WO2022056817A1

WO2022056817A1 - Anchor frame selection for blending frames in image processing

Info

Publication number: WO2022056817A1
Application number: PCT/CN2020/116137
Authority: WO
Inventors: Wen-Chun Feng; Mian Li; Kai Liu; Ruocheng JIANG
Original assignee: Qualcomm Incorporated
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2022-03-24

Abstract

Example techniques are described for selecting an anchor frame from a plurality of frames (e.g., plurality of captured images) based on tracking of a region of interest (ROI) in the plurality of frames. The ROI may be less than the entire frame and may be in different locations in two or more of the frames. A camera device may generate an output frame based on blending of the anchor frame and at least another frame of the plurality frames (e.g., blending the anchor frame and one or more of the plurality of the frames).

Description

ANCHOR FRAME SELECTION FOR BLENDING FRAMES IN IMAGE PROCESSING

TECHNICAL FIELD

The disclosure relates to blending frames for image processing.

BACKGROUND

A camera device includes one or more cameras that capture frames (e.g., images) . Examples of the camera device include stand-alone digital cameras or digital video camcorders, camera-equipped wireless communication device handsets, such as mobile telephones having one or more cameras, cellular or satellite radio telephones, camera-equipped personal digital assistants (PDAs) , panels or tablets, gaming devices, computer devices that include cameras, such as so-called “web-cams, ” or any device with digital imaging or video capabilities.

The camera device processes the captured frames and outputs the frames for display. In some examples, the camera device captures a plurality of frames and blends the frames together to form an output frame that is output for display.

SUMMARY

In general, this disclosure describes techniques for selecting an anchor frame from a plurality of frames (e.g., plurality of captured images) based on tracking of a region of interest (ROI) in the plurality of frames. The ROI may be less than the entire frame and may be in different locations in two or more of the frames. A camera device may generate an output frame based on blending of the anchor frame and at least another frame of the plurality frames (e.g., blending the anchor frame and one or more of the plurality of the frames) .

The blending of the anchor frame with at least another frame of the plurality of frames may reduce noise and avoid ghosting in the output frame. For example, by selecting one anchor frame and blending the anchor frame with at least another frame (e.g., one or more of the other frames) , the camera device may reduce blurriness, such as motion blur. The amount that the blurriness is reduced may be a factor of which frame of the plurality of frames is selected as the anchor picture. Selecting the anchor frame based on tracking of the ROI in the plurality of frames (e.g., using example techniques described in this disclosure) and blending one or more of the other frames with the selected anchor frame may result in an output frame having less blurriness as compared to selecting an anchor frame which is not based on tracking of the ROI in the plurality of frames.

In one example, the disclosure describes a device for image processing, the device comprising a memory and a processor coupled to the memory, the processor configured to determine a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame, track the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame, determine one or more sharpness values of the ROI in one or more frames of the plurality of frames, select an anchor frame from the one or more frames based on the determined one or more sharpness values, and generate an output frame based on blending the anchor frame and at least another frame of the plurality of frames.

In one example, the disclosure describes a method for image processing, the method comprising determining a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame, tracking the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame, determining one or more sharpness values of the ROI in one or more frames of the plurality of frames, selecting an anchor frame from the one or more frames based on the determined one or more sharpness values, and generating an output frame based on blending the anchor frame and at least another frame of the plurality of frames.

In one example, the disclosure describes a computer-readable storage medium storing instructions thereon that when executed cause one or more processors of a device for image processing to determine a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame, track the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame, determine one or more sharpness values of the ROI in one or more frames of the plurality of frames, select an anchor frame from the one or more frames based on the determined one or more sharpness values, and generate an output frame based on blending the anchor frame and at least another frame of the plurality of frames.

In one example, the disclosure describes a device for image processing, the device comprising means for determining a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame, means for tracking the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame, means for determining one or more sharpness values of the ROI in one or more frames of the plurality of frames, means for selecting an anchor frame from the one or more frames based on the determined one or more sharpness values, and means for generating an output frame based on blending the anchor frame and at least another frame of the plurality of frames.

The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram of a device configured to perform one or more of the example techniques described in this disclosure.

FIG. 2 is a block diagram illustrating a camera processor of the device of FIG. 1 in further detail.

FIG. 3 is a conceptual diagram illustrating an example of a region of interest (ROI) in different frames.

FIG. 4 is a flowchart illustrating an example method of operation in accordance with one or more examples described in this disclosure.

FIG. 5 is a flowchart illustrating another example method of operation in accordance with one or more examples described in this disclosure.

FIG. 6 is a flowchart illustrating another example method of operation in accordance with one or more examples described in this disclosure.

DETAILED DESCRIPTION

The example techniques described in this disclosure relate to selecting an anchor frame from a plurality of frames for blending to generate one or more output frames. As described in more detail, the selection of the anchor frame may be based on tracking of a region of interest (ROI) in the plurality of frames. For example, a location or size of the ROI in a first frame of the plurality of frames may be different than a location or size of the ROI in a second frame of the plurality of frames. Tracking the ROI in the plurality of frames may include, for example, tracking the location and/or size of the ROI in the plurality of frames.

A camera device may be configured to capture a plurality of frames (e.g., images) that the camera device blends together to form one or more output frames. Rather than displaying each of the plurality of frames, a display may output the resulting one or more output frames. Blending one or more (e.g., all) of the plurality of frames may be beneficial because the blending may be a form of temporal filtering that reduces noise but maintains image details. One example technique of blending of one or more of the plurality of frames includes a multi-frame noise reduction (MFNR) technique. For example, the MFNR technique may include relatively more blending for stationary regions in the plurality of frames, and relatively less blending for moving regions in the plurality of frames to avoid ghosting. The MFNR technique may also include image alignment before blending to minimize handheld shaking blur (e.g., blur created due to accidental movement of the camera device while capturing the frames) .

In some examples, while blending reduces noise, blending may cause ghosting. This disclosure describes example techniques of selecting an anchor frame to be used for blending, so that ghosting is reduced while blending, and the noise reduction benefits of blending are preserved. For instance, the camera device may select an anchor frame from the plurality of frames, and perform blending with one or more frames of the plurality of frames with respect to the anchor frame. This disclosure describes example techniques for selecting the anchor frame such as based on tracking of an ROI in the plurality of frames, where the ROI is less than the entire frame and may be in different locations or may have different sizes in the plurality of frames. For example, the camera device may determine one or more sharpness values of the ROI in the plurality of frames (e.g., sharpness values of the ROI that is tracked in the plurality of frames) , and select an anchor frame based on the determined one or more sharpness values.

In some examples, rather than determining sharpness values in the ROI of each frame in the plurality of frames, the camera device may preselect one or more frames from the plurality of frames based on the location and/or size of the ROI in the plurality of frames, and possibly, also based on a brightness value of each frame of the plurality of frames. The camera device may determine one or more sharpness values of the ROI in one or more frames (e.g., the preselected frames) of the plurality of frames, and select an anchor frame based on the determined one or more sharpness values.

Selecting the anchor frame based on tracking of the ROI in the plurality of frames may be beneficial as compared to selecting the anchor frame without tracking of the ROI. For example, by not tracking the ROI, the camera device may not be able to distinguish background and foreground image content in the frames, resulting in selection of an anchor frame that when blended with at least another frame (e.g., the other frames) results in increased sharpness for background, and possible blurring of objects in the foreground. As another example, there may be movement of the one or more objects in the plurality of frames (e.g., because the camera device is capturing a moving object or because the camera device is moving) . By selecting an anchor frame without accounting for the movement of the object, the selected anchor frame, when blended with the other frames, may not result in an output frame that minimizes blurriness. Moreover, even in cases where a fixed ROI is used for anchor frame selection (e.g., the location and size of the ROI does not change frame-to-frame) , the fixed ROI may end up including more background information than desired if the ROI is not tracked frame-to-frame.

FIG. 1 is a block diagram of a device configured to perform one or more of the example techniques described in this disclosure. Examples of camera device 10 include stand-alone digital cameras or digital video camcorders, camera-equipped wireless communication device handsets, such as mobile telephones having one or more cameras, cellular or satellite radio telephones, camera-equipped personal digital assistants (PDAs) , panels or tablets, gaming devices, computer devices that include cameras, such as so-called “web-cams, ” or any device with digital imaging or video capabilities.

As illustrated in the example of FIG. 1, camera device 10 includes camera 12 (e.g., having an image sensor and lens) , camera processor 14 and local memory 20 of camera processor 18, a central processing unit (CPU) 16, a graphical processing unit (GPU) 18, user interface 22, memory controller 24 that provides access to system memory 30, and display interface 26 that outputs signals that cause graphical data to be displayed on display 28. Although the example of FIG. 1 illustrates camera device 10 including one camera 12, in some examples, camera device 10 may include a plurality of cameras, such as for omnidirectional image or video capture.

Also, although the various components are illustrated as separate components, in some examples the components may be combined to form a system on chip (SoC) . As an example, camera processor 14, CPU 16, GPU 18, and display interface 26 may be formed on a common integrated circuit (IC) chip. In some examples, one or more of camera processor 14, CPU 16, GPU 18, and display interface 26 may be in separate IC chips. Various other permutations and combinations are possible, and the techniques should not be considered limited to the example illustrated in FIG. 1.

The various components illustrated in FIG. 1 (whether formed on one device or different devices) may be formed as at least one of fixed-function or programmable circuitry such as in one or more microprocessors, application specific integrated circuits (ASICs) , field programmable gate arrays (FPGAs) , digital signal processors (DSPs) , or other equivalent integrated or discrete logic circuitry. Examples of local memory 20 and system memory 30 include one or more volatile or non-volatile memories or storage devices, such as random access memory (RAM) , static RAM (SRAM) , dynamic RAM (DRAM) , erasable programmable ROM (EPROM) , electrically erasable programmable ROM (EEPROM) , flash memory, a magnetic data media or an optical storage media.

The various units illustrated in FIG. 1 communicate with each other using bus 32. Bus 32 may be any of a variety of bus structures, such as a third generation bus (e.g., a HyperTransport bus or an InfiniBand bus) , a second generation bus (e.g., an Advanced Graphics Port bus, a Peripheral Component Interconnect (PCI) Express bus, or an Advanced eXtensible Interface (AXI) bus) or another type of bus or device interconnect. The specific configuration of buses and communication interfaces between the different components shown in FIG. 1 is merely exemplary, and other configurations of camera devices and/or other image processing systems with the same or different components may be used to implement the techniques of this disclosure.

Camera processor 14 is configured to receive image frames from camera 12, and process the image frames to generate output frames for display. CPU 16, GPU 18, camera processor 14, or some other circuitry may be configured to process the output frame that includes image content generated by camera processor 14 into images for display on display 28. In some examples, GPU 18 may be further configured to render graphics content on display 28.

In some examples, camera processor 14 may be configured as an image processing pipeline. For instance, camera processor 14 may include a camera interface that interfaces between camera 12 and camera processor 14. Camera processor 14 may include additional circuitry to process the image content.

Camera processor 14 outputs the resulting frames with image content (e.g., pixel values for each of the image pixels) to system memory 30 via memory controller 24. In one or more examples described in this disclosure, the frames may be further processed for generating one or more frames for display. For example, as described in more detail, camera processor 14 may be configured to select an anchor frame from a plurality of frames and generate an output frame of one or more output frames based on blending of the anchor frame with at least another frame (e.g., one or more, including all, of the other frames of the plurality of frames) . In some examples, rather than camera processor 14 performing the blending, GPU 18 or some other circuitry of camera device 10 may be configured to perform the blending.

This disclosure describes the examples techniques as being performed by camera processor 14. However, the example techniques should not be considered limited to camera processor 14 performing the example techniques. For instance, camera processor 14 in combination with CPU 16, GPU 18, and/or display interface 26 may be configured to perform the example techniques described in this disclosure. For example, a processor may be configured to perform the example techniques described in this disclosure. Examples of the processor include camera processor 14, CPU 16, GPU 18, display interface 26, or any combination of one or more of camera processor 14, CPU 16, GPU 18, and display interface 26.

CPU 16 may comprise a general-purpose or a special-purpose processor that controls operation of camera device 10. A user may provide input to camera device 10 to cause CPU 16 to execute one or more software applications. The software applications that execute on CPU 16 may include, for example, a media player application, a video game application, a graphical user interface application or another program. The user may provide input to camera device 10 via one or more input devices (not shown) such as a keyboard, a mouse, a microphone, a touch pad or another input device that is coupled to camera device 10 via user interface 22.

One example of the software application is a camera application. CPU 16 executes the camera application, and in response, the camera application causes CPU 16 to generate content that display 28 outputs. GPU 18 may be configured to process the content generated by CPU 16 for rendering on display 28. For instance, display 28 may output information such as light intensity, whether flash is enabled, and other such information. The user of camera device 10 may interface with display 28 to configure the manner in which the images are generated (e.g., with or without flash, focus settings, exposure settings, and other parameters) . As one example, the user of camera device 10 may select to take multiple frames (e.g., multiple pictures) , where two or more of the multiple frames are blended together (e.g., to reduce blur) to generate one or more output frames. However, taking multiple frames that are blended together may be the default option (e.g., no user selection is needed) . The camera application also causes CPU 16 to instruct camera processor 14 to capture and process the frames of image content captured by camera 12 in the user-defined manner.

Memory controller 24 facilitates the transfer of data going into and out of system memory 30. For example, memory controller 24 may receive memory read and write commands, and service such commands with respect to memory 30 in order to provide memory services for the components in camera device 10. Memory controller 24 is communicatively coupled to system memory 30. Although memory controller 24 is illustrated in the example of camera device 10 of FIG. 1 as being a processing circuit that is separate from both CPU 16 and system memory 30, in other examples, some or all of the functionality of memory controller 24 may be implemented on one or both of CPU 16 and system memory 30.

System memory 30 may store program modules and/or instructions and/or data that are accessible by camera processor 14, CPU 16, and GPU 18. For example, system memory 30 may store user applications (e.g., instructions for the camera application) , resulting frames from camera processor 14, etc. System memory 30 may additionally store information for use by and/or generated by other components of camera device 10. For example, system memory 30 may act as a device memory for camera processor 14.

As one example, camera processor 14 may cause camera 12 to capture a plurality of frames (e.g., a plurality of pictures) . The number of frames in the plurality of frames may be approximately 4 to 8 frames, but the techniques described in this disclosure are not limited to any particular number of frames. Camera 12 may capture the plurality of frames sequentially. Therefore, there may be a slight delay between when the first frame of the plurality of frames is captured and the last frame of the plurality of frames is captured.

Camera processor 14 may perform some initial image processing on the plurality of frames, but such initial image processing is not necessary in all examples. Camera processor 14 may output the plurality of frames to system memory 30 for storage. In some examples, rather than or in addition to, outputting the plurality of frames to system memory 30, camera processor 14 may output the plurality of frames to local memory 20. As another example, camera processor 14 may store each of the plurality of frames, as each frame is captured, into local memory 20 for temporary storage and then move the plurality of frames from local memory 20 to system memory 30. In some examples, camera 12 may bypass camera processor 14 and directly store the captured frames in system memory 30.

In accordance with one or more examples described in this disclosure, camera processor 14 may access the plurality of frames from local memory 20 and/or system memory 30 for blending two or more, or all, of the plurality of frames to generate one or more output frames (e.g., an output frame) for display. For example, camera processor 14 may be configured to perform operations of a multi-frame noise reduction (MFNR) technique. The MFNR technique may include temporal filtering the plurality of frames (e.g., approximately 4 to 8 frames) to blend together the plurality of frames to generate one or more output frames, including, in some cases, only one output frame. Such blending may reduce noise but keep image details, for example, by blending stationary regions in the frames more than moving regions in the frame, to generate an output frame with reduced ghosting and higher image quality. In some examples, image alignment may be performed before blending to avoid handheld shaking blur (e.g., blur due to inadvertent movement of camera device 10 by the user) .

The MFNR techniques may be further optimized by blending the plurality of frames with respect to an anchor frame. For example, camera processor 14 may select an anchor frame and reorder the plurality of frames, with the anchor frame as the first frame. Camera processor 14 may blend the anchor frame and at least another frame of the plurality of frames. For example, camera processor 14 may blend the plurality of frames starting with the anchor frame as the first frame. By blending the plurality of frames starting with the anchor frame, camera processor 14 may generate an output frame having a lesser amount of motion blur as compared to blending the plurality of frames without starting with an anchor frame.

For example, in some MFNR techniques, the frame having the highest sharpness is selected as the anchor frame, and then other frames are blended onto the selected anchor frame. As one example, the sample values of the samples in the anchor frame are weighted more than the sample values of the samples in the other frames. In this way, when blending between frames (e.g., determining a weighted average or summation of samples between two frames but other blending techniques are possible) , the sample values of the anchor frame are weighted more than sample values of the other frames resulting in the sharpness being preserved.

This disclosure describes example techniques for selecting an anchor frame, such as based on tracking of a region of interest (ROI) in the plurality of frames. As described in more detail, the selected anchor frame may be the anchor frame having the highest sharpness values within the ROI. Camera processor 14 may blend the anchor frame with at least another of the plurality of frames. By selecting the anchor frame based on the tracking of the ROI, camera processor 14 may blend less the areas in the frames that include moving content, to reduce ghosting, and blend more the remaining areas to remove noise. Also, because the anchor frame is selected based on the ROI having the highest sharpness values, even if there is blurriness in the other frames, the anchor frame may be weighted the most, resulting in sharper image content in the one or more output frames.

Stated another way, anchor picking mechanism may choose the “sharpest” image as anchor frame, and camera processor 14 may blend other images onto the anchor frame. In one or more examples, the anchor frame may be considered as having higher weighting especially in moving area for generating the one or more output frames. Therefore, in accordance with one or more examples described in this disclosure, if the ROI, which may include at least one object of interest that is in different locations in the frames, was blurred by movement in some of the frames, camera processor 14 can pick the frame having the sharpest ROI as the anchor frame and blend onto the anchor frame to generate the one or more output frames.

As described above, camera processor 14 may select an anchor frame, such as based on tracking of a ROI in the plurality of frames. For example, camera processor 14 may be configured to determine an ROI in a first frame of the plurality of frames, where an area of the ROI is less than an entirety of the frame. Although possible, the first frame of the plurality of frames need not necessarily be the first captured frame of the plurality of frames. Rather, the first frame of the plurality of frames may be anyone of the frames of the plurality of frames.

Camera processor 14 may track the ROI in the plurality of frames. For example, a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame. In some examples, the location and size of the ROI in the second frame of the plurality of frames is different than the location and size of the ROI in the first frame.

There may be various examples of the ROI and various reasons why the location or size of the ROI may be different in different frames. As described above, camera 12 may capture the plurality of frames sequentially, and therefore, a slight amount of time may have elapsed between each instance when camera 12 captured a frame and a following frame. During the time that elapses between capturing of frames, the location or size of the ROI may change.

As one example, the ROI may be a user selected ROI (e.g., the user may select the ROI as part of the executing the camera application) . In one or more examples, the user selected ROI may be in different locations in one or more of the plurality of frames. As another example, the ROI may be one or more objects that are moving, which is why the ROI may be in different locations in one or more of the plurality of frames. The movement of the objects may be because camera 12 is capturing frames of one or more moving objects, or the movement of the objects may be because the user of camera device 10 is moving camera device 10 during the capturing of the frames, as two examples. As another example, the ROI may be a face of a person, and the face of the person may be moving in one or more of the plurality of frames.

Camera processor 14 may determine one or more sharpness values for the ROI in the plurality of frames, and select a frame, as the anchor frame, with the ROI having the highest one or more sharpness values as compared to the one or more sharpness values of the ROI in the other frames. Rather than determining sharpness values for an entire frame or determining sharpness values for a fixed ROI (e.g., location and/or size of ROI is same in each of the frames) to select an anchor frame, camera processor 14 may determine sharpness values for an ROI that may be in different locations or may have different sizes in two or more of the plurality of frames to select an anchor frame.

In one or more examples, camera processor 14 may determine one or more sharpness values for the ROI in each of the plurality of frames. However, the example techniques are not so limited. In some examples, camera processor 14 may determine one or more sharpness values in a preselected group of the plurality of frames (e.g., one or more frames of the plurality of frames, where the one or more frames include all of the plurality of frames or a subset of the plurality of frames) . As one example, camera processor 14 may select one or more of the plurality of frames based on at least one of location or size of the ROI in the plurality of frames. As another example, camera processor 14 may select one or more of the plurality of frames based on a brightness value of the ROI in the plurality of frames. As another example, camera processor 14 may select one or more of the plurality of frames based on a combination of location or size of the ROI in the plurality of frames and the brightness value of the ROI in the plurality of frames.

For example, camera processor 14 may determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames. Camera processor 14 may compare each difference value for each frame to a tolerance value and select the one or more frames of the plurality of frames (e.g., preselect the one or more frames) based on the comparison. In some examples, to determine the difference value, camera processor 14 may compare at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames. Examples of comparing at least one of the size or location of the ROI in each frame to a size or location of the ROI in each of the other frames is described in further detail with respect to FIG. 3.

As described above, in some examples, camera processor 14 may determine a brightness value of the ROI in the plurality of frames. A brightness value may represent the luminance of samples in the ROI. For instance, each sample (e.g., pixel) in the ROI in each frame includes a color value. The color value may be represented in at least two manners. One way in which to represent the color value is with a red value, green value, and blue value (e.g., RGB value) . Another way in which to represent the color value is with luminance and chrominance, where luminance (Y) represents brightness and chrominance includes two colors (e.g., Cb for blue and Cr for red) . The one or more brightness values of the ROI in the plurality of frames may be based on the luminance (Y) of the samples in the ROI.

Camera processor 14 may be configured to process frames captured by camera 12 in either RGB or luminance/chrominance. For example, camera processor 14 may utilize the brightness values for preselecting frames, and then determine sharpness values for the ROI in the preselected frames. Camera processor 14 may determine brightness values based on the luminance, but determine the sharpness values based on the RGB values. In some examples, camera processor 14 may store color values for the samples of the ROI as both RGB values and luminance/chrominance values. The following is an example of equations that camera processor 14 may utilize to convert between RGB values and luminance/chrominance values, and vice-versa:

Y = 16 + 65.738*R/256 + 129.057*G/256 + 25.064*B/256

Cb = 128 –37.945*R/256 –74.494*G/256 + 112.439*B/256

Cr = 128 + 112.439*R –94.154*G/256 –18.285*B/256

R = 298.082*Y/256 + 408.583*Cr/256 –222.921

G = 298.082*Y/256 –100.291*Cb/256 –208.120*Cr/256 + 135.576

B = 298.082*Y/256 + 516.412*Cb/256 –276.836.

To determine the one or more brightness values for the ROI in each frame, as one example, camera processor 14 may determine the most common range of brightness values of the samples in the ROI in each frame. As another example, camera processor 14 may determine an average of the brightness values of the samples in the ROI as the brightness value of the ROI. As another example, camera processor 14 may select the highest brightness value of the samples in the ROI as the brightness value of the ROI. There may be various ways in which camera processor 14 may determine the brightness value for the ROI in each frame, and the example techniques are not limited to any particular way in which to determine the brightness value.

Camera processor 14 may compare the brightness value of the ROI of each frame to a tolerance value. Camera processor 14 may select the one or more frames of the plurality of frames (e.g., preselect the one or more frames) based on the comparison.

In some examples, camera processor 14 may select the one or more frames of the plurality of frames based on comparison of each difference value for each frame to a first tolerance value and comparison of the brightness value for each frame to a second tolerance value. In some examples, camera processor 14 may determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames, compare each difference value for each frame to a tolerance value, and select a first subset of frames of the plurality of frames based on the comparison. Camera processor 14 may determine the brightness value for the ROI in the first subset of frames, compare each of the one or more brightness values to a tolerance value, and select a second subset of frames (e.g., from the first subset of frames) based on the comparison.

In some examples, camera processor 14 may determine a brightness value for each frame of the plurality of frames based on the ROI in each of the plurality of frames, compare each brightness value for each frame to a tolerance value, and select a first subset of frames of the plurality of frames based on the comparison. Camera processor 14 may determine a difference value for each frame of the first subset of frames based on the ROI in each of the plurality of frames, compare each difference value to a tolerance value, and select a second subset of frames (e.g., from the first subset of frames) based on the comparison.

As described above, camera processor 14 may determine one or more sharpness values of the ROI in one or more frames of the plurality of frames. In some examples, the one or more frames of the plurality of frames may be the preselected frames. Examples of the preselected frames include frames selected based on comparison of the difference value to a tolerance value, frames selected based on comparison of a brightness value for the ROI in each frame to a tolerance value, frames selected based on comparison of the difference value to a first tolerance value and comparison of the brightness value to a second tolerance value, or frames selected from the second subset of frames.

In some examples, regardless of a manner in which the one or more frames are selected, camera processor 14 may determine the one or more sharpness values of the ROI of the selected one or more frames and avoid determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames. That is, camera processor 14 may not determine the one or more sharpness values of the ROI in the non-selected frames. However, comparing difference values or the brightness values to respective tolerance values is not needed in every example. In some examples, camera processor 14 may determine the one or more sharpness values of the ROI of the one or more frames of the plurality of frames, where the one or more frames include all of the frames of the plurality of frames.

There may be various ways in which camera processor 14 determines the one or more sharpness values for the ROI in the one or more frames of the plurality of frames. As one example, camera processor 14 may divide the ROI in each frame include a plurality of blocks, where each block includes one or more samples. Camera processor 14 may determine an average color value for each block (e.g., based on average of the color values of the samples in the block) . Camera processor 14 may subtract the average color values between a current block and a horizontally neighboring block to generate a first value, and subtract the average color values between the current block and a vertically neighboring block to generate a second value. Camera processor 14 may sum an absolute value of the first value and an absolute value of the second value, and divide the result based on the size of the block to determine the sharpness value for the current block. Camera processor 14 may repeat these operations for each block of the ROI to determine a sharpness value for each block of the ROI. In this way, in one or more example, camera processor 14 may subtract color values between horizontally and vertically neighboring samples or blocks with the ROI in each of the one or more frames and sum the result of the subtraction for each of the one or more frames, as at least some of the operations for determining the one or more sharpness values for the ROI.

Camera processor 14 may select the largest sharpness value from the sharpness values for each block as the sharpness value for the ROI. As another example, camera processor 14 may average the sharpness values for each block as the sharpness value for the ROI. There may be other example ways in which to determine the sharpness value for the ROI and the above examples should not be considered limiting.

Camera processor 14 may select an anchor frame from the frames based on the determined one or more sharpness values. As one example, camera processor 14 may select the anchor frame from the frames with the ROI having the highest sharpness value among the determined one or more sharpness values.

Camera processor 14 may generate an output frame based on blending the anchor frame and at least another frame of the plurality of frames. For example, camera processor 14 may rank one or more, or all, of the frames based on sharpness values of the ROI. The anchor frame may be the first frame, followed by the frame having the second highest sharpness value, and so forth. Camera processor 14 may temporally filter (e.g., filter frames based on image content captured at different times) the one or more frames of the plurality of frames with respect to the anchor frame. For example, because the frames are captured at slightly different times, filtering the image content may be considered as filtering image content across time, and therefore referred to as temporally filtering. In some examples, camera processor 14 may temporally filter the anchor frame with the frame having the second highest sharpness value, and temporally filter the resulting frame with the frame having the third highest sharpness value, and so forth to generate an output frame.

In some examples, camera processor 14 may blend (e.g., by temporally filtering) the anchor frame and only the preselected frames. In some examples, camera processor 14 may blend the anchor frame and all frames. In some cases, where blending the anchor frame and all frames, camera processor 14 may weigh the samples of the non-selected frames even less than the weight assigned to the preselected frames. For instance, if there are eight frames, and after preselection, there are four frames that are preselected, in some examples, camera processor 14 may select an anchor frame from the four preselected frames, and blend with the remaining three preselected frames with the anchor frame. In some examples, camera processor 14 may blend the seven frames with the anchor frame.

Camera processor 14 may blend more in static region to reduce noise and blend less in moving region to avoid ghosting. Camera processor 14 may select the anchor frame as the first frame to simplify the overall blending process (e.g., averaging across samples in the frames) . Camera processor 14 may sort the frames and rearrange the frames in order to the anchor frame as the first frame. Because the ROI is based on tracking of at least one object in the frames, when camera processor 14 performs the blending, camera processor 14 may weight the samples in the ROI of the anchor frame more than the samples in the other frames so that there is less blending of the object of interest that is moving. The reduction in blending may result in the image content of the anchor frame being preserved and result in reduction in ghosting. Also, camera processor 14 may blend the static portions of the frames more, resulting in reduction in noise.

In one or more examples, camera processor 14 may include fixed-function circuitry configured to perform the above example techniques. However, camera processor 14 may include programmable circuitry or a combination of fixed-function and programmable circuitry configured to perform the above example techniques.

In some aspects, system memory 30 may include instructions that cause camera processor 14, CPU 16, GPU 18, and display interface 26 to perform the functions ascribed to these components in this disclosure. Accordingly, system memory 30 may be a computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors (e.g., camera processor 14, CPU 16, GPU 18, and display interface 26) to perform various functions.

In some examples, system memory 30 is a non-transitory storage medium. The term “non-transitory” indicates that the storage medium is not embodied in a carrier wave or a propagated signal. However, the term “non-transitory” should not be interpreted to mean that system memory 30 is non-movable or that its contents are static. As one example, system memory 30 may be removed from camera device 10, and moved to another device. As another example, memory, substantially similar to system memory 30, may be inserted into camera device 10. In certain examples, a non-transitory storage medium may store data that can, over time, change (e.g., in RAM) .

Camera processor 14, CPU 16, and GPU 18 may store image data, and the like in respective buffers that are allocated within system memory 30. Display interface 26 may retrieve the data from system memory 30 and configure display 28 to display the image represented by the generated image data. For example, display 28 may output the output frame generated by camera processor 14. In some examples, display interface 26 may include a digital-to-analog converter (DAC) that is configured to convert the digital values retrieved from system memory 30 into an analog signal consumable by display 28 to drive elements of the displays. In other examples, display interface 26 may pass the digital values directly to display 28 for processing.

Display 28 may include a monitor, a television, a projection device, a liquid crystal display (LCD) , a plasma display panel, a light emitting diode (LED) array, or another type of display unit. Display 28 may be integrated within camera device 10. For instance, display 28 may be a screen of a mobile telephone handset or a tablet computer. Alternatively, display 28 may be a stand-alone device coupled to camera device 10 via a wired or wireless communications link. For instance, display 28 may be a computer monitor or flat panel display connected to a personal computer via a cable or wireless link.

FIG. 2 is a block diagram illustrating a camera processor of the device of FIG. 1 in further detail. As illustrated, camera processor 14 includes tracking circuit 34, anchor frame selection circuit 36, blend circuit 38, and post-processing circuit 40. Tracking circuit 34, anchor frame selection circuit 36, blend circuit 38, and post-processing circuit 40 are illustrated as separate circuits simply to ease with illustration and description. However, tracking circuit 34, anchor frame selection circuit 36, blend circuit 38, and post-processing circuit 40 may be integrated together and need not necessarily be separate circuits within camera processor 14.

As illustrated, camera processor 14 may receive region information and a plurality of frames. The plurality of frames may be the plurality of frames that camera 12 captured, and camera processor 14 may receive the plurality of frames from system memory 30, local memory 20, or directly from camera 12.

Region information may include information of a region within which camera processor 14 may determine a region of interest (ROI) . As one example, the region information includes information identifying foreground areas (e.g., grids) within the plurality of frames. For example, a multiwindow (MW) autofocus (AF) algorithm, executing on CPU 16, may select the best foreground grids when autofocus (AF) finishes converging. The foreground areas may be one or more objects of interest that the user is more likely to focus on than other objects. The MW AF algorithm may divide an ROI into multi-grids, and converges on the image content that is closest (e.g., based on depth values) . The image content that is closest may be foreground areas.

As another example, the region information is user input that a user provides (e.g., based on content being displayed on display 28 or some other form of user selection) . For example, prior to taking an image, camera device 10 may be in a preview mode in which display 28 displays image content as a preview of the image that the user will take. During the preview mode, the user may interact with display 28 to provide region information.

As another example, the region information includes at least one face. For example, during the preview mode, camera device 10 may be configured to detect faces (e.g., human faces but animal faces are possible) in the image content. For instance, although not specifically shown, camera device 10 or camera processor 14 may include face detection circuitry that scans the image content in the preview mode and identifies shapes that appear to form faces. The face detection circuitry may determine shapes based on changes in contrast in the image content. The face detection circuitry may determine whether the shapes tend to form shapes of a face. In some examples, the face detection circuitry may be further configured to determine if there are shapes similar to the shapes of eyes, shapes similar to a shape of a nose, and shapes similar to a shape of a mouth in the shapes that the face detection circuitry initially classified as being faces to confirm that detected faces are actually faces. The face detection circuitry may provide information of one or more detected faces to camera processor 14.

There may be other examples of region information and the example techniques are not limited to the example region information. In some examples, the default region information may be the foreground information. For example, camera processor 14 may utilize the foreground information as default for performing the example techniques described in this disclosure. However, if there is user input for the region information or there are detected faces in the region information, then camera processor 14 may utilize the user input or detected faces for performing the example techniques described in this disclosure. In some examples, camera processor 14 may utilize any combination of the region information for performing the example techniques described in this disclosure.

Tracking circuit 34 may utilize the received region information to determine an ROI in a first frame of the plurality of frames, where the area of the ROI is less than an entirety of the first frame. For example, tracking circuit 34 may determine a polygon (e.g., rectangle) that encompasses at least one object of interest. The object of interest may be an object in the foreground, may be an object identified by user input, or may be a face. The polygon that encompasses the at least one object of interest is an example of an ROI. For example, tracking circuit 34 may determine an ROI in the plurality of frames by determining a polygon that encompasses at least one object of interest. Although the example is described with respect to one ROI, in some examples, tracking circuit 34 may determine a plurality of ROIs and the example techniques may apply to examples where there are a plurality of ROIs.

In one or more examples, tracking circuit 34 may be configured to track the ROI in the plurality of frames. For instance, a location of the ROI in a second frame of the plurality of frames may be in a different location than the ROI in the first frame. As an example, tracking circuit 34 may track the at least one object of interest in the plurality of frames, where the ROI encompasses the object of interest.

There may be movement of the at least one object of interest in the plurality of frames. Because the ROI encompasses the object of interest, the ROI may be located in different locations in the frames. As one example, the movement of the at least one object of interest may be because the object is an object in motion (e.g., moving car, jumping pet, movement of a face, etc. ) . As another example, the movement of the at least one object of interest may be because camera device 10 is moving. For example, the user may inadvertently move camera device 10 during the time that camera 12 is capturing the plurality of frames, which in turn then causes an object of interest to be in different locations in the plurality of frames.

Tracking circuit 34 may output information indicative of the ROI in the plurality of frames and anchor frame selection circuit 36 may receive the information indicative of the ROI. For example, tracking circuit 34 may output the information indicative of the ROI to local memory 20, and anchor frame selection circuit 36 may retrieve the information indicative of the ROI from local memory 20.

Anchor frame selection circuit 36 may be configured to determine one or more sharpness values of the ROI in one or more frames of the plurality frames, and select an anchor frame from the one or more frames based on the determined one or more sharpness values. As one example, anchor frame selection circuit 36 may divide the ROI in one or more frames (possibly all frames) of the plurality of frames into blocks. For each block, anchor frame selection circuit 36 may determine an average color value (e.g., average of the color values of the samples in the block) . Anchor frame selection circuit 36 may subtract the average color values of horizontally neighboring blocks and vertically neighboring blocks, and sum the results of the subtraction. In some examples, anchor frame selection circuit 36 may divide the result of the summation based on a size of the block to determine a sharpness value for each block. Anchor frame selection circuit 36 may determine the highest sharpness value from among the sharpness values for each of the blocks to determine a sharpness value for the ROI. There may be other ways in which to determine the sharpness value for the ROI such as averaging the sharpness of the sharpness values of the blocks.

Anchor frame selection circuit 36 may repeat such example operations for the ROI in each of the one or more frames. Accordingly, to determine the one or more sharpness values of the ROI in one or more frames of the plurality of frames, anchor frame selection circuit 36 may subtract color values between horizontally and vertically neighboring samples or blocks of samples within the ROI in each of the frames and sum the result of the subtraction for each of the frames.

In one or more examples, anchor frame selection circuit 36 may select an anchor frame from the frames with the ROI having a highest sharpness value among the determined one or more sharpness values. For example, the ROI in each of the one or more frames of the plurality of frames may be associated with one or more sharpness values, and anchor frame selection circuit 36 may select the anchor frame with the ROI associated with the highest sharpness value.

In some examples, rather than determining one or more sharpness values of the ROI in each of the plurality of frames for purposes of selecting an anchor frame, anchor frame selection circuit 36 may be configured to preselect the one or more frames of the plurality frames. Anchor frame selection circuit 36 may select the anchor frame as one of the one or more preselected frames. The following describes example ways in which selection circuit 36 may be configured to preselect the one or more frames.

As a first example, anchor frame selection circuit 36 may determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames. For example, to determine the difference value, anchor frame selection circuit 36 may compare at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames. Anchor frame selection circuit 36 may compare each difference value for each frame to a tolerance value and select the one or more frames of the plurality of frames based on the comparison. For example, if the difference value is less than the tolerance value, then the condition of the first example is satisfied and the frame is included in the one or more frames that are preselected.

As a second example, anchor frame selection circuit 36 may determine a brightness value for each frame of the plurality of frames based on the ROI in each of the plurality of frames, compare each brightness value for each frame to a tolerance value, and select the one or more frames of the plurality of frames based on the comparison. For example, if the brightness value is less than the tolerance value, then the condition of the second example is satisfied and the frame is included in the one or more frames that are preselected.

One example way to determine the brightness value for each frame is generating a histogram, where each bin of the histogram represents a range of brightness values. For example, the first bin of the histogram may be for 0 to X brightness values, the second bin of the histogram may be for X+1 to Y brightness values, and so forth. Anchor frame selection circuit 36 may arrange each of the brightness values of each of the samples in the ROI into one of the histogram bins. Anchor frame selection circuit 36 may determine the brightness value based on the median number of the bins. There may be other ways in which to determine the brightness values, and the techniques are not limited to the above example techniques.

Anchor frame selection circuit 36 may select the one or more frames using the first example or the second example. In some examples, anchor frame selection circuit 36 may utilize a combination of the first example and the second example (i.e., the conditions of both the first example and the second example are satisfied) . For example, anchor frame selection circuit 36 may preselect a frame if the frame satisfies both conditions: difference value is less than a first tolerance value and brightness value is less than a second tolerance value.

In some examples, by preselecting one or more frames, anchor frame selection circuit 36 may reduce computations for anchor frame selection. For instance, anchor frame selection circuit 36 may determine the one or more sharpness values of the ROI of the selected one or more frames and avoid determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames. However, in some examples, determining the sharpness values of the ROI for each of the frames may be utilized such as for ranking the frames for blending.

Blend circuit 38 may be configured to blend the anchor frame with at least another frame of the plurality of frames. For example, blend circuit 38 may start with the anchor frame and blend the anchor frame with another frame by temporally filtering the anchor frame with the other frame. Then, blend circuit 38 may blend the resulting frame with another frame, and so forth. The result of the blending may be an output frame. In this way, blend circuit 38 may be configured to generate an output frame based on blending of the anchor frame and at least another frame of the plurality of frames.

Post-processing circuit 40 may be configured to perform any post-processing, such as additional filtering or any other processing needed to prepare the output frame to be output. Post-processing may be optional. Examples of post-processing include spatial filtering (e.g., filtering across image content in the same frame) or edge enhancement, as a few examples. Display 28 may then output the output frame.

FIG. 3 is a conceptual diagram illustrating example of a region of interest (ROI) in different frames. For example, FIG. 3 illustrates

frames

42A, 42B, and 42C, which may be part of the plurality of frames.

Frames

42A, 42B, and 42C each include ROI 44. As illustrated in FIG. 3, a location or size of ROI 44 may be different in

frames

42A, 42B, and 42C.

As described above, anchor frame selection circuit 36 may preselect frames from the plurality of frames for selecting an anchor frame. For example, anchor frame selection circuit 36 may determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames. As one example, to determine the difference value, anchor frame selection circuit 36 may compare at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames.

For example, the size of ROI 44 in frame 42A is based on width 46A and height 48A. The location of ROI 44 in frame 42A is based on point 50A (e.g., x-and y-coordinates of point 50A) . The size of ROI 44 in frame 42B is based on width 46B and height 48B. The location of ROI 44 in frame 42B is based on point 50B (e.g., x-and y-coordinates of point 50B) . The size of ROI 44 in frame 42C is based on width 46C and height 48C. The location of ROI 44 in frame 42C is based on point 50C (e.g., x-and y-coordinates of point 50C) . The location of

points

50A, 50B, and 50C is one example, and other possible locations of

points

50A, 50B, and 50C on ROI 44 are possible.

For frame 42A, anchor frame selection circuit 36 may determine the difference between

width

46A and 46B and determine the difference between height 48A and height 48B. Based on the differences in the width and the height, anchor frame selection circuit 36 may determine a first size value for frame 42A. For frame 42A, anchor frame selection circuit 36 may determine the difference between

width

46A and 46C and determine the difference between height 48A and height 48C. Based on the differences in the width and the height, anchor frame selection circuit 36 may determine a second size value for frame 42A, and so forth for each of the plurality of frames. Anchor frame selection circuit 36 may sum the first size value, second size value, and so forth for each of the frames to determine a summed size value.

For frame 42A, anchor frame selection circuit 36 may determine a difference between the x-coordinate of point 50A and the x-coordinate of point 50B and determine a difference between the y-coordinate of point 50A and the y-coordinate of point 50B. Based on the differences in the x-and y-coordinates, anchor frame selection circuit 36 may determine a first location value for frame 42A. For frame 42A, anchor frame selection circuit 36 may determine the difference between the x-coordinate of point 50A and the x-coordinate of point 50C and determine a difference between the y-coordinate of point 50A and the y-coordinate of point 50C. Based on the differences in the x-and y-coordinates, anchor frame selection circuit 36 may determine a second location value for frame 42A, and so forth for each of the plurality of frames. Anchor frame selection circuit 36 may sum the first value, second value, and so forth for each of the frames to determine a summed location value.

In some examples, anchor frame selection circuit 36 may compare the summed size value to a tolerance value, and if the summed size value is less than the tolerance value, anchor frame selection circuit 36 may select frame 42A as one of the frames used to select an anchor frame. In some examples, anchor frame selection circuit 36 may compare the summed location value to a tolerance value, and if the summed location value is less than the tolerance value, anchor frame selection circuit 36 may select frame 42A as one of the frames used to select an anchor frame.

In some examples, anchor frame selection circuit 36 may combine the summed size value and the summed location value (e.g., by multiplying the summed size value and the summed location value together) to determine a combined size and location value. Anchor frame selection circuit 36 may compare the combined summed size and location value to a tolerance value, and if the combined summed size and location value is less than the tolerance value, anchor frame selection circuit 36 may select frame 42A as one of the frames used to select an anchor frame.

Anchor frame selection circuit 36 may repeat the example operations for frame 42B, frame 42C, and so forth. In some examples, anchor frame selection circuit 36 may not need to repeat all of the operations. For instance, for frame 42B, anchor frame selection circuit 36 may have already determined the difference in width and height and location with frame 42A when determining the difference in width and height and location for frame 42A. For each frame, anchor frame selection circuit 36 may determine whether the frame is to be used for selecting the anchor frame based on the above example operations.

Mathematically, the above example operations may be represented as follows:

In the above equation, i represents a current frame, j represents another one of the frames, and n represents the number of frames. For instance, if frame 42A is the current frame, frame 42A would be represented by i, and frames 42B and 42C would each be represented by j in the above equation. In the above equation, trackedROId _xj is the width of the jth frame (e.g., width 46B or width 46C) and trackedROId _xi is the width of the ith frame (e.g., width 46A) . In the above equation, trackedROId _yj is the height of the jth frame (e.g., height 48B or height 48C) and trackedROId _yi is the height of the ith frame (e.g., height 48A) . roiSizeDiff _i is an example of the summed size value for frame 42A, where frame 42A is the ith frame.

In the above equation, trackedROI _xj is the x-coordinate of a point in the jth frame (e.g., x-coordinate of point 50B or point 50C) and trackedROI _xi is the x-coordinate of a point in the ith frame (e.g., x-coordinate of point 50A) . In the above equation, trackedROI _yj is the y-coordinate of a point in the jth frame (e.g., y-coordinate of point 50B or point 50C) and trackedROI _yi is the y-coordinate of a point in the ith frame (e.g., y-coordinate of point 50A) . roiCoorDiff _i is an example of the summed location value for frame 42A, where frame 42A is the ith frame.

Anchor frame selection circuit 36 may multiply roiSizeDiff _i and roiCoorDiff _i to determine trackedROIDiff _i. Anchor frame selection circuit 36 may compare trackedROIDiff _i to a tolerance value (e.g., trackedROIDiffTolerance) . If trackedROIDiff _i is less than trackedROIDiffTolerance, then anchor frame selection circuit 36 may select the ith frame (e.g., frame 42A) as a frame that is used for selecting the anchor frame. That is, frame 42A may be a candidate for being selected as an anchor frame.

FIG. 4 is a flowchart illustrating an example method of operation in accordance with one or more examples described in this disclosure. The example of FIG. 4 is described with respect to a processor. One example of the processor is camera processor 14. However, the processor may be any one or combination of camera processor 14, CPU 16, and GPU 18. For example, the processor may be a system-on-chip (SoC) .

The processor may be configured to determine an ROI in a first frame of a plurality of frames, where an area of the ROI is less than an entirety of the first frame (52) . Camera 12 may be configured to capture the plurality of frames. As described above, tracking circuit 34 may receive region information (e.g., foreground information, user input such as user selection of ROI, or face information) . Tracking circuit 34 may be configured to determine the ROI based on the region information. For example, tracking circuit 34 may determine a polygon that encompasses at least one object of interest, where the object of interest may be a moving object, user input, or a face. In this way, tracking circuit 34 may be configured to determine the ROI in the first frame such as based on receiving information indicative of the ROI based on user input or where the ROI includes at least one face.

The processor may track the ROI in the plurality of frames (54) . As illustrated in FIG. 3, a location of the ROI (e.g., ROI 44) in a second frame (e.g., frame 42B) of the plurality of frames is in a different location than the ROI (e.g., ROI 44) in the first frame (e.g., frame 42A) . For example, tracking circuit 34 may track the at least one object of interest in the plurality of frames. That is, tracking circuit 34 may track the location and size of ROI 44 in the plurality of frames 42A–42C.

The processor may determine one or more sharpness values of the ROI in one or more frames of the plurality of frames (56) . To determine the one or more sharpness values of the ROI, anchor frame selection circuit 36 may subtract color values between horizontally and vertically neighboring samples or blocks of samples within the ROI in each of the one or more frames and sum the result of the subtraction for each of the one or more frames. In some examples, the one or more frames of the plurality of frames may be all of the plurality of frames. In some examples, the one or more frames may be preselected from the plurality of frames as described above and described with respect to FIGS. 5 and 6.

The processor may select an anchor frame from the frames based on the determined one or more sharpness values (58) . For example, anchor frame selection circuit 36 may select the anchor frame from the frames with the ROI having a highest sharpness value among the determined one or more sharpness values.

The processor may generate an output frame (e.g., one or more output frames) based on blending (e.g., weighted averaging) the anchor frame and at least another frame of the plurality of frames (60) . For example, blend circuit 38 may start with the anchor frame and temporally filter the anchor frame with another frame of the plurality of frames, and then temporally filter the resulting frame with another frame of the plurality of frames, and so forth. The temporally filtering may be averaging or weighted summation of sample intensity from the frames, as one example. However, other types of filtering may be possible. The blending may be considered as temporally filtering because the filtering is across frames that may be captured at different time instances (e.g., slightly different time instances) . In this way, blend circuit 38 may generate the output frame by temporally filtering the plurality of frames with respect to the anchor frame. Display 28 may be configured to output (e.g., display) the output frame.

FIG. 5 is a flowchart illustrating another example method of operation in accordance with one or more examples described in this disclosure. As described above, for selecting an anchor frame, in some examples, the processor may preselect one or more frames from the plurality of frames, and select the anchor frame from the preselected one or more frames. FIG. 5 is an example way of preselecting the one or more frames.

The processor may be configured to determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames (62) . For example, anchor frame selection circuit 36 may determine the difference value by comparing at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames. For example, anchor frame selection circuit 36 may determine roiCoorDiff _i, roiSizeDiff _i, or trackedROIDiff _i as described above for each frame. roiCoorDiff _i, roiSizeDiff _i, or trackedROIDiff _i are each examples of the difference value that the processor may be configured to determine.

The processor may compare each difference value for each frame to a tolerance value (64) . For example, anchor frame selection circuit 36 may compare trackedROIDiff _i to a tolerance value (e.g., trackedROIDiffTolerance) . The processor may select one or more frames based on the comparison (66) . For example, if anchor frame selection circuit 36 determines that trackedROIDiff _i is less than trackedROIDiffTolerance, anchor frame selection circuit 36 may select the ith frame as one of the one or more frames used to select an anchor frame. As described above, such as in FIG. 4, from the one or more frames (e.g., preselected frames using the example techniques of FIG. 5) , the processor may select the anchor frame, such as based on sharpness values of the ROI in each of the preselected frames.

FIG. 6 is a flowchart illustrating another example method of operation in accordance with one or more examples described in this disclosure. FIG. 5 described one example way in which to preselect one or more frames from the plurality of frames. FIG. 6 is another example way of preselecting the one or more frames. In some examples, the one or more frames that are preselected may be preselected based on a combination of techniques of FIGS. 5 and 6.

The processor may determine a brightness value for each frame based on the ROI of each frame (68) . The brightness value may be based on the luminance of samples in the ROI. In the event that the color values for the samples are represented as RGB values, anchor frame selection circuit 36 may convert the RGB values to luminance/chrominance values as described above. One example way in which to determine the brightness value for each frame is based on generating a histogram, where each bin represents a range of brightness values and the brightness value is based on the most common range of brightness values in the ROI (e.g., most common bin) .

The processor may compare each brightness value for each frame to a tolerance value (70) . The processor may select one or more frames based on the comparison (72) . For example, if anchor frame selection circuit 36 determines that the brightness value for the ith frame is less than the tolerance value, anchor frame selection circuit 36 may select the ith frame as one of the one or more frames used to select an anchor frame. As described above, such as in FIG. 4, from the one or more frames (e.g., preselected frames using the example techniques of FIG. 6) , the processor may select the anchor frame, such as based on sharpness values of the ROI in each of the preselected frames.

In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media. In this manner, computer-readable media generally may correspond to tangible computer-readable storage media which is non-transitory. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. It should be understood that computer-readable storage media and data storage media do not include carrier waves, signals, or other transient media, but are instead directed to non-transient, tangible storage media. Disk and disc, as used herein, includes compact disc (CD) , laser disc, optical disc, digital versatile disc (DVD) , floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs) , general purpose microprocessors, application specific integrated circuits (ASICs) , field programmable logic arrays (FPGAs) , or other equivalent integrated or discrete logic circuitry. Accordingly, the term “processor, ” as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.

The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set) . Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.

Various examples have been described. These and other examples are within the scope of the following claims.

Claims

A device for image processing, the device comprising:

a memory; and

a processor coupled to the memory, the processor configured to:

determine a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame;

track the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame;

determine one or more sharpness values of the ROI in one or more frames of the plurality of frames;

select an anchor frame from the one or more frames based on the determined one or more sharpness values; and

generate an output frame based on blending the anchor frame and at least another frame of the plurality of frames.
The device of claim 1, wherein the ROI includes at least one object of interest, and tracking the ROI includes tracking the at least one object of interest in the plurality of frames.
The device of claim 1, wherein the processor is further configured to:

determine a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames;

compare each difference value for each frame to a tolerance value; and

select the one or more frames of the plurality of frames based on the comparison.
The device of claim 3, wherein determining the one or more sharpness values of the ROI includes determining the one or more sharpness values of the ROI of the selected one or more frames and avoiding determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames.
The device of claim 3, wherein determining the difference value includes comparing at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames.
The device of claim 1, wherein the processor is configured to:

determine a brightness value for each frame of the plurality of frames based on the ROI in each of the plurality of frames;

compare each brightness value for each frame to a tolerance value; and

select the one or more frames of the plurality of frames based on the comparison.
The device of claim 6, wherein determining the one or more sharpness values of the ROI includes determining the one or more sharpness values of the ROI of the selected one or more frames and avoiding determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames.
The device of claim 1, wherein the processor is configured to receive information indicative of the ROI based on user input.
The device of claim 1, wherein the ROI includes at least one face.
The device of claim 1, wherein determining the one or more sharpness values of the ROI in one or more frames of the plurality of frames includes subtracting color values between horizontally and vertically neighboring samples or blocks of samples within the ROI in each of the one or more frames and summing the result of the subtraction for each of the one or more frames.
The device of claim 1, further comprising a display configured to output the output frame.
The device of claim 1, wherein the device comprises a camera configured to capture the plurality of frames.
The device of claim 1, wherein generating the output frame comprises temporally filtering the plurality of frames with respect to the anchor frame.
The device of claim 1, wherein selecting the anchor frame from the one or more frames based on the determined one or more sharpness values includes selecting the anchor frame from the one or more frames with the ROI having a highest sharpness value among the determined one or more sharpness values.
A method of image processing, the method comprising:

determining a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame;

tracking the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame;

determining one or more sharpness values of the ROI in one or more frames of the plurality of frames;

selecting an anchor frame from the one or more frames based on the determined one or more sharpness values; and

generating an output frame based on blending the anchor frame and at least another frame of the plurality of frames.
The method of claim 15, wherein the ROI includes at least one object of interest, and wherein tracking the ROI comprises tracking the at least one object of interest in the plurality of frames.
The method of claim 15, further comprising:

determining a difference value for each frame of the plurality of frames based on the ROI in each of the plurality of frames;

comparing each difference value for each frame to a tolerance value; and

selecting the one or more frames of the plurality of frames based on the comparison.
The method of claim 17, wherein determining the one or more sharpness values of the ROI comprises determining the one or more sharpness values of the ROI of the selected one or more frames and avoiding determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames.
The method of claim 17, wherein determining the difference value comprises comparing at least one of a size or location of the ROI in each frame of the plurality of frames to a size or location of the ROI in each of the other frames of the plurality of frames.
The method of claim 15, further comprising:

determining a brightness value for each frame of the plurality of frames based on the ROI in each of the plurality of frames;

comparing each brightness value for each frame to a tolerance value; and

selecting the one or more frames of the plurality of frames based on the comparison.
The method of claim 20, wherein determining the one or more sharpness values of the ROI comprises determining the one or more sharpness values of the ROI of the selected one or more frames and avoiding determining one or more sharpness values of the ROI of the non-selected frames of the plurality of frames.
The method of claim 15, further comprising receiving information indicative of the ROI based on user input.
The method of claim 15, wherein the ROI includes at least one face.
The method of claim 15, wherein determining the one or more sharpness values of the ROI in one or more frames of the plurality of frames comprises subtracting color values between horizontally and vertically neighboring samples or blocks of samples within the ROI in each of the one or more frames and summing the result of the subtraction for each of the one or more frames.
The method of claim 15, further comprising displaying the output frame.
The method of claim 15, further comprising capturing the plurality of frames with a camera.
The method of claim 15, wherein generating the output frame comprises temporally filtering the plurality of frames with respect to the anchor frame.
The method of claim 15, wherein selecting the anchor frame from the one or more frames based on the determined one or more sharpness values comprises selecting the anchor frame from the one or more frames with the ROI having a highest sharpness value among the determined one or more sharpness values.
A computer-readable storage medium storing instructions thereon that when executed by one or more processors cause the one or more processors of a device for image processing to:

determine a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame;

track the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame;

determine one or more sharpness values of the ROI in one or more frames of the plurality of frames;

select an anchor frame from the one or more frames based on the determined one or more sharpness values; and

generate an output frame based on blending the anchor frame and at least another frame of the plurality of frames.
A device for image processing, the device comprising:

means for determining a region of interest (ROI) in a first frame of a plurality of frames, wherein an area of the ROI is less than an entirety of the first frame;

means for tracking the ROI in the plurality of frames, wherein a location of the ROI in a second frame of the plurality of frames is in a different location than the ROI in the first frame;

means for determining one or more sharpness values of the ROI in one or more frames of the plurality of frames;

means for selecting an anchor frame from the one or more frames based on the determined one or more sharpness values; and

means for generating an output frame based on blending the anchor frame and at least another frame of the plurality of frames.