WO2011017823A1 - Techniques to perform video stabilization and detect video shot boundaries based on common processing elements - Google Patents

Techniques to perform video stabilization and detect video shot boundaries based on common processing elements Download PDF

Info

Publication number
WO2011017823A1
WO2011017823A1 PCT/CN2009/000920 CN2009000920W WO2011017823A1 WO 2011017823 A1 WO2011017823 A1 WO 2011017823A1 CN 2009000920 W CN2009000920 W CN 2009000920W WO 2011017823 A1 WO2011017823 A1 WO 2011017823A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
current frame
block
motion parameters
trajectory
Prior art date
Application number
PCT/CN2009/000920
Other languages
French (fr)
Inventor
Lidong Xu
Yijen Chiu
Qian Huang
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to CN200980160949.5A priority Critical patent/CN102474568B/en
Priority to JP2012524073A priority patent/JP5435518B2/en
Priority to EP09848153.4A priority patent/EP2465254A4/en
Priority to KR1020127003602A priority patent/KR101445009B1/en
Priority to PCT/CN2009/000920 priority patent/WO2011017823A1/en
Publication of WO2011017823A1 publication Critical patent/WO2011017823A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • H04N5/145Movement estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

An apparatus, a system and a method for performing video stabilization and detecting video shot boundaries are disclosed. The method includes: receiving a current frame of a video; down-scaling the current frame; storing the down-scaled current frame into a portion of a buffer; determining the sum of the absolute differences between the blocks in a down-scaled reference frame and a target block in the down-scaled current frame; determining the inter-frame dominant motion parameters of the down-scaled current frame; and performing at least one of the video stabilization and the shot boundary detection based in part on at least one of the motion parameters and the sum of the absolute differences.

Description

TECHNIQUES TO PERFORM VIDEO STABILIZATION AND DETECT VIDEO SHOT BOUNDARIES BASED ON COMMON PROCESSING ELEMENTS
Field
The subject matter disclosed herein relates generally to techniques to perform video stabilization and detect video shot boundaries using common processing elements.
Related Art
Video stabilization aims to improve visual qualities of video sequences captured by digital video cameras. When cameras are hand held or mounted on unstable platforms, the captured video can appear shaky because of undesired camera motions, which lead to a degraded viewer experience. Video stabilization techniques can be employed to remove or reduce the undesired motions among the captured video frames.
A video usually consists of scenes, and each scene includes one or more shots. A shot is defined as a sequence of frames captured by a single camera in a single continuous action. The change from one shot to another, also known as shot transition, includes two key types: abrupt transition (CUT) and gradual transition (GT). Video shot boundary detection aims to detect shot boundary frames. Video shot boundary detection can be applied in various applications, such as intra frame identification in video coding, video indexing, video retrieval, and video editing.
Brief Description of the Drawings
Embodiments of the present invention are illustrated by way of example, and not by way of limitation, in the drawings and in which like reference numerals refer to similar elements.
FIG. 1 depicts a block diagram format of a video stabilization system in accordance with an embodiment.
FIG. 2 shows a block diagram of an inter-frame dominant motion estimation module, in accordance with an embodiment.
FIG. 3 provides a flow diagram of a process performed to improve video stabilization, in accordance with an embodiment.
FIG. 4 depicts a block diagram of a shot boundary detection system, in accordance with an embodiment.
FIG. 5 provides a process of a shot boundary decision scheme, in accordance with an embodiment.
FIG. 6 depicts a block diagram of a system that performs video stabilization and shot boundary detection, in accordance with an embodiment.
FIG. 7 depicts an example of identification of a matched block in a reference frame using a search window where the matched block corresponds to a target block in a current frame. Detailed Description
Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrase "in one embodiment" or "an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
Furthermore, the particular features, structures, or characteristics may be combined in one or more embodiments.
A graphics processing system may need to support multiple video processing features as well as various video encoding or decoding standards. Various embodiments permit a graphics processing system to support both video stabilization and video shot boundary detection features. In particular, various embodiments permit a graphics processing system to use certain processing capabilities for both video stabilization and shot boundary detection. In some embodiments, down sampling and block motion search features of a graphics processing system are used for both video stabilization and video shot boundary detection. Reuse of features may reduce the cost of manufacturing a graphics processing system and also reduce the size of the graphics processing system.
Various embodiments are capable of encoding or decoding video or still images in accordance with a variety of standards, such as but not limited to: MPEG-4 Part 10 advanced video codec (AVQ/H.264. The H.264 standard has been prepared by the Joint Video Team (JVT), which includes ITU-T SGl 6 Q.6, also known as VCEG (Video Coding Expert Group) and the ISO-IEC JTC1/SC29/WG11 (2003), known as MPEG (Motion Picture Expert Group). In addition, embodiments may be used in a variety of still image or video compression systems including, but not limited to, object oriented video coding, model based video coding, scalable video coding, as well as MPEG-2 (ISO/IEC 13818-1 (2000) available from International Organization for Standardization, Geneva, Switzerland), VCl (SMPTE 42 IM (2006) available from SMPTE White Plains, NY 10601), as well as variations of MPEG-4, MPEG-2, and VCl.
FIG. 1 depicts a video stabilization system 100, in block diagram format, in accordance with an embodiment. Video stabilization system 100 includes inter-frame dominant motion estimation (DME) block 102, trajectory computation block 104, trajectory smoothing block 106, and jitter compensation block 108. Inter-frame DME block 102 is to determine camera vibration between two consecutive frames in a video sequence. Inter-frame DME block 102 is to identify local motion vectors and then determine the dominant motion parameters based on those local motion vectors.
Trajectory computation block 104 is to calculate the motion trajectory with those determined dominant motion parameters. Trajectory smoothing block 106 is to smooth the calculated motion trajectory to provide a smoother trajectory. Jitter compensation module 108 is to reduce jitter in the smoother trajectory.
FIG. 2 shows a block diagram of an inter-frame dominant motion estimation module 200, in accordance with an embodiment. Module 200 includes frame down- sampling block 202, reference buffer 204, block motion search block 206, iterative least square solver block 208, and motion up-scaling block 210.
Down-sampling block 202 is to down scale input frames to a smaller size. For example, a down-sampling factor of approximately 4-5 may be used, although other values can be used. In some embodiments, down-sampling block 202 provides smaller sized frames that are approximately 160x120 pixels. A resulting downscaled frame has a fewer number of blocks. A block may be 8x8, 16x16, or other sizes due to the design of the common processing element. Generally, a 16x16 block is used. The downscaling process also down-scales block motion vectors. In various embodiments, a motion vector represents a vertical and horizontal displacement of a pixel, a block, or an image between frames. Downscaling the frames also downscales the x and y motions between two frames. For example, if the down-sampling factor is 4 and the motion vector is (20, 20), the downscaled motion vector will be approximately (5, 5) in the downscaled frames. As a result, a window/region-limited block motion search on a smaller picture can encompass larger motions on the original frames. Accordingly, processing speed and processing resources used to identify process blocks can be reduced. Down-sampling block 202 is to store the down-sampled frames into reference buffer 204. Reference buffer 204 may be a region in memory that is available for use at least in performing video stabilization and shot boundary detection. The region may be a buffer or a portion of a buffer. For example, if the region is a portion of a buffer, the other portions of the same buffer can be used simultaneously or at other times by other applications or processes. In various embodiments, for video stabilization and shot boundary detection, a single reference frame is used. Accordingly, the size of the reference buffer can be set to store one frame. At each updating of the reference buffer, a reference frame can be replaced with another reference frame.
Block motion search block 206 is to receive a down-sampled current frame from down-sampling block 202 and also to receive the down-sampled previous reference frame from reference buffer 204. Block motion search block 206 is to identify a local motion vector of selected blocks within a pre-defined search window. For example, the identified motion vector can be the motion vector associated with a block in a search window with the lowest sum of absolute difference (SAD) with respect to a target block in the current frame. The block in the search window may be a macroblock or a small block, such as 8x8 pixels, although other sizes can be used. In some embodiments, the block size is 16x16 pixels and the search window can be set to 48x32 pixels, hi various embodiments, block motion search block 206 does not search for motion vectors associated with blocks on frame borders.
In some embodiments, block motion search block 206 is to determine sum of absolute difference (SAD) for macro blocks of each frame. For example, determining a SAD for each macro block in a frame may include comparing each 16x16 pixel macro block of a reference frame with a 16x16 pixel macro block in a current frame. For example, in some embodiments, all macro blocks within a 48x32 pixel search window of a reference frame can be compared with a target 16x16 pixel macro block in a current frame. The target macro block can be picked one by one or in chessboard pattern. For a full search, all macroblocks in a 48x32 search window may be compared with the target macro block. Accordingly, 32x16 (512) macroblocks can be compared. When moving a 16x16 macroblock within a 48x32 search window, there are 32x16 positions to move.
Accordingly, in this example, 512 SADs are determined.
FIG. 7 depicts an example of identification of a matched block in a reference frame using a search window where the matched block corresponds to a target block in a current frame. An exemplary block motion search may include the following steps.
(1) Select multiple target blocks in a current frame. Let the coordinates of the target blocks be (x_i, y_i), where i is the block index. Target blocks in the current frame can be selected one by one. Although other selection techniques can be used such as selecting them in chessboard manner.
(2) For target block i in the current frame, block motion search is used in the search window to identify the matched block and obtain a local motion vector (mvx i, mvy i). Finding a matched block in the search window in the reference frame for target block i can include comparing all candidate blocks in a reference frame search window with the target block, and the one with minimum SAD is regarded as the matched block.
(3) After block motion search for block i, calculate: x' i = x_i + mvx i and y' i = y_i + mvy_i. Then, (x_i, y_i) and (x'_i, y'_i) are regarded as a pair.
(4) After performing block motion search for all selected target blocks in the current frame, multiple pairs (x_i, y_i) and (x' i, y' i) are obtained.
As shown in FIG. 7, for one target block (x, y) in a current frame, the 48x32 search window is specified in a reference frame, and the position of the search window can be centered by (x, y). After finding the matched block in the search window by block motion search, the local motion vector (mvx, mvy) for the target block is obtained. The coordinates of the matched block (x',y') is x' = x + mvx, y' = y + mvy. Then, (x,y) and (x',y') are regarded as a pair.
Referring again to FIG. 2, iterative least square solver 208 is to determine dominant motion parameters based on at least two identified local motion vectors. In some embodiments, iterative least square solver 208 is to apply the similarity motion model shown in FIG. 2 to approximate the dominant inter-frame motion parameters. The similarity motion model can also be written in the format of equation (1) below.
Figure imgf000006_0001
where:
(x', y') represents the matched block coordinates in a reference frame, (x, y) represent the block coordinates in the current frame, and (a, b, c, d) represents the dominant motion parameters, where parameters a and b relate to rotation and parameters c and d relate to translation. For example, block coordinates (x', y') and (x, y) could be defined as top-left comer, bottom-right corner, or block center of a block, as long as consistently used. For a block whose coordinates are (x, y) and the identified local motion vector (from block 206) is (mvx, mvy), the coordinates (x',y') of its matched block are obtained by x'=x+mvx and y'=y+mvy. In various embodiments, all (x, y) and (x', y') pairs of a frame are used in equation (1). Iterative least squares solver block 208 to determine motion parameters (a, b, c, d) by solving equation (1) using the Least Squares (LS) technique.
Outlier local motion vectors may negatively impact estimation of dominant motions if considered by iterative least square solver 208. Outlier local motion vectors may be identified by block motion search block 206 if some blocks in a current frame are selected from an area that includes foreground objects or repeated similar patterns. In various embodiments, iterative least square solver 208 uses an iterative least square (ILS) solver to reduce the effect of the outlier local motion vectors by identifying and removing outlier location motion vectors from consideration. In such embodiments, after
determining dominant motion parameters using equation (1) above, iterative least square solver 208 is to determine the squared estimation error (SEE) of each remaining block position (x;, yi) in the current frame. Block position (XJ, yj) can be the top-left corner, bottom-right corner, or block center, as long as consistently used.
SEE1 = {ax, + byt + C - X1)2 + {-bx, + ay, + d - y,f (2) A local motion vector is regarded as an outlier if its corresponding squared estimation error (SEE) satisfies equation (3).
Figure imgf000007_0001
where,
T is a constant, which can be empirically set to 1.4, although other values can be used and
n is the number of remaining blocks in the current frame.
Equations (l)-(3) above are repeated until no outlier local motion vectors are detected or the number of remaining blocks is less than a predefined threshold number. For example, the threshold number can be 12, although other numbers can be used. In each iteration of equations (l)-(3), the detected outlier motion vectors and blocks associated with the outlier motion vectors are not considered. Instead, motion vectors associated with the remaining blocks are considered. After removing outlier local motion vectors from consideration, iterative least squares block 208 performs equation (1) to determine motion parameters.
Motion up-scaling block 210 is to up-scale the translation motion parameters, c and d, according to the inverse of the down-sampling factor applied by down-scaling block 202. Because down-sampling process does not affect the rotation and scaling motions between two frames, the parameters a and b may not be upscaled.
Referring again to FIG. 1, trajectory computation block 104 is to determine a trajectory. For example, trajectory computation block 104 is to determine the motion trajectory of frame j, Tj, using the accumulated motion as defined in equation (4).
Figure imgf000008_0001
where,
Mj is the global motion matrix between frames j and j-1 and is based on dominant motion parameters (a, b, c, d). Dominant motion parameters (a, b, c, d) are for the current frame (referred to as frame j) in equation (4).
An inter-frame global motion vector includes camera intended motion and camera jitter motion. Trajectory smoothing block 106 is to reduce camera jitter motion from an inter-frame global motion vector. In various embodiments, trajectory smoothing block 106 is to reduce camera jitter motion by using motion trajectory smoothing. The low frequency component of the motion trajectory is recognized as the camera intended movement. After trajectory computation block 104 determines the motion trajectory of each frame, trajectory smoothing block 106 is to increase the smoothness of the motion trajectory using a low-pass filter, such as but not limited to Gaussian filter. The Gaussian filter window can be set to 2n+l frames. The filtering process introduces n frames delay. Experimental results show that n can be set to 5, although other values can be used. The smoother motion trajectory, T'j, can be determined using equation (5).
Figure imgf000008_0002
where g(k) is the Gaussian filter kernel. A Gaussian filter is a low-pass filter, 1 k2
g(k) = exp(— τ) . After specifying its variation value δ , the filter coefficients can
J2πδ2 2J
be calculated. In some embodiments, the variation value is set to 1.5, but it can be set to other values. A larger variation value may produce smoother motion trajectory.
Jitter compensation block 108 is to compensate jitter in the un-smoothed original trajectory. Camerajitter motion is the high frequency component of the trajectory. The high frequency component of the trajectory is the difference between the original trajectory and the smoothed trajectory. Jitter compensation block 108 is to compensate the high frequency component and provide a more stabilized current frame. For example, the more stabilized frame representation, frame F'(j), for the current frame may be obtained by warping current frame F(j) with the jitter motion parameters.
After performing trajectory smoothing for j-th current frame F(j), the motion differences between TQ) and TQ) (shown in equations 4 and 5) are regarded as jitter motions. Jitter motions can be represented by jitter motion parameters (a1, b', c', d')- The following describes a manner to determine (a1, b1, c', d') from the difference between T(j) and T'(j ) . Suppose the j itter motion parameters of T(j ) are (a 1 , b 1 , c 1 , d 1 ) and the smoothed jitter motion parameters of T'(j) are (a2, b2, c2, d2). Setting θl = arctan(bl/al) and 02 = arctan(b2/a2), the jitter motion parameters are determined as follows:
a' = cos(θl - 02), b' = sin(θl - 02), c' = cl - c2, d' = dl - d2. An exemplary warping process is as follows.
(1) For any pixel positioned at (x, y) in the more stabilized frame F'(j), the pixel value is denoted by F'(x,y,j).
(2) The corresponding position (x1, y') in current frame F(j) is determined as x' = a'*x + b'*y + c1, y1 = -b'*x + a'*y + d'.
(3) If x' and y' are integers, set F'(x, y, j) = F(x', y', j). Otherwise, calculate F'(x, y, j) through bi-linear interpolation using the pixels in F(j) around the position (x1, y').
(4) If (x1, y') is outside the current frame FQ), set F'(x, y, j) to a black pixel.
FIG. 3 provides a flow diagram of a process to improve video stabilization, in accordance with an embodiment. Block 302 includes performing frame size down scaling. For example, techniques described with regard to down-sampling block 202 may be used to perform frame size down scaling.
Block 304 includes performing block motion search to identify two or more local motion vectors. For example, techniques described with regard to block motion search block 206 may be used to identify one or more local motion vectors.
Block 306 includes determining dominant motion parameters. For example, techniques described with regard to iterative least squares block 208 may be used to determine dominant motion parameters.
Block 308 includes up-scaling dominant motion parameters. For example, techniques described with regard to up-scaling block 210 may be used to up-scale dominant motion parameters.
Block 310 includes determining a trajectory. For example, techniques described with regard to trajectory computation block 104 may be used to determine a trajectory.
Block 312 includes improving trajectory smoothness. For example, techniques described with regard to trajectory smoothing block 106 may be used to perform trajectory smoothing.
Block 314 includes performing jitter compensation by warping a current frame to provide a more stable version of the current frame. For example, techniques described with regard to jitter compensation block 108 may be used to reduce jitter.
FIG. 4 depicts a block diagram of a shot boundary detection system, in accordance with an embodiment. In various embodiments, some results from inter-frame dominant motion estimation block 102 used by video stabilization system 100 are also used by shot boundary detection system 400. For example, the same information available from any of down-sampling block 202, reference buffer 204, and block motion search block 206 can be used in either or both of video stabilization and shot boundary detection. In some embodiments, shot boundary detection system 400 detects abrupt scene transition (i.e., a
CUT scene). Shot boundary decision block 402 is to determine whether a frame is a scene change frame. For example, shot boundary decision block 402 may use a process described with regard to FIG. 5 to determine whether a current frame is a scene change frame.
FIG. 5 provides a process of a shot boundary decision scheme, in accordance with an embodiment. Blocks 502 and 504 are substantially similar to respective blocks 302 and
304.
Block 506 includes determining a mean sum of absolute difference (SAD) for the current frame. Note that the current frame is a down-scaled frame. For example, block
506 may include receiving a SAD for each macro block in the current frame from block motion search block 206 and determining the mean of the SADs of all macro-blocks in the current frame.
Block 508 includes determining whether the mean SAD is less than a threshold, TO. TO can be empirically set to approximately 1600 for a 16x16 block, although other values can be used. If the mean SAD is less than the threshold, then the frame is not a shot- boundary frame. If the mean SAD is not less than the threshold, then block 510 follows block 508.
Block 510 includes determining a number of blocks with a SAD larger than threshold Tl. Threshold Tl can be empirically set to 4 times the mean SAD, although other values can be used.
Block 512 includes determining whether the number of blocks with a SAD larger than threshold Tl is less than another threshold, T2. Threshold T2 can be empirically set to two thirds of the total number of target blocks in a frame, although other values of T2 can be used. If the number of blocks with a SAD larger than threshold Tl is less than the threshold T2, then the current frame is not considered a shot boundary frame. If the number of blocks is equal to or greater than the threshold T2, then the current frame is considered a shot boundary frame.
FIG. 6 depicts a block diagram of a system that is to perform video stabilization and shot boundary detection, in accordance with an embodiment. In various embodiments, frame down-sampling and block motion search operations are implemented in hardware. The frame down-sampling and block motion search operations are shared by both video stabilization and shot boundary detection applications. In various embodiments, for video stabilization (VS), trajectory computation, trajectory smoothing, jitter motion
determination, and jitter compensation operations are performed in software executed by a processor. In various embodiments, shot boundary detection (SBD) is performed in software executed by a processor, where the shot boundary detection uses results from the hardware-implemented frame down-sampling and block motion search operations. Other video or image processing techniques can make use of the results provided by down sampling or block motion search.
Processed images and video can be stored into any type of memory such as a transistor-based memory or magnetic memory.
The frame buffer may be a region in a memory. A memory can be implemented as a volatile memory device such as but not limited to a Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static RAM (SRAM), or other type of semiconductor-based memory or magnetic memory such as a magnetic storage device.
When designing a media processor with multiple video processing features, e.g., video encoding, de-interlacing, super-resolution, frame rate conversion, and so forth, hardware re-use can be a very efficient way to save the cost and reduce the form factor. Various embodiments greatly reduce the complexity of implementing both video stabilization and video shot boundary detection features on the same media processor, especially when the media processor has supported the block motion estimation function.
The graphics and/or video processing techniques described herein may be implemented in various hardware architectures. For example, graphics and/or video functionality may be integrated within a chipset. Alternatively, a discrete graphics and/or video processor may be used. As still another embodiment, the graphics and/or video functions may be implemented by a general purpose processor, including a multi-core processor. In a further embodiment, the functions may be implemented in a consumer electronics device such as portable computers and mobile telephones with display devices capable of displaying still images or video. The consumer electronics devices may also include a network interface capable of connecting to any network such as the internet using any standards such as Ethernet (e.g., IEEE 802.3) or wireless standards (e.g., IEEE 802.11 or 16).
Embodiments of the present invention may be implemented as any or a
combination of: one or more microchips or integrated circuits interconnected using a motherboard, hardwired logic, software stored by a memory device and executed by a microprocessor, firmware, an application specific integrated circuit (ASIC), and/or a field programmable gate array (FPGA). The term "logic" may include, by way of example, software or hardware and/or combinations of software and hardware.
Embodiments of the present invention may be provided, for example, as a computer program product which may include one or more machine-readable media having stored thereon machine-executable instructions that, when executed by one or more machines such as a computer, network of computers, or other electronic devices, may result in the one or more machines carrying out operations in accordance with
embodiments of the present invention. A machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs (Compact Disc-Read Only Memories), and magneto-optical disks, ROMs (Read Only Memories), RAMs (Random Access Memories), EPROMs (Erasable Programmable Read Only Memories), EEPROMs (Electrically Erasable Programmable Read Only Memories), magnetic or optical cards, flash memory, or other type of media / machine-readable medium suitable for storing machine-executable instructions.
The drawings and the forgoing description gave examples of the present invention. Although depicted as a number of disparate functional items, those skilled in the art will appreciate that one or more of such elements may well be combined into single functional elements. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, orders of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts necessarily need to be performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of the present invention, however, is by no means limited by these specific examples. Numerous variations, whether explicitly given in the specification or not, such as differences in structure, dimension, and use of material, are possible. The scope of the invention is at least as broad as given by the following claims.

Claims

Claims What is claimed is:
1. An apparatus comprising:
an inter-frame dominant motion estimation logic to determine motion parameters of a current frame and to determine sum of absolute differences for blocks in the current frame;
a shot boundary decision logic to determine whether the current frame is a scene change frame based in part on the sum of absolute differences; and
a video stabilization block to provide a more stabilized version of the current frame sequence based in part on the motion parameters.
2. The apparatus of claim 1, wherein the inter-frame dominant motion estimation logic is implemented in hardware.
3. The apparatus of claim 1 , wherein the inter-frame dominant motion estimation logic is to:
down-scale the current frame;
store the down-scaled current frame into a portion of a buffer;
determine sum of absolute differences between blocks in a reference frame and current frame;
determine inter-frame dominant motion parameters of the down-scaled frame; and
up-scale the inter-frame dominant motion parameters.
4. The apparatus of claim 3, wherein logic to determine inter-frame dominant motion parameters of the down-scaled frame is to:
identify a matched block in a reference frame in a search window with a lowest sum of absolute difference relative to a target block;
determine a local motion vector of the matched block;
determine coordinates of the matched block based in part on the local motion vector; and
apply a similarity motion model to determine the dominant motion parameters based in part on the coordinates of the matched block and coordinates of the target block.
5. The apparatus of claim 4, wherein logic to determine inter-frame dominant motion parameters of the down-scaled frame is also to:
disregard any outlier local motion vector.
6. The apparatus of claim 1, wherein the video stabilization block comprises:
a trajectory computation block to determine a trajectory of the current frame based in part on the motion parameters;
a trajectory smoothing block to increase smoothness of the current frame's trajectory; and
a jitter compensation block to reduce jitter in the current frame's trajectory.
7. The apparatus of claim 6, wherein the video stabilization block is also to:
determine jitter motion parameters based in part on differences between the motion trajectory and the smoother trajectory and
warp the current frame with the jitter motion parameters.
8. The apparatus of claim 1, wherein the shot boundary decision logic is to:
determine a mean of the sum of absolute differences for all blocks in the current frame and
determine whether the current frame is a shot-boundary frame based in part on a number of blocks whose sum of absolute difference is greater than a threshold.
9. A system comprising:
a hardware implemented inter-frame dominant motion estimator to determine motion parameters of a current frame and to determine sum of absolute differences for blocks in the current frame;
logic to determine whether the current frame is a scene change frame based in part on the sum of absolute differences;
logic to provide a more stabilized version of the current frame based in part on the motion parameters; and
a display to receive and display video.
10. The system of claim 9, wherein the logic to determine and logic to provide are stored on a computer-readable medium and executed by a processor.
11. The system of claim 9, wherein the inter-frame dominant motion estimator is to:
down-scale the current frame;
store the down-scaled current frame into a portion of a buffer; determine sum of absolute differences between blocks in a reference frame and current frame;
determine inter-frame dominant motion parameters of the down-scaled frame; and
up-scale the inter-frame dominant motion parameters.
12. The system of claim 9, wherein the logic to provide a more stabilized version of the current frame comprises:
logic to determine a trajectory of the current frame based in part on the motion parameters;
logic to increase smoothness of the current frame's trajectory; logic to reduce jitter in the current frame's trajectory;
logic to determine jitter motion parameters based in part on differences between the trajectory and the smoother trajectory; and
logic to warp the current frame with the jitter motion parameters.
13. The system of claim 9, wherein the logic to determine whether the current frame is a scene change frame based in part on the sum of absolute differences comprises:
logic to determine a mean of the sum of absolute differences for blocks in the current frame and
logic to determine whether the current frame is a shot-boundary frame based in part on a number of blocks whose sum of absolute difference is greater than a threshold.
14. A computer-implemented method comprising:
receiving a current frame of a video;
down-scaling the current frame;
storing the down-scaled current frame into a portion of a buffer;
determining sum of absolute differences between blocks in a down-scaled reference frame and a target block in the down-scaled current frame;
determining inter-frame dominant motion parameters of the down-scaled current frame; and
performing at least one of video stabilization and shot boundary detection based in part on at least one of the motion parameters and the sum of absolute differences.
15. The method of claim 14, further comprising:
upscaling the inter-frame dominant motion parameters.
16. The method of claim 14, wherein determining inter-frame dominant motion parameters of the down-scaled current frame comprises:
identifying a matched block in a reference frame in a search window with a lowest sum of absolute difference relative to the target block;
determining a local motion vector of the matched block;
disregarding any outlier local motion vector;
determining coordinates of the matched block based in part on the local motion vector; and
applying a similarity motion model to determine the dominant motion parameters based in part on the coordinates of the matched block and coordinates of the target block.
17. The method of claim 14, wherein the performing video stabilization comprises: determining a trajectory of the current frame based in part on the motion parameters;
increasing smoothness of the current frame's trajectory;
reducing jitter in the current frame's trajectory;
determining jitter motion parameters based in part on differences between the trajectory and the smoother motion trajectory; and
warping the current frame with the jitter motion parameters.
18. The method of claim 14, wherein the performing shot boundary detection comprises:
determining a mean of the sum of absolute differences for blocks in the current frame and
determining whether the current frame is a shot-boundary frame based in part on a number of blocks whose sum of absolute difference is greater than a threshold.
19. The method of claim 14, wherein the down-scaling, storing, determining sum of absolute differences, and determining inter-frame dominant motion parameters are implemented in hardware.
20. The method of claim 14, wherein the performing at least one of video stabilization and shot boundary detection are implemented as processor-executed software instructions.
PCT/CN2009/000920 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements WO2011017823A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN200980160949.5A CN102474568B (en) 2009-08-12 2009-08-12 Perform video stabilization based on co-treatment element and detect the technology of video shot boundary
JP2012524073A JP5435518B2 (en) 2009-08-12 2009-08-12 Apparatus, system, and method for performing video stabilization and video shot boundary detection based on common processing elements
EP09848153.4A EP2465254A4 (en) 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements
KR1020127003602A KR101445009B1 (en) 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements
PCT/CN2009/000920 WO2011017823A1 (en) 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2009/000920 WO2011017823A1 (en) 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements

Publications (1)

Publication Number Publication Date
WO2011017823A1 true WO2011017823A1 (en) 2011-02-17

Family

ID=43585832

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/000920 WO2011017823A1 (en) 2009-08-12 2009-08-12 Techniques to perform video stabilization and detect video shot boundaries based on common processing elements

Country Status (5)

Country Link
EP (1) EP2465254A4 (en)
JP (1) JP5435518B2 (en)
KR (1) KR101445009B1 (en)
CN (1) CN102474568B (en)
WO (1) WO2011017823A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013109335A1 (en) * 2012-01-16 2013-07-25 Google Inc. Methods and systems for processing a video for stablization using dynamic crop
WO2015099816A1 (en) * 2012-11-13 2015-07-02 Intel Corporation Content adaptive dominant motion compensated prediction for next generation video coding
EP2798832A4 (en) * 2011-12-30 2016-02-24 Intel Corp Object detection using motion estimation
US9516309B2 (en) 2012-07-09 2016-12-06 Qualcomm Incorporated Adaptive difference domain spatial and temporal reference reconstruction and smoothing

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310451B (en) * 2013-06-17 2016-12-28 中国传媒大学 Based on progressive two points and the Methods for Shot Boundary Detection of Video Sequences of adaptive threshold
CN103442161B (en) * 2013-08-20 2016-03-02 合肥工业大学 The video image stabilization method of Image estimation technology time empty based on 3D
TWI542201B (en) * 2013-12-26 2016-07-11 智原科技股份有限公司 Method and apparatus for reducing jitters of video frames
US10158802B2 (en) * 2014-09-19 2018-12-18 Intel Corporation Trajectory planning for video stabilization
CN114095659B (en) * 2021-11-29 2024-01-23 厦门美图之家科技有限公司 Video anti-shake method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007114796A1 (en) * 2006-04-05 2007-10-11 Agency For Science, Technology And Research Apparatus and method for analysing a video broadcast
CN101087413A (en) * 2006-06-07 2007-12-12 中兴通讯股份有限公司 Division method of motive object in video sequence
US20080170125A1 (en) 2005-01-18 2008-07-17 Shih-Hsuan Yang Method to Stabilize Digital Video Motion
CN101278551A (en) * 2005-09-30 2008-10-01 摩托罗拉公司 System and method for video stabilization
CN101383899A (en) * 2008-09-28 2009-03-11 北京航空航天大学 Video image stabilizing method for space based platform hovering
WO2009031751A1 (en) * 2007-09-05 2009-03-12 Electronics And Telecommunications Research Institute Video object extraction apparatus and method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07115584A (en) * 1993-10-19 1995-05-02 Canon Inc Device for correcting image shake
US5614945A (en) * 1993-10-19 1997-03-25 Canon Kabushiki Kaisha Image processing system modifying image shake correction based on superimposed images
JP3755155B2 (en) * 1994-09-30 2006-03-15 ソニー株式会社 Image encoding device
JP2009505477A (en) * 2005-08-12 2009-02-05 エヌエックスピー ビー ヴィ Method and system for digital image stabilization
JP2007243335A (en) * 2006-03-06 2007-09-20 Fujifilm Corp Camera shake correction method, camera shake correction apparatus, and imaging apparatus
JP2007323458A (en) * 2006-06-02 2007-12-13 Sony Corp Image processor and image processing method
US8130845B2 (en) * 2006-11-02 2012-03-06 Seiko Epson Corporation Method and apparatus for estimating and compensating for jitter in digital video
US20080112630A1 (en) * 2006-11-09 2008-05-15 Oscar Nestares Digital video stabilization based on robust dominant motion estimation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080170125A1 (en) 2005-01-18 2008-07-17 Shih-Hsuan Yang Method to Stabilize Digital Video Motion
CN101278551A (en) * 2005-09-30 2008-10-01 摩托罗拉公司 System and method for video stabilization
WO2007114796A1 (en) * 2006-04-05 2007-10-11 Agency For Science, Technology And Research Apparatus and method for analysing a video broadcast
CN101087413A (en) * 2006-06-07 2007-12-12 中兴通讯股份有限公司 Division method of motive object in video sequence
WO2009031751A1 (en) * 2007-09-05 2009-03-12 Electronics And Telecommunications Research Institute Video object extraction apparatus and method
CN101383899A (en) * 2008-09-28 2009-03-11 北京航空航天大学 Video image stabilizing method for space based platform hovering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2465254A4

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2798832A4 (en) * 2011-12-30 2016-02-24 Intel Corp Object detection using motion estimation
US9525803B2 (en) 2011-12-30 2016-12-20 Intel Corporation Object detection using motion estimation
WO2013109335A1 (en) * 2012-01-16 2013-07-25 Google Inc. Methods and systems for processing a video for stablization using dynamic crop
US8810666B2 (en) 2012-01-16 2014-08-19 Google Inc. Methods and systems for processing a video for stabilization using dynamic crop
US9554043B2 (en) 2012-01-16 2017-01-24 Google Inc. Methods and systems for processing a video for stabilization using dynamic crop
US9516309B2 (en) 2012-07-09 2016-12-06 Qualcomm Incorporated Adaptive difference domain spatial and temporal reference reconstruction and smoothing
US9854259B2 (en) 2012-07-09 2017-12-26 Qualcomm Incorporated Smoothing of difference reference picture
WO2015099816A1 (en) * 2012-11-13 2015-07-02 Intel Corporation Content adaptive dominant motion compensated prediction for next generation video coding

Also Published As

Publication number Publication date
KR101445009B1 (en) 2014-09-26
JP5435518B2 (en) 2014-03-05
JP2013502101A (en) 2013-01-17
EP2465254A1 (en) 2012-06-20
CN102474568A (en) 2012-05-23
CN102474568B (en) 2015-07-29
EP2465254A4 (en) 2015-09-09
KR20120032560A (en) 2012-04-05

Similar Documents

Publication Publication Date Title
KR101445009B1 (en) Techniques to perform video stabilization and detect video shot boundaries based on common processing elements
Dufaux et al. Efficient, robust, and fast global motion estimation for video coding
US8736767B2 (en) Efficient motion vector field estimation
CN112468812B (en) Encoding and decoding method, device and equipment
EP1639829B1 (en) Optical flow estimation method
Liu et al. Codingflow: Enable video coding for video stabilization
EP1419650A2 (en) method and apparatus for motion estimation between video frames
US8406305B2 (en) Method and system for creating an interpolated image using up-conversion vector with uncovering-covering detection
Sun et al. Predictive motion estimation with global motion predictor
JP3823767B2 (en) Moving image foreground / background region separation method, and moving image encoding method using conditional pixel interpolation using the method
WO2017078814A1 (en) Motion vector assisted video stabilization
Fei et al. Mean shift clustering-based moving object segmentation in the H. 264 compressed domain
Okade et al. Fast video stabilization in the compressed domain
Braspenning et al. Efficient motion estimation with content-adaptive resolution
Hill et al. Sub-pixel motion estimation using kernel methods
Peng et al. Image restoration for interlaced scan CCD image with space-variant motion blurs
Bhujbal et al. Review of video stabilization techniques using block based motion vectors
Fu et al. Fast global motion estimation based on local motion segmentation
Patanavijit et al. A robust iterative super-resolution reconstruction of image sequences using a Lorentzian Bayesian approach with fast affine block-based registration
Xu et al. Interlaced scan CCD image motion deblur for space-variant motion blurs
Cho et al. Surface modeling-based segmentalized motion estimation algorithm for video compression
Hong et al. Real-time foreground segmentation for the moving camera based on h. 264 video coding information
Smolić et al. Long-term global motion compensation applying super-resolution mosaics
JP2017163421A (en) Moving image encoding method, computer program, and moving image encoding device
Chan et al. A novel predictive global motion estimation for video coding

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980160949.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09848153

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012524073

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2009848153

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20127003602

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE