CN102572277B

CN102572277B - Image processing method and circuit and photographic head

Info

Publication number: CN102572277B
Application number: CN201110436828.8A
Authority: CN
Inventors: 朴圣秀; M.布朗; E.S.K.刘
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2010-12-23
Filing date: 2011-12-23
Publication date: 2016-12-14
Anticipated expiration: 2031-12-23

Abstract

A kind of digital image stabilization (DIS) method, including: characteristic point selection algorithm, it is used for selecting optimal characteristics point；Calculate efficient searching algorithm based on tile vector, for deriving the motion vector of the characteristic point of selection；And characteristic point motion vector packet/comparison procedure, the feature point pairs packet that will select than the differential seat angle with them for amplitude of the vectors based on them.A kind of digital image stabilizing method, including: it is based upon each conversion scoring in multiple conversion that tile motion vector (tile MV) organizes and multiple conversion that characteristic point motion vector (FP MV) is organized, selects to represent the primary transform of the fixing/background object in the scene of frame of video；And each history in the history organized based on fixing (background) and multiple motion vector set, get rid of big mobile object.

Description

Image processing method and circuit and photographic head

Technical field

Present inventive concept refers here to digital image stabilization (DIS), and more particularly, to detection, selection and the method being grouped the characteristic point for digital image stabilization.

Background technology

Digital camera, digital camera and include the handheld device of such photographic head to be usually used to when photographic head operates in the hands of human operator who capturing image or video.Therefore, when capturing image or video, photographic head may rock in operator's hands or shake.This shake can include horizontal component, vertical component and rotational component.Rotation can be along the axle vertical with the focussing plane of image capturing circuit, or along the axle parallel with the focussing plane of image capturing circuit, or along axle crooked between vertical axis and parallel axes.Shake is so that the video of hands capture allows beholder divert one's attention or confuses, it is therefore desirable for use digital circuit carry out digital estimation photographic head track (that is, the shake of detection between every pair of successive frame) and filter this shake from the sequence of the frame of video of same scene.For estimating photographic head track between successive video frames and may be embodied in video camera itself for filtering the circuit of the shake caused by photographic head track from the sequence of frame of video, and (such as, if video camera includes real-time MPEG encoder, before mpeg encoded or during it) is activated and cancels shake with real-time before the frame of video of storage capture.Alternatively, for estimating photographic head track between successive video frames and can be the mini-computer controlled by the software realizing digital image stabilization (DIS) method for filtering the circuit of shake from the storage sequence of frame of video, or can be specialized hardware, all mpeg video encoders being used for such as embedded within optimization performing in the ASIC (special IC) of digital image stabilization (DIS) method.

The video produced by stable, fixing or mobile camera mainly comprises smooth motion (translate, rotate) in the video of capture.On the other hand, unstable video camera produces and runs through the video in video image with high dither (translate and/or rotate).

Undesirable high dither campaign is usually shown from the digital image sequence of physics imaging device capture.The amount of the jitter motion presented in image sequence depends on the image capture device physical property relative to the object in capture sequence.The unstability (depending on the weight of base, inertia, balance) of the base of the depth of field and imager is combined to produce undesirable shake global motion.

First digital image stabilization (DIS) system estimates that undesirable (unintentionally) moves, and then image sequence is applied correction.The visual effect stablizing video is highly dependent on the quality that photographic head track is estimated.Digital image stabilization (DIS) algorithm uses the characteristic point effectively followed the tracks of to estimate the jitter motion between two successive frames.Digital video stabilization uses hardware and/or software approach, for producing the video of spatial stability from the unstable video comprising the undesirably jitter motion caused by unstable video camera.In tradition DIS technology, detect cam movement by analyzing the motion vector of points different in scene.But motion vector can be caused by object motion and cam movement.

There is such function, the numerical fraction of each pixel of its offer frame, how suitable indicate this point to have as the characteristic point that can detect in temporally adjacent frame.One example of such function is Harris's Corner Detection device (Harris Corner Detector).But, the amplitude of characteristic point is the most different for the different piece of image.DIS method can use global threshold to compare with the numerical fraction of each pixel, and this not necessarily causes the Optimal Distribution of characteristic point.Therefore, in the region (such as, cloudless blue sky causes dilute scattered or do not has characteristic point) of low contrast may characteristic point very little, and in the region with many structures, characteristic point may get too close to each other.The improper computation burden that may increase the redundancy motion vector calculating the characteristic point got too close to of distribution of characteristic point, and possibly cannot provide motion vector accurately.

In the embodiment of digital image stabilization (DIS) method, it is desirable to minimize computing cost to reduce the power consumption of circuit and reducing the time performed needed for DIS method.It is also expected to detect and measure the track of photographic head and portray shake exactly so that can properly compensate for from the video of storage/display and correctly cancel shake.

In mathematics, affine geometry is by affine transformation (that is, non-single linear converts and conversion) the most immovable geometry character Quality Research.Have been developed and be referred to as the math equation group defined by numerical coefficient of affine matrix and portray between every pair of successive frame or at its each several part (such as, Moving Objects in frame) between detection motion horizontal (up/down), rotate and scalar (such as, zooming in or out).

Therefore, shake can fix object (such as by be referred to as primary transform or global change relative to any reality in scene, rock, desk, the automobile of park, high mountain, the sun) the first affine transformation matrix portray, can be portrayed by other affine matrix with any Moving Objects (such as, the car of bird, people, ball, movement) in time frame.

Primary transform (main Inter-frame Transformation) instruction may be by the handshaking cam movement caused of user, and it can fix, by detection, one or more points of interest (referred to as " characteristic point ") that object associates, then in the identical characteristic point of the middle search of temporally adjacent frame (t+1) and calculate the motion vector of each characteristic point and calculate with the reality of each frame captured at time t.The multiple motion vectors associating (together be grouped) with special object are then used to calculate the affine transformation of this object, and it defines the motion of its detection according to affine formula:

X '=sx*x+ry*y+tx

Y '=rx*x+sy*y+ty

The various searching methods used in field of video compression can be used to calculate the motion vector of characteristic point between successive frame.Such searching method can use the mathematics of the macro block of such as absolute difference and (SAD), mean absolute difference (MAD) or mean square poor (MSE) to compare (such as, by being compared to the position of characteristic point in searching for reference frame (t+1) by comprising 8 × 8 pixel macroblock of characteristic point in present frame with multiple 8 × 8 pixel macroblock in the region of search in the reference frame (t+1) centered by the position of characteristic point) in two temporally adjacent frames.Between temporally adjacent frame (t and t+1), the amount of the measurement of the displacement of macro block centered by characteristic point and direction are referred to as " motion vector " of characteristic point.Use the method for estimating motion vector of prior art of block matching algorithm (BMA) in United States Patent (USP) 6 in the various ranges of choice, 895,361 (inventor Yang) and 7,680, described in 186 (inventor Lee), it is incorporated herein by reference.

Video stabilization algorithm can eliminate jitter motion, keeps the cam movement that user has a mind to simultaneously.Generally, shake is caused by hand wobble and Platform Vibration, and described hand wobble and Platform Vibration will be faster (that is, higher frequency) and nonlinear, and cam movement will be slower and is linear or dullness.Global motion (photographic head track) vector is included in the affine transformation parameter of Compensation Transformation P (n), and described Compensation Transformation P (n) is that the specified point using coupling is estimated between consecutive frame.

Compensation Transformation P (n) shaken for compensating photograph photographic head can be characterized as any reality with scene and fix the first affine transformation matrix that object (such as, rock, desk, parked vehicle, high mountain, the sun) is relevant.In almost all cases, hand wobble and Platform Vibration may cause the translation of frame of video, rotate and scale.In order to these are modeled, need six parameter affine transform.

Even if being correctly created Compensation Transformation P (n) to compensate jitter motion unintentionally, the frame that compensates as result is still likely to be of the movement of significantly vibrating of the input video frame relative to capture, and may extend beyond view data available in some input video frame captured.This causes the excessive deviation of backoff window.

In order to remove the jitter motion in video, compensating unit crops some borderline regions of each input video frame.The amount of the borderline region removed can be quantified as cutting rate.Big cutting rate means that the more multizone of boundary is removed.Output video frame can be modeled as the backoff window (see, e.g., Fig. 1) being superimposed upon in input video frame.Backoff window can be rotated relative to input video frame, offset, scaling etc..

For given cutting rate, the amount of the movement of backoff window is referred to as backoff window deviation.Backoff window is referred to as backoff window beyond the movement of input video frame boundaries and excessively deviates.

Without shake (without photographic head track unintentionally), then expection (characteristic point based on actually fixing object) Compensation Transformation P (n) is in same position (such as, UNITY) in each frame of two or more successive frames.If there being high dither, then expectation produces and has reduction degree or drop the most stable video that low-frequency backoff window excessively deviates.

Accordingly, it would be desirable to the filtering method being balanced between at not enough video stabilization and excessively deviateing adaptively.

Summary of the invention

The one side of present inventive concept provides and identifies characteristic point and derive the motion vector due to the characteristic point that the overall situation moves or photographic head moves and moves in a consistent manner, the most accurately for the high efficiency process of DIS purpose.

Desirable features point for DIS algorithm is the point of the motion vector providing monodrome when applying suitable motion estimation algorithm.In order to identify the characteristic point in image, Harris's Corner Detection device of the pixel being applied to frame of video estimates that each pixel is suitable as characteristic point more.The zones of different of image has the different densities of the characteristic point candidate of identification.Disclosed raster scan order selects and the method zonule based on frame of video (referred to as tile (tile)) of selection provides final characteristic point to be distributed, and wherein the maximum quantity of characteristic point is with the variances sigma of the luminance picture data of tile²Linear increase.

Each frame of video is divided into lesser amt j × k tile.Quantity j × k tile can from for SD video 4 × 4 to for the 6 × 6 or bigger of HD video；Other quantity in the range of (4..8) × (4..8) are also possible and can be useful.Select tile dimensions so that movable independently sufficiently large object covers the major part of at least one tile, such that it is able to capture their motion for DIS purpose, ignore the motion of little object simultaneously.

There are more image of interest data and so that the tile expection of more features point has higher variances sigma².Characteristic point selection algorithm finds the programmable minimum range between characteristic point to however it is necessary that minimum hardware memory.

Hierarchical Motion Estimation algorithm can be used to estimate that characteristic point from frame to frame moves, and wherein the programmable movements scope about a rear search level is intentionally little, thus the biggest object or the mobile and non local movement of the overall situation.Therefore, minimizing required operation amount, result is the most accurately applied for digital image stabilization simultaneously.

For such as by choosing each characteristic point of algorithms selection, determining its motion vector by Block-matching in the little scope starting vector used.Starting vector is the tile comprising current signature point and the tile motion vector of surrounding tile (such as, upper and lower, left and right).Tile estimation is the first step of the process of the motion vector deriving characteristic point.Non-overlapped tile based on the core covering input picture (the identical tile such as, used in characteristic point selection algorithm) completes tile estimation.For the tile estimation of each tile, down-sampled images performs the search of full Block-matching.

Present frame by for SD (SD) video four to eight or for high definition (HD) video eight to ten six the second down-sampling coefficient f_s2Carry out down-sampling.In this down-sampling territory, each tile is completed Full-search block matching and stores tile vector for future use (such as, as the beginning vector of the motion vector for deriving characteristic point).A motion vector will be estimated for each tile, carry out full search with lowest resolution, by the second down-sampling coefficient f of down-sampling brightness data_s2Carry out down-sampling, and the motion vector candidate producing minimum SAD is assigned to each tile.According to embodiment, for border tile, search can be limited in available region of search, cause reference block (partly) at the motion vector outside region of search thus without generation.Relative to the resolution used, tile motion search will produce half-pixel accuracy vector: region of search will be up-sampled by simple, two wire interpolation.This only uses considerably less local storage, therefore saves the memorizer in VLSI embodiment and logic region.

One aspect of present inventive concept provides digital image stabilization (DIS) method, including characteristic point motion vector grouping process, is grouped by feature point pairs for the differential seat angle between motion-vector magnitude ratio and their motion vector of distinguished point based.A kind of method processing video data is provided, including: receive the first view data representing the first frame；Identify the multiple characteristic points in the first frame；Receive the second view data representing the second frame；Derive the motion vector with each Feature point correspondence；The first motion vector in selection motion vector is as current vector A, and selects the second motion vector in motion vector as current vector B；And amplitude of the vector ratio based on vector A and vector B and their differential seat angle compare vector A and vector B.

The method can also include: arranges Amplitude Ratio threshold value and differential seat angle threshold value；And if the amplitude of the vector of vector A and vector B is than in falling into Amplitude Ratio threshold value and their differential seat angle falls in differential seat angle threshold value, vector A and vector B is grouped together.According to another aspect, including: if if their amplitude of the vector is than outside falling into Amplitude Ratio threshold value or outside their differential seat angle falls into differential seat angle threshold value, vector A and vector B are not grouped together.

Embodiment according to present inventive concept, it is provided that a kind of video processing circuits, including characteristic point circuit, multiple characteristic points of being configured to identify in the first frame also derive the motion vector between the first frame and the second frame for each characteristic point；Matching controller, be configured to select one in motion vector as current vector A (xa, ya), and select different one in motion vector as current vector B (xb, yb)；Amplitude Ratio comparator, is configured to the Amplitude Ratio of vector A and vector B to compare vector A and vector B；And vector angle comparator, it is configured to the differential seat angle of vector A and vector B to compare vector A and vector B.

According to one exemplary embodiment, it is provided that a kind of method processing video data, including: estimate the motion vector of each characteristic point in the first frame of video data；Than with differential seat angle, motion vector is grouped into motion vector set based on amplitude of the vector；And in selecting to comprise the scene representing the first frame, fix the group of the motion vector of the movement of object.The method can also include: uses down-sampling brightness data to estimate each motion vector of a part of multiple tiles divided from the first frame, and selects the tile motion vector candidate with minimum absolute difference and (SAD)；And than with differential seat angle, tile motion vector is grouped into tile motion vector set based on amplitude of the vector.

Embodiment according to present inventive concept, it is provided that a kind of photographic head, including: image capturing circuit, it is configured to capture image and multiple images are converted to the first frame and second frame of view data；And video processing circuits chip, including characteristic point circuit, multiple characteristic points of being configured to identify in the first frame also derive the motion vector between the first frame and the second frame for each characteristic point；Matching controller, is configured to the every pair of motion vector selected in the middle of the motion vector of characteristic point；Amplitude Ratio comparator, the amplitude of the vector ratio being configured to them carrys out more every pair of motion vector；And vector angle comparator, the vector angle difference being configured to them carrys out more every pair of motion vector.

The one side of present inventive concept provides a kind of digital image stabilizing method, including: the history excessively deviateed based on backoff window, master/Compensation Transformation P (n) of the fixing/background object during filtering represents the scene of frame of video adaptively.

The one side of present inventive concept provides one to use the strong efficient and predictable jitter eliminating method compensating (SC) wave filter.This SC wave filter is altofrequency selectivity high order linear time constant digital filter.The effective filtering using the SC wave filter input video to shaking very much implies that in the input video frame captured, the notable of backoff window moves.For given cutting rate, the amount of movement of backoff window is referred to as backoff window deviation.The movement of the input video frame boundaries that backoff window exceeds capture is referred to as backoff window and excessively deviates.SC wave filter produces highly stable output video for the strict application of big mobile input video by excessively deviateing with a large amount of backoff window as cost.On the other hand, weak compensation (WC) wave filter with lower frequency selectivity characteristic excessively deviates producing less backoff window with more unstable output video as cost

The one side of present inventive concept provides a kind of adaptive equalization (AC) wave filter, and it is configured to prevent excess excessively deviation for big mobile input video, keeps outstanding video stabilization characteristic simultaneously.

In the one exemplary embodiment of present inventive concept, the cause and effect linear time invariant wave filter including WC wave filter supplements SC wave filter, in order to produce predictable characteristic.History can be deviateed based on the backoff window on multiple K frames and carry out control combination WC/SC wave filter.Little deviation in history allows the bigger impact on present frame n of the SC wave filter, and the bigger impact for present frame n of the WC wave filter is permitted in the big deviation in history.Moderate deviation distribution SC wave filter in history and the proportional impact of WC wave filter.

Another aspect of the present invention provides a kind of digital image stabilization circuit, and it is adapted to be execution DIS method disclosed herein.This circuit can be included in video camera itself, and be activated with before the frame of video of storage capture (such as, if video camera includes real-time MPEG encoder, then between mpeg encoded or period) eliminate shake in real time.Alternatively, for estimating photographic head track between successive video frames and to filter the DIS circuit of shake from the sequence of frames of video of storage can be the general-purpose microcomputer controlled by the software using digital image stabilization (DIS) method, or can be specialized hardware, such as be optimised for performing the middle mpeg video encoder realized of ASIC (special IC) of digital image stabilization (DIS) method.

It is more fully described the one exemplary embodiment of present inventive concept below with reference to the accompanying drawings.But, present inventive concept can embody in different forms, and should not be construed as being limited to embodiments presented herein.On the contrary, these embodiments are provided so that the disclosure will be comprehensive and complete, and the scope of present inventive concept is fully conveyed to those skilled in the art.Running through accompanying drawing, like numeral refers to similar element.

Accompanying drawing explanation

Including accompanying drawing to provide being further understood from of present inventive concept, and merge and constitute the part of this specification.The one exemplary embodiment of accompanying drawing explanation present inventive concept, and it is used for explaining the principle of the present invention together with description.In accompanying drawing:

Figure 1A to 1F is according to the present frame of the step in the digital image stabilizing method of the one exemplary embodiment of present inventive concept and reference frame and its characteristic point selected and the view of motion vector for explanation；

Fig. 2 A is based in Figure 1A to 1F the present frame F comprising Figure 1A of the step in the DIS method of explanation_t, and the figure of capture frame of Fig. 1 E of many tiles of j × k that are divided in borderline region and nucleus；

The figure of a tile in the nucleus of the picture frame that Fig. 2 B is based in Figure 1A to 1F Fig. 2 A of the characteristic point of the explanation selection, refusal of the step in the DIS method of explanation and cancellation；

Fig. 3 is the figure being configured to perform the circuit block of DIS process of the embodiment according to present inventive concept；

Fig. 4 A and 4B is performed in Figure 1A to 1F the flow chart of the method for the multiple characteristic points in each tile of the identification of the step in the DIS method of explanation and the picture frame of selection Figure 1A and 2A；

Fig. 5 is the present frame F of Figure 1A utilizing the tile motion vector down-sampling added on it of the motion vector computation step for illustrating in Figure 1A to 1F in the DIS method of explanation_tView；

Fig. 6 is the view of a part for the tile in the down-sampling frame of Fig. 5, and its explanation uses the tile motion vector of Fig. 5 to start vector to calculate the motion vector of the characteristic point of the selection used in the DIS method of explanation in Figure 1A to 1F as search for for Block-matching；

The flow chart of the method for the motion vector of the characteristic point selected in the picture frame of calculating Figure 1A and 2A that Fig. 7 is performed in Figure 1A to 1F the step in the DIS method of explanation；

The reality that Fig. 8 A is identical in the video scene away from photographic head same distance when photographic head only has translational motion and do not has rotational component fixes the figure of the motion vector of two characteristic points of object；

Fig. 8 B is the figure of the motion vector of two characteristic points that reality away from photographic head same distance fixes object when photographic head has rotational component；

Fig. 8 C is the figure of the motion vector of two characteristic points that identical reality away from photographic head different distance fixes object when photographic head only has translational motion and do not has rotational component；

Fig. 9 A and 9B is the figure of two pairs of motion vectors of the characteristic point of actual fixing object in video scene, though each poor to having identical amplitude of the vector when the direction of four motion vectors of explanation and amplitude all differences；

Figure 10 illustrates explanation three vectograms of the calculating of the normalized vector indirectly the measured difference of the differential seat angle of the packet of the characteristic point in the DIS method of explanation in Figure 1A to 1F；

The normalized vector difference of use Figure 10 A, 10B, 10C that Figure 11 is performed in Figure 1A to 1F the vector packet step of Fig. 1 D in the DIS method of explanation measures the flow chart of the grouping algorithm of the differential seat angle between the motion vector of the characteristic point selected in the picture frame of Figure 1A and 2A indirectly；

Figure 12 is Amplitude Ratio | the b | the curve chart as the function of differential seat angle θ of the amplitude of normalized vector poor (a-b) | (a-b) | poor with normalized vector (a-b), and its explanation is for using the availability of approximation in the step of the grouping algorithm of Figure 11；

Figure 13 is the block diagram including being configured to perform the characteristic point packet circuit of the grouping algorithm circuit 1310 of the characteristic point grouping algorithm of Figure 11；

Figure 14 is based on the block diagram of digital image stabilization (DIS) circuit of execution digital image stabilization (DIS) method of the one exemplary embodiment of present inventive concept；

Figure 15 be Figure 14 DIS circuit in be adapted to be the block diagram of detector unit of the affine transformation calculating tile set of vectors；

Figure 16 be Figure 14 DIS circuit in be adapted to be based upon tile group conversion and feature group convert T_iN () is marked and is selected the block diagram of the trajectory unit (TU) of main (stable/background) conversion P (n)；

Figure 17 A is configured as performing the block diagram of the exemplary realization of the group conversion scoring of the step in the DIS method of the DIS circuit of Fig. 1 and selection circuit；

Figure 17 B is the block diagram of the exemplary realization of history scoring unit；

Figure 18 is the block diagram of the exemplary realization of set transform scoring and selection circuit；

Figure 19 is the block diagram that the mobile object of diagram gets rid of the one exemplary embodiment of circuit；

Figure 20 is the flow chart of the process step illustrating the embodiment according to present inventive concept；

The frame of video of the capture in the step of digital image stabilization (DIS) method that Figure 21 is based on the one exemplary embodiment of present inventive concept and the view of backoff window calculated wherein；

Figure 22 is carried out the block diagram of digital image stabilization (DIS) module of DIS；

Figure 23 is based on the block diagram of the adaptive equalization filter module method of the one exemplary embodiment of present inventive concept；And

Figure 24 is based on the schematic block diagram of the adaptive equalization filter module of present inventive concept.

Detailed description of the invention

Figure 1A to 1F is according to the present frame of the step in the digital image stabilizing method of the one exemplary embodiment of present inventive concept and reference frame and its characteristic point selected and the view of motion vector for explanation.

Figure 1A illustrates two successive video frames of scene, present frame F_tWith reference frame F_t+1.Scene includes the fixing object of such as hillside (in the foreground), electric pole, high mountain and the sun and the Moving Objects of the bird of such as upper left.Present frame F_tWith reference frame F_t+1It it is each continuous part (as referring to figure 1e) capturing frame with larger area.Bigger capture frame is the original image captured by imageing sensor before digital image stabilization (DIS).Photographic head track owing to being caused by jitter motion causes reference frame F_t+1Relative to present frame F_tRotate and translation.The size of capture frame (see Fig. 1 E) is typically made a reservation for by the hardware size of the physical image sensor (not shown) of video camera.Present frame F_tWith reference frame F_t+1Size can be dynamically selected to avoid or minimize, owing to the jitter motion of photographic head causes, present frame F occurs outside the border of capture frame_t" excessively deviation ".

Figure 1B illustrates and fixes object with the reality in scene and present frame F that Moving Objects associates_tIn the characteristic point (circular) of multiple selections.By present frame F_tBeing divided into multiple rectangular tile, each tile includes at least one characteristic point selected.By performing the step of the method for explanation in Fig. 2 A, 2B and 4A and 4B and/or can be identified and select the characteristic point of the selection shown in Figure 1B by the circuit of Fig. 3.Present frame and reference frame are stored in the memorizer 350 of the circuit of Fig. 3, and the step by performing the method for Fig. 2 A, 2B and 4A and 4B explanation identifies and selects the characteristic point of the selection shown in Figure 1B simultaneously.

Fig. 1 C illustrates the present frame F with motion vector (arrow)_tThe characteristic point of each selection.The motion vector of the characteristic point of the selection shown in Fig. 1 C can be calculated by the step performing the method shown in Fig. 6 and 7.

Fig. 1 D illustrates that the motion vector in scene has been grouped (such as, group A, group B, group C).Reality in scene is fixed the motion vector (group B and group C) of object and is moved (such as shake) by photographic head and cause.The packet of the motion vector of the characteristic point of the selection shown in Fig. 1 D can be performed, wherein based on using the pairing algorithm of Amplitude Ratio and normalized vector difference by motion vector pairing/packet (including or excluding) by the step of the method shown in Figure 10 A, 10B and 10C and Figure 11.

Fig. 1 E is shown in the reference frame F in the overall background of the bigger capture frame exported by imageing sensor (not shown)_t+1.By using the reality shown in Fig. 1 D fix the group B of object and organize the motion vector definition reference frame F of C_t+1Affine coefficients determine reference frame F_t+1Position.So that the circuit of the step of the method shown in Fig. 6 and 7 of the motion vector that the view data of the capture frame outside the border of reference frame is to performing for calculating group B and group C can be used.

Fig. 1 F illustrates the shake cam movement then reference frame F indicated without the motion vector being fixed object by the reality shown in Fig. 1 D_t+1In the position that it should be received by imageing sensor (not shown).By compensating circuit (not shown) application reference frame F_t+1Affine coefficients with rotate and translation reference frame F_t+1Correct the jitter motion of photographic head.

Feature point recognition, choose and be distributed

Fig. 2 A is based in Figure 1A to 1F the present frame F of many tiles of the j × k being divided in borderline region and nucleus of the step in the DIS method of explanation_t(see Fig. 1 E) is so that the figure of capture frame of the identification of characteristic point and selection.Boundary line between borderline region and nucleus can predefine by hardware or by software independent of the content of view data received, present frame F simultaneously_tBorder can be dynamically selected based on the shake degree of cam movement by the picture data content instruction such as received, in order to prevent or reduce the excessive deviation of present frame.Therefore, nucleus may or may not correspond to current image frame F shown in Figure 1A_tSize and position.

The frame of video of each capture is divided into the non-overlapped tile of peanut (such as, 4 × 4 tiles for SD and 6 × the 6 or more tiles for high definition), it is therefore an objective to select to provide the characteristic point of the desirable features point distribution being suitable for digital image stabilization on algorithm.The zones of different of image can have the suitable characteristic point of different densities.In extreme circumstances, the region of frame may not have any suitable characteristic point, such as in the case of cloudless blue sky.In other regions, potential characteristic point is the densest.When using threshold value based on the overall situation identify and select whole characteristic point, characteristic point tends to concentrate in the zonule of image, causes the DIS result of difference.Still it is expected to wherein have in the region of the image of more structure and there is more features point, because there is motion interested potential.In the region that these are intensive, another problem is how that guarantee not all characteristic point are packed together.Therefore an aspect of of the present present invention is provided for ensuring that the highly efficient process of the minimum range (MIN_DIST) being used between the characteristic point of DIS.

For the stability of DIS algorithm, distribution characteristics point as broadly as possible, the simultaneously total quantity of limited features point." well distributed " of characteristic point can be expressed as follows: it has big convex closure；Characteristic point is not too close to (MIN_DIST)；In the tile with less suitable characteristics point, if it is possible, then select the characteristic point of at least minimum number (min_features)；And, in the tile with more suitable characteristics point, select more characteristic point (max_num_features=min_features+max_plus_features* (tile_variance σ²/total_variance))。

Brightness variances sigma based on tile²Determine the maximum quantity (max_num_features) of characteristic point in each tile.

In one embodiment, in each tile, the maximum quantity (max_num_features) of characteristic point is the variances sigma that the programmable minimum number (min_features) of the characteristic point of each tile is multiplied by specific tile plus the programmable maximum quantity (max_plus_features) of other characteristic point²With the ratio of the summation of tile variance and.If tile is of different sizes, correction factor can be applied.Therefore, the maximum quantity of the final characteristic point selected of each tile alternatively can add the variances sigma with tile for min_features²Proportional, part by the corresponding normalized var_features of tile weight.Border tile higher weights can be given, because they include that borderline region is the biggest.In the case of this replacement, the maximum quantity of the characteristic point of given tile is calculated as follows:

Therefore, the maximum quantity (max_num_features) of the characteristic point of selection does not keep constant, the most not at frame F in whole tiles_tWith frame F_t+1Between keep constant.

In one embodiment, in each tile, the maximum quantity (max_num_features) of characteristic point is the variances sigma of brightness data in each tile²Divided by the function of total brightness variance, need to precalculate the brightness variances sigma of each tile²Population variance with frame.Those of ordinary skill in the art are understood that other functions are also possible, such as, relate to average brightness value and tile variances sigma²Function.

In order to identify characteristic point, it is possible to use the Corner Detection device of such as Harris's Corner Detection device etc.Each pixel of Harris's Corner Detection device assessment image is as possible characteristic point candidate.Preferred feature point candidate is the point that wherein characteristic mass estimation function has local maximum.The method that disclosed characteristic point selects is by being compared to optimize by the selection of the characteristic point of Harris's Corner Detection device identification with LOCAL (locally) rather than GLOBAL (overall) (full frame) threshold value by the end value of the characteristic point of each identification (estimating how this pixel is suitable as characteristic point).Therefore, the difference of the contrast during disclosed method considers the different piece of the characteristic point density at each regional area or even frame.

The characteristic point distribution obtained is zonule based on frame of video, and (such as, non-overlapped tile), the quantity of the characteristic point in the most each tile is with the variances sigma of the luminance picture data of tile²Linearly increasing.There are more image of interest data and so that the tile expection of more features point has higher variances sigma²。

Fig. 4 A and 4B is that the minimum range (MIN_DIST) that explanation determines in each tile between characteristic point only needs a small amount of local state information simultaneously thus reduces the flow chart of the method for hardware implementation cost.

But Fig. 2 B is the figure of a tile in the nucleus of the picture frame of Fig. 2 A, (Lycoperdon polymorphum Vitt), (white) refused and previous (Lycoperdon polymorphum Vitt still makes fork) characteristic point selected cancel that its explanation selects.The characteristic point being shown as blockage in Fig. 2 B has used Harris's Corner Detection device algorithm to be identified as characteristic point candidate, is then selected successively by raster scan order according to the step of the method for Fig. 4 A and 4B explanation, refuses or cancels.

Each tile is selected to the maximum quantity (max_num_features) of the characteristic point candidate identified.According to embodiments of the invention, the characteristic point candidate of each identification such as can be selected with raster scan order by following:

I. the characteristic point candidate identified be wherein Harris's angle point estimation function exceed programmable threshold and wherein this estimation there is the pixel of local maximum.In order to meet local maximum, the value of the position of discussion have to be larger than by scanning sequency before this pixel whole directly and the value of diagonal angle neighbours, but be merely greater than or equal to by the scanning sequency direct and value of diagonal angle neighbours after this position.This is implemented to accommodate the fact that identical value is quite to be similar to.

The most once characteristic point candidate has been identified, it will enter data store organisation (such as, selection list, but other embodiments are also possible), it can be the characteristic point candidate that each tile retains predetermined maximum number, such as, maximum 32,48, the 64 or more final characteristic points selected, premise is the not characteristic point candidate with bigger estimation function value in block scope (MIN_DIST) able to programme.For illustrative purposes, maximum 32 is selected to describe the present embodiment.

If the characteristic point candidate identified the most later has stored in data structure, then will from this point than block scope (MIN_DIST) closer to other characteristic points whole with less estimation function value delete from data store organisation.

For illustrative purposes, it is assumed that the predetermined maximum number of the characteristic point candidate of tile (5,4) is four (that is, max_num_features=4).As shown in Figure 2 A, tile (5,4) comprises characteristic point (Lycoperdon polymorphum Vitt) SFP3, SFP4, SFP5 and the SFP7 by the four of raster scan order final selections and three previous selections but (but Lycoperdon polymorphum Vitt make fork) characteristic point SFP1 cancelled, SFP2 and SFP6, plus two (from unselected) characteristic points (white) refused.Characteristic point SFP1 of (Lycoperdon polymorphum Vitt still makes fork) that previously selected cancelled, SFP2, it is such characteristic point candidate with SFP6, they are chosen as characteristic point by raster scan order during the process of the method for Fig. 4 A and 4B explanation, but cancel as characteristic point candidate subsequently, reason is that they are in the rejection area (MIN_DIST) of the bigger characteristic point candidate identifying and being chosen as characteristic point candidate after a while, or because the list of the characteristic point selected has been expired (i.e., quantity SFP_count=max_num_features of characteristic point candidate selected) and the characteristic point that relatively early selects be minimum in the list that the characteristic point selected is waited and less than the characteristic point candidate identified after a while and select.

Characteristic point SFP1 previously selected cancelled is based on the step of the method for Fig. 4 A and 4B explanation by the fisrt feature point of raster scan order identification and selection.After a while, characteristic point SFP2 of the previously selection of cancellation is identified and selects, but after selecting SFP2, characteristic point SFP3 of selection is identified and is more than SFP2.Owing to SFP2 is in the rejection area (MIN_DIST) of characteristic point SFP3 bigger, that select, cancel SFP2 immediately when selecting SFP3.Selecting after SFP3, identify characteristic point candidate in the lower right corner of the rejection area (MIN_DIST) of SFP3, and because this feature point candidate is less than SFP3 and in being in its rejection area, it is by refusal (that is, not selecting) immediately.Then, identify characteristic point candidate in the lower section of the rejection area (MIN_DIST) just past SFP3, and it is selected as SFP4 (and not the most being cancelled).Then, identify characteristic point candidate on the right further below of the rejection area (MIN_DIST) of SFP3, and it is selected as SFP5 (and not the most being cancelled, though because it is close to being still not located in the rejection area of SFP7).Then, identify characteristic point candidate on the right of the lower section of the rejection area (MIN_DIST) of SFP5, and it is selected as SFP6 (and being cancelled later, because it is in the rejection area of bigger characteristic point SFP7 selected after a while).When SFP6 is chosen, the list of the characteristic point selected " has been expired (full) " (such as, the maximum quantity of the characteristic point of this tile is 4), because SFP1 is minimum in characteristic point SFP1 of selection, the list of SFP3, SFP4 and SFP5, and because SFP6 is more than SFP1, therefore cancel SFP1.Then, identify characteristic point candidate in the inside of the lower section of the rejection area (MIN_DIST) of SFP6, and it is selected as SFP7 (because SFP6 is cancelled more than SFP6 and/or because list is full etc. immediately due to characteristic point SFP7 that selects).Then, identify characteristic point candidate in the inside of the lower section of the rejection area (MIN_DIST) of SFP7, and it is rejected (non-selected), because last characteristic point is less than SFP7.Likely SFP7 is actually smaller than the SFP2 (if SFP3 is more much bigger than SFP7) cancelled, but has obtained the well distributed of characteristic point.Block scope (MIN_DIST) able to programme guarantees that the characteristic point finally selected the most closely is clustered round together.

The pixel intensity variances sigma of each tile can be determined during lower scaling (downscaling) process that each tile is downsampled².In each tile, the maximum quantity of characteristic point is confirmed as the programmable constant minimum number of the characteristic point of each tile and is multiplied by the variances sigma of specific tile plus the quantity of total alterable features point²With the ratio of the summation of tile variance and.Correction factor can be added for edge and the region of tiled area, corner, because characteristic point can also be in borderline region.For each tile, for by each characteristic point candidate of raster scan order identification, above-mentioned selection process (that is, select, refuse, cancel) is used to collect and store the characteristic point candidate of up to maximum quantity.Finally, the characteristic point candidate of the final each tile selected has the characteristic point candidate of highest estimated function response just, and its maximum quantity is scheduled.May there is such example, wherein the most available in given tile enough characteristic point candidates, the such as tile of soft image data, in the case of Gai, the quantity of the characteristic point of the final utilization produced is by the minimum number (such as, the quantity less than min_features) less than programming.

Therefore, there is provided a kind of method with raster scan order processing feature point candidate, even if wherein can be in the list identified after a while with select more new feature point candidate also to maintain to include the only characteristic point of the selection of the maximum quantity at most calculated together with closely cluster round.The raster scan order method of this selection characteristic point has the advantage of the amount reducing memorizer and calculating compared with distinguishing order of priority the various additive methods that select with in the middle of the characteristic point candidate identified.Such as, in an alternate embodiment, in the large list that whole characteristic point candidate identifications of tile and storage can be stored in memory, after the most only the whole characteristic point candidates at tile have been identified, can be with the optimal set (there is predetermined full-size) of the applied mathematics selection algorithm maximum characteristic point candidate to find not in the rejection area (MIN_DIST) gathering any other member.But, such selection algorithm raster order selection (select, refuse, cancels) method (its exemplary results illustrates in fig. 2b) than Fig. 4 A and 4B needs more physical storage (being used for storing the whole list of the characteristic point candidate of the identification of tile) and needs the most total calculating potentially.The raster scan order selection algorithm of Fig. 4 A and 4B there is no need to provide the set of the characteristic point of the selection as global optimum, because characteristic point candidate can be cancelled from list by the characteristic point candidate being chosen after a while still to be cancelled after a while, it provides the algorithm that can realize in having limited locally stored hardware on the contrary.Although the method for Fig. 4 A and 4B is described as according to " raster scan order " (i.e., process from left to right and from top to bottom) the characteristic point candidate identified, its pixel order generally carried out for Harris's Corner Detection device, the method can use any sequence selecting characteristic point candidate, the discontinuous sequence of the most non-adjacent characteristic point candidate, as long as all characteristic point is identified successively and is finally chosen.

Fig. 3 is the block diagram of the characteristic point circuit of the embodiment according to present inventive concept.Characteristic point circuit 3000 includes characteristic point selector 300 and has selected characteristic point (SFP) motion vector calculator 700 and the RAM memory 350 shared.Characteristic point selector 300 includes down-sampler 310, characteristic point candidate evaluator 330 and characteristic point candidate sorter 340.

Characteristic point candidate evaluator 330 uses Harris's Corner Detection device algorithm identification characteristic point candidate and exports, by raster scan order, the characteristic point identified to characteristic point next tile of candidate sorter 340 1, such as, (Harris Corner response) is responded with location of pixels and Harris's angle point.The method that characteristic point candidate sorter 340 is configured to perform the characteristic point of the identification choosing each tile one by one such as Fig. 4 A and 4B further illustrated in Figure 1B and 2B.Down-sampler 310 includes tile-variances sigma²Computer 320 functional device, it is according to the tile-variances sigma of each tile of below equation calculating picture frame²:

σ^{2} = \frac{Σ y^{2}}{N} - {(\frac{Σy}{N})}^{2}

Brightness value in wherein y value is tile and N is the quantity of pixel in tile.

Circuit shown in Fig. 3 can realize in semiconductor chip, its have be configured to from have capture image sensor and for the image of capture is converted to the circuit of view data photographic head receive view data input/output pin.The data of the processing of circuit of Fig. 3 export other assemblies of photographic head via input/output pin.As will be further described below, memorizer 350 resides in semiconductor chip and minimizes the size of chip, and memorizer needs limited in the least therefore memory capacity.Calculating power to save and reduce the quantity of action required, brightness data can only be operated by characteristic point selector 300, its by by down-sampler 310 according to for 2,4 or 8 factor f_s1(the present embodiment is chosen as factor f of 4_s1) carry out down-sampling horizontally and vertically.Through f_s1The brightness data of down-sampling is used for Feature point recognition by characteristic point candidate evaluator 330, and can be used for characteristic point estimation of motion vectors by the Multi-frame Block Matching search unit 730 of SFP motion vector calculator 700 after a while in an alternate embodiment.While calculating less down-sampled images by down-sampler 310, calculate brightness variance (tile-variance) σ of each tile², and identify the global maximum of the less characteristic value of 3 × 3 Harris's angle point matrixes.The maximum down-sampling factor (f preferably used as the tile side-play amount of coordinate of top left pixel and the tile Pixel Dimensions of upper left tile_s2) multiple.And preferably, image core region is placed in the middle in whole image.Therefore, the width in left margin region and the width of right border area is identical and the height identical (see Fig. 2 A) in the height in region, coboundary and lower boundary region.

During once incoming frame brightness data has been downsampled and has been stored in RAM memory 350, characteristic point candidate evaluator 330 is pressed tile order and is read, and the characteristic point candidate of identification is fed sequentially to characteristic point candidate sorter 340.For the Feature point recognition process of block 330, the statistical regions of the potential characteristic point in the tile adjacent with borderline region expands to borderline region, and the pixel of the most each borderline region processes together with the pixel of adjacent tiles.Pixel data is pressed raster scan order in each tile and is read: the most line by line, in often row the most pixel-by-pixel.

In order to process each tile, characteristic point candidate evaluator 330 needs three other pixels on each internal tile boundaries for using the Feature point recognition of Harris's Corner Detection device.Therefore, these pixels will be read more than once.The characteristic point candidate identified is that relatively low characteristic value λ 1 of Harris's matrix in each tile has the pixel at local maximum.In order to meet local maximum, the response of the angle point of the pixel of discussion have to be larger than upper left, upper, upper right and left neighbours angle point response and more than or equal to right side, lower-left, under and the angle point of bottom right neighbours respond.By this definition, at least one point of the large area with identical constant angle point response will be identified that potential characteristic point candidate.Two line buffers that the detection logic of local maximum will need angle point to respond.First the point with local angle point response maximum is compared with angle point response lag able to programme.If the angle point response of the point inquired into is less than this threshold value, then ignore it.Otherwise, coordinate and the angle point response thereof of this feature point are supplied to characteristic point candidate sorter 340.

Characteristic point candidate sorter 340 controls most max_num_features (such as in each tile, 32) the individual characteristic point candidate with the response of the highest angle point, guarantees that whole characteristic point has minimum distance able to programme (MIN_DIST) to each other simultaneously.Distance between two points used in algorithm above is defined below:

dist ((\begin{matrix} x_{1} \\ y_{1} \end{matrix}) \cdot (\begin{matrix} x_{2} \\ y_{2} \end{matrix})) = \max (| x_{1} - x_{2} |, | y_{1} - y_{2} |)

The Current Content of characteristic point candidate by the list only considering the characteristic point of the selection of sorter and come in and the operation made decision at once realize the selection of the method for Fig. 4 A and 4B.Therefore, the characteristic point candidate sorter 340 being adapted for performing the method for Fig. 4 A and 4B will not calculate global optimum inherently, and result will depend upon which the order that the characteristic point candidate wherein come in is provided.

Characteristic point candidate sorter 340 exports the characteristic point of selection one by one, and they are stored in the SFP list in the part of the memorizer 350 of the circuit of Fig. 3.

Fig. 4 A and 4B is performed in Figure 1A to 1F the flow chart of the method for the multiple characteristic points in each tile of the identification of the step in the DIS method of explanation and the picture frame of selection Figure 1A and 2A.The method starts from data input step S400, wherein present frame F_tBrightness data received, be followed by down-sampling step S402.Initialization step S404 resets tile Counter Value current_tile and pixel counter value current_pixel.

It follows that along with the increase (step S428) of current_pixel, by raster scan order, each pixel of the down-sampling brightness data of current_tile performed Harris's Corner Detection device (step S406, SD408 and S410).When the angle point of current pixel responds and exceedes local maximum or threshold value, (i.e., the "Yes" branch of decision steps SD408), current pixel is identified as current FP (characteristic point) candidate (step S410) experience marks immediately after some selection algorithm (SD412, SD414, SD416, S417, SD430, S418, S420).

Only when current FP candidate is more than the FP candidate of the minimum previously selection in the list of the characteristic point being stored in selection (step SD412 is branch), the current FP candidate of characteristic point selection algorithms selection (S420), otherwise current FP candidate is rejected (refusal step S417) and is not selected (the no branch of decision steps SD412).If the list of the characteristic point selected when selecting current FP candidate is the fullest, such as by select characteristic point counting SFP_count instruction (i.e., SFP_count=max_num_features=min_features+max_plus_feature s* (tile_variance/total_variance)), from list, then cancel the minimum FP candidate (SD430) previously selected, otherwise increase SFP_count value (SD430).

In the rejection area (MIN_DIST) of the characteristic point only previously selected when current FP candidate the most existing any bigger (SD416) time (SD414), the current FP candidate of characteristic point selection algorithms selection (S420).Therefore, if in the MIN_DIST of the characteristic point that current FP candidate the most existing any bigger (SD416) had previously selected (SD414), then it is rejected (the no branch of decision steps SD416, and refusal step S417) and is not selected.On the other hand, if in the MIN_DIST of the characteristic point that current FP candidate the most existing any less (SD416) had previously selected (SD414), the characteristic points that then all the least (SD416) had previously selected are cancelled that (decision steps SD416 is branch, and cancellation step S418), and currently FP candidate is chosen (S420), and correspondingly update SFP_count (such as, minimizing or constant) (S418).

The current FP candidate of the most chosen (S420) or refusal (S417), the value (SD422) of Harris's Corner Detection device just next (S428) current pixel (S410) of output current tile, and next FP candidate identified experience marks point selection algorithm (SD412, SD414, SD416, S417, SD430, S418, S420) at once etc..If having processed the last pixel (SD422) of current tile, then process next tile (SD424, S426).If the most treated last tile, then complete this process until next picture frame will be processed.

Characteristic point motion vector computation

At present frame F_tIn the characteristic point of each tile be identified and chosen after, the next step in the DIS method of Figure 1A to 1F is the motion vector of the characteristic point obtaining each selection.

It is well-known for calculating the block matching algorithm (BMA) of the motion vector of characteristic point.In Block-matching, whole possible positions of the block in the target area of reference frame are calculated error function (such as, SAD, MAD, MSE).The position of the minimum event with this function is used for calculating estimates motion vector.Block-matching is to calculate complexity.There is several known method to reduce calculating cost.Classification or multiresolution Block-matching are the one in these methods, first calculate global motion with low resolution.The vector produced will be used for searching for smaller range with high-resolution, thus reduce the sum of the arithmetical operation of needs.

Applying for majority, especially for Video coding, the whole blocks for frame need motion vector accurately.Therefore, the hunting zone in the stage below is the most relatively large.In Figure 1A to 1F in digital image stabilization (DIS) method of explanation, it is only necessary to estimate that characteristic point (such as, actual fixing object) is from the relative motion of a frame to next frame.For image stabilization purposes, need to represent the accurate motion vector of the motion of background and big object, and smaller objects need not have accurate motion associated with it vector.Any inaccurate vector about smaller objects can be filtered in the later stage of DIS algorithm.

It is contemplated that the characteristic point of fixing object the biggest in DIS method will move because of the overall situation or photographic head moves and moves by consistent mode.We recognize that the biggest movable independently object covers the major part of at least one tile so that their motion can be estimated as the leading motion of tile self, the motion of the least object has little impact to the motion vector of tile self.Therefore, it can amendment and calculate the process of motion vector to reduce calculatings, such as by using Hierarchical Motion Estimation algorithm and by the motion vector next preferred tile motion that uses tile rather than local motion.Therefore, first step is, current image frame is divided into many tiles of j × k (this first step has performed the purpose selected for characteristic point, above with reference to described in Figure 1B and 2A).

The second step of the motion vector calculating the characteristic point being the most accurately used for DIS is, using the Block-matching in lowest resolution is that each tile derives a motion vector.In this step, calculate the SAD (sum of absolute difference) of given tile.Motion vector about given tile is the motion vector minimizing SAD.SAD (sum of absolute difference) about given motion vector candidate v=(vx, vy) is defined as:

SAD (v_{x}, v_{y}) = \underset{(x, y) &Element; tile}{Σ} | ref (x, y) - search (x + v_{x}, y + v_{y}) |

By using the image of low resolution downsampled, decrease the impact calculating and reducing the medium and small object of scene further.

In third step, the beginning vector of the Local Search of the motion vector of characteristic point that the motion vector of tile will be used as in each tile in block matching algorithm.Because the biggest object covering the major part of at least one tile can expand to adjacent tiles, it is likely that in each tile, some characteristic point more strongly can associate with the motion vector of adjacent tiles rather than be found the motion vector of residing tile with them and associate.Therefore, will be effectively the motion vector using whole adjacent tiles multiple beginning vectors as the Block-matching search of the motion vector of the characteristic point of any given tile.Tile used herein is placed in the middle in the frame have the borderline region of size of at least maximum motion vector supported so that can complete whole characteristic points in whole tile motion search and without reference frame outside pixel.

Fig. 5 is the present frame F of Figure 1A of the down-sampling of the tile motion vector utilizing the calculating added on it of the motion vector computation step for illustrating in Figure 1A to 1F in the DIS method of explanation_tView.Less (less pixel, less data) image in Fig. 5 or is derived from the image (step S402 of Fig. 4 A and 4B) of previous down-sampling from original current capture frame by down-sampling horizontally and vertically.By down-sampling factor f_s2The down-sampling of (such as 4) is used for the overall situation (tile) estimation.4 × 4 down-samplings just 16 pixels of equalization (by rounding), not in any overlap of input side.Then, perform to use the Block-matching of each complete down-sampling tile to search for the motion vector determining each tile.

Motion vector about given tile is the motion vector minimizing SAD.In the case of tile, use first motion vector found.This motion vector will act as the beginning vector of the Local Search of the motion vector of neighbouring characteristic point.Range of movement about each beginning vector is programmable.

Owing to the quantity of the operation needed for tile estimation is only for the 12% of local motion estimation action required, about 8 absolute differences of each cycle calculations and the most much of that.It is therefore not necessary to systolic arrays.

Fig. 6 is the view of a part for the tile in the down-sampling frame of Fig. 5, and its explanation uses the tile motion vector of Fig. 5 to start vector to calculate the motion vector of the characteristic point of the selection used in the DIS method of explanation in Figure 1A to 1F as search for for Block-matching.

One group of each characteristic point high-resolution territory started around each in vector performs in tile the coupling search of little localized mass.This step can with original video resolution or with 2 or 4 factor f_s3Down-sampling perform.The beginning vector used is the tile motion vector having been described above determining.The beginning vector used is the vector of tile belonging to characteristic point and belongs to the vector of four immediate neighbors (upper tile, left tile, right tile, lower tile) (supposing existence).Therefore, in Fig. 6, it is the motion vector of the tile of characteristic point (FP) oneself corresponding to the beginning vector of Block-matching region of search 1；It it is the motion vector of the block of the lower section of tile at FP corresponding to the beginning vector of Block-matching region of search 2；It it is the motion vector of the block of the right of tile at FP corresponding to the beginning vector of Block-matching region of search 3；It it is the motion vector of the block of the left of tile at FP corresponding to the beginning vector of Block-matching region of search 4；It is the motion vector of the block above the tile of FP corresponding to the beginning vector of Block-matching region of search 5.According to another embodiment, use the beginning vector of four diagonal angle neighbours.Other steps for selecting beginning vector can be performed (such as, for reducing the quantity that Block-matching calculates), if especially first group tile vector have the amplitude mutually similar of one big object of hint and direction (see about Fig. 8 A, 8B, 9, the discussion of the motion vector packet of 10A, 10B, 10C).Alternatively, Block-matching can perform when two or more Block-matching region of search overlap or between approximating Block-matching region of search etc. with given priority or only.

Generally, will carry out by tile one by one to characteristic point assigned motion vector, and each characteristic point of given tile will use identical beginning vector (such as, the identical selection of tile motion vector).But, in other embodiments various, characteristic point in the different piece of given tile can use the different choice starting vector, premise be the tile motion vector of detection packet in the characteristic point adjacent with each tile it is more likely that the visible point of the same object jointly found in each member of this group.Therefore, first those characteristic points to the girth near each tile Block-matching search can be performed, to detect whether that all of which or almost all are similar to the motion vector of the tile of themselves and/or are similar to the tile motion vector of adjacent packets of tile motion vector.Such as, if all the characteristic point of initial selected is (such as, whole characteristic points of girth or the characteristic point farthest from its midpoint near given tile) motion vector the same as or similar to the motion vector of the tile of themselves, then can reduce the set starting vector of the selection of remaining characteristic point.

To each beginning vector used, we use the least scope for Local Search.Object here does not lies in and determines the accurate vector for each characteristic point.On the contrary, characteristic point interested is belonging to those of background or big object.For those characteristic points, of tile motion vector should be good, or the motion vector close to characteristic point interested, and the little Local Search accordingly, with respect to the tile motion vector of each selection is enough.

Referring again to Fig. 3, SFP (characteristic point of the selection) motion vector calculator 700 of characteristic point circuit 3000 includes: the second down-sampler 710, for exporting the brightness data than the first down-sampler 310 further down-sampling for tile Vector operation；Tile vector computer 720, for calculating the motion vector of each tile；And Multi-frame Block Matching search unit 730, it is used for determining and exporting the motion vector of the characteristic point (SFP) of each selection of characteristic point candidate sorter 340 reception from characteristic point selector 300.Second down-sampler 710 exports the present frame F of the degree of depth down-sampling shown in Fig. 5_t.Tile vector computer 720 uses the present frame F exported by the second down-sampler 710_tThe brightness data of degree of depth down-sampling calculate the motion vector of each tile.Multi-frame Block Matching search unit 730 uses the full resolution brightness data (or output of the first down-sampler 310) of two successive frames tile vector used as discussed above as starting the motion vector that vector determines the characteristic point of each selection exported by the characteristic point candidate sorter 340 of characteristic point selector 300.

Fig. 7 is the present frame F illustrating calculating Figure 1A and 2A for the step performed in Figure 1A to 1F in the DIS method of explanation_tThe flow chart of the method for the motion vector of the characteristic point (SFP) of middle selection.

In the initial step, the Multi-frame Block Matching search unit 730 shown in Fig. 3 receives video the brightness data (step S700i) of two successive frames (present frame and reference frame) and the location of pixels (S700ii) of characteristic point selected.Present frame F_tBeing divided into multiple down-sampling tile (S710), the tile that it can be preferably previously used with the characteristic point picking method of Fig. 4 A and 4B is identical.In sub-step S710-A, by present frame F_tIt is divided into many tiles of j × k and adds borderline region, as shown in Figure 2 A.In sub-step S710-B, the brightness data associated with each tile passes through factor f_s2(such as, f_s2=4,8 are used for SD；f_s2=8,16 are used for HD) it is downsampled, as shown in Figure 5.

It follows that in step S720, use Full-search block matching to utilize the brightness data of degree of depth down-sampling to calculate the motion vector of each tile, as shown in Figure 5, it is achieved relative to the half-pixel accuracy of down-sampling resolution.The minimum sad value of the calculating that the motion vector that can preserve and calculate is corresponding is for other features (such as, for filtering the characteristic point of little object) of DIS.In step S730, the tile motion vector of calculating based on step S720 is that the current characteristic point (SFP) selected selects to start vector, as described above.In step S740, full resolution brightness data is utilized to perform Multi-frame Block Matching algorithm and use the vector that starts of selection based on tile vector to determine the motion vector of current SFP.Repeat step S730 and step S740 until being computed the motion vector (by circulation SD750 and S752) of each SFP of each tile.

The characteristic point utilizing motion-vector magnitude and direction is grouped

The motion between frame of video is detected by calculating the motion vector of discernible " characteristic point " in consecutive frame.Then the motion vector of characteristic point " can be grouped " so that the Moving Objects in identifying scene, because being different from the global motion of photographic head/scene.Analyze the global motion of photographic head/scene with at (such as, the pan) having a mind to (shake) unintentionally between global motion and make a distinction.

Without cam movement (without photographic head track) then actual fixing object (such as, the angle of rock, the peak of high mountain) each detection characteristic point by it is contemplated that two or more successive frame each in same position in find, and the motion vector of characteristic points of all those detections will be measured as sky.But, if there being cam movement, the vector of many characteristic points that the most any given reality fixes object can have different amplitudes and direction.Digital image stabilization circuit can be used to correctly " be grouped " by multiple to (characteristic point) motion vectors so that they are attributed to identical reality fixes object.

Common cam movement is translation and the mixing of rotary motion, and from photographic head to scene in the distance of object change.The cam movement of translation contributes the amplitude difference of motion vector based on the object distance from photographic head, and the cam movement rotated contributes amplitude and the direction of motion vector.

Fig. 8 A and 8B explanation and translation vector motion ratio purely are compared with the different vector from the cam movement generation rotated.In figure, assuming that characteristic point SFP4 of two selections of identical fixing physical object and SFP5 are physically in away from photographic head same distance, and vector A is the motion vector of SFP4 and B is the motion vector of SFP5 in the case of the cam movement of pure translation, and vector A ' is the motion vector of SFP4 and B ' is the motion vector of SFP5 in the case of including the cam movement rotated.

For the cam movement of pure translation, vector A and B will have identical motion vector, but vector A ' and B ' has different amplitudes and different directions, even if they are in away from photographic head same distance due to the cam movement of rotation.

The different vector that Fig. 8 C produces from translation vector motion purely in the case of two characteristic point distance photographic head different distance of the most identical fixing object are described.Assuming that characteristic points SFP4 of two of identical fixing physical object selections and SFP7 are physically in away from photographic head different distance, and vector A is still the motion vector of SFP4 and vector C " is the motion vector of SFP7 in the case of the cam movement of pure translation.Because SFP7 is than SFP4 closer to photographic head, they are the points on identical fixing object simultaneously, so the amplitude of their motion vector is different (vector C " less than vector A).

Therefore, when being grouped by motion vector, need the surplus considering phasor difference poor with direction vector (angle) so that can be grouped together by the motion vector of whole characteristic points of identical fixing object for the amplitude of the vector caused by these factors.The usual method utilizing the detection motion vector set of error margin and the simple motion vector difference of use is to define error threshold.

Amplitude, ao M of motion vector difference is the measurement on the basis can adjudicated as packet, and error margin Th_Δ _MCan be defined as:

Δ M=SQRT ((xa-xb) ^2+ (ya-yb) ^2) ＜ Th_Δ _M, wherein

A=(xa, ya)；

B=(xb, yb)；And

Th_Δ _MIt it is the error threshold (positive number) of amplitude, ao M of phasor difference.

When cam movement is pure translation (upper and lower and/or lateral), the amplitude of motion vector difference method is enough, because the motion vector of whole characteristic points of fixing object will have identical direction, because they are defined by identical translation cam movement entirely.As shown in comparison diagram 8A and 8C, even if in the case of the cam movement of translation purely, the motion vector of different fixed characteristic points can also be because of different to image distance photographic head different distance.The amplitude difference of the motion vector of the characteristic point of fixing object identical in usual video scene is the most relatively small, but also can tolerate this amplitude difference by some surpluses allowing amplitude of the vector poor (| A |-| B |), and the method for motion vector difference amplitude, ao M in this case is suitable.

Fig. 9 A illustrates that amplitude, ao M of wherein motion vector difference is the situation of two motion vector A and B of the good basis for two characteristic points being grouped together.

Amplitude, ao M of phasor difference is individually not likely to be the good basis being grouped by vector in some cases.

Fig. 9 B illustrates the situation of two motion vector A ' and B ' of the good basis that the amplitude, ao M ' of wherein phasor difference is not intended to be grouped together two characteristic points.

In Fig. 9 A and 9B, as indicated, vector has the amplitude of identical phasor difference (Δ M=Δ M ') to (A, B) and (A ', B ').Each can also have at error margin Th (A, B) and (A ', B ')_Δ _MIn amplitude, ao M of its respective phasor difference, Δ M '.Amplitude, ao M of phasor differences based on them A and B suitably can be grouped together by vector.But vector has too many angle (direction) poor (such as, with compare A and B) so that inappropriate by being grouped in identical group together with vector A ' and vector B ' to A ' and B '.

The amplitude, ao M method of phasor difference itself may two characteristic points have in surplus Th wherein_Δ _MThey are not suitable for motion vector packet in having the example of the biggest angle (direction) difference to the amplitude of they interior phasor difference simultaneously.The rotational component of photographic head track can cause one or more characteristic points of fixing object to have same or analogous amplitude, but has different directions, and this is not detected by the amplitude method of phasor difference.Therefore, the amplitude method of phasor difference can cause incorrect jitter compensation and/or be less than optimum video compress and/or excessive calculating power or time loss and/or the video interference brought due to fixing or Moving Objects incorrect video compress.

The motion vector of the characteristic point (SFP) of the selection exported by the Multi-frame Block Matching search unit 730 of characteristic point circuit 3000 next amplitude and direction according to them is grouped, to be associated with the object in scene by the motion vector of the characteristic point (SFP) selected based on the relative motion of the perception of object between successive video frames.

When cam movement has rotational component, such as around the axle orthogonal with the plane of imageing sensor/photodiode array, the direction of the motion vector of an object (such as background) cannot be identical.Amplitude and the direction of the different characteristic point vector of background are the most different, even if they are actually fixed and away from photographic head same distance.

Replace amplitude, ao M and the error margin Th only using motion vector difference_Δ _MFor being grouped judgement, we use the Amplitude Ratio of motion vector and normalized phasor difference detect and tolerate the some amount of the motion vector difference caused by rotating camera motion.

Wherein vector A=(xa, ya), and vector B=(xb, yb),

First packet decision rule is based on Amplitude Ratio | b |, wherein

| b | ^2=(| B |/| A |) ^2=(| B | ^2)/(| A | ^2)=(xb^2+yb^2)/(xa^2+ya^2).

Second packet decision rule is based on normalized vector poor (for the assessment of differential seat angle)=| a-b |, wherein

| a-b | ^2=[(xa-xb) ^2+ (ya-yb) ^2}/(xa^2+ya^2)].

Because the first packet judgement is that we use Amplitude Ratio threshold value r based on Amplitude Ratio (| B |/| A |) rather than absolute difference (A-B)_thReplace absolute error surplus Th_Δ _M.For packet judgement, the coboundary of lower boundary and Amplitude Ratio threshold value that we define Amplitude Ratio threshold value respectively is Mr_LthAnd Mr_Uth。

Mr_Lth^2 ＜ | b | ^2 ＜ Mr_Uth^2, wherein

Mr_Lth≡(1-r_th)；And

Mr_Uth≡(1+r_th)；And

0 ＜ r_th＜ 1

Such as, if allowing the 30% of Amplitude Ratio threshold value, then r_thTo be 0.3；Mr_LthTo be 0.7；And Mr_UthTo be 1.3, produce following scope:

0.7^2 ＜ | b | ^2 ＜ 1.3^2

For the differential seat angle between vector A and vector B less than θ_thThe threshold value of degree, the second packet decision rule is

| a-b | ^2 ＜ Ma_th^2, wherein

Ma_th^2=SQRT (1+ | b | ^2-2* | b | * cos θ_th)；And

| b |=SQRT{ (xb^2+yb^2)/(xa^2+ya^2) }

Even if when there is rotating camera motion, use the group technology of judgement based on the two packet decision rule can also perform the motion vector packet of optimum.

Figure 10 illustrates explanation three vectograms of the calculating of the normalized vector indirectly the measured difference of the differential seat angle of the packet of the characteristic point in the DIS method of explanation in Figure 1A to 1F.With reference to the figure (a) in Figure 10, the difference vector (A-B) between vector A and vector B is plotted as being labeled as the horizontal vector of (A-B).In order to easily read, figure (c) is plotted as the ratio bigger than figure (a) and figure (b).Absolute amplitude Δ M (A-B) of difference vector (A-B) will be calculated as follows:

Δ M (A-B)=SQRT ((xa-xb) ^2+ (ya-yb) ^2), wherein

A=(xa, ya)

B=(xb, yb)

With reference to the figure (b) of Figure 10, normalized vector a being defined as vector A absolute value | A | divided by vector, therefore normalized vector a has amplitude 1 (see the figure (c) of Figure 10).Normalized vector b is defined as vector B absolute value | A | divided by vector.By below equation definition amplitude | A | and | B |:

| A | ^2=(xa^2+ya^2)

| B | ^2=(xb^2+yb^2)

Noting b=B/ | A |, therefore Amplitude Ratio is absolute value | b |=| B |/| A |=| (B/ | A |) |.Therefore, | b | is the amplitude of normalized vector b, it by by vector B divided by amplitude | A | of vector A and normalization (that is, b=B/ | A |).Therefore, Amplitude Ratio | b |=SQRT{ (xb^2+yb^2)/(xa^2+ya^2) }.

Because the amplitude of normalized vector a is 1, amplitude | b | is also equal to the Amplitude Ratio between the amplitude of normalized vector a and amplitude | b | of normalized vector b.Therefore, amplitude | b | is referred to as Amplitude Ratio | b |.Amplitude Ratio | b | is not the function of the differential seat angle θ between vector A and B.

As our the first packet decision rule, we check | b | (| b |=| A |/| B |) whether at height than threshold value Mr_LthAnd Mr_UthWithin.If | b | is ^2 ＜ Mr_Lth^2 or | b | ^2 ＞ Mr_Uth^2, then we judge that vector A cannot be in identical group with vector B.But, if Mr_Lth^2 ＜ | b | ^2 ＜ Mr_Uth^2 then we based on using poor | the a-b | of normalized vector to make second as criterion and compare.

Absolute amplitude | a-b | of normalized vector poor (a-b) calculates according to equation below:

| a-b | ^2=[(xa-xb) ^2+ (ya-yb) ^2}/(xa^2+ya^2)]

Normalized vector poor (a-b) has absolute amplitude | the a-b | as shown in the figure (c) of Figure 10, wherein, length | a |, | b | form the triangle and limit with phasor difference angle, θ relative for | (a-b) | with | (a-b) |, it means that the cosine rule as theta function can also be used to calculate | (a-b) |.The length on unknown limit and the length on other limits of cosine rule definition triangle and and the relative angle in unknown limit between mathematical relationship.If the angle between two vectors gives, then amplitude | a-b | of normalized vector poor (a-b) can use cosine rule formula to obtain:

| (a-b) |=SQRT (1+ | b | ^2-2* | b | * cos θ).

Therefore, the angle threshold of the threshold value being expressed as amplitude | a-b | of normalized vector poor (a-b) (limit of triangle relative with differential seat angle θ in the figure (c) of Figure 10) can be calculated as | b | and the function of differential seat angle θ, as cosine rule indicates.Therefore, threshold amplitude Ma of we can define amplitude | (a-b) |_thAs differential seat angle threshold θ_th, wherein Ma_thIt it is threshold angle θ selected_thFunction.Therefore, we can by calculate normalized vector poor (a-b) amplitude | a-b | square and Ma_thSquare compare.Therefore, by | a-b | ^2 and Ma_th^2 is compared to determine that the differential seat angle between vector A and vector B is the most sufficiently small so that should they be grouped together.

We define Ma_th^2=(1+ | b | ^2-2* | b | * cos θ_th), wherein

θ_thIt is the predetermined angular threshold value (such as 30 degree) for being grouped judgement purpose, and

| b |=| (B/ | A |) |=SQRT{ (xb^2+yb^2)/(xa^2+ya^2) }

If | a-b | ^2 is less than Ma_th^2, then we judge that vector A can be in identical group with vector B.Therefore, if | a-b | ^2 is less than Ma_th^2, then suitably final packet determine be vector A with vector B in identical group.

Therefore, if | a-b | ^2 is less than Ma_th^2 and and if only if | b | ^2 (≡ (| B |/| A |) ^2) is more than Mr_Lth^2 and less than Mr_Uth^2, then vector A with vector B in identical group.Ma_thDefinite calculating of ^2 needs a square root calculation (that is, being used for calculating | b |), and square root calculation can be to calculate complexity, or needs suitable hardware to realize.Therefore, the elimination of square root calculation can substantially reduce calculation cost or hardware.We have already envisaged for for Ma_thApproximation (that is, Ma_th=0.5), its for | b | add deduct equal to 1 30% (that is, for 0.7≤| b |≤1.3) and in being in 30 degree of vector error (phasor difference) angle (i.e.,) good group result is provided.Therefore, the second packet criterion becomes | a-b | ^2 ＜ 0.5^2.

If we draw the relation between differential seat angle θ, Amplitude Ratio | b | and poor | the a-b | of normalization, we can obtain the curve chart of Figure 12.

Figure 12 is the curve chart of Amplitude Ratio | b | of amplitude | (a-b) | poor with normalized vector (a-b) of the normalized vector poor (a-b) of the function of the various values (between 0 degree and 90 degree) as differential seat angle θ, its explanation availability of spendable approximation in decision steps dS1140 of the grouping algorithm of Figure 11.

By experiment, exemplary video is given and adds deduct 30% and reach the good group result of 30 degree about differential seat angle θ maximum, as shown in the border of the square region of Figure 12 more than 1 about Amplitude Ratio.As shown in the curve chart of Figure 12, this experience range is effective for the approximation of the normalized vector difference between 0 and 0.5.

Use approximation, SQRT (1+ | b | ^2-2* | b | * cos θ) can be approximately 0.5 regardless of | b |, the burden calculated with minimizing.Therefore, this approximation, the second packet criterion is used to become | a-b | ^2 ＜ 0.5^2.

Figure 11 is the flow chart of the grouping process that the embodiment according to present inventive concept is described.Grouping process 1100 uses two criterions of the normalized vector indirectly the measured difference of the differential seat angle included between the motion vector about the characteristic point (see the picture frame of Figure 1A and 2A) selected, to perform the vector packet step of Fig. 1 D in the DIS method of explanation in Figure 1A to 1F.Grouping algorithm 1100 includes Amplitude Ratio packet decision rule (in decision steps DS1120) and normalized vector differential set decision rule (in decision steps SD1140).Pairing algorithm (step S1104, S1106, S1152, dS1150 and S1154) in grouping algorithm 1100 peripheral operation controls which characteristic point (motion vector) to be matched with other characteristic points, which is the most unpaired, and which is got rid of whole from grouping algorithm 1100.Pairing algorithm provides a pair SFP motion vector A and B as the input (S1104, S1106) to grouping algorithm 1100.In initial step iS1102, receive Amplitude Ratio surplus (such as Mr from external circuit_Lth^2 and Mr_Uth^2) and angle surplus be supplied to grouping algorithm 1100.

Grouping algorithm 1100 calculates | A | ^2 based on the vector A received and calculates | B | ^2 (step S1112 and S1114), to use these value of calculation to carry out calculating subsequently in step S1116 the most subsequently, dS1120 and dS1140 based on the vector B received.Therefore, when the vector B received is got rid of from the packet with the vector A received (decision steps dS1120 is the no branch of branch or decision steps dS1140), pairing algorithm distributes new vector B (step S1152) and grouping algorithm 1100 calculates the new value (step S1114) of | B | ^2 based on new vector B, but need not update the value of calculation | A | ^2 (step S1112) of current vector A simultaneously, because by continuing more identical vector A, but with new vector B.Therefore, it is adapted for performing the hardware of grouping algorithm 1100 or software can be configured to be stored separately one or more examples of value | B | ^2 and | A | ^2, one in these values is used repeatedly to compare efficiently so that calculating, as long as an only one changes in vector A and B.

Next grouping algorithm 1100 uses | A | ^2 and | B | ^2 (from step S1112 and S1114) to calculate Amplitude Ratio (| b | ^2) and | a-b | ^2 (S1116).In decision steps dS1120 application first (Amplitude Ratio) packet decision rule.Square | b | ^2 of Amplitude Ratio is compared by decision steps dS1120 with Amplitude Ratio surplus (from step iS1102).(if | A |/| B |) ^2 ＜ Mr_Uth^2 or (| A |/| B |) ^2 ＞ Mr_Uth^2 (decision steps dS1120 is branch), is not grouped current vector A and current vector B, and terminate with current vector B relatively and select new vector B (step S1152).If (| A |/| B |) ^2 is at Mr_Lth^2 and Mr_UthBetween ^2 (the no branch of decision steps dS1120), then current vector A can become being grouped with current vector B, and applies the second packet decision rule (in decision steps dS1140).If | b | be in preset range (such as, value based on | b | ^2) and if differential seat angle θ be in preset range (decision steps dS1130 is branch), then amplitude Ma of normalization difference vector (a-b)_thApproximated (such as, Ma_th^2=0.5^2).Otherwise, (the no branch of decision steps dS1130), calculate amplitude Ma of normalization difference vector (a-b)_th(S1132)。

It follows that amplitude Ma that is that use the approximation of normalization difference vector (a-b) in decision steps dS1140 in second (normalized vector is poor) packet decision rule or that calculate_th.By Ma in decision steps dS1140_thSquare (Ma_th^2) with (| a-b |) ^2 and/or compare with angle surplus (from step iS1102).If (| a-b |) ^2 is less than Ma_th^2, (decision steps dS1140 is branch), then current vector A and current vector B can be grouped (step S1142).If (| a-b |) ^2 is not less than Ma_th^2, (the no branch of decision steps dS1140), current vector A and current vector B are not grouped, and terminate with current vector B relatively and select new vector B (step S1152).

Once current vector A compares (decision steps dS1150 is branch) with the most available packet candidate vector B, then select new vector A and continue to compare (S1154, S1112 etc.) with remaining (ungrouped) packet candidate vector B, if or the most grouped whole vector, then grouping algorithm 1100 wait until new frame need process.

Figure 13 is the block diagram including being configured to perform the characteristic point packet circuit 1300 of the grouping algorithm circuit 1310 of the characteristic point grouping algorithm of Figure 11.Characteristic point packet circuit 1300 includes: Amplitude Ratio (| b |) comparator 1320, is configured to Amplitude Ratio threshold value (Mr_LthAnd Mr_Uth) criterion perform first packet judgement (decision steps dS1120 of Figure 11)；And vector angle comparator 1330, the criterion (for the assessment of differential seat angle) being configured to normalized vector difference performs the second packet judgement (decision steps dS1140 of Figure 11).Vector angle comparator 1330 in this one exemplary embodiment of present inventive concept includes the amplitude (Ma of normalization difference_th) computer/estimator 1332 and normalization difference amplitude (Ma_th) comparator 1334.Amplitude (the Ma of normalization difference_th) the carrying out that describe as described above for step dS1130 of Figure 11, S1132 and S1134 of computer/estimator 1332 produce or calculate.

The characteristic point circuit 3000 of characteristic point packet circuit 1300 and Fig. 3 shares RAM memory 350.SFP list 352 part of memorizer 350 comprises the list of the characteristic point of the selection exported by characteristic point candidate sorter 340.Pairing algorithmic controller 1302 in characteristic point packet circuit 1300 by dma access SFP list 352 and selects the comparison in grouping algorithm circuit 1310 of vector A and vector B, above with reference to described in step S1152 of Figure 11, S1154, dS1156 and dS1150.When comparing generation one or more groups vector (group of the characteristic point of selection), vector or its FP describing list write memorizer 350 of packet are grouped inventory 354 part by pairing algorithmic controller 1302.Realize in single integrated circuit chip according to an embodiment, characteristic point packet circuit 1300 and memorizer 350, and export the circuit of chip exterior from the data of memorizer 350 retrieval via I/O pin.According to another embodiment, characteristic point packet circuit 1300, RAM memory 350 and characteristic point circuit 3000 realize in single integrated circuit chip.According to such embodiment, if the logical technical staff of this area is it can be appreciated that need extra memory capacity, then other RAM memory can be added outside IC chip.

Characteristic point packet circuit 1300 also includes | A | ^2 computer 1312, | B | ^2 computer 1314 and | b | ^2 and | a-b | ^2 computer 1316, is configured to perform step S1112 of Figure 11, S1114 and S1116 respectively.

Theme disclosed above is thought of as illustrative and not restrictive, and appended claims is intended to contain true spirit and whole such amendment of scope, enhancing and other embodiments falling into present inventive concept.Therefore, in allowed by law maximum magnitude, the explanation of the broadest permission by appended claims and equivalent thereof is determined by the scope of the present invention, and should not limit or be constrained in aforesaid detailed description.

Embodiment according to present inventive concept, in order to algorithmically select to provide the purpose of the specified point of the desirable features point distribution being suitable for digital image stabilization, the frame of video of each capture is divided into a small amount of non-overlapped tile (typically, it is 4 × 4 tiles for single-definition, and is 6 × 6 tiles for fine definition).The zones of different of image is likely to be of the suitable characteristics point of different densities.In extreme circumstances, such as in the case of sunny blue sky, a region of frame is likely not to have any suitable characteristics point.In other region, potential characteristic point may be the most intensive.

The characteristic point obtained is distributed zonule based on frame of video (such as, non-overlapped tile), and in each tile, the quantity of characteristic point is along with the variances sigma of the luminance picture data of this tile²And increase linearly.There is view data interested and it is thus desirable to the tile of more features point is expected to have higher variances sigma².See the co-pending application No.8729-357 looked into, describe procedure below: the minimum range between characteristic point in each tile is set, and requires nothing more than little local state information simultaneously, thus reduce hardware implementation cost.Here by the disclosure being incorporated by 8729-357.

If scene captures under low lighting conditions, will have relatively more noise, and noise will be more than the impact on tile to the impact of characteristic point, this is because the quantity of the pixel of characteristic point is much smaller than the quantity of pixel in tile.In tile, the larger number of pixel provides noise cancellation effect, and the motion vector based on tile of down-sampling is the most accurate.

Even if scene captures the most under low lighting conditions, if scene is the most smooth, then motion vector based on tile is likely more accurately.If tile scene is smooth as cloudy sky or blue sky, then these characteristic points in some characteristic points, and smooth tile may be had in any position of next frame may to find the match point of similar grade.But, coupling based on tile does not relies on little characteristic point region of search, and all patterns in tile can contribute to tile matching process.As result, when scene is smooth, motion vector based on tile is more reliable.

When the best score of characteristic point motion vector set is less than given threshold value, we determine to use motion vector based on tile rather than the motion vector of distinguished point based, and this strategy works good for strong noise scene or planar scene.

According to the embodiment of present inventive concept, we select to represent the estimation motion vector of the movement of background and big object, and smaller objects need not have accurate motion vector associated there.At the relatively rear class of DIS algorithm, any inaccurate vector for smaller objects can be filtered.

The characteristic point of the big fixing object expecting to have significance will move due to the overall situation or photographic head moves and moves in a uniform matter.It is appreciated that the notable big object independently moved covers the major part of at least one tile so that their motion can be estimated as the leading motion of tile itself, and the motion vector to tile itself that moves of little object has and has little effect on.

We use the Block-matching of lowest resolution to derive a motion vector of each tile.Motion vector based on tile (such as strong noise or planar scene video) can be used for the decision of photographic head track under some particular cases.

It is the motion vector making absolute difference sum (SAD) minimize for giving the motion vector of tile.And, by using Hierarchical Motion Estimation algorithm and being moved rather than local motion by preferred tile, use the motion vector of tile as start vector, the process of motion vector of the characteristic point calculating each tile can be revised to reduce calculating.Owing to the sufficiently large most object to cover at least one tile may extend in adjacent tiles, it is therefore possible to some characteristic points in each tile may be associated with the motion vector of adjacent tiles rather than associate with the motion vector of the tile finding these characteristic points wherein more strongly.Therefore, in the Block-matching search of the motion vector of the characteristic point for any given tile, the motion vector using all adjacent tiles will be effective as multiple start vector.Therefore, it is used for obtaining start vector that the start vector of motion vector of the characteristic point of selection is tile belonging to this feature point and belongs to the start vector of four direct neighbor blocks (upper tile, left tile, right tile, lower tile) (assuming that it exists).For each start vector used, the least scope is only used for the Local Search of characteristic point motion vector by us.Goal does not lies in the accurate vector (picking except the motion vector of difference in DIS processes chain after a while) determined for each characteristic point.On the contrary, characteristic point interested only belongs to background or the characteristic point of big object.For those characteristic points, one of tile motion vector should be good or should be close to the motion vector of characteristic point interested, and therefore little Local Search about the tile motion vector of each selection is enough.For the characteristic point of each selection in tile, each start vector in one group of start vector, in higher resolution territory, (this is probably original video resolution, or can be by factor f of 2 or 4_s3Down-sampling) middle execution little localized mass coupling search.

Figure 14 is based on the block diagram of digital image stabilization (DIS) circuit of execution digital image stabilization (DIS) method of the one exemplary embodiment of present inventive concept.This DIS circuit includes: detector unit (DU), and it is analyzed the shake video received and exports Inter-frame Transformation Ti (n)；Trajectory unit (TU), master/Compensation Transformation P (n) of the selection that its output selects from Inter-frame Transformation Ti (n)；And compensating unit (CU), its master selected by use/Compensation Transformation P (n) amendment is shaken video and is exported stable video.

Detector unit (DU) estimates interframe movement vector and the interframe movement vector (tile vector) of non-overlapped tile of the characteristic point (FP) in the video data frame received.Detector unit exports the conversion of FP motion vector set and tile group conversion Ti (n) further.

One of trajectory unit (TU) selection Inter-frame Transformation Ti (n) (or the identity transformation in the case of big mobile object covering scene) is as primary transform P (n), thus be excluded that the Inter-frame Transformation of little mobile object and the Inter-frame Transformation of big mobile object (can move in whole frame and cover whole frame).

Figure 15 is the block diagram of detector unit 2000 that the step of the DIS method being adapted to be the DIS circuit as Figure 14 in the DIS circuit of Figure 14 calculates the affine transformation of tile set of vectors.Detector unit 2000 includes characteristic point circuit 3000, motion vector (MV) packet circuit 1300 and motion vector (MV) group affine transformation computer 2010.

Characteristic point circuit 3000 receives each video data frame, and each frame of video is preferably divided into a small amount of j × k non-overlapped tile.The scope of j × k tile can from for SD video 4 × 4 to for the 6 × 6 of HD video；Also it is possible from other quantity in (4..8) × (4..8) scope, and is probably useful.The notable big object selecting tile dimensions to make to independently move covers the major part of at least one tile, such that it is able to capture their motion for DIS purpose, can ignore the motion of little object simultaneously.Characteristic point circuit 3000 identifies and selects the characteristic point (SFP) in the frame of video of reception, and the motion vector (SFP MV and tile MV) of the motion vector of output characteristic point and tile.

Characteristic point circuit 3000 includes characteristic point selector and motion vector calculator and the RAM memory 350 shared.Characteristic point circuit 3000 can also include Harris's Corner Feature point candidate evaluator and characteristic point candidate sorter.In order to save computing capability and reduce the quantity of required computing, brightness data is only operated by characteristic point circuit 3000, and includes one or more down-sampler and Multi-frame Block Matching search unit.

Characteristic point circuit 3000 is that each tile estimates motion vector.On the basis of the non-overlapped tile (such as, the tile identical with the tile being possibly used for characteristic point selection algorithm) at the center covering input picture, carry out tile motion vector (tile MV) and estimate.For each tile, the image of degree of depth down-sampling is performed complete block coupling search.Complete block coupling search is carried out for each tile, and store (356) tile motion vector (tile MV) for using after a while, it is used for example as the start vector in Multi-frame Block Matching search unit, for deriving the motion vector (the SFP MV of storage at 352) of characteristic point, and it is used for fixing object detection.

Characteristic point circuit 3000 preferably provides characteristic point list 352, and it has distribution based on the frame of video zonule (tile) being referred to as tile, and wherein, the maximum quantity of the characteristic point of every tile is along with the variances sigma of the luminance picture data of this tile²And increase linearly.Desirable features point for DIS method is the point of the motion vector producing monodrome when applying suitable motion estimation algorithm.Pixel application Harris Corner Detection Algorithm in order to identify the characteristic point in image, to frame of video, in order to measure this pixel has how to be suitable as characteristic point.The zones of different (tile) of image is likely to be of the characteristic point candidate of the identification of different densities.

Characteristic point circuit 3000 preferably includes: motion vector calculator, and it performs the function being used for calculating the tile vector computer of the motion vector of each tile；And Multi-frame Block Matching search unit, it is for determining and export the motion vector of the characteristic point (SFP) of each selection.Tile vector computer uses present frame F_tThe brightness data of degree of depth down-sampling calculate the motion vector of each tile.The complete resolution of Multi-frame Block Matching search unit two successive frames of use or the brightness data of down-sampling determine the motion vector of the characteristic point of each selection, and tile motion vector can be used as start vector.

All characteristic points and the data relevant to tile are delivered to next DIS block, in particular motion vector packet circuit 1300.

Motion vector packet circuit 1300 is configured to FP motion vector and tile motion vector are performed grouping algorithm.Motion vector packet circuit 1300 includes motion vector comparator 1310, and its every pair of vector being configured to compare pairing algorithmic controller 1302 selection performs packet decision.

The packet of FP motion vector is associated with the object in scene by motion vector packet circuit 1300 relative movement being noticeable based on the object between successive video frames with the motion vector by the characteristic point (SFP) selected.Tile motion vector is grouped to be associated with the object in scene by tile vector by the relative movement discovered of the object that motion vector packet circuit 1300 is additionally based upon between successive video frames.

Motion vector packet circuit 1300 and characteristic point circuit 3000 share RAM memory 350.The SPF MV list section 352-FP of memorizer 350 comprises position and the list of motion vector of the characteristic point (SFP) of selection.The tile MV list section 352-TMV of memorizer 350 comprises the position of non-overlapped tile and the list of motion vector.

Which characteristic point pairing algorithmic controller 1302 controls and tile (motion vector) with which further feature point and tile matches, and which keeps unpaired, and which will be got rid of completely from packet.Pairing algorithm repeatedly provides motion vector to (vector A and vector B) as the input to MV comparator 1310.

Pairing algorithmic controller 1302 in characteristic point packet circuit 1300 accesses SPF MV list (352-MV) and tile MV list (352-TMV) and selects vector A and vector B to compare in motion vector comparator 1310.One or more set of vectors is caused (such as when a series of vector A-vector B compare, the group of the characteristic point selected and the group of tile) time, the motion vector after packet or its descriptive list are written to FP MV group Listing portion 354 and the tile MV group Listing portion 358 of memorizer 350 by pairing algorithmic controller 1302.

Motion vector (MV) group affine transformation computer 2010 calculates the Inter-frame Transformation of every stack features point motion vector, calculates the Inter-frame Transformation often organizing tile motion vector, and exports all of which as Ti (n).

Figure 16 is the block diagram of the trajectory unit (TU) 4000 of the DIS circuit of Figure 14, and this trajectory unit 4000 is adapted to be the step of the DIS method of the DIS circuit according to Figure 14, is based upon the conversion of tile group and feature group conversion T_iN method that () is marked, selects main (fixing/background) conversion P (n).

Trajectory unit (TU) 4000 (Figure 16) includes that the conversion scoring of tile group and selection circuit 4100-1 (Figure 17 A), the conversion scoring of feature group and selection circuit 4100-2 (Figure 17 A), set group selection circuit 4200 (Figure 18), mobile object get rid of circuit 4400 (Figure 19) and adaptive equalization wave filter.

Trajectory unit (TU) 4000 identifies main motion P (n) caused by unstable photographic head and the mobile object simultaneously ignoring in scene, is filtered primary transform P (n) selected, and exports Compensation Transformation C (n).Trajectory unit (TU) 4000 uses multiple continuous print score functions to come from the Inter-frame Transformation T received_iPrimary transform P (n) is selected in the middle of (n).

Figure 17 A is group conversion scoring and the block diagram of selection circuit 4100 of the trajectory unit (TU) 4000 of the DIS circuit of Figure 14, and the conversion scoring of this group and selection circuit 4100 include converting interpolater 4150, conversion quality calculator 4160 and being configured to perform group conversion and the mass selector 4170 of the step in the DIS method of the DIS circuit of Figure 14.Group conversion scoring and selection circuit 4100 are adapted to be from tile group Inter-frame Transformation T_Tile _, _iOutput tile group primary transform GP in (n) (4100-1)_Tile(n), and be adapted to be from FP Inter-frame Transformation T_FP _, _iOutput characteristic group primary transform GP in (n) (4100-2)_FP(n)。

Figure 17 B is the block diagram of the exemplary realization of history score calculating unit 4110-1 in the group conversion scoring realizing Figure 17 A shown in 4100-1 and selection circuit 4100 of Fig. 4.

See Figure 17 A and 17B, group conversion scoring and selection circuit 4100 include history score calculating unit 4110 (such as 4110-1), sports scores computing unit 4120, feature scores computing unit 4130 and range fraction computing unit 4140, add total conversion mark S_i(n) computer 4150, conversion quality Q_iN () computer 4160 and group convert and mass selector 4170 (such as 4170-1).

The group conversion scoring of Figure 17 A converts, based on from total, (each Inter-frame Transformation T that interpolater 4150 receives with the group conversion in selection circuit 4100 and mass selector 4170_i(n)) total conversion mark S_iN () (by refusing the Inter-frame Transformation of little mobile object) and selects Inter-frame Transformation T_iN one of () is as group primary transform GP (n), and output group primary transform GP (n) and the quality Q (n) that is associated thereof.

If T_iN () is the i-th converted the candidate conversion of all receptions received from detector unit (DU) 2000, wherein n instruction frame and Temporal Order.If GP (n) is the group primary transform selected at frame time n, i.e. for the i, GP (n)=T that select_i(n)。

For each T_i(n), total conversion mark S_iN () computer 4150 receives history mark H from history score calculating unit 4110 (such as, 4110-1)_iN (), receives sports scores M from sports scores computing unit 4120_iN (), receives feature scores F from feature scores computing unit 4130_iN (), from range fraction computing unit 4140 range of receiving mark E_i(n), and calculate total conversion mark S based on below equation_i(n):

S_i(n)=H_i(n)*M_i(n)*F_i(n)*E_i(n)

For each T_iN (), converts quality Q_iN () computer 4160 receives feature scores F from feature scores computing unit 4130_iN (), from range fraction computing unit 4140 range of receiving mark E_i(n), and calculate conversion quality Q based on below equation_i(n):

Q_i(n)=F_i(n)*E_i(n).

There is maximum S_iThe T of (n)_iN () should be by the group conversion selector 4170 of the group conversion scoring of Figure 17 A and selection circuit 4100 selected as group primary transform GP (n).Therefore, in the exemplary embodiment, there is highest score S_iThe Inter-frame Transformation candidate T of (n)_iN () is selected as organizing primary transform GP (n), be then adaptively filtered to produce Compensation Transformation C (n), thus compensate shake cam movement in the DIS compensating unit (CU) 6000 of the DIS circuit of Figure 14.

The history of history score calculating unit 4110 (such as 4110-1) storage group primary transform GP (n), and receiving each T from detector unit (DU) 2000_iN, time (), the history of group primary transform GP (n) of predetermined length HL based on storage, is each T seriatim_iN () calculates history mark H_iN (), wherein HL is the integer of the previous frame indicating predetermined quantity.The T that will enter_iN () be previous group primary transform GP (n-1) selected with HL of storage ... each in GP (n-k) mathematically compares, wherein k is the frame time index of integer, and it changes to HK (in time at a distance of farther frame: n-HK) from 1 (indicating back to back former frame: n-1).T_iN conversion that in the middle of (), previous group primary transform GP (n-1) selected has more high correlation to GP (n-k) with HL of storage has higher history mark H_i(n)。

T_iDependency Hi, k (n) between (n) and each GP (n-k) be 1 deduct normalization norm (1-| T_i(n)-GP (n-k) |) and in scope [0,1], wherein Hi, k (n) value=1 indicates high correlation.

Each dependency Hi, k (n) (1-| T_i(n)-GP (n-k) |) contribution by corresponding history weight HW (k) weighting.

History mark H_iN () is overall relevancy, and be the sum of Hi, k (n) that HW (n-k) weights, 1 ＜ k ＜ HL, and wherein HL is the length (quantity of past frame) of history.Therefore,

H_i(n)=∑ [1-| T_i(n)-GP (n-k) | * HW (k)], 1 ＜ k ＜ HL

Weight HW (n-HL) to HW (n-1) be preferably selected such that they and equal to 1, and make history mark H_iN () output by non-linear normalizing and has continuous print scope [0,1].

The exemplary hardware of the history score calculating unit 4110 shown in Figure 17 B realizes 4110-1 and includes FIFO (first in first out) storage buffer, it arrives GP (n-k) for storing HL group primary transform GP (n-1) previously selected, and it has HL branch (for n-1 to n-HL) so that the content that they store is exported comparator 4114.Comparator 4114 is by current T_iN () and each in group primary transform GP (n-1) previously selected of HL storage to GP (n-HL) compare, and each comparison theed weight by history weight HW (n-1) to HW (n-HL) is exported to total history interpolater 4116, its output overall relevancy is as the total history mark H in successive range [0,1]_i(n)。

Sports scores computing unit 4120 receives each T_i(n), and it is based only upon T_iN () calculates its sports scores M_i(n).In an alternate embodiment, in order to calculate sports scores M_iN (), sports scores computing unit 4120 can be configured to receive the information of storage from detector unit 2000.The conversion with little motion has higher sports scores M_i(n), and more likely become group primary transform GP (n).For T_iN each Inter-frame Transformation in (), sports scores computing unit 4120 calculates sports scores M_i(n)。

There is the M of big value_iN () corresponds to little motion, vice versa.Motion M_iN () can level based on conversion, vertical or total linear displacement.Sports scores M_i(n) and linear displacement M_i(n) inversely related, and preferably by non-linear normalizing for having successive range [0,1].

Feature scores computing unit 4130 receives each T_i(n), and it is based only upon T_iN () calculates its feature scores F_i(n).In an alternate embodiment, in order to calculate feature scores F_iN (), feature scores computing unit 4130 can be configured to receive the information of storage from detector unit 2000.For T_iN each Inter-frame Transformation in (), feature scores computing unit 4130 calculates feature scores F_i(n).Feature scores F_iN () is relevant to the multiple characteristic points being grouped together, with composition by T_iN feature point group that each Inter-frame Transformation in () represents.T_iN in the middle of (), often group has the conversion of more features point and has higher feature scores F_i(n).Feature scores F_i(n) preferably by non-linear normalizing for having successive range [0,1].

Range fraction computing unit 4140 receives each T_i(n), and it is based only upon T_iN () calculates its range fraction E_i(n).In an alternate embodiment, for computer capacity mark E_iN (), range fraction computing unit 4140 can be configured to receive the information of storage from detector unit 2000.For T_iEach Inter-frame Transformation in (n), range fraction computing unit 4140 computer capacity mark E_i(n)。T_iN the conversion of the characteristic point in () with covering (spreading all over) larger area is higher marked.There is range fraction E of greater value_iN (), corresponding to bigger overlay area, vice versa.Range fraction E_iN () is multiplied by width to the length of the rectangular region of all characteristic points of the group comprising conversion relevant.Range fraction E_i(n) preferably by non-linear normalizing to have successive range [0,1].

Each one exemplary embodiment of present inventive concept uses scene historical analysis, and to get rid of the big object of movement in whole scene, otherwise it will cause undesirable result in video stabilization.In the case of not having correct scene historical analysis, primary transform selector more likely selects the conversion candidate corresponding with big mobile object, particularly when it covers whole scene.It is appreciated that when big object moves and is full of whole scene in whole scene, convert candidate T_iN () does not include primary transform P (n) corresponding with unstable photographic head.

Figure 18 is the set transform scoring block diagram with the exemplary realization of selection circuit 4200 of the trajectory unit (TU) 4000 of the DIS circuit of Figure 14, comprising: set determines computer 4250, is configured to set of computations and determines CD (n)；And set transform selector 4260, it is configured to output set primary transform CP (n) as the step in the DIS method of the DIS circuit of Figure 14.

Set in Figure 18 determines that computer 4250 is according to the feature group conversion quality Q received from detector unit (DU) 2000_FP(n), tile group conversion quality Q_TileQuantity K of (n) and feature group conversion candidate_FGN (), carrys out set of computations and determines CD (n).

Set determines that the exemplary realization of computer 4250 includes quantity K according to feature group_FGN () calculates and is measured Θ by the fragment of non-linear normalizing_F(n), thus work as K_FG(n) hour, Θ_FN () is 0, and work as K_FGWhen () is big n, Θ_FN () is 1.Therefore, Θ_FN () value is divided into many feature groups close to all characteristic points in 1 instruction video scene, vice versa.

Set determines that computer 4250 is by by Q_F(n) and Θ_F(n)*Q_TN () is compared to output set and determines CD (n), and if Q_F(n) ＞ Θ_F(n)*Q_TN (), then set determines that CD (n) is arranged to select feature group.And, if Q_F(n) ＜=Θ_F(n)*Q_TN (), then set determines that CD (n) is arranged to select tile group.In this formula, if feature group is the most divided, then Θ_FN () is close to 0, and feature group is the most selected.Otherwise, if feature group is divided, then Θ_FN () is close to 1, and tile group conversion quality Q_TileN () converts quality Q with feature group_FPN () is compared and is in phase same level.

Set transform selector 4260 is at feature group primary transform GP_FP(n) and tile group primary transform GP_TileSelection is performed between (n).Set transform selector 4260 is aggregated decision CD (n) and controls, so that output set primary transform CP (n) is arranged to feature group primary transform GP when CD (n) is arranged to feature group_FPN (), is otherwise arranged to tile group primary transform GP_Tile(n)。

In this embodiment, set transform scoring and selection circuit 4200 feature based group convert quality Q_FPN () and tile group convert quality Q_TileN () performs selection.These group conversion quality are to be calculated by the conversion quality calculator 4160 of Figure 17 A, and described conversion quality calculator 4160 receives input from feature scores computing unit 4130 and range fraction computing unit 4140.

Feature scores computing unit 4130 calculates feature based and conversion T based on tile_iFeature scores F of (n)_i(n).In this embodiment, in order to calculate feature scores F_iN (), feature scores computing unit 4130 is configured to receive the information of storage from detector unit 2000.For T_iN each Inter-frame Transformation in (), feature scores computing unit 4130 calculates feature scores F_i(n).Group has the conversion T in more features point or group with more tile_iN () will have higher feature scores F_i(n), and cause higher feature group to convert quality Q respectively_FP(n) or tile group conversion quality Q_Tile(n).In certain embodiments, in every tile, the quantity of characteristic point can arrange the conversion T of feature based_iFeature scores F of (n)_i(n).In other embodiments, the quantity of tile can arrange conversion T based on tile_iFeature scores F of (n)_i(n).In every tile characteristic point quantity and/or in each tile set of vectors the quantity of tile can directly obtain from detector unit 2000.

Range fraction computing unit 4140 calculates feature based and conversion T based on tile_iRange fraction E of (n)_i(n).In this embodiment, for computer capacity mark E_iN (), range fraction computing unit 4140 is configured to receive the information of storage from detector unit 2000.The conversion with the characteristic point or tile that cover larger area is higher marked.In the quantity of characteristic point and each tile set of vectors, the size of tile can directly obtain from detector unit 2000.Similarly, the horizontal and vertical scope of the motion vector set of each feature based can directly obtain from detector unit 2000.Cover larger range of feature group or cover larger range of tile group and will have higher range fraction E_i(n), and cause higher feature group to convert quality Q respectively_FP(n) or tile group conversion quality Q_Tile(n).In this embodiment, for computer capacity mark E_iN (), range fraction computing unit 4140 is configured to receive the range information of storage from detector unit 2000.

After having passed through set transform scoring and selection circuit 4200 selection set primary transform CP (n), big object is got rid of hardware and is determined that the set owner selected converts whether CP (n) is moved to and covers the big mobile object of whole scene.When carrying out such eliminating, establishment identity transformation (UT) is to replace and to serve as primary transform P (n) of the selection compensating circuit for DIS system, thus stablizes video and will not follow the conversion of big mobile object improperly or unnecessarily.

An embodiment according to present inventive concept, mobile object method for removing is activated based on two observations: the fixed background (being indicated by the history of P (n)) being pre-existing in；And the time period that fixed background and big mobile object coexist.

Mobile object method for removing can process following situation efficiently: scene has almost fixing background, and is with or without mobile object；Big mobile object enters scene, and gradually covers larger area；Big mobile object covers whole scene；Big mobile object initially moves off scene, and background starts to reappear；Big mobile object is finally removed.

Mobile object analysis device detection eliminating situation IF:

Continuous print is fixed the existence instruction of MV group and is had the existing scene of almost fixing background；

The increase counting indicator object of the MV group of the similar speed of continuous print is just mobile shows up in scape；

Trend continue, and the time n similar speed of continuous print MV group cover whole scene and fixing MV group stopping existence, eliminating situation then detected.

Get rid of and determine that ED (n) is sent to get rid of conversion selector.Get rid of conversion selector and select set primary transform CP (n), unless situation is got rid of in ED (n) instruction, the most then primary transform P (n) is set to identity transformation.Therefore, even if big mobile object covers whole scene, stablize video and also will not follow big mobile object improperly.

Figure 19 is the block diagram of the exemplary realization of the mobile object eliminating circuit 4400 of the trajectory unit (TU) 4000 of the DIS circuit of Figure 14, and described mobile object is got rid of circuit 4400 and included that the mobile object analysis device 4470 being configured to perform the step in the DIS method of the DIS circuit of Figure 14 and eliminating convert selector 4480.

Mobile object is got rid of circuit 4400 and is included multiple group history circuit 4410,4420,4430,4440 for storage scenarios history and mobile object analysis device 4470.At any time, a fixing group of G specified is only existed₀, but existing exercise group G of zero or more can be there is_k, wherein k ＞ 0.New exercise group G can also be there is_N, it will become k (such as, k (n+1)=k (n)+1) existing exercise group G during next frame_kOne of.

Fixing group G₀There is group history GH being associated₀.K existing exercise group G_kIn each there is group history GH being associated_k, and the motion vector M that is associated_k.Each existing exercise group G_kThere is motion vector M_k, it is substantially in time until the T of each similar speed of frame n_iThe low-pass filtering of (n) | T_i(n)|。

Each new exercise group G_NThere is group history GH being associated_N(n), this group history GH_NN () was initialised it creates when.Mobile object analysis device 4470 receives by multiple groups of history GH₀(n), GH₁(n) ... GH_J(n) and GH_K(n) and GH_NN scene history that () forms, and calculate eliminating decision ED (n) according to them.

Get rid of conversion selector 4480 and perform selection between identity transformation (UT) and set owner conversion CP (n).Get rid of conversion selector 4480 to be controlled by getting rid of decision ED (n), so that output primary transform P (n) is arranged to identity transformation (UT) when ED (n) is activated, otherwise it is arranged to set owner conversion CP (n).Identity transformation (UT) will result in compensating unit and does nothing during compensating.Therefore, when Moving Objects analyzer 4470 detects " big mobile object " situation and activates eliminating decision ED (n), getting rid of conversion primary transform P (n) as selection of big mobile object, otherwise it may be selected as primary transform P (n).As a result, when big mobile object being detected, from the compensation that the compensating unit 6000 of Fig. 1 performs, get rid of the conversion of big mobile object.

Figure 20 is mixing block diagram-flow chart that diagram is configured to perform the details of the mobile object eliminating circuit 4400 of Figure 19 of the step in the DIS method of the DIS circuit of Figure 14.Figure 20 diagram corresponds respectively to fixing group G₀, existing exercise group G_k, and newly created exercise group G_N+1The details of representational group of history circuit 4410,4430 and 4440.

Group history (the such as H that the mobile object analysis device 4470 of circuit 4400 receives is got rid of from group history circuit (such as 4410) by the mobile object of Figure 19₀(n)) in each include two kinds of historical datas: with selection history (the such as SH of each group of group₀(n)) and history of existence (such as EH₀(n))。

Mobile object analysis device 4470 detects eliminating situation as follows: at fixing conversion G₀Group history GH of (n)₀The scene with almost fixed background has been there is in the continued presence of middle instruction and selection instruction for many frames；Concrete exercise group G_KGroup history GH_KThe quantity denoted object being gradually increased of middle continued presence is just moving into scene；Continue if there is this trend with motion, and if unfixing conversion adds G in the time (n)₀, but conversion P (n) selected adds G_K, then big mobile object situation detected, and the eliminating being activated determine that ED (n) is sent to primary transform selector 4160-2.If ED (n) indicates big object to get rid of situation, then primary transform P (n) is set to identity transformation, otherwise according to T_iN some score functions of () select primary transform P (n).

Group history circuit 4410,4420,4430,4440 in each for the Inter-frame Transformation T received_iThree kinds of historical informations of n each group that one of () is relevant perform storage and process.Three kinds of group history are to select history, history of existence and motion history.When video stabilization starts, create the fixing group of G with sky history₀.Fixing group G₀Motion history can be omitted, and be assumed to sky.Create during DIS video processing procedure or delete exercise group (G₁..., G_K..., G_N)。

With reference to Figure 20, respectively with fixing group of G₀, N number of existing exercise group G_k, and newly created exercise group G_N+1Corresponding group history circuit 4410,4430 and 4440 offer group history GH₀、GH_kAnd GH_N+1。

Exercise group G₀Group history circuit 4410 include for store selection history SH₀With history of existence EH₀History₀Memorizer.History of existence EH₀It is the value of a bit for each past frame, Inter-frame Transformation T in its instruction frame previously_iN whether () have been added to exercise group G₀.Select history SH₀It is the value of a bit for each past frame, its instruction frame previously joins exercise group G₀Inter-frame Transformation T_iN whether () be selected as primary transform P (n).

Fixing group G₀Group history circuit 4410 omit motion history M₀, because any T of primary transform P (n) for including selection_iN whether () add fixing group G₀Decision (deciding step dS4418) depend on T_i(n) and threshold value thd₀Relatively rather than with variable motion vector M based on history₀Relatively, because organizing G₀It is considered as fixing.When video stabilization starts, create the fixing group of G with sky history₀。

If during frame n, T_iN () meets | T_i(n) | ＜ thd₀(the "Yes" branch of deciding step dS4418), then:

This T_iN () adds G₀；

Update history of existence EH₀Fixing conversion is there is at frame n with instruction；And

If P (n)=this T_i(n), the most the newly selected history SH₀This T is selected with instruction_i(n)。

Otherwise (the "No" branch of deciding step dS4418), will be unsatisfactory for during frame | T_i(n) | ＜ thd₀Those T_i(n) and existing exercise group G₁To G_NIn each in group history compare.

Exercise group G_kGroup history circuit 4430 include for store selection history SH_k, history of existence EH_k, and motion history M_kHistory_kMemorizer.History of existence EH_kIt is the value of a bit for each past frame, Inter-frame Transformation T in its instruction frame previously_iN whether () have been added to exercise group G_k.Select history SH_kIt is the value of a bit for each past frame, its instruction frame previously joins exercise group G_kInter-frame Transformation T_iN whether () be selected as primary transform P (n).

Motion history M_kStorage instruction group G_kThe vector M of mass motion_kInformation.Each T_iN () is also mapped to motion vector M.Each exercise group G_kIt is mapped to motion vector M_k.If | T_i(n) | it is T_iThe size of the motion vector of (n), | T_i(n)-M_K| it is T_iN () is from existing exercise group G_kMotion vector M_kDeviation, 1≤K≤N, wherein N is the quantity of currently existing exercise group.N number of existing exercise group has minimum | T_i(n)-M_J| exercise group G_JInstruction is for T_iOptimal coupling group G of (n)_J.Can be by inciting somebody to action | T_i(n)-M_J| with predetermined threshold value thd₁It is compared to determine that this addition determines.It is therefoie, for example, in deciding step dS4438, if for specific J and all K between 1 and N, | T_i(n)-M_J|≤|T_i(n)-M_K|, and | T_i(n)-M_J| ＜ thd₁, (the "Yes" branch of deciding step dS4438), then this T_iN () adds existing exercise group G_J。

T_iN () adds G_J；

Adjust motion history M_JThe T being newly added with reflection_i(n)；

Update history of existence EH_JAt frame n, exercise group G is there is with instruction_J；

If P (n)=this T_i(n), the most the newly selected history SH_J, select this T with instruction_i(n)=P (n).

On the other hand, if for T_i(n) and for all existing exercise group (G₁To G_N) be repeated deciding step dS4438 after, neither one M_KMeet | T_i(n)-M_K| ＜ thd₁(the "No" branch of deciding step dS4438), then this T_iN () adds newly created exercise group G_N+1(step S4449).If this T_iN () adds newly created exercise group G_N+1(step S4449), then:

T_iN () adds newly created exercise group G_N+1；

By motion history M_N+1It is set to this T_iThe motion vector of (n)；

Initialize history of existence EH_N+1At frame n, new exercise group G is there is with instruction_N+1；With

If P (n)=this T_i(n), the most the newly selected history SH_N+1, select this T with instruction_i(n)=P (n).

Any T is not had within the time period (frame) of extension_iN (G that () is added thereto₀To G_JIn the middle of) any exercise group will be deleted.

Figure 21 is at the frame of video of time n capture and the view of the backoff window corresponding with Compensation Transformation C (n) calculated from primary transform P (n), it is shown that vertical velocity deviation to be reduced.In the step of digital image stabilization (DIS) method of the one exemplary embodiment according to present inventive concept, the vertical velocity deviation of backoff window is measured as v1.

As shown in figure 21, corresponding with Compensation Transformation C (n) of the frame of video captured backoff window can have both vertical velocity deviation (v0 or v1), excessive levels deviation (u0 or u1) or vertical velocity deviation (v0 or v1) and excessive levels deviation (u0 or u1).Each in potential excessively deviation (v0, v1, u0 and u1) can be caused by the translational component of Compensation Transformation C (n), the rotational component by Compensation Transformation C (n) or the translational component by Compensation Transformation C (n) and rotational component.

It is expected that by primary transform P (n) being filtered Compensation Transformation C (n) adaptively that export the filtering of the frame of video for each capture the excessively deviation of (v0v1, u0, and u1) to be minimized based on deviation history.

Figure 22 is based on the block diagram of the DIS circuit of execution digital image stabilization (DIS) method of another one exemplary embodiment of present inventive concept.This DIS circuit includes: detector unit (DU) 2000, and it is analyzed the shake video received and exports Inter-frame Transformation T_i(n)；Trajectory unit (TU) 4000, including identifying T_iThe primary transform selection circuit (4100,4200,4400) of primary transform P (n) in the middle of (n) and P (n) is filtered into Compensation Transformation C (n) adaptive equalization wave filter 8000；And compensating unit (CU) 6000, it exports stable video by using C (n) amendment shake frame of video.

Primary transform selection circuit (4100,4200,4400) is by identifying the Inter-frame Transformation T of the global motion caused by unstable photographic head_i(n) and ignore in scene the Inter-frame Transformation T of mobile object simultaneously_i(n) and select Inter-frame Transformation T_iN one of () is as primary transform P (n), and export it selected as primary transform P (n) calculated.Therefore, the primary transform selection circuit (4100,4200,4400) of DIS circuit selects and exports Inter-frame Transformation T_iN one of () is as primary transform P (n) calculated.Compensation Transformation C (n) is obtained by adaptively primary transform P (n) being filtered.Compensation Transformation C (n) is the description relative to the geometrical relationship of corresponding inputted video image of the stable video image (backoff window).This description can comprise position, angle, scaling etc..Some conventional Compensation Transformations are similarity transformation and affine transformation, but present inventive concept is not limited to these conversion, and we illustrate the exemplary method according to present inventive concept by using affine transformation.

Primary transform selection circuit (4100,4200,4400) sequentially adaptive equalization wave filter 8000 is arrived in the primary transform P (n-∞) ... of the selection of successive frame sequence, P (n-1), P (n) output, wherein, primary transform P (n-∞) instruction uses recurrence (finite impulse response IIR) wave filter.Photographic head track intentionally estimated from the jitter motion represented by primary transform sequence P (n-∞) ..., P (n-1), P (n) by adaptive equalization wave filter 8000, and according to photographic head track output Compensation Transformation C (n) estimated.

The visual effect of stable video is highly dependent on the quality of adaptive equalization wave filter 8000.Traditional track method of estimation includes motion vector integration and Kalman filter etc..But, traditional track method of estimation of these and other can not perform in the shake video properties of wide scope well.In the one exemplary embodiment of present inventive concept, use adaptive equalization wave filter to filter jitter motion, and produce stable video.

Figure 23 be the DIS circuit of Figure 22 trajectory unit (TU) 4000 in be configured to the block diagram of adaptive equalization wave filter 8000 that adaptively primary transform P (n) is filtered based on backoff window deviation history.The primary transform P (n-∞) ... of adaptive equalization wave filter 8000 reception based on successive frame sequence, P (n-1), P (n), be filtered primary transform P (n) and filtered Compensation Transformation C (n) of output adaptive.

Adaptive equalization wave filter 8000 includes strong compensating filter (SC) 8700, weak compensating filter (WC) 8600, the sef-adapting filter control circuit 8500 being used for exporting control signal E (n) and deviation modulation blender 8200.SC wave filter is altofrequency selectivity high order linear time constant digital filter, is effective for the input video shaken very much.On the other hand, weak compensation (WC) wave filter has lower frequency selective characteristic, and it excessively deviates producing less backoff window with more unstable output video for cost.

Adaptive equalization wave filter 8000 is SC wave filter and the combination of WC wave filter effectively.Control signal E (n) that deviation modulation blender 8200 produces based on adaptive-filter controller 8500 and exports on the basis of backoff window deviation history, performs SC wave filter and the mixing of WC wave filter output.

Figure 24 is the first exemplary block diagram realizing 8000-1 of the adaptive equalization wave filter 8000 of the trajectory unit (TU) 4000 of the DIS circuit of Figure 22.Exemplary adaptive equalization wave filter 8000-1 includes strong compensating filter 8700 and weak compensating filter 8600 and the feedback loop deviateed in computer 8510 to adaptive-filter controller 8500-1.

With reference to Figure 24, strong compensating filter (SC) 8700 is high order linear time constant recursive digital filter, and have altofrequency selectivity output F (n), described high order linear time constant recursive digital filter have be in the cut-off frequency of about 1.0Hz and sharp roll-off (sharp rolloff) with obtain visually by the most stable video.

Weak compensating filter (WC) 8600 is high-order or more low order linear time constant recursive digital filter.WC 8600 has output G (n) that lower frequency selects, and it has and is slightly higher than the cut-off frequency (such as, being in 1.2Hz) of 1Hz and soft roll-offs to reduce excessively deviation.

The deviation modulation blender 8200-1 of adaptive equalization wave filter 8000-1 performs the deviation modulation self adaptive filtering according to scalar control signal E (n) combination F (n) and G (n).Compensation Transformation in the middle of output G (n) of output F (n) of SC wave filter and WC wave filter both, and output C (n) deviateing modulation blender 8200-1 is also Compensation Transformation.Deviation modulation blender 8200-1 exports C (n)=(1-E (n)) * F (n)+E (n) * G (n), wherein E (n) is in scope [0,1] the non-linear normalizing scalar control signal in, " * " is the multiplying between scalar and conversion, is mapped to conversion；And "+" it is the additive operation between two conversion, it is mapped to conversion.Therefore, the adaptive equalization wave filter 8000-1 in this one exemplary embodiment is SC wave filter and the linear combination of WC wave filter.Therefore, adaptive equalization wave filter 8000-1 is effectively based on linear superposition theorem and possesses the high order linear time constant recursive digital filter of known stability characteristic (quality).

Compensation Transformation C (n) of linear combination is controlled by scalar control signal E (n) based on backoff window deviation history.Little deviation in history produces little E (n) and thus present frame n is increased to the impact of SC wave filter, and the big deviation in history produce close to 1 E (n) and thus present frame n is increased to the impact of WC wave filter.Moderate deviation distribution SC wave filter in storage history and the proportional impact of WC wave filter.

Therefore, SC wave filter provides main contributions when little deviation, and highly effective when filtering high dither.Owing to WC wave filter is contributed more when bigger deviation, therefore greatly reduce the appearance of excessively deviation.Adaptive equalization wave filter 8000-1 prevents excess excessively deviation for big mobile input video, keeps fabulous video stabilization characteristic simultaneously.

Include deviateing computer 8510, four deviation historical integral time device 8520 and modulation factor computer 8530-1 with reference to Figure 24, adaptive-filter controller 8500-1.Adaptive-filter controller 8500-1 is a part for feedback loop.Export from adaptive equalization wave filter before and C (n-∞) ..., C (n-2), C (n-1) derive output E (n) of deviation computer 8510, wherein n represents its time sequence signature, thus E (n) and C (n) is formed without the most attainable without ring retard.Therefore, it is stable that one exemplary embodiment is suitable for real-time video, and includes the cause and effect linear time-varying filtering device with measurable characteristic.

Deviation computer 8510 receives the feedback of Compensation Transformation C (n) by deviation modulation blender 8200-1 output.Deviation computer 8510 includes u0 computer, u1 computer, v0 computer and v1 computer, calculates left and right, lower section and the deviation of top of every frame discretely for the position (seeing Figure 21) of four angle points based on backoff window.

Adaptive equalization wave filter 8500-1 keeps deviation history by using recursion filter.Every lateral deviation of deviation computer 8510 from the output of computer then by deviation historical integral time device (its substantially low pass recursion filter) time integral individually.The output of each low pass recursion filter (Hu0, Hu1, Hv0, Hv1) is then fed into modulation factor computer 8530-1.Modulation factor computer 8530-1 selects the maximum in the deviation oscillation (Hu0, Hu1, Hv0, Hv1) of four time integrals, and produces non-linear normalizing scalar control signal E (n) with successive range [0,1].

Modulation factor computer 8530-1 output nonlinear normalization scalar control signal E (n) is to modulate the mixing of F (n) and G (n).The little value of E (n) implies little deviation history, and the big value hint of E (n) deviates greatly history.

Therefore, under the control of scalar control signal E (n), the mixing of F (n) and G (n) produces and exports Compensation Transformation C (n) based on backoff window deviation history.This one exemplary embodiment provides good the stablizing not having frequency excessively to deviate, and has known frequency response and predictable stability characteristic (quality), is suitable for real-time video stable.

Theme disclosed above is considered exemplary and the most nonrestrictive, and appended claims expection covers the other embodiments in all this amendments, improvement and the true spirit falling into present inventive concept and scope.Therefore, being allowed at utmost for law, the scope of present inventive concept should be determined by the explanation that the most extensively can allow of appended claims and equivalent thereof, and limitation should be described in detail above and limit.

Cross-Reference to Related Applications

Application claims December in 2010 the two pieces U.S. Provisional Application to U.S.Patent & Trademark Office's submission on the 23rd: numbering 61/426,970 and the priority of numbering 61/426,975.The disclosure of two pieces provisional application is incorporated herein by reference.

Claims

1. the method processing image, including:

Receive the first view data representing the first picture frame；

The part of the first picture frame is divided into multiple tile；

Identify the characteristic point in each tile；

Receive the second view data representing the second frame；

The motion vector of the Feature point correspondence derived and identify；

Motion vector is grouped into the motion vector set with similar kinetic characteristic；

Select to comprise the fortune of the motion vector of the movement of the fixing object in the scene representing the first picture frame Dynamic vector group；And

Identify from the motion vector set selected and represent the Compensation Transformation that photographic head to be compensated moves,

Wherein than with differential seat angle, motion vector is grouped into motion vector set based on amplitude of the vector.

2. the method for claim 1, farther includes to derive the tile corresponding with each tile Motion vector.

3. the method for claim 1, farther includes: deviation based on time integral, from Compensation Transformation is adaptively filtered preventing the excess excessively deviation of backoff window.

4. the method for claim 1, farther includes: identify the big mobile object in scene And get rid of the Compensation Transformation corresponding with this big mobile object.

5. an image processing circuit, including:

Receptor, is configured to receive the frame of view data；

Memorizer, is configured to store first group of motion vector with the first kinetic characteristic and have second Second group of motion vector of kinetic characteristic；

Conversion selector, is configured to from first group of motion vector and second group of motion vector identify representative The Compensation Transformation that photographic head to be compensated moves；And

Sef-adapting filter, the deviation being configured to time integral based on non-linear normalizing prevents from mending Repay the excess excessively deviation of window.

6. a photographic head, including:

Imageing sensor, is configured to capture image；

Image data circuitry, is configured to be converted to the image of capture the frame of view data；

Image processing circuit, including:

Receptor, is configured to receive the frame of view data；

Motion vector detection, is configured to detect the motion of object and produce motion vector；

Conversion selector, is configured to from the conversion of motion vector identify the shooting that representative is to be compensated The Compensation Transformation of head movement；

Sef-adapting filter, is configured to be filtered the excessively deviation of backoff window；And

Compensate circuit, be configured to output based on Compensation Transformation and sef-adapting filter and adjust and catch The image obtained,

Wherein than being grouped into by motion vector with differential seat angle, there is similar kinetic characteristic based on amplitude of the vector Motion vector set.

7. photographic head as claimed in claim 6, farther includes packet circuit, is configured to transport In the group that dynamic vector is grouped at least two group including feature point group and tile group, described tile Group includes the non-overlapped tile divided from frame of video.

8. photographic head as claimed in claim 6, wherein, described conversion selector be configured to based on The multiple score functions selected from history, motion, feature and range fraction identify Compensation Transformation.

9. photographic head as claimed in claim 6, wherein said sef-adapting filter be configured to based on The deviation of the time integral of non-linear normalizing prevents the excess excessively deviation of described backoff window.

10. photographic head as claimed in claim 6, wherein, described sef-adapting filter is configured to lead to Cross use recursion filter to keep deviateing history.