CN101742122B - Method and system for removing video jitter - Google Patents

Method and system for removing video jitter Download PDF

Info

Publication number
CN101742122B
CN101742122B CN2009102427956A CN200910242795A CN101742122B CN 101742122 B CN101742122 B CN 101742122B CN 2009102427956 A CN2009102427956 A CN 2009102427956A CN 200910242795 A CN200910242795 A CN 200910242795A CN 101742122 B CN101742122 B CN 101742122B
Authority
CN
China
Prior art keywords
current image
image frame
frame
sampling
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009102427956A
Other languages
Chinese (zh)
Other versions
CN101742122A (en
Inventor
黄磊
刘昌平
姚波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hanwang Technology Co Ltd
Original Assignee
Hanwang Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hanwang Technology Co Ltd filed Critical Hanwang Technology Co Ltd
Priority to CN2009102427956A priority Critical patent/CN101742122B/en
Publication of CN101742122A publication Critical patent/CN101742122A/en
Application granted granted Critical
Publication of CN101742122B publication Critical patent/CN101742122B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a system for removing video jitter. The method comprises the following steps: 1. registering the current image frame in a video opposite to the adjacent image frame thereof; 2. accumulating trajectory information of the jitter generated in registration, and jittering the generated motion trail smoothly; and registering the current image frame in a direction opposite to the motion trail; 3. filling a blank area generated at the edge of the current image frame by an image frame adjacent to the current image frame; and 4. jumping to the next image frame of the current image frame, returning to the first step until all the image frames in the video are completely processed. The method and the system of the invention has quicker processing speed, is unrelated to the contrast ratio of the image, has relatively higher registering precision, ensures that the filled image has good continuity and consistency at the edge, and reduces trails of manual handling.

Description

A kind of method and system of removing video jitter
Technical field
The invention belongs to image processing field, especially, relate to the technology of removing video jitter.
Background technology
When picture pick-up device is fixed on building or the pillar, when wind, will occur rocking; At machine (like car, aircraft, ship etc.), heating ventilation equipment, air-conditioning, PTZ The Cloud Terrace etc. the also unsettled video image of output jitter is arranged in the occasion of vibrations; Especially under the camera lens situation of using high power to amplify; The degree of video jitter is more serious, has had a strong impact on visual effect.Be directed to the video of camera head output, the speed that present video is removed the dither method processing often can not reach real-time requirement, or reduces the resolution of handling rear video.
The primary work that video is removed shake is the registration between the consecutive frame image, and the levels of precision of registration will directly influence video and go the effect of shaking.The method that is usually used in image registration has: the method for optical flow method, the method based on shape content, Corner Detection and coupling etc.The computational complexity of optical flow method is higher, is difficult to accomplish the real-time processing of video.Method based on shape content is suitable for content match and retrieval between image.The method of Corner Detection and coupling mainly comprises SIFT Corner Detection and matching process, Harris angular-point detection method, SUSAN angular-point detection method etc.Wherein, SIFT Corner Detection and matching process have yardstick and rotational invariance, can carry out registration to image more accurately, but computational complexity are higher, are difficult to accomplish real-time processing.Harris and SUSAN Corner Detection Algorithm are very fast relatively, and can detect the angle point in the image more accurately, but when the contrast of image when lower or image shift yardstick is big, registration accuracy is lower, and the continuity and the consistency at edge are relatively poor.
Summary of the invention
The object of the present invention is to provide a kind of method and system of removing video jitter, the trace information of the shake that can carry out producing behind the registration to the original image in the frame of video carries out smoothly, and the white space that produces is filled.
This method comprises the steps:
Step 1: current image frame in the video is carried out registration with respect to its contiguous picture frame;
Step 2: the trace information of the shake that produces in the registration is added up, and the movement locus of level and smooth shake generation, current image frame is carried out and the rightabout correction of movement locus;
Step 3: the white space that produces in current image frame edge when using the picture frame that is close to current image frame to fill level and smooth the shake;
Step 4: jump to next picture frame of current image frame, return step 1, all images frame in video disposes.
Said step 1 comprises the steps:
The maximum down-sampling yardstick of step a, calculating current image frame carries out down-sampling according to maximum down-sampling yardstick to current image frame and its contiguous picture frame, generates down-sampled images;
Step b, in down-sampled images, calculate the textural characteristics of each pixel;
Step c, calculate the cost function of the down-sampled images of two picture frames, thereby obtain the direction of motion of down-sampled images according to the textural characteristics of each pixel;
Steps d, current image frame is proofreaied and correct, obtain the trace information of current image frame on line direction and column direction according to the direction of motion;
Step e, dwindling the down-sampling yardstick, the image after proofreading and correct is carried out down-sampling, return step b, is 0 until the down-sampling yardstick.
Among the said step a, maximum down-sampling yardstick is τ,
τ=max{i-γ|2 i≤min{H,W}}
Wherein, i representes critical down-sampling yardstick, When sampling scale is greater than i instantly, the image behind the down-sampling will become a pixel less than 2 * 2,γ is a relative parameter, the height of H presentation video, the width of W presentation video.
Among the said step b, textural characteristics is represented with gray scale, level and the vertical gradient three-dimensional vector of image, is shown below:
f ( x , y ) = [ I ( x , y ) , I ( x , y ) x ′ , I ( x , y ) y ′ ]
Wherein, I (x, y)For image pixel (x, the gray value of y) locating,
Figure DEST_PATH_GSB00000568999100022
Know
Figure DEST_PATH_GSB00000568999100023
Presentation video is at pixel (x, level of y) locating and vertical gradient.
Among the said step c, cost function is:
C ( u , v ) = 1 H × W Σ x = 1 H Σ y = 1 W ( f ( x , y ) 1 - f ( x + u , y + v ) 2 ) 2
Wherein, u is the pixel-shift of horizontal direction, and v is the pixel-shift of vertical direction, the span of u and v all be 1,0,1}; C (u, v)Be by (u, the cost value when v) direction is moved, the height of H presentation video, the width of W presentation video between two two field pictures;
Figure DEST_PATH_GSB00000568999100031
Presentation video pixel (x, the textural characteristics value of y) locating,
Figure DEST_PATH_GSB00000568999100032
The picture frame of expression vicinity is in pixel (x+u, the textural characteristics value of y+v) locating.
Among the said step c, the computing formula of the direction of motion of image is:
( u ′ v ′ ) = ( u , v ) min { C ( u , v ) | u , v ∈ { - 1,0,1 } } ≤ φ ( 0,0 ) other
Wherein, φ is a decision threshold, (u ', v ') be the direction of motion of adjacent two interframe.
Among the said step e, when the image after proofreading and correct is carried out down-sampling, adopt pyramid model to carry out down-sampling.
In the said step 2, when the trace information of the shake that produces in the registration is added up, at first calculate the number of pixels δ that present frame need move at line direction 0rThe number of pixels δ that need move at column direction with present frame 0c,
δ 0 r = p 0 r - 1 2 a + 1 Σ i = - a a p ir
δ 0 c = p 0 c - 1 2 a + 1 Σ j = - a a p jc
Wherein, p IrBe the accumulation displacement of the image of adjacent i frame with current image frame in the movement locus, p at line direction JcBe the accumulation displacement at column direction of the image of adjacent j frame with current image frame in the movement locus, a is the frame of video width that the filtering shake is adopted, p 0rThe expression current image frame is at the accumulation displacement of line direction, p 0cThe expression current image frame is in the accumulation displacement of column direction, and i, j are natural number; According to δ 0rAnd δ 0cMove the movement locus that current image frame comes level and smooth shake to produce in the other direction.
In the said step 3; Calculate current image frame and and the contiguous picture frame of current image frame between shared edge constraint value; When shared edge constraint value during less than set threshold value; Use the white space that current image frame edge produces when filling level and smooth shake with the corresponding region of the contiguous picture frame of current image frame, otherwise use the next frame with the contiguous picture frame of current image frame to carry out the threshold value coupling, the white space that produces until current image frame edge is filled and is finished.
In the said step 3, share edge constraint value S k(x 0, y 0) be:
S k(x 0,y 0)=S 0k(x 0,y 0)+S k0(x 0,y 0)
Wherein,
S 0 k ( x 0 , y 0 ) = 1 m Σ ( x , y ) ∈ ω 1 ( x 0 , y 0 ) ( I ( x , y ) - I ( x 0 k , y 0 k ) ‾ ) 2 For current image frame with respect to contiguous k two field picture at (x 0, y 0) the shared edge constraint located; S k 0 ( x 0 , y 0 ) = 1 m Σ ( x , y ) ∈ ω 2 ( x 0 , y 0 ) ( I k ( x , y ) - I ( x 0 k , y 0 k ) ‾ ) 2 For contiguous k two field picture with respect to current image frame at (x 0, y 0) the shared edge constraint located; (x 0, y 0) to be current image frame sharing the coordinate of pixel on the edge with contiguous k two field picture; (x y) is pixel coordinate in the display window; M is with (x 0, y 0) be the number of pixels of the set scope at center; ω 1(x 0, y 0) be the zone of current image frame in set scope, ω 2(x 0, y 0) be k two field picture the zone in set scope contiguous with current image frame, (x is that current image frame is at (x, the brightness value of y) locating, I y) to I k(x, y) be with the contiguous k two field picture of current image frame (x, the brightness value of y) locating, I ( x 0 k , y 0 k ) ‾ = 1 m ( Σ ( x , y ) ∈ ω 1 ( x 0 , y 0 ) I ( x , y ) + Σ ( x , y ) ∈ ω 2 ( x 0 , y 0 ) I k ( x , y ) ) Be the contiguous k two field picture of current image frame and current image frame with (x 0, y 0) be the brightness average of the set scope at center.
The present invention also provides a kind of system of removing video jitter, comprising:
Registration apparatus is used for the video current image frame is carried out registration with respect to its contiguous picture frame;
Means for correcting, the trace information of the shake that is used for the registration current image frame is produced add up, and the movement locus of level and smooth shake generation, and current image frame is carried out and the rightabout correction of movement locus;
Filling device, the white space that produces in current image frame edge when being used to use the picture frame that is close to current image frame to fill level and smooth the shake.
Said registration apparatus comprises:
Downsampling unit is used to calculate the maximum down-sampling yardstick of current image frame, according to maximum down-sampling yardstick current image frame and its contiguous picture frame is carried out down-sampling, generates down-sampled images; And dwindling the down-sampling yardstick, the image after direction of motion correcting unit is proofreaied and correct carries out down-sampling to generate down-sampled images, is 0 until the down-sampling yardstick;
Computing unit is used for the textural characteristics in each pixel of down-sampled images calculating;
The direction of motion is confirmed the unit, is used for calculating according to the textural characteristics of each pixel the cost function of the down-sampled images of two picture frames, thereby obtains the direction of motion of down-sampled images;
Direction of motion correcting unit is used for according to the direction of motion current image frame being proofreaied and correct, and obtains the trace information of current image frame on line direction and column direction.
When the trace information of the shake that produces in registration at means for correcting adds up, at first calculate the number of pixels δ that present frame need move at line direction 0rThe number of pixels δ that need move at column direction with present frame 0c,
δ 0 r = p 0 r - 1 2 a + 1 Σ i = - a a p ir
δ 0 c = p 0 c - 1 2 a + 1 Σ j = - a a p jc
Wherein, p IrBe the accumulation displacement of the image of adjacent i frame with current image frame in the movement locus, p at line direction JcBe the accumulation displacement at column direction of the image of adjacent j frame with current image frame in the movement locus, a is the frame of video width that the filtering shake is adopted, p 0rThe expression current image frame is at the accumulation displacement of line direction, p 0cThe expression current image frame is in the accumulation displacement of column direction, and i, j are natural number; According to δ 0rAnd δ 0cMove the movement locus that current image frame comes level and smooth shake to produce in the other direction.
Compared with prior art, the method and system that the present invention removes video jitter utilizes multiple sampling scale that picture frame is carried out registration, and calculates textural characteristics; Processing speed is very fast; And irrelevant with the contrast of image, only depend on size of images, adopt cost function to come the direction of motion of image is judged; And on this basis the direction of motion of image is proofreaied and correct, guaranteed the registration accuracy when the lower or image shift yardstick of the contrast of image is big.Utilize contiguous picture frame to come present frame is filled; Guaranteed to fill the back and had good continuity and consistency in the edge of present frame; Reduced the vestige of artificial treatment, the video of output has identical resolution with original video, and has kept the definition and the continuity of image.
Description of drawings
Fig. 1 is the flow chart that the present invention removes the method for video jitter;
Fig. 2 is the registration flow chart of the picture frame of the present invention's method of removing video jitter;
Fig. 3 is the pyramid model sketch map of the present invention's method of removing video jitter;
Fig. 4 is the schematic block diagram that the present invention removes the system of video jitter;
Fig. 5 a is that the video of the present invention's method of removing video jitter is at the movement locus of line direction and the movement locus sketch map of level and smooth rear video in various degree;
Fig. 5 b is that the video of the present invention's method of removing video jitter is at the movement locus of column direction and the movement locus sketch map of level and smooth rear video in various degree;
Fig. 6 be the present invention's method of removing video jitter present frame and with the shared edge constraint sketch map of the contiguous picture frame of present frame;
Fig. 7 a is a image that the present invention removes the method for video jitter when having different shift ratio, the registration accuracy of distinct methods;
Fig. 7 b is a image that the present invention removes the method for video jitter when having different contrast, the registration accuracy of distinct methods;
Fig. 7 c is the influence that image that the present invention removes the method for video jitter has different contrast diagonal angle spot check detecting number;
Fig. 7 d is a image that the present invention removes the method for video jitter when having different contrast, and three kinds of methods are carried out the image registration required time.
Embodiment
Below in conjunction with accompanying drawing the embodiment of the invention is elaborated.
In the present embodiment, adopt 8 sections different scenes shake videos of shaking videos and 30 sections individual recordings of 5 minutes that do not wait in 3 seconds to 54 seconds, picture frame is of a size of 320 * 240, and computer is configured to CPU:Pentium (R) D 3GHz; Internal memory: 1GB.From the video of different scenes intercepting totally 100 two field pictures be used for the explanation, 100 two field pictures have different contrasts and definition.Every two field picture has been carried out random offset in various degree, made that the shift ratio minimum is 0, be 1/2 of picture traverse or height to the maximum, the offset direction comprises level and vertical both direction.
Fig. 1 is the flow chart that the present invention removes the method for video jitter:
Step 1: current image frame in the video is carried out registration with respect to its neighborhood graph picture frame.
This step is specifically as shown in Figure 2:
The maximum down-sampling yardstick of step a, calculating current image frame carries out down-sampling according to maximum down-sampling yardstick to original image, generates down-sampled images.Wherein, maximum down-sampling yardstick is τ,
τ=max{i-γ|2 i≤min{H,W}}
Wherein, i representes critical down-sampling yardstick, and when sampling scale is greater than i instantly, the image behind the down-sampling will become a pixel less than 2 * 2; γ is a relative parameter, is provided with in order to guarantee registration accuracy, comes τ is adjusted through the value of γ, makes the picture size that obtains behind the down-sampling be not less than 2 * 2, thereby has guaranteed the precision of registration; When γ was big more in the formula, the precision of registration was high more, but correctly the pixel count of registration will reduce, and it is 2 that relative parameter γ is set in the present embodiment.The height of H presentation video, the width of W presentation video.Size of images is big more, and the precision of overall Texture registration will be high more.The value of γ is all influential to the registration scope of image registration accuracy and image shift.As shown in Figure 3, γ value 2 in the present embodiment makes that the picture size behind the down-sampling is at least 3 * 3, has enough precision to guarantee image registration, makes the registration scope reach the half the of picture traverse or height simultaneously, satisfies the registration requirement of violent shake video.
Step b, in down-sampled images, calculate the textural characteristics of each pixel;
Textural characteristics is represented with gray scale, level and the vertical gradient three-dimensional vector of image, is shown below:
f ( x , y ) = [ I ( x , y ) , I ( x , y ) x ′ , I ( x , y ) y ′ ]
Wherein, I (x, y)For image at pixel (x, the gray value of y) locating, I (x, y) x' and I (x, y) y' presentation video is at pixel (x, level of y) locating and vertical gradient.
Step c, calculate the cost function of the down-sampled images of two picture frames, thereby obtain the direction of motion of down-sampled images according to the textural characteristics of each pixel;
Cost function is: C ( u , v ) = 1 H × W Σ x = 1 H Σ y = 1 W ( f ( x , y ) 1 - f ( x + u , y + v ) 2 ) 2
Wherein, u is the pixel-shift of horizontal direction, and v is the pixel-shift of vertical direction, the span of u and v all be 1,0,1}; C (u, v)Be by (u, the cost value when v) direction is moved, the height of H presentation video, the width of W presentation video between two two field pictures.f (x, y) 1Presentation video is at pixel (x, the textural characteristics value of y) locating, f (x+u, y+v) 2Represent that adjacent two field picture is in pixel (x+u, the textural characteristics value of y+v) locating.
The computing formula of the direction of motion of image is:
( u ′ , v ′ ) = ( u , v ) min { C ( u , v ) | u , v ∈ { - 1,0,1 } } ≤ φ ( 0,0 ) other
Wherein, φ is a decision threshold, (u ', v ') be the direction of motion of adjacent two interframe.When the cost function value of minimum means that two two field pictures are diverse scenes during greater than some threshold value φ, otherwise can get the direction of motion (u ', v ') of adjacent two interframe.
Steps d, original image is proofreaied and correct, obtained the trace information of original image on line direction and column direction according to the direction of motion;
The peak excursion ratio is divided into from 1/10 to 1/2 do not wait 10 groups, and concrete shift ratio is as shown in table 1.Like the shift ratio of arbitrary image in 1/2 group between width and height 0 times to 1/2 times.Every two field picture all has 10 groups of peak excursion ratios, and every group produces 100 width of cloth migrated images at random, and the offset direction of document image and offset pixels number, obtains the trace information of former figure on line direction and column direction.
The corresponding registration accuracy of the different skew of table 1. yardstick
Shift ratio 1/10 1/8 1/6 1/5 1/4.5 1/4 1/3.5 1/3 1/2.5 1/2
Registration accuracy 0.991 0.989 0.982 0.979 0.972 0.955 0.916 0.836 0.735 0.631
Step e, dwindling the down-sampling yardstick, the image after proofreading and correct is carried out down-sampling generate down-sampled images, get back to step b and continue to handle, is 0 o'clock end process until the down-sampling yardstick.
In the present embodiment, when the image after proofreading and correct is carried out down-sampling, adopt pyramid model to carry out down-sampling.As shown in Figure 3, the bottom is an original image, can obtain the intermediate layer image through the down-sampling of one time 2 * 2 yardstick, and the average of 2 * 2 pixels constitutes behind the down-sampling value of a pixel in the image of intermediate layer in this moment original image.After returning the textural characteristics of each pixel of step b calculating; Calculate the cost function of the down-sampled images of two picture frames according to the textural characteristics of each pixel; The direction of motion that obtains down-sampled images is (1,1), the skew that expression intermediate layer image has level and each 1 pixel of vertical direction.According to registration result original image is proofreaied and correct, obtained the trace information of original image on line direction and column direction and be (2,2).
Continue to dwindle the down-sampling yardstick, the intermediate layer image is carried out 2 * 2 yardsticks (be equivalent to former
Image carries out 4 * 4 yardsticks) down-sampling can obtain top layer images.The average of 2 * 2 pixels constitutes behind the down-sampling value of a pixel in the top layer images in this moment intermediate layer image.After returning the textural characteristics of each pixel of step b calculating; Calculate the cost function of the down-sampled images of two picture frames according to the textural characteristics of each pixel; The direction of motion that obtains down-sampled images is (1,1), representes that top layer images has level and the respectively skew of 1 pixel of vertical direction.According to registration result the intermediate layer image is proofreaied and correct, obtained the trace information of intermediate layer image on line direction and column direction and be (2,2), then the trace information of original image on line direction and column direction is (4,4), and this moment, the down-sampling yardstick was 0, end process.
Step 2: the trace information of the shake that produces in the registration is added up, and the movement locus of level and smooth shake generation, image is carried out and the rightabout correction of movement locus.
When the trace information of the shake that produces in the registration is added up, in the present embodiment, detected 100 two field pictures altogether, curve 501 is the movement locus of frame of video at line direction among Fig. 5 a, and curve 501 is the movement locus of frame of video at column direction among Fig. 5 b.At first calculate the number of pixels δ that current image frame need move at line direction 0rThe number of pixels δ that need move at column direction with current image frame 0c,
δ 0 r = p 0 r - 1 2 a + 1 Σ i = - a a p ir δ 0 c = p 0 c - 1 2 a + 1 Σ j = - a a p jc
Wherein, shown in curve 501 among Fig. 5 a, p IrBe the accumulation displacement of the image of adjacent i frame with current image frame in the movement locus, p at line direction JcBe the accumulation displacement at column direction of the image of adjacent j frame with current image frame in the movement locus, α is the frame of video width that the filtering shake is adopted, like 5 frames among Fig. 5 a, Fig. 5 b, and 10 frames or 20 frames, p 0rThe expression current image frame is at the accumulation displacement of line direction, p 0cExpression expression current image frame is in the accumulation displacement of column direction, and i .j are natural number; According to δ 0rAnd δ 0cMove the movement locus that current image frame comes level and smooth shake to produce in the other direction.
Can detect the jitter conditions between frame of video through multiple dimensioned overall Texture registration, the movable information that calculating just can obtain video camera is accumulated in shake.Video contains a large amount of radio-frequency components, promptly violent shake in the movement locus of line direction and column direction.Low-frequency component in the movement locus is the subjectivity motion of video camera, should keep.It is exactly for the radio-frequency component in the filtering movement locus that video goes dither operation.After detecting the movable information of video, the movement locus of line direction capable of using and column direction comes the shake of filtering video.Adopt the smoothing processing of movement locus to reach the purpose of filtering shake among the present invention, and obtained good effect.
Shake filtering method in the present embodiment similar with movement locus is carried out LPF, the curve 502 among Fig. 5 a is the smoothing processing results that movement locus carried out 11 frame width (about each 5 frames); Adopt 21 frame width to come the smooth motion track can obtain the movement locus shown in the curve 503 among Fig. 5 a; Adopt 41 frame width to handle and to obtain the movement locus shown in the curve 504 among Fig. 5 a.With respect to original video movement locus 501, the movement locus that removes to shake rear video is more level and smooth, filtering the high dither interference, and kept the low frequency movement trend in the curve 501, reflected the subjective direction of motion of video camera.In like manner, the different curves among Fig. 5 b have reflected the movement locus of level and smooth in various degree rear video at column direction.
After motion detection and shake filtering processing, video is basicly stable.To the skew of image, cause the periphery of video white space to occur in the trimming process.Some videos went the result of dither algorithm to reduce the resolution of removing to shake rear video usually in the past.The present invention will adopt the neighborhood graph picture frame of the sharing edge constraint zone that fills in the blanks, and make output video and input have identical resolution, continuity and consistency that the assurance frame of video is located on the edge of, and kept the contrast of frame of video.
Step 3: the white space that produces in current image frame edge when using the picture frame that is close to current image frame to fill level and smooth the shake;
Deviation appears in the result of image registration unavoidably, if directly adopt the white space of filling present frame with the contiguous picture frame of current image frame, stays tangible artificial treatment vestige easily.The present invention will adopt and share the result that edge constraint is come the authentication image registration.As shown in Figure 6, solid box 601 is viewing areas, and frame of broken lines 602 is the picture frames after the translation.Through motion detection and filtering, current image frame need cause display window downside and right side white space to occur to the upper left side translation to eliminate shake.The picture frame (dotted line 603) contiguous with current image frame need cover a part of white space in the display window to the lower right translation.Black heavy line 604 is the shared edge of two two field pictures among Fig. 6, adopts the variance in the set scope to describe the level and smooth degree of sharing the edge.Set scope makes the center traversal of set scope share the point on the edge shown in grey square frame 605 among Fig. 6, and the average of variance of brightness of adding up set scope interior pixel is as sharing the edge constraint value.
Share edge constraint value S k(x 0, y 0) be:
S k(x 0,y 0)=S 0k(x 0,y 0)+S k0(x 0,y 0)
Wherein, S 0 k ( x 0 , y 0 ) = 1 m Σ ( x , y ) ∈ ω 1 ( x 0 , y 0 ) ( I ( x , y ) - I ( x 0 k , y 0 k ) ‾ ) 2 For current image frame with respect to the contiguous k two field picture of current image frame at (x 0, y 0) the shared edge constraint located. S k 0 ( x 0 , y 0 ) = 1 m Σ ( x , y ) ∈ ω 2 ( x 0 , y 0 ) ( I k ( x , y ) - I ( x 0 k , y 0 k ) ‾ ) 2 For with the contiguous k two field picture of current image frame with respect to current image frame at (x 0, y 0) the shared edge constraint located.(x 0, y 0) be current image frame with the contiguous k two field picture of current image frame in the coordinate of pixel at edge; (x y) is pixel coordinate in the display window; M is with (x 0, y 0) be the number of pixels of the set scope at center; ω 1(x 0, y 0) be the zone of current image frame in set scope, ω 2(x 0, y 0) be k two field picture the zone in set scope contiguous with current image frame, (x is that current image frame is at (x, the brightness value of y) locating, I y) to I k(x, y) be with the contiguous k two field picture of current image frame at (x, the brightness value of y) locating, I (x 0k, y 0k) be the contiguous k two field picture of current image frame and current image frame with (x 0, y 0) be the brightness average of the set scope at center, promptly
I ( x 0 k , y 0 k ) ‾ = 1 m ( Σ ( x , y ) ∈ ω 1 ( x 0 , y 0 ) I ( x , y ) + Σ ( x , y ) ∈ ω 2 ( x 0 , y 0 ) I k ( x , y ) ) .
Set scope be with current image frame with the contiguous k two field picture of current image frame in the coordinate (x of pixel at edge 0, y 0) be the center, 3% to 1% of viewing area width is the square area of the length of side.
Share edge constraint value S k(x 0, y 0) more little, explain two two field pictures locate on the edge of level and smooth more, i.e. image registration accurate more, the confidence level of filling current image frame with this contiguous frames is high more.Local window adopts 5 * 5 size in the present embodiment.
After obtaining the shared edge constraint of two two field pictures; Just can confirm the levels of precision of registration by the binding occurrence at the shared edge of this frame and current image frame; When the binding occurrence of sharing the edge during less than set threshold value; Use the white space that produces in present frame edge when filling level and smooth shake, otherwise white space is filled with the next frame of the picture frame of this vicinity with the corresponding region of the contiguous picture frame of current image frame.Be shown below, when shared edge constraint during less than set threshold value ρ (being made as 100 in the present embodiment), the white space that can be used for current image frame is filled; Otherwise, can not use this contiguous frames.λ kEqual the contiguous k two field picture of 1 expression and can be used for the white space filling.
When contiguous multiple image all can be used for the white space filling of current image frame, preferentially select the picture frame more contiguous for use with current image frame.Because what two two field pictures leaned in the video is near more, the coefficient that rotation and convergent-divergent change between image is more little, meets the hypothesis of the fast image registration method of the present invention's proposition more, and the precision of image registration is higher, and the artificial treatment vestige after white space is filled also will be more little.
Step 4: jump to the next frame of current image frame, return step 1, finish after all images frame in video disposes.
Referring to Fig. 4, a kind of system 400 of removing video jitter is disclosed, comprise registration apparatus 401, means for correcting 402 and filling device 403.Wherein registration apparatus 401 carries out registration with current image frame in the video with respect to its neighborhood graph picture frame; Means for correcting 402 adds up the trace information of the shake that current image frame in the registration produces, and the movement locus of level and smooth shake generation, and current image frame is carried out and the rightabout correction of movement locus; The white space that the picture frame that filling device 403 uses and current image frame is contiguous produces in current image frame edge when filling level and smooth the shake.Registration apparatus 401 further comprises: downsampling unit 411, be used to calculate the maximum down-sampling yardstick of current image frame, and according to maximum down-sampling yardstick original image is carried out down-sampling, generate down-sampled images; And dwindle the down-sampling yardstick, and the image after the direction of motion correction is carried out down-sampling to generate down-sampled images, be 0 until the down-sampling yardstick; Computing unit 412 is used for the textural characteristics in each pixel of down-sampled images calculating; The direction of motion is confirmed unit 413, is used for calculating according to the textural characteristics of each pixel the cost function of the down-sampled images of two picture frames, thereby obtains the direction of motion of down-sampled images; Direction of motion correcting unit 414 is used for according to the direction of motion original image being proofreaied and correct, and obtains the trace information of original image on line direction and column direction.Can implement the method for said video jitter through this system.
Registration process of this method and Harris and SUSAN Corner Detection and method for registering compare, and comprise the registration accuracy that changes along with shift ratio, registration accuracy that changes along with picture contrast and the registration speed that changes along with picture contrast.Fig. 7 a has showed when image has different shift ratio, the registration accuracy of distinct methods.Among the figure, the registration process of this method is gathered textural characteristics and is carried out registration, so represent with MSGTR (the multiple dimensioned overall Texture registration method of Multi-Scale Global Texture Registeration) owing to adopt multiple yardstick to carry out down-sampling.Abscissa be the scope of 10 o'clock presentation video line directions and column direction skew from 0 to width and 1/10 of height.Can know that from figure when image shift was big, the registration process registration accuracy of this method was higher relatively, is 63.1%; In image shift hour, its registration accuracy and SUSAN Corner Detection are suitable with matching process, and be more relatively low than the registration accuracy of Harris Corner Detection and method for registering.The concrete registration accuracy of the whole bag of tricks is as shown in table 2.
The corresponding registration accuracy of three kinds of methods of table 2. different skew yardstick
Figure G2009102427956D00121
Fig. 7 b has showed when image has different contrast, the registration accuracy of distinct methods.When the contrast of image was low, the levels of precision of Harris and SUSAN Corner Detection decreased, thereby influences the precision of image registration.Visible from figure, the registration process of this method of MSGTR representative receives the influence of contrast less relatively, and when contrast was minimum, registration accuracy was 82%; Have the greatest impact and Harris Corner Detection and matching process receive, accuracy of registration is merely 69%.When contrast was higher, the registration accuracy of three kinds of methods was all higher.
Fig. 7 c has shown that image has the influence of different contrast diagonal angle spot check detecting number.Visible from figure, along with the enhancing of picture contrast, Harris and SUSAN angular-point detection method can detect more angle point.And under the same contrast situation, the detected angle point number of SUSAN angular-point detection method is more more.This is an image when having same contrast, and SUSAN Corner Detection and the slightly higher reason of method for registering precision are shown in Fig. 5 a and Fig. 5 b.
Fig. 7 d has shown when image has different contrast that three kinds of methods are carried out the situation of change of image registration required time.Visible from figure; Along with the raising of picture contrast, Harris and SUSAN algorithm detect more angle point, also consume more time; But the registration process of this method of MSGTR representative receives the influence of picture contrast less, and consuming time obviously less than normal than other two kinds of methods.
The performance of registration process of this method and Harris and SUSAN Corner Detection and method for registering, the registration speed of the registration process of this method and the contrast of image are irrelevant, only receive the influence of picture size.When picture size is 320 * 240; The processing time of the registration process of this method is merely 16ms, and confirm by the binding occurrence of sharing the edge fill level and smooth shake with the contiguous picture frame of present frame the time white space that produces in present frame edge, guaranteed the continuity and the consistency at edge; Make reconstruction back image have identical resolution with original input video; Locate on the edge of and can seamlessly transit, the fill area has higher contrast ratio, has removed the shake of video effectively.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims (9)

1. a method of removing video jitter is characterized in that, comprising:
Step 1: current image frame in the video is carried out registration with respect to its contiguous picture frame;
Step 2: the trace information of the shake of current image frame generation in the registration is added up, and the movement locus of level and smooth shake generation, current image frame is carried out and the rightabout correction of movement locus;
Step 3: the white space that produces in current image frame edge when using the picture frame that is close to current image frame to fill level and smooth the shake;
Step 4: jump to next picture frame of current image frame, return step 1, all images frame in video disposes;
Wherein, step 1 further comprises:
The maximum down-sampling yardstick of step a, calculating current image frame carries out down-sampling according to maximum down-sampling yardstick to current image frame and its contiguous picture frame, generates down-sampled images;
Step b, in down-sampled images, calculate the textural characteristics of each pixel;
Step c, calculate the cost function of the down-sampled images of two picture frames, thereby obtain the direction of motion of down-sampled images according to the textural characteristics of each pixel;
Steps d, current image frame is proofreaied and correct, obtain the trace information of current image frame on line direction and column direction according to the direction of motion;
Step e, dwindling the down-sampling yardstick, the image after proofreading and correct is carried out down-sampling, return step b, is 0 until the down-sampling yardstick;
In the said step 2, when the trace information of the shake that produces in the registration is added up, at first calculate the number of pixels δ that current image frame need move at line direction 0rThe number of pixels δ that need move at column direction with current image frame 0c,
Figure FSB00000568999000011
Figure FSB00000568999000012
Wherein, p IrBe the accumulation displacement of the image of adjacent i frame with current image frame in the movement locus, p at line direction JcBe the accumulation displacement at column direction of the image of adjacent j frame with current image frame in the movement locus, a is the frame of video width that the filtering shake is adopted, p 0rThe expression current image frame is at the accumulation displacement of line direction, p 0cThe expression current image frame is in the accumulation displacement of column direction, and i, j are natural number; According to δ 0rAnd δ 0cMove the movement locus that current image frame comes level and smooth shake to produce in the other direction.
2. method according to claim 1 is characterized in that, among the said step a, maximum down-sampling yardstick is τ,
τ=max{i-γ|2 i≤min{H,W}}
Wherein, i representes critical down-sampling yardstick, and when sampling scale is greater than i instantly, the image behind the down-sampling will become a pixel less than 2 * 2, and γ is a relative parameter, the height of H presentation video, the width of W presentation video.
3. method according to claim 1 is characterized in that, among the said step b, textural characteristics is represented with gray scale, level and the vertical gradient three-dimensional vector of image, is shown below:
Figure FSB00000568999000021
Wherein, I (x, y)For image pixel (x, the gray value of y) locating,
Figure FSB00000568999000022
With
Figure FSB00000568999000023
Presentation video is at pixel (x, level of y) locating and vertical gradient.
4. method according to claim 1 is characterized in that, among the said step c, cost function is:
Figure FSB00000568999000024
Wherein, u is the pixel-shift of horizontal direction, and v is the pixel-shift of vertical direction, the span of u and v all be 1,0,1}; C (u, v)Be by (u, the cost value when v) direction is moved, the height of H presentation video, the width of W presentation video between two two field pictures;
Figure FSB00000568999000025
Presentation video pixel (x, the textural characteristics value of y) locating,
Figure FSB00000568999000026
The picture frame of expression vicinity is in pixel (x+u, the textural characteristics value of y+v) locating.
5. method according to claim 4 is characterized in that, among the said step c, the computing formula of the direction of motion of image is:
Wherein, φ is a decision threshold, (u ', v ') be the direction of motion of adjacent two interframe.
6. method according to claim 1 is characterized in that, among the said step e, when the image after proofreading and correct is carried out down-sampling, adopts pyramid model to carry out down-sampling.
7. method according to claim 1; It is characterized in that; In the said step 3; Calculate current image frame and and the contiguous picture frame of current image frame between shared edge constraint value, when shared edge constraint value during, use the white space of current image frame edge generation when filling level and smooth shake with the corresponding region of the contiguous picture frame of current image frame less than set threshold value; Otherwise the next frame of using the picture frame that is close to current image frame carries out the threshold value coupling, and the white space that produces until current image frame edge is filled and finished.
8. method according to claim 7 is characterized in that, in the said step 3, shares edge constraint value S k(x 0, y 0) be:
S k(x 0,y 0)=S 0k(x 0,y 0)+S k0(x 0,y 0)
Wherein,
Figure FSB00000568999000031
For current image frame with respect to contiguous k two field picture at (x 0, y 0) the shared edge constraint located;
Figure FSB00000568999000032
For contiguous k two field picture with respect to current image frame at (x 0, y 0) the shared edge constraint located; (x 0, y 0) to be current image frame sharing the coordinate of pixel on the edge with contiguous k two field picture; (x y) is pixel coordinate in the display window; M is with (x 0, y 0) be the number of pixels of the set scope at center; ω 1(x 0, y 0) be the zone of current image frame in set scope, ω 2(x 0, y 0) be k two field picture the zone in set scope contiguous with current image frame, (x is that current image frame is at (x, the brightness value of y) locating, I y) to I k(x, y) be with the contiguous k two field picture of current image frame (x, the brightness value of y) locating,
Figure FSB00000568999000033
Be the contiguous k two field picture of current image frame and current image frame with (x 0, y 0) be the brightness average of the set scope at center.
9. a system of removing video jitter is characterized in that, comprising:
Registration apparatus is used for the video current image frame is carried out registration with respect to its contiguous picture frame;
Means for correcting, the trace information of the shake that is used for the registration current image frame is produced add up, and the movement locus of level and smooth shake generation, and current image frame is carried out and the rightabout correction of movement locus;
Filling device, the white space that produces in current image frame edge when being used to use the picture frame that is close to current image frame to fill level and smooth the shake,
Wherein, said registration apparatus further comprises:
Downsampling unit is used to calculate the maximum down-sampling yardstick of current image frame, according to maximum down-sampling yardstick current image frame and its contiguous picture frame is carried out down-sampling, generates down-sampled images; And dwindling the down-sampling yardstick, the image after direction of motion correcting unit is proofreaied and correct carries out down-sampling to generate down-sampled images, is 0 until the down-sampling yardstick;
Computing unit is used for the textural characteristics in each pixel of down-sampled images calculating;
The direction of motion is confirmed the unit, is used for calculating according to the textural characteristics of each pixel the cost function of the down-sampled images of two picture frames, thereby obtains the direction of motion of the down-sampled images of current image frame;
Direction of motion correcting unit is used for according to the direction of motion current image frame being proofreaied and correct, and obtains the trace information of current image frame on line direction and column direction,
When the trace information of the shake that produces in registration at means for correcting adds up, at first
Calculate the number of pixels δ that current image frame need move at line direction 0rThe number of pixels δ that need move at column direction with present frame 0c,
Figure FSB00000568999000041
Figure FSB00000568999000042
Wherein, p IrBe the accumulation displacement of the image of adjacent i frame with current image frame in the movement locus, p at line direction JcBe the accumulation displacement at column direction of the image of adjacent j frame with current image frame in the movement locus, a is the frame of video width that the filtering shake is adopted, p 0rThe expression current image frame is at the accumulation displacement of line direction, p 0cThe expression current image frame is in the accumulation displacement of column direction, and i, j are natural number; According to δ 0rAnd δ 0cMove the movement locus that current image frame comes level and smooth shake to produce in the other direction.
CN2009102427956A 2009-12-21 2009-12-21 Method and system for removing video jitter Active CN101742122B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009102427956A CN101742122B (en) 2009-12-21 2009-12-21 Method and system for removing video jitter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009102427956A CN101742122B (en) 2009-12-21 2009-12-21 Method and system for removing video jitter

Publications (2)

Publication Number Publication Date
CN101742122A CN101742122A (en) 2010-06-16
CN101742122B true CN101742122B (en) 2012-06-06

Family

ID=42464932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009102427956A Active CN101742122B (en) 2009-12-21 2009-12-21 Method and system for removing video jitter

Country Status (1)

Country Link
CN (1) CN101742122B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102143376B (en) * 2010-11-29 2013-05-22 北大方正集团有限公司 Method and device for detecting consistency of twin-channel video signals
CN102499634B (en) * 2011-10-26 2014-01-08 中国科学院光电技术研究所 Living human eye retina dynamic imaging device with image stabilizing function and method
CN103051903A (en) * 2012-12-24 2013-04-17 四川九洲电器集团有限责任公司 Space adaptive H.264 video I frame error concealment method
CN104349039B (en) * 2013-07-31 2017-10-24 展讯通信(上海)有限公司 Video anti-fluttering method and device
CN103679749B (en) * 2013-11-22 2018-04-10 北京奇虎科技有限公司 A kind of image processing method and device based on motion target tracking
CN104796580B (en) * 2014-01-16 2018-07-31 北京亿羽舜海科技有限公司 A kind of real-time steady picture video routing inspection system integrated based on selection
EP3195590A4 (en) * 2014-09-19 2018-04-25 Intel Corporation Trajectory planning for video stabilization
CN104751488B (en) * 2015-04-08 2017-02-15 努比亚技术有限公司 Photographing method for moving track of moving object and terminal equipment
CN106550174B (en) * 2016-10-28 2019-04-09 大连理工大学 A kind of real time video image stabilization based on homography matrix
CN107370941B (en) * 2017-06-29 2020-06-23 联想(北京)有限公司 Information processing method and electronic equipment
US10587807B2 (en) * 2018-05-18 2020-03-10 Gopro, Inc. Systems and methods for stabilizing videos
CN110390688A (en) * 2019-07-23 2019-10-29 中国人民解放军国防科技大学 Steady video SAR image sequence registration method
CN111263069B (en) * 2020-02-24 2021-08-03 Oppo广东移动通信有限公司 Anti-shake parameter processing method and device, electronic equipment and readable storage medium
CN113744277A (en) * 2020-05-29 2021-12-03 广州汽车集团股份有限公司 Video jitter removal method and system based on local path optimization
CN113163120A (en) * 2021-04-21 2021-07-23 安徽清新互联信息科技有限公司 Transformer-based video anti-shake method
CN114245176A (en) * 2021-12-16 2022-03-25 北京数码视讯技术有限公司 Transmission detection device and method for multimedia stream

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101281650A (en) * 2008-05-05 2008-10-08 北京航空航天大学 Quick global motion estimating method for steadying video
CN101511024A (en) * 2009-04-01 2009-08-19 北京航空航天大学 Movement compensation method of real time electronic steady image based on motion state recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101281650A (en) * 2008-05-05 2008-10-08 北京航空航天大学 Quick global motion estimating method for steadying video
CN101511024A (en) * 2009-04-01 2009-08-19 北京航空航天大学 Movement compensation method of real time electronic steady image based on motion state recognition

Also Published As

Publication number Publication date
CN101742122A (en) 2010-06-16

Similar Documents

Publication Publication Date Title
CN101742122B (en) Method and system for removing video jitter
US11240471B2 (en) Road vertical contour detection
US20150086080A1 (en) Road vertical contour detection
JP3868876B2 (en) Obstacle detection apparatus and method
TWI489418B (en) Parallax Estimation Depth Generation
KR100985805B1 (en) Apparatus and method for image stabilization using adaptive Kalman filter
US20120327189A1 (en) Stereo Camera Apparatus
CN105069804B (en) Threedimensional model scan rebuilding method based on smart mobile phone
US20120162395A1 (en) Method for filling hole-region and three-dimensional video system using the same
CN105872345A (en) Full-frame electronic image stabilization method based on feature matching
CN110245199B (en) Method for fusing large-dip-angle video and 2D map
Cho et al. Affine motion based CMOS distortion analysis and CMOS digital image stabilization
CN103440664A (en) Method, system and computing device for generating high-resolution depth map
US10043106B2 (en) Corresponding point searching method and distance detection device
JP2010152521A (en) Apparatus and method for performing stereographic processing to image
CN101815225A (en) Method for generating depth map and device thereof
JP4985542B2 (en) Corresponding point search device
US9380285B2 (en) Stereo image processing method, stereo image processing device and display device
JP4831084B2 (en) Corresponding point search device
CN103118221B (en) Based on the real-time video electronic image stabilization method of field process
JPH10124687A (en) Method and device for detecting white line on road
JP4985863B2 (en) Corresponding point search device
Aboussouan Super-Resolution Image Construction Using an Array Camera
Huang et al. Video stabilization with distortion correction for wide-angle lens dashcam
KR20160115068A (en) Method and apparatus for hierarchical stereo matching

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant