CN112001860A - Video debounce algorithm based on content-aware blocking strategy - Google Patents
Video debounce algorithm based on content-aware blocking strategy Download PDFInfo
- Publication number
- CN112001860A CN112001860A CN202010810101.0A CN202010810101A CN112001860A CN 112001860 A CN112001860 A CN 112001860A CN 202010810101 A CN202010810101 A CN 202010810101A CN 112001860 A CN112001860 A CN 112001860A
- Authority
- CN
- China
- Prior art keywords
- frame
- video
- follows
- constraint
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000000903 blocking effect Effects 0.000 title claims abstract description 24
- 230000009466 transformation Effects 0.000 claims abstract description 66
- 238000009499 grossing Methods 0.000 claims abstract description 17
- 230000011218 segmentation Effects 0.000 claims abstract description 13
- 238000013507 mapping Methods 0.000 claims abstract description 9
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims abstract description 7
- 238000000034 method Methods 0.000 claims description 21
- 238000005457 optimization Methods 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000000007 visual effect Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 241000764238 Isis Species 0.000 claims description 3
- 230000006978 adaptation Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 239000000843 powder Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 2
- 241000197727 Euscorpius alpha Species 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract 1
- 238000005192 partition Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/207—Analysis of motion for motion estimation over a hierarchy of resolutions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/21—Circuitry for suppressing or minimising disturbance, e.g. moiré or halo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a video debounce algorithm based on a content-aware blocking strategy, which comprises the following steps: step1, extracting a characteristic track from the video, obtaining the characteristic point distribution of each frame according to the characteristic track, and performing triangle segmentation according to the characteristic points; step2, based on the track smooth constraint, the interframe similarity transformation constraint, the intraframe similarity transformation constraint and the regular constraint, and solving the stable position of the characteristic point through self-adaptive weight setting, and further solving the stable position of the edge control point through the interframe similarity transformation constraint and the intraframe similarity transformation constraint according to the obtained stable position of the characteristic point; step 3. mapping from a dithered view to a stabilized view is established based on the affine transformed frame-by-frame image. According to the invention, each frame in the video is divided into triangular mesh structures with different numbers and sizes according to the distribution of the track points, and the triangular meshes are utilized to carry out inter-frame motion estimation and smoothing processing, so that a more stable effect can be generated, and the phenomenon of local distortion in the generated video is avoided.
Description
Technical Field
The invention relates to the technical field of computer vision and video debouncing, in particular to a video debouncing algorithm based on a content-aware blocking strategy.
Background
In recent years, as the video debounce problem is studied more and more algorithms are proposed. Despite the numerous correlation algorithms, existing algorithms follow the following three steps: camera motion estimation, camera motion smoothing, and mapping of a shake perspective to a stable perspective. We can classify the debounce algorithm into a 2D algorithm, a 3D algorithm, and a 2.5D algorithm according to the method used in the above three steps.
The 2D algorithm refers to a method of representing inter-frame motion of a video frame using an inter-frame transformation matrix and smoothing camera motion by smoothing a series of inter-frame transformation matrices. While gaussian low-pass filtering, particle filtering, regularization, etc. are typically used when smoothing camera motion. Grundmann et al achieve smoothing of the camera path by constraining the derivatives of the motion changes between frames. Joshi et al use the video frame sequence in a certain neighborhood of the current frame to perform inter-frame feature point matching to obtain the maximum interior point set that fits the video frame sequence, thereby excluding the influence of part of the foreground region on jitter estimation. Liu et al, using a warping transform based on content preservation to divide the video frame into a grid-like structure and perform camera motion estimation and smoothing locally, can more robustly cope with scenes with discontinuous depth-of-field transformation.
The 3D algorithm is specially designed for the video of a three-dimensional scene, so that the video de-jittering problem of a scene with large parallax can be better solved. The method adopts an SFM method to recover the 3D attitude matrix of the camera, and then smoothes the 3D attitude matrix sequence to eliminate jitter. Liu et al reconstructs the three-dimensional motion pattern of the captured scene and employs a mesh-based warping transformation for generation of stable frames. Zhang et al performs estimation and smoothing of camera motion by establishing a smoothness-based optimization function.
In recent years, people design 2.5D algorithms by combining the advantages of the 2D algorithm and the 3D algorithm, firstly extract the characteristic tracks of the jitter video, then smooth the characteristic tracks through various algorithms, and finally realize the mapping from the jitter frame to the stable frame by utilizing the corresponding relation of the jitter characteristic tracks and the smooth characteristic tracks in each frame. Lee et al extract feature trajectories from the dithered video and design optimization functions to solve for smooth trajectories at stable viewing angles. The above approach does not perform well when processing video containing large foreground and parallax scenes. Liu et al impose well-known subspace constraints on the feature trajectories for motion estimation and smoothing. In order to ensure the performance of a video jitter removal algorithm in a scene with a large foreground, Ling et al propose an algorithm for separating foreground and background characteristic tracks and removing jitter of a video based on a feedback strategy.
As can be seen, the conventional method divides the entire video frame into several rectangles of fixed size, and then estimates a mapping relationship from a local jittered view to a stable view for each small block. This approach has the obvious drawback of ignoring the content of the video and considering all regions of the full picture equally. For example, for areas lacking useful information, such as sky, roads, and rich content areas, such as crowd, the method divides them into rectangular areas of the same size and performs an estimation of the transformation matrix within each rectangle. For the former region, such a blocking operation may be unnecessary since the same motion pattern may be satisfied over a larger range, but for the latter region, such a blocking operation is not fine enough since there is a more complicated motion pattern and a discontinuous disparity transform.
In conclusion, the method has poor applicability to scenes with large foreground shielding, complex parallax change and the like, and has poor de-jittering capability to videos of complex scenes.
Disclosure of Invention
In order to solve the technical problem, the invention adaptively blocks the video frame according to the scene content, simultaneously imposes constraints on the foreground and the background, and performs camera motion estimation by using the full-image information, thereby enhancing the jitter removal capability of the video containing the complex scene. The invention provides a video debounce algorithm based on a content-aware blocking strategy. Unlike existing de-jittering algorithms based on fixed blocking strategies, the present algorithm considers video content and adaptively partitions the video frame into blocks accordingly. And carrying out Delaunay triangulation on the feature points of the feature track in each frame to realize the self-adaptive block partitioning strategy and realize different partitioning results in each frame. By solving a two-stage optimization problem, the position of each triangle in the triangular mesh segmentation under the stable visual angle can be obtained, and the mapping from the jittering visual angle to the stable visual angle is carried out according to the position. The algorithm proposed by the invention does not distinguish foreground and background feature tracks any more and uses all feature tracks to estimate and smooth the camera motion. In order to further improve the robustness of the algorithm, two adaptive weight setting strategies are provided, and the performance of scenes with large foreground and parallax change is obviously improved.
The technical scheme of the invention is as follows: a video de-jittering algorithm based on a content-aware blocking strategy comprises the following steps:
and Step1, extracting a characteristic track from the video, obtaining the characteristic point distribution of each frame according to the characteristic track, and performing triangle segmentation according to the characteristic points.
And Step2, solving the stable position of the characteristic point through self-adaptive weight setting based on trajectory smoothing constraint, inter-frame similarity transformation constraint, intra-frame similarity transformation constraint and regular constraint. And solving the stable position of the edge control point according to the obtained stable position of the characteristic point, the interframe similarity transformation constraint and the intraframe similarity transformation constraint.
Step3 mapping of the frame-by-frame image from a dithered view to a stabilized view based on affine transformation.
Further, in the video debounce algorithm based on the content-aware blocking strategy, the feature trajectory extraction method in Step1 is as follows:
the method comprises the steps of extracting feature points from video frames by using a KLT algorithm and tracking to generate feature tracks, dividing the video frames into grids of 10x10 in size in order to avoid all the feature tracks from gathering in a middle region, extracting 200 corner points by using a uniform threshold, and ensuring that at least one feature point can be detected by reducing the threshold for a region without corner points.
Further, in the video debounce algorithm based on the content-aware blocking strategy, a delaunay algorithm is adopted for triangulation in Step 1.
Using standard Delaunay triangulation method to pair MtPerforming segmentation and generating triangular mesh segmentation resultWherein KtRepresents QtThe number of medium triangles. MtAnd QtAre respectively expressed asAndto find the result at a stable viewing angle for these edge regions, 10 control points are set on each of the four sides of the video frame and defined as E for the set of t-th framest={Et,1,...,Et,36}. We then base the video frame on the new set of points Mt,EtDelaunay triangulation operation is performed. The segmentation result is represented asWhereinReferred to as "interior triangles" meaning vertices all made up of MtThe point in (1) constitutes a triangle.Referred to as an "outer triangle," denotes a triangle whose vertices contain at least one control point. Qt、BtThe position under a stable viewing angle is represented as
Further, in the video debounce algorithm based on the content-aware blocking strategy, the Step of solving the stable position of the feature point based on the trajectory smoothing constraint, the inter-frame similarity transformation constraint, the intra-frame similarity transformation constraint and the regularization constraint in Step2 is as follows:
the method solves an optimization problem comprising three constraints on the characteristic trackIs estimated, the three constraints are:
(1) a characteristic track P is giveniPosition at stable viewing angleShould change slowly between frames.
(2) In the t-th frame, the frame,should each stabilized triangle maintain and QtThe corresponding original triangles in (a) are similar.
(3) In the t-th frame, the frame,should maintain the transformation relationship between similar triangles in (1) and (Q)tThe transformation relations between the corresponding triangles in the tree are consistent.
Based on the above constraints, an optimization function is designed that is minimized as follows:
wherein:
(1)is a "smoothing term" used to smooth the feature trajectory by constraining the first and second derivatives of the feature trajectory to smooth the feature trajectory.The definition is as follows:
where α and β are weighting coefficients, α ═ 2 and β ═ 10.
(2)Is an "inter-frame similarity transformation constraint term" that ensures that the video frame after transformation and the original video frame remain similarly transformed, for QtK in (1)tA Delaunay triangle whose stable viewing angle is defined asQtAndthe vertex of the ith triangle is defined asAnd is required and QtThe corresponding triangles in (a) are similar.The definition is as follows:
wherein a isBAnd bBThis can be obtained by the following formula:
(3)the method is an intra-frame similarity transformation constraint item which is used for ensuring that the transformation relation between local areas in a video frame obtained by transformation is similar to the transformation relation between original video frames.The definition is as follows:
wherein:
is a weight coefficient, 20, phi (i) denotes all the adjoining triangles of triangle i, a, b, c can be obtained by the following formula:
a+b+c=1
(4)is a "regularization term" that limits the stabilized feature trajectory from remaining positionally close to the original feature trajectory, thereby avoiding excessive transformation resulting in substantial loss of video content. The definition is as follows:
further, in the video debounce algorithm based on the content-aware blocking strategy, the Step of solving the stable position of the feature point based on the trajectory smoothing constraint, the inter-frame similarity transformation constraint, the intra-frame similarity transformation constraint and the regularization constraint in Step2 is as follows:
further, in the video debounce algorithm based on the content-aware blocking strategy, the Step of solving the stable position of the edge control point in Step2 is as follows:
the following optimization problem is designed to find the positions of these control points at a stable viewing angle:
wherein:
whereinDifferent optimization operations for the feature points and the control points are represented, and are specifically defined as follows:
indicating the stable position of the feature point. Gamma is a weight coefficient with a value equal to 1.
wherein:
wherein a, b, c are solved by the following equations:
a+b+c=1
Further, in the video debounce algorithm based on the content-aware blocking strategy, the adaptive weight setting strategy in Step2 is as follows:
(1) time-adaptive based weight setting
The term expects that the change between adjacent frames of the feature trajectory tends to 0. However, rapid camera motion often results in large frame-to-frame variations in the feature trajectories, which can lead to collapse of the video content. I.e., alpha, should be appropriately reduced when fast camera motion is detected, so improvementsIn the form:
wherein the sigma is 10, the total weight of the powder,andrepresenting the velocities in the x and y directions, respectively, is calculated as follows:
(2) weight setting based on spatial adaptation
Since handheld devices capture videos of real-world scenes, there are inevitably discontinuous depth variations in these videos and inconsistencies in local motion due to foreground occlusion. This may create a distortion phenomenon in the stabilized video frame. In order to solve the problems, whether a dynamic foreground object exists in each triangle divided by a triangular mesh is judged, and then a triangular area with the foreground object is addedThe weight of the term. For the characteristic track i, an overdetermined equation is adopted to calculate Ct,iAnd Ct-1,iA transformation matrix between, wherein Ht,iThe definition is as follows:
the parameters are solved by the following equations:
the above equation is solved by the least square method, and the above formula is expressed asAt,iβt,i=Bt,iTo obtainResidual is | | At,iβ-Bt,i||2The residual is specifically defined as:
wherein theta ist,iIs a spatial scale defined as:
where ρ ═ (W/τ + H/τ)/2, and W and H denote the width and height of the video frame. τ is used to control the scale size of the partitions in the normalization, and is set to a value of 10 and remains equal to the number of control points on each side of the video frame.
By finding all Pi,tAndand performing affine transformation and mapping the jitter frame to the stable frame.
Compared with the prior art, the invention has the advantages that:
(1) the invention adaptively blocks the video frame according to the scene content, simultaneously imposes constraints on the foreground and the background, and utilizes the full-image information to carry out camera motion estimation, thereby enhancing the jitter removal capability of the video containing the complex scene.
(2) Further, adaptive weight setting based on time and space further improves the robustness of the algorithm.
(3) Further, the calculation of the stable position of the edge control point improves the stability of the image of the edge area.
Drawings
Fig. 1 is a flow chart of a video debounce algorithm method based on a content-aware blocking strategy according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all embodiments, and all other embodiments obtained by a person skilled in the art based on the embodiments of the present invention belong to the protection scope of the present invention without creative efforts.
As shown in fig. 1, according to an embodiment of the present invention, a video de-jittering algorithm method based on content-aware blocking strategy is provided, which includes the following steps:
step1, extracting and triangulating characteristic track
The method comprises the steps of extracting feature points from video frames by using a KLT algorithm and tracking to generate feature tracks, in order to avoid gathering all the feature tracks in a middle region, firstly dividing the video frames into grids of 10x10, and extracting 200 corner points in total by using a uniform threshold (considering the calculation amount, preferably, only 200 corner points are extracted), and for regions without corner points, ensuring that at least one feature point can be detected by reducing the threshold. And then generating a feature track by using the extracted feature points.
Suppose that N feature tracks are extracted in a jittered video, which is defined asThese trajectories may be fromBackground or foreground regions. For a feature trajectory i (i e [1, N ]]) The first and last frames of its occurrence are defined as siAnd ei. Defining a characteristic track i at the t(s)i≤t≤ei) The position of the frame is Pi,t. The set of the feature points of which all the feature tracks appear in the t-th frame is Mt={Pi,t|i∈[1,N],t∈[si,ei]}。
Using standard Delaunay triangulation method to pair MtPerforming segmentation and generating triangular mesh segmentation resultWherein KtRepresents QtThe number of medium triangles. MtAnd QtAre respectively expressed asAndto find the result at stable viewing angles of these edge regions, 10 control points, for a total of 36 control points, are set on each of the four sides of the video frame and defined as E in the t-th frame sett={Et,1,...,Et,36}. The video frame is then based on the new set of points Mt,EtDelaunay triangulation operation is performed. The segmentation result is represented asWhereinReferred to as "interior triangles" meaning vertices all made up of MtThe point in (1) constitutes a triangle.Referred to as an "outer triangle," denotes a triangle whose vertices contain at least one control point. Qt、BtPosition under stable viewing angle is byIs shown as
Step2, smoothing the characteristic track by using an optimization function
2.1 calculation of Stable positions of feature points
The step is to solve an optimization problem pair characteristic track containing three constraintsIs estimated, the three constraints are:
(1) a characteristic track P is giveniPosition at stable viewing angleShould change slowly between frames.
(2) In the t-th frame, the frame,should each stabilized triangle maintain and QtThe corresponding original triangles in (a) are similar.
(3) In the t-th frame, the frame,should maintain the transformation relationship between similar triangles in (1) and (Q)tThe transformation relations between the corresponding triangles in the tree are consistent.
Based on the above constraints, an optimization function is designed that is minimized as follows:
wherein:
(1)is a "smoothing term" used to smooth the feature trajectory by constraining the first and second derivatives of the feature trajectory to smooth the feature trajectory.The definition is as follows:
where α and β are weighting coefficients, α ═ 2 and β ═ 10.
(2)Is an "inter-frame similarity transformation constraint term" that ensures that the video frame after transformation and the original video frame remain similarly transformed, for QtK in (1)tA Delaunay triangle whose stable viewing angle is defined asQtAndthe vertex of the ith triangle is defined asAnd is required and QtThe corresponding triangles in (a) are similar.The definition is as follows:
wherein a isBAnd bBThis can be obtained by the following formula:
(3)the method is an intra-frame similarity transformation constraint item which is used for ensuring that the transformation relation between local areas in a video frame obtained by transformation is similar to the transformation relation between original video frames.The definition is as follows:
wherein:
where is the weight factor, 20, phi (i) denotes all the adjoining triangles of triangle i, a, b, c can be obtained by the following formula:
a+b+c=1
(4)is a "regularization term" that limits the stabilized feature trajectory from remaining positionally close to the original feature trajectory, thereby avoiding excessive transformation resulting in substantial loss of video content. The definition is as follows:
the adaptive weight setting strategy is as follows:
(1) time-adaptive based weight setting
The term expects that the change between adjacent frames of the feature trajectory tends to 0. However, rapid camera motion often results in large frame-to-frame variations in the feature trajectories, which can lead to collapse of the video content. I.e., alpha, should be appropriately reduced when fast camera motion is detected, so improvementsIn the form:
wherein the sigma is 10, the total weight of the powder,andthe calculation method of (c) is as follows:
(2) weight setting based on spatial adaptation
Since handheld devices capture videos of real-world scenes, there are inevitably discontinuous depth variations in these videos and inconsistencies in local motion due to foreground occlusion. This may create a distortion phenomenon in the stabilized video frame. In order to solve the problems, whether a dynamic foreground object exists in each triangle divided by a triangular mesh is judged, and then a triangular area with the foreground object is addedThe weight of the term. For the characteristic trajectory i, C is calculated by using the overdetermined equationt,iAnd Ct-1,iA transformation matrix between, wherein Ht,iThe definition is as follows:
the parameters are solved by the following equations:
the above equation is solved by the least square method, and the above formula is expressed as At,iβt,i=Bt,iTo obtainResidual is | | At,iβ-Bt,i||2The residual is specifically defined as:
wherein theta ist,iIs a spatial scale defined as:
where ρ ═ (W/τ + H/τ)/2, and W and H denote the width and height of the video frame. τ is used to control the scale size of the partitions in the normalization, and is set to a value of 10 and remains equal to the number of control points on each side of the video frame.
2.2 calculation of Stable position of control Point
The following optimization problem is designed to find the positions of these control points at a stable viewing angle:
wherein:
wherein:
wherein:
wherein a, b, c are solved by the following equations:
a+b+c=1
step3, affine transformation from jittering visual angle to stable visual angle
And performing homography matrix calculation according to the characteristic point coordinates under the dithering visual angle of the t-th frame and the estimated characteristic point coordinates under the stable visual angle, and performing affine transformation.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (7)
1. A video debounce algorithm based on a content-aware blocking strategy is characterized by comprising the following steps:
step1, extracting a characteristic track from the video, obtaining the characteristic point distribution of each frame according to the characteristic track, and performing triangle segmentation according to the characteristic points;
step2, based on the track smooth constraint, the interframe similarity transformation constraint, the intraframe similarity transformation constraint and the regular constraint, and solving the stable position of the characteristic point through self-adaptive weight setting, and further solving the stable position of the edge control point through the interframe similarity transformation constraint and the intraframe similarity transformation constraint according to the obtained stable position of the characteristic point;
step 3. mapping from a dithered view to a stabilized view is established based on the affine transformed frame-by-frame image.
2. The video de-jittering algorithm based on content-aware blocking strategy according to claim 1,
the characteristic track extraction method in Step1 is as follows:
extracting feature points from a video frame by using a KLT algorithm and tracking to generate a feature track, firstly dividing the video frame into grids of 10x10, extracting 200 corner points by using a unified threshold, and ensuring that at least one feature point can be detected by reducing the threshold for an area without corner points;
suppose that N feature tracks are extracted in a jittered video, which is defined asThese trajectories come from the background or foreground region, for a characteristic trajectory i, i ∈ [1, N ]]The first and last frames of its occurrence are defined as siAnd eiDefining the position of the characteristic track i in the t frame as Pi,tWherein s isi≤t≤eiThe set of feature points of which all feature tracks appear in the t-th frame is Mt={Pi,t|i∈[1,N],t∈[si,ei]}。
3. The video de-jittering algorithm based on content-aware blocking strategy according to claim 1, wherein said triangulation Step in Step1 is as follows:
set M of feature points by using standard Delaunay triangulation methodtPerforming segmentation and generating triangular mesh segmentation resultWherein KtRepresents QtNumber of medium triangles, MtAnd QtAre respectively expressed asAndto find the result at a stable view angle of the edge region, four at the video frameSetting 10 control points on each side of the edges, and defining the set of the control points in the t-th frame as Et={Et,1,...,Et,36Then the video frame is put according to the new point set { M }t,EtCarry out Delaunay triangulation operation, the segmentation result is expressed asWhereinReferred to as "interior triangles" meaning vertices all made up of MtThe point(s) in (b) constitute a triangle,referred to as "outer triangles" and representing triangles having vertices containing at least one control point, the number Lt,Qt、BtThe position under a stable viewing angle is represented as
4. The video de-jittering algorithm based on content-aware blocking strategy according to claim 1, wherein said Step2 is based on trajectory smoothing constraint, inter-frame similarity transformation constraint, intra-frame similarity transformation constraint and regularization constraint, and the Step of solving stable positions of feature points is as follows:
by solving an optimization problem pair feature trajectory containing three constraintsIs estimated, the three constraints are:
(A) a characteristic track P is giveniPosition at stable viewing angleShould change slowly between frames;
(B) in the t-th frame, the frame,should each stabilized triangle maintain and QtThe corresponding original triangles in (1) are similar;
(C) in the t-th frame, the frame,should maintain the transformation relation between adjacent trianglestThe transformation relations between the corresponding adjacent triangles in the triangle are consistent;
based on the above constraints, an optimization function is designed that is minimized as follows:
wherein:
(1)is a 'smoothing term' used to smooth the feature trajectory by constraining the first and second derivatives of the feature trajectory to smooth the feature trajectory;the definition is as follows:
wherein α and β are weighting coefficients, α ═ 2, β ═ 10;
(2)is an "inter-frame similarity transformation constraint term" that ensures that the video frame after transformation and the original video frame remain similarly transformed, for QtK in (1)tDelaunay triangle, the desired view angle being defined asStable viewing angle is defined asQtAndthe vertex of the ith triangle is defined asAnd is required and QtThe corresponding triangles in (a) are similar,the definition is as follows:
γ is a weight coefficient, γ ═ 10, where:
wherein a isBAnd bBIs the coefficient of the corresponding vector edge, solved by the following equation:
(3)the video frame is an intra-frame similarity transformation constraint item which is used for ensuring that the transformation relation between local areas in the video frame obtained by transformation is similar to the transformation relation between the local areas in the original video frame;
where γ is a weight coefficient, φ (i) represents all adjacent triangles of triangle i,is the expected stable position of the intra-frame similarity transformation, defined as follows:
where is the weighting factor, 20, a, b, c are the coefficients corresponding to the three vertices, which is obtained by the following formula:
a+b+c=1
(4)is a "regularization term" used to limit the stabilized feature trajectory from remaining positionally close to the original feature trajectory, thereby avoiding a large loss of video content due to excessive transformation, defined as follows:
5. the video de-jittering algorithm based on content-aware blocking strategy according to claim 4, wherein the Step of solving the stable position of the edge control point in Step2 is as follows:
the following optimization problem is designed to find the positions of these control points at a stable viewing angle:
wherein:
whereinDifferent optimization operations for the feature points and the control points are represented, and are specifically defined as follows:
representing the position where the feature point is stable, gamma is a weight coefficient with a value equal to 1.
(2)Is an intra-frame similarity transformation constraint term, boundary BtThe triangles in the frame are required to keep consistent transformation relation between adjacent triangles in the stable frame and the relation between corresponding triangles in the original video frame; the adjacent triangle j of the triangle i may belong to BtAnd QtTriangle j with Bt,iDifferent vertices may belong to BtOr QtA 1 to BtAnd QtIs defined asIn the t-th frame, Bt,iIs defined as BQt,j,Bt,iAnd BQt,jThe different vertices in the set are defined asAndusing linear texture mappingIs represented as BQt,jCombinations of vertices, i.e. A triangle at a stable viewing angle is shown,has a vertex of The definition is as follows:
whereinRepresenting different operations according to the control point and the feature point to which the vertex of the stabilized triangle belongs, is defined as follows:
a+b+c=1。
6. the video de-jittering algorithm based on content-aware blocking strategy according to claim 1, wherein said adaptive weight setting method in Step2 adopts two methods as follows:
(1) time-adaptive based weight setting
The change between adjacent frames of the term expected feature trajectory tends to 0, i.e. alpha should be reduced appropriately when fast camera motion is detected, improvementIn the form:
wherein the sigma is 10, the total weight of the powder,andrepresenting the velocities in the x and y directions, respectively, is calculated as follows:
(2) weight setting based on spatial adaptation
For a video shot by a handheld device in a real scene, discontinuous depth change and inconsistency of local motion caused by foreground occlusion exist, and a distortion phenomenon is generated in a stabilized video frame, so that whether a dynamic foreground object exists in each triangle segmented by a triangular mesh is judged firstly, and then a triangular area with the foreground object is addedThe weight of the item; for the characteristic trajectory i, C is calculated by using the overdetermined equationt,iAnd Ct-1,iA transformation matrix between, wherein Ht,iThe definition is as follows:
the parameters are solved by the following equations:
the above equation is solved by the least square method, and the above formula is expressed as At,iβt,i=Bt,iTo obtainResidual is | | At,iβ-Bt,i||2The residual is specifically defined as:
wherein theta ist,iIs a spatial scale defined as:
where ρ ═ (W/τ + H/τ)/2, W and H denote the width and height of the video frame; tau is used for controlling the scale size of the blocks in the normalization, the value of tau is set to 10, and the tau is equal to the number of control points on each edge of the video frame;
7. The video de-jittering algorithm based on content-aware blocking strategy according to claim 1, wherein said Step3 specifically comprises:
and performing homography matrix calculation according to the characteristic point coordinates under the dithering visual angle of the t-th frame and the estimated characteristic point coordinates under the stable visual angle, and performing affine transformation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010810101.0A CN112001860A (en) | 2020-08-13 | 2020-08-13 | Video debounce algorithm based on content-aware blocking strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010810101.0A CN112001860A (en) | 2020-08-13 | 2020-08-13 | Video debounce algorithm based on content-aware blocking strategy |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112001860A true CN112001860A (en) | 2020-11-27 |
Family
ID=73463980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010810101.0A Pending CN112001860A (en) | 2020-08-13 | 2020-08-13 | Video debounce algorithm based on content-aware blocking strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112001860A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112804444A (en) * | 2020-12-30 | 2021-05-14 | 影石创新科技股份有限公司 | Video processing method and device, computing equipment and storage medium |
-
2020
- 2020-08-13 CN CN202010810101.0A patent/CN112001860A/en active Pending
Non-Patent Citations (1)
Title |
---|
MINDA ZHAO等: "Adaptively Meshed Video Stabilization", 《ARXIV》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112804444A (en) * | 2020-12-30 | 2021-05-14 | 影石创新科技股份有限公司 | Video processing method and device, computing equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Spatially and temporally optimized video stabilization | |
Wang et al. | Motion-aware temporal coherence for video resizing | |
CN106331480B (en) | Video image stabilization method based on image splicing | |
US20150022677A1 (en) | System and method for efficient post-processing video stabilization with camera path linearization | |
US9165401B1 (en) | Multi-perspective stereoscopy from light fields | |
US9838604B2 (en) | Method and system for stabilizing video frames | |
EP1864502A2 (en) | Dominant motion estimation for image sequence processing | |
CN101853497A (en) | Image enhancement method and device | |
CN111614965B (en) | Unmanned aerial vehicle video image stabilization method and system based on image grid optical flow filtering | |
CN106851102A (en) | A kind of video image stabilization method based on binding geodesic curve path optimization | |
Berdnikov et al. | Real-time depth map occlusion filling and scene background restoration for projected-pattern-based depth cameras | |
Wang et al. | Video stabilization: A comprehensive survey | |
CN104469086A (en) | Method and device for removing dithering of video | |
Zhao et al. | Adaptively meshed video stabilization | |
KR101851896B1 (en) | Method and apparatus for video stabilization using feature based particle keypoints | |
CN105282400B (en) | A kind of efficient video antihunt means based on geometry interpolation | |
CN112001860A (en) | Video debounce algorithm based on content-aware blocking strategy | |
Chan et al. | An object-based approach to image/video-based synthesis and processing for 3-D and multiview televisions | |
CN109729263A (en) | Video based on fusional movement model removes fluttering method | |
CN108596858B (en) | Traffic video jitter removal method based on characteristic track | |
Rawat et al. | Adaptive motion smoothening for video stabilization | |
WO2022040988A1 (en) | Image processing method and apparatus, and movable platform | |
Yousaf et al. | Real time video stabilization methods in IR domain for UAVs—A review | |
Lee et al. | ROI-based video stabilization algorithm for hand-held cameras | |
Dervişoğlu et al. | Interpolation-based smart video stabilization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20201127 |