WO2022214037A1

WO2022214037A1 - Video anti-shake processing method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022214037A1
Application number: PCT/CN2022/085634
Authority: WO
Inventors: 杨松; 刘宇龙
Original assignee: 北京字跳网络技术有限公司
Priority date: 2021-04-08
Filing date: 2022-04-07
Publication date: 2022-10-13
Also published as: CN115209031A; CN115209031B

Abstract

Embodiments of the present disclosure relate to a video anti-shake processing method and apparatus, an electronic device, and a storage medium. The method comprises: by means of performing feature point tracking between different image frames in a video, and on the basis of an initial transformation mode, determining an initial amount of variation in video-recording positions in respect of different image frames in the video; on the basis of a fitting error corresponding to the initial amount of variation, using a target transformation mode matching the fitting error in order to determine the target amount of variation in the video-recording positions in respect of different image frames in the video; and on the basis of the target amount of variation, forming a movement trajectory in respect of the video-recording positions of the video; performing smoothing on each of the video-recording positions in respect of different image frames in the movement trajectory; and, on the basis of the difference between a smooth trajectory and the movement trajectory, performing shaping on the video to obtain an anti-shake-processed video. In the embodiments of the present disclosure, the target transformation mode with respect to different image frames in the video is dynamically determined according to the fitting error, ensuring the video anti-shake processing effect, and avoiding the introduction of excessively large fitting errors.

Description

Video anti-shake processing method, device, electronic device and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the priority of the Chinese patent application filed on April 8, 2021 with the application number of 202110379651.6 and the invention titled "video anti-shake processing method, device, electronic device and storage medium". The entire content of the application is approved by Reference is incorporated in this application.

technical field

The present disclosure relates to the technical field of video processing, and in particular, to a video anti-shake processing method, device, electronic device and storage medium.

Background technique

With the rise of short videos, video shooting has become more and more important. At present, users generally use handheld devices to shoot video, which can easily cause video jitter, resulting in poor video quality. Therefore, how to perform anti-shake processing on the video and improve the video quality is still a problem to be solved at present.

SUMMARY OF THE INVENTION

In order to solve the above technical problem or at least partially solve the above technical problem, the embodiments of the present disclosure provide a video anti-shake processing method, apparatus, electronic device, and storage medium.

In a first aspect, an embodiment of the present disclosure provides a video anti-shake processing method, including:

By tracking feature points between different image frames in the video, the initial change amount of the shooting position between different image frames in the video is determined based on the initial transformation method;

Based on the fitting error corresponding to the initial variation, the target variation of the shooting position between different image frames in the video is determined by adopting a target transformation method that matches the fitting error;

forming a moving track of the shooting position of the video based on the target change amount of the shooting position between different image frames in the video, wherein the moving track is used to indicate the shooting position of different image frames in the video;

respectively smoothing the shooting positions of different image frames in the moving track to obtain a smooth track;

Based on the difference between the smooth trajectory and the moving trajectory, the video is deformed to obtain an anti-shake processed video.

In a second aspect, an embodiment of the present disclosure further provides a video anti-shake processing device, including:

an initial change amount determination module, configured to determine the initial change amount of the shooting position between different image frames in the video based on the initial transformation method by tracking feature points between different image frames in the video;

A target variation determination module, configured to determine, based on the fitting error corresponding to the initial variation, the target variation of the shooting position between different image frames in the video by adopting a target transformation method that matches the fitting error;

a movement trajectory generation module, configured to form a movement trajectory of the shooting position of the video based on the target change amount of the shooting position between different image frames in the video, wherein the movement trajectory is used to indicate different images in the video the shooting position of the frame;

a smooth trajectory determination module, configured to perform smooth processing on the shooting positions of different image frames in the moving trajectory respectively to obtain a smooth trajectory;

A video anti-shake processing module, configured to deform the video based on the difference between the smooth track and the moving track to obtain a video that has undergone anti-shake processing.

In a third aspect, embodiments of the present disclosure further provide an electronic device, including a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the electronic device is made to The device implements any of the video anti-shake processing methods provided in the embodiments of the present disclosure.

In a fourth aspect, an embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored in the storage medium, and when the computer program is executed by a computing device, the computing device enables the computing device to implement the embodiment of the present disclosure Any of the provided video anti-shake processing methods.

In a fifth aspect, there is provided a computer program product comprising computer program instructions that cause a computer to perform a method as in the first aspect or implementations thereof.

In a sixth aspect, there is provided a computer program, the computer program causing a computer to perform the method as in the first aspect or implementations thereof.

Compared with the prior art, the technical solutions provided by the embodiments of the present disclosure have at least the following advantages: in the embodiments of the present disclosure, the initial change amount of the shooting position between different image frames in the video is first determined based on the initial transformation method, and then based on the initial change The fitting error corresponding to the fitting error is used to determine the target variation of the shooting position between different image frames in the video by using the target transformation method that matches the fitting error, that is, the fitting error can be used to evaluate whether the selection of the initial transformation method is reasonable. The target change amount of the shooting position between different image frames in the video forms the movement trajectory of the shooting position of the video. Finally, the video anti-shake processing effect is realized through the trajectory smoothing processing and video deformation processing. The embodiment of the present disclosure realizes the effect of dynamically determining the target transformation mode between different image frames in the video based on the fitting error corresponding to the initial change of the shooting position between different image frames in the video, and ensures the processing effect of video anti-shake. , which avoids the introduction of excessive fitting errors and effectively improves the video quality.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the accompanying drawings that are required to be used in the description of the embodiments or the prior art will be briefly introduced below. In other words, on the premise of no creative labor, other drawings can also be obtained from these drawings.

1 is a flowchart of a video anti-shake processing method provided by an embodiment of the present disclosure;

2 is a flowchart of another video anti-shake processing method provided by an embodiment of the present disclosure;

3 is a schematic structural diagram of a video anti-shake processing apparatus according to an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

In order to more clearly understand the above objects, features and advantages of the present disclosure, the solutions of the present disclosure will be further described below. It should be noted that the embodiments of the present disclosure and the features in the embodiments may be combined with each other under the condition of no conflict.

Many specific details are set forth in the following description to facilitate a full understanding of the present disclosure, but the present disclosure can also be implemented in other ways different from those described herein; obviously, the embodiments in the specification are only a part of the embodiments of the present disclosure, and Not all examples.

FIG. 1 is a flowchart of a video anti-shake processing method provided by an embodiment of the present disclosure, which can be applied to a situation of performing anti-shake processing on a video. The method can be performed by a video anti-shake processing apparatus, which can be implemented by software and/or hardware, and can be integrated on any electronic device with computing capability, such as a terminal or a server.

In the embodiment of the present disclosure, the video to be processed may be a video being shot or a video that has been shot, that is, the embodiment of the present disclosure may perform anti-shake processing on the shot video in real time during the video shooting process, or the video may be Anti-shake processing is performed on the video after shooting, which can improve the quality of the video.

As shown in FIG. 1 , the video anti-shake processing method provided by the embodiment of the present disclosure may include:

S101 , by tracking feature points between different image frames in the video, and based on an initial transformation method, determine an initial change amount of shooting positions between different image frames in the video.

By tracking feature points between different image frames in the video, matching feature points between different image frames can be determined (referring to the feature points of the same object in different image frames, the number of matching feature points can be determined according to the situation) , and then based on the initial transformation method and the matching feature points, the initial variation of the shooting position between different image frames can be determined. The feature point tracking may be implemented with reference to the prior art, which is not specifically limited in the embodiment of the present disclosure. The different image frames in the video may be two adjacent frames of images in the video, or may be images separated by at least two frames, such as the current frame and the first frame of images in the video. The initial transformation method is a calculation method that is used by default and is used to calculate the amount of change in the shooting position between different image frames. The initial transformation method can be implemented by, for example, an initial transformation matrix used to characterize the change of the shooting position, for example, it can be Homography matrix, etc. It should be understood that, in the actual processing process, the initial transformation mode may be flexibly selected from multiple available transformation modes according to processing requirements, which is not specifically limited in the embodiment of the present disclosure.

The initial change amount of the shooting position between different image frames may be, for example, the change amount of the shooting position of the following frame image relative to the previous frame image. Taking the implementation of the initial transformation method using an initial transformation matrix as an example, the initial change amount of the shooting position between different image frames may be a transformation matrix from the previous frame image to the subsequent frame image.

Optionally, the video anti-shake processing method provided by the embodiment of the present disclosure further includes: calculating a fitting corresponding to the initial variation of the shooting position between different image frames based on the initial transformation method and the feature points that are successfully matched between different image frames. error. The fitting error can be used to evaluate whether the selection of the initial transformation method is reasonable, and then determine the influence of the initial transformation method on the anti-shake processing effect in the video anti-shake processing process. In the process of calculating the fitting error, the initial transformation method can be used to perform coordinate transformation on the feature points on the previous frame image, or the inverse transformation of the initial transformation method can be used to perform coordinate transformation on the feature points on the subsequent frame image, and then It is compared with the image coordinates of the feature points on the remaining image of another frame, so as to calculate the fitting error corresponding to the initial change of the shooting position between different image frames.

Further, based on the initial transformation method and the feature points that are successfully matched between different image frames, calculating the fitting error corresponding to the initial variation of the shooting position between different image frames may include:

Use the initial transformation method to perform coordinate transformation on the feature points on the previous frame image in different image frames, and obtain the transformed coordinates of the feature points on the previous frame image;

Based on the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image in different image frames, the fitting error corresponding to the initial variation of the shooting position between different image frames is calculated.

Exemplarily, the initial transformation method is implemented by using an initial transformation matrix. Different image frames in the video refer to two adjacent frames of images in the video as an example. The calculation of the fitting error is exemplified, but should not be construed as a reference to the present disclosure. Specific limitations of the examples. Assuming that the video V contains n frames of images in total, denote the i-th frame image as f _i , then the video V={f ₁ ,f ₂ ,...,f _n-1 ,f _n, } The following processing:

1) For the i-th frame image f _i , extract feature points, denoted as p _i .

2) Track the feature point p _i of the previous frame image on the i+1th frame image f _i+1 , and the tracked feature points are recorded as p _i～i+1 , namely p _i and p _{i～i+ 1} are the matching feature points on the i-th frame image f _i and the i+1-th frame image f _i+1 ;

3) According to the corresponding relationship between p _i and p _i～i+1 , fit the initial transformation matrix from the i-th frame image f _i to the i+1-th frame image f _i+1 (that is, the initial change of the shooting position) , denoted as T _i ;

4) Utilize the initial transformation matrix T _i to carry out coordinate transformation to the matching feature point p _i , obtain the transformed coordinates, and denote it as T _i * _pi ;

5) Compare T _i *pi with pi ~ _i ₊₁ , and calculate the fitting error corresponding to the initial change of the shooting position between two adjacent frames of images.

Theoretically, the smaller the difference between T _i *pi and p _i _～i+1 , the smaller the fitting error, the better the motion fitting effect of the initial transformation matrix for different image frames, otherwise the initial transformation matrix is suitable for different image frames. The motion fitting effect of the image frame is poor, and the initial transformation matrix needs to be dynamically replaced, that is, the initial transformation method needs to be dynamically replaced.

The specific calculation method for obtaining the fitting error by using the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image can be flexibly determined in actual processing. For example, the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image can be calculated according to the corresponding relationship between the feature points on the two frames of images. Each difference value or each quotient value is summed (including weighted summation) to obtain the fitting error corresponding to the initial change of the shooting position between different image frames, and the average value of each difference value or the average value of each quotient value can also be calculated. value, as the fitting error corresponding to the initial variation of the shooting position between different image frames.

S102. Based on the fitting error corresponding to the initial variation, determine the target variation of the shooting position between different image frames in the video by adopting a target transformation method that matches the fitting error.

Wherein, according to the relationship between the fitting error and the error threshold, a target transformation mode matching the fitting error can be determined, so as to improve the processing effect of video anti-shake. The error threshold can be set to a value. At this time, if the fitting error is less than the error threshold, the initial transformation method can be determined as the target transformation method that matches the fitting error. If the fitting error is greater than or equal to the error threshold, the matching The transformation methods with different degrees of freedom of the initial transformation method are used as the target transformation method to achieve the effect of reducing the fitting error of the shooting position between different image frames; the error threshold can also be hierarchically set to multiple values, and each threshold corresponds to a There are different transformation modes, the degrees of freedom between the multiple optional transformation modes are different, and the fitting errors corresponding to the initial changes of the shooting positions between different image frames obtained by each transformation mode are also different. It should be noted that, each threshold mentioned in the implementation of the present disclosure can be flexibly set in the actual processing process, which is not specifically limited in the embodiment of the present disclosure.

Exemplarily, the error threshold includes a first error threshold and a second error threshold, and the value of the first error threshold is less than the second error threshold, if the fitting error corresponding to the initial variation of the shooting position between different image frames in the video is less than If the first error threshold is set, the initial transformation mode is determined as the target transformation mode that matches the fitting error, and the target transformation mode is used to determine the target change amount of the shooting position between different image frames in the video; or

If the fitting error corresponding to the initial variation is greater than or equal to the first error threshold and less than the second error threshold, the first transformation with a degree of freedom smaller than the initial transformation is determined as the target transformation that matches the fitting error, and Determine the amount of target change in the shooting position between different image frames in the video by means of target transformation; or

If the fitting error corresponding to the initial change amount is greater than or equal to the second error threshold, the second transformation method with a degree of freedom smaller than the first transformation method is determined as the target transformation method matching the fitting error, and the target transformation method is used to determine the video The amount of target change in the shooting position between different image frames in .

Further, the initial transformation mode includes a homography transformation mode, the first transformation mode includes an affine transformation mode, and the second transformation mode includes a similarity transformation mode.

Among them, the homography transformation is the transformation relationship from one plane to another plane, with a total of eight degrees of freedom; the affine transformation is a linear transformation from two-dimensional coordinates to two-dimensional coordinates, which maintains the two-dimensional graphics. "Flatness" and "parallelism" mainly include translation transformation, rotation transformation, scale transformation, tilt transformation (or called staggered transformation, shear transformation, offset transformation), flip transformation, a total of six degrees of freedom ; Compared with affine transformation, there is no oblique transformation and flip transformation in similarity transformation, and there are a total of four degrees of freedom.

A transformation relationship corresponds to a motion model. A motion model with a higher degree of freedom (ie, a transformation matrix with a higher degree of freedom) has better fitting ability, but it is easier to introduce fitting errors. Therefore, in the process of video anti-shake processing You can first use a motion model with a high degree of freedom to fit the motion between different image frames in the video, and then dynamically adjust the motion model type between different image frames according to the fitting error, that is, if the fitting error is too large, use the free A motion model with a low degree of freedom replaces a motion model with a high degree of freedom, so as to avoid introducing large fitting errors while reducing the smoothing effect, to achieve a balance between motion smoothing and fitting errors between different image frames, and to ensure the final Video stabilization effect.

S103. Based on the target variation of the shooting position between different image frames in the video, a movement track of the shooting position of the video is formed, wherein the movement track is used to indicate the shooting position of different image frames in the video.

Exemplarily, one frame of image in the video can be selected as the reference frame image, and the reference frame image can be determined adaptively, and then the target variation of the shooting position between different image frames can be used to obtain the relative value of each frame of image in the video relative to the reference frame. The target change amount of the shooting position of the image, and then the movement trajectory of the shooting position of the video (or the movement trajectory of the shooting device for shooting the video) is obtained based on the multiple target changes.

Optionally, in this embodiment of the present disclosure, the initial change amount or the target change amount may be represented by a transformation matrix, and further, the movement track includes multiple transformation matrices, that is, different transformation matrices in the movement track may respectively represent different image frames in the video. Correspondingly, taking the reference frame image as the first frame image in the video as an example, based on the target variation of the shooting position between different image frames in the video, the movement trajectory of the shooting position of the video is formed, including:

Determine the transformation matrix of each frame image in the video relative to the first frame image based on the target transformation matrix of the shooting position between different image frames in the video;

Based on the transformation matrix of each frame of image in the video relative to the first frame of image, the movement track of the shooting position of the video is formed.

Assuming that the target transformation matrix of the shooting position between the i-th frame image f _i and the i+1-th frame image f _i+1 in the video is represented as T _i , then the i-th frame image f _i and the different images before the i-th frame can be expressed as T i . The target transformation matrix of the shooting position between the frames (for example, each adjacent two frame images) is accumulated and processed, for example, the cumulative multiplication calculation (specifically can be determined according to the actual processing), and the difference between the i-th frame image f _i relative to the first frame image is obtained. The transformation matrix of the shooting position, expressed as follows:

The transformation matrix of the shooting position of each frame of image in the video relative to the first frame of image is obtained in turn, then the movement trajectory of the shooting position of the video can be expressed as C={C ₁ ,C ₂ ,...,C _n-1 ,C _n }, n represents the number of image frames included in the video.

S104 , performing smoothing processing on the shooting positions of different image frames in the moving track respectively to obtain a smooth track.

The movement trajectory of the shooting position of the video is obtained, that is, the shaking trend of the shooting position of the video is determined. Any available smoothing algorithm in the prior art, such as Gaussian smoothing algorithm, etc., can be used to shoot different image frames in the moving trajectory. The position is smoothed to obtain a smooth trajectory. The smooth trajectory can be expressed as, for example,

Smoothing tracks are used to indicate where different image frames were taken in the smoothed video.

S105 , deforming the video based on the difference between the smooth track and the moving track to obtain a video that has undergone anti-shake processing.

will smooth the trajectory

Comparing with the moving trajectory C before smoothing, the adjustment parameter W={W ₁ ,W ₂ ,...,W _n-1 ,W _n } of the moving trajectory can be determined, and each sub-value in the adjustment parameter W can be determined.

n represents the number of image frames included in the video; then, according to the corresponding relationship between each sub-value in the adjustment parameter and each frame of image in the video, the corresponding image frame can be deformed based on each sub-value in the adjustment parameter, so as to obtain anti-shake processed video. In the deformation processing process, it involves the rotation, translation, scaling or cropping of a specific frame image, which can be performed according to actual processing requirements.

In the embodiment of the present disclosure, the initial change amount of the shooting position between different image frames in the video is first determined based on the initial transformation method, and then based on the fitting error corresponding to the initial change amount, the target transformation method that matches the fitting error is used to determine the video The target change of the shooting position between different image frames in the video, that is, the fitting error can be used to evaluate whether the selection of the initial transformation method is reasonable, and then the shooting position of the video is formed based on the target change of the shooting position between different image frames in the video. Finally, through the trajectory smoothing and video deformation processing, the video anti-shake processing effect is realized. The embodiment of the present disclosure realizes the effect of dynamically determining the target transformation mode between different image frames in the video based on the fitting error corresponding to the initial change of the shooting position between different image frames in the video, and ensures the processing effect of video anti-shake. , which avoids the introduction of excessive fitting errors and effectively improves the video quality.

FIG. 2 is a flowchart of another video anti-shake processing method provided by an embodiment of the present disclosure, which is further optimized and expanded based on the foregoing technical solution, and may be combined with the foregoing optional implementation manners.

As shown in FIG. 2 , the video anti-shake processing method provided by the embodiment of the present disclosure may include:

S201 , by tracking feature points between different image frames in the video, and based on an initial transformation method, determine the initial change amount of the shooting position between different image frames in the video.

S202 , using an initial transformation method to perform coordinate transformation on the feature points on the previous frame image in different image frames, to obtain transformed coordinates of the feature points on the previous frame image.

S203 , using the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image in different image frames to calculate the cumulative error corresponding to the initial variation of the shooting position between different image frames.

Exemplarily, taking the i-th frame image f _i and the i+1-th frame image f _i+1 as an example, the number of successfully matched feature points on these two frame images is represented as M _i , and the i-th frame image f _i The j-th feature point on is represented as

Then the transformed coordinates corresponding to the jth feature point can be expressed as

T _i represents the initial change of the shooting position between the i-th frame image f _i and the i+1-th frame image f _i ₊₁ , and the j-th feature point on the i+1-th frame image f i+1 is expressed as

Then the cumulative error corresponding to the initial change of the shooting position between the i-th frame image f _i and the i+1-th frame image f _i+1 can be expressed as

S204 , based on the accumulated error and the number of feature points on the image of the previous frame, calculate the fitting error corresponding to the initial variation of the shooting position between different image frames.

Continuing to take the above example as an example, the mean value can be calculated based on the accumulated error and the number of feature points on the previous frame image (equal to the number of successfully matched feature points between different image frames) M _i to obtain the shooting position between different image frames. The fitting error E _i corresponding to the initial variation of , can be expressed as follows:

By calculating the mean value of multiple successfully matched feature points, the fitting error corresponding to the initial variation of the shooting position between different image frames is determined, which ensures the accuracy of the fitting error calculation.

S205. Based on the fitting error corresponding to the initial variation, determine the target variation of the shooting position between different image frames in the video by adopting a target transformation method that matches the fitting error.

S206 . Based on the target variation of the shooting position between different image frames in the video, a movement track of the shooting position of the video is formed, wherein the movement track is used to indicate the shooting position of different image frames in the video.

S207 , performing smoothing processing on the shooting positions of different image frames in the moving track respectively to obtain a smooth track.

Exemplarily, based on a preset smoothing radius, smoothing processing may be performed on the shooting positions of different image frames in the moving track respectively to obtain a smooth track. The value of the preset smoothing radius determines the number of image frames involved in the smoothing process, and its specific value can be flexibly determined in the actual processing process, which is not specifically limited in the embodiment of the present disclosure. Exemplarily, for each frame of image in the video, an image with a preset number of frames participating in the smoothing process may be determined based on a preset smoothing radius; For the smoothed shooting position of each frame of image, for example, a weighted sum calculation can be performed on the corresponding shooting positions of the images with a preset number of frames in the moving trajectory to obtain the smoothed shooting position corresponding to each frame of image; The smoothed shooting position of the image is obtained to obtain a smooth trajectory.

Optionally, based on the preset smoothing radius, determining the images of the preset number of frames participating in the smoothing process may include: for each frame of the image in the video, based on the preset smoothing radius, determining the first preset before each frame of the image. Set the previous frame image of the frame number; determine each frame image and the previous frame image of the first preset frame number (the value of which is the value of the preset smoothing radius at this time) as the preset frame participating in the smoothing process or, based on the preset smoothing radius, determine the previous frame image of the second preset number of frames before each frame of image (the value of which is the value of the preset smoothing radius at this time), and determine each frame of image. The subsequent frame images of the second preset frame number after the frame image; each frame image, the previous frame image of the second preset frame number, and the subsequent frame image of the second preset frame number are determined to participate in the smoothing process the preset number of frames of the image.

Taking the value of the preset smoothing radius, the images of the same number of frames are taken before and after each frame of image, and the smoothed shooting position corresponding to each frame of image is represented by a smoothing matrix as an example, the smoothing of each frame of image f _i is taken as an example. The matrix can be represented as follows:

Among them, r is the preset smoothing radius, C _t is the corresponding transformation matrix (ie shooting position) of each frame of image participating in the smoothing process in the moving track C, and w _i～t is the weight of each frame of the image participating in the smoothing process. The value can be set adaptively, which is not specifically limited in the embodiment of the present disclosure. After obtaining the smoothing matrix of each frame of image, the smoothed trajectory can be expressed as

S208. Determine the adjustment parameter based on the difference between the smooth track and the moving track.

Exemplarily, the smooth trajectory of the photographing device can be

Subtract it from the moving trajectory C before smoothing to obtain the adjustment parameter W={W ₁ ,W ₂ ,...,W _n-1 ,W _n }, and each sub-value in the adjustment parameter W can be expressed as

S209. Transform the video by using the adjustment parameters to obtain a video that has undergone anti-shake processing.

According to the correspondence between each sub-value in the adjustment parameter and each frame of image in the video, the corresponding image frame can be deformed based on each sub-value in the adjustment parameter, so as to obtain the video after anti-shake processing.

In the embodiment of the present disclosure, by calculating the fitting error corresponding to the initial variation of the shooting position between different image frames in the video, the dynamic determination of the available transformation modes between different image frames according to the fitting error is realized, ensuring that The processing effect of video anti-shake avoids the introduction of excessive fitting errors and effectively improves the video quality.

FIG. 3 is a schematic structural diagram of a video anti-shake processing apparatus according to an embodiment of the present disclosure, which can be applied to a situation of performing anti-shake processing on a video. The apparatus can be implemented by software and/or hardware, and can be integrated on any electronic device with computing capability, such as a terminal or a server.

As shown in FIG. 3 , the video anti-shake processing apparatus 300 provided by the embodiment of the present disclosure may include an initial change amount determination module 301, a target change amount determination module 302, a movement trajectory generation module 303, a smooth trajectory determination module 304, and video anti-shake processing. Module 305, where:

The initial change amount determination module 301 is used to determine the initial change amount of the shooting position between different image frames in the video based on the initial transformation method by performing feature point tracking between different image frames in the video;

The target change amount determination module 302 is used for determining the target change amount of the shooting position between different image frames in the video based on the fitting error corresponding to the initial change amount by adopting a target transformation method that matches the fitting error;

The movement trajectory generation module 303 is used to form the movement trajectory of the shooting position of the video based on the target variation of the shooting position between different image frames in the video, wherein the movement trajectory is used to indicate the shooting position of different image frames in the video;

The smooth trajectory determination module 304 is configured to perform smoothing processing on the shooting positions of different image frames in the moving trajectory respectively to obtain a smooth trajectory;

The video anti-shake processing module 305 is configured to deform the video based on the difference between the smooth track and the moving track, so as to obtain a video that has undergone anti-shake processing.

Optionally, the video anti-shake processing apparatus 300 provided in this embodiment of the present disclosure further includes:

The transformation coordinate determination module is used to perform coordinate transformation on the feature points on the previous frame image in different image frames by using the initial transformation method, so as to obtain the transformed coordinates of the feature points on the previous frame image;

The fitting error calculation module is used to calculate the corresponding value of the initial change of the shooting position between different image frames based on the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image in different image frames. Fitting error.

Optionally, the fitting error calculation module includes:

The cumulative error calculation unit is used to calculate the cumulative corresponding to the initial change of the shooting position between different image frames by using the image coordinates of the feature points on the subsequent frame image and the transformed coordinates of the feature points on the previous frame image in different image frames error;

The fitting error calculation unit is configured to calculate the fitting error corresponding to the initial variation of the shooting position between different image frames based on the accumulated error and the number of feature points on the previous frame image.

Optionally, the target variation determination module 302 includes:

a first determining unit, configured to determine the initial transformation mode as a target transformation mode matching the fitting error if the fitting error corresponding to the initial change amount is smaller than the first error threshold, and use the target transformation mode to determine different image frames in the video The amount of target change between shooting positions; or

a second determining unit, configured to determine the first transformation mode with the degree of freedom smaller than the initial transformation mode as the fitting error if the fitting error corresponding to the initial change amount is greater than or equal to the first error threshold and less than the second error threshold The matching target transformation method, and the target transformation method is used to determine the target change amount of the shooting position between different image frames in the video; or

a second determining unit, configured to determine a second transformation mode with a degree of freedom smaller than the first transformation mode as a target transformation mode matching the fitting error if the fitting error corresponding to the initial change amount is greater than or equal to the second error threshold, And adopt the target transformation method to determine the target change amount of the shooting position between different image frames in the video.

Optionally, the initial transformation mode includes a homography transformation mode, the first transformation mode includes an affine transformation mode, and the second transformation mode includes a similarity transformation mode.

Optionally, the initial change amount or the target change amount is represented by a transformation matrix;

The movement trajectory generation module 303 includes:

A transformation matrix determining unit, for determining the transformation matrix of each frame of image relative to the first frame of image in the video based on the target transformation matrix of the shooting position between different image frames in the video;

The movement trajectory generation unit is configured to form the movement trajectory of the shooting position of the video based on the transformation matrix of each frame of image in the video relative to the first frame of image.

Optionally, the video anti-shake processing module 305 includes:

an adjustment parameter determination unit for determining an adjustment parameter based on the difference between the smooth trajectory and the moving trajectory;

The video deformation unit is used to deform the video by using the adjustment parameters to obtain the video that has undergone anti-shake processing.

The video anti-shake processing apparatus provided by the embodiment of the present disclosure can execute any video anti-shake processing method provided by the embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method. For the content that is not described in detail in the apparatus embodiment of the present disclosure, reference may be made to the description in any method embodiment of the present disclosure.

FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, which is used to exemplarily describe an electronic device that implements the video anti-shake processing method provided by the embodiment of the present disclosure. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, Mobile terminals such as car navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, smart home devices, wearable electronic devices, servers, and the like. The electronic device shown in FIG. 4 is only an example, and should not impose any limitation on the functions and occupancy scope of the embodiments of the present disclosure.

As shown in FIG. 4 , electronic device 400 includes one or more processors 401 and memory 402 .

Processor 401 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 400 to perform desired functions.

Memory 402 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others. Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 401 may execute the program instructions to implement the video anti-shake processing method provided by the embodiments of the present disclosure, and may also implement other desired functions. Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.

Wherein, the video anti-shake processing method provided by the embodiment of the present disclosure may include: by tracking feature points between different image frames in the video, determining the initial change amount of the shooting position between different image frames in the video based on an initial transformation method; The fitting error corresponding to the initial variation is determined by the target transformation method that matches the fitting error to determine the target variation of the shooting position between different image frames in the video; based on the target variation of the shooting position between different image frames in the video, the The moving track of the shooting position of the video, wherein the moving track is used to indicate the shooting position of different image frames in the video; the shooting positions of different image frames in the moving track are respectively smoothed to obtain a smooth track; based on the difference between the smooth track and the moving track The difference between the video and the video is deformed to obtain an anti-shake video. It should be understood that the electronic device 400 may also perform other optional implementations provided by the method embodiments of the present disclosure.

In one example, the electronic device 400 may also include an input device 403 and an output device 404 interconnected by a bus system and/or other form of connection mechanism (not shown).

In addition, the input device 403 may also include, for example, a keyboard, a mouse, and the like.

The output device 404 can output various information to the outside, including the determined distance information, direction information, and the like. The output devices 404 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.

Of course, for simplicity, only some of the components in the electronic device 400 related to the present disclosure are shown in FIG. 4 , and components such as buses, input/output interfaces, and the like are omitted. Besides, the electronic device 400 may also include any other appropriate components according to the specific application.

In addition to the above-mentioned methods and devices, the embodiments of the present disclosure may also be computer program products, which include computer programs or computer program instructions, which, when executed by a processor, cause a computing device to implement what is provided by the embodiments of the present disclosure. Any video anti-shake processing method.

The computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc., as well as conventional procedural programming language, such as "C" language or similar programming language. The program code may execute entirely on the user electronic device, partly on the user electronic device, as a stand-alone software package, partly on the user electronic device and partly on the remote electronic device, or entirely on the remote electronic device execute on.

In addition, the embodiments of the present disclosure may further provide a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by the processor, the computer program instructions enable the computing device to implement any video anti-shake processing provided by the embodiments of the present disclosure method.

Wherein, the video anti-shake processing method provided by the embodiment of the present disclosure may include: by tracking feature points between different image frames in the video, determining the initial change amount of the shooting position between different image frames in the video based on an initial transformation method; The fitting error corresponding to the initial variation is determined by the target transformation method that matches the fitting error to determine the target variation of the shooting position between different image frames in the video; based on the target variation of the shooting position between different image frames in the video, the The moving track of the shooting position of the video, wherein the moving track is used to indicate the shooting position of different image frames in the video; the shooting positions of different image frames in the moving track are respectively smoothed to obtain a smooth track; based on the difference between the smooth track and the moving track The video is deformed to obtain an anti-shake processed video. It should be understood that, when the computer program instructions are executed by the processor, the computer program instructions can also cause the computing device to implement other optional implementations provided by the method embodiments of the present disclosure.

A computer-readable storage medium can employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

It should be noted that, in this document, relational terms such as "first" and "second" etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these There is no such actual relationship or sequence between entities or operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article, or device that includes the element.

The above are only specific embodiments of the present disclosure, so that those skilled in the art can understand or implement the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present disclosure. Therefore, the present disclosure is not to be limited to the embodiments herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

A video anti-shake processing method, comprising:

By tracking feature points between different image frames in the video, the initial change amount of the shooting position between different image frames in the video is determined based on the initial transformation method;

Based on the fitting error corresponding to the initial variation, the target variation of the shooting position between different image frames in the video is determined by adopting a target transformation method that matches the fitting error;

forming a moving track of the shooting position of the video based on the target change amount of the shooting position between different image frames in the video, wherein the moving track is used to indicate the shooting position of different image frames in the video;

respectively smoothing the shooting positions of different image frames in the moving track to obtain a smooth track;

Based on the difference between the smooth trajectory and the moving trajectory, the video is deformed to obtain an anti-shake processed video.
The method of claim 1, further comprising:

Using the initial transformation method to perform coordinate transformation on the feature points on the previous frame image in the different image frames, to obtain the transformed coordinates of the feature points on the previous frame image;

Based on the image coordinates of the feature points on the subsequent frame image in the different image frames and the transformed coordinates of the feature points on the previous frame image, a fit corresponding to the initial change of the shooting position between the different image frames is calculated error.
The method according to claim 2, wherein the different images are calculated based on the image coordinates of the feature points on the subsequent frame image in the different image frames and the transformed coordinates of the feature points on the previous frame image The fitting error corresponding to the initial change in shooting position between frames, including:

Using the image coordinates of the feature points on the next frame image in the different image frames and the transformed coordinates of the feature points on the previous frame image, calculate the cumulative error corresponding to the initial change of the shooting position between the different image frames ;

Based on the accumulated error and the number of feature points on the previous frame image, a fitting error corresponding to the initial variation of the shooting position between the different image frames is calculated.
The method according to claim 1, characterized in that, based on the fitting error corresponding to the initial change amount, a target transformation method matching the fitting error is used to determine the difference between shooting positions between different image frames in the video. Target Variation, including:

If the fitting error corresponding to the initial change amount is smaller than the first error threshold, the initial transformation method is determined as a target transformation method matching the fitting error, and the target transformation method is used to determine the The amount of target change in shooting position between different image frames; or

If the fitting error corresponding to the initial change amount is greater than or equal to the first error threshold and less than the second error threshold, the first transformation method with a degree of freedom smaller than the initial transformation method is determined to be the same as the fitting The target transformation method for error matching, and the target transformation method is used to determine the target change amount of the shooting position between different image frames in the video; or

If the fitting error corresponding to the initial change amount is greater than or equal to the second error threshold, a second transformation method with a degree of freedom smaller than the first transformation method is determined as a target transformation method matching the fitting error , and use the target transformation method to determine the target change amount of the shooting position between different image frames in the video.
The method according to claim 4, wherein the initial transformation mode includes a homography transformation mode, the first transformation mode includes an affine transformation mode, and the second transformation mode includes a similarity transformation mode.
The method according to any one of claims 1 to 5, wherein the initial change amount or the target change amount is represented by a transformation matrix;

Based on the target variation of the shooting position between different image frames in the video, the movement trajectory of the shooting position of the video is formed, including:

Based on the target transformation matrix of the shooting positions between different image frames in the video, determine the transformation matrix of each frame of image in the video relative to the first frame of image;

Based on the transformation matrix of each frame of image in the video relative to the first frame of image, a movement track of the shooting position of the video is formed.
The method according to claim 1, wherein the video is deformed based on the difference between the smooth trajectory and the moving trajectory to obtain an anti-shake processed video, comprising:

determining an adjustment parameter based on the difference between the smooth trajectory and the moving trajectory;

The video is deformed by using the adjustment parameter to obtain an anti-shake processed video.
A video anti-shake processing device, comprising:

an initial change amount determination module, configured to determine the initial change amount of the shooting position between different image frames in the video based on the initial transformation method by tracking feature points between different image frames in the video;

A target variation determination module, configured to determine, based on the fitting error corresponding to the initial variation, the target variation of the shooting position between different image frames in the video by adopting a target transformation method that matches the fitting error;

a movement trajectory generation module, configured to form a movement trajectory of the shooting position of the video based on the target change amount of the shooting position between different image frames in the video, wherein the movement trajectory is used to indicate different images in the video the shooting position of the frame;

a smooth trajectory determination module, configured to perform smooth processing on the shooting positions of different image frames in the moving trajectory respectively to obtain a smooth trajectory;

A video anti-shake processing module, configured to deform the video based on the difference between the smooth track and the moving track, so as to obtain a video that has undergone anti-shake processing.
An electronic device, characterized by comprising a memory and a processor, wherein a computer program is stored in the memory, and when the computer program is executed by the processor, the electronic device is made to implement claims 1-7 The video anti-shake processing method described in any one.
A computer-readable storage medium, characterized in that, a computer program is stored in the storage medium, and when the computer program is executed by a computing device, the computing device is made to implement any one of claims 1-7. The video anti-shake processing method.
A computer program product comprising instructions, characterized in that, when the computer program product is run on an electronic device, the electronic device is caused to execute the video anti-shake processing method according to any one of claims 1-7.