CN112561839A

CN112561839A - Video clipping method and device, storage medium and electronic equipment

Info

Publication number: CN112561839A
Application number: CN202011401472.XA
Authority: CN
Inventors: 吴昊; 王长虎
Original assignee: Beijing Youzhuju Network Technology Co Ltd
Current assignee: Beijing Youzhuju Network Technology Co Ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2021-03-26
Anticipated expiration: 2040-12-02
Also published as: WO2022116772A1; CN112561839B

Abstract

The disclosure relates to a video clipping method and device, a storage medium and an electronic device, which aim to reduce the problem of frequent shaking of pictures in the playing process of a clipped video, and thus improve the playing effect of the clipped video. The video clipping method comprises the following steps: acquiring size information of an original video to be cut and a target cutting frame; performing split-mirror detection on the original video to determine a split-mirror segment in the original video; aiming at each mirror segment, determining a cutting path corresponding to the mirror segment according to the main content of a target video frame in the mirror segment, wherein the target video frame is a part of video frames or all video frames in the mirror segment, and the cutting path is used for representing a position moving path of a target cutting frame in all video frames included in the mirror segment along the width direction or the length direction of the video frames; and cutting the original video according to the size information of the target cutting frame and the cutting path corresponding to each sub-mirror segment to obtain the cut target video.

Description

Video clipping method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a video clipping method and apparatus, a storage medium, and an electronic device.

Background

Video cropping is a technique required to be applied to a scene in which the playing size of a video is inconsistent with the original video. The related art video cropping algorithm generally uses a crop box of a target play size to crop each video frame in the video. Specifically, a loss function is applied to the text information included in each video frame, and when the text is completely inside or outside the cropping frame, the result of the loss function is the smallest, and when half of the text is inside the cropping frame and half is outside the cropping frame, the result of the loss function is the largest, so as to improve the cropping effect.

However, when a portrait video is cut into a landscape video, text information such as subtitles and logos (logos) is obtained at different positions of a cutting frame in different video frames in order to satisfy the loss function, which causes frequent shaking of the frame during playing of the cut video and affects the playing effect of the cut video.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a method for video cropping, the method comprising:

acquiring size information of an original video to be cut and a target cutting frame;

performing split mirror detection on the original video to determine a split mirror segment in the original video;

for each split-mirror segment, determining a clipping path corresponding to the split-mirror segment according to the main content of a target video frame in the split-mirror segment, where the target video frame is a part of or all video frames in the split-mirror segment, and the clipping path is used to represent a position moving path of the target clipping frame in all video frames included in the split-mirror segment along the width direction or the length direction of the video frame;

and cutting the original video according to the size information of the target cutting frame and the cutting path corresponding to each split mirror segment to obtain a cut target video.

In a second aspect, the present disclosure provides a video cropping device, the device comprising:

the acquisition module is used for acquiring the size information of an original video to be cut and a target cutting frame;

the first determination module is used for performing split mirror detection on the original video so as to determine a split mirror segment in the original video;

a second determining module, configured to determine, for each of the mirror segments, a clipping path corresponding to the mirror segment according to a main content of a target video frame in the mirror segment, where the target video frame is a part of or all video frames in the mirror segment, and the clipping path is used to represent a position moving path of the target clipping frame in all video frames included in the mirror segment along a width direction or a length direction of the video frame;

and the cutting module is used for cutting the original video according to the size information of the target cutting frame and the cutting path corresponding to each split-mirror segment to obtain the cut target video.

In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which, when executed by a processing apparatus, performs the steps of the method of the first aspect.

In a fourth aspect, the present disclosure provides an electronic device comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method of the first aspect.

By the technical scheme, the cutting path corresponding to each split mirror segment can be determined according to the main content of the target video frame in the split mirror segment, so that video cutting is performed according to the cutting path corresponding to each split mirror segment. Because the scene corresponding to the video frame included in each split-mirror segment is the same or similar, the difference of the main content included in the video frame is not large, so that the cutting path is determined according to the main content of the target video frame in the split-mirror segment, the position movement deviation of the cutting frame in the cutting path corresponding to the same split-mirror segment is small, the problem of frequent shaking of the picture in the playing process of the cut video is reduced, and the video playing effect is improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale. In the drawings:

fig. 1 is a schematic diagram illustrating a cropping result of a video cropping method in the related art;

FIG. 2 is a flow diagram illustrating a method of video cropping, according to an exemplary embodiment of the present disclosure;

FIG. 3 is a diagram illustrating interpolation calculations in a method of video cropping, according to an exemplary embodiment of the present disclosure;

FIG. 4 is a diagram illustrating a smoothing filtering process in a video cropping method according to an exemplary embodiment of the present disclosure;

FIG. 5 is a block diagram illustrating a video cropping device according to an exemplary embodiment of the present disclosure;

fig. 6 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units. It is further noted that references to "a", "an", and "the" modifications in the present disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

As mentioned in the background, in the related art video cropping algorithm, a loss function is applied to the text information included in each video frame, and the result of the loss function is the smallest when the text is completely inside or outside the cropping frame, and the result of the loss function is the largest when the text is half inside and half outside the cropping frame, so as to improve the cropping effect.

However, when a portrait video is cut into a landscape video, text information such as subtitles and logos (logos) is obtained at different positions of a cutting frame in different video frames in order to satisfy the loss function, which causes frequent shaking of the frame during playing of the cut video and affects the playing effect of the cut video. For example, referring to fig. 1, the first video frame, the second video frame, and the third video frame are three frames of pictures that are temporally consecutive in the same video, the cropping frame of the first video frame is a, and the cropping frame B is shifted down from the entire cropping frame a in the second video frame in order to frame subtitles. In order to frame all faces in the third video frame, the cropping frame C is moved up as a whole compared to the cropping frame B. Therefore, the picture can frequently shake up and down in the process of playing the cut video, and the playing effect is poor.

In view of this, the present disclosure provides a video clipping method, a video clipping device, a storage medium, and an electronic device, so as to reduce the problem of frequent shaking of a picture in the playing process of a clipped video, thereby improving the playing effect of the clipped video. Fig. 2 is a flow chart illustrating a video cropping method according to an exemplary embodiment of the present disclosure. Referring to fig. 2, the video cropping method may include:

step 201, obtaining the original video to be cut and the size information of the target cutting frame.

For example, a user may input a URL (Uniform Resource Locator) corresponding to an original video in the electronic device, and then the electronic device may download the original video from a corresponding Resource server according to the URL for video cropping. Or, the electronic device may, in response to a video cropping request triggered by a user, retrieve a stored video from a memory as an original video for video cropping, and so on, and the embodiment of the present disclosure does not limit a manner of retrieving the original video.

For example, the size information of the target cropping box may define the width and length of the cropped video, and may be determined according to the playing size of the video playing device. For example, the size information of the original video is 720 × 1280 pixels, and the playing size of the video playing device is 1: 1, the size information of the target crop box may be determined to be 720 × 720 pixels. Or, the size information of the target crop box may be customized according to actual business requirements, and the like, which is not limited in this disclosure.

Step 202, performing a split-mirror detection on the original video to determine a split-mirror segment in the original video.

For example, the split mirror detection may determine different shot sections in the original video. The same split-mirror segment comprises a plurality of video frames, and the scene of the lens corresponding to the video frames is the same or similar, so that if the video frames in the same split-mirror segment are cut by a plurality of cutting frames with large position deviation, the cut video frame frequently shakes, and the playing effect of the cut video is reduced. The video frames in different sub-lens segments have different corresponding lens scenes, so that the video playing effect cannot be greatly influenced even if pictures shake. Therefore, in the embodiment of the present disclosure, in order to reduce the problem of frequent shaking of the picture in the playing process of the clipped video and improve the playing effect of the clipped video, the split-mirror detection may be performed on the original video to determine the split-mirror segments in the original video, so that in the subsequent processing process, the clipping path may be individually planned for each split-mirror segment, so that the video frame in each split-mirror segment may correspond to the clipping frame with a small position change.

Step 203, for each split mirror segment, determining a clipping path corresponding to the split mirror segment according to the main content of the target video frame in the split mirror segment. The target video frame is a part of video frames or all video frames in the split-mirror segment, and the cropping path is used for representing the position moving path of the target cropping frame in the width direction or the length direction of the video frames in all the video frames included in the split-mirror segment.

For example, the target video frame may be obtained by performing frame extraction processing on the split mirror segment, and the target video frame may include all video frames in the split mirror segment, or may include part of video frames in the split mirror segment, which is not limited in this disclosure. Or, the original video may be subjected to frame extraction processing, and then the video frame obtained by the frame extraction processing may be marked. After that, performing the split mirror detection to obtain a split mirror segment, and then taking the video frame with the mark in the split mirror segment as the target video frame, and so on.

Illustratively, the shot scenes corresponding to the video frames included in each of the split-mirror segments are the same or similar, so that the difference of the main content included in the video frames is not large, the clipping path is determined according to the main content of the target video frame in the split-mirror segment, the position movement deviation of the target clipping frame in the clipping path corresponding to the same split-mirror segment can be small, the problem of frequent shaking of the picture in the playing process of the clipped video is reduced, and the video playing effect is improved.

And step 204, cutting the original video according to the size information of the target cutting frame and the cutting path corresponding to each sub-mirror segment to obtain the cut target video.

Illustratively, the size information of the target cropping frame defines the size of the cropping frame, and the cropping path corresponding to each sub-mirror segment defines the position of the cropping frame, so that each frame of picture of the original video can be cropped according to the size information of the target cropping frame and the cropping path corresponding to each sub-mirror segment to obtain the cropped target video.

In a possible mode, according to the size information of the target cutting frame and the cutting path corresponding to the split mirror segment, cutting each video frame in the split mirror segment, and then splicing each cut video frame according to a time sequence to obtain the cut target video. That is, each video frame in each sub-frame segment is cut, and then the cut video frames corresponding to different sub-frame segments are sequentially spliced according to the time sequence to obtain the cut target video.

For example, the first segment includes a video frame 1, a video frame 2, and a video frame 3, the second segment includes a video frame 4 and a video frame 5, and the times corresponding to the video frame 1, the video frame 2, the video frame 3, the video frame 4, and the video frame 5 sequentially increase, that is, the video frame 1, the video frame 2, the video frame 3, the video frame 4, and the video frame 5 are sequentially played in the video playing process. In this case, the video frame 1, the video frame 2, and the video frame 3 included in the first sub-video segment may be cropped, the video frame 4 and the video frame 5 included in the second sub-video segment may be cropped at the same time, and then the video frame included in the cropped first sub-video segment and the video frame included in the cropped second sub-video segment may be spliced according to a time sequence to obtain the cropped target video.

By the method, for each sub-mirror segment, the clipping path corresponding to the sub-mirror segment can be determined according to the main content of the target video frame in the sub-mirror segment, so that video clipping is performed according to the clipping path corresponding to each sub-mirror segment. Because the scene corresponding to the video frame included in each split-mirror segment is the same or similar, the difference of the main content included in the video frame is not large, so that the cutting path is determined according to the main content of the target video frame in the split-mirror segment, the position movement deviation of the cutting frame in the cutting path corresponding to the same split-mirror segment is small, the problem of frequent shaking of the picture in the playing process of the cut video is reduced, and the video playing effect is improved. For example, the original video can be a portrait video, and the size information of the target clipping frame can be the size information corresponding to a landscape video, so that the problem of picture shaking of the clipped video in the scene where the portrait video is clipped into the landscape video can be solved through the video clipping method provided by the disclosure, and the video playing effect is improved.

In order to make the video cropping method provided by the present disclosure more understandable to those skilled in the art, the above steps are exemplified in detail below.

In a possible approach, performing a split-mirror detection on the original video to determine a split-mirror segment in the original video may be: performing split-mirror detection on the original video through a frame difference method to determine a split-mirror segment in the original video; or inputting the original video into a pre-trained split-mirror detection model, and determining a split-mirror segment in the original video according to an output result of the split-mirror detection model, wherein the split-mirror detection model is obtained by training a sample video and a sample split-mirror segment corresponding to the sample video.

For example, the frame difference method may be one of moving object detection and segmentation methods, and the basic principle is to extract a moving region in an image by using pixel-based time difference between two or three adjacent frames of an image sequence and performing closed-valued transform, and a specific calculation method of the frame difference method is similar to that in the related art, and is not described herein again. In the embodiment of the disclosure, the motion areas corresponding to different video frames in the original video can be determined by a frame difference method, so that the video frames with the same or similar motion areas are determined to belong to the same lens segment, and then at least one lens segment corresponding to the original video is obtained. Or, the split-mirror detection model can be trained according to the sample video and the sample split-mirror segments corresponding to the sample video, so that at least one split-mirror segment corresponding to the original video is determined through the trained split-mirror detection model.

After at least one sub-mirror segment corresponding to the original video is obtained, for each sub-mirror segment, a clipping path corresponding to the sub-mirror segment can be determined according to the main content of the target video frame in the sub-mirror segment. In a possible manner, if the target video frames are all video frames in the split-mirror segment, a target crop box capable of including the main content in the target video frame may be determined for each target video frame, and then the position coordinates of the target crop box in the width direction or the length direction of the target video frame may be determined, so as to obtain the crop path. However, in this method, the target cropping frame needs to be determined one by one for each video frame in the split-mirror segment, and the calculation amount is large, thereby affecting the efficiency of video cropping.

In order to solve the problem and improve the video cropping efficiency, in a possible manner, if the target video frame is a partial video frame in the split-mirror segment, a target cropping frame capable of including the main content in the target video frame may be determined for each target video frame, and a position coordinate of the target cropping frame in the width direction or the length direction of the target video frame may be determined. And then, carrying out interpolation calculation according to the position coordinates of each target video frame to obtain the position coordinates of the target cutting frame corresponding to other video frames except the target video frame in the split-mirror segment. And finally, determining a cutting path corresponding to the split-mirror segment according to the position coordinate corresponding to each video frame in the split-mirror segment.

For example, the main content may be main picture content occupying most of the image area, such as a video in which a character performs story telling, and the character is the main content in the target video frame. For each target video frame, at least one of the following detection modes may be performed to determine the subject content: saliency detection, face detection, text detection, logo detection. Wherein the saliency detection is used to detect a subject component position of the target video frame. The face detection is used for detecting the position of the face in the target video frame. The character detection is used for detecting the position of the character in the target video frame and the character content. And detecting the positions of the logo, the watermark and other contents in the target video frame by the logo. In addition, before the main content is detected, frame detection can be performed on the target video frame, and then useless frames such as detected black edges, Gaussian blur and the like are removed, so that the detection accuracy of the subsequent main content is improved.

In a possible manner, determining the position coordinates of the target crop box in the width direction or the length direction of the target video frame may be: and if the clipping is determined to be carried out along the width direction according to the size information of the original video and the size information of the target clipping frame, determining the position coordinate of the target clipping frame in the width direction of the target video frame. And if the clipping is determined to be carried out along the length direction according to the size information of the original video and the size information of the target clipping frame, determining the position coordinate of the target clipping frame in the length direction of the target video frame.

It should be understood that, in general, the length and the width of the corresponding frame picture in the original video may be respectively cropped according to the size information of the target crop frame, and therefore, the position coordinates of the target crop frame in the width direction and the length direction of the target video frame may be determined, that is, if the width direction of the target video frame is taken as an X axis and the length direction of the target video frame is taken as a Y axis, the position coordinates include an X coordinate value and a Y coordinate value.

In the embodiment of the present disclosure, in order to improve video cropping efficiency, cropping may be performed along the length or width of a corresponding frame picture according to the size information of the target cropping frame and the size information of the original video. For example, the size information of the target crop box is 1: 1, the size information of the original video is 720 × 1280 pixels, the length (along the y-axis direction) of the corresponding frame picture can be cut, and the size of the cut video is 720 × 720 pixels. In this case, the position coordinates of the target crop box in the length direction of the target video frame may be determined, that is, if the width direction of the target video frame is taken as the X axis and the length direction of the target video frame is taken as the Y axis, the position coordinates include the Y coordinate value. In other cases, if it is determined to perform cropping in the width direction based on the size information of the target crop frame and the size information of the original video, the position coordinates of the target crop frame in the width direction of the target video frame may be determined, that is, the position coordinates include the X-coordinate value.

For example, the interpolation calculation is performed according to the position coordinate of each target video frame, where the position coordinate may be any one of the position coordinate of the target cropping frame in the width direction of the target video frame, the position coordinate of the target cropping frame in the length direction of the target video frame, and the position coordinates of the target cropping frame in the width direction and the length direction of the target video frame, and may be determined according to actual business requirements in a specific application. For the sake of easy understanding, the following description will be made with the positional coordinates being the positional coordinates of the target crop box in the longitudinal direction of the target video frame (i.e., the positional coordinates include the Y-coordinate value).

For example, each target video frame has a position coordinate of y₀、y₁、y₂、……、y_n-1(n is the number of the target video frames), so that the position coordinates can be subjected to interpolation calculation to obtain the position coordinates of the target crop box corresponding to other video frames except the target video frames in the original video. The interpolation calculation method may be any interpolation calculation method in the related art, and the present disclosure does not limit this.

In a possible manner, considering that a general linear interpolation manner may cause a change of an interpolation position to be harsh, which may cause a position movement of a target clipping frame in a clipping path obtained by interpolation calculation to be large, and further cause a shake of a clipped video picture, the embodiment of the present disclosure adopts a cubic spline interpolation calculation manner, so that the clipping path obtained by interpolation calculation is smoother, and the shake of the clipped video picture is reduced.

Specifically, the objective function may be determined according to the position coordinates of each target video frame, where the objective function includes a plurality of piecewise functions, each piecewise function is determined according to the position coordinates of each two adjacent target video frames, and each piecewise function and the objective function are cubic equations with an argument being time and a dependent variable being position coordinates, and a first derivative and a second derivative of the objective function are continuous in time. Correspondingly, the position coordinates of the target crop box corresponding to other video frames can be determined according to the target function and the time corresponding to other video frames except the target video frame in the split-mirror segment.

For example, the position coordinates of every two adjacent target video frames may be taken as a segmentation interval, each segmentation interval may correspond to a segmentation function, and the segmentation function is a cubic equation with an independent variable being time and a dependent variable being position coordinates, so that a segmentation curve corresponding to each segmentation function may be obtained. The objective function comprises a plurality of piecewise functions, i.e. the objective function is the sum of the plurality of piecewise functions. In addition, the target function is a cubic equation with the independent variable being time and the dependent variable being position coordinates, and the first derivative and the second derivative are continuous in time, so that the piecewise curves corresponding to the multiple piecewise functions can be connected into a smooth curve, and the shaking of the cut video picture is reduced.

For example, for the interval variable time t: t is t₀≤t₁≤t₂≤…≤t_n-1Corresponding position coordinate y: y is₀≤y₁≤y₂≤…≤y_n-1The objective function S (t) is S₀(t)+S₁(t)+S₂(t)+…+S_n-1(t) satisfies: 1) at each segment interval t_i,t_i+1]Piecewise function S_i(t) is a cubic function; 2) s_i(t)＝y_i(ii) a 3) The first derivative S' (t) and the second derivative S "(t) of the objective function S (t) are at [ t ]₀,t_n-1]Continuous, and the objective function s (t) is a cubic function. Wherein S is_iThe expression of (t) may be: s_i(t)＝a_i+b_i(t-t_i)+c_i(t-t_i)²+d_i(t-t_i)³Wherein, the value of i is 0, 1, 2, … … and (n-1).

That is to say, the embodiment of the present disclosure may calculate, based on a cubic spline interpolation mode, the position coordinates of the target crop box corresponding to other video frames in the split-mirror segment according to the position coordinates of the target crop box in each target video frame. The second-order continuity of cubic spline interpolation can enable a cutting path obtained by interpolation calculation to be smoother, and therefore shaking of a cut video picture is reduced. For example, referring to fig. 3, by performing interpolation calculation on the position coordinates Y1, Y2, Y3, and Y4 of the target video frames in the above manner, the position coordinates of other video frames between every two adjacent target videos can be obtained, so as to determine the clipping path corresponding to the split mirror segment.

In a possible manner, performing interpolation calculation according to the position coordinates of each target video frame to obtain the position coordinates of the target crop box corresponding to other video frames except the target video frame in the split-mirror segment, which may also be: and performing smoothing filtering processing on the position coordinates of each target video frame to obtain smooth position coordinates of each target video frame, and then performing interpolation calculation according to the smooth position coordinates of each target video frame to obtain the position coordinates of the target cutting frame corresponding to other video frames except the target video frame in the split-mirror segment.

For example, the position coordinates of each target video frame may be processed by means of gaussian smoothing filtering to obtain smoothed position coordinates of each target video frame. For example, the position coordinate of the target cropping frame in each target video frame is y₀、y₁、y₂、……、y_n-1A gaussian smoothing filter with a window of 2M +1(M is a positive integer) may be used. Wherein, the weight corresponding to the position of the distance window center deviation delta y accords with the following Gaussian distribution:

the length of the sliding window convolution kernel of 2M +1 is [ G (-M), G (-M +1), …, G (0), …, G (M-1), G (M)]. The meaning of the related parameters in the gaussian distribution formula can refer to the related technology, and is not described herein again. Of course, in other possible manners, the position coordinates of each target video frame may also be processed by other smoothing filtering manners, such as mean filtering, and the like, which is not limited in this disclosure.

Then, interpolation calculation can be performed according to the smooth position coordinates of each target video frame to obtain the position coordinates of the target crop box corresponding to other video frames except the target video frame in the split-mirror segment. In this way, since the position coordinates of the target crop frame in the target video frame are smoothly filtered before the interpolation calculation, the position offset between the target crop frames in the target video frame can be reduced. For example, referring to fig. 4, smoothing filtering processing is performed on the initial position coordinates Y1, Y2, Y3, and Y4 of the target crop box in the target video frame, and smoothed position coordinates Y1 ', Y2', Y3 ', and Y4' may be obtained. As can be seen from fig. 4, compared with the initial position coordinates, the position offset between the smooth position coordinates is reduced, so that the position offset between the target cropping frames corresponding to other video frames obtained by interpolation can be reduced, the smoothness of the cropping path is further improved, and the shake of the cropped video frame is reduced.

Based on the same inventive concept, the present disclosure also provides a video cropping device, which may be a part or all of an electronic device through software, hardware, or a combination of both. Referring to fig. 5, the video cropping device 500 includes:

an obtaining module 501, configured to obtain size information of an original video to be clipped and a target clipping frame;

a first determining module 502, configured to perform split detection on the original video to determine a split segment in the original video;

a second determining module 503, configured to determine, for each of the mirror segments, a clipping path corresponding to a mirror segment according to a main content of a target video frame in the mirror segment, where the target video frame is a part of or all video frames in the mirror segment, and the clipping path is used to represent a position moving path of the target clipping frame in all video frames included in the mirror segment along a width direction or a length direction of the video frame;

and the cropping module 504 is configured to crop the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the split-mirror segments, so as to obtain a cropped target video.

Optionally, the second determining module 503 is configured to:

for each target video frame, determining a target cutting frame capable of including the main body content in the target video frame, and determining the position coordinates of the target cutting frame in the width direction or the length direction of the target video frame;

performing interpolation calculation according to the position coordinates of each target video frame to obtain the position coordinates of the target cutting frame corresponding to other video frames except the target video frame in the split-mirror segment;

and determining the cutting path corresponding to the split mirror segment according to the position coordinate corresponding to each video frame in the split mirror segment.

Optionally, the second determining module 503 is configured to:

if the clipping is determined to be carried out along the width direction according to the size information of the original video and the size information of the target clipping frame, determining the position coordinate of the target clipping frame in the width direction of the target video frame;

and if the clipping is determined to be carried out along the length direction according to the size information of the original video and the size information of the target clipping frame, determining the position coordinate of the target clipping frame in the length direction of the target video frame.

Optionally, the second determining module 503 is configured to:

determining an objective function from the position coordinates of each target video frame, wherein the objective function comprises a plurality of piecewise functions, each piecewise function is determined from the position coordinates of each two adjacent target video frames, and each piecewise function and the objective function are cubic equations with an argument of time and a argument of the position coordinates, a first derivative and a second derivative of the objective function being continuous in time;

and determining the position coordinates of the target cutting frame corresponding to other video frames according to the target function and the time corresponding to other video frames except the target video frame in the split-mirror segment.

Optionally, the second determining module 503 is configured to:

performing smooth filtering processing on the position coordinate of each target video frame to obtain a smooth position coordinate of each target video frame;

and carrying out interpolation calculation according to the smooth position coordinates of each target video frame to obtain the position coordinates of the target cutting frame corresponding to other video frames except the target video frame in the split-mirror segment.

Optionally, the first determining module 502 is configured to:

performing split-mirror detection on the original video through a frame difference method to determine a split-mirror segment in the original video; or inputting the original video into a pre-trained split-mirror detection model, and determining a split-mirror segment in the original video according to an output result of the split-mirror detection model, wherein the split-mirror detection model is obtained by training a sample video and a sample split-mirror segment corresponding to the sample video.

Optionally, the clipping module 504 is configured to:

for each split-mirror segment, cutting each video frame in the split-mirror segment according to the size information of the target cutting frame and the cutting path corresponding to the split-mirror segment;

and splicing each cut video frame according to a time sequence to obtain the cut target video.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Based on the same inventive concept, the disclosed embodiments also provide a computer readable medium, on which a computer program is stored, which when executed by a processing device, implements the steps of any of the above-mentioned video cropping methods.

Based on the same inventive concept, an embodiment of the present disclosure further provides an electronic device, including:

a storage device having a computer program stored thereon;

and the processing device is used for executing the computer program in the storage device so as to realize the steps of any video clipping method.

Referring now to FIG. 6, a block diagram of an electronic device 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the communication may be performed using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring size information of an original video to be cut and a target cutting frame; performing split mirror detection on the original video to determine a split mirror segment in the original video; for each split-mirror segment, determining a clipping path corresponding to the split-mirror segment according to the main content of a target video frame in the split-mirror segment, where the target video frame is a part of or all video frames in the split-mirror segment, and the clipping path is used to represent a position moving path of the target clipping frame in all video frames included in the split-mirror segment along the width direction or the length direction of the video frame; and cutting the original video according to the size information of the target cutting frame and the cutting path corresponding to each split mirror segment to obtain a cut target video.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the name of a module in some cases does not constitute a limitation on the module itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Example 1 provides a video cropping method, according to one or more embodiments of the present disclosure, comprising:

According to one or more embodiments of the present disclosure, example 2 provides the method of example 1, where determining, according to the main content of the target video frame in the split mirror segment, a clipping path corresponding to the split mirror segment includes:

Example 3 provides the method of example 2, the determining position coordinates of the target crop box in a width direction or a length direction of the target video frame, according to one or more embodiments of the present disclosure, including:

Example 4 provides the method of example 2, where performing interpolation calculation according to the position coordinates of each target video frame to obtain position coordinates of a target crop box corresponding to other video frames in the split-mirror segment except for the target video frame includes:

Example 5 provides the method of any one of examples 2 to 4, wherein the performing interpolation according to the position coordinates of each target video frame to obtain the position coordinates of the target crop box corresponding to the other video frames except the target video frame in the split-mirror segment includes:

Example 6 provides the method of any one of examples 1-4, wherein performing the split mirror detection on the original video to determine a split mirror segment in the original video, includes:

Example 7 provides the method of any one of examples 1 to 4, wherein the cropping the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the split mirror segments to obtain the cropped target video includes:

Example 8 provides, in accordance with one or more embodiments of the present disclosure, a video cropping apparatus, the apparatus comprising:

Example 9 provides the apparatus of example 8, the second determination module to:

Example 10 provides the apparatus of example 9, the second determination module to:

Example 11 provides the apparatus of example 9, the second determination module to:

Example 12 provides the apparatus of any one of examples 9-11, the second determination module to:

Example 13 provides the apparatus of any one of examples 8-11, the first determination module to:

Example 14 provides the apparatus of any one of examples 8-11, the cropping module to:

Example 15 provides a computer readable medium having stored thereon a computer program that, when executed by a processing apparatus, performs the steps of the method of any one of examples 1 to 7, in accordance with one or more embodiments of the present disclosure.

Example 16 provides, in accordance with one or more embodiments of the present disclosure, an electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method of any one of examples 1 to 7.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Claims

1. A method of video cropping, the method comprising:

2. The method according to claim 1, wherein the determining the clipping path corresponding to the split mirror segment according to the main content of the target video frame in the split mirror segment comprises:

3. The method of claim 2, wherein the determining the position coordinates of the target crop box in the width direction or the length direction of the target video frame comprises:

4. The method according to claim 2, wherein the performing interpolation calculation according to the position coordinates of each target video frame to obtain the position coordinates of the target crop box corresponding to the other video frames except the target video frame in the split-mirror segment comprises:

5. The method according to any one of claims 2 to 4, wherein the performing interpolation calculation according to the position coordinates of each target video frame to obtain the position coordinates of the target crop box corresponding to the other video frames except the target video frame in the split-mirror segment comprises:

6. The method according to any one of claims 1-4, wherein the performing the split-mirror detection on the original video to determine the split-mirror segments in the original video comprises:

7. The method according to any one of claims 1 to 4, wherein the cropping the original video according to the size information of the target cropping frame and the cropping path corresponding to each of the split mirror segments to obtain the cropped target video comprises:

8. A video cropping device, characterized in that said device comprises:

9. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the steps of the method of any one of claims 1 to 7.

10. An electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method according to any one of claims 1 to 7.