CN114666656A - Video clipping method, video clipping device, electronic equipment and computer readable medium - Google Patents

Video clipping method, video clipping device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN114666656A
CN114666656A CN202210255943.3A CN202210255943A CN114666656A CN 114666656 A CN114666656 A CN 114666656A CN 202210255943 A CN202210255943 A CN 202210255943A CN 114666656 A CN114666656 A CN 114666656A
Authority
CN
China
Prior art keywords
video
image
image frame
clipped
segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210255943.3A
Other languages
Chinese (zh)
Inventor
周芳汝
杨玫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202210255943.3A priority Critical patent/CN114666656A/en
Publication of CN114666656A publication Critical patent/CN114666656A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The embodiment of the disclosure provides a video clipping method, a video clipping device, electronic equipment and a computer readable medium, wherein the method comprises the following steps: obtaining a video duration T1And determining the target video duration T2(ii) a Carrying out feature extraction on the image frame of the video to be edited to obtain the image features of the image frame; according to the video duration T of the video to be edited1The target video duration T2And image features of the image frames determine image frames to be deleted in the image frames; deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped. The video clipping method, the video clipping device, the electronic equipment and the computer readable medium provided by the embodiment of the disclosure can realize automatic and accurate clipping of videos and reduce the consumption of labor and time cost.

Description

Video clipping method, video clipping device, electronic equipment and computer readable medium
Technical Field
The present disclosure relates to the field of video processing technologies, and in particular, to a video editing method and apparatus, an electronic device, and a computer-readable medium.
Background
With the arrival of the information age, video users grow rapidly, videos on various platforms grow explosively, and video clips become more important. However, for a large amount of shot material, manual video clipping requires a lot of manpower and time.
Therefore, a new video clipping method, apparatus, electronic device, and computer readable medium are needed.
The above information disclosed in this background section is only for enhancement of understanding of the background of the disclosure.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a video clipping method, an apparatus, an electronic device, and a computer-readable medium, which can implement automatic clipping of a video and reduce labor and time cost consumption.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the embodiments of the present disclosure, a video clipping method is provided, which includes: obtaining a video duration T1And determining the target video duration T2(ii) a Carrying out feature extraction on the image frame of the video to be edited to obtain the image features of the image frame; according to the video duration T of the video to be edited1The target video duration T2And image features of the image frames determine image frames to be deleted in the image frames; deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped.
In an exemplary embodiment of the present disclosure, performing feature extraction on image frames of the video to be clipped, and obtaining image features of the image frames includes: processing the t-s image frame to the t + s image frame in the video to be edited through an encoder to obtain the image characteristics of the t image frame in the video to be edited; wherein t is more than 0 and less than N, N is the total image frame number of the video to be clipped, and s is more than 0.
In an exemplary embodiment of the present disclosure, the method further comprises the step of determining a position of the target object in accordance with the positionVideo duration T of clip video1The target video duration T2And the image characteristics of the image frames determining the image frames to be deleted in the image frames comprises: obtaining the importance score s of the t image frame according to the image characteristics from the t-m image frame to the t + m image frame in the video to be editedtM is more than 0; according to the video time length T1And the target video duration T2Determining the number n of frames to be deleted according to the difference value and the frame rate of the video to be clipped; dividing the image frame of the video to be edited into n intervals; and determining the image frame with the minimum importance score in each interval as the image frame to be deleted.
In an exemplary embodiment of the disclosure, the importance score s of the t-th image frame is obtained according to the image characteristics of the t-m-th to t + m-th image frames in the video to be clippedtThe method comprises the following steps:
Figure BDA0003548472350000021
wherein, Ft+iFor image features of the t + i-th image frame, wiIs the weight of the ith image frame.
In an exemplary embodiment of the present disclosure, the method further comprises:
Figure BDA0003548472350000022
in an exemplary embodiment of the disclosure, the video duration T of the video to be clipped is determined according to the video duration T1The target video duration T2And the image characteristics of the image frames determining the image frames to be deleted in the image frames comprises: dividing the video to be clipped into I video segments, wherein I is an integer greater than 1; determining an importance score of each image frame in the ith video segment; determining the importance score of the ith video segment according to the average value of the importance scores of all the image frames in the ith video segment; sorting the video clips in a descending order according to the importance scores of the video clips to obtain a sorting result; according to the video time length T1The target video duration T2With the duration of each video segment, willDetermining the first q video clips in the sequencing result as reserved clips; according to the video time lengths of the first q video segments in the sequencing result and the target video time length T2Determining a first image frame in the first q video segments in the sequencing result according to the image characteristics of the image frames in the first q video segments in the sequencing result; and determining a first image frame in the first q video segments in the sequencing result and image frames in the (q + 1) th to the I-th video segments in the sequencing result as the image frame to be deleted.
In an exemplary embodiment of the present disclosure, the video duration T is determined according to the video duration T1The target video time length T2Determining the first q video segments in the sequencing result as reserved segments according to the duration of each video segment comprises:
Figure BDA0003548472350000031
and is
Figure BDA0003548472350000032
Wherein, N ═ T2Xfps, fps being the frame rate of the video to be clipped,
Figure BDA0003548472350000033
the number of the image frames of the j-th video segment in the sequencing result.
In an exemplary embodiment of the present disclosure, acquiring a video to be clipped includes: acquiring K video segments to be edited, wherein K is an integer larger than 2, the K video segments to be edited comprise L sequenced segments and K-L segments to be sequenced, and L is larger than or equal to 1 and smaller than or equal to K-2; determining the segment characteristics of each video segment to be clipped according to the image characteristics of the image frame in each video segment to be clipped; determining the relevance scores of the L-th ordered segment and the K-L segments to be ordered according to the distance between the segment characteristics of the L-th ordered segment and the segment characteristics of the K-L segments to be ordered; determining the segment to be sorted with the highest correlation score with the L-th sorted segment in the K-L segments to be sorted as the L + 1-th sorted segment; adding one to L and returning to execute the steps until K-1 is reached to obtain K sequenced segments; and synthesizing the K sequenced segments according to the sequence of the K sequenced segments to obtain the video to be edited.
According to a second aspect of embodiments of the present disclosure, there is provided a video clipping device, the device comprising: a video acquisition module for acquiring the video duration T1And determining the target video duration T2(ii) a The characteristic extraction module is used for extracting the characteristics of the image frames of the video to be edited to obtain the image characteristics of the image frames; an image frame positioning module for positioning the image frame according to the video duration T of the video to be clipped1The target video duration T2And image features of the image frames determine image frames to be deleted in the image frames; and the video clipping module is used for deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped.
According to a third aspect of the embodiments of the present disclosure, an electronic device is provided, which includes: one or more processors; storage means for storing one or more programs; when executed by the one or more processors, cause the one or more processors to implement the video clip method of any of the above.
According to a fourth aspect of embodiments of the present disclosure, a computer-readable medium is proposed, on which a computer program is stored, which program, when executed by a processor, implements a video clipping method as defined in any of the above.
According to the video clipping method, the video clipping device, the electronic equipment and the computer readable medium provided by some embodiments of the present disclosure, feature extraction is performed on image frames of a video to be clipped to obtain image features of the image frames, and importance scores of different image frames or video segments can be considered based on the image features of the image frames; and the importance is considered based on the image characteristics of the image frames and the video duration T of the video to be edited1The target video time length T2Determining a certain number of image frames to be deleted in the image frames; deleting the image frames to be deleted in the video to be edited so as to delete the image frames to be deletedThe video to be edited is edited, so that the automatic and accurate editing of the video can be realized, and the consumption of labor and time cost is reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. The drawings described below are merely some embodiments of the present disclosure, and other drawings may be derived from those drawings by those of ordinary skill in the art without inventive effort.
FIG. 1 is a system block diagram illustrating a video clipping method and apparatus according to an example embodiment.
FIG. 2 is a flow diagram illustrating a method of video clipping in accordance with an exemplary embodiment.
FIG. 3 is a flowchart illustrating a method of video clipping in accordance with another exemplary embodiment.
Fig. 4 is a flowchart illustrating a video clipping method according to yet another exemplary embodiment.
FIG. 5 is a flowchart illustrating a video clipping method according to yet another exemplary embodiment.
Fig. 6(a) is a schematic diagram illustrating importance scores of image frames according to an exemplary embodiment.
FIG. 6(b) is a schematic diagram illustrating importance scores for video segments according to an example embodiment.
Fig. 7 is a flowchart illustrating training of an unsupervised network in accordance with an exemplary embodiment.
FIG. 8 is a flowchart illustrating a video clipping method according to yet another exemplary embodiment.
FIG. 9 is a block diagram illustrating a video clipping device according to an example embodiment.
Fig. 10 schematically illustrates a block diagram of an electronic device in an exemplary embodiment of the disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.
The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations or operations have not been shown or described in detail to avoid obscuring aspects of the invention.
The drawings are merely schematic illustrations of the present invention, in which the same reference numerals denote the same or similar parts, and thus, a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and steps, nor do they necessarily have to be performed in the order described. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The following detailed description of exemplary embodiments of the invention refers to the accompanying drawings.
FIG. 1 is a system block diagram illustrating a video clipping method and apparatus according to an example embodiment.
In the system 100 of video clipping methods and apparatus, the server 105 may be a server providing various services, such as a background management server (for example only) providing support over the network 104 for a video clipping system operated by users with terminal devices 101, 102, 103. The backend management server may analyze and otherwise process data such as the received video clip request, and feed back a processing result (e.g., a clipped video — just an example) to the terminal device.
The server 105 may be a server of one entity, and may also be composed of a plurality of servers, for example, a part of the server 105 may be used as a video clip task submitting system in the present disclosure, for example, to obtain a task to execute a video clip command; and a portion of the server 105 may also be used, for example, as a video clipping system in this disclosure, for obtaining a video having a video duration T1And determining the target video duration T2(ii) a Carrying out feature extraction on the image frame of the video to be edited to obtain the image features of the image frame; according to the video duration T of the video to be edited1The target video duration T2And image features of the image frames determine image frames to be deleted in the image frames; deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped.
FIG. 2 is a flowchart illustrating a method of video clipping in accordance with an exemplary embodiment. The video clipping method provided by the embodiments of the present disclosure may be executed by any electronic device with computing processing capability, such as the terminal devices 101, 102, and 103 and/or the server 105, and in the following embodiments, the server executes the method as an example for illustration, but the present disclosure is not limited thereto. The video clipping method provided by the embodiment of the present disclosure may include steps S202 to S208.
As shown in fig. 2, in step S202, the video having a video duration T is acquired1And determining the target video duration T2
In the embodiment of the present disclosure, for example, a video clip request may be received, where the video clip request includes a video to be clipped and a target video duration T2. Target video duration T2Is the desired duration of the video after the video to be clipped is clipped. Wherein, T1>T2. In particular, the video clip request may include a video segment to be clipped; after the video clip request is obtained, the video to be clipped may be synthesized from the video segments to be clipped, which may specifically refer to the embodiment shown in fig. 5.
In step S204, feature extraction is performed on the image frames of the video to be clipped, so as to obtain image features of the image frames.
In the embodiment of the disclosure, the image characteristics of each frame of image (i.e. image frame) in the video to be edited can be obtained by adopting the self-supervised learning.
In an exemplary embodiment, the t-s image frame to the t + s image frame in the video to be edited can be processed through an encoder, so as to obtain the image characteristics of the t image frame in the video to be edited; wherein t is more than 0 and less than N, N is the total image frame number of the video to be clipped, and s is more than 0. In the embodiment, for the t-th image frame in the video to be edited, the encoder combines the image information of the image frames of the previous s frame and the next s frame to process, and can fully utilize the related information of the previous and next frames of the t-th image frame to obtain the image characteristics capable of fully representing the t-th image frame.
In step S206, according to the video duration T of the video to be clipped1The target video duration T2And determining an image frame to be deleted in the image frame by using the image characteristics of the image frame.
In the embodiment of the disclosure, the importance score of the video segment or each image frame in the video to be clipped can be calculated according to the image characteristics, and then the image frame to be deleted is determined in the image frame based on the importance score of the video segment or each image frame in the video to be clipped. The video segments can be obtained by clustering image frames in the video to be clipped, and dividing the video to be clipped by a differential or gradient method, for example.
In step S208, deleting image frames to be deleted in the video to be clipped to clip the video to be clipped.
Wherein the target video duration T is obtained after the video to be edited is edited2The target clip video.
According to the video clipping method provided by the embodiment of the disclosure, the image frames of the video to be clipped are subjected to feature extraction to obtain the image features of the image frames, and the importance scores of different image frames or video segments can be considered based on the image features of the image frames; and the importance is considered based on the image characteristics of the image frames and the video duration T of the video to be edited1The target video duration T2Determining a certain number of image frames to be deleted in the image frames; and deleting the image frames to be deleted in the video to be edited so as to edit the video to be edited, so that the automatic and accurate editing of the video can be realized, and the consumption of labor and time cost is reduced.
FIG. 3 is a flow chart illustrating a method of video clipping in accordance with another exemplary embodiment. Step S206 of the embodiment of fig. 2 may include steps S302 to S308.
As shown in fig. 3, in step S302, according to the image characteristics of the t-m th image frame to the t + m th image frame in the video to be clipped, the importance score S of the t-th image frame is obtainedt,m>0。
In the embodiment of the present disclosure, the importance score s of the t-th image frametMay be represented by the following formula (1):
Figure BDA0003548472350000071
wherein, Ft+iFor image features of the t + i-th image frame, wiIs the weight of the ith image frame. Dist represents a distance function, such as Euclidean distance. w is aiIndicating the weight, the weight will be a little larger when calculating the feature distance of two frames near the ith image frame, and the weight will be correspondingly smaller when calculating the feature frame number farther from the tth frame, for example, wiCan be determined by the following formula (2).
Figure BDA0003548472350000081
Importance score s of the t-th image frametIt can be expressed that when the difference between the t-th image frame and the image frame nearby is large, the importance score s thereoftLarger, otherwise, the t-th image frame is similar to the adjacent image frames, and the importance score stAnd is lower.
In step S304, according to the video duration T1And the target video time length T2Determining the number n of frames to be deleted according to the difference value of the video to be clipped and the frame rate of the video to be clipped.
In the embodiment of the present disclosure, when the frame rate of the video to be clipped is represented as fps, the number n of frames to be deleted can be calculated as following formula (3):
n=(T1-T2)×fps (3)
in step S306, the image frames of the video to be clipped are divided into n sections.
Wherein, the total frame number N of the image frames of the video to be edited is T1Xfps, then each of the video to be edited can be determined
Figure BDA0003548472350000082
Each image frame is a section.
In step S308, the image frame with the smallest importance score in each interval is determined as the image frame to be deleted.
Each interval has an image frame to be deleted, and n intervals can determine n image frames to be deleted. Fig. 6(a) is a schematic diagram illustrating importance scores of image frames according to an exemplary embodiment. As shown in fig. 6(a), the horizontal axis represents 400 image frames of the video to be clipped (i.e., the total number of frames N is 400), and the vertical axis represents the importance score of each image frame. Wherein, the dots in fig. 6(a) represent the image frames to be deleted in the video to be clipped.
Fig. 4 is a flowchart illustrating a video clipping method according to yet another exemplary embodiment. Step S206 of the embodiment of fig. 2 may include steps S402 to S414.
As shown in fig. 4, in step S402, the video to be clipped is divided into I video segments, where I is an integer greater than 1.
In the embodiment of the disclosure, the image frames belonging to the same category may be divided into the same video segment by using a clustering method. For another example, a difference method, a gradient method, etc. may be used to find the segmentation point of the video segment, and the video to be edited is divided into different video segments at the segmentation point.
In step S404, the importance score of each image frame in the ith video segment is determined.
The calculation method of the importance score of each image frame is shown in formula (1).
In step S406, the importance score of the ith video segment is determined according to the average of the importance scores of the image frames in the ith video segment.
For I video clips1,…,clipIIn which the ith video clipiNumber of frames of niThen the importance score for each image frame in the ith video segment may be expressed as
Figure BDA0003548472350000091
And the importance score S of the ith video clipiCan be represented by the following formula (4).
Figure BDA0003548472350000092
In step S408, the video segments are sorted in descending order according to the importance scores of the video segments, and a sorting result is obtained.
The higher the importance score of the video segment is, the more the picture change of the image frame in the video segment is represented, and the more the highlight of the video segment is. For I video clips1,…,clipIAccording to its importance score S1,…,SIGet after descending order sorting
Figure BDA0003548472350000093
The corresponding video segment has an importance score of
Figure BDA0003548472350000094
And is
Figure BDA0003548472350000095
Figure BDA0003548472350000096
The frame number of the image frame of the sorted video segment is
Figure BDA0003548472350000097
In step S410, according to the video duration T1The target video duration T2And determining the first q video clips in the sequencing result as reserved clips according to the duration of each video clip.
Wherein the value of q can be determined by the following formula (5).
Figure BDA0003548472350000098
And is
Figure BDA0003548472350000099
Wherein N' ═ T2Xfps, i.e. the total frame number of the clipped video, fps is the frame rate of the video to be clipped,
Figure BDA00035484723500000910
the number of the image frames of the j-th video segment in the sequencing result. FIG. 6(b) is a schematic diagram illustrating importance scores for video segments according to an example embodiment. As shown in fig. 6(b), each horizontal line (including a solid line and a dotted line) represents a video segment of a video to be clipped, and the vertical axis represents an importance score of each video segment. In fig. 6(b), the solid line represents the reserved segment in the video to be edited, and the dotted line represents the video segment to be deleted. Keeping fragments absent for deletionA video segment of an image frame. The total number of image frames for a retained segment may be expressed as
Figure BDA00035484723500000911
In step S412, according to the video durations of the first q video segments in the sorting result, the target video duration T2And determining the first image frame in the first q video segments in the sequencing result according to the image characteristics of the image frames in the first q video segments in the sequencing result.
Wherein when N ism>N', the method for determining image frames to be deleted similar to the embodiment shown in fig. 3 may be further adopted to determine the image frames to be deleted of the first q video segments in the sorting result as the first image frame. For example, for the first q video segments of the ordering result, the number of image frames is NqFirstly, determining the importance score of each image frame in the first q video segments and according to Nm-N' determining the number of frames of image frames to be deleted in the first q video segments, dividing the first q video segments into Nm-N' intervals; will be Nm-of the N' intervals, the image frame with the smallest importance score in each interval is determined as the first image frame of the top q video segments.
In step S414, the first image frame in the first q video segments in the sorting result and the image frames in the q +1 th to I-th video segments in the sorting result are determined as the image frame to be deleted.
FIG. 5 is a flowchart illustrating a method of video clipping in accordance with yet another exemplary embodiment. Step S202 of the embodiment of fig. 2 may include steps S502 to S512.
As shown in fig. 5, in step S502, K video segments to be clipped are obtained, where K is an integer greater than 2, the K video segments to be clipped include L sequenced segments and K-L segments to be sequenced, and L is greater than or equal to 1 and less than or equal to K-2. Wherein the L sorted segments are sorted in a specified order.
Wherein the K video segments to be clipped may for example be included in the video clip request. The K video segments to be clipped can be respectively recorded as: v1,V2,…,VK
In step S504, segment characteristics of each video segment to be clipped are determined according to image characteristics of image frames in each video segment to be clipped.
Wherein, suppose that the ith video segment V to be clippediIs n', its image frame can be recorded as:
Figure BDA0003548472350000101
wherein
Figure BDA0003548472350000102
Representing the jth image frame of the ith video segment to be clipped. The image characteristics of the j image frame of the video segment to be edited can be represented as
Figure BDA0003548472350000103
The section characteristic F of the ith video section to be clipped can be determined according to the following formula (6)i
Figure BDA0003548472350000104
In step S506, relevance scores of the lth sorted segment and the K-L segments to be sorted are determined according to distances between the segment features of the lth sorted segment and the segment features of the K-L segments to be sorted.
In the embodiment of the present disclosure, the distance between the segment feature of the L-th sorted segment and the segment feature of the i-th segment to be sorted may be determined according to the following formula (7), where i is an integer greater than 0 and equal to or less than K-L.
SLi=-Dist(FL,Fi) (7)
Wherein, FLFor the segment characteristics of the L-th sequenced segment, it is assumed here that the 1 st video segment to be clipped among the K video segments to be clipped is the designated beginning segment. Dist represents a distance function.
In step S508, the segment to be sorted with the highest relevance score to the L-th sorted segment among the K-L segments to be sorted is determined as the L + 1-th sorted segment.
In step S510, add one to L and return to performing steps S506 to S510 described above until K-1 is reached, K sorted segments are obtained.
When L ═ K-1, the remaining 1 segment to be sorted can be directly used as the kth sorted segment, and then K sorted segments are obtained.
In step S512, the K sorted segments are synthesized according to the order of the K sorted segments, so as to obtain the video to be clipped.
In still another exemplary embodiment of the present disclosure, the video clipping method may include the following four steps: in the step 1, the image characteristics of each image frame in the video to be edited are obtained based on self-supervision learning; in the step 2, judging the video clipping mode, wherein the video clipping mode comprises clipping the image frame of the video to be clipped (step 3) and clipping the video segment in the video to be clipped (step 4); in step 3, clipping image frames in a video to be clipped; in step 4, a video segment in the video is clipped.
Specifically, in step 1, the features of each frame of image in the video are obtained based on the self-supervised learning. In order to perform full-automatic editing on video, the feature of each frame of image in video can be extracted by using an auto-supervision network, and a training flow chart of the auto-supervision network is shown in fig. 7.
The encoder can be constructed to extract the features of the image frames, the image frames are in the video to be edited, and the features of the image frames connected with the previous frame and the next frame are required to have high similarity. An RNN module may be introduced into the encoder to enable the extracted image features to have related information of previous and subsequent frames. The t frame image frame is ItInputting the t-th frame image frame, and the preceding s-frame image frame and the following s-frame image frame into the encoder at the same time, outputting the image characteristics F of the t-th frame image framet∈Rn
Further, as shown in fig. 7, the self-supervised learning network may further include a decoder to supervise encoding to generate accurate features Ft. Image characteristic F of t-th image frametInput decoderOutput sum oftY of the same sizetCalculating ItAnd YtThe encoder can obtain the t frame image I through the self-supervision learning by taking the distance of (1) as the loss of the self-supervision networktCharacteristic F oft,FtCombines the front and back frame information of the t frame image and can sufficiently express It
Specifically, in step 2, for the video to be clipped, 2 clipping manners may be adopted: clipping image frames of a video to be clipped, and clipping video segments in the video to be clipped. Video duration T for video to be edited1And target video duration T2When is coming into contact with
Figure BDA0003548472350000121
When the first clipping mode (clipping image frames in the video to be clipped) is adopted, when the first clipping mode is used
Figure BDA0003548472350000122
And then, a second clipping mode (clipping the video segments in the video to be clipped) is adopted, and theta is more than 0 and less than or equal to 1.
When the video clip request includes the video segments to be clipped, the video segments can be sequenced and combined into a complete video (the combined video is the video to be clipped), and then the clipping mode is judged. For the ordering of a group of video segments to be edited, 2 ordering modes can be adopted. The first sorting mode: determining the sequence of the video segments to be clipped according to a user instruction; the second sort mode: the L sorted video segments to be clipped (i.e. the sorted segments) are determined according to the user instruction, and the next playing order is determined according to the algorithm, which can specifically refer to the embodiment shown in fig. 5.
For example, it can be recorded that K video segments to be edited are V respectively1,V2,…,VkWherein the a-th video segment V to be editedaA segment played for the user-specified beginning. For the ith video segment V to be clippediThe number of frames can be recorded as n, i.e.
Figure BDA0003548472350000123
Wherein
Figure BDA0003548472350000124
And representing the j frame image frame of the ith video segment to be clipped. Extraction using self-supervision network
Figure BDA0003548472350000125
Is characterized by
Figure BDA0003548472350000126
And calculates the video segment V to be editediCharacteristic F ofi
Figure BDA0003548472350000127
For every two video segments V to be clippediAnd VjThe correlation score S of two video segments to be clipped can be obtained by calculating the distance of the featuresij
Sij=-Dist(Fi,Fj) (9)
Where Dist represents a distance function.
For a user-specified starting video segment VaThe video segment to be edited with the highest relevance score can be used as the video segment to be played next. Then for any one video segment V to be editediThe next played video clip Vj*In { V1,V2,…,VkSequence j in*Is calculated in a manner that
j*=argmax Sij (10)
j is 1, … k and j ≠ i
And sequencing a group of video segments to be clipped according to the above mode, synthesizing a complete video, and judging the clipping mode according to the video duration.
Specifically, in step 3, when
Figure BDA0003548472350000131
When the video is to be edited, the single-frame image frame in the video to be edited can be edited.
In step 1, each image frame I in the video to be edited is extractedtImage feature F oft. For the t-th image frame, the importance score s of the image frame can be calculated by utilizing the image characteristics of 2m image frames before and after the t-th image frametSee formula (1) and the related description.
Importance score s of the t-th image frametIt means that the importance score is larger when the distance between the image frame t and the nearby image is larger, and conversely, it means that the image frame t and the nearby image are similar, and the importance score is lower.
Recording the frame rate of the video to be cut as fps, and then the number of the video frames n needing to be deleted after cutting is equal to (T)1-T2)×fps。
In this way, the number n of video frames to be deleted is a relatively small value, i.e. the video can be completely edited on the basis of keeping all segments in the video, i.e. a segment is not deleted. To ensure that the n image frames deleted are not a complete segment (it is possible that the importance scores of all the frames in a segment are low), the image frames may be edited by using a frame deletion method: total frame number N ═ T of video to be edited1X fps, calculation interval
Figure BDA0003548472350000132
I.e. each time
Figure BDA0003548472350000133
One picture with the lowest criticality score is selected from the frames to be deleted, and n frames are deleted in total.
Specifically, in step 4, when
Figure BDA0003548472350000141
When doing so, a video segment in the video to be clipped can be clipped.
Firstly, dividing a video to be edited into I video segments, for example, using a clustering method to take image frames belonging to the same category as one segment; or using difference, gradient and other methods to find the segmentation points of the segments, and dividing the video into different segments at the segmentation points.
Total frame number of video N ═ T1Xfps, we split the video into I video segments { clip1,…,clipIRecording the ith video clipiNumber of frames of niI.e. by
Figure BDA0003548472350000142
Extracting clipiImage features of each image frame
Figure BDA0003548472350000143
And calculating an importance score of each image frame according to formula (1)
Figure BDA0003548472350000144
Clip will beiThe average of the importance scores of each image frame is the importance score of the current segment, i.e. the ith segment clipiSee equation (4) for the importance score of (1).
The higher the importance score of a video segment, the greater the picture change representing an image frame in the video segment, and the greater the highlight of the segment. The duration of the clipped video is T2The number of frames is N' ═ T2We will have K video clips { clip } fps1,…,clipIAccording to its importance score S1,…,SIGet after descending order sorting
Figure BDA0003548472350000145
With a corresponding importance score of
Figure BDA0003548472350000146
And is
Figure BDA0003548472350000147
The number of the sequenced fragment frames is
Figure BDA0003548472350000148
The first q video segments are selected as the remaining segments such that they satisfy equation (5).
Fig. 6(b) shows the clipping result of the video segment, the abscissa is the image frame in the video to be clipped, the ordinate is the importance score of the video segment, and the line segment represents the score of the video segment. Where the solid line segment is the retained video segment and the dashed line segment is the video segment to be deleted with a lower score, q being 10 in this example.
Splicing the selected q-1 video segments according to the sequence of the video segments in the original video to form a complete video vqThe number of frames of the video is
Figure BDA0003548472350000149
When N is presentq>N', it may be necessary to take the video frame clipping method in step 3 for vqClipping is carried out, the number of frames after clipping is from NmTo become N'.
Further, FIG. 8 is a flowchart illustrating a video clipping method according to yet another exemplary embodiment. The video clipping method of the disclosed embodiment may include steps S802 to S814.
In step S802, a user input is received, and a video clip request is determined based on the information input by the user. The video clip request may include the complete video to clip and the target video duration T2, the video to clip having a video duration T1. In another embodiment, the video clip request may include a video segment to be clipped.
In step S804, it is determined whether the video is a complete video to be clipped, if so, S810 is executed, otherwise, S806 is executed.
In step S806, feature extraction is performed on image frames in the video segment to be clipped.
In step S808, the video segments to be clipped are sorted, and the video segments to be clipped are synthesized into a complete video to be clipped according to the sorting result.
In step S810, it is judged
Figure BDA0003548472350000151
If so, go to S812, otherwise go to S814.
In step S812, the image frames in the video to be clipped are clipped. See step 3 of the above example.
In step S814, a video segment in the video to be clipped is clipped. See step 4 of the above example.
The application provides a video editing method based on self-supervision learning, and after a user uploads a section of complete video to be edited or a group of video segments to be edited, the user only needs to specify the target video duration after the user desires to edit, and then the user can edit the video in a full-automatic mode. The whole clipping process does not need manual participation, and the intelligent clipping can be completed by reserving wonderful segments or frames in the video to be clipped through the image characteristics of each image frame and the calculation of the importance score.
It should be clearly understood that this disclosure describes how to make and use particular examples, but the principles of this disclosure are not limited to any details of these examples. Rather, these principles can be applied to many other embodiments based on the teachings of the present disclosure.
Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments are implemented as a computer program executed by a Central Processing Unit (CPU). When executed by a central processing unit CPU, performs the above-described functions defined by the above-described methods provided by the present disclosure. The program may be stored in a computer readable storage medium, which may be a read-only memory, a magnetic or optical disk, or the like.
Furthermore, it should be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
FIG. 9 is a block diagram illustrating a video clipping device according to an example embodiment. Referring to fig. 9, a video clipping apparatus 90 provided by an embodiment of the present disclosure may include: a video acquisition module 902, a feature extraction module 904, an image frame positioning module 906, and a video clipping module 908.
In video clipping device 90, video acquisition module 902 may be configured to acquire a video having a video duration T1And determining the target video duration T2
The feature extraction module 904 can be configured to perform feature extraction on the image frames of the video to be clipped, so as to obtain image features of the image frames.
The image frame positioning module 906 is configured to determine the video duration T of the video to be edited1The target video duration T2And determining image frames to be deleted in the image frames according to the image characteristics of the image frames.
The video clipping module 908 may be configured to delete the to-be-deleted image frames in the to-be-clipped video to clip the to-be-clipped video.
According to the video clipping device provided by the embodiment of the disclosure, the image frames of the video to be clipped are subjected to feature extraction to obtain the image features of the image frames, and the importance scores of different image frames or video segments can be considered based on the image features of the image frames; and the importance is considered based on the image characteristics of the image frames and the video duration T of the video to be edited1The target video duration T2Determining a certain number of image frames to be deleted in the image frames; and deleting the image frames to be deleted in the video to be edited so as to edit the video to be edited, so that the automatic and accurate editing of the video can be realized, and the consumption of labor and time cost is reduced.
In an exemplary embodiment, the feature extraction module 904 can be configured to: processing the t-s image frame to the t + s image frame in the video to be edited through an encoder to obtain the image characteristics of the t image frame in the video to be edited; wherein t is more than 0 and less than N, N is the total image frame number of the video to be clipped, and s is more than 0.
In an exemplary embodiment, the image frame positioning module 906 may include: a first image frame score calculating unit, configured to obtain an importance score s of the t-th image frame according to image features of the t-m-th to t + m-th image frames in the video to be clippedtM is more than 0; a deletion frame number determination unit operable to determine the number of deleted frames based on the video duration T1And the target video duration T2Determining the number n of frames to be deleted according to the difference value and the frame rate of the video to be clipped; the interval dividing unit can be used for dividing the image frames of the video to be clipped into n intervals; a first deleted frame determining unit, configured to determine the image frame with the smallest importance score in each interval as the image frame to be deleted.
In an exemplary embodiment, the image frame score calculation unit may include:
Figure BDA0003548472350000171
wherein, Ft+iFor image features of the t + i-th image frame, wiIs the weight of the ith image frame.
In an exemplary embodiment, the video clipping device 90 may further include:
Figure BDA0003548472350000172
in an exemplary embodiment, the image frame positioning module 906 may include: the segment dividing unit can be used for dividing the video to be clipped into I video segments, wherein I is an integer larger than 1; a second image frame score calculating unit operable to determine an importance score of each image frame in the ith video segment; the segment score calculating unit is used for determining the importance score of the ith video segment according to the average value of the importance scores of the image frames in the ith video segment; the segment sorting unit is used for sorting the video segments in a descending order according to the importance scores of the video segments to obtain a sorting result; a reserved segment determining unit operable to determine a segment based onSaid video time length T1The target video duration T2Determining the first q video clips in the sequencing result as reserved clips according to the duration of each video clip; a first image frame determining unit, configured to determine the video duration of the first q video segments in the sorting result and the target video duration T2Determining a first image frame in the first q video segments in the sequencing result according to the image characteristics of the image frames in the first q video segments in the sequencing result; and the second deleted frame determining unit may be configured to determine, as the image frame to be deleted, a first image frame in the first q video segments in the sorting result and image frames in the (q + 1) th to I-th video segments in the sorting result.
In an exemplary embodiment, the reserved fragment determination unit may include:
Figure BDA0003548472350000181
and is
Figure BDA0003548472350000184
Wherein, N ═ T2Xfps, fps being the frame rate of the video to be clipped,
Figure BDA0003548472350000183
the number of the image frames of the j-th video segment in the sequencing result.
In an exemplary embodiment, the video acquisition module 902 may include: the video clip to be clipped acquiring unit can be used for acquiring K video clips to be clipped, wherein K is an integer greater than 2, the K video clips to be clipped comprise L sequenced clips and K-L segments to be sequenced, and L is greater than or equal to 1 and less than or equal to K-2; the segment characteristic determining unit can be used for determining the segment characteristic of each video segment to be clipped according to the image characteristic of the image frame in each video segment to be clipped; the segment correlation calculation unit can be used for determining the correlation scores of the L-th ordered segment and the K-L segments to be ordered according to the distance between the segment characteristics of the L-th ordered segment and the segment characteristics of the K-L segments to be ordered; the single fragment sorting unit can be used for determining the fragment to be sorted with the highest correlation score with the Lth sorted fragment in the K-L fragments to be sorted as the L +1 th sorted fragment; the segment sorting unit can be used for performing an addition operation on the L and returning to execute the steps until the L is equal to K-1, and K sorted segments are obtained; and the segment synthesis unit can be used for synthesizing the K sequenced segments according to the sequence of the K sequenced segments to obtain the video to be edited.
An electronic device 1000 according to this embodiment of the invention is described below with reference to fig. 10. The electronic device 1000 shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 10, the electronic device 1000 is in the form of a general purpose computing device. The components of the electronic device 1000 may include, but are not limited to: the at least one processing unit 1010, the at least one memory unit 1020, and a bus 1030 that couples various system components including the memory unit 1020 and the processing unit 1010.
Wherein the storage unit stores program code that is executable by the processing unit 1010 to cause the processing unit 1010 to perform steps according to various exemplary embodiments of the present invention as described in the "exemplary methods" section above in this specification. For example, the processing unit 1010 may perform the steps as shown in fig. 2 or fig. 3 or fig. 4 or fig. 5 or fig. 8.
The storage unit 1020 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)10201 and/or a cache memory unit 10202, and may further include a read-only memory unit (ROM) 10203.
The memory unit 1020 may also include a program/utility 10204 having a set (at least one) of program modules 10205, such program modules 10205 including but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 1030 may be any one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and a local bus using any of a variety of bus architectures.
The electronic device 1000 may also communicate with one or more external devices 1100 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 1000, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 1000 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interfaces 1050. Also, the electronic device 1000 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 1060. As shown, the network adapter 1060 communicates with the other modules of the electronic device 1000 over the bus 1030. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1000, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary method" of this description, when said program product is run on said terminal device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to external computing devices (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (11)

1. A video clipping method, comprising:
obtaining a video duration T1And determining the target video duration T2
Carrying out feature extraction on the image frame of the video to be edited to obtain the image features of the image frame;
according to the video duration T of the video to be edited1The target video time length T2And image features of the image frames determine image frames to be deleted in the image frames;
deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped.
2. The method of claim 1, wherein feature extraction is performed on image frames of the video to be edited, and obtaining image features of the image frames comprises:
processing the t-s image frame to the t + s image frame in the video to be edited through an encoder to obtain the image characteristics of the t image frame in the video to be edited;
wherein t is more than 0 and less than N, N is the total image frame number of the video to be clipped, and s is more than 0.
3. The method of claim 1, wherein the video to be clipped is based on a video duration T of the video1The target video duration T2And the image characteristics of the image frames determining the image frames to be deleted in the image frames comprises:
obtaining the importance score s of the t image frame according to the image characteristics from the t-m image frame to the t + m image frame in the video to be editedt,m>0;
According to the video time length T1And the target video duration T2Determining the number n of frames to be deleted according to the difference value and the frame rate of the video to be clipped;
dividing the image frame of the video to be edited into n intervals;
and determining the image frame with the minimum importance score in each interval as the image frame to be deleted.
4. The method as claimed in claim 3, wherein the importance score s of the t image frame is obtained according to the image characteristics of the t-m image frame to the t + m image frame in the video to be editedtThe method comprises the following steps:
Figure FDA0003548472340000011
wherein, Ft+iFor image features of the t + i-th image frame, wiIs the weight of the ith image frame.
5. The method of claim 4, wherein the method further comprises:
Figure FDA0003548472340000021
6. the method of claim 1, wherein the video to be clipped is based on a video duration T of the video1The target video duration T2And the image characteristics of the image frames determining the image frames to be deleted in the image frames comprises:
dividing the video to be clipped into I video segments, wherein I is an integer greater than 1;
determining an importance score of each image frame in the ith video segment;
determining the importance score of the ith video segment according to the average value of the importance scores of all the image frames in the ith video segment;
sorting the video clips in a descending order according to the importance scores of the video clips to obtain a sorting result;
according to the video time length T1The target video duration T2Determining the first q video clips in the sequencing result as reserved clips according to the duration of each video clip;
according to the video time lengths of the first q video segments in the sequencing result and the target video time length T2Determining a first image frame in the first q video segments in the sequencing result according to the image characteristics of the image frames in the first q video segments in the sequencing result;
and determining a first image frame in the first q video segments in the sequencing result and image frames in the (q + 1) th to the (I) th video segments in the sequencing result as the image frame to be deleted.
7. The method of claim 6, wherein said duration T is based on said video duration1The target video duration T2Determining the first q video segments in the sequencing result as reserved segments according to the duration of each video segment comprises:
Figure FDA0003548472340000022
and is
Figure FDA0003548472340000023
Wherein N' ═ T2Xfps, fps being the frame rate of the video to be clipped,
Figure FDA0003548472340000024
the number of the image frames of the j-th video segment in the sequencing result.
8. The method of claim 1, wherein obtaining the video to be clipped comprises:
acquiring K video segments to be edited, wherein K is an integer larger than 2, the K video segments to be edited comprise L sequenced segments and K-L segments to be sequenced, and L is larger than or equal to 1 and smaller than or equal to K-2;
determining the segment characteristics of each video segment to be clipped according to the image characteristics of the image frame in each video segment to be clipped;
determining the relevance scores of the L-th ordered segment and the K-L segments to be ordered according to the distance between the segment characteristics of the L-th ordered segment and the segment characteristics of the K-L segments to be ordered;
determining the segment to be sorted with the highest correlation score with the L-th sorted segment in the K-L segments to be sorted as the L + 1-th sorted segment;
adding one to L and returning to execute the steps until K-1 is reached to obtain K sequenced segments;
and synthesizing the K sequenced segments according to the sequence of the K sequenced segments to obtain the video to be edited.
9. A video clipping apparatus, comprising:
a video acquisition module for acquiring the video duration T1And determining the target video duration T2
The characteristic extraction module is used for extracting the characteristics of the image frames of the video to be edited to obtain the image characteristics of the image frames;
an image frame positioning module for positioning the image frame according to the video duration T of the video to be clipped1The target video time length T2And determining an image frame to be deleted in the image frame according to the image characteristics of the image frame;
and the video clipping module is used for deleting the image frames to be deleted in the video to be clipped so as to clip the video to be clipped.
10. An electronic device, comprising:
at least one processor;
storage means for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-8.
11. A computer-readable medium, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the method of any one of claims 1-8.
CN202210255943.3A 2022-03-15 2022-03-15 Video clipping method, video clipping device, electronic equipment and computer readable medium Pending CN114666656A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210255943.3A CN114666656A (en) 2022-03-15 2022-03-15 Video clipping method, video clipping device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210255943.3A CN114666656A (en) 2022-03-15 2022-03-15 Video clipping method, video clipping device, electronic equipment and computer readable medium

Publications (1)

Publication Number Publication Date
CN114666656A true CN114666656A (en) 2022-06-24

Family

ID=82029739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210255943.3A Pending CN114666656A (en) 2022-03-15 2022-03-15 Video clipping method, video clipping device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN114666656A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866839A (en) * 2022-07-11 2022-08-05 深圳市鼎合丰科技有限公司 Video editing software system based on repeated frame image merging

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977895A (en) * 2019-04-02 2019-07-05 重庆理工大学 A kind of wild animal video object detection method based on multi-characteristic fusion
CN110505519A (en) * 2019-08-14 2019-11-26 咪咕文化科技有限公司 Video editing method, electronic equipment and storage medium
CN110915224A (en) * 2018-08-01 2020-03-24 深圳市大疆创新科技有限公司 Video editing method, device, equipment and storage medium
CN112153462A (en) * 2019-06-26 2020-12-29 腾讯科技(深圳)有限公司 Video processing method, device, terminal and storage medium
CN112182299A (en) * 2020-09-25 2021-01-05 北京字节跳动网络技术有限公司 Method, device, equipment and medium for acquiring highlight segments in video
CN112532897A (en) * 2020-11-25 2021-03-19 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and computer readable storage medium
CN112866683A (en) * 2021-01-07 2021-05-28 中国科学技术大学 Quality evaluation method based on video preprocessing and transcoding
CN113301430A (en) * 2021-07-27 2021-08-24 腾讯科技(深圳)有限公司 Video clipping method, video clipping device, electronic equipment and storage medium
WO2021208255A1 (en) * 2020-04-15 2021-10-21 上海摩象网络科技有限公司 Video clip marking method and device, and handheld camera
CN113542865A (en) * 2020-12-25 2021-10-22 腾讯科技(深圳)有限公司 Video editing method, device and storage medium
US20220067383A1 (en) * 2020-08-25 2022-03-03 Beijing Xiaomi Pinecone Electronics Co., Ltd. Method and apparatus for video clip extraction, and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110915224A (en) * 2018-08-01 2020-03-24 深圳市大疆创新科技有限公司 Video editing method, device, equipment and storage medium
CN109977895A (en) * 2019-04-02 2019-07-05 重庆理工大学 A kind of wild animal video object detection method based on multi-characteristic fusion
CN112153462A (en) * 2019-06-26 2020-12-29 腾讯科技(深圳)有限公司 Video processing method, device, terminal and storage medium
CN110505519A (en) * 2019-08-14 2019-11-26 咪咕文化科技有限公司 Video editing method, electronic equipment and storage medium
WO2021208255A1 (en) * 2020-04-15 2021-10-21 上海摩象网络科技有限公司 Video clip marking method and device, and handheld camera
US20220067383A1 (en) * 2020-08-25 2022-03-03 Beijing Xiaomi Pinecone Electronics Co., Ltd. Method and apparatus for video clip extraction, and storage medium
CN112182299A (en) * 2020-09-25 2021-01-05 北京字节跳动网络技术有限公司 Method, device, equipment and medium for acquiring highlight segments in video
CN112532897A (en) * 2020-11-25 2021-03-19 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and computer readable storage medium
CN113542865A (en) * 2020-12-25 2021-10-22 腾讯科技(深圳)有限公司 Video editing method, device and storage medium
CN112866683A (en) * 2021-01-07 2021-05-28 中国科学技术大学 Quality evaluation method based on video preprocessing and transcoding
CN113301430A (en) * 2021-07-27 2021-08-24 腾讯科技(深圳)有限公司 Video clipping method, video clipping device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郝晓丽;高永;: "CUDA框架下的视频关键帧互信息熵多级提取算法", 电子科技大学学报, no. 05, 30 September 2018 (2018-09-30) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866839A (en) * 2022-07-11 2022-08-05 深圳市鼎合丰科技有限公司 Video editing software system based on repeated frame image merging
CN114866839B (en) * 2022-07-11 2022-10-25 深圳市鼎合丰科技有限公司 Video editing software system based on repeated frame image merging

Similar Documents

Publication Publication Date Title
US10073861B2 (en) Story albums
CN110751224B (en) Training method of video classification model, video classification method, device and equipment
CN108509465B (en) Video data recommendation method and device and server
CN108319723B (en) Picture sharing method and device, terminal and storage medium
US9148619B2 (en) Music soundtrack recommendation engine for videos
CN102334118B (en) Promoting method and system for personalized advertisement based on interested learning of user
US10311913B1 (en) Summarizing video content based on memorability of the video content
CN108776676B (en) Information recommendation method and device, computer readable medium and electronic device
CN111901626B (en) Background audio determining method, video editing method, device and computer equipment
JP2021168117A (en) Video clip search method and device
CN112511854B (en) Live video highlight generation method, device, medium and equipment
CN113806588B (en) Method and device for searching video
CN112929746B (en) Video generation method and device, storage medium and electronic equipment
CN112989212B (en) Media content recommendation method, device and equipment and computer storage medium
CN113869138A (en) Multi-scale target detection method and device and computer readable storage medium
CN114845149B (en) Video clip method, video recommendation method, device, equipment and medium
WO2024099171A1 (en) Video generation method and apparatus
CN108345700B (en) Article representative picture selection method and device and computer equipment
CN110990598A (en) Resource retrieval method and device, electronic equipment and computer-readable storage medium
JP2016181143A (en) User profile creation device, moving image analysis device, moving image reproduction device, and user profile creation program
CN114666656A (en) Video clipping method, video clipping device, electronic equipment and computer readable medium
CN113672758A (en) Singing list generation method, device, medium and computing equipment
CN116635911A (en) Action recognition method and related device, storage medium
CN110381391B (en) Video fast slicing method and device and electronic equipment
CN111866609B (en) Method and apparatus for generating video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination