CN110381371B - Video editing method and electronic equipment - Google Patents

Video editing method and electronic equipment Download PDF

Info

Publication number
CN110381371B
CN110381371B CN201910696203.1A CN201910696203A CN110381371B CN 110381371 B CN110381371 B CN 110381371B CN 201910696203 A CN201910696203 A CN 201910696203A CN 110381371 B CN110381371 B CN 110381371B
Authority
CN
China
Prior art keywords
video
style
target
input
clip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910696203.1A
Other languages
Chinese (zh)
Other versions
CN110381371A (en
Inventor
龚烜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Weiwo Software Technology Co ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN201910696203.1A priority Critical patent/CN110381371B/en
Publication of CN110381371A publication Critical patent/CN110381371A/en
Application granted granted Critical
Publication of CN110381371B publication Critical patent/CN110381371B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof

Abstract

The invention discloses a video editing method and electronic equipment, wherein the method comprises the following steps: displaying a content tag set and a style tag set of a first video to be clipped, wherein the content tag set comprises at least one content tag, each content tag corresponds to at least one video segment in the first video, the style tag set comprises at least one style tag, each style tag corresponds to at least one material combination, and each material combination comprises at least one clipping material; receiving a first input of a content tag set by a user; in response to the first input, intercepting a target video segment corresponding to a target content label selected by the first input from the first video; receiving a second input of the style label set by the user; responding to the second input, and acquiring a target material combination corresponding to the target style label selected by the second input; and synthesizing the target video segment and the clip material in the target material combination to generate a second video.

Description

Video editing method and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a video editing method and electronic equipment.
Background
The video editing technology is a technology for processing video segments in a video in an editing mode to generate video works with different expressive power, and is often applied to scenes such as short video production, video collection and the like.
In the prior art, a video clip mainly adopts a manual clipping mode, when the video clip is carried out, a user needs to spend a large amount of time to carry out the track alignment adjustment of the video speed and the video length, the screening of the transition effect, the matching of the audio rhythm and the like, the operation is more complicated, and the video clip efficiency is lower.
Disclosure of Invention
The embodiment of the invention provides a video clipping method and electronic equipment, and aims to solve the technical problem of low video clipping efficiency in the prior art.
To solve the above technical problem, the embodiment of the present invention is implemented as follows:
in a first aspect, an embodiment of the present invention provides a video clipping method, where the method includes:
displaying a content tag set and a style tag set of a first video to be clipped, wherein the content tag set comprises at least one content tag, each content tag corresponds to at least one video segment in the first video, the style tag set comprises at least one style tag, each style tag corresponds to at least one material combination, and each material combination comprises at least one clipping material;
receiving a first input of the content tag set by a user;
in response to the first input, intercepting a target video segment corresponding to a target content label selected by the first input from the first video;
receiving a second input of the style label set by the user;
responding to the second input, and acquiring a target material combination corresponding to the target style label selected by the second input;
and synthesizing the target video segment and the clip material in the target material combination to generate a second video.
Optionally, as an embodiment, before displaying the content tag set and the genre tag set of the first video to be clipped, the method further includes:
classifying each video frame in the first video to obtain at least one video clip, wherein the video frames in each video clip have the same category;
extracting a subtitle fragment of each video fragment;
extracting at least one keyword of each subtitle segment according to the occurrence frequency of each word in the subtitle segment;
and determining the at least one keyword of each subtitle segment as a content tag of the corresponding video segment.
Optionally, as an embodiment, before displaying the content tag set and the genre tag set of the first video to be clipped, the method further includes:
acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
extracting at least one story feature of each of the clip stories in each video clip sample, the story features identifying the clip stories;
obtaining a style label of each video clip sample;
and combining the extracted material characteristics, and mapping to the corresponding style labels to obtain the material combination corresponding to each style label.
Optionally, as an embodiment, the combining the extracted material features and mapping the combined material features to corresponding genre labels to obtain a material combination corresponding to each genre label includes:
counting the use times of each material characteristic in each type of material characteristics according to the acquired N types of material characteristics under each style label Pi;
determining material characteristics using the first M bits of the secondary number in each type of material characteristics;
combining the material characteristics of the first M positions under the N material characteristics to obtain MNA material feature set;
calculating the material correlation degree of each material feature set, and mapping the material feature sets with the material correlation degrees arranged in the front S positions to the style labels Pi to obtain material combinations corresponding to the style labels Pi;
the style labels Pi are ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
Optionally, as an embodiment, after the synthesizing the target video segment and the clip material in the target material combination to generate the second video, the method further includes:
receiving a third input of the second video from the user;
in response to the third input, adding a style label for the second video that was input by the third input;
determining the second video as one of the preset number of video clip samples.
Optionally, as an embodiment, the clip material includes at least one of: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
Optionally, as an embodiment, the synthesizing the target video segment and the clip material in the target material combination to generate the second video includes:
extracting the sound wave fluctuation frequency of the audio material under the condition that the target material combination comprises the audio material;
generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
receiving a fourth input of a target video shot switching frequency mode alternative in the at least one video shot switching frequency mode alternative by the user;
and responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
In a second aspect, an embodiment of the present invention provides an electronic device, including:
the content tag set comprises at least one content tag, each content tag corresponds to at least one video segment in the first video, the style tag set comprises at least one style tag, each style tag corresponds to at least one material combination, and each material combination comprises at least one clipping material;
a first receiving unit, configured to receive a first input of the content tag set by a user;
the intercepting unit is used for responding to the first input and intercepting a target video fragment corresponding to a target content label selected by the first input from the first video;
the second receiving unit is used for receiving a second input of the style label set by the user;
the first obtaining unit is used for responding to the second input and obtaining a target material combination corresponding to a target style label selected by the second input;
and the synthesizing unit is used for synthesizing the target video segment and the clip material in the target material combination to generate a second video.
Optionally, as an embodiment, the electronic device further includes:
the classification unit is used for classifying each video frame in the first video to obtain at least one video clip, and the video frames in each video clip have the same category;
a first extraction unit for extracting a subtitle segment for each video segment;
the second extraction unit is used for extracting at least one keyword of each caption segment according to the occurrence frequency of each word in the caption segments;
a first determining unit, configured to determine the at least one keyword of each subtitle segment as a content tag of a corresponding video segment.
Optionally, as an embodiment, the electronic device further includes:
the second acquiring unit is used for acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
a third extraction unit for extracting at least one material feature of each clip material in each video clip sample, the material feature identifying the clip material;
a third acquiring unit for acquiring a style label of each video clip sample;
and the mapping unit is used for combining the extracted material characteristics and mapping the material characteristics to the corresponding style labels to obtain the material combination corresponding to each style label.
Optionally, as an embodiment, the mapping unit includes:
the statistical subunit is used for counting the use times of each material characteristic in each type of material characteristics for the obtained N types of material characteristics under each style label Pi;
the determining subunit is used for determining the material characteristics using the first M-bit times in each type of material characteristics;
a combination subunit, configured to combine the material features of the top M bits under the N-type material features to obtain MNA material feature set;
the calculating subunit is used for calculating the material correlation degree of each material feature set, and mapping the material feature set with the material correlation degree arranged at the top S position to the style label Pi to obtain a material combination corresponding to the style label Pi;
the style labels Pi are ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
Optionally, as an embodiment, the electronic device further includes:
a third receiving unit, configured to receive a third input to the second video by a user;
an adding unit, configured to add, in response to the third input, a style label input by the third input to the second video;
a second determining unit for determining the second video as one of the preset number of video clip samples.
Optionally, as an embodiment, the clip material includes at least one of: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
Optionally, as an embodiment, the synthesis unit includes:
an extracting subunit configured to extract a sound wave fluctuation frequency of an audio material in a case where the audio material is included in the target material combination;
the display subunit is used for generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
a receiving subunit, configured to receive a fourth input of a target video lens switching frequency mode alternative in the at least one video lens switching frequency mode alternative from the user;
and the synthesizing subunit is used for responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored on the memory and executable on the processor, and when executed by the processor, the computer program implements the steps of the video clipping method described above.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the video clipping method described above.
In the embodiment of the invention, when video clipping is carried out, a target video segment which is actually required to be clipped is screened out from a first video to be clipped through the content tag, a target material combination required by the actual clipping is obtained through the style tag, and the screened target video segment and the clipping material in the obtained target material combination are subjected to synthesis processing to obtain a clipping work, so that a user does not need to carry out fussy searching, track alignment, combination and other processing, clipping operation is simplified, and video clipping efficiency is improved.
Drawings
FIG. 1 is a flow chart of a video clipping method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of a method for generating a content tag set according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for generating a style tag set according to an embodiment of the present invention;
FIG. 4 is a diagram of an example of a video clip interface provided by an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The video clip is to carry out nonlinear editing and synthesis on a video source through clip software, namely, A/D digital-to-analog conversion is carried out on various analog materials and sound effect materials, then the converted data is archived, and then later-stage synthesis software, such as Premiere, Huiyue shadow and the like, is utilized to carry out editing and synthesis on later-stage video, audio and special-effect images; the nonlinear editing technology can add special effect picture effect, sound effect, animation effect, matched subtitles and the like to video pictures, so that movie and television works have more texture and impact. The common non-linear editing technologies mainly include a picture editing technology, a sound editing technology, a montage technology, a special effect adding technology, and the like.
In the prior art, a video clip mainly adopts a manual clipping mode, when the video clip is carried out, a user needs to spend a large amount of time to carry out the track alignment adjustment of the video speed and the video length, the screening of the transition effect, the matching of the audio rhythm and the like, the operation is more complicated, and the video clip efficiency is lower.
In order to solve the above technical problem, embodiments of the present invention provide a video clipping method and an electronic device. First, a video clipping method according to an embodiment of the present invention will be described.
It should be noted that the video clipping method provided by the embodiment of the present invention is applicable to an electronic device, and in practical applications, the electronic device may include: mobile terminals such as smart phones, tablet computers, personal digital assistants, etc. may also include: computer equipment such as notebook computer, desktop computer.
Fig. 1 is a flowchart of a video clipping method provided by an embodiment of the present invention, and as shown in fig. 1, the method may include the following steps: step 101, step 102, step 103, step 104, step 105 and step 106, wherein,
in step 101, a content tag set and a genre tag set of a first video to be clipped are displayed, wherein the content tag set includes at least one content tag, each content tag corresponds to at least one video segment in the first video, the genre tag set includes at least one genre tag, each genre tag corresponds to at least one material combination, and each material combination includes at least one clip material.
In an embodiment of the present invention, the content tag set is used to locate video segments in the first video, wherein each content tag in the content tag set is used to indicate at least one of a main character, a plurality of key characters, scenario information or video content, and at least one key plot in one video segment.
In the embodiment of the present invention, the style tag set is used for recommending a clipping material combination scheme for the user, wherein the clipping material may include at least one of the following items: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
For ease of understanding, the content tab set and the style tab set are described in connection with specific examples. In one example, the first video to be edited is "live-on-day dragon's book 30", and the content tag set includes: the content label "zhang wu wei", the content label "zhao xi", the content label "zhou zhi qi" and so on, wherein the content label "zhang wu wei" corresponds to the video clip containing "zhang wu yi ji" in the 30 th set of the "yeda yu Longji", the content label "zhao xi" corresponds to the video clip containing "zhao xi" in the 30 th set of the "yeda yu Longji", and the content label "zhou zhi qi" corresponds to the video clip containing "zhou zhi qi" in the 30 th set of the "yeda yu Longji".
In the embodiment of the present invention, the content tag is used to locate a specific position of a video segment corresponding to the content tag in the first video to be edited. For example, if the content tag "zhangbuibit" is selected, a video segment containing "zhangbuibit" in the first video to be edited can be located, i.e., the desired video segment can be directly obtained through the content tag.
In another example, the set of style tags includes: the system comprises a style label 'martial arts', a style label 'Oumei', a style label 'TVB group portrait', and the like, wherein the style label 'martial arts' corresponds to a material combination, the style label 'Oumei' corresponds to a material combination, and the style label 'TVB group portrait' corresponds to a material combination.
In the embodiment of the invention, the material combination corresponding to the style label can be obtained through the style label, for example, the material combination corresponding to the style label martial arts can be obtained through the style label martial arts.
In the embodiment of the invention, the electronic device can acquire the content tag set of the first video to be clipped from the server, namely the content tag set can be generated by the server; or the electronic device may also generate a corresponding content tag set according to the first video to be clipped, that is, the content tag set may be generated by the electronic device.
When a content tag set is generated by an electronic device, as shown in fig. 2, fig. 2 is a flowchart of a content tag set generation method provided by an embodiment of the present invention, where the method may include the following steps: step 201, step 202, step 203 and step 204, wherein,
in step 201, each video frame in the first video is classified to obtain at least one video segment, where the category of the video frame in each video segment is the same.
In the embodiment of the invention, the first video to be edited, which is imported by a user, can be decomposed into the video frame sequence, and the video frame sequence is processed through a clustering algorithm, so that each video frame in the first video is classified. Correspondingly, the step 201 may specifically include the following steps (not shown in the figure): step 2011, step 2012, and step 2013, wherein,
in step 2011, the first video is decomposed into video frames;
in step 2012, image features of each video frame are extracted;
in an embodiment of the present invention, the image features may include: the video frame processing method comprises the following steps of color features and/or texture features, wherein the color features are HSV color features, H represents hue, S represents saturation, and V represents brightness, and the texture features can be obtained by processing the video frame through an LBP texture feature operator.
In step 2013, the extracted image features are clustered through a preset clustering algorithm to obtain the category of each video frame, and the video frames of the same category are determined to be a video clip.
In the embodiment of the invention, the preset clustering algorithm can be a K-means clustering algorithm, wherein the principle of the K-means clustering algorithm is as follows: a) randomly initializing, namely randomly selecting K objects from n data objects as initial clustering centers; b) updating the classification based on the existing central point, wherein the point is classified into the class when the point is closest to the class; c) the center point of the class is updated based on the existing classification. The method is to calculate the average value of each point in each class; and repeating a) and b) until the overall error reaches the set parameter range.
In step 202, a subtitle segment for each video segment is extracted.
In step 203, at least one keyword of each caption segment is extracted according to the occurrence frequency of each word in the caption segment.
In the embodiment of the invention, each caption segment can be processed through a Word2vec Word vector model to obtain at least one keyword of each caption segment. Specifically, Word vectors of each category are trained by adopting a Word2vec model, and the keywords of each category are obtained by calculating the probability of each Word vector.
In step 204, the at least one keyword of each subtitle segment is determined as a content tag of the corresponding video segment.
In one example, video clips from 4 minutes 28 seconds to 6 minutes 30 seconds of "butchery godson 30 th album" are grouped into a category, and through word vector training, according to a probability threshold setting, four words are obtained that satisfy the set threshold: (zhangwubu, -27.9013707845), (zhao-28.1072913493), (zhou zhi if, -30.482187911), (guanding, -36.3372344659), and these four words are used as the keywords of the corresponding video segment.
It should be noted that the server may also generate the content tag set by using the processing operations in step 201 to step 204.
Therefore, in the embodiment of the invention, the first video and the subtitle fragments of each video fragment in the first video can be processed through the K-means clustering algorithm and the Word2vec Word vector model to generate the content tags, and further a content tag set is generated, so that a user can quickly locate the video fragments actually needing to be edited in the first video through the content tags.
In the embodiment of the invention, the electronic equipment can acquire the style label set from the server, namely the style label set can be generated by the server; or the electronic device may generate the style label set itself.
In the embodiment of the invention, a large amount of video editing works can be collected, the material characteristics of each video editing work can be extracted, the style label of each video editing work can be obtained, the editing materials and the style labels can be automatically learned and matched through an algorithm, a material combination recommendation scheme is generated, a style label set is formed and recommended to a user, so that the user can obtain the material combination matched with the own requirements through the style labels, and the editing effect does not need to be optimized through continuous fine adjustment.
When a style tag set is generated by an electronic device, as shown in fig. 3, fig. 3 is a flowchart of a style tag set generation method provided by an embodiment of the present invention, where the method may include the following steps: step 301, step 302, step 303 and step 304, wherein,
in step 301, a preset number of video clip samples are obtained, where the video clip samples are videos processed by video clips, and each video clip sample contains at least one clip material.
In the embodiment of the invention, a large amount of video clip samples can be obtained, wherein the video clip samples are video clip works.
In step 302, at least one material feature is extracted for each clip material in each video clip sample, wherein the material features are used to identify the clip material.
In the embodiment of the invention, when the editing material is an audio material, the material characteristics can be audio names; when the editing material is a filter, the material is characterized by being of a filter type; when the editing material is in transition effect, the material is characterized by the type of the transition effect; when the edited material is a special effect, the material is characterized by a special effect type; when the editing material is in a subtitle style, the material is characterized by the subtitle style type; when the material is edited into the shot switching frequency, the material characteristic is a specific numerical value.
In step 303, style labels are obtained for each sample of video clips.
In the embodiment of the invention, the style label is used for identifying the style of the video clip sample, and the style label can be derived from label collection and artificial marking of mass video data.
In step 304, the extracted material features are combined and mapped to corresponding genre labels to obtain a material combination corresponding to each genre label.
In the embodiment of the invention, the clipping materials such as audio sources, filters, transition effects, special effects, subtitle styles, lens switching frequencies and the like of massive video clipping samples can be combined through the multi-label mapping rule and mapped to the plurality of style labels to form a corresponding material combination recommendation scheme, and a user can obtain the recommended material combination scheme through screening according to the style labels.
Correspondingly, the step 303 may specifically include the following steps (not shown in the figure): step 3031, step 3032, step 3033 and step 3034, wherein,
in step 3031, for the acquired N-type material characteristics under each style label Pi, the number of times of use of each material characteristic in each type of material characteristics is counted.
For ease of understanding, taking the style label "martial arts" as an example, there are 6 types (i.e., N ═ 6) of clip materials under the style label "martial arts": the method comprises the steps of corresponding to 6 types of material characteristics, counting the use times of all audios under an audio material, counting the use times of all filters under the filter material, counting the use times of all transition effects under the transition effect material, counting the use times of all special effects under the special effect material, counting the use times of all caption styles under the caption style material and counting the use times of all the lens switching frequencies under the lens switching frequency material.
Taking the statistics of the number of times of use of all audio under the audio material as an example, as shown in table 1 below, table 1 shows the use of each audio material under the style label "martial arts":
ranking Audio names Number of times of use
1 The book of memorial difficulties 5948
2 Fexuedanxin 4655
3 Knife and sword dream 4342
4 Return to the future 3062
5 ' ren Xiaoyao 2856
6 A great deal of love in the fallen city 2565
7 Jinghong Ying (Jinghong face) 2130
8 Wanshenjie (the period of Wangsheng) 2003
9 "Tian Hui Ming" (a Chinese character of' Tian Hui Ming 1986
10 "the world is not in good condition 1975
TABLE 1
Similarly, the use times of other types of materials under the style label martial arts can be counted.
In step 3032, determining the material characteristics using the first M digits of the secondary number in each class of material characteristics;
still taking the audio material under the style label "martial arts" as an example, if M is 3, then the audio materials using the first 3 bits of the audio in the order of the first 3 bits are determined as follows: the book of memorial difficulties, the book of Tiexuedanxin, and the book of Shaoye such as dream. Similarly, the filter arranged at the first 3 bits, the transition effect arranged at the first 3 bits, the special effect arranged at the first 3 bits, the subtitle style arranged at the first 3 bits, and the shot switching frequency arranged at the first 3 bits can be determined.
In step 3033, the material characteristics of the first M bits under the N-type material characteristics are combined to obtain MNEach material feature set comprises N material features, and the types of the material features are different;
for example, the audio of the top 3 bits is { audio 1, audio 2, audio 3}, the filter of the top 3 bits { filter 1, filter 2, filter 3}, the transition effect of the top 3 bits { transition effect 1, transition effect 2, transition effect 3}, the special effect of the top 3 bits { special effect 1, special effect 2, special effect 3}, the caption style of the top 3 bits { caption style 1, caption style 2, caption style 3} and the shot switching frequency of the top 3 bits { frequency 1, frequency 2, frequency 3}, and the combination is 36Each material feature set is as follows: { audio 1, filter 1, transition effect 1, special effect 1, caption style 1, frequency 1}, …, { audio 3, filter 3, transition effect 3, special effect 3, caption style 3, frequency 3 }.
In step 3034, calculating the material correlation degree of each material feature set, and mapping the material feature set with the material correlation degree arranged at the top S position to the style label Pi to obtain a material combination corresponding to the style label Pi; the style labels Pi are the ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
In the embodiment of the invention, when the relevance of the material is calculated, a weight needs to be allocated to each material characteristic, and still taking the style label "swordsmen" as an example, the audio weight is 0.3, the filter weight is 0.2, the transition weight is 0.2, the special effect weight is 0.1, the caption effect weight is 0.1, and the shot switching frequency is 0.1.
According to the calculation formula: (audio weight + filter weight + number of used filters) × correlation coefficient of both + (filter weight + number of used transitions) × correlation coefficient of both + …, calculate the material correlation; wherein, the calculation process of the correlation coefficient of the two materials (such as the audio material and the filter material) comprises the following steps: the audio of the first 3 bits is { audio 1, audio 2, audio 3}, the filter of the first 3 bits is { filter 1, filter 2, filter 3}, an audio use number vector (the number of uses of audio 1, the number of uses of audio 2, the number of uses of audio 3) is constructed, a filter use number vector (the number of uses of filter 1, the number of uses of filter 2, the number of uses of filter 3) is constructed, the vectors (the number of uses of audio 1, the number of uses of audio 2, the number of uses of audio 3) are normalized, the vectors (the number of uses of filter 1, the number of uses of filter 2, the number of uses of filter 3) are normalized, and the pearson coefficients for use of the two normalized vectors are calculated.
Therefore, in the embodiment of the invention, the high-frequency material characteristics under each style label can be combined to generate a material combination scheme.
The server may generate the style label set by using the processing operations in steps 301 to 304.
Therefore, in the embodiment of the invention, a large amount of video clip works can be collected, the material characteristics and the style labels of each video clip work are extracted, the clip materials and the style labels are automatically learned and matched through an algorithm, the style label set is generated and recommended to a user, so that the user can obtain the recommended material combination under the style through the style labels, and the clipping effect does not need to be optimized through continuous fine adjustment.
In step 102, a first input to a set of content tags by a user is received.
In the embodiment of the present invention, the first input is used to select a target content tag from a content tag set, where the first input may be a click operation.
In step 103, in response to the first input, a target video segment corresponding to the target content tag selected by the first input is intercepted from the first video.
In the embodiment of the invention, the user can directly position the target video segment corresponding to the target content tag from the first video and intercept the target video segment through the target content tag without manually searching the target video segment by the user.
In step 104, a second input to the collection of style tags by the user is received.
In the embodiment of the present invention, the second input is used to select a target style label from the style label set, where the second input may be a click operation.
In step 105, in response to the second input, a target material combination corresponding to the target style label selected by the second input is obtained.
In the embodiment of the invention, the user can obtain the material combination scheme recommended by the system through the style label, and because the material combination scheme recommended by the system is generated based on big data, the recommendation result is reasonable, and the user does not need to search and match the materials one by one.
In step 106, the clip material in the target video segment and the target material combination is synthesized to generate a second video.
In the embodiment of the present invention, it is also possible to learn, for each user, the characteristics and style of a clipped work newly completed by the user, and continuously optimize the style label set in reaction to the generation process of the style label, and at this time, after step 106, the following steps may be further added:
receiving a third input of the second video from the user; wherein the third input is used to add a style label to the second video;
in response to a third input, adding a style label input by the third input to the second video;
the second video is determined to be one of a preset number of samples of video clips.
Therefore, in the embodiment of the invention, the user can add the style label to the new work to react on the calculation result of the mapping rule to optimize the recommendation scheme of the feature combination mapping label, and the scheme can be adjusted by learning along with the current popularity and personal preference to output the recommendation scheme of video clips of thousands of people.
In this embodiment of the present invention, if the material combination includes an audio material, a clipping scheme may be recommended according to the track fluctuation information of the audio material, and at this time, the step 106 may specifically include the following steps:
generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
receiving a fourth input of a target video shot switching frequency mode alternative in the at least one video shot switching frequency mode alternative by the user;
and responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
Therefore, in the embodiment of the invention, the editing mode recommendation scheme is generated according to the sound wave fluctuation frequency of the audio material for the user to select, so that the operation difficulty of making high-quality editing video works can be reduced, and the time cost of video editing can be reduced.
In one example, as shown in FIG. 4, FIG. 4 is an example diagram of a video clip interface, and the video clip interface 40 includes: a content tab set area 41, a genre tab set area 42, a video identification module area 43, a video preview interface 44, and an audio track area 45, where a first video may be dragged to the area 43 to generate a corresponding content tab set. Selecting an audio material from a material combination corresponding to the target style label to be placed in the audio track, giving a video lens switching frequency mode alternative and displaying the video lens switching frequency mode in the video track 1 according to the sound wave fluctuation frequency of the audio material, specifically, automatically giving a video lens switching frequency mode: the video rate is 0.8 times, the video length is 3 seconds, if the user agrees to the video shot switching frequency mode, the clip works are generated according to the video shot switching frequency mode, and finally the preview effect of the clip works can be viewed in the video preview interface 44.
According to the embodiment, when video clipping is carried out, the target video segment which needs to be clipped actually can be screened out from the first video to be clipped through the content tag, the target material combination which needs to be clipped actually is obtained through the style tag, and the screened target video segment and the clipping material in the obtained target material combination are subjected to synthesis processing to obtain the clipping work, so that a user does not need to carry out fussy searching, track alignment, combination and other processing, clipping operation is simplified, and video clipping efficiency is improved.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 5, the electronic device 500 may include: a display unit 501, a first receiving unit 502, a clipping unit 503, a second receiving unit 504, a first obtaining unit 505, and a synthesizing unit 506, wherein,
a display unit 501, configured to display a content tag set and a genre tag set of a first video to be clipped, where the content tag set includes at least one content tag, each content tag corresponds to at least one video segment in the first video, the genre tag set includes at least one genre tag, each genre tag corresponds to at least one material combination, and each material combination includes at least one clip material;
a first receiving unit 502, configured to receive a first input of the content tag set by a user;
an intercepting unit 503, configured to, in response to the first input, intercept, from the first video, a target video segment corresponding to a target content tag selected by the first input;
a second receiving unit 504, configured to receive a second input of the style label set from the user;
a first obtaining unit 505, configured to, in response to the second input, obtain a target material combination corresponding to a target style label selected by the second input;
a synthesizing unit 506, configured to synthesize the target video segment and the clip material in the target material combination, and generate a second video.
According to the embodiment, when video clipping is carried out, the target video segment which needs to be clipped actually can be screened out from the first video to be clipped through the content tag, the target material combination which needs to be clipped actually is obtained through the style tag, and the screened target video segment and the clipping material in the obtained target material combination are subjected to synthesis processing to obtain the clipping work, so that a user does not need to carry out fussy searching, track alignment, combination and other processing, clipping operation is simplified, and video clipping efficiency is improved.
Optionally, as an embodiment, the electronic device 500 may further include:
the classification unit is used for classifying each video frame in the first video to obtain at least one video clip, and the video frames in each video clip have the same category;
a first extraction unit for extracting a subtitle segment for each video segment;
the second extraction unit is used for extracting at least one keyword of each caption segment according to the occurrence frequency of each word in the caption segments;
a first determining unit, configured to determine the at least one keyword of each subtitle segment as a content tag of a corresponding video segment.
Optionally, as an embodiment, the electronic device 500 may further include:
the second acquiring unit is used for acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
a third extraction unit for extracting at least one material feature of each clip material in each video clip sample, the material feature identifying the clip material;
a third acquiring unit for acquiring a style label of each video clip sample;
and the mapping unit is used for combining the extracted material characteristics and mapping the material characteristics to the corresponding style labels to obtain the material combination corresponding to each style label.
Optionally, as an embodiment, the mapping unit may include:
the statistical subunit is used for counting the use times of each material characteristic in each type of material characteristics for the obtained N types of material characteristics under each style label Pi;
the determining subunit is used for determining the material characteristics using the first M-bit times in each type of material characteristics;
a combination subunit, configured to combine the material features of the top M bits under the N-type material features to obtain MNA material feature set;
the calculating subunit is used for calculating the material correlation degree of each material feature set, and mapping the material feature set with the material correlation degree arranged at the top S position to the style label Pi to obtain a material combination corresponding to the style label Pi;
the style labels Pi are ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
Optionally, as an embodiment, the electronic device 500 may further include:
a third receiving unit, configured to receive a third input to the second video by a user;
an adding unit, configured to add, in response to the third input, a style label input by the third input to the second video;
a second determining unit for determining the second video as one of the preset number of video clip samples.
Optionally, as an embodiment, the clip material may include at least one of: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
Optionally, as an embodiment, the synthesizing unit 506 may include:
an extracting subunit configured to extract a sound wave fluctuation frequency of an audio material in a case where the audio material is included in the target material combination;
the display subunit is used for generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
a receiving subunit, configured to receive a fourth input of a target video lens switching frequency mode alternative in the at least one video lens switching frequency mode alternative from the user;
and the synthesizing subunit is used for responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device for implementing various embodiments of the present invention, and as shown in fig. 6, the electronic device 600 includes, but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, and a power supply 611. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 6 does not constitute a limitation of the electronic device, and that the electronic device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.
The processor 610 is configured to display a content tag set and a genre tag set of a first video to be clipped, where the content tag set includes at least one content tag, each content tag corresponds to at least one video segment in the first video, the genre tag set includes at least one genre tag, each genre tag corresponds to at least one material combination, and each material combination includes at least one clip material; receiving a first input of the content tag set by a user; in response to the first input, intercepting a target video segment corresponding to a target content label selected by the first input from the first video; receiving a second input of the style label set by the user; responding to the second input, and acquiring a target material combination corresponding to the target style label selected by the second input; and synthesizing the target video segment and the clip material in the target material combination to generate a second video.
In the embodiment of the invention, when the video is edited, the target video segment which is actually required to be edited can be screened out from the first video to be edited through the content tag, the target material combination required by the actual editing is obtained through the style tag, and the screened target video segment and the editing material in the obtained target material combination are subjected to synthesis processing to obtain the editing work, so that a user does not need to perform fussy searching, track alignment, combination and other processing, the editing operation is simplified, and the video editing efficiency is improved.
Optionally, as an embodiment, before displaying the content tag set and the genre tag set of the first video to be clipped, the method further includes:
classifying each video frame in the first video to obtain at least one video clip, wherein the video frames in each video clip have the same category;
extracting a subtitle fragment of each video fragment;
extracting at least one keyword of each subtitle segment according to the occurrence frequency of each word in the subtitle segment;
and determining the at least one keyword of each subtitle segment as a content tag of the corresponding video segment.
Optionally, as an embodiment, before displaying the content tag set and the genre tag set of the first video to be clipped, the method further includes:
acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
extracting at least one story feature of each of the clip stories in each video clip sample, the story features identifying the clip stories;
obtaining a style label of each video clip sample;
and combining the extracted material characteristics, and mapping to the corresponding style labels to obtain the material combination corresponding to each style label.
Optionally, as an embodiment, the combining the extracted material features and mapping the combined material features to corresponding genre labels to obtain a material combination corresponding to each genre label includes:
counting the use times of each material characteristic in each type of material characteristics according to the acquired N types of material characteristics under each style label Pi;
determining material characteristics using the first M bits of the secondary number in each type of material characteristics;
combining the material characteristics of the first M positions under the N material characteristics to obtain MNA material feature set;
calculating the material correlation degree of each material feature set, and mapping the material feature sets with the material correlation degrees arranged in the front S positions to the style labels Pi to obtain material combinations corresponding to the style labels Pi;
the style labels Pi are ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
Optionally, as an embodiment, after the synthesizing the target video segment and the clip material in the target material combination to generate the second video, the method further includes:
receiving a third input of the second video from the user;
in response to the third input, adding a style label for the second video that was input by the third input;
determining the second video as one of the preset number of video clip samples.
Optionally, as an embodiment, the clip material includes at least one of: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
Optionally, as an embodiment, the synthesizing the target video segment and the clip material in the target material combination to generate the second video includes:
extracting the sound wave fluctuation frequency of the audio material under the condition that the target material combination comprises the audio material;
generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
receiving a fourth input of a target video shot switching frequency mode alternative in the at least one video shot switching frequency mode alternative by the user;
and responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 may be used for receiving and sending signals during a message sending and receiving process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 610; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. Further, the radio frequency unit 601 may also communicate with a network and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user via the network module 602, such as assisting the user in sending and receiving e-mails, browsing web pages, and accessing streaming media.
The audio output unit 603 may convert audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into an audio signal and output as sound. Also, the audio output unit 603 may also provide audio output related to a specific function performed by the electronic apparatus 600 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
The input unit 604 is used to receive audio or video signals. The input Unit 604 may include a Graphics Processing Unit (GPU) 6041 and a microphone 6042, and the Graphics processor 6041 processes image data of a still picture or video obtained by an image capturing apparatus (such as a camera) in a video capture mode or an image capture mode. The processed image may be displayed on the display unit 606. The image processed by the graphic processor 6041 may be stored in the memory 609 (or other storage medium) or transmitted via the radio frequency unit 601 or the network module 602. The microphone 6042 can receive sound, and can process such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 601 in case of the phone call mode.
The electronic device 600 also includes at least one sensor 605, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 6061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 6061 and/or the backlight when the electronic apparatus 600 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 605 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.
The display unit 606 is used to display information input by the user or information provided to the user. The Display unit 606 may include a Display panel 6061, and the Display panel 6061 may be configured by a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 607 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. Touch panel 6071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 6071 using a finger, stylus, or any suitable object or accessory). The touch panel 6071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 610, receives a command from the processor 610, and executes the command. In addition, the touch panel 6071 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The user input unit 607 may include other input devices 6072 in addition to the touch panel 6071. Specifically, the other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein again.
Further, the touch panel 6071 can be overlaid on the display panel 6061, and when the touch panel 6071 detects a touch operation on or near the touch panel 6071, the touch operation is transmitted to the processor 610 to determine the type of the touch event, and then the processor 610 provides a corresponding visual output on the display panel 6061 according to the type of the touch event. Although the touch panel 6071 and the display panel 6061 are shown in fig. 6 as two separate components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 6071 and the display panel 6061 may be integrated to implement the input and output functions of the electronic device, and this is not limited here.
The interface unit 608 is an interface for connecting an external device to the electronic apparatus 600. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 608 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the electronic device 600 or may be used to transmit data between the electronic device 600 and external devices.
The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 609 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 610 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 609, and calling data stored in the memory 609, thereby performing overall monitoring of the electronic device. Processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 610.
The electronic device 600 may further include a power supply 611 (e.g., a battery) for supplying power to the various components, and preferably, the power supply 611 may be logically connected to the processor 610 via a power management system, such that the power management system may be used to manage charging, discharging, and power consumption.
In addition, the electronic device 600 includes some functional modules that are not shown, and are not described in detail herein.
Preferably, an embodiment of the present invention further provides an electronic device, which includes a processor 610, a memory 609, and a computer program stored in the memory 609 and capable of running on the processor 610, where the computer program, when executed by the processor 610, implements each process of the above-mentioned video clipping method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the video clipping method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It should be noted that, in the present specification, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (9)

1. A method of video clipping, the method comprising:
displaying a content tag set and a style tag set of a first video to be clipped, wherein the content tag set comprises at least one content tag, each content tag corresponds to at least one video segment in the first video, the style tag set comprises at least one style tag, each style tag corresponds to at least one material combination, and each material combination comprises at least one clipping material;
receiving a first input of the content tag set by a user;
in response to the first input, intercepting a target video segment corresponding to a target content label selected by the first input from the first video;
receiving a second input of the style label set by the user;
responding to the second input, and acquiring a target material combination corresponding to the target style label selected by the second input;
synthesizing the target video segment and the editing material in the target material combination to generate a second video;
wherein, before displaying the content tag set and the style tag set of the first video to be edited, the method further comprises:
acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
extracting at least one story feature of each of the clip stories in each video clip sample, the story features identifying the clip stories;
obtaining a style label of each video clip sample;
and combining the extracted material characteristics, and mapping to the corresponding style labels to obtain the material combination corresponding to each style label.
2. The method of claim 1, wherein prior to displaying the content tab set and the genre tab set of the first video to be edited, further comprising:
classifying each video frame in the first video to obtain at least one video clip, wherein the video frames in each video clip have the same category;
extracting a subtitle fragment of each video fragment;
extracting at least one keyword of each subtitle segment according to the occurrence frequency of each word in the subtitle segment;
and determining the at least one keyword of each subtitle segment as a content tag of the corresponding video segment.
3. The method of claim 1, wherein the combining the extracted material features and mapping the extracted material features to corresponding genre labels to obtain a material combination corresponding to each genre label comprises:
counting the use times of each material characteristic in each type of material characteristics according to the acquired N types of material characteristics under each style label Pi;
determining material characteristics using the first M bits of the secondary number in each type of material characteristics;
combining the material characteristics of the first M positions under the N material characteristics to obtain MNA material feature set;
calculating the material correlation degree of each material feature set, and mapping the material feature sets with the material correlation degrees arranged in the front S positions to the style labels Pi to obtain material combinations corresponding to the style labels Pi;
the style labels Pi are ith style labels in the acquired style labels, each material feature set comprises N material features, the types of the material features are different, the material relevance is the relevance of all the material features in the material feature set, and N, M and S are integers greater than 1.
4. The method of claim 1, wherein after synthesizing the target video segment and the clip material in the target material combination to generate the second video, further comprising:
receiving a third input of the second video from the user;
in response to the third input, adding a style label for the second video that was input by the third input;
determining the second video as one of the preset number of video clip samples.
5. The method of any of claims 1 to 4, wherein the clip material comprises at least one of: audio material, filters, transition effects, special effects, subtitle style, and shot switching frequency.
6. The method of claim 5, wherein the synthesizing of the target video segment and the clip material in the target material combination to generate the second video comprises:
extracting the sound wave fluctuation frequency of the audio material under the condition that the target material combination comprises the audio material;
generating and displaying at least one video lens switching frequency mode alternative according to the sound wave fluctuation frequency;
receiving a fourth input of a target video shot switching frequency mode alternative in the at least one video shot switching frequency mode alternative by the user;
and responding to the fourth input, synthesizing the target video clip and the clip material in the target material combination according to the target video shot switching frequency mode alternative, and generating a second video.
7. An electronic device, comprising:
the content tag set comprises at least one content tag, each content tag corresponds to at least one video segment in the first video, the style tag set comprises at least one style tag, each style tag corresponds to at least one material combination, and each material combination comprises at least one clipping material;
a first receiving unit, configured to receive a first input of the content tag set by a user;
the intercepting unit is used for responding to the first input and intercepting a target video fragment corresponding to a target content label selected by the first input from the first video;
the second receiving unit is used for receiving a second input of the style label set by the user;
the first obtaining unit is used for responding to the second input and obtaining a target material combination corresponding to a target style label selected by the second input;
the synthesizing unit is used for synthesizing the target video segment and the editing material in the target material combination to generate a second video;
the second acquiring unit is used for acquiring a preset number of video clip samples, wherein the video clip samples are videos processed by video clips, and each video clip sample comprises at least one clip material;
a third extraction unit for extracting at least one material feature of each clip material in each video clip sample, the material feature identifying the clip material;
a third acquiring unit for acquiring a style label of each video clip sample;
and the mapping unit is used for combining the extracted material characteristics and mapping the material characteristics to the corresponding style labels to obtain the material combination corresponding to each style label.
8. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the video clipping method of any one of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the video clipping method according to any one of claims 1 to 6.
CN201910696203.1A 2019-07-30 2019-07-30 Video editing method and electronic equipment Active CN110381371B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910696203.1A CN110381371B (en) 2019-07-30 2019-07-30 Video editing method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910696203.1A CN110381371B (en) 2019-07-30 2019-07-30 Video editing method and electronic equipment

Publications (2)

Publication Number Publication Date
CN110381371A CN110381371A (en) 2019-10-25
CN110381371B true CN110381371B (en) 2021-08-31

Family

ID=68257088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910696203.1A Active CN110381371B (en) 2019-07-30 2019-07-30 Video editing method and electronic equipment

Country Status (1)

Country Link
CN (1) CN110381371B (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110797055B (en) * 2019-10-29 2021-09-03 北京达佳互联信息技术有限公司 Multimedia resource synthesis method and device, electronic equipment and storage medium
CN110933460B (en) * 2019-12-05 2021-09-07 腾讯科技(深圳)有限公司 Video splicing method and device and computer storage medium
US11678029B2 (en) 2019-12-17 2023-06-13 Tencent Technology (Shenzhen) Company Limited Video labeling method and apparatus, device, and computer-readable storage medium
CN110996138B (en) * 2019-12-17 2021-02-05 腾讯科技(深圳)有限公司 Video annotation method, device and storage medium
CN111158492B (en) * 2019-12-31 2021-08-06 维沃移动通信有限公司 Video editing method and head-mounted device
CN111541936A (en) * 2020-04-02 2020-08-14 腾讯科技(深圳)有限公司 Video and image processing method and device, electronic equipment and storage medium
CN111491213B (en) * 2020-04-17 2022-03-08 维沃移动通信有限公司 Video processing method, video processing device and electronic equipment
CN111491206B (en) * 2020-04-17 2023-03-24 维沃移动通信有限公司 Video processing method, video processing device and electronic equipment
CN111629252B (en) * 2020-06-10 2022-03-25 北京字节跳动网络技术有限公司 Video processing method and device, electronic equipment and computer readable storage medium
CN113938751B (en) * 2020-06-29 2023-12-22 抖音视界有限公司 Video transition type determining method, device and storage medium
CN113395542B (en) * 2020-10-26 2022-11-08 腾讯科技(深圳)有限公司 Video generation method and device based on artificial intelligence, computer equipment and medium
CN112508284A (en) * 2020-12-10 2021-03-16 网易(杭州)网络有限公司 Display material preprocessing method, putting method, system, device and equipment
CN112887794B (en) * 2021-01-26 2023-07-18 维沃移动通信有限公司 Video editing method and device
CN113115055B (en) * 2021-02-24 2022-08-05 华数传媒网络有限公司 User portrait and live video file editing method based on viewing behavior
CN113259708A (en) * 2021-04-06 2021-08-13 阿里健康科技(中国)有限公司 Method, computer device and medium for introducing commodities based on short video
CN113365147B (en) * 2021-08-11 2021-11-19 腾讯科技(深圳)有限公司 Video editing method, device, equipment and storage medium based on music card point
CN113923477A (en) * 2021-09-30 2022-01-11 北京百度网讯科技有限公司 Video processing method, video processing device, electronic equipment and storage medium
CN114040248A (en) * 2021-11-23 2022-02-11 维沃移动通信有限公司 Video processing method and device and electronic equipment
CN114302253B (en) * 2021-11-25 2024-03-12 北京达佳互联信息技术有限公司 Media data processing method, device, equipment and storage medium
CN114245171B (en) * 2021-12-15 2023-08-29 百度在线网络技术(北京)有限公司 Video editing method and device, electronic equipment and medium
CN114173067A (en) * 2021-12-21 2022-03-11 科大讯飞股份有限公司 Video generation method, device, equipment and storage medium
CN114339399A (en) * 2021-12-27 2022-04-12 咪咕文化科技有限公司 Multimedia file editing method and device and computing equipment
CN116708917A (en) * 2022-02-25 2023-09-05 北京字跳网络技术有限公司 Video processing method, device, equipment and medium
CN115567660B (en) * 2022-02-28 2023-05-26 荣耀终端有限公司 Video processing method and electronic equipment
CN115022712B (en) * 2022-05-20 2023-12-29 北京百度网讯科技有限公司 Video processing method, device, equipment and storage medium
CN116634058B (en) * 2022-05-30 2023-12-22 荣耀终端有限公司 Editing method of media resources, electronic equipment and readable storage medium
CN115278306A (en) * 2022-06-20 2022-11-01 阿里巴巴(中国)有限公司 Video editing method and device
CN115119050B (en) * 2022-06-30 2023-12-15 北京奇艺世纪科技有限公司 Video editing method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390032A (en) * 2006-01-05 2009-03-18 眼点公司 System and methods for storing, editing, and sharing digital video
CN101901620A (en) * 2010-07-28 2010-12-01 复旦大学 Automatic generation method and edit method of video content index file and application
CN106233707A (en) * 2014-04-21 2016-12-14 微软技术许可有限责任公司 Alternatively make camera motion stylization
CN108769733A (en) * 2018-06-22 2018-11-06 三星电子(中国)研发中心 Video clipping method and video clipping device
CN109002857A (en) * 2018-07-23 2018-12-14 厦门大学 A kind of transformation of video style and automatic generation method and system based on deep learning
CN109121021A (en) * 2018-09-28 2019-01-01 北京周同科技有限公司 A kind of generation method of Video Roundup, device, electronic equipment and storage medium
CN109688463A (en) * 2018-12-27 2019-04-26 北京字节跳动网络技术有限公司 A kind of editing video generation method, device, terminal device and storage medium
CN109819179A (en) * 2019-03-21 2019-05-28 腾讯科技(深圳)有限公司 A kind of video clipping method and device
CN110019880A (en) * 2017-09-04 2019-07-16 优酷网络技术(北京)有限公司 Video clipping method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8285121B2 (en) * 2007-10-07 2012-10-09 Fall Front Wireless Ny, Llc Digital network-based video tagging system
EP2593884A2 (en) * 2010-07-13 2013-05-22 Motionpoint Corporation Dynamic language translation of web site content

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101390032A (en) * 2006-01-05 2009-03-18 眼点公司 System and methods for storing, editing, and sharing digital video
CN101901620A (en) * 2010-07-28 2010-12-01 复旦大学 Automatic generation method and edit method of video content index file and application
CN106233707A (en) * 2014-04-21 2016-12-14 微软技术许可有限责任公司 Alternatively make camera motion stylization
CN110019880A (en) * 2017-09-04 2019-07-16 优酷网络技术(北京)有限公司 Video clipping method and device
CN108769733A (en) * 2018-06-22 2018-11-06 三星电子(中国)研发中心 Video clipping method and video clipping device
CN109002857A (en) * 2018-07-23 2018-12-14 厦门大学 A kind of transformation of video style and automatic generation method and system based on deep learning
CN109121021A (en) * 2018-09-28 2019-01-01 北京周同科技有限公司 A kind of generation method of Video Roundup, device, electronic equipment and storage medium
CN109688463A (en) * 2018-12-27 2019-04-26 北京字节跳动网络技术有限公司 A kind of editing video generation method, device, terminal device and storage medium
CN109819179A (en) * 2019-03-21 2019-05-28 腾讯科技(深圳)有限公司 A kind of video clipping method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Optimization-based automated home video editing system》;Xian-Sheng Hua;Lie Lu;Hong-Jiang Zhang;《IEEE Transactions on Circuits and Systems for Video Technology》;20040504;第14卷(第5期);全文 *
《基于内容的电影视频检索和精彩视频剪辑系统研究》;程远;《中国优秀硕士学位论文全文数据库》;20070615;全文 *

Also Published As

Publication number Publication date
CN110381371A (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN110381371B (en) Video editing method and electronic equipment
CN109558512B (en) Audio-based personalized recommendation method and device and mobile terminal
CN109857905B (en) Video editing method and terminal equipment
CN110557683B (en) Video playing control method and electronic equipment
CN109561211B (en) Information display method and mobile terminal
CN112689201B (en) Barrage information identification method, barrage information display method, server and electronic equipment
CN110933511B (en) Video sharing method, electronic device and medium
CN108628985B (en) Photo album processing method and mobile terminal
CN108334196B (en) File processing method and mobile terminal
CN108646960B (en) File processing method and flexible screen terminal
CN111445927B (en) Audio processing method and electronic equipment
CN108460817B (en) Jigsaw puzzle method and mobile terminal
CN109495616B (en) Photographing method and terminal equipment
CN111491123A (en) Video background processing method and device and electronic equipment
CN109246474B (en) Video file editing method and mobile terminal
CN109753202B (en) Screen capturing method and mobile terminal
CN111491205B (en) Video processing method and device and electronic equipment
CN107728877B (en) Application recommendation method and mobile terminal
CN110544287B (en) Picture allocation processing method and electronic equipment
CN111143614A (en) Video display method and electronic equipment
CN108595107B (en) Interface content processing method and mobile terminal
CN111460180B (en) Information display method, information display device, electronic equipment and storage medium
CN109510897B (en) Expression picture management method and mobile terminal
CN108628534B (en) Character display method and mobile terminal
CN110941592A (en) Data management method and mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220822

Address after: 5 / F, building B, No. 25, Andemen street, Yuhuatai District, Nanjing City, Jiangsu Province, 210012

Patentee after: NANJING WEIWO SOFTWARE TECHNOLOGY CO.,LTD.

Address before: 523860 No. 283 BBK Avenue, Changan Town, Changan, Guangdong.

Patentee before: VIVO MOBILE COMMUNICATION Co.,Ltd.