CN111741331A - Video clip processing method, device, storage medium and equipment - Google Patents

Video clip processing method, device, storage medium and equipment Download PDF

Info

Publication number
CN111741331A
CN111741331A CN202010791005.6A CN202010791005A CN111741331A CN 111741331 A CN111741331 A CN 111741331A CN 202010791005 A CN202010791005 A CN 202010791005A CN 111741331 A CN111741331 A CN 111741331A
Authority
CN
China
Prior art keywords
packaging
template
processed
video
different
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010791005.6A
Other languages
Chinese (zh)
Other versions
CN111741331B (en
Inventor
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Meishe Network Technology Co ltd
Original Assignee
Beijing Meishe Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Meishe Network Technology Co ltd filed Critical Beijing Meishe Network Technology Co ltd
Priority to CN202010791005.6A priority Critical patent/CN111741331B/en
Publication of CN111741331A publication Critical patent/CN111741331A/en
Application granted granted Critical
Publication of CN111741331B publication Critical patent/CN111741331B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234345Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Abstract

The embodiment of the application provides a video clip processing method, a video clip processing device, a storage medium and video clip processing equipment, and belongs to the technical field of image processing. The method comprises the following steps: reading template resources, and acquiring packaging elements in each template and different priorities corresponding to the packaging elements; for each template, forming different packaging combinations of the packaging elements included in the template according to different priorities of the packaging elements; matching a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selecting the highest score in the scores as the score of the template where the packaging combination is located; comparing scores among the templates, and selecting a target template with the highest score; and introducing packaging elements included by the packaging combination with the highest score on the target template, and modifying the plurality of video segments to be processed by utilizing an SDK interface. By using the video segment processing method provided by the application, the video editing can be performed without manual operation, and the time and the labor are saved.

Description

Video clip processing method, device, storage medium and equipment
Technical Field
The embodiment of the application relates to the technical field of image processing, in particular to a video clip processing method, a video clip processing device, a storage medium and video clip processing equipment.
Background
The video clip is to remix the video and the added materials such as pictures, background music, special effects, scenes and the like, cut and combine video sources, and generate new videos with different expressive power through secondary coding.
The traditional video clipping is realized by manual clipping, and the manual clipping of the video has the defects of time and labor waste and slow slicing speed.
Disclosure of Invention
The embodiment of the application provides a video segment processing method, a video segment processing device, a storage medium and video segment processing equipment, and aims to solve the problems that manual clipping is time-consuming and labor-consuming, and the slicing speed is low.
A first aspect of an embodiment of the present application provides a video clip processing method, where the method includes:
reading template resources, and acquiring packaging elements in each template and different priorities corresponding to the packaging elements;
for each template, forming different packaging combinations of the packaging elements included in the template according to different priorities of the packaging elements;
matching a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selecting the highest score in the scores as the score of the template where the packaging combination is located;
comparing scores among the templates, and selecting a target template with the highest score;
and introducing packaging elements included by the packaging combination with the highest score on the target template, and modifying the plurality of video segments to be processed by utilizing an SDK interface.
Optionally, the video segment to be processed is obtained by:
respectively identifying the picture contents of the multi-frame images of the video to be processed;
and clustering the multi-frame images according to the respective picture contents and the respective timestamps of the multi-frame images to obtain a plurality of video clips to be processed.
Optionally, clustering the multiple frames of images according to their respective picture contents and their respective timestamps to obtain multiple to-be-processed video clips, including:
identifying the confidence coefficient of the multi-frame image;
respectively comparing the confidence degrees of the identified multi-frame images with a first threshold value, and if the confidence degrees of the identified multi-frame images are smaller than the first threshold value, rejecting the frame images;
and clustering the frames of images left after the elimination according to the respective picture content and the respective time stamp of the frames of images left after the elimination to obtain a plurality of video clips to be processed.
Optionally, matching a plurality of video clips to be processed with different packaging combinations to obtain different scores, including:
acquiring weight coefficients of different picture contents in the same frame of image;
respectively comparing the weight coefficients of different image contents in the same frame of image with a second threshold, and if the weight coefficients are greater than the second threshold, taking the image contents with the weight coefficients greater than the second threshold as the main contents of the frame of image;
taking the main content of each frame of image of the video clip to be processed as the main content of the video clip to be processed;
respectively matching the main contents of the video clips to be processed with the same packaging combination to obtain different scores;
adding the different scores to obtain scores of the multiple video clips to be processed matched with the same package combination;
and sequentially and circularly matching the plurality of video clips to be processed with a plurality of packaging combinations, and summing the obtained scores to obtain different scores.
Optionally, introducing, on the target template, a packaging element included in a highest-scoring packaging combination, specifically including:
pre-setting a positioning mark of a packaging element required by the style of each template on each template;
and introducing the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning marks.
Optionally, the resource packaging library includes several sub-resource libraries;
introducing the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning mark, wherein the method comprises the following steps:
randomly calling packaging elements included by the packaging combination with the highest score from the plurality of sub-resource libraries according to the positioning marks.
A second aspect of the embodiments of the present application provides a video clip processing apparatus, where the apparatus includes a template reading module, a packaging combination module, a matching module, a comparison module, and a rendering module;
the reading module reads the template resources and acquires the packaging elements in each template and different priorities corresponding to the packaging elements;
the packaging combination module is used for forming different packaging combinations of the packaging elements included by each template according to different priorities of the packaging elements aiming at each template;
the matching module matches a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selects the highest score in the scores as the score of the template where the packaging combination is located;
the comparison module compares scores among the templates and selects a target template with the highest score;
and the rendering module introduces packaging elements included by the packaging combination with the highest score on the target template, and modifies the video fragments to be processed by using an SDK interface.
A third aspect of embodiments of the present application provides a readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps in the method according to the first aspect of the present application.
A fourth aspect of the embodiments of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method according to the first aspect of the present application.
By adopting the video clip processing method provided by the application, the packaging elements in each template and different priorities corresponding to the packaging elements are obtained by reading the template resources; for each template, forming different packaging combinations of the packaging elements included in the template according to different priorities of the packaging elements; matching a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selecting the highest score in the scores as the score of the template where the packaging combination is located; comparing scores among the templates, and selecting a target template with the highest score; and introducing packaging elements included by the packaging combination with the highest score on the target template, and modifying the plurality of video segments to be processed by utilizing an SDK interface. The video clip processing method provided by the embodiment does not need manual clipping, so that time and labor are saved, and in the process, the processing speed of the video clip is higher than that of manual clipping because the system clips the video.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a block flow diagram of a video segment processing method according to an embodiment of the present application;
FIG. 2 is a block diagram of a process for obtaining a plurality of video clips to be processed according to another embodiment of the present application;
FIG. 3 is a block diagram of a process for image culling according to yet another embodiment of the present application;
FIG. 4 is a block diagram of a process for matching a plurality of pending video segments to a wrapped combination according to yet another embodiment of the present application;
FIG. 5 is a block flow diagram of introducing packaging elements as set forth in yet another embodiment of the present application;
fig. 6 is a block diagram of a video segment processing apparatus according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a video segment processing method according to the present invention is shown, which may specifically include the following steps:
step 101, reading template resources, and obtaining packaging elements in each template and different priorities corresponding to the packaging elements.
The template resources are arranged in a template library, a plurality of templates with different styles are defined in the template library, the packaging elements on each template are various, and each packaging element has different priorities.
For example, the template library contains templates with fresh air styles, the fresh air templates define two packaging elements of sunlight and sand beach, the packaging elements of sunlight correspond to 70% and 30% of priority, and the packaging elements of sand beach correspond to 70% and 30% of priority.
One template is not limited to two packaging elements, and may be one or more packaging elements.
And 102, aiming at each template, forming different packaging combinations of the packaging elements included in the template according to different priorities of the packaging elements.
Wherein, different package combinations on the same template refer to random combinations among different package elements to form package combinations with different priorities.
For example, the packaging elements of sunlight correspond to 70% and 30% of priority, the packaging elements of sand beach correspond to 70% and 30% of priority, and when the packaging elements are combined, the following four different packaging combination situations are formed:
packaging combination 1: the priority corresponding to sunlight is 70%, and the priority corresponding to sand beach is 70%;
and (3) packaging combination 2: the priority corresponding to sunlight is 70% and the priority corresponding to sand beach is 30%;
and (3) packaging combination: the priority corresponding to sunlight is 30% and the priority corresponding to sand beach is 70%;
and (4) packaging combination: the priority for sunlight is 30% and the priority for beach is 30%.
In this step, one or more packaging elements may be included in one template, so that when the packaging elements are combined, various packaging combinations are formed.
And 103, matching the multiple video clips to be processed with different packaging combinations to obtain different scores, and selecting the highest score in the scores as the score of the template where the packaging combination is located.
In the process of combining the video clip to be processed with different packages, the video clip to be processed is respectively matched with the packages 1 to 4 in the step 102, so that four different scores are obtained, the package combination with the highest score is selected from the four scores to serve as the score of the fresh air template, and the higher the priority matched with the video clip to be processed is, the higher the corresponding score is.
If the picture content contained in the video clip to be processed cannot completely correspond to the packaging elements in the packaging combination, the priority corresponding to the packaging elements matched with the video clip to be processed in the packaging combination is counted in the score, and the priority corresponding to the unmatched packaging elements is not counted in the score.
And 104, comparing scores among the templates, and selecting the target template with the highest score.
After the video clip to be processed is matched with each different packaging element defined by each template, the score of each template is obtained, wherein the score represents the matching degree of each template and the video clip to be processed, and the score is higher when the matching degree is higher; and after the scores of all the templates are compared, selecting a target template, wherein the target template is the template with the highest matching degree with the video clip to be processed.
In this step, since each template has a different score after matching with the video segment to be processed, the target template with the highest matching degree with the video segment to be processed can be selected from the templates, so that the slicing effect can be better after the video segment to be processed is modified by using the packaging element defined by the target template with the highest matching degree in the subsequent step 105.
In this step, the mapping relationship between the video segment to be processed and the template may also be preset in the database of the terminal, the matching process between the video segment and the packaging combination in step 103 is omitted, step 104 is directly executed, and the target template is directly obtained from the video segment to be processed through the mapping relationship between the video segment to be processed and the template.
And 105, introducing packaging elements included in the packaging combination with the highest score into the target template, and modifying the video clips to be processed by using an SDK interface.
After the target template with the highest matching degree is selected, the packaging elements included in the packaging combination with the highest score on the target template are obtained, the packaging elements included in the packaging combination with the highest score are applied to the video clip to be processed through the SDK interface, and the assembly of the video clip to be processed and the packaging elements is completed.
In this step, if the target template is matched in a mapping relationship manner, since various packaging elements are defined on the target template, the various packaging elements are directly applied to the video clip to be processed.
In the video clip processing method provided in the embodiment, manual clipping is not required, so that time and labor are saved, and in the process, since the system clips the video, the processing speed of the video clip is faster than that of manual clipping.
Referring to fig. 2, in step 103, before matching a plurality of to-be-processed video clips with different packaging combinations to obtain different scores, the plurality of to-be-processed video clips are obtained through the following steps:
step 201, respectively identifying the picture contents of the multi-frame images of the video to be processed.
The video to be processed is identified through an AI identification technology, and the respective picture content and the respective time stamp of the multi-frame image of the video to be processed are identified. The AI identification technology is an intelligent identification technology based on a neural network, and can acquire basic contents of a screen, wherein the basic contents of the screen specifically include a series of multi-dimensional screen information such as people, animals, landmark sites, timestamp information, and the like.
Step 202, clustering the multiple frames of images according to the respective picture contents and the respective timestamps of the multiple frames of images to obtain a plurality of video clips to be processed.
The multi-frame images are clustered into a plurality of groups of images according to a preset rule, and the multi-frame images in the same group are a video clip to be processed. And the time stamp corresponding to the first frame image to the time stamp corresponding to the last frame image in each video clip to be processed is the playing time period of the video clip to be processed. For example, if the timestamp corresponding to the first frame image of the video clip to be processed is 00:23 and the timestamp corresponding to the last frame image is 00:25, the playing time of the video clip to be processed is 00:23 to 00: 25.
The preset rule is as follows: and calculating the similarity between each dimensionality in the two adjacent frames of images, determining the two adjacent frames of images as similar images when the similarity is smaller than a third threshold value, and clustering the similar images into the same group after sequentially comparing the two adjacent frames of images.
Referring to fig. 3, after step 201 and before step 202, the following sub-steps are further included:
and a sub-step a1 of identifying a confidence level of the multi-frame image.
After the multi-frame image of the video to be processed is subjected to AI identification, the confidence degrees of different frame images can be identified, and the confidence degree represents the confidence degree of the identification result of the frame image.
And a substep a2, comparing the confidences of the identified multiple frame images with a first threshold respectively, and rejecting the frame image if the confidences are smaller than the first threshold.
And if the confidence degrees of the identified multi-frame images are smaller than the first threshold value, the confidence degree of the identification result of the frame image is lower, the AI identification is wrong, and the system can automatically remove the frame image.
And a substep A3, clustering the frames of images left after being removed according to the respective picture content and the respective time stamp of the frames of images left after being removed, so as to obtain a plurality of video segments to be processed.
And if the confidence degrees of the identified multi-frame images are greater than the first threshold value, the identification result of the frame of image is higher in confidence degree, and the system can automatically cluster the images with higher confidence degrees of the identification results left after the images are removed.
In the embodiment, a multi-frame image is obtained after AI identification of a video to be processed, and if the multi-frame image possibly contains an image with an AI identification error, the multi-frame image is rejected when the confidence coefficient of the multi-frame image is smaller than a first threshold value, so that the accuracy of image input is ensured before clustering of the multi-frame image; after clustering is carried out on multiple frames of images, the images with the similarity larger than the third threshold value are classified into a video clip to be processed, the similarity of the images among different video clips to be processed is low, so that in the subsequent matching process with the packaging combination, only the video clip to be processed is matched with the packaging combination, each frame of image in the video clip to be processed does not need to be matched with the packaging combination, and the running speed of the system is improved to a certain extent.
Referring to fig. 4, in step 103, matching a plurality of video segments to be processed with different package combinations to obtain different scores, and selecting the highest score among the scores as the score of the template where the package combination is located, specifically includes the following steps:
step 301, obtaining the weight coefficients of different picture contents in the same frame of image.
After AI identification, the weight coefficients corresponding to different picture contents in the same frame of image can be identified.
Step 302, comparing the weight coefficients of the different image contents in the same frame image with a second threshold, and if the weight coefficients are greater than the second threshold, taking the image contents with the weight coefficients greater than the second threshold as the main contents of the frame image.
After the weight coefficients of different image contents in the same frame image are identified, the weight coefficients of the different image contents are compared with a second threshold, if the weight coefficients are larger than the second threshold, the importance degree of the image contents is higher, and the image contents are used as the main contents of the frame image. In this step, the number of the screen contents whose weight coefficient is larger than the second threshold may be one or more.
For example, the AI recognizes that a cat, a dog, and a person are included in the same frame image, and if the weight coefficient of the person is larger than the weight coefficients of the cat and the dog, the AI takes the person as the main content of the frame image.
Step 303, using the main content of each frame image of the video clip to be processed as the main content of the video clip to be processed.
It is stated in step 202 that the similarity between the frame images in the same to-be-processed video segment is higher, that is, the similarity of the main content corresponding to each frame image in the same to-be-processed video segment in this step is higher, and based on the higher similarity of the main content of each frame image, the main content of the similar picture of each frame image is directly used as the main content of the to-be-processed video segment.
And 304, respectively matching the main contents of the video clips to be processed with the same packaging combination to obtain different scores.
After a video to be processed is divided into a plurality of video segments to be processed, the similarity of the main picture contents displayed by different video segments to be processed is low, so that different scores can be obtained when the main picture contents are respectively matched with the same packaging combination.
And 305, summing the different scores to obtain the scores of the multiple video clips to be processed matched with the same package combination.
After a complete video to be processed is divided into a plurality of video segments to be processed, the video segments to be processed are matched with the same packaging combination on the same template to obtain a plurality of scores, the scores are added to obtain a score between the complete video to be processed and one packaging combination, and the rest packaging combinations on the same template are circulated in sequence to obtain different scores matched between the complete video to be processed and different packaging combinations.
And step 306, sequentially and circularly matching the plurality of video clips to be processed with a plurality of packaging combinations, and summing the obtained scores to obtain different scores.
In step 305, the video to be processed is matched with one package combination, however, there are multiple package combinations in the same template, so all package combinations need to be cycled through, and the scores between the same video to be processed and multiple package combinations on the same template can be obtained.
In this embodiment, after dividing a video to be processed into a plurality of video segments to be processed, the plurality of video segments to be processed are respectively matched with a plurality of packaging combinations located on the same template, a plurality of different scores are obtained through matching, the highest score is selected as a matching score between the video to be processed and the template, since different templates define different styles (such as fresh air, dark black air, cool air, and the like), a score matched between the video to be processed and different templates can be obtained, the highest score template is selected as a target template, the packaging combination with the highest score is used for respectively rendering the plurality of video segments to be processed of the video to be processed on the target template, and the assembly between the packaging elements and the video to be processed is completed.
And because the target template defines a style, after the whole video to be processed is rendered, the style defined on the target template is added to the video to be processed, for example, the style of the whole video to be processed is fresh wind.
Referring to fig. 5, in step 105, the step of introducing the packaging elements included in the highest-scoring packaging combination to the target template specifically includes the following steps:
step 401, positioning marks of the packaging elements required by the style of each template are preset on each template.
And each template is predefined with positioning marks of the pre-designed packaging elements, and each positioning mark represents different types of packaging elements.
And 402, introducing packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning marks.
And the SDK interface introduces the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning marks marked in advance on the template.
When step 402 is executed, the method specifically includes: randomly calling packaging elements included by the packaging combination with the highest score from the plurality of sub-resource libraries according to the positioning marks.
The resource packaging library comprises a plurality of packaging resources of different types, wherein the packaging resources of different types comprise filters, subtitles, music and the like; and each packaging resource defines a sub-resource library with different picture contents; each repository includes different wrapper elements. For example, in a package resource such as a filter, a sub-resource library to which the screen content of a beach is applicable is predefined, and package elements included in the sub-resource library to which the beach is applicable include a yellow filter and a blue filter, and the like; for example, in a package resource of music, a sub-resource library to which the screen content of beach is applicable is predefined, and the package elements included in the sub-resource library to which beach is applicable include happy music and relaxed music, and the like.
In this embodiment, pieces with different effects can be generated when various packaging resources in a plurality of sub-resource libraries are randomly called. For example, when the SDK needs to call up music on sandy beach from the sub-resource library, the SDK randomly calls up music from the sub-resource library, may call up happy music, may call up relaxing music, or may call up other music suitable for sandy beach. In this process, since the music called by the SDK is different and suitable for the to-be-processed video segment, when the to-be-processed video segment is combined with the packing element, the SDK may apply relaxed music to the to-be-processed video segment and may also apply relaxed music to the to-be-processed video segment, so that after the to-be-processed video is input into the system, pieces of different effects can be output.
Fig. 6 is a block diagram of a video segment processing apparatus according to the present invention.
Referring to fig. 6, the apparatus includes a template reading module, a packaging combination module, a matching module, a comparison module, and a rendering module;
the reading module reads the template resources and acquires the packaging elements in each template and different priorities corresponding to the packaging elements;
the packaging combination module is used for forming different packaging combinations of the packaging elements included by each template according to different priorities of the packaging elements aiming at each template;
the matching module matches a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selects the highest score in the scores as the score of the template where the packaging combination is located;
the comparison module compares scores among the templates and selects a target template with the highest score;
and the rendering module introduces packaging elements included by the packaging combination with the highest score on the target template, and modifies the video fragments to be processed by using an SDK interface.
The system further comprises a to-be-processed video clip acquisition module, wherein the to-be-processed video clip acquisition module comprises an image identification module and a clustering module;
the image identification module respectively identifies the picture contents of multi-frame images of the video to be processed;
and the clustering module clusters the multi-frame images according to the respective picture contents and the respective timestamps of the multi-frame images to obtain a plurality of video clips to be processed.
Further, the clustering module comprises an image rejection module;
the image culling module comprises:
identifying the confidence coefficient of the multi-frame image;
respectively comparing the confidence degrees of the identified multi-frame images with a first threshold value, and if the confidence degrees of the identified multi-frame images are smaller than the first threshold value, rejecting the frame images;
and clustering the frames of images left after the elimination according to the respective picture content and the respective time stamp of the frames of images left after the elimination to obtain a plurality of video clips to be processed.
Further, the matching module comprises a weight comparison module and a sub-matching module;
the weight comparison module comprises: acquiring weight coefficients of different picture contents in the same frame of image;
respectively comparing the weight coefficients of different image contents in the same frame of image with a second threshold, and if the weight coefficients are greater than the second threshold, taking the image contents with the weight coefficients greater than the second threshold as the main contents of the frame of image;
the sub-matching module includes: taking the main content of each frame of image of the video clip to be processed as the main content of the video clip to be processed;
respectively matching the main contents of the video clips to be processed with the same packaging combination to obtain different scores;
adding the different scores to obtain scores of the multiple video clips to be processed matched with the same package combination;
and sequentially and circularly matching and summing the plurality of video clips to be processed with the plurality of packaging combinations to obtain different scores.
Further, the rendering module comprises a positioning mark module;
the positioning mark module is provided with a positioning mark of a packaging element required by the style of each template in advance;
and introducing the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning marks.
Further, the resource packaging library comprises a plurality of sub-resource libraries; the positioning marking module comprises a random calling module;
and the random calling module randomly calls the packaging elements included in the packaging combination with the highest score from the plurality of sub-resource libraries according to the positioning marks.
Based on the same inventive concept, another embodiment of the present application provides a readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in the classification parameter value generation method or the image classification method according to any of the above embodiments of the present application.
Based on the same inventive concept, another embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the method, the method for generating a classification parameter value or the method for classifying an image according to any of the above embodiments of the present application is implemented.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The foregoing describes in detail a video clip processing method, apparatus, storage medium, and device provided by the present application, and specific examples are applied herein to illustrate the principles and implementations of the present application, and the descriptions of the foregoing examples are only used to help understand the method and core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (9)

1. A method for processing video segments, the method comprising:
reading template resources, and acquiring packaging elements in each template and different priorities corresponding to the packaging elements;
for each template, forming different packaging combinations of the packaging elements included in the template according to different priorities of the packaging elements;
matching a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selecting the highest score in the scores as the score of the template where the packaging combination is located;
comparing scores among the templates, and selecting a target template with the highest score;
and introducing packaging elements included by the packaging combination with the highest score on the target template, and modifying the plurality of video segments to be processed by utilizing an SDK interface.
2. The method according to claim 1, wherein the video segment to be processed is obtained by:
respectively identifying the picture contents of the multi-frame images of the video to be processed;
and clustering the multi-frame images according to the respective picture contents and the respective timestamps of the multi-frame images to obtain a plurality of video clips to be processed.
3. The method according to claim 2, wherein clustering the multiple frames of images according to their respective picture contents and their respective timestamps to obtain a plurality of video clips to be processed comprises:
identifying the confidence coefficient of the multi-frame image;
respectively comparing the confidence degrees of the identified multi-frame images with a first threshold value, and if the confidence degrees of the identified multi-frame images are smaller than the first threshold value, rejecting the frame images;
and clustering the frames of images left after the elimination according to the respective picture content and the respective time stamp of the frames of images left after the elimination to obtain a plurality of video clips to be processed.
4. The method of claim 1, wherein matching a plurality of video clips to be processed with different packaging combinations to obtain different scores comprises:
acquiring weight coefficients of different picture contents in the same frame of image;
respectively comparing the weight coefficients of different image contents in the same frame of image with a second threshold, and if the weight coefficients are greater than the second threshold, taking the image contents with the weight coefficients greater than the second threshold as the main contents of the frame of image;
taking the main content of each frame of image of the video clip to be processed as the main content of the video clip to be processed;
respectively matching the main contents of the video clips to be processed with the same packaging combination to obtain different scores;
adding the different scores to obtain scores of the multiple video clips to be processed matched with the same package combination;
and sequentially and circularly matching the plurality of video clips to be processed with a plurality of packaging combinations, and summing the obtained scores to obtain different scores.
5. The method according to claim 1, wherein introducing the packaging elements included in the highest-scoring packaging combination on the target template specifically comprises:
pre-setting a positioning mark of a packaging element required by the style of each template on each template;
and introducing the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning marks.
6. The method of claim 5, wherein the resource wrapper library comprises a plurality of sub-resource libraries;
introducing the packaging elements included in the packaging combination with the highest score from the resource packaging library according to the positioning mark, wherein the method comprises the following steps:
randomly calling packaging elements included by the packaging combination with the highest score from the plurality of sub-resource libraries according to the positioning marks.
7. The video clip processing device is characterized by comprising a template reading module, a packaging combination module, a matching module, a comparison module and a rendering module;
the reading module reads the template resources and acquires the packaging elements in each template and different priorities corresponding to the packaging elements;
the packaging combination module is used for forming different packaging combinations of the packaging elements included by each template according to different priorities of the packaging elements aiming at each template;
the matching module matches a plurality of video clips to be processed with different packaging combinations to obtain different scores, and selects the highest score in the scores as the score of the template where the packaging combination is located;
the comparison module compares scores among the templates and selects a target template with the highest score;
and the rendering module introduces packaging elements included by the packaging combination with the highest score on the target template, and modifies the video fragments to be processed by using an SDK interface.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 6 are implemented when the computer program is executed by the processor.
CN202010791005.6A 2020-08-07 2020-08-07 Video clip processing method, device, storage medium and equipment Active CN111741331B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010791005.6A CN111741331B (en) 2020-08-07 2020-08-07 Video clip processing method, device, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010791005.6A CN111741331B (en) 2020-08-07 2020-08-07 Video clip processing method, device, storage medium and equipment

Publications (2)

Publication Number Publication Date
CN111741331A true CN111741331A (en) 2020-10-02
CN111741331B CN111741331B (en) 2020-12-22

Family

ID=72658251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010791005.6A Active CN111741331B (en) 2020-08-07 2020-08-07 Video clip processing method, device, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN111741331B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111222A (en) * 2021-03-26 2021-07-13 北京达佳互联信息技术有限公司 Method and device for generating short video template, server and storage medium
CN114390367A (en) * 2020-10-16 2022-04-22 上海哔哩哔哩科技有限公司 Audio and video processing method and device
CN115269889A (en) * 2021-04-30 2022-11-01 北京字跳网络技术有限公司 Clipping template searching method and device

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008042895A (en) * 2006-08-02 2008-02-21 Fuji Xerox Co Ltd Method for clustering plurality of videos, apparatus, system, and program related thereto
CN102982572A (en) * 2012-10-31 2013-03-20 北京百度网讯科技有限公司 Intelligent image editing method and device thereof
CN103402100A (en) * 2013-08-23 2013-11-20 北京奇艺世纪科技有限公司 Video processing method and mobile terminal
US20140099023A1 (en) * 2012-10-05 2014-04-10 National Applied Research Laboratories Search method for video clip
CN106534967A (en) * 2016-10-25 2017-03-22 司马大大(北京)智能系统有限公司 Video editing method and device
CN107436921A (en) * 2017-07-03 2017-12-05 李洪海 Video data handling procedure, device, equipment and storage medium
US20180089204A1 (en) * 2016-09-29 2018-03-29 British Broadcasting Corporation Video Search System & Method
CN108062760A (en) * 2017-12-08 2018-05-22 广州市百果园信息技术有限公司 Video editing method, device and intelligent mobile terminal
CN108391063A (en) * 2018-02-11 2018-08-10 北京秀眼科技有限公司 Video clipping method and device
CN109426658A (en) * 2017-09-01 2019-03-05 奥多比公司 Document beautification is carried out using the intelligent characteristic suggestion based on text analyzing
CN109819179A (en) * 2019-03-21 2019-05-28 腾讯科技(深圳)有限公司 A kind of video clipping method and device
CN109996011A (en) * 2017-12-29 2019-07-09 深圳市优必选科技有限公司 Video clipping device and method
CN110139149A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 A kind of video optimized method, apparatus, electronic equipment
CN110139159A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 Processing method, device and the storage medium of video material
CN111013150A (en) * 2019-12-09 2020-04-17 腾讯科技(深圳)有限公司 Game video editing method, device, equipment and storage medium
CN111105819A (en) * 2019-12-13 2020-05-05 北京达佳互联信息技术有限公司 Clipping template recommendation method and device, electronic equipment and storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008042895A (en) * 2006-08-02 2008-02-21 Fuji Xerox Co Ltd Method for clustering plurality of videos, apparatus, system, and program related thereto
US20140099023A1 (en) * 2012-10-05 2014-04-10 National Applied Research Laboratories Search method for video clip
CN102982572A (en) * 2012-10-31 2013-03-20 北京百度网讯科技有限公司 Intelligent image editing method and device thereof
CN103402100A (en) * 2013-08-23 2013-11-20 北京奇艺世纪科技有限公司 Video processing method and mobile terminal
US20180089204A1 (en) * 2016-09-29 2018-03-29 British Broadcasting Corporation Video Search System & Method
CN106534967A (en) * 2016-10-25 2017-03-22 司马大大(北京)智能系统有限公司 Video editing method and device
CN107436921A (en) * 2017-07-03 2017-12-05 李洪海 Video data handling procedure, device, equipment and storage medium
CN109426658A (en) * 2017-09-01 2019-03-05 奥多比公司 Document beautification is carried out using the intelligent characteristic suggestion based on text analyzing
CN108062760A (en) * 2017-12-08 2018-05-22 广州市百果园信息技术有限公司 Video editing method, device and intelligent mobile terminal
CN109996011A (en) * 2017-12-29 2019-07-09 深圳市优必选科技有限公司 Video clipping device and method
CN108391063A (en) * 2018-02-11 2018-08-10 北京秀眼科技有限公司 Video clipping method and device
CN109819179A (en) * 2019-03-21 2019-05-28 腾讯科技(深圳)有限公司 A kind of video clipping method and device
CN110139149A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 A kind of video optimized method, apparatus, electronic equipment
CN110139159A (en) * 2019-06-21 2019-08-16 上海摩象网络科技有限公司 Processing method, device and the storage medium of video material
CN111013150A (en) * 2019-12-09 2020-04-17 腾讯科技(深圳)有限公司 Game video editing method, device, equipment and storage medium
CN111105819A (en) * 2019-12-13 2020-05-05 北京达佳互联信息技术有限公司 Clipping template recommendation method and device, electronic equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114390367A (en) * 2020-10-16 2022-04-22 上海哔哩哔哩科技有限公司 Audio and video processing method and device
CN113111222A (en) * 2021-03-26 2021-07-13 北京达佳互联信息技术有限公司 Method and device for generating short video template, server and storage medium
CN113111222B (en) * 2021-03-26 2024-03-19 北京达佳互联信息技术有限公司 Short video template generation method, device, server and storage medium
CN115269889A (en) * 2021-04-30 2022-11-01 北京字跳网络技术有限公司 Clipping template searching method and device

Also Published As

Publication number Publication date
CN111741331B (en) 2020-12-22

Similar Documents

Publication Publication Date Title
CN111741331B (en) Video clip processing method, device, storage medium and equipment
CN111866585B (en) Video processing method and device
CN109740670B (en) Video classification method and device
US8032539B2 (en) Method and apparatus for semantic assisted rating of multimedia content
CN109756751B (en) Multimedia data processing method and device, electronic equipment and storage medium
CN111683209A (en) Mixed-cut video generation method and device, electronic equipment and computer-readable storage medium
CN109828993B (en) Statistical data query method and device
US20190347294A1 (en) Retrieval method and device for judgment documents
US20110052086A1 (en) Electronic Apparatus and Image Processing Method
US11531839B2 (en) Label assigning device, label assigning method, and computer program product
WO2020009777A1 (en) Production of modified image inventories
CN111488813A (en) Video emotion marking method and device, electronic equipment and storage medium
CN110851675A (en) Data extraction method, device and medium
CN112749299A (en) Method and device for determining video type, electronic equipment and readable storage medium
CN107369450B (en) Recording method and recording apparatus
WO2021171099A2 (en) Method for atomically tracking and storing video segments in multi-segment audio-video compositions
CN110532773B (en) Malicious access behavior identification method, data processing method, device and equipment
US10289915B1 (en) Manufacture of image inventories
CN112528073A (en) Video generation method and device
CN115049963A (en) Video classification method and device, processor and electronic equipment
CN114037889A (en) Image identification method and device, electronic equipment and storage medium
CN108600864B (en) Movie preview generation method and device
EP3113069A1 (en) Method and apparatus for deriving a feature point based image similarity measure
CN111797765A (en) Image processing method, image processing apparatus, server, and storage medium
CN110674720A (en) Picture identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant