CN113408461A

CN113408461A - Method and device for extracting wonderful segments, computer equipment and storage medium

Info

Publication number: CN113408461A
Application number: CN202110731863.6A
Authority: CN
Inventors: 焦小珍
Original assignee: Shenzhen Sibo Technology Co ltd
Current assignee: Shenzhen Wondershare Software Co Ltd
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2021-09-17
Anticipated expiration: 2041-06-30
Also published as: CN113408461B

Abstract

The invention discloses a method and a device for extracting wonderful segments, computer equipment and a storage medium. Reading video data and preprocessing the video data to obtain a plurality of candidate sub-segments; screening, clustering and knapsack strategy processing are carried out on the candidate sub-fragments to obtain a plurality of candidate highlight fragments; when a highlight clip with fixed time length is extracted based on video data for editing, sequencing the candidate highlight clips and selecting a plurality of target highlight clips, processing the selected target highlight clips and outputting the highlight clip for editing; when the highlight clip of fixed duration is extracted based on the video data with multiple indefinite frame rates, the candidate highlight clips of the video data are partitioned, converted, deduplicated and sequenced to obtain multiple target highlight clips, and the highlight clip of fixed duration is output. The method effectively extracts the highlight segments in the video data and generates the highlight segment compilation with fixed time duration, and has the advantages of high extraction efficiency and good effect.

Description

Method and device for extracting wonderful segments, computer equipment and storage medium

Technical Field

The invention relates to the field of video data processing, in particular to a method and a device for extracting a highlight, computer equipment and a storage medium.

Background

With the rapid development of information technology, various information waves emerge every day, wherein the transmission mode taking video as a carrier is particularly prominent, the demand for video understanding is generated, and the rising of the current short video, a short and refined video content undoubtedly provides the greatest convenience for people of the present day.

In the related technology, a supervised method and an unsupervised method are often used for extracting highlights from a video and generating a short highlight compilation, but the extraction efficiency of the supervised method is low; the efficiency of the unsupervised method is high, but the extraction effect is difficult to achieve the ideal expectation.

Disclosure of Invention

The invention aims to provide a method and a device for extracting a highlight segment, computer equipment and a storage medium, and aims to solve the problems of low extraction efficiency and poor extraction effect when the highlight segment is extracted from a video and a highlight segment is generated in the prior art.

In order to solve the technical problems, the invention aims to realize the following technical scheme: a highlight extraction method comprising:

reading video data and preprocessing the video data to obtain a plurality of candidate sub-segments;

screening, clustering and knapsack strategy processing are carried out on the candidate sub-segments to obtain a plurality of candidate highlight segments;

when a video data is used for extracting a highlight segment with a fixed time length for editing, sequencing the candidate highlight segments and selecting a plurality of target highlight segments, processing the selected target highlight segments by adopting a multi-fading compensation strategy according to a preset frame length, and outputting the highlight segment editing;

when a highlight segment of fixed duration is extracted based on video data with multiple indefinite frame rates, partitioning, converting, de-duplicating and sequencing are carried out on candidate highlight segments of the video data, multiple target highlight segments are obtained, and highlight segment compilation is output.

In addition, an object of the present invention is to provide a highlight extracting apparatus, including:

the preprocessing unit is used for reading the video data and preprocessing the video data to obtain a plurality of candidate sub-segments;

the acquisition unit is used for screening, clustering and knapsack strategy processing on the candidate sub-segments to obtain a plurality of candidate highlight segments;

the first extraction unit is used for sorting the candidate highlight segments and selecting a plurality of target highlight segments when a highlight segment of a fixed duration is extracted and edited based on video data, processing the selected target highlight segments by adopting a multi-fading and multi-complementing strategy according to a preset frame length and outputting a highlight segment edit;

and the second extraction unit is used for partitioning, converting, de-duplicating and sequencing the candidate highlight segments of the plurality of video data when extracting a highlight segment compilation with a fixed duration based on the plurality of video data with indefinite frame rates to obtain a plurality of target highlight segments and outputting the highlight segment compilation.

In addition, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the highlight extracting method according to the first aspect when executing the computer program.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the highlight extracting method according to the first aspect.

The embodiment of the invention discloses a method and a device for extracting a wonderful segment, computer equipment and a storage medium. Reading video data and preprocessing the video data to obtain a plurality of candidate sub-segments; screening, clustering and knapsack strategy processing are carried out on the candidate sub-fragments to obtain a plurality of candidate highlight fragments; when a highlight clip with fixed time length is extracted based on video data for editing, sequencing the candidate highlight clips and selecting a plurality of target highlight clips, processing the selected target highlight clips and outputting the highlight clip for editing; when the highlight clip of fixed duration is extracted based on the video data with multiple indefinite frame rates, the candidate highlight clips of the video data are partitioned, converted, deduplicated and sequenced to obtain multiple target highlight clips, and the highlight clip of fixed duration is output. The embodiment of the invention effectively extracts the highlight in the video data and generates the highlight edit with fixed time duration, and has the advantages of high extraction efficiency and good effect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flow chart of a method for extracting highlights provided by an embodiment of the invention;

FIG. 2 is a schematic view of a sub-flow of a method for extracting highlights according to an embodiment of the invention;

FIG. 3 is a schematic view of another sub-flow of the method for extracting highlights according to the embodiment of the invention;

FIG. 4 is a schematic view of another sub-flow of the method for extracting highlights according to the embodiment of the invention;

FIG. 5 is a schematic view of another sub-flow of the method for extracting highlights according to the embodiment of the invention;

FIG. 6 is a schematic view of another sub-flow of the method for extracting highlights according to the embodiment of the invention;

FIG. 7 is a schematic view of another sub-flow of the method for extracting highlights according to the embodiment of the invention;

fig. 8 is a schematic block diagram of a highlight extracting apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Referring to fig. 1, fig. 1 is a schematic flow chart of a method for extracting a highlight according to an embodiment of the present invention;

as shown in fig. 1, the method includes steps S101 to S104.

S101, reading video data and preprocessing the video data to obtain a plurality of candidate sub-segments.

Specifically, as shown in fig. 2, step S101 includes:

s201, reading video data, and acquiring image color, image texture feature, image quality, frame difference value and edge change rate value of each frame in the video data;

in this step, by reading the video data, the image color, the image texture feature, the image quality, the frame difference value, and the edge change rate value of each frame in the video are obtained, it should be noted that in the process of reading the video data, the image quality of one frame is calculated when one frame is read; reading a previous frame, a current frame and a next frame, and taking the average of the sum of the frame difference of the previous frame and the current frame and the frame difference of the current frame and the next frame as the frame difference value of the current frame; calculating the edge change rate value of two continuous frames when reading the two continuous frames; compared with the original mode that the image color, the image texture feature, the image quality, the frame difference value and the edge change rate value are acquired after all frames are acquired, the embodiment only needs to cache three frames and two frames of images, so that the resource occupation can be reduced, especially for long videos of several hours, if the original method is adopted to cache all frames, the program is blocked at a high probability. It should be noted that the frame difference value and the edge rate value acquired in the present embodiment are stored in an image data format.

S202, setting a first filtering proportion according to the frame length of video data, and filtering frames with low image quality according to the first filtering proportion;

in this step, all frames are sorted in an ascending order according to image quality, and frames with low image quality are filtered and deleted at a first filtering ratio, where the first filtering ratio is set according to the number of frames of video data, for example, video data smaller than 30000 frames adopts a filtering ratio of 10%, and video larger than 30000 frames adopts a filtering ratio of 12%.

S203, setting a second filtering proportion, and filtering the frames with large frame difference values and edge change rate values according to the second filtering proportion;

in this step, all frames are sorted in ascending order according to the frame difference and the edge change rate, and frames with large frame difference and large edge change rate are filtered by a second filtering proportion, where the second filtering proportion may be 10%.

S204, cascading the image color and the image texture characteristics to obtain image color texture characteristics;

s205, carrying out scene segmentation on video data, setting a longest frame length and a shortest frame length for each scene to limit the segment duration of each scene, and outputting candidate sub-segments, wherein one scene outputs one or more candidate sub-segments;

in this step, because there are usually multiple scenes in a section of video, each scene needs to be segmented first, and for each scene, a longest frame length and a shortest frame length need to be set to limit the duration of the segment in each scene, so that each scene can be segmented into one or more candidate sub-segments; specifically, different longest frame lengths and shortest frame lengths can be set for videos with different lengths, for example, for a video less than half an hour, the division length can be set to 5 seconds, the corresponding shortest frame length is the 5 second frame length, and the longest frame length is three times the shortest frame length; for videos larger than half an hour, the division duration is set to be 10 seconds, the corresponding shortest frame length is 10 seconds, and the longest frame length is three times as long as the shortest frame length.

S206, extracting the wonderful cover of the candidate sub-segment through clustering.

The method mainly comprises the steps of extracting the wonderful cover, obtaining N covers with the best quality for candidate sub-segments, selecting the most static frames, filtering out similar redundant frames, and finally obtaining a plurality of key frames with the best quality, high static values and no redundancy, wherein the key frames are used as the wonderful cover.

S102, screening, clustering and knapsack strategy processing are carried out on the candidate sub-segments, and a plurality of candidate highlight segments are obtained.

Specifically, as shown in fig. 3, step S102 includes:

s301, setting a frame length threshold, a frame difference threshold and an edge change rate threshold of each candidate sub-segment;

in the step, the frame difference mean value and the frame difference median value of each candidate sub-segment are calculated, and the smaller value is taken as the frame difference threshold value of the corresponding candidate sub-segment; and calculating the mean value of the edge change rate and the median value of the edge change rate of each candidate sub-segment, and taking the smaller value as the threshold value of the edge change rate of the corresponding candidate sub-segment.

S302, screening out candidate highlight segments from the candidate sub-segments according to a frame length threshold, a frame difference threshold and an edge change rate threshold;

in the step, based on the obtained candidate sub-segments, the frame length threshold, the frame difference threshold and the edge change rate threshold of each candidate sub-segment are set, three-layer screening is adopted to realize different screening of videos with different wonderful degrees, and the purpose is to screen out more wonderful segments as far as possible under the limit of the preset segment number, if unified parameters are set, the segments which are not wonderful are selected, so that excessive and wonderful segments which are not wonderful are accommodated due to limited capacity space during subsequent backpack strategy processing, and the wonderful degree of the whole segment is reduced.

Specifically, the first layer screening: mainly aiming at screening videos with rich dynamic energy, candidate sub-segments which are larger than a frame length threshold, a frame difference threshold and an edge change rate threshold are screened out, and segments which are shielded in a large area and have strong self-wonderful degree can be obtained; the large-area occlusion section refers to a section in which large-area occlusion is possible to occur in the video, for example, when selecting weapons, viewing ranks, and the like in the game video, a large attribute page occlusion exists. If the candidate sub-fragments screened by the first layer exceed the preset number of fragments and are lower than the preset number of fragments which is 2 times, screening is finished; and if the screened candidate sub-fragments exceed 2 times of the preset fragment number, entering a second-layer screening.

And (3) second-layer screening: on the basis of the candidate sub-segments obtained by the first-layer screening, carrying out the primary screening again, and only screening the candidate sub-segments larger than the frame length threshold and the frame difference threshold during the secondary screening; instead of using the edge rate threshold for screening, the purpose is that the edge rate of change of the video that is static is generally low. And if the candidate sub-segments screened by the second layer still exceed 2 times of the preset segment number, increasing the frame length threshold and the frame difference threshold to continue screening.

And a third screening layer: the method mainly comprises the steps that a plurality of videos which are not rich in dynamic energy are set, when the candidate sub-segments screened out by the first layer are lower than the preset segment number, the frame length threshold value and the frame difference threshold value are reduced through loop iteration, and the candidate sub-segments meeting the preset segment number are screened out.

Further, consider the extreme case where the number of missing fragments is made up by merging adjacent fragments if the set number of fragments is not reached.

S303, clustering the candidate highlight segments based on the image color texture characteristics, and calculating the score of each candidate highlight segment according to the frame difference mean value of each candidate highlight segment;

the clustering processing in this step is to cluster similar images together into a class, so as to ensure that the features in the candidate sub-segments are similar, and score the candidate highlight segments by calculating the frame difference mean value of each candidate highlight segment, and obtain a score, wherein the larger the frame difference mean value is, the higher the corresponding score is.

S304, carrying out knapsack strategy processing according to the score and the frame length of each candidate highlight segment, and selecting the candidate highlight segment with the highest score sum.

The knapsack strategy is a dynamic programming algorithm, namely, more fragments with the highest scores can be accommodated as much as possible in a limited capacity space; assuming that the total bearing weight of the backpack is W, the number of total luggage j is n, the list of the weight of the luggage is W, and the list of the value is v; suppose that the first i objects are represented by dp (i, j), the total weight needs not to exceed j kilograms, and the value is the greatest. Then there are the following cases:

1) if the weight w [ i ] > j of the ith baggage is greater than j, the ith baggage is not considered, that is, dp (i, j) is dp (i-1, j).

2) If the weight w [ i ] < ═ j of the ith piece of luggage, there are two cases: one is not put into the ith luggage, then dp (i, j) is dp (i-1, j); in another case, the ith baggage is put in, dp (i, j) ═ d (i-1, j-w [ i ]) + vi ]; the maximum value between them should be chosen, i.e., dp (i, j) ═ max { dp (i-1, j), dp (i-1, j-w [ i ]) + vi }.

Therefore, in this step, the total bearing weight W is the preset frame length, the weight W is the frame length of a single candidate highlight segment, the value V is the total score of all candidate highlight segments, and the value V is the score of the single candidate highlight segment, and the application here is to accommodate more wonderful segments under the limit of the preset frame length, that is, to select the candidate highlight segment with the highest score sum under the limited capacity.

S103, when a section of highlight clips with fixed time length are extracted based on one piece of video data for editing, the candidate highlight clips are sequenced, a plurality of target highlight clips are selected, the selected target highlight clips are processed by adopting a multi-reduction-complement strategy according to a preset frame length, and the highlight clips are output.

Specifically, as shown in fig. 4, step S103 includes:

s401, performing ascending arrangement according to the scores of the selected candidate highlight segments, and selecting a plurality of candidate highlight segments with preset ranks as target highlight segments;

in the step, the selected candidate highlight segments are sorted in an ascending order according to the scores of the selected candidate highlight segments, and the candidate highlight segment 5 at the top of the ranking is selected as the target highlight segment.

S402, when the total frame length of all the target highlight segments is longer than a preset frame length, based on the frame difference mean value and the edge change rate mean value of each target highlight segment, performing starting frame self-addition or ending frame self-subtraction on each target highlight segment by adopting a multi-regression strategy until the total frame length of all the target highlight segments reaches the preset frame length;

in the step, when the total frame length of all the target highlight segments is longer than a preset frame length, multi-step strategy processing is required; specifically, the frame difference mean value and the edge change rate mean value of the previous and subsequent frames of the target highlight segment are calculated based on the frame difference value and the edge change rate value of the target highlight segment, if the frame difference mean value and the edge change rate mean value of the current frame are both smaller than the frame difference mean value and the edge change rate mean value of the next frame, the starting frame of the target highlight segment is self-added, otherwise, the ending frame of the target highlight segment is self-reduced, so that the frame length of the target highlight segment can be shortened, and the 5 candidate highlight segments are continuously subjected to iterative calculation until the total frame length of the 5 target highlight segments reaches the preset frame length.

S403, when the total frame length of all the target highlight segments is smaller than the preset frame length, performing initial frame self-subtraction or ending frame self-addition on each target highlight segment by adopting a less complement strategy based on the frame difference mean value and the edge change rate mean value of each target highlight segment until the total frame length of all the target highlight segments reaches the preset frame length;

in the step, when the total frame length of all the target highlight segments is less than the preset frame length, less compensation strategy processing is required; specifically, the frame difference mean value and the edge change rate mean value of the previous frame and the next frame of the target highlight segment are calculated based on the frame difference value and the edge change rate value of the target highlight segment, if the frame difference mean value and the edge change rate mean value of the current frame are both smaller than the frame difference mean value and the edge change rate mean value of the previous frame, the starting frame of the target highlight segment is self-reduced, otherwise, the ending frame of the target highlight segment is self-added, so that the frame length of the target highlight segment can be lengthened, and the 5 candidate highlight segments are continuously subjected to iterative calculation until the total frame length of the 5 target highlight segments reaches the preset frame length.

And S404, merging all the target highlight segments reaching the preset frame length, and outputting the highlight segment compilation.

And S104, when a highlight segment with fixed duration is extracted based on the video data with the plurality of indefinite frame rates for editing, partitioning, converting, removing duplication and sequencing the candidate highlight segments of the video data to obtain a plurality of target highlight segments and outputting the highlight segment editing.

In the use process of an actual product, a highlight segment with fixed duration is extracted from videos with multiple indefinite frame rates, a segment with the fixed frame length with the fixed duration is obtained based on a fixed frame rate in the algorithm, the product timeline takes the original video duration as a reference, namely, the timeline has a frame interpolation strategy, although the timeline frame rate is not uniform with the original video frame, the video is automatically interpolated and subjected to frame interpolation, and therefore the total duration of the finally provided segment is calculated based on the duration of the original video. This presents the problem that the duration needs to be calculated for each segment separately, i.e. the duration of each segment based on the algorithmic frame rate needs to be scaled to the original video. That is, if the duration of the segment is 1s based on the algorithm frame rate of 30fps, and the original video is 15fps, it needs to be converted into 15 frames instead of 30 frames of the algorithm frame rate.

Therefore, in this embodiment, the candidate highlight segments need to be partitioned, converted, deduplicated, and sorted to obtain a plurality of target highlight segments and output a highlight segment compilation.

In one embodiment, as shown in fig. 5, step S104 includes:

s501, based on the frame length of each video data, partitioning each candidate highlight segment;

in this step, because the method considers the video data with multiple indefinite frame rates as a video to be processed, the start frame and the end frame of some candidate highlight segments before partitioning are not in the same video, so that the partition processing is performed on each candidate highlight segment based on the frame length of each video, and specifically, which video the candidate highlight segment should be in can be determined by determining the frame number of the candidate highlight segment.

S502, judging the video data to which each candidate highlight segment belongs after partition processing, and obtaining segments with the same time length in the corresponding video data of each candidate highlight segment through conversion processing;

in this step, each candidate highlight segment is traversed, the video to which the candidate highlight segment belongs is judged, and the segment of the candidate highlight segment that is simultaneously long in the corresponding video is calculated based on the duration of the candidate highlight segment.

S503, carrying out duplicate removal processing on the candidate highlight segments after the conversion processing, and then carrying out partition processing to obtain a plurality of target highlight segments;

in the step, some candidate highlight segments after conversion processing have overlapping conditions, and two overlapped sections need to be merged and deduplicated; and then partitioning is carried out again based on the candidate highlight segments after the deduplication to obtain a plurality of target highlight segments, wherein the partitioning is carried out again because the deduplication process may encounter the problem of cross-region.

Specifically, the process of deduplication and partitioning is described in one implementation: assuming that a user inputs 2 videos, a total frame length of 60fps is 600, a total frame length of 30fps is 150, the method adopts a uniform frame rate of 30fps, the total duration of 2 candidate highlight segments obtained after processing the two videos is 10s, which is [310,550] and [610,670], respectively, then the candidate highlight segments are converted into the video duration, the first candidate highlight segment is 8s, the second candidate highlight segment is 2s, and the video frame rate converted by plus-minus half difference is: [190,670] and [610,670], which are 8s and 2s segments on the video, but it is clear that the segments have overlapped and cross-regions. This is a problem that is often encountered when multiple videos are processed together. Partitioning and deduplication processes are required. The two candidate highlight segments are subdivided into [190,600] and [601,670] by partitioning, the total duration is 9.16s, the fixed duration of 10s is not reached, so the frames need to be added again for each candidate highlight segment to reach the fixed duration of 10s, here, each candidate highlight segment is traversed by a segment with 0.834s duration, here, the first candidate highlight segment is traversed by a segment with 0.834x60 being compensated by 50 frames, and the final segment ranges are [140,600] and [601,670 ]. Dividing the two candidate highlight segments by the respective video frame rate yields highlight segments having a total duration of about 10 s.

S504, calculating the total duration of the target highlight segments based on the frame rate of the video data;

and S505, calculating the time length difference between the total time length and the fixed time length, processing the target highlight segment by adopting a multi-fading and multi-complementing strategy according to the time length difference, and outputting a highlight segment compilation.

In the step, traversing each target highlight segment, judging the video to which each target highlight segment belongs, calculating the total duration of all the target highlight segments based on the video frame rate, performing multi-quit strategy processing if the total duration exceeds a fixed duration, performing less complement strategy processing if the total duration does not reach the fixed duration until the total duration of all the target highlight segments reaches the fixed duration, then combining all the target highlight segments and outputting a highlight segment combination.

In one embodiment, as shown in fig. 6, step S404 includes:

s601, when the total duration is longer than the fixed duration, calculating the duration difference between the total duration and the fixed duration, traversing each target highlight segment, and performing self-addition of the initial frame and self-subtraction of the ending frame;

s602, traversing and circulating all the target highlight segments once, and then calculating the time length difference once again until the total time length is consistent with the fixed time length;

and S603, performing ascending arrangement on the target highlight segments with the total duration reaching the fixed duration, and outputting the highlight segment compilation.

In this embodiment, when the total duration is longer than the fixed duration, the number of frames of the target highlight segment needs to be subtracted, specifically, a duration difference can be calculated and obtained according to the total duration and the fixed duration, and then, the start frame self-addition and the end frame self-subtraction are performed on the target highlight segment according to the duration difference; calculating the time length difference once again after traversing and circulating all the target highlight segments once until the total time length is consistent with the fixed time length, and finishing the deletion of the frame number of the target highlight segments; at the moment, all the target highlight segments with the total duration reaching the fixed duration are arranged in an ascending order and are combined, so that the highlight segment compilation is output.

In one embodiment, as shown in fig. 7, step S404 further includes:

s701, when the total duration is smaller than a fixed duration, calculating the duration difference between the total duration and the fixed duration, traversing each target highlight segment, and performing self-subtraction of the starting frame and self-addition of the ending frame;

s702, traversing and circulating all the target highlight segments once, and then calculating the time length difference once again until the total time length is consistent with the fixed time length;

and S703, performing ascending arrangement on the target highlight segments with the total duration reaching the fixed duration, and outputting the highlight segment compilation.

In this embodiment, when the total duration is less than the fixed duration, the number of frames of the target highlight segment needs to be increased, specifically, a duration difference can be calculated and obtained according to the total duration and the fixed duration, and the starting frame self-subtraction and the ending frame self-addition are performed on the target highlight segment according to the duration difference; calculating the time length difference once again after traversing and circulating all the target highlight segments once until the total time length is consistent with the fixed time length, and finishing the increase of the frame number of the target highlight segments; at the moment, all the target highlight segments with the total duration reaching the fixed duration are arranged in an ascending order and are combined, so that the highlight segment compilation is output.

The embodiment of the invention also provides a highlight extracting device, which is used for executing any embodiment of the highlight extracting method. Specifically, please refer to fig. 8, fig. 8 is a schematic block diagram of a highlight extracting apparatus according to an embodiment of the present invention.

As shown in fig. 8, the highlight extracting apparatus 800 includes: a preprocessing unit 801, an acquisition unit 802, a first extraction unit 803, and a second extraction unit 804.

A preprocessing unit 801, configured to read video data and preprocess the video data to obtain a plurality of candidate sub-segments;

an obtaining unit 802, configured to perform screening, clustering, and knapsack policy processing on the multiple candidate sub-segments to obtain multiple candidate highlight segments;

a first extracting unit 803, configured to, when a highlight clip with a fixed duration is extracted based on one piece of video data, sort the candidate highlight clips and select a plurality of target highlight clips, process the selected target highlight clips by using a multi-fading and under-complementing strategy according to a preset frame length, and output the highlight clip;

the second extracting unit 804 is configured to, when a highlight clip of a fixed duration is extracted based on a plurality of pieces of video data with indefinite frame rates, perform partitioning, conversion, deduplication and sorting on candidate highlight clips of the plurality of pieces of video data to obtain a plurality of target highlight clips, and output the highlight clip.

The device obtains a plurality of candidate sub-segments through video data preprocessing, and the overall chroma of the selected candidate highlight segments is higher through screening, clustering and knapsack strategy processing, and finally, the target highlight segments are selected from the candidate highlight segments on the basis of fixed duration, and are processed and output highlight segment compilation; the device effectively extracts the highlight segments in the video data and generates the highlight segment compilation with fixed time length, and has the advantages of high extraction efficiency and good effect.

In one embodiment, the pre-processing unit 801 includes:

the parameter acquisition unit is used for reading the video data and acquiring the image color, the image texture characteristic, the image quality, the frame difference value and the edge change rate value of each frame in the video data;

the first filtering unit is used for setting a first filtering proportion according to the frame length of the video data and filtering frames with low image quality according to the first filtering proportion;

the second filtering unit is used for setting a second filtering proportion and filtering the frames with large frame difference values and edge change rate values according to the second filtering proportion;

the cascade unit is used for cascading the image color and the image texture characteristics to obtain the image color texture characteristics;

the video data processing device comprises a segmentation unit, a search unit and a display unit, wherein the segmentation unit is used for carrying out scene segmentation on video data, setting a longest frame length and a shortest frame length for each scene to limit the segment duration of each scene, and outputting candidate sub-segments, wherein one scene outputs one or more candidate sub-segments;

and the clustering processing unit is used for extracting the wonderful cover of the candidate sub-segments through clustering processing.

In one embodiment, the obtaining unit 802 includes:

the setting unit is used for setting a frame length threshold, a frame difference threshold and an edge change rate threshold of each candidate sub-segment;

the screening unit is used for screening out the candidate highlight segments from the candidate sub-segments according to the frame length threshold, the frame difference threshold and the edge change rate threshold;

the score calculating unit is used for clustering the candidate highlight segments based on the image color texture characteristics and calculating the score of each candidate highlight segment according to the frame difference mean value of each candidate highlight segment;

and the knapsack strategy processing unit is used for carrying out knapsack strategy processing according to the score and the frame length of each candidate highlight segment and selecting the candidate highlight segment with the highest score sum.

In an embodiment, the first extraction unit 803 includes:

the arrangement unit is used for carrying out ascending arrangement according to the scores of the selected candidate highlight segments and selecting a plurality of candidate highlight segments with preset ranks as target highlight segments;

a multi-drop strategy unit, configured to, when the total frame length of all the target highlight segments is longer than a preset frame length, perform, based on the frame difference average value and the edge change rate average value of each target highlight segment, start frame self-addition or end frame self-subtraction on each target highlight segment by using a multi-drop strategy until the total frame length of all the target highlight segments reaches the preset frame length;

the less-complement strategy unit is used for performing initial frame self-subtraction or ending frame self-addition on each target highlight segment by using the less-complement strategy based on the frame difference mean value and the edge change rate mean value of each target highlight segment when the total frame length of all the target highlight segments is less than the preset frame length until the total frame length of all the target highlight segments reaches the preset frame length;

and the merging unit is used for merging all the target highlight segments reaching the preset frame length and outputting the highlight segment compilation.

In an embodiment, the second extracting unit 804 includes:

the partition unit is used for partitioning each candidate highlight segment based on the frame length of each video data;

the conversion unit is used for judging the video data to which each candidate highlight segment belongs after partition processing, and obtaining segments with the same time length in the corresponding video data of each candidate highlight segment through conversion processing;

the de-duplication unit is used for performing de-duplication processing on the candidate highlight segments after the conversion processing and then performing partition processing to obtain a plurality of target highlight segments;

the duration calculation unit is used for calculating the total duration of the target highlight segments based on the frame rate of the video data;

and the output unit is used for calculating the time length difference between the total time length and the fixed time length, processing the target highlight segment by adopting a multi-fading and multi-complementing strategy according to the time length difference and outputting the highlight segment compilation.

The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the wonderful segment extraction method is realized when the processor executes the computer program.

The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when being executed by a processor, the computer program realizes the highlight extracting method.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

While the embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the embodiments of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims

1. A highlight extraction method is characterized by comprising the following steps:

2. The method of claim 1, wherein the reading video data and pre-processing the video data to obtain a plurality of candidate sub-segments comprises:

reading the video data, and acquiring the image color, the image texture feature, the image quality, the frame difference value and the edge change rate value of each frame in the video data;

setting a first filtering proportion according to the frame length of the video data, and filtering frames with low image quality according to the first filtering proportion;

setting a second filtering proportion, and filtering frames with large frame difference values and edge change rate values according to the second filtering proportion;

cascading the image color and the image texture characteristics to obtain image color texture characteristics;

performing scene segmentation on the video data, setting a longest frame length and a shortest frame length for each scene to limit the segment duration of each scene, and outputting candidate sub-segments, wherein one scene outputs one or more candidate sub-segments;

and extracting the wonderful cover of the candidate sub-segments through clustering.

3. The highlight segment extraction method according to claim 2, wherein the screening, clustering and knapsack strategy processing of the candidate sub-segments to obtain candidate highlight segments comprises:

setting a frame length threshold, a frame difference threshold and an edge change rate threshold of each candidate sub-segment;

screening out candidate highlight segments from the candidate sub-segments according to the frame length threshold, the frame difference threshold and the edge change rate threshold;

clustering the candidate highlight segments based on the image color texture features, and calculating the score of each candidate highlight segment according to the frame difference mean value of each candidate highlight segment;

and (4) carrying out knapsack strategy processing according to the score and the frame length of each candidate highlight segment, and selecting the candidate highlight segment with the highest score sum.

4. The highlight segment extraction method according to claim 3, wherein when extracting a highlight segment compilation of a fixed duration based on one piece of video data, sorting the candidate highlight segments and selecting a plurality of target highlight segments, processing the selected target highlight segments according to a preset frame length by using a multi-fading and multi-complementing strategy, and outputting the highlight segment compilation, the method comprises:

performing ascending arrangement according to the scores of the selected candidate highlight segments, and selecting a plurality of candidate highlight segments with preset ranks as target highlight segments;

when the total frame length of all the target highlight segments is longer than the preset frame length, based on the frame difference mean value and the edge change rate mean value of each target highlight segment, performing self-addition of a starting frame or self-subtraction of an ending frame on each target highlight segment by adopting a multi-step strategy until the total frame length of all the target highlight segments reaches the preset frame length;

when the total frame length of all the target highlight segments is smaller than the preset frame length, performing initial frame self-subtraction or ending frame self-addition on each target highlight segment by adopting an under-compensation strategy based on the frame difference mean value and the edge change rate mean value of each target highlight segment until the total frame length of all the target highlight segments reaches the preset frame length;

and merging all the target highlight segments reaching the preset frame length, and outputting the highlight segment compilation.

5. The method for extracting highlight clips according to claim 1, wherein when extracting a highlight clip compilation of a fixed duration based on a plurality of video data with variable frame rate, the method for dividing, converting, de-duplicating and sorting candidate highlight clips of the plurality of video data to obtain a plurality of target highlight clips and outputting the highlight clip compilation comprises:

partitioning each candidate highlight segment based on the frame length of each video data;

judging the video data of each candidate highlight segment after partition processing, and obtaining segments with the same time length of each candidate highlight segment in the corresponding video data through conversion processing;

carrying out duplicate removal processing on the candidate highlight segments after conversion processing, and then carrying out partition processing to obtain a plurality of target highlight segments;

calculating the total duration of the target highlight segments based on the frame rate of the video data;

and calculating the time length difference between the total time length and the fixed time length, processing the target highlight segments by adopting a multi-fading and multi-complementing strategy according to the time length difference, and outputting a highlight segment compilation.

6. The method for extracting highlight segments according to claim 5, wherein said calculating a time length difference between said total time length and a fixed time length, processing said target highlight segments according to said time length difference by using a multi-fading and multi-complementing strategy, and outputting a highlight segment compilation, comprises:

when the total duration is longer than the fixed duration, calculating the duration difference between the total duration and the fixed duration, traversing each target highlight segment, and performing self-addition of the initial frame and self-subtraction of the ending frame;

traversing and circulating all the target highlight segments once, and then calculating the time length difference once again until the total time length is consistent with the fixed time length;

and performing ascending arrangement on the target highlight segments with the total duration reaching the fixed duration, and outputting a highlight segment compilation.

7. The highlight clip extraction method according to claim 5, wherein said calculating a time length difference between said total time length and a fixed time length, processing said target highlight clip using a multi-back and multi-complement strategy according to said time length difference, and outputting a highlight clip compilation, further comprises:

when the total duration is less than the fixed duration, calculating the duration difference between the total duration and the fixed duration, traversing each target highlight segment, and performing self-subtraction of the initial frame and self-addition of the ending frame;

8. A highlight extracting apparatus, comprising:

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the highlight extraction method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to execute the highlight extraction method according to any one of claims 1 to 7.