CN109195026A

CN109195026A - Video abstraction generating method and system

Info

Publication number: CN109195026A
Application number: CN201811195007.8A
Authority: CN
Inventors: 曹风云; 周猛; 唐杰晓; 谢飞; 施培蓓
Original assignee: Hefei Normal University
Current assignee: Hefei Normal University
Priority date: 2018-10-15
Filing date: 2018-10-15
Publication date: 2019-01-11

Abstract

The present invention relates to a kind of video abstraction generating method and systems, wherein video abstraction generating method includes the method according to dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；Similarity is calculated according to the characteristic information；Adaptive adjustment is carried out to video according to similarity to be segmented video；According to the similarity in the section of the video of segmentation and between section, it is adaptively adjusted similar frame discrimination standard, and similar video frame is merged, generates final video frequency abstract.By using the method for dictionary learning and rarefaction representation, standard, keeps video abstraction generating method more adaptable with being adaptively adjusted video segmentation.

Description

Video abstraction generating method and system

Technical field

The present invention relates to technical field of video processing, and in particular to a kind of video abstraction generating method and system.

Background technique

In recent years, with universal and network technology the development of digital photographing apparatus, it is raw that video is increasingly becoming record people A kind of important form lived and linked up, in order to save time it is desirable to extract the important content of video, To quickly understand the key content of video, video frequency abstract is exactly a kind of technology that can satisfy this demand, video frequency abstract algorithm The importance of its various pieces can be assessed according to video content, and prior extracting section is come out and constitutes view Frequency is made a summary.However, video type is more, and content is complex, proposes higher want to the design of video frequency abstract algorithm It asks, and that there are performances is not good enough for the generation method of existing video frequency abstract, and this method is not strong to the universality of different scenes asks Topic.

Summary of the invention

The object of the present invention is to provide video abstraction generating method and systems.

In order to solve the above-mentioned technical problems, the present invention provides a kind of video abstraction generating methods to include:

According to the method for dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；

Similarity is calculated according to the characteristic information；

Adaptive adjustment is carried out to video according to similarity to be segmented video；

According to the similarity in the section of the video of segmentation and between section, similar video frame is merged, is generated final Video frequency abstract.

Further, the method for the characteristic information to video extraction video frame includes:

Pretreatment and feature extraction are carried out to the content of video, in the form of representation of video shot is become video frame；

Video frame is carried out down-sampled；

Each video frame is read in, characteristic information is extracted.

Further, the characteristic information includes SIFT feature and HSV feature.

Further, the method for calculating similarity according to the characteristic information includes:

After video extraction characteristic information, each video-frequency band is expressed as a matrix；

Its reconstructed error on corresponding dictionary is defined as follows: | | X_i-DA_i||² _F, wherein D indicates dictionary, A_iIndicate video The corresponding reconstruction coefficients of frame, X_iIt is the SIFT feature of video extraction；

Judge current video and video before in terms of content according to the variation of the reconstructed error and the reconstruction coefficients Difference, according to encoder matrix and spatial pyramid, encoder matrix, which is converted to one, multi-scale information vector；It measures described more Tri- frame of dimensional information vector sum n-1, n-2, n-3 corresponds to the distance of vector, and takes its average value as present frame and previous frame Similarity；Wherein n indicates present frame.

Further, described that video progress is adaptively adjusted by the method being segmented to video includes: according to similarity

According to the similarity of interframe at video lens fragment position and in video mirror head section, the reasonable threshold of video segmentation is calculated Value, is segmented video lens；

According to segmentation result, the video frame of each video-frequency band is extracted, and measures the similarity between video frame, to segmentation As a result it is adaptively adjusted.

Further, the reasonable threshold value for calculating video segmentation, the method being segmented to video lens include:

Step S1 extracts data and feature, each picture frame is expressed as a three-dimensional feature vector, according to consecutive frame Between similarity numerical curve, extract local minimum, extract the data of the minimum position；

Step S2 carries out initial classification, the nearest neighbor classifier of two classification of building, according to the minimum position to data The data set find out the smallest data of similarity numerical value, using its corresponding feature vector as the classification of video shot boundary classification Center；A maximum value is looked for again, it is right using the classifier using its feature vector as the class center for not being boundary classification Remaining data is classified；

Step S3 reselects positive negative sample, is updated to classifier, and data are carried out with new classification；

Step S4 repeats step S3, until reaching termination condition, i.e. algorithm reaches maximum cycle or data Classification results no longer change.

Further, the similarity in the section according to the video of segmentation and between section, is adaptively adjusted similar frame and sentences Other standard, and similar video frame is merged, the method for generating final video frequency abstract includes:

Step A1: the video frame in each segmenting video is extracted according to segmentation result, and measures the phase between each video frame Like degree；

Step A2 detects that transition in video or content change time are less than the camera lens of preset value according to similarity, and Carry out delete processing；

Step A3, video frame and corresponding video-frequency band after merging treatment, and generate final video frequency abstract.

Further, the video abstraction generating method further include:

The numerical value of Fscore according to final video frequency abstract evaluates the quality of abstract, wherein described Fscore is that the measurement index accurate rate (precision) combined and recall rate (recall) are calculated, and formula is as follows:

Wherein: prescision=N_match/N_AS；Recall=N_match/N_US

Wherein N_matchIt is that the video video frame that algorithm generates matches the artificial quantity for extracting video frame: N_ASAnd N_USRespectively Indicate that algorithm is the quantity for generating video frame and manually selected video frame.

The present invention also provides a kind of video frequency abstracts to generate system, comprising:

Characteristic extracting module is suitable for the method according to dictionary learning and rarefaction representation, believes the feature of video extraction video frame Breath；

Similarity calculation module is suitable for calculating similarity according to the characteristic information；

Segmentation module is suitable for carrying out adaptive adjustment to video according to similarity being segmented video；

Summarization generation module is suitable for being carried out similar video frame according to the similarity in the section of the video of segmentation and between section Merge, generates final video frequency abstract.

Further, the video frequency abstract generates system further include:

Quality assessment module is suitable for the numerical value of the Fscore according to final video frequency abstract to comment the quality of abstract Valence, the Fscore are that the measurement index accurate rate (precision) combined and recall rate (recall) are calculated, and guarantee essence True rate and recall rate are all higher.However in actual experiment, accurate rate and recall rate are difficult to get both, using the numerical value of the Fscore It replaces, formula is as follows:

Wherein: prescision=N_match/N_AS；Recall=N_match/N_US

The invention has the advantages that the present invention provides a kind of video abstraction generating method and systems, wherein video is plucked Wanting generation method includes the method according to dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；According to institute It states characteristic information and calculates similarity；Adaptive adjustment is carried out to video according to similarity to be segmented video；According to segmentation Similarity in the section of video and between section, is adaptively adjusted similar frame discrimination standard, and similar video frame is merged, Generate final video frequency abstract.By using the method for dictionary learning and rarefaction representation, it is adaptively adjusted video segmentation terrestrial reference Standard keeps video abstraction generating method more adaptable.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples.

Fig. 1 shows the flow diagram of video abstraction generating method provided by the embodiment of the present invention.

Fig. 2 is the sub-step flow chart of step S130 in Fig. 1.

Fig. 3 is the sub-step flow chart of step S131 in Fig. 2.

Fig. 4 shows the functional block diagram that video frequency abstract provided by the embodiment of the present invention generates system.

Specific embodiment

In conjunction with the accompanying drawings, the present invention is further explained in detail.These attached drawings are simplified schematic diagram, only with Illustration illustrates basic structure of the invention, therefore it only shows the composition relevant to the invention.

Embodiment is as shown in Figure 1, the embodiment of the invention provides a kind of video abstraction generating methods.Video frequency abstract generation side Method includes the next steps:

S110: according to the method for dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame.

By using the method for dictionary learning and rarefaction representation, the characteristic information of adaptive extraction video frame is realized Video abstraction generating method is more adaptable.Wherein, following methods are used to the characteristic information of video extraction video frame:

It is down-sampled to video frame progress, it is in the present embodiment, down-sampled that the quantity of video frame is reduced to original 1/10, In other embodiments, down-sampled quantity can sets itself, it is within the scope of the present invention.By down-sampled by video The quantity of frame is reduced, and the data processing quantity of processor and the occupancy of memory are reduced.

Each video frame is read in, characteristic information is extracted.In the present embodiment, characteristic information includes SIFT spy It seeks peace HSV feature.

S120: similarity is calculated according to the characteristic information.

The characteristic information of adaptive extraction video frame is adaptively adjusted video according to the similarity for calculating video frame Piecewise standard, it is more adaptable to realize video abstraction generating method.Wherein, step S120 the following steps are included:

S130: adaptive adjustment is carried out to video according to similarity, video is segmented.

Please refer to 2, wherein adaptive adjustment is carried out to video according to similarity, segmentation is carried out using following step to video It is rapid:

S131: according to the similarity of interframe at video lens fragment position and in video mirror head section, the conjunction of video segmentation is calculated Threshold value is managed, video lens are segmented, referring to Fig. 3, in the present embodiment, the method packet that video lens are segmented It includes:

Step S1: data and feature are extracted, each picture frame is expressed as a three-dimensional feature vector, according to consecutive frame Between similarity numerical curve, extract local minimum, extract the data of the minimum position；

Step S2: data are carried out with initial classification, the nearest neighbor classifier of two classification of building, according to the minimum position The data set find out the smallest data of similarity numerical value, using its corresponding feature vector as the classification of video shot boundary classification Center；A maximum value is looked for again, it is right using the classifier using its feature vector as the class center for not being boundary classification Remaining data is classified；

Step S3: reselecting positive negative sample, be updated to classifier, and data are carried out with new classification；

Step S4: repeating step S3, until reaching termination condition, i.e. algorithm reaches maximum cycle or data Classification results no longer change.

S132: according to segmentation result, extracting the video frame of each video-frequency band, and measure the similarity between video frame, Segmentation result is adaptively adjusted.

S140: according to the similarity in the section of the video of segmentation and between section, similar frame can also be adaptively adjusted and differentiated Then standard again merges similar video frame, generate final video frequency abstract.

By the way that adaptive video frame extraction algorithm, including the deletion of filtering camera lens section and similar video frame two rings of merging are arranged Section, by adaptive video segmentation and video frame extraction algorithm, so that the video frequency abstract quality generated greatly improves.Wherein, Step S140 the following steps are included:

Step A1: extracting the video frame in each segmenting video according to segmentation result, and measures similar between video frame Degree；

Step A2: detecting that transition in video or content change time are less than the camera lens of preset value according to similarity, and Delete processing is carried out, in the present embodiment, the video frame for being less than the camera lens of preset value for transition or content change time carries out Delete processing, to improve the precision of sample.

Step A3: video frame and corresponding video-frequency band after merging treatment, and generate final video frequency abstract.

In the present embodiment, the similar frame discrimination standard, that is, first according to adaptive video segmentation result, extract every The video frame of a video mirror head section, and measure the similarity of video interframe.Then, it according to these similarities, detects in video Transition or the too fast camera lens section of content change, and handled；Finally merge remaining video frame and corresponding video-frequency band, And generate new video frame.

In the present embodiment, the video abstraction generating method further include:

Step S150: the numerical value of the Fscore according to final video frequency abstract evaluates the quality of abstract. Fscore be combine measurement index accurate rate (precision) and recall rate (recall) calculated, guarantee accurate rate with Recall rate is all higher.However in actual experiment, accurate rate and recall rate are difficult to get both, using the numerical value of the Fscore come generation It replaces, formula is as follows:

Wherein: prescision=N_match/N_AS；Recall=N_match/N_US, wherein N_matchIt is the video view that algorithm generates Frequency frame matches the artificial quantity for extracting video frame: N_ASAnd N_USRespectively indicating algorithm is to generate video frame and manually selected view The quantity of frequency frame.The quality of abstract is evaluated by the numerical value of the Fscore to abstract, convenient for being generated to video frequency abstract The value of parameter inside method is adjusted.

Referring to Fig. 4, the present invention also provides a kind of video frequency abstracts to generate system.It includes feature that video frequency abstract, which generates system, Extraction module is suitable for the method according to dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；Similarity meter Module is calculated to be suitable for calculating similarity according to the characteristic information；Segmentation module is suitable for adaptively adjusting video according to similarity It is whole that video is segmented；Summarization generation module is suitable for according to the similarity in the section of the video of segmentation and between section, can also be certainly Similar frame discrimination standard is adaptively adjusted, then similar video frame is merged, generates final video frequency abstract.By using The method of dictionary learning and rarefaction representation, standard, has video abstraction generating method more with being adaptively adjusted video segmentation There is adaptability.

In the present embodiment, the video frequency abstract generates system further include: quality assessment module is suitable for according to final view The numerical value of the Fscore of frequency abstract evaluates the quality of abstract.Quality assessment module is suitable for according to final video frequency abstract The numerical value of Fscore the quality of abstract is evaluated, the Fscore is the measurement index accurate rate combined (precision) it is calculated with recall rate (recall), guarantees that accurate rate and recall rate are all higher.However in actual experiment, Accurate rate and recall rate are difficult to get both, and are replaced using the numerical value of the Fscore, formula is as follows:

Wherein: prescision=N_match/N_AS；Recall=N_match/N_US, wherein N_matchIt is the video that algorithm generates Video frame matches the artificial quantity for extracting video frame: N_ASAnd N_USRespectively indicating algorithm is to generate video frame and manually select The quantity of video frame.The quality of abstract is evaluated by the numerical value of the Fscore to abstract, convenient for raw to video frequency abstract It is adjusted at the value of the parameter inside method.

In conclusion the present invention provides a kind of video abstraction generating method and systems, wherein video abstraction generating method Including the method according to dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；According to the characteristic information Calculate similarity；Adaptive adjustment is carried out to video according to similarity to be segmented video；According in the section of the video of segmentation And the similarity between section, it is adaptively adjusted similar frame discrimination standard, and similar video frame is merged, generated final Video frequency abstract.By using the method for dictionary learning and rarefaction representation, standard, plucks video with being adaptively adjusted video segmentation Want generation method more adaptable.

Taking the above-mentioned ideal embodiment according to the present invention as inspiration, through the above description, relevant staff is complete Various changes and amendments can be carried out without departing from the scope of the technological thought of the present invention' entirely.The technology of this invention Property range is not limited to the contents of the specification, it is necessary to which the technical scope thereof is determined according to the scope of the claim.

Claims

1. a kind of video abstraction generating method characterized by comprising

Similarity is calculated according to the characteristic information；

According to the similarity in the section of the video of segmentation and between section, similar video frame is merged, generates final video Abstract.

2. video abstraction generating method as described in claim 1, which is characterized in that

The method of the characteristic information to video extraction video frame includes:

Video frame is carried out down-sampled；

Each video frame is read in, characteristic information is extracted.

3. video abstraction generating method as claimed in claim 2, which is characterized in that the characteristic information include SIFT feature and HSV feature.

4. video abstraction generating method as claimed in claim 3, which is characterized in that

It is described according to the characteristic information calculate similarity method include:

Its reconstructed error on corresponding dictionary is defined as follows: | | X_i-DA_i||² _F, wherein D indicates dictionary, A_iIndicate video frame pair The reconstruction coefficients answered, X_iIt is the SIFT feature of video extraction；

Current video and the difference of video in terms of content before are judged according to the variation of the reconstructed error and the reconstruction coefficients, According to encoder matrix and spatial pyramid, encoder matrix, which is converted to one, multi-scale information vector；It measures described multiple dimensioned Information vector and n-1, tri- frame of n-2, n-3 correspond to the distance of vector, and take its average value as the similar of present frame and previous frame Degree；Wherein n indicates present frame.

5. video abstraction generating method as claimed in claim 4, which is characterized in that

The foundation similarity adaptively adjusts the method being segmented to video to video progress

According to the similarity of interframe at video lens fragment position and in video mirror head section, the reasonable threshold value of video segmentation is calculated, it is right Video lens are segmented；

According to segmentation result, the video frame of each video-frequency band is extracted, and measures the similarity between video frame, to segmentation result Adaptively adjusted.

6. video abstraction generating method as claimed in claim 5, which is characterized in that

The reasonable threshold value for calculating video segmentation, the method being segmented to video lens include:

Step S1 extracts data and feature, each picture frame is expressed as a three-dimensional feature vector, according between consecutive frame Similarity numerical curve, extract local minimum, extract the data of the minimum position；

Step S2 carries out initial classification, the nearest neighbor classifier of two classification of building, according to the minimum position to data Data find out the smallest data of similarity numerical value, using its corresponding feature vector as in the classification of video shot boundary classification The heart；A maximum value is looked for again, using its feature vector as the class center for not being boundary classification, using the classifier, to surplus Remainder is according to classifying；

Step S4 repeats step S3, until reach termination condition, i.e. algorithm point that reaches maximum cycle or data Class result no longer changes.

7. video abstraction generating method as described in claim 1, which is characterized in that

Similarity in the section according to the video of segmentation and between section, similar video frame is merged, and is generated final The method of video frequency abstract includes:

Step A1 extracts the video frame in each segmenting video according to segmentation result, and measures the similarity between each video frame；

Step A2 detects that transition or content change time in video are less than the camera lens of preset value according to similarity, and carries out Delete processing；

8. video abstraction generating method as described in claim 1, which is characterized in that the video abstraction generating method also wraps Include: the numerical value of the Fscore according to final video frequency abstract evaluates the quality of abstract, wherein the Fscore is knot The measurement index accurate rate (precision) and recall rate (recall) of conjunction are calculated, and formula is as follows:

Wherein: prescision=N_match/N_AS；Recall=N_match/N_US

Wherein N_matchIt is that the video video frame that algorithm generates matches the artificial quantity for extracting video frame: N_ASAnd N_USIt respectively indicates Algorithm is the quantity for generating video frame and manually selected video frame.

9. a kind of video frequency abstract generates system characterized by comprising

Characteristic extracting module is suitable for the method according to dictionary learning and rarefaction representation, to the characteristic information of video extraction video frame；

Summarization generation module is suitable for being closed similar video frame according to the similarity in the section of the video of segmentation and between section And generate final video frequency abstract.

10. video frequency abstract as claimed in claim 9 generates system, which is characterized in that the video frequency abstract generates system and also wraps It includes:

Quality assessment module is suitable for the numerical value of the Fscore according to final video frequency abstract to evaluate the quality of abstract.