CN103079117A

CN103079117A - Video abstract generation method and video abstract generation device

Info

Publication number: CN103079117A
Application number: CN2012105877576A
Authority: CN
Inventors: 王海峰
Original assignee: XINZHENG ELECTRONIC TECHNOLOGY (BEIJING) Co Ltd
Current assignee: IROBOTCITY (BEIJING) CO.,LTD.; TELEFRAME TECHNOLOGY (BEIJING) CO LTD
Priority date: 2012-12-30
Filing date: 2012-12-30
Publication date: 2013-05-01
Anticipated expiration: 2032-12-30
Also published as: CN103079117B

Abstract

The invention provides a video abstract generation method and a video abstract generation device. The video abstract generation method comprises the steps that a target frame image in original video is subjected to background modeling, and a background model is obtained; moving targets in the target frame image are extracted by using the background model; the extracted moving targets are classified by using a preset classifier; and the extracted moving targets are arranged in a track manner according to the classification result, so that an abstract is generated. With application of the technical scheme of the invention, the use efficiency of video is improved; and an organized abstract is generated by finally adopting the object classification method, and the purpose of quick multi-object browsing is met.

Description

Video abstraction generating method and video frequency abstract generating apparatus

Technical field

The invention belongs to computer vision and area of pattern recognition, be specifically related to a kind of video abstraction generating method and video frequency abstract generating apparatus.

Background technology

It is concentrated that video frequency abstract is called again video, it is the summary to video content, in automatic or semi-automatic mode, by the moving target analysis, extract moving target, then the movement locus of each target is analyzed, different targets is spliced in the common background scene, and they are made up in some way.Along with the development of video technique, the effect of video frequency abstract in video analysis and content-based video frequency searching is further important.

The purpose that video frequency abstract extracts is the moving target in order to occur in the browsing video fast, the target that yet video content is very complicated and different user is paid close attention to is different, and video abstraction generating method can't generate gratifying video frequency abstract according to various user's request complete and accurate ground in the prior art.

Can't to the problem of variety classes moving target difference generating video summary, effective solution be proposed not yet at present for video abstraction generating method in the prior art.

Summary of the invention

The present invention proposes a kind of video abstraction generating method and video frequency abstract generating apparatus, and video abstraction generating method can't be to the problem of variety classes moving target difference generating video summary in the prior art to solve.

The technical solution adopted for the present invention to solve the technical problems is:

According to an aspect of the present invention, provide a kind of video abstraction generating method.This video abstraction generating method comprises: the target frame image in the original video is carried out background modeling, obtain background model; Utilize background model to extract moving target in the target frame image; Use default grader that the moving target that extracts is classified; According to classification results the moving target that extracts is carried out trajectory alignment, generate summary.

Further, the image of the target frame in the original video is carried out background modeling comprise: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.

Further, using default grader that the moving target that extracts is classified comprises: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.

Further, also comprise afterwards at the moving target that utilizes background model to extract in the target frame image:

The moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.

Further, the moving target of the former frame of the moving target of target frame image and target frame being carried out track following comprises: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.

Further, according to classification results the moving target that extracts being carried out trajectory alignment comprises: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image.

According to another aspect of the present invention, provide a kind of video frequency abstract generating apparatus.This video frequency abstract generating apparatus comprises: the background modeling module is used for the target frame image of original video is carried out background modeling to obtain background model; The moving target extraction module is used for utilizing background model to extract the moving target of target frame image; The moving object classification module is used for using default grader that the moving target that extracts is classified; The summarization generation module is used for according to classification results the moving target that extracts being carried out trajectory alignment, generates summary.

Further, the background modeling module also is used for: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.

Further, the moving object classification module also is used for: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.

Further, this video frequency abstract generating apparatus also comprises: the track following module, be used for the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, and obtain the movement locus of moving target.

Use technical scheme of the present invention, technical scheme of the present invention extracts after the moving target in image, utilize default grader to set up the model of object common in the video, the prospect that extraction is obtained is that moving target is classified, and makes a summary according to the classification results generating video.Thereby complete and accurate ground generates the video frequency abstract that meets user's request, makes the user can promptly obtain the video information that needs from video frequency abstract, has improved the service efficiency of video.The final method of object classification of using has generated coherent summary, has satisfied the purpose that fast multi-target is browsed.

Description of drawings

Fig. 1 is the schematic diagram according to the video frequency abstract generating apparatus of the embodiment of the invention;

Fig. 2 is the schematic diagram according to the video abstraction generating method of the embodiment of the invention;

Fig. 3 is the flow chart according to the video abstraction generating method of the embodiment of the invention.

Embodiment

Need to prove, in the situation that do not conflict, embodiment and the feature among the embodiment among the application can make up mutually.Describe below with reference to the accompanying drawings and in conjunction with the embodiments the present invention in detail.

The embodiment of the invention provides a kind of video frequency abstract generating apparatus, Fig. 1 is the schematic diagram according to the video frequency abstract generating apparatus of the embodiment of the invention, as shown in Figure 1, this generating apparatus comprises: background modeling module 11 is used for the target frame image of original video is carried out background modeling to obtain background model; Moving target extraction module 13 is used for utilizing background model to extract the moving target of target frame image; Moving object classification module 15 is used for using default grader that the moving target that extracts is classified; Summarization generation module 17 is used for according to classification results the moving target that extracts being carried out trajectory alignment, generates summary.

Utilize the video frequency abstract generating apparatus of the present embodiment, after setting up background model, utilize default grader to set up the model of object common in the video, the prospect that extraction is obtained is that moving target is classified, and makes a summary according to the classification results generating video.Thereby complete and accurate ground generates the video frequency abstract that meets user's request, makes the user can promptly obtain the video information that needs from video frequency abstract, has improved the service efficiency of video.

Above background modeling module 11 can be used various image background modeling algorithms, after setting up background model, current image and background model is compared, and determines foreground target (namely need extract moving target) according to comparative result.The image background modeling algorithm can select to adopt color background model or grain background model particularly, and wherein, the color background model is that the color value of each pixel in the image (comprising gray scale or colour) is carried out modeling.If when the pixel color value in the pixel color value on the present image coordinate (x, y) and the background model on (x, y) had larger difference, current pixel was considered to prospect, otherwise it is background.

The background modeling module 11 of the video frequency abstract generating apparatus of this example can preferably be used the mixed Gaussian Background Algorithm in the color background model, mixture Gaussian background model (Gaussian Mixture Model) is in the basis of single Gaussian Background model has carried out improving, be similar to smoothly the density fonction of arbitrary shape by the weighted average of a plurality of Gaussian probability-density functions, be particularly useful for processing for the image of outdoor environment, background modeling module 11 is utilized the characteristics of mixed Gaussian Background Algorithm, the moving target in the video under the outdoor environment can be identified rapidly and accurately.

When background modeling module 11 is carried out background modeling, can carry out corresponding filtering processing to the illumination in the target frame image and shade, be mistaken as moving target to avoid illumination and shade, affect the generation of video frequency abstract.

In the situation that use the mixed Gaussian Background Algorithm, background modeling module 11 can also be used for: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.

Moving object classification module 15 employed default graders can use multiple sorting algorithm, can adopt in the present embodiment SVMs (Support Vector Machine, SVM) object that often occurs in the video (such as vehicles such as bicycle, automobile, motorcycles, or animal etc.) is carried out model training.SVMs is analyzed the linear separability situation, situation for linearly inseparable, make its linear separability by using non-linear map that the sample of low-dimensional input space linearly inseparable is converted into high-dimensional feature space, thereby realize that high-dimensional feature space adopts linear algorithm that the nonlinear characteristic of sample is carried out linear analysis.Simultaneously SVMs based on the structural risk minimization theory in feature space construction optimum segmentation hyperplane so that learner obtains global optimization, and satisfy certain upper bound in the expected risk of whole sample space with certain probability.

The algorithm of SVMs is by a Nonlinear Mapping p, sample space is mapped to (Hilbert space) in a higher-dimension and even the infinite dimensional feature space, so that the problem of Nonlinear separability is converted into the problem of the linear separability in feature space in original sample space, namely realize a linear separability by rising dimension.Simultaneously for the calculation of complex that rises dimension and may bring, SVMs is used the expansion theorem of kernel function, does not just need to know the explicit expression of Nonlinear Mapping; Owing to being in high-dimensional feature space, to set up linear learning machine, so compare with linear model, not only increase hardly the complexity of calculating, by selecting different kernel functions, can generate different models.

In the present embodiment, can carry out in advance SVM training to multiple subject image, obtain several typical models and store.Thereby moving object classification module 15 can also be used for using the default object classification device model that is trained by SVMs that the moving target that extracts is classified.

The video frequency abstract generating apparatus of the present embodiment can also comprise the track following module, the moving target that every frame detects is followed the tracks of, tracking can adopt neighbor method, concrete steps can be: the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.

Wherein, track following can comprise the judgement of several steps such as track association, track generation and track disappearance.Concrete method of discrimination is: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.Repeat above step, until all moving targets that extracts in the traversal present frame.

The area of the moving target of the former frame of hypothetical target frame is S _Pre, the area of the moving target of target frame image is S _Temp, above-mentioned preset area value can be set to min (S _Pre, S _Temp) * R is so as above-mentioned intersection area S _CrossSatisfy: S _CrossMin (S _Pre, S _TempDuring the condition of) * R, can judge that the moving target of target frame image is associated with the track of the moving target of the former frame of target frame, upgrades this movement locus according to the position of moving target on image of target frame image.R is cross-ratio in following formula, can carry out value according to empirical value, and general value is 0.4.

Above-mentioned intersection area S _CrossComputational methods be S _Cross=Width _Cross* Height _Cross, wherein,

Width _Cross=min (right _Pre, right _Temp)-max (lef _Pre, lef _Temp), right _PreThe maximum of the former frame moving target abscissa in image coordinate that is has represented the rightest position of moving target in a two field picture before; Left _PreThe minimum value of the former frame moving target abscissa in image coordinate that is has represented the most left position of moving target in a two field picture before; S is the maximum of moving target abscissa in image coordinate in the target frame, has represented the least significant of moving target in the target frame image; Left _TempBe the minimum value of moving target abscissa in image coordinate in the target frame, represented the leftmost position of moving target in the target frame image.So min (right _Pre, right _Temp) be right _PreAnd right _TempIn a less value, max (left _Pre, left _Temp) be left _PreAnd left _TempIn a larger value.

Height _Cross=min (Top _Pre, Top _Temp)-max (Bottom _Pre, Bottom _Temp), Bottom _PreThe minimum value of the former frame moving target ordinate in image coordinate that is has represented the bottom position of moving target in the former frame image; Top _PreThe maximum of the former frame moving target ordinate in image coordinate that is has represented the tip position of moving target in the former frame image; Top _TempBe the maximum of moving target ordinate in image coordinate in the target frame, represented the bottom position of moving target in the target frame image; Top _PreBe the minimum value of moving target ordinate in image coordinate in the target frame, represented the tip position of moving target in the target frame image.So min (Top _Pre, Top _Temp) be Top _PreAnd Top _TempIn a less value, max (Bottom _Pre, Bottom _Temp) be Bottom _PreAnd Bottom _TempIn a larger value.

S when the intersection area is less than or equal to preset area _Cross≤ min (S _Pre, S _Temp) * R, according to the position generation new movement locus of moving target on image of target frame image, and the movement locus before judging disappears.

The flow process of summarization generation module 17 specifically can comprise: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image, thereby has generated video frequency abstract.

The embodiment of the invention also provides a kind of video abstraction generating method, this video abstraction generating method can be carried out by any video frequency abstract generating apparatus that the above embodiment of the present invention provides, Fig. 2 is the schematic diagram according to the video abstraction generating method of the embodiment of the invention, this video abstraction generating method comprises:

Step S21 carries out background modeling to the target frame image in the original video, obtains background model;

Step S23 utilizes background model to extract moving target in the target frame image;

Step S25 uses default grader that the moving target that extracts is classified;

Step S27 carries out trajectory alignment according to classification results with the moving target that extracts, and generates summary.

Wherein, step S21 specifically can comprise: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.Step S21 has utilized the characteristics of mixed Gaussian Background Algorithm, be similar to smoothly the density fonction of arbitrary shape by the weighted average of a plurality of Gaussian probability-density functions, be applicable to process for the image of outdoor environment, the moving target in the video under the outdoor environment can be identified rapidly and accurately.

Step S25 specifically can comprise: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.SVMs is by a Nonlinear Mapping p, sample space is mapped in the feature space of a higher-dimension, so that the problem of Nonlinear separability is converted into the problem of the linear separability in feature space in original sample space, namely realize a linear separability by rising dimension.Object classification device model draws multiple subject image precondition by SVMs, and preserves.

The video abstraction generating method of the present embodiment can also comprise after step S23: the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.Flow process is particularly: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.Concrete computational methods were introduced introducing the video frequency abstract generating apparatus, no longer repeated at this.

Step S27 specifically can comprise: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image.

The video abstraction generating method of the present embodiment, at first carry out background modeling, detect moving object, object is followed the tracks of, obtained the track of object, store these tracks and Background, utilize grader to classify to the object that detects, tell the common classification of several classes such as car, humans and animals, then respectively every class is carried out object and carry out trajectory alignment, generate summary.Namely mainly be divided into: foreground detection, target following, object classification, the several steps of summarization generation.Fig. 3 is the flow chart according to the video abstraction generating method of the embodiment of the invention, as shown in the figure:

The step of foreground detection mainly comprises: utilize mixed Gaussian that image is carried out background modeling, extract the prospect of motion, computational process will be processed accordingly to illumination and shade.Utilize mixed Gaussian that image is carried out background modeling, extract the prospect of motion, the number that wherein can select the mixed Gaussian function to adopt according to video scene can be trained separately a Gauss model for shade or illumination.

The step of target following mainly comprises: the target that every frame detects is followed the tracks of, and tracking can adopt simply neighbor method, and storage track and Background.On if the track of certain prospect of present frame and previous frame storage is related, then upgrade track, if track does not have in the association, then produce new track, if have and do not have the prospect that detects with the present frame track on related, then stop this track and carry out operation next time, track is stored, be used for follow-up generation video frequency abstract.

The step of object classification mainly comprises: utilize the SVM precondition to go out the model of several objects, then to the sorting objects of each track of detecting.Utilize the precondition good model, the target that detects is classified, adopt SVM algorithm precondition to go out the model of several objects of needs here, for example can train motor vehicle, motorcycle, the model of several objects of humans and animals.

The step of summarization generation mainly comprises: according to the track of the moving target that extracts and the Background of storage, according to time relationship and spatial relationship that track occurs track is arranged, then the target trajectory that moves is added on the Background of storing, generates summary.Thereby utilize the trace information of all moving targets that extract and the background of storage, by certain rule compositor, on the background that then track is added to, the generating video summary.

Through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better execution mode in a lot of situation.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in the storage medium, such as ROM/RAM, magnetic disc, CD etc., comprise that some instructions are with so that a computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.

The above only is preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. a video abstraction generating method is characterized in that, comprising:

Target frame image in the original video is carried out background modeling, obtain background model;

Utilize described background model to extract moving target in the described target frame image;

Use default grader that the moving target that extracts is classified;

According to classification results the moving target that extracts is carried out trajectory alignment, generate summary.

2. video abstraction generating method according to claim 1 is characterized in that, the image of the target frame in the original video is carried out background modeling comprise:

Use the mixed Gaussian Background Algorithm that the image of described target frame is calculated, obtain the mixed Gauss model of described target frame image.

3. video abstraction generating method according to claim 1 is characterized in that, uses default grader that the moving target that extracts is classified and comprises:

Use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.

4. video abstraction generating method according to claim 1 is characterized in that, also comprises afterwards at the moving target that utilizes described background model to extract in the described target frame image:

The moving target of the former frame of the moving target of described target frame image and described target frame is carried out track following, obtain the movement locus of described moving target.

5. video abstraction generating method according to claim 4 is characterized in that, the moving target of the former frame of the moving target of described target frame image and described target frame is carried out track following comprise:

The area that intersects of the moving target that calculates described target frame image and the moving target of the former frame of described target frame;

Judge that whether described intersection area is greater than the preset area value;

When described intersection area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of described target frame image;

When described intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of described target frame image.

6. video abstraction generating method according to claim 5 is characterized in that, according to classification results the moving target that extracts is carried out trajectory alignment and comprises:

According to time relationship and locus that the movement locus of described moving target occurs this movement locus is arranged;

Movement locus after arranging is added on the background image.

7. a video frequency abstract generating apparatus is characterized in that, comprising:

The background modeling module is used for the target frame image of original video is carried out background modeling to obtain background model;

The moving target extraction module is used for utilizing described background model to extract the moving target of described target frame image;

The moving object classification module is used for using default grader that the moving target that extracts is classified;

The summarization generation module is used for according to classification results the moving target that extracts being carried out trajectory alignment, generates summary.

8. video frequency abstract generating apparatus according to claim 7 is characterized in that, described background modeling module also is used for: use the mixed Gaussian Background Algorithm that the image of described target frame is calculated, obtain the mixed Gauss model of described target frame image.

9. video frequency abstract generating apparatus according to claim 7 is characterized in that, the moving object classification module also is used for: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.

10. video frequency abstract generating apparatus according to claim 7 is characterized in that, also comprises:

The track following module is used for the moving target of the former frame of the moving target of described target frame image and described target frame is carried out track following, obtains the movement locus of described moving target.