Summary of the invention
The present invention proposes a kind of video abstraction generating method and video frequency abstract generating apparatus, and video abstraction generating method can't be to the problem of variety classes moving target difference generating video summary in the prior art to solve.
The technical solution adopted for the present invention to solve the technical problems is:
According to an aspect of the present invention, provide a kind of video abstraction generating method.This video abstraction generating method comprises: the target frame image in the original video is carried out background modeling, obtain background model; Utilize background model to extract moving target in the target frame image; Use default grader that the moving target that extracts is classified; According to classification results the moving target that extracts is carried out trajectory alignment, generate summary.
Further, the image of the target frame in the original video is carried out background modeling comprise: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.
Further, using default grader that the moving target that extracts is classified comprises: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.
Further, also comprise afterwards at the moving target that utilizes background model to extract in the target frame image:
The moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.
Further, the moving target of the former frame of the moving target of target frame image and target frame being carried out track following comprises: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.
Further, according to classification results the moving target that extracts being carried out trajectory alignment comprises: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image.
According to another aspect of the present invention, provide a kind of video frequency abstract generating apparatus.This video frequency abstract generating apparatus comprises: the background modeling module is used for the target frame image of original video is carried out background modeling to obtain background model; The moving target extraction module is used for utilizing background model to extract the moving target of target frame image; The moving object classification module is used for using default grader that the moving target that extracts is classified; The summarization generation module is used for according to classification results the moving target that extracts being carried out trajectory alignment, generates summary.
Further, the background modeling module also is used for: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.
Further, the moving object classification module also is used for: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.
Further, this video frequency abstract generating apparatus also comprises: the track following module, be used for the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, and obtain the movement locus of moving target.
Use technical scheme of the present invention, technical scheme of the present invention extracts after the moving target in image, utilize default grader to set up the model of object common in the video, the prospect that extraction is obtained is that moving target is classified, and makes a summary according to the classification results generating video.Thereby complete and accurate ground generates the video frequency abstract that meets user's request, makes the user can promptly obtain the video information that needs from video frequency abstract, has improved the service efficiency of video.The final method of object classification of using has generated coherent summary, has satisfied the purpose that fast multi-target is browsed.
Embodiment
Need to prove, in the situation that do not conflict, embodiment and the feature among the embodiment among the application can make up mutually.Describe below with reference to the accompanying drawings and in conjunction with the embodiments the present invention in detail.
The embodiment of the invention provides a kind of video frequency abstract generating apparatus, Fig. 1 is the schematic diagram according to the video frequency abstract generating apparatus of the embodiment of the invention, as shown in Figure 1, this generating apparatus comprises: background modeling module 11 is used for the target frame image of original video is carried out background modeling to obtain background model; Moving target extraction module 13 is used for utilizing background model to extract the moving target of target frame image; Moving object classification module 15 is used for using default grader that the moving target that extracts is classified; Summarization generation module 17 is used for according to classification results the moving target that extracts being carried out trajectory alignment, generates summary.
Utilize the video frequency abstract generating apparatus of the present embodiment, after setting up background model, utilize default grader to set up the model of object common in the video, the prospect that extraction is obtained is that moving target is classified, and makes a summary according to the classification results generating video.Thereby complete and accurate ground generates the video frequency abstract that meets user's request, makes the user can promptly obtain the video information that needs from video frequency abstract, has improved the service efficiency of video.
Above background modeling module 11 can be used various image background modeling algorithms, after setting up background model, current image and background model is compared, and determines foreground target (namely need extract moving target) according to comparative result.The image background modeling algorithm can select to adopt color background model or grain background model particularly, and wherein, the color background model is that the color value of each pixel in the image (comprising gray scale or colour) is carried out modeling.If when the pixel color value in the pixel color value on the present image coordinate (x, y) and the background model on (x, y) had larger difference, current pixel was considered to prospect, otherwise it is background.
The background modeling module 11 of the video frequency abstract generating apparatus of this example can preferably be used the mixed Gaussian Background Algorithm in the color background model, mixture Gaussian background model (Gaussian Mixture Model) is in the basis of single Gaussian Background model has carried out improving, be similar to smoothly the density fonction of arbitrary shape by the weighted average of a plurality of Gaussian probability-density functions, be particularly useful for processing for the image of outdoor environment, background modeling module 11 is utilized the characteristics of mixed Gaussian Background Algorithm, the moving target in the video under the outdoor environment can be identified rapidly and accurately.
When background modeling module 11 is carried out background modeling, can carry out corresponding filtering processing to the illumination in the target frame image and shade, be mistaken as moving target to avoid illumination and shade, affect the generation of video frequency abstract.
In the situation that use the mixed Gaussian Background Algorithm, background modeling module 11 can also be used for: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.
Moving object classification module 15 employed default graders can use multiple sorting algorithm, can adopt in the present embodiment SVMs (Support Vector Machine, SVM) object that often occurs in the video (such as vehicles such as bicycle, automobile, motorcycles, or animal etc.) is carried out model training.SVMs is analyzed the linear separability situation, situation for linearly inseparable, make its linear separability by using non-linear map that the sample of low-dimensional input space linearly inseparable is converted into high-dimensional feature space, thereby realize that high-dimensional feature space adopts linear algorithm that the nonlinear characteristic of sample is carried out linear analysis.Simultaneously SVMs based on the structural risk minimization theory in feature space construction optimum segmentation hyperplane so that learner obtains global optimization, and satisfy certain upper bound in the expected risk of whole sample space with certain probability.
The algorithm of SVMs is by a Nonlinear Mapping p, sample space is mapped to (Hilbert space) in a higher-dimension and even the infinite dimensional feature space, so that the problem of Nonlinear separability is converted into the problem of the linear separability in feature space in original sample space, namely realize a linear separability by rising dimension.Simultaneously for the calculation of complex that rises dimension and may bring, SVMs is used the expansion theorem of kernel function, does not just need to know the explicit expression of Nonlinear Mapping; Owing to being in high-dimensional feature space, to set up linear learning machine, so compare with linear model, not only increase hardly the complexity of calculating, by selecting different kernel functions, can generate different models.
In the present embodiment, can carry out in advance SVM training to multiple subject image, obtain several typical models and store.Thereby moving object classification module 15 can also be used for using the default object classification device model that is trained by SVMs that the moving target that extracts is classified.
The video frequency abstract generating apparatus of the present embodiment can also comprise the track following module, the moving target that every frame detects is followed the tracks of, tracking can adopt neighbor method, concrete steps can be: the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.
Wherein, track following can comprise the judgement of several steps such as track association, track generation and track disappearance.Concrete method of discrimination is: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.Repeat above step, until all moving targets that extracts in the traversal present frame.
The area of the moving target of the former frame of hypothetical target frame is S
Pre, the area of the moving target of target frame image is S
Temp, above-mentioned preset area value can be set to min (S
Pre, S
Temp) * R is so as above-mentioned intersection area S
CrossSatisfy: S
CrossMin (S
Pre, S
TempDuring the condition of) * R, can judge that the moving target of target frame image is associated with the track of the moving target of the former frame of target frame, upgrades this movement locus according to the position of moving target on image of target frame image.R is cross-ratio in following formula, can carry out value according to empirical value, and general value is 0.4.
Above-mentioned intersection area S
CrossComputational methods be S
Cross=Width
Cross* Height
Cross, wherein,
Width
Cross=min (right
Pre, right
Temp)-max (lef
Pre, lef
Temp), right
PreThe maximum of the former frame moving target abscissa in image coordinate that is has represented the rightest position of moving target in a two field picture before; Left
PreThe minimum value of the former frame moving target abscissa in image coordinate that is has represented the most left position of moving target in a two field picture before; S is the maximum of moving target abscissa in image coordinate in the target frame, has represented the least significant of moving target in the target frame image; Left
TempBe the minimum value of moving target abscissa in image coordinate in the target frame, represented the leftmost position of moving target in the target frame image.So min (right
Pre, right
Temp) be right
PreAnd right
TempIn a less value, max (left
Pre, left
Temp) be left
PreAnd left
TempIn a larger value.
Height
Cross=min (Top
Pre, Top
Temp)-max (Bottom
Pre, Bottom
Temp), Bottom
PreThe minimum value of the former frame moving target ordinate in image coordinate that is has represented the bottom position of moving target in the former frame image; Top
PreThe maximum of the former frame moving target ordinate in image coordinate that is has represented the tip position of moving target in the former frame image; Top
TempBe the maximum of moving target ordinate in image coordinate in the target frame, represented the bottom position of moving target in the target frame image; Top
PreBe the minimum value of moving target ordinate in image coordinate in the target frame, represented the tip position of moving target in the target frame image.So min (Top
Pre, Top
Temp) be Top
PreAnd Top
TempIn a less value, max (Bottom
Pre, Bottom
Temp) be Bottom
PreAnd Bottom
TempIn a larger value.
S when the intersection area is less than or equal to preset area
Cross≤ min (S
Pre, S
Temp) * R, according to the position generation new movement locus of moving target on image of target frame image, and the movement locus before judging disappears.
The flow process of summarization generation module 17 specifically can comprise: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image, thereby has generated video frequency abstract.
The embodiment of the invention also provides a kind of video abstraction generating method, this video abstraction generating method can be carried out by any video frequency abstract generating apparatus that the above embodiment of the present invention provides, Fig. 2 is the schematic diagram according to the video abstraction generating method of the embodiment of the invention, this video abstraction generating method comprises:
Step S21 carries out background modeling to the target frame image in the original video, obtains background model;
Step S23 utilizes background model to extract moving target in the target frame image;
Step S25 uses default grader that the moving target that extracts is classified;
Step S27 carries out trajectory alignment according to classification results with the moving target that extracts, and generates summary.
Wherein, step S21 specifically can comprise: use the mixed Gaussian Background Algorithm that the image of target frame is calculated, obtain the mixed Gauss model of target frame image.Step S21 has utilized the characteristics of mixed Gaussian Background Algorithm, be similar to smoothly the density fonction of arbitrary shape by the weighted average of a plurality of Gaussian probability-density functions, be applicable to process for the image of outdoor environment, the moving target in the video under the outdoor environment can be identified rapidly and accurately.
Step S25 specifically can comprise: use the default object classification device model that is trained by SVMs that the moving target that extracts is classified.SVMs is by a Nonlinear Mapping p, sample space is mapped in the feature space of a higher-dimension, so that the problem of Nonlinear separability is converted into the problem of the linear separability in feature space in original sample space, namely realize a linear separability by rising dimension.Object classification device model draws multiple subject image precondition by SVMs, and preserves.
The video abstraction generating method of the present embodiment can also comprise after step S23: the moving target of the former frame of the moving target of target frame image and target frame is carried out track following, obtain the movement locus of moving target.Flow process is particularly: the area that intersects of the moving target that calculates the target frame image and the moving target of the former frame of target frame; Judge and whether intersect area greater than the preset area value; When intersecting area during greater than the preset area value, upgrade movement locus according to the position of moving target on image of target frame image; When the intersection area is less than or equal to preset area, according to the position generation new movement locus of moving target on image of target frame image.Concrete computational methods were introduced introducing the video frequency abstract generating apparatus, no longer repeated at this.
Step S27 specifically can comprise: according to time relationship and locus that the movement locus of moving target occurs this movement locus is arranged; Movement locus after arranging is added on the background image.
The video abstraction generating method of the present embodiment, at first carry out background modeling, detect moving object, object is followed the tracks of, obtained the track of object, store these tracks and Background, utilize grader to classify to the object that detects, tell the common classification of several classes such as car, humans and animals, then respectively every class is carried out object and carry out trajectory alignment, generate summary.Namely mainly be divided into: foreground detection, target following, object classification, the several steps of summarization generation.Fig. 3 is the flow chart according to the video abstraction generating method of the embodiment of the invention, as shown in the figure:
The step of foreground detection mainly comprises: utilize mixed Gaussian that image is carried out background modeling, extract the prospect of motion, computational process will be processed accordingly to illumination and shade.Utilize mixed Gaussian that image is carried out background modeling, extract the prospect of motion, the number that wherein can select the mixed Gaussian function to adopt according to video scene can be trained separately a Gauss model for shade or illumination.
The step of target following mainly comprises: the target that every frame detects is followed the tracks of, and tracking can adopt simply neighbor method, and storage track and Background.On if the track of certain prospect of present frame and previous frame storage is related, then upgrade track, if track does not have in the association, then produce new track, if have and do not have the prospect that detects with the present frame track on related, then stop this track and carry out operation next time, track is stored, be used for follow-up generation video frequency abstract.
The step of object classification mainly comprises: utilize the SVM precondition to go out the model of several objects, then to the sorting objects of each track of detecting.Utilize the precondition good model, the target that detects is classified, adopt SVM algorithm precondition to go out the model of several objects of needs here, for example can train motor vehicle, motorcycle, the model of several objects of humans and animals.
The step of summarization generation mainly comprises: according to the track of the moving target that extracts and the Background of storage, according to time relationship and spatial relationship that track occurs track is arranged, then the target trajectory that moves is added on the Background of storing, generates summary.Thereby utilize the trace information of all moving targets that extract and the background of storage, by certain rule compositor, on the background that then track is added to, the generating video summary.
Use technical scheme of the present invention, technical scheme of the present invention extracts after the moving target in image, utilize default grader to set up the model of object common in the video, the prospect that extraction is obtained is that moving target is classified, and makes a summary according to the classification results generating video.Thereby complete and accurate ground generates the video frequency abstract that meets user's request, makes the user can promptly obtain the video information that needs from video frequency abstract, has improved the service efficiency of video.The final method of object classification of using has generated coherent summary, has satisfied the purpose that fast multi-target is browsed.
Through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better execution mode in a lot of situation.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in the storage medium, such as ROM/RAM, magnetic disc, CD etc., comprise that some instructions are with so that a computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.
The above only is preferred implementation of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.