CN103929685A

CN103929685A - Video abstract generating and indexing method

Info

Publication number: CN103929685A
Application number: CN201410151449.8A
Authority: CN
Inventors: 沈志忠
Original assignee: CHINA HUA RONG HOLDINGS Corp Ltd
Current assignee: Huarong Technology Co Ltd
Priority date: 2014-04-15
Filing date: 2014-04-15
Publication date: 2014-07-16
Anticipated expiration: 2034-04-15
Also published as: CN103929685B

Abstract

The invention relates to a video abstract generating and indexing method. The video abstract generating and indexing method comprises the following steps of (1) conducting background modeling on a target image in an original video, and achieving background extraction; (2) conducting comparative segmentation on a current image and a background model, and determining movement targets according to a comparison result; (3) matching space information distributed between frames with characteristics of the targets separated from each image, achieving target tracking and recording the movement trails of the targets; (4) correcting the positions of the targets in the images in a sequence; (5) allowing the background to be overlaid with the movement targets to form a short video, and recording abstract video data to form an index data file. The video abstract generating and indexing method has the advantages that after the movement targets are tracked, the tracking success rate is guaranteed, the quality of a generated video abstract is improved greatly, and the video indexing function can meet the requirements that users view the video rapidly and view the original video conveniently so as to view actual situations completely.

Description

A kind of video frequency abstract generates and indexing means

Technical field

The present invention relates to technical field of video monitoring, relate in particular to a kind of video frequency abstract and generate and indexing means.

Background technology

Flourish along with multimedia technology, video capture technology, video monitoring, image compression encoding and stream media technology progressively develop, Video Supervision Technique application is in daily life more and more extensive, make video monitoring not only be confined to safety precaution, but becoming a kind ofly to all trades and professions effective supervision means all, the flexibility of its application is also far beyond traditional defined category of safety monitoring.Yet there is the features such as storage data volume is large, and memory time is long in the video recording of video monitoring, by video recording, finds clue, obtains evidence, and traditional way need to expend a large amount of human and material resources and time, and efficiency is extremely low, to such an extent as to misses the best opportunity.Therefore in video monitoring system, original video is concentrated, can fast browsing, locking searching object, can meet various demands and the application of public security, net prison and criminal investigation.Video frequency abstract is being played the part of key player in video analysis and content-based video frequency searching, and it generates a brief video, has wherein comprised all important activities in former video.Video is by play a plurality of events simultaneously, even different time occurs in former video, whole original video is compressed into a brief event summary.

The object of video frequency abstract is that the quality of the video frequency abstract of generation has directly affected user's experience effect in order to facilitate user to check fast video.The current problem such as imperfect, the ghost of ubiquity target in video frequency abstract.And the video after summary has been upset the sequential logic of target in original video, user is if understand the truth of some targets, also need to check by original video, therefore, how directly from summarized radio, to jump to and original video, to check that the situation original case of this target is also a problem that needs solution.

Summary of the invention

The object of this invention is to provide a kind of video frequency abstract and generate and indexing means, to overcome currently available technology above shortcomings.

The object of the invention is to be achieved through the following technical solutions:

Video frequency abstract generates and an indexing means, comprises the following steps:

1) background modeling: the target frame image in original video is carried out to background modeling, realize background extracting, separating background from original video;

2) moving target extracts: current image and background model are compared and cut apart, according to comparative result, determine moving target;

3) motion target tracking: the clarification of objective that the spatial information that interframe is distributed and every two field picture split is mated, realize target is followed the tracks of, and the movement locus of record object;

4) moving target position correction: the goal set tracing into is revised, and mainly the position in image is revised to target in sequence; And

5) summary is synthetic sets up with video index: moving target is superimposed upon in background, not simultaneous activity in original video is unobstructed or block less in the situation that and synchronously play in video frequency abstract, produce one on time and space relative compact and comprise original video in the summarized radio of required activity, docket video data during synthetic video, forms index data file.

Further, the method for background modeling employing color background model in step 1).

Preferably, described color background model specifically adopts mixed Gaussian Background Algorithm.

Further, step 2) if the defect of cavity and noise jamming appears in the moving target after middle extraction, adopt the computing of morphology open and close to process, eliminate cavity and noise; Step 2) if the moving target after middle extraction occurs that same target is divided into the defect of two or more targets, to all targets of extracting in each frame, calculate the mutual space length between target, the target that distance is less than to threshold value Λ is identified as same target.

Further, step 3) specifically comprises the following steps:

A) tracking module utilizes space distribution information and color characteristic between the moving target of consecutive frame, to mate tracking, and the match is successful is considered as same target, and records movement locus; Mate the unsuccessful new moving target that is considered as.

B) result store of following the tracks of is being gathered in, middle object representation mode is as follows:

Wherein, represent target the sequence occurring in video.

Further, the method that described coupling is followed the tracks of comprises following two kinds:

The first: the target that a new frame is split and the target in set omega are mated, defines with minor function:

Time difference function:

Wherein, represent a target newly extracting, represent set in a target. represent timestamp, represent timestamp. time difference threshold value for definition.

Distance difference function:

Wherein, represent a target newly extracting, represent set in a target. represent with distance spatially. range difference threshold value for definition.

Comparison function:

If comparison function be 1, calculate with color histogram map distance, the match is successful to meet histogram distance threshold, will add to sequence in.If mate unsuccessful or be 0, a new target, will add set to in.

The second: by first method first by target beta and a up-to-date frame of target sequence compare, if mate unsuccessful, and former frame compare, until front M frame.

Further, in step 4), moving target position is revised specifically and is comprised the following steps:

The first step, after video is all finished dealing with, statistics in each target sequence in target width, height and sequence.

After sequence width means as follows:

After sequence height be expressed as follows:

Second step, calculates respectively the mean value of above sequence, obtains target width and height , according to with each target location in target sequence is revised.

Further, when summary is synthetic in step 5), need recording of video to make a summary to participate in each frame coding, the position of the moving target that merges and the timestamp occurring first, these values are remained in index file.

Beneficial effect of the present invention is: after motion target tracking, guarantee the success rate of tracking, greatly improve the quality of the video frequency abstract generating, video index function enough meets user and checks fast video, checks that easily original video carries out the complete actual conditions of watching.

Accompanying drawing explanation

With reference to the accompanying drawings the present invention is described in further detail below.

Fig. 1 is that a kind of video frequency abstract described in the embodiment of the present invention generates and the schematic flow sheet of indexing means;

Fig. 2 is a moving target sequence before the correction described in the embodiment of the present invention position in image;

Fig. 3 is a revised moving target sequence described in the embodiment of the present invention position in image.

Embodiment

As shown in Figure 1, the step of the embodiment of the present invention, video index synthetic by background modeling, moving target extraction, motion target tracking, moving target correction, summary forms.Its concrete steps are as follows:

1, background modeling

Background modeling module can be used various image background modeling algorithms, comprises color background model and grain background model two classes.Its thought of color background model is that the color value of each pixel in image (gray scale or colour) is carried out to modeling.If when the pixel color value in the pixel color value on present image coordinate (x, y) and background model on (x, y) has larger difference, current pixel is considered to prospect, otherwise it is background.

The present embodiment background modeling module is used the mixed Gaussian Background Algorithm in color background model, mixture Gaussian background model (Gaussian Mixture Model) is to develop on the basis of single Gauss model, is similar to smoothly the density fonction of arbitrary shape by the weighted average of a plurality of Gaussian probability-density functions.The Gaussian Profile that mixed Gauss model hypothesis is used for describing the color of each pixel is K, generally gets 3 ~ 5.The present embodiment K value is 3.

2, moving target extracts

After Background Modeling, current image and background model are carried out to certain relatively, according to comparative result, determine the moving target that needs detection.Generally, the prospect obtaining has comprised a lot of noises, and in order to eliminate noise, the present embodiment has carried out opening operation and closed operation to the movement destination image extracting, and then abandons smaller profile.

The present embodiment, after extracting target, adds up to each target the pixel sum that it comprises, if a certain target pixel points sum is less than 400 pixels, the present embodiment is considered as ELIMINATION OF ITS INTERFERENCE by this target and falls, and does not process.

In order to solve the problem that same Target Segmentation is become to two and above target, calculate the mutual space length between all targets in present frame, take pixel as unit, will be apart from being less than target be identified as same target.In the present embodiment, Λ value is 15 pixels.

3, motion target tracking module

Certain moving target to present frame, because the interframe time interval is very short, the space size that moving target is shared and residing spatial position change are less, and the present embodiment utilizes space distribution information and color characteristic between the moving target of consecutive frame, to mate tracking.

Tracking module utilizes space distribution information and color characteristic between the moving target of consecutive frame, to mate tracking.The match is successful is considered as same target, and records movement locus, mates the unsuccessful new moving target that is considered as.

The result store of following the tracks of is in set in, middle object representation mode is as follows:

Wherein, represent target the sequence occurring in video.

If a certain frame target of extraction module extraction effect is bad, can cause following the tracks of unsuccessfully.In order to improve the success rate of tracking, adopt two kinds of methods below:

1) target that a new frame splits is not only mated with the target of previous frame, but and set in target mate, definition is with minor function:

Time difference function:

Distance difference function:

Comparison function:

2) in method above, only by target and a up-to-date frame of target sequence compares, if last frame extracts bad, there will be the failed situation of following the tracks of.First by target beta and a up-to-date frame of target sequence compare, if mate unsuccessful, and former frame compare, until front M frame.

In the present embodiment, the target of usining appears at frame number in video flowing sequence as its timestamp, and the first frame number is 0, increases successively.The present embodiment time difference function value is 15, represents target to be matched and set middle target timestamp difference should be in 15 frames.

In the present embodiment, calculate target to be matched and set middle target between distance time, using between two targets the pixel value between nearest point as both distances, distance difference function value is 20, represents target to be matched and set middle target distance difference should be in 20 pixels.

In the present embodiment tracking module, m value is 10, represents target to be matched can and gather middle target last 10 targets of sequence compare, and in the time of relatively, according to the inverted order of target time of occurrence, carry out.

The present embodiment statistics obtains target to be matched and set in color histogram, calculate these two histogrammic Bhattacharyya distances, two histogrammic similitudes are described.If Bhattacharyya distance is less than 0.6, explanation and set in the match is successful, will add to sequence in.If with in all targets all can not mate, give a target code, will be added to set in.

4, moving target position correcting module

Moving target position correcting module, after video is all finished dealing with, is added up in each target sequence in the width, highly of target, to target in Ω the target location of sequence is revised, right width and sorting highly from big to small in sequence, after sequence width means as follows:

After sequence height be expressed as follows:

The mean value that calculates the rear top n width of sequence and height, draws mean breadth and average height.Here N value is 20% of sequence sum.The principle of aliging according to target's center during correction, symmetrical modifying target width, upper and lower symmetrical modifying target height.

As shown in Figure 2, be to revise the position of previous target sequence in image, be the position of this target sequence after revising as shown in Figure 3.After moving target position correction, can improve the target having in leaching process and extract incomplete problem, improve the quality of the video frequency abstract generating.

5, summary synthesizes and video index

This module mainly completes the moving target tracing into and video background synthetic, will be in original video synchronously play not simultaneous activity unobstructed in video frequency abstract (or blocking less) in the situation that, produce one on time and space relative compact and comprise original video in the summarized radio of required activity.

For each two field picture of video frequency abstract, select which moving target to occur is the key of synthesizing simultaneously.The present embodiment is determined by calculating the energy loss function of each moving target.This function is comprised of moving target time difference loss function and moving target collision loss function, selects the qualified moving target of energy loss function value to merge.

Produce before each frame video frequency abstract, by the moving target set that is divided three classes: merged (S1), merge (S2), (S3) to be combined.From S3, according to the sequencing of time of occurrence, the energy loss function between calculating and S set 2, meets being just incorporated in same frame video of threshold value of loss and occurs successively.

During merging, need to provide background image, choose the background constantly image as a setting the earliest of moving target time of occurrence in this frame.

While merging moving target, the timestamp that record the coding that participates in the moving target that merges in each frame, position, occurs first, is kept at these values in index file.

When user clicks video, judge whether mouse position drops within the scope of the envelope of moving target, if mouse position in a certain target zone, search index file obtains the time that this target occurs in original video.

The present invention is not limited to above-mentioned preferred forms; anyone can draw other various forms of products under enlightenment of the present invention; no matter but do any variation in its shape or structure; every have identical with a application or akin technical scheme, within all dropping on protection scope of the present invention.

Claims

1. video frequency abstract generates and an indexing means, it is characterized in that, comprises the following steps:

2. a kind of video frequency abstract according to claim 1 generates and indexing means, it is characterized in that, in step 1), background modeling adopts the method for color background model.

3. a kind of video frequency abstract according to claim 2 generates and indexing means, it is characterized in that: described color background model specifically adopts mixed Gaussian Background Algorithm.

4. a kind of video frequency abstract according to claim 1 generates and indexing means, it is characterized in that step 2) in, if the defect of cavity and noise jamming appears in the moving target after extraction, adopt the computing of morphology open and close to process, eliminate cavity and noise; Step 2) if the moving target after middle extraction occurs that same target is divided into the defect of two or more targets, to all targets of extracting in each frame, calculate the mutual space length between target, the target that distance is less than to threshold value Λ is identified as same target.

5. a kind of video frequency abstract according to claim 1 generates and indexing means, it is characterized in that, step 3) specifically comprises the following steps:

A) tracking module utilizes space distribution information and color characteristic between the moving target of consecutive frame, to mate tracking, and the match is successful is considered as same target, and records movement locus; Mate the unsuccessful new moving target that is considered as; And

Wherein, represent target the sequence occurring in video.

6. a kind of video frequency abstract according to claim 5 generates and indexing means, it is characterized in that, the method that described coupling is followed the tracks of comprises following two kinds:

Time difference function:

Wherein, represent a target newly extracting, represent set in a target; represent timestamp, represent timestamp; time difference threshold value for definition;

Distance difference function:

Wherein, represent a target newly extracting, represent set in a target, represent with distance spatially; range difference threshold value for definition;

Comparison function:

If comparison function be 1, calculate with color histogram map distance, the match is successful to meet histogram distance threshold, will add to sequence in; If mate unsuccessful or be 0, a new target, will add set to in;

7. a kind of video frequency abstract according to claim 1 generates and indexing means, it is characterized in that, in step 4), moving target position is revised specifically and is comprised the following steps:

The first step, after video is all finished dealing with, statistics in each target sequence in target width, height and sequence;

After sequence width means as follows:

After sequence height be expressed as follows:

8. a kind of video frequency abstract according to claim 1 generates and indexing means, it is characterized in that: in step 5), when summary is synthetic, need recording of video to make a summary to participate in each frame coding, the position of the moving target that merges and the timestamp occurring first, these values are remained in index file.