Summary of the invention
The present invention proposes a kind of video abstraction generating method and device, saved the shared storage space of movement locus of irrelevant target.
In order to achieve the above object, technical scheme of the present invention is achieved in that
The invention provides a kind of video abstraction generating method, comprise step:
Steps A is obtained the directional information of the detected moving object of user's setting;
Step B carries out foreground detection to each frame video image, and the target that detects is followed the tracks of, and extracts the movement locus of target;
The similarity of the directional information that step C, the direction of differentiating the movement locus extract and user arrange, the similarity of the directional information that will arrange with the user is stored less than the movement locus of predetermined threshold value;
Step D is added to each two field picture of movement locus of storage respectively on corresponding background, the generating video summary.
Wherein, the directional information in described steps A comprises a direction vector at least.
Wherein, described step C comprises step:
The differentiation of the similarity of the directional information that the direction of the movement locus that extracts according to following criterion and user arrange:
Wherein, β is the average angle of the direction vector that arranges of the movement locus that extracts and user, and T is predetermined threshold value, the direction vector of the moving object that is detected that a arranges for the user, b
i=p
i+1-p
i, p
iI central point for the movement locus that extracts;
If the β value that calculates is less than predetermined threshold value T, with this movement locus storage.
Wherein, carry out foreground detection in described step B and comprise step:
Utilize the mixed Gaussian function to carry out background modeling to image, extract the target of motion.
Wherein, described step utilizes the mixed Gaussian function to carry out background modeling to image, and the target of extracting motion also comprises to be processed illumination and shade, comprises step:
When illumination variation amplitude in the unit interval in shooting environmental surpasses predetermined threshold value, the span that is judged to be the pixel of background dot is reduced into original 0.4-0.6 doubly;
Use threshold value greater than the pixel of shadow region with image binaryzation, remove shade.
Wherein, in described step C, the target that detects is followed the tracks of and comprises step:
All targets of detecting of traversal present frame, and the detected target of previous frame image compares, if satisfy following condition:
S
cross>min(S
pre,S
temp)×R
S
cross=Width
cross×Height
cross
Width
cross=min(right
pre,right
temp)-max(left
pre,left
temp)
Height
cross=min(Bottom
pre,Bottom
temp)-max(Top
pre,Top
temp)
Wherein, Scross is the intersection area of front and back two frames, Width
CrossFor projecting to the length of the cross section on horizontal direction; Heught
CrossFor projecting to the length of the cross section on vertical direction; Right
preValue for the right margin of former frame profile; Right
TempValue for the right margin of present frame profile; Left
preValue for the left margin of former frame profile; Left
TempValue for the left margin of present frame profile; Bottom
preValue for the lower boundary of former frame profile; Bottom
TempValue for the lower boundary of present frame profile; Top
preValue for the coboundary of former frame profile; Top
TempValue for the coboundary of present frame profile; Described R is cross-ratio;
The target of judging present frame is related with previous frame, upgrades track; If do not satisfy this condition, judgement is not related, produces new track, if there is the track on the target association that does not have to detect with present frame in the previous frame image, stops the tracking of this track, and with this track storage.
Wherein, described step B also comprises step:
Upgrade background;
According to the number of the target of extracting, according to the target numbers principle that frequency is higher, the context update frequency is lower of foreground detection more at most, adjust the frequency of foreground detection and the frequency of context update.
Wherein, the frequency of described step adjustment foreground detection and the frequency of context update comprise step:
When the target numbers of extracting is zero, carry out foreground detection one time every the 3-6 frame, each frame of background upgrades once;
When the target numbers of extracting is 1-3, carry out foreground detection one time every 2 frames, every two frames of background upgrade once;
Every frame all carries out foreground detection when above when the target numbers of extracting is 3, and every three frames of background upgrade once.
The present invention also provides a kind of video frequency abstract generating apparatus, comprises that the user arranges module, track extraction module, track discrimination module and generation module; Described user arranges module, for the directional information of the detected moving object of obtaining user's setting; Described track extraction module is used for each frame video image is carried out foreground detection, and the target that detects is followed the tracks of, and extracts the movement locus of target; Described track discrimination module, the similarity of the directional information that direction that be used for to differentiate the movement locus that extracts and user arrange, the similarity of the directional information that will arrange with the user is stored less than the movement locus of predetermined threshold value; Described generation module, each two field picture that is used for movement locus that will the storage corresponding background that is added to respectively, the generating video summary.
As seen, the present invention has following beneficial effect at least:
A kind of video abstraction generating method of the present invention and device, by obtaining the directional information of the detected moving object that the user arranges, and differentiate the direction of the movement locus extract and the similarity of the directional information of user's setting, the similarity of the directional information that only will arrange with the user is less than the movement locus storage of predetermined threshold value, like this, the movement locus not storage lower with the direction similarity of user's setting, thus this a part of storage space saved;
In addition, on the corresponding background that only image of movement locus of storage is added to, the generating video summary, the movement locus of comparing all targets that will extract all is added in background, also reduced operand, and computing and storage all needs elapsed time, therefore, the rise time of video frequency abstract has also been saved in the minimizing of operand and the minimizing of storage operation, thereby has accelerated the formation speed of video frequency abstract;
Further, adopt the mixed Gaussian function to carry out foreground detection, guaranteed the precision of foreground detection, simultaneously, illumination and shade are processed respectively, prevented that the foreground extraction to video causes adverse effect because the variation of illumination is large, and the processing of shade, also make video more clear, more easily observe;
Further, also the frequency of foreground detection and context update is adjusted, like this, carried out differentiated treatment according to different situations, under the prerequisite that guarantees accuracy, reduce operand as far as possible, thereby further accelerated the formation speed of video frequency abstract.
Embodiment
For the purpose, technical scheme and the advantage that make the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
Embodiment one
The embodiment of the present invention one provides a kind of video abstraction generating method, and is shown in Figure 1, comprises step:
Step S110 obtains the directional information of the detected moving object of user's setting.
Described directional information refers to the direction of motion of the object that will detect that the user arranges comprise that at least is with a directive vector, preferably, can comprise the multiple directions vector.
Step S111 carries out foreground detection to each frame video image, and the target that detects is followed the tracks of, and extracts the movement locus of target.
Foreground detection can adopt multiple related algorithm, such as mixture Gaussian background model, SACON(SAMPLE CONSENSUS) etc., the present embodiment does not enumerate.
Tracing process also can adopt many algorithms, for example comparatively simply neighbor method, multiple target tracking algorithm, border following algorithm etc.
The similarity of the directional information that step S112, the direction of differentiating the movement locus extract and user arrange, the similarity of the directional information that will arrange with the user satisfy the movement locus of predetermined threshold value and store.
The similarity of the direction that the direction of the movement locus that extracts and user arrange is higher, illustrates more that this movement locus is probably the movement locus that the user wants the target known.
Those skilled in the art can be according to technical conceive of the present invention, the multiple method of discrimination of specific implementation, the differentiation of the similarity of the directional information that the direction of the movement locus that extracts according to different criterions and user arrange.
For example, can adopt following method of discrimination:
Wherein, β is the average angle of the direction vector that arranges of the movement locus that extracts and user, and T is predetermined threshold value, the direction vector of the moving object that is detected that a arranges for the user, b
i=p
i+1-p
i, p
iI central point for the movement locus that extracts;
If the β value that calculates less than predetermined threshold value T, illustrates that the similarity with direction user's setting this movement locus is higher, satisfy predetermined threshold value, with this movement locus storage.
Described predetermined threshold value T can rule of thumb be worth and get 0.1-0.4, preferably, can get 0.2.About the value of predetermined threshold value, those skilled in the art can determine multiple span or concrete numerical value according to different criterions, and the embodiment of the present invention one is enumerated one by one.
Step S113 is added to each two field picture of movement locus of storage respectively on corresponding background, the generating video summary.
According to the track of the moving target that extracts and the background image of storage, according to time relationship and spatial relationship that track occurs, track is arranged, on the background image of the storage that then target trajectory that moves is added to, generate summary.
Embodiment two
The embodiment of the present invention two provides a kind of video abstraction generating method, and is shown in Figure 2, comprises step:
Step S210: the directional information of obtaining the detected moving object of user's setting.
In the present embodiment two, described directional information comprises the both direction vector.Be that the movement locus that the user can arrange detected moving object is first to advance along one the direction of determining, advance along another direction more afterwards, for example, can be for first then being folded to east along southeastern direction.
Step S211: utilize the mixed Gaussian function to carry out background modeling, extract the target of motion.
Utilize mixed Gaussian to carry out background modeling to image, extract the prospect of motion, the number that wherein can select the mixed Gaussian function to adopt according to video scene can be trained separately a Gauss model for shade.
Single Gaussian Background modeling function is
The mixed Gaussian background modeling is modeled as the basis with single Gaussian Background, comprises step:
1) initialization mixture model parameter at first comprises the shared weight of each Gauss model of initialization and average and the standard deviation of each Gauss model.
Wherein the initialization of weight is exactly the distribution of background to be carried out the valuation of prior probability, initialized the time, generally the weight of first Gauss model is got greatlyr, and other just corresponding values are less, that is:
Wherein the average of first Gauss model equal the first frame of input video corresponding pixel value or process the mean value of unit, that is:
The variance v of K Gauss model:
σ
k 2(x,y,1)=var?k=1,2,...,K
The initial variance of all Gauss models all equates, that is: σ
k 2(x, y, 1)=var k=1,2 ..., K
The var value is directly relevant to the dynamic perfromance of this video.
2) upgrade the Gauss model parameter
Travel through each Gauss model, relatively following formula:
(I(x,y,l,f)-μ
k(x,y,l,f-1))
2<c*σ
k(x,y,f-1)
2
If all set up for all color components, so just this pixel is attributed to B Gauss model, otherwise, just not belonging to any one Gauss model, this just is equivalent to occur wild point.Below either way need to do corresponding renewal.
All set up this situation for all colours component, corresponding step of updating is:
This situation represents that the value of current pixel satisfies B Gaussian distribution, and this pixel might not belong to background so, needs to judge whether this B Gaussian distribution meets the following conditions:
Illustrate that this pixel belongs to background dot, otherwise just belong to the foreground point.
If this pixel belongs to background dot, so just illustrate that B background distributions exported a sampled value, at this moment all distribute and all need to carry out parameter and upgrade.
B corresponding Gauss model parameter upgraded as follows:
w
B(x,y,f)=(1-α)*w
B(x,y,f-1)+α
μ
B(x,y,l,f)=(1-β)*μ
B(x,y,l,f-1)+β*I(x,y,l,f)
σ
B 2(x,y,f)=(1-β)*σ
B 2(x,y,f-1)+β*(I(:)-μ
B(:))
T*(I(:)-μ
B(:))
Remaining Gauss model only changes weights, and average and variance all remain unchanged, that is:
w
k(x,y,f)=(1-α)*w
k(x,y,f-1)?k≠B
β=αη(I(x,y,:,f)|μ
B,σ
B)
Wild point refers to this pixel value and does not meet any one Gaussian distribution, this moment, we regarded this pixel as the new situation that occurs in video, replace K Gaussian distribution with this new situation, its weight and average and variance are all determined according to the initialization thinking, namely distribute a less weight, with a larger variance, that is:
w
K(x,y,f)=(1-W)/(K-1)
μ
K(x,y,l,f)=I(x,y,l,f)
σ
K(x,y,l,f)=var
Determine that simultaneously this point is the foreground point.The foreground point is the pixel of each target very.
Preferably, also comprise illumination and shade processed, comprise step:
The illumination variation amplitude surpasses predetermined threshold value within the unit interval in shooting environmental, when namely illumination variation is very large, the span that is judged to be the pixel of background dot is reduced into original 0.4-0.6 doubly, preferably, is 0.5 times.
Wherein the predetermined threshold value of illumination variation amplitude can specifically be determined according to actual needs by those skilled in the art in the unit interval, and for example this predetermined threshold value can be 10-15lx/s(lux/second).
For shade, use threshold value greater than the pixel of shadow region with image binaryzation, remove shade.
Wherein, the frequency of the frequency of foreground detection and context update can be adjusted according to target numbers.
According to the number of the target of extracting, according to the target numbers principle that frequency is higher, the context update frequency is lower of foreground detection more at most, adjust the frequency of foreground detection and the frequency of context update.
For example, when the target numbers of extracting is zero, carry out foreground detection one time every the 3-6 frame, each frame of background upgrades once; When the target numbers of extracting is 1-3, carry out foreground detection one time every 2 frames, every two frames of background upgrade once; Every frame all carries out foreground detection when above when the target numbers of extracting is 3, and every three frames of background upgrade once.
Step S212: the target that detects is followed the tracks of, extract the movement locus of target.
Target to front and back two frames that detect is carried out respectively the track association, trajectory generation, and track disappears and differentiates.All prospects of detecting of traversal present frame, and the previous frame result of all tracks compares, if satisfy following condition:
S
cross>min(S
pre,S
temp)×R
S wherein
Cross=Width
Cross* Height
CrossBe the intersection area of front and back two frames, described R is cross-ratio, and in the present embodiment, R can learn from else's experience and test threshold value 0.4.
Width
cross=min(right
pre,right
temp)-max(left
pre,left
temp)
Height
cross=min(Bottom
pre,Bottom
temp)-max(Top
pre,Top
temp)
Width
CrossFor projecting to the length of the cross section on horizontal direction; Height
CrossFor projecting to the length of the cross section on vertical direction; Right
preValue for the right margin of former frame profile; Right
TempValue for the right margin of present frame profile; Left
preValue for the left margin of former frame profile; Left
TempValue for the left margin of present frame profile; Bottom
preValue for the lower boundary of former frame profile; Bottom
TempValue for the lower boundary of present frame profile; Top
preValue for the coboundary of former frame profile; Top
TempValue for the coboundary of present frame profile.
If satisfy above-mentioned condition, certain prospect of judging present frame is with on the track of previous frame storage is related, upgrade track, if do not have in association, produce new track, there is no the prospect that detects with the present frame track on related if having, stop this track and carry out operating next time, track is stored, be used for follow-up generation video frequency abstract.
For example, if definite area S that interweaves to two human body contour outlines
CrossMin (S
pre, S
Temp) * R thinks same human body profile.
Step S213: the similarity of the directional information that the direction of differentiating the movement locus extract and user arrange, the similarity of the directional information that will arrange with the user satisfy the movement locus of predetermined threshold value and store.
After obtaining the track of object of which movement, whether whether the direction of motion of differentiating object consistent with the direction of motion of the object that will detect that arranges.
In the embodiment of the present invention two, the method that track is differentiated is namely differentiated by following discriminant with embodiment one:
Wherein T can rule of thumb be worth, and selects 0.2, namely as β less than 0.2 the time, judge that this track as approximate track, stores, if β is not less than 0.2, abandon this track.
Wherein, described with embodiment one, β is the average angle of the direction vector that arranges of the movement locus that extracts and user, and T is predetermined threshold value, the direction vector of the moving object that is detected that a arranges for the user, b
i=p
i+1-p
i, p
iI central point for the movement locus that extracts.
Step S214: each two field picture of movement locus of storage is added to respectively on corresponding background, the generating video summary.
Embodiment three
The embodiment of the present invention three provides a kind of video frequency abstract generating apparatus, comprises that the user arranges module, track extraction module, track discrimination module and generation module.
Described user arranges module, for the directional information of the detected moving object of obtaining user's setting; Described track extraction module is used for each frame video image is carried out foreground detection, and the target that detects is followed the tracks of, and extracts the movement locus of target; Described track discrimination module, the similarity of the directional information that direction that be used for to differentiate the movement locus that extracts and user arrange, the similarity of the directional information that will arrange with the user is stored less than the movement locus of predetermined threshold value; Described generation module, each two field picture that is used for movement locus that will the storage corresponding background that is added to respectively, the generating video summary.
Through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but in a lot of situation, the former is better embodiment.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprise that some instructions are with so that a computer equipment (can be personal computer, server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.
a kind of video abstraction generating method and the device of the embodiment of the present invention, at first the direction that will detect moving object is set, then carry out background modeling, detect moving object, object is followed the tracks of, obtain the track of object, then track is differentiated, obtain the moving object close with the direction that will detect that arranges, then store these tracks and Background, generate at last the video frequency abstract of the moving object of the specific direction of wanting inspected object, saved owing to generating the irrelevant shared storage space of moving object, reduced related operation, thereby improved the speed that video generates.
It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment, the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme that aforementioned each embodiment puts down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.