CN105488814A

CN105488814A - Method for detecting shaking backgrounds in video

Info

Publication number: CN105488814A
Application number: CN201510836815.8A
Authority: CN
Inventors: 张见威; 丛子涵
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2015-11-25
Filing date: 2015-11-25
Publication date: 2016-04-13

Abstract

The invention discloses a method for detecting shaking backgrounds in a video. The method comprises the following steps: S1, obtaining a to-be-detected video sequence; S2, calculating a local updating count of each pixel in each frame image of the video sequence; S3, according to the local updating count of each pixel, calculating the shake measure of each pixel to further obtain a shake measure matrix of each frame image; S4, clustering all elements of the shake measure matrix in two categories, and marking one category, in which pixels of relatively few elements are located, as S1; and S5, marking all pixel points in the S1 as the shaking backgrounds. According to the method, the shaking backgrounds are identified, so that the influences of the shaking backgrounds on moving foreground extraction can be eliminated and the accuracy of a moving foreground algorithm in the video can be improved.

Description

A kind ofly detect in video the method for shaking background

Technical field

The present invention relates to computer vision field, be specifically related to a kind ofly detect in video the method for shaking background.

Background technology

Moving object detection in video is the basis of intelligent video analysis, is one of major issue in computer vision field, its objective is to be extracted from background image by interested Moving Objects from video sequence image.Motion detection is widely used in the fields such as computer vision, pattern-recognition, target recognition and tracking, moving image encoding, security monitoring.But, due to the dynamic change of background image, as the impact of weather, illumination, shadow, shake background etc., make moving object detection become one quite difficulty work.In the present invention, shake background refers to the part of shaking regularly in background, the leaf comprising swing, the ripples flickered, the flag and colored ribbon etc. that float, these backgrounds are in video also in motion, therefore existing algorithm of target detection is usually judged to be prospect mistakenly, have impact on Detection results.

Current existing moving target detecting method ultimate principle algorithmically can be divided three classes: frame differential method, background subtracting method and optical flow method.Background subtracting method is a kind of wherein method be generally used, and this method adopts the present frame of image sequence and reference background model relatively to detect moving object.This class methods speed is fast, and accuracy is good, is easy to realize, and a lot of researchist has carried out studying comparatively widely in background modeling.Wherein code book model method is representational common method, more effectively can adapt to localized variation and the global change of illumination, but actual Detection results display, although the method can detect static background preferably, but shake background can not be removed well, for this reason, the present invention proposes a kind of method detecting shake background.

Summary of the invention

In order to overcome shortcoming that prior art exists with not enough, the invention provides and a kind ofly detect in video the method for shaking background.

The present invention adopts following technical scheme:

Detect in video a method of shaking background, comprise the steps:

S1 obtains video sequence to be detected;

S2 calculates the local updating counting of each pixel in the every two field picture of video sequence;

S3 counts according to the local updating of each pixel, and the shake calculating each pixel is estimated, and matrix is estimated in the shake obtaining every two field picture further;

The all elements cluster that S4 estimates matrix to shake is 2 classes, and a class of getting the pixel place of less element is designated as S ₁;

S5 is by S ₁in all pixels be labeled as shake background.

Described S2 calculates the local updating counting of each pixel in the every two field picture of video sequence, is specially:

In code book model, comprise the training step of code book, in the code book training stage, each pixel has multiple code word, makes λ _{(x, y)}represent the longest minimum value of not mating interval in all code words of pixel (x, y), in like manner, λ _{[8-n (x, y)]}represent the longest interval of not mating of eight neighborhood pixel of pixel (x, y), wherein (x, what y) represent is 8 neighborhood points of pixel (x, y) to 8-n;

The local updating defining each pixel (x, y) is counted as and compares according to the recent renewal time of around eight pixels and central pixel point and add up, thus obtains this pixel local updating count value at a time, and formula is as follows:

{LUC}_{(x, y)} = Σ_{n = 1}^{8} L_{n},

Then

L_{n} = \{\begin{matrix} 1, & λ_{[8 - n (x, y)]} < λ_{(x, y)} \\ 0, & O t h e r w i s e \end{matrix} .

Shake the degrees of shaking estimated for describing each pixel in described S3, detailed process is as follows:

S3.1 remembers that the resolution of training sample image is the matrix of H × W, initialization size H × W, T ⁽⁰⁾, S ⁽⁰⁾for the null matrix of initialization matrix;

S3.2, by t and the LUC value in t-1 moment, calculates the matrix T of current time t picture frame _(t), be defined as

T ^(t)＝|LUC _(t)-LUC _(t-1)|；

S3.3 threshold value T preset time _lif, pixel (x, y), the λ of 1≤x≤H, 1≤y≤W _{(x, y)}value is greater than threshold value T _l, represent that this pixel is at T _ldo not upgrade in time period, then to T _(t)each element carry out upgrading and obtain new matrix T _(t):

T_{(x, y)}^{(t)} = \{\begin{matrix} 0, & λ_{(x, y)} &GreaterEqual; T_{L} \\ T_{(x, y)}^{(t)}, & O t h e r w i s e \end{matrix}

Wherein for matrix T _(t)element.

S3.4 uses following formula accumulated matrix T ^(t)value, obtain matrix S _(t), be referred to as shake and estimate;

S _(t)＝S _(t-1)+T ^(t)。

Described T _lscope (15,30).

Beneficial effect of the present invention:

(1) propose the concept of local updating counting, utilize comparing and statistics of the pixel recent renewal time of pixel and surrounding neighbors, the local updating situation of pixel color attribute can be reflected, contribute to classifying further and the renewing speciality of statistical pixel;

(2) propose to shake the concept estimated, by the further process counted local updating, obtain each pixel of quantitative identification be motion prospect, static background, or shake as backgrounds such as leaf, colored ribbon, ripples, obtain the method judging pixel class clearly;

(3) by the discriminant classification to pixel, better can understand scene, contribute to the higher level application that other need scene to understand;

(4) by identifying the background of shake, the impact that shake background is extracted sport foreground can be eliminated, improve the accuracy of sport foreground algorithm in video.

Accompanying drawing explanation

Fig. 1 is method flow diagram of the present invention;

Fig. 2 is that results contrast figure is estimated in leaf of the present invention shake.

Fig. 3 is that results contrast figure estimated by leaf colored ribbon.

Embodiment

Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited thereto.

Embodiment

The present invention is quantitatively described the degrees of shaking of object in video sequence, proposes a kind of method identifying targetedly and extract shake background.First propose local updating counting and shake the concept estimated, sue for peace according to the local updating counting of pixel historical frames afterwards, the summation utilizing this to try to achieve calculates shake and estimates in modeling process, then by cluster sport foreground, static background and shake background separately.Method in this paper not only can improve the accuracy rate of foreground detection, also can be used for the research and apply that relevant video scene is understood.

As shown in Figure 1, a kind ofly detect in video the method for shaking background, comprise the steps:

S1 obtains video sequence to be detected;

In order to describe certain pixel regeneration characteristics at a time, the present invention proposes local updating counting (LUC-LocalUpdateCount) concept.For studying the regeneration characteristics of some pixels, the recent renewal time according to around eight pixels and central pixel point compares and adds up, thus obtains the local updating count value of this point, and method is as follows.

In the code book model proposed in prior art, comprise the training step of code book, in the code book training stage, each pixel has multiple code word, in code book model, makes λ _{(x, y)}represent minimum in all code words of pixel (x, y) and the longlyest do not mate interval (MNRL), λ _{(x, y)}less, pixels illustrated was mated recently, also meaned most recently updated.In like manner, λ _{[8-n (x, y)]}represent that the longest of eight neighborhood pixel of pixel (x, y) does not mate interval.Definition

L_{n} = \{\begin{matrix} 1, & λ_{[8 - n (x, y)]} < λ_{(x, y)} \\ 0, & O t h e r w i s e \end{matrix}

Then the formula of LUC and local updating counting is.

{LUC}_{(x, y)} = Σ_{n = 1}^{8} L_{n}

That is, concerning pixel (x, y), 8 points around minimum the longest does not mate little than central point (x, y) of interval, then L _nvalue is 1, otherwise is zero, and the local updating counting LUC that (x, y) puts _{(x, y)}be 8 field point L _nvalue sum.Can the regeneration characteristics of a certain pixel of quantitative description by above-mentioned two formula, more than the pixel number of center recent renewal in neighborhood, LUC will be larger, otherwise LUC will be less, levels off to 0.

In table 1,1,2,3,4,5,6,7,8 is pixel (x, y) numbering of each pixel in eight neighborhood, table 2 is the longest interval λ non-match time (MNRL) of each pixel of eight neighborhood pixel, the eight neighborhood pixel getting central point in table 1 compares, and the λ value according to them calculates regeneration characteristics.If the λ of neighborhood point is less than central point, represent neighborhood newer than center, then arranging this neighborhood territory pixel value is 1, otherwise is then set to 0.All obtain right figure relatively afterwards.In the rightest figure, neighborhood value is cumulative asks summation, just obtains the LUC eigenwert of central pixel point.

S3 counts according to the local updating of each pixel, and the shake calculating each pixel is estimated, and matrix is estimated in the shake obtaining every frame;

The resolution of note training sample image is H × W, and the present invention proposes to shake the concept estimated, the degrees of shaking of each pixel of its energy quantitative description.It is calculated as described below:

(1) initialization size is the matrix T of H × W ₍₀₎, S ₍₀₎for the null matrix of initialization matrix.

(2) by t and the LUC value in t-1 moment, the matrix T of current time t picture frame is calculated _(t), be defined as

T ^(t)＝|LUC _(t)-LUC _(t-1)|

(3) given threshold value T _lif, the λ of pixel (x, y) (1≤x≤H, 1≤y≤W) _{(x, y)}value is greater than threshold value T _l, represent that this pixel is at T _ldo not upgrade in time period.Then to T _(t)each element upgrade:

T_{(x, y)}^{(t)} = \{\begin{matrix} 0, & λ_{(x, y)} &GreaterEqual; T_{L} \\ T_{(x, y)}^{(t)}, & O t h e r w i s e \end{matrix}

Obtain new matrix T ^(t), T here _lspan be (15,30).Use the value of following formula accumulated matrix, obtain matrix S _(t), be referred to as shake and estimate.

S _(t)＝S _(t-1)+T ^(t)。

Described cumulative S ₍₁₎=S ₍₀₎+ T ⁽¹⁾, then S ₍₂₎=S ₍₁₎+ T ⁽²⁾......

T ^(t)this value, when t is greater than 0, is all calculated by the formula in S3.2 at every turn, and S ₍₀₎be defined as 0.

For have the people of motion, static background, shake the video of leaf be example, for static background point, because its central point λ value is all 0, obtaining LUC value is 0, thus shake to estimate matrix S value be 0; And for the leaf rocked, its λ value is generally smaller, T ^(t)matrix can not be cleared.When leaf pixel is detected as prospect, the most unlucky survey of the pixel around it is prospect, and the pixel around it is unlucky (λ value of background is all 0) mostly, and LUC value is obtained larger.And due to shake, this pixel of subsequent time may be detected as again background, and this pixel of the subsequent time obtained may be detected as again background, then the LUC value obtained is 0, and the matrix T obtained is then larger.After accumulation a period of time, the matrix S of shake background can obviously increase; And for the region that people appears at, its neighborhood territory pixel λ value size is close, and λ value major part is greater than T _l, T ^(t)be set to 0, thus the shake of people estimate compare shake leaf less.Result as shown in Figure 2.The example of another colored ribbon as shown in Figure 3.Can find out, the shake situation of really energy accurate response background is estimated in the shake that the present invention proposes.

As shown in Figure 2, along with the carrying out of the training of code book algorithm, about 150 frames, namely separate static background, the leaf of gentle agitation, the leaf of violent shake and prospect (people) by shaking the calculating estimated.

As shown in Figure 3, about 120 frames, namely static background, the colored ribbon slightly floated, the colored ribbon acutely floated and prospect (people) is separated by shaking the calculating estimated.

In the present invention, code book model training can estimate matrix S to the shake of pixel after terminating _(t)do cluster and improve efficiency in testing process with this.The present invention adopts K-means cluster, and in order to get rid of the randomness of K-means method, consider that in real Scene, shake factor always accounts for the less part of image scaled, we are by matrix S _(t)element to gather be 2 classes, a class of getting the pixel place of less element is designated as S ₁.

S5 is by S ₁in all pixels be labeled as shake background.

Ergodic Matrices S ₁in all pixels, extract all S ₁in pixel and be labeled as shake background.

Idiographic flow is as follows:

(1) frame of video x is read in _t, frame number t adds 1;

(2) the LUCt matrix calculating present frame estimates s-matrix with rocking;

(3) if frame number t%T _k==0;

(3.1) according to S _(t)the value of matrix carries out Kmeans cluster to pixel, and clusters number value is herein 2, namely branches away the background into shake and the background of non-rocking;

(3.2) cluster complete after pixel be divided into two classes, a class is the background of rocking, and another kind of is the prospect of static background and motion;

(3.3) obtain the foreground image removing background, and s-matrix is reset to null matrix;

Otherwise (i.e. t%T (4) _kunequal to 0), then enter into next detection.

Generally consider that background can dynamic change, thus except the training stage terminate after can perform cluster, also can perform once every frame in testing process.Because the time loss of clustering algorithm is larger, for meeting the real-time of detection algorithm, T _kvalue generally according to frame per second, get the frame number of 6-8 second.

Above-described embodiment is the present invention's preferably embodiment; but embodiments of the present invention are not limited by the examples; change, the modification done under other any does not deviate from Spirit Essence of the present invention and principle, substitute, combine, simplify; all should be the substitute mode of equivalence, be included within protection scope of the present invention.

Claims

1. detect in video a method of shaking background, it is characterized in that, comprise the steps:

S1 obtains video sequence to be detected;

S5 is by S ₁in all pixels be labeled as shake background.

2. method according to claim 1, is characterized in that, described S2 calculates the local updating counting of each pixel in the every two field picture of video sequence, is specially:

{LUC}_{(x, y)} = Σ_{n = 1}^{8} L_{n},

Then

L_{n} = \{\begin{matrix} 1, & λ_{[8 - n (x, y)]} < λ_{(x, y)} \\ 0, & O t h e r w i s e \end{matrix} .

3. method according to claim 1, is characterized in that, shake the degrees of shaking estimated for describing each pixel in described S3, detailed process is as follows:

S3.1 remembers that the resolution of training sample image is the matrix of H × W, initialization size H × W, T ⁽⁰⁾, S ₍₀₎for the null matrix of initialization matrix;

S3.2, by t and the LUC value in t-1 moment, calculates the matrix T of current time t picture frame ^(t), be defined as

T ^(t)＝|LUC _(t)-LUC _(t-1)|；

S3.3 threshold value T preset time _lif, pixel (x, y), the λ of 1≤x≤H, 1≤y≤W _{(x, y)}value is greater than threshold value T _l, represent that this pixel is at T _ldo not upgrade in time period, then to T ^(t)each element carry out upgrading and obtain new matrix T ^(t):

Wherein for matrix T ^(t)element;

S _(t)＝S _(t-1)+T ^(t)。

4. method according to claim 3, is characterized in that, described T _lscope (15,30).