CN101853513A

CN101853513A - Time and space significance visual attention method based on entropy

Info

Publication number: CN101853513A
Application number: CN 201010192240
Authority: CN
Inventors: 魏龙生; 桑农; 王岳环
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2010-06-06
Filing date: 2010-06-06
Publication date: 2010-10-06
Anticipated expiration: 2030-06-06
Also published as: CN101853513B

Abstract

The invention discloses a time and space significance visual attention method based on entropy, which comprises the following steps that: (1) a dynamic significance figure and a static significance figure in short video are extracted; (2) the static significance figure and the dynamic significance figure are connected to generate a final significance figure; (3) all winners win; (4) return inhibition: the pixel values of the most significant area in the final significance figure are set to be zero to obtain a novel final significance figure; and (5) attention selection. When calculating dynamic significance, the method directly calculates the dynamic significance among all frames and only calculates the static significance figure of a current frame, thereby well solving the problems of the prior art, saving the calculation time and well detecting the dynamic significance part; and in addition, the invention also applies a multi-scale method to calculate the dynamic significance so as to better calculate the dynamic significance of objects with different sizes in the video and obtain good visual effect.

Description

A kind of time and space significance visual attention method based on information entropy

Technical field

The invention belongs to computer vision field, be specifically related to a kind of time and space significance visual attention method based on information entropy.

Background technology

Visual attention method mainly is the data screening problem that solves in the image.In computer picture, the content that task is concerned about mostly just is a part very little in the image, so, be necessary to give different image-regions with different processing priority, the complexity of processing procedure can be reduced like this, unnecessary calculating waste can also be reduced.In the human vision information processing, always select a few significant object to carry out priority processing rapidly, and ignore or give up other non-significant object, make our distributes calculation resources selectively like this, thereby greatly improve the efficient of Vision information processing, this process is called as vision attention.

The human visual system can find interesting areas and target in the natural scene easily by selective visual attention mechanism.The imagination space that vision noticing mechanism provides similar acceleration to handle for computer vision.Selective attention also allows the human visual system to handle the visual scene of input more effectively with higher complexity level.In the video of a weak point, motion is based on a such fact, and that is exactly that the stimulation that people's notice is easier to be moved in static scene attracts.Clearly comprised motion in the vision attention, and the object of fast detecting motion is the gordian technique of human and environment adaptive interaction.So the human visual system not only comprises static scene, and comprises dynamic scene.

Just as having passed through amphiblestroid processing procedure, retina obtains two output: magnocellular outputs and parvocellular output by different cells of interest to human vision selective attention process.Magnocellular output has quick response function, this output can be simulated by low spatial frequency, parvocellular output provides detailed information, this output can be simulated by the high spatial frequency that extracts image, and this output has strengthened the contrast of frame and can attract vision attention on the static frames.The present invention gains enlightenment from amphiblestroid two outputs just, two signals from each frame, have been extracted corresponding to amphiblestroid two main outputs, the short-sighted frequency of importing is resolved into low-frequency bandwidth simulates dynamic output and resolves into the output that high-frequency bandwidth is simulated static state, obtain dynamic conspicuousness figure and static conspicuousness figure, these two figure merge the final conspicuousness figure of generation.

The time and space significance model mainly comprises dynamic model and static model, the model that most of vision attention calculates is static and is based on the Feature Fusion theory, using the widest is people (L.Itti such as Itti, C.Koch and E.Niebur, " A model of saliency-based visual attention for rapidscene analysis; " IEEE Transactions on Pattern Analysis and MachineIntelligence, 20 (11), pp.1254-1259,1998.) the static vision attention model that proposes, this model has used elementary visual signature such as brightness, orientation and color.

There were a lot of documents that dynamic conspicuousness is introduced in the middle of the vision noticing mechanism in the last few years, people such as Ban have proposed a typical Dynamic Selection attention model (S.Ban, I.Lee and M.Lee, " Dynamic visual selective attention model; " Neurocomputing, vol.71, pp.853-856,2008.).Detailed process is described below: at first, calculate a static conspicuousness figure for each frame in the video image; Secondly, for each point among each static conspicuousness figure, calculate optimum yardstick; Once more, calculate entropy chart, like this, must just obtain the entropy chart of a static state to each frame figure according to these optimum yardsticks and static conspicuousness figure; At last, obtain a new entropy chart by these static entropy chart sequences, this entropy chart is exactly dynamic conspicuousness figure.

The model that people such as Ban propose has superiority in theory very much, but when the target of motion during not in salient region inside, this model is difficult to detect the zone of motion.

Summary of the invention

The objective of the invention is to propose a kind of time and space significance visual attention method based on information entropy, this method has good yardstick unchangeability, can obtain good visual effect.

The invention provides a kind of time and space significance visual attention method based on information entropy, concrete steps are:

The 1st step was extracted dynamic conspicuousness figure and the static conspicuousness figure in the short-sighted frequency; Wherein, dynamically the leaching process of conspicuousness figure is:

(A.1) for the short-sighted frequency of input, get continuous n frame image sequence, convert each two field picture to lower level grayscale image;

(A.2) each two field picture that step (A.1) is obtained narrows down to 4 different yardsticks, the image sets that n frame under the same scale dwindles is synthesized 1 dynamic response diagram, 3 again that yardstick is bigger dynamic response figure narrow down to and the identical yardstick of smallest dimension response diagram wherein, utilize the image that dwindles of these 4 same scale to unite then and generate dynamic conspicuousness figure;

The 2nd step was united the final conspicuousness figure of generation with static conspicuousness figure and dynamic conspicuousness figure;

The 3rd step victor wins entirely:

For the every bit ψ among the final conspicuousness figure, obtained the size ψ of an optimum according to the maximization approach of entropy _r, calculating the average of this o'clock in a regional area again, this regional area is to be the center of circle with this point, with ψ _rBe the border circular areas of radius, all averages have constituted a figure, and peaked point is the most significant point among this figure, and the optimal size that the most significant point and this point are corresponding has constituted the most significant zone.

The 4th step was returned inhibition:

With among the final conspicuousness figure the pixel value in significant zone all be changed to zero, obtained a new final conspicuousness figure;

The 5th step was noted selecting:

Repeated for the 3rd step to the 5th step, until predefined number of times, the point of the conspicuousness that obtains after finishing and the size of this region are as focus-of-attention.

The present invention proposes a kind of time and space significance visual attention method based on information entropy, comprise dynamic conspicuousness and static conspicuousness two aspects, when calculating dynamic conspicuousness, existing method is to calculate the static conspicuousness figure of each frame earlier, and the static conspicuousness figure according to all frames calculates dynamic conspicuousness again; There are two shortcomings in the method: when the first is calculated the static significantly figure of each frame, expended a large amount of time, its two be when dynamic target not when static salient region is inner, the method detects less than dynamic conspicuousness part; The present invention directly calculates the dynamic conspicuousness between all frames, only calculates the static conspicuousness figure of present frame, has solved above two problems so well, has saved computing time, and can detect dynamic conspicuousness part better; The present invention has also used multiple dimensioned method to calculate dynamic conspicuousness in addition, so just can calculate the dynamic conspicuousness of different big wisps in the video better, obtains good visual effect.

Description of drawings

Fig. 1 is a process flow diagram of the present invention;

The color framing of Fig. 2 (a) input; (b) gray scale frame; (c) 8 gray scale level frames; (d) 4 gray scale level frames;

Fig. 3 is the LBP operator;

The LBP operator that Fig. 4 (a) is original; (b) the LBP operator of Yan Shening;

Fig. 5 (a) and (b) be the static conspicuousness figure and the scanning pattern of first frame; (c) and (d) be the static conspicuousness figure and the scanning pattern of last frame;

Fig. 6 (a) and (b) be the dynamic conspicuousness figure and the scanning pattern thereof that obtain according to static conspicuousness figure of Ban; (c) and (d) be final conspicuousness figure and the scanning pattern thereof of Ban;

Fig. 7 (a) and (b) be dynamic conspicuousness figure and the scanning pattern thereof that the present invention obtains according to successive frame; (c) and (d) be final conspicuousness figure of the present invention and scanning pattern thereof.

Embodiment

The present invention is further detailed explanation below in conjunction with accompanying drawing and example.

As shown in Figure 1, the inventive method may further comprise the steps:

(1) dynamic conspicuousness figure and the static conspicuousness figure in the short-sighted frequency of extraction;

(A) the dynamic conspicuousness figure in the short-sighted frequency of extraction:

(A.1) for the short-sighted frequency V of input, get continuous n frame image sequence V ₁, V ₂... V _n, generally speaking, when 3≤n≤8, can reach experiment effect preferably, for speed and the reduction complexity of calculation of accelerating to calculate, convert each two field picture to lower level grayscale image, among the present invention, we select the frame number imported in the short-sighted frequency number as gray scale level.If input is coloured image, then at first be transformed into gray level image, again each frame is become n gray level (n＜256) by 256 greyscale transitions.If pixel value maximum in all frames is Max, for k frame V _k(coordinate points among 1≤k≤n) (x, y), with this corresponding pixel value V _k(x, (k), (span k) is in the interval of [0,1], shown in equation (1) for x, y for f like this for x, y y) to obtain f divided by Max; Again [0,1] interval is equally divided into the n five equilibrium, ((k), the span of these integers is [0, n-1], shown in equation (2) for x, y k) to give different round values g for x, y to give the f that falls into different five equilibriums then.Fig. 2 has shown that one is transformed into a gray scale frame (b) with color framing (a), is transformed into an example of 8 gray scale level frames (c) and 4 gray scale level frames (d) again.

f(x，y，k)＝V _k(x，y)/Max (1)

g (x, y, k) = \{\begin{matrix} 0 & 0 \leq f (x, y, k) \leq \frac{1}{n} \\ 1 & \frac{1}{n} < f (x, y, k) \leq \frac{2}{n} \\ L & L \\ n - 1 & \frac{n - 1}{n} < f (x, y, k) \leq 1 \end{matrix} - - - (2)

(A.2) in order more effectively to detect the zone of motion, we narrow down to 4 different yardsticks with each frame figure, with k frame V _kBe example, V _kReduced to V _{K, 1}, V _{K, 2}, V _{K, 3}And V _{K, 4}4 different yardsticks, be respectively former figure size 1/2,1/4,1/8 and 1/16, this image sequence has become 4 image sequence V like this _{1, s}, V _{2, s}, K, V _{N, s}(s represents the sequence number of yardstick, 1≤s≤4), be designated as V respectively _1,1, V _2,1, K, V _{N, 1}, V _1,2, V _2,2, K, V _{N, 2}, V _1,3, V _2,3, K, V _{N, 3}And V _1,4, V _2,4, K, V _{N, 4}If R _s(x is that (x, a regional area of y) locating, this regional area are so that (x is the center of circle y), with V to s image sequence in coordinate points y) _{N, 4}Length and width value in half of minimum value be the border circular areas of radius.For the coordinate points in s the image sequence (x, y), in this sequence in that (x, (x, y k) have constituted a histogram to all g in the regional area of y) locating, and the entropy of this point obtains by this histogrammic probability piece function, shown in equation (3).Entropy is big more, and the conspicuousness of this point is just strong more, and all entropy have constituted a dynamic response diagram M under current yardstick _{D, s}(x, y):

M_{d, s} (x, y) = - \underset{k &Element; {1,2, K, n}}{Σ} p_{g (x^{'}, y^{'}, k)} \log_{2} p_{g (x^{'}, y^{'}, k)} - - - (3)

Wherein

(x′，y′)∈R _s(x，y) (4)

p _{G (x ', y ', k)}Be the probability piece function that is produced by histogram, this histogram is at regional area R by s image sequence _s(x, y) all pixel values in obtain.

3 again that yardstick is bigger response diagrams all narrow down to and the identical yardstick of smallest dimension response diagram wherein, unite then to generate dynamic conspicuousness figure M _d(x, y):

M_{d} (x, y) = Σ_{s = 1}^{4} M_{d, s} (x, y) - - - (5)

(B) the static conspicuousness figure of extraction present frame

Static conspicuousness figure comprises color contrast, luminance contrast and orientation, and the model that people such as employing Itti propose just can be finished.

As a kind of improvement of the present invention, static conspicuousness figure also can consider texture information, and this static conspicuousness model is a kind of extension of the model that proposes of people such as Itti.Specify as follows below:

(B.1) conspicuousness Feature Extraction

Four kinds of low-level visual signatures: color contrast, luminance contrast, orientation and texture are extracted and have been fused into static conspicuousness figure.Make r, g and b are respectively three Color Channels of input picture, be red green blue tricolor, we create the Color Channel of 4 wider scope, make R=r-(g+b)/2 that expression is red, G=g-(r+b)/2 represents green, B=b-(r+g)/2 represents blue, and Y=(r+g)/2-|r-g|/2-b represents yellow, (is zero if negative value then makes it), then RG=|R-G| is red green contrast, and BY=|B-Y| is blue yellow contrast.So color characteristic is broken down into red green contrast and 2 characteristic types of blue yellow contrast.

We are divided into into brightness with brightness and open 2 types of (by bright to dark) and brightness closures (by secretly to bright), this is because the competent cell in human visual system's the visually-perceptible field has 2 types, bright part around the bright part in center that strengthens the cell that central authorities open suppresses, the closed cell of central authorities suppresses the bright part in center and strengthens the bright part of periphery, if present frame is a coloured image, then at first be transformed into gray level image, again the pixel value of each point in the image is deducted the response (if negative value then make it is zero) of the average of neighbours territory pixel value around this point as this point, obtained the characteristic type figure that brightness is opened like this, average with neighbours territory pixel value around each point in the image deducts the response (if negative value then make it is zero) of the pixel value of this point as this point equally, has obtained the characteristic type figure of brightness closure like this.

Go out 0 ° of 4 orientative feature type with the Gabor filter detection, 45 °, 90 ° and 135 °, the mathematic(al) representation of Gabor wave filter is:

h(u，v)＝q(u′，v′)cos(2πω _fu′) (6)

Wherein

(u′，v′)＝(ucos(φ)+vsin(φ)，-usin(φ)+vcos(φ)) (7)

q (u, v) = \frac{1}{2 π σ_{u} σ_{v}} \exp (- \frac{u^{2}}{2 {σ_{u}}^{2}} - \frac{v^{2}}{2 {σ_{v}}^{2}}) - - - (8)

ω _fThe centre frequency of expression Gabor wave filter, it has determined the wave filter band to lead to the position of regional center on frequency, by choosing different ω _fCan obtain different yardsticks.σ _uAnd σ _vBe respectively the space constant of Gabor wave filter along the Gaussian envelope of horizontal ordinate and ordinate, σ _u, σ _vRespectively with the frequency bandwidth B of Gabor wave filter _fWith the orientation bandwidth B _θAnd following relation arranged:

σ_{u} = \sqrt{\frac{\ln 2}{2}} \frac{1}{π ω_{f}} \frac{2^{B_{f}} + 1}{2^{B_{f}} - 1} - - - (9)

σ_{v} = \sqrt{\frac{\ln 2}{2}} \frac{1}{π ω_{f}} \frac{1}{\tan (B_{θ} / 2)} - - - (10)

Get ω generally speaking _f=0.12, B _f=1.25, B _θ=π/6, φ is the angle of Gauss's coordinate axis and abscissa axis, when φ gets 0 ° respectively, 45 °, when 90 ° and 135 °, obtains 4 different Gabor wave filters.When extracting the orientative feature type, if present frame is a coloured image, be transformed into gray level image earlier, again with this 4 Gabor wave filters filtering respectively, obtained the characteristic type figure in 4 orientation.

For textural characteristics, we have considered local binary pattern LBP (Local Binary Pattern), and LBP is the textural characteristics that is used for describing the local space architectural feature of image and has been widely used for explaining human perception, people such as Ojala (T.Ojala, M.

And D.Harwood, " Acomparative study of texture measures with classification based on featureddistributions; " Pattern Recognition, 29 (1): 51-59,1996.) at first introduced this operator and shown the ability of the Texture classification that it is powerful.If same present frame is a coloured image, be transformed into gray level image earlier, given position (x in image _c, y _c), LBP is defined as the set (as shown in Figure 3) of the two-value order that a center pixel and peripheral eight neighborhood territory pixels relatively obtain, and result's the decimal system can be showed by following formula:

LBP (x_{c}, y_{c}) = Σ_{n = 0}^{7} s (i_{n} - i_{c}) 2^{n} - - - (11)

I wherein _cBe center (x _c, y _c) pixel value, i _nBe the pixel value of peripheral eight neighborhoods, function s (x) is defined by:

s (x) = \{\begin{matrix} 1 & x &GreaterEqual; 0 \\ 0 & x < 0 \end{matrix} - - - (12)

The present invention has used 2 LBP operators, and one is original LBP operator, and another is the LBP operator of the extension of ring radius, this operator can keep size and rotational invariance, when its pixel during not at pixel center, obtain by interpolation, two LBP operators are as shown in Figure 4.So the present invention has used 10 characteristic types altogether.

(B.2) the static conspicuousness figure of calculating present frame

For each characteristic type figure of present frame, be broken down into 9 gaussian pyramid figure (from yardstick 0 to yardstick 8), like this for each characteristic type F, have 9 characteristic pattern F (i) (i ∈ 0,1, K, 8}), the size of F (0) equals the size of present frame, and the size of F (1) is half of F (0) size, the size of F (2) is half of F (1) size ... the size of F (8) is half of F (7) size, gets c ∈ { 2,3,4}, δ ∈ { 3,4}, a=c+ δ, order

F(c，a)＝|F(c)ΘF(a)| (13)

Wherein Θ represents that the pointwise of gaussian pyramid is poor, and each characteristic type all has 6 characteristic patterns like this, and 10 characteristic types have produced 60 characteristic patterns altogether.

We use people's such as Itti characteristic pattern normalization operator N (.) to strengthen the less characteristic pattern in remarkable peak, have a large amount of significantly characteristic patterns at peak and weaken.To each characteristic pattern, the operation of this operator comprises: 1) in this characteristic pattern to one fixed range of normalization [0, L, M], depend on the amplitude difference of feature with elimination, wherein M is the max pixel value in this characteristic pattern; 2) calculate all local averages greatly except that global maximum

3) use

Take advantage of this characteristic pattern.All values less than maximal value 20% all are changed to zero.

Only consider that local maximum can make N (.) that significant zone in the characteristic pattern is compared, and ignore homogeneous area.The difference of global maximum and all local maximum averages has reflected the difference between most interested zone and average area-of-interest.If this difference is bigger, most interested zone will highlight, if this difference is less, shows not contain any zone with peculiar property in the characteristic pattern.The biology of N (.) is according to being that it has expressed the lateral inhibition mechanism of cortex approx, and promptly neighbour's similar features suppresses mutually by specific connection.Characteristic pattern is combined into 4 characteristic remarkable descriptions, and promptly the gray feature conspicuousness is described

The color characteristic conspicuousness is described

The orientative feature conspicuousness is described

Describe with the textural characteristics conspicuousness

These descriptions can unify to be expressed as

Wherein

Expression pointwise summation.Obtain 4 characteristic patterns

With

These 4 characteristic remarkable descriptions are by normalization further, and addition obtains static conspicuousness figure M _s(x, y), as the formula (15):

(2) obtain final conspicuousness figure in the short-sighted frequency

Dynamically conspicuousness figure and static conspicuousness figure are as mentioned above, final conspicuousness figure be they weight and, these two figure compete conspicuousness, dynamically conspicuousness figure emphasizes the conspicuousness of time, static conspicuousness figure emphasizes the conspicuousness in space, for they can be compared, with another one standardization operator N orm (.) dynamic and static conspicuousness figure is normalized into [0,1] in the interval, specifically be with the pixel value of the every bit among the dynamic conspicuousness figure divided by the max pixel value among the dynamic conspicuousness figure, with the pixel value of the every bit among the static conspicuousness figure divided by the max pixel value among the static conspicuousness figure.When merging them, the definition weights be t ∈ 0, K, 1} represents the weight of dynamic conspicuousness figure for final conspicuousness figure, 0.4≤t≤0.6 can reach effect preferably generally speaking, final conspicuousness figure M (x y) can be expressed as:

M(x，y)＝t×Norm(M _d(x，y))+(1-t)×Norm(M _s(x，y)) (16)

By above computation process as can be known, (x, size y) is former input video frame V at this moment final conspicuousness figure M ₁Size 1/16, for the size with former frame of video is consistent, (x, size y) is amplified to and V with M ₁Identical size.

(3) victor wins (Winner-take-all) entirely: for the every bit ψ among the final conspicuousness figure, obtain the size ψ of an optimum according to the maximization approach of entropy _r, shown in equation (17), this The Representation Equation the significant spatial of this position.

ψ_{r} = \underset{r}{\arg \max} {H_{D} (r, ψ) \times W_{D} (r, ψ)} - - - (17)

Wherein D is to be the set of circular all pixel values of regional area of r with the center of circle for the ψ radius among the final conspicuousness figure, H _D(r ψ) is the entropy that obtains according to equation (18), W _D(r ψ) is yardstick between the yardstick that is obtained by equation (19).

H_{D} (r, ψ) = - \underset{d &Element; D}{Σ} p_{d, r, ψ} \log_{2} p_{d, r, ψ} - - - (18)

W_{D} (r, ψ) = \frac{r^{2}}{2 r - 1} \underset{d &Element; D}{Σ} | p_{d, r, ψ} - p_{d, r - 1, ψ} | - - - (19)

P wherein _{D, r, ψ}Be the probability piece function that obtains by the histogram in above regional area internal standardization pixel, description value d is an element among the set D.

For the every bit ψ among the final conspicuousness figure, obtained the size ψ of an optimum like this _r, calculating the average of this o'clock in a regional area again, this regional area is to be the center of circle with this point, with ψ _rBe the border circular areas of radius, all averages have constituted a figure, and peaked point is the most significant point among this figure, and the optimal size that the most significant point and this point are corresponding has constituted the most significant zone.

(4) return inhibition (Inhibition-of-return): obtained a zone the most significant according to the full method of winning of victor, after sight line is noticed this zone, in order to realize attention mobility, make it to notice next zone, will eliminate the most significant zone among the current final conspicuousness figure, the present invention be with among the final conspicuousness figure the pixel value in significant zone all be changed to zero.So just obtained a new final conspicuousness figure.

(5) note to select: repeating step (3) is to (5), until predefined number of times λ, can reach good experiment effect when 4≤λ≤10, and the point of the conspicuousness that obtains after finishing and the size of this region are as focus-of-attention.

Fig. 5 has provided static conspicuousness figure and the scanning pattern thereof of first frame and last frame in the short-sighted frequency.Fig. 6 (a) and (b) provided dynamic conspicuousness figure and the scanning pattern thereof that the method that proposes according to people such as Ban obtains, Fig. 6 (c) and (d) provided final conspicuousness figure and the scanning pattern thereof that the method according to people such as Ban proposition obtains.Fig. 7 (a) and (b) provided dynamic conspicuousness figure and scanning pattern thereof that the method according to this invention obtains, Fig. 7 (c) and (d) provided final conspicuousness figure and scanning pattern thereof that the method according to this invention obtains.In the experiment, we get t=0.5 and represent that dynamic conspicuousness figure and static conspicuousness figure are of equal importance.Fig. 7 (d) has shown the size of this region significance with the frame table of different scale, and other figure does not comprise yardstick information, and square frame is wherein only represented the position of salient region.

The present invention not only is confined to above-mentioned embodiment; persons skilled in the art are according to content disclosed by the invention; can adopt other multiple embodiment to implement the present invention; therefore; every employing project organization of the present invention and thinking; do some simple designs that change or change, all fall into the scope of protection of the invention.

Claims

1. time and space significance visual attention method based on information entropy, its step comprises:

The 3rd step victor wins entirely:

The 4th step was returned inhibition:

The 5th step was noted selecting:

2. the time and space significance visual attention method based on information entropy according to claim 1 is characterized in that: in the step (A.1), convert each two field picture to lower level grayscale image according to following process:

Each frame is become n gray level by 256 greyscale transitions; If pixel value maximum in all frames is Max, for k frame V _k(x, y), 1≤k≤n is with this corresponding pixel value V for middle coordinate points _k(x, (k), (span k) is in the interval of [0,1] for x, y for f for x, y y) to obtain f divided by Max; Again [0,1] interval is equally divided into the n five equilibrium, ((k), the span of these integers is [0, n-1] for x, y, and (x, y is k) as k frame V with g k) to give different round values g for x, y to give the f that falls into different five equilibriums then _kMiddle coordinate points (x, pixel value y).

3. the time and space significance visual attention method based on information entropy according to claim 2 is characterized in that: step (A.2) specifically comprises following process:

(A.2.1) each frame figure is narrowed down to 4 different yardsticks, with k frame V _kBe example, V _kReduced to V _{K, 1}, V _{K, 2}, V _{K, 3}And V _{K, 4}4 different yardsticks are respectively 1/2,1/4,1/8 and 1/16 of former figure sizes, and it is 4 image sequence V that described continuous n frame image sequence becomes _{1, s}, V _{2, s}, K, V _{N, s}, s represents the sequence number of yardstick, 1≤s≤4, and these 4 image sequences are designated as V respectively _1,1, V _2,1, K, V _{N, 1}, V _1,2, V _2,2, K, V _{N, 2}, V _1,3, V _2,3, K, V _{N, 3}And V _1,4, V _2,4, K, V _{N, 4}If R _s(x is that (x, a regional area of y) locating, this regional area are so that (x is the center of circle y), with V to s image sequence in coordinate points y) _{N, 4}Length and width value in half of minimum value be the border circular areas of radius;

(A.2.2) for the coordinate points in s the image sequence (x, y), in this sequence (x, (x, y k) have constituted a histogram to all g in the regional area of y) locating, and the entropy of this point obtains by this histogrammic probability piece function, shown in I; All entropy have constituted a dynamic response diagram M under current yardstick s _{D, s}(x, y):

M_{d, s} (x, y) = - \underset{k &Element; {1,2, K, n}}{Σ} p_{g (x^{'}, y^{'}, k)} \log_{2} p_{g (x^{'}, y^{'}, k)}

Formula I

Wherein

(x′，y′)∈R _s(x，y)

p _{G (x ', y ', k)}Be the probability piece function that is produced by histogram, this histogram is at regional area R by s image sequence _s(x, y) all pixel values in obtain;

(A.2.3) 3 response diagram M that yardstick is bigger _{D, s}(x y) narrows down to and the identical yardstick of smallest dimension response diagram wherein, unites then to generate dynamic conspicuousness figure M _d(x, y):

M_{d} (x, y) = Σ_{s = 1}^{4} M_{d, s} (x, y)