CN101853513A - Time and space significance visual attention method based on entropy - Google Patents

Time and space significance visual attention method based on entropy Download PDF

Info

Publication number
CN101853513A
CN101853513A CN 201010192240 CN201010192240A CN101853513A CN 101853513 A CN101853513 A CN 101853513A CN 201010192240 CN201010192240 CN 201010192240 CN 201010192240 A CN201010192240 A CN 201010192240A CN 101853513 A CN101853513 A CN 101853513A
Authority
CN
China
Prior art keywords
conspicuousness
dynamic
significance
frame
static
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 201010192240
Other languages
Chinese (zh)
Other versions
CN101853513B (en
Inventor
魏龙生
桑农
王岳环
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN2010101922408A priority Critical patent/CN101853513B/en
Publication of CN101853513A publication Critical patent/CN101853513A/en
Application granted granted Critical
Publication of CN101853513B publication Critical patent/CN101853513B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a time and space significance visual attention method based on entropy, which comprises the following steps that: (1) a dynamic significance figure and a static significance figure in short video are extracted; (2) the static significance figure and the dynamic significance figure are connected to generate a final significance figure; (3) all winners win; (4) return inhibition: the pixel values of the most significant area in the final significance figure are set to be zero to obtain a novel final significance figure; and (5) attention selection. When calculating dynamic significance, the method directly calculates the dynamic significance among all frames and only calculates the static significance figure of a current frame, thereby well solving the problems of the prior art, saving the calculation time and well detecting the dynamic significance part; and in addition, the invention also applies a multi-scale method to calculate the dynamic significance so as to better calculate the dynamic significance of objects with different sizes in the video and obtain good visual effect.

Description

A kind of time and space significance visual attention method based on information entropy
Technical field
The invention belongs to computer vision field, be specifically related to a kind of time and space significance visual attention method based on information entropy.
Background technology
Visual attention method mainly is the data screening problem that solves in the image.In computer picture, the content that task is concerned about mostly just is a part very little in the image, so, be necessary to give different image-regions with different processing priority, the complexity of processing procedure can be reduced like this, unnecessary calculating waste can also be reduced.In the human vision information processing, always select a few significant object to carry out priority processing rapidly, and ignore or give up other non-significant object, make our distributes calculation resources selectively like this, thereby greatly improve the efficient of Vision information processing, this process is called as vision attention.
The human visual system can find interesting areas and target in the natural scene easily by selective visual attention mechanism.The imagination space that vision noticing mechanism provides similar acceleration to handle for computer vision.Selective attention also allows the human visual system to handle the visual scene of input more effectively with higher complexity level.In the video of a weak point, motion is based on a such fact, and that is exactly that the stimulation that people's notice is easier to be moved in static scene attracts.Clearly comprised motion in the vision attention, and the object of fast detecting motion is the gordian technique of human and environment adaptive interaction.So the human visual system not only comprises static scene, and comprises dynamic scene.
Just as having passed through amphiblestroid processing procedure, retina obtains two output: magnocellular outputs and parvocellular output by different cells of interest to human vision selective attention process.Magnocellular output has quick response function, this output can be simulated by low spatial frequency, parvocellular output provides detailed information, this output can be simulated by the high spatial frequency that extracts image, and this output has strengthened the contrast of frame and can attract vision attention on the static frames.The present invention gains enlightenment from amphiblestroid two outputs just, two signals from each frame, have been extracted corresponding to amphiblestroid two main outputs, the short-sighted frequency of importing is resolved into low-frequency bandwidth simulates dynamic output and resolves into the output that high-frequency bandwidth is simulated static state, obtain dynamic conspicuousness figure and static conspicuousness figure, these two figure merge the final conspicuousness figure of generation.
The time and space significance model mainly comprises dynamic model and static model, the model that most of vision attention calculates is static and is based on the Feature Fusion theory, using the widest is people (L.Itti such as Itti, C.Koch and E.Niebur, " A model of saliency-based visual attention for rapidscene analysis; " IEEE Transactions on Pattern Analysis and MachineIntelligence, 20 (11), pp.1254-1259,1998.) the static vision attention model that proposes, this model has used elementary visual signature such as brightness, orientation and color.
There were a lot of documents that dynamic conspicuousness is introduced in the middle of the vision noticing mechanism in the last few years, people such as Ban have proposed a typical Dynamic Selection attention model (S.Ban, I.Lee and M.Lee, " Dynamic visual selective attention model; " Neurocomputing, vol.71, pp.853-856,2008.).Detailed process is described below: at first, calculate a static conspicuousness figure for each frame in the video image; Secondly, for each point among each static conspicuousness figure, calculate optimum yardstick; Once more, calculate entropy chart, like this, must just obtain the entropy chart of a static state to each frame figure according to these optimum yardsticks and static conspicuousness figure; At last, obtain a new entropy chart by these static entropy chart sequences, this entropy chart is exactly dynamic conspicuousness figure.
The model that people such as Ban propose has superiority in theory very much, but when the target of motion during not in salient region inside, this model is difficult to detect the zone of motion.
Summary of the invention
The objective of the invention is to propose a kind of time and space significance visual attention method based on information entropy, this method has good yardstick unchangeability, can obtain good visual effect.
The invention provides a kind of time and space significance visual attention method based on information entropy, concrete steps are:
The 1st step was extracted dynamic conspicuousness figure and the static conspicuousness figure in the short-sighted frequency; Wherein, dynamically the leaching process of conspicuousness figure is:
(A.1) for the short-sighted frequency of input, get continuous n frame image sequence, convert each two field picture to lower level grayscale image;
(A.2) each two field picture that step (A.1) is obtained narrows down to 4 different yardsticks, the image sets that n frame under the same scale dwindles is synthesized 1 dynamic response diagram, 3 again that yardstick is bigger dynamic response figure narrow down to and the identical yardstick of smallest dimension response diagram wherein, utilize the image that dwindles of these 4 same scale to unite then and generate dynamic conspicuousness figure;
The 2nd step was united the final conspicuousness figure of generation with static conspicuousness figure and dynamic conspicuousness figure;
The 3rd step victor wins entirely:
For the every bit ψ among the final conspicuousness figure, obtained the size ψ of an optimum according to the maximization approach of entropy r, calculating the average of this o'clock in a regional area again, this regional area is to be the center of circle with this point, with ψ rBe the border circular areas of radius, all averages have constituted a figure, and peaked point is the most significant point among this figure, and the optimal size that the most significant point and this point are corresponding has constituted the most significant zone.
The 4th step was returned inhibition:
With among the final conspicuousness figure the pixel value in significant zone all be changed to zero, obtained a new final conspicuousness figure;
The 5th step was noted selecting:
Repeated for the 3rd step to the 5th step, until predefined number of times, the point of the conspicuousness that obtains after finishing and the size of this region are as focus-of-attention.
The present invention proposes a kind of time and space significance visual attention method based on information entropy, comprise dynamic conspicuousness and static conspicuousness two aspects, when calculating dynamic conspicuousness, existing method is to calculate the static conspicuousness figure of each frame earlier, and the static conspicuousness figure according to all frames calculates dynamic conspicuousness again; There are two shortcomings in the method: when the first is calculated the static significantly figure of each frame, expended a large amount of time, its two be when dynamic target not when static salient region is inner, the method detects less than dynamic conspicuousness part; The present invention directly calculates the dynamic conspicuousness between all frames, only calculates the static conspicuousness figure of present frame, has solved above two problems so well, has saved computing time, and can detect dynamic conspicuousness part better; The present invention has also used multiple dimensioned method to calculate dynamic conspicuousness in addition, so just can calculate the dynamic conspicuousness of different big wisps in the video better, obtains good visual effect.
Description of drawings
Fig. 1 is a process flow diagram of the present invention;
The color framing of Fig. 2 (a) input; (b) gray scale frame; (c) 8 gray scale level frames; (d) 4 gray scale level frames;
Fig. 3 is the LBP operator;
The LBP operator that Fig. 4 (a) is original; (b) the LBP operator of Yan Shening;
Fig. 5 (a) and (b) be the static conspicuousness figure and the scanning pattern of first frame; (c) and (d) be the static conspicuousness figure and the scanning pattern of last frame;
Fig. 6 (a) and (b) be the dynamic conspicuousness figure and the scanning pattern thereof that obtain according to static conspicuousness figure of Ban; (c) and (d) be final conspicuousness figure and the scanning pattern thereof of Ban;
Fig. 7 (a) and (b) be dynamic conspicuousness figure and the scanning pattern thereof that the present invention obtains according to successive frame; (c) and (d) be final conspicuousness figure of the present invention and scanning pattern thereof.
Embodiment
The present invention is further detailed explanation below in conjunction with accompanying drawing and example.
As shown in Figure 1, the inventive method may further comprise the steps:
(1) dynamic conspicuousness figure and the static conspicuousness figure in the short-sighted frequency of extraction;
(A) the dynamic conspicuousness figure in the short-sighted frequency of extraction:
(A.1) for the short-sighted frequency V of input, get continuous n frame image sequence V 1, V 2... V n, generally speaking, when 3≤n≤8, can reach experiment effect preferably, for speed and the reduction complexity of calculation of accelerating to calculate, convert each two field picture to lower level grayscale image, among the present invention, we select the frame number imported in the short-sighted frequency number as gray scale level.If input is coloured image, then at first be transformed into gray level image, again each frame is become n gray level (n<256) by 256 greyscale transitions.If pixel value maximum in all frames is Max, for k frame V k(coordinate points among 1≤k≤n) (x, y), with this corresponding pixel value V k(x, (k), (span k) is in the interval of [0,1], shown in equation (1) for x, y for f like this for x, y y) to obtain f divided by Max; Again [0,1] interval is equally divided into the n five equilibrium, ((k), the span of these integers is [0, n-1], shown in equation (2) for x, y k) to give different round values g for x, y to give the f that falls into different five equilibriums then.Fig. 2 has shown that one is transformed into a gray scale frame (b) with color framing (a), is transformed into an example of 8 gray scale level frames (c) and 4 gray scale level frames (d) again.
f(x,y,k)=V k(x,y)/Max (1)
g ( x , y , k ) = 0 0 &le; f ( x , y , k ) &le; 1 n 1 1 n < f ( x , y , k ) &le; 2 n L L n - 1 n - 1 n < f ( x , y , k ) &le; 1 - - - ( 2 )
(A.2) in order more effectively to detect the zone of motion, we narrow down to 4 different yardsticks with each frame figure, with k frame V kBe example, V kReduced to V K, 1, V K, 2, V K, 3And V K, 44 different yardsticks, be respectively former figure size 1/2,1/4,1/8 and 1/16, this image sequence has become 4 image sequence V like this 1, s, V 2, s, K, V N, s(s represents the sequence number of yardstick, 1≤s≤4), be designated as V respectively 1,1, V 2,1, K, V N, 1, V 1,2, V 2,2, K, V N, 2, V 1,3, V 2,3, K, V N, 3And V 1,4, V 2,4, K, V N, 4If R s(x is that (x, a regional area of y) locating, this regional area are so that (x is the center of circle y), with V to s image sequence in coordinate points y) N, 4Length and width value in half of minimum value be the border circular areas of radius.For the coordinate points in s the image sequence (x, y), in this sequence in that (x, (x, y k) have constituted a histogram to all g in the regional area of y) locating, and the entropy of this point obtains by this histogrammic probability piece function, shown in equation (3).Entropy is big more, and the conspicuousness of this point is just strong more, and all entropy have constituted a dynamic response diagram M under current yardstick D, s(x, y):
M d , s ( x , y ) = - &Sigma; k &Element; { 1,2 , K , n } p g ( x &prime; , y &prime; , k ) log 2 p g ( x &prime; , y &prime; , k ) - - - ( 3 )
Wherein
(x′,y′)∈R s(x,y) (4)
p G (x ', y ', k)Be the probability piece function that is produced by histogram, this histogram is at regional area R by s image sequence s(x, y) all pixel values in obtain.
3 again that yardstick is bigger response diagrams all narrow down to and the identical yardstick of smallest dimension response diagram wherein, unite then to generate dynamic conspicuousness figure M d(x, y):
M d ( x , y ) = &Sigma; s = 1 4 M d , s ( x , y ) - - - ( 5 )
(B) the static conspicuousness figure of extraction present frame
Static conspicuousness figure comprises color contrast, luminance contrast and orientation, and the model that people such as employing Itti propose just can be finished.
As a kind of improvement of the present invention, static conspicuousness figure also can consider texture information, and this static conspicuousness model is a kind of extension of the model that proposes of people such as Itti.Specify as follows below:
(B.1) conspicuousness Feature Extraction
Four kinds of low-level visual signatures: color contrast, luminance contrast, orientation and texture are extracted and have been fused into static conspicuousness figure.Make r, g and b are respectively three Color Channels of input picture, be red green blue tricolor, we create the Color Channel of 4 wider scope, make R=r-(g+b)/2 that expression is red, G=g-(r+b)/2 represents green, B=b-(r+g)/2 represents blue, and Y=(r+g)/2-|r-g|/2-b represents yellow, (is zero if negative value then makes it), then RG=|R-G| is red green contrast, and BY=|B-Y| is blue yellow contrast.So color characteristic is broken down into red green contrast and 2 characteristic types of blue yellow contrast.
We are divided into into brightness with brightness and open 2 types of (by bright to dark) and brightness closures (by secretly to bright), this is because the competent cell in human visual system's the visually-perceptible field has 2 types, bright part around the bright part in center that strengthens the cell that central authorities open suppresses, the closed cell of central authorities suppresses the bright part in center and strengthens the bright part of periphery, if present frame is a coloured image, then at first be transformed into gray level image, again the pixel value of each point in the image is deducted the response (if negative value then make it is zero) of the average of neighbours territory pixel value around this point as this point, obtained the characteristic type figure that brightness is opened like this, average with neighbours territory pixel value around each point in the image deducts the response (if negative value then make it is zero) of the pixel value of this point as this point equally, has obtained the characteristic type figure of brightness closure like this.
Go out 0 ° of 4 orientative feature type with the Gabor filter detection, 45 °, 90 ° and 135 °, the mathematic(al) representation of Gabor wave filter is:
h(u,v)=q(u′,v′)cos(2πω fu′) (6)
Wherein
(u′,v′)=(ucos(φ)+vsin(φ),-usin(φ)+vcos(φ)) (7)
q ( u , v ) = 1 2 &pi; &sigma; u &sigma; v exp ( - u 2 2 &sigma; u 2 - v 2 2 &sigma; v 2 ) - - - ( 8 )
ω fThe centre frequency of expression Gabor wave filter, it has determined the wave filter band to lead to the position of regional center on frequency, by choosing different ω fCan obtain different yardsticks.σ uAnd σ vBe respectively the space constant of Gabor wave filter along the Gaussian envelope of horizontal ordinate and ordinate, σ u, σ vRespectively with the frequency bandwidth B of Gabor wave filter fWith the orientation bandwidth B θAnd following relation arranged:
&sigma; u = ln 2 2 1 &pi; &omega; f 2 B f + 1 2 B f - 1 - - - ( 9 )
&sigma; v = ln 2 2 1 &pi; &omega; f 1 tan ( B &theta; / 2 ) - - - ( 10 )
Get ω generally speaking f=0.12, B f=1.25, B θ=π/6, φ is the angle of Gauss's coordinate axis and abscissa axis, when φ gets 0 ° respectively, 45 °, when 90 ° and 135 °, obtains 4 different Gabor wave filters.When extracting the orientative feature type, if present frame is a coloured image, be transformed into gray level image earlier, again with this 4 Gabor wave filters filtering respectively, obtained the characteristic type figure in 4 orientation.
For textural characteristics, we have considered local binary pattern LBP (Local Binary Pattern), and LBP is the textural characteristics that is used for describing the local space architectural feature of image and has been widely used for explaining human perception, people such as Ojala (T.Ojala, M.
Figure GDA0000022071200000081
And D.Harwood, " Acomparative study of texture measures with classification based on featureddistributions; " Pattern Recognition, 29 (1): 51-59,1996.) at first introduced this operator and shown the ability of the Texture classification that it is powerful.If same present frame is a coloured image, be transformed into gray level image earlier, given position (x in image c, y c), LBP is defined as the set (as shown in Figure 3) of the two-value order that a center pixel and peripheral eight neighborhood territory pixels relatively obtain, and result's the decimal system can be showed by following formula:
LBP ( x c , y c ) = &Sigma; n = 0 7 s ( i n - i c ) 2 n - - - ( 11 )
I wherein cBe center (x c, y c) pixel value, i nBe the pixel value of peripheral eight neighborhoods, function s (x) is defined by:
s ( x ) = 1 x &GreaterEqual; 0 0 x < 0 - - - ( 12 )
The present invention has used 2 LBP operators, and one is original LBP operator, and another is the LBP operator of the extension of ring radius, this operator can keep size and rotational invariance, when its pixel during not at pixel center, obtain by interpolation, two LBP operators are as shown in Figure 4.So the present invention has used 10 characteristic types altogether.
(B.2) the static conspicuousness figure of calculating present frame
For each characteristic type figure of present frame, be broken down into 9 gaussian pyramid figure (from yardstick 0 to yardstick 8), like this for each characteristic type F, have 9 characteristic pattern F (i) (i ∈ 0,1, K, 8}), the size of F (0) equals the size of present frame, and the size of F (1) is half of F (0) size, the size of F (2) is half of F (1) size ... the size of F (8) is half of F (7) size, gets c ∈ { 2,3,4}, δ ∈ { 3,4}, a=c+ δ, order
F(c,a)=|F(c)ΘF(a)| (13)
Wherein Θ represents that the pointwise of gaussian pyramid is poor, and each characteristic type all has 6 characteristic patterns like this, and 10 characteristic types have produced 60 characteristic patterns altogether.
We use people's such as Itti characteristic pattern normalization operator N (.) to strengthen the less characteristic pattern in remarkable peak, have a large amount of significantly characteristic patterns at peak and weaken.To each characteristic pattern, the operation of this operator comprises: 1) in this characteristic pattern to one fixed range of normalization [0, L, M], depend on the amplitude difference of feature with elimination, wherein M is the max pixel value in this characteristic pattern; 2) calculate all local averages greatly except that global maximum
Figure GDA0000022071200000091
3) use
Figure GDA0000022071200000092
Take advantage of this characteristic pattern.All values less than maximal value 20% all are changed to zero.
Only consider that local maximum can make N (.) that significant zone in the characteristic pattern is compared, and ignore homogeneous area.The difference of global maximum and all local maximum averages has reflected the difference between most interested zone and average area-of-interest.If this difference is bigger, most interested zone will highlight, if this difference is less, shows not contain any zone with peculiar property in the characteristic pattern.The biology of N (.) is according to being that it has expressed the lateral inhibition mechanism of cortex approx, and promptly neighbour's similar features suppresses mutually by specific connection.Characteristic pattern is combined into 4 characteristic remarkable descriptions, and promptly the gray feature conspicuousness is described
Figure GDA0000022071200000093
The color characteristic conspicuousness is described
Figure GDA0000022071200000094
The orientative feature conspicuousness is described
Figure GDA0000022071200000095
Describe with the textural characteristics conspicuousness
Figure GDA0000022071200000096
These descriptions can unify to be expressed as
Figure GDA0000022071200000097
Figure GDA0000022071200000098
Wherein
Figure GDA0000022071200000099
Expression pointwise summation.Obtain 4 characteristic patterns
Figure GDA00000220712000000910
With
Figure GDA00000220712000000911
These 4 characteristic remarkable descriptions are by normalization further, and addition obtains static conspicuousness figure M s(x, y), as the formula (15):
(2) obtain final conspicuousness figure in the short-sighted frequency
Dynamically conspicuousness figure and static conspicuousness figure are as mentioned above, final conspicuousness figure be they weight and, these two figure compete conspicuousness, dynamically conspicuousness figure emphasizes the conspicuousness of time, static conspicuousness figure emphasizes the conspicuousness in space, for they can be compared, with another one standardization operator N orm (.) dynamic and static conspicuousness figure is normalized into [0,1] in the interval, specifically be with the pixel value of the every bit among the dynamic conspicuousness figure divided by the max pixel value among the dynamic conspicuousness figure, with the pixel value of the every bit among the static conspicuousness figure divided by the max pixel value among the static conspicuousness figure.When merging them, the definition weights be t ∈ 0, K, 1} represents the weight of dynamic conspicuousness figure for final conspicuousness figure, 0.4≤t≤0.6 can reach effect preferably generally speaking, final conspicuousness figure M (x y) can be expressed as:
M(x,y)=t×Norm(M d(x,y))+(1-t)×Norm(M s(x,y)) (16)
By above computation process as can be known, (x, size y) is former input video frame V at this moment final conspicuousness figure M 1Size 1/16, for the size with former frame of video is consistent, (x, size y) is amplified to and V with M 1Identical size.
(3) victor wins (Winner-take-all) entirely: for the every bit ψ among the final conspicuousness figure, obtain the size ψ of an optimum according to the maximization approach of entropy r, shown in equation (17), this The Representation Equation the significant spatial of this position.
&psi; r = arg max r { H D ( r , &psi; ) &times; W D ( r , &psi; ) } - - - ( 17 )
Wherein D is to be the set of circular all pixel values of regional area of r with the center of circle for the ψ radius among the final conspicuousness figure, H D(r ψ) is the entropy that obtains according to equation (18), W D(r ψ) is yardstick between the yardstick that is obtained by equation (19).
H D ( r , &psi; ) = - &Sigma; d &Element; D p d , r , &psi; log 2 p d , r , &psi; - - - ( 18 )
W D ( r , &psi; ) = r 2 2 r - 1 &Sigma; d &Element; D | p d , r , &psi; - p d , r - 1 , &psi; | - - - ( 19 )
P wherein D, r, ψBe the probability piece function that obtains by the histogram in above regional area internal standardization pixel, description value d is an element among the set D.
For the every bit ψ among the final conspicuousness figure, obtained the size ψ of an optimum like this r, calculating the average of this o'clock in a regional area again, this regional area is to be the center of circle with this point, with ψ rBe the border circular areas of radius, all averages have constituted a figure, and peaked point is the most significant point among this figure, and the optimal size that the most significant point and this point are corresponding has constituted the most significant zone.
(4) return inhibition (Inhibition-of-return): obtained a zone the most significant according to the full method of winning of victor, after sight line is noticed this zone, in order to realize attention mobility, make it to notice next zone, will eliminate the most significant zone among the current final conspicuousness figure, the present invention be with among the final conspicuousness figure the pixel value in significant zone all be changed to zero.So just obtained a new final conspicuousness figure.
(5) note to select: repeating step (3) is to (5), until predefined number of times λ, can reach good experiment effect when 4≤λ≤10, and the point of the conspicuousness that obtains after finishing and the size of this region are as focus-of-attention.
Fig. 5 has provided static conspicuousness figure and the scanning pattern thereof of first frame and last frame in the short-sighted frequency.Fig. 6 (a) and (b) provided dynamic conspicuousness figure and the scanning pattern thereof that the method that proposes according to people such as Ban obtains, Fig. 6 (c) and (d) provided final conspicuousness figure and the scanning pattern thereof that the method according to people such as Ban proposition obtains.Fig. 7 (a) and (b) provided dynamic conspicuousness figure and scanning pattern thereof that the method according to this invention obtains, Fig. 7 (c) and (d) provided final conspicuousness figure and scanning pattern thereof that the method according to this invention obtains.In the experiment, we get t=0.5 and represent that dynamic conspicuousness figure and static conspicuousness figure are of equal importance.Fig. 7 (d) has shown the size of this region significance with the frame table of different scale, and other figure does not comprise yardstick information, and square frame is wherein only represented the position of salient region.
The present invention not only is confined to above-mentioned embodiment; persons skilled in the art are according to content disclosed by the invention; can adopt other multiple embodiment to implement the present invention; therefore; every employing project organization of the present invention and thinking; do some simple designs that change or change, all fall into the scope of protection of the invention.

Claims (3)

1. time and space significance visual attention method based on information entropy, its step comprises:
The 1st step was extracted dynamic conspicuousness figure and the static conspicuousness figure in the short-sighted frequency; Wherein, dynamically the leaching process of conspicuousness figure is:
(A.1) for the short-sighted frequency of input, get continuous n frame image sequence, convert each two field picture to lower level grayscale image;
(A.2) each two field picture that step (A.1) is obtained narrows down to 4 different yardsticks, the image sets that n frame under the same scale dwindles is synthesized 1 dynamic response diagram, 3 again that yardstick is bigger dynamic response figure narrow down to and the identical yardstick of smallest dimension response diagram wherein, utilize the image that dwindles of these 4 same scale to unite then and generate dynamic conspicuousness figure;
The 2nd step was united the final conspicuousness figure of generation with static conspicuousness figure and dynamic conspicuousness figure;
The 3rd step victor wins entirely:
For the every bit ψ among the final conspicuousness figure, obtained the size ψ of an optimum according to the maximization approach of entropy r, calculating the average of this o'clock in a regional area again, this regional area is to be the center of circle with this point, with ψ rBe the border circular areas of radius, all averages have constituted a figure, and peaked point is the most significant point among this figure, and the optimal size that the most significant point and this point are corresponding has constituted the most significant zone.
The 4th step was returned inhibition:
With among the final conspicuousness figure the pixel value in significant zone all be changed to zero, obtained a new final conspicuousness figure;
The 5th step was noted selecting:
Repeated for the 3rd step to the 5th step, until predefined number of times, the point of the conspicuousness that obtains after finishing and the size of this region are as focus-of-attention.
2. the time and space significance visual attention method based on information entropy according to claim 1 is characterized in that: in the step (A.1), convert each two field picture to lower level grayscale image according to following process:
Each frame is become n gray level by 256 greyscale transitions; If pixel value maximum in all frames is Max, for k frame V k(x, y), 1≤k≤n is with this corresponding pixel value V for middle coordinate points k(x, (k), (span k) is in the interval of [0,1] for x, y for f for x, y y) to obtain f divided by Max; Again [0,1] interval is equally divided into the n five equilibrium, ((k), the span of these integers is [0, n-1] for x, y, and (x, y is k) as k frame V with g k) to give different round values g for x, y to give the f that falls into different five equilibriums then kMiddle coordinate points (x, pixel value y).
3. the time and space significance visual attention method based on information entropy according to claim 2 is characterized in that: step (A.2) specifically comprises following process:
(A.2.1) each frame figure is narrowed down to 4 different yardsticks, with k frame V kBe example, V kReduced to V K, 1, V K, 2, V K, 3And V K, 44 different yardsticks are respectively 1/2,1/4,1/8 and 1/16 of former figure sizes, and it is 4 image sequence V that described continuous n frame image sequence becomes 1, s, V 2, s, K, V N, s, s represents the sequence number of yardstick, 1≤s≤4, and these 4 image sequences are designated as V respectively 1,1, V 2,1, K, V N, 1, V 1,2, V 2,2, K, V N, 2, V 1,3, V 2,3, K, V N, 3And V 1,4, V 2,4, K, V N, 4If R s(x is that (x, a regional area of y) locating, this regional area are so that (x is the center of circle y), with V to s image sequence in coordinate points y) N, 4Length and width value in half of minimum value be the border circular areas of radius;
(A.2.2) for the coordinate points in s the image sequence (x, y), in this sequence (x, (x, y k) have constituted a histogram to all g in the regional area of y) locating, and the entropy of this point obtains by this histogrammic probability piece function, shown in I; All entropy have constituted a dynamic response diagram M under current yardstick s D, s(x, y):
M d , s ( x , y ) = - &Sigma; k &Element; { 1,2 , K , n } p g ( x &prime; , y &prime; , k ) log 2 p g ( x &prime; , y &prime; , k ) Formula I
Wherein
(x′,y′)∈R s(x,y)
p G (x ', y ', k)Be the probability piece function that is produced by histogram, this histogram is at regional area R by s image sequence s(x, y) all pixel values in obtain;
(A.2.3) 3 response diagram M that yardstick is bigger D, s(x y) narrows down to and the identical yardstick of smallest dimension response diagram wherein, unites then to generate dynamic conspicuousness figure M d(x, y):
M d ( x , y ) = &Sigma; s = 1 4 M d , s ( x , y )
CN2010101922408A 2010-06-06 2010-06-06 Time and space significance visual attention method based on entropy Expired - Fee Related CN101853513B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101922408A CN101853513B (en) 2010-06-06 2010-06-06 Time and space significance visual attention method based on entropy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101922408A CN101853513B (en) 2010-06-06 2010-06-06 Time and space significance visual attention method based on entropy

Publications (2)

Publication Number Publication Date
CN101853513A true CN101853513A (en) 2010-10-06
CN101853513B CN101853513B (en) 2012-02-29

Family

ID=42804978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101922408A Expired - Fee Related CN101853513B (en) 2010-06-06 2010-06-06 Time and space significance visual attention method based on entropy

Country Status (1)

Country Link
CN (1) CN101853513B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129349A (en) * 2011-03-18 2011-07-20 山东大学 Method for adaptively displaying image by taking visual transfer mechanism as guide
CN102831621A (en) * 2012-08-09 2012-12-19 西北工业大学 Video significance processing method based on spectral analysis
CN102867301A (en) * 2012-08-29 2013-01-09 西北工业大学 Mehtod for getting image salient features according to information entropy
CN103034865A (en) * 2012-12-13 2013-04-10 南京航空航天大学 Extraction method of visual salient regions based on multiscale relative entropy
CN104424642A (en) * 2013-09-09 2015-03-18 华为软件技术有限公司 Detection method and detection system for video salient regions
CN105825238A (en) * 2016-03-30 2016-08-03 江苏大学 Visual saliency object detection method
CN107451595A (en) * 2017-08-04 2017-12-08 河海大学 Infrared image salient region detection method based on hybrid algorithm
CN109102467A (en) * 2017-06-21 2018-12-28 北京小米移动软件有限公司 The method and device of picture processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007003651A1 (en) * 2005-07-06 2007-01-11 Thomson Licensing Method of obtaining a saliency map from a plurality of saliency maps created from visual quantities
CN101651772A (en) * 2009-09-11 2010-02-17 宁波大学 Method for extracting video interested region based on visual attention

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007003651A1 (en) * 2005-07-06 2007-01-11 Thomson Licensing Method of obtaining a saliency map from a plurality of saliency maps created from visual quantities
CN101651772A (en) * 2009-09-11 2010-02-17 宁波大学 Method for extracting video interested region based on visual attention

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
《IEEE ICIP 2002》 20021231 Yu-Fei Ma,et al 《A MODEL OF MOTION ATTENTION FOR VIDEO SKIMMING》 129-132 , 2 *
《International Journal of Computer Vision》 20011231 TIMOR KADIR,et al 《Saliency, Scale and Image Description》 83-105 第45卷, 第2期 2 *
《Lecture Notes in Computer Science 》 20031231 Nabil Ouerhani,et al 《A Model of Dynamic Visual Attention for Object Tracking in Natural Image Sequences 》 702-709 第1卷, 第2686期 2 *
《The 2010 Canadian Geomatics Conference and Symposium of Commission I, ISPRS》 20100618 Long sheng Wei,et al 《A SPATIOTEMPORAL SALIENCY MODEL OF VISUAL ATTENTION BASED ON MAXIMUM ENTROPY》 , 2 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129349A (en) * 2011-03-18 2011-07-20 山东大学 Method for adaptively displaying image by taking visual transfer mechanism as guide
CN102129349B (en) * 2011-03-18 2012-09-19 山东大学 Method for adaptively displaying image by taking visual transfer mechanism as guide
CN102831621A (en) * 2012-08-09 2012-12-19 西北工业大学 Video significance processing method based on spectral analysis
CN102867301A (en) * 2012-08-29 2013-01-09 西北工业大学 Mehtod for getting image salient features according to information entropy
CN102867301B (en) * 2012-08-29 2015-01-28 西北工业大学 Mehtod for getting image salient features according to information entropy
CN103034865A (en) * 2012-12-13 2013-04-10 南京航空航天大学 Extraction method of visual salient regions based on multiscale relative entropy
CN104424642A (en) * 2013-09-09 2015-03-18 华为软件技术有限公司 Detection method and detection system for video salient regions
CN105825238A (en) * 2016-03-30 2016-08-03 江苏大学 Visual saliency object detection method
CN105825238B (en) * 2016-03-30 2019-04-30 江苏大学 A kind of vision significance mesh object detection method
CN109102467A (en) * 2017-06-21 2018-12-28 北京小米移动软件有限公司 The method and device of picture processing
CN107451595A (en) * 2017-08-04 2017-12-08 河海大学 Infrared image salient region detection method based on hybrid algorithm

Also Published As

Publication number Publication date
CN101853513B (en) 2012-02-29

Similar Documents

Publication Publication Date Title
CN101853513B (en) Time and space significance visual attention method based on entropy
CN104484667B (en) A kind of contour extraction method based on brightness and integrality of outline
CN104966085B (en) A kind of remote sensing images region of interest area detecting method based on the fusion of more notable features
Tomasi Histograms of oriented gradients
CN111275696B (en) Medical image processing method, image processing method and device
CN103177458B (en) A kind of visible remote sensing image region of interest area detecting method based on frequency-domain analysis
CN104835175B (en) Object detection method in a kind of nuclear environment of view-based access control model attention mechanism
CN107844795A (en) Convolutional neural network feature extraction method based on principal component analysis
CN103247059A (en) Remote sensing image region of interest detection method based on integer wavelets and visual features
CN103927741A (en) SAR image synthesis method for enhancing target characteristics
CN103218832B (en) Based on the vision significance algorithm of global color contrast and spatial distribution in image
CN106446936A (en) Hyperspectral data classification method for spectral-spatial combined data and oscillogram conversion based on convolution neural network
CN103020965A (en) Foreground segmentation method based on significance detection
Zhang et al. Cloud detection method using CNN based on cascaded feature attention and channel attention
CN103295241A (en) Frequency domain significance target detection method based on Gabor wavelet
CN107154044A (en) A kind of dividing method of Chinese meal food image
CN103946868A (en) Processing method and system for medical images
CN104966054A (en) Weak and small object detection method in visible image of unmanned plane
CN105893960A (en) Road traffic sign detecting method based on phase symmetry
CN105426846A (en) Method for positioning text in scene image based on image segmentation model
CN109190456A (en) Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes
CN106296632B (en) A kind of well-marked target detection method based on amplitude spectrum analysis
CN117079097A (en) Sea surface target identification method based on visual saliency
CN101894371B (en) Bio-inspired top-down visual attention method
Ran et al. Sketch-guided spatial adaptive normalization and high-level feature constraints based GAN image synthesis for steel strip defect detection data augmentation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120229

Termination date: 20170606

CF01 Termination of patent right due to non-payment of annual fee