WO2007125115A1 - Salience estimation for object-based visual attention model - Google Patents
Salience estimation for object-based visual attention model Download PDFInfo
- Publication number
- WO2007125115A1 WO2007125115A1 PCT/EP2007/054195 EP2007054195W WO2007125115A1 WO 2007125115 A1 WO2007125115 A1 WO 2007125115A1 EP 2007054195 W EP2007054195 W EP 2007054195W WO 2007125115 A1 WO2007125115 A1 WO 2007125115A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- salience
- image
- estimated
- segmented
- visual attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Definitions
- the present invention relates to a method for estimating the salience of an image, and more particularly to a salience estimation method for object- based visual attention model.
- Attention area is the area in a picture where tends to catch more human attention.
- the system designed to automatically detect the attention area of a picture is called attention model.
- the detected attention area is widely utilized in many kinds of applications, such as accumulating limited resource in an attention area, directing retrieval/search, simplifying analysis, etc.
- Fig.l indicates the general architecture of a mostly used attention model.
- an image to be estimated is inputted into the attention model.
- the feature of intensity, colour, orientation, etc. will be achieved after the step of feature extraction.
- the salience of said features are estimated.
- the steps of fusion scheme and post-processing the attention area is finally got.
- MB being the basic unit
- other models which direct visual attention are object-driven, called object-based visual attention model.
- Each micro-block may cover lots of natural objects . So, the extracted feature of the micro-block is a mixed property of all these natural objects and thus will lower down attention area detection precision.
- the key issue of the object-based visual attention model lies in two aspects: one is the object grouping before feature extraction, the other is the particular efficient salience estimation of each object over all the objects in the image.
- the central idea of the currently used salience estimation scheme is based on Gauss distance measure as presented by Y. Sun et al .
- Gauss distance is defined as the formula (1)
- the attenuation coefficient is measured by d gau ss, which is coherent with the visual physiology thesis.
- S F (x) is a useful salient estimation in feature F.
- some important human perception properties are not considered in S F (x) .
- Fig.2a is an original image of Skating to be estimated and Fig.3a is the salience estimation result of
- Fig.2b is an original image of Coastguard to be estimated and Fig.3b is the salience estimation result of
- white colour means a very outstanding object while black colour means not salient one, the grey level between white and black represents the salience degree.
- Fig.3a there is a little grey block on the left of the female dancer's head.
- the block consists of a piece of white skating rink which is circled by black male clothing and female skin, and it is salient in this local area. But when all comes to all, this block is a part of the large skating rink and will not attract viewers' attention. This is called “Local effect”. Because of the local effect, the accumulated difference between the object and its neighbours is large and thus it is recognized as “salience”.
- Object size The estimation of the influence that the object size on salience degree is a complex problem. For example, (a) if all neighbouring objects y ⁇ are of the same size s and the size of object x decreases from s to 0, as a result the salience degree of x (S F (x) ) will decrease gradually; (b) if all neighbouring objects y ⁇ are of the same size s and the size of object x decreases from s ⁇ to S2 (si>s, and S ⁇ >S2>s) , S F (x) will increase gradually.
- S F (x) will increase gradually.
- Video texture - Suppose the object features of an image are uniformly random, human will usually ignore the details of the whole image and not any object of the image is salient, while the above defined S F (x) will be a large number for any of the objects in the image.
- the conventional object-based visual attention model is far from applicable. Therefore an improved object-based visual attention model is desirable.
- the present invention provides a salience estimation scheme for object-based visual attention model employing a multi-level concentric circled scheme capable of lowering the computing complexity and being more applicable.
- the invention provides a method for estimating the salience of an image. It comprises steps of segmenting the image into a plurality of objects to be estimated; extracting feature maps for each segmented object; calculating the saliences of each segmented object in a set of circles defined around a centre pixel of the object based on the extracted feature maps; and integrating the saliences of each segmented object in the all circles in order to achieve an overall salience estimation for each segmented object.
- the step of extracting feature maps is based on the measure of image colour variation.
- the step of calculating the salience of each segmented object comprises a sub-step of comparing colour features of the object to be estimated with that of any other object in each circle defined around the object to be estimated.
- the object-based visual attention model based on multi-level concentric circled salience estimation scheme of the present invention presents an efficient framework to construct object-based visual attention model, which is of low computing complexity and much more human vision inosculated.
- Fig.l illustrates a general architecture of a mostly used attention model
- Fig.2a illustrates an original image of Skating to be salience estimated
- Fig.2b illustrates an original image of Coastguard to be salience estimated
- Fig.3a is the salience estimation result of Fig.2a using the conventional object-based visual attention model
- Fig.3b is the salience estimation result of Fig.2b using the conventional object-based visual attention model
- Fig.4 illustrates the multi-level concentric circled scheme of the salience estimation according to a preferred embodiment of the present invention
- Fig.5 illustrates an example definition of texture (.) in the invention
- Fig.6a is an example of segmentation result of Fig.2a according to the preferred embodiment of the present invention
- Fig.6b is another example of segmentation result of Fig.2b according to the preferred embodiment of the present invention
- Fig.7a illustrates the estimated salience result of Fig.2a using the salience estimation scheme according to the preferred embodiment of the present invention.
- Fig.7b illustrates the estimated salience result of Fig.2b using the salience estimation scheme according to a preferred embodiment of the present invention.
- the method of the present invention mainly includes three steps described as below:
- Step 1 Pre-processing (image segmentation)
- Image-based segmentation and grouping play a powerful role in human vision perception, a lot of researches have been developed in this area.
- P . F. Felzenszwalb et al . "Image Segmentation Using Local Variation", IEEE Computer Society on Computer Vision and Pattern Recognition, Jun.1998, which is based on measures of image colour variation.
- P . F. Felzenszwalb et al . "Image Segmentation Using Local Variation", IEEE Computer Society on Computer Vision and Pattern Recognition, Jun.1998, which is based on measures of image colour variation.
- the precise definition of which pixels are connected by edges in E depends on the expression (1-1).
- Int(C) max weight(e) e ⁇ MST(C,E) (1 _ 3) where MST(C, E) is a minimum spanning tree of C with respect to the set of E.
- the process of the segmentation is to make the expression (1-5) satisfied for any two of the segmented objects :
- C ⁇ ) of the two objects they belong to, the two objects are merged to form a new single object. -lilt can be seen, this gives an efficient object segmentation scheme which will not cost too much of computing resource. In implementation, here uses an 8- connected neighbourhood for constructing E 1 that is d l .
- Fig.6a and Fig.6b provide the segmentation results of Fig.2a and Fig.2b respectively.
- Step 2 Pre-processing (feature extraction)
- the returned value of Major (f, o) is the representative feature of the object o which is defined to satisfy (d ⁇ , c?2 and ⁇ are constant number, set to 2, 64 and 95% respectively in our implementation) :
- B 1 O 1 ) + Major(g, O 1 ))/ 2
- Y 1 (Major(r, O 1 ) + MaJoHg ⁇ o 1 ))/ 2 - ⁇ Major(r, O 1 ) - Major ⁇ g,o ⁇ )
- the intensity feature is extracted as formula (2-1).
- Orientation will be a certain complex feature in object based visual attention model. Since all the objects are segmented according to colour variations, the object itself then will not contain any orientation information except the border of the object. Because of this special property of the segmented objects, we will not consider orientation in the implementation.
- the remaining problem is how to estimate the salience map for each feature map F (F ⁇ I, RG, BY ⁇ ), denoted as SaIp(O 1 ).
- each pixel of the O 1 is indistinctively considered equal to the center pixel C 1 , so the object is considered duplicated S 1 copies of the center pixel as shown in Fig.4.
- SalC F is set as below
- texture (.) is an empirical function of p for detection of "audience area", i.e. the area with random featured objects such as audience, which is more expected not to be recognized as attention.
- the detection function texture (p) satisfies that the lower the value of p is, the bigger the value of texture (p) will be, and thus the more chance this area is recognized as an "audience area” i.e. the video texture of the image.
- this detection function texture ( . ) there will be lower probability that the non-attention objects in the area are recognized as attention.
- Fig.7a and Fig.7b respectively present the salience estimation experimental results of Fig.2a and Fig.2b by using the salience estimation scheme according to the present invention.
- the audience in Fig.2a and the background in Fig.2b are considered not salient, and the little block on the left of the female dancer' s head in Fig.3a is also removed from Fig.7a.
- the present invention is capable of handing the local effect and video texture and it is more applicable.
- the present object-based visual attention model based on multi-level concentric circled salience estimation scheme gives a more accuracy on understanding of the image and a far more computing efficiency, it has several advantages as below:
- the invention presents an efficient framework to construct object-based visual attention model. It is of low computing complexity. 2.
- the presented framework is much more human vision inosculated. The un-considered human vision properties in conventional schemes (such as object size, local effect and video texture) are well issued.
- the framework is extendable.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Image Analysis (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN200780015252XA CN101432775B (zh) | 2006-04-28 | 2007-04-27 | 基于对象的视觉注意力模型的显著性评估方法 |
| US12/226,386 US8385654B2 (en) | 2006-04-28 | 2007-04-27 | Salience estimation for object-based visual attention model |
| EP07728649.0A EP2013850B1 (en) | 2006-04-28 | 2007-04-27 | Salience estimation for object-based visual attention model |
| JP2009507098A JP4979033B2 (ja) | 2006-04-28 | 2007-04-27 | オブジェクト・ベース視覚的注意モデルの顕著性推定 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP06300418.8 | 2006-04-28 | ||
| EP06300418 | 2006-04-28 | ||
| EP06300538A EP1862966A1 (en) | 2006-05-31 | 2006-05-31 | Salience estimation for object-based visual attention model |
| EP06300538.3 | 2006-05-31 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2007125115A1 true WO2007125115A1 (en) | 2007-11-08 |
Family
ID=38169248
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2007/054195 Ceased WO2007125115A1 (en) | 2006-04-28 | 2007-04-27 | Salience estimation for object-based visual attention model |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US8385654B2 (enExample) |
| EP (1) | EP2013850B1 (enExample) |
| JP (1) | JP4979033B2 (enExample) |
| WO (1) | WO2007125115A1 (enExample) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101894371A (zh) * | 2010-07-19 | 2010-11-24 | 华中科技大学 | 一种生物激励的自顶向下的视觉注意方法 |
| CN102227753A (zh) * | 2008-10-03 | 2011-10-26 | 3M创新有限公司 | 用于评估稳健性的系统和方法 |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5306940B2 (ja) * | 2009-08-11 | 2013-10-02 | 日本放送協会 | 動画像コンテンツ評価装置およびコンピュータプログラム |
| EP2515206B1 (en) * | 2009-12-14 | 2019-08-14 | Panasonic Intellectual Property Corporation of America | User interface apparatus and input method |
| AU2011254040B2 (en) * | 2011-12-14 | 2015-03-12 | Canon Kabushiki Kaisha | Method, apparatus and system for determining a saliency map for an input image |
| US9946795B2 (en) | 2014-01-27 | 2018-04-17 | Fujitsu Limited | User modeling with salience |
| US9195903B2 (en) | 2014-04-29 | 2015-11-24 | International Business Machines Corporation | Extracting salient features from video using a neurosynaptic system |
| US9373058B2 (en) | 2014-05-29 | 2016-06-21 | International Business Machines Corporation | Scene understanding using a neurosynaptic system |
| US10115054B2 (en) | 2014-07-02 | 2018-10-30 | International Business Machines Corporation | Classifying features using a neurosynaptic system |
| US9798972B2 (en) | 2014-07-02 | 2017-10-24 | International Business Machines Corporation | Feature extraction using a neurosynaptic system for object classification |
| US10055850B2 (en) * | 2014-09-19 | 2018-08-21 | Brain Corporation | Salient features tracking apparatus and methods using visual initialization |
| US9830529B2 (en) * | 2016-04-26 | 2017-11-28 | Xerox Corporation | End-to-end saliency mapping via probability distribution prediction |
| CN110781846B (zh) * | 2019-10-30 | 2021-02-09 | 江苏开放大学(江苏城市职业学院) | 一种融合视觉广度特点的视觉注意计算方法 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1017019A2 (en) * | 1998-12-31 | 2000-07-05 | Eastman Kodak Company | Method for automatic determination of main subjects in photographic images |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003248825A (ja) * | 2002-02-22 | 2003-09-05 | Fuji Xerox Co Ltd | 画像処理装置及び画像処理方法、画像処理プログラム、記憶媒体 |
| US7471827B2 (en) * | 2003-10-16 | 2008-12-30 | Microsoft Corporation | Automatic browsing path generation to present image areas with high attention value as a function of space and time |
| US7940985B2 (en) * | 2007-06-06 | 2011-05-10 | Microsoft Corporation | Salient object detection |
-
2007
- 2007-04-27 EP EP07728649.0A patent/EP2013850B1/en not_active Ceased
- 2007-04-27 JP JP2009507098A patent/JP4979033B2/ja not_active Expired - Fee Related
- 2007-04-27 WO PCT/EP2007/054195 patent/WO2007125115A1/en not_active Ceased
- 2007-04-27 US US12/226,386 patent/US8385654B2/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP1017019A2 (en) * | 1998-12-31 | 2000-07-05 | Eastman Kodak Company | Method for automatic determination of main subjects in photographic images |
Non-Patent Citations (2)
| Title |
|---|
| ITTI L ET AL: "A SALIENCY-BASED SEARCH MECHANISM FOR OVERT AND COVERT SHIFTS OF VISUAL ATTENTION", VISION RESEARCH, PERGAMON PRESS, OXFORD, GB, vol. 40, no. 10-12, June 2000 (2000-06-01), pages 1489 - 1506, XP008060077, ISSN: 0042-6989 * |
| LUO J ET AL: "On measuring low-level self and relative saliency in photographic images", PATTERN RECOGNITION LETTERS, NORTH-HOLLAND PUBL. AMSTERDAM, NL, vol. 22, no. 2, February 2001 (2001-02-01), pages 157 - 169, XP004315118, ISSN: 0167-8655 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102227753A (zh) * | 2008-10-03 | 2011-10-26 | 3M创新有限公司 | 用于评估稳健性的系统和方法 |
| CN101894371A (zh) * | 2010-07-19 | 2010-11-24 | 华中科技大学 | 一种生物激励的自顶向下的视觉注意方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP2013850A1 (en) | 2009-01-14 |
| JP4979033B2 (ja) | 2012-07-18 |
| EP2013850B1 (en) | 2018-07-25 |
| US20090060267A1 (en) | 2009-03-05 |
| US8385654B2 (en) | 2013-02-26 |
| JP2009535683A (ja) | 2009-10-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8385654B2 (en) | Salience estimation for object-based visual attention model | |
| US8660342B2 (en) | Method to assess aesthetic quality of photographs | |
| Jia et al. | Category-independent object-level saliency detection | |
| Mishra et al. | Active segmentation with fixation | |
| Lu et al. | Salient object detection using concavity context | |
| CN103714181B (zh) | 一种层级化的特定人物检索方法 | |
| CN102968782A (zh) | 一种彩色图像中显著对象的自动抠取方法 | |
| Russell et al. | Segmenting scenes by matching image composites | |
| Karasulu | Review and evaluation of well-known methods for moving object detection and tracking in videos | |
| CN111310662A (zh) | 一种基于集成深度网络的火焰检测识别方法及系统 | |
| Zohourian et al. | Superpixel-based Road Segmentation for Real-time Systems using CNN. | |
| CN101526955B (zh) | 一种基于草图的网络图元自动提取方法和系统 | |
| Bai et al. | Principal pixel analysis and SVM for automatic image segmentation | |
| Mirghasemi et al. | A new image segmentation algorithm based on modified seeded region growing and particle swarm optimization | |
| CN101432775B (zh) | 基于对象的视觉注意力模型的显著性评估方法 | |
| Yang et al. | The large-scale crowd density estimation based on sparse spatiotemporal local binary pattern | |
| Kim et al. | Segmentation of salient regions in outdoor scenes using imagery and 3-d data | |
| Lu et al. | Context-constrained accurate contour extraction for occlusion edge detection | |
| Mustaffa et al. | Content-based image retrieval based on color-spatial features | |
| Liang et al. | Salient object detection based on regions | |
| EP1862966A1 (en) | Salience estimation for object-based visual attention model | |
| Hung et al. | Generalized playfield segmentation of sport videos using color features | |
| Zhang et al. | Image Segmentation Based on Visual Attention Mechanism. | |
| Zhan et al. | Salient object contour detection based on boundary similar region | |
| Li et al. | Object-Based Visual Saliency Computation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07728649 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2007728649 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 12226386 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2009507098 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 200780015252.X Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |