TWI453659B

TWI453659B - Object-based system and method of directing visual attention by a subliminal cue

Info

Publication number: TWI453659B
Application number: TW100131633A
Authority: TW
Inventors: Homer H Chen; Su Ling Yeh; Tai Hsiang Huang; Yung Hao Yang; Hsin I Liao; Ling Hsiu Huang
Original assignee: Univ Nat Taiwan; Himax Tech Ltd
Priority date: 2011-09-02
Filing date: 2011-09-02
Publication date: 2014-09-21
Also published as: TW201312451A

Description

Object-based visual attention guidance system and method using subliminal prompt

本發明係有關一種數位影像處理，特別是關於一種使用閾下提示(subliminal cue)基於物體的視覺注意力導引系統及方法。 The present invention relates to a digital image processing, and more particularly to an object-based visual attention guiding system and method using subliminal cue.

視覺注意力是人類視覺系統(human visual system,HVS)的一個重要特徵，而人類視覺系統又是中央神經系統的一環。視覺注意力可幫助我們的大腦濾掉多餘的視覺訊息，且使得眼睛集中於關注區域。 Visual attention is an important feature of the human visual system (HVS), which is part of the central nervous system. Visual attention helps our brain filter out unwanted visual information and focus on the area of interest.

實務上，經常要導引觀者的視覺注意力至影像的特定區域，或稱為關注區域(area of interest,AOI)，但又不想讓觀者獲悉此種導引意圖。為達此目的，傳統上係使用可感知(perceivable)影像改變來吸引觀者的知覺，例如改變影像目標區域的顏色以導引人類視覺注意力至目標區域。然而，上述藉可感知影像改變以吸引觀者知覺的方法具有以下缺點。第一，可感知改變會造成觀者注意力的分散，因而破壞了觀者的觀看感受。第二，可感知改變會讓觀者對於影像的感知異於原先的目的。第三，位於目標區域的影像細節會遭到改變或遺失。再者，傳統方法通常涉及手動調整，因此不適於即時應用。 In practice, it is often necessary to guide the viewer's visual attention to a specific area of the image, or area of interest (AOI), but does not want the viewer to be aware of such guiding intentions. To this end, perceptual image changes have traditionally been used to attract viewers' perceptions, such as changing the color of the image's target area to direct human visual attention to the target area. However, the above-described method of perceiving image change to attract viewer perception has the following disadvantages. First, perceptible change can cause the viewer's attention to distract, thus destroying the viewer's viewing experience. Second, the perceptible change will make the viewer's perception of the image different from the original. First purpose. Third, the details of the image located in the target area will be changed or lost. Furthermore, traditional methods typically involve manual adjustments and are therefore not suitable for instant applications.

因此，亟需提出一種新穎的機制，以非侵入性的有效方式以導引觀者的視覺注意力。 Therefore, there is an urgent need to propose a novel mechanism to guide the viewer's visual attention in a non-invasive and effective manner.

鑑於上述，本發明實施例的目的之一在於提出一種基於物體的系統及方法，使用閾下提示而在觀者未察覺的情形下，自動且有效地導引觀者的視覺注意力。 In view of the above, one of the objects of embodiments of the present invention is to provide an object-based system and method for automatically and effectively guiding the viewer's visual attention using subliminal cues without being noticed by the viewer.

根據本發明實施例，使用閾下提示之基於物體的視覺注意力導引系統包含物體偵測器、增強單元及混合器。物體偵測器用以偵測一影像的物體，藉以形成一提示物體，作為一基於物體的閾下提示。增強單元分別地且相異地調整提示物體及提示物體除外區域的影像特徵，用以增強提示物體的顯著性，藉以產生一提示影像。混合器用於提示影像與輸入影像之間作選擇，因而形成一輸出影像的序列，該序列包含輸入影像及提示影像，每一該提示影像插入於相鄰輸入影像之間。 In accordance with an embodiment of the present invention, an object-based visual attention guidance system using subthreshold cues includes an object detector, an enhancement unit, and a mixer. The object detector is used to detect an object of an image, thereby forming a prompting object as an object-based subliminal prompt. The enhancement unit separately and differently adjusts the image features of the prompting object and the area except the prompting object to enhance the saliency of the prompting object, thereby generating a prompt image. The mixer is used to select between the image and the input image, thereby forming a sequence of output images, the sequence comprising an input image and a cue image, each of the cue images being inserted between adjacent input images.

10‧‧‧物體偵測器 10‧‧‧ Object detector

12‧‧‧增強單元 12‧‧‧Enhancement unit

14‧‧‧混合器 14‧‧‧Mixer

第一圖之方塊圖顯示本發明實施例使用閾下提示基於物體的視覺注意力導引系統及方法。 The block diagram of the first diagram shows an embodiment of the present invention using a subliminal prompting object based visual attention guidance system and method.

第二A圖及第二B圖例示使用第一圖之混合器以交替提示影像及輸入影像。 The second A and second B diagrams illustrate the use of the mixer of the first figure to alternately present images and input images.

第一圖之方塊圖顯示本發明實施例使用閾下提示(subliminal cue)基於物體的視覺注意力導引系統及方法。 The block diagram of the first diagram shows an object-based visual attention guidance system and method using subliminal cue in accordance with an embodiment of the present invention.

首先，將輸入影像(或幀(frame))饋至物體偵測器10。在本說明書中，“影像”一詞可指靜態影像(例如相片)或動態影像(例如視訊)，且“影像”一詞可與“幀”互換使用。物體偵測器10使用物體偵測(或稱物體類偵測)技術，用以於影像當中得到屬於某一類別的至少一物體之位置及大小。在一實施例中，使用人臉偵測技術(其為物體偵測技術的一個特例)，藉由數位影像處理技術，用以於影像當中偵測人臉特徵，因而得到至少一人臉的位置，且通常還得到其大小。有許多的演算法可用以偵測人臉，例如M.Nilsson等人於Proc.ICASSP,vol.2,pp.589-592,2007所提出的“Face detection using local SMQT features and split up snow classifier”，該文獻視為本說明書的一部份。經偵測的人臉或多個偵測人臉之一形成了關注區域(AOI)，作為基於物體的閾下提示，用以導引(或吸引)觀者注意力。值得注意的是，閾下提示通常定義為一種低於個人意識感知的絕對底限之視覺刺激。 First, an input image (or frame) is fed to the object detector 10. In this specification, the term "image" may refer to a still image (such as a photo) or a moving image (such as a video), and the word "image" may be used interchangeably with "frame." The object detector 10 uses an object detection (or object detection) technique to obtain the position and size of at least one object belonging to a certain category in the image. In one embodiment, a face detection technique (which is a special case of object detection technology) is used to detect facial features in an image by digital image processing technology, thereby obtaining at least one face position. And usually get its size. There are a number of algorithms that can be used to detect human faces, such as "Face detection using local SMQT features and split up snow classifier" by M. Nilsson et al., Proc. ICASSP, vol. 2, pp. 589-592, 2007. This document is considered part of this specification. One of the detected faces or multiple detected faces forms an area of interest (AOI) as an object-based subliminal cue to guide (or attract) the viewer's attention. It is worth noting that subliminal cues are usually defined as visual stimuli that are below the absolute limit of perceived by the individual.

接著，物體偵測器10所輸出的偵測訊息(亦即，提示物體或人臉)連同輸入影像被饋至增強單元12，經增強提示物體的顯著性(saliency)後，形成閾下提示影像。在本實施例中，降低提示物體 (例如提示人臉)除外的整個輸入影像的亮度，使得亮度小於相應輸入影像的亮度。對於提示影像的提示物體，可維持其亮度或甚至提高其亮度。藉此，於提示影像當中，提示物體較其他區域來得顯著。在其他實施例中，也可調整亮度以外的其他至少一種影像特徵。為了便於影像的處理，決定出相應於提示物體的簡單幾何形狀(例如圓形或方形)，用以進行所需的影像處理。 Then, the detection message outputted by the object detector 10 (that is, the prompting object or the human face) is fed to the enhancement unit 12 together with the input image, and after the saliency of the enhanced object is enhanced, the subliminal cue image is formed. . In this embodiment, the prompting object is lowered The brightness of the entire input image (except for the face), such that the brightness is less than the brightness of the corresponding input image. For the prompting object that prompts the image, it can maintain its brightness or even increase its brightness. In this way, in the cue image, the cue object is more prominent than other areas. In other embodiments, at least one of the image features other than brightness may also be adjusted. In order to facilitate the processing of the image, a simple geometric shape (such as a circle or a square) corresponding to the prompting object is determined for the desired image processing.

在一實施例中，從提示物體的中心往提示物體的邊界，逐漸衰減提示物體的亮度。例如，提示物體的中心具有最高亮度，而提示物體的邊界具最低亮度。藉此，可避免提示物體邊界處形成可視邊緣。詳而言之，以(x_f,y_f)及r_f分別表示對應至提示物體的圓形之中心座標及半徑，且以I表示(原始)輸入影像。產生提示影像C的過程可表示如下： In one embodiment, the brightness of the prompting object is gradually attenuated from the center of the prompting object to the boundary of the prompting object. For example, the center of the prompting object has the highest brightness, while the boundary of the prompting object has the lowest brightness. Thereby, it is possible to avoid forming a visible edge at the boundary of the prompting object. In detail, (x _f , y _f ) and r _f respectively represent the center coordinates and the radius of the circle corresponding to the prompting object, and the (original) input image is represented by I. The process of generating the cue image C can be expressed as follows:

C(x,y)=I(x,y)e ^-1,otherwise C ( x , y )= I ( x , y ) e ^-1 , otherwise

其中，x及y分別表示像素的水平及垂直位置，且r²=(x-x_f)²+(y-y_f)²。 Where x and y represent the horizontal and vertical positions of the pixel, respectively, and r ² = (xx _f ) ² + (yy _f ) ² .

上述的物體偵測器10及增強單元12共同形成一基於物體的提示產生次系統。接下來，提示影像及(原始)輸入影像被饋至影像混合器(或切換器)14，其於提示影像及輸入影像之間作選擇，因而形成一輸出影像序列，包含有輸入影像及提示影像，其中每一提示影像插入於相鄰輸入影像(或幀)之間。例如，如第二A圖所示，混合器14交替一提示影像及一輸入影像，使得每一輸入影像之後放置一提示影像。或者，如第二B圖所示，混合器14交替一提示影像與二(或更多)輸入影像，使得每二(或更多)輸入影像之後放置一提示影像。再者，每一提示影像通常顯示數十或數百毫秒(ms)。一般來說，輸入影像的顯示係根據顯示裝置的更新或幀速率，而每一提示影像的顯示期間必需足夠小，使得提示影像不會受到觀者意識心智所感知，但可受到觀者無意識地(或潛意識地)感知。 The object detector 10 and the enhancement unit 12 described above together form an object-based prompt generation subsystem. Next, the prompt image and the (original) input image are fed to an image mixer (or switcher) 14, which selects between the prompt image and the input image, thereby forming an output image sequence including the input image and the prompt image. Each of the cue images is inserted between adjacent input images (or frames). For example, as shown in Figure 2A, the mixer 14 is handed over. A prompt image and an input image are placed so that a prompt image is placed after each input image. Alternatively, as shown in FIG. B, the mixer 14 alternates a cue image with two (or more) input images such that a cue image is placed after every two (or more) input images. Furthermore, each cue image typically displays tens or hundreds of milliseconds (ms). Generally, the display of the input image is based on the update or frame rate of the display device, and the display period of each cue image must be sufficiently small, so that the cue image is not perceived by the viewer's conscious mind, but can be unconsciously viewed by the viewer. (or subconsciously) perceive.

根據上述實施例所揭露的系統及方法，觀者的眼睛會以閾下方式明顯地被導引至提示物體。換句話說，觀者的視覺注意力被導引至提示物體的程度會大於非提示物體，因此提示物體較非提示物體更吸引觀者的視覺注意力。相較於傳統使用可感知影像改變方法，本實施例使用閾下提示之基於物體的視覺注意力導引系統及方法，提供了一種非侵入性且生物啟發(bio-inspired)的機制，有助於各種的多媒體應用，例如數位看板(digital signage)、廣告媒體設計、數位藝術、立體影像的對焦輔助以及教育方面。 According to the system and method disclosed in the above embodiments, the viewer's eyes are clearly guided to the prompting object in a subliminal manner. In other words, the viewer's visual attention is directed to the level of the prompting object to be greater than the non-presenting object, so the prompting object is more attractive to the viewer's visual attention than the non-presenting object. Compared to the conventional use of perceptible image change methods, the present embodiment provides a non-invasive and bio-inspired mechanism using sub-threshold object-based visual attention guidance systems and methods. For a variety of multimedia applications, such as digital signage, advertising media design, digital art, focus assisted focus and education.

以上所述僅為本發明之較佳實施例而已，並非用以限定本發明之申請專利範圍；凡其它未脫離發明所揭示之精神下所完成之等效改變或修飾，均應包含在下述之申請專利範圍內。 The above description is only the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention; all other equivalent changes or modifications which are not departing from the spirit of the invention should be included in the following Within the scope of the patent application.

10‧‧‧物體偵測器 10‧‧‧ Object detector

12‧‧‧增強單元 12‧‧‧Enhancement unit

14‧‧‧混合器 14‧‧‧Mixer

Claims

An object-based visual attention guidance system using subthreshold prompts, comprising: an object detector for detecting an image object, thereby forming a prompt object as an object-based subliminal prompt; a unit that separately and differently adjusts an image feature of the prompting object and the excluded area of the prompting object to enhance the saliency of the prompting object to generate a prompt image; and a mixer for the prompt image and the input The image is selected to form a sequence of output images, the sequence comprising the input image and the cue image, each of the cue images being inserted between adjacent input images; wherein the cue image C is as follows: C ( x , y )= I ( x , y ) e ^-1 , otherwise where I represents the input image, x and y represent the horizontal and vertical positions of the pixel, respectively, r is defined as r2=(x-xf)2+ (y-yf) 2, (xf, yf) and rf respectively represent the center coordinates and the radius of the circle corresponding to the cue object.

An object-based visual attention guidance system using subthresholding prompts as described in claim 1 wherein the object detector uses digital image processing techniques to detect facial features for obtaining a face position and size.

An object-based visual attention guidance system using subthresholding as described in claim 1 wherein the enhancement unit reduces the brightness of an area other than the cueing object.

An object-based visual attention guidance system using subthresholding as described in claim 3, wherein the enhancement unit maintains or enhances the brightness of the cueing object.

An object-based visual attention guidance system using a subliminal cue as described in claim 3, wherein the enhancement unit gradually attenuates brightness from a center of the cueing object toward a boundary of the cueing object.

An object-based visual attention guiding system using a subliminal prompt according to claim 1, wherein the mixer places a cue image behind the at least one input image to alternate the cue image and the Enter the image.

An object-based visual attention guiding method using subliminal prompting, comprising: detecting an image object, thereby forming a prompting object as an object-based subliminal prompt; separately and differently adjusting the prompting object and The image feature of the area except the object is used to enhance the saliency of the object to generate a cue image; and a hybrid selection between the cue image and the input image, thereby forming a sequence of output images, the sequence The input image and the prompt image are included, and each of the prompt images is inserted between adjacent input images; wherein the prompt image C is expressed as follows: C ( x , y )= I ( x , y ) e ^-1 , otherwise where I represents the input image, x and y represent the horizontal and vertical positions of the pixel, respectively, r is defined as r ² =(xx _f ) ² + (yy _f ) ² , (x _f , y _f ) and r _f respectively represent the center coordinates and the radius of the circle corresponding to the cue object.

An object-based visual attention guiding method using subthresholding prompts as described in claim 7 wherein the detecting step uses digital image processing technology to detect facial features for obtaining the position and size of a human face. .

An object-based visual attention guiding method using a subliminal cue as described in claim 7 of the patent application, wherein the enhancing step reduces the brightness of an area other than the cueing object.

An object-based visual attention guiding method using a subliminal prompt as described in claim 9 wherein the enhancing step maintains or increases the brightness of the presenting object.

An object-based visual attention guiding method using subthresholding as described in claim 9 wherein the enhancing step gradually attenuates brightness from a center of the prompting object toward a boundary of the prompting object.

An object-based visual attention guiding method using a subliminal prompt according to claim 7, wherein the mixing step is followed by at least one of the input images, and the prompt image is placed to alternate the prompt image and the Enter the image.