JP2016162234A

JP2016162234A - Attention area detection device, attention area detection method and program

Info

Publication number: JP2016162234A
Application number: JP2015040750A
Authority: JP
Inventors: 正雄山中; Masao Yamanaka; 優和真継; Masakazu Matsugi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-03-02
Filing date: 2015-03-02
Publication date: 2016-09-05

Abstract

PROBLEM TO BE SOLVED: To provide an attention area detection device, an attention area detection method and a program, capable of appropriately selecting an attention area with regard to an input image.SOLUTION: An attention area detection device includes: acquisition means 11 configured to acquire information on a sight line position of a user to an input image; setting means 12 configured to set a plurality of candidate areas in the input image; and detection means 13 configured to detect an attention area according to the plurality of candidate areas and a statistical distribution of the acquired sight line position of the user.SELECTED DRAWING: Figure 1

Description

本発明は、ユーザの視線に関する情報に基づいて、画像から注目領域を検出する技術に関する。 The present invention relates to a technique for detecting a region of interest from an image based on information about a user's line of sight.

従来から、ユーザの視線を検出し、そのユーザ視線情報を用いて、ユーザが注目している領域を検出する種々の方法が知られている。特許文献１に記載の方法では、カメラのファインダー面のどの位置に注目しているかを検出するために、ファインダー面に複数の局所領域を設定し、局所領域ごとにユーザの視線の停留時間を計測する。そして、停留時間が最大となる局所領域をユーザが注目している注目領域として設定する。 2. Description of the Related Art Conventionally, various methods for detecting a user's line of sight and using the user's line of sight information to detect a region that the user is paying attention to are known. In the method described in Patent Document 1, in order to detect which position on the finder surface of the camera is focused, a plurality of local regions are set on the finder surface, and the user's gaze stop time is measured for each local region. To do. And the local area | region where a stop time becomes the maximum is set as an attention area which the user has paid attention to.

特開２０００−１３１５９９号公報JP 2000-131599 A 特開平１−２４１５１１号公報JP-A-1-241511

竹上健、後藤敏行、大山玄、“視線方向計測のための高精度瞳孔検出アルゴリズム”、電子情報通信学会論文誌．Ｄ−ＩＩ、情報・システム、ＩＩ−パターン処理．Ken Takegami, Toshiyuki Goto, Gen Oyama, "Highly accurate pupil detection algorithm for gaze direction measurement", IEICE Transactions. D-II, information / system, II-pattern processing. ＤｏｎｇＨｙｕｎＹｏｏａｎｄＭｙｕｎｇＪｉｎＣｈｕｎｇ， “Ｎｏｎ−ｉｎｔｒｕｓｉｖｅＥｙｅＧａｚｅＥｓｔｉｍａｔｉｏｎｗｉｔｈｏｕｔＫｎｏｗｌｅｄｇｅｏｆＥｙｅＰｏｓｅ”，Ｐｒｏｃ．ＩＥＥＥＦＧ２００４．Dong Hyun Yoo and Myung Jin Chung, “Non-intrusive Eye Gaze Estimate with Knowledge of Eye Pose”, Proc. IEEE FG 2004. Ｒ．Ａｃｈａｎｔａ，Ｓ．Ｈｅｍａｍｉ，Ｆ．Ｅｓｔｒａｄａ，ａｎｄＳ．Ｓｕｓｓｔｒｕｎｋ， “Ｆｒｅｑｕｅｎｃｙ−ｔｕｎｅｄｓａｌｉｅｎｔｒｅｇｉｏｎｄｅｔｅｃｔｉｏｎ”，ＣＶＰＲ２００９．R. Achanta, S .; Hemimi, F.A. Estrada, and S.E. Susstrunk, “Frequency-tuned salient region detection”, CVPR2009. 大津展之、“判別および最小２乗法に基づく自動しきい値選定法”、電子通信学会論文誌、Ｊ６３−Ｄ−４（１９８０−４）、３４９．３５６．Nobuyuki Otsu, “Automatic threshold selection method based on discrimination and least squares”, IEICE Transactions, J63-D-4 (1980-4), 349.356. Ｌ．Ｉｔｔｉ，Ｃ．Ｋｏｃｈ，ａｎｄＥ．Ｎｉｅｂｕｒ， “Ａｍｏｄｅｌｏｆｓａｌｉｅｎｃｙ−ｂａｓｅｄｖｉｓｕａｌａｔｔｅｎｔｉｏｎｆｏｒｒａｐｉｄｓｃｅｎｅａｎａｌｙｓｉｓ”，ＴＰＡＭＩ１９９８．L. Itti, C.I. Koch, and E.M. Niebur, “A model of saliency-based visual attention for rapid scene analysis”, TPAMI 1998. Ｄ．ＫｌｅｉｎａｎｄＳ．Ｆｒｉｎｔｒｏｐ， “Ｃｅｎｔｅｒ−ｓｕｒｒｏｕｎｄｄｉｖｅｒｇｅｎｃｅｏｆｆｅａｔｕｒｅｓｔａｔｉｓｔｉｃｓｆｏｒｓａｌｉｅｎｔｏｂｊｅｃｔｄｅｔｅｃｔｉｏｎ”，ＩＣＣＶ２０１１．D. Klein and S.K. Frintrop, “Center-surround diversity of feature statistics for saliency object detection”, ICCV2011. Ｍ．Ｙａｍａｎａｋａ，Ｍ．Ｍａｔｓｕｇｕ，ａｎｄＭ．Ｓｕｇｉｙａｍａ， “Ｓａｌｉｅｎｔｏｂｊｅｃｔｄｅｔｅｃｔｉｏｎｂａｓｅｄｏｎｄｉｒｅｃｔｄｅｎｓｉｔｙ−ｒａｔｉｏｅｓｔｉｍａｔｉｏｎ”，ＩＰＳＪ２０１３．M.M. Yamanaka, M .; Matsusu, and M.M. Sugiyama, “Salient object detection based on direct density-ratio estimation”, IPSJ2013. ＰａｕｌＶｉｏｌａａｎｄＭｉｃｈａｅｌＪ．Ｊｏｎｅｓ． “ＲａｐｉｄＯｂｊｅｃｔＤｅｔｅｃｔｉｏｎｕｓｉｎｇａＢｏｏｓｔｅｄＣａｓｃａｄｅｏｆＳｉｍｐｌｅＦｅａｔｕｒｅｓ”，ＣＶＰＲ２００１．Paul Viola and Michael J. et al. Jones. “Rapid Object Detection using a Boosted Cascade of Simple Features”, CVPR2001. 御手洗裕輔、森克彦、真継優和 “選択的モジュール起動を用いたＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓによる変動にロバストな顔検出システム” ＦＩＴ２００３Yusuke Mitarai, Katsuhiko Mori, Yukazu Masashi “Face Detection System with Selective Module Activation and Robust against Variations” FIT2003

しかしながら、非特許文献１に記載の方法では、ユーザ視線を検出するための複数の局所領域は所定の位置に決められているため、複数の局所領域の中に注目領域が含まれていない場合、注目領域を適切に選択できないという問題があった。 However, in the method described in Non-Patent Document 1, since a plurality of local regions for detecting the user's line of sight are determined at predetermined positions, when a region of interest is not included in the plurality of local regions, There was a problem that the attention area could not be selected properly.

以上の課題を解決するために、本発明は、入力画像に対するユーザの視線位置に関する情報を取得する取得手段と、前記入力画像において、複数の候補領域を設定する設定手段と、前記設定された複数の候補領域と前記取得したユーザの視線位置の統計的分布とに基づいて注目領域を検出する検出手段と、を有することを特徴とする。 In order to solve the above-described problems, the present invention provides an acquisition unit that acquires information about a user's line-of-sight position with respect to an input image, a setting unit that sets a plurality of candidate areas in the input image, and the set plurality of Detecting means for detecting a region of interest based on the acquired candidate region and the statistical distribution of the acquired gaze position of the user.

以上の構成によれば、本発明は、ユーザの視線に関する情報に基づいてユーザの注目する注目領域を精度よく検出することができる。 According to the above configuration, the present invention can accurately detect a region of interest to which the user is interested based on information on the user's line of sight.

第１の実施形態に関わる注目領域検出装置の構成を示す概略ブロック図。1 is a schematic block diagram illustrating a configuration of a region of interest detection apparatus according to a first embodiment. 第１の実施形態において候補領域設定部の機能を説明する図。The figure explaining the function of a candidate area | region setting part in 1st Embodiment. 第１の実施形態に関わる注目領域検出部の機能構成を示す概略ブロック図。The schematic block diagram which shows the function structure of the attention area detection part in connection with 1st Embodiment. 第１の実施形態において候補領域設定部の処理の例を示す図。The figure which shows the example of a process of the candidate area | region setting part in 1st Embodiment. 第１の実施形態において時刻ｔ’に対するｍ次関数ｋ（ｔ’）のグラフを示す図。The figure which shows the graph of the m-order function k (t ') with respect to time t' in 1st Embodiment. 第１の実施形態において領域設定部による注目領域の設定処理の一例を示す図。The figure which shows an example of the setting process of the attention area by the area | region setting part in 1st Embodiment. 第１の実施形態において時間の経過にともなう注目領域の変化を示す図。The figure which shows the change of the attention area | region with progress of time in 1st Embodiment. 第１の実施形態に関わる注目領域検出方法のフロー図。The flowchart of the attention area detection method in connection with 1st Embodiment. 第２の実施形態に関わる注目領域検出装置の構成を示す概略ブロック図。The schematic block diagram which shows the structure of the attention area detection apparatus in connection with 2nd Embodiment. 第２の実施形態において顔検出部による顔領域検出の処理を説明する図。The figure explaining the process of the face area detection by a face detection part in 2nd Embodiment. 第２の実施形態において注目領域の候補領域の設定処理を説明する図。The figure explaining the setting process of the candidate area | region of an attention area in 2nd Embodiment. 第２の実施形態において領域設定部による注目領域の設定処理を示す図。The figure which shows the setting process of the attention area by the area | region setting part in 2nd Embodiment. 第２の実施形態に関わる注目領域検出方法のフロー図。The flowchart of the attention area detection method in connection with 2nd Embodiment.

［第１の実施形態］
以下、図面を参照して本発明の実施形態を詳細に説明する。図１は、本実施形態に係る注目領域検出装置の構成を示す概略ブロック図である。注目領域検出装置１は、視線情報取得部１１、候補領域設定部１２、注目領域検出部１３を有する。 [First Embodiment]
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a schematic block diagram showing the configuration of the attention area detection device according to the present embodiment. The attention area detection device 1 includes a line-of-sight information acquisition unit 11, a candidate area setting unit 12, and an attention area detection unit 13.

本実施形態に係る注目領域検出装置１は、半導体集積回路（ＬＳＩ）を用いて実現される。または、注目領域検出装置１が、ＣＰＵ、ＲＯＭ、ＲＡＭ、ＨＤＤ等のハードウェア構成を備えるようにしてもよい。その場合、ＣＰＵがＲＯＭやＨＤ等に格納されたプログラムを実行することにより、例えば、後述する各機能構成やフローチャートの処理が実現される。ＲＡＭは、ＣＰＵがプログラムを展開して実行するワークエリアとして機能する記憶領域を有する。ＲＯＭは、ＣＰＵが実行するプログラム等を格納する記憶領域を有する。ＨＤは、ＣＰＵが処理を実行する際に要する各種のプログラム、閾値に関するデータ等を含む各種のデータを格納する記憶領域を有する。 The attention area detection apparatus 1 according to the present embodiment is realized using a semiconductor integrated circuit (LSI). Alternatively, the attention area detection device 1 may include a hardware configuration such as a CPU, a ROM, a RAM, and an HDD. In that case, the CPU executes a program stored in the ROM, HD, or the like, thereby realizing, for example, each functional configuration and flowchart processing described below. The RAM has a storage area that functions as a work area where the CPU develops and executes the program. The ROM has a storage area for storing programs executed by the CPU. The HD has a storage area for storing various types of data including various programs necessary for the CPU to execute processing, data on threshold values, and the like.

視線情報取得部１１は、時刻ｔにおける、入力画像に対するユーザの視線位置（ｘ（ｔ），ｙ（ｔ））に関する情報を取得する。ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））は、例えば、非特許文献１に記載された高精度瞳孔検出アルゴリズムを用いた手法や、非特許文献２に記載された画像処理を用いた非接触な視線検出手法により検出できる。これらの手法を用い、視線情報取得部１１自体が、ユーザの映る写真（画像）からユーザの視線位置を検出、取得するようにしてもよい。 The line-of-sight information acquisition unit 11 acquires information related to the user's line-of-sight position (x (t), y (t)) with respect to the input image at time t. For the user's line-of-sight position (x (t), y (t)), for example, a method using a high-precision pupil detection algorithm described in Non-Patent Document 1 or an image process described in Non-Patent Document 2 is used. It can be detected by a non-contact gaze detection method. Using these techniques, the line-of-sight information acquisition unit 11 itself may detect and acquire the user's line-of-sight position from a photograph (image) shown by the user.

候補領域設定部１２は、外部から入力される入力画像を取得し、その入力画像に注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を設定する。その方法としては、例えば、非特許文献３に記載された、低次特徴量（輝度値、エッジ強度、テクスチャ等）から求められる顕著度を利用する方法を採用することができる。図２は、候補領域設定部１２が、非特許文献３の方法により、注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を設定する処理を説明する図である。図２（ａ）に示されるような入力画像を取得したとすると、候補領域設定部１２は、入力画像中の各位置の点（ｘ，ｙ）における顕著度Ｓを算出し、顕著度画像を作成する（図２（ｂ））。そして、顕著度画像において顕著度Ｓが所定閾値Ｔｈ以上の一塊の領域を全て抽出する（図２（ｃ））。さらに、抽出した各領域を包含する矩形領域を注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）として設定する（図２（ｄ））。 The candidate area setting unit 12 acquires an input image input from the outside, and sets the candidate area R _n (n = 1, 2,..., N) of the attention area in the input image. As the method, for example, a method described in Non-Patent Document 3 that uses a saliency obtained from a low-order feature amount (luminance value, edge strength, texture, etc.) can be employed. FIG. 2 is a diagram for explaining processing in which the candidate area setting unit 12 sets the candidate area R _n (n = 1, 2,..., N) of the attention area by the method of Non-Patent Document 3. If the input image as shown in FIG. 2A is acquired, the candidate area setting unit 12 calculates the saliency S at the point (x, y) at each position in the input image, and the saliency image is obtained. Create (FIG. 2B). In the saliency image, all the areas of the saliency S that are equal to or greater than the predetermined threshold Th are extracted (FIG. 2C). Further, a rectangular area including each extracted area is set as a candidate area R _n (n = 1, 2,..., N) of the attention area (FIG. 2D).

なお、顕著度Ｓに対する閾値Ｔｈは、例えば、非特許文献４に記載されている２値化処理の手法（クラス内分散を最小化し、クラス間分散を最大化することにより適応的に閾値を設定する手法）により算出すればよい。また、Ｎは注目領域の候補領域の個数を表しており、図２の場合はＮ＝２で与えられる。また、顕著度Ｓの算出方法としては、非特許文献５〜７に記載の方法等を用いればよい。このようにして設定した候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）は、後段の注目領域検出部１３による注目領域の設定処理において、視線位置の統計的分布（停留時間の合計値など）を算出する際の画像空間上の範囲（位置および大きさ）の初期値として用いられる。 Note that the threshold Th for the saliency S is, for example, the threshold value processing method described in Non-Patent Document 4 (the threshold is adaptively set by minimizing intra-class variance and maximizing inter-class variance). The calculation may be performed by the above method. N represents the number of candidate areas for the attention area, and is given by N = 2 in the case of FIG. In addition, as a method for calculating the saliency S, the methods described in Non-Patent Documents 5 to 7 may be used. The candidate region R _n (n = 1, 2,..., N) set in this way is subjected to the statistical distribution of the line-of-sight position (of the stopping time) in the attention region setting processing by the attention region detection unit 13 at the subsequent stage. It is used as an initial value of a range (position and size) in the image space when calculating a total value or the like.

注目領域検出部１３は、視線情報取得部１１により取得したユーザの視線位置に関する情報と、候補領域設定部１２により設定された注目領域の候補領域Ｒとに基づいて注目領域を検出する。図３は、注目領域検出部１３の機能構成を示す図である。同図に示されるように、注目領域検出部１３は、領域選択部１３１、統計量算出部１３２、領域重心算出部１３３、領域サイズ算出部１３４、領域設定部１３５によって構成される。 The attention area detection unit 13 detects the attention area based on the information regarding the user's line-of-sight position acquired by the line-of-sight information acquisition unit 11 and the candidate area R of the attention area set by the candidate area setting unit 12. FIG. 3 is a diagram illustrating a functional configuration of the attention area detection unit 13. As shown in the figure, the attention area detection unit 13 includes an area selection unit 131, a statistic calculation unit 132, an area centroid calculation unit 133, an area size calculation unit 134, and an area setting unit 135.

領域選択部１３１は、候補領域設定部１２により設定された候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）それぞれについて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の停留時間の合計値を算出する。そして、領域選択部１３１は、その最大値を有する候補領域Ｒ∈Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を選択し、その重心位置（Ｘ，Ｙ）および大きさ（Ｗ，Ｈ）を取得する。ここで、（Ｘ，Ｙ）は選択した候補領域Ｒの重心の水平方向座標および垂直方向座標を表す。また、（Ｗ，Ｈ）は選択した候補領域Ｒの水平方向サイズおよび垂直方向サイズを表す。 The region selection unit 131 sets the user's line-of-sight position (x (t (t)) from time t = 0 to T for each of the candidate regions R _n (n = 1, 2,..., N) set by the candidate region setting unit 12. The total value of the stop times of t) and y (t)) is calculated. Then, the region selection unit 131 selects the candidate region RεR _n (n = 1, 2,..., N) having the maximum value, and the barycentric position (X, Y) and size (W, H). Here, (X, Y) represents the horizontal coordinate and the vertical coordinate of the center of gravity of the selected candidate region R. (W, H) represents the horizontal size and vertical size of the selected candidate region R.

図４は、領域選択部１３１の処理の例を示す図である。同図の例では、全ての候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）のうち、候補領域Ｒ_２がユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の停留時間の合計値が最大であることから、領域選択部１３１はその重心（Ｘ，Ｙ）および大きさ（Ｗ，Ｈ）を取得する。領域選択部１３１は、選択した候補領域Ｒの重心（Ｘ，Ｙ）および大きさ（Ｗ，Ｈ）を取得すると、その情報を統計量算出部１３２、領域重心算出部１３３、領域サイズ算出部１３４に出力する。 FIG. 4 is a diagram illustrating an example of processing of the region selection unit 131. In the example of the figure, among all the candidate regions R _n (n = 1, 2,..., N), the candidate region R ₂ is the stop of the user's line-of-sight position (x (t), y (t)). Since the total value of time is the maximum, the region selection unit 131 acquires the center of gravity (X, Y) and the size (W, H). When the region selection unit 131 acquires the centroid (X, Y) and size (W, H) of the selected candidate region R, the information is used as the statistic calculation unit 132, the region centroid calculation unit 133, and the region size calculation unit 134. Output to.

統計量算出部１３２は、領域選択部１３１により選択された候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）を数１式より算出する。 The statistic calculation unit 132, in the candidate region R selected by the region selection unit 131, the time average position (g _x ) of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T. , G _y ) is calculated from equation (1).

さらに、統計量算出部１３２は、領域選択部１３１により選択された候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）を数２式より算出する。 Further, the statistic calculator 132 calculates the variance (σ _x ) of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T in the candidate region R selected by the region selector 131. ² , σ _y ² ) is calculated from equation ( ² ).

領域重心算出部１３３は、候補領域Ｒの重心（Ｘ，Ｙ）とユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）とに基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の重心（Ｘ’，Ｙ’）を算出する。注目領域Ｒ’の重心（Ｘ’，Ｙ’）は、候補領域Ｒの重心（Ｘ，Ｙ）とユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の重心が，時間の経過とともに徐々に接近する位置に設定される。具体的には、領域重心算出部１３３は、以下の数３式に基づき注目領域Ｒ’の重心（Ｘ’，Ｙ’）を算出する。 The area centroid calculation unit 133 calculates the time based on the centroid (X, Y) of the candidate area R and the time average position (g _x , g _y ) of the user's line-of-sight position (x (t), y (t)). The center of gravity (X ′, Y ′) of the attention area R ′ at t ′ (> T) is calculated. The center of gravity (X ′, Y ′) of the attention area R ′ is gradually set as the center of gravity (X, Y) of the candidate area R and the center of gravity of the user's line-of-sight position (x (t), y (t)) as time passes. Is set to a position approaching Specifically, the area centroid calculating unit 133 calculates the centroid (X ′, Y ′) of the attention area R ′ based on the following equation (3).

ここで、ｋ_ｇ（ｔ’）は、時刻ｔ’に対するｍ次の関数として与えられ、以下の数４式として表わされる。図５には、時刻ｔ’に対するｍ次の関数ｋ（ｔ’）のグラフを示しており、同図（ａ）〜（ｄ）は、ｍ＝１〜４のときの関数ｋ（ｔ’）グラフである。本実施形態における関数ｋ_ｇ（ｔ’）は、図５のような時刻ｔ’に対するｍ次の関数として与えられている。 Here, k _g (t ′) is given as an m-th order function with respect to time t ′, and is expressed as the following equation (4). FIG. 5 shows a graph of an m-th order function k (t ′) with respect to time t ′. FIGS. 5A to 5D show the function k (t ′) when m = 1 to 4. It is a graph. The function k _g (t ′) in this embodiment is given as an m-order function with respect to time t ′ as shown in FIG.

ここで、数４式における係数ａ_ｇは所定の実数値であり、システムの処理速度や視線検出精度などに応じて予め決定される。なお、ｋ_ｇ（ｔ’）は、時刻ｔ’に対する多次元の関数であれば、他の関数として与えてもよい。 Here, the coefficient a _g in Formula 4 are predetermined real value, is predetermined in accordance with the processing speed and visual axis detection accuracy of the system. Note that k _g (t ′) may be given as another function as long as it is a multidimensional function with respect to time t ′.

領域サイズ算出部１３４は、候補領域Ｒの大きさ（Ｗ，Ｈ）と、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）とに基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）を算出する。注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）は、候補領域Ｒの大きさ（Ｗ，Ｈ）に対してユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の空間的分布の幅が大きい場合、時間の経過とともに候補領域Ｒの大きさ（Ｗ，Ｈ）が徐々に拡大するように設定される。一方、候補領域Ｒの大きさ（Ｗ，Ｈ）に対してユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の空間的分布の幅が小さい場合は、時間の経過とともに候補領域Ｒの大きさ（Ｗ，Ｈ）が徐々に縮小するように設定される。具体的には、領域サイズ算出部１３４は、以下の数５式により、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）を算出する。 The region size calculation unit 134 is based on the size (W, H) of the candidate region R and the variance (σ _x ² , σ _y ² ) of the user's line-of-sight position (x (t), y (t)). Then, the size (W ′, H ′) of the attention area R ′ at time t ′ (> T) is calculated. The size (W ′, H ′) of the attention region R ′ is a spatial distribution of the user's line-of-sight position (x (t), y (t)) with respect to the size (W, H) of the candidate region R. When the width is large, the size (W, H) of the candidate region R is set to gradually increase as time passes. On the other hand, when the width of the spatial distribution of the user's line-of-sight position (x (t), y (t)) is small with respect to the size (W, H) of the candidate region R, the candidate region R The size (W, H) is set to be gradually reduced. Specifically, the region size calculation unit 134 calculates the size (W ′, H ′) of the attention region R ′ at time t ′ (> T) by the following equation (5).

ここで、ｋ_ｓ（ｔ’）は、時刻ｔ’に対するｍ次の関数として与えられ、以下の数６式として表わされる。この関数ｋ_ｓ（ｔ’）も、図５（ａ）〜（ｄ）に示されるような時刻ｔ’に対するｍ次の関数である。 Here, k _s (t ′) is given as an m-th order function with respect to time t ′, and is expressed as the following Expression 6. This function k _s (t ′) is also an m-th order function for time t ′ as shown in FIGS.

ここで、数６式における係数ａ_ｓは所定の実数値であり、システムの処理速度や視線検出精度などに応じて予め決定される。なお、ｋ_ｓ（ｔ’）は、時刻ｔ’に対する多次元の関数であれば、他の関数として与えてもよい。 Here, the coefficient a _s in equation (6) is a predetermined real value, is predetermined in accordance with the processing speed and visual axis detection accuracy of the system. Note that k _s (t ′) may be given as another function as long as it is a multidimensional function with respect to time t ′.

領域設定部１３５は、領域重心算出部１３３で算出された時刻ｔ’における注目領域Ｒ’の重心（Ｘ’，Ｙ’）と、領域サイズ算出部１３４で算出された時刻ｔ’における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）に基づいて、時刻ｔ’における注目領域Ｒ’を設定する。図６は、領域設定部１３５による注目領域Ｒ’の設定処理の一例を示す図である。同図に示されるように、本実施形態では、座標（Ｘ’，Ｙ’）を重心位置とし、大きさ（Ｗ’，Ｈ’）の矩形領域を注目領域Ｒ’として設定する。なお、このように設定された注目領域Ｒ’を、注目領域検出装置外部に設けられた表示手段（ディスプレイ等）において、入力画像に重畳して表示させるようにしてもよい。 The region setting unit 135 includes the center of gravity (X ′, Y ′) of the attention region R ′ calculated at the time t ′ calculated by the region gravity center calculation unit 133 and the attention region R at the time t ′ calculated by the region size calculation unit 134. Based on the size of 'W', H ', the attention area R' at time t 'is set. FIG. 6 is a diagram illustrating an example of processing for setting the attention area R ′ by the area setting unit 135. As shown in the figure, in this embodiment, the coordinates (X ′, Y ′) are set as the center of gravity, and the rectangular area having the size (W ′, H ′) is set as the attention area R ′. Note that the attention area R ′ set in this way may be displayed superimposed on the input image on display means (such as a display) provided outside the attention area detection device.

図７は、時間の経過にともなう注目領域Ｒ’の変化の例を示す図である。同図に示されるように、注目領域Ｒ’は時刻ｔ’におけるｋ_ｇ（ｔ’）およびｋ_ｓ（ｔ’）の値に応じて、時間的に連続に拡大、縮小、平行移動される。そして、時刻ｔ’＝Ｔ_ｉｎｔにおいて、ユーザからの割込み操作（例えば、ボタンが押される等）が行われた場合、注目領域検出装置はその情報を取得して、領域設定部１３５は、時刻ｔ’＝Ｔ_ｉｎｔ以降の拡大、縮小、平行移動を中止する。また、領域設定部１３５は、時刻ｔ’＝Ｔ_ｉｎｔ時点での注目領域Ｒ’の重心（Ｘ’，Ｙ’）と大きさ（Ｗ’，Ｈ’）に基づいて、最終的な注目領域Ｒ’を設定する。一方、ｋ_ｇ（ｔ’）＝ｋ_ｓ（ｔ’）＝１を満たす時刻ｔ’＝Ｔ_ｅｎｄまで、ユーザの割込み操作が行われなかった場合は、時刻ｔ’＝Ｔ_ｅｎｄにおける注目領域Ｒ’の重心（Ｘ’，Ｙ’）と大きさ（Ｗ’，Ｈ’）に基づいて、最終的な注目領域Ｒ’を設定する。 FIG. 7 is a diagram illustrating an example of a change in the attention area R ′ over time. As shown in the figure, the attention area R ′ is continuously enlarged, reduced, and translated in time according to the values of k _g (t ′) and k _s (t ′) at time t ′. When an interrupt operation (for example, a button is pressed) from the user is performed at time t ′ = T _int , the attention area detection device acquires the information, and the area setting unit 135 _{'= T int} after the enlargement, reduction, to stop the movement parallel. The region setting unit 135 also determines the final region of interest R based on the center of gravity (X ′, Y ′) and the size (W ′, H ′) of the region of interest R ′ at time t ′ = _Tint. Set '. On the other hand, when the user's interruption operation is not performed until time t ′ = T _end where k _g (t ′) = k _s (t ′) = 1 is satisfied, the attention area R ′ at time t ′ = T _end . The final region of interest R ′ is set based on the center of gravity (X ′, Y ′) and the size (W ′, H ′).

このようにして検出された注目領域は、注目領域検出結果を用いて処理を行う装置へと出力される。例えば、デジタルカメラのような撮像装置においては、検出された注目領域にフォーカスを合わせ、当該領域を高画質化するような処理に供される。また、デジタルカメラのような撮像装置に備えられた半導体集積回路が前述の注目領域検出装置としての機能を実現するようにしてもよく、この場合、撮像装置自体が本実施形態の注目領域検出装置に相当する。また、このように撮像装置自体が注目領域検出装置として機能する場合には、ユーザの視線に関する情報は、例えば特許文献２に開示されるような方法により、ファインダーから注目領域検出装置自体が測定、取得するようにしてもよい。 The attention area detected in this way is output to a device that performs processing using the attention area detection result. For example, in an imaging apparatus such as a digital camera, the detected region of interest is focused and subjected to processing for improving the image quality of the region. In addition, a semiconductor integrated circuit provided in an imaging apparatus such as a digital camera may realize the function as the above-described attention area detection device. In this case, the imaging apparatus itself is the attention area detection device of this embodiment. It corresponds to. Further, when the imaging device itself functions as the attention area detection device in this way, information on the user's line of sight is measured by the attention area detection device itself from the viewfinder, for example, by a method disclosed in Patent Document 2. You may make it acquire.

図８は、本実施形態に係る注目領域検出方法のフロー図を示している。本処理が開始されると、まず、ステップＳ１０１において、視線情報取得部１１は、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））に関する情報を取得する。次に、ステップＳ１０２では、候補領域設定部１２が、入力画像に注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を設定する。 FIG. 8 shows a flowchart of the attention area detection method according to the present embodiment. When this process is started, first, in step S101, the line-of-sight information acquisition unit 11 acquires information about the user's line-of-sight position (x (t), y (t)). Next, in step S102, the candidate area setting unit 12 sets candidate areas R _n (n = 1, 2,..., N) of the attention area in the input image.

続くステップＳ１０３−１〜Ｓ１０３−５は、注目領域検出部１３による注目領域を検出する工程である。まず、ステップＳ１０３−１において、領域選択部１３１は、設定された候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）それぞれについて、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の停留時間の合計値を算出する。そして、ステップＳ１０３−２において、統計量算出部１３２は、時刻ｔ＝０〜Ｔのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）と、時刻ｔ＝０〜Ｔのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）とを算出する。 Subsequent steps S103-1 to S103-5 are steps in which the attention area is detected by the attention area detection unit 13. First, in step S103-1, the region selection unit 131 sets the user's line-of-sight position (x (t), y (t) for each of the set candidate regions R _n (n = 1, 2,..., N). )) The total stop time is calculated. In step S103-2, the statistic calculator 132 calculates the time average position (g _x , g _y ) of the user's line-of-sight position (x (t), y (t)) at time t = 0 to T, The variance (σ _x ² , σ _y ² ) of the user's line-of-sight position (x (t), y (t)) at time t = 0 to T is calculated.

ステップＳ１０３−３では、領域重心算出部１３３が、候補領域Ｒの重心（Ｘ，Ｙ）とユーザの視線位置の時間平均位置（ｇ_ｘ，ｇ_ｙ）とに基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の重心（Ｘ’，Ｙ’）を算出する。また、ステップＳ１０３−４では、領域サイズ算出部１３４が、候補領域Ｒの大きさ（Ｗ，Ｈ）と、ユーザの視線位置の分散（σ_ｘ ^２，σ_ｙ ^２）とに基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）を算出する。 In step S103-3, the area centroid calculation unit 133 determines the time t ′ (> T) based on the centroid (X, Y) of the candidate area R and the time average position (g _x , g _y ) of the user's line-of-sight position. ) To calculate the center of gravity (X ′, Y ′) of the attention area R ′. In step S103-4, the region size calculation unit 134 determines the time t based on the size (W, H) of the candidate region R and the variance (σ _x ² , σ _y ² ) of the user's line-of-sight position. The size (W ′, H ′) of the attention area R ′ in “(> T) is calculated.

ステップＳ１０３−５において、領域設定部１３５は、算出された時刻ｔ’における注目領域Ｒ’の重心（Ｘ’，Ｙ’）と、時刻ｔ’における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）とに基づいて、時刻ｔ’における注目領域Ｒ’を設定する。 In step S103-5, the area setting unit 135 calculates the center of gravity (X ′, Y ′) of the attention area R ′ at time t ′ and the size (W ′, H) of the attention area R ′ at time t ′. Based on “), the attention area R ′ at time t ′ is set.

以上、本実施形態によれば、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の統計的分布である、ユーザの視線位置の停留時間に基づいて注目領域Ｒ’の重心（Ｘ’，Ｙ’）と大きさ（Ｗ’，Ｈ’）を設定するようにした。このような構成により、本実施形態では、ユーザの注目する注目領域を精度よく検出することができる。 As described above, according to the present embodiment, the center of gravity (X ′) of the attention area R ′ is based on the stop time of the user's gaze position, which is a statistical distribution of the user's gaze position (x (t), y (t)). , Y ′) and size (W ′, H ′). With this configuration, in the present embodiment, it is possible to accurately detect a region of interest to which the user pays attention.

［第２の実施形態］
次に、本発明の第２の実施形態として、検出した人物の顔領域に基づいて、ｍ番目の人物が注目する注目領域を検出する構成について説明を行う。なお、第１の実施形態において既に説明をした構成については、同一の符号を付し、その説明を省略する。 [Second Embodiment]
Next, as a second embodiment of the present invention, a description will be given of a configuration for detecting a region of interest to which the mth person pays attention based on the detected face region of the person. In addition, about the structure already demonstrated in 1st Embodiment, the same code | symbol is attached | subjected and the description is abbreviate | omitted.

図９は、本実施形態に係る注目領域検出装置の構成を示す概略ブロック図である。本実施形態の注目領域検出装置１は、本実施形態の注目領域検出装置１は、第１の実施形態で説明をした各機能部に加えて、顔検出部２１を備える。顔検出部２１は、まず、注目領域検出装置が処理対象とする入力画像を表示する表示手段に注目する人物の顔を検出すするための画像を取得する。ここでは、例えば、処理対象の入力画像を表示させるデジタルサイネージに取り付けられたカメラによって撮影された画像等を取得する。 FIG. 9 is a schematic block diagram illustrating the configuration of the attention area detection device according to the present embodiment. The attention area detection device 1 according to the present embodiment includes a face detection unit 21 in addition to the functional units described in the first embodiment. First, the face detection unit 21 acquires an image for detecting the face of a person who pays attention to a display unit that displays an input image to be processed by the attention area detection device. Here, for example, an image taken by a camera attached to a digital signage that displays an input image to be processed is acquired.

そして、顔検出部２１は、取得した画像からすべての人物の顔領域Ｆ_ｍ（ｍ＝１，２，．．．，Ｍ）を検出する。図１０は、顔検出部２１による顔領域Ｆ_ｍ（ｍ＝１，２，．．．，Ｍ）の検出の例を示している。なお、Ｍは人物の顔領域の個数（図１０の場合、Ｍ＝４）を表している。なお、人物の顔領域を検出する手法としては、例えば、非特許文献８、９に開示された方法を用いればよい。また、入力画像に注目するユーザを検出するユーザ検出の手法としては、前述したような顔領域に基づく方法に限られず、他の人体パーツに基づいて行うようにしてもよい。 Then, the face detection unit 21 detects the face regions F _m (m = 1, 2,..., M) of all persons from the acquired image. FIG. 10 shows an example of detection of the face region F _m (m = 1, 2,..., M) by the face detection unit 21. M represents the number of face areas of a person (in the case of FIG. 10, M = 4). For example, the methods disclosed in Non-Patent Documents 8 and 9 may be used as a method for detecting a human face area. Further, the user detection method for detecting the user who pays attention to the input image is not limited to the method based on the face area as described above, and may be performed based on other human body parts.

視線情報取得部１１は、時刻ｔにおける、ｍ番目の人物（ユーザ）の視線位置（ｘ（ｔ），ｙ（ｔ））に関する情報を取得する。ユーザの視線位置を検出する手法は、第１の実施形態で説明したように、非特許文献１、２等の方法を採用することができる。また、視線情報取得部１１自体がユーザの視線位置を検出、取得するようにしてもよい。 The line-of-sight information acquisition unit 11 acquires information regarding the line-of-sight position (x (t), y (t)) of the mth person (user) at time t. As described in the first embodiment, the method of detecting the user's line-of-sight position can employ methods such as Non-Patent Documents 1 and 2. The line-of-sight information acquisition unit 11 itself may detect and acquire the user's line-of-sight position.

候補領域設定部１２は、外部から入力される入力画像を取得し、その入力画像に注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を設定する。図１１は、本実施形態における候補領域設定部１２による注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）の設定処理を説明する図である。同図（ａ）は、本実施形態の入力画像の例を示しており、ここでは、不特定多数の人物を対象とした画像やテキストなどから構成される広告（デジタルサイネージ）である。候補領域設定部１２は、このような入力画像に対して、図１１（ｂ）に示すように、コンテンツとして意味をなす一塊の領域の各々（位置と大きさは既知である）を注目領域の候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）として設定する。 The candidate area setting unit 12 acquires an input image input from the outside, and sets the candidate area R _n (n = 1, 2,..., N) of the attention area in the input image. FIG. 11 is a diagram illustrating processing for setting a candidate region R _n (n = 1, 2,..., N) of a region of interest by the candidate region setting unit 12 in the present embodiment. FIG. 5A shows an example of an input image of the present embodiment, and here is an advertisement (digital signage) composed of images, texts, and the like targeted at an unspecified number of people. As shown in FIG. 11 (b), the candidate area setting unit 12 assigns each of a group of areas (position and size are known) as meaning areas to the input image. It is set as a candidate area R _n (n = 1, 2,..., N).

注目領域検出部１３は、第１の実施形態と同様の機能部で構成されており、領域選択部１３１、統計量算出部１３２、領域重心算出部１３３、領域サイズ算出部１３４、領域設定部１３５を有する。 The attention area detection unit 13 includes functional units similar to those in the first embodiment, and includes an area selection unit 131, a statistic calculation unit 132, an area centroid calculation unit 133, an area size calculation unit 134, and an area setting unit 135. Have

領域選択部１３１は、候補領域設定部１２が設定したすべての候補領域Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）において、時刻ｔ＝０〜Ｔまでのｍ番目の人物の視線位置（ｘ_ｍ（ｔ），ｙ_ｍ（ｔ））の停留時間の合計値を算出する。そして、その合計値の最大値を有する候補領域Ｒ_ｍ∈Ｒ_ｎ（ｎ＝１，２，．．．，Ｎ）を選択する。 The area selection unit 131 displays the line-of-sight position of the mth person from time t = 0 to T in all candidate areas R _n (n = 1, 2,..., N) set by the candidate area setting unit 12. The total value of the stop times of (x _m (t), y _m (t)) is calculated. Then, the candidate region R _m εR _n (n = 1, 2,..., N) having the maximum total value is selected.

統計量算出部１３２は、領域選択部１３１により選択された、ｍ番目の人物の候補領域Ｒ_ｍにおいて、時刻ｔ＝０〜Ｔまでのｍ番目の人物の視線位置（ｘ_ｍ（ｔ），ｙ_ｍ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）を数７式より算出する。 The statistic calculation unit 132 selects the line-of-sight position (x _m (t), y) of the m-th person from time t = 0 to T in the m-th person candidate region R _m selected by the region selection unit 131. The time average position (g _x , g _y ) of _m (t)) is calculated from Equation 7.

さらに、統計量算出部１３２は、領域選択部１３１により選択されたｍ番目の人物の候補領域Ｒ_ｍにおいて、時刻ｔ＝０〜Ｔまでのｍ番目の人物の視線位置（ｘ_ｍ（ｔ），ｙ_ｍ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）を数８式より算出する。 Further, the statistic calculation unit 132, in the m th person candidate region R _m selected by the region selection unit 131, the line-of-sight position (x _m (t), The variance (σ _x ² , σ _y ² ) of y _m (t)) is calculated from Equation 8.

領域重心算出部１３３は、ｍ番目の人物の候補領域Ｒの重心（Ｘ_ｍ，Ｙ_ｍ）と視線位置（ｘ_ｍ（ｔ），ｙ_ｍ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）とに基づいて、時刻ｔ’（＞Ｔ）におけるｍ番目の人物の注目領域Ｒ_ｍ’の重心（Ｘ_ｍ’，Ｙ_ｍ’）を算出する。具体的には、領域重心算出部１３３は、以下の数９式に基づきｍ番目の人物の注目領域Ｒ’の重心（Ｘ_ｍ’，Ｙ_ｍ’）を算出する。 The area center-of-gravity calculation unit 133 calculates the time-average position (g _x , g _y ) of the center of gravity (X _m , Y _m ) and the line-of-sight position (x _m (t), y _m (t)) of the m-th person candidate area R. ) And the center of gravity (X _m ′, Y _m ′) of the attention area R _m ′ of the m-th person at time t ′ (> T). Specifically, the area centroid calculation unit 133 calculates the centroid (X _m ′, Y _m ′) of the attention area R ′ of the m-th person based on the following equation (9).

また、領域サイズ算出部１３４は、ｍ番目の人物の候補領域Ｒ_ｍと視線位置（ｘ_ｍ（ｔ），ｙ_ｍ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）とに基づいて、時刻ｔ’（＞Ｔ）におけるｍ番目の人物の注目領域Ｒ_ｍ’の大きさ（Ｗ_ｍ’，Ｈ_ｍ’）を数１０式より算出する。 Further, the area size calculation unit 134 is based on the mth person candidate area R _m and the variance (σ _x ² , σ _y ² ) of the line-of-sight position (x _m (t), y _m (t)). The size (W _m ′, H _m ′) of the attention area R _m ′ of the m-th person at time t ′ (> T) is calculated from Equation 10.

領域設定部１３５は、時刻ｔ’におけるｍ番目の人物の注目領域Ｒ_ｍ’の重心（Ｘ_ｍ’，Ｙ_ｍ’）と、時刻ｔ’におけるｍ番目の人物の注目領域Ｒ_ｍ’の大きさ（Ｗ_ｍ’，Ｈ_ｍ’）に基づいて、時刻ｔ’におけるｍ番目の人物の注目領域Ｒ_ｍ’を設定する。 The size of the area setting unit 135, 'm th person of the region of interest R _m' in the center of gravity of the time t (X _m ', Y _m') and 'attention area of the m-th person in R _m' at time t Based on (W _m ′, H _m ′), the attention area R _m ′ of the m-th person at time t ′ is set.

図１２は、領域設定部１３５による注目領域Ｒ_ｍ’の設定処理の一例を示す図である。同図に示されるように、本実施形態では、座標（Ｘ_ｍ’，Ｙ_ｍ’）を重心として、大きさ（Ｗ_ｍ’，Ｈ_ｍ’）の矩形領域を注目領域Ｒ_ｍ’として設定する。なお、このように設定された注目領域Ｒ_ｍ’を、注目領域検出装置外部に設けられた表示手段（ディスプレイ等）において、入力画像に重畳して表示させるようにしてもよい。図１０に示すように複数の顔領域が検出された場合には、重要度が最も高いと検出された顔領域の人物に対応する注目領域のみ表示するようにしてもよし、複数の注目領域すべてを表示するようにしてもよい。 FIG. 12 is a diagram illustrating an example of processing for setting the attention area R _m ′ by the area setting unit 135. As shown in the figure, in the present embodiment, the coordinate (X _m ′, Y _m ′) is set as the center of gravity, and the rectangular area having the size (W _m ′, H _m ′) is set as the attention area R _m ′. . Note that the attention area R _m ′ set in this way may be displayed superimposed on the input image on a display means (display or the like) provided outside the attention area detection device. As shown in FIG. 10, when a plurality of face areas are detected, only the attention area corresponding to the person of the face area detected as having the highest importance may be displayed. May be displayed.

このようにして検出された注目領域は、注目領域検出結果を用いて処理を行う装置へと出力される。例えば、デジタルサイネージのような電子表示装置においては、検出された注目領域のみビットレートを上げて高画質化するなどの処理に供される。また、電子表示装置のような撮像装置に備えられた半導体集積回路が前述の注目領域検出装置としての機能を実現するようにしてもよく、この場合、電子表示装置自体が本実施形態の注目領域検出装置に相当する。また、電子表示装置がカメラ（撮像手段）を備え、電子表示装置に注目する人物の画像を撮影するようにしてもよい。 The attention area detected in this way is output to a device that performs processing using the attention area detection result. For example, in an electronic display device such as digital signage, only the detected region of interest is subjected to processing such as increasing the bit rate to improve image quality. In addition, a semiconductor integrated circuit provided in an imaging device such as an electronic display device may realize the function as the attention area detection device described above. In this case, the electronic display device itself is the attention area of the present embodiment. It corresponds to a detection device. Further, the electronic display device may be provided with a camera (imaging means), and an image of a person who pays attention to the electronic display device may be taken.

図１３は、本実施形態に係る注目領域検出方法のフロー図を示している。本処理が開始されると、まず、ステップＳ２０１において、顔検出部２１は画像からすべての人物の顔領域Ｆ_ｍ（ｍ＝１，２，．．．，Ｍ）を検出する。そして、ステップＳ１０１では、視線情報取得部１１が、ｍ番目の人物の視線位置（ｘ（ｔ），ｙ（ｔ））に関する情報を取得する。以降のステップＳ１０２〜Ｓ１０３−５では、第１の実施形態で説明した処理と同様の処理を、ｍ番目の人物の視線位置情報に基づいて実行する。また、検出された顔領域Ｆ_ｍ（ｍ＝１，２，．．．，Ｍ）すべてについて注目領域を検出するのであれば、本フローを繰り返し実行すればよい。 FIG. 13 is a flowchart of the attention area detection method according to the present embodiment. When this process is started, first, in step S201, the face detection unit 21 detects the face regions F _m (m = 1, 2,..., M) of all persons from the image. In step S101, the line-of-sight information acquisition unit 11 acquires information regarding the line-of-sight position (x (t), y (t)) of the mth person. In subsequent steps S102 to S103-5, processing similar to the processing described in the first embodiment is executed based on the line-of-sight position information of the mth person. Further, if the attention area is detected for all detected face areas F _m (m = 1, 2,..., M), this flow may be repeatedly executed.

以上、本実施形態によれば、人物の顔領域を抽出し、特定の人物が注目する注目領域を検出する構成とすることにより、特定の人物が注目する注目領域を精度よく検出することが可能となる。 As described above, according to the present embodiment, it is possible to accurately detect the attention area focused on by a specific person by extracting the face area of the person and detecting the attention area focused on by the specific person. It becomes.

［第３の実施形態］
次に、本発明の第３の実施形態について説明を行う。本実施形態では、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の空間的分布の特性を表す尖度（Ｓ_ｘ，Ｓ_ｙ）および歪度（Ｋ_ｘ，Ｋ_ｙ）に基づいて注目領域Ｒ’の重心（Ｘ’，Ｙ’）と大きさ（Ｗ’，Ｈ’）を設定する。なお、第１、第２の実施形態において既に説明をした構成については、同一の符号を付し、その説明を省略する。本実施形態の注目領域検出装置１およびその処理フローは、第１の実施形態と同様であるが、統計量算出部１３２、領域重心算出部１３３、領域サイズ算出部１３４による処理の内容が第１の実施形態とは異なっている。 [Third Embodiment]
Next, a third embodiment of the present invention will be described. In the present embodiment, based on the kurtosis (S _x , S _y ) and the skewness (K _x , K _y ) representing the characteristics of the spatial distribution of the user's line-of-sight position (x (t), y (t)). The center of gravity (X ′, Y ′) and size (W ′, H ′) of the attention area R ′ are set. In addition, about the structure already demonstrated in 1st, 2nd embodiment, the same code | symbol is attached | subjected and the description is abbreviate | omitted. The attention area detection device 1 and its processing flow of this embodiment are the same as those of the first embodiment, but the contents of the processing by the statistic calculation unit 132, the region centroid calculation unit 133, and the region size calculation unit 134 are the first. This is different from the embodiment.

まず、本実施形において、統計量算出部１３２は、領域選択部１３１が選択した候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の時間平均位置（ｇ_ｘ，ｇ_ｙ）を数１１式により算出する。 First, in the present embodiment, the statistic calculator 132 calculates the user's line-of-sight position (x (t), y (t)) from time t = 0 to T in the candidate region R selected by the region selector 131. The time average position (g _x , g _y ) is calculated by the equation (11).

また、統計量算出部１３２は、領域選択部１３１が選択した候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の分散（σ_ｘ ^２，σ_ｙ ^２）を数１２式より算出する。 In addition, the statistic calculation unit 132 distributes (σ _x ² ) the line-of-sight position (x (t), y (t)) of the user from time t = 0 to T in the candidate region R selected by the region selection unit 131. , Σ _y ² ) is calculated from equation (12).

また、統計量算出部１３２は、領域選択部１３１が選択した候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の歪度（Ｓ_ｘ，Ｓ_ｙ）を数１３式より算出する。ここで、歪度（Ｓ_ｘ，Ｓ_ｙ）は、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の統計的分布の特性を表しており、歪度が正（負）の場合、統計的分布が正（負）の方向に偏っていることを表す。 In addition, the statistic calculation unit 132, in the candidate region R selected by the region selection unit 131, the skewness (S _x of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T. , S _y ) is calculated from equation (13). Here, the skewness (S _x , S _y ) represents the characteristics of the statistical distribution of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T. When is positive (negative), it means that the statistical distribution is biased in the positive (negative) direction.

また、統計量算出部１３２は、領域選択部１３１が選択した候補領域Ｒにおいて、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の尖度（Ｋ_ｘ，Ｋ_ｙ）を数１４式より算出する。ここで、尖度（Ｋ_ｘ，Ｋ_ｙ）は、時刻ｔ＝０〜Ｔまでのユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の統計的分布の特性を表しており、尖度が正（負）の場合、統計的分布が先鋭（偏平）であることを表す。 Further, the statistic calculation unit 132, in the candidate region R selected by the region selection unit 131, the kurtosis (K _x ) of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T. , K _y ) is calculated from equation (14). Here, the kurtosis (K _x , K _y ) represents the characteristics of the statistical distribution of the user's line-of-sight position (x (t), y (t)) from time t = 0 to T. When is positive (negative), the statistical distribution is sharp (flat).

領域重心算出部１３３は、候補領域Ｒの重心（Ｘ，Ｙ）と、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の歪度（Ｓ_ｘ，Ｓ_ｙ）に基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の重心（Ｘ’，Ｙ’）を算出する。具体的には、領域重心算出部１３３は、注目領域Ｒ’の重心（Ｘ’，Ｙ’）を数１５式に基づいて算出する。これにより、注目領域Ｒ’の重心（Ｘ’，Ｙ’）は、候補領域Ｒの重心（Ｘ，Ｙ）とユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の重心が時間の経過とともに徐々に接近する位置として設定される。 Based on the centroid (X, Y) of the candidate region R and the skewness (S _x , S _y ) of the user's line-of-sight position (x (t), y (t)), the region centroid calculation unit 133 The center of gravity (X ′, Y ′) of the attention area R ′ at “(> T) is calculated. Specifically, the area centroid calculation unit 133 calculates the centroid (X ′, Y ′) of the attention area R ′ based on Expression 15. As a result, the center of gravity (X ′, Y ′) of the attention area R ′ is the same as the center of gravity (X, Y) of the candidate area R and the center of gravity of the user's line-of-sight position (x (t), y (t)). And it is set as a position that gradually approaches.

また、領域サイズ算出部１３４は、候補領域Ｒの大きさ（Ｗ，Ｈ）と、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の尖度（Ｋ_ｘ，Ｋ_ｙ）に基づいて、時刻ｔ’（＞Ｔ）における注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）を算出する。具体的には、領域サイズ算出部１３４は、注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）を数１６式に基づいて算出する。注目領域Ｒ’の大きさ（Ｗ’，Ｈ’）は、候補領域Ｒの大きさ（Ｗ，Ｈ）に対してユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の統計的分布の幅が大きい（小さい）場合、候補領域Ｒの大きさ（Ｗ，Ｈ）が時間とともに徐々に拡大（縮小）するように設定される。 Further, the region size calculation unit 134 is based on the size (W, H) of the candidate region R and the kurtosis (K _x , K _y ) of the user's line-of-sight position (x (t), y (t)). Then, the size (W ′, H ′) of the attention area R ′ at time t ′ (> T) is calculated. Specifically, the region size calculation unit 134 calculates the size (W ′, H ′) of the attention region R ′ based on Expression 16. The size (W ′, H ′) of the attention area R ′ is a statistical distribution of the user's line-of-sight position (x (t), y (t)) with respect to the size (W, H) of the candidate area R. When the width is large (small), the size (W, H) of the candidate region R is set to gradually expand (reduce) with time.

以上、本実施形態によれば、ユーザの視線位置（ｘ（ｔ），ｙ（ｔ））の空間的分布の特性を表す尖度（Ｓ_ｘ，Ｓ_ｙ）および歪度（Ｋ_ｘ，Ｋ_ｙ）に基づいて注目領域Ｒ’の重心（Ｘ’，Ｙ’）と大きさ（Ｗ’，Ｈ’）を設定するようにした。このような構成により、本実施形態では、ユーザの注目する注目領域を精度よく検出することができる。 As described above, according to the present embodiment, the kurtosis (S _x , S _y ) and the skewness (K _x , K _y ) representing the characteristics of the spatial distribution of the user's line-of-sight position (x (t), y (t)). ), The center of gravity (X ′, Y ′) and the size (W ′, H ′) of the attention area R ′ are set. With this configuration, in the present embodiment, it is possible to accurately detect a region of interest to which the user pays attention.

［その他の実施形態］
また、本発明は、上記実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。また、本発明は、複数の機器から構成されるシステムに適用しても、１つの機器からなる装置に適用してもよい。本発明は上記実施例に限定されるものではなく、本発明の趣旨に基づき種々の変形（各実施例の有機的な組合せを含む）が可能であり、それらを本発明の範囲から除外するものではない。即ち、上述した各実施例及びその変形例を組み合わせた構成も全て本発明に含まれるものである。 [Other Embodiments]
In addition, the present invention supplies software (program) for realizing the functions of the above-described embodiments to a system or apparatus via a network or various storage media, and the computer of the system or apparatus (or CPU, MPU, etc.) programs Is read and executed. Further, the present invention may be applied to a system composed of a plurality of devices or an apparatus composed of a single device. The present invention is not limited to the above embodiments, and various modifications (including organic combinations of the embodiments) are possible based on the spirit of the present invention, and these are excluded from the scope of the present invention. is not. That is, the present invention includes all the combinations of the above-described embodiments and modifications thereof.

１注目領域検出装置
１１視線情報取得部
１２候補領域設定部
１３注目領域検出部 DESCRIPTION OF SYMBOLS 1 Attention area detection apparatus 11 Gaze information acquisition part 12 Candidate area setting part 13 Attention area detection part

Claims

Acquisition means for acquiring information on the user's line-of-sight position with respect to the input image;
Setting means for setting a plurality of candidate areas in the input image;
Detecting means for detecting a region of interest based on the plurality of set candidate regions and a statistical distribution of the acquired gaze position of the user;
An attention area detecting device characterized by comprising:

The attention area detection apparatus according to claim 1, wherein the detection unit selects one candidate area from the plurality of set candidate areas based on a stop time of the user's line-of-sight position. .

The attention area detection according to claim 2, wherein the detection unit determines the position of the attention area based on a gravity center position of the selected candidate area and a time average position of the user's line-of-sight position. apparatus.

The attention area detection apparatus according to claim 2, wherein the detection unit determines a position in the attention area based on a gravity center position of the selected candidate area and a skewness of the user's line-of-sight position. .

The said detection means determines the magnitude | size of the said attention area | region based on the magnitude | size of the said selected candidate area | region and dispersion | distribution of the said user's gaze position, The said any one of Claim 2 to 4 characterized by the above-mentioned. The attention area detection device described.

The said detection means determines the magnitude | size of the said attention area | region based on the magnitude | size of the selected candidate area | region and the kurtosis of the said user's eyes | visual_axis position, The one of Claim 2 to 4 characterized by the above-mentioned. The attention area detection device described in 1.

A user detecting unit for detecting a user who pays attention to the input image;
7. The attention area detection apparatus according to claim 1, wherein the detection unit detects the attention area that a specific user pays attention to among the detected users. 8.

The attention area detection apparatus according to claim 7, wherein the user detection unit detects a user who pays attention to an input image by detecting a face area.

The said setting part calculates saliency from the low-order feature-value in each position of the said input image, and sets several candidate area | region based on this saliency. The attention area detection device according to item.

The attention area detection apparatus according to claim 1, wherein the setting unit sets an area forming one content included in the input image as one candidate area.

The attention area detection apparatus according to claim 1, wherein the attention area detected by the detection unit is displayed superimposed on an input image.

The attention area detection apparatus according to claim 1, wherein the attention area detected by the detection unit is displayed with high image quality.

Obtaining information about a user's line-of-sight position with respect to the input image;
Setting a plurality of candidate regions in the input image;
Detecting a region of interest based on the plurality of set candidate regions and the statistical distribution of the acquired gaze position of the user;
A region of interest detection method characterized by comprising:

A program for causing a computer to function as the attention area detection device according to any one of claims 1 to 12.