JP2002282210A

JP2002282210A - Method and apparatus for detecting visual axis

Info

Publication number: JP2002282210A
Application number: JP2001089886A
Authority: JP
Inventors: Hitoshi Hongo; 仁志本郷
Original assignee: Japan Science and Technology Corp
Current assignee: Japan Science and Technology Agency
Priority date: 2001-03-27
Filing date: 2001-03-27
Publication date: 2002-10-02
Anticipated expiration: 2021-03-27
Also published as: JP4729188B2

Abstract

PROBLEM TO BE SOLVED: To provide a method and a device for detecting the visual axis capable of detecting the visual axis appropriately in a wide interior space without having the device worn by a subject of the detection. SOLUTION: Each personal computer 14 for a camera judges whether the face on the front view is being caught in a prescribed angle range based on the estimated angle of the face, and after normalization of the size of the pupil region in the eye region in the image data satisfying the condition for the judgement, the center of the eye and the center of the pupil are calculated, and the quantity of divergence between them is calculated. After that, a main personal computer 16 compares the plural quantities of divergence as the results of the operation and determines that the visual axis is directed to a video camera with the smallest quantity of divergence based on the comparison.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、視線検出方法及び
その装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a gaze detection method and apparatus.

【０００２】[0002]

【従来の技術】従来より、人物の視線や動作など、人間
をセンシングして得られる情報と、物体センシングによ
り構築された周辺環境とから、その人の要望を察知し、
その人の意図に適したサービスを提供することが提案さ
れている。これらを実現するためには、人間とその周辺
環境をセンシングし、その人が何を見て、どのような動
作を行っているかを知ることが重要なこととなる。この
とき、視線情報はその人が注目している物又は、その人
の意図や状況を推定するのに欠かせない情報の１つであ
る。2. Description of the Related Art Conventionally, a person's request is sensed from information obtained by sensing a person, such as a line of sight and movement of a person, and a surrounding environment constructed by object sensing.
It has been proposed to provide a service suitable for the intention of the person. In order to realize these, it is important to sense a person and its surrounding environment, and to know what the person is seeing and what kind of operation is being performed. At this time, the line-of-sight information is one of information that the person is paying attention to or information that is indispensable for estimating the intention and situation of the person.

【０００３】視線を検出する視線検出方法としては、以
下のようなものが知られている。即ち、視線検出用光源
を有するゴーグル型の視線検出装置を検出対象者の頭部
に装着させ、前記光源から赤外光を眼部に照射する。そ
して、視線検出装置内に設けられた受光センサが眼部
（瞳孔と角膜）にて反射する反射光を受光し、その反射
光に基づいて視線を検出する。[0003] The following are known as eye gaze detection methods for detecting the eye gaze. That is, a goggle-type line-of-sight detection device having a line-of-sight detection light source is mounted on the head of the detection target person, and the light source irradiates the eye with infrared light. Then, a light receiving sensor provided in the eye gaze detecting device receives the reflected light reflected by the eyes (pupil and cornea), and detects the eye gaze based on the reflected light.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記の
ような視線検出方法では、頭部（眼部）にいちいちゴー
グル型の装置を装着しなくてはならず、非常に煩わしい
という問題があった。また、通常ゴーグル型の視線検出
装置は、検出した視線に基づいて所定の処理や制御を行
う制御用コンピュータ等に有線接続されているため、移
動範囲が規制され、広い室内空間等では使用できないと
いう問題があった。However, the above-described gaze detection method has a problem that a goggle type device must be mounted on the head (eye), which is very troublesome. In addition, since the goggle-type gaze detection device is usually wiredly connected to a control computer or the like that performs predetermined processing and control based on the detected gaze, the movement range is restricted, and it cannot be used in a large indoor space or the like. There was a problem.

【０００５】本発明は上記問題点を解決するためになさ
れたものであり、その目的は、装置を検出対象者に装着
させることなく、広い室内空間でも好適に視線を検出す
ることができる視線検出方法及びその装置を提供するこ
とにある。SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and has as its object to provide a gaze detection device capable of suitably detecting a gaze even in a large indoor space without mounting the device on a person to be detected. It is an object of the present invention to provide a method and an apparatus thereof.

【０００６】[0006]

【課題を解決するための手段】上記問題点を解決するた
めに、請求項１に記載の発明は、所定ポイントから検出
対象者を撮像する撮像手段と、撮像手段が撮像した画像
データから、検出対象者の顔領域検出を行い、検出した
顔領域に基づいて顔向き推定を行う顔向き推定手段と、
顔向き推定手段が推定した顔向きが所定の角度範囲を含
む正面顔であるか否かを判定する判定手段と、判定手段
が正面顔であると判定した画像データにおける正面顔の
目領域を検出する目領域検出手段と、目領域検出手段が
検出した目領域において、同目領域内の第１所定部位と
第２所定部位との距離を計測する距離計測手段と、距離
計測手段が計測した距離に基づいて視線を推定する視線
推定手段とを備えたことを要旨とする。According to a first aspect of the present invention, there is provided an imaging apparatus for detecting an object to be detected from a predetermined point, and detecting image data from image data captured by the imaging unit. A face direction estimating means for detecting a face area of the target person and estimating a face direction based on the detected face area;
Determining means for determining whether the face direction estimated by the face direction estimating means is a frontal face including a predetermined angle range, and detecting an eye area of the frontal face in the image data determined by the determining means to be a frontal face Eye area detecting means, a distance measuring means for measuring a distance between a first predetermined part and a second predetermined part in the eye area detected by the eye area detecting means, and a distance measured by the distance measuring means Gaze estimating means for estimating the gaze based on the gist.

【０００７】請求項２に記載の発明は、請求項１におい
て、前記目領域内における瞳中心を検出する瞳検出手段
を更に備え、前記第１所定部位は瞳中心であることを要
旨とする。According to a second aspect of the present invention, in the first aspect, a pupil detecting means for detecting a pupil center in the eye region is further provided, and the first predetermined portion is a pupil center.

【０００８】請求項３に記載の発明は、請求項１又は請
求項２において、前記目領域内における瞳孔中心を検出
する瞳孔検出手段を更に備え、前記第２所定部位は、瞳
孔中心であることを要旨とする。According to a third aspect of the present invention, in the first or second aspect, the apparatus further comprises pupil detection means for detecting a pupil center in the eye area, and the second predetermined portion is a pupil center. Is the gist.

【０００９】請求項４に記載の発明は、請求項１又は請
求項２において、前記目領域における重心位置を検出す
る重心検出手段を更に備え、前記第２所定部位は目領域
の重心であることを要旨とする。According to a fourth aspect of the present invention, in the first or second aspect, the apparatus further comprises a center of gravity detecting means for detecting a position of a center of gravity in the eye region, wherein the second predetermined portion is a center of gravity of the eye region. Is the gist.

【００１０】請求項５に記載の発明は、所定ポイントか
ら検出対象者を撮像する撮像行程と、撮像した画像デー
タから、検出対象者の顔領域検出を行い、検出した顔領
域に基づいて顔向き推定を行う顔向き推定行程と、推定
した顔向きが所定の角度範囲を含む正面顔であるか否か
を判定する判定行程と、正面顔であると判定した画像デ
ータにおける正面顔の目領域を検出する目領域検出行程
と、検出した目領域において、同目領域内の第１所定部
位と第２所定部位との距離を計測する距離計測行程と、
計測した距離に基づいて視線を推定する視線推定行程と
を備えたことを要旨とする。[0010] According to a fifth aspect of the present invention, an image capturing step of capturing an image of a person to be detected from a predetermined point, a face area of the person to be detected is detected from the captured image data, and a face orientation is detected based on the detected face area. A face direction estimating step for estimating, a determining step of determining whether the estimated face direction is a front face including a predetermined angle range, and an eye area of the front face in the image data determined to be the front face. An eye region detection process for detecting, and a distance measurement process for measuring a distance between a first predetermined portion and a second predetermined portion in the detected eye region in the detected eye region;
A gaze estimation process for estimating the gaze based on the measured distance is provided.

【００１１】請求項６に記載の発明は、請求項５におい
て、前記目領域内における瞳中心を検出する瞳検出行程
を更に含み、前記第１所定部位は瞳中心であることを要
旨とする。The invention according to claim 6 is the invention according to claim 5, further comprising a pupil detection step of detecting a pupil center in the eye area, wherein the first predetermined portion is a pupil center.

【００１２】請求項７に記載の発明は、請求項５又は請
求項６において、前記目領域内における瞳孔中心を検出
する瞳孔検出行程を更に含み、前記第２所定部位は、瞳
孔中心であることを要旨とする。According to a seventh aspect of the present invention, in the fifth or sixth aspect, the method further includes a pupil detection step of detecting a pupil center in the eye region, wherein the second predetermined portion is a pupil center. Is the gist.

【００１３】請求項８に記載の発明は、請求項５又は請
求項６において、前記目領域における重心位置を検出す
る重心検出行程を更に含み、前記第２所定部位は目領域
の重心であることを要旨とする。According to an eighth aspect of the present invention, in the fifth or sixth aspect, the method further includes a center of gravity detecting step of detecting a center of gravity position in the eye region, wherein the second predetermined portion is a center of gravity of the eye region. Is the gist.

【００１４】（作用）請求項１の発明によれば、所定ポ
イントから撮像手段が撮像した画像データから、顔向き
推定手段にて顔向きが推定され、判定手段にてその顔向
きが所定角度範囲を含む正面顔か否かが判定される。そ
して、距離計測手段にて、正面顔と判定された画像デー
タにおける目領域内の第１所定部位と第２所定部位の距
離が計測され、視線推定手段にてその距離に基づいて視
線は検出される。According to the first aspect of the present invention, the face direction is estimated by the face direction estimating means from the image data picked up by the image pick-up means from the predetermined point, and the face direction is determined by the judging means in the predetermined angle range. It is determined whether or not the face is a front face. Then, the distance measuring means measures the distance between the first predetermined part and the second predetermined part in the eye area in the image data determined as the frontal face, and the gaze is detected by the gaze estimating means based on the distance. You.

【００１５】請求項２の発明によれば、瞳中心が用いら
れて、視線推定手段にて視線の推定が行われる。請求項
３の発明によれば、瞳中心に加えて、瞳孔中心が用いら
れて視線推定手段にて視線の推定が行われる。According to the second aspect of the present invention, the line of sight is estimated by the line of sight estimating means using the pupil center. According to the third aspect of the present invention, the gaze is estimated by the gaze estimating means using the pupil center in addition to the pupil center.

【００１６】請求項４の発明によれば、瞳中心に加え
て、目領域の重心が用いられて視線推定手段にて視線の
推定が行われる。請求項５の発明によれば、所定ポイン
トから撮像行程で撮像した画像データから、顔向き推定
行程で顔向きが推定され、判定行程でその顔向きが所定
角度範囲を含む正面顔か否かが判定される。そして、距
離計測行程で正面顔と判定された画像データにおける目
領域内の第１所定部位と第２所定部位の距離が計測さ
れ、視線推定行程でその距離に基づいて視線は検出され
る。According to the fourth aspect of the present invention, the line of sight is estimated by the line of sight estimating means using the center of gravity of the eye region in addition to the center of the pupil. According to the fifth aspect of the present invention, a face orientation is estimated in a face orientation estimation process from image data captured in an imaging process from a predetermined point, and it is determined whether or not the face orientation is a front face including a predetermined angle range in a determination process. Is determined. Then, the distance between the first predetermined part and the second predetermined part in the eye area in the image data determined to be the front face in the distance measurement step is measured, and the line of sight is detected based on the distance in the line of sight estimation step.

【００１７】請求項６の発明によれば、瞳中心が用いら
れて、視線推定行程では視線の推定が行われる。請求項
７の発明によれば、瞳中心に加えて、瞳孔中心が用いら
れて視線推定行程では視線の推定が行われる。According to the sixth aspect of the present invention, the sight line is estimated in the sight line estimation process using the pupil center. According to the invention of claim 7, the gaze is estimated in the gaze estimation process using the pupil center in addition to the pupil center.

【００１８】請求項８の発明によれば、瞳中心に加え
て、目領域の重心が用いられて視線推定行程では視線の
推定が行われる。According to the invention of claim 8, the line of sight is estimated in the line of sight estimation process using the center of gravity of the eye region in addition to the center of the pupil.

【００１９】[0019]

【発明の実施の形態】以下、本発明の視線検出装置を具
体化した一実施の形態を図１〜図１０を参照して説明す
る。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment embodying a gaze detecting apparatus according to the present invention will be described below with reference to FIGS.

【００２０】本実施形態の視線検出装置１０は、複数台
の電気機器１７（例えば、テレビ、オーディオ、エアコ
ン等）をオンオフ等の制御をする場合に、視線検出装置
１０が検出した視線上の電気機器１７に対して対応する
コマンド信号を付与するためのものである。The gaze detecting device 10 according to the present embodiment, when controlling a plurality of electric devices 17 (for example, a television, an audio system, an air conditioner, etc.) to turn on / off, etc. This is for giving a corresponding command signal to the device 17.

【００２１】例えば、テレビがオフ状態（又はオン状
態）の際に、テレビが配置されている方向に視線が向け
られた際に、その視線を検出して、コマンド信号として
オン信号（又はオフ信号）を付与する。For example, when the television is turned off (or turned on) and a line of sight is turned in the direction in which the television is arranged, the line of sight is detected and an on signal (or off signal) is detected as a command signal. ).

【００２２】視線検出装置１０は、撮像手段としての複
数台（本実施形態では４台）のビデオカメラ（ＣＣＤカ
メラ）１１、カメラ用パソコン１４、メインパソコン１
６等を備えている。前記ビデオカメラ１１は、複数台の
電気機器１７（例えば、テレビ、オーディオ、エアコン
等）が配置された地点と同一箇所に配置されている。本
実施形態ではカメラ用パソコン１４が顔向き推定手段、
判定手段、目領域検出手段、瞳孔検出手段、及び瞳検出
手段に相当し、メインパソコン１６が視線推定手段に相
当する。また、ビデオカメラ１１は任意の位置に配置さ
れており、各ビデオカメラ１１の位置が所定ポイントに
相当する。The eye-gaze detecting device 10 includes a plurality of (four in the present embodiment) video cameras (CCD cameras) 11 as imaging means, a camera personal computer 14, and a main personal computer 1.
6 and so on. The video camera 11 is arranged at the same place where a plurality of electric devices 17 (for example, television, audio, air conditioner, etc.) are arranged. In this embodiment, the camera personal computer 14 is a face direction estimating unit,
The main personal computer 16 corresponds to the determining means, the eye area detecting means, the pupil detecting means, and the pupil detecting means, and the main personal computer 16 corresponds to the gaze estimating means. The video cameras 11 are arranged at arbitrary positions, and the position of each video camera 11 corresponds to a predetermined point.

【００２３】各ビデオカメラ１１には、カメラ用パソコ
ン１４がそれぞれ接続されている。カメラ用パソコン１
４には、ビデオカメラ１１で撮影された個々のフレーム
（画像データ）が、ビデオレートのカラー画像（６４０
×４８０）として入力されるようになっている。A camera personal computer 14 is connected to each video camera 11. Camera PC 1
4, each frame (image data) photographed by the video camera 11 is a video image color image (640).
× 480).

【００２４】カメラ用パソコン１４はメインパソコン１
６に接続されており、メインパソコン１６は、各カメラ
用パソコン１４との通信をイーサネット（登録商標）を
介したソケット通信で行うようにしている。また、ネッ
トワーク・タイムサーバシステムが用いられており、メ
インパソコン１６がタイムサーバとして設定され、各カ
メラ用パソコン１４の時刻がメインパソコン１６に合わ
されるようになっている。又、メインパソコン１６は、
各電気機器１７（例えば、テレビ、オーディオ、エアコ
ン等）に電気的に接続されており、視線検出装置１０の
視線検出結果に応じてオンオフ制御する。すなわち、現
在の電気機器１７の状態に応じたコマンド信号を出力す
る。例えば、電気機器１７がオン状態のときには、コマ
ンド信号としてオフ信号が、オフ状態のときには、コマ
ンド信号としてオン信号が出力される。なお、メインパ
ソコン１６と各電気機器１７を有線接続せずに、赤外線
でコントロールしてもよく、つまり無線で制御する態様
をとってもよい。The camera personal computer 14 is the main personal computer 1
The main personal computer 16 communicates with each camera personal computer 14 by socket communication via Ethernet (registered trademark). In addition, a network time server system is used. The main personal computer 16 is set as a time server, and the time of each camera personal computer 14 is adjusted to the main personal computer 16. Also, the main personal computer 16
It is electrically connected to each of the electric devices 17 (for example, a television, an audio, an air conditioner, and the like), and performs on / off control according to a gaze detection result of the gaze detection device 10. That is, a command signal corresponding to the current state of the electric device 17 is output. For example, when the electric device 17 is on, an off signal is output as a command signal, and when it is off, an on signal is output as a command signal. It should be noted that the main personal computer 16 and each electric device 17 may be controlled by infrared rays without being connected by wire, that is, may be controlled by wireless.

【００２５】（作用）以下、本実施形態の視線検出装置
１０の作用について説明する。まず、視線検出装置１０
が行う視線検出の概要を説明する。(Operation) The operation of the visual line detection device 10 of the present embodiment will be described below. First, the gaze detection device 10
An outline of gaze detection performed by the user will be described.

【００２６】各ビデオカメラ１１は、検出対象者Ｈを撮
像し、各カメラ用パソコン１４に入力する。各カメラ用
パソコン１４はビデオカメラ１１からの画像のキャプチ
ャを行い、続いて肌色領域抽出、顔向き推定を行い、顔
向き推定結果が所定条件を満たすか否かを判定し、条件
を満たす画像データから目領域３２を検出する。そし
て、検出された目領域３２から、瞳の大きさを正規化
し、瞳の中心部位（瞳中心）Ｃ１及び瞳内の瞳孔の位置
（瞳孔中心）Ｃ２を算出し（図９参照）、２点間の距離
を算出（計測）する。カメラ用パソコン１４はその距離
の演算結果をメインパソコン１６に送信し、メインパソ
コン１６は、その距離の大小を比較することで複数のビ
デオカメラ１１のうち何れのビデオカメラ１１に視線を
送っているか、即ち視線を検出する。Each video camera 11 captures an image of the person H to be detected and inputs the captured image to the personal computer 14 for each camera. Each camera personal computer 14 captures an image from the video camera 11, subsequently performs skin color region extraction and face orientation estimation, determines whether or not the face orientation estimation result satisfies a predetermined condition, and sets image data that satisfies the condition. The eye area 32 is detected from. Then, from the detected eye region 32, the size of the pupil is normalized, and the center part of the pupil (pupil center) C1 and the position of the pupil in the pupil (pupil center) C2 are calculated (see FIG. 9). The distance between them is calculated (measured). The camera personal computer 14 transmits the calculation result of the distance to the main personal computer 16, and the main personal computer 16 compares the magnitude of the distance to determine which of the plurality of video cameras 11 is sending the line of sight. That is, the line of sight is detected.

【００２７】以下、図２のフローチャートを参照して詳
細に説明する。メインパソコン１６からカメラ用パソコ
ン１４へ、開始要求信号が送信されるとこのフローチャ
ートは開始される。そして、メインパソコン１６からカ
メラ用パソコン１４へ、終了要求信号が送信されるま
で、Ｓ１〜Ｓ１１の処理が繰り返し行われる。The details will be described below with reference to the flowchart of FIG. This flowchart is started when a start request signal is transmitted from the main personal computer 16 to the camera personal computer 14. Then, the processing of S1 to S11 is repeatedly performed until an end request signal is transmitted from the main personal computer 16 to the camera personal computer 14.

【００２８】ステップ（以下「Ｓ」と略す）１におい
て、まず、カメラ用パソコン１４は、ビデオカメラ１１
からの画像のキャプチャを行うか否かの判定を行う。即
ち、本実施形態では、ビデオカメラ１１からの画像のキ
ャプチャは所定間隔（例えば０．３秒）毎に行われるよ
うになっており、各カメラ用パソコン１４は、その時刻
か否かを判定する。そして、画像をキャプチャする時刻
であると判断した場合は（Ｓ１がＹＥＳ）、各カメラ用
パソコン１４はビデオカメラ１１からの画像のキャプチ
ャを行う（Ｓ２）。一方、カメラ用パソコン１４が画像
をキャプチャする時刻ではないと判断した場合は（Ｓ１
がＮＯ）、この判定を繰り返す。なお、各カメラ用パソ
コン１４の時刻はメインパソコン１６に合わされている
ため、各カメラ用パソコン１４は、同時刻に画像のキャ
プチャを行うようになっている。In step (hereinafter abbreviated as “S”) 1, first, the camera personal computer 14
It is determined whether or not to capture an image from. That is, in the present embodiment, capture of an image from the video camera 11 is performed at predetermined intervals (for example, 0.3 seconds), and each camera personal computer 14 determines whether or not that time is reached. . If it is determined that it is time to capture an image (S1 is YES), each camera personal computer 14 captures an image from the video camera 11 (S2). On the other hand, when the camera personal computer 14 determines that it is not time to capture an image (S1).
Is NO), this determination is repeated. Since the time of each camera personal computer 14 is set to the time of the main personal computer 16, each camera personal computer 14 captures an image at the same time.

【００２９】（顔領域検出）各カメラ用パソコン１４
は、ビデオカメラ１１からのフレーム（画像データ、例
えば図３参照）をキャプチャした後、顔領域検出を行
う。顔領域検出は、色情報を用いた公知の肌色基準値に
よる手法を用いている。本実施形態では、均等知覚色空
間の１つであるCIE L*u*v 表色系を用いている。(Face area detection) Personal computer 14 for each camera
Performs face area detection after capturing a frame (image data, for example, see FIG. 3) from the video camera 11. The face area detection uses a method based on a known skin color reference value using color information. In this embodiment, the CIE L * u * v color system, which is one of the uniform perceived color spaces, is used.

【００３０】まず、入力された画像データから、画像の
全領域に亘り、Ｕ，Ｖ座標値による２次元色ヒストグラ
ムを求め、予め定めた肌色有効範囲内のピーク値（度数
が最大の値）を肌色基準値とする。その基準値からの色
差に対して公知の判別分析法を適用して閾値を決定し、
その閾値に基づいて肌色領域とその他の領域に２値化す
る（図４参照）。本実施形態では、検出対象者Ｈが一人
の場合を想定しているため、複数の肌色領域が検出され
た場合には、各カメラ用パソコン１４は最大領域を顔領
域３１と判定する（Ｓ３）。すなわち、抽出された複数
の肌色領域にて、画素数（面積）を求め、最大面積Ｓma
x の領域を顔領域３１とする。なお、以下の説明におい
て、前記Ｕ，Ｖ座標値は、説明の便宜上ＵＶ値又はＵ
値，Ｖ値というときもある。First, a two-dimensional color histogram based on U and V coordinate values is obtained from the input image data over the entire area of the image, and a peak value (the maximum frequency) within a predetermined effective skin color range is determined. The skin color reference value is used. A threshold value is determined by applying a known discriminant analysis method to the color difference from the reference value,
Based on the threshold value, the image is binarized into a skin color area and other areas (see FIG. 4). In the present embodiment, it is assumed that there is only one detection target person H. Therefore, when a plurality of skin color areas are detected, each camera personal computer 14 determines the maximum area as the face area 31 (S3). . That is, the number of pixels (area) is determined in the plurality of extracted skin color areas, and the maximum area Sma is calculated.
The area x is a face area 31. In the following description, the U and V coordinate values are referred to as UV values or U
Value, V value.

【００３１】（顔向き推定）次に、Ｓ４において、各カ
メラ用パソコン１４は対応するビデオカメラ１１から得
た画像データに基づいて顔向き推定を行う。(Estimation of Face Direction) Next, in S 4, each camera personal computer 14 estimates a face direction based on image data obtained from the corresponding video camera 11.

【００３２】本実施形態では、顔向き推定は、４方向面
特徴抽出した結果を線形判別分析により、顔向きの判別
空間を作成する方法で行っている。４方向面特徴抽出で
は、画像データの濃淡値の勾配により各画素での４方向
（縦、横、右斜め４５度、左斜め４５度）のベクトル場
を求め、方向別に分割したエッジ画像を得る。得られた
エッジ画像は方向性を持った濃淡画像となる。In the present embodiment, the face orientation is estimated by a method of creating a face orientation discrimination space by linear discriminant analysis of the results of the four-way surface feature extraction. In the four-direction surface feature extraction, a vector field in each of four directions (vertical, horizontal, diagonally right 45 degrees, diagonally left 45 degrees) is obtained for each pixel based on the gradient of the gray value of the image data, and an edge image divided for each direction is obtained. . The obtained edge image becomes a shaded image having directionality.

【００３３】具体的には、Ｓ３において入力した画像デ
ータからPrewitt オペレータを用いて、微分フィルタと
してのプレヴィットフィルタ処理を行い、水平（横）、
垂直（縦）、右上がり４５度（右斜め４５度）、右下が
り４５度（左斜め４５度）の４方向のそれぞれのエッジ
画像を生成する。これらのエッジ画像を、以下、方向面
という。次に、これらの４方向面のそれぞれの画像を顔
領域３１で正規化し、８×８に低解像度化して、各方向
面の画素の濃淡値を特徴量（以下、特徴ベクトルとい
う。）として抽出する。Specifically, a prewitt filter process as a differential filter is performed from the image data input in S3 using a Prewitt operator, and horizontal (horizontal),
Edge images in four directions of vertical (vertical), rising 45 degrees (45 degrees to the right), and falling 45 degrees (45 degrees to the left) are generated. These edge images are hereinafter referred to as direction planes. Next, the images of these four directions are normalized by the face area 31 and the resolution is reduced to 8 × 8, and the gray value of the pixel in each direction is extracted as a feature value (hereinafter, referred to as a feature vector). I do.

【００３４】この特徴ベクトルは４つの方向面に分けて
から解像度を低くしているため、入力画像の解像度を直
接低くする場合よりも、高解像度でエッジ情報が保持さ
れる。その結果、位置ずれや、形状変化の影響を受けに
くく、かつ計算コストを削減して処理の高速化が可能と
なる。Since the resolution of this feature vector is reduced after being divided into four directional planes, the edge information is held at a higher resolution than when the resolution of the input image is directly reduced. As a result, it is hard to be affected by the displacement and the shape change, and the processing cost can be reduced and the processing speed can be increased.

【００３５】次に、各カメラ用パソコン１４は線形判別
分析を行う。なお、線形判別分析は、抽出された特徴量
（特徴ベクトル：ｘｉ）が、どのクラスに属するかを判
別するためのものであり、クラス内の分散が小さく、各
クラスの平均特徴ベクトルが互いに離れるような判別空
間を構成すると高い判別力が得られる。図５は判別分析
に係るクラスを示した概念図である。Next, each camera personal computer 14 performs a linear discriminant analysis. The linear discriminant analysis is for discriminating to which class the extracted feature amount (feature vector: xi) belongs, and the variance within the class is small, and the average feature vectors of each class are separated from each other. By configuring such a discrimination space, a high discrimination power can be obtained. FIG. 5 is a conceptual diagram showing classes related to discriminant analysis.

【００３６】本実施形態では、予め、学習データに基づ
いた係数行列Ａが各カメラ用パソコン１４の記憶装置
（図示しない）に記憶されている。なお、学習データ
は、複数の検出対象者Ｈである人物を撮像して得た画像
データに基づいたデータである。すなわち、図７に示す
ように、光軸を室内中心に向けるように等角度間隔（本
実施形態では２２．５度間隔）で放射状に配置された１
６台のビデオカメラ１１により、１６方向から得た画像
データを得て、上記と同様に顔領域検出と、同顔領域３
１における４方向面特徴抽出を行い、特徴ベクトルｘを
求める。In this embodiment, the coefficient matrix A based on the learning data is stored in advance in a storage device (not shown) of each camera personal computer 14. The learning data is data based on image data obtained by imaging a plurality of detection target persons H. That is, as shown in FIG. 7, 1 is radially arranged at equal angular intervals (22.5 degrees in the present embodiment) so that the optical axis is directed toward the center of the room.
Image data obtained from sixteen directions is obtained by the six video cameras 11, and face area detection and face area
The feature vector x is obtained by performing the four-way surface feature extraction in Step 1.

【００３７】ｘ＝｛ｘ１，ｘ２，……ｘ２５６｝なお、１６台のビデオカメラ１１を使用する代わりに、
例えば、１台のビデオカメラ１１を使用して、検出対象
者Ｈが室内中心を中心に等角度毎に回転するたびに撮像
し、そのときの画像データを学習用データに使用しても
良い。X = {x1, x2,..., X256} Note that instead of using 16 video cameras 11,
For example, one video camera 11 may be used to capture an image each time the detection target person H rotates at equal angles around the center of the room, and the image data at that time may be used as the learning data.

【００３８】この特徴ベクトルｘから判別空間の特徴ベ
クトルｙ（＝Ａｘ）へ線形写像する係数行列Ａが求めら
れており、かつ各クラス（本実施形態では学習データを
取り込むときに使用した２２．５度間隔に配置したビデ
オカメラ１１に応じた１６のクラス）が生成され、クラ
スの平均特徴ベクトルｙj が算出されている。そして、
前記係数行列Ａと、各クラスの平均特徴ベクトルｙj の
データが、予め各カメラ用パソコン１４の記憶装置に格
納されている。A coefficient matrix A that linearly maps from the feature vector x to the feature vector y (= Ax) in the discriminant space has been obtained, and each class (in this embodiment, 22.5 16 classes corresponding to the video cameras 11 arranged at intervals of degrees are generated, and the average feature vector yj of the classes is calculated. And
The data of the coefficient matrix A and the average feature vector yj of each class are stored in the storage device of each camera personal computer 14 in advance.

【００３９】なお、本実施形態では、クラス番号ｊは、
０、２２．５、４５、６７．５、９０、１１２．５、１
３５、１５７．５、１８０、−１５７．５、−１３５、
−１１２．５、−９０、−６７．５、−４５、−２２．
５の等差となる１６の値である。図７に示すように、各
クラス番号（数値）はカメラ用パソコン１４に係るビデ
オカメラ１１の光軸（カメラ方向）に対する相対顔方向
（相対的な顔向き）とのなす角度と一致する。図７は検
出対象者Ｈを中心に２２．５度間隔で１６方向に配置し
たビデオカメラ１１の配置を示し、各カメラから検出対
象者Ｈを撮像した場合の、各カメラから得られる画像デ
ータに対するクラス付与の内容を示している。同図にお
いて、例えば−２２．５が付与されたカメラから検出対
象者Ｈを撮像した画像データには、クラス−２２．５が
付与される。本実施形態では、相対顔方向に係るクラス
番号０度が、正面顔を撮像した場合としている。なお、
「−」は、図７において、当該ビデオカメラ１１の光軸
から反時計回り方向の角度を示す。In this embodiment, the class number j is
0, 22.5, 45, 67.5, 90, 112.5, 1
35, 157.5, 180, -157.5, -135,
-112.5, -90, -67.5, -45, -22.
It is a value of 16 which is an equal difference of 5. As shown in FIG. 7, each class number (numerical value) coincides with an angle between the optical axis (camera direction) of the video camera 11 of the camera personal computer 14 and the relative face direction (relative face direction). FIG. 7 shows the arrangement of the video cameras 11 arranged in 16 directions at intervals of 22.5 degrees around the detection target H. When the detection target H is imaged from each camera, the video cameras 11 correspond to image data obtained from each camera. This shows the content of class assignment. In the figure, for example, a class -22.5 is assigned to image data obtained by imaging the detection target person H from a camera to which -22.5 is assigned. In the present embodiment, the case where the front face is imaged is the class number 0 degrees related to the relative face direction. In addition,
"-" Indicates an angle in the counterclockwise direction from the optical axis of the video camera 11 in FIG.

【００４０】そして、未知データの識別を行う線形判別
分析では、前記係数行列Ａに基づいて、未知データから
抽出した４方向面特徴に係る特徴ベクトルｘi を写像変
換し、特徴ベクトルｙi （＝Ａｘi ）を生成する。次
に、生成された特徴ベクトルｙi と、各クラスの平均特
徴ベクトルｙj とのユークリッド距離の２乗である距離
（以下、２乗距離という）Ｄijを、以下の式（１）で演
算し、２乗距離Ｄijが最小値となるクラスを決定するこ
とにより、パターン認識を行う（図６参照）。その後、
最小値を含む下位３つの２乗距離Ｄijの値に対応したク
ラスを用いて以下の式（２）にて、カメラ方向（ビデオ
カメラ１１の光軸γが向く方向、図１参照）と相対顔方
向（光軸γに対する相対的な顔向き）βとのなす角度Ｆ
を推定する。なお、図６中のＤｊは、ｉが省略されてお
り、本明細書中では、Ｄijに相当する。In the linear discriminant analysis for identifying the unknown data, the feature vector xi relating to the four-directional surface features extracted from the unknown data is mapped based on the coefficient matrix A, and the feature vector yi (= Axi) is obtained. Generate Next, a distance (hereinafter referred to as a square distance) Dij which is a square of the Euclidean distance between the generated feature vector yi and the average feature vector yj of each class is calculated by the following equation (1). Pattern recognition is performed by determining the class in which the riding distance Dij has the minimum value (see FIG. 6). afterwards,
The camera direction (the direction in which the optical axis γ of the video camera 11 is oriented; see FIG. 1) and the relative face are calculated by the following equation (2) using the class corresponding to the values of the lower three square distances Dij including the minimum value. Angle F with the direction (relative face direction to the optical axis γ) β
Is estimated. Note that i is omitted from Dj in FIG. 6 and corresponds to Dij in the present specification.

【００４１】Ｄij＝｜ｙi −ｙj ｜² …（１）Dij = | yi-yj | ² (1)

【００４２】[0042]

【数１】なお、式（２）において、ｉはクラス番号を示し、本実
施形態ではｎ＝３を想定している。このため、最小値を
含む下位３つの２乗距離Ｄijに対応したクラス番号が、
最小値に対応するクラス番号から順にｉに代入される。
θは各クラスにおける顔向きの相対角度（カメラ方向に
対する相対顔方向のなす角度＝クラス番号）を示す。ま
た、式（２）中において、２乗距離Ｄijはj が省略され
ている。(Equation 1) In Expression (2), i indicates a class number, and in the present embodiment, n = 3 is assumed. Therefore, the class number corresponding to the lower three square distances Dij including the minimum value is:
It is assigned to i in order from the class number corresponding to the minimum value.
θ indicates the relative angle of the face direction in each class (the angle formed by the relative face direction with respect to the camera direction = class number). In Equation (2), j is omitted from the square distance Dij.

【００４３】（顔向き判定）Ｓ５においては、各カメラ
用パソコン１４はＳ４で行った顔向き推定の結果を利用
して、相対顔方向においてその推定された顔向きの角度
が所定角度（本実施形態では±２０度）範囲内であるか
否かを判定する。そして、所定角度内であれば（Ｓ５が
ＹＥＳ）、Ｓ６に進む。なお、この推定された角度が所
定角度（例えば±２０度）範囲内であるか否かという条
件を、本実施形態では所定条件ということがある。(Face Direction Determination) In step S5, each camera personal computer 14 uses the result of the face direction estimation performed in step S4 to set the estimated face direction in the relative face direction to a predetermined angle (this embodiment). It is determined whether it is within the range of ± 20 degrees in the embodiment. If the angle is within the predetermined angle (S5: YES), the process proceeds to S6. In the present embodiment, the condition of whether or not the estimated angle is within a predetermined angle (for example, ± 20 degrees) may be referred to as a predetermined condition.

【００４４】このとき、ビデオカメラ１１は、一定間隔
毎に配置していないため、相対顔方向の角度Ｆが所定角
度（±２０度）内である画像データ、換言すれば、前述
した所定条件を満たす画像データは１つとは限らない。
従って、本実施形態では、相対顔方向の角度Ｆが所定角
度内の正面顔を撮像したカメラ１１が２つあり、ビデオ
カメラ１１Ａ及びビデオカメラ１１Ｂで捉えた画像デー
タが視線が向けられた候補、即ち、所定条件を満たし、
後述する目領域検出の対象として判断されたものとし
て、以下の説明を続ける。なお、推定された顔向きの角
度Ｆが所定条件を満たしていない（Ｓ５がＮＯ）と判定
したカメラ用パソコン１４は、今回の画像データについ
ては、以下のステップを行わず、このフローチャートを
終了する。At this time, since the video cameras 11 are not arranged at regular intervals, the image data in which the angle F of the relative face direction is within a predetermined angle (± 20 degrees), in other words, the predetermined conditions described above are satisfied. The number of image data to be satisfied is not limited to one.
Therefore, in the present embodiment, there are two cameras 11 that have imaged the frontal face with the relative face direction angle F within a predetermined angle, and the image data captured by the video camera 11A and the video camera 11B are candidates for which the eyes are directed, That is, a predetermined condition is satisfied,
The following description will be continued assuming that the eye area has been determined as an eye area detection target described later. The camera personal computer 14 that has determined that the estimated face orientation angle F does not satisfy the predetermined condition (NO in S5) does not perform the following steps for the current image data, and ends this flowchart. .

【００４５】（視線検出）次のＳ６〜Ｓ１０の概要を説
明すると、ビデオカメラ１１Ａ及びビデオカメラ１１Ｂ
におけるカメラ用パソコン１４は、顔領域３１の中から
目領域３２を検出する（図９参照）。そして、瞳領域３
５を検出すると共に、その瞳領域３５の大きさを正規化
し、さらにそこから瞳孔領域３６を検出し、瞳中心Ｃ１
と瞳孔中心Ｃ２を算出して両位置間の距離を演算（計
測）する。そして、その距離の演算結果をメインパソコ
ン１６に送信する。メインパソコン１６はビデオカメラ
１１Ａ，１１Ｂの各カメラ用パソコン１４から受信した
前記距離の演算結果を比較して視線を検出（推定）す
る。本実施形態では、瞳中心Ｃ１が第１所定部位、瞳孔
中心Ｃ２が第２所定部位にそれぞれ相当する。(Line-of-sight detection) The outline of the following S6 to S10 will be described. The video camera 11A and the video camera 11B
The camera personal computer 14 detects the eye area 32 from the face area 31 (see FIG. 9). And pupil area 3
5, the size of the pupil region 35 is normalized, and the pupil region 36 is further detected therefrom.
And the pupil center C2 are calculated, and the distance between both positions is calculated (measured). Then, the calculation result of the distance is transmitted to the main personal computer 16. The main personal computer 16 detects (estimates) the line of sight by comparing the distance calculation results received from the camera personal computers 14 of the video cameras 11A and 11B. In the present embodiment, the pupil center C1 corresponds to a first predetermined part, and the pupil center C2 corresponds to a second predetermined part.

【００４６】（目領域検出）さて、Ｓ６において、ま
ず、カメラ用パソコン１４は、画像データについて肌色
基準値を再算出し、肌色領域を抽出する。抽出された肌
色領域のうち、最大領域を顔領域３１と判定する。(Detection of Eye Area) In S6, first, the camera personal computer 14 recalculates the skin color reference value for the image data and extracts the skin color area. Of the extracted skin color regions, the largest region is determined to be the face region 31.

【００４７】カメラ用パソコン１４は、その顔領域３１
に基づき、４方向面特徴と色差面特徴を用いたテンプレ
ートマッチング手法により、それぞれ目領域３２、並び
に口領域を検出する。The camera personal computer 14 has its face area 31
, An eye region 32 and a mouth region are detected by a template matching method using a four-way surface feature and a color difference surface feature.

【００４８】ところで、今回の画像データの１つ前に本
フローチャートを用いて処理された画像データにおい
て、このＳ６で目領域３２及び口領域が検出されていた
場合は、前回の検出結果に基づいて、今回得られた顔領
域３１を所定領域削除し、顔領域３１が前記所定領域分
狭められた探索範囲として設定されるようになってい
る。そして、今回の画像データに関しては、前記探索範
囲が用いられ、テンプレートマッチング手法により目領
域３２及び口領域の検出が行われる。なお、テンプレー
トマッチングを行った結果、前記探索範囲に対して目領
域３２及び口領域が検出されなかった場合は、再度、顔
領域３１に対して両領域の検出が行われるようになって
いる。By the way, if the eye area 32 and the mouth area are detected in S6 in the image data processed by using this flowchart immediately before the current image data, based on the previous detection result. The face area 31 obtained this time is deleted by a predetermined area, and the face area 31 is set as a search range narrowed by the predetermined area. The search range is used for the current image data, and the eye region 32 and the mouth region are detected by the template matching method. When the eye region 32 and the mouth region are not detected in the search range as a result of performing the template matching, the detection of both regions is performed again on the face region 31.

【００４９】ここで、前記テンプレートマッチング手法
について説明する。この手法は、得られた画像データか
ら、前述した４方向面特徴抽出にて４方向面特徴（方向
面）、及びＵ，Ｖ座標値による色差面特徴を抽出し、肌
色領域抽出で得られた肌色領域（顔領域３１）又は探索
範囲に対して、右目、左目、口の各テンプレートを用い
て類似度を計算する。Here, the template matching method will be described. According to this method, a four-directional surface feature (directional surface) and a color difference surface feature based on U and V coordinate values are extracted from the obtained image data by the above-described four-directional surface feature extraction, and obtained by skin color region extraction. The similarity is calculated for the skin color area (face area 31) or the search range using the right eye, left eye, and mouth templates.

【００５０】なお、前記色差面特徴は、肌色基準値から
のＵ値の差、及びＶ値の差を示すものである。また、前
記テンプレートとは、予め、右目、左目、口の画像を複
数枚用意し、４方向面特徴及び色差面特徴を抽出した画
像データを、所定比率で縮小し、横幅を所定ピクセル
（例えば３２ピクセル）に揃え、大きさの正規化を行
う。そして、４方向面特徴に関しては、エッジ方向情報
を４方向に分解し、さらに、４方向面特徴及び色差面特
徴に対してガウシャンフィルタで平滑化し、各画像デー
タを８×８の解像度に変換したものである。このテンプ
レートは、記憶装置（図示しない）に記憶されている。The color difference plane feature indicates a difference between the U value and the V value from the skin color reference value. In addition, the template means that a plurality of images of the right eye, the left eye, and the mouth are prepared in advance, the image data obtained by extracting the four-way surface features and the color difference surface features is reduced at a predetermined ratio, and the width is set to a predetermined pixel (for example, 32 Pixels) and normalize the size. Regarding the four-directional surface features, the edge direction information is decomposed into four directions, and the four-directional surface features and the color difference surface features are smoothed by a Gaussian filter, and each image data is converted to an 8 × 8 resolution. It was done. This template is stored in a storage device (not shown).

【００５１】そして、前記テンプレートＴと画像データ
（入力画像）Ｉとの４方向面特徴の類似度ａを以下の式
（３）で算出し、色差面特徴の類似度ｂを以下の式
（４）で算出する。Then, the similarity a of the four-way plane feature between the template T and the image data (input image) I is calculated by the following equation (3), and the similarity b of the chrominance plane feature is calculated by the following equation (4). ).

【００５２】[0052]

【数２】（３）、（４）式中、Ｉは入力画像を示し、Ｔはテンプ
レートを示す。ｉ、ｊは、１〜ｍ、１〜ｎの値であり、
ｍ×ｎ画素のテンプレート及び入力画像に対応してい
る。（ｘ，ｙ）は入力画像の左上座標を示す。また、
（４）式中Ｔｕ，ＴｖはテンプレートのＵＶ値、Ｉｕ，
Ｉｖは画像データのＵＶ値を示し、Ｕmax ，Ｖmax はＵ
Ｖ値の最大範囲を示す。本実施形態では、CIE L*u*v 表
色系を用いており、このＣＩＥＬＵＶ表色系において、
処理の高速化及び記憶装置の空間を節約するため、Ｕma
x ＝２５６，Ｖmax ＝２５６としている。(Equation 2) In the expressions (3) and (4), I indicates an input image, and T indicates a template. i and j are values of 1 to m and 1 to n,
It corresponds to a template of m × n pixels and an input image. (X, y) indicates the upper left coordinates of the input image. Also,
(4) where Tu and Tv are the UV values of the template, Iu,
Iv indicates the UV value of the image data, and Umax and Vmax
Indicates the maximum range of the V value. In this embodiment, the CIE L * u * v color system is used, and in this CIELUV color system,
To speed up processing and save storage space, Uma
x = 256 and Vmax = 256.

【００５３】次いで、これらの式（３），（４）で算出
した、各類似度ａ，ｂに基づいて、以下の式（５）によ
り、最終的な類似度ｃを算出する。ｃ＝Ｗa ×ａ＋Ｗb ×ｂ …（５）（５）式中Ｗａ，Ｗｂは、重み付けとして、各類似度
ａ，ｂに掛け合わせられる所定の定数であり、Ｗa ＋Ｗ
b ＝１を満たしている。なお、本実施形態では、Ｗa ＝
Ｗb ＝０．５としている。Next, based on the similarities a and b calculated by the equations (3) and (4), the final similarity c is calculated by the following equation (5). c = Wa × a + Wb × b (5) In the equation (5), Wa and Wb are predetermined constants to be multiplied by the similarities a and b as weightings, and Wa + W
b = 1 is satisfied. In this embodiment, Wa =
Wb is set to 0.5.

【００５４】その演算結果を元に、前記類似度ｃが予め
設定された閾値以上の箇所を、目の候補領域とする。そ
して、入力画像（画像データ）には、左上座標が予め付
与されており、その座標に基づき目、口の位置関係が把
握できる。従って、その座標に基づいて、例えば、目は
口より上にある、右目と左目の配置等、目、口の大まか
な位置関係（座標位置）を満たし、最も類似度ｃの高い
組み合わせを目領域３２並びに口領域として決定する。
この結果、顔領域３１の中で目領域３２が検出される。Based on the calculation result, a portion where the similarity c is equal to or larger than a predetermined threshold value is set as an eye candidate region. The input image (image data) is given upper left coordinates in advance, and the positional relationship between the eyes and the mouth can be grasped based on the coordinates. Therefore, based on the coordinates, for example, the eye satisfies the rough positional relationship (coordinate position) between the eyes and the mouth, such as the right eye and the left eye, and the combination having the highest similarity c is determined as the eye area. 32 and the mouth area.
As a result, the eye region 32 is detected in the face region 31.

【００５５】（瞳検出）次にＳ７において、検出された
目領域３２からカメラ用パソコン１４は瞳の中心Ｃ１を
検出する瞳検出を行う。なお、本実施形態では、Ｓ６に
て検出された目領域３２のうち何れか一方（例えば右
目）の目領域３２について、以下に説明する瞳検出及び
瞳孔検出を行う。(Pupil Detection) Next, in S7, the camera personal computer 14 performs pupil detection for detecting the center C1 of the pupil from the detected eye area 32. In the present embodiment, pupil detection and pupil detection described below are performed on one of the eye regions 32 (for example, the right eye) among the eye regions 32 detected in S6.

【００５６】まず、目領域画像の彩度値ヒストグラムを
作成して、公知の判別分析法を適用し、顔領域３１を目
領域３２と肌領域（顔領域の目領域３２以外の領域）と
に分離する。一般的に、肌領域の彩度は高く、目領域３
２の彩度は低い。このため、この分離処理はその特性を
利用している。次いで、前記目領域画像の輝度ヒストグ
ラムを作成して、公知の判別分析法を適用し、分離され
た目領域３２を、瞳領域３５と白目領域３４とに分割す
る。First, a chroma histogram of an eye area image is created, and a known discriminant analysis method is applied to convert the face area 31 into an eye area 32 and a skin area (an area other than the eye area 32 of the face area). To separate. Generally, the saturation of the skin area is high and the eye area 3
The saturation of 2 is low. For this reason, this separation process utilizes its characteristics. Next, a luminance histogram of the eye area image is created, and a known discriminant analysis method is applied to divide the separated eye area 32 into a pupil area 35 and a white eye area 34.

【００５７】その後、瞳領域３５の検出結果を元に、瞳
領域３５を縮小又は拡大し、所定の大きさに正規化す
る。そして、瞳領域３５に対して円形状の補完を行う。
この際、前述したように、彩度値ヒストグラム及び輝度
ヒストグラムにそれぞれ判別分析法を適用して分割する
ことで得られた瞳領域３５内には図８（ａ）に示すよう
に、瞼による陰影３５ａの存在が考えられる。このと
き、通常、画像の濃淡値を８ビットで表した場合、濃淡
値０が黒、濃淡値２５６が白となる。従って、領域分割
結果における濃淡値０（黒色）の領域に対して、水平射
影ヒストグラムを作成し（図８（ｂ）参照）、同ヒスト
グラムにおいて縦軸方向の上部に示されるように、極端
なピークをもつ部分を予め設定された閾値に基づいて削
除する。つまり、瞼による陰影３５ａの部分は該ヒスト
グラム上でピークとして現れ、それを削除することで、
図８（ｃ）に示すような、瞳領域３５のみが抽出され
る。なお、本実施形態では、縦軸方向は、図８（ａ）〜
（ｃ）及び図９において上下方向を示し、横軸方向は、
図８（ａ）〜（ｃ）及び図９において左右方向を示す。Thereafter, based on the detection result of the pupil region 35, the pupil region 35 is reduced or enlarged and normalized to a predetermined size. Then, circular interpolation is performed on the pupil region 35.
At this time, as described above, in the pupil region 35 obtained by applying the discriminant analysis method to each of the chroma value histogram and the luminance histogram and dividing the pupil region 35, as shown in FIG. The existence of 35a is considered. At this time, when the grayscale value of the image is usually represented by 8 bits, the grayscale value 0 is black and the grayscale value 256 is white. Therefore, a horizontal projection histogram is created for an area having a gray value of 0 (black) in the area division result (see FIG. 8B), and as shown in the upper part of the histogram in the vertical axis direction, an extreme peak is obtained. Is deleted based on a preset threshold. That is, the portion of the shade 35a due to the eyelid appears as a peak on the histogram, and by deleting it,
As shown in FIG. 8C, only the pupil region 35 is extracted. Note that, in the present embodiment, the direction of the vertical axis corresponds to FIGS.
(C) and FIG. 9, the vertical direction is shown, and the horizontal axis direction is
8 (a) to 8 (c) and FIG. 9 show the left and right directions.

【００５８】次に、目領域３２に対して、白目領域３４
と瞳領域３５の濃淡の違いを利用して、Prewitt オペレ
ータを用い図８（ｃ）に示す瞳領域３５のエッジ画像を
生成することで、輪郭（エッジ）を抽出する。その後、
その輪郭を構成する点群に対して公知のハフ変換を用い
て瞳領域３５の円方程式を求める。この結果、前記円方
程式から瞳中心Ｃ１が検出される（図９参照）。Next, with respect to the eye region 32, the white eye region 34
The contour (edge) is extracted by generating an edge image of the pupil region 35 shown in FIG. afterwards,
A circle equation of the pupil region 35 is obtained by using a well-known Hough transform for the point group forming the contour. As a result, the pupil center C1 is detected from the circular equation (see FIG. 9).

【００５９】（瞳孔検出）次いで、Ｓ８において、検出
された瞳領域３５からカメラ用パソコン１４は瞳孔の中
心Ｃ２を検出する瞳孔検出を行う。このとき、瞳孔領域
３６は非常に小さいため、瞳領域３４までを検出してい
た画像データでは、瞳孔と虹彩の濃淡の違いを判別して
エッジ抽出を行うことができず、これに伴い瞳孔中心Ｃ
２を検出できない。このため、ビデオカメラ１１Ａ，１
１Ｂがズームアップされ、図９に示すように、目領域３
２を拡大した画像データが取得される。(Pupil Detection) Next, in S8, the camera personal computer 14 performs pupil detection for detecting the center C2 of the pupil from the detected pupil area 35. At this time, since the pupil region 36 is very small, in the image data that has been detected up to the pupil region 34, it is not possible to discriminate the difference in shading between the pupil and the iris and perform edge extraction. C
2 cannot be detected. Therefore, the video cameras 11A, 1
1B is zoomed up, and as shown in FIG.
2 is acquired.

【００６０】そして、瞳領域３５（虹彩）と瞳孔領域３
６の濃淡の違いを利用して、Prewitt オペレータを用
い、瞳孔領域３６のエッジ画像を生成することで、輪郭
（エッジ）を抽出する。その後、瞳の大きさに基づいて
瞳孔の大きさを推定し（例えば、瞳の１／３〜１／
５）、その推定結果を利用して、前記輪郭を構成する点
群に対して公知のハフ変換にて瞳孔領域３６の円方程式
を求める。このとき、瞳には、様々なものが映し出され
るため、前記Prewitt オペレータによる瞳孔領域３６の
エッジ抽出の際には、瞳孔領域３６以外の輪郭（エッ
ジ）が検出されるおそれがある。このため、瞳中心Ｃ１
近辺で検出されたエッジのみを用い、瞳孔領域３６の検
出精度を高めている。そして、前記円方程式から瞳孔中
心Ｃ２が検出される（図９参照）。Then, the pupil region 35 (iris) and the pupil region 3
The contour (edge) is extracted by generating an edge image of the pupil region 36 using the Prewitt operator using the difference in shading of No. 6. Thereafter, the size of the pupil is estimated based on the size of the pupil (for example, 1/3 to 1/1 of the pupil).
5) Using the estimation result, a circle equation of the pupil region 36 is obtained by a well-known Hough transform for the point group forming the contour. At this time, since various things are projected on the pupil, when the Prewitt operator extracts an edge of the pupil region 36, a contour (edge) other than the pupil region 36 may be detected. For this reason, the pupil center C1
The detection accuracy of the pupil region 36 is increased by using only the edges detected in the vicinity. Then, the pupil center C2 is detected from the circular equation (see FIG. 9).

【００６１】（視線決定（カメラ決定））次いで、Ｓ９
において、図９に示すように、カメラ用パソコン１４は
演算された瞳中心Ｃ１及び瞳孔中心Ｃ２から、両位置間
の距離、即ち、瞳中心Ｃ１に対する瞳孔中心Ｃ２のズレ
量を算出（計測）する。そして、算出したズレ量の結果
を各カメラ用パソコン１４は、メインパソコン１６に送
信する。なお、各カメラ用パソコン１４の時刻はメイン
パソコン１６に合わされているため、各カメラ用パソコ
ン１４から送信されるズレ量はそれぞれ同時刻にキャプ
チャした画像データから算出されたものになっている。(Gaze determination (camera determination)) Then, S9
In FIG. 9, as shown in FIG. 9, the camera personal computer 14 calculates (measures) a distance between the calculated pupil center C1 and the pupil center C2, that is, a shift amount of the pupil center C2 with respect to the pupil center C1. . Then, each camera personal computer 14 transmits the result of the calculated shift amount to the main personal computer 16. Since the time of each camera personal computer 14 is adjusted to the time of the main personal computer 16, the shift amount transmitted from each camera personal computer 14 is calculated from the image data captured at the same time.

【００６２】Ｓ１０において、メインパソコン１６は、
ビデオカメラ１１Ａのカメラ用パソコン１４から受信し
たズレ量と、ビデオカメラ１１Ｂのカメラ用パソコン１
４から受信したズレ量とを比較し、視線が向けられてい
るビデオカメラを決定する。このとき前記ズレ量が小さ
い方を視線が向けられているビデオカメラとする。視線
が決定すると、メインパソコン１６は、視線が向けられ
たビデオカメラに対応する電気機器１７へコマンド信号
を出力する（Ｓ１１）。このようにして視線は検出され
る。At S10, the main personal computer 16
The displacement amount received from the camera personal computer 14 of the video camera 11A and the camera personal computer 1 of the video camera 11B
Then, the video camera to which the line of sight is pointed is determined by comparing the deviation amount received from No. 4. At this time, the one with the smaller displacement is regarded as the video camera to which the line of sight is directed. When the line of sight is determined, the main personal computer 16 outputs a command signal to the electric device 17 corresponding to the video camera to which the line of sight is directed (S11). The line of sight is thus detected.

【００６３】従って、上記実施形態によれば、以下のよ
うな効果を得ることができる。（１）上記実施形態では、カメラ用パソコン１４は、推
定した顔向きの角度に基づいて、画像データが所定角度
範囲内の正面顔を捉えているか否かを判定し、その条件
を満たす画像データの目領域３２における瞳領域３５の
大きさを正規化した後に、瞳中心Ｃ１と瞳孔中心Ｃ２を
算出し、両位置のズレ量を算出する。そして、メインパ
ソコン１６は、各ビデオカメラ１１Ａ，１１Ｂに対応し
たそれぞれのズレ量を比較し、そのズレ量が最も小さい
ビデオカメラ１１Ａに視線を向けているという視線推定
を行う。このため、従来と異なり、頭部に装置を装着す
ることなく、広い室内空間でも好適に視線を検出でき
る。また、正面顔を撮像するビデオカメラが複数存在す
る場合でも、ズレ量の比較により、視線が向けられてい
るカメラを正確に推定できる。Therefore, according to the above embodiment, the following effects can be obtained. (1) In the above embodiment, the camera personal computer 14 determines whether the image data captures a frontal face within a predetermined angle range based on the estimated face orientation angle, and determines whether the image data satisfy the condition. After normalizing the size of the pupil region 35 in the eye region 32, the pupil center C1 and the pupil center C2 are calculated, and the amount of deviation between both positions is calculated. Then, the main personal computer 16 compares the respective shift amounts corresponding to the respective video cameras 11A and 11B, and performs a gaze estimation that the gaze is directed to the video camera 11A having the smallest shift amount. For this reason, unlike the related art, it is possible to preferably detect the line of sight even in a large indoor space without mounting the device on the head. Further, even when there are a plurality of video cameras for imaging the frontal face, it is possible to accurately estimate the camera to which the line of sight is directed by comparing the deviation amounts.

【００６４】（２）上記実施形態では、瞳孔検出を目領
域３２を拡大した画像データを取得した上で行った。こ
のため、瞳孔と虹彩の濃淡の違いを確実に判別すること
ができ、好適に瞳孔検出を実現できる。(2) In the above embodiment, pupil detection is performed after acquiring image data in which the eye region 32 is enlarged. For this reason, the difference between the pupil and the iris can be reliably determined, and pupil detection can be suitably performed.

【００６５】（３）上記実施形態では、視線を検出する
ために行う瞳孔検出を、ズームアップしたビデオカメラ
１１で捉えた画像データに対して、Prewitt オペレータ
を用いて輪郭（エッジ）を抽出し、さらにその点群に対
してハフ変換を行うことで実現した。このため、例えば
各ビデオカメラ１１に光源を設け、その光源からそれぞ
れ赤外光を照射し、瞳領域３５（瞳孔）から反射した反
射光に基づいて、瞳孔中心Ｃ２を検出する場合と異な
り、赤外光が乱れ飛び合い、赤外光同士がノイズとなる
という問題が発生することはなく、簡便に瞳孔中心Ｃ２
の検出ができる。(3) In the above embodiment, pupil detection for detecting a line of sight is performed by extracting a contour (edge) using image data captured by the zoomed-up video camera 11 using a Prewitt operator. Furthermore, the point group was realized by performing Hough transform. Therefore, for example, unlike the case where a light source is provided in each video camera 11 and each light source irradiates infrared light to detect the pupil center C2 based on the reflected light reflected from the pupil region 35 (pupil), There is no problem that external light is disturbed and jumps and infrared light becomes noise, and the pupil center C2 can be easily obtained.
Can be detected.

【００６６】（４）上記実施形態では、視線を検出する
ために、瞳中心Ｃ１を検出し、更に瞳孔中心Ｃ２を検出
する。そして、視線の最終判断において、瞳中心Ｃ１と
瞳孔中心Ｃ２とのズレ量に基づいて、どのビデオカメラ
に視線を向けているかを決定した。このため、目領域３
２内における他の部位同士のズレ量を元に視線を検出す
る場合と異なり、最も正確に視線の方向を検出できる。(4) In the above embodiment, in order to detect the line of sight, the pupil center C1 is detected, and further the pupil center C2 is detected. Then, in the final determination of the line of sight, it was determined which video camera the line of sight was directed to based on the amount of deviation between the pupil center C1 and the pupil center C2. Therefore, the eye area 3
Unlike the case where the line of sight is detected based on the amount of displacement between the other parts in 2, the direction of the line of sight can be detected most accurately.

【００６７】（５）上記実施形態では、瞳孔検出に際し
て、瞳中心Ｃ１近辺で検出されたエッジのみを用いて、
ハフ変換で瞳孔領域３６の円方程式を求めている。通
常、瞳孔は、瞳中心Ｃ１の近辺に位置することが多いた
め瞳孔領域３６の検出精度を高めることができる。(5) In the above embodiment, the pupil is detected by using only the edges detected near the pupil center C1.
The circular equation of the pupil region 36 is obtained by the Hough transform. Usually, the pupil is often located near the pupil center C1, so that the detection accuracy of the pupil region 36 can be improved.

【００６８】なお、上記実施形態は以下のように変更し
てもよい。・上記実施形態において、瞳孔検出を以下のような手法
で行ってもよい。即ち、赤外光を照射するための光源を
ビデオカメラ１１に備える。赤外光を用いた場合、瞳孔
領域は白く映し出される。このとき、輝度の高い範囲が
瞳孔領域３６に相当し、輝度の低い範囲が虹彩領域に相
当する。そして、閾値に基づく２値化により、瞳孔領域
３６（輝度の高い（明るい）範囲）を検出する。そし
て、前記瞳孔領域３６の重心を算出し、その重心を瞳孔
中心Ｃ２とする。なお、この際も、瞳孔領域３６を好適
に捉えるためにビデオカメラ１１によるズームアップは
行われる。The above embodiment may be modified as follows. In the above embodiment, pupil detection may be performed by the following method. That is, the video camera 11 is provided with a light source for irradiating infrared light. When infrared light is used, the pupil region appears white. At this time, the range with high luminance corresponds to the pupil region 36, and the range with low luminance corresponds to the iris region. Then, the pupil region 36 (high-brightness (bright) range) is detected by binarization based on the threshold. Then, the center of gravity of the pupil region 36 is calculated, and the center of gravity is set as the pupil center C2. Also at this time, the zoom-up by the video camera 11 is performed in order to appropriately capture the pupil region 36.

【００６９】このようにした場合、瞳領域３５に赤外光
が照射されるタイミングはメインパソコン１６によって
制御される。即ち、Ｓ４において、各カメラ用パソコン
１４で推定された相対顔方向の角度Ｆがメインパソコン
１６に入力され、その角度Ｆが所定条件を満たしている
か否かの判断（Ｓ５の処理）がメインパソコン１６で行
われる。そして、メインパソコン１６は所定条件を満た
しているカメラ用パソコン１４に対して制御信号を出力
し、所定のビデオカメラ１１Ａ，１１Ｂにおいて、各光
源から順次赤外光を照射させるとともに、そのカメラ１
１Ａ，１１Ｂに対応するカメラ用パソコン１４にＳ６〜
Ｓ９の処理を再び行わせる。なお、メインパソコン１６
から制御信号出力されなかったカメラ用パソコン１４に
ついては、今回の画像データに関してはＳ６以降の処理
は行わない。In this case, the timing at which the pupil region 35 is irradiated with infrared light is controlled by the main personal computer 16. That is, in S4, the angle F of the relative face direction estimated by each camera personal computer 14 is input to the main personal computer 16, and it is determined whether the angle F satisfies a predetermined condition (the process of S5). 16 is performed. Then, the main personal computer 16 outputs a control signal to the camera personal computer 14 that satisfies the predetermined condition, and the predetermined video cameras 11A and 11B sequentially emit infrared light from each light source, and the camera 1
S6 ~ on the camera personal computer 14 corresponding to 1A, 11B
The process of S9 is performed again. The main PC 16
For the camera personal computer 14 for which no control signal was output from, the processing after S6 is not performed for the current image data.

【００７０】このようにしても、各ビデオカメラ１１
Ａ，１１Ｂの光源からタイミングが制御された赤外光が
照射されるため、赤外光が乱れ飛び合い、赤外光同士が
ノイズとなることはなく、簡便に瞳孔中心Ｃ２の検出が
できる。また、赤外光により、瞳領域内において瞳孔領
域を明確に判別できる。この場合、メインパソコン１６
が判定手段に相当する。Even in this case, each video camera 11
Since the infrared light whose timing is controlled is emitted from the light sources A and 11B, the infrared light is not disturbed and jumps, and the infrared light does not become noise, so that the pupil center C2 can be easily detected. Further, the pupil region can be clearly distinguished within the pupil region by the infrared light. In this case, the main personal computer 16
Corresponds to the determination means.

【００７１】・また、赤外光を用いた場合でも、瞳孔領
域３６を、Prewitt オペレータを用いたエッジ抽出及び
ハフ変換にて瞳孔中心Ｃ２を検出してもよい。・上記実施形態では、視線の最終判断は、瞳中心Ｃ１と
瞳孔中心Ｃ２とのズレ量に基づいて行われたが、瞳中心
Ｃ１又は瞳孔中心Ｃ２の代わりに瞳領域３５における他
の部位を用いて、ズレ量を求めてもよい。Also, even when infrared light is used, the pupil area 36 may be detected by the edge extraction and Hough transform using the Prewitt operator to detect the pupil center C2. In the above-described embodiment, the final determination of the line of sight is performed based on the amount of deviation between the pupil center C1 and the pupil center C2, but other parts in the pupil region 35 are used instead of the pupil center C1 or the pupil center C2. Then, the shift amount may be obtained.

【００７２】・上記実施形態では、メインパソコン１６
と各カメラ用パソコン１４との通信をイーサネットを介
したソケット通信にて行っていたが、無線電波にて行っ
てもよい。In the above embodiment, the main personal computer 16
Although the communication with the camera personal computer 14 is performed by socket communication via Ethernet, the communication may be performed by wireless radio waves.

【００７３】・上記実施形態では、瞳領域３５の円方程
式の算出をハフ変換で行ったが、以下の手法で行っても
よい。即ち、Prewitt オペレータを用いて抽出された輪
郭を構成する点群から公知の４点サンプリング法で４点
をサンプリングする。そして、その４点を用いて、公知
の最小二乗法によって瞳領域３５の円方程式を求める。In the above embodiment, the calculation of the circular equation of the pupil region 35 is performed by the Hough transform, but it may be performed by the following method. That is, four points are sampled from a group of points constituting the contour extracted using the Prewitt operator by a known four-point sampling method. Then, using the four points, a circle equation of the pupil region 35 is obtained by a known least square method.

【００７４】・上記実施形態では、Ｓ７及びＳ８におけ
る瞳検出、瞳孔検出を、Ｓ６において検出された目領域
３２のうち何れか一方の目領域３２について行ったが、
右・左、両方の目領域３２に対して行ってもよい。この
場合、各目領域３２において算出されたズレ量の平均値
が算出され、その値が、各画像データのズレ量とされ、
比較される。このようにすれば、片目について、ズレ量
を算出する場合と比較して、高精度に視線検出を行うこ
とができる。In the above embodiment, the pupil detection and the pupil detection in S7 and S8 are performed on one of the eye regions 32 detected in S6.
This may be performed on both the right and left eye regions 32. In this case, the average value of the shift amounts calculated in each eye region 32 is calculated, and the value is used as the shift amount of each image data.
Be compared. In this way, it is possible to perform the gaze detection with higher accuracy than in the case of calculating the shift amount for one eye.

【００７５】・上記実施形態では、視線検出を瞳中心Ｃ
１と瞳孔中心Ｃ２のズレ量に基づいて行ったが、瞳孔中
心Ｃ２の代わりに、図１０に示すように、目領域３２の
重心Ｃ３を用いてもよい。この場合、Ｓ６において、目
領域３２が検出された後に、その目領域３２を拡大又は
縮小して所定の大きさに正規化し、その正規化後の目領
域３２に対して、重心Ｃ３を求める。そして、Ｓ９にお
いて、瞳中心Ｃ１と目領域３２の重心Ｃ３とのズレ量を
算出し、視線を推定する。このようにすれば、瞳孔中心
Ｃ２を利用する場合と比較して、ビデオカメラ１１をズ
ームアップする必要なしに視線検出ができる。即ち、瞳
孔を検出できない低解像度の画像データからでも簡単な
演算でズレ量を求めることができる。なお、このように
した場合、Ｓ８は必要なくなる。In the above embodiment, the line of sight is detected at the pupil center C
Although it was performed based on the amount of deviation between 1 and the pupil center C2, the center of gravity C3 of the eye region 32 may be used instead of the pupil center C2 as shown in FIG. In this case, after the eye region 32 is detected in S6, the eye region 32 is enlarged or reduced to normalize it to a predetermined size, and the center of gravity C3 is obtained for the normalized eye region 32. Then, in S9, the amount of deviation between the pupil center C1 and the center of gravity C3 of the eye region 32 is calculated, and the line of sight is estimated. In this way, it is possible to detect the line of sight without having to zoom up the video camera 11 as compared with the case where the pupil center C2 is used. That is, the amount of deviation can be obtained by simple calculation from low-resolution image data in which a pupil cannot be detected. In this case, S8 becomes unnecessary.

【００７６】・上記実施形態では、複数台のビデオカメ
ラ１１が所定角度内の正面顔を撮像したとして、各カメ
ラ用パソコン１４で算出された瞳中心Ｃ１と瞳孔中心Ｃ
２とのズレ量をメインパソコン１６が比較することで、
視線を検出したが、ズレ量同士の比較ではなく、閾値と
の比較で視線を検出してもよい。すなわち、例えば、１
台のビデオカメラ１１に対応するカメラ用パソコン１４
のみが、相対顔方向の角度Ｆが所定角度内であると判断
した場合は、メインパソコン１６はカメラ用パソコン１
４から送信されたズレ量と予め設定された閾値とを比較
する。そして、前記閾値を超えた場合に、検出対象者Ｈ
がビデオカメラ１１に視線を向けているという視線検出
を行う。In the above embodiment, the pupil centers C1 and pupil centers C1 calculated by the camera personal computers 14 are determined on the assumption that the plurality of video cameras 11 image the front face within a predetermined angle.
The main PC 16 compares the amount of deviation from the value of
Although the line of sight has been detected, the line of sight may be detected not by comparing the amounts of deviation but by comparing with a threshold value. That is, for example, 1
Camera personal computer 14 corresponding to one video camera 11
If only the personal computer 16 determines that the angle F of the relative face direction is within a predetermined angle, the main personal computer 16
4 is compared with a preset threshold value. When the threshold value is exceeded, the detection target person H
Is looking at the video camera 11.

【００７７】このようにしても、好適に視線検出を行う
ことができる。また、複数のカメラ用パソコン１４から
ズレ量がメインパソコン１６に送信された場合でも、各
ズレ量をそれぞれ閾値と比較して視線検出を行うことも
可能である。また、上記実施形態では、複数台のビデオ
カメラ１１を設置したが、１台でもよい。Even in such a case, it is possible to preferably perform the gaze detection. Further, even when the displacement amounts are transmitted from the plurality of camera personal computers 14 to the main personal computer 16, it is possible to detect the line of sight by comparing each displacement amount with a threshold value. In the above embodiment, a plurality of video cameras 11 are installed, but one video camera 11 may be installed.

【００７８】次に、上記実施形態及び各別例から把握で
きる技術的思想について、それらの効果と共に以下に記
載する。（１）請求項１乃至請求項４のうちいずれか１項の視線
検出装置において、前記所定ポイントは複数あり、前記
判定手段が複数の画像データを正面顔であると判定した
際は、前記視線推定手段は、距離計測手段が計測した距
離を比較することで視線を検出する視線検出装置。この
ようにすれば、複数のポイントから検出対象者を撮像し
た場合でも、好適に視線検出を行うことができる。Next, technical ideas that can be grasped from the above-described embodiment and other examples will be described below together with their effects. (1) In the eye-gaze detecting device according to any one of claims 1 to 4, there are a plurality of the predetermined points, and when the judging means judges that a plurality of image data is a frontal face, the gaze is detected. The estimating means is a gaze detecting device that detects a gaze by comparing distances measured by the distance measuring means. In this way, even when the detection target person is imaged from a plurality of points, it is possible to preferably perform the gaze detection.

【００７９】（２）請求項３に記載の視線検出装置にお
いて、前記瞳孔検出手段による瞳孔検出は、前記撮像手
段が検出対象者の目領域を拡大撮像した画像データに基
づいて行われる視線検出装置。このようにすれば、簡便
に瞳孔検出を実現できる。(2) In the eye-gaze detecting device according to claim 3, the pupil detection by the pupil detecting means is performed based on image data obtained by enlarging and capturing the eye area of the detection target by the imaging means. . In this way, pupil detection can be easily realized.

【００８０】[0080]

【発明の効果】以上詳述したように、請求項１の発明に
よれば、装置を検出対象者に装着させることなく、広い
室内空間でも好適に視線を検出することができる。As described above in detail, according to the first aspect of the present invention, it is possible to detect a line of sight suitably even in a large indoor space without mounting the apparatus on a detection target person.

【００８１】請求項２の発明によれば、請求項１の発明
の効果に加えて、第１所定部位を、瞳中心とすることで
視線推定手段は視線の推定を好適に実現できる。請求項
３の発明によれば、請求項１又は請求項２の発明の効果
に加えて、第２所定部位を、瞳孔中心とすることで視線
推定手段は視線の推定を的確にできる。According to the invention of claim 2, in addition to the effect of the invention of claim 1, by setting the first predetermined portion as the pupil center, the gaze estimating means can suitably realize the gaze estimation. According to the third aspect of the invention, in addition to the effects of the first or second aspect, the gaze estimating means can accurately estimate the gaze by setting the second predetermined portion as the center of the pupil.

【００８２】請求項４の発明によれば、請求項１又は請
求項２の発明の効果に加えて、第２所定部位を、目領域
の重心とすることで視線推定手段は低解像度の画像デー
タからも視線の推定を好適に実現できる。According to the fourth aspect of the present invention, in addition to the effects of the first or second aspect of the present invention, the line-of-sight estimating means uses the second predetermined portion as the center of gravity of the eye region so that the line-of-sight estimating means Therefore, it is possible to suitably realize the gaze estimation.

【００８３】請求項５の発明によれば、請求項１と同様
の効果を奏す。請求項６の発明によれば、請求項５の発
明の効果に加えて、第２所定部位を、瞳中心とすること
で視線推定行程において視線の推定を好適に実現でき
る。According to the fifth aspect of the invention, the same effect as that of the first aspect is obtained. According to the invention of claim 6, in addition to the effect of the invention of claim 5, by setting the second predetermined portion as the center of the pupil, it is possible to suitably realize the gaze estimation in the gaze estimation process.

【００８４】請求項７の発明によれば、請求項５又は請
求項６の発明の効果に加えて、第１所定部位を、瞳孔中
心とすることで視線推定行程において視線の推定を的確
にできる。According to the seventh aspect of the present invention, in addition to the effects of the fifth or sixth aspect of the present invention, by setting the first predetermined portion as the center of the pupil, the gaze can be accurately estimated in the gaze estimation process. .

【００８５】請求項８の発明によれば、請求項５又は請
求項６の発明の効果に加えて、第２所定部位を、目領域
の重心とすることで視線推定行程において低解像度の画
像データからも視線の推定を好適に実現できる。According to the eighth aspect of the present invention, in addition to the effect of the fifth or sixth aspect, the second predetermined portion is set as the center of gravity of the eye region, so that low-resolution image data can be obtained in the line-of-sight estimation process. Therefore, it is possible to suitably realize the gaze estimation.

[Brief description of the drawings]

【図１】本発明に係る実施形態の視線検出装置の構成を
示すブロック図。FIG. 1 is a block diagram showing a configuration of a gaze detection device according to an embodiment of the present invention.

【図２】同じくフローチャート。FIG. 2 is also a flowchart.

【図３】ビデオカメラが撮像した画像データの説明図。FIG. 3 is an explanatory diagram of image data captured by a video camera.

【図４】肌色基準で抽出した画像データの説明図。FIG. 4 is an explanatory diagram of image data extracted on the basis of skin color.

【図５】判別分析に係るクラスを示した概念図。FIG. 5 is a conceptual diagram showing classes related to discriminant analysis.

【図６】パターン認識の概念図。FIG. 6 is a conceptual diagram of pattern recognition.

【図７】ビデオカメラの光軸（カメラ方向）と相対顔方
向とのなす角度に対する学習データ取得の説明図。FIG. 7 is an explanatory diagram of learning data acquisition for an angle between an optical axis (camera direction) of a video camera and a relative face direction.

【図８】（ａ）、（ｃ）は瞳検出を示す説明図、（ｂ）
は瞳検出における水平射影ヒストグラムを示す説明図。FIGS. 8A and 8C are explanatory diagrams showing pupil detection, and FIGS.
FIG. 4 is an explanatory diagram showing a horizontal projection histogram in pupil detection.

【図９】目領域を示した説明図。FIG. 9 is an explanatory diagram showing an eye area.

【図１０】別の実施形態における目領域を示した説明
図。FIG. 10 is an explanatory diagram showing an eye region according to another embodiment.

[Explanation of symbols]

Ｈ…検出対象者、Ｃ１…瞳中心、Ｃ２…瞳孔中心、１１
…ビデオカメラ（撮像手段）、１４…カメラ用パソコン
（顔向き推定手段、判定手段、目領域検出手段、瞳孔検
出手段、瞳検出手段）、１６…メインパソコン（視線推
定手段）、３１…顔領域、３２…目領域。H: person to be detected, C1: pupil center, C2: pupil center, 11
... video camera (imaging means), 14 ... camera personal computer (face direction estimating means, determining means, eye area detecting means, pupil detecting means, pupil detecting means), 16 ... main personal computer (gaze estimating means), 31 ... face area , 32 ... eye area.

フロントページの続きＦターム(参考） 5B057 AA20 BA02 BA13 CA01 CA08 CA13 CA16 CE06 CE09 DA08 DA20 DB03 DB06 DB09 DC03 DC06 DC08 DC16 DC32 5L096 AA02 AA09 BA08 CA05 EA35 FA34 FA37 FA60 FA67 JA03 JA09 Continued on the front page F term (reference) 5B057 AA20 BA02 BA13 CA01 CA08 CA13 CA16 CE06 CE09 DA08 DA20 DB03 DB06 DB09 DC03 DC06 DC08 DC16 DC32 5L096 AA02 AA09 BA08 CA05 EA35 FA34 FA37 FA60 FA67 JA03 JA09

Claims

[Claims]

An image pickup means for picking up an image of a detection target person from a predetermined point, and a face for detecting a face area of the detection target person from image data picked up by the image pickup means and estimating a face direction based on the detected face area. Direction estimating means; determining means for determining whether or not the face direction estimated by the face direction estimating means is a frontal face including a predetermined angle range; and a front face in the image data determined by the determining means to be a frontal face. An eye area detecting means for detecting an eye area of the face; a distance measuring means for measuring a distance between a first predetermined part and a second predetermined part in the eye area detected by the eye area detecting means; A gaze estimating unit for estimating a gaze based on the distance measured by the measuring unit;

2. The eye-gaze detecting device according to claim 1, further comprising pupil detecting means for detecting a pupil center in the eye area, wherein the first predetermined portion is a pupil center.

3. The eye gaze detection according to claim 1, further comprising a pupil detection unit that detects a pupil center in the eye area, wherein the second predetermined part is a pupil center. apparatus.

4. The eye gaze detection according to claim 1, further comprising a center of gravity detecting means for detecting a position of a center of gravity in the eye region, wherein the second predetermined portion is a center of gravity of the eye region. apparatus.

5. An imaging step of imaging a detection target person from a predetermined point, a face direction estimation step of detecting a face area of the detection target person from captured image data and estimating a face direction based on the detected face area. A determination step of determining whether the estimated face orientation is a front face including a predetermined angle range; an eye area detection step of detecting an eye area of the front face in the image data determined to be a front face; In the detected eye area, a distance measurement step for measuring a distance between the first predetermined part and the second predetermined part in the same eye area, and a gaze estimation step for estimating a gaze based on the measured distance are provided. A featured gaze detection method.

6. The gaze detection method according to claim 5, further comprising a pupil detection step of detecting a pupil center in the eye area, wherein the first predetermined portion is a pupil center.

7. The eye gaze detection according to claim 5, further comprising a pupil detection step of detecting a pupil center in the eye region, wherein the second predetermined portion is a pupil center. Method.

8. The eye gaze detection according to claim 5, further comprising a center-of-gravity detecting step of detecting a center-of-gravity position in the eye region, wherein the second predetermined portion is a center of gravity of the eye region. apparatus.