JP7269617B2

JP7269617B2 - Face image processing device, image observation system, and pupil detection system

Info

Publication number: JP7269617B2
Application number: JP2018225770A
Authority: JP
Inventors: 嘉伸海老澤
Original assignee: Shizuoka University NUC
Current assignee: Shizuoka University NUC
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2023-05-09
Anticipated expiration: 2038-11-30
Also published as: JP2020081756A

Description

本発明は、顔画像を処理する顔画像処理装置、画像観察システム、及び瞳孔検出システムに関する。 The present invention relates to a face image processing device, an image observation system, and a pupil detection system for processing face images.

近年、ビデオカメラを使用して得られた顔画像を処理する顔画像処理装置が普及しつつある（例えば、下記特許文献１，２参照）。例えば、このような顔画像処理装置では、顔画像を処理することによって、瞳孔の三次元座標、及びカメラから瞳孔中心を結ぶ直線に対する視線の角度を求め、これらを基に視線を検出している。これらの装置では、２台以上のカメラの画像に対象者の眼が映っていれば、瞳孔の三次元座標を求めることが可能である。 2. Description of the Related Art In recent years, facial image processing apparatuses that process facial images obtained using a video camera have become popular (see, for example, Patent Documents 1 and 2 below). For example, in such a face image processing device, by processing the face image, the three-dimensional coordinates of the pupil and the angle of the line of sight with respect to the straight line connecting the center of the pupil from the camera are obtained, and the line of sight is detected based on these. . With these devices, it is possible to determine the three-dimensional coordinates of the pupil if the subject's eyes are shown in the images of two or more cameras.

特許第４５００９９２号Patent No. 4500992 特許第４５１７０４９号Patent No. 4517049

しかしながら、上述した特許文献１，２に記載の装置では、対象者が急にあるいは広範囲に移動する場合、あるいは、対象者の姿勢が急にあるいは広範囲で変化する場合に、それに対応して対象者の顔画像をカメラで捉えることが困難である。カメラとして広視野のカメラを用いることも考えられるが、その場合はカメラの分解能を高くする必要がある等の制約が生じる。また、カメラの光学中心の３自由度の位置、あるいはカメラの光軸の方向とカメラの光軸周りの回転角度とで表現されるカメラの姿勢を、対象者の動きあるいは姿勢に対応して制御することも考えられるが、その場合にカメラの画像を対象にしたカメラの位置あるいは姿勢に基づいた画像処理を精度よく行うことができない傾向にある。また、カメラを自動車のハンドル等の可動物体に取り付ける場合にも、カメラの画像を対象にしたカメラの位置に基づいた画像処理を精度よく行うことができない傾向にある。 However, in the devices described in Patent Documents 1 and 2 described above, when the subject moves suddenly or over a wide range, or when the posture of the subject changes suddenly or over a wide range, It is difficult to capture the face image of a person with a camera. Although it is conceivable to use a camera with a wide field of view as the camera, in that case there are restrictions such as the need to increase the resolution of the camera. In addition, the position of the optical center of the camera with 3 degrees of freedom, or the posture of the camera, which is expressed by the direction of the optical axis of the camera and the angle of rotation around the optical axis of the camera, is controlled according to the movement or posture of the subject. However, in that case, there is a tendency that image processing based on the position or orientation of the camera cannot be performed with high accuracy. Also, when a camera is attached to a movable object such as a steering wheel of an automobile, there is a tendency that image processing based on the position of the camera cannot be performed with high accuracy.

本発明は、上記課題に鑑みて為されたものであり、カメラの位置あるいは姿勢が変化する場合であっても、対象者の画像を対象にカメラの位置あるいは姿勢に基づいて高精度に画像処理を行うことが可能な顔画像処理装置、画像観察システム、及び瞳孔検出システムを提供することを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of the above-mentioned problems. It is an object of the present invention to provide a face image processing device, an image observation system, and a pupil detection system capable of performing

上記課題を解決するため、本発明の一形態に係る顔画像処理装置は、対象者の顔及び顔とは独立した固定点を撮像することで顔画像を取得する少なくとも１台のカメラと、顔画像を対象に画像処理を実行する算出部と、を備え、算出部は、顔画像上における固定点の位置を基にカメラの位置あるいは姿勢を検出し、カメラの位置あるいは姿勢を基に画像処理を実行する。 In order to solve the above problems, a face image processing apparatus according to one aspect of the present invention includes at least one camera that acquires a face image by capturing a subject's face and a fixed point independent of the face; a calculation unit that performs image processing on an image, the calculation unit detects the position or orientation of the camera based on the position of the fixed point on the face image, and performs image processing based on the position or orientation of the camera. to run.

上記形態の顔画像処理装置によれば、対象者の顔とそれとは独立した固定点を撮像した顔画像を用いてカメラの位置あるいは姿勢を検出することにより、顔画像を取得した際のカメラの位置あるいは姿勢をズレが生じることなく検出することができる。そして、検出したカメラの位置あるいは姿勢を基に画像処理を実行することにより、カメラの位置あるいは姿勢が変化する場合であっても、対象者の画像を対象に、カメラの位置あるいは姿勢に基づいて高精度に画像処理を施すことができる。 According to the face image processing device of the above embodiment, the position or orientation of the camera is detected using the face image obtained by imaging the subject's face and a fixed point independent of the face. The position or orientation can be detected without deviation. By executing image processing based on the detected position or orientation of the camera, even if the position or orientation of the camera changes, the image of the subject is processed based on the position or orientation of the camera. Image processing can be performed with high accuracy.

ここで、カメラの位置あるいは姿勢を制御する駆動系をさらに備える、こととしてもよい。かかる構成においては、カメラの位置あるいは姿勢を駆動系を用いて制御する場合に、対象者の画像を対象に、カメラの位置あるいは姿勢に基づいて高精度に画像処理を施すことができる。 Here, a drive system for controlling the position or orientation of the camera may be further provided. In such a configuration, when the position or orientation of the camera is controlled using the drive system, the image of the subject can be subjected to highly accurate image processing based on the position or orientation of the camera.

また、前記カメラは、３箇所以上の前記固定点を撮像する、こととしてもよい。この場合、顔画像を取得した際のカメラの位置あるいは姿勢を精度よく検出することができる。 Also, the camera may capture images of three or more fixed points. In this case, it is possible to accurately detect the position or orientation of the camera when the face image was acquired.

また、算出部は、固定点の位置を基にカメラの位置及び姿勢を検出し、カメラの位置及び姿勢を用いて画像処理を実行する、こととしてもよい。こうすれば、カメラの位置と姿勢の両方が変化する場合であっても、対象者の画像を対象に、カメラの位置及び姿勢に基づいて高精度に画像処理を施すことができる。 Further, the calculation unit may detect the position and orientation of the camera based on the positions of the fixed points, and perform image processing using the position and orientation of the camera. In this way, even if both the position and orientation of the camera change, the image of the subject can be subjected to highly accurate image processing based on the position and orientation of the camera.

また、算出部は、画像処理を用いて、対象者の瞳孔のカメラから見た方向あるいは距離を算出する、こととしてもよい。この場合、対象者の画像をカメラの位置あるいは姿勢に基づいて画像処理することにより、高精度に対象者の瞳孔の方向あるいは距離を算出することができる。 Further, the calculation unit may use image processing to calculate the direction or distance of the subject's pupils viewed from the camera. In this case, the direction or distance of the pupil of the subject can be calculated with high accuracy by processing the image of the subject based on the position or posture of the camera.

また、算出部は、画像処理を用いて、対象者の瞳孔の三次元位置を算出する、こととしてもよい。この場合にも、対象者の画像をカメラの位置あるいは姿勢に基づいて画像処理することにより、高精度に対象者の瞳孔の三次元位置を算出することができる。 Further, the calculation unit may use image processing to calculate the three-dimensional position of the subject's pupil. In this case as well, the three-dimensional position of the subject's pupil can be calculated with high accuracy by performing image processing on the subject's image based on the position or orientation of the camera.

また、算出部は、カメラの位置あるいは姿勢の変化を反映した異なる時間での複数の顔画像の位置補正を行うことによって対象者の瞳孔の顔画像上の位置を検出する、こととしてもよい。この場合には、対象者の画像をカメラの位置あるいは姿勢に基づいて位置補正処理することにより、高精度に対象者の顔画像上の位置を検出することができる。 Further, the calculation unit may detect the positions of the subject's pupils on the face image by correcting the positions of a plurality of face images at different times reflecting changes in camera position or posture. In this case, by subjecting the image of the subject to position correction processing based on the position or orientation of the camera, the position of the subject on the facial image can be detected with high accuracy.

また、算出部は、カメラの位置及び／又は姿勢を基に顔画像のブレを補正する、こととしてもいてもよい。この場合には、対象者の画像をカメラの位置及び／又は姿勢に基づいてブレ補正処理することにより、高精度に対象者の画像上のブレを補正することができる。 Further, the calculation unit may correct blurring of the face image based on the position and/or orientation of the camera. In this case, by subjecting the image of the subject to blur correction processing based on the position and/or orientation of the camera, blurring in the image of the subject can be corrected with high accuracy.

あるいは、本発明の他の形態に係る画像観察システムは、上記形態の顔画像処理装置と、対象者とは別の観察者の頭部の位置及び姿勢を検出する光学系と、観察者に対して顔画像を表示する表示装置とを備え、駆動系は、頭部の位置及び姿勢に基づいてカメラの位置及び姿勢を制御し、算出部は、ブレが補正された顔画像を表示装置に表示させる。 Alternatively, an image observation system according to another aspect of the present invention includes the face image processing device of the aspect described above, an optical system for detecting the position and orientation of the head of an observer other than the subject, and the driving system controls the position and orientation of the camera based on the position and orientation of the head; and the calculation unit displays the blur-corrected face image on the display device. Let

上記形態の画像観察システムによれば、観察者の頭部の位置及び姿勢に応じてカメラの位置及び姿勢が制御され、そのカメラによって取得された対象者の顔画像が表示装置に表示される。この際、顔画像処理装置によって、カメラの位置あるいは姿勢が検出され、そのカメラの位置あるいは姿勢を基に顔画像のブレが補正される。その結果、ブレが安定的に補正された対象者の顔を表示装置に表示することができる。 According to the image observation system of the above configuration, the position and orientation of the camera are controlled according to the position and orientation of the observer's head, and the face image of the subject acquired by the camera is displayed on the display device. At this time, the position or orientation of the camera is detected by the face image processing device, and blurring of the face image is corrected based on the position or orientation of the camera. As a result, it is possible to display on the display device the subject's face whose blur has been stably corrected.

ここで、算出部は、カメラの位置あるいは姿勢を基に顔画像を拡大あるいは縮小を行ってから表示装置に表示させる、こととしてもよい。かかる構成によれば、対象者の顔をカメラの位置あるいは姿勢に応じた好適なサイズで表示装置に表示させることができる。 Here, the calculation unit may enlarge or reduce the face image based on the position or orientation of the camera, and then display it on the display device. According to such a configuration, the face of the subject can be displayed on the display device in a suitable size according to the position or posture of the camera.

あるいは、本発明の他の形態に係る瞳孔検出システムは、上記形態の顔画像処理装置を備え、カメラは、対象者が操作する可動物体に取り付けられ、対象者の顔及び固定点を撮像し、算出部は、対象者の瞳孔を検出する。 Alternatively, a pupil detection system according to another aspect of the present invention includes the face image processing device of the above aspect, the camera is attached to a movable object operated by the subject, images the subject's face and fixed points, The calculator detects pupils of the subject.

上記形態の瞳孔検出システムによれば、カメラの位置が可動物体の操作により変化する場合であっても、対象者の顔画像を対象に、カメラの位置に基づいて画像処理を施すことにより、高精度に対象者の瞳孔を検出することができる。 According to the pupil detection system of the above configuration, even if the position of the camera changes due to the operation of the movable object, by performing image processing on the face image of the subject based on the position of the camera, a high degree of accuracy can be achieved. A subject's pupil can be detected with accuracy.

ここで、算出部は、カメラの位置あるいは姿勢の変化を反映した異なる時間での複数の顔画像の位置補正を行うことによって対象者の瞳孔の顔画像上の位置あるいは形状を検出する、こととしてもよい。この場合、対象者の顔画像を対象に、カメラの位置あるいは姿勢に基づいて位置補正処理を施すことにより、高精度に対象者の瞳孔の顔画像上の位置あるいは形状を検出することができる。 Here, the calculation unit detects the position or shape of the subject's pupils on the face image by correcting the positions of a plurality of face images at different times reflecting changes in the position or posture of the camera. good too. In this case, the position or shape of the subject's pupils on the face image can be detected with high accuracy by performing position correction processing on the subject's face image based on the position or posture of the camera.

本発明によれば、カメラの位置あるいは姿勢が変化する場合であっても、対象者の画像を対象にカメラの位置あるいは姿勢に基づいて高精度に画像処理を行うことができる。 According to the present invention, even if the position or orientation of the camera changes, it is possible to perform highly accurate image processing on the image of the subject based on the position or orientation of the camera.

第１実施形態に係る顔画像処理装置を示す側面図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a side view which shows the face image processing apparatus which concerns on 1st Embodiment. 第１実施形態に係る顔画像処理装置を示す正面図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a front view which shows the face image processing apparatus which concerns on 1st Embodiment. 第１実施形態に係る顔画像処理装置を示す平面図である。1 is a plan view showing a face image processing device according to a first embodiment; FIG. 第１実施形態に係る顔画像処理装置の撮像対象である対象者とマーカーとの位置関係を示す図である。It is a figure which shows the positional relationship of the target person who is an imaging target of the face image processing apparatus which concerns on 1st Embodiment, and a marker. 第１実施形態に係るコンピュータのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the computer which concerns on 1st Embodiment. 第１実施形態に係るコンピュータの機能構成を示すブロック図である。2 is a block diagram showing the functional configuration of a computer according to the first embodiment; FIG. 変形例に係るカメラ駆動系の構成を示す平面図である。It is a top view which shows the structure of the camera drive system which concerns on a modification. 変形例に係るカメラ駆動系の構成を示す平面図である。It is a top view which shows the structure of the camera drive system which concerns on a modification. 第２実施形態に係る画像観察システムの構成を示す側面図である。FIG. 11 is a side view showing the configuration of an image observation system according to a second embodiment; 第２実施形態に係るコンピュータの機能構成を示すブロック図である。FIG. 7 is a block diagram showing the functional configuration of a computer according to the second embodiment; FIG. 第２実施形態に係るコンピュータによる処理前の顔画像と処理後の顔画像のイメージを示す図である。FIG. 10 is a diagram showing images of a face image before processing and a face image after processing by a computer according to the second embodiment; 第２実施形態に係る画像観察システムの別の使用形態を示す側面図である。FIG. 11 is a side view showing another usage pattern of the image observation system according to the second embodiment; 第３実施形態に係る自動車用監視システムを構成する顔画像処理装置の構成を示す側面図である。FIG. 11 is a side view showing the configuration of a face image processing device that constitutes a vehicle monitoring system according to a third embodiment; 第３実施形態に係る顔画像取得用ステレオカメラの配置状態を示す図である。FIG. 12 is a diagram showing the arrangement state of face image acquisition stereo cameras according to the third embodiment;

以下、図面を参照しつつ本発明に係る顔画像処理装置の好適な実施形態について詳細に説明する。なお、図面の説明においては、同一又は相当部分には同一符号を付し、重複する説明を省略する。 Preferred embodiments of the face image processing apparatus according to the present invention will be described in detail below with reference to the drawings. In the description of the drawings, the same or corresponding parts are denoted by the same reference numerals, and overlapping descriptions are omitted.

［第１実施形態に係る顔画像処理装置の構成］
まず、図１～３を用いて、実施形態に係る顔画像処理装置１の全体構成を説明する。図１は、顔画像処理装置１の側面図、図２は、顔画像処理装置１の正面図、図３は、顔画像処理装置１の平面図である。顔画像処理装置１は、対象者Ｓｐを撮像することにより顔画像を取得し、その顔画像を対象に画像処理を実行する装置群である。一例として、対象者Ｓｐがガラス板等の透明な物体又は電子基板等の不透明な物体である検査対象物Ｓｏ等を目視検査する際に、対象者Ｓｐの検査対象物Ｓｏ上の注視点を検出する装置として説明する。ここでは、顔画像処理装置１を対象者Ｓｐの瞳孔および角膜反射の検出結果を基に視線を検出する装置として例示しているが、対象者Ｓｐの顔自体の検出のほか、対象者Ｓｐの鼻孔、口、耳、眉、鼻先、頭部等の顔上の様々な部位、あるいは、眼鏡、ゴーグル、イヤホン等の対象者Ｓｐの顔に付属する物体を検出する装置であってもよい。顔画像処理装置１は、マーカー検出用カメラ３、顔画像取得用ステレオカメラ５、カメラ駆動系７、及び、コンピュータ（算出部）９を含んで構成されている。以下、顔画像処理装置１の各構成要素について説明する。なお、検査対象物Ｓｏとして、電子基板等不透明な物体を対象とする場合には、検査対象物Ｓｏが顔画像取得用ステレオカメラ５による対象者Ｓｐの撮像の邪魔にならないようにその位置が調整される。 [Configuration of face image processing device according to first embodiment]
First, the overall configuration of a face image processing device 1 according to an embodiment will be described with reference to FIGS. 1 to 3. FIG. 1 is a side view of the face image processing device 1, FIG. 2 is a front view of the face image processing device 1, and FIG. The face image processing device 1 is a device group that acquires a face image by capturing an image of a subject Sp and executes image processing on the face image. As an example, when the subject Sp visually inspects an inspection object So that is a transparent object such as a glass plate or an opaque object such as an electronic substrate, the gaze point of the subject Sp on the inspection object So is detected. It will be described as a device for Here, the face image processing device 1 is exemplified as a device that detects the line of sight based on the detection results of the pupil and corneal reflection of the subject Sp. It may be a device that detects various parts on the face such as the nostrils, mouth, ears, eyebrows, tip of the nose, head, etc., or objects attached to the face of the subject Sp, such as glasses, goggles, and earphones. The face image processing device 1 includes a marker detection camera 3 , a face image acquisition stereo camera 5 , a camera drive system 7 , and a computer (calculation section) 9 . Each component of the face image processing apparatus 1 will be described below. When an opaque object such as an electronic substrate is to be inspected So, the position of the inspection object So is adjusted so that it does not interfere with the imaging of the subject Sp by the face image acquiring stereo camera 5. be done.

マーカー検出用カメラ３は、対象者Ｓｐに向けて固定されたカメラであり、対象者Ｓｐ、及び、対象者Ｓｐの後方に対象者Ｓｐと独立して固定された複数のマーカー（固定点）を撮像することにより画像情報を生成し、その画像情報に画素毎の被写体までの距離に関する情報を含めることができるＴＯＦ（Time of Flight）カメラである。このマーカー検出用カメラ３は、カメラ駆動系７を制御するための情報を生成するために用いられ、後述する顔画像取得用ステレオカメラ５に比較して広い範囲を撮影可能な広角カメラである。マーカー検出用カメラ３は、コンピュータ９からの命令に応じて画像情報を生成し、その画像情報をコンピュータ９に出力する。ここで、後述するコンピュータ９の機能により、顔画像取得用ステレオカメラ５により取得される顔画像を基に複数のマーカーの三次元座標を算出可能な場合には、マーカー検出用カメラ３は省略されてもよい。 The marker detection camera 3 is a camera that is fixed toward the target person Sp, and detects the target person Sp and a plurality of markers (fixed points) that are fixed behind the target person Sp independently of the target person Sp. This is a TOF (Time of Flight) camera that can generate image information by taking an image, and can include information on the distance to the subject for each pixel in the image information. The marker detection camera 3 is used to generate information for controlling the camera driving system 7, and is a wide-angle camera capable of photographing a wider range than the face image acquisition stereo camera 5, which will be described later. The marker detection camera 3 generates image information according to a command from the computer 9 and outputs the image information to the computer 9 . Here, when the three-dimensional coordinates of a plurality of markers can be calculated based on the face image acquired by the face image acquisition stereo camera 5 by the function of the computer 9 to be described later, the marker detection camera 3 is omitted. may

顔画像取得用ステレオカメラ５は、対象者Ｓｐに向けられた２台のカメラ５ａ，５ｂを含み、それぞれのカメラ５ａ，５ｂは、カメラ駆動系７に搭載されてから予めカメラ較正がされており、対象者Ｓｐの顔を複数のマーカーと同時に撮像し、顔画像を取得および出力する。本実施形態では、カメラ５ａ，５ｂは、インターレーススキャン方式の一つであるＮＴＳＣ方式のビデオカメラであり、各カメラ５ａ，５ｂには、対象者Ｓｐの瞳孔、角膜反射、あるいは鼻孔等の特徴点を検出するために特許第４５００９９２号に記載の構成の近赤外光源１１ａ，１１ｂが取り付けられているが、マーカーが顔画像上で検出できるのであれば近赤外光源１１ａ，１１ｂは省かれていてもよい。カメラ５ａ，５ｂは、コンピュータ９からの命令に応じて対象者Ｓｐ及びマーカーを撮像し、顔画像のデータをコンピュータ９に出力する。 The facial image acquisition stereo camera 5 includes two cameras 5a and 5b directed toward the subject Sp, and the cameras 5a and 5b are calibrated in advance after being mounted on the camera driving system 7. , the face of the subject Sp is imaged simultaneously with a plurality of markers, and a face image is obtained and output. In this embodiment, the cameras 5a and 5b are video cameras of the NTSC system, which is one of the interlaced scan systems, and each of the cameras 5a and 5b has a feature point such as a pupil, a corneal reflection, or a nostril of the subject Sp. In order to detect the near-infrared light sources 11a and 11b configured as described in Japanese Patent No. 4500992, the near-infrared light sources 11a and 11b are omitted if the markers can be detected on the face image. may The cameras 5a and 5b take images of the target person Sp and the marker according to commands from the computer 9, and output face image data to the computer 9. FIG.

カメラ駆動系７は、カメラ５ａ，５ｂを搭載する台座７ａ、台座７ａを垂直方向に移動可能に支持する支持部７ｃ、支持部７ｃを水平方向に移動可能に支持する支持部７ｂを備えている。カメラ駆動系７は、コンピュータ９からの命令に応じて、台座７ａに載置されたカメラ５ａ、５ｂの位置を、対象者Ｓｐに対するそれらの姿勢を保ったまま垂直面に沿って垂直方向および水平方向に移動させるように駆動する。このようなカメラ駆動系７としては、ＩＡＩ社製のロボシリンダ（登録商標）を使用することができる。なお、カメラ駆動系７の駆動方向は、垂直面に沿ったものには限定されず、垂直面に対して傾いた面に沿った方向であってもよい。カメラ駆動系７は、例えば、１ｍ／１秒の速度でカメラ５ａ，５ｂを移動させることができる。 The camera drive system 7 includes a pedestal 7a on which the cameras 5a and 5b are mounted, a support portion 7c that supports the pedestal 7a so as to be vertically movable, and a support portion 7b that is horizontally movable and supports the support portion 7c. . The camera drive system 7 moves the cameras 5a and 5b placed on the pedestal 7a vertically and horizontally along the vertical plane while maintaining their postures with respect to the subject Sp, according to a command from the computer 9. Drive to move in the direction. As such a camera drive system 7, a Robo Cylinder (registered trademark) manufactured by IAI can be used. The driving direction of the camera driving system 7 is not limited to the direction along the vertical plane, and may be the direction along the plane inclined with respect to the vertical plane. The camera driving system 7 can move the cameras 5a and 5b at a speed of 1 m/1 sec, for example.

上記の顔画像処理装置１を用いる際には、マーカー検出用カメラ３及び顔画像取得用ステレオカメラ５の前方において、暗幕、壁等の垂直面Ｓｖ上に固定された複数のマーカーを用意し、垂直面Ｓｖと顔画像取得用ステレオカメラ５の間にカメラ５ａ，５ｂの光軸に対して斜めに傾くように検査対象物Ｓｏを固定する。マーカーとしては、白色等のカメラ画像上で目立つ色彩のマーカーが好ましく、ＬＥＤ等の発光体をマーカーとして用いてもよく、発光体を用いる場合はカメラの画像において中心精度が高まるような形状に映るように、発光部の高さが低く指向性が弱いＬＥＤを用いる。また、マーカー検出用カメラ３及び顔画像取得用ステレオカメラ５として近赤外線カメラを使用する場合には、再帰（再帰性）反射材料のマーカーを用いることが好ましく、マーカー検出用カメラ３及び顔画像取得用ステレオカメラ５としてカラーカメラを使用する場合には、背景に対して目立つ色（例えば、青色）のマーカーＭを用いてもよい。さらに、対象者Ｓｐには、垂直面Ｓｖと検査対象物Ｓｏの間で顔画像取得用ステレオカメラ５側を向かせた状態で、検査対象物Ｓｏの任意の位置を目視で検査させる。 When using the face image processing device 1, a plurality of markers fixed on a vertical plane Sv such as a blackout curtain or a wall are prepared in front of the marker detection camera 3 and the face image acquisition stereo camera 5, An object to be inspected So is fixed between the vertical plane Sv and the face image acquisition stereo camera 5 so as to be inclined with respect to the optical axes of the cameras 5a and 5b. As the marker, a marker with a color that stands out on the camera image, such as white, is preferable, and a luminous body such as an LED may be used as the marker. As shown, an LED with a low light-emitting portion and weak directivity is used. Further, when a near-infrared camera is used as the marker detection camera 3 and the face image acquisition stereo camera 5, it is preferable to use a marker made of a retroreflective material. When a color camera is used as the stereo camera 5, a marker M having a color (for example, blue) that stands out against the background may be used. Further, the subject Sp is allowed to visually inspect an arbitrary position of the inspection object So with the face image acquisition stereo camera 5 facing between the vertical plane Sv and the inspection object So.

図４には、垂直面Ｓｖ上におけるマーカーＭの配置状態を示している。このように、複数のマーカーＭは、例えば、垂直面Ｓｖ上で垂直方法及び水平方向に沿って等間隔で二次元的に配列されている。複数のマーカーＭ間の距離は、カメラ５ａ，５ｂが動いた場合に、対象者Ｓｐによって一部のマーカーＭが隠れたとしても、カメラ５ａ，５ｂによって取得されるそれぞれの顔画像上に最低１個のマーカーＭが映るように調整されている。このように調整されることで、カメラ５ａ，５ｂが大きく動いた場合であっても、それらによって取得される顔画像上に常にマーカーＭを捉えることができる。 FIG. 4 shows how the markers M are arranged on the vertical plane Sv. In this way, the plurality of markers M are arranged two-dimensionally at regular intervals along the vertical and horizontal directions on the vertical plane Sv, for example. Even if some of the markers M are hidden by the subject Sp when the cameras 5a and 5b move, the distance between the markers M should be at least 1 on each face image acquired by the cameras 5a and 5b. It is adjusted so that the number of markers M can be seen. By adjusting in this way, even when the cameras 5a and 5b move greatly, the marker M can always be captured on the face image acquired by them.

ただし、マーカーＭの配置は、二次元的に等間隔の状態に限定されるものではなく、マーカーＭ間の距離が不均一であってもよいし、複数のマーカーＭを結ぶ線が斜めを向いていてもよいし、カメラ５ａ，５ｂが動いた場合に１個のマーカーＭが確実に顔画像上に映るのであればマーカーＭは１つであってもよい。また、マーカーＭは垂直面Ｓｖ上に配置されるものには限定されず、垂直面に対して斜めな面に対して配置されてもよい。また、複数のマーカーＭは互いに独立したものである必要もなく、例えば、枠状の物体の隅部を複数のマーカーＭとして用いてもよい。つまり、マーカーＭは、任意の形状の物体であってよく、カメラ５ａ，５ｂによって撮影可能であって物体上の相対位置が分かり得るような物体の一部であってもよい。その場合は、物体の特定の位置にマーカーＭが仮想的に存在していると考えることができる。さらに、顔画像の取得に２台以上のカメラを使用する場合には、マーカーＭが含まれる対象の物体は、立体的な物体でもよい。その場合は、コンピュータ９にて、画像を用いて物体の３次元構造を推定し、その構造中で位置推定精度の低下しにくい位置(例えば、突起部分)に仮想的なマーカーＭを設定して，そのマーカーＭを利用しても良い。 However, the arrangement of the markers M is not limited to being two-dimensionally equidistant. Alternatively, the number of markers M may be one as long as one marker M is reliably captured on the face image when the cameras 5a and 5b are moved. Further, the marker M is not limited to being placed on the vertical plane Sv, and may be placed on a plane oblique to the vertical plane. Moreover, the plurality of markers M need not be independent of each other, and for example, the corners of a frame-shaped object may be used as the plurality of markers M. In other words, the marker M may be an object of any shape, or may be a part of an object that can be photographed by the cameras 5a and 5b so that relative positions on the object can be known. In that case, it can be considered that the marker M exists virtually at a specific position of the object. Furthermore, when two or more cameras are used to acquire facial images, the target object including the marker M may be a three-dimensional object. In that case, the computer 9 estimates the three-dimensional structure of the object using the image, and sets a virtual marker M at a position (for example, a protrusion) in the structure where the position estimation accuracy is unlikely to decrease. , its marker M may be used.

図１に戻って、コンピュータ９は、マーカー検出用カメラ３、顔画像取得用ステレオカメラ５、及びカメラ駆動系７の制御と、顔画像取得用ステレオカメラ５によって取得された顔画像を対象にした画像処理とを実行するデータ処理装置である。コンピュータ９は、据置型または携帯型のパーソナルコンピュータ（ＰＣ）により構築されてもよいし、ワークステーションにより構築されてもよいし、他の種類のコンピュータにより構築されてもよい。あるいは、コンピュータ９は複数台の任意の種類のコンピュータを組み合わせて構築されてもよい。複数台のコンピュータを用いる場合には、これらのコンピュータはインターネットやイントラネットなどの通信ネットワークを介して接続されうる。 Returning to FIG. 1, the computer 9 controls the marker detection camera 3, the face image acquisition stereo camera 5, and the camera driving system 7, and the face image acquired by the face image acquisition stereo camera 5. It is a data processing device that executes image processing. The computer 9 may be a stationary or portable personal computer (PC), a workstation, or other type of computer. Alternatively, the computer 9 may be constructed by combining a plurality of arbitrary types of computers. When using multiple computers, these computers can be connected via a communication network such as the Internet or an intranet.

コンピュータ９の一般的なハードウェア構成を図５に示す。コンピュータ９は、オペレーティングシステムやアプリケーション・プログラムなどを実行するＣＰＵ（プロセッサ）１０１と、ＲＯＭおよびＲＡＭで構成される主記憶部１０２と、ハードディスクやフラッシュメモリなどで構成される補助記憶部１０３と、ネットワークカードあるいは無線通信モジュールで構成される通信制御部１０４と、キーボードやマウスなどの入力装置１０５と、ディスプレイやプリンタなどの出力装置１０６とを備える。 A general hardware configuration of the computer 9 is shown in FIG. The computer 9 includes a CPU (processor) 101 that executes an operating system, application programs, etc., a main memory unit 102 composed of a ROM and a RAM, an auxiliary memory unit 103 composed of a hard disk, a flash memory, etc., a network It comprises a communication control unit 104 constituted by a card or a wireless communication module, an input device 105 such as a keyboard or mouse, and an output device 106 such as a display or printer.

後述するコンピュータ９の各機能要素は、ＣＰＵ１０１または主記憶部１０２の上に所定のソフトウェアを読み込ませ、ＣＰＵ１０１の制御の下で通信制御部１０４や入力装置１０５、出力装置１０６などを動作させ、主記憶部１０２または補助記憶部１０３におけるデータの読み出しおよび書き込みを行うことで実現される。処理に必要なデータやデータベースは主記憶部１０２または補助記憶部１０３内に格納される。 Each functional element of the computer 9, which will be described later, causes the CPU 101 or the main storage unit 102 to load predetermined software, operates the communication control unit 104, the input device 105, the output device 106, etc. under the control of the CPU 101, It is realized by reading and writing data in the storage unit 102 or the auxiliary storage unit 103 . Data and databases required for processing are stored in the main storage unit 102 or the auxiliary storage unit 103 .

図６に示すように、コンピュータ９は、機能的構成要素として、カメラ駆動系制御部２１、撮像制御部２３、カメラ位置検出部２５、及び画像処理部２７を備える。以下、コンピュータ９の各構成要素の機能を説明する。 As shown in FIG. 6, the computer 9 includes a camera driving system control section 21, an imaging control section 23, a camera position detection section 25, and an image processing section 27 as functional components. The function of each component of the computer 9 will be described below.

カメラ駆動系制御部２１は、対象者Ｓｐの視線検出処理が開始された後に、マーカー検出用カメラ３によって取得された画像情報を基に、カメラ駆動系７による顔画像取得用ステレオカメラ５の駆動を制御する。すなわち、カメラ駆動系制御部２１は、画像情報を参照して対象者Ｓｐの頭部の三次元位置を算出し、その三次元位置に対向する位置にカメラ５ａ，５ｂが位置するように駆動を制御する。加えて、カメラ駆動系制御部２１は、画像情報を基に複数のマーカーＭの三次元座標も算出し、複数のマーカーＭの三次元座標の情報をカメラ位置検出部２５に引き渡す。一般に、検査対象物Ｓｏを目視検査する際には、対象者Ｓｐは、特に検査対象物Ｓｏの面積が大きい場合は、足腰を動かしながら検査対象物Ｓｏに対して上下左右に頭部および視線を動かすことによって、検査対象物Ｓｏを隅から隅まで観察および検査する。このような場合であっても、カメラ駆動系制御部２１の制御により、顔画像取得用ステレオカメラ５によって取得される顔画像上に対象者Ｓｐの顔を捉えることができる。 The camera drive system control unit 21 drives the face image acquisition stereo camera 5 by the camera drive system 7 based on the image information acquired by the marker detection camera 3 after the target person Sp's line of sight detection processing is started. to control. That is, the camera driving system control unit 21 refers to the image information to calculate the three-dimensional position of the subject Sp's head, and drives the cameras 5a and 5b so that they are positioned opposite to the three-dimensional position. Control. In addition, the camera drive system control unit 21 also calculates the three-dimensional coordinates of the multiple markers M based on the image information, and passes the information on the three-dimensional coordinates of the multiple markers M to the camera position detection unit 25 . In general, when visually inspecting the inspection object So, if the inspection object So has a large area, the subject Sp moves his or her legs while moving his or her head and eyes upward, downward, leftward, or rightward with respect to the inspection object So. Observe and inspect the inspection object So from corner to corner by moving it. Even in such a case, the face of the subject Sp can be captured on the face image acquired by the face image acquisition stereo camera 5 under the control of the camera driving system control section 21 .

撮像制御部２３は、対象者Ｓｐの視線検出処理が開始された後に、顔画像取得用ステレオカメラ５による撮像を制御する。その際、撮像制御部２３は、特許第４５００９９２号に記載のように、顔画像取得用ステレオカメラ５による撮像タイミングに応じて、顔画像取得用ステレオカメラ５に取り付けられた近赤外光源１１ａ，１１ｂの点灯タイミングを制御する。このような撮像制御部２３の制御により、各カメラ５ａ，５ｂにおいては、１秒間に３０枚得られる１フレームの顔画像として、奇数番目の水平画素ラインで構成される奇数フィールドの画像と、偶数番目の水平画素ラインで構成される偶数フィールドの画像とが得られ、奇数フィールドの画像と偶数フィールドの画像とが、瞳孔が比較的明るく写る顔画像（明瞳孔画像）及び瞳孔が比較的暗く写る顔画像（暗瞳孔画像）として、１／６０秒の間隔で交互に撮影されることで生成される。また、これらの顔画像はそれらの画像上にマーカーＭの像も映った状態で得られる。なお、マーカーＭの像のブレを防止してマーカーＭの画像上の位置を正確に検出するためには、各カメラ５ａ，５ｂの露光時間をできるだけ短く（例えば、５００μ秒）に設定することが好ましい。また、マーカーＭと瞳孔との位置検出の同時性を高めるために、各カメラ５ａ，５ｂのシャッタータイミングと、近赤外光源１１ａ，１１ｂの点灯タイミングとを同期させることも好ましく、マーカーＭとしてＬＥＤ等の発光体を用いる場合は近赤外光源１１ａ，１１ｂの点灯タイミングと同期させてマーカーＭを点灯させるように制御することが好ましい。 The imaging control unit 23 controls imaging by the face image acquisition stereo camera 5 after the gaze detection process of the subject Sp is started. At that time, as described in Japanese Patent No. 4500992, the imaging control unit 23 controls the near-infrared light sources 11a and 11a attached to the face image acquisition stereo camera 5 according to the imaging timing of the face image acquisition stereo camera 5. 11b lighting timing is controlled. Under the control of the imaging control unit 23 in this manner, each camera 5a, 5b obtains 30 face images per second for one frame as an odd-field image composed of odd-numbered horizontal pixel lines and an even-numbered image. The even field image composed of the th horizontal pixel line is obtained, and the odd field image and the even field image are divided into a face image (bright pupil image) in which the pupil is relatively bright and a relatively dark pupil. A face image (dark pupil image) is generated by alternately photographing at an interval of 1/60 second. Moreover, these face images are obtained with the image of the marker M also reflected on those images. In order to prevent blurring of the image of the marker M and accurately detect the position of the marker M on the image, the exposure time of each camera 5a, 5b should be set as short as possible (for example, 500 μs). preferable. Further, in order to improve the simultaneity of position detection of the marker M and the pupil, it is also preferable to synchronize the shutter timing of each camera 5a, 5b with the lighting timing of the near-infrared light sources 11a, 11b. When using a luminous body such as the above, it is preferable to control so that the marker M is lit in synchronization with the lighting timing of the near-infrared light sources 11a and 11b.

カメラ位置検出部２５は、顔画像取得用ステレオカメラ５の各カメラ５ａ，５ｂによって取得された顔画像上のマーカーＭの位置を基に、各カメラ５ａ，５ｂの位置を検出する。ここでは、各カメラ５ａ，５ｂのカメラモデルをピンホールモデルと仮定されており、各カメラ５ａ，５ｂのピンホール（光学中心）とマーカーＭの位置する垂直面Ｓｖとの距離Ｌが既知であり、コンピュータ９内にその距離Ｌが設定されている。この距離Ｌは、カメラ５ａ，５ｂを用いてステレオ計測によって求めることもできるし、予めマニュアルで計測して設定しておいてもよい。カメラ位置検出部２５は、この距離Ｌと、顔画像上のマーカーＭの位置と、そのマーカーＭの三次元座標の情報とを基に、各カメラ５ａ，５ｂのピンホールの位置（三次元座標）Ｃｉを算出することができる。詳細には、ピンホールモデルを利用することにより、顔画像上のマーカーＭの位置を基に各カメラ５ａ，５ｂのピンホールからマーカーＭへ向かう方向ベクトルが算出され、その方向ベクトルとマーカーＭの三次元座標と距離Ｌとから各カメラ５ａ，５ｂのピンホールの位置Ｃｉが導出される。そして、カメラ位置検出部２５は、各カメラ５ａ，５ｂの顔画像毎に検出したピンホールの位置Ｃｉを画像処理部２７に引き渡す。 The camera position detection unit 25 detects the positions of the cameras 5a and 5b based on the positions of the markers M on the facial images acquired by the cameras 5a and 5b of the stereo cameras 5 for acquiring facial images. Here, the camera model of each camera 5a, 5b is assumed to be a pinhole model, and the distance L between the pinhole (optical center) of each camera 5a, 5b and the vertical plane Sv on which the marker M is located is known. , the distance L is set in the computer 9 . This distance L can be determined by stereo measurement using the cameras 5a and 5b, or can be manually measured and set in advance. Based on this distance L, the position of the marker M on the face image, and information on the three-dimensional coordinates of the marker M, the camera position detection unit 25 detects the pinhole positions (three-dimensional coordinates ) Ci can be calculated. Specifically, by using a pinhole model, a direction vector from the pinholes of the cameras 5a and 5b toward the marker M is calculated based on the position of the marker M on the face image. From the three-dimensional coordinates and the distance L, the pinhole positions Ci of the cameras 5a and 5b are derived. Then, the camera position detection unit 25 transfers to the image processing unit 27 the position Ci of the pinhole detected for each face image of each of the cameras 5a and 5b.

このとき、カメラ位置検出部２５は、顔画像上に映るマーカーＭを区別するために、２台のカメラ５ａ，５ｂの顔画像を用いてステレオマッチングを行ってもよい。例えば、２台のカメラの５ａ，５ｂの並び方向（水平方向）に交わる方向に２つ以上のマーカーＭが存在していれば、特許第６０８３７６１号に記載の手法を用いて、ステレオマッチングしたマーカーＭが同一のマーカーであるか否かを判定することができる。また、カメラ位置検出部２５は、カメラ駆動系７から出力される台座７ａの位置を示す信号、あるいは、カメラ駆動系制御部２１から出力されるカメラ駆動系７を制御する信号を基に、おおよそのカメラ５ａ，５ｂの位置を推定し、カメラ５ａ，５ｂの画像に映っているはずのマーカーＭを判断し、そのマーカーＭを複数のマーカーＭの中から特定してもよい。 At this time, the camera position detection unit 25 may perform stereo matching using the face images of the two cameras 5a and 5b in order to distinguish the marker M appearing on the face image. For example, if there are two or more markers M in a direction intersecting the alignment direction (horizontal direction) of the two cameras 5a and 5b, the stereo-matched markers can be obtained using the method described in Japanese Patent No. 6083761. It can be determined whether M are the same marker. Further, the camera position detection unit 25 approximately detects the position of the pedestal 7a output from the camera drive system 7 or the signal for controlling the camera drive system 7 output from the camera drive system control unit 21. , the positions of the cameras 5a and 5b may be estimated, the marker M that should appear in the images of the cameras 5a and 5b may be determined, and the marker M may be specified from among a plurality of markers M.

なお、カメラ位置検出部２５は、各カメラ５ａ，５ｂで同時に得られた顔画像を基に検出された各カメラ５ａ，５ｂの位置Ｃｉを対象に、それぞれの位置の変化の平均値を求め、その平均値を反映した位置を各カメラ５ａ，５ｂの位置として検出してもよい。また、カメラ位置検出部２５は、カメラ５ａ，５ｂで得られた顔画像にひずみ等がある場合には、ピンホールからマーカーＭへ向かう方向ベクトルを、カメラ較正時に得られたカメラパラメータを用いて補正することが好ましい。 Note that the camera position detection unit 25 obtains the average value of changes in the respective positions for the positions Ci of the respective cameras 5a and 5b detected based on the face images simultaneously obtained by the respective cameras 5a and 5b, A position reflecting the average value may be detected as the position of each camera 5a, 5b. In addition, when the face images obtained by the cameras 5a and 5b are distorted, the camera position detection unit 25 detects the direction vector from the pinhole to the marker M using camera parameters obtained during camera calibration. Correction is preferred.

画像処理部２７は、カメラ５ａ，５ｂで得られたそれぞれの顔画像を対象に、それぞれの顔画像に対応してカメラ位置検出部２５によって検出されたカメラ５ａ，５ｂの位置を基に画像処理を実行する。すなわち、画像処理部２７は、顔画像の取得タイミングにおける対象者Ｓｐの視線方向、あるいは検査対象物Ｓｏ上の注視点を検出し、視線方向あるいは注視点の情報を、画像あるいはデータとして出力装置１０６に時系列に出力する。 The image processing unit 27 processes the facial images obtained by the cameras 5a and 5b based on the positions of the cameras 5a and 5b detected by the camera position detecting unit 25 corresponding to the respective facial images. to run. That is, the image processing unit 27 detects the line-of-sight direction of the subject Sp or the gaze point on the inspection object So at the acquisition timing of the face image, and outputs the information on the line-of-sight direction or the gaze point as an image or data to the output device 106. output in chronological order.

より具体的には、画像処理部２７は、顔画像上で検出された瞳孔の位置を基に、ピンホールモデルを用いてカメラ５ａ，５ｂのピンホールから瞳孔に向かう単位方向ベクトルＶｎを算出し、その単位方向ベクトルＶｎと、カメラ５ａ，５ｂのピンホールの位置Ｃｉとを基に、瞳孔位置（三次元座標）Ｐを、下記式；
Ｐ＝Ｃｉ＋ｋ×Ｖｎ
により、計算する（上記式中、ｋは未定定数）。この未定定数ｋは、マーカー検出用カメラ３等のＴＯＦカメラによる検出結果を用いて計算された対象者Ｓｐまでの距離を用いて求めてもよいし、特許第４４３１７４９号に記載されたように瞳孔及び鼻孔を検出して瞳孔の三次元座標を計測する手法を用いて求めてもよい。ここでも、顔画像にひずみ等がある場合には、ピンホールから瞳孔へ向かう単位方向ベクトルを、カメラ較正時に得られたカメラパラメータを用いて補正することが好ましい。一方で、画像処理部２７は、２台のカメラ５ａ，５ｂの顔画像を基にピンホールの位置から瞳孔に向かうベクトルを求め、それらの２つのベクトルの交点を求めることで、瞳孔位置Ｐを検出することもできる。また、画像処理部２７は、国際公開ＷＯ２０１５／１９０２０４号に記載の手法を用いて、２本のベクトルの最近点を瞳孔位置Ｐとして検出することもできる。この場合には、誤差のため２つのベクトルの交点が存在しない場合にも瞳孔位置Ｐを求めることができる。 More specifically, the image processing unit 27 calculates a unit direction vector Vn directed from the pinholes of the cameras 5a and 5b to the pupils using a pinhole model based on the positions of the pupils detected on the face image. , based on the unit direction vector Vn and the pinhole positions Ci of the cameras 5a and 5b, the pupil position (three-dimensional coordinates) P is calculated by the following formula;
P = Ci + k x Vn
(where k is an undetermined constant). This undetermined constant k may be obtained using the distance to the subject Sp calculated using the detection result of the TOF camera such as the marker detection camera 3, or as described in Japanese Patent No. 4431749. and a method of detecting the nostrils and measuring the three-dimensional coordinates of the pupils. Again, if the face image has distortion or the like, it is preferable to correct the unit direction vector from the pinhole to the pupil using camera parameters obtained during camera calibration. On the other hand, the image processing unit 27 obtains a vector pointing from the position of the pinhole to the pupil based on the face images of the two cameras 5a and 5b, and obtains the intersection point of these two vectors to determine the pupil position P. can also be detected. The image processing unit 27 can also detect the closest point of the two vectors as the pupil position P using the method described in International Publication WO2015/190204. In this case, the pupil position P can be obtained even if the intersection of the two vectors does not exist due to an error.

さらに、画像処理部２７は、顔画像を基に算出した瞳孔位置Ｐと、顔画像を基に検出した顔画像上の瞳孔及び角膜反射の位置を基に、特許第４５００９９２号に記載の手法を用いて、対象者Ｓｐの視線ベクトル及び検査対象物Ｓｏ上の注視点を算出する。ここで、視線とは、瞳孔と注視点とを結ぶ直線であり、注視点とは、視対称面と視線との交点である。 Furthermore, the image processing unit 27 performs the method described in Japanese Patent No. 4500992 based on the pupil position P calculated based on the face image and the position of the pupil and corneal reflection on the face image detected based on the face image. are used to calculate the line-of-sight vector of the subject Sp and the gaze point on the inspection object So. Here, the line of sight is a straight line connecting the pupil and the point of gaze, and the point of gaze is the intersection of the plane of visual symmetry and the line of sight.

上記の瞳孔位置Ｐの検出時には、画像処理部２７は、明瞳孔画像と暗瞳孔画像とを差分して得た差分画像を対象に、顔画像上の瞳孔の位置を検出する。この差分画像の取得時には、差分する前に少なくとも一方の画像に対して位置補正（位置合わせ）の処理を加えてもよい。このような差分位置補正により、対象者Ｓｐが眩しくて瞳孔が小さくなった場合等に、検出のロバスト性および瞳孔位置の検出精度を高めることができる。具体的には、画像処理部２７は、特許第４４５２８３２号に記載の手法を用いて、異なる時間で得られた明瞳孔画像と暗瞳孔画像にそれぞれにおいて鼻孔を検出し、フレーム間の鼻孔の移動量に応じて小ウィンドウ中の画像をずらしてから差分処理してもよいし、特許第４４５２８３６に記載の手法を用いて、フレーム間の角膜反射の移動量に応じて位置補正してから差分処理してもよいし、特許第５４２９８８５号に記載の手法を用いて、２個の瞳孔と鼻孔間中点の３点を１つの塊として捉えてそれらの位置を推定してその推定結果を基に位置補正してから差分処理してもよい。また、画像処理部２７は、特許第６０８３７６１号に記載の手法を用いて、２個の瞳孔を三次元追跡してその結果を基に位置補正してから差分処理してもよいし、特開２０１６－０９５５８４号公報に記載の手法を用いて、角膜反射を２次元座標上で追跡してその結果を基に位置補正してから差分処理してもよい。ただし、本実施形態では、カメラ５ａ，５ｂの位置が比較的高速で移動するため、その位置の変化を反映させない場合には、瞳孔、鼻孔、角膜反射等の運動が正しく推定できないため顔画像上の瞳孔の位置を正確に検出できない。そのため、本実施形態の画像処理部２７は、次のようにして、差分位置補正を行う。 When detecting the pupil position P described above, the image processing unit 27 detects the position of the pupil on the face image with respect to the differential image obtained by subtracting the bright pupil image and the dark pupil image. At the time of acquiring the differential images, at least one of the images may be subjected to position correction (alignment) processing before subtraction. Such differential position correction can improve detection robustness and pupil position detection accuracy when the target person Sp is dazzled and the pupil becomes small. Specifically, the image processing unit 27 uses the method described in Japanese Patent No. 4452832 to detect nostrils in each of the bright-pupil image and the dark-pupil image obtained at different times, and detects the movement of the nostrils between frames. The image in the small window may be shifted according to the amount and then difference processing may be performed, or the method described in Japanese Patent No. 4452836 may be used to perform position correction according to the amount of movement of the corneal reflection between frames before difference processing. Alternatively, using the method described in Japanese Patent No. 5429885, the three points of the two pupils and the midpoint between the nostrils are captured as one mass, their positions are estimated, and based on the estimation result Difference processing may be performed after position correction. Further, the image processing unit 27 may three-dimensionally track two pupils using the method described in Japanese Patent No. 6083761, perform position correction based on the result, and then perform difference processing. Using the method described in Japanese Patent Application Laid-Open No. 2016-095584, the corneal reflection may be traced on two-dimensional coordinates, position correction may be performed based on the result, and then difference processing may be performed. However, in this embodiment, since the positions of the cameras 5a and 5b move at a relatively high speed, the movement of the pupils, nostrils, corneal reflexes, etc. cannot be correctly estimated unless the changes in the positions are reflected. cannot detect the position of the pupil accurately. Therefore, the image processing unit 27 of the present embodiment performs difference position correction as follows.

すなわち、画像処理部２７は、過去のフレームの顔画像を対象に検出された瞳孔位置（三次元座標）Ｐと、処理対象のフレームの顔画像に対応して検出されたカメラ５ａ，５ｂの位置Ｃｉとを用いて、処理対象のフレームにおける瞳孔位置（三次元座標）Ｐを推定する。より詳細には、画像処理部２７は、処理対象のフレームの１つ前のフレームの瞳孔位置Ｐ_ｉ－１と、さらに１つ前のフレームの瞳孔位置Ｐ_ｉ－２とを基に、等速モデルを用いて、処理対象のフレームにおける瞳孔位置Ｐ_ｉを、下記式により算出する。
Ｐ_ｉ＝Ｐ_ｉ－１＋（Ｐ_ｉ－１－Ｐ_ｉ－２）
この場合、等速モデルの代わりに線形カルマンフィルターを用いてもよい。そして、画像処理部２７は、処理対象のフレームに対応するカメラ５ａ，５ｂの位置Ｃｉと、計算した瞳孔位置Ｐ_ｉとから、ピンホールモデルを用いて、処理対象の顔画像中の瞳孔の位置Ｐ_Ｃｉを推測する。そして、画像処理部２７は、差分位置補正の際に、対象のフレームの１つ前のフレームにおける瞳孔の位置Ｐ_{Ｃ（ｉ－１）}が対象のフレームでの瞳孔の位置Ｐ_Ｃｉに位置するように画像の位置補正を行う。この際に、画像処理部２７は、前のフレームで角膜反射が検出されている場合には、対象のフレームで瞳孔の位置Ｐ_Ｃｉを中心とした小ウィンドウ中から角膜反射を検出し、前のフレームにおける角膜反射が対象のフレームにおける角膜反射に一致するように前のフレームの同サイズの小ウィンドウをずらした上で、対象フレームの小ウィンドウ内の画像と差分して差分画像を生成する。さらに、画像処理部２７は、差分画像を対象に瞳孔を抽出し対象のフレームの顔画像上における瞳孔の位置（中心位置）を検出する。このような処理を繰り返すことにより、画像処理部２７は、毎フレームにおける顔画像上の瞳孔の位置を検出する。 That is, the image processing unit 27 detects the pupil position (three-dimensional coordinates) P detected from the face image of the past frame, and the positions of the cameras 5a and 5b detected corresponding to the face image of the frame to be processed. Ci and are used to estimate the pupil position (three-dimensional coordinates) P in the frame to be processed. More specifically, the image processing unit 27 performs constant-velocity image processing based on the pupil position P _i−1 of the frame immediately before the frame to be processed and the pupil position P _i−2 of the frame immediately before the frame to be processed. Using the model, the pupil position P _i in the frame to be processed is calculated by the following equation.
P _i =P _i-1 +(P _i-1 -P _i-2 )
In this case, a linear Kalman filter may be used instead of the constant velocity model. Then, the image processing unit 27 calculates the positions of the pupils in the face image to be processed using the pinhole model from the positions Ci of the cameras 5a and 5b corresponding to the frame to be processed and the calculated pupil positions _Pi . Guess P _Ci . Then, the image processing unit 27 performs the difference position correction so that the pupil position P _C(i−1) in the frame immediately before the target frame is positioned at the pupil position P _Ci in the target frame. position correction of the image. At this time, if the corneal reflection has been detected in the previous frame, the image processing unit 27 detects the corneal reflection in the small window centered on the pupil position _PCi in the target frame, and detects the corneal reflection in the previous frame. A small window of the same size in the previous frame is shifted so that the corneal reflection in the frame matches the corneal reflection in the target frame, and then subtracted from the image in the small window of the target frame to generate a difference image. Further, the image processing unit 27 extracts the pupils from the difference image and detects the position (center position) of the pupils on the face image of the target frame. By repeating such processing, the image processing unit 27 detects the position of the pupil on the face image in each frame.

ここで、画像処理部２７は、差分位置補正時に、上述したように、瞳孔間距離が一定であると仮定して両瞳孔を１つの塊として追尾してもよいし、角膜球中心を瞳孔の代わりに追尾してもよいし、瞳孔のみでなく鼻孔も検出して、頭部姿勢のフレーム間での等速変動を仮定して瞳孔を追尾してもよい。さらには、特開２０１８－０９９１７４号公報に記載の手法を用いて、各眼球の眼球回転中心が等速運動することを仮定して、それを追尾してもよいし、両眼球の回転中心間距離を一定と仮定して追尾してもよい。また、画像処理部２７は、瞳孔の位置を検出する際には、位置補正後の画像を差分する以外に、特許5145555号に記載のように、画像を除算あるいは乗算等の他の演算により瞳孔位置を検出してもよい。 Here, when correcting the position of the difference, the image processing unit 27 may track both pupils as a single mass on the assumption that the interpupillary distance is constant, as described above. Alternatively, tracking may be performed by detecting not only the pupils but also the nostrils, and tracking the pupils by assuming that the head posture changes at a constant speed between frames. Furthermore, using the method described in JP-A-2018-099174, assuming that the eyeball rotation center of each eyeball is in uniform motion, it may be tracked, or between the rotation centers of both eyeballs Tracking may be performed by assuming that the distance is constant. Further, when detecting the position of the pupil, the image processing unit 27 calculates the position of the pupil by other operations such as division or multiplication of the image, as described in Japanese Patent No. 5145555, in addition to subtracting the position-corrected image. Position may be detected.

以上説明した第１実施形態に係る顔画像処理装置１によれば、対象者Ｓｐの顔とそれとは独立したマーカーＭを撮像した顔画像を用いてカメラ５ａ，５ｂの位置を検出することにより、顔画像を取得した際のカメラ５ａ，５ｂの位置をズレが生じることなく検出することができる。そして、検出したカメラ５ａ，５ｂの位置を基に画像処理を実行することにより、カメラ５ａ，５ｂの位置が変化する場合であっても、対象者Ｓｐの画像を対象に、カメラ５ａ，５ｂの位置に基づいて高精度に画像処理を施すことができる。これは、顔画像を取得するカメラ自体によって、画像処理を行う対象である瞳孔、角膜反射等の特徴点と、マーカーとを同時に撮像することによって、完全な同時性が得られるからである。 According to the face image processing device 1 according to the first embodiment described above, by detecting the positions of the cameras 5a and 5b using the face image obtained by capturing the face of the subject Sp and the marker M independent of it, It is possible to detect the positions of the cameras 5a and 5b when the face image is acquired without deviation. By executing image processing based on the detected positions of the cameras 5a and 5b, even when the positions of the cameras 5a and 5b change, the image of the subject Sp is processed by the cameras 5a and 5b. Image processing can be performed with high accuracy based on the position. This is because the camera itself that acquires the face image can simultaneously capture the characteristic points such as the pupil and corneal reflection, which are targets for image processing, and the markers, so that complete simultaneity can be obtained.

特に、顔画像処理装置１は、カメラ５ａ，５ｂの位置を制御するカメラ駆動系７をさらに備えている。かかる構成においては、対象者Ｓｐの動き或いは姿勢に応じて顔画像上に対象者の顔を確実に捉えることができる。さらに、対象者Ｓｐの位置あるいは姿勢に応じてカメラ５ａ，５ｂの位置を制御する場合に、対象者Ｓｐの画像を対象に、カメラ５ａ，５ｂの位置に基づいて高精度に画像処理を施すことができる。本実施形態のように、検査対象物Ｓｏの面積が大きく、対象者Ｓｐの動きが大きい場合において対象者Ｓｐの視線を確実に検出したい場合には、カメラに取り付けた光源の指向性を弱くし、広角カメラを用いることも考えられる。しかし、その場合、光源の指向性を弱くした分、光源のパワーが必要となり、カメラの視野角を広げれば広げるほど、小さな角膜反射等の特徴点を検出するために、カメラの分解能を高くしなければならず、制約が多くなる。ここで、カメラの視野角を広げるために、横に並べるカメラの台数を３台以上に増やして配置することも考えられるが、その場合でもカメラの撮影範囲は広がりにくい。特にカメラの近い距離では、２台のカメラで撮影できる範囲は得られにくい。３台のカメラを近づけて配置しても、一般に奥行き方向の三次元計測の精度が低下する結果、視線あるいは注視点の検出精度が大きく低下してしまう。これに対して、本実施形態では、そのような制約は生じない。 In particular, the face image processing device 1 further includes a camera drive system 7 for controlling the positions of the cameras 5a and 5b. In such a configuration, it is possible to reliably capture the face of the subject Sp on the face image according to the movement or posture of the subject Sp. Furthermore, when the positions of the cameras 5a and 5b are controlled according to the position or posture of the subject Sp, the image of the subject Sp can be subjected to highly accurate image processing based on the positions of the cameras 5a and 5b. can be done. As in the present embodiment, when the inspection object So has a large area and the subject Sp moves a lot, if it is desired to reliably detect the line of sight of the subject Sp, the directivity of the light source attached to the camera is weakened. , it is also conceivable to use a wide-angle camera. However, in that case, the weaker the directivity of the light source, the higher the power of the light source is required. You have to, and there are a lot of restrictions. Here, in order to widen the viewing angle of the cameras, it is conceivable to increase the number of cameras arranged side by side to three or more and arrange them, but even in this case, it is difficult to widen the photographing range of the cameras. In particular, when the cameras are at a short distance, it is difficult to obtain a range that can be photographed with two cameras. Even if the three cameras are arranged close to each other, the accuracy of three-dimensional measurement in the depth direction generally decreases, resulting in a large decrease in the detection accuracy of the line of sight or the gaze point. On the other hand, in this embodiment, such restrictions do not occur.

また、顔画像処理装置１のコンピュータ９により、高精度に対象者Ｓｐの瞳孔の方向、距離、及び三次元位置を算出することができる。特に、コンピュータ９により、顔画像を対象にカメラの位置に基づいて差分位置補正の処理を行うことにより、高精度に対象者Ｓｐの特徴点の顔画像上の位置を検出することができる。例えば、カメラの位置が特に加速度を持って移動する場合は、顔画像に映る瞳孔画像が対象者Ｓｐの頭部の動きとはずれて撮影されるため、顔画像の各フレームでのカメラの位置あるいはカメラの台座の位置が把握できなければ、瞳孔の位置もしくは方向が正確にわからない。本実施形態では、カメラの動きに関わらず、瞳孔の位置あるいは方向を正確に検出でき、その結果、視線及び注視点を正確に検出することができる。 Further, the computer 9 of the face image processing apparatus 1 can calculate the direction, distance, and three-dimensional position of the pupil of the subject Sp with high accuracy. In particular, the computer 9 performs difference position correction processing on the face image based on the position of the camera, so that the positions of the feature points of the subject Sp on the face image can be detected with high accuracy. For example, when the position of the camera moves particularly with acceleration, the pupil image reflected in the face image is taken out of the movement of the head of the subject Sp, so the position of the camera in each frame of the face image If the position of the pedestal of the camera cannot be grasped, the position or direction of the pupil cannot be accurately known. In this embodiment, it is possible to accurately detect the position or direction of the pupil regardless of the movement of the camera, and as a result, it is possible to accurately detect the line of sight and the gaze point.

本発明は、上述した実施形態に限定されるものではない。以下に、第１実施形態の変形例および他の実施形態の構成について説明する。 The invention is not limited to the embodiments described above. Modifications of the first embodiment and configurations of other embodiments will be described below.

［第１実施形態の変形例］
第１実施形態に係るコンピュータ９においては、画像処理部２７が画像処理として対象者Ｓｐの視線および注視点を検出していたが、このような画像処理には限定されず、瞳孔の三次元位置から対象者Ｓｐの頭部の動きを検出するようにしてもよいし、鼻孔の検出結果も併せて頭部姿勢を検出するようにしてもよい。この場合、画像処理部２７は、特許第４７６５００８号、あるいは、特許第４４３１７４９号に記載の手法を実行する機能を有する。 [Modification of First Embodiment]
In the computer 9 according to the first embodiment, the image processing unit 27 detects the line of sight and gaze point of the subject Sp as image processing. Alternatively, the motion of the head of the subject Sp may be detected, or the head posture may be detected together with the detection result of the nostrils. In this case, the image processing unit 27 has a function of executing the technique described in Japanese Patent No. 4765008 or Japanese Patent No. 4431749.

また、第１実施形態に係る顔画像処理装置１は、対象者Ｓｐの顔画像をステレオカメラを用いて取得し、視線検出、瞳孔検出、あるいは頭部姿勢検出等を実行しているが、１台のカメラによって顔画像を取得してこれらの検出処理を実行するように構成されていてもよい。例えば、１台のカメラの画像を基に頭部姿勢を検出する手法は特許第４４３１７４９号に記載の手法を採用できる。 Further, the face image processing device 1 according to the first embodiment acquires the face image of the subject Sp using a stereo camera, and performs line-of-sight detection, pupil detection, or head posture detection. It may be configured to acquire a face image with a stand camera and execute these detection processes. For example, the method described in Japanese Patent No. 4431749 can be adopted as the method of detecting the head posture based on the image of one camera.

また、第１実施形態に係る顔画像処理装置１は、カメラ駆動系７を用いてカメラ５ａ，５ｂの位置を変更するように構成されていたが、カメラ５ａ，５ｂの姿勢を変更するように構成されていてもよい。このような構成を採用すれば、対象者Ｓｐがカメラ５ａ，５ｂに対して大きく横方向を向いた場合に、顔画像で捉えられる角膜反射が白目における反射になって角膜反射の検出が不可能となる、あるいは、顔画像における瞳孔の形状が変形して視線検出精度が低下することを防止できる。 Further, the face image processing apparatus 1 according to the first embodiment is configured to change the positions of the cameras 5a and 5b using the camera drive system 7, but the configuration is such that the postures of the cameras 5a and 5b are changed. may be configured. If such a configuration is adopted, when the subject Sp faces the camera 5a, 5b in a large horizontal direction, the corneal reflection captured in the face image becomes the reflection in the white of the eye, making it impossible to detect the corneal reflection. Alternatively, it is possible to prevent a decrease in gaze detection accuracy due to deformation of the shape of the pupil in the face image.

図７には、変形例に係るカメラ駆動系２０７の構成を示している。このように、カメラ駆動系２０７は、台座７ａを、台座７ａ上の２台のカメラ５ａ，５ｂの中間点に設けられた回転軸２０７ａを中心に回転駆動するように構成されている。つまり、カメラ駆動系２０７は、カメラ５ａ，５ｂの姿勢を水平面に沿って回転させるように駆動する。このような構成の場合には、コンピュータ９のカメラ位置検出部２５は、次のようにして各カメラ５ａ，５ｂの位置Ｃｉ及び姿勢を検出する。すなわち、台座７ａの回転中心からカメラ５ａ，５ｂの位置がずれていることから、カメラ位置検出部２５は、カメラ較正時に特定された回転軸２０７ａからカメラピンホールまでの相対位置を利用して、台座７ａの回転角度および各カメラ５ａ，５ｂのピンホールの位置Ｃｉを計算する。より具体的には、カメラ５ａ，５ｂのピンホールを通る光軸は回転円の接線となっており、回転円の接線の方程式を用いることで台座７ａの回転角度が計算できる。水平方向にｘ軸及びｚ軸を仮定した場合に、半径ｄ_ｐの回転円の式は、下記式；
ｘ^２＋ｚ^２＝ｄ_ｐ ^２
によって与えられ、回転円上にある点Ｃ（ｘ_ｃ，ｚ_ｃ）を通る接線の方程式は、下記式；
ｘｘ_ｃ＋ｚｚ_ｃ＝ｄ_ｐ ^２
によって与えられる。カメラ位置検出部２５は、各カメラ５ａ，５ｂからマーカーＭへ向かう方向ベクトルが既知であるので、その方向ベクトルと顔画像上のマーカーＭの位置を基に各カメラ５ａ，５ｂの光軸の方向ベクトルを計算し、光軸の方向ベクトルと上記式で与えられる接線とが一致するという条件から点Ｃの座標を求め、その座標を各カメラ５ａ，５ｂのピンホールの位置Ｃｉとして得ることができる。さらに、カメラ位置検出部２５は、ピンホールの位置Ｃｉを基に、各カメラ５ａ，５ｂの回転角度を得ることができ、その回転角度から、各カメラ５ａ，５ｂの姿勢を取得することができる。 FIG. 7 shows the configuration of the camera drive system 207 according to the modification. In this way, the camera drive system 207 is configured to rotate the base 7a about the rotary shaft 207a provided at the midpoint between the two cameras 5a and 5b on the base 7a. In other words, the camera drive system 207 drives the cameras 5a and 5b to rotate along the horizontal plane. In such a configuration, the camera position detector 25 of the computer 9 detects the positions Ci and attitudes of the cameras 5a and 5b as follows. That is, since the positions of the cameras 5a and 5b are deviated from the rotation center of the base 7a, the camera position detection unit 25 uses the relative position from the rotation axis 207a to the camera pinhole specified at the time of camera calibration, The rotation angle of the pedestal 7a and the pinhole positions Ci of the cameras 5a and 5b are calculated. More specifically, the optical axis passing through the pinholes of the cameras 5a and 5b is tangent to the rotation circle, and the rotation angle of the pedestal 7a can be calculated using the equation of the tangent to the rotation circle. Assuming the x-axis and z-axis in the horizontal direction, the formula for a rotating circle with a radius of _dp is the following formula:
^x2 + ^z2 ⁼ _dp2
and passing through a point C(x _c , z _c ) on the circle of revolution is given by the following equation:
_xxc + _zzc ⁼ _dp2
given by Since the direction vector from each camera 5a, 5b toward the marker M is known, the camera position detection unit 25 detects the direction of the optical axis of each camera 5a, 5b based on the direction vector and the position of the marker M on the face image. A vector is calculated, and the coordinates of the point C are obtained from the condition that the directional vector of the optical axis and the tangent line given by the above equation match, and the coordinates can be obtained as the position Ci of the pinhole of each camera 5a, 5b. . Further, the camera position detection unit 25 can obtain the rotation angles of the cameras 5a and 5b based on the pinhole positions Ci, and can obtain the attitudes of the cameras 5a and 5b from the rotation angles. .

この変形例では、コンピュータ９の画像処理部２７が、カメラ５ａ，５ｂの位置及び姿勢を用いて画像処理を実行することで、高精度の画像処理が実現できる。具体的には、カメラ５ａ，５ｂの位置及び姿勢の変化を反映した差分位置補正を行うことにより、カメラ５ａ，５ｂが高速で回転駆動される場合であっても、瞳孔の顔画像上の位置を高精度に検出できる。 In this modified example, the image processing unit 27 of the computer 9 executes image processing using the positions and orientations of the cameras 5a and 5b, thereby realizing highly accurate image processing. Specifically, by performing differential position correction that reflects changes in the positions and orientations of the cameras 5a and 5b, even when the cameras 5a and 5b are rotationally driven at high speed, the positions of the pupils on the face image are corrected. can be detected with high accuracy.

図８には、別の変形例に係るカメラ駆動系２５７の構成を示している。このように、カメラ駆動系２５７は、台座７ａ上において各カメラ５ａ，５ｂに設けられた回転軸２５７ａ，２５７ｂを中心に各カメラ５ａ，５ｂを独立に回転駆動するように構成されている。つまり、カメラ駆動系２５７は、カメラ５ａ，５ｂの姿勢を水平面に沿って回転させるように駆動することにより、その位置を変化させることなく姿勢のみ変化させる。このような構成の場合には、コンピュータ９のカメラ位置検出部２５は、各カメラ５ａ，５ｂからマーカーＭへ向かう方向ベクトルが既知であるので、その方向ベクトルと顔画像上のマーカーＭの位置を基に各カメラ５ａ，５ｂの光軸の方向ベクトルを計算し、その光軸の方向ベクトルを基に各カメラ５ａ，５ｂの姿勢を検出することができる。そして、コンピュータ９の画像処理部２７は、カメラ５ａ，５ｂの姿勢を用いて差分位置補正等の画像処理を実行することで、高精度の画像処理が実現できる。ただし、カメラ５ａ，５ｂのピンホールと回転軸２５７ａ，２５７ｂにずれがある場合には、図７に示した変形例の場合と同様にして、カメラ５ａ，５ｂの位置及び姿勢を検出し、それらを基に画像処理を実行することもできる。この変形例では、カメラ駆動系２５７を含む装置の奥行きが小さく構成できるという利点がある。 FIG. 8 shows the configuration of a camera drive system 257 according to another modification. In this way, the camera drive system 257 is configured to independently rotate the cameras 5a and 5b on the pedestal 7a about the rotation shafts 257a and 257b provided on the cameras 5a and 5b. In other words, the camera drive system 257 drives the cameras 5a and 5b so as to rotate their postures along the horizontal plane, thereby changing only their postures without changing their positions. In the case of such a configuration, the camera position detection unit 25 of the computer 9 knows the direction vector from each of the cameras 5a and 5b toward the marker M, so that the direction vector and the position of the marker M on the face image are detected. Based on this, the directional vectors of the optical axes of the cameras 5a and 5b can be calculated, and the attitudes of the cameras 5a and 5b can be detected based on the directional vectors of the optical axes. The image processing unit 27 of the computer 9 executes image processing such as differential position correction using the orientations of the cameras 5a and 5b, thereby achieving high-precision image processing. However, if there is a misalignment between the pinholes of the cameras 5a and 5b and the rotation shafts 257a and 257b, the positions and orientations of the cameras 5a and 5b are detected in the same manner as in the modification shown in FIG. can also be used to perform image processing. This modification has the advantage that the depth of the device including the camera driving system 257 can be made small.

［第２実施形態に係る画像観察システムの構成］
図９には、第２実施形態に係る画像観察システム３００の構成を示している。この画像観察システム３００は、観察者が遠隔地から観察者とは別の対象者の顔を観察するための画像処理システムである。この画像処理システムは、互いに遠隔地に存在する観察者と対象者との間のコミュニケーションシステムとしての用途が考えられる。図９に示すように、画像観察システム３００は、観察者Ｓｐ１側に配置される観察者検出用カメラ（光学系）３０５ａ、コンピュータ３０９ａ、及びディスプレイ装置（表示装置）３１１と、観察対象である対象者Ｓｐ２側に配置される顔画像処理装置３０１とを含んで構成される。顔画像処理装置３０１は、顔画像取得用カメラ３０５ｂ、カメラ駆動系３０７、及びコンピュータ（算出部）３０９ｂを含んでいる。これらのコンピュータ３０９ａ，３０９ｂは、図５に示したハードウェア構成と同様なハードウェア構成を有し、図示しない通信ネットワークを介して互いに画像データ等のデータを送受信可能に構成される。 [Configuration of Image Observation System According to Second Embodiment]
FIG. 9 shows the configuration of an image observation system 300 according to the second embodiment. This image observation system 300 is an image processing system for an observer to remotely observe the face of a subject other than the observer. This image processing system can be used as a communication system between an observer and a subject who are located at remote locations. As shown in FIG. 9, the image observation system 300 includes an observer detection camera (optical system) 305a arranged on the observer Sp1 side, a computer 309a, a display device (display device) 311, and an object to be observed. and a face image processing device 301 arranged on the side of the person Sp2. The face image processing device 301 includes a face image acquisition camera 305b, a camera drive system 307, and a computer (calculation unit) 309b. These computers 309a and 309b have the same hardware configuration as the hardware configuration shown in FIG. 5, and are configured to be able to transmit and receive data such as image data to and from each other via a communication network (not shown).

観察者検出用カメラ３０５ａは、カメラ５ａ，５ｂと同様な構成を有する固定配置されたビデオカメラであり、コンピュータ３０９ａからの命令に応じて、観察者Ｓｐ１の顔（頭部）を撮像して顔画像を出力する。コンピュータ３０９ａは、観察者検出用カメラ３０５ａから出力された顔画像を用いて、コンピュータ９と同様にして、フレーム毎に観察者Ｓｐ１の左右の瞳孔の位置（三次元座標）の中点座標及び観察者Ｓｐ１の顔姿勢を検出し、中点座標及び顔姿勢に関する情報をコンピュータ３０９ｂに送信する。加えて、コンピュータ３０９ａは、コンピュータ３０９ｂから受信された対象者Ｓｐ２が映った顔画像をディスプレイ装置３１１に表示させる機能も有する。このとき、コンピュータ３０９ａは、特開２０１７－０２６８９３号公報に記載の手法を用いて、対象者Ｓｐ２が映った顔画像を、観察者Ｓｐ１の顔姿勢に応じて、斜めから見ても正面から見たように映る画像に変換してからディスプレイ装置３１１に表示させてもよい。 The observer detection camera 305a is a fixedly arranged video camera having the same configuration as the cameras 5a and 5b. Output the image. Using the face image output from the observer detection camera 305a, the computer 309a uses the midpoint coordinates of the positions (three-dimensional coordinates) of the left and right pupils of the observer Sp1 for each frame and the observation The face posture of the person Sp1 is detected, and information about the midpoint coordinates and the face posture is transmitted to the computer 309b. In addition, the computer 309a also has a function of displaying on the display device 311 the face image of the subject Sp2 received from the computer 309b. At this time, the computer 309a uses the method described in Japanese Patent Application Laid-Open No. 2017-026893 to display the face image of the subject Sp2 according to the facial posture of the observer Sp1, even when viewed obliquely or from the front. The image may be displayed on the display device 311 after being converted into an image that looks like this.

顔画像取得用カメラ３０５ｂは、対象者Ｓｐ２と対象者Ｓｐ２の背後の垂直面Ｓｖ上に二次元的に配置された複数のマーカーＭとを同時に撮影するビデオカメラである。顔画像取得用カメラ３０５ｂは、コンピュータ３０９ｂからの命令に応じて対象者Ｓｐ２及びマーカーＭを撮像し、その結果得られた顔画像をコンピュータ３０９ｂに出力する。 The facial image acquisition camera 305b is a video camera that simultaneously captures the subject Sp2 and a plurality of markers M arranged two-dimensionally on the vertical plane Sv behind the subject Sp2. The facial image acquisition camera 305b images the target person Sp2 and the marker M according to a command from the computer 309b, and outputs the resulting facial image to the computer 309b.

カメラ駆動系３０７は、顔画像取得用カメラ３０５ｂをパンチルト可能に支持するパンチルト台３１９、顔画像取得用カメラ３０５ｂをパンチルト台３１９と一体に移動可能に支持するアクチュエータ３１３，３１５，３１７により構成される。このアクチュエータ３１３は、顔画像取得用カメラ３０５ｂを対象者Ｓｐ２側から見て左右方向に駆動し、アクチュエータ３１５は、顔画像取得用カメラ３０５ｂを対象者Ｓｐ２側から見て上下方向に駆動し、アクチュエータ３１７は、顔画像取得用カメラ３０５ｂを対象者Ｓｐ２側から見て前後方向に駆動する。パンチルト台３１９、及びアクチュエータ３１３，３１５，３１７は、コンピュータ３０９ｂからの命令に応じて、顔画像取得用カメラ３０５ｂの三次元位置及び姿勢を変化させるように駆動する。 The camera driving system 307 is composed of a pan-tilt base 319 that supports the face-image acquisition camera 305b in a pan-tilt manner, and actuators 313, 315, and 317 that support the face-image acquisition camera 305b movably integrally with the pan-tilt base 319. . The actuator 313 drives the face image acquisition camera 305b in the horizontal direction when viewed from the side of the subject Sp2, and the actuator 315 drives the face image acquisition camera 305b in the vertical direction when viewed from the side of the subject Sp2. A reference numeral 317 drives the face image acquisition camera 305b in the front-rear direction as viewed from the side of the subject Sp2. The pan/tilt table 319 and actuators 313, 315, and 317 are driven to change the three-dimensional position and posture of the facial image acquisition camera 305b in accordance with commands from the computer 309b.

図１０に示すように、コンピュータ３０９ｂは、機能的構成要素として、カメラ駆動系制御部３２１、撮像制御部３２３、カメラ位置／姿勢検出部３２５、及び画像処理部３２７を備える。以下、コンピュータ３０９ｂの各構成要素の機能を説明する。 As shown in FIG. 10, the computer 309b includes a camera driving system control section 321, an imaging control section 323, a camera position/orientation detection section 325, and an image processing section 327 as functional components. The function of each component of the computer 309b will be described below.

カメラ駆動系制御部３２１は、対象者Ｓｐ２の顔画像の取得処理が開始されたことに応じて、コンピュータ３０９ａから受信された観察者Ｓｐ１の位置及び顔姿勢に関する情報を基に、顔画像取得用カメラ３０５ｂの三次元位置及び姿勢をそれらに対応するように、カメラ駆動系３０７の駆動を制御する。これにより、観察者Ｓｐ１があたかも対象者Ｓｐ２に対面しているような状態を仮想的に作り出してその状態に応じた顔画像を取得できる。具体的には、観察者Ｓｐ１が、頭部を動かすことによってディスプレイ装置３１１上に表示される対象者Ｓｐ２を別の距離及び角度で見ようとすると、それに相当する位置及び姿勢に変更されるようにカメラが制御される。 In response to the start of the facial image acquisition process of the subject Sp2, the camera driving system control unit 321 performs facial image acquisition based on the information about the position and facial posture of the observer Sp1 received from the computer 309a. The driving of the camera driving system 307 is controlled so that the three-dimensional position and orientation of the camera 305b correspond to them. As a result, it is possible to virtually create a state in which the observer Sp1 faces the target person Sp2, and acquire a face image corresponding to that state. Specifically, when the observer Sp1 attempts to view the target person Sp2 displayed on the display device 311 at a different distance and angle by moving his/her head, the position and posture are changed to correspond to that. camera is controlled.

撮像制御部３２３は、対象者Ｓｐ２の顔画像の取得処理が開始されたことに応じて、顔画像取得用カメラ３０５ｂによる対象者Ｓｐ２の顔画像の取得を開始するように制御する。また、撮像制御部３２３は、顔画像取得用カメラ３０５ｂによって取得されたフレーム毎の顔画像を、カメラ位置／姿勢検出部３２５及び画像処理部３２７に引き渡す。 The imaging control unit 323 controls to start acquisition of the face image of the target person Sp2 by the face image acquisition camera 305b in response to the start of the processing of acquiring the face image of the target person Sp2. The imaging control unit 323 also hands over the face image for each frame acquired by the face image acquisition camera 305 b to the camera position/orientation detection unit 325 and the image processing unit 327 .

カメラ位置／姿勢検出部３２５は、顔画像取得用カメラ３０５ｂによって取得された顔画像上のマーカーＭの位置を基に、顔画像取得用カメラ３０５ｂの位置（三次元座標）及び姿勢をフレーム毎に検出する。具体的には、顔画像上において３個以上のマーカーＭの位置を検出し、それらのマーカーＭの間の距離が既知であることを利用して、特許第４４３１７４９号に記載の手法を応用して、顔画像取得用カメラ３０５ｂの位置及び姿勢を検出する。すなわち、上記特許に記載の手法では、カメラ座標系における互いの距離が既知の点の三次元座標を計算したが、この手法を応用して、三次元座標が既知の３個のマーカーＭの顔画像上の位置から、顔画像取得用カメラ３０５ｂの位置及び姿勢を導出することができる。 The camera position/posture detection unit 325 detects the position (three-dimensional coordinates) and posture of the face image acquisition camera 305b for each frame based on the position of the marker M on the face image acquired by the face image acquisition camera 305b. To detect. Specifically, the method described in Japanese Patent No. 4431749 is applied using the fact that the positions of three or more markers M are detected on the face image and the distances between these markers M are known. to detect the position and orientation of the face image acquisition camera 305b. That is, in the method described in the above patent, the three-dimensional coordinates of points with known mutual distances in the camera coordinate system are calculated. The position and orientation of the facial image acquisition camera 305b can be derived from the position on the image.

画像処理部３２７は、カメラ位置／姿勢検出部３２５によって検出されたフレーム毎の顔画像取得用カメラ３０５ｂの位置及び姿勢を基に、顔画像取得用カメラ３０５ｂによって取得されたフレーム毎の顔画像に対して画像処理を施し、画像処理後の顔画像をコンピュータ３０９ａに送信する。具体的には、画像処理部３２７は、フレーム毎に次のような処理を実行する。すなわち、予め、顔画像上に検出される３点のマーカーＭの三次元座標を基に、対象者Ｓｐ２の中心及び向きを規定する三次元座標である視対象座標系が定義されている。そして、画像処理部３２７は、顔画像取得用カメラ３０５ｂの位置及び姿勢を基に、顔画像取得用カメラ３０５ｂの光軸の位置及び方向を基準にした三次元座標であるカメラ座標系を規定し、カメラ座標系における視対象座標系の位置及び姿勢を取得する。さらに、画像処理部３２７は、フレーム毎の顔画像を、顔画像取得用カメラ３０５ｂの光学中心から視対象座標系の中心に向かうベクトルがその顔画像取得用カメラ３０５ｂの光軸に一致したときの画像になるように変換することにより、その顔画像のブレを補正する。これにより、顔画像において対象者Ｓｐ２（視対象）の中心が画像の中心と一致し、視対象の像が歪んでいない顔画像が得られる。ここでは、顔画像取得用カメラ３０５ｂの光学中心から視対象座標系の中心に向かうベクトルを法線ベクトルとする平面を理想的な撮像面（理想画像面）と設定し、射影変換により顔画像を理想画像面に投影することにより、ブレが補正された顔画像が得られる。 Based on the position and orientation of the face image acquisition camera 305b for each frame detected by the camera position/orientation detection unit 325, the image processing unit 327 converts the face image for each frame acquired by the face image acquisition camera 305b. Image processing is performed on the face image, and the face image after the image processing is transmitted to the computer 309a. Specifically, the image processing unit 327 performs the following processing for each frame. That is, based on the three-dimensional coordinates of the three markers M detected on the face image, a visual target coordinate system, which is three-dimensional coordinates that define the center and orientation of the subject Sp2, is defined in advance. Based on the position and orientation of the face image acquisition camera 305b, the image processing unit 327 defines a camera coordinate system, which is three-dimensional coordinates based on the position and direction of the optical axis of the face image acquisition camera 305b. , to obtain the position and orientation of the visual object coordinate system in the camera coordinate system. Furthermore, the image processing unit 327 converts the face image for each frame into a face image obtained when the vector from the optical center of the face image acquisition camera 305b to the center of the visual target coordinate system coincides with the optical axis of the face image acquisition camera 305b. Blurring of the face image is corrected by transforming it into an image. As a result, the center of the target person Sp2 (visual target) in the facial image is aligned with the center of the image, and a facial image in which the image of the visual target is not distorted can be obtained. Here, a plane whose normal vector is a vector directed from the optical center of the face image acquisition camera 305b to the center of the visual target coordinate system is set as an ideal imaging plane (ideal image plane), and a face image is obtained by projective transformation. A blur-corrected facial image is obtained by projecting onto the ideal image plane.

加えて、画像処理部３２７は、顔画像に次のような画像処理を加えてもよい。すなわち、理想画像面の中心座標と視対象座標系の中心座標とを基に、顔画像の拡大あるいは縮小の処理を行ってもよい。例えば、顔画像取得用カメラ３０５ｂの光学中心から理想画像面の中心までの距離をＺ_Ｉ、顔画像取得用カメラ３０５ｂの光学中心から視対象座標系の中心までの距離をＺ_Ｖとした場合に、透視投影モデルを適用して、顔画像をＺ_Ｖ／Ｚ_Ｉ倍に拡大あるいは縮小することにより顔画像を加工する。 In addition, the image processing section 327 may apply the following image processing to the face image. That is, based on the central coordinates of the ideal image plane and the central coordinates of the visual target coordinate system, processing for enlarging or reducing the face image may be performed. For example, when the distance from the optical center of the face image acquisition camera 305b to the center of the ideal image plane is _ZI , and the distance from the optical center of the face image acquisition camera 305b to the center of the visual target coordinate system is _ZV , , apply the perspective projection model and process the facial image by enlarging or reducing the facial image by a factor of _ZV / _ZI .

図１１には、コンピュータ３０９ｂの画像処理部３２７による処理前の顔画像と処理後の顔画像のイメージを示している。このように、顔画像取得用カメラ３０５ｂの駆動によって顔画像において対象者Ｓｐ２の像の大きさ、位置、姿勢などがブレて小さく写っていても、画像処理によってブレが補正されて適切な大きさの像を反映した画像に加工することができる。 FIG. 11 shows images of a face image before and after processing by the image processing unit 327 of the computer 309b. In this way, even if the size, position, posture, etc. of the target person Sp2 appear blurred and small in the face image due to the drive of the face image acquisition camera 305b, the blur is corrected by the image processing and the appropriate size is obtained. It can be processed into an image that reflects the image of

上記の画像観察システム３００によれば、観察者Ｓｐ１の頭部の位置及び姿勢に応じて顔画像取得用カメラ３０５ｂの位置及び姿勢が制御され、顔画像取得用カメラ３０５ｂによって取得された対象者Ｓｐ２の顔画像がディスプレイ装置３１１に表示される。この際、顔画像処理装置３０１によって、顔画像取得用カメラ３０５ｂの位置あるいは姿勢が検出され、顔画像取得用カメラ３０５ｂの位置あるいは姿勢を基に顔画像のブレが補正される。その結果、ブレが安定的に補正された対象者の顔を表示させることができる。 According to the image observation system 300 described above, the position and orientation of the facial image acquisition camera 305b are controlled according to the position and orientation of the head of the observer Sp1, and the image of the subject Sp2 acquired by the facial image acquisition camera 305b is controlled. is displayed on the display device 311 . At this time, the face image processing device 301 detects the position or orientation of the face image acquisition camera 305b, and corrects blurring of the face image based on the position or orientation of the face image acquisition camera 305b. As a result, it is possible to display the face of the subject whose blur is stably corrected.

また、顔画像処理装置３０１においては、顔画像取得用カメラ３０５ｂの位置あるいは姿勢を基に顔画像を拡大あるいは縮小を行ってからディスプレイ装置３１１に表示させている。このような構成により、対象者Ｓｐ２の顔を顔画像取得用カメラ３０５ｂの位置あるいは姿勢に応じた好適なサイズでディスプレイ装置３１１に表示させることができる。 Further, in the face image processing device 301, the face image is displayed on the display device 311 after being enlarged or reduced based on the position or posture of the face image acquiring camera 305b. With such a configuration, the face of the subject Sp2 can be displayed on the display device 311 in a suitable size according to the position or posture of the face image acquiring camera 305b.

なお、上記第２実施形態の画像観察システム３００における視対象は、人の顔には限定されず、美術品を代表とする鑑賞物、商品等の様々な有体物であってもよい。図１２には、第２実施形態に係る画像観察システム３００における観察対象を美術品等の有体物Ｓｐ３に変更した場合の使用形態を示している。 Note that the visual target in the image observation system 300 of the second embodiment is not limited to a human face, and may be various tangible objects such as objects to be appreciated, typified by works of art, and commercial products. FIG. 12 shows a usage pattern when the observation target in the image observation system 300 according to the second embodiment is changed to a tangible object Sp3 such as a work of art.

また、上記第２実施形態の画像観察システム３００においては、コンピュータ３０９ａが観察者検出用カメラ３０５ａから出力された顔画像を用いて観察者Ｓｐ１の視線ベクトルあるいはディスプレイ装置３１１上の注視点を検出し、その結果得られた検出情報をコンピュータ３０９ｂに送信してもよい。コンピュータ３０９ｂは、その検出情報を用いて顔画像取得用カメラ３０５ｂの三次元位置及び姿勢を制御してもよい。その結果、例えば、観察者Ｓｐ１がディスプレイ装置３１１上に表示された対象者Ｓｐ２の顔のある部分に視線を向けると、顔画像取得用カメラ３０５ｂが観察者Ｓｐ１の顔姿勢と対応するように制御される。その結果、顔画像取得用カメラ３０５ｂが観察者Ｓｐ１の視線あるいは注視点に対応した撮影方向で撮影した顔画像を観察者Ｓｐ１側に送信でき、観察者Ｓｐ１側では実際に実物を覗き込んだような対象者Ｓｐ２の顔画像を観察できる。この場合に、観察時に観察者Ｓｐ１の視線が頻繁に動いて、顔画像取得用カメラ３０５ｂが頻繁に動いても、本実施形態によればマーカーを利用してディスプレイ装置３１１の表示画像がぶれるのを抑えることができる。 In the image observation system 300 of the second embodiment, the computer 309a detects the line-of-sight vector of the observer Sp1 or the gaze point on the display device 311 using the face image output from the observer detection camera 305a. , and the resulting detection information may be transmitted to the computer 309b. The computer 309b may use the detection information to control the three-dimensional position and orientation of the face image acquisition camera 305b. As a result, for example, when the observer Sp1 directs his/her line of sight to a part of the subject Sp2's face displayed on the display device 311, the face image acquisition camera 305b is controlled to correspond to the face posture of the observer Sp1. be done. As a result, the face image acquired by the face image acquiring camera 305b in the photographing direction corresponding to the line of sight or the gaze point of the observer Sp1 can be transmitted to the observer Sp1 side. The face image of the target person Sp2 can be observed. In this case, even if the line of sight of the observer Sp1 frequently moves during observation and the facial image acquisition camera 305b frequently moves, according to the present embodiment, the displayed image on the display device 311 is not blurred using the marker. can be suppressed.

［第３実施形態に係る自動車用監視システムの構成］
図１３には、第３実施形態に係る瞳孔検出システムである自動車用監視システムを構成する顔画像処理装置４００の構成を示し、図１４には、顔画像処理装置４００に含まれる顔画像取得用ステレオカメラ４０５の配置状態を示している。この顔画像処理装置４００は、自動車の車内に設けられ、自動車のドライバーである対象者Ｓｐの視線、顔姿勢、又は眼の開閉状態を検出する装置群である。 [Configuration of Vehicle Monitoring System According to Third Embodiment]
FIG. 13 shows the configuration of a face image processing device 400 that constitutes a vehicle monitoring system that is a pupil detection system according to the third embodiment, and FIG. The arrangement state of the stereo camera 405 is shown. The face image processing device 400 is a group of devices that are installed in an automobile and detect the line of sight, facial posture, or eye opening/closing state of a subject Sp, who is the driver of the automobile.

図１３に示すように、顔画像処理装置４００は、顔画像取得用ステレオカメラ４０５と、コンピュータ（算出部）４０９とを備える。顔画像取得用ステレオカメラ４０５は、第１実施形態の顔画像取得用ステレオカメラ５と同様な構成を有し、２台のカメラ４０５ａ，４０５ｂを備え、それらのカメラ４０５ａ，４０５ｂが、それらの光軸を対象者Ｓｐに向けるように、対象者Ｓｐが操作する可動物体であるステアリングホイール（ハンドル）Ｈの中央部に埋め込んで設けられる。コンピュータ４０９は、自動車内に設けられた図５に示すハードウェア構成を有するコンピュータであり、第２実施形態に係る撮像制御部３２３、カメラ位置／姿勢検出部３２５、及び第１実施形態に係る画像処理部２７と同様な機能部を有する。また、自動車の社内の天井あるいは座席のヘッドレスト等のカメラ４０５ａ，４０５ｂの視野に常に収まるような位置には複数のマーカーＭが設けられる。このマーカーＭは１つであってもよいが、画像処理の精度の観点からは複数設けられることが好ましい。 As shown in FIG. 13 , the face image processing device 400 includes a face image acquisition stereo camera 405 and a computer (calculation unit) 409 . The face image acquisition stereo camera 405 has the same configuration as the face image acquisition stereo camera 5 of the first embodiment, and includes two cameras 405a and 405b. It is embedded in the central portion of a steering wheel (handle) H, which is a movable object operated by the subject Sp, so that the axis faces the subject Sp. The computer 409 is a computer having the hardware configuration shown in FIG. It has functional units similar to the processing unit 27 . In addition, a plurality of markers M are provided at positions such as the ceiling of the interior of the vehicle or the headrests of the seats that are always within the field of view of the cameras 405a and 405b. Although one marker M may be provided, it is preferable to provide a plurality of markers from the viewpoint of accuracy of image processing.

コンピュータ４０９は、ステアリングホイールの回転に伴って変化する各カメラ４０５ａ，４０５ｂの位置及び姿勢を、得られた顔画像上のマーカーＭの位置に基づいて検出し、その検出結果を利用して各カメラ４０５ａ，４０５ｂで得られた顔画像に差分位置補正を施すことによって、対象者Ｓｐの瞳孔位置あるいは瞳孔形状を取得する。これは、自動車においてステアリングホイールＨが回転すると、各カメラ４０５ａ，４０５ｂにおいて回転と並進移動が生じ、それに応じて各カメラ４０５ａ，４０５ｂに映る顔画像に回転と並進移動が生じるために、正しい瞳孔位置あるいは瞳孔形状を得るためにはそれを反映する必要があるからである。また、各カメラ４０５ａ，４０５ｂにおいて回転と並進移動が生じると、顔画像上で角膜反射と瞳孔中心とを結ぶベクトルｒの位置及び方向にも変化が生じる。例えば、コンピュータ４０９は、差分位置補正を行う際には、第１実施形態と同様にして、現在のフレームの顔画像中の瞳孔位置を予測して小ウィンドウを設定し、その小ウィンドウ内で角膜反射を検出し、前のフレームでの角膜反射位置が現在のフレームでの角膜反射位置と一致するように、顔画像上のベクトルｒの回転角度に相当する角度で前のフレームの小ウィンドウ内の画像の所定の一部を回転させながらずらした後に、その所定の一部の画像について前のフレームと現在のフレームとで差分する。一方、コンピュータ４０９は、第１実施形態と同様にして、鼻孔（鼻孔間中点）あるいは頭部姿勢を利用して差分位置補正を行ってもよい。 The computer 409 detects the positions and orientations of the cameras 405a and 405b, which change with the rotation of the steering wheel, based on the positions of the markers M on the obtained facial images, and uses the detection results to The pupil position or pupil shape of the subject Sp is acquired by performing differential position correction on the face images obtained in 405a and 405b. This is because when the steering wheel H rotates in the automobile, each camera 405a, 405b rotates and translates, and the face image captured by each camera 405a, 405b accordingly rotates and translates. Alternatively, it is necessary to reflect it in order to obtain the pupil shape. Further, when the cameras 405a and 405b rotate and translate, the position and direction of the vector r connecting the corneal reflection and the center of the pupil on the face image also change. For example, when performing differential position correction, the computer 409 predicts the pupil position in the face image of the current frame, sets a small window, and cornea cornea in the small window, as in the first embodiment. Detect the reflection and rotate the corneal reflection position in the previous frame at an angle corresponding to the rotation angle of the vector r on the face image such that the corneal reflection position in the previous frame coincides with the corneal reflection position in the current frame. After rotating and shifting a predetermined portion of the image, the predetermined portion of the image is subtracted between the previous frame and the current frame. On the other hand, the computer 409 may perform differential position correction using nostrils (midpoints between nostrils) or the head posture in the same manner as in the first embodiment.

上記形態の自動車用監視システムによれば、カメラ４０５ａ，４０５ｂの位置がステアリングホイールの回転により変化する場合であっても、対象者Ｓｐの顔画像を対象に、カメラ４０５ａ，４０５ｂの位置及び姿勢に基づいて画像処理を施すことにより、高精度にドライバーの瞳孔位置、視線、顔姿勢、あるいは瞳孔形状を基にした眼の開閉状態等を検出することができる。 According to the vehicle monitoring system of the above embodiment, even when the positions of the cameras 405a and 405b change due to the rotation of the steering wheel, the positions and postures of the cameras 405a and 405b are adjusted for the face image of the target person Sp. By performing image processing based on this information, it is possible to detect the position of the driver's pupils, the line of sight, the face posture, or the state of the eyes being opened or closed based on the shape of the pupils with high accuracy.

特に、自動車用監視システムのコンピュータ４０９は、カメラ４０５ａ，４０５ｂの位置及び姿勢の変化を反映した顔画像の差分処理によって対象者の瞳孔の顔画像上の位置あるいは形状を検出している。この場合、対象者Ｓｐの顔画像を対象に、高精度に対象者Ｓｐの瞳孔の顔画像上の位置あるいは形状を検出することができる。 In particular, the computer 409 of the automobile monitoring system detects the position or shape of the subject's pupils on the facial image by differential processing of the facial image reflecting changes in the positions and postures of the cameras 405a and 405b. In this case, the position or shape of the pupils of the subject Sp on the face image can be detected with high accuracy.

また、自動車用監視システムの顔画像取得用ステレオカメラ４０５は、ステアリングホイールに設けられているので、ハンドルコラム、コンソール等に設ける場合と比較して計測できる内容の可能性を広げることができる。特に、ステアリングホイールの向こう側のコンソールなどにカメラを取り付けた場合は、カメラと対象者の瞳孔の間に、運転者がステアリングホイールを回した際など、ハンドルコラムあるいは運転者の腕などが介在することになり、例えば、ハンドルコラムに顔が隠れたり、光源の光が反射して画像を乱す等が生じ、連続的に乱れのない理想的な顔画像を取得する妨げになると課題がある。本実施例では、例えば、ハンドルコラムよりも低い位置から対象者Ｓｐを見上げるような位置にカメラ４０５ａ，４０５ｂを設けることができ、対象者Ｓｐの鼻孔の検出が容易になり、鼻孔が検出できると、２個の瞳孔と鼻孔間中点を１つの塊として追跡できるために、動きの速い頭部運動に対してもロバストに顔姿勢を追跡することができる。結果的に、ロバストな瞳孔検出、さらには、正確な視線検出も可能となる。特に、カメラ４０５ａ，４０５ｂのビデオレートが遅い場合にはこのような効果が大きくなる。一方で、自動車内のステアリングホイールに設けられるカメラの位置及び姿勢を検出するには、自動車内の制御装置から出力される操舵角度を示すステアリング信号を利用することも考えられるが、このような信号にはノイズが含まれていたり遅延が生じるため、画像処理の精度が低くなる。これに対して、本実施形態では、顔画像で同時検出されるマーカーＭを利用しているので、そのような問題も解決できる。 Moreover, since the face image acquisition stereo camera 405 of the automobile monitoring system is provided on the steering wheel, it is possible to expand the possibilities of what can be measured compared to when it is provided on the steering column, console, or the like. Especially when the camera is attached to the console on the other side of the steering wheel, the steering column or the driver's arm, etc. intervenes between the camera and the subject's pupil when the driver turns the steering wheel. As a result, for example, the face may be hidden by the steering column, or the image may be disturbed by reflection of the light from the light source. In this embodiment, for example, the cameras 405a and 405b can be provided at positions where the subject Sp can be looked up from a position lower than the steering column, which facilitates detection of the subject Sp's nostrils. , two pupils and the midpoint between the nostrils can be tracked as one mass, so the face pose can be tracked robustly even against fast-moving head movements. As a result, robust pupil detection and even accurate gaze detection are possible. In particular, when the video rate of the cameras 405a and 405b is slow, such an effect is enhanced. On the other hand, in order to detect the position and orientation of a camera provided on a steering wheel in a vehicle, it is conceivable to use a steering signal indicating a steering angle output from a control device in the vehicle. contains noise and delays, resulting in less accurate image processing. On the other hand, in the present embodiment, since the markers M that are simultaneously detected in the face image are used, such a problem can be solved.

ここで、第３実施形態に係る瞳孔検出システムは、自動車での用途には限定されず、対象者が操作する可動物体を備える装置におけるシステムとしても応用できる。例えば、他の種類の乗り物、操作シミュレータ、遊技機、ゲーム機器、遠隔操作卓等の機器に応用できる。 Here, the pupil detection system according to the third embodiment is not limited to use in automobiles, and can also be applied as a system in a device provided with a movable object operated by a subject. For example, it can be applied to devices such as other types of vehicles, operation simulators, game machines, game machines, and remote consoles.

なお、上記第１～第３実施形態で撮像するマーカーＭとしては、実際に個別に設置されたマーカーである必要はない。例えば、顔画像に含まれる背景画像からマーカーとみなせるような特徴点を抽出して、それをマーカーとして認識してもよい。 Note that the markers M to be imaged in the first to third embodiments do not have to be markers that are actually individually set. For example, a feature point that can be regarded as a marker may be extracted from the background image included in the face image and recognized as the marker.

１，３０１，４００…顔画像処理装置、５，４０５…顔画像取得用ステレオカメラ、３０５ｂ…顔画像取得用カメラ、５ａ，５ｂ，４０５ａ，４０５ｂ…カメラ、７，２０７，２５７，３０７…カメラ駆動系、９，３０９ｂ，４０９…コンピュータ（算出部）、３００…画像観察システム、３０５ａ…観察者検出用カメラ（光学系）、３１１…ディスプレイ装置（表示装置）、Ｈ…ステアリングホイール（ハンドル）、Ｓｐ，Ｓｐ２…対象者、Ｓｐ１…観察者。 1,301,400 Face image processing device 5405 Face image acquisition stereo camera 305b Face image acquisition camera 5a, 5b, 405a, 405b Cameras 7,207,257,307 Camera drive System 9, 309b, 409... Computer (calculating unit) 300... Image observation system 305a... Observer detection camera (optical system) 311... Display device (display device) H... Steering wheel (handle) Sp , Sp2... Subject, Sp1... Observer.

Claims

at least one camera that acquires facial images by imaging a subject's face and fixed points independent of the face;
A calculation unit that performs image processing on the face image,
The calculation unit detects the position or orientation of the camera based on the position of the fixed point on the face image, and executes the image processing based on the position or orientation of the camera,
Continuously executing a position calculation process for calculating the three-dimensional position of the pupil of the subject for a plurality of frames of the face image as the image processing,
The position calculation processing for the frame to be processed includes:
A process of predicting the three-dimensional position of the pupil in the frame to be processed based on the three-dimensional position of the pupil calculated by the position calculation process for the previous frame;
a process of estimating the position of the pupil on the face image of the processing target frame based on the predicted three-dimensional position of the pupil and the position of the camera detected in the processing target frame ;
Based on the positions of the pupils on the face image of the previous frame and the estimated positions of the pupils on the face image of the processing target frame, after correcting the position of the face image, A process of detecting the position of the subject's pupils on the face image in the frame to be processed by generating a processed image between the previous frame and the frame to be processed;
and calculating the three-dimensional position of the pupil in the frame to be processed based on the position.
Face image processing device.

further comprising a drive system for controlling the position or orientation of the camera,
The face image processing device according to claim 1.

The camera captures images of the fixed points at three or more locations,
The face image processing device according to claim 1 or 2.

The calculation unit detects the position and orientation of the camera based on the positions of the fixed points, and executes the image processing using the position and orientation of the camera.
The facial image processing device according to any one of claims 1 to 3.

The calculation unit uses the image processing to calculate the direction or distance of the subject's pupil as viewed from the camera,
The facial image processing device according to any one of claims 1 to 4.

The calculation unit corrects blurring of the face image based on the position and/or orientation of the camera.
3. The face image processing device according to claim 2.

a face image processing device according to claim 6;
an optical system for detecting the position and orientation of the head of an observer different from the subject;
A display device for displaying the face image to the observer,
the drive system controls the position and orientation of the camera based on the position and orientation of the head;
The calculation unit causes the display device to display the face image whose blur has been corrected.
Image observation system.

The calculation unit enlarges or reduces the facial image based on the position and orientation of the camera, and then causes the display device to display the facial image.
The image observation system according to claim 7.

Equipped with the face image processing device according to claim 1,
The camera is attached to a movable object operated by the subject, and images the subject's face and the fixed point,
The calculation unit detects a pupil of the subject.
Pupil detection system.

The calculation unit detects the position or shape of the subject's pupils on the face image by correcting the positions of the plurality of face images at different times reflecting changes in the position or posture of the camera.
10. Pupil detection system according to claim 9.