JP2022165239A

JP2022165239A - Imaging apparatus and control method for the same, and program

Info

Publication number: JP2022165239A
Application number: JP2021070512A
Authority: JP
Inventors: 信行堀江; Nobuyuki Horie; 見寺澤; Ken Terasawa
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-04-19
Filing date: 2021-04-19
Publication date: 2022-10-31

Abstract

To provide an imaging apparatus that can improve accuracy of detecting a user's line-of-sight position and a control method for the same, and a program.SOLUTION: An imaging apparatus 1 displays a through image on an internal finder 10, and comprises: an image pickup device for an eye 17 that picks up an image of the eyeball 14 of a user who looks into the finder 10 to generate eye image data; and a line of sight detection circuit 201 that detects a position of the user's line of sight fixed on the through image on the finder 10 on the basis of, the eye image data. When the user moves, with an operating member 42, the detected line-of-sight position (position C) displayed on the finder 10 to a position B and subsequently depresses a release button 5 to perform photographing with the position B as a focus position, the position B is collected as a correct answer position and is used for learning in creating a reasoner that estimates the user's line-of-sight position with the eye image data as input data.SELECTED DRAWING: Figure 9

Description

本発明は撮像装置及びその制御方法、並びにプログラムに関し、特に検出された視線位置の情報に基づき焦点制御を行う撮像装置及びその制御方法、並びにプログラムに関する。 The present invention relates to an image pickup apparatus, its control method, and a program, and more particularly to an image pickup apparatus, its control method, and its program, which perform focus control based on information on a detected line-of-sight position.

近年、撮像装置の自動化・インテリジェント化が進み、手動で被写体位置を入力せずとも、ファインダを覗くユーザの視線位置の情報に基づいてユーザが意図する被写体を認識し、焦点制御を行うことを可能とする撮像装置が提案されている。この際、撮像装置がユーザの視線位置を検出する際に、ユーザの意図した視線位置と撮像装置が認識するユーザの視線位置の間にズレが生じ、ユーザの意図する被写体に焦点を合わせることができない場合がある。 In recent years, imaging devices have become increasingly automated and intelligent, making it possible to recognize the user's intended subject and perform focus control based on information about the user's line of sight when looking through the viewfinder, without having to manually enter the subject's position. An imaging device has been proposed. In this case, when the imaging device detects the line-of-sight position of the user, a deviation occurs between the line-of-sight position intended by the user and the position of the user's line of sight recognized by the imaging device, and the user's intended subject may be focused. Sometimes you can't.

これに対し、撮影前にファインダ内に指標を表示し、ユーザにその指標を注視するよう指示を出し、その注視状態において、ユーザの視線位置を検出し、該指標位置とのずれ量を検出するキャリブレーションを実行する。その後、撮影時において、その検出されたずれ量だけ撮像装置が認識するユーザの視線位置を補正することで、補正後の視線位置をよりユーザの意図に近い視線位置とする技術が知られている（例えば、特許文献１参照）。 On the other hand, an index is displayed in the finder before photographing, an instruction is given to the user to gaze at the index, and in the gaze state, the user's gaze position is detected, and the amount of deviation from the index position is detected. Execute calibration. After that, a technique is known in which, at the time of photographing, by correcting the line-of-sight position of the user recognized by the imaging apparatus by the amount of the detected deviation, the corrected line-of-sight position is closer to the user's intention. (See Patent Document 1, for example).

また、視線位置の検出精度を判定し、判定された検出精度が低い箇所においては、表示オブジェクトを疎に表示し、視線検出精度が高い箇所においては表示オブジェクトを密に表示する。これにより、ユーザの意図しない視線位置が選択されないようにする技術が知られている（例えば、特許文献２参照）。 In addition, the detection accuracy of the line-of-sight position is determined, and the display objects are displayed sparsely at locations where the determined detection accuracy is low, and the display objects are densely displayed at locations where the line-of-sight detection accuracy is high. There is known a technique for preventing selection of an unintended line-of-sight position by the user (see, for example, Patent Document 2).

特開２００４－００８３２３号公報Japanese Patent Application Laid-Open No. 2004-008323 特開２０１５－１５２９３８号公報JP 2015-152938 A

しかしながら、特許文献１に開示された従来技術では、撮影時とキャリブレーション時で、ファインダを覗くユーザの目の瞳孔径や距離などの条件が異なる場合、視線位置の検出精度が落ちてしまう。このように視線位置の検出精度が落ちるたびに再度キャリブレーションを行うようにすると、ユーザの負担が大きくなる。 However, in the conventional technology disclosed in Patent Document 1, when conditions such as the pupil diameter and distance of the user's eyes looking through the viewfinder are different between when shooting and when performing calibration, the detection accuracy of the line-of-sight position decreases. If calibration is performed again every time the detection accuracy of the line-of-sight position drops, the burden on the user increases.

また特許文献２にあるように、例えば検出精度が悪い箇所では焦点枠を大きくすることで、視線位置の誤検出を防ぐことは可能であるが、焦点枠が大きくなってしまうと、ユーザの望む焦点制御を行うことができない恐れがある。 As described in Patent Document 2, it is possible to prevent erroneous detection of the line-of-sight position by enlarging the focal frame at a location with poor detection accuracy, but if the focal frame becomes large, the user's desired Focus control may not be possible.

そこで、本発明の目的は、ユーザの視線位置の検出精度を向上することができる撮像装置及びその制御方法、並びにプログラムを提供することである。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide an imaging apparatus, a control method thereof, and a program capable of improving detection accuracy of a user's line-of-sight position.

本発明の請求項１に係る撮像装置は、内部のファインダにスルー画像を表示する撮像装置であって、前記ファインダを覗くユーザの眼球を撮像して眼画像データを生成する生成手段と、前記眼画像データを取得し、前記取得した眼画像データに基づき前記ファインダの前記スルー画像に注がれるユーザの視線位置を検出する視線検出手段と、前記検出された視線位置を、第１のユーザ操作により他の位置に移動可能に前記ファインダに表示する表示制御手段と、前記他の位置をフォーカス位置に決定する第２のユーザ操作があった場合、前記他の位置を正解位置として収集する収集手段とを備え、前記正解位置は、前記視線検出手段によって取得された前記眼画像データを入力データとし、前記視線位置を推定する推論器を作成するための学習に用いられることを特徴とする。 An imaging apparatus according to claim 1 of the present invention is an imaging apparatus for displaying a through image on an internal finder, comprising generating means for imaging an eyeball of a user looking through the finder to generate eye image data; a line-of-sight detecting means for acquiring image data and detecting a line-of-sight position of a user focused on the through-the-lens image of the finder based on the acquired eye image data; display control means for displaying on the finder movably to another position; collection means for collecting the other position as a correct position when there is a second user operation to determine the other position as the focus position; wherein the correct position is used for learning to create a reasoner for estimating the line-of-sight position using the eye image data acquired by the line-of-sight detection means as input data.

本発明によれば、ユーザの視線位置の検出精度を向上することができる。 ADVANTAGE OF THE INVENTION According to this invention, the detection accuracy of a user's gaze position can be improved.

実施例１に係る撮像装置の内部構成の概略を示す図である。1 is a diagram showing an outline of an internal configuration of an imaging device according to Example 1; FIG. 撮像装置の外観を示す図である。It is a figure which shows the external appearance of an imaging device. 撮像装置に内蔵された電気的構成を示すブロック図である。3 is a block diagram showing an electrical configuration built in the imaging device; FIG. 図３におけるファインダが動作する状態におけるファインダ内視野を示す図である。FIG. 4 is a view showing the viewfinder's inner field of view when the viewfinder in FIG. 3 operates; 視線検出方法の原理を説明するための図である。It is a figure for demonstrating the principle of a line-of-sight detection method. 眼画像データから、角膜反射像及び瞳孔中心に対応する座標を検出する方法を説明するための図である。FIG. 4 is a diagram for explaining a method of detecting coordinates corresponding to a corneal reflection image and a pupil center from eye image data; 視線検出処理のフローチャートである。9 is a flowchart of line-of-sight detection processing; 図３における視線検出信頼度判別回路により信頼度が低いと判断される、キャリブレーション時及び撮影時の眼画像データの一例を示す図である。4A and 4B are diagrams showing an example of eye image data at the time of calibration and at the time of photographing, which are determined to have low reliability by the line-of-sight detection reliability determination circuit in FIG. 3; 推論器を作成する際の学習に用いる正解データの収集処理のフローチャートである。FIG. 10 is a flow chart of processing for collecting correct data used for learning when creating an inference device; FIG. 撮影時のフォーカス処理のフローチャートである。6 is a flowchart of focus processing during shooting; 実施例２に係る、第１の推定注視点位置の信頼度低下要因の判定処理のフローチャートである。FIG. 11 is a flowchart of a process of determining a reliability reduction factor of the first estimated gaze point position according to the second embodiment; FIG. 図１１のステップＳ１１０４における差分眼画像データの生成方法を説明するための図である。FIG. 12 is a diagram for explaining a method of generating differential eye image data in step S1104 of FIG. 11; 実施例１に係るＣＮＮの全体構成の例を示す模式図である。1 is a schematic diagram showing an example of the overall configuration of a CNN according to Example 1; FIG. 図１３のＣＮＮの部分構成の例を示す模式図である。FIG. 14 is a schematic diagram showing an example of a partial configuration of the CNN of FIG. 13;

以下、添付図面を参照して本発明の実施形態を詳しく説明する。尚、以下の実施形態は特許請求の範囲に係る本発明を限定するものでなく、また本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In addition, the following embodiments do not limit the present invention according to the claims, and not all combinations of features described in the embodiments are essential for the solution of the present invention. .

（実施例１）
以下、図１～１０，１３，１４を参照して、本発明の実施例１に係る撮像装置１における視線位置の検出精度の向上のために実行される学習方法及び推論方法について説明する。 (Example 1)
1 to 10, 13, and 14, a learning method and an inference method that are executed to improve the detection accuracy of the line-of-sight position in the imaging device 1 according to the first embodiment of the present invention will be described below.

図１～３を用いて、撮像装置１の構成に関して説明する。 The configuration of the imaging device 1 will be described with reference to FIGS. 1 to 3. FIG.

図２は、撮像装置１の外観を示す図であり、図２（ａ）は正面斜視図、図２（ｂ）は背面斜視図、図２（ｃ）は図２（ｂ）の操作部材４２を説明するための図である。 2A is a front perspective view, FIG. 2B is a rear perspective view, and FIG. 2C is an operation member 42 of FIG. 2B. It is a figure for explaining.

本実施例において撮像装置１は、カメラ筐体部１Ｂ及びこれに脱着可能に装着される撮影レンズ１Ａで構成されている。 In this embodiment, the imaging apparatus 1 is composed of a camera housing 1B and a photographing lens 1A detachably attached thereto.

カメラ筐体部１Ｂの正面には、図２（ａ）で示すようにレリーズボタン５が設けられる。 A release button 5 is provided on the front surface of the camera housing 1B as shown in FIG. 2(a).

レリーズボタン５は、ユーザからの撮像操作を受ける操作部材である。 The release button 5 is an operation member that receives an imaging operation from the user.

また、図２（ｂ）で示すように、カメラ筐体部１Ｂの背面には、接眼窓６、操作部材４１～４３が設けられる。 Further, as shown in FIG. 2(b), an eyepiece window 6 and operation members 41 to 43 are provided on the rear surface of the camera housing 1B.

接眼窓６は、カメラ筐体部１Ｂの内部に含まれる図１で後述するファインダ１０上に表示される視認用画像をユーザが覗くための窓である。 The eyepiece window 6 is a window for the user to look into an image for visual recognition displayed on a finder 10 (described later with reference to FIG. 1) included inside the camera housing 1B.

操作部材４１は、タッチパネル対応液晶であり、操作部材４２は、レバー式操作部材であり、操作部材４３は、ボタン式十字キーである。尚、本実施例では、後述の推定注視点位置の手動操作による移動制御等のカメラ操作に使用する操作部材４１～４３がカメラ筐体部１Ｂに設けられているがこれに限定されない。例えば、電子ダイヤル等の他の操作部材がカメラ筐体部１Ｂに更にまたは操作部材４１～４３の代わりに設けられていてもよい。 The operation member 41 is a touch panel compatible liquid crystal, the operation member 42 is a lever type operation member, and the operation member 43 is a button type cross key. In this embodiment, the operation members 41 to 43 used for camera operation such as movement control by manual operation of an estimated position of gaze, which will be described later, are provided on the camera housing 1B, but the present invention is not limited to this. For example, other operation members such as an electronic dial may be provided in addition to or instead of the operation members 41 to 43 on the camera housing 1B.

図１は、図２（ａ）で図示したＹ軸とＺ軸が成すＹＺ平面でカメラ筐体Ｂを切った断面図であり、撮像装置１の内部構成の概略を示す図である。尚、図１では、図２と同一の構成には同一の符号が付されている。 FIG. 1 is a cross-sectional view of the camera housing B cut along the YZ plane formed by the Y-axis and Z-axis shown in FIG. In FIG. 1, the same components as those in FIG. 2 are denoted by the same reference numerals.

図１において、撮影レンズ１Ａは、カメラ筐体部１Ｂに脱着可能に装着される撮影レンズである。本実施例では便宜上撮影レンズ１Ａの内部にあるレンズとして二枚のレンズ１０１，１０２のレンズのみが図示されているが、実際はさらに多数のレンズで構成されていることは周知の通りである。 In FIG. 1, a photographing lens 1A is detachably attached to a camera housing 1B. In this embodiment, only two lenses 101 and 102 are shown as the lenses inside the photographing lens 1A for the sake of convenience, but it is well known that the photographing lens 1A actually comprises a larger number of lenses.

カメラ筐体部１Ｂは、その内部に、撮像素子２、ＣＰＵ３、メモリ部４、ファインダ１０、ファインダ駆動回路１１、接眼レンズ１２、光源１３ａ～１３ｂ、光分割器１５、受光レンズ１６、及び眼用撮像素子１７を備える。 The camera housing section 1B contains an image sensor 2, a CPU 3, a memory section 4, a viewfinder 10, a viewfinder drive circuit 11, an eyepiece lens 12, light sources 13a to 13b, a light splitter 15, a light receiving lens 16, and an eye sensor. An imaging element 17 is provided.

撮像素子２は、撮影レンズ１Ａの予定結像面に配置され、画像を撮像する。また、撮像素子２は、測光センサの役割も兼ねる。 The image pickup device 2 is arranged on a planned imaging plane of the photographing lens 1A and picks up an image. The imaging device 2 also serves as a photometric sensor.

ＣＰＵ３は、撮像装置１全体を制御するマイクロコンピュータの中央処理装置である。 The CPU 3 is a central processing unit of a microcomputer that controls the imaging apparatus 1 as a whole.

メモリ部４は、撮像素子２にて撮像された画像を記録する。またメモリ部４は、撮像素子２および眼用撮像素子１７からの撮像信号の記憶機能及び、後述する視線の個人差を補正する視線補正データを記憶する。 The memory unit 4 records images captured by the image sensor 2 . The memory unit 4 also has a function of storing image signals from the image pickup device 2 and the eye image pickup device 17, and stores line-of-sight correction data for correcting individual differences in line of sight, which will be described later.

ファインダ１０は、撮像素子２にて撮像された画像（スルー画像）を表示するための液晶等で構成される。 The finder 10 is composed of a liquid crystal or the like for displaying an image (through image) captured by the imaging device 2 .

ファインダ駆動回路１１は、ファインダ１０を駆動する回路である。 A finder drive circuit 11 is a circuit that drives the finder 10 .

接眼レンズ１２は、ファインダ１０に表示される視認用画像を接眼窓６（図２）からユーザが覗き込んで観察するためのレンズである。 The eyepiece lens 12 is a lens for the user to look into and observe a visual recognition image displayed on the finder 10 through the eyepiece window 6 (FIG. 2).

光源１３ａ～１３ｂは、ユーザの視線方向を検出するため、ユーザの眼球１４を照明するための赤外発光ダイオードからなる光源であり、接眼窓６（図２）の周りに配置されている。光源１３ａ～１３ｂの点灯により眼球１４には光源１３ａ～１３ｂの角膜反射像（プルキニエ像）Ｐｄ，Ｐｅ（図５）が形成される。この状態で眼球１４からの光が接眼レンズ１２を透過し、光分割器１５で反射され、受光レンズ１６によってＣＭＯＳ等の光電素子列を２次元的に配した眼用撮像素子１７（生成手段）上に眼球像を含む眼画像が結像され、眼画像データが生成される。受光レンズ１６はユーザの眼球１４の瞳孔と眼用撮像素子１７を共役な結像関係に位置付けている。後述する所定のアルゴリズムにより、視線検出回路２０１（視線検出手段：図３）が、眼用撮像素子１７上に結像された眼球像における角膜反射像の位置から、視線方向（視認用画像に注がれるユーザの視点、以下、第１の推定注視点位置と称する。）を検出する。 The light sources 13a-13b are light sources composed of infrared light emitting diodes for illuminating the user's eyeball 14 in order to detect the user's line of sight direction, and are arranged around the eyepiece window 6 (FIG. 2). Corneal reflection images (Purkinje images) Pd and Pe (FIG. 5) of the light sources 13a to 13b are formed on the eyeball 14 by lighting the light sources 13a to 13b. In this state, light from the eyeball 14 passes through the eyepiece 12, is reflected by the light splitter 15, and is received by the light-receiving lens 16. The image sensor 17 (generating means) has a photoelectric element array such as CMOS arranged two-dimensionally. An eye image including an eyeball image is formed thereon to generate eye image data. The light-receiving lens 16 positions the pupil of the user's eyeball 14 and the eye imaging device 17 in a conjugate imaging relationship. A line-of-sight detection circuit 201 (line-of-sight detection means: FIG. 3) detects the line-of-sight direction (focusing on the image for visual recognition) from the position of the corneal reflection image in the eyeball image formed on the eye imaging element 17 by a predetermined algorithm to be described later. The viewpoint of the user who escapes (hereinafter referred to as the first estimated gaze point position) is detected.

光分割器１５は、接眼レンズ１２を透過した光を反射し、受光レンズ１６を介して眼用撮像素子１７上に結像させると共に、ファインダ１０からの光を透過し、ユーザがファインダ１０に表示される視認用画像を見ることができるように構成されている。 The light splitter 15 reflects the light that has passed through the eyepiece lens 12 , forms an image on the eye imaging device 17 via the light receiving lens 16 , transmits the light from the viewfinder 10 , and allows the user to display the image on the viewfinder 10 . It is configured so that the visual recognition image to be displayed can be viewed.

撮影レンズ１Ａは、絞り１１１、絞り駆動装置１１２、レンズ駆動用モーター１１３、駆動ギヤ等からなるレンズ駆動部材１１４、フォトカプラー１１５、パルス板１１６、マウント接点１１７、及び焦点調節回路１１８を備える。 The photographing lens 1A includes a diaphragm 111, a diaphragm driving device 112, a lens driving motor 113, a lens driving member 114 including a driving gear, a photocoupler 115, a pulse plate 116, a mount contact 117, and a focus adjustment circuit 118.

フォトカプラー１１５は、レンズ駆動部材１１４に連動するパルス板１１６の回転を検知して、焦点調節回路１１８に伝えている。 A photocoupler 115 detects the rotation of a pulse plate 116 interlocked with a lens drive member 114 and transmits it to a focus adjustment circuit 118 .

焦点調節回路１１８は、フォトカプラー１１５からの情報とカメラ筐体部１Ｂからのレンズ駆動量の情報にもとづいてレンズ駆動用モーター１１３を所定量駆動させ、撮影レンズ１Ａを合焦点位置に移動させる。 The focus adjustment circuit 118 drives the lens drive motor 113 by a predetermined amount based on the information from the photocoupler 115 and the lens drive amount information from the camera housing 1B, and moves the photographing lens 1A to the in-focus position.

マウント接点１１７は、カメラ筐体部１Ｂと撮影レンズ１Ａとのインターフェイスであり、公知の構成を有する。カメラ筐体部１Ｂと撮影レンズ１Ａでは、マウント接点１１７を介して信号の伝達がなされる。カメラ筐体部１ＢのＣＰＵ３は、撮影レンズ１Ａの種別情報や光学情報などを取得することで、カメラ筐体部１Ｂに装着された撮影レンズ１Ａによる焦点可能な範囲を判定する。 A mount contact 117 is an interface between the camera housing 1B and the photographing lens 1A, and has a known configuration. Signals are transmitted through the mount contact 117 between the camera housing 1B and the photographing lens 1A. The CPU 3 of the camera housing portion 1B obtains the type information and optical information of the photographing lens 1A to determine the focusable range of the photographing lens 1A attached to the camera housing portion 1B.

図３は、撮像装置１に内蔵された電気的構成を示すブロック図である。尚、図３では、図１、図２と同一の構成には同一番号が付されている。 FIG. 3 is a block diagram showing an electrical configuration built into the imaging device 1. As shown in FIG. In FIG. 3, the same numbers are assigned to the same components as in FIGS.

カメラ筐体部１Ｂは、視線検出回路２０１、測光回路２０２、自動焦点検出回路２０３、信号入力回路２０４、ファインダ駆動回路１１、光源駆動回路２０５、視線検出信頼度判別回路３１、及び通信回路３２を備え、これらは夫々ＣＰＵ３と接続されている。また、撮影レンズ１Ａは、焦点調節回路１１８、及び絞り駆動装置１１２（図１）に含まれる絞り制御回路２０６を備え、これらは夫々マウント接点１１７を介してカメラ筐体部１ＢのＣＰＵ３との間で信号の伝達を行う。 The camera housing 1B includes a line-of-sight detection circuit 201, a photometry circuit 202, an automatic focus detection circuit 203, a signal input circuit 204, a finder drive circuit 11, a light source drive circuit 205, a line-of-sight detection reliability determination circuit 31, and a communication circuit 32. , and they are connected to the CPU 3 respectively. The photographic lens 1A also includes a focus adjustment circuit 118 and an aperture control circuit 206 included in the aperture drive device 112 (FIG. 1), which are connected to the CPU 3 of the camera housing 1B via mount contacts 117, respectively. signal transmission.

視線検出回路２０１は、眼用撮像素子１７上で結像・出力された眼画像データをＡ／Ｄ変換し、この眼画像データをＣＰＵ３に送信する。ＣＰＵ３は、眼画像データから視線検出に必要な眼画像の各特徴点を後述する所定のアルゴリズムに従って抽出し、更に抽出された各特徴点の位置から推定されたユーザの視線位置（第１の推定注視点位置）を算出する。 The line-of-sight detection circuit 201 A/D-converts the eye image data formed and output on the eye imaging device 17 and transmits the eye image data to the CPU 3 . The CPU 3 extracts each feature point of the eye image necessary for detecting the line of sight from the eye image data according to a predetermined algorithm, which will be described later. gazing point position).

測光回路２０２は、測光センサの役割も兼ねる撮像素子２から得られる信号を元に、被写界の明るさに対応した輝度信号出力を増幅後、対数圧縮、Ａ／Ｄ変換し、被写界輝度情報として、ＣＰＵ３に送る。 The photometry circuit 202 amplifies the luminance signal output corresponding to the brightness of the object scene based on the signal obtained from the image pickup device 2 which also serves as a photometry sensor, and then logarithmically compresses and A/D converts it. It is sent to the CPU 3 as luminance information.

自動焦点検出回路２０３は、撮像素子２に含まれる、位相差検出の為に使用される複数の画素からの信号電圧をＡ／Ｄ変換し、ＣＰＵ３に送る。ＣＰＵ３は前記複数の画素からの信号電圧から、各焦点検出ポイントに対応する被写体までの距離を演算する。これは撮像面位相差ＡＦとして知られる公知の技術である。本実施例では、図４に示すファインダ内視野像（視認用画像）で示すように、ファインダ１０の撮像面上に１８０か所の焦点検出ポイントがある。 The autofocus detection circuit 203 A/D-converts signal voltages from a plurality of pixels included in the image sensor 2 and used for phase difference detection, and sends them to the CPU 3 . The CPU 3 calculates the distance to the object corresponding to each focus detection point from the signal voltages from the plurality of pixels. This is a well-known technique known as imaging plane phase difference AF. In this embodiment, there are 180 focus detection points on the imaging surface of the finder 10, as shown in the finder field image (viewing image) shown in FIG.

信号入力回路２０４は、不図示のスイッチＳＷ_１，ＳＷ_２と接続される。スイッチＳＷ_１は、レリーズボタン５（図２（ａ））の第一ストロークでＯＮし、撮像装置１の測光、測距、視線検出動作等を開始するためスイッチである。スイッチＳＷ_２は、レリーズボタン５の第二ストロークでＯＮし、レリーズ動作を開始するためのスイッチである。スイッチＳＷ_１，ＳＷ_２からの信号が信号入力回路２０４に入力され、ＣＰＵ３に送信される。 The signal input circuit 204 is connected to switches SW ₁ and SW ₂ (not shown). The switch SW1 is turned on by the _first stroke of the release button 5 (FIG. 2(a)) to start photometry, distance measurement, line-of-sight detection, and the like of the imaging device 1. FIG. The switch SW2 is turned on by the _second stroke of the release button 5 to start the release operation. Signals from the switches SW ₁ and SW ₂ are input to the signal input circuit 204 and transmitted to the CPU 3 .

視線検出信頼度判別回路３１（信頼度判別手段）は、ＣＰＵ３によって算出された第１の推定注視点位置の信頼度を判別する。この判別は、後述するキャリブレーション時に取得した眼画像データと、撮影時に取得した眼画像データの２つの眼画像データの差異に基づき実行される。ここでの差異は、具体的には、上記２つの眼画像データの夫々から検出される、瞳孔径の大きさの違い、角膜反射像の数の違い、外光の入り込みの違いである。より具体的には、図５～図７で後述する視線検出方法により瞳孔端を算出するのだが、例えばこの瞳孔端の抽出数が閾値以上の場合に信頼度が高いと判別し、そうでない場合信頼度は低いと判別する。なぜなら瞳孔端をつなぎ合わせることでユーザの眼球１４の瞳孔１４１（図５）として推定するため、抽出できる瞳孔端の個数が多いほど、推定精度があがるためである。またほかにも瞳孔端をつなぎ合わせて算出される瞳孔１４１が円に対してどれだけ歪んでいるかで信頼度を判別してもよい。またその他の手法として後述するキャリブレーション時にユーザに注視させた指標の付近では信頼度を高く、指標から離れるほど信頼度を低いものとして判別してもよい。視線検出回路２０１によって算出されたユーザの視線位置情報をＣＰＵ３に送信する際、視線検出信頼度判別回路３１がその視線位置情報の信頼度をＣＰＵ３に送信する。 A line-of-sight detection reliability determination circuit 31 (reliability determination means) determines the reliability of the first estimated gaze point position calculated by the CPU 3 . This determination is performed based on the difference between two pieces of eye image data, ie, eye image data acquired during calibration, which will be described later, and eye image data acquired during photographing. Specifically, the differences here are the difference in the pupil diameter, the difference in the number of corneal reflection images, and the difference in the penetration of external light detected from each of the two eye image data. More specifically, the pupil edge is calculated by the line-of-sight detection method described later with reference to FIGS. Reliability is determined to be low. This is because the pupil ends are connected to estimate the pupil 141 (FIG. 5) of the user's eyeball 14, and the more pupil ends that can be extracted, the higher the estimation accuracy. Alternatively, the degree of reliability may be determined based on how much the pupil 141 calculated by connecting pupil ends is distorted with respect to the circle. As another method, it may be determined that the reliability is high in the vicinity of the indicator that the user is gazing at during calibration, which will be described later, and the reliability is low as the distance from the indicator increases. When transmitting the user's line-of-sight position information calculated by the line-of-sight detection circuit 201 to the CPU 3 , the line-of-sight detection reliability determination circuit 31 transmits the reliability of the line-of-sight position information to the CPU 3 .

通信回路３２は、ＣＰＵ３の制御に基づき、ＬＡＮやインターネットといったネットワーク（不図示）を介して、サーバ上のＰＣ（不図示）に対する通信を行う。 Under the control of the CPU 3, the communication circuit 32 communicates with a PC (not shown) on the server via a network (not shown) such as a LAN or the Internet.

また、前述した操作部材４１～４３は、ＣＰＵ３にその操作信号が伝わる構成となっており、それに応じて後述する第１の推定注視点位置の手動操作による移動制御等が行われる。 Further, the operation members 41 to 43 described above are configured so that their operation signals are transmitted to the CPU 3, and movement control and the like of the first estimated gazing point position to be described later are performed by manual operation accordingly.

図４は、ファインダ内視野を示した図であり、ファインダ１０が動作する状態（視認用画像を表示した状態）を示す。 FIG. 4 is a view showing the viewfinder's internal field of view, showing a state in which the viewfinder 10 operates (a state in which an image for visual recognition is displayed).

図４に示すように、ファインダ内視野には、視野マスク３００、焦点検出領域４００、１８０個の測距点指標４００１～４１８０等がある。 As shown in FIG. 4, the viewfinder field includes a field mask 300, a focus detection area 400, 180 distance measuring point indices 4001 to 4180, and the like.

測距点指標４００１～４１８０の夫々は、ファインダ１０の撮像面上における複数の焦点検出ポイントの一つと対応する位置に表示されるように、ファインダ１０に表示されたスルー画像（ライブビュー画像）に重畳表示される。また、測距点指標４００１～４１８０のうち、現在の第１の推定注視点位置である位置Ａと一致する指標は、ファインダ１０において強調表示される。 Each of the ranging point indices 4001 to 4180 is displayed on the through image (live view image) displayed on the viewfinder 10 so as to be displayed at a position corresponding to one of the plurality of focus detection points on the imaging surface of the viewfinder 10. It is superimposed. Further, among the range-finding point indices 4001 to 4180, the index that matches the position A, which is the current first estimated gazing point position, is highlighted in the finder 10. FIG.

次に、図５～図７を用いて撮像装置１による視線検出方法について説明する。 Next, a line-of-sight detection method by the imaging device 1 will be described with reference to FIGS. 5 to 7. FIG.

図５は、視線検出方法の原理を説明するための図であり、視線検出を行うための光学系の概略図である。 FIG. 5 is a diagram for explaining the principle of the line-of-sight detection method, and is a schematic diagram of an optical system for performing line-of-sight detection.

図５において、光源１３ａ，１３ｂは、ユーザに対して不感の赤外光を放射する発光ダイオード等の光源であり、各光源は受光レンズ１６の光軸に対して略対称に配置されユーザの眼球１４を照らす。光源１３ａ，１３ｂから発せられて眼球１４で反射した照明光の一部は受光レンズ１６によって、眼用撮像素子１７に集光する。 In FIG. 5, the light sources 13a and 13b are light sources such as light-emitting diodes that emit infrared light that is imperceptible to the user. Light 14. Part of the illumination light emitted from the light sources 13 a and 13 b and reflected by the eyeball 14 is collected by the light receiving lens 16 onto the eye imaging device 17 .

図６（ａ）は、眼用撮像素子１７で撮像された眼画像（眼用撮像素子１７に投影される眼画像）の概略図であり、図６（ｂ）は眼用撮像素子１７における光電素子列の出力強度を示す図である。 FIG. 6A is a schematic diagram of an eye image captured by the eye image sensor 17 (an eye image projected on the eye image sensor 17), and FIG. FIG. 4 is a diagram showing output intensity of an element array;

図７は、視線検出処理のフローチャートである。本処理はＣＰＵ３が、図３において不図示のＲＯＭに記録されるプログラムを読み出すことにより実行される。 FIG. 7 is a flowchart of line-of-sight detection processing. This process is executed by the CPU 3 reading out a program recorded in a ROM (not shown in FIG. 3).

図７において、視線検出処理が開始すると、ステップＳ７０１において、ＣＰＵ３は、光源１３ａ、１３ｂからユーザの眼球１４に向けて赤外光を放射させる。赤外光によって照明されたユーザの眼画像は、受光レンズ１６を通して眼用撮像素子１７上に結像され、眼用撮像素子１７により光電変換される。これにより、処理可能な眼画像の電気信号（眼画像データ）が得られる。 In FIG. 7, when the line-of-sight detection process starts, the CPU 3 emits infrared light from the light sources 13a and 13b toward the eyeball 14 of the user in step S701. An image of the user's eye illuminated by the infrared light is formed on the eye imaging device 17 through the light receiving lens 16 and photoelectrically converted by the eye imaging device 17 . Thereby, an electric signal (eye image data) of a processable eye image is obtained.

ステップＳ７０２において、ＣＰＵ３は、上記のように眼用撮像素子１７から得られた眼画像データを眼用撮像素子１７から取得する。 In step S702 , the CPU 3 acquires the eye image data obtained from the eye image sensor 17 as described above from the eye image sensor 17 .

ステップＳ７０３では、ＣＰＵ３は、ステップＳ７０２において得られた眼画像データから、光源１３ａ，１３ｂの角膜反射像Ｐｄ，Ｐｅ及び瞳孔中心ｃに対応する座標を検出する。 In step S703, the CPU 3 detects coordinates corresponding to the corneal reflection images Pd and Pe of the light sources 13a and 13b and the pupil center c from the eye image data obtained in step S702.

光源１３ａ、１３ｂより発せられた赤外光は、ユーザの眼球１４の角膜１４２を照明する。このとき、角膜１４２の表面で反射した赤外光の一部により形成される角膜反射像Ｐｄ，Ｐｅは、受光レンズ１６により集光され、眼用撮像素子１７上に結像して、角膜反射像Ｐｄ’，Ｐｅ’となる。同様に瞳孔１４１の端部ａ，ｂからの光束も眼用撮像素子１７上に結像して、瞳孔端像ａ’，ｂ’となる。 The infrared light emitted by the light sources 13a, 13b illuminates the cornea 142 of the eyeball 14 of the user. At this time, the corneal reflection images Pd and Pe formed by part of the infrared light reflected on the surface of the cornea 142 are condensed by the light receiving lens 16 and imaged on the eye imaging device 17, resulting in corneal reflection. Images Pd' and Pe' are obtained. Similarly, light beams from the ends a and b of the pupil 141 are also imaged on the eye imaging device 17 to form pupil end images a' and b'.

図６（ｂ）は、図６（ａ）の眼画像における領域αの輝度情報（輝度分布）を示す。図６（ｂ）では、眼画像の水平方向をＸ軸、垂直方向をＹ軸とし、Ｘ軸方向の輝度分布が示されている。本実施例では、角膜反射像Ｐｄ’，Ｐｅ’のＸ軸方向（水平方向）の座標をＸｄ，Ｘｅとし、瞳孔端像ａ’，ｂ’のＸ軸方向の座標をＸａ，Ｘｂとする。図６（ｂ）に示すように、角膜反射像Ｐｄ’，Ｐｅ’の座標Ｘｄ，Ｘｅでは、極端に高いレベルの輝度が得られる。瞳孔１４１の領域（瞳孔１４１からの光束が眼用撮像素子１７上に結像して得られる瞳孔像１４１’の領域）に相当する、座標Ｘａより大きく座標Ｘｂより小さい範囲では、座標Ｘｄ，Ｘｅを除いて、極端に低いレベルの輝度が得られる。これに対し、瞳孔１４１の外側の光彩１４３の領域（光彩１４３からの光束が結像して得られる、瞳孔像１４１’の外側の光彩像１４３’の領域）では、上記２種の輝度の中間の輝度が得られる。具体的には、Ｘ座標（Ｘ軸方向の座標）が座標Ｘａより小さい領域と、Ｘ座標が座標Ｘｂより大きい領域とで、上記２種の輝度の中間の輝度が得られる。 FIG. 6(b) shows luminance information (luminance distribution) of the region α in the eye image of FIG. 6(a). In FIG. 6(b), the horizontal direction of the eye image is the X axis and the vertical direction is the Y axis, and the luminance distribution in the X axis direction is shown. In this embodiment, the X-axis (horizontal) coordinates of the corneal reflection images Pd' and Pe' are Xd and Xe, and the X-axis coordinates of the pupil edge images a' and b' are Xa and Xb. As shown in FIG. 6B, extremely high levels of brightness are obtained at the coordinates Xd and Xe of the corneal reflection images Pd' and Pe'. In a range larger than the coordinate Xa and smaller than the coordinate Xb, which corresponds to the area of the pupil 141 (the area of the pupil image 141' obtained by forming an image of the light flux from the pupil 141 on the eye imaging device 17), the coordinates Xd and Xe Extremely low levels of luminance are obtained except for On the other hand, in the area of the iris 143 outside the pupil 141 (the area of the iris image 143' outside the pupil image 141' obtained by forming an image of the light flux from the iris 143), the brightness is intermediate between the above two types. is obtained. Specifically, a luminance intermediate between the above two types of luminance is obtained in an area where the X coordinate (coordinate in the X-axis direction) is smaller than the coordinate Xa and in an area where the X coordinate is larger than the coordinate Xb.

図６（ｂ）に示すような輝度分布から、角膜反射像Ｐｄ’，Ｐｅ’のＸ座標Ｘｄ，Ｘｅと、瞳孔端像ａ’，ｂ’のＸ座標Ｘａ，Ｘｂを得ることができる。具体的には、輝度が極端に高い座標を角膜反射像Ｐｄ’，Ｐｅ’の座標として得ることができ、輝度が極端に低い座標を瞳孔端像ａ’，ｂ’の座標として得ることができる。また、受光レンズ１６の光軸に対する眼球１４の光軸の回転角θｘが小さい場合には、瞳孔中心ｃからの光束が眼用撮像素子１７上に結像して得られる瞳孔中心像ｃ’（瞳孔像１４１’の中心）の座標Ｘｃは、Ｘｃ≒（Ｘａ＋Ｘｂ）／２と表すことができる。つまり、瞳孔端像ａ’，ｂ’のＸ座標Ｘａ，Ｘｂから、瞳孔中心像ｃ’のＸ座標Ｘｃを算出できる。このようにして、角膜反射像Ｐｄ’，Ｐｅ’のＸ座標と、瞳孔中心像ｃ’のＸ座標とを見積もることができる。 From the luminance distribution shown in FIG. 6B, the X coordinates Xd and Xe of the corneal reflection images Pd' and Pe' and the X coordinates Xa and Xb of the pupil edge images a' and b' can be obtained. Specifically, the coordinates with extremely high brightness can be obtained as the coordinates of the corneal reflection images Pd' and Pe', and the coordinates with extremely low brightness can be obtained as the coordinates of the pupil edge images a' and b'. . Further, when the rotation angle θx of the optical axis of the eyeball 14 with respect to the optical axis of the light receiving lens 16 is small, the pupil center image c′ ( The coordinate Xc of the center of the pupil image 141′ can be expressed as Xc≈(Xa+Xb)/2. That is, the X coordinate Xc of the pupil center image c' can be calculated from the X coordinates Xa and Xb of the pupil edge images a' and b'. In this way, the X coordinates of the corneal reflection images Pd' and Pe' and the X coordinate of the pupil center image c' can be estimated.

図７に戻り、ステップＳ７０４では、ＣＰＵ３は、眼球像の結像倍率βを算出する。結像倍率βは、受光レンズ１６に対する眼球１４の位置により決まる倍率で、角膜反射像Ｐｄ‘、Ｐｅ’の間隔（Ｘｄ－Ｘｅ）の関数として求めることができる。 Returning to FIG. 7, in step S704, the CPU 3 calculates the imaging magnification β of the eyeball image. The imaging magnification β is a magnification determined by the position of the eyeball 14 with respect to the light receiving lens 16, and can be obtained as a function of the interval (Xd-Xe) between the corneal reflection images Pd' and Pe'.

ステップＳ７０５では、ＣＰＵ３は、受光レンズ１６の光軸に対する眼球１４の光軸の
回転角を算出する。角膜反射像Ｐｄと角膜反射像Ｐｅの中点のＸ座標と角膜１４２の曲率中心ＯのＸ座標とはほぼ一致する。このため、角膜１４２の曲率中心Ｏと瞳孔１４１の中心ｃまでの標準的な距離をＯｃとすると、Ｚ－Ｘ平面（Ｙ軸に垂直な平面）内の眼球１４の回転角θ_Ｘは、以下の式１で算出できる。Ｚ－Ｙ平面（Ｘ軸に垂直な平面）内での眼球１４の回転角θｙも、回転角θｘの算出方法と同様の方法で算出できる。 In step S705 , CPU 3 calculates the rotation angle of the optical axis of eyeball 14 with respect to the optical axis of light receiving lens 16 . The X coordinate of the midpoint between the corneal reflection image Pd and the corneal reflection image Pe and the X coordinate of the center of curvature O of the cornea 142 substantially match. Therefore, if the standard distance between the center of curvature O of the cornea 142 and the center c of the pupil 141 is Oc, then the rotation angle _θX of the eyeball 14 in the ZX plane (the plane perpendicular to the Y-axis) is given by: It can be calculated by the formula 1 below. The rotation angle θy of the eyeball 14 within the ZY plane (the plane perpendicular to the X axis) can also be calculated by a method similar to the method for calculating the rotation angle θx.

β×Ｏｃ×ＳＩＮθ_Ｘ≒｛（Ｘｄ＋Ｘｅ）／２｝－Ｘｃ・・・（式１）
ステップＳ７０６では、ＣＰＵ３は、メモリ部４から補正係数（係数ｍ、及び視線補正係数Ａｘ，Ｂｘ，Ａｙ，Ｂｙ）を取得する。係数ｍは撮像装置１のファインダ光学系（受光レンズ１６等）の構成で定まる定数であり、回転角θｘ，θｙを視認用画像において瞳孔中心ｃに対応する座標に変換する変換係数であり、予め決定されてメモリ部４に格納されている。また、視線補正係数Ａｘ，Ｂｘ，Ａｙ，Ｂｙは、眼球の個人差を補正するパラメータであり、後述するキャリブレーション作業を行うことで取得され、本処理が開始する前にメモリ部４に格納されている。 β×Oc×SINθ _X ≈{(Xd+Xe)/2}−Xc (Formula 1)
In step S706 , the CPU 3 acquires correction coefficients (coefficient m and line-of-sight correction coefficients Ax, Bx, Ay, By) from the memory unit 4 . The coefficient m is a constant determined by the configuration of the finder optical system (light receiving lens 16, etc.) of the imaging device 1, and is a conversion coefficient for converting the rotation angles θx and θy into coordinates corresponding to the pupil center c in the image for visual recognition. It is determined and stored in the memory section 4 . Further, the line-of-sight correction coefficients Ax, Bx, Ay, and By are parameters for correcting individual differences in eyeballs, are obtained by performing calibration work described later, and are stored in the memory unit 4 before the start of this process. ing.

ステップＳ７０７では、ＣＰＵ３は、視線検出回路２０１に指示し、ファインダ１０に表示された視認用画像に注がれるユーザの視点の位置（第１の推定注視点位置）を算出させる。具体的には、視線検出回路２０１は、ステップＳ７０５で算出した眼球１４の回転角θｘ，θｙ、及びステップＳ７０６で取得した補正係数データを用いて、第１の推定注視点位置を算出する。第１の推定注視点位置の座標（Ｈｘ，Ｈｙ）が瞳孔中心ｃに対応する座標であるとすると、第１の推定注視点位置の座標（Ｈｘ，Ｈｙ）は以下の式２，３で算出できる。 In step S707 , the CPU 3 instructs the line-of-sight detection circuit 201 to calculate the position of the user's viewpoint focused on the image for visual recognition displayed on the finder 10 (first estimated gazing point position). Specifically, the line-of-sight detection circuit 201 calculates the first estimated gazing point position using the rotation angles θx and θy of the eyeball 14 calculated in step S705 and the correction coefficient data obtained in step S706. Assuming that the coordinates (Hx, Hy) of the first estimated gazing point position are the coordinates corresponding to the pupil center c, the coordinates (Hx, Hy) of the first estimated gazing point position are calculated by the following equations 2 and 3. can.

Ｈｘ＝ｍ×（Ａｘ×θｘ＋Ｂｘ）・・・（式２）
Ｈｙ＝ｍ×（Ａｙ×θｙ＋Ｂｙ）・・・（式３）
ステップＳ７０８では、ＣＰＵ３は、ステップＳ７０６で算出した第１の推定注視点位置の座標（Ｈｘ，Ｈｙ）をメモリ部４に格納して、本処理を終える。 Hx=m×(Ax×θx+Bx) (Formula 2)
Hy=m×(Ay×θy+By) (Formula 3)
In step S708, the CPU 3 stores the coordinates (Hx, Hy) of the first estimated gazing point position calculated in step S706 in the memory unit 4, and ends this processing.

以上、本実施例の視線検出処理においては、眼球１４の回転角θｘ，θｙ、及び後述するキャリブレーション作業等により予め取得している補正係数（係数ｍ、及び視線補正係数Ａｘ，Ｂｘ，Ａｙ，Ｂｙ）を用いて、第１の推定注視点位置が算出された。 As described above, in the line-of-sight detection processing of this embodiment, the rotation angles θx and θy of the eyeball 14 and the correction coefficients (coefficient m and line-of-sight correction coefficients Ax, Bx, Ay, By) was used to calculate the first estimated gaze point position.

しかし、人間の眼球の形状の個人差等の要因により、第１の推定注視点位置を高精度に推定できないことがある。具体的には、視線補正係数Ａｘ，Ａｙ，Ｂｘ，Ｂｙの値をユーザに適した値に調整しなければ、図４（ｂ）に示したように、ユーザが実際に注視している位置ＢとステップＳ７０７で算出された第１の推定注視点位置である位置Ｃとのずれが生じてしまう。図４（ｂ）では、ユーザは位置Ｂの人物を注視しているが、撮像装置１は、ユーザが第１の推定注視点位置である位置Ｃの背景を注視していると誤って推定しており、適切な焦点検出及び調整ができない状態に陥ってしまっている。 However, due to factors such as individual differences in the shape of human eyeballs, the first estimated gaze point position may not be estimated with high accuracy. Specifically, unless the values of the line-of-sight correction coefficients Ax, Ay, Bx, and By are adjusted to values suitable for the user, as shown in FIG. and position C, which is the first estimated gaze point position calculated in step S707. In FIG. 4B, the user is gazing at the person at position B, but the imaging device 1 incorrectly estimates that the user is gazing at the background at position C, which is the first estimated gazing point position. This results in a state in which appropriate focus detection and adjustment cannot be performed.

そこで、ＣＰＵ３（キャリブレーション手段）は、撮像装置１が撮像（焦点検出）を行う前に、キャリブレーション作業を行い、ユーザに適した視線補正係数Ａｘ，Ａｙ，Ｂｘ，Ｂｙを取得し、メモリ部４に格納する。 Therefore, the CPU 3 (calibration means) performs calibration work before the image pickup apparatus 1 performs image pickup (focus detection), acquires line-of-sight correction coefficients Ax, Ay, Bx, and By suitable for the user, and stores them in the memory unit. Store in 4.

従来より、キャリブレーション作業は、撮像前に図４（ｃ）のような位置の異なる複数の指標Ｄ１～Ｄ５を視認用画像で強調表示し、ユーザにその指標を見てもらうことで行われている。そして、各視標の注視時に視線検出処理を行い、算出された複数の第１の推定注視点位置の座標と、各指標の座標とから、ユーザに適した視線補正係数Ａｘ，Ａｙ，Ｂｘ，Ｂｙを求める技術が、公知の技術として知られている。なお、ユーザの見るべき位置が示唆されれば、指標の表示でなくてもよく、輝度や色の変更で位置が強調されてもよい。 Conventionally, calibration work is performed by highlighting a plurality of indices D1 to D5 at different positions as shown in FIG. there is Then, line-of-sight detection processing is performed when gazing at each visual target, and line-of-sight correction coefficients Ax, Ay, Bx, A technique for obtaining By is known as a known technique. As long as the position to be viewed by the user is suggested, the position may be emphasized by changing the brightness or color instead of displaying the index.

しかしながら、先述したように撮影時とキャリブレーション時の条件の違いによっては、視線検出の精度が落ちてしまう。例えば外光の入り込みや、ファインダ１０を覗くユーザの目の距離が撮影時とキャリブレーション時で異なる時などである。 However, as described above, the accuracy of line-of-sight detection is degraded depending on the difference in conditions at the time of photographing and at the time of calibration. For example, this is the case when external light enters, or when the distance of the user's eyes looking into the finder 10 differs between when shooting and when performing calibration.

図８は、視線検出信頼度判別回路３１により信頼度が低いと判断される、キャリブレーション時及び撮影時の眼画像データの一例を示す図である。図８（ａ）はキャリブレーション時に視線検出回路２０１から取得した眼画像データの一例であり、図８（ｂ）は撮影時に視線検出回路２０１から取得した眼画像データの一例である。ここでは、図８（ｂ）において、キャリブレーション時よりも撮影時の方が接眼窓６からユーザの目の位置が離れてしまい、眼画像データから検出される目の大きさが小さくなった場合を示している。例えば、撮像装置１を光軸方向に対して水平にしてキャリブレーションを行った後、撮像装置１を光軸方向に対して下に向けて地面に咲いている花を撮影した場合などに、図８に示すような眼画像データが得られることがある。 FIG. 8 is a diagram showing an example of eye image data at the time of calibration and photographing, which is determined to have low reliability by the line-of-sight detection reliability determination circuit 31. In FIG. FIG. 8A shows an example of eye image data acquired from the line-of-sight detection circuit 201 during calibration, and FIG. 8B shows an example of eye image data acquired from the line-of-sight detection circuit 201 during photographing. Here, in FIG. 8B, the position of the user's eyes is farther from the eyepiece window 6 during shooting than during calibration, and the size of the eyes detected from the eye image data is smaller. is shown. For example, when the image pickup apparatus 1 is set horizontally with respect to the optical axis direction for calibration, and then the image pickup apparatus 1 is turned downward with respect to the optical axis direction to photograph a flower blooming on the ground. Eye image data such as that shown in 8 may be obtained.

このような場合、視線検出信頼度判別回路３１から出力される第１の推定注視点位置の信頼度は低くなるため、本実施例ではＣＰＵ３は、ニューラルネットワーク、より具体的にはＣＮＮを用いた推論器により第２の推定注視点位置を推定する。ここで、ＣＮＮとは、特に画像認識を行う際によく用いられる、畳み込みニューラルネットワーク（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）の略である。 In such a case, the reliability of the first estimated gaze point position output from the line-of-sight detection reliability determination circuit 31 is low. A second estimated point-of-regard position is estimated by the reasoner. Here, CNN is an abbreviation for convolutional neural network, which is often used especially when performing image recognition.

本実施例では、ＣＰＵ３が、ＣＮＮを用いた推論器での演算を行う。ＣＮＮの基本的な構成について、図１３および図１４を用いて説明する。 In this embodiment, the CPU 3 performs calculations in the reasoner using CNN. A basic configuration of CNN will be described with reference to FIGS. 13 and 14. FIG.

図１３は、視線検出回路２０１からＣＰＵ３に出力された眼画像データから第２の推定注視点位置を推定するＣＮＮの基本的な構成を示す。 FIG. 13 shows the basic configuration of a CNN for estimating the second estimated gazing point position from the eye image data output from the line-of-sight detection circuit 201 to the CPU 3. As shown in FIG.

処理の流れは、左端を入力とし、右方向に処理が進んでいく。ＣＮＮは、特徴検出層（Ｓ層）と特徴統合層（Ｃ層）と呼ばれる２つの層をひとつのセットとし、それが階層的に構成されている。 In the flow of processing, the left end is the input and the processing proceeds to the right. The CNN has a set of two layers called a feature detection layer (S layer) and a feature integration layer (C layer), which are hierarchically configured.

ＣＮＮでは、まずＳ層において前段階層で検出された特徴をもとに次の特徴を検出する。またＳ層において検出した特徴をＣ層で統合し、その階層における検出結果として次の階層に送る構成になっている。 In CNN, first, the following features are detected in the S layer based on the features detected in the previous layer. In addition, the features detected in the S layer are integrated in the C layer and sent to the next layer as the detection result in that layer.

Ｓ層は特徴検出細胞面からなり、特徴検出細胞面ごとに異なる特徴を検出する。また、Ｃ層は、特徴統合細胞面からなり、前段の特徴検出細胞面での検出結果をプーリングする。以下では、特に区別する必要がない場合、特徴検出細胞面および特徴統合細胞面を総称して特徴面と呼ぶ。本実施形態では、最終段階層である出力層ではＣ層は用いずＳ層のみで構成している。 The S layer consists of feature detection cell planes, and detects different features for each feature detection cell plane. Also, the C layer consists of a feature integration cell plane, and pools the detection results of the feature detection cell plane in the preceding stage. Hereinafter, the feature detection cell plane and the feature integration cell plane will be collectively referred to as feature planes when there is no particular need to distinguish them. In this embodiment, the output layer, which is the final stage layer, is composed only of the S layer without using the C layer.

特徴検出細胞面での特徴検出処理、および特徴統合細胞面での特徴統合処理の詳細について、図１４を用いて説明する。特徴検出細胞面は、複数の特徴検出ニューロンにより構成され、特徴検出ニューロンは前段階層のＣ層に所定の構造で結合している。また特徴統合細胞面は、複数の特徴統合ニューロンにより構成され、特徴統合ニューロンは同階層のＳ層に所定の構造で結合している。図１４中に示した、Ｌ階層目Ｓ層のＭ番目細胞面内において、位置（ξ，ζ）の特徴検出ニューロンの出力値をｙ_Ｍ ^ＬＳ（ξ，ζ）、Ｌ階層目Ｃ層のＭ番目細胞面内において、位置（ξ，ζ）の特徴統合ニューロンの出力値をｙ_Ｍ ^ＬＣ（ξ，ζ）と表記する。その時、それぞれのニューロンの結合係数をｗ_Ｍ ^ＬＳ（ｎ，ｕ，ｖ）、ｗ_Ｍ ^ＬＣ（ｕ，ｖ）とすると、各出力値は以下の式４，５のように表すことができる。 Details of feature detection processing in the feature detection cell plane and feature integration processing in the feature integration cell plane will be described with reference to FIG. The feature detection cell surface is composed of a plurality of feature detection neurons, and the feature detection neurons are connected to the C layer of the prestage layer in a predetermined structure. The feature-integrating cell surface is composed of a plurality of feature-integrating neurons, and the feature-integrating neurons are connected to the S-layer of the same layer in a predetermined structure. In the _M - ^th cell plane of the L-th layer S layer shown in FIG. In the th cell plane, the output value of the feature integration neuron at the position (ξ, ζ) is expressed as y _M ^LC (ξ, ζ). At that time, if the coupling coefficients of the neurons are w _M ^LS (n, u, v) and w _M ^LC (u, v), each output value can be expressed as in Equations 4 and 5 below.

式４のｆは、活性化関数であり、ロジスティック関数や双曲正接関数などのシグモイド関数であれば何でもよい。式４のｕ_Ｍ ^ＬＳ（ξ，ζ）は、Ｌ階層目Ｓ層のＭ番目細胞面における、位置（ξ，ζ）の特徴検出ニューロンの内部状態である。一方、式５では活性化関数を用いず単純な線形和をとっているので、Ｌ階層目Ｃ層のＭ番目細胞面における、位置（ξ，ζ）の特徴統合ニューロンの内部状態である、ｕ_Ｍ ^ＬＣ（ξ，ζ）は、式５で算出される出力値ｙ_Ｍ ^ＬＣ（ξ，ζ）は等しい。また、式４のｙ_ｎ ^Ｌ－１Ｃ（ξ＋ｕ，ζ＋ｖ）、式５のｙ_Ｍ ^ＬＳ（ξ＋ｕ，ζ＋ｖ）をそれぞれ特徴検出ニューロンの結合先出力値、特徴統合ニューロンの結合先出力値と呼ぶ。 f in Equation 4 is an activation function, and may be any sigmoid function such as a logistic function or a hyperbolic tangent function. u _M ^LS (ξ, ζ) in Equation 4 is the internal state of the feature detection neuron at the position (ξ, ζ) on the M-th cell plane of the L-th layer S layer. On the other hand, in Equation 5, a simple linear sum is taken without using an activation function, so u _M ^LC (ξ, ζ) is equal to the output value y _M ^LC (ξ, ζ) calculated by Equation (5). Also, y _n ^L−1C (ξ+u, ζ+v) in Equation 4 and y _M ^LS (ξ+u, ζ+v) in Equation 5 are called the output value of the connection destination of the feature detection neuron and the output value of the connection destination of the feature integration neuron, respectively.

式４，５中のξ，ζ，ｕ，ｖ，ｎについて説明する。 ξ, ζ, u, v, and n in Equations 4 and 5 will be explained.

位置（ξ，ζ）は、入力画像における位置座標に対応している。例えば式４で算出される出力値ｙ_Ｍ ^ＬＳ（ξ，ζ）が高い出力値である場合は、入力画像の画素位置（ξ，ζ）に、Ｌ階層目Ｓ層Ｍ番目細胞面において検出する特徴が存在する可能性が高いことを意味する。 Position (ξ, ζ) corresponds to position coordinates in the input image. For example, when the output value y _M ^LS (ξ, ζ) calculated by Equation 4 is a high output value, it is detected at the pixel position (ξ, ζ) of the input image in the L-th layer S-layer M-th cell plane. It means that the feature is likely to be present.

またｎは、式４において、Ｌ－１階層目Ｃ層ｎ番目細胞面を意味しており、統合先特徴番号と呼ぶ。基本的にＬ－１階層目Ｃ層に存在する全ての細胞面についての積和演算を行う。 Also, n in Formula 4 means the n-th cell surface of the C layer of the L−1 layer, and is called the integration target feature number. Basically, sum-of-products operations are performed for all cell planes present in the L-1 layer C layer.

（ｕ，ｖ）は、結合係数の相対位置座標であり、検出する特徴のサイズに応じて有限の範囲（ｕ，ｖ）において積和演算を行う。このような有限な（ｕ，ｖ）の範囲を受容野と呼ぶ。また受容野の大きさを、以下では受容野サイズと呼び、結合している範囲の横画素数×縦画素数で表す。 (u, v) are the relative position coordinates of the coupling coefficient, and the sum-of-products operation is performed in a finite range (u, v) according to the size of the feature to be detected. Such a finite range of (u, v) is called a receptive field. The size of the receptive field is hereinafter referred to as the size of the receptive field, and is represented by the number of horizontal pixels×the number of vertical pixels in the combined range.

また式４において、Ｌ＝１つまり一番初めのＳ層では、式４中のｙ_ｎ ^Ｌ－１Ｃ（ξ＋ｕ，ζ＋ｖ）は、入力画像ｙ^{ｉｎ＿ｉｍａｇｅ}（ξ＋ｕ，ζ＋ｖ）となる。ちなみにニューロンや画素の分布は離散的であり、結合先特徴番号も離散的なので、ξ，ζ，ｕ，ｖ，ｎは連続な変数ではなく、離散的な値をとる。ここでは、ξ，ζは非負整数、ｎは自然数、ｕ，ｖは整数とし、何れも有限な範囲となる。 Also, in Equation 4, when L=1, that is, in the first S layer, y _n ^L−1C (ξ+u, ζ+v) in Equation 4 becomes the input image y ^in_image (ξ+u, ζ+v). By the way, the distribution of neurons and pixels is discrete, and the connection destination feature number is also discrete, so ξ, ζ, u, v, and n are not continuous variables but discrete values. Here, ξ and ζ are non-negative integers, n is a natural number, and u and v are integers, all of which have a finite range.

式４中のｗ_Ｍ ^ＬＳ（ｎ，ｕ，ｖ）は、所定の特徴を検出するための結合係数分布であり、これを適切な値に調整することによって、所定の特徴を検出することが可能になる。この結合係数分布の調整が学習であり、ＣＮＮの構築においては、さまざまなテストパターンを提示して、式４で算出される出力値ｙ_Ｍ ^ＬＳ（ξ，ζ）が適切な出力値になるように、結合係数を繰り返し徐々に修正していくことで結合係数の調整を行う。 w _M ^LS (n, u, v) in Equation 4 is a coupling coefficient distribution for detecting a predetermined feature, and by adjusting this to an appropriate value, it is possible to detect the predetermined feature. become. This adjustment of the coupling coefficient distribution is learning, and in constructing the CNN, various test patterns are presented so that the output value y _M ^LS (ξ, ζ) calculated by Equation 4 becomes an appropriate output value. Second, the coupling coefficient is adjusted by repeatedly and gradually correcting the coupling coefficient.

次に、式５中のｗ_Ｍ ^ＬＣ（ｕ，ｖ）は、２次元のガウシアン関数を用いており、以下の式６のように表すことができる。 Next, w _M ^LC (u, v) in Equation 5 uses a two-dimensional Gaussian function and can be expressed as in Equation 6 below.

ここでも、（ｕ，ｖ）は有限の範囲としてあるので、特徴検出ニューロンの説明と同様に、有限の範囲を受容野といい、範囲の大きさを受容野サイズと呼ぶ。この受容野サイズは、ここではＬ階層目Ｓ層のＭ番目特徴のサイズに応じて適当な値に設定すれば良い。式６中の、σは特徴サイズ因子であり、受容野サイズに応じて適当な定数に設定しておけば良い。具体的には、受容野の一番外側の値がほぼ０とみなせるような値になるように設定するのが良い。本実施例のＣＮＮは、上述のような演算を各階層で行うことで、最終階層のＳ層において、第２の推定注視点位置を推定するよう構成される。 Again, since (u, v) has a finite range, the finite range is called the receptive field, and the size of the range is called the receptive field size, as in the description of the feature detection neuron. This receptive field size may be set to an appropriate value according to the size of the M-th feature of the L-th layer S layer. σ in Equation 6 is a feature size factor, which may be set to an appropriate constant according to the size of the receptive field. Specifically, it is preferable to set the value so that the outermost value of the receptive field can be regarded as approximately zero. The CNN of this embodiment is configured to estimate the second estimated point-of-regard position in the S layer, which is the final layer, by performing the above-described calculations in each layer.

ここでファインダ１０を覗くユーザの眼画像データを入力データとして、第２の推定注視点位置を推論結果として出力する推論器を作成する際に重要となってくるのが、正解データ（正解位置）をどのように定義するかとなる。視線検出信頼度判別回路３１により算出された、第１の推定注視点位置の信頼度が高い場合は、正解位置を第１の推定注視点位置としてもよいが、そうでない場合に第１の推定注視点位置を正解位置としてしまうと、学習による正解率が上がらない。よって、本実施例では、第１の推定注視点位置の信頼度が低い場合は、撮像装置１から得られる他の情報を正解データとして収集する。 Correct data (correct position) is important when creating an inference device that outputs a second estimated gaze point position as an inference result using eye image data of the user looking through the finder 10 as input data. how to define If the reliability of the first estimated gazing point position calculated by the line-of-sight detection reliability determination circuit 31 is high, the correct position may be used as the first estimated gazing point position. If the point-of-regard position is set as the correct position, the accuracy rate by learning does not increase. Therefore, in this embodiment, when the reliability of the first estimated gaze point position is low, other information obtained from the imaging device 1 is collected as correct data.

図９は、推論器を作成する際の学習に用いる正解データの収集処理のフローチャートである。本処理はＣＰＵ３が、図３において不図示のＲＯＭに記録されるプログラムを読み出すことにより実行される。 FIG. 9 is a flow chart of processing for collecting correct data used for learning when creating a reasoner. This process is executed by the CPU 3 reading out a program recorded in a ROM (not shown in FIG. 3).

ステップＳ９０１において、ＣＰＵ３は、ユーザがファインダ１０を覗いているかどうかを監視する。これは例えば、視線検出回路２０１より出力される画像データが眼画像データであるか否かで判断することが可能である。尚、ユーザがファインダ１０を覗いているかどうかが監視できる方法であれば特にこれに限るものではなく、接眼レンズ１２の周囲に設けられる不図示の光センサを用いて、接眼レンズ１２への接眼の有無を検知してもよい。ユーザがファインダ１０を覗いていると判断した場合、ステップＳ９０２へと進む。 In step S901, CPU 3 monitors whether the user is looking through finder 10 or not. This can be determined, for example, by whether or not the image data output from the line-of-sight detection circuit 201 is eye image data. The method is not particularly limited to this as long as it is possible to monitor whether or not the user is looking through the viewfinder 10. An optical sensor (not shown) provided around the eyepiece lens 12 is used to detect eye contact with the eyepiece lens 12. Presence/absence may be detected. If it is determined that the user is looking through the finder 10, the process proceeds to step S902.

ステップＳ９０２において、ＣＰＵ３は、視線検出回路２０１より出力される眼画像データから第１の推定注視点位置を算出すると共に、視線検出信頼度判別回路３１より出力される信頼度を取得する。その後ステップＳ９０３へと進む。 In step S902 , the CPU 3 calculates the first estimated gazing point position from the eye image data output from the line-of-sight detection circuit 201 and acquires the reliability output from the line-of-sight detection reliability determination circuit 31 . After that, the process proceeds to step S903.

ステップＳ９０３において、ＣＰＵ３は、視線検出信頼度判別回路３１より出力された信頼度が高い場合には、ステップＳ９０４へと進む。また信頼度が低い場合には、ステップＳ９０５へと進む。 In step S903, when the reliability output from the line-of-sight detection reliability determination circuit 31 is high, the CPU 3 proceeds to step S904. If the reliability is low, the process proceeds to step S905.

ステップＳ９０４において、ＣＰＵ３は、第１の推定注視点位置を正解位置として収集した後、本処理を終了する。 In step S904, CPU 3 completes this process after collecting the first estimated gazing point position as the correct position.

ステップＳ９０５において、ＣＰＵ３は、第１の推定注視点位置の付近に被写体があるかどうかを判別する。この判別では、被写体として人物を検出してもよいし、瞳を検出してもよい。この判別の結果、第１の推定注視点位置の付近に被写体が検出された場合、ステップＳ９０６へと進む一方、そうでない場合、ステップＳ９０７へと進む。 In step S905, CPU 3 determines whether or not there is a subject near the first estimated gazing point position. In this determination, a person may be detected as a subject, or eyes may be detected. As a result of this determination, if the subject is detected near the first estimated gazing point position, the process proceeds to step S906. Otherwise, the process proceeds to step S907.

ステップＳ９０６において、ＣＰＵ３（収集手段）は、第１の推定注視点位置の付近に被写体の座標を正解位置として収集した後、本処理を終了する。 In step S906, CPU 3 (collecting means) collects the coordinates of the subject near the first estimated gaze point position as the correct position, and then terminates this process.

ステップＳ９０７において、ＣＰＵ３（表示制御手段）は、第１の推定注視点位置（以下本処理では、図４（ｂ）の位置Ｃ）を第１のユーザ操作により他の位置（以下本処理では図４（ｂ）の位置Ｂ）に移動可能にファインダ１０に強調表示する。ここで第１のユーザ操作とは、操作部材４１～４３のいずれかを用いたユーザの手動操作をいう。その後、ＣＰＵ３は、第１のユーザ操作によりファインダ１０に強調表示されていた位置Ｃが位置Ｂに移動した後、撮像装置１で位置Ｂをフォーカス位置とした画像の撮影が行われたか（レリーズボタン５の押下（第２のユーザ操作）がされたか）否かを判別する。かかる撮影が行われた場合のみ、ステップＳ９０８へと進む。尚、第２のユーザ操作は、ユーザが選択したファインダ１０上の他の位置をフォーカス位置に決定するユーザ操作であればよく、レリーズボタン５の押下以外のユーザ操作であってもよい。 In step S907, the CPU 3 (display control means) moves the first estimated gazing point position (hereinafter referred to as position C in FIG. 4B in this process) to another position (hereinafter referred to as 4(b), position B) is highlighted in the finder 10 so as to be movable. Here, the first user operation refers to manual operation by the user using any one of the operation members 41-43. After that, after the position C highlighted in the finder 10 has been moved to the position B by the first user operation, the CPU 3 determines whether an image is captured with the position B as the focus position by the imaging device 1 (release button 5 is pressed (second user operation)). Only when such shooting has been performed, the process proceeds to step S908. Note that the second user operation may be a user operation for determining another position on the finder 10 selected by the user as the focus position, and may be a user operation other than pressing the release button 5 .

ステップＳ９０８において、ＣＰＵ３（収集手段）は、撮影画像のフォーカス位置（他の位置）の座標を正解位置として収集した後、本処理を終了する。 In step S908, the CPU 3 (collecting means) collects the coordinates of the focus position (other position) of the captured image as the correct position, and then terminates this process.

ＣＰＵ３は、図９の処理で収集した眼画像データ（入力データ）及びその際の正解位置（正解データ）を、通信回路３２を用いて、ＬＡＮやインターネットといったネットワークを介して、不図示のサーバ上のＰＣに対して送信する。サーバ上のＰＣは、これらのデータを用いたＣＮＮの機械学習を行い、学習結果として生成した「推論器」を撮像装置１に送信する。尚、撮像装置１が高性能のＧＰＵを有し、そのＧＰＵ（またはＣＰＵ３との協働）により上記ＣＮＮの機械学習を行うようにしてもよい。 The CPU 3 transmits the eye image data (input data) collected in the process of FIG. 9 and the correct position (correct data) at that time to a server (not shown) using the communication circuit 32 via a network such as a LAN or the Internet. to the PC of The PC on the server performs CNN machine learning using these data, and transmits a “reasoner” generated as a learning result to the imaging device 1 . The imaging device 1 may have a high-performance GPU, and the GPU (or cooperation with the CPU 3) may perform the machine learning of the CNN.

次に、サーバ上のＰＣでＣＮＮの機械学習が行われ、生成された推論器の使い方に関して説明する。 Next, a description will be given of how to use an inference device generated by performing CNN machine learning on a PC on the server.

図１０は、撮影時のフォーカス処理のフローチャートである。本処理はＣＰＵ３が、図３において不図示のＲＯＭに記録されるプログラムを読み出すことにより実行される。 FIG. 10 is a flowchart of focus processing during shooting. This process is executed by the CPU 3 reading out a program recorded in a ROM (not shown in FIG. 3).

図１０において、まず、ＣＰＵ３は、ステップＳ９０１～Ｓ９０３の処理を行う。これらの処理は、図９の説明で前述しているため、重複した説明を割愛する。 In FIG. 10, the CPU 3 first performs steps S901 to S903. Since these processes have already been described in the description of FIG. 9, redundant description will be omitted.

ステップＳ９０３において、ＣＰＵ３は、第１の推定注視点位置の信頼度が高い場合はステップＳ１００４へと進み、信頼度が低い場合はステップＳ１００５と進む。 In step S903, CPU 3 proceeds to step S1004 if the reliability of the first estimated gaze point position is high, and proceeds to step S1005 if the reliability is low.

ステップＳ１００４において、ＣＰＵ３（第１のフォーカス手段）は、第１の推定注視点位置がユーザの望むフォーカス位置であると判断し、第１の推定注視点位置に基づいてフォーカスを行う。この処理はＣＰＵ３の指示により、自動焦点検出回路２０３と焦点調節回路１１８を動作させることで実現される。具体的には、まず、自動焦点検出回路２０３が、第１の推定注視点位置と一致する焦点検出ポイントに対応する被写体までの距離を演算する。その後、焦点調節回路１１８が、この情報にもとづいてレンズ駆動用モーター１１３を所定量駆動させ、撮影レンズ１Ａを合焦点位置に移動させる。その後、本処理を終了する。 In step S1004, CPU 3 (first focusing means) determines that the first estimated position of the point of gaze is the focus position desired by the user, and performs focusing based on the first estimated position of the point of gaze. This processing is realized by operating the automatic focus detection circuit 203 and the focus adjustment circuit 118 according to instructions from the CPU 3 . Specifically, first, the automatic focus detection circuit 203 calculates the distance to the subject corresponding to the focus detection point that matches the first estimated gazing point position. After that, the focus adjustment circuit 118 drives the lens driving motor 113 by a predetermined amount based on this information to move the photographing lens 1A to the in-focus position. After that, this process is terminated.

ステップＳ１００５において、ＣＰＵ３が、サーバ上のＰＣから送信された推論器に、ステップＳ９０２で視線検出回路２０１より出力された眼画像データを入力し、第２の推定注視点位置を推定する。尚、本実施例では、ＣＰＵ３が第２の推定注視点位置の推定を行っているが、これに限定されない。例えば、サーバ上のＰＣが第２の推定注視点位置の推定を行ってもよい。この場合、ＣＰＵ３は、ステップＳ９０２で視線検出回路２０１より出力された眼画像データをサーバ上のＰＣに送信し、サーバ上のＰＣが推論器を用いて第２の推定注視点位置の推定を行い、推論結果を撮像装置１のＣＰＵ３に出力する。その後ステップＳ１００６へと進む。 In step S1005, the CPU 3 inputs the eye image data output from the line-of-sight detection circuit 201 in step S902 to the inference unit transmitted from the PC on the server, and estimates the second estimated gazing point position. In this embodiment, the CPU 3 estimates the second estimated gaze point position, but the present invention is not limited to this. For example, a PC on the server may estimate the second estimated point-of-regard position. In this case, the CPU 3 transmits the eye image data output from the line-of-sight detection circuit 201 in step S902 to the PC on the server, and the PC on the server uses the inference device to estimate the second estimated gaze point position. , outputs the inference result to the CPU 3 of the imaging device 1 . After that, the process proceeds to step S1006.

ステップＳ１００６において、ＣＰＵ３は、ステップＳ１００５で推定された第２の推定注視点位置の信頼度が高いと判断した場合はステップＳ１００７へと進む。信頼度が低いと判断した場合はステップＳ１００８へと進む。尚、本実施例の推論器においては、１８０か所の焦点検出ポイントの夫々についてその尤度が算出される。よって、最も高い尤度が算出された焦点検出ポイントを第２の推定注視点位置とする。また最も高い尤度の値が閾値以上である場合、第２の推定注視点位置の信頼度が高いと判別される。 In step S1006, when CPU 3 determines that the reliability of the second estimated gaze point position estimated in step S1005 is high, the process proceeds to step S1007. If it is determined that the reliability is low, the process proceeds to step S1008. In the inference unit of this embodiment, the likelihood is calculated for each of the 180 focus detection points. Therefore, the focus detection point for which the highest likelihood is calculated is set as the second estimated gazing point position. Also, when the highest likelihood value is equal to or greater than the threshold, it is determined that the reliability of the second estimated point-of-regard position is high.

ステップＳ１００７において、ＣＰＵ３（第２のフォーカス手段）は、第２の推定注視点位置がユーザの望むフォーカス位置であると判断し、第２の推定注視点位置に基づいてフォーカスを行う。この処理はＣＰＵ３の指示により、自動焦点検出回路２０３と焦点調節回路１１８を動作させることで実現される。その後、本処理を終了する。 In step S1007, CPU 3 (second focus means) determines that the second estimated position of the point of interest is the focus position desired by the user, and performs focus based on the second estimated position of the point of interest. This processing is realized by operating the automatic focus detection circuit 203 and the focus adjustment circuit 118 according to instructions from the CPU 3 . After that, this process is terminated.

ステップＳ１００８において、ＣＰＵ３は、第１の推定注視点位置と第２の推定注視点位置のどちらの信頼度が高いかを判断する。第１の推定注視点位置の信頼度の方が高いと判断した場合はステップＳ１００９へと進む一方、第２の推定注視点位置の信頼度の方がが高いと判断した場合はステップＳ１０１０へと進む。 In step S1008, the CPU 3 determines which of the first estimated gazing point position and the second estimated gazing point position is more reliable. If it is determined that the reliability of the first estimated point-of-regard position is higher, the process proceeds to step S1009. If it is determined that the reliability of the second estimated point-of-regard position is higher, the process proceeds to step S1010. move on.

ステップＳ１００９において、ＣＰＵ３は、第１の推定注視点位置の付近にユーザの望むフォーカス位置があると判断し、第１の推定注視点位置の付近の被写体検出を行い、検出された被写体をフォーカスポイントとしてフォーカスを行う。その後、本処理を終了する。 In step S1009, the CPU 3 determines that the focus position desired by the user is near the first estimated position of the point of gaze, detects a subject near the first estimated position of the point of gaze, and places the detected subject as the focus point. Focus as After that, this process is terminated.

ステップＳ１０１０において、ＣＰＵ３は、第２の推定注視点位置の付近にユーザの望むフォーカス位置があると判断し、第２の推定注視点位置の付近の被写体検出を行い、検出された被写体をフォーカスポイントとしてフォーカスを行う。その後、本処理を終了する。 In step S1010, CPU 3 determines that the focus position desired by the user is in the vicinity of the second estimated position of the point of gaze, performs subject detection in the vicinity of the second estimated position of the point of gaze, and places the detected subject as the focus point. Focus as After that, this process is terminated.

また、サーバ上のＰＣでのＣＮＮの機械学習が進み、第１の推定注視点位置の信頼度が高い場合でも、推論器が推定する第２の推定注視点位置の信頼度がこれと同等以上となった場合は、撮影時のフォーカス処理を常に推論器を用いて行うようにしてもよい。具体的には図１０のステップＳ９０２，Ｓ９０３，Ｓ１００４，Ｓ１００８，Ｓ１００９が不要となり、ステップＳ９０１でＹＥＳの場合、直接ステップＳ１００５に進み、ステップＳ１００６でＮＯの場合、直接ステップＳ１０１０に進む。これにより、第２の推定注視点位置のみに基づいてフォーカス位置が決定される。 In addition, even if the machine learning of CNN on the PC on the server progresses and the reliability of the first estimated gaze point position is high, the reliability of the second estimated gaze point position estimated by the inference device is equal to or higher than this. In this case, the inference unit may always be used for focus processing during shooting. Specifically, steps S902, S903, S1004, S1008, and S1009 in FIG. 10 are not required, and if YES in step S901, the process proceeds directly to step S1005, and if NO in step S1006, the process proceeds directly to step S1010. Thereby, the focus position is determined based only on the second estimated gaze point position.

本実施例では、撮影時のファインダ１０を覗くユーザの目の位置が、キャリブレーション時よりも接眼窓６から離れた場合を説明したが、これに限るものではない。すなわち、視線検出信頼度判別回路３１により出力される信頼度が下がる種々の条件でフォーカス処理が行う場合において、本実施例は適用することが可能である。 In this embodiment, a case has been described in which the user's eyes looking through the viewfinder 10 at the time of photographing are positioned farther from the eyepiece window 6 than at the time of calibration, but the present invention is not limited to this. In other words, this embodiment can be applied when focus processing is performed under various conditions in which the reliability output by the line-of-sight detection reliability determination circuit 31 is lowered.

以上説明したように本実施例においては、学習時に撮像装置１の情報を用いて正解位置を定義することで、視線検出の精度が落ちているときにおいても有意な学習を行うことができる。その上で第１及び第２の推定注視点位置の信頼度に応じて、視線検出回路２０１と推論器を切り替えることにより、キャリブレーション時と差異があるような撮影状況においても、視線検出の精度を向上することができる。 As described above, in this embodiment, by defining the correct position using the information of the imaging device 1 at the time of learning, significant learning can be performed even when the accuracy of line-of-sight detection is degraded. Then, by switching the line-of-sight detection circuit 201 and the inference device according to the reliability of the first and second estimated gaze point positions, the accuracy of line-of-sight detection can be improved even in shooting situations where there is a difference from the time of calibration. can be improved.

（実施例２）
以下、図１１、図１２を参照して、本発明の実施例２による、様々な条件で視線検出の信頼度が低くなった場合における最適な学習時の入力データの収集方法について説明する。尚、本実施例において、実施例１と同一の構成については同一の付番を付し、重複した説明は省略する。 (Example 2)
11 and 12, a method of collecting input data during optimal learning when the reliability of line-of-sight detection is low under various conditions will be described according to the second embodiment of the present invention. In addition, in the present embodiment, the same numbers are assigned to the same configurations as in the first embodiment, and redundant explanations are omitted.

実施例１において説明したように、第１の推定注視点位置の信頼度の低下はキャリブレーション時と撮影時で、目の情報が異なる際に顕著に出てくる。ただこの要因はファインダ１０を覗くユーザの目と接眼レンズ１２との距離の違いや、光の入り込みの違いなど複数の要因が考えられる。そこで本実施例では、ＣＰＵ３（信頼度低下要因判別手段）は、このような複数の信頼度低下の要因を判断し、各要因に応じた最適なニューラルネットワークの学習時の入力データを収集する。 As described in the first embodiment, the decrease in the reliability of the first estimated gaze point position becomes conspicuous when eye information differs between calibration and photographing. However, there are a number of possible factors for this, such as the difference in the distance between the user's eye looking through the viewfinder 10 and the eyepiece lens 12, and the difference in light entering. Therefore, in this embodiment, the CPU 3 (reliability reduction factor determination means) determines such multiple reliability reduction factors, and collects input data during learning of the optimal neural network according to each factor.

図１１は、本実施例に係る、第１の推定注視点位置の信頼度低下要因の判定処理のフローチャートである。本処理はＣＰＵ３が、図３において不図示のＲＯＭに記録されるプログラムを読み出すことにより実行される。 FIG. 11 is a flowchart of the process of determining the reliability reduction factor of the first estimated gazing point position according to the present embodiment. This process is executed by the CPU 3 reading out a program recorded in a ROM (not shown in FIG. 3).

図１１において、まず、ＣＰＵ３は、撮像装置１で撮影が行われると、ステップＳ９０１～Ｓ９０３の処理を行う。これらの処理は、図９の説明で前述しているため、重複した説明を割愛する。 In FIG. 11, first, when the imaging apparatus 1 takes an image, the CPU 3 performs steps S901 to S903. Since these processes have already been described in the description of FIG. 9, redundant description will be omitted.

ステップＳ９０３において、ＣＰＵ３は、第１の推定注視点位置の信頼度が低い場合、ステップＳ１１０１へと進み、信頼度が高い場合はそのまま本処理を終了する。 In step S903, if the reliability of the first estimated gazing point position is low, the CPU 3 proceeds to step S1101, and if the reliability is high, the processing ends.

ステップＳ１１０１において、ＣＰＵ３は、キャリブレーション時と目の大きさが異なるかどうか、すなわち、ファインダ１０を覗くユーザの目と接眼レンズ１２の距離の違いが発生して信頼度が低下したかどうかを判別する。接眼レンズ１２とユーザの目との間の距離は、キャリブレーション時の目の大きさに対する現在の目の大きさの大小で算出可能である。例えば図５～図７で示すような視線検出方法によって瞳孔を検出する際、この算出された瞳孔径の大きさを目の大きさとして算出し、接眼レンズ１２とユーザの目との間の距離がキャリブレーション時とどれだけ離れているかが推測可能である。尚、キャリブレーション時と目の大きさが違うかどうかが判別できる方法であれば特にこれに限るものではなく、接眼レンズ１２の周囲に設けられる光センサを用いてキャリブレーション時と撮影時に接眼レンズ１２とユーザの目との間の距離を算出してもよい。キャリブレーション時と同様と判断した場合は、ステップＳ１１０３へと進み、キャリブレーション時と異なると判断した場合は、ステップＳ１１０２へと進む。 In step S1101, the CPU 3 determines whether or not the size of the eye is different from that at the time of calibration, that is, whether or not the distance between the user's eye looking through the finder 10 and the eyepiece 12 is different and the reliability is lowered. do. The distance between the eyepiece 12 and the user's eyes can be calculated based on the size of the current eye size relative to the eye size at the time of calibration. For example, when the pupil is detected by the line-of-sight detection method as shown in FIGS. It is possible to guess how far away from the time of calibration. It should be noted that the method is not particularly limited to this as long as it can determine whether or not the size of the eye is different from that at the time of calibration. The distance between 12 and the user's eyes may be calculated. If it is determined to be the same as during calibration, the process advances to step S1103, and if it is determined to be different from that during calibration, the process advances to step S1102.

ステップＳ１１０２において、ＣＰＵ３は、信頼度低下要因として、ファインダ１０と撮影者（ユーザの目）との間の距離が異なるという情報をメモリ部４に格納する。その後ステップＳ１１０３へと進む。 In step S1102, the CPU 3 stores information indicating that the distance between the viewfinder 10 and the photographer (user's eyes) is different in the memory unit 4 as a reliability lowering factor. After that, the process proceeds to step S1103.

ステップＳ１１０３においては、ＣＰＵ３は、輝度が所定以上であるかどうかを判別する。これは、眼用撮像素子１７で取得した、ファインダ１０を覗くユーザの眼画像データの輝度を確認することで判別可能である。輝度が所定以上である場合、外光の入り込みがあったと判断してステップＳ１１０４へと進む。輝度が所定未満である場合、外光の入り込みがなかったと判断して本処理を終了する。 In step S1103, CPU 3 determines whether or not the luminance is equal to or higher than a predetermined value. This can be determined by confirming the brightness of the eye image data of the user looking through the viewfinder 10 acquired by the eye imaging device 17 . If the brightness is equal to or higher than the predetermined value, it is determined that external light has entered, and the process proceeds to step S1104. If the brightness is less than the predetermined value, it is determined that no outside light has entered, and this processing ends.

ステップＳ１１０４において、ＣＰＵ３は、信頼度低下要因として、外光の入り込みがあるという情報をメモリ部４に格納する。また、後述する図１２（ｃ）の差分眼画像データを入力データとして生成する。その後本処理を終了する。 In step S1104, the CPU 3 stores in the memory unit 4 information indicating that external light enters as a reliability lowering factor. In addition, difference eye image data of FIG. 12C, which will be described later, is generated as input data. After that, this process is terminated.

一般的に推論器による推論の精度を上げるには多量の入力データを用いた学習が必要となる。よって、入力データが少ない場合、元の入力データに変換を加えてデータ量を増やす水増しと呼ばれる手法がＣＮＮではよく用いられる。この水増しの手法としては、ノイズを増やしたり、画像の拡大縮小をしたり、部分をマスクしたり、画像を反転したりといった様々な方法があるが、水増しの手法によっては推論器による推論の精度を悪くしてしまう可能性がある。これは学習時の水増しデータが、実際にはあり得ないデータや推論器により推論ができないデータ等の品質の悪いデータである場合があるためである。本実施例においては、図１１の処理に従い信頼度が低下した要因を記憶しておくことで、取得した元の入力データに応じた適切な水増しを行うことが可能である。 In general, learning using a large amount of input data is necessary to improve the accuracy of inference by an inference machine. Therefore, when the amount of input data is small, CNN often uses a technique called padding, in which the original input data is transformed to increase the amount of data. There are various methods for this padding, such as increasing noise, scaling the image, masking parts, and inverting the image. can make things worse. This is because the padding data during learning may be data of poor quality such as data that cannot actually exist or data that cannot be inferred by an inference device. In this embodiment, it is possible to perform appropriate padding according to the acquired original input data by storing the factor of the decrease in reliability according to the process of FIG. 11 .

ＣＰＵ３は、図９の処理で収集した眼画像データ及びその際の正解位置（正解データ）、並びに図１１の処理で取得した信頼度低下要因を、通信回路３２を用いて、ＬＡＮやインターネットといったネットワークを介して、サーバ上のＰＣに対して送信する。サーバ上のＰＣは、これらのデータを用いたＣＮＮの機械学習を行い、学習結果として生成した「推論器」を撮像装置１に送信する。 The CPU 3 uses the communication circuit 32 to transmit the eye image data collected in the process of FIG. 9, the correct position (correct data) at that time, and the reliability reduction factor acquired in the process of FIG. to the PC on the server via The PC on the server performs CNN machine learning using these data, and transmits a “reasoner” generated as a learning result to the imaging device 1 .

サーバ上のＰＣは、信頼度低下要因として、ファインダ１０と撮影者との間の距離が異なるという情報を受け取った場合、撮像時の眼画像データ（元の入力データ）が撮像装置１から送信されると、これを拡大・縮小したデータを学習時の水増しデータとする。より具体的には、サーバ上のＰＣは、キャリブレーション時の眼画像データも撮像装置１から取得し、キャリブレーション時及び撮影時の夫々の眼画像データから瞳孔径の大きさを検出する。水増しデータから検出される瞳孔径の大きさが、これらの検出された瞳孔径の大きさの範囲内となるように元の入力データの拡大・縮小を行って水増しデータを作成する。このように水増しデータを作成することで、品質の悪い水増しデータの生成を抑制しつつ、最適な水増しを行うことができる。また、ユーザの瞳画像の一部分をマスクするなどの実際にはあり得ないデータが水増しデータとして生成されることをなくすことができる。 When the PC on the server receives the information that the distance between the viewfinder 10 and the photographer is different as a reliability lowering factor, the eye image data (original input data) at the time of imaging is transmitted from the imaging device 1. Then, the expanded/reduced data is used as the padding data at the time of learning. More specifically, the PC on the server also acquires the eye image data during calibration from the imaging device 1, and detects the size of the pupil diameter from each of the eye image data during calibration and during photography. Inflated data is created by enlarging or reducing the original input data so that the size of the pupil diameter detected from the inflated data is within the range of these detected pupil diameter sizes. By creating the inflated data in this way, it is possible to suppress the generation of inflated data of poor quality and perform optimal inflation. In addition, it is possible to prevent the generation of inflated data, which is impossible in reality, such as masking a part of the user's pupil image.

一方、外光の入り込みために信頼度が下がった場合、外光条件によって、眼用撮像素子１７で撮影される眼画像の上部が白潰れすることもあれば、下部が白潰れすることもある。前者は、例えば日中の外での撮影において太陽光が入り込むために起こり、後者はスキー場の雪などの太陽光の反射から発生する。またそのほかにも撮像装置１の撮影姿勢によって接眼レンズ１２の横から太陽光が入り込むこともある。すなわち、眼画像の白潰れが発生する部分は外光条件に応じて多岐にわたる。 On the other hand, if the reliability is lowered due to the entry of external light, depending on the external light conditions, the upper part of the eye image captured by the eye image sensor 17 may be crushed white, and the lower part may be crushed white. . The former is caused, for example, by sunlight entering during shooting outside in the daytime, and the latter is caused by the reflection of sunlight from snow on a ski resort. In addition, sunlight may enter from the side of the eyepiece 12 depending on the shooting posture of the imaging device 1 . That is, the portion of the eye image where the white crushing occurs varies depending on the external light conditions.

このような白潰れが発生している眼画像データを入力データとして学習させても、外光条件が異なると、推論器による推論の精度はなかなか上がらない。そこで、本実施例では、このような場合、図１２に示す方法で生成した差分眼画像データを入力データとして収集する。 Even if the eye image data in which such white saturation occurs is used as input data for learning, the accuracy of inference by the inference device does not easily improve if the external light conditions are different. Therefore, in this embodiment, in such a case, differential eye image data generated by the method shown in FIG. 12 is collected as input data.

図１２（ａ）の眼画像データは、眼用撮像素子１７から出力された眼画像データであって、その上部に外光１２００が入り込んでいる。またこの撮像時においては、光源１３ａ，１３ｂが点灯しており、ユーザの角膜（眼球）には角膜反射像１２０１ａ～ｃが形成されている。 The eye image data of FIG. 12(a) is the eye image data output from the eye image pickup device 17, and external light 1200 enters the upper portion thereof. At the time of this imaging, the light sources 13a and 13b are turned on, and corneal reflection images 1201a to 1201c are formed on the user's cornea (eyeball).

図１２（ｂ）の眼画像データは、眼用撮像素子１７で出力された眼画像データであって、図１２（ａ）と同様に、その上部に外光１２００が入り込んでいる。但し、この撮像時においては、光源１３ａ，１３ｂが消灯しており、ユーザの角膜（眼球）には角膜反射像１２０１ａ～ｃが形成されていない。図１２（ｂ）に示す眼画像は、光源１３ａ，１３ｂが消灯している分、図１２（ａ）に示す眼画像よりも全体にやや暗く撮像される。 The eye image data in FIG. 12(b) is the eye image data output by the eye image pickup element 17, and external light 1200 enters the upper portion thereof, as in FIG. 12(a). However, at the time of this imaging, the light sources 13a and 13b are turned off, and the corneal reflection images 1201a to 1201c are not formed on the user's cornea (eyeball). The eye image shown in FIG. 12(b) is slightly darker overall than the eye image shown in FIG. 12(a) because the light sources 13a and 13b are turned off.

図１２（ｃ）の差分眼画像データは、図１２（ａ）から図１２（ｂ）の差分をとった差分データであり、眼画像の上部に入り込んでいた外光１２００を取り除かれている。 The difference eye image data of FIG. 12(c) is difference data obtained by taking the difference between FIG. 12(a) and FIG. 12(b), and the outside light 1200 entering the upper part of the eye image is removed.

このように、サーバ上のＰＣは、信頼度低下要因が、外光の入り込みがあるという情報である場合、入力データとして、図１２（ａ）のような角膜反射像が形成される眼画像データではなく、図１２（ｃ）のような差分眼画像データを受け取る。よって、外光の条件をある程度分離した学習が可能である。但し、図１２（ｃ）の差分眼画像データは、外光という強い光に対して差分がとられた画像であるため、瞳孔周りのエッジがぼやけ、視線検出の検出精度が悪化する。サーバ上のＰＣは、差分眼画像データ及びこれに基づき作成された水増しデータを用いてＣＮＮの機械学習を行うことで、瞳孔周りのエッジぼけに対して優位な推論器を作成でき、外光条件が変わっても推論が可能となる。またこの場合、撮像装置１から送信された差分眼画像データの瞳孔境界の一部分をマスクしたデータや、その瞳孔周りのノイズを増やしたデータを作成し、学習時の水増しデータとする。 In this way, when the reliability lowering factor is the information that external light enters, the PC on the server receives as input data eye image data that forms a corneal reflection image as shown in FIG. Instead, it receives differential eye image data as shown in FIG. 12(c). Therefore, it is possible to perform learning in which the external light conditions are separated to some extent. However, since the difference eye image data of FIG. 12C is an image obtained by taking a difference with respect to strong external light, the edge around the pupil is blurred and the detection accuracy of sight line detection is deteriorated. The PC on the server performs CNN machine learning using the differential eye image data and the inflated data created based on it, so that it is possible to create a reasoner that is superior to the edge blur around the pupil. Inference is possible even if is changed. In this case, data obtained by masking a portion of the pupillary boundary of the differential eye image data transmitted from the imaging device 1 or data obtained by increasing the noise around the pupil are created and used as inflated data during learning.

以上説明したように本実施例では、学習時の入力データの収集において、第１の推定注視点位置の信頼度が落ちている際、ＣＰＵ３は、その信頼度の低下要因を示すフラグを入力データと併せてサーバ上のＰＣに送信する。これにより、サーバ上のＰＣは、入力データを元に適切な水増しデータを作成することができ、少ない学習回数で推論器の精度を向上させることができる。 As described above, in the present embodiment, when the reliability of the first estimated gazing point position is lowered in the collection of input data during learning, the CPU 3 puts a flag indicating the factor of the lowered reliability in the input data. to the PC on the server. As a result, the PC on the server can create appropriate inflated data based on the input data, and can improve the accuracy of the reasoner with a small number of times of learning.

以上、本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 Although preferred embodiments of the present invention have been described above, the present invention is not limited to these embodiments, and various modifications and changes are possible within the scope of the gist.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実行可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

１撮像装置
３ＣＰＵ
１０ファインダ
１３ａ～１３ｂ光源
１７眼用撮像素子
３１視線検出信頼度判別回路
４１～４３操作部材
１１８焦点調節回路
２０１視線検出回路
２０３自動焦点検出回路 1 imaging device 3 CPU
REFERENCE SIGNS LIST 10 finder 13a-13b light source 17 eye imaging element 31 line-of-sight detection reliability determination circuit 41-43 operating member 118 focus adjustment circuit 201 line-of-sight detection circuit 203 automatic focus detection circuit

Claims

An imaging device that displays a through image in an internal viewfinder,
generating means for generating eye image data by capturing an eyeball of the user looking through the viewfinder;
sight line detection means for obtaining the eye image data and detecting the position of the user's sight line focused on the through image of the viewfinder based on the obtained eye image data;
display control means for displaying the detected line-of-sight position on the finder so as to be movable to another position by a first user operation;
collecting means for collecting the other position as a correct position when there is a second user operation to determine the other position as the focus position;
The imaging apparatus, wherein the correct position is used for learning to create a reasoner for estimating the line-of-sight position using the eye image data acquired by the line-of-sight detection means as input data.

2. The imaging apparatus according to claim 1, wherein, when a subject is present near the detected line-of-sight position, the collection means collects the position of the subject as the correct position.

Reliability determination means for determining reliability of the detected line-of-sight position;
when the reliability is high, a first focusing means for focusing the imaging device using the detected line-of-sight position as the focus position;
3. The imaging according to claim 1, further comprising second focusing means for focusing the imaging device using the line-of-sight position estimated by the inference unit as the focus position when the reliability is low. Device.

a calibration means for obtaining the eye image data before the sight line position is detected by the sight line detection means, and correcting individual differences in eyeballs based on the obtained eye image data;
The reliability determination means determines the reliability based on a difference between two eye image data, the eye image data acquired by the calibration means and the eye image data acquired by the line-of-sight detection means. 4. The imaging apparatus according to claim 3, characterized by:

5. The imaging apparatus according to claim 4, wherein the difference is a difference in pupil diameter detected from each of the two eye image data.

6. The imaging apparatus according to claim 5, wherein the difference is a difference in external light detected from each of the two eye image data.

further comprising a light source for illuminating the eye of the user;
7. The imaging according to claim 6, wherein the difference is a difference in the number of corneal reflection images formed on the eyeball of the user by lighting of the light source, which is detected from each of the two eye image data. Device.

When the reliability determination means determines that the reliability has decreased due to the difference in the external light, the eye image data of the eyeball of the user in which the corneal reflection image is formed by turning on the light source, and the light source. The learning is performed by using, as the input data instead of the eye image data acquired by the line-of-sight detecting means, difference data between the eye image data of the eyeball of the user in which the corneal reflection image is not formed due to the light being turned off. 8. The imaging apparatus of claim 7, wherein:

further comprising reliability reduction factor determination means for determining a reliability reduction factor when the reliability determination means determines that the reliability is low,
9. The image pickup apparatus according to claim 8, wherein padded data corresponding to said reliability lowering factor is created during said learning.

If the reliability lowering factor is that the distance between the viewfinder and the user is different, the size of the pupil diameter is detected from each of the two eye image data, and the size of the detected pupil diameter is calculated. 10. The imaging apparatus according to claim 9, wherein said input data is enlarged/reduced so that the size of the pupil diameter detected from said inflated data falls within a range to create said inflated data.

The padding is performed by masking a portion of the pupillary boundary of the differential data as the input data or creating data with increased noise around the pupil when the reliability lowering factor is the intrusion of external light. 10. The image pickup apparatus according to claim 9, wherein the image pickup apparatus is performed by:

A control method for an imaging device that displays a through image in an internal finder, comprising:
a generation step of capturing an eyeball of a user looking through the finder and generating eye image data;
a line-of-sight detection step of acquiring the eye image data and detecting a position of the user's line of sight focused on the through image of the finder based on the acquired eye image data;
a display control step of displaying the detected line-of-sight position on the finder so as to be movable to another position by a first user operation;
a collecting step of collecting the other position as a correct position when there is a second user operation to determine the other position as the focus position;
The control method, wherein the correct position is used for learning for creating an inference device for estimating the line-of-sight position using the eye image data acquired in the line-of-sight detection step as input data.

A computer-executable program that causes a computer to function as each means of the imaging apparatus according to any one of claims 1 to 11.