JP2021043368A

JP2021043368A - Electronic apparatus and control method thereof

Info

Publication number: JP2021043368A
Application number: JP2019166234A
Authority: JP
Inventors: 江幡　裕也; Hironari Ehata; 裕也江幡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-09-12
Filing date: 2019-09-12
Publication date: 2021-03-18
Anticipated expiration: 2039-09-12
Also published as: JP7358130B2

Abstract

To provide an electronic apparatus capable of precisely identifying the detection result of line-of-sight representing which frame a user looks at.SOLUTION: The electronic apparatus includes: display means that displays images; imaging means that picks up an image of an eye looking at a display image displayed on the display means; detection means that detects a line-of-sight using an image of eye picked up by the imaging means; and identification means that identifies the detection frame of line-of-sight that is a display frame corresponding to the line-of-sight detection result of the detection means. The identification means is configured to identify the line-of-sight detection frame based on drive mode information of the imaging means and drive mode information of the display means.SELECTED DRAWING: Figure 11

Description

本発明は電子機器に関し、より詳細には、視線を検出する電子機器に関する。 The present invention relates to an electronic device, and more particularly to an electronic device that detects a line of sight.

近年、カメラの自動化・インテリジェント化が進んでいる。特許文献１は、手動で被写体位置を入力せずとも、ファインダを覗く撮影者の視線位置の情報に基づいて撮影者が意図する被写体を認識し、焦点制御を行う技術を提案する。特許文献２は、記録する画像のタイム情報と視線検出したタイム情報を関連付けて、記録画像と視線情報を記録する技術が記載されている。 In recent years, cameras have become more automated and intelligent. Patent Document 1 proposes a technique for recognizing a subject intended by a photographer based on information on the line-of-sight position of a photographer looking into a finder and performing focus control without manually inputting the subject position. Patent Document 2 describes a technique for recording a recorded image and line-of-sight information by associating the time information of the image to be recorded with the time information of line-of-sight detection.

特開２００４−８３２３号公報Japanese Unexamined Patent Publication No. 2004-8323 特開２００５−２５２７３２号公報Japanese Unexamined Patent Publication No. 2005-252732

従来技術では、視線検出に用いた表示画像の特定が困難である。言い換えると、視線検出結果がどのフレーム画像を見ているときの視線を表すかを特定することが困難である。 With the prior art, it is difficult to identify the display image used for line-of-sight detection. In other words, it is difficult to specify which frame image the line-of-sight detection result represents the line-of-sight.

特許文献１は、そもそも、視線検出に用いた表示画像を特定することを開示しない。特許文献２は、タイム情報を用いて画像と視線検出結果を関連付けているが、画像表示と眼球撮像の同期関係によっては、眼球を撮像したときに視線位置に表示されている表示画像が前後する可能性がある。つまり、特許文献２は、視線検出に用いた表示画像の特定に失敗する場合がある。 Patent Document 1 does not disclose to specify the display image used for the line-of-sight detection in the first place. Patent Document 2 associates an image with a line-of-sight detection result using time information, but the display image displayed at the line-of-sight position when the eyeball is imaged moves back and forth depending on the synchronization relationship between the image display and the eyeball imaging. there is a possibility. That is, Patent Document 2 may fail to specify the display image used for the line-of-sight detection.

例えば、視線検出結果を被写体追尾または焦点制御に利用する際に、どのフレーム画像に対する視線検出結果であるかが特定されていないと、ユーザーが意図しない被写体を追尾対象または合焦対象としてしまう。特に、被写体の動きが速いと、被写体追尾や焦点検出精度の低下を招く可能性がある。 For example, when the line-of-sight detection result is used for subject tracking or focus control, if the frame image for which the line-of-sight detection result is not specified is not specified, the subject unintended by the user is targeted for tracking or focusing. In particular, if the subject moves quickly, the subject tracking and focus detection accuracy may decrease.

本発明は、上記課題に鑑みてなされたものであり、視線検出結果がどのフレームを見ているときの視線を表すかを精度良く特定可能な電子機器を提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide an electronic device capable of accurately specifying which frame the line-of-sight detection result represents the line-of-sight.

本発明の第一の態様は、
画像を表示する表示手段と、
前記表示手段に表示される表示画像を見る眼を撮像する撮像手段と、
前記撮像手段で撮像された眼画像を用いて視線を検出する検出手段と、
前記検出手段の視線検出結果に対応する表示フレームである視線検出フレームを特定する特定手段と、
を備え、
前記特定手段は、前記撮像手段の駆動モード情報および前記表示手段の駆動モード情報に基づいて、前記視線検出フレームを特定する、
電子機器である。 The first aspect of the present invention is
Display means for displaying images and
An imaging means for capturing an eye for viewing a display image displayed on the display means, and an imaging means.
A detection means that detects the line of sight using an eye image captured by the imaging means, and
A specific means for specifying the line-of-sight detection frame, which is a display frame corresponding to the line-of-sight detection result of the detection means, and
With
The specific means identifies the line-of-sight detection frame based on the drive mode information of the image pickup means and the drive mode information of the display means.
It is an electronic device.

本発明の第二の態様は、
画像を表示手段に表示する表示ステップと、
前記表示手段に表示される表示画像を見る眼を撮像手段により撮像する撮像ステップと、
前記撮像ステップにおいて撮像された眼画像を用いて視線を検出する検出ステップと、
前記検出ステップの視線検出結果に対応する表示フレームである視線検出フレームを特定する特定ステップと、
を含み、
前記特定ステップでは、前記撮像手段の駆動モード情報および前記表示手段の駆動モード情報に基づいて、前記視線検出フレームを特定する、
電子機器の制御方法である。 The second aspect of the present invention is
A display step that displays an image on a display means,
An imaging step in which an eye for viewing a display image displayed on the display means is imaged by the imaging means,
A detection step of detecting the line of sight using the eye image captured in the imaging step, and a detection step.
A specific step for specifying the line-of-sight detection frame, which is a display frame corresponding to the line-of-sight detection result of the detection step,
Including
In the specific step, the line-of-sight detection frame is specified based on the drive mode information of the image pickup means and the drive mode information of the display means.
This is a control method for electronic devices.

本発明によれば、視線検出結果がどのフレームを見ているときの視線を表すかを精度良く特定可能となる。 According to the present invention, it is possible to accurately identify which frame the line-of-sight detection result represents the line-of-sight.

実施形態に係るカメラの外観図。The external view of the camera which concerns on embodiment. 実施形態に係るカメラの断面図。Sectional drawing of the camera which concerns on embodiment. 実施形態に係る虚像が結像される様子を示す図。The figure which shows how the virtual image which concerns on embodiment is imaged. 実施形態に係るカメラのブロック図。The block diagram of the camera which concerns on embodiment. 実施形態に係るファインダ内視野を示す図。The figure which shows the field of view in the finder which concerns on embodiment. 実施形態に係る視野検出方法の原理を説明するための図。The figure for demonstrating the principle of the visual field detection method which concerns on embodiment. 実施形態に係る眼画像を示す図。The figure which shows the eye image which concerns on embodiment. 実施形態に係る視線検出動作のフローチャート。The flowchart of the line-of-sight detection operation which concerns on embodiment. 実施形態における表示フレームと眼球撮像フレームの関係図。The relationship diagram of the display frame and the eyeball imaging frame in an embodiment. 実施形態における表示フレームと眼球撮像フレームの関係図。The relationship diagram of the display frame and the eyeball imaging frame in an embodiment. 実施形態における視線検出結果出力までのタイミングチャート。The timing chart until the line-of-sight detection result output in the embodiment. 実施形態における視線検出フレーム特定処理のフローチャート。The flowchart of the line-of-sight detection frame identification process in an embodiment. 実施形態における表示フレームと視線検出位置の関係図。The relationship diagram of the display frame and the line-of-sight detection position in an embodiment. 実施形態における注視点合焦処理のフローチャート。The flowchart of the gazing point focusing process in an embodiment. 実施形態における追尾補正処理のフローチャート。The flowchart of the tracking correction processing in an embodiment.

以下、添付の図面を参照して本発明の好適な実施形態を説明する。本発明は任意の電子機器に適用することができるが、以下では撮像装置（カメラ）を例として説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. The present invention can be applied to any electronic device, but an image pickup device (camera) will be described below as an example.

＜構成の説明＞
図１（Ａ），１（Ｂ）は、本実施形態に係るカメラ１（デジタルスチルカメラ；レンズ交換式カメラ）の外観を示す。図１（Ａ）は正面斜視図であり、図１（Ｂ）は背面斜視図である。図１（Ａ）に示すように、カメラ１は、撮影レンズユニット１Ａ及びカメラ筐体１Ｂを有する。カメラ筐体１Ｂには、ユーザー（撮影者）からの撮像操作を受け付ける操作部材であるレリーズボタン５が配置されている。図１（Ｂ）に示すように、カメラ筐体１Ｂの背面には、カメラ筐体１Ｂ内に含まれている後述の表示デバイス１０（表示パネル）をユーザーが覗き込むための接眼窓枠１２１と接眼レンズ１２（接眼光学系）が配置されている。接眼窓枠１２１は接眼レンズ１２を囲んでおり、接眼レンズ１２に対して、カメラ筐体１Ｂの外側（背面側）に突出している。なお、接眼光学系には複数枚のレンズが含まれていてもよい。カメラ筐体１Ｂの背面には、ユーザーからの各種操作を受け付ける操作部材４１〜４３も配置されている。例えば、操作部材４１はタッチ操作を受け付けるタッチパネルであり、操作部材４２は各方向に押し倒し可能な操作レバーであり、操作部材４３は４方向のそれぞれに押し込み可能な４方向キーである。操作部材４１（タッチパ
ネル）は、液晶パネル等の表示パネルを備えており、表示パネルで画像を表示する機能を有する。 <Explanation of configuration>
1 (A) and 1 (B) show the appearance of the camera 1 (digital still camera; interchangeable lens camera) according to the present embodiment. FIG. 1 (A) is a front perspective view, and FIG. 1 (B) is a rear perspective view. As shown in FIG. 1A, the camera 1 has a photographing lens unit 1A and a camera housing 1B. A release button 5 which is an operation member for receiving an imaging operation from a user (photographer) is arranged on the camera housing 1B. As shown in FIG. 1 (B), on the back surface of the camera housing 1B, there is an eyepiece window frame 121 for the user to look into the display device 10 (display panel) described later contained in the camera housing 1B. An eyepiece lens 12 (eyepiece optical system) is arranged. The eyepiece window frame 121 surrounds the eyepiece lens 12 and projects to the outside (rear side) of the camera housing 1B with respect to the eyepiece lens 12. The eyepiece optical system may include a plurality of lenses. On the back surface of the camera housing 1B, operating members 41 to 43 that receive various operations from the user are also arranged. For example, the operation member 41 is a touch panel that accepts touch operations, the operation member 42 is an operation lever that can be pushed down in each direction, and the operation member 43 is a four-direction key that can be pushed in each of the four directions. The operation member 41 (touch panel) includes a display panel such as a liquid crystal panel, and has a function of displaying an image on the display panel.

図２は、図１（Ａ）に示したＹ軸とＺ軸が成すＹＺ平面でカメラ１を切断した断面図であり、カメラ１の大まかな内部構成を示す。 FIG. 2 is a cross-sectional view of the camera 1 cut along the YZ plane formed by the Y-axis and the Z-axis shown in FIG. 1A, and shows a rough internal configuration of the camera 1.

撮影レンズユニット１Ａ内には、２枚のレンズ１０１，１０２、絞り１１１、絞り駆動部１１２、レンズ駆動モーター１１３、レンズ駆動部材１１４、フォトカプラー１１５、パルス板１１６、マウント接点１１７、焦点調節回路１１８等が含まれている。レンズ駆動部材１１４は駆動ギヤ等からなり、フォトカプラー１１５は、レンズ駆動部材１１４に連動するパルス板１１６の回転を検知して、焦点調節回路１１８に伝える。焦点調節回路１１８は、フォトカプラー１１５からの情報と、カメラ筐体１Ｂからの情報（レンズ駆動量の情報）とに基づいてレンズ駆動モーター１１３を駆動し、レンズ１０１を移動させて合焦位置を変更する。マウント接点１１７は、撮影レンズユニット１Ａとカメラ筐体１Ｂとのインターフェイスである。なお、簡単のために２枚のレンズ１０１，１０２を示したが、実際は２枚より多くのレンズが撮影レンズユニット１Ａ内に含まれている。 In the photographing lens unit 1A, two lenses 101, 102, an aperture 111, an aperture drive unit 112, a lens drive motor 113, a lens drive member 114, a photo coupler 115, a pulse plate 116, a mount contact 117, and a focus adjustment circuit 118. Etc. are included. The lens drive member 114 is composed of a drive gear or the like, and the photocoupler 115 detects the rotation of the pulse plate 116 linked to the lens drive member 114 and transmits the rotation to the focus adjustment circuit 118. The focus adjustment circuit 118 drives the lens drive motor 113 based on the information from the photocoupler 115 and the information from the camera housing 1B (lens drive amount information), and moves the lens 101 to adjust the focus position. change. The mount contact 117 is an interface between the photographing lens unit 1A and the camera housing 1B. Although two lenses 101 and 102 are shown for simplicity, more than two lenses are actually included in the photographing lens unit 1A.

カメラ筐体１Ｂ内には、撮像素子２、ＣＰＵ３、メモリ部４、表示デバイス１０、表示デバイス駆動回路１１等が含まれている。撮像素子２は、撮影レンズユニット１Ａの予定結像面に配置されている。ＣＰＵ３は、マイクロコンピュータの中央処理部であり、カメラ１全体を制御する。メモリ部４は、撮像素子２により撮像された画像等を記憶する。表示デバイス１０は、液晶等で構成されており、撮像された画像（被写体像）等を表示する。表示デバイス駆動回路１１は、表示デバイス１０を駆動する。 The camera housing 1B includes an image sensor 2, a CPU 3, a memory unit 4, a display device 10, a display device drive circuit 11, and the like. The image sensor 2 is arranged on the planned image plane of the photographing lens unit 1A. The CPU 3 is a central processing unit of the microcomputer and controls the entire camera 1. The memory unit 4 stores an image or the like captured by the image sensor 2. The display device 10 is composed of a liquid crystal or the like, and displays an captured image (subject image) or the like. The display device drive circuit 11 drives the display device 10.

表示デバイス１０が表示手段に想到し、撮像素子２が第２撮像手段に相当する。 The display device 10 has arrived at the display means, and the image pickup device 2 corresponds to the second image pickup means.

ユーザーは、接眼窓枠１２１と接眼レンズ１２を通して、表示デバイス１０に表示された画像（視認用画像）を見ることができる。具体的には、図３に示すように、接眼レンズ１２により、表示デバイス１０を拡大した虚像３００が、接眼レンズ１２から５０ｃｍ〜２ｍ程度離れた位置に結像される。図３では、接眼レンズ１２から１ｍ離れた位置に虚像３００が結像されている。ユーザーは、接眼窓枠１２１内を覗き込むことで、この虚像３００を視認することとなる。 The user can see the image (visual image) displayed on the display device 10 through the eyepiece window frame 121 and the eyepiece lens 12. Specifically, as shown in FIG. 3, the eyepiece 12 forms an image of a virtual image 300 in which the display device 10 is magnified at a position about 50 cm to 2 m away from the eyepiece 12. In FIG. 3, the virtual image 300 is formed at a position 1 m away from the eyepiece lens 12. The user can visually recognize the virtual image 300 by looking into the eyepiece window frame 121.

カメラ筐体１Ｂ内には、光源１３ａ，１３ｂ、光分割器１５、受光レンズ１６、眼球撮像素子１７等も含まれている。光源１３ａ，１３ｂは、光の角膜反射による反射像（角膜反射像）と瞳孔の関係から視線方向を検出するために従来から一眼レフカメラ等で用いられている光源であり、ユーザーの眼球１４を照明するための光源である。具体的には、光源１３ａ，１３ｂは、ユーザーに対して不感の赤外光を発する赤外発光ダイオード等であり、接眼レンズ１２の周りに配置されている。照明された眼球１４の光学像（眼球像；光源１３ａ，１３ｂから発せられて眼球１４で反射した反射光による像）は、接眼レンズ１２を透過し、光分割器１５で反射される。そして、眼球像は、受光レンズ１６によって、ＣＣＤ等の光電素子列を２次元的に配した眼球撮像素子１７上に結像される。受光レンズ１６は、眼球１４の瞳孔と眼球撮像素子１７を共役な結像関係に位置付けている。後述する所定のアルゴリズムにより、眼球撮像素子１７上に結像された眼球像における角膜反射像の位置から、視線方向（視認用画像における視点）が検出される。 The camera housing 1B also includes light sources 13a and 13b, an optical divider 15, a light receiving lens 16, an eyeball image sensor 17, and the like. The light sources 13a and 13b are light sources conventionally used in a single-lens reflex camera or the like for detecting the line-of-sight direction from the relationship between the reflected image (corneal reflex image) due to the corneal reflex of light and the pupil, and the user's eyeball 14 It is a light source for illuminating. Specifically, the light sources 13a and 13b are infrared light emitting diodes or the like that emit infrared light that is insensitive to the user, and are arranged around the eyepiece lens 12. The optical image of the illuminated eyeball 14 (eyeball image; an image of reflected light emitted from the light sources 13a and 13b and reflected by the eyeball 14) passes through the eyepiece 12 and is reflected by the light divider 15. Then, the eyeball image is formed by the light receiving lens 16 on the eyeball image pickup element 17 in which a array of photoelectric elements such as a CCD is two-dimensionally arranged. The light receiving lens 16 positions the pupil of the eyeball 14 and the eyeball image sensor 17 in a conjugate imaging relationship. The line-of-sight direction (viewpoint in the visual recognition image) is detected from the position of the corneal reflex image in the eyeball image formed on the eyeball image pickup device 17 by a predetermined algorithm described later.

図４は、カメラ１内の電気的構成を示すブロック図である。ＣＰＵ３には、視線検出回路２０１、測光回路２０２、自動焦点検出回路２０３、信号入力回路２０４、表示デバイス駆動回路１１、光源駆動回路２０５等が接続されている。また、ＣＰＵ３は、撮影レンズユニット１Ａ内に配置された焦点調節回路１１８と、撮影レンズユニット１Ａ内の絞り
駆動部１１２に含まれた絞り制御回路２０６とに、マウント接点１１７を介して信号を伝達する。ＣＰＵ３に付随したメモリ部４は、撮像素子２および眼球撮像素子１７からの撮像信号の記憶機能と、後述する視線の個人差を補正する視線補正パラメータの記憶機能とを有する。 FIG. 4 is a block diagram showing an electrical configuration in the camera 1. The line-of-sight detection circuit 201, the photometric circuit 202, the autofocus detection circuit 203, the signal input circuit 204, the display device drive circuit 11, the light source drive circuit 205, and the like are connected to the CPU 3. Further, the CPU 3 transmits a signal to the focus adjustment circuit 118 arranged in the photographing lens unit 1A and the aperture control circuit 206 included in the aperture drive unit 112 in the photographing lens unit 1A via the mount contact 117. To do. The memory unit 4 attached to the CPU 3 has a function of storing an image pickup signal from the image pickup element 2 and the eyeball image pickup element 17, and a function of storing a line-of-sight correction parameter for correcting individual differences in the line of sight, which will be described later.

視線検出回路２０１が検出手段に相当し、眼球撮像素子１７が撮像手段に相当し、ＣＰＵ３が特定手段に相当する。 The line-of-sight detection circuit 201 corresponds to the detection means, the eyeball image sensor 17 corresponds to the image pickup means, and the CPU 3 corresponds to the specific means.

視線検出回路２０１は、眼球撮像素子１７（ＣＣＤ−ＥＹＥ）上に眼球像が結像した状態での眼球撮像素子１７の出力（眼を撮像した眼画像）をＡ／Ｄ変換し、その結果をＣＰＵ３に送信する。ＣＰＵ３は、後述する所定のアルゴリズムに従って眼画像から視線検出に必要な特徴点を抽出し、特徴点の位置からユーザーの視線（視認用画像における視点）を算出する。 The line-of-sight detection circuit 201 A / D-converts the output (eye image obtained by imaging the eye) of the eyeball image sensor 17 in a state where the eyeball image is formed on the eyeball image sensor 17 (CCD-EYE), and obtains the result. It is transmitted to the CPU 3. The CPU 3 extracts feature points required for line-of-sight detection from the eye image according to a predetermined algorithm described later, and calculates the user's line of sight (viewpoint in the visual recognition image) from the positions of the feature points.

測光回路２０２は、測光センサの役割を兼ねた撮像素子２から得られる信号、具体的には被写界の明るさに対応した輝度信号の増幅、対数圧縮、Ａ／Ｄ変換等を行い、その結果を被写界輝度情報としてＣＰＵ３に送る。 The photometric circuit 202 performs amplification, logarithmic compression, A / D conversion, and the like of a signal obtained from the image sensor 2 that also serves as a photometric sensor, specifically, a luminance signal corresponding to the brightness of the field of view. The result is sent to the CPU 3 as the photometric brightness information.

自動焦点検出回路２０３は、撮像素子２におけるＣＣＤの中に含まれる、位相差検出のために使用される複数の検出素子（複数の画素）からの信号電圧をＡ／Ｄ変換し、ＣＰＵ３に送る。ＣＰＵ３は、複数の検出素子の信号から、各焦点検出ポイントに対応する被写体までの距離を演算する。これは撮像面位相差ＡＦとして知られる公知の技術である。本実施形態では、一例として、図５（Ａ）のファインダ内視野像（視認用画像）に示した１８０か所に対応する撮像面上の１８０か所のそれぞれに、焦点検出ポイントがあるとする。 The autofocus detection circuit 203 A / D-converts signal voltages from a plurality of detection elements (a plurality of pixels) used for phase difference detection included in the CCD in the image pickup element 2 and sends them to the CPU 3. .. The CPU 3 calculates the distance from the signals of the plurality of detection elements to the subject corresponding to each focus detection point. This is a known technique known as imaging surface phase-difference AF. In the present embodiment, as an example, it is assumed that there are focus detection points at each of the 180 locations on the imaging surface corresponding to the 180 locations shown in the visual field image (visual field image) in the finder of FIG. 5 (A). ..

信号入力回路２０４には、レリーズボタン５の第１ストロークでＯＮし、カメラ１の測光、測距、視線検出動作等を開始するためのスイッチＳＷ１と、レリーズボタン５の第２ストロークでＯＮし、撮影動作を開始するためのスイッチＳＷ２が接続されている。スイッチＳＷ１，ＳＷ２からのＯＮ信号が信号入力回路２０４に入力され、ＣＰＵ３に送信される。 The signal input circuit 204 is turned on by the first stroke of the release button 5, is turned on by the switch SW1 for starting the photometry, distance measurement, line-of-sight detection operation, etc. of the camera 1, and is turned on by the second stroke of the release button 5. The switch SW2 for starting the shooting operation is connected. The ON signals from the switches SW1 and SW2 are input to the signal input circuit 204 and transmitted to the CPU3.

追尾回路２０７は、入力画像中の被写体を追尾する回路であり、被写体位置を表す追尾枠の情報をＣＰＵ３に送信する。追尾処理は、例えば、ＳＡＤ（ＳｕｍＯｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅ）により、２枚の画像間の類似度を求めることにより行われる。また、追尾回路２０７は、ＳＡＤ以外の追尾処理を用いても良い。 The tracking circuit 207 is a circuit that tracks a subject in an input image, and transmits information on a tracking frame indicating a subject position to the CPU 3. The tracking process is performed by, for example, determining the similarity between two images by SAD (Sum Of Absolute Difference). Further, the tracking circuit 207 may use a tracking process other than SAD.

認識回路２０８は、入力画像中の被写体を認識する回路であり、例えば人物の顔検出や動物の検出を行う。 The recognition circuit 208 is a circuit that recognizes a subject in an input image, and for example, detects a person's face and an animal.

また、操作部材４１〜４３は、ＣＰＵ３にその操作信号に伝達する。ＣＰＵ３は、操作部材４１〜４３から伝達される操作信号に応じて推定注視点枠位置の移動操作制御を行う。 Further, the operation members 41 to 43 transmit the operation signal to the CPU 3. The CPU 3 controls the movement of the estimated gazing point frame position according to the operation signals transmitted from the operation members 41 to 43.

図５（Ａ）は、ファインダ内視野を示した図であり、表示デバイス１０が動作した状態（視認用画像を表示した状態）を示す。図５（Ａ）に示すように、ファインダ内視野には、焦点検出領域５００、１８０個の測距点指標５０１、視野マスク５０２等がある。１８０個の測距点指標５０１のそれぞれは、撮像面上における焦点検出ポイントに対応する位置に表示されるように、表示デバイス１０に表示されたスルー画像（ライブビュー画像）に重ねて表示される。また、１８０個の測距点指標５０１のうち、現在の推定注視点Ａ（
推定位置）に対応する測距点指標５０１は、枠等で強調されて表示される。 FIG. 5A is a view showing the field of view in the finder, and shows a state in which the display device 10 is operating (a state in which a visual image is displayed). As shown in FIG. 5A, the visual field in the viewfinder includes a focal point detection region 500, 180 AF point indexes 501, a visual field mask 502, and the like. Each of the 180 AF point indexes 501 is displayed superimposed on the through image (live view image) displayed on the display device 10 so as to be displayed at a position corresponding to the focus detection point on the imaging surface. .. Also, of the 180 AF point indexes 501, the current estimated gazing point A (
The AF point index 501 corresponding to the estimated position) is highlighted by a frame or the like.

＜視線検出動作の説明＞
図６，７，８を用いて、視線検出方法について説明する。図６は、視線検出方法の原理を説明するための図であり、視線検出を行うための光学系の概略図である。図６に示すように、光源１３ａ，１３ｂは受光レンズ１６の光軸に対して略対称に配置され、ユーザーの眼球１４を照らす。光源１３ａ，１３ｂから発せられて眼球１４で反射した光の一部は、受光レンズ１６によって、眼球撮像素子１７に集光する。図７は、眼球撮像素子１７で撮像された眼画像（眼球撮像素子１７に投影された眼球像）の概略図７０１、および眼球撮像素子１７におけるＣＣＤの出力強度の分布７０２を示す。図８は、視線検出動作の概略フローチャートを表す。 <Explanation of line-of-sight detection operation>
The line-of-sight detection method will be described with reference to FIGS. 6, 7 and 8. FIG. 6 is a diagram for explaining the principle of the line-of-sight detection method, and is a schematic view of an optical system for performing line-of-sight detection. As shown in FIG. 6, the light sources 13a and 13b are arranged substantially symmetrically with respect to the optical axis of the light receiving lens 16 and illuminate the user's eyeball 14. A part of the light emitted from the light sources 13a and 13b and reflected by the eyeball 14 is focused on the eyeball image sensor 17 by the light receiving lens 16. FIG. 7 shows a schematic view 701 of an eye image (eyeball image projected on the eyeball image sensor 17) captured by the eyeball image sensor 17, and a distribution 702 of the output intensity of the CCD in the eyeball image sensor 17. FIG. 8 shows a schematic flowchart of the line-of-sight detection operation.

視線検出動作が開始すると、図８のステップＳ８０１で、光源１３ａ，１３ｂは、ユーザーの眼球１４に向けて赤外光を発する。赤外光によって照明されたユーザーの眼球像は、受光レンズ１６を通して眼球撮像素子１７上に結像され、眼球撮像素子１７により光電変換される。これにより、処理可能な眼画像の電気信号が得られる。 When the line-of-sight detection operation starts, the light sources 13a and 13b emit infrared light toward the user's eyeball 14 in step S801 of FIG. The user's eyeball image illuminated by infrared light is imaged on the eyeball image sensor 17 through the light receiving lens 16 and photoelectrically converted by the eyeball image sensor 17. As a result, an electric signal of a processable eye image is obtained.

ステップＳ８０２では、視線検出回路２０１は、眼球撮像素子１７から得られた眼画像（眼画像信号；眼画像の電気信号）をＣＰＵ３に送る。 In step S802, the line-of-sight detection circuit 201 sends an eye image (eye image signal; electrical signal of the eye image) obtained from the eyeball image sensor 17 to the CPU 3.

ステップＳ８０３では、ＣＰＵ３は、ステップＳ８０２で得られた眼画像から、光源１３ａ，１３ｂの角膜反射像Ｐｄ，Ｐｅと瞳孔中心ｃに対応する点の座標を求める。 In step S803, the CPU 3 obtains the coordinates of the points corresponding to the corneal reflex images Pd and Pe of the light sources 13a and 13b and the pupil center c from the eye image obtained in step S802.

光源１３ａ，１３ｂより発せられた赤外光は、ユーザーの眼球１４の角膜１４２を照明する。このとき、角膜１４２の表面で反射した赤外光の一部により形成される角膜反射像Ｐｄ，Ｐｅは、受光レンズ１６により集光され、眼球撮像素子１７上に結像して、眼画像における角膜反射像Ｐｄ’，Ｐｅ’となる。同様に瞳孔１４１の端部ａ，ｂからの光束も眼球撮像素子１７上に結像して、眼画像における瞳孔端像ａ’，ｂ’となる。 The infrared light emitted from the light sources 13a and 13b illuminates the cornea 142 of the user's eyeball 14. At this time, the corneal reflex images Pd and Pe formed by a part of the infrared light reflected on the surface of the cornea 142 are condensed by the light receiving lens 16 and imaged on the eyeball imaging element 17 to be formed in the eye image. The corneal reflex images are Pd'and Pe'. Similarly, the luminous flux from the ends a and b of the pupil 141 is also imaged on the eyeball image sensor 17 to become the pupil end images a'and b'in the eye image.

分布７０２は、眼画像７０１における領域α’の輝度情報（輝度分布）を示す。分布７０２において、眼画像７０１の水平方向をＸ軸方向、垂直方向をＹ軸方向とし、Ｘ軸方向の輝度分布が示されている。本実施形態では、角膜反射像Ｐｄ’，Ｐｅ’のＸ軸方向（水平方向）の座標をＸｄ，Ｘｅとし、瞳孔端像ａ’，ｂ’のＸ軸方向の座標をＸａ，Ｘｂとする。図７の輝度分布７０２に示すように、角膜反射像Ｐｄ’，Ｐｅ’の座標Ｘｄ，Ｘｅでは、極端に高いレベルの輝度が得られる。瞳孔１４１の領域（瞳孔１４１からの光束が眼球撮像素子１７上に結像して得られる瞳孔像の領域）に相当する、座標Ｘａから座標Ｘｂまでの領域では、座標Ｘｄ，Ｘｅを除いて、極端に低いレベルの輝度が得られる。そして、瞳孔１４１の外側の光彩１４３の領域（光彩１４３からの光束が結像して得られる、瞳孔像の外側の光彩像の領域）では、上記２種の輝度の中間の輝度が得られる。具体的には、Ｘ座標（Ｘ軸方向の座標）が座標Ｘａより小さい領域と、Ｘ座標が座標Ｘｂより大きい領域とで、上記２種の輝度の中間の輝度が得られる。 The distribution 702 indicates the luminance information (luminance distribution) of the region α'in the eye image 701. In the distribution 702, the horizontal direction of the eye image 701 is the X-axis direction, the vertical direction is the Y-axis direction, and the luminance distribution in the X-axis direction is shown. In the present embodiment, the coordinates of the corneal reflection images Pd'and Pe'in the X-axis direction (horizontal direction) are Xd and Xe, and the coordinates of the pupil end images a'and b'in the X-axis direction are Xa and Xb. As shown in the luminance distribution 702 of FIG. 7, extremely high levels of luminance can be obtained at the coordinates Xd and Xe of the corneal reflection images Pd'and Pe'. In the region from the coordinates Xa to the coordinates Xb, which corresponds to the region of the pupil 141 (the region of the pupil image obtained by forming the luminous flux from the pupil 141 on the eyeball image sensor 17), the coordinates Xd and Xe are excluded. Extremely low levels of brightness are obtained. Then, in the region of the iris 143 outside the pupil 141 (the region of the iris image outside the pupil image obtained by forming the luminous flux from the iris 143), a brightness intermediate between the above two types of brightness can be obtained. Specifically, in a region where the X coordinate (coordinate in the X-axis direction) is smaller than the coordinate Xa and a region where the X coordinate is larger than the coordinate Xb, a brightness intermediate between the above two types of brightness can be obtained.

輝度分布７０２から、角膜反射像Ｐｄ’，Ｐｅ’のＸ座標Ｘｄ，Ｘｅと、瞳孔端像ａ’，ｂ’のＸ座標Ｘａ，Ｘｂを得ることができる。具体的には、輝度が極端に高い座標を角膜反射像Ｐｄ’，Ｐｅ’の座標として得ることができ、輝度が極端に低い座標を瞳孔端像ａ’，ｂ’の座標として得ることができる。また、受光レンズ１６の光軸に対する眼球１４の光軸の回転角θｘが小さい場合には、瞳孔中心ｃからの光束が眼球撮像素子１７上に結像して得られる瞳孔中心像ｃ’（瞳孔像の中心）の座標Ｘｃは、Ｘｃ≒（Ｘａ＋Ｘｂ）／２と表すことができる。つまり、瞳孔端像ａ’，ｂ’のＸ座標Ｘａ，Ｘｂから、瞳孔中心像ｃ’の座標Ｘｃを算出できる。このようにして、角膜反射像Ｐｄ’，Ｐｅ’の座標と
、瞳孔中心像ｃ’の座標とを見積もることができる。 From the luminance distribution 702, the X-coordinates Xd and Xe of the corneal reflection images Pd'and Pe'and the X-coordinates Xa and Xb of the pupil end images a'and b'can be obtained. Specifically, the coordinates with extremely high brightness can be obtained as the coordinates of the corneal reflection images Pd'and Pe', and the coordinates with extremely low brightness can be obtained as the coordinates of the pupil end images a'and b'. .. Further, when the rotation angle θx of the optical axis of the eyeball 14 with respect to the optical axis of the light receiving lens 16 is small, the pupil center image c'(pupil center image c'(pupil) obtained by forming a light beam from the pupil center c on the eyeball imaging element 17. The coordinates Xc (center of the image) can be expressed as Xc≈ (Xa + Xb) / 2. That is, the coordinates Xc of the pupil center image c'can be calculated from the X coordinates Xa and Xb of the pupil edge images a'and b'. In this way, the coordinates of the corneal reflex images Pd'and Pe'and the coordinates of the pupil center image c'can be estimated.

ステップＳ８０４では、ＣＰＵ３は、眼球像の結像倍率βを算出する。結像倍率βは、受光レンズ１６に対する眼球１４の位置により決まる倍率で、角膜反射像Ｐｄ’，Ｐｅ’の間隔（Ｘｄ−Ｘｅ）の関数を用いて求めることができる。 In step S804, the CPU 3 calculates the imaging magnification β of the eyeball image. The imaging magnification β is a magnification determined by the position of the eyeball 14 with respect to the light receiving lens 16 and can be obtained by using a function of the interval (Xd-Xe) between the corneal reflection images Pd'and Pe'.

ステップＳ８０５では、ＣＰＵ３は、受光レンズ１６の光軸に対する眼球１４の光軸の回転角を算出する。角膜反射像Ｐｄと角膜反射像Ｐｅの中点のＸ座標と角膜１４２の曲率中心ＯのＸ座標とはほぼ一致する。このため、角膜１４２の曲率中心Ｏから瞳孔１４１の中心ｃまでの標準的な距離をＯｃとすると、Ｚ−Ｘ平面（Ｙ軸に垂直な平面）内での眼球１４の回転角θ_Ｘは、以下の式（１）で算出できる。Ｚ−Ｙ平面（Ｘ軸に垂直な平面）内での眼球１４の回転角θｙも、回転角θｘの算出方法と同様の方法で算出できる。

β×Ｏｃ×ＳＩＮθ_Ｘ≒｛（Ｘｄ＋Ｘｅ）／２｝−Ｘｃ・・・（１）
In step S805, the CPU 3 calculates the rotation angle of the optical axis of the eyeball 14 with respect to the optical axis of the light receiving lens 16. The X coordinate of the midpoint of the corneal reflex image Pd and the corneal reflex image Pe and the X coordinate of the center of curvature O of the cornea 142 substantially coincide with each other. Therefore, assuming that the standard distance from the center of curvature O of the cornea 142 to the center c of the pupil 141 is Occ, the angle of rotation θ _X of the eyeball 14 in the ZX plane (plane perpendicular to the Y axis) is It can be calculated by the following formula (1). The rotation angle θy of the eyeball 14 in the ZZ plane (plane perpendicular to the X-axis) can also be calculated by the same method as the calculation method of the rotation angle θx.

β × Oc × SINθ _X ≒ {(Xd + Xe) / 2} -Xc ・・・ (1)

ステップＳ８０６では、ＣＰＵ３は、ステップＳ８０５で算出した回転角θｘ，θｙを用いて、表示デバイス１０に表示された視認用画像におけるユーザーの視点（視線が注がれた位置；ユーザーが見ている位置）を求める（推定する）。視点の座標（Ｈｘ，Ｈｙ）が瞳孔中心ｃに対応する座標であるとすると、視点の座標（Ｈｘ，Ｈｙ）は以下の式（２），（３）で算出できる。

Ｈｘ＝ｍ×（Ａｘ×θｘ＋Ｂｘ）・・・（２）
Ｈｙ＝ｍ×（Ａｙ×θｙ＋Ｂｙ）・・・（３）
In step S806, the CPU 3 uses the rotation angles θx and θy calculated in step S805 to display the user's viewpoint (the position where the line of sight is poured; the position where the user is looking) in the visual image displayed on the display device 10. ) Is obtained (estimated). Assuming that the coordinates of the viewpoint (Hx, Hy) are the coordinates corresponding to the center of the pupil c, the coordinates of the viewpoint (Hx, Hy) can be calculated by the following equations (2) and (3).

Hx = m × (Ax × θx + Bx) ・・・ (2)
Hy = m × (Ay × θy + By) ・・・ (3)

式（２），（３）のパラメータｍは、カメラ１のファインダ光学系（受光レンズ１６等）の構成で定まる定数であり、回転角θｘ，θｙを視認用画像において瞳孔中心ｃに対応する座標に変換する変換係数であり、予め決定されてメモリ部４に格納されるとする。パラメータＡｘ，Ｂｘ，Ａｙ，Ｂｙは、視線の個人差を補正する視線補正パラメータであり、後述するキャリブレーション作業を行うことで取得され、視線検出動作が開始する前にメモリ部４に格納されるとする。 The parameters m of the equations (2) and (3) are constants determined by the configuration of the finder optical system (light receiving lens 16 and the like) of the camera 1, and the rotation angles θx and θy are coordinates corresponding to the pupil center c in the visual image. It is a conversion coefficient to be converted into, and is determined in advance and stored in the memory unit 4. The parameters Ax, Bx, Ay, and By are line-of-sight correction parameters that correct individual differences in the line of sight, are acquired by performing the calibration work described later, and are stored in the memory unit 4 before the line-of-sight detection operation starts. And.

ステップＳ８０７では、ＣＰＵ３は、視点の座標（Ｈｘ，Ｈｙ）をメモリ部４に格納し、視線検出動作を終える。 In step S807, the CPU 3 stores the coordinates (Hx, Hy) of the viewpoint in the memory unit 4, and finishes the line-of-sight detection operation.

＜キャリブレーション作業の説明＞
前述のように、視線検出動作において眼画像から眼球１４の回転角度θｘ，θｙを取得し、瞳孔中心ｃの位置を視認用画像上での位置に座標変換することで、視点を推定できる。 <Explanation of calibration work>
As described above, the viewpoint can be estimated by acquiring the rotation angles θx and θy of the eyeball 14 from the eye image in the line-of-sight detection operation and converting the position of the pupil center c to the position on the visual recognition image.

しかし、人間の眼球の形状の個人差等の要因により、視点を高精度に推定できないことがある。具体的には、視線補正パラメータＡｘ，Ａｙ，Ｂｘ，Ｂｙをユーザーに適した値に調整しなければ、図５（Ｂ）に示したように、実際の視点Ｂと推定された視点Ｃとのずれが生じてしまう。図５（Ｂ）では、ユーザーは人物を注視しているが、カメラ１は背景が注視されていると誤って推定しており、適切な焦点検出及び調整ができない状態に陥ってしまっている。 However, it may not be possible to estimate the viewpoint with high accuracy due to factors such as individual differences in the shape of the human eyeball. Specifically, unless the line-of-sight correction parameters Ax, Ay, Bx, and By are adjusted to values suitable for the user, as shown in FIG. 5B, the actual viewpoint B and the estimated viewpoint C There will be a gap. In FIG. 5B, the user is gazing at a person, but the camera 1 erroneously presumes that the background is being gazed at, resulting in a state in which appropriate focus detection and adjustment cannot be performed.

そこで、カメラ１が撮像を行う前に、キャリブレーション作業を行い、ユーザーに適した視点補正パラメータ（眼球特性）を取得し、カメラ１に格納する必要がある。視点補正
パラメータ（眼球特性）には、眼球動作の遅延の程度を表す遅延情報が含まれる。キャリブレーション処理を実行するＣＰＵ３が、眼球特性取得手段に相当する。 Therefore, before the camera 1 performs imaging, it is necessary to perform calibration work, acquire a viewpoint correction parameter (eyeball characteristic) suitable for the user, and store it in the camera 1. The viewpoint correction parameter (eyeball characteristics) includes delay information indicating the degree of delay in eye movement. The CPU 3 that executes the calibration process corresponds to the eyeball characteristic acquisition means.

従来より、キャリブレーション作業は、撮像前に図５（Ｃ）のような位置の異なる複数の指標を視認用画像で強調表示し、ユーザーにその指標を見てもらうことで行われている。そして、各指標の注視時に視線検出動作を行い、算出された複数の視点（推定位置）と、各指標の座標とから、ユーザーに適した視点補正パラメータを求める技術が、公知の技術として知られている。なお、ユーザーの見るべき位置が示唆されれば、指標の表示でなくてもよく、輝度や色の変更で位置が強調されてもよい。 Conventionally, the calibration work has been performed by highlighting a plurality of indexes having different positions as shown in FIG. 5C on a visual image and having the user see the indexes before imaging. Then, a technique of performing a line-of-sight detection operation when gazing at each index and obtaining a viewpoint correction parameter suitable for the user from a plurality of calculated viewpoints (estimated positions) and the coordinates of each index is known as a known technique. ing. If the position to be viewed by the user is suggested, the index may not be displayed and the position may be emphasized by changing the brightness or color.

また、キャリブレーション作業の際、ＣＰＵ３は、人間の眼球動作の遅延フレーム数（遅延情報）Ｃを記憶する。遅延フレーム数Ｃは、図５（Ｃ）のキャリブレーション作業指標を表示させたフレームと、表示画像中の当該指標位置に視線が検出されたフレームとの間のフレーム数である。遅延フレーム数Ｃは、図１１で後述する視線検出フレーム特定処理のＳ１１１０で用いるために、メモリ部４に記憶される。なお、眼球動作の遅延を表す遅延情報が取得されればよいので、遅延フレーム数の代わりに、遅延時間を取得および記憶してもよい。 Further, during the calibration work, the CPU 3 stores the number of delay frames (delay information) C of the human eye movement. The delay frame number C is the number of frames between the frame on which the calibration work index of FIG. 5C is displayed and the frame in which the line of sight is detected at the index position in the displayed image. The delay frame number C is stored in the memory unit 4 for use in S1110 of the line-of-sight detection frame identification process described later in FIG. Since it is sufficient to acquire the delay information indicating the delay of the eye movement, the delay time may be acquired and stored instead of the number of delay frames.

＜視線検出した表示フレームの説明＞
冒頭に述べたように、視線検出結果の注視点が、ユーザーが実際にどの表示フレームを見ているときのものであるかを特定することが望ましい。ユーザーが実際に見ていた表示フレームのことを、本開示では、視線検出に用いられた表示フレーム、あるいは視線検出フレームと称する。視線検出フレームは、視線検出結果に対応する表示フレーム、または、注視点に対応する表示フレームと称することもできる。 <Explanation of display frame with line-of-sight detected>
As mentioned at the beginning, it is desirable to identify which display frame the user is actually looking at when the gaze point of the line-of-sight detection result is. In the present disclosure, the display frame actually viewed by the user is referred to as a display frame used for line-of-sight detection or a line-of-sight detection frame. The line-of-sight detection frame can also be referred to as a display frame corresponding to the line-of-sight detection result or a display frame corresponding to the gazing point.

図９（Ａ）〜図９（Ｆ）を用いて、表示デバイス１０に表示されるフレーム（表示フレーム）と眼画像のフレーム（眼球撮像フレーム）の関係を説明する。これらの図において、縦方向はフレーム内の行を表し、横方向は時間を表す。 The relationship between the frame (display frame) displayed on the display device 10 and the frame of the eye image (eyeball imaging frame) will be described with reference to FIGS. 9A to 9F. In these figures, the vertical direction represents the rows in the frame and the horizontal direction represents the time.

図９（Ａ）〜図９（Ｆ）において、表示フレームと眼球撮像フレームの同期関係と眼球撮像素子１７の読み出し時間の組み合わせがそれぞれ異なる。表示デバイス１０の表示走査時間と眼球撮像素子１７の読み出し時間を比較するため、全ての場合で表示デバイス１０の表示走査時間を一定としているが、表示デバイス１０の表示走査時間は変更可能であってもよい。図９（Ａ）〜図９（Ｄ）、図９（Ｇ）は眼球撮像素子１７によるローリングシャッタ方式での撮像を想定しており、図９（Ｅ）、図９（Ｆ）はグローバルシャッタを方式での撮像を想定している。 In FIGS. 9A to 9F, the combination of the synchronization relationship between the display frame and the eyeball image pickup frame and the readout time of the eyeball image pickup element 17 is different. In order to compare the display scanning time of the display device 10 with the reading time of the eyeball image sensor 17, the display scanning time of the display device 10 is constant in all cases, but the display scanning time of the display device 10 can be changed. May be good. 9 (A) to 9 (D) and 9 (G) assume imaging by the rolling shutter method by the eyeball image sensor 17, and FIGS. 9 (E) and 9 (F) show the global shutter. It is supposed to be imaged by the method.

図９（Ａ）〜９（Ｆ）において、９００と９０１は表示フレームを示し、表示フレーム９０１は表示フレーム９００の次に表示されるフレーム画像である。９０４，９０５は表示の同期タイミングを点線で示している。９０２は眼球撮像素子１７の露光時間であり、９０３は眼球撮像フレームである。露光期間９０２において眼球撮像素子１７によって撮像される画像が、眼球撮像フレーム９０３である。 In FIGS. 9A to 9F, 900 and 901 indicate a display frame, and the display frame 901 is a frame image displayed next to the display frame 900. In 904 and 905, the synchronization timing of the display is indicated by a dotted line. 902 is the exposure time of the eyeball image sensor 17, and 903 is the eyeball image pickup frame. The image captured by the eyeball image sensor 17 during the exposure period 902 is the eyeball image pickup frame 903.

９０６、９０７はユーザーの角膜反射像（プルキニエ像）の位置を表し、角膜反射像が4点ある場合には、画像のもっとも下側で検出された角膜反射像の位置を表している。注
視点（Ｓ８０７で算出する視点座標）は、角膜反射像の位置や瞳孔中心位置に基づいて検出されるので、以下では簡単のために９０６、９０７を注視点とも称する。角膜反射像位置９０６、９０７から伸びる矢印の先は、角膜反射像位置９０６、９０７に基づいて算出される注視点、すなわち利用者が実際に注視している表示フレーム内の位置を表す。必要に応じて、眼画像内での角膜反射像の位置を眼画像内の注視点と称し、ユーザーが実際に
注視している位置を表示フレーム内の注視点と称することもある。説明の都合上、１つの露光期間中に眼画像内の注視点が２箇所ある例を示しているが、実際の注視点は１箇所である。 906 and 907 represent the positions of the user's corneal reflex images (Purkinje images), and when there are four corneal reflex images, they represent the positions of the corneal reflex images detected at the lowermost side of the image. Since the gazing point (viewpoint coordinates calculated in S807) is detected based on the position of the corneal reflex image and the center position of the pupil, 906 and 907 are also referred to as gazing points for the sake of simplicity. The tip of the arrow extending from the corneal reflex image positions 906 and 907 represents the gaze point calculated based on the corneal reflex image positions 906 and 907, that is, the position in the display frame actually gazed by the user. If necessary, the position of the corneal reflex image in the eye image may be referred to as the gazing point in the ocular image, and the position actually gazing by the user may be referred to as the gazing point in the display frame. For convenience of explanation, an example is shown in which there are two gazing points in the eye image during one exposure period, but the actual gazing point is one.

図９（Ａ）は、表示デバイス１０と眼球撮像素子１７は同期しており、眼球撮像素子１７の読み出し時間が表示デバイス１０の表示走査時間以下の場合を示す。 FIG. 9A shows a case where the display device 10 and the eyeball image sensor 17 are synchronized and the read time of the eyeball image sensor 17 is equal to or less than the display scanning time of the display device 10.

表示デバイス１０と眼球撮像素子１７が同期しているというのは、本実施形態では、表示デバイス１０によるフレーム画像の表示開始タイミングと、眼球撮像素子１７による眼球撮像フレームの読み出し開始タイミングとが一致していることを意味する。表示デバイス１０は、表示同期信号の入力にしたがって、新しいフレーム画像の表示を開始する。眼球撮像素子１７は、撮像同期信号の入力にしたがって、露光期間中に蓄積された信号電荷の読み出しを開始する。表示デバイス１０と眼球撮像素子１７が同期しているというのは、表示同期信号と撮像同期信号の入力が同じタイミングであると言い表すこともできる。 The fact that the display device 10 and the eyeball image sensor 17 are synchronized means that in the present embodiment, the display start timing of the frame image by the display device 10 and the read start timing of the eyeball image pickup frame by the eyeball image sensor 17 coincide with each other. It means that it is. The display device 10 starts displaying a new frame image according to the input of the display synchronization signal. The eyeball image sensor 17 starts reading the signal charge accumulated during the exposure period in accordance with the input of the image pickup synchronization signal. The fact that the display device 10 and the eyeball image sensor 17 are synchronized can be said to mean that the input of the display synchronization signal and the image pickup synchronization signal are at the same timing.

図９（Ａ）の場合、眼画像内での注視点９０６の位置に関係なく、角膜反射像が撮影されたタイミングでは注視点に表示フレーム９００が表示されている。したがって、視線検出フレームは、注視点の位置にかかわらず、眼球撮像フレーム９０３の露光を開始したときの表示フレーム９００である。この表示フレーム９００は、撮像同期信号の入力タイミングに表示されている（表示が開始される）表示フレーム９０１の１つ前の表示フレームとも特定できる。 In the case of FIG. 9A, the display frame 900 is displayed at the gazing point at the timing when the corneal reflex image is taken, regardless of the position of the gazing point 906 in the eye image. Therefore, the line-of-sight detection frame is the display frame 900 when the exposure of the eyeball imaging frame 903 is started regardless of the position of the gazing point. The display frame 900 can also be identified as a display frame immediately before the display frame 901 displayed (display starts) at the input timing of the imaging synchronization signal.

図９（Ｂ）は、表示デバイス１０と眼球撮像素子１７は同期しており、眼球撮像素子１７の読み出し時間が表示デバイス１０の表示走査時間より長い場合を示す。この場合、眼画像内での注視点の位置に応じて、視線検出フレームが異なりうる。この例では、注視点９０６は表示フレーム９００に対応し、注視点９０７は表示フレーム９０１に対応する。 FIG. 9B shows a case where the display device 10 and the eyeball image sensor 17 are synchronized and the read time of the eyeball image sensor 17 is longer than the display scanning time of the display device 10. In this case, the line-of-sight detection frame may differ depending on the position of the gazing point in the eye image. In this example, the gaze point 906 corresponds to the display frame 900 and the gaze point 907 corresponds to the display frame 901.

図９（Ｃ）は、表示デバイス１０と眼球撮像素子１７は非同期であり、眼球撮像素子１７の読み出し時間が表示デバイス１０の表示走査時間以下の場合を示す。この場合、眼画像内での注視点の位置に応じて、視線検出フレームが異なりうる。この例では、注視点９０６は表示フレーム９０１に、注視点９０７は表示フレーム９００に対応する。 FIG. 9C shows a case where the display device 10 and the eyeball image sensor 17 are asynchronous and the read time of the eyeball image sensor 17 is equal to or less than the display scanning time of the display device 10. In this case, the line-of-sight detection frame may differ depending on the position of the gazing point in the eye image. In this example, the gazing point 906 corresponds to the display frame 901 and the gazing point 907 corresponds to the display frame 900.

図９（Ｄ）は、表示デバイス１０と眼球撮像素子１７は非同期であり、眼球撮像素子１７の読み出し時間が表示デバイス１０の表示走査時間より長い場合を示す。この場合、眼画像内での注視点の位置に応じて、視線検出フレームが異なる。表示フレームと撮像フレームの非同期性、すなわち、表示開始タイミングと撮像開始タイミングの差に応じた調整を行う必要がある点を除けば、視線検出結果に対応する表示フレームは図９（Ｂ）と同様にして求められる。 FIG. 9D shows a case where the display device 10 and the eyeball image sensor 17 are asynchronous and the read time of the eyeball image sensor 17 is longer than the display scanning time of the display device 10. In this case, the line-of-sight detection frame differs depending on the position of the gazing point in the eye image. The display frame corresponding to the line-of-sight detection result is the same as in FIG. 9B, except that the display frame and the imaging frame are asynchronous, that is, adjustment must be made according to the difference between the display start timing and the imaging start timing. Is required.

図９（Ｅ）は、表示デバイス１０と眼球撮像素子１７は同期しており、眼球撮像素子１７がグローバルシャッタ方式で撮像する場合を示す。この場合は、図９（Ａ）において眼球撮像素子１７の読み出し時間がゼロである場合と同一視できる。したがって、眼画像内での注視点９０６の位置にかかわらず、視線検出フレームは表示フレーム９００であると決定できる。 FIG. 9E shows a case where the display device 10 and the eyeball image sensor 17 are synchronized with each other and the eyeball image sensor 17 takes an image by the global shutter method. In this case, it can be equated with the case where the readout time of the eyeball image sensor 17 is zero in FIG. 9A. Therefore, regardless of the position of the gazing point 906 in the eye image, it can be determined that the line-of-sight detection frame is the display frame 900.

図９（Ｆ）は、表示デバイス１０と眼球撮像素子１７は非同期であり、眼球撮像素子１７がグローバルシャッタ方式で撮像する場合を示す。この場合は、図９（Ｃ）において眼球撮像素子１７の読み出し時間がゼロである場合と同一視できる。したがって、眼画像内での注視点の位置に応じて、視線検出表示フレームが異なる。 FIG. 9F shows a case where the display device 10 and the eyeball image sensor 17 are asynchronous, and the eyeball image sensor 17 takes an image by the global shutter method. In this case, it can be equated with the case where the readout time of the eyeball image sensor 17 is zero in FIG. 9C. Therefore, the line-of-sight detection display frame differs depending on the position of the gazing point in the eye image.

図９（Ｇ）は、表示デバイス１０と眼球撮像素子１７は同期しており、眼球撮像素子１７の撮像フレームレートと表示デバイス１０の表示フレームレートが２：１の場合を示す。眼球撮像フレーム９０８を視線検出に使用せず、眼球撮像フレーム９０３、９０９を視線検出に使用すれば、視線検出フレームは、図９（Ａ）と同様に求めることができる。また、眼球撮像フレーム９０８のみを視線検出に使用する場合、視線検出フレームは、図９（Ｃ）と同様に求めることができる。つまり、眼球撮像素子１７のフレームレートと表示デバイス１０のフレームレートがｎ：１（ｎは２以上の整数）である場合は、表示と同期して撮像された眼画像のみを使用して視線検出すれば、容易に視線検出フレームを特定することが可能となる。 FIG. 9 (G) shows a case where the display device 10 and the eyeball image sensor 17 are synchronized, and the image pickup frame rate of the eyeball image sensor 17 and the display frame rate of the display device 10 are 2: 1. If the eyeball imaging frames 908 are not used for the line-of-sight detection and the eyeball imaging frames 903 and 909 are used for the line-of-sight detection, the line-of-sight detection frame can be obtained in the same manner as in FIG. 9A. Further, when only the eyeball imaging frame 908 is used for the line-of-sight detection, the line-of-sight detection frame can be obtained in the same manner as in FIG. 9C. That is, when the frame rate of the eyeball image sensor 17 and the frame rate of the display device 10 are n: 1 (n is an integer of 2 or more), the line of sight is detected using only the eye image captured in synchronization with the display. Then, the line-of-sight detection frame can be easily specified.

図９（Ａ）〜図９（Ｇ）は一部の例に過ぎない。これらの例から、表示フレームと眼球撮像フレームの同期関係、表示デバイス１０の表示走査時間、眼球撮像素子１７の読み出し時間、および注視点の垂直位置に応じて、視線検出位置に対応する表示フレームが異なることがわかる。 9 (A) to 9 (G) are only some examples. From these examples, the display frame corresponding to the line-of-sight detection position is determined according to the synchronization relationship between the display frame and the eyeball imaging frame, the display scanning time of the display device 10, the readout time of the eyeball imaging element 17, and the vertical position of the gazing point. You can see that they are different.

＜視線検出フレーム特定処理の説明＞
以下、図１０、１１を用いて、視線検出フレーム特定処理について説明する。視線検出フレーム特定処理は、表示デバイス１０及び眼球撮像素子１７の駆動モード情報に基づいて、視線検出フレームを特定する処理である。 <Explanation of line-of-sight detection frame identification processing>
Hereinafter, the line-of-sight detection frame identification process will be described with reference to FIGS. 10 and 11. The line-of-sight detection frame identification process is a process for specifying the line-of-sight detection frame based on the drive mode information of the display device 10 and the eyeball image sensor 17.

図１０は、撮像素子２により撮影した画像データの表示デバイス１０での表示、その表示を見ているユーザーの眼球の眼球撮像素子１７による撮像、眼画像に対する視線検出、の一連の流れを示すタイミング図である。図１０の横方向が時間軸である。１０００〜１００３は表示フレーム、１００４はユーザーの注視点を示している。Ｔ０は撮像素子２の同期タイミングである。Ｔ１は表示デバイス１０への同期タイミングであり、表示同期信号が入力されるタイミング（表示走査開始タイミング）である。Ｔ２は眼球撮像素子１７の同期タイミングであり、撮像同期信号が入力されるタイミング（読み出し開始タイミング；駆動開始タイミング）である。Ｔ３は表示デバイス１０の表示走査完了タイミングである。Ｔ４は眼球撮像素子１７の読み出し完了タイミングである。Ｔ５は、視線検出結果の出力タイミング、つまり表示画像中の注視点の位置を出力したタイミングである。Ｔ３とＴ１の差分が表示デバイス１０の表示走査時間に、Ｔ４とＴ２の差分が、眼球撮像素子１７の読み出し時間に相当する。図１０中のα、Ｔｄ、Ｔｓ、Ｔｒは以降で説明する。 FIG. 10 shows a timing showing a series of flow of displaying the image data captured by the image sensor 2 on the display device 10, imaging the eyeball of the user who is viewing the display by the eyeball image sensor 17, and detecting the line of sight of the eye image. It is a figure. The horizontal direction of FIG. 10 is the time axis. 1000-1003 indicates a display frame, and 1004 indicates a user's gaze point. T0 is the synchronization timing of the image sensor 2. T1 is a synchronization timing to the display device 10, and is a timing (display scanning start timing) at which a display synchronization signal is input. T2 is the synchronization timing of the eyeball image pickup device 17, and is the timing at which the imaging synchronization signal is input (reading start timing; drive start timing). T3 is the display scanning completion timing of the display device 10. T4 is the read completion timing of the eyeball image sensor 17. T5 is the output timing of the line-of-sight detection result, that is, the timing at which the position of the gazing point in the displayed image is output. The difference between T3 and T1 corresponds to the display scanning time of the display device 10, and the difference between T4 and T2 corresponds to the reading time of the eyeball image sensor 17. Α, Td, Ts, and Tr in FIG. 10 will be described below.

図１１は視線検出フレーム特定処理（以下、フレーム特定処理とも称する）の流れを示すフローチャートである。この処理はＣＰＵ３で制御されるものとして説明する。 FIG. 11 is a flowchart showing the flow of the line-of-sight detection frame identification process (hereinafter, also referred to as frame identification process). This process will be described as being controlled by the CPU 3.

Ｓ１０１では、ＣＰＵ３が、眼球撮像素子１７と表示デバイス１０が同期関係であり、かつ、眼球撮像素子１７の読み出し時間が表示デバイス１０の表示走査時間より短いか同じであるか否か判定する。眼球撮像素子１７がグローバルシャッタ方式で撮像する場合（図９（Ｅ）、図９（Ｆ））は、眼球撮像素子１７の読み出し時間はゼロとみなす。ＣＰＵ３は、判定条件を満たす場合はＳ１０８へ進み、そうでない場合は処理をＳ１０２へ進む。 In S101, the CPU 3 determines whether or not the eyeball image sensor 17 and the display device 10 are in a synchronous relationship and the read time of the eyeball image sensor 17 is shorter or the same as the display scanning time of the display device 10. When the eyeball image sensor 17 takes an image by the global shutter method (FIGS. 9 (E) and 9 (F)), the readout time of the eyeball image sensor 17 is regarded as zero. If the determination condition is satisfied, the CPU 3 proceeds to S108, and if not, the process proceeds to S102.

Ｓ１０２では、ＣＰＵ３は、表示デバイス１０の表示走査中に眼球撮像素子１７の露光を行ったか判定し、露光を行った場合はＳ１０３へ進み、行っていない場合はＳ１０８へ進む。 In S102, the CPU 3 determines whether or not the eyeball image sensor 17 has been exposed during the display scan of the display device 10. If the exposure is performed, the process proceeds to S103, and if not, the process proceeds to S108.

Ｓ１０３では、ＣＰＵ３は、眼球撮像素子１７の読み出し開始時間Ｔ２と表示デバイス１０の表示開始時間Ｔ１の差分時間αを、式（４）により算出する。

α＝Ｔ２−Ｔ１・・・（４）
In S103, the CPU 3 calculates the difference time α between the read start time T2 of the eyeball image sensor 17 and the display start time T1 of the display device 10 by the equation (4).

α = T2-T1 ・・・ (4)

Ｓ１０４では、ＣＰＵ３は、眼球撮像素子１７が読み出しを開始してから、注視点（角膜反射像）が写っている行の読み出しを開始するまでの時間Ｔｓを、式（５）により算出する。但し、グローバル露光時はＴｓ＝０とする。

Ｔｓ＝（眼球撮像素子１７の水平１ラインの読み出し時間）×（眼球撮像素子１７上の注視点まで（注視点を含む）の垂直方向のライン数）・・・（５）
In S104, the CPU 3 calculates the time Ts from the start of reading by the eyeball image sensor 17 to the start of reading the line in which the gazing point (corneal reflex image) is captured by the equation (5). However, at the time of global exposure, Ts = 0.

Ts = (reading time of one horizontal line of the eyeball image sensor 17) × (number of lines in the vertical direction up to the gazing point (including the gazing point) on the eyeball image sensor 17) ... (5)

Ｓ１０５では、ＣＰＵ３は、表示デバイス１０がフレーム画像の表示を開始してから、注視点を含む行を表示するまでの時間Ｔｄを、式（６）により算出する。

Ｔｄ＝（表示走査の水平１ラインの時間）×（表示デバイス１０の垂直総ライン数）×（眼球撮像素子１７の注視点までの垂直ライン数）／（眼球撮像素子１７の垂直総ライン数）・・・（６） In S105, the CPU 3 calculates the time Td from when the display device 10 starts displaying the frame image until the line including the gazing point is displayed by the equation (6).

Td = (time of one horizontal line of display scan) × (total number of vertical lines of display device 10) × (number of vertical lines to the gazing point of eyeball image sensor 17) / (total number of vertical lines of eyeball image sensor 17) ... (6)

Ｓ１０７では、ＣＰＵ３は、視線検出フレームと、眼球撮像素子１７の駆動開始時の表示フレームとのフレーム差Ｘ（調整情報）を算出する。視線検出フレームは、角膜反射像が撮影されている眼球撮像素子１７のライン位置の露光開始時に表示されている表示フレームであると考えられる。したがって、表示デバイス１０がフレーム画像の表示を開始してから注視点を含むラインを表示開始するタイミング（Ｔｄ）と、角膜反射像が撮影される眼球撮像素子１７のラインの露光がタイミング（α＋Ｔｓ−露光時間）の差に応じて、値Ｘを算出できる。具体的には、ＣＰＵ３は、以下の式（７）によりＸの値を算出する。図１０の例では、Ｘ＝０と算出される。

Ｘ＝Ｆｌｏｏｒ［（Ｔｓ＋α−Ｔｄ−露光時間）／表示周期］・・・（７）

ここで、Ｆｌｏｏｒ（ｘ）は床関数、すなわち実数ｘを超えない最大の整数を返す関数である。露光時間は、眼球撮像素子１７の１ラインの露光時間である。表示周期は、表示デバイス１０の表示同期信号の入力間隔である。 In S107, the CPU 3 calculates the frame difference X (adjustment information) between the line-of-sight detection frame and the display frame at the start of driving the eyeball image sensor 17. The line-of-sight detection frame is considered to be a display frame displayed at the start of exposure at the line position of the eyeball image sensor 17 on which the corneal reflex image is captured. Therefore, the timing (Td) at which the display device 10 starts displaying the frame image and then the line including the gazing point is started, and the timing (α + Ts−) of the exposure of the line of the eyeball image sensor 17 on which the corneal reflex image is taken. The value X can be calculated according to the difference in exposure time). Specifically, the CPU 3 calculates the value of X by the following equation (7). In the example of FIG. 10, X = 0 is calculated.

X = Floor [(Ts + α-Td-exposure time) / display cycle] ・・・ (7)

Here, Floor (x) is a floor function, that is, a function that returns the maximum integer that does not exceed the real number x. The exposure time is the exposure time of one line of the eyeball image sensor 17. The display cycle is an input interval of the display synchronization signal of the display device 10.

なお、式（７）では、露光時間を用いているため、値Ｘは、露光開始時に表示されている表示フレームと、眼球撮像素子１７の駆動開始時に表示されている表示フレームを表す。ここで、露光時間の半分の値を用いれば、露光開始と露光終了の中間の時点で表示されている表示フレームについての値Ｘが求められる。また、露光時間を用いなければ、露光終了時点に表示されている表示フレームについての値Ｘが求められる。式（７）の代わりに、このようにして値Ｘを算出しても構わない。 Since the exposure time is used in the equation (7), the value X represents a display frame displayed at the start of exposure and a display frame displayed at the start of driving the eyeball image sensor 17. Here, if a value that is half of the exposure time is used, the value X for the display frame displayed at an intermediate point between the start of exposure and the end of exposure can be obtained. Further, if the exposure time is not used, the value X for the display frame displayed at the end of the exposure can be obtained. Instead of the equation (7), the value X may be calculated in this way.

Ｓ１０８では、ＣＰＵ３は、Ｘ＝−１とする。これは、視線検出フレームが、眼球撮像素子１７の駆動タイミングにおいて表示されているフレームの１つ前の表示フレームであることを意味する。 In S108, the CPU 3 sets X = -1. This means that the line-of-sight detection frame is the display frame immediately before the frame displayed at the drive timing of the eyeball image sensor 17.

なお、図１１のフローチャートでは、Ｓ１０１およびＳ１０２の判定結果に応じて、Ｓ１０７とＳ１０８のいずれの処理によって値Ｘを算出するか変えている。しかしながら、Ｓ１０１およびＳ１０２の判定を行わずに、常にＳ１０７の処理によって値Ｘを算出しても同じ結果が得られる。例えば、Ｓ１０１で肯定判定されるときはα＝０かつＴｓ≦Ｔｄであり、また、ＴｓとＴｄの差および露光時間は表示周期に比べれば十分に小さいので、式（７）にしたがってもＸ＝−１と算出される。 In the flowchart of FIG. 11, the value X is calculated by which of the processes S107 and S108 is calculated according to the determination results of S101 and S102. However, the same result can be obtained even if the value X is always calculated by the processing of S107 without determining S101 and S102. For example, when an affirmative judgment is made in S101, α = 0 and Ts ≦ Td, and since the difference between Ts and Td and the exposure time are sufficiently smaller than the display cycle, X = according to the equation (7). It is calculated as -1.

ただし、図９（Ａ）および９（Ｇ）の場合に、Ｓ１０３〜Ｓ１０７の処理を行うことなくＸの値を決定でき簡便である。すなわち、表示デバイス１０と眼球撮像素子１７が同期しており、眼球撮像素子１７の撮像フレームレートが表示デバイス１０の表示フレームレートンの整数倍であるときは、次のようにすればよい。具体的には、視線検出回路２０１は、表示デバイス１０の表示開始タイミングと同じタイミングに読み出しが開始された眼画像のみを用いて視線検出を行う。このようにすれば、Ｓ１０３〜Ｓ１０７の処理を行うことなく、Ｘ＝−１、すなわち、視線検出フレームは、眼球撮像素子１７が読み出しを開始するタイミングで表示デバイス１０に表示されている表示フレームの１つ前のフレームであると特定できる。 However, in the case of FIGS. 9 (A) and 9 (G), the value of X can be determined without performing the processes of S103 to S107, which is convenient. That is, when the display device 10 and the eyeball image sensor 17 are synchronized and the image pickup frame rate of the eyeball image sensor 17 is an integral multiple of the display frame rate of the display device 10, the following may be performed. Specifically, the line-of-sight detection circuit 201 performs line-of-sight detection using only an eye image whose reading is started at the same timing as the display start timing of the display device 10. In this way, X = -1, that is, the line-of-sight detection frame of the display frame displayed on the display device 10 at the timing when the eyeball image sensor 17 starts reading, without performing the processes of S103 to S107. It can be identified as the previous frame.

Ｓ１０９では、ＣＰＵ３は、眼球撮像素子１７の駆動開始タイミング（Ｔ２）の表示フレームと、注視検出完了タイミング（Ｔ５）の表示フレームとのあいだのフレーム数Ｙを算出する。具体的には、値Ｙは、時間Ｔｒと時間αの合計に対応するフレーム数であるので、式（８）により算出される。図１０の例では、Ｙ＝２と算出される。

Ｙ＝Ｆｌｏｏｒ（（Ｔｒ＋α）／表示周期）・・・（８）

ここで、Ｔｒ＝Ｔ５−Ｔ２であり、Ｆｌｏｏｒ（ｘ）は床関数である。 In S109, the CPU 3 calculates the number of frames Y between the display frame of the drive start timing (T2) of the eyeball image sensor 17 and the display frame of the gaze detection completion timing (T5). Specifically, since the value Y is the number of frames corresponding to the total of the time Tr and the time α, it is calculated by the equation (8). In the example of FIG. 10, Y = 2 is calculated.

Y = Floor ((Tr + α) / display cycle) ・・・ (8)

Here, Tr = T5-T2, and Floor (x) is a floor function.

Ｓ１１０では、ＣＰＵ３は、視線検出表示フレームが、視線結果出力タイミング（Ｔ５）の表示フレームから何フレーム前であるかを表す値Ｚを式（９）により算出する。具体的には、値Ｙを、駆動モード情報に基づく調整情報Ｘと、眼球特性の遅延情報Ｃとを用い調整することにより、値Ｚが算出される。

Ｚ＝Ｙ − （Ｘ＋Ｃ）・・・（９）

ここで、Ｃは、キャリブレーション時に取得した遅延フレーム数である。 In S110, the CPU 3 calculates a value Z indicating how many frames before the line-of-sight detection display frame is before the display frame of the line-of-sight result output timing (T5) by the equation (9). Specifically, the value Z is calculated by adjusting the value Y using the adjustment information X based on the drive mode information and the delay information C of the eyeball characteristics.

Z = Y − (X + C) ・・・ (9)

Here, C is the number of delay frames acquired at the time of calibration.

以上の処理により、視線検出フレームは、視線検出結果が得られたタイミングで表示されている表示フレームから、Ｚフレーム前のフレームとして特定される。値Ｚは、視線検出結果が得られたタイミングに基づいて得られるフレーム差Ｙを、眼球撮像素子１７および表示デバイス１０の駆動モード情報に基づく調整情報Ｘおよび眼球の遅延情報Ｃを用いて調整した調整後のフレーム差である。駆動モード情報に基づく調整（Ｓ１０７）では、眼球撮像素子１７の駆動モード情報として同期タイミング（Ｔ２）、撮像素子１７からの画像読み出し時間（Ｔ４−Ｔ２）、および１ラインあたりの露光時間が用いられる。また、表示デバイス１０の駆動モード情報として、同期タイミング（Ｔ１）および表示走査時間（Ｔ３−Ｔ１）が用いられる。 By the above processing, the line-of-sight detection frame is identified as a frame before the Z frame from the display frame displayed at the timing when the line-of-sight detection result is obtained. The value Z adjusts the frame difference Y obtained based on the timing at which the line-of-sight detection result is obtained by using the adjustment information X based on the drive mode information of the eyeball image sensor 17 and the display device 10 and the eyeball delay information C. This is the frame difference after adjustment. In the adjustment based on the drive mode information (S107), the synchronization timing (T2), the image readout time from the image sensor 17 (T4-T2), and the exposure time per line are used as the drive mode information of the eyeball image sensor 17. .. Further, the synchronization timing (T1) and the display scanning time (T3-T1) are used as the drive mode information of the display device 10.

式（８）による値Ｙを、表示デバイス１０と眼球撮像素子１７の駆動モード情報を用いて算出した値Ｘ（調整情報）を用いて調整することにより、視線検出フレームをより精度良く特定できる。 By adjusting the value Y according to the equation (8) using the value X (adjustment information) calculated by using the drive mode information of the display device 10 and the eyeball image sensor 17, the line-of-sight detection frame can be specified more accurately.

視線検出結果に対応する表示フレームが特定できると、ユーザーが実際に注視していた物体が特定できるため、ユーザーが注視していた物体に対する各種の処理をより適切に行える。これは特に、被写体の動きが速いときに有効である。以下では、ユーザーが実際に注視していた物体に対する処理の例として、合焦制御と追尾補正処理を説明する。 If the display frame corresponding to the line-of-sight detection result can be specified, the object that the user is actually gazing at can be specified, so that various processes for the object that the user is gazing at can be performed more appropriately. This is especially effective when the subject moves quickly. In the following, focusing control and tracking correction processing will be described as examples of processing for an object that the user is actually gazing at.

＜合焦制御の説明＞
以下、図１２、図１３を用いて、視線検出した注視点を利用した合焦制御について説明
する。撮像装置１は、ユーザーが実際に注視している被写体に焦点が合うように合焦制御を行う。 <Explanation of focusing control>
Hereinafter, focusing control using the gaze point detected by the line of sight will be described with reference to FIGS. 12 and 13. The image pickup apparatus 1 performs focusing control so as to focus on the subject that the user is actually gazing at.

図１２は、本実施形態に係る撮像装置１によって撮影した画像を、表示デバイス１０に表示させたときの画像１２０１〜１２０４を示す図である。画像１２０１〜１２０３はこの順で時系列であり、その中で自動車１２００が右上から左下に移動している。なお、画像１２０４は画像１２０３と同じフレームの画像であり、自動車１２００の位置および大きさは同じである。追尾枠１２１１〜１２１４は、追尾対象である自動車１２００が存在する領域を表す画像である。追尾枠１２１１〜１２１４の情報は追尾回路２０７から出力され、撮影画像に重畳されて表示デバイス１０に表示される。 FIG. 12 is a diagram showing images 1201 to 1204 when an image taken by the image pickup apparatus 1 according to the present embodiment is displayed on the display device 10. Images 1201 to 1203 are in chronological order in this order, in which the car 1200 is moving from the upper right to the lower left. The image 1204 is an image of the same frame as the image 1203, and the position and size of the automobile 1200 are the same. The tracking frames 121 to 1214 are images showing a region in which the automobile 1200 to be tracked exists. The information of the tracking frames 121 to 1214 is output from the tracking circuit 207, superimposed on the captured image, and displayed on the display device 10.

１２２１，１２２２，１２２３は注視点を示す。注視点は実際には点であるが、後述するように注視点を中心とする矩形画像を合焦制御処理に用いる。したがって、図１２では、注視点１２２１，１２２２，１２２３を矩形画像で示している。本開示では、注視点を中心とする矩形画像を、注視点画像とも称する。以下では、注視点と注視点画像を同じ符号を用いて参照する。例えば、注視点１２２１を中心とする矩形画像のことを、注視点画像１２２１と称する。なお、図１２には注視点１２２１，１２２２，１２２３が描かれているが、これらは表示デバイス１０に表示されてもよいし表示されなくてもよい。注視点が表示される場合には、注視点を中心とし注視点画像（矩形画像）と同じ大きさの表示枠が画像に表示される。 1221, 1222, 1223 indicate the gazing point. The gazing point is actually a point, but as will be described later, a rectangular image centered on the gazing point is used for the focusing control process. Therefore, in FIG. 12, the gazing points 1221, 1222, 1223 are shown as rectangular images. In the present disclosure, a rectangular image centered on the gazing point is also referred to as a gazing point image. In the following, the gazing point and the gazing point image will be referred to using the same reference numerals. For example, a rectangular image centered on the gazing point 1221 is referred to as a gazing point image 1221. Although the gazing points 1221, 1222, and 1223 are drawn in FIG. 12, they may or may not be displayed on the display device 10. When the gazing point is displayed, a display frame having the same size as the gazing point image (rectangular image) is displayed on the image centering on the gazing point.

注視点１２２１は、表示デバイス１０に画像１２０１のフレームが表示されているときのユーザーの注視点を表す。画像１２０２は、画像１２０１の次のフレームであり、自動車１２００が移動している。画像１２０３および１２０４は、画像１２０２の次のフレームである。ここでは、ユーザーが画像１２０１を見ていたときの注視点１２２１がその２フレーム後に検出された、すなわち上述のＺ＝２であるとして説明する。 The gazing point 1221 represents the gazing point of the user when the frame of the image 1201 is displayed on the display device 10. The image 1202 is the frame next to the image 1201, and the automobile 1200 is moving. Images 1203 and 1204 are the next frames of image 1202. Here, it is assumed that the gazing point 1221 when the user is looking at the image 1201 is detected two frames later, that is, Z = 2 described above.

注視点１２２２は、画像内における水平位置および垂直位置が注視点１２２１と同じである。撮像装置１は、ユーザーが注視している被写体に焦点が合うように合焦制御を行う。しかしながら、画像１２０３内の注視点１２２２にある被写体に焦点を合わせることはユーザーの意図に反する。なぜならば、ユーザーは画像１２０１内の注視点１２２１にある被写体への合焦を意図して注視しているためである。そこで、撮像装置１は、視線検出フレーム特定処理によって特定された検出フレームに基づく補正処理により、注視点１２２２を、注視点１２２１に対応する注視点１２２３へと補正する。注視点１２２３に対する合焦制御を行えば、本来ユーザーが望む位置への合焦が行える。注視点の補正処理については、注視点合焦処理の中で説明する。 The gazing point 1222 has the same horizontal and vertical positions as the gazing point 1221 in the image. The image pickup apparatus 1 performs focusing control so as to focus on the subject that the user is gazing at. However, focusing on the subject at the gazing point 1222 in the image 1203 is contrary to the user's intention. This is because the user intends to focus on the subject at the gazing point 1221 in the image 1201. Therefore, the image pickup apparatus 1 corrects the gaze point 1222 to the gaze point 1223 corresponding to the gaze point 1221 by the correction process based on the detection frame specified by the line-of-sight detection frame identification process. By performing focusing control on the gazing point 1223, it is possible to focus on the position originally desired by the user. The gazing point correction process will be described in the gazing point focusing process.

図１３（Ａ）は合焦制御処理の流れを示すフローチャートである。合焦制御処理は、ＣＰＵ３が実行する。合焦制御処理が開始されると、ＣＰＵ３は、Ｓ２０１以降の処理を実行する。 FIG. 13A is a flowchart showing the flow of focusing control processing. The focusing control process is executed by the CPU 3. When the focusing control process is started, the CPU 3 executes the processes after S201.

Ｓ２０１で、ＣＰＵ３は、注視点補正処理を行う。注視点補正処理は、視線検出結果として得られる注視点１２２２を、ユーザーが実際に注視していた視線検出フレーム１２０１の注視点１２２１に対応する、現在の表示フレーム中の位置に補正する処理である。より具体的には、注視点補正処理は、視線検出フレーム１２０１の注視点１２２１に位置する被写体が損竿している、現在の表示フレーム１２０４中の位置（注視点１２２３）に補正する処理である。 In S201, the CPU 3 performs the gazing point correction process. The gaze point correction process is a process of correcting the gaze point 1222 obtained as a line-of-sight detection result to a position in the current display frame corresponding to the gaze point 1221 of the line-of-sight detection frame 1201 that the user was actually gazing at. .. More specifically, the gaze point correction process is a process of correcting the position in the current display frame 1204 (gaze point 1223) where the subject located at the gaze point 1221 of the line-of-sight detection frame 1201 is damaged. ..

Ｓ２０２で、ＣＰＵ３は、補正後の注視点（補正注視点）の位置にある被写体に対して合焦するように焦点調節回路１１８を制御する。 In S202, the CPU 3 controls the focus adjustment circuit 118 so as to focus on the subject at the corrected gazing point (corrected gazing point) position.

図１３（Ｂ）は注視点補正処理の流れを示すフローチャートである。図１３（Ｂ）を参照して、Ｓ２０１の注視点補正処理についてより詳細に説明する。 FIG. 13B is a flowchart showing the flow of the gazing point correction process. The gaze point correction process of S201 will be described in more detail with reference to FIG. 13 (B).

Ｓ２１１で、ＣＰＵ３は、視線検出フレーム特定処理で特定した視線検出フレームから、注視点位置を中心とする矩形画像を抽出する。以下では、この画像を注視点画像Ａと称する。図１２の例では、注視点画像Ａは注視点画像１２２１である。 In S211 the CPU 3 extracts a rectangular image centered on the gazing point position from the line-of-sight detection frame specified by the line-of-sight detection frame identification process. Hereinafter, this image will be referred to as a gaze point image A. In the example of FIG. 12, the gazing point image A is the gazing point image 1221.

Ｓ２１２では、ＣＰＵ３は、現在表示している表示フレームに対して、注視点位置の画像を矩形画像として抽出する。以下では、この画像を注視点画像Ｂと称する。図１２の例は、注視点画像Ｂは注視点画像１２２２である。ＣＰＵ３は、追尾枠１２１１に対する注視点画像１２２１の大きさと、追尾枠１２１３に対する注視点画像１２２２の大きさが一致するように、注視点画像１２２２の大きさを設定する。 In S212, the CPU 3 extracts the image at the gazing point position as a rectangular image with respect to the currently displayed display frame. Hereinafter, this image will be referred to as a gaze point image B. In the example of FIG. 12, the gazing point image B is the gazing point image 1222. The CPU 3 sets the size of the gazing point image 1222 so that the size of the gazing point image 1221 with respect to the tracking frame 1211 and the size of the gazing point image 1222 with respect to the tracking frame 1213 match.

Ｓ２１３では、ＣＰＵ３は、注視点枠画像ＡとＢの類似度を算出するためにＳＡＤ（ＳｕｍＯｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅ）演算つまり、画素値の差分の絶対値の和を演算する。ＳＡＤ値は小さいほど類似度が高いことを意味する。注視点画像Ａ，Ｂの大きさが異なる場合には、大きさをそろえてからＳＡＤ演算を施す。なお、本実施形態では、画像の類似度を算出するのにＳＡＤを用いたが別の演算で類似度を算出しても良い。例えば、ＳＳＤ（ＳｕｍｏｆＳｑｕａｒｅｄＤｉｆｆｅｒｅｎｃｅ）やＮＣＣ（ＮｏｒｍａｌｉｚｅｄＣｒｏｓｓ−Ｃｏｒｒｅｌａｔｉｏｎ）を用いてもよい。 In S213, the CPU 3 calculates the SAD (Sum Of Absolute Difference) calculation, that is, the sum of the absolute values of the differences between the pixel values, in order to calculate the similarity between the gazing point frame images A and B. The smaller the SAD value, the higher the similarity. If the sizes of the gazing point images A and B are different, the SAD calculation is performed after the sizes are the same. In the present embodiment, SAD is used to calculate the similarity of the images, but the similarity may be calculated by another calculation. For example, SSD (Sum of Squared Difference) or NCC (Normalized Cross-Correlation) may be used.

Ｓ２１４では、ＣＰＵ３は、Ｓ２０３で算出したＳＡＤ値が閾値以内であるか判定し、閾値以内であればＳ２０５に進み、閾値以上であれば、Ｓ２０６へ進む。Ｓ２０４は、注視点枠画像ＡとＢの類似度が閾値以上に高ければＳ２０５に進み、類似度が閾値未満であればＳ２０６に進む分岐処理であるともいえる。 In S214, the CPU 3 determines whether the SAD value calculated in S203 is within the threshold value, proceeds to S205 if it is within the threshold value, and proceeds to S206 if it is above the threshold value. It can be said that S204 is a branching process in which the process proceeds to S205 if the similarity between the gaze frame images A and B is higher than the threshold value, and proceeds to S206 if the similarity is less than the threshold value.

Ｓ２１５では、ＣＰＵ３は、注視点の補正処理を行わずに処理を終了する。注視点画像ＡとＢの類似度が高ければ、被写体が移動しておらず、注視点画像Ａ（１２２１）と注視点画像Ｂ（１２２２）は一致しており補正処理が不要なためである。 In S215, the CPU 3 ends the process without performing the gazing point correction process. If the degree of similarity between the gaze point images A and B is high, the subject has not moved, and the gaze point image A (1221) and the gaze point image B (1222) match, and no correction process is required.

一方、注視点画像ＡとＢの類似度が低ければ、被写体が移動しているため、Ｓ２１６以降の処理により、ユーザーが実際に注視していた物体の位置を現在フレームの中から探索し、得られた位置を補正後の注視点１２２３とする。 On the other hand, if the degree of similarity between the gazing point images A and B is low, the subject is moving. Therefore, the position of the object that the user was actually gazing at is searched from the current frame by the processing after S216, and the result is obtained. The corrected position is set as the gaze point 1223 after correction.

Ｓ２１６では、ＣＰＵ３は、注視点１２２２の周辺の領域を探索領域として設定する。探索領域１２２２は、例えば、注視点１２２２を中心とする矩形領域であってもよいし、被写体の移動を考慮して視線検出フレーム１２０１の注視点１２２１にある物体が存在する可能性が高い位置を中心とする矩形領域で会ってもよい。 In S216, the CPU 3 sets the area around the gazing point 1222 as the search area. The search area 1222 may be, for example, a rectangular area centered on the gazing point 1222, or a position where an object at the gazing point 1221 of the line-of-sight detection frame 1201 is likely to exist in consideration of the movement of the subject. You may meet in the central rectangular area.

Ｓ２１７で、ＣＰＵ３は、現表示フレーム１２０３（１２０４）の追尾枠１２１３内の画像に対して、評価枠を設定する。評価枠内の画像を評価枠画像と称する。評価枠の大きさは、注視点画像Ｂと同じ大きさとする。Ｓ２１８では、ＣＰＵ３は、評価枠画像と注視点画像ＡのあいだのＳＡＤ値を演算する。Ｓ２１９では、ＣＰＵ３は、算出したＳＡＤ値が閾値以内（類似度が閾値以上）であればＳ２２０に進み、そうでなければＳ２２１へ進む。 In S217, the CPU 3 sets an evaluation frame for the image in the tracking frame 1213 of the current display frame 1203 (1204). The image in the evaluation frame is called an evaluation frame image. The size of the evaluation frame is the same as that of the gazing point image B. In S218, the CPU 3 calculates the SAD value between the evaluation frame image and the gazing point image A. In S219, the CPU 3 proceeds to S220 if the calculated SAD value is within the threshold value (similarity is equal to or higher than the threshold value), and proceeds to S221 otherwise.

Ｓ２２０では、ＣＰＵ３は、現在の評価枠の位置を補正後の注視点位置として設定する。これにより、現在フレームのうち、ユーザーが実際に注視していた表示フレーム１２０１の注視点１２２１にある物体が存在する位置を補正後の注視点とすることができる。 In S220, the CPU 3 sets the position of the current evaluation frame as the corrected gazing point position. As a result, among the current frames, the position where the object at the gazing point 1221 of the display frame 1201 that the user was actually gazing at exists can be set as the gazing point after correction.

Ｓ２２１では、ＣＰＵ３は、追尾枠１２１３内の全ての位置での評価枠画像の類似度評価が完了したか判定し、完了している場合はＳ２２２へ進み、完了していない場合はＳ２２３へ進む。 In S221, the CPU 3 determines whether the similarity evaluation of the evaluation frame images at all the positions in the tracking frame 1213 is completed, and if it is completed, proceeds to S222, and if not, proceeds to S223.

Ｓ２２２では、ＣＰＵ３は、追尾枠１２１３の中心位置にある被写体に対して合焦するように焦点調節回路１１８を制御する。これは、閾値よりも高い評価枠が存在しない場合には、追尾枠１２１３の中心を注視していると考えるのが、最も誤差が少ないと考えられるためである。Ｓ２２２では、追尾枠１２１３の中心位置以外の位置に焦点を合わせてもよい。例えば、ＣＰＵ３は、追尾枠１２２１と追尾枠１２２３の位置の違いに基づいて注視点位置１２２２を移動させることによって、補正後の注視点位置１２２３を求め、補正後の注視点位置１２２３に焦点を合わせてもよい。 In S222, the CPU 3 controls the focus adjustment circuit 118 so as to focus on the subject at the center position of the tracking frame 1213. This is because when there is no evaluation frame higher than the threshold value, it is considered that the center of the tracking frame 1213 is being watched, and the error is considered to be the smallest. In S222, the focus may be on a position other than the center position of the tracking frame 1213. For example, the CPU 3 obtains the corrected gazing point position 1223 by moving the gazing point position 1222 based on the difference in the positions of the tracking frame 1221 and the tracking frame 1223, and focuses on the corrected gazing point position 1223. You may.

Ｓ２２３では、ＣＰＵ３は、追尾枠１２１３内の画像に対して、評価枠を一定画素ずらした枠を新しい評価枠画像に設定し、Ｓ２０７へ戻るように制御する。 In S223, the CPU 3 sets a frame in which the evaluation frame is shifted by a certain pixel as a new evaluation frame image with respect to the image in the tracking frame 1213, and controls to return to S207.

以上の処理により、ユーザーが実際に注視している被写体位置に対して焦点を合わせることができる。 By the above processing, it is possible to focus on the subject position that the user is actually gazing at.

なお、注視点補正処理により、補正後の注視点画像の大きさも補正される。具体的には、視線検出フレーム１２０１における追尾枠１２１１に対する注視点画像１２２１の大きさの比と、現在フレーム１２０４における追尾枠１２１３に対する注視点画像１２２３の比が同じなるように、補正後の注視点画像の大きさが設定される。 The gaze point correction process also corrects the size of the gaze point image after correction. Specifically, the corrected gazing point so that the ratio of the size of the gazing point image 1221 to the tracking frame 1211 in the line-of-sight detection frame 1201 and the ratio of the gazing point image 1223 to the tracking frame 1213 in the current frame 1204 are the same. The size of the image is set.

図１３（Ｂ）のフローチャートでは、類似度が閾値以上の評価枠が発見されたらＳ２０９に進んでいるが、代替的には、追尾枠１２１３内の全ての位置に評価枠を設定し、最も高い類似度が得られる評価枠位置に焦点を合わせるようにしてもよい。この代替例では、得られた最も高い類似度が閾値未満であれば、Ｓ２１１の処理により追尾枠中心位置に焦点を合わせてもよい。 In the flowchart of FIG. 13B, if an evaluation frame having a similarity equal to or higher than the threshold value is found, the process proceeds to S209. Alternatively, evaluation frames are set at all positions in the tracking frame 1213, which is the highest. The focus may be on the position of the evaluation frame where the similarity can be obtained. In this alternative example, if the highest similarity obtained is less than the threshold value, the processing of S211 may focus on the center position of the tracking frame.

＜追尾補正処理の説明＞
以下、図１４を用いて、視線検出した注視点を利用した追尾補正処理について説明する。撮像装置１は、追尾回路２０７により求められる追尾位置（追尾枠）と注視点のずれが大きい場合に、ユーザーが実際に注視している被写体を追尾位置に設定し直す。 <Explanation of tracking correction processing>
Hereinafter, the tracking correction process using the gaze point detected by the line of sight will be described with reference to FIG. When the deviation between the tracking position (tracking frame) obtained by the tracking circuit 207 and the gazing point is large, the image pickup apparatus 1 resets the subject actually gazing by the user to the tracking position.

図１４は、視線検出した注視点を利用した被写体の追尾位置の補正処理の流れを示すフローチャートである。図１４の追尾補正処理のフローは、ＣＰＵ３で制御されるものとして説明するが、追尾補正処理は追尾回路２０７で行っても良い。 FIG. 14 is a flowchart showing a flow of correction processing of the tracking position of the subject using the gazing point detected by the line of sight. The flow of the tracking correction processing of FIG. 14 will be described as being controlled by the CPU 3, but the tracking correction processing may be performed by the tracking circuit 207.

Ｓ３０１では、ＣＰＵ３は、視線検出フレーム特定処理で特定した表示フレームの注視点位置と最新の追尾処理用画像の追尾位置を比較して、位置のずれを求める。ＣＰＵ３は、位置のずれが閾値以内であれば、Ｓ３０２へ進み、そうでなければＳ３０３へ進む。用いられる閾値は、あらかじめ定められた値である。 In S301, the CPU 3 compares the gazing point position of the display frame specified by the line-of-sight detection frame specifying process with the tracking position of the latest tracking image, and obtains the position deviation. If the positional deviation is within the threshold value, the CPU 3 proceeds to S302, and if not, proceeds to S303. The threshold used is a predetermined value.

Ｓ３０２では、ＣＰＵ３は、最新の追尾位置の信頼度が高いと判断して、追尾位置を維持して処理を終える。つまり、Ｓ３０１において位置のずれが閾値以内で判定された場合は、追尾位置の補正は行われない。 In S302, the CPU 3 determines that the reliability of the latest tracking position is high, maintains the tracking position, and finishes the process. That is, if the position deviation is determined within the threshold value in S301, the tracking position is not corrected.

Ｓ３０３では、ＣＰＵ３は、追尾被写体の移動速度を求める。移動速度は、視線検出フレームおよびそれより時間的に前のフレームにおける追尾位置の差（すなわちフレーム間
での被写体移動量）と、画像フレームレートから算出できる。追尾被写体の移動速度はその他の公知の手法によって求められてもよい。 In S303, the CPU 3 obtains the moving speed of the tracking subject. The moving speed can be calculated from the difference in tracking position between the line-of-sight detection frame and the frame before that (that is, the amount of movement of the subject between frames) and the image frame rate. The moving speed of the tracking subject may be determined by other known methods.

Ｓ３０４では、ＣＰＵ３は、被写体速度に基づいてずれ量の閾値を決定（更新）する。ＣＰＵ３は、具体的には、被写体速度が速いほど大きな閾値を採用する。速度と閾値の関係は線形であっても非線形であってもよい。 In S304, the CPU 3 determines (updates) the threshold value of the deviation amount based on the subject speed. Specifically, the CPU 3 adopts a larger threshold value as the subject speed is faster. The relationship between velocity and threshold may be linear or non-linear.

Ｓ３０５では、ＣＰＵ３は、注視点位置と追尾位置のずれがＳ３０４で決定した閾値以内であるか判定する。ＣＰＵ３は、ずれが閾値以内でＳ３０２に進み、そうでなければＳ３０６へ進む。 In S305, the CPU 3 determines whether the deviation between the gazing point position and the tracking position is within the threshold value determined in S304. The CPU 3 proceeds to S302 within the threshold value, otherwise proceeds to S306.

Ｓ３０６では、ＣＰＵ３は、視線検出フレームの注視点画像を用いて、被写体を再探索する。探索範囲は、被写体速度に基づいて予測される被写体が存在している可能性が高い領域とする。ＣＰＵ３は、探索範囲内の評価枠に対して、視線検出フレームの注視点画像との類似度評価を行い、最も高い類似度の評価枠の位置を求める。類似度評価には例えばＳＡＤが採用可能であるが、その他の類似度基準を用いてもよい。ＣＰＵ３は、最も高い類似度が閾値以上であれば、この評価枠を追尾枠に設定する。ＣＰＵ３は、最も高い類似度閾値未満であれば、追尾枠なし（追跡ロスト）とする。追跡をロストした場合、ＣＰＵ３は、認識回路２０８によって被写体（例えば、顔）を検出して、その結果を用いて追尾枠を更新してもよい。 In S306, the CPU 3 re-searches the subject using the gazing point image of the line-of-sight detection frame. The search range is a region in which there is a high possibility that a subject predicted based on the subject speed exists. The CPU 3 evaluates the similarity of the evaluation frame within the search range with the gazing point image of the line-of-sight detection frame, and obtains the position of the evaluation frame having the highest degree of similarity. For example, SAD can be adopted for the similarity evaluation, but other similarity criteria may be used. If the highest similarity is equal to or higher than the threshold value, the CPU 3 sets this evaluation frame as the tracking frame. If the CPU 3 is less than the highest similarity threshold value, the CPU 3 has no tracking frame (tracking lost). When the tracking is lost, the CPU 3 may detect a subject (for example, a face) by the recognition circuit 208 and update the tracking frame using the result.

以上述べたように、本実施形態によれば、視線検出結果がどの表示フレームを見ていたときものであるかを精度良く特定できる。これにより、視線検出フレーム内の注視点位置にある被写体画像が特定できるので、ユーザーが実際に注視していた物体に対して制御を行える。例えば、合焦制御に適用する例では、ユーザーが実際に注視していた被写体に対して焦点を合わせることができる。また、追尾補正に適用する例では、ユーザーが実際に注視していた被写体を追尾位置にすることができる。 As described above, according to the present embodiment, it is possible to accurately identify which display frame the line-of-sight detection result is when the user is looking at. As a result, the subject image at the gazing point position in the line-of-sight detection frame can be specified, so that the user can control the object actually gazing. For example, in the example applied to focusing control, it is possible to focus on the subject that the user is actually gazing at. Further, in the example applied to the tracking correction, the subject actually being watched by the user can be set to the tracking position.

以上、本発明を実施例に基づき具体的に説明したが、本発明は、上記実施例に限定されるものではなく、その要旨を逸脱しない範囲において種々の変更が可能であることは言うまでもない。 Although the present invention has been specifically described above based on the examples, it goes without saying that the present invention is not limited to the above examples and various modifications can be made without departing from the gist thereof.

例えば、上記の実施形態では、表示デバイス１０および眼球撮像素子１７がカメラのファインダ内に配置されているが、本発明はカメラ以外の任意の電子機器に適用可能である。例えば、表示デバイス１０はパーソナルコンピュータから出力されるモニタであり、眼球撮像素子１７はこのモニタに取り付けられたカメラであってもよい。表示デバイス１０はＶＲ（仮想現実）等を体感するために頭部に装着されるＨＭＤ（ヘッドマウントディスプレイ）であり、眼球撮像素子１７はＨＭＤに取り付けられたカメラであってもよい。また、表示デバイス１０はＡＲ（拡張現実）グラス等のメガネ型デバイスであり、眼球撮像素子１７はこのメガネ型デバイスに取り付けられたカメラであってよい。表示デバイス１０は、虚像投影方式であっても網膜投影方式のいずれでもよい。 For example, in the above embodiment, the display device 10 and the eyeball image sensor 17 are arranged in the viewfinder of the camera, but the present invention can be applied to any electronic device other than the camera. For example, the display device 10 may be a monitor output from a personal computer, and the eyeball image sensor 17 may be a camera attached to the monitor. The display device 10 is an HMD (head-mounted display) worn on the head to experience VR (virtual reality) or the like, and the eyeball image sensor 17 may be a camera attached to the HMD. Further, the display device 10 may be a glasses-type device such as an AR (augmented reality) glass, and the eyeball image sensor 17 may be a camera attached to the glasses-type device. The display device 10 may be either a virtual image projection method or a retinal projection method.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other Embodiments>
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１：カメラ３：ＣＰＵ１０：表示デバイス１７：眼球撮像素子
２０１：視線検出回路 1: Camera 3: CPU 10: Display device 17: Eyeball image sensor 201: Line-of-sight detection circuit

Claims

Display means for displaying images and
An imaging means for capturing an eye for viewing a display image displayed on the display means, and an imaging means.
A detection means that detects the line of sight using an eye image captured by the imaging means, and
A specific means for specifying the line-of-sight detection frame, which is a display frame corresponding to the line-of-sight detection result of the detection means, and
With
The specific means identifies the line-of-sight detection frame based on the drive mode information of the image pickup means and the drive mode information of the display means.
Electronics.

The drive mode information of the image pickup means includes synchronization timing, reading time of image data from the image pickup device, and exposure time.
The drive mode information of the display means includes a synchronization timing and a display scanning time.
The electronic device according to claim 1.

The specific means
Difference between the display start timing of the frame image by the display means and the read start timing of the image pickup means,
The time from when the imaging means starts reading to when reading the line in which the corneal reflex image appears.
The exposure time of the imaging means, and the time from when the display means starts displaying the frame image until the line including the gazing point is displayed.
Is used to obtain adjustment information indicating a frame difference between the line-of-sight detection frame and the display frame displayed at the read start timing of the imaging means, and the line-of-sight detection frame is specified using the adjustment information.
The electronic device according to claim 1 or 2.

The specific means
Corresponds to the total of the time from the start of displaying the frame image by the display means to the start of reading by the imaging means and the time from the start of reading by the imaging means until the line-of-sight detection result is obtained. Adjust the number of frames to be performed using the adjustment information,
From the display frame at the timing when the line-of-sight detection result is obtained, the display frame that is the number of frames before the adjustment is determined as the line-of-sight detection frame.
The electronic device according to claim 3.

The drive mode information of the imaging means includes a synchronization timing and a frame rate.
The drive mode information of the display means includes a synchronization timing and a frame rate.
The imaging means and the display means are synchronized, and the frame rate of the imaging means is an integral multiple of the frame rate of the display means.
The detection means detects the line of sight using only the eye image whose reading is started at the same timing as the display start timing of the display means, and displays the line of sight detection frame at the timing when the imaging means starts reading. Identify the frame immediately before the display frame displayed in the means,
The electronic device according to claim 1 or 2.

Further provided with a characteristic acquisition means for acquiring eye characteristics including at least delay information of the user's eye movements,
The identifying means also identifies the line-of-sight detection frame using the delay information.
The electronic device according to any one of claims 1 to 4.

Further provided with a gazing point correction means for determining a position in the current display frame corresponding to the gazing point of the line-of-sight detection frame as a correction gazing point.
The electronic device according to any one of claims 1 to 6.

The position in the current display frame corresponding to the gazing point of the line-of-sight detection frame is the position in the current display frame where the subject located at the gazing point of the line-of-sight detection frame exists.
The electronic device according to claim 7.

The gaze correction means is
Set the area around the gazing point of the current display frame as the search area, and set it as the search area.
In the search area, the evaluation frame is moved to obtain the similarity between the image in the evaluation frame of the current display frame and the frame image including the gazing point of the line-of-sight detection frame.
The position of the evaluation frame in which the similarity of the line-of-sight detection frame with the frame image at the gazing point is equal to or higher than a predetermined threshold value is determined as the corrected gazing point.
The electronic device according to claim 7 or 8.

The gazing point correction means determines the position of the evaluation frame having the highest degree of similarity as the corrected gazing point.
The electronic device according to claim 9.

The search area is set according to the moving speed of the subject located at the gazing point.
The electronic device according to claim 9 or 10.

The display means displays a display frame for representing the position of the user's gaze at the position of the correction gaze point of the display frame.
The electronic device according to any one of claims 7 to 11.

The display frame has a size based on the ratio of the size of the object detected at the position of the gazing point of the line-of-sight detection frame to the size of the object detected at the position of the corrected gazing point of the display frame. Have,
The electronic device according to claim 12.

With the second imaging means
A focus adjusting means for controlling the focus of the second imaging means and
With more
The display means displays an image captured by the second imaging means, and displays the image.
The focus adjusting means controls the focus so as to focus on the corrected gazing point.
The electronic device according to any one of claims 7 to 13.

When the similarity between the image of the gazing point position of the line-of-sight detection frame and the image of the gazing point position of the current display frame is lower than a predetermined threshold value, the focus adjusting means is applied to the corrected gazing point. Control the focus so that it is in focus,
When the similarity between the image of the gazing point position of the line-of-sight detection frame and the image of the gazing point position of the current display frame is higher than a predetermined threshold value, the focus adjusting means focuses on the gazing point. The electronic device according to claim 14, wherein the focus is controlled so as to match.

With additional tracking means to track the subject,
If the deviation between the tracking position and the position of the gazing point detected by the detecting means is equal to or greater than a predetermined threshold value, the tracking means corrects the tracking position to the position of the corrected gazing point.
The electronic device according to any one of claims 7 to 15.

A display step that displays an image on a display means,
An imaging step in which an eye for viewing a display image displayed on the display means is imaged by the imaging means, and an imaging step.
A detection step of detecting the line of sight using the eye image captured in the imaging step, and a detection step.
A specific step for specifying the line-of-sight detection frame, which is a display frame corresponding to the line-of-sight detection result of the detection step,
Including
In the specific step, the line-of-sight detection frame is specified based on the drive mode information of the image pickup means and the drive mode information of the display means.
How to control electronic devices.

A program for causing a computer to function as each means of the electronic device according to any one of claims 1 to 16.

A computer-readable storage medium containing a program for causing the computer to function as each means of the electronic device according to any one of claims 1 to 16.