JP2023065313A

JP2023065313A - Imaging apparatus, image processing device, and method

Info

Publication number: JP2023065313A
Application number: JP2022165023A
Authority: JP
Inventors: 友美高尾; Tomomi Takao; 彰宏西尾; Teruhiro Nishio
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-10-27
Filing date: 2022-10-13
Publication date: 2023-05-12

Abstract

To provide an imaging apparatus and a method for generating image data for displaying to assist a user to quickly gaze at an intended location or subject.SOLUTION: An imaging apparatus is capable of detecting a user's gazing location in an image being displayed. The imaging apparatus applies a processing process that visually emphasizes a feature area over other areas when generating image data to be displayed when the detection of the gazing location is valid. The feature area is an area of a subject of a type determined on the basis of the setting of the imaging apparatus.SELECTED DRAWING: Figure 6

Description

本発明は撮像装置、画像処理装置、および方法に関する。 The present invention relates to an imaging device, an image processing device, and a method.

表示画像におけるユーザの注視位置を検出し、注視位置を含んだ領域を拡大表示する撮像装置が特許文献１に開示されている。 Japanese Unexamined Patent Application Publication No. 2002-101001 discloses an imaging device that detects a gaze position of a user in a display image and enlarges and displays an area including the gaze position.

特開２００４－２１５０６２号公報JP 2004-215062 A

特許文献１記載の技術によれば、表示画像において意図した位置を注視しているか否かをユーザが確認しやすくなる。しかしながら、表示画像において意図した位置（被写体）にユーザが視線を合わせるまでに要する時間を短縮することはできない。 According to the technology described in Patent Document 1, it becomes easier for the user to confirm whether or not the user is gazing at the intended position in the display image. However, it is not possible to reduce the time required for the user's line of sight to match the intended position (subject) in the displayed image.

本発明はこのような従来技術の課題に鑑みてなされたものである。本発明はその一態様において、ユーザが意図した位置もしくは被写体を素早く注視することを支援するための表示用画像データを生成する撮像装置および方法を提供する。 The present invention has been made in view of such problems of the prior art. In one aspect, the present invention provides an imaging apparatus and method for generating image data for display to assist a user in quickly gazing at an intended position or object.

上述の目的は、撮像装置であって、撮像装置が表示している画像におけるユーザの注視位置を検出可能な検出手段と、表示のための画像データを生成する生成手段と、を有し、生成手段は、検出手段が有効な際に生成する画像データについては、特徴領域を他の領域より視覚的に強調する加工処理を適用し、特徴領域が、撮像装置の設定に基づいて判定される種類の被写体の領域である、ことを特徴とする撮像装置によって達成される。 The above-mentioned object is an imaging device, which has detection means capable of detecting a gaze position of a user in an image displayed by the imaging device, and generation means for generating image data for display, and generating The means applies processing to the image data generated when the detection means is active to visually emphasize the characteristic region from other regions, and the characteristic region is determined based on the settings of the imaging device. is a subject area of .

本発明の一態様によれば、ユーザが意図した位置もしくは被写体を素早く注視することを支援する表示用画像データを生成する撮像装置および方法を提供することができる。 According to one aspect of the present invention, it is possible to provide an imaging apparatus and method for generating image data for display that assists a user in quickly gazing at an intended position or subject.

本発明の実施形態にかかる撮像装置の構成を示すブロック図1 is a block diagram showing the configuration of an imaging device according to an embodiment of the present invention; FIG. 本発明の実施形態にかかる撮像装置の画素の瞳面と光電変換部の対応関係を示す図FIG. 2 is a diagram showing a correspondence relationship between a pupil plane of pixels and a photoelectric conversion unit of an imaging device according to an embodiment of the present invention; 本発明の実施形態にかかる視線入力部の構成を示す図The figure which shows the structure of the line-of-sight input part concerning embodiment of this invention. 本発明の第１実施形態のフローチャートFlowchart of the first embodiment of the present invention 本発明の実施形態の撮影モード設定方法Shooting mode setting method according to the embodiment of the present invention 本発明の第１実施形態の画像加工例１Image processing example 1 according to the first embodiment of the present invention 本発明の第１実施形態の画像加工例２Image processing example 2 of the first embodiment of the present invention 本発明の第２実施形態の撮像装置構成Image pickup device configuration of the second embodiment of the present invention 本発明の第２実施形態のフローチャートFlowchart of the second embodiment of the present invention 本発明の第２実施形態の画像加工例３Image processing example 3 of the second embodiment of the present invention 本発明の第２実施形態の画像加工例４Image processing example 4 of the second embodiment of the present invention 第３実施形態に係る撮像装置が提示するキャリブレーション画面の例を示す図FIG. 11 is a diagram showing an example of a calibration screen presented by an imaging device according to the third embodiment; FIG. 視覚特性に応じた加工処理を行うことが適切なシーンの例と、加工処理の例を示す図A diagram showing an example of a scene where it is appropriate to perform processing according to visual characteristics and an example of processing. 視覚特性に応じた加工処理を行うことが適切なシーンの例と、加工処理の例を示す図A diagram showing an example of a scene where it is appropriate to perform processing according to visual characteristics and an example of processing. 視覚特性に応じた加工処理を行うことが適切なシーンの例と、加工処理の例を示す図A diagram showing an example of a scene where it is appropriate to perform processing according to visual characteristics and an example of processing. 視覚特性に応じた加工処理を行うことが適切なシーンの例と、加工処理の例を示す図A diagram showing an example of a scene where it is appropriate to perform processing according to visual characteristics and an example of processing. 第３実施形態における表示用画像データ生成動作に関するフローチャートFlowchart relating to display image data generation operation in the third embodiment 第４実施形態で提示する仮想空間の例を示す図A diagram showing an example of a virtual space presented in the fourth embodiment. 第４実施形態における強調表示の例を示す図A diagram showing an example of highlighting in the fourth embodiment 第４実施形態における仮想空間の種類と強調表示が可能な被写体の種類との関係例を示す図A diagram showing an example of the relationship between the type of virtual space and the type of subject that can be highlighted in the fourth embodiment. 第４実施形態における主被写体種類の選択ＧＵＩの例を示す図A diagram showing an example of a main subject type selection GUI in the fourth embodiment. 第４実施形態におけるメタデータを画像データとした例を示す図A diagram showing an example in which image data is used as metadata in the fourth embodiment. 第４実施形態において仮想空間画像とともに表示する指標の例を示す図A diagram showing an example of indices displayed together with a virtual space image in the fourth embodiment. 第５実施形態に係る表示システムとその動作の例を示す図FIG. 11 is a diagram showing an example of a display system according to a fifth embodiment and its operation; 第５実施形態におけるサーバとして用いることができるコンピュータ機器の機能構成例を示すブロック図FIG. 12 is a block diagram showing a functional configuration example of a computer device that can be used as a server in the fifth embodiment; FIG. 第５実施形態における表示領域の例を示す図A diagram showing an example of a display area in the fifth embodiment 第５実施形態に係る表示システムとその動作の別の例を示す図FIG. 12 is a diagram showing another example of the display system according to the fifth embodiment and its operation; 第５実施形態において用いるカメラの構成例を示す図FIG. 11 is a diagram showing a configuration example of a camera used in the fifth embodiment;

以下、添付図面を参照して本発明をその例示的な実施形態に基づいて詳細に説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定しない。また、実施形態には複数の特徴が記載されているが、その全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 The invention will now be described in detail on the basis of its exemplary embodiments with reference to the accompanying drawings. In addition, the following embodiments do not limit the invention according to the scope of claims. In addition, although a plurality of features are described in the embodiments, not all of them are essential to the invention, and the plurality of features may be combined arbitrarily. Furthermore, in the accompanying drawings, the same or similar configurations are denoted by the same reference numerals, and redundant description is omitted.

以下では、本発明をデジタルカメラのような撮像装置で実施する場合に関して説明する。しかし、本発明は表示画面の注視位置を検出可能な任意の電子機器で実施可能である。このような電子機器には、撮像装置以外にも、コンピュータ機器（パーソナルコンピュータ、タブレットコンピュータ、メディアプレーヤ、ＰＤＡなど）、携帯電話機、スマートフォン、ゲーム機、ロボット、車載機器などが含まれる。これらは例示であり、本発明は他の電子機器でも実施可能である。 In the following, the present invention will be described with respect to implementation in an imaging device such as a digital camera. However, the present invention can be implemented with any electronic device capable of detecting the gaze position of the display screen. Such electronic devices include computer devices (personal computers, tablet computers, media players, PDAs, etc.), mobile phones, smart phones, game machines, robots, vehicle-mounted devices, and the like, in addition to imaging devices. These are examples, and the present invention can also be implemented in other electronic devices.

●（第１実施形態）
［撮像装置の構成の説明］
図１は、実施形態に係る画像処理装置の一例としての撮像装置１の機能構成例を示すブロック図である。撮像装置１は、本体１００と、レンズユニット１５０とを有する。ここではレンズユニット１５０は本体１００に着脱可能な交換レンズユニットであるが、本体１００と一体化されたレンズユニットであってもよい。 ● (first embodiment)
[Description of configuration of imaging device]
FIG. 1 is a block diagram showing a functional configuration example of an imaging device 1 as an example of an image processing device according to an embodiment. The imaging device 1 has a main body 100 and a lens unit 150 . Although the lens unit 150 is an interchangeable lens unit detachable from the main body 100 here, it may be a lens unit integrated with the main body 100 .

レンズユニット１５０と本体１００とはレンズマウントを介して機械的および電気的に接続される。レンズマウントに設けられた通信端子６および１０は、レンズユニット１５０と本体１００とを電気的に接続する接点である。通信端子６および１０を通じてレンズユニット制御回路４とシステム制御回路５０は通信可能である。また、レンズユニット１５０の動作に必要な電力も通信端子６および１０を通じて本体１００からレンズユニット１５０に供給される。 Lens unit 150 and body 100 are mechanically and electrically connected via a lens mount. Communication terminals 6 and 10 provided on the lens mount are contacts for electrically connecting the lens unit 150 and the main body 100 . Communication between the lens unit control circuit 4 and the system control circuit 50 is possible through the communication terminals 6 and 10 . Electric power required for the operation of the lens unit 150 is also supplied from the body 100 to the lens unit 150 through the communication terminals 6 and 10 .

レンズユニット１５０は、被写体の光学像を撮像部２２の撮像面に形成する撮影光学系を構成する。レンズユニット１５０は絞り１０２と、フォーカスレンズを含む複数のレンズ１０３とを有する。絞り１０２は絞り駆動回路２により、フォーカスレンズはＡＦ駆動回路３によりそれぞれ駆動される。絞り駆動回路２およびＡＦ駆動回路３の動作は、システム制御回路５０からの指示に従い、レンズシステム制御回路４が制御する。 The lens unit 150 constitutes an imaging optical system that forms an optical image of a subject on the imaging surface of the imaging section 22 . A lens unit 150 has an aperture 102 and a plurality of lenses 103 including a focus lens. The diaphragm 102 is driven by the diaphragm driving circuit 2, and the focus lens is driven by the AF driving circuit 3, respectively. The operations of the aperture drive circuit 2 and the AF drive circuit 3 are controlled by the lens system control circuit 4 according to instructions from the system control circuit 50 .

フォーカルプレーンシャッタ１０１（以下、単にシャッタ１０１という）は、システム制御回路５０の制御よって駆動される。システム制御回路５０は、静止画撮影時、撮影条件に従って撮像部２２を露光するようにシャッタ１０１の動作を制御する。 A focal plane shutter 101 (hereinafter simply referred to as shutter 101 ) is driven under the control of system control circuit 50 . The system control circuit 50 controls the operation of the shutter 101 so as to expose the imaging section 22 according to the photographing conditions when photographing a still image.

撮像部２２は、２次元配列された複数の画素を有する撮像素子である。撮像部２２は、撮像面に形成された光学像を、各画素が有する光電変換部によって画素信号群（アナログ画像信号）に変換する。撮像部２２は例えばＣＣＤイメージセンサまたはＣＭＯＳイメージセンサであってよい。 The imaging unit 22 is an imaging device having a plurality of pixels arranged two-dimensionally. The imaging unit 22 converts an optical image formed on the imaging surface into a pixel signal group (analog image signal) by a photoelectric conversion unit of each pixel. The imaging unit 22 may be, for example, a CCD image sensor or a CMOS image sensor.

本実施形態の撮像部２２は、位相差検出方式の自動焦点検出（以下、位相差ＡＦ）に用いる１対の像信号を生成可能である。図２は、レンズユニット１５０の瞳面と、撮像部２２が有する画素の光電変換部の対応関係を示している。図２（ａ）は、画素が複数（ここでは２つ）の光電変換部２０１ａ，２０１ｂを有する構成、図２（ｂ）は、画素が１つの光電変換部２０１を有する構成の例を示している。 The imaging unit 22 of the present embodiment can generate a pair of image signals used for phase-difference automatic focus detection (hereinafter referred to as phase-difference AF). FIG. 2 shows the correspondence relationship between the pupil plane of the lens unit 150 and the photoelectric conversion units of the pixels of the imaging unit 22 . FIG. 2A shows an example of a configuration in which each pixel has a plurality of (here, two) photoelectric conversion units 201a and 201b, and FIG. there is

画素にはマイクロレンズ２５１とカラーフィルタ２５２が１つずつ設けられている。カラーフィルタ２５２の色は画素ごとに異なり、あらかじめ定められたパターンで色が配列される。ここでは、一例として原色ベイヤパターンによってカラーフィルタ２５２が配列されているものとする。この場合、各画素が有するカラーフィルタ２５２の色は、赤（Ｒ）、緑（Ｇ）、青（Ｂ）のいずれかである。 Each pixel is provided with one microlens 251 and one color filter 252 . The color of the color filter 252 differs for each pixel, and the colors are arranged in a predetermined pattern. Here, as an example, it is assumed that the color filters 252 are arranged in a primary color Bayer pattern. In this case, the color of the color filter 252 that each pixel has is red (R), green (G), or blue (B).

図２（ａ）の構成では、瞳面２５３の領域２５３ａから画素に入力する光は光電変換部２０１ａに、領域２５３ｂから画素に入力する光は光電変換部２０１ｂに入射する。複数の画素について、光電変換部２０１ａから得られる信号群と、光電変換部２０１ｂから得られる信号群とを１対の像信号として用いることにより位相差ＡＦを行うことができる。 In the configuration of FIG. 2A, the light input from the region 253a of the pupil plane 253 to the pixel enters the photoelectric conversion unit 201a, and the light input from the region 253b enters the photoelectric conversion unit 201b. For a plurality of pixels, phase difference AF can be performed by using a signal group obtained from the photoelectric conversion unit 201a and a signal group obtained from the photoelectric conversion unit 201b as a pair of image signals.

光電変換部２０１ａ，２０１ｂで得られる信号を個別に扱う場合、個々の信号は焦点検出用信号として機能する。一方、同じ画素の光電変換部２０１ａ，２０１ｂで得られる信号をまとめて（加算して）扱う場合、可算信号は画素信号として機能する。従って、図２（ａ）の構成を有する画素は、焦点検出用の画素としても、撮影用の画素としても機能する。撮像部２２は、すべての画素が図２（ａ）に示す構成を有するものとする。 When the signals obtained by the photoelectric conversion units 201a and 201b are individually handled, the individual signals function as focus detection signals. On the other hand, when the signals obtained by the photoelectric conversion units 201a and 201b of the same pixel are collectively (added), the countable signal functions as a pixel signal. Therefore, the pixel having the configuration of FIG. 2A functions both as a pixel for focus detection and as a pixel for photographing. It is assumed that all pixels of the imaging unit 22 have the configuration shown in FIG. 2(a).

一方、図２（ｂ）は、専用の焦点検出用画素の構成例を示している。図２（ｂ）に示す画素は、カラーフィルタ２５２と光電変換部２０１との間に、光電変換部２０１へ入射する光を制限する遮光マスク２５４が設けられている。ここでは、遮光マスク２５４が、瞳面２５３の領域２５３ｂからの光だけが光電変換部２０１に入射するような開口部を有している。これにより、画素は図２（ａ）の光電変換部２０１ｂだけを有する状態と実質的に同じになる。同様に、遮光マスク２５４の開口部を、瞳面２５３の領域２５３ａからの光だけが光電変換部２０１に入射するように構成することにより、画素を図２（ａ）の光電変換部２０１ａだけを有する状態と実質的に同じにすることができる。この２種類の画素を撮像部２２に複数対配置しても、位相差ＡＦ用の信号対を生成することができる。 On the other hand, FIG. 2B shows a configuration example of dedicated focus detection pixels. The pixel shown in FIG. 2B is provided with a light-shielding mask 254 between the color filter 252 and the photoelectric conversion section 201 to restrict light incident on the photoelectric conversion section 201 . Here, the light shielding mask 254 has openings such that only the light from the region 253 b of the pupil plane 253 is incident on the photoelectric conversion section 201 . As a result, the pixel becomes substantially the same as having only the photoelectric conversion unit 201b in FIG. 2(a). Similarly, by configuring the aperture of the light shielding mask 254 so that only the light from the region 253a of the pupil plane 253 is incident on the photoelectric conversion section 201, the pixel is converted to the photoelectric conversion section 201a of FIG. It can be substantially the same as the state having. Even if a plurality of pairs of these two types of pixels are arranged in the imaging unit 22, signal pairs for phase difference AF can be generated.

なお、位相差ＡＦの代わりに、あるいは位相差ＡＦと組み合わせて、コントラスト検出方式の自動焦点検出（以下、コントラストＡＦ）を実施してもよい。コントラストＡＦのみを実施する場合、画素は図２（ｂ）から遮光マスク２５４を省いた構成とすることができる。 Instead of phase-difference AF, or in combination with phase-difference AF, contrast detection-type automatic focus detection (hereinafter referred to as contrast AF) may be performed. When only contrast AF is performed, the pixels can have a configuration in which the light shielding mask 254 is omitted from FIG. 2(b).

Ａ／Ｄ変換器２３は、撮像部２２から出力されるアナログ画像信号をデジタル画像信号に変換する。撮像部２２がデジタル画像信号を出力可能な場合、Ａ／Ｄ変換器２３は省略可能である。 The A/D converter 23 converts the analog image signal output from the imaging section 22 into a digital image signal. If the imaging unit 22 can output a digital image signal, the A/D converter 23 can be omitted.

画像処理部２４は、Ａ／Ｄ変換器２３もしくはメモリ制御部１５からのデジタル画像信号に対して予め定められた画像処理を適用し、用途に応じた信号や画像データを生成したり、各種の情報を取得および／または生成したりする。画像処理部２４は例えば特定の機能を実現するように設計されたＡＳＩＣのような専用のハードウェア回路であってもよいし、ＤＳＰのようなプログラマブルプロセッサがソフトウェアを実行することで特定の機能を実現する構成であってもよい。 The image processing unit 24 applies predetermined image processing to the digital image signal from the A/D converter 23 or the memory control unit 15 to generate a signal or image data according to the application, or perform various processing. Obtain and/or generate information. The image processing unit 24 may be, for example, a dedicated hardware circuit such as an ASIC designed to implement a specific function, or a programmable processor such as a DSP executing software to perform the specific function. It may be a configuration to realize.

ここで、画像処理部２４が適用する画像処理には、前処理、色補間処理、補正処理、検出処理、データ加工処理、評価値算出処理、特殊効果処理などが含まれる。前処理には、信号増幅、基準レベル調整、欠陥画素補正などが含まれる。色補間処理は、撮影時に得られない色成分の値を補間する処理であり、デモザイク処理や同時化処理とも呼ばれる。補正処理には、ホワイトバランス調整、階調補正（ガンマ処理）、レンズ１０３の光学収差や周辺減光の影響を補正する処理、色を補正する処理などが含まれる。検出処理には、特徴領域（たとえば顔領域や人体領域）やその動きの検出、人物の認識処理などが含まれる。データ加工処理には、合成処理、スケーリング処理、符号化および復号処理、ヘッダ情報生成処理などが含まれる。評価値算出処理には、自動焦点検出（ＡＦ）に用いる信号や評価値の生成、自動露出制御（ＡＥ）に用いる評価値の算出処理などが含まれる。特殊効果処理には、ぼかしの付加、色調の変更、リライティング処理、後述する注視位置の検出が有効な際に適用する加工処理などが含まれる。なお、これらは画像処理部２４が適用可能な画像処理の例示であり、画像処理部２４が適用する画像処理を限定するものではない。 Here, the image processing applied by the image processing unit 24 includes preprocessing, color interpolation processing, correction processing, detection processing, data processing processing, evaluation value calculation processing, special effect processing, and the like. Pre-processing includes signal amplification, reference level adjustment, defective pixel correction, and the like. Color interpolation processing is processing that interpolates values of color components that cannot be obtained at the time of shooting, and is also called demosaicing processing or synchronization processing. The correction processing includes white balance adjustment, gradation correction (gamma processing), processing for correcting the effects of optical aberration and vignetting of the lens 103, processing for color correction, and the like. The detection processing includes detection of feature regions (for example, face regions and human body regions) and their movements, recognition of persons, and the like. The data processing includes synthesis processing, scaling processing, encoding and decoding processing, header information generation processing, and the like. The evaluation value calculation processing includes generation of signals and evaluation values used for automatic focus detection (AF), calculation processing of evaluation values used for automatic exposure control (AE), and the like. The special effect processing includes addition of blur, change of color tone, relighting processing, and processing applied when gaze position detection, which will be described later, is enabled. Note that these are examples of image processing that can be applied by the image processing unit 24, and the image processing that is applied by the image processing unit 24 is not limited.

特徴領域の検出処理の具体例について説明する。画像処理部２４は、検出対象の画像データ（例えばライブビュー画像のデータ）に水平および垂直方向のバンドパスフィルタを適用し、エッジ成分を抽出する。その後、画像処理部２４は、エッジ成分に対して、検出する特徴領域の種類に応じて予め用意されたテンプレートを用いたマッチング処理を適用し、テンプレートに類似した画像領域を検出する。例えば、特徴領域として人間の顔領域を検出する場合、画像処理部２４は顔のパーツ（例えば目、鼻、口、耳）のテンプレートを用いてマッチング処理を適用する。 A specific example of the feature region detection processing will be described. The image processing unit 24 applies horizontal and vertical band-pass filters to image data to be detected (for example, live view image data) to extract edge components. After that, the image processing unit 24 applies matching processing using a template prepared in advance according to the type of characteristic region to be detected to the edge component, and detects an image region similar to the template. For example, when detecting a human face region as a feature region, the image processing unit 24 applies matching processing using templates of facial parts (eg, eyes, nose, mouth, and ears).

マッチング処理により、目、鼻、口、耳の領域候補群が検出される。画像処理部２４は、目の候補群を、他の目の候補と予め設定された条件（例えば２つの目の距離、傾き等）を満たすものに絞り込む。そして、画像処理部２４は、絞り込まれた目の候補群との位置関係を満たする他のパーツ（鼻、口、耳）を対応付ける。さらに、画像処理部２４は、予め設定した非顔条件フィルタを適用し、顔に該当しないパーツの組み合わせを除外することにより、顔領域を検出する。画像処理部２４は、検出された顔領域の総数、および各顔領域の情報（位置や大きさ、検出の信頼度など）を、システム制御回路５０に出力する。システム制御回路５０は、画像処理部２４から得られた特徴領域の情報をシステムメモリ５２に記憶する。 A group of candidate regions for the eyes, nose, mouth, and ears is detected by matching processing. The image processing unit 24 narrows down the group of eye candidates to those that satisfy other eye candidates and preset conditions (for example, the distance and inclination of two eyes). Then, the image processing unit 24 associates other parts (nose, mouth, ears) that satisfy the positional relationship with the group of narrowed-down eye candidates. Furthermore, the image processing unit 24 detects a face area by applying a preset non-face condition filter and excluding a combination of parts that do not correspond to a face. The image processing unit 24 outputs the total number of detected face areas and information on each face area (position, size, detection reliability, etc.) to the system control circuit 50 . The system control circuit 50 stores the information of the feature area obtained from the image processing section 24 in the system memory 52 .

なお、ここで説明した人間の顔領域の検出方法は例示であり、機械学習を用いる方法など、他の任意の公知の方法を用いることができる。また、人間の顔に限らず、人物の胴体、手足、動物の顔、ランドマーク、文字、自動車、飛行機、鉄道車両など、他の種類の特徴領域を検出してもよい。 Note that the method of detecting the human face region described here is an example, and any other known method such as a method using machine learning can be used. In addition, detection is not limited to human faces, and other types of feature regions such as human bodies, limbs, animal faces, landmarks, characters, automobiles, airplanes, and railroad vehicles may be detected.

検出した特徴領域は、例えば焦点検出領域の設定に用いることができる。例えば、検出された顔領域の中から主顔領域を決定し、主顔領域に焦点検出領域を設定することができる。これにより、撮影範囲内に存在する顔領域に合焦するようにＡＦを実行することができる。なお、主顔領域はユーザに選択させてもよい。 The detected characteristic area can be used, for example, for setting the focus detection area. For example, a main face area can be determined from the detected face areas, and a focus detection area can be set in the main face area. As a result, AF can be performed so as to focus on the face area existing within the shooting range. Note that the main face area may be selected by the user.

Ａ／Ｄ変換器２３からの出力データは、画像処理部２４およびメモリ制御部１５を介して、あるいはメモリ制御部１５だけを介してメモリ３２に格納される。メモリ３２は、静止画データや動画データのバッファメモリ、画像処理部２４の作業用メモリ、表示部２８のビデオメモリなどとして用いられる。 Output data from the A/D converter 23 is stored in the memory 32 via the image processing section 24 and the memory control section 15, or via the memory control section 15 alone. The memory 32 is used as a buffer memory for still image data and moving image data, a working memory for the image processing section 24, a video memory for the display section 28, and the like.

Ｄ／Ａ変換器１９は、メモリ３２のビデオメモリ領域に格納されている表示用の画像データをアナログ信号に変換して表示部２８に供給する。表示部２８は、液晶ディスプレイなどの表示デバイスに、Ｄ／Ａ変換器１９からのアナログ信号に応じた表示を行う。 The D/A converter 19 converts image data for display stored in the video memory area of the memory 32 into analog signals and supplies the analog signals to the display unit 28 . The display unit 28 displays on a display device such as a liquid crystal display according to the analog signal from the D/A converter 19 .

動画を撮影しながら、表示用画像データの生成および表示を継続的に行うことにより、表示部２８を電子ビューファインダ（ＥＶＦ）として機能させることができる。表示部２８をＥＶＦとして機能させるために表示する画像をスルー画像またはライブビュー画像と呼ぶ。なお、表示部２８は、接眼部を通じて観察するように本体１００の内部に配置されてもよいし、本体１００の筐体表面（例えば背面）に配置されてもよいし、両方に設けられてもよい。 The display unit 28 can function as an electronic viewfinder (EVF) by continuously generating and displaying display image data while shooting a moving image. An image displayed so that the display unit 28 functions as an EVF is called a through image or a live view image. The display unit 28 may be arranged inside the main body 100 so as to be observed through the eyepiece, or may be arranged on the housing surface (for example, the rear surface) of the main body 100, or may be provided on both sides. good too.

本実施形態では、ユーザの注視位置を検出するため、表示部２８が少なくとも本体１００の内部に配置されているものとする。 In this embodiment, the display unit 28 is arranged at least inside the main body 100 in order to detect the gaze position of the user.

不揮発性メモリ５６は、電気的に書き換え可能な例えばＥＥＰＲＯＭである。不揮発性メモリ５６には、システム制御回路５０が実行可能なプログラム、各種の設定値、ＧＵＩデータなどが記憶される。 The nonvolatile memory 56 is an electrically rewritable EEPROM, for example. The nonvolatile memory 56 stores programs executable by the system control circuit 50, various setting values, GUI data, and the like.

システム制御回路５０は、プログラムを実行可能な１つ以上のプロセッサ（ＣＰＵ、ＭＰＵなどとも呼ばれる）を有する。システム制御回路５０は、不揮発性メモリ５６に記録されたプログラムをシステムメモリ５２にロードしてプロセッサによって実行することにより、撮像装置１の機能を実現する。 The system control circuit 50 has one or more processors (also called CPU, MPU, etc.) capable of executing programs. The system control circuit 50 realizes the functions of the imaging apparatus 1 by loading the program recorded in the nonvolatile memory 56 into the system memory 52 and executing it by the processor.

システムメモリ５２はシステム制御回路５０が実行するプログラムやプログラムの実行中に使用する定数、変数などを保持するために用いられる。
システムタイマー５３は各種制御に用いる時間および内蔵時計の時間を計測する。 The system memory 52 is used to store programs executed by the system control circuit 50 and constants and variables used during execution of the programs.
The system timer 53 measures the time used for various controls and the time of the built-in clock.

電源スイッチ７２は撮像装置１の電源のＯＮ、ＯＦＦを切り替える操作部材である。
モード切替スイッチ６０、第１シャッタースイッチ６２、第２シャッタースイッチ６４、操作部７０はシステム制御回路５０に指示を入力するための操作部材である。 A power switch 72 is an operation member for switching ON/OFF of the power of the imaging apparatus 1 .
A mode changeover switch 60 , a first shutter switch 62 , a second shutter switch 64 , and an operation section 70 are operation members for inputting instructions to the system control circuit 50 .

モード切替スイッチ６０は、システム制御回路５０の動作モードを静止画記録モード、動画撮影モード、再生モード等のいずれかに切り替える。静止画記録モードに含まれるモードとして、オート撮影モード、オートシーン判別モード、マニュアルモード、絞り優先モード（Ａｖモード）、シャッター速度優先モード（Ｔｖモード）がある。また、撮影シーン別の撮影設定となる各種シーンモード、プログラムＡＥモード、カスタムモード等がある。モード切替スイッチ６０で、メニューボタンに含まれるこれらのモードのいずれかに直接切り替えられる。あるいは、モード切替スイッチ６０でメニューボタンに一旦切り換えた後に、メニューボタンに含まれるこれらのモードのいずれかに、他の操作部材を用いて切り替えるようにしてもよい。同様に、動画撮影モードにも複数のモードが含まれていてもよい。 A mode switch 60 switches the operation mode of the system control circuit 50 between a still image recording mode, a moving image shooting mode, a reproduction mode, and the like. Modes included in the still image recording mode include an auto shooting mode, an auto scene determination mode, a manual mode, an aperture priority mode (Av mode), and a shutter speed priority mode (Tv mode). In addition, there are various scene modes, program AE modes, custom modes, etc., which are shooting settings for each shooting scene. A mode selector switch 60 allows direct switching to any of these modes contained in the menu button. Alternatively, after switching to the menu button once with the mode switching switch 60, any of these modes included in the menu button may be switched using another operation member. Similarly, the movie shooting mode may also include multiple modes.

第１シャッタースイッチ６２は、シャッターボタン６１の半押しでＯＮとなり第１シャッタースイッチ信号ＳＷ１を発生する。システム制御回路５０は第１シャッタースイッチ信号ＳＷ１を静止画の撮影準備指示と認識し、撮影準備動作を開始する。撮影準備動作には、例えばＡＦ処理、自動露出制御（ＡＥ）処理、オートホワイトバランス（ＡＷＢ）処理、ＥＦ（フラッシュプリ発光）処理などが含まれるが、これらは必須でなく、また他の処理が含まれてもよい。 The first shutter switch 62 is turned on by half-pressing the shutter button 61 and generates a first shutter switch signal SW1. The system control circuit 50 recognizes the first shutter switch signal SW1 as a still image shooting preparation instruction, and starts shooting preparation operations. The shooting preparation operation includes, for example, AF processing, automatic exposure control (AE) processing, auto white balance (AWB) processing, EF (flash pre-emission) processing, etc., but these are not essential, and other processing may be included. may be included.

第２シャッタースイッチ６４は、シャッターボタン６１の全押しでＯＮとなり、第２シャッタースイッチ信号ＳＷ２を発生する。システム制御回路５０は、第２シャッタースイッチ信号ＳＷ２を静止画の撮影指示と認識し、撮影処理および記録処理を実行する。 The second shutter switch 64 is turned on when the shutter button 61 is fully pressed, and generates a second shutter switch signal SW2. The system control circuit 50 recognizes the second shutter switch signal SW2 as an instruction to shoot a still image, and executes shooting processing and recording processing.

操作部７０は、シャッターボタン６１、モード切替スイッチ６０、電源スイッチ７２以外の操作部材の総称である。操作部７０には例えば、方向キー、セット（実行）ボタン、メニューボタン、動画撮影ボタンなどが含まれる。なお、表示部２８がタッチディスプレイの場合、表示とタッチ操作によって実現されるソフトウェアキーもまた操作部７０を構成する。メニューボタンが操作されるとシステム制御回路５０は、方向キーおよびセットボタンを用いて操作可能なメニュー画面を表示部２８に表示させる。ユーザはソフトウェアキーやメニュー画面の操作を通じて、撮像装置１の設定を変更することができる。 The operation unit 70 is a general term for operation members other than the shutter button 61 , the mode switching switch 60 and the power switch 72 . The operation unit 70 includes, for example, direction keys, a set (execution) button, a menu button, a video shooting button, and the like. If the display unit 28 is a touch display, the operation unit 70 also includes software keys realized by display and touch operation. When the menu button is operated, system control circuit 50 causes display unit 28 to display a menu screen that can be operated using the direction keys and the set button. The user can change the settings of the imaging device 1 by operating software keys and menu screens.

図３（ａ）は、視線入力部７０１の構成例を模式的に示す側面図である。視線入力部７０１は、本体１００の内部に設けられた表示部２８を接眼部を通じて覗いているユーザの眼球５０１ａの光軸の回転角を検出するための画像（視線検出用の画像）を取得するユニットである。 FIG. 3A is a side view schematically showing a configuration example of the line-of-sight input unit 701. FIG. The line-of-sight input unit 701 acquires an image (image for line-of-sight detection) for detecting the rotation angle of the optical axis of the eyeball 501a of the user looking into the display unit 28 provided inside the main body 100 through the eyepiece. It is a unit that

視線検出用の画像を画像処理部２４で処理し、眼球５０１ａの光軸の回転角を検出する。回転角は視線の方向を表すため、回転角と、予め設定された眼球５０１ａから表示部２８までの距離とに基づいて、表示部２８上の注視位置を推定することができる。なお、注視位置の推定において、予め行ったキャリブレーション動作によって取得した、ユーザの固有情報を考慮してもよい。注視位置の推定は画像処理部２４が実行してもシステム制御回路５０が実行してもよい。視線入力部７０１と画像処理部２４（またはシステム制御回路５０）とは、撮像装置１が表示部２８に表示している画像におけるユーザの注視位置を検出可能な検出手段を構成する。 The image for sight line detection is processed by the image processing unit 24 to detect the rotation angle of the optical axis of the eyeball 501a. Since the rotation angle represents the direction of the line of sight, the gaze position on the display unit 28 can be estimated based on the rotation angle and the preset distance from the eyeball 501a to the display unit 28 . In estimating the gaze position, the unique information of the user acquired by the calibration operation performed in advance may be taken into consideration. The gaze position may be estimated by the image processing unit 24 or by the system control circuit 50 . The line-of-sight input unit 701 and the image processing unit 24 (or the system control circuit 50) constitute detection means capable of detecting the gaze position of the user in the image displayed on the display unit 28 by the imaging device 1. FIG.

表示部２８に表示されている画像は、接眼レンズ７０１ｄおよびダイクロックミラー７０１ｃを通じてユーザに視認される。照明光源７０１ｅは、接眼部を通じて筐体の外部方向に赤外光を放射する。眼球５０１ａで反射された赤外光は、ダイクロイックミラー７０１ｃに入射する。ダイクロックミラー７０１ｃは入射した赤外光を上方に反射する。ダイクロックミラー７０１ｃの上方には受光レンズ７０１ｂおよび撮像素子７０１ａが配置されている。撮像素子７０１ａは、受光レンズ７０１ｂが形成する赤外光の像を撮影する。撮像素子７０１ａはモノクロ撮像素子であってよい。 The image displayed on the display unit 28 is visually recognized by the user through the eyepiece lens 701d and the dichroic mirror 701c. The illumination light source 701e emits infrared light to the outside of the housing through the eyepiece. The infrared light reflected by the eyeball 501a enters the dichroic mirror 701c. The dichroic mirror 701c reflects the incident infrared light upward. A light receiving lens 701b and an imaging device 701a are arranged above the dichroic mirror 701c. The imaging element 701a captures an image of infrared light formed by the light receiving lens 701b. The imaging device 701a may be a monochrome imaging device.

撮像素子７０１ａは、撮影により得られたアナログ画像信号を、Ａ／Ｄ変換器２３に出力する。Ａ／Ｄ変換器２３は、得られたデジタル画像信号を画像処理部２４に出力する。画像処理部２４は、画像データから眼球像を検出し、さらに、眼球像内で瞳孔領域を検出する。画像処理部２４は、眼球像における瞳孔領域の位置から、眼球の回転角（視線方向）を算出する。眼球像を含んだ画像から視線方向を検出する処理は、公知の方法によって実施することができる。 The imaging element 701 a outputs an analog image signal obtained by photographing to the A/D converter 23 . The A/D converter 23 outputs the obtained digital image signal to the image processing section 24 . The image processing unit 24 detects an eyeball image from the image data, and further detects a pupil region within the eyeball image. The image processing unit 24 calculates the eyeball rotation angle (line-of-sight direction) from the position of the pupil region in the eyeball image. A known method can be used to detect the line-of-sight direction from an image containing an eyeball image.

図３（ｂ）は、表示部２８が撮像装置１の背面に設けられている場合の視線入力部７０１の構成例を模式的に示す側面図である。この場合も、表示部２８を観察しているユーザの顔５００が存在するであろう方向に赤外光を照射する。そして、撮像装置１の背面に設けられたカメラ７０１ｆで撮影することにより、ユーザの顔５００の赤外像を取得し、眼球５０１ａおよび／または５０１ｂの像から瞳孔領域を検出することにより、視線方向を検出する。 FIG. 3B is a side view schematically showing a configuration example of the line-of-sight input section 701 when the display section 28 is provided on the back surface of the imaging device 1. As shown in FIG. Also in this case, the infrared light is emitted in the direction in which the face 500 of the user observing the display unit 28 is likely to exist. Then, an infrared image of the user's face 500 is captured by a camera 701f provided on the back surface of the imaging device 1, and a pupil region is detected from the image of the eyeballs 501a and/or 501b. to detect

なお、表示部２８上の注視位置を最終的に検出可能であれば、視線入力部７０１の構成および画像処理部２４（またはシステム制御回路５０）の処理に特に制限はなく、他の任意の構成および処理を採用しうる。 As long as the gaze position on the display unit 28 can be finally detected, the configuration of the line-of-sight input unit 701 and the processing of the image processing unit 24 (or the system control circuit 50) are not particularly limited, and any other configuration can be used. and processing can be employed.

図１に戻り、電源制御部８０は、電池検出回路、ＤＣ－ＤＣコンバータ、通電するブロックを切り替えるスイッチ回路等により構成される。電源制御部８０は、電源部３０が電池の場合、装着の有無、種類、残量を検出する。また、電源制御部８０は、これらの検出結果およびシステム制御回路５０の指示に基づいてＤＣ－ＤＣコンバータを制御し、必要な電圧を必要な期間、記録媒体２００を含む各部へ供給する。 Returning to FIG. 1, the power supply control unit 80 includes a battery detection circuit, a DC-DC converter, a switch circuit for switching blocks to be energized, and the like. If the power supply unit 30 is a battery, the power supply control unit 80 detects the presence/absence of attachment, the type, and the remaining amount. Also, the power supply control unit 80 controls the DC-DC converter based on these detection results and instructions from the system control circuit 50, and supplies necessary voltage to each unit including the recording medium 200 for a necessary period.

電源部３０は、アルカリ電池やリチウム電池などの一次電池、ＮｉＣｄ電池、ＮｉＭＨ電池、Ｌｉ電池などの二次電池、および／またはＡＣアダプターなどの１つ以上を融資うる。 The power supply unit 30 may use one or more of primary batteries such as alkaline batteries and lithium batteries, secondary batteries such as NiCd batteries, NiMH batteries and Li batteries, and/or AC adapters.

記録媒体Ｉ／Ｆ１８は、メモリカードやハードディスク等の記録媒体２００とのインターフェースである。記録媒体２００は着脱できても、できなくてもよい。記録媒体２００は撮影によって得られた画像データの記録先である。 A recording medium I/F 18 is an interface with a recording medium 200 such as a memory card or hard disk. The recording medium 200 may or may not be detachable. A recording medium 200 is a recording destination of image data obtained by photographing.

通信部５４は、無線または有線によって接続された外部装置と、画像信号や音声信号を送受信する。通信部５４は無線ＬＡＮ(Local Area Network)、ＵＳＢ(Universal Serial Bus)など１つ以上の通信規格をサポートする。システム制御回路５０は、通信部５４を通じて、撮像部２２による撮影で得られた画像データ（スルー画像を含む）や、記録媒体２００に記録された画像データを外部装置に送信することができる。また、システム制御回路５０は、通信部５４を通じて外部機器から画像データやその他の各種情報を受信することができる。 The communication unit 54 transmits and receives image signals and audio signals to and from an external device connected wirelessly or by wire. The communication unit 54 supports one or more communication standards such as wireless LAN (Local Area Network) and USB (Universal Serial Bus). The system control circuit 50 can transmit image data (including through images) captured by the imaging unit 22 and image data recorded on the recording medium 200 to an external device through the communication unit 54 . The system control circuit 50 can also receive image data and other various information from an external device through the communication unit 54 .

姿勢検知部５５は重力方向に対する撮像装置１の姿勢を検知する。姿勢検知部５５で検知された姿勢に基づいて、撮影時に撮像装置１を横向きであったか縦向きであったかを判別できる。システム制御回路５０は、撮影時の撮像装置１の姿勢を画像データファイルへ付加したり、画像の向きを揃えてから記録したりすることができる。姿勢検知部５５としては、加速度センサやジャイロセンサなどを用いることができる。 The orientation detection unit 55 detects the orientation of the imaging device 1 with respect to the direction of gravity. Based on the posture detected by the posture detection unit 55, it is possible to determine whether the imaging device 1 was oriented horizontally or vertically at the time of photographing. The system control circuit 50 can add the attitude of the imaging device 1 at the time of photographing to the image data file, or can record the image after aligning the orientation of the image. An acceleration sensor, a gyro sensor, or the like can be used as the posture detection unit 55 .

［注視位置検出動作］
図４は、撮像装置１の注視位置検出動作に関するフローチャートである。注視位置検出動作は、視線検出機能が有効に設定されている場合に実行される。また、注視位置検出動作は、ライブビュー表示動作と並行して実施することができる。 [Gaze position detection operation]
FIG. 4 is a flowchart relating to gaze position detection operation of the imaging device 1 . The gaze position detection operation is executed when the line-of-sight detection function is enabled. Further, the gaze position detection operation can be performed in parallel with the live view display operation.

Ｓ２で、システム制御回路５０は、現在設定されている撮影モードを取得する。撮影モードはモード切替スイッチ６０で設定可能である。なお、モード切替スイッチ６０でシーン選択モードが設定されている場合には、シーン選択モード内で設定されているシーンの種類も撮影モードとして取り扱う。 In S2, the system control circuit 50 acquires the currently set shooting mode. The shooting mode can be set with the mode changeover switch 60 . When the scene selection mode is set by the mode changeover switch 60, the type of scene set in the scene selection mode is also treated as the shooting mode.

図５は、撮像装置１の外観例を示す図である。図５（ａ）はモード切替スイッチ６０の配置例を示している。また、図５（ｂ）は、モード切替スイッチ６０の上面図であり、選択可能な撮影モードの例を示している。例えば、Ｔｖはシャッタスピード優先モード、ＡｖはＦ値優先モード、Ｍはマニュアル設定モード、Ｐはプログラムモード、ＳＣＮはシーン選択モードを示している。所望の撮影モードを示す文字がマーク６３の位置にくるようにモード切替スイッチ６０を回転させることで、所望の撮影モードを設定することができる。図５（ｂ）は、シーン選択モードが設定されている状態を示している。 FIG. 5 is a diagram showing an example of the appearance of the imaging device 1. As shown in FIG. FIG. 5A shows an arrangement example of the mode changeover switch 60. FIG. FIG. 5B is a top view of the mode changeover switch 60 and shows examples of selectable shooting modes. For example, Tv indicates shutter speed priority mode, Av indicates F value priority mode, M indicates manual setting mode, P indicates program mode, and SCN indicates scene selection mode. A desired photographing mode can be set by rotating the mode switch 60 so that the character indicating the desired photographing mode is positioned at the position of the mark 63 . FIG. 5(b) shows a state in which the scene selection mode is set.

シーン選択モードは、特定のシーンや特定の被写体を撮影するための撮影モードである。そのため、シーン選択モードでは、シーンや被写体の種類が設定される必要がある。システム制御回路５０は、設定されたシーンや被写体の種類に適した撮影条件（シャッタスピード、絞り値、感度など）やＡＦモードを設定する。 Scene selection mode is a shooting mode for shooting a specific scene or a specific subject. Therefore, in the scene selection mode, it is necessary to set the type of scene and subject. The system control circuit 50 sets shooting conditions (shutter speed, aperture value, sensitivity, etc.) and AF mode suitable for the set scene and subject type.

本実施形態では、シーン選択モードにおけるシーンや被写体の種類は、図５（ｃ）に示すように、表示部２８に表示されるメニュー画面の操作を通じて設定することができる。ここでは、一例としてポートレート、風景、キッズ、スポーツのいずれかを設定可能であるが、より多くの選択肢が存在してもよい。上述したように、シーン選択モードでは、設定されているシーンや被写体の種類を、撮影モードとして取り扱う。 In this embodiment, the type of scene and subject in the scene selection mode can be set by operating the menu screen displayed on the display unit 28, as shown in FIG. 5(c). Here, one of portrait, landscape, kids, and sports can be set as an example, but more options may exist. As described above, in the scene selection mode, the set scene and subject type are treated as the shooting mode.

Ｓ３で、システム制御回路５０は、表示用画像データを取得する。システム制御回路５０は、メモリ３２のビデオメモリ領域に格納されている、これから表示されるライブビュー表示用の画像データを読み出し、画像処理部２４に供給する。 In S3, the system control circuit 50 acquires image data for display. The system control circuit 50 reads the image data for live view display to be displayed from now stored in the video memory area of the memory 32 and supplies the data to the image processing section 24 .

Ｓ４で、生成手段としての画像処理部２４は、システム制御回路５０から供給された表示用画像データに対して加工処理を適用し、注視位置検出用の表示用画像データを生成する。そして、画像処理部２４は、生成した表示用画像データを、メモリ３２のビデオメモリ領域に格納し直す。なお、ここではＳ３でビデオメモリ領域から取得した表示用画像データを加工するものとしたが、表示用画像データを画像処理部２４で生成する際に、加工処理を適用し、最初から注視位置検出用の表示用画像データを生成してもよい。 In S4, the image processing unit 24 as generating means applies processing to the display image data supplied from the system control circuit 50 to generate display image data for gaze position detection. Then, the image processing unit 24 stores the generated display image data in the video memory area of the memory 32 again. In this case, the image data for display acquired from the video memory area in S3 is processed. You may generate the image data for a display for.

ここで、注視位置検出のために画像処理部２４が適用する加工処理の例についていくつか説明する。注視位置検出のために適用する加工処理は、撮像装置１の設定情報（ここでは一例として撮影モード）に基づいて判定される特徴領域が他の領域より視覚的に強調される加工処理である。この加工処理により、ユーザが所望の被写体に素早く視線を合わせやすくなる。設定情報に対応してどのような特徴情報を検出すべきか、特徴情報を検出するために必要なパラメータなどは、設定情報ごとに、例えば不揮発性メモリ５６に予め記憶しておくことができる。例えば、特定のシーンを撮影するための撮影モードに関連付けて、その特定のシーンに応じた主被写体の種類や、その主被写体の特徴領域を検出するためのテンプレートやパラメータを予め記憶しておくことができる。 Here, some examples of processing applied by the image processing unit 24 for gaze position detection will be described. The processing applied for gaze position detection is processing that visually emphasizes a characteristic region determined based on setting information (here, as an example, a shooting mode) of the imaging device 1 over other regions. This processing makes it easier for the user to quickly match the line of sight with the desired subject. What kind of feature information should be detected in correspondence with the setting information, parameters necessary for detecting the feature information, etc. can be stored in advance in, for example, the non-volatile memory 56 for each setting information. For example, in association with a shooting mode for shooting a specific scene, pre-storing the type of the main subject according to the specific scene and the template and parameters for detecting the characteristic region of the main subject. can be done.

（例１）
図６は、シーン選択モードでシーンが「スポーツ」に設定されている場合に適用しうる加工処理の例を模式的に示している。図６（ａ）が加工前の表示用画像データが表す画像を示し、図６（ｂ）～図６（ｄ）がそれぞれ加工後の表示用画像データが表す画像を示している。 (Example 1)
FIG. 6 schematically shows an example of processing that can be applied when the scene is set to "sports" in the scene selection mode. FIG. 6(a) shows an image represented by the display image data before processing, and FIGS. 6(b) to 6(d) respectively show images represented by the display image data after processing.

シーン選択モードで「スポーツ」が設定されている場合、ユーザはスポーツシーンを撮影することを意図していると推測できる。この場合、画像処理部２４は、動いている人物被写体の領域を強調すべき特徴領域と判定し、特徴領域を強調する加工処理を適用する。 If "sports" is set in the scene selection mode, it can be inferred that the user intends to shoot a sports scene. In this case, the image processing unit 24 determines that the area of the moving human subject is a characteristic area to be emphasized, and applies processing to emphasize the characteristic area.

具体的には、画像処理部２４は、特徴領域として人体領域を検出するとともに、１つ前の検出結果（例えば１フレーム前のライブビュー画像における検出結果）との比較により、移動している人物領域を特定する。そして、画像処理部２４は、現フレームのライブビュー画像に対して、移動している人物領域を強調する加工処理を適用する。 Specifically, the image processing unit 24 detects the human body region as the characteristic region, and compares the detection result with the previous detection result (for example, the detection result in the live view image of the previous frame) to determine whether the moving person is moving. Identify areas. Then, the image processing unit 24 applies processing for emphasizing the moving person region to the live view image of the current frame.

ここでは、図６（ａ）に示す現フレームの画像において、移動している人物領域Ｐ１、Ｐ２、Ｐ３が検出されたものとする。そして、図６（ｂ）は、人物領域Ｐ１～Ｐ３を強調する加工処理として、人物領域を囲う枠Ａ１～Ａ３を重畳する処理を適用した例を示している。 Here, it is assumed that moving person areas P1, P2, and P3 are detected in the image of the current frame shown in FIG. 6(a). FIG. 6B shows an example in which processing for superimposing frames A1 to A3 surrounding the person regions is applied as the processing for emphasizing the person regions P1 to P3.

また、図６（ｃ）は、人物領域Ｐ１～Ｐ３を強調する加工処理として、人物領域Ｐ１～Ｐ３を囲う領域の表示は変更せず、他の領域Ａ４の輝度を下げる処理を適用した例を示している。また、図６（ｄ）は、人物領域Ｐ１～Ｐ３を強調する加工処理として、人物領域Ｐ１～Ｐ３の全てを囲う矩形領域Ａ５の表示は変更せず、他の領域の輝度を下げる処理を適用した例を示している。 FIG. 6(c) shows an example in which, as processing for emphasizing the person areas P1 to P3, the display of the area surrounding the person areas P1 to P3 is not changed, and the process of lowering the brightness of the other area A4 is applied. showing. Also, in FIG. 6D, as the processing for emphasizing the person areas P1 to P3, the display of the rectangular area A5 surrounding all of the person areas P1 to P3 is not changed, and the process of lowering the luminance of other areas is applied. example.

このように、設定されているシーンや主被写体の種類に応じた特徴領域を検出し、検出された特徴領域を強調する加工処理を適用することにより、ユーザが意図している主被写体を見つけやすくすることが期待できる。ユーザが意図している主被写体を見つけやすくなることで、ユーザの視線が主被写体を注視する様になるまでの時間を短縮する効果が期待できる。 In this way, by detecting a characteristic region according to the set scene and the type of the main subject, and applying processing that emphasizes the detected characteristic region, the user can easily find the intended main subject. can be expected to By making it easier for the user to find the main subject intended by the user, an effect of shortening the time required for the user's line of sight to gaze at the main subject can be expected.

なお、特徴領域を強調する加工処理は上述した例に限定されない。例えば、特徴領域として検出された人体領域Ｐ１、Ｐ２、Ｐ３のエッジを強調する加工処理であってもよい。また、枠Ａ１～Ａ３を点滅させたり、特定の色で表示したりすることもできる。また、図６（ｃ）および図６（ｄ）において輝度を下げる代わりにモノクロ表示としてもよい。また、特徴領域が人間や動物の領域である場合には、画像全体をサーモグラフィ風の擬似カラー画像に変換することで、人物や動物の領域を強調してもよい。 Note that the processing for emphasizing the characteristic region is not limited to the example described above. For example, processing may be performed to emphasize the edges of the human body regions P1, P2, and P3 detected as characteristic regions. Also, the frames A1 to A3 can be blinked or displayed in a specific color. Further, instead of lowering the brightness in FIGS. 6(c) and 6(d), monochrome display may be used. Moreover, when the feature region is a human or animal region, the human or animal region may be emphasized by converting the entire image into a thermography-like pseudo-color image.

（例２）
図７（ａ）、図７（ｂ）は、シーン選択モードで主被写体が「キッズ」に設定されている場合に適用しうる加工処理の例を模式的に示している。図７（ａ）が加工前の表示用画像データが表す画像を示し、図７（ｂ）が加工後の表示用画像データが表す画像を示している。 (Example 2)
FIGS. 7A and 7B schematically show examples of processing that can be applied when the main subject is set to "kids" in the scene selection mode. FIG. 7A shows an image represented by the display image data before processing, and FIG. 7B shows an image represented by the display image data after processing.

シーン選択モードで「キッズ」が設定されている場合、ユーザは子供を主被写体として撮影することを意図していると推測できる。この場合、画像処理部２４は、子供と推定される人物被写体の領域を強調すべき特徴領域と判定し、特徴領域を強調する加工処理を適用する。 If "Kids" is set in the scene selection mode, it can be inferred that the user intends to shoot a child as the main subject. In this case, the image processing unit 24 determines that the area of the human subject presumed to be a child is a characteristic area to be emphasized, and applies processing to emphasize the characteristic area.

特徴領域として検出された人物の領域が、大人であるか子供であるかは、例えば胴体の長さもしくは身長に対する頭部の長さの割合が閾値以下なら子供と判定したり、機械学習を利用して判定したりすることができるが、これらに限定されない。事前に子供として登録されている人物だけを顔認証によって検出してもよい。 Whether a human region detected as a feature region is an adult or a child can be determined as a child if, for example, the ratio of head length to body length or height is less than a threshold, or machine learning can be used. However, it is not limited to these. Only persons registered as children in advance may be detected by face authentication.

ここでは、図７（ａ）に示す現フレームの画像において、人物領域Ｐ１、Ｋ１、Ｋ２が検出され、領域Ｋ１、Ｋ２が子供と判定されたとする。そして、図７（ｂ）は、子供の領域Ｋ１、Ｋ２を強調する加工処理として、子供の領域Ｋ１、Ｋ２のエッジを強調し、さらに子供の領域Ｋ１、Ｋ２以外の領域の階調を削減する処理を適用した例を示している。階調の削減は、最大輝度の低減（輝度の圧縮）、輝度階調数の削減（２５６階調から１６階調にする）などであってよいが、これらに限定されない。例１で示した様な、輝度の低下やモノクロ表示を適用してもよい。 Here, it is assumed that person areas P1, K1, and K2 are detected in the image of the current frame shown in FIG. 7A, and areas K1 and K2 are determined to be children. In FIG. 7B, as processing for emphasizing the child areas K1 and K2, the edges of the child areas K1 and K2 are emphasized, and the gradation of areas other than the child areas K1 and K2 is reduced. An example of applying the treatment is shown. Reduction of gradation may be reduction of maximum luminance (compression of luminance), reduction of number of luminance gradations (from 256 gradations to 16 gradations), etc., but is not limited to these. Reduction in luminance or monochrome display as shown in Example 1 may be applied.

（例３）
図７（ａ）、図７（ｃ）は、シーン選択モードで主被写体が「文字」に設定されている場合に適用しうる加工処理の例を模式的に示している。図７（ａ）が加工前の表示用画像データが表す画像を示し、図７（ｃ）が加工後の表示用画像データが表す画像を示している。 (Example 3)
FIGS. 7A and 7C schematically show examples of processing that can be applied when the main subject is set to "text" in the scene selection mode. FIG. 7A shows an image represented by the display image data before processing, and FIG. 7C shows an image represented by the display image data after processing.

シーン選択モードで「文字」が設定されている場合、ユーザはシーン内に存在する文字に注目して撮影することを意図していると推測できる。この場合、画像処理部２４は、特徴領域として文字と推定される領域を強調すべき特徴領域と判定し、特徴領域を強調する加工処理を適用する。 When "character" is set in the scene selection mode, it can be inferred that the user intends to take a picture while paying attention to the characters existing in the scene. In this case, the image processing unit 24 determines that an area estimated to be a character as a characteristic area is a characteristic area to be emphasized, and applies processing to emphasize the characteristic area.

ここでは、図７（ａ）に示す現フレームの画像において、文字領域ＭＯが検出されたものとする。そして、図７（ｃ）は、文字領域ＭＯを強調する加工処理として、文字領域ＭＯのエッジを強調するとともに、文字領域ＭＯ以外の領域の階調を削減する処理を適用した例を示している。諧調の削減方法については例２と同様であってよい。 Here, it is assumed that the character area MO is detected in the image of the current frame shown in FIG. 7(a). FIG. 7(c) shows an example in which processing for emphasizing the edges of the character area MO and reducing the gradation of areas other than the character area MO is applied as processing for emphasizing the character area MO. . The gradation reduction method may be the same as in Example 2.

例２および例３で説明した様に、同じ元画像（図７（ａ））に対して、設定されている撮影モードによって異なる加工処理が適用されうる。なお、例２および例３においても、例１と同様の加工処理を適用してもよい。また、強調すべき領域に対するエッジの強調と他の領域に対する輝度や階調の低減とは、一方のみ適用してもよい。 As described in Examples 2 and 3, different processing can be applied to the same original image (FIG. 7A) depending on the set shooting mode. In addition, in Examples 2 and 3, processing similar to that in Example 1 may be applied. Further, only one of the edge enhancement for the region to be enhanced and the luminance or gradation reduction for the other region may be applied.

本実施形態において注視位置検出用の画像データに対して適用しうる加工処理は、撮像装置の設定情報に基づいて決定される強調すべき領域（特徴領域）が他の領域より視覚的に強調される加工処理である。加工処理は例えば以下の４通りのいずれかであってよい。
（１）強調すべき領域は加工せず、他の領域を（輝度や階調を削減するなどして）目立たなくなるように加工する処理
（２）強調すべき領域を強調し（エッジの強調など）、他の領域は加工しない処理
（３）強調すべき領域を強調し（エッジの強調など）、さらに、他の領域を（輝度や階調を削減するなどして）目立たなくするように加工する処理
（４）画像全体を加工して、強調すべき領域を強調する処理（擬似カラー画像への変換など）
なお、これらは例示であり、強調すべき領域が他の領域に対して視覚的に強調される（目立つ）ような任意の加工処理を適用可能である。 Processing that can be applied to image data for gaze position detection in the present embodiment is to visually emphasize a region to be emphasized (characteristic region) determined based on setting information of an imaging device from other regions. It is a processing process that Processing may be, for example, any one of the following four types.
(1) A process that does not process the area that should be emphasized and processes other areas (by reducing the brightness or gradation) so that it does not stand out (2) Emphasizes the area that should be emphasized (emphasis of edges, etc.) ) and do not process other areas. (3) Emphasize the area to be emphasized (edge enhancement, etc.), and process other areas to make them less noticeable (such as by reducing brightness and gradation). (4) processing the entire image to emphasize the area to be emphasized (conversion to a pseudo-color image, etc.)
It should be noted that these are examples, and arbitrary processing can be applied such that the area to be emphasized is visually emphasized (conspicuous) with respect to other areas.

図４に戻り、Ｓ５でシステム制御回路５０は、Ｓ４で画像処理部２４が生成した表示用の画像データを表示部２８に表示させる。また、システム制御回路５０は、画像処理部２４が視線入力部７０１からの視線検出用画像に基づいて検出した眼球の光軸の回転角を、画像処理部２４から取得する。システム制御回路５０は、取得した回転角に基づいて、ユーザが注視している、表示部２８に表示されている画像内の座標（注視位置）を求める。なお、システム制御回路５０は、得られた注視位置を示すマークなどをライブビュー画像に重畳表示させることで、注視位置をユーザに通知もしくはフィードバックしてもよい。 Returning to FIG. 4, in S5, the system control circuit 50 causes the display unit 28 to display the display image data generated by the image processing unit 24 in S4. The system control circuit 50 also acquires from the image processing unit 24 the rotation angle of the optical axis of the eyeball detected by the image processing unit 24 based on the sight line detection image from the sight line input unit 701 . Based on the acquired rotation angle, the system control circuit 50 obtains the coordinates (gaze position) within the image displayed on the display unit 28 where the user is gazing. Note that the system control circuit 50 may notify or feed back the gaze position to the user by superimposing a mark indicating the obtained gaze position on the live view image.

以上で、注視位置検出動作は終了する。注視位置検出動作によって得られた注視位置は、焦点検出領域の設定、主被写体の選択などに用いることができるが、これらに限定されない。なお、撮影で得られた画像データを記録する場合、撮影時に検出した注視位置の情報を画像データと関連づけて記録してもよい。例えば、画像データを格納するデータファイルのヘッダなどに記録される付随情報として、撮影時の注視位置の情報を記録することができる。画像データに関連づけて記録された注視位置の情報は、画像データを取り扱うアプリケーションプログラムなどにおいて、主被写体の特定などに利用することができる。 Thus, the gaze position detection operation ends. The gaze position obtained by the gaze position detection operation can be used for setting the focus detection area, selecting the main subject, etc., but is not limited to these. When recording image data obtained by photographing, information on the gaze position detected at the time of photographing may be recorded in association with the image data. For example, it is possible to record the gaze position information at the time of photographing as accompanying information recorded in the header of a data file that stores image data. The gaze position information recorded in association with the image data can be used to identify the main subject in an application program or the like that handles the image data.

なお、視線入力機能が有効に設定されていない場合、画像処理部２４は表示用の画像データに対して視線入力を支援するための加工処理は適用しないが、目的の異なる加工処理は適用しうる。 Note that when the line-of-sight input function is not set to be valid, the image processing unit 24 does not apply processing for supporting line-of-sight input to image data for display, but may apply processing for different purposes. .

以上説明したように、本実施形態では、視線入力が有効な際に表示する画像に対し、撮像装置の設定情報に基づいて判定された特徴領域が他の領域より視覚的に強調される加工処理を適用するようにした。これにより、ユーザが主被写体として意図している可能性の高い領域が視認しやすくなり、主被写体を注視するまでの時間が短縮される効果が期待できる。 As described above, in the present embodiment, in the image displayed when line-of-sight input is valid, processing is performed in which the characteristic region determined based on the setting information of the imaging device is visually emphasized over the other regions. applied. As a result, it becomes easier for the user to visually recognize the area that is likely to be intended as the main subject, and the effect of shortening the time required to gaze at the main subject can be expected.

なお、本実施形態では、強調すべき領域を、撮影モードの設定に基づいて判定したが、ユーザが意図している可能性の高い主被写体の種類が判定可能であれば、他の設定を用いてもよい。 Note that in the present embodiment, the region to be emphasized is determined based on the shooting mode settings. may

●（第２実施形態）
次に、第２実施形態について説明する。第２実施形態は第１の実施形態における表示部２８としてＸＲゴーグル（頭部装着型の表示装置もしくはＨＭＤ）を用いる場合の実施形態である。なお、ＸＲとは、ＶＲ（仮想現実）、ＡＲ（拡張現実）、ＭＲ（複合現実）の総称である。 ● (Second embodiment)
Next, a second embodiment will be described. The second embodiment is an embodiment in which XR goggles (head-mounted display device or HMD) are used as the display unit 28 in the first embodiment. Note that XR is a general term for VR (virtual reality), AR (augmented reality), and MR (mixed reality).

図８（ａ）の左図はＸＲゴーグル８００の外観例を示す斜視図である。ＸＲゴーグル８００は、図８（ａ）の右図に示す顔の領域ＳＯに装着するの一般的である。図８（ｂ）は、ＸＲゴーグル８００の装着面（顔が接する面）を模式的に示した図である。また、図８（ｃ）は、ＸＲゴーグル８００の装着時における、ＸＲゴーグル８００の接眼レンズ７０１ｄと表示部２８Ａ、２８Ｂ、ユーザの右目５０１ａ、左目５０１ｂの位置関係を模式的に示す上面図である。 The left view of FIG. 8A is a perspective view showing an example of the appearance of the XR goggles 800. FIG. The XR goggles 800 are generally worn on the face area SO shown in the right diagram of FIG. 8(a). FIG. 8(b) is a diagram schematically showing the mounting surface of the XR goggles 800 (the surface in contact with the face). FIG. 8C is a top view schematically showing the positional relationship between the eyepiece 701d of the XR goggles 800, the display units 28A and 28B, and the user's right eye 501a and left eye 501b when the XR goggles 800 are worn. .

ＸＲゴーグル８００は、右目５０１ａ用の表示部２８Ａと左目５０１ｂ用の表示部２８Bを有し、視差画像対を構成する右目用画像を表示部２８Aに、左目用画像を表示部２８Ｂに表示させることにより、立体視を可能としている。そのため、第１実施形態で説明した接眼レンズ７０１ｄが表示部２８Ａおよび２８Ｂのそれぞれについて設けられている。 The XR goggles 800 have a display unit 28A for the right eye 501a and a display unit 28B for the left eye 501b, and display the right eye image and the left eye image on the display unit 28B. This enables stereoscopic viewing. Therefore, the eyepiece 701d described in the first embodiment is provided for each of the display sections 28A and 28B.

なお、本実施形態では撮像部２２が図２（ａ）に示した構成の画素を有するものとする。この場合、光電変換部２０１ａから得られる画素信号群から右目用画像を、光電変換部２０１ｂから得られる画素信号群から左目用画像を生成することができる。レンズユニット１５０をステレオ映像を撮影可能なレンズとするなど、他の構成を用いて右目用画像および左目用画像を生成してもよい。また、視線入力部７０１は、ＸＲゴーグルの接眼部に設けられ、右目、左目の一方について視線検出用画像を生成するものとする。 In this embodiment, it is assumed that the imaging unit 22 has pixels configured as shown in FIG. 2(a). In this case, a right-eye image can be generated from the pixel signal group obtained from the photoelectric conversion unit 201a, and a left-eye image can be generated from the pixel signal group obtained from the photoelectric conversion unit 201b. The right-eye image and the left-eye image may be generated using another configuration, such as using a lens capable of capturing stereo images as the lens unit 150 . Also, the line-of-sight input unit 701 is provided in the eyepiece of the XR goggles, and generates a line-of-sight detection image for either the right eye or the left eye.

それ以外の構成は、図１に示した撮像装置１と同様の構成により実施可能であるため、以下では撮像装置１の構成要素を用いて説明する。なお、本実施形態では、ライブビュー画像ではなく、予め記録媒体２００に記録済みの右目用画像および左目用画像を用いて表示用画像を生成するものとする。 Since the rest of the configuration can be implemented with the same configuration as that of the imaging device 1 shown in FIG. Note that in the present embodiment, the display image is generated using the right-eye image and the left-eye image recorded in advance on the recording medium 200 instead of the live view image.

図９は、本実施形態における注視位置検出動作に関するフローチャートであり、第１実施形態と同様の処理を行うステップには、図４と同じ参照符号を付与することにより、重複する説明を省略する。 FIG. 9 is a flowchart relating to gaze position detection operation in this embodiment, and redundant description is omitted by assigning the same reference numerals as in FIG. 4 to steps that perform the same processing as in the first embodiment.

Ｓ９１で、システム制御回路５０は、現在設定されている体験モードを取得する。本実施形態では撮影を行わないため、ＸＲに関する体験モードを取得する。体験モードは、例えばＸＲ体験を行う仮想環境の種類であり、例えば「美術館」、「博物館」、「動物園」、「ダイビング」といった選択肢が用意される。なお、体験モードは、リモートコントローラを用いたり、ＸＲゴーグルに設けられた入力デバイスを用いたり、メニュー画面を表示して視線で選択したりする方法によって設定可能である。なお、記録媒体２００には、体験モードとして選択可能な仮想環境のそれぞれに対応した表示用画像データが記憶されているものとする。 In S91, the system control circuit 50 acquires the currently set experience mode. Since no shooting is performed in this embodiment, an experience mode related to XR is acquired. The experience mode is, for example, the type of virtual environment in which the XR experience is performed, and options such as "art museum", "museum", "zoo", and "diving" are prepared. Note that the experience mode can be set by using a remote controller, using an input device provided in the XR goggles, or by displaying a menu screen and making a selection with the line of sight. It is assumed that the recording medium 200 stores display image data corresponding to each virtual environment that can be selected as the experience mode.

Ｓ３でシステム制御回路５０は、Ｓ９１で選択された体験モードに対応した表示用画像データを記録媒体２００から読み出すことによって取得し、画像処理部２４に供給する。 In S3, the system control circuit 50 acquires the display image data corresponding to the experience mode selected in S91 by reading it from the recording medium 200, and supplies it to the image processing section .

Ｓ９２で、画像処理部２４は、システム制御回路５０から供給された表示用画像データに対して加工処理を適用し、注視位置検出用の表示用画像データを生成する。本実施形態では表示画像データが右目用画像と左目用画像とを含んだステレオ画像データであるため、画像処理部２４は、右目用画像と左目用画像の両方に対して加工処理を適用する。 In S92, the image processing unit 24 applies processing to the display image data supplied from the system control circuit 50 to generate display image data for gaze position detection. In the present embodiment, the display image data is stereo image data including a right-eye image and a left-eye image, so the image processing unit 24 applies processing to both the right-eye image and the left-eye image.

Ｓ９２で画像処理部２４が適用する加工処理は、ＸＲ体験を提供する装置（ここでは撮像装置１）の設定情報（ここでは一例として体験モード）に基づいて判定される特徴領域が他の領域より視覚的に強調される加工処理である。この加工処理により、ＸＲ体験の没入感が増す効果が期待できる。 In the processing applied by the image processing unit 24 in S92, the characteristic region determined based on the setting information (here, the experience mode as an example) of the device that provides the XR experience (here, the imaging device 1) is different from other regions. It is a processing that is visually emphasized. This processing can be expected to increase the immersive feeling of the XR experience.

Ｓ９２で画像処理部２４が適用する加工処理の例について説明する。
（例４）
図１０は、体験モードが「ダイビング」に設定されている場合に適用しうる加工処理の例を模式的に示している。図１０（ａ）は加工前の表示用画像データが表す画像を示している。 An example of processing applied by the image processing unit 24 in S92 will be described.
(Example 4)
FIG. 10 schematically shows an example of processing that can be applied when the experience mode is set to "diving". FIG. 10A shows an image represented by display image data before processing.

体験モードで「ダイビング」が設定されている場合、ユーザは海中の生物に興味を有していると推測できる。この場合、画像処理部２４は、動いている海中の生物の領域を強調すべき特徴領域と判定し、特徴領域を強調する加工処理を適用する。 When "diving" is set in the experience mode, it can be inferred that the user is interested in underwater creatures. In this case, the image processing unit 24 determines that the area of moving underwater creatures is a characteristic area to be emphasized, and applies processing to emphasize the characteristic area.

具体的には、画像処理部２４は、特徴領域として魚類および海獣などの領域を検出するとともに、過去の検出結果との比較により、移動している特徴領域を特定する。そして、画像処理部２４は、処理対象のフレーム画像に対して、移動している特徴領域を強調する加工処理を適用する。 Specifically, the image processing unit 24 detects areas of fish, sea animals, and the like as characteristic areas, and identifies moving characteristic areas by comparing them with past detection results. Then, the image processing unit 24 applies a processing process for emphasizing a moving characteristic region to the frame image to be processed.

ここでは、図１０（ａ）に示す処理対象のフレーム画像において、移動している魚および人間の領域である特徴領域ｆ１～ｆ４が検出されたものとする。この場合、画像処理部２４は、特徴領域ｆ１～ｆ４を強調する加工処理として、特徴領域ｆ１～ｆ４の表示は維持し、他の領域の色数を低減する（例えばモノクロにする）加工処理を適用する。なお、特徴領域を強調する加工処理は、第１実施形態で説明したものを含む、他の加工処理であってもよい。 Here, it is assumed that characteristic regions f1 to f4, which are regions of moving fish and humans, are detected in the frame image to be processed shown in FIG. 10(a). In this case, the image processing unit 24 maintains the display of the characteristic regions f1 to f4 and performs processing to reduce the number of colors in other regions (for example, to monochrome) as the processing to emphasize the characteristic regions f1 to f4. Apply. The processing for emphasizing the characteristic region may be other processing including the one described in the first embodiment.

図９に戻り、Ｓ５でシステム制御回路５０は、システム制御回路５０は、画像処理部２４が視線入力部７０１からの視線検出用画像に基づいて検出した眼球の光軸の回転角を、画像処理部２４から取得する。システム制御回路５０は、取得した回転角に基づいて、ユーザが注視している、表示部２８Ａまたは２８Ｂに表示されている画像内の座標（注視位置）を求める。そして、システム制御回路５０は、Ｓ９２で画像処理部２４が生成した右目用および左目用の画像データに注視位置を示すマークを重畳して、表示部２８Ａおよび２８Ｂ表示させる。 Returning to FIG. 9, in S5, the system control circuit 50 performs image processing on the rotation angle of the optical axis of the eyeball detected by the image processing unit 24 based on the sight line detection image from the sight line input unit 701. Acquired from the unit 24 . Based on the acquired rotation angle, the system control circuit 50 obtains the coordinates (gaze position) within the image displayed on the display unit 28A or 28B that the user is gazing at. Then, the system control circuit 50 superimposes marks indicating gaze positions on the right-eye and left-eye image data generated by the image processing unit 24 in S92, and displays them on the display units 28A and 28B.

Ｓ９３でシステム制御回路５０は、Ｓ５で検出した注視位置情報を用いて表示用画像にさらなる加工処理を適用するか否かを判定する。この判定は、例えば注視位置情報の利用に関するユーザ設定に基づいて実行するなど、任意の判定条件に基づいて実行することができる。 In S93, the system control circuit 50 determines whether or not to apply further processing to the display image using the gaze position information detected in S5. This determination can be performed based on arbitrary determination conditions, for example, based on user settings regarding the use of gaze position information.

システム制御回路５０は、注視位置情報を用いないと判定されれば注視位置検出動作を終了する。一方、システム制御回路５０は、視線位置情報を用いると判定されればＳ９４を実行する。 If it is determined that the gaze position information is not used, the system control circuit 50 ends the gaze position detection operation. On the other hand, the system control circuit 50 executes S94 if it is determined that the line-of-sight position information is to be used.

Ｓ９４でシステム制御回路５０は、メモリ３２のビデオメモリ領域に格納されている、表示用画像データを読み出し、画像処理部２４に供給する。画像処理部２４は、Ｓ５で検出された注視位置を用いて、表示用画像データに対してさらに加工処理を適用する。 In S<b>94 , the system control circuit 50 reads the display image data stored in the video memory area of the memory 32 and supplies it to the image processing section 24 . The image processing unit 24 further applies processing to the display image data using the gaze position detected in S5.

Ｓ９４で行う、注視位置情報を用いた加工処理の例を図１０（ｂ）および図１０（ｃ）に示す。表示用画像データには、Ｓ５で検出された注視位置を示すマーカーＰ１が重畳されている。ここでは、検出された注視位置ｐ１が特徴領域ｆ１内であるため、ユーザは特徴領域ｆ１に興味を持っている可能性が高い。そのため、Ｓ９２で加工処理を適用して強調した特徴領域ｆ１～ｆ４のうち、特徴領域ｆ１が他の特徴領域ｆ２～ｆ４より視覚的に強調されるような加工処理を適用する。 FIGS. 10(b) and 10(c) show an example of processing using gaze position information performed in S94. A marker P1 indicating the gaze position detected in S5 is superimposed on the display image data. Here, since the detected gaze position p1 is within the characteristic region f1, there is a high possibility that the user is interested in the characteristic region f1. Therefore, among the characteristic regions f1 to f4 emphasized by applying the processing processing in S92, the processing is applied such that the characteristic region f1 is visually emphasized more than the other characteristic regions f2 to f4.

例えば、Ｓ９２では特徴領域ｆ１～ｆ４はカラー表示を維持し、他の領域をモノクロ表示することにより、特徴領域ｆ１～ｆ４を強調したとする。この場合、Ｓ９４で画像処理部２４は、特徴領域ｆ２～ｆ４もモノクロ表示とし、特徴領域ｆ１、あるいは注視位置と注視位置に最も近い特徴領域（ここでは特徴領域ｆ１）を包含する領域はカラー表示を維持するようにする。図１０（ｂ）は、注視位置ｐ１と特徴領域ｆ１を包含する領域Ｃ１はカラー表示が維持され、特徴領域ｆ２～ｆ４を含む他の領域はモノクロ表示とする加工処理が行われた状態を模式的に示している。ここでは、特徴領域ｆ２～ｆ４を、特徴量域以外の領域と同じ表示形態に変更するものとしたが、特徴領域ｆ２～ｆ４は特徴領域ｆ１よりは目立たず、特徴領域以外の領域よりは目立つような表示形態としてもよい。 For example, in S92, it is assumed that the characteristic regions f1 to f4 are emphasized by maintaining the color display of the characteristic regions f1 to f4 and displaying the other regions in monochrome. In this case, in S94, the image processing unit 24 also displays the characteristic areas f2 to f4 in monochrome, and displays the characteristic area f1 or the area including the gaze position and the characteristic area closest to the gaze position (here, the characteristic area f1) in color. to maintain FIG. 10(b) schematically shows a state in which color display is maintained in an area C1 including the gaze position p1 and the characteristic area f1, and processing is performed to display the other areas including the characteristic areas f2 to f4 in monochrome. clearly shown. Here, the characteristic areas f2 to f4 are changed to the same display form as the areas other than the characteristic amount area, but the characteristic areas f2 to f4 are less conspicuous than the characteristic area f1 and more conspicuous than the areas other than the characteristic amount area. It is good also as a display form.

このように、注視位置を用いて強調すべき特徴領域を絞り込むことにより、注視位置を用いない場合よりもユーザが興味を持っている特徴領域をより正確に把握し、強調する加工処理を適用することができる。そのため、ＸＲ体験時の没入感が増す効果が一層期待できる。また、注視位置を利用したアプリケーションにおいて、意図している被写体を注視していることをユーザが確認することが容易になるという効果も実現できる。 In this way, by narrowing down the feature areas to be emphasized using the gaze position, the feature area that the user is interested in can be grasped more accurately than when the gaze position is not used, and processing for enhancing can be applied. be able to. Therefore, the effect of increasing the immersive feeling during the XR experience can be expected. In addition, it is possible to realize an effect that the user can easily confirm that the user is gazing at the intended subject in an application that uses the gaze position.

検出された注視位置の時間変化を利用することもできる。図１０（ｃ）において、時刻Ｔ＝０で検出された注視位置がＰ１、Ｔ＝１（単位は任意）で検出された注視位置がＰ２であったとする。この場合、時刻Ｔ＝０から１の間に、注視位置がＰ１からＰ２へ移動しているため、ユーザは視線を左方向に移動させていることがわかる。 It is also possible to use the temporal change in the detected gaze position. In FIG. 10C, it is assumed that the gaze position detected at time T=0 is P1, and the gaze position detected at T=1 (arbitrary unit) is P2. In this case, since the gaze position moves from P1 to P2 between times T=0 and 1, it can be seen that the user moves the line of sight to the left.

この場合、注視位置の移動方向に存在する特徴領域ｆ４について強調することで、ユーザが新しい特徴領域を注視しやすくなる効果が期待できる。ここでは、カラー表示を維持するＣ２を、注視位置の移動方向における距離が最短である特徴領域ｆ４を包含するように拡張した例を示している。ここでは強調する領域を注視位置の移動方向に拡張する例を示したが、領域を拡張せずに、注視位置の移動に合わせて移動させてもよい。 In this case, by emphasizing the characteristic region f4 existing in the movement direction of the gaze position, an effect that the user can easily gaze at a new characteristic region can be expected. Here, an example is shown in which C2 that maintains color display is expanded to include the characteristic region f4 that has the shortest distance in the movement direction of the gaze position. Although an example in which the region to be emphasized is expanded in the moving direction of the gaze position is shown here, the region may be moved according to the movement of the gaze position without expanding the region.

このように、視線位置の経時変化を考慮して強調する領域を決定することにより、ユーザがこれから注視するであろう被写体について強調することができ、ユーザが所望の被写体を容易に注視することを支援できる効果が期待できる。 In this way, by determining the area to be emphasized in consideration of the temporal change of the line-of-sight position, it is possible to emphasize the subject that the user is likely to gaze at in the future, and to facilitate the user to gaze at the desired subject. You can expect the effect that can be supported.

Ｓ９２で画像処理部２４が適用する加工処理の別の例について説明する。
（例５）
図１１は、体験モードが「美術館」に設定されている場合に適用しうる加工処理の例を模式的に示している。図１１（ａ）は加工前の表示用画像データが表す画像を示している。 Another example of processing applied by the image processing unit 24 in S92 will be described.
(Example 5)
FIG. 11 schematically shows an example of processing that can be applied when the experience mode is set to "museum". FIG. 11A shows an image represented by display image data before processing.

体験モードで「美術館」が設定されている場合、ユーザは絵画や彫刻などの美術品に興味を有していると推測できる。この場合、画像処理部２４は、美術品の領域を強調すべき特徴領域と判定し、特徴領域を強調する加工処理を適用する。 If "museum" is set in the experience mode, it can be inferred that the user is interested in works of art such as paintings and sculptures. In this case, the image processing unit 24 determines that the area of the artwork is a characteristic area to be emphasized, and applies processing to emphasize the characteristic area.

ここでは、図１１（ａ）に示す処理対象のフレーム画像において、美術品の領域として特徴領域Ｂ１～Ｂ５が検出されたものとする。この場合、Ｓ９２で画像処理部２４は、例えば図１１（ｂ）に示すように、特徴領域Ｂ１～Ｂ５を強調する加工処理として、特徴領域Ｂ１～Ｂ５の表示は維持し、他の領域の輝度を低減する加工処理を適用する。なお、特徴領域を強調する加工処理は、第１実施形態で説明したものを含む、他の加工処理であってもよい。 Here, it is assumed that feature areas B1 to B5 are detected as areas of works of art in the frame image to be processed shown in FIG. 11(a). In this case, in S92, the image processing unit 24 maintains the display of the characteristic regions B1 to B5 and maintains the display of the characteristic regions B1 to B5, and increases the brightness of the other regions as processing processing for emphasizing the characteristic regions B1 to B5, as shown in FIG. 11B, for example. apply processing that reduces The processing for emphasizing the characteristic region may be other processing including the one described in the first embodiment.

注視位置情報を用いて表示用画像にさらなる加工処理を適用する場合、Ｓ９４で画像処理部２４は、例えば図１１（ｃ）に示すように、（マーカｐ３で示される）注視位置を含んだ特徴領域Ｂ２に、予め記憶された付随情報ＣＭ１を重畳表示することができる。付随情報ＣＭ１に特に制限は無く、絵画であれば例えば絵画の名称、作者、制作年などの書誌的情報など、特徴領域の種類に応じた情報であってよい。なお、本実施形態では表示用画像データは予め用意されているため、画像における美術品の位置に関する情報や、美術品に関する付随情報についても予め用意しておくことができる。従って、画像処理部２４は注視位置に存在する美術品を特定し、その付随情報を取得することが可能である。 When applying further processing to the display image using the gaze position information, in S94 the image processing unit 24, for example, as shown in FIG. Accompanying information CM1 stored in advance can be superimposed on the area B2. The accompanying information CM1 is not particularly limited, and in the case of a painting, it may be information corresponding to the type of characteristic region, such as bibliographic information such as the name of the painting, the author, and the year of production. Note that, in this embodiment, since the display image data is prepared in advance, information regarding the position of the art object in the image and accompanying information regarding the art object can also be prepared in advance. Therefore, the image processing unit 24 can identify the artwork present at the gaze position and acquire its associated information.

ここでは、注視位置に存在する美術品の付随情報を追加表示してさらに強調するようにしたが、注視位置に存在する美術品の拡大映像を重畳するなど、他の方法で強調するようにしてもよい。 In this example, the accompanying information of the artwork present at the gaze position is additionally displayed for further emphasis. good too.

以上説明したように、本実施形態では、第１の実施形態で説明した加工処理に加え、注視位置を考慮した加工処理を適用することにより、ユーザが興味を有している可能性の高い特徴領域についてより効果的に強調することができる。そのため、ユーザが所望の被写体を素早く注視することを支援したり、より没入感のあるＸＲ体験を提供したりすることが可能になる。 As described above, in the present embodiment, in addition to the processing described in the first embodiment, by applying the processing processing that takes into account the gaze position, the features that the user is likely to be interested in are Areas can be emphasized more effectively. Therefore, it is possible to assist the user in quickly gazing at a desired subject and provide a more immersive XR experience.

●（第３実施形態）
次に、第３実施形態について説明する。視線入力機能はユーザの視覚を用いる機能であるが、ユーザの視覚特性には個人差がある。そのため、本実施形態では、表示用画像データに対してユーザの視覚特性を考慮した加工処理を適用することにより、視線入力機能の使い勝手を向上する。 ● (Third Embodiment)
Next, a third embodiment will be described. The line-of-sight input function is a function that uses the user's visual sense, and there are individual differences in the visual characteristics of the user. Therefore, in the present embodiment, the user-friendliness of the line-of-sight input function is improved by applying a processing process to the display image data in consideration of the user's visual characteristics.

視覚特性の個人差の例としては、
（１）明るさの違いが識別できる輝度範囲（ダイナミックレンジ）の個人差
（２）中心視（注視点の周囲１～２°）や有効視野（中心視の周囲４～２０°）の個人差
（３）色相差の認識能力の個人差
などがある。これらの個人差は、先天的に、また後天的に（典型的には加齢によって）生じうる。 Examples of individual differences in visual characteristics include:
(1) Individual differences in luminance range (dynamic range) where differences in brightness can be discerned (2) Individual differences in central vision (1 to 2 degrees around the point of gaze) and effective visual field (4 to 20 degrees around central vision) (3) There are individual differences in the ability to recognize hue differences. These individual differences can be congenital or acquired (typically due to aging).

したがって、本実施形態ではこれら（１）～（３）の個人差を反映した視覚情報をユーザごとに登録し、視覚情報を反映した加工処理を表示用画像データに適用することにより、個々のユーザにとって利用しやすい視線入力機能を提供する。 Therefore, in the present embodiment, by registering visual information reflecting the individual differences (1) to (3) for each user and applying processing reflecting the visual information to the image data for display, individual user's To provide an easy-to-use line-of-sight input function for

以下、視覚情報を取得するための具体的なキャリブレーション機能の例について説明する。キヤリブレーション機能は、例えばメニュー画面を通じてユーザから実行が指示された場合や、ユーザの視覚特性が登録されていない場合にシステム制御回路５０が実行することができる。 A specific example of a calibration function for acquiring visual information will be described below. The calibration function can be executed by the system control circuit 50, for example, when the execution is instructed by the user through a menu screen or when the user's visual characteristics are not registered.

（１）の輝度ダイナミックレンジは、ユーザが不快に感じない最大輝度と最低輝度の範囲とすることができる。例えば、システム制御回路５０は、図１２（ａ）に示すような、最大輝度から最低輝度までを所定の階調数で表した無彩色のグラデーションチャートを表示部２８に表示させる。そして、ユーザが不快に感じない輝度範囲を、例えば操作部７０の操作を通じて選択させる。ユーザは、例えば４方向キーの上下キーを用いてバー１２０１の上端および下端の位置を調整し、眩しく感じない最大輝度と、隣接する階調との差が識別できる（あるいは暗すぎると感じない）最小輝度とを設定することができる。 The luminance dynamic range of (1) can be the range between the maximum luminance and the minimum luminance that does not make the user feel uncomfortable. For example, the system control circuit 50 causes the display unit 28 to display an achromatic gradation chart that expresses the maximum luminance to the minimum luminance with a predetermined number of gradations, as shown in FIG. 12(a). Then, the user is allowed to select a brightness range in which the user does not feel uncomfortable, for example, by operating the operation unit 70 . The user can adjust the positions of the upper and lower ends of the bar 1201 using, for example, the up and down keys of the 4-way key, and can identify the difference between the maximum brightness that does not feel dazzling and the adjacent gradation (or does not feel too dark). A minimum brightness can be set.

システム制御回路５０は、例えばセット（決定）ボタンが押下された際のバー１２０１の上端および下端の位置に基づいて、ユーザに対して使用しないことが好ましい輝度範囲ＫＨおよびＫＬを登録する。なお、バー１２０１の上端および下端の位置に相当する輝度を登録してもよい。 The system control circuit 50 registers luminance ranges KH and KL that are preferably not used for the user, based on the positions of the upper and lower ends of the bar 1201 when the set (determine) button is pressed, for example. It should be noted that luminance corresponding to the positions of the upper end and the lower end of the bar 1201 may be registered.

あるいは、システム制御回路５０は、例えば４方向キーの上キーの押下に応じて画面全体の輝度を増加させ、下キーの押下に応じて画面全体の輝度を低下させるようにして、ユーザに最大輝度と最小輝度を設定させてもよい。そして、システム制御回路５０は、ユーザに、眩しいと感じない最大の輝度で表示されている状態にしてセットボタンを押下するように促す。そして、システム制御回路５０は、セットボタンの押下を検出した際の表示輝度を最大輝度として登録する。また、システム制御回路５０は、ユーザに、隣接する階調との差が識別できる（あるいは暗すぎると感じない）最小の輝度で表示されている状態にしてセットボタンを押下するように促す。そして、システム制御回路５０は、セットボタンの押下を検出した際の表示輝度を最小輝度として登録する。この場合も、最大輝度と最小輝度の代わりに、ユーザに対して使用しないことが好ましい高輝度側の輝度範囲ＫＨおよび低輝度側の輝度範囲ＫＬを登録してもよい。 Alternatively, the system control circuit 50 increases the brightness of the entire screen when the up key of the 4-way key is pressed, and decreases the brightness of the entire screen when the down key is pressed. and minimum luminance. Then, the system control circuit 50 prompts the user to press the set button in a state in which the display is made with the maximum brightness that does not cause glare. Then, the system control circuit 50 registers the display brightness at the time of detecting pressing of the set button as the maximum brightness. In addition, the system control circuit 50 prompts the user to press the set button in a state where the display is at the minimum luminance at which the difference from the adjacent gradation can be discerned (or at which the user does not feel that it is too dark). Then, the system control circuit 50 registers the display brightness at the time of detecting pressing of the set button as the minimum brightness. Also in this case, instead of the maximum luminance and the minimum luminance, a luminance range KH on the high luminance side and a luminance range KL on the low luminance side that are preferably not used by the user may be registered.

輝度ダイナミックレンジに関するユーザの視覚特性は、輝度の調整が必要か否かの判定や、輝度調整時のパラメータの決定に用いることができる。 The user's visual characteristics regarding luminance dynamic range can be used to determine whether or not luminance adjustment is necessary, and to determine parameters for luminance adjustment.

（２）の有効視野は、中心視を含む、情報を識別可能な範囲である。有効視野は例えばUseful Field of View (UFOV)と呼ばれる視野であってよい。システム制御回路５０は、例えば図１２（ｂ）に示すような、比較的細かなパターンを背景として、大きさが可変である円１２０２が表示される画像を表示部２８に表示させる。そして、円１２０２の中心を注視した状態で、背景のパターンが鮮明に判別できる範囲に円１２０２の大きさを調節するようにユーザに促す。ユーザは、背景パターンを鮮明に認識できる最大の範囲に対応するように例えば４方向キーの上下キーを用いて円１２０２の大きさを調整し、セットキーを押下することにより、有効視野の大きさを設定することができる。システム制御回路５０は、上下キーの押下を検出すると円１２０２の大きさを変更し、セットキーの押下を検出すると、その時点の円１２０２の大きさに応じた有効視野の範囲を登録する。 (2) The effective visual field is a range in which information can be identified, including central vision. A useful field of view may be, for example, a field of view called a useful field of view (UFOV). The system control circuit 50 causes the display unit 28 to display an image in which a circle 1202 whose size is variable is displayed against a background of a relatively fine pattern as shown in FIG. 12(b), for example. Then, the user is urged to adjust the size of the circle 1202 to a range in which the background pattern can be clearly discerned while gazing at the center of the circle 1202 . The user adjusts the size of the circle 1202 using, for example, the up and down keys of the four direction keys so as to correspond to the maximum range in which the background pattern can be clearly recognized, and presses the set key to change the size of the effective field of view. can be set. The system control circuit 50 changes the size of the circle 1202 upon detection of pressing of the up/down key, and registers the range of effective visual field according to the size of the circle 1202 at that time upon detection of pressing of the set key.

有効視野に関するユーザの視覚特性は、注視範囲の抽出に用いることができる。 The user's visual characteristics regarding the effective field of view can be used to extract the gaze range.

（３）は、同系色において違いが認識できる色相差の大きさである。システム制御回路５０は、例えば図１２（ｃ）に示すような、同系色で色相を徐々に変化させた複数の色見本を選択可能に並べた画像を表示部２８に表示させる。ここで表示する色見本は、例えば緑、黄、青のような、被写体の背景で大きな面積を占めることがある色とすることができる。また、緑系統、黄系統、青系統など、複数の色系統について情報を取得してもよい。 (3) is the magnitude of the hue difference at which a difference can be recognized between similar colors. The system control circuit 50 causes the display unit 28 to display an image in which a plurality of color samples of similar colors with gradually changing hues are arranged in a selectable manner, for example, as shown in FIG. 12(c). The color samples displayed here can be colors such as green, yellow, and blue, which may occupy a large area in the background of the subject. Also, information may be obtained for a plurality of color systems such as green, yellow, and blue.

図１２（ｃ）は色鉛筆のイメージを用いて色見本を並べているが、短冊状の色見本などであってもよい。左端の色鉛筆が基準色であり、右方向に色相を一定量ずつ異らせた色見本を並べている。システム制御回路５０は、左端の色鉛筆と色が異なると認識できる、色鉛筆のうち、一番左の色鉛筆を選択するようユーザに促す。ユーザは、例えば４方向キーの左右キーを用いて該当する色鉛筆を選択し、セットキーを押下する。システム制御回路５０は、左右キーの押下を検出すると選択状態の色鉛筆を移動させ、セットキーの押下を検出すると、その時点で選択状態にある色鉛筆に対応する色相と、基準色の色相との差を、ユーザが認識可能な最小の色相差として登録する。複数の色系統について情報を登録する場合には、色系統ごとに同じ動作を繰り返し実行する。 In FIG. 12C, the color samples are arranged using the image of colored pencils, but strip-shaped color samples may be used. Colored pencils on the left end are the reference colors, and color samples with different hues are arranged in the right direction. The system control circuit 50 prompts the user to select the leftmost colored pencil among the colored pencils that can be recognized as having a different color from the leftmost colored pencil. The user selects the corresponding colored pencil using, for example, the left and right keys of the 4-way key, and presses the set key. When the system control circuit 50 detects pressing of the left/right key, it moves the colored pencil in the selected state, and when it detects pressing of the set key, the difference between the hue corresponding to the colored pencil currently selected and the hue of the reference color is detected. is registered as the minimum hue difference recognizable by the user. When registering information for a plurality of color systems, the same operation is repeatedly executed for each color system.

色相差の認識能力に関するユーザの視覚特性は、色相の調整が必要か否かの判定や、色相調整時のパラメータの決定に用いることができる。 The user's visual characteristics related to hue difference recognition ability can be used to determine whether hue adjustment is necessary or not, and to determine parameters for hue adjustment.

上述した、個人差のある視覚特性（１）～（３）および、視覚特性（１）～（３）に関するユーザの固有情報の取得方法は単なる例示である。他の視覚特性に関してユーザの情報を登録すること、および／または視覚得性（１）～（３）に関する情報を他の方法で登録することができる。 The above-described visual characteristics (1) to (3) with individual differences and the method of acquiring the unique information of the user regarding the visual characteristics (1) to (3) are merely examples. The user's information can be registered with respect to other visual characteristics and/or the information with respect to visual aptitudes (1)-(3) can be registered in other ways.

次に、登録したユーザの視覚特性（１）～（３）を用いた加工処理の具体例について説明する。なお、視覚特性を複数のユーザに対して登録可能な場合には、例えば設定画面を通じて選択されているユーザに関する視覚特性を用いる。 Next, a specific example of processing using the registered user's visual characteristics (1) to (3) will be described. Note that when visual characteristics can be registered for a plurality of users, the visual characteristics related to the user selected through the setting screen, for example, are used.

図１３（ａ）は、逆光状態で高輝度な空を背景にして、複数の飛行機Ｅ１が存在するシーンを示している。このように背景が高輝度な場合、ユーザの視覚特性によっては背景が眩しく、飛行機Ｅ１を注視するのが難しくなる。 FIG. 13A shows a scene in which a plurality of airplanes E1 are present against a background of a backlit sky with high brightness. If the background has such high luminance, the background may be dazzling depending on the user's visual characteristics, making it difficult to gaze at the airplane E1.

このような状況に対処するため、画像処理部２４は、視線入力機能が有効な際に表示用画像データを生成する場合、背景の輝度値（例えば平均輝度値）がユーザの視覚特性（輝度ダイナミックレンジ）に適切か否かを判定することができる。画像処理部２４は、背景の輝度値がユーザの輝度ダイナミックレンジを外れている場合（図１２における輝度範囲ＫＨに含まれる場合）には、輝度がユーザの視覚特性に対して適切でないと判定する。そして、画像処理部２４は、画像の背景領域の輝度値がユーザの輝度ダイナミックレンジ内（図１２のバー１２０１で表される輝度範囲）に含まれるように輝度を低下させる加工処理を表示画像データに適用する。 In order to deal with such a situation, the image processing unit 24, when generating image data for display when the line-of-sight input function is enabled, sets the luminance value of the background (for example, the average luminance value) to the user's visual characteristic (luminance dynamic range) can be determined. The image processing unit 24 determines that the luminance is not appropriate for the user's visual characteristics when the luminance value of the background is out of the luminance dynamic range of the user (when it is included in the luminance range KH in FIG. 12). . Then, the image processing unit 24 performs processing to lower the brightness so that the brightness value of the background area of the image is within the user's brightness dynamic range (the brightness range represented by the bar 1201 in FIG. 12). apply to

図１３（ｂ）は背景領域の輝度を低減する加工処理を適用した状態を模式的に示している。Ｍ１は主被写体領域である。画像のうち、主被写体領域Ｍ１を除いた領域を背景領域とする。ここでは、画像処理部２４は、ユーザの注視位置から一定範囲に存在する特徴領域（ここでは飛行機）を包含する大きさの領域を主被写体領域Ｍ１として、背景領域と分離している。なお、主被写体領域の大きさは、ユーザの有効視野の大きさとしてもよい。また、ユーザの注視位置に基づく主被写体領域の決定は他の方法に基づいてもよい。 FIG. 13(b) schematically shows a state in which the processing for reducing the brightness of the background area is applied. M1 is the main subject area. A region of the image excluding the main subject region M1 is assumed to be a background region. Here, the image processing unit 24 separates the main subject area M1 from the background area by defining an area having a size that includes a characteristic area (here, an airplane) existing within a certain range from the gaze position of the user. Note that the size of the main subject area may be the size of the user's effective field of view. Also, determination of the main subject area based on the gaze position of the user may be based on other methods.

なお、ユーザの輝度ダイナミックレンジに適した輝度に調整するための加工処理を適用する場合、目標とする輝度値は輝度ダイナミックレンジ内で適宜定めることができる。例えば、輝度ダイナミックレンジの中央値としてもよい。なお、ここでは背景領域の輝度値を調整（補正）する加工処理についてのみ説明したが、主被写体領域の輝度値についても同様に調整することができる。なお、背景領域と主被写体領域の両方について輝度を調整する場合、背景領域よりも主被写体領域の目標輝度が高くなるようにすることで、主被写体領域の視認性を向上することができる。 It should be noted that in the case of applying the processing for adjusting the luminance to suit the luminance dynamic range of the user, the target luminance value can be appropriately determined within the luminance dynamic range. For example, it may be the median value of the luminance dynamic range. Although only the processing for adjusting (correcting) the luminance value of the background area has been described here, the luminance value of the main subject area can be similarly adjusted. When adjusting the brightness of both the background area and the main subject area, the visibility of the main subject area can be improved by setting the target brightness of the main subject area higher than that of the background area.

図１４（ａ）は、主被写体を見失いやすいシーンの一例として、例えば集団競技や遊戯のように、類似かつ多数の被写体が様々な方向に移動するシーンを示している。図１４（ａ）において、ユーザが意図している主被写体がＥ２であるとする。
ユーザが主被写体Ｅ２を見失い、主被写体がユーザの有効視野から外れると、主被写体が他の被写体と同様にボケて認識されるため、一層区別が付きにくくなる。 FIG. 14(a) shows a scene in which a large number of similar subjects move in various directions, such as a group sport or a game, as an example of a scene in which the main subject is easily lost. In FIG. 14A, it is assumed that the main subject intended by the user is E2.
If the user loses sight of the main subject E2 and the main subject moves out of the user's effective field of view, the main subject will be recognized as blurred like the other subjects, making it even more difficult to distinguish between them.

このような状況に対処するため、画像処理部２４は、図１４（ｂ）に示すように、主被写体領域Ｍ２以外の領域（背景領域）の解像度を低下させる（ぼかす）加工処理を適用する。これにより、主被写体領域Ｍ２の鮮鋭度が相対的に高まるため、仮にユーザが主被写体Ｅ２を見失ったとしても、容易に見つけることができる。主被写体領域Ｍ２は、輝度調整に関して説明した方法と同様にして決定することができる。 In order to cope with such a situation, the image processing unit 24 applies a process of lowering (blurring) the resolution of the area (background area) other than the main subject area M2, as shown in FIG. 14(b). As a result, since the sharpness of the main subject area M2 is relatively increased, even if the user loses sight of the main subject E2, he/she can easily find it. The main subject area M2 can be determined in the same manner as described for brightness adjustment.

なお、主被写体領域の大きさが中心視野の範囲より大きい場合には、主被写体領域のうち、中心視野の範囲外の領域についても背景領域として加工処理を適用してもよい。このように、主被写体の鮮鋭度を相対的に高めることにより、ユーザの注意が自然に主被写体へ向かうため、結果的に注視位置に基づく被写体追尾を支援する効果も実現できる。 If the size of the main subject area is larger than the range of the central field of view, processing may be applied to the area outside the range of the central field of view in the main subject area as a background area. By relatively increasing the sharpness of the main subject in this way, the user's attention is naturally directed to the main subject, and as a result, it is possible to achieve the effect of supporting subject tracking based on the gaze position.

図１５（ａ）は、主被写体の輝度が低いことにより、ユーザが主被写体を認識しづらいシーンの一例として、暗い場所を移動している動物が主被写体であるシーンを示している。図１５（ａ）において、ユーザが意図している主被写体がＥ３であるとする。 FIG. 15A shows a scene in which the main subject is an animal moving in a dark place, as an example of a scene in which it is difficult for the user to recognize the main subject due to the low brightness of the main subject. In FIG. 15A, it is assumed that the main subject intended by the user is E3.

このような状況に対処するため、画像処理部２４は、視線入力機能が有効な際に表示用画像データを生成する場合、注視位置の周辺領域の輝度値（例えば平均輝度値）がユーザの視覚特性（輝度ダイナミックレンジ）に適切か否かを判定することができる。画像処理部２４は、注視位置の周辺領域の輝度値がユーザの輝度ダイナミックレンジを外れている場合（図１２における輝度範囲ＫＬに含まれる場合）には、輝度がユーザの視覚特性に対して適切でないと判定する。そして、画像処理部２４は、注視位置の周辺領域の輝度値がユーザの輝度ダイナミックレンジ内（図１２のバー１２０１で表される輝度範囲）に含まれるように輝度を上昇させる加工処理を表示画像データに適用する。図１５（ｂ）は、注視位置の周辺領域Ｍ３の輝度を上昇させる加工処理を適用した状態を模式的に示している。なお、注視位置の周辺領域は、例えば有効視野に対応する領域としてもよいし、注視位置を含む特徴領域や、追尾用のテンプレートとして用いる領域などとしてもよい。 In order to deal with such a situation, when generating image data for display when the line-of-sight input function is enabled, the image processing unit 24 sets the luminance value (for example, average luminance value) of the peripheral region of the gaze position to the user's visual perception. It can be determined whether or not the characteristics (luminance dynamic range) are appropriate. When the luminance value of the peripheral region of the gaze position is out of the user's luminance dynamic range (when it is included in the luminance range KL in FIG. 12), the image processing unit 24 determines that the luminance is appropriate for the user's visual characteristics. determine that it is not. Then, the image processing unit 24 performs processing to increase the luminance so that the luminance value of the peripheral region of the gaze position is within the luminance dynamic range of the user (the luminance range represented by the bar 1201 in FIG. 12). Apply to data. FIG. 15(b) schematically shows a state in which processing for increasing the brightness of the peripheral region M3 of the gaze position is applied. The area around the gaze position may be, for example, an area corresponding to the effective visual field, a characteristic area including the gaze position, or an area used as a template for tracking.

ここで、画像全体ではなく、注視位置の周辺領域についてのみ輝度を調整（上昇）させるのは、暗いシーンの画像の輝度を上昇させるとノイズ成分によって画像の視認性が低下するためである。画面全体の輝度を上昇させると、ノイズの影響でフレーム間における移動被写体の検出精度が低下しやすくなる。また、ノイズが画面全体で視認されるようになると、ノイズのチラツキによってユーザの目が疲労しやすくなる。 Here, the reason why the brightness is adjusted (increased) only for the area around the gaze position, not for the entire image, is that if the brightness of the image in a dark scene is increased, the visibility of the image is reduced due to noise components. When the brightness of the entire screen is increased, the accuracy of detecting a moving object between frames tends to decrease due to the influence of noise. In addition, when noise is visible on the entire screen, the user's eyes are likely to get tired due to the flickering of the noise.

なお、シーンが暗い場合、注視位置に主被写体が存在しないことも十分考えられる。そのため、注視位置が安定するまでは画面全体の輝度を上昇させ、注視位置が安定したら注視位置の周辺領域以外の領域については輝度を元に戻す（加工処理を適用しないようにする）ようにしてもよい。システム制御回路５０は、例えば注視位置の移動量が一定時間にわたって閾値以下であれば、注視位置が安定したと判定することができる。 It should be noted that when the scene is dark, it is quite conceivable that the main subject does not exist at the gaze position. Therefore, until the gaze position stabilizes, the brightness of the entire screen is increased, and once the gaze position stabilizes, the brightness of the area other than the peripheral area of the gaze position is restored (no processing is applied). good too. The system control circuit 50 can determine that the gaze position has stabilized, for example, if the amount of movement of the gaze position is equal to or less than a threshold for a certain period of time.

図１６（ａ）は、主被写体と背景の色が似ており、主被写体を見失いやすいシーンの一例として、草むらを背景として類似色の鳥Ｅ４が移動しているシーンを示している。図１６（ａ）において、ユーザが意図している主被写体が鳥Ｅ４であるとする。ユーザが鳥Ｅ４を見失った場合、背景と鳥Ｅ４の色が類似しているために鳥Ｅ４を見つけづらい。 FIG. 16A shows a scene in which a similar-colored bird E4 is moving against a background of grass, as an example of a scene in which the main subject and the background are similar in color and the main subject is easy to lose sight of. In FIG. 16A, it is assumed that the main subject intended by the user is the bird E4. When the user loses sight of the bird E4, it is difficult to find the bird E4 because the color of the background and the bird E4 are similar.

そのため、画像処理部２４は、主被写体領域（鳥Ｅ４の領域）の色相と、少なくとも主被写体の周辺の背景領域の色相との差が、ユーザの視覚特性のうち色相差の認識能力に照らして適切であるか否かを判定することができる。そして、主被写体領域と背景領域の色相との差が、ユーザが認識できる色相の差以下である場合、画像処理部２４は不適切であると判定する。この場合、画像処理部２４は、主被写体領域とその周辺の背景領域との色相の差が、ユーザが認識できる色相の差より大きくなるように、主被写体領域の色相を変更する加工処理を表示画像データに適用する。図１６（ｂ）は主被写体領域Ｍ４の色相を変更する加工処理を適用した状態を模式的に示している。 Therefore, the image processing unit 24 determines that the difference between the hue of the main subject area (the area of the bird E4) and at least the hue of the background area around the main subject is determined in light of the user's ability to recognize the hue difference among the visual characteristics. It can be determined whether it is appropriate or not. If the difference in hue between the main subject area and the background area is less than or equal to the difference in hue recognizable by the user, the image processing unit 24 determines that the image is inappropriate. In this case, the image processing unit 24 displays processing for changing the hue of the main subject area so that the difference in hue between the main subject area and the surrounding background area is greater than the difference in hue that the user can perceive. Applies to image data. FIG. 16(b) schematically shows a state in which processing for changing the hue of the main subject area M4 is applied.

なお、ここで例示した加工処理に限らず、ユーザの視覚特性を利用した加工処理を適用することが可能である。また、主被写体領域や背景領域の輝度や色相に応じて複数の加工処理を組み合わせて適用することもできる。 It should be noted that it is possible to apply not only the processing processing exemplified here, but also processing processing using the user's visual characteristics. Further, it is also possible to combine and apply a plurality of processing processes according to the luminance and hue of the main subject area and the background area.

図１７は、本実施形態に係る表示画像データの生成動作に関するフローチャートである。この動作は、視線入力機能が有効である際に、注視位置の検出と並行して実行することができる。
Ｓ１７０１においてシステム制御回路５０は、撮像部２２により１フレームの画像を撮影し、Ａ／Ｄ変換器２３を通じて画像処理部２４にデジタル画像信号を供給する。 FIG. 17 is a flow chart relating to the display image data generation operation according to the present embodiment. This operation can be executed in parallel with the gaze position detection when the line-of-sight input function is enabled.
In S<b>1701 , the system control circuit 50 captures an image of one frame with the imaging unit 22 and supplies a digital image signal to the image processing unit 24 through the A/D converter 23 .

Ｓ１７０２で画像処理部２４は、直近に検出された注視位置に基づいて、主被写体領域とする特徴領域を検出する。ここで、画像処理部２４は、第１実施形態で説明したように撮影モードから判定した種類の特徴領域を検出してから、注視位置を含む特徴領域、あるいは注視位置からの距離が最も近い特徴領域を主被写体領域としてもよい。 In S1702, the image processing unit 24 detects a feature area to be the main subject area based on the most recently detected gaze position. Here, the image processing unit 24 detects the characteristic region of the type determined from the shooting mode as described in the first embodiment, and then detects the characteristic region including the gaze position or the feature closest to the gaze position. The area may be the main subject area.

Ｓ１７０３で画像処理部は、Ｓ１７０２で検出した特徴領域（主被写体領域）を抽出する。これにより、主被写体領域と他の領域（背景領域）とが分離される。 In S1703, the image processing unit extracts the feature area (main subject area) detected in S1702. As a result, the main subject area and the other area (background area) are separated.

Ｓ１７０４で画像処理部２４は、例えば不揮発性メモリ５６に記憶されている、ユーザの視覚特性に関する情報を取得する。 In S1704, the image processing unit 24 acquires information about the user's visual characteristics stored in the nonvolatile memory 56, for example.

Ｓ１７０５で画像処理部２４は、主被写体領域と背景領域とについて、平均輝度や色相の差を算出する。そして、画像処理部２４は、算出した平均輝度や色相の差と、ユーザの視覚特性とを比較することにより、主被写体領域に加工処理を適用する必要があるか否かを判定する。上述したように、画像処理部２４は、主被写体の輝度や主被写体領域と背景領域との色相の差がユーザの視覚特性にとって適切でない場合には、主被写体領域に加工処理を適用する必要があると判定する。画像処理部２４は、主被写体領域に加工処理を適用する必要があると判定されればＳ１７０６を、判定されなければＳ１７０７を実行する。 In S1705, the image processing unit 24 calculates the difference in average luminance and hue between the main subject area and the background area. Then, the image processing unit 24 compares the calculated average luminance and hue difference with the user's visual characteristics to determine whether or not processing needs to be applied to the main subject area. As described above, the image processing unit 24 needs to apply processing to the main subject area when the luminance of the main subject and the difference in hue between the main subject area and the background area are not appropriate for the visual characteristics of the user. Determine that there is. The image processing unit 24 executes S1706 if it is determined that it is necessary to apply processing to the main subject area, and executes S1707 if it is not determined.

Ｓ１７０６で画像処理部２４は、適切でないと判定された内容に応じた加工処理を主被写体領域に適用したのち、Ｓ１７０７を実行する。 In S1706, the image processing unit 24 applies processing to the main subject area according to the content determined as inappropriate, and then executes S1707.

Ｓ１７０７で画像処理部２４は、Ｓ１７０５と同様にして、他の領域（背景領域）に加工処理を適用する必要があるか否かを判定する。画像処理部２４は、背景領域に加工処理を適用する必要があると判定されればＳ１７０８を実行する。また、背景領域に加工処理を適用する必要があると判定されなければ、画像処理装置部２４はＳ１７０１を実行し、次フレームについての動作を開始する。 In S1707, the image processing unit 24 determines whether or not it is necessary to apply processing to another area (background area) in the same manner as in S1705. The image processing unit 24 executes S1708 if it is determined that it is necessary to apply processing to the background area. Also, if it is not determined that it is necessary to apply processing to the background area, the image processing device unit 24 executes S1701 and starts the operation for the next frame.

Ｓ１７０８で画像処理部２４は、適切でないと判定された内容に応じた加工処理を背景領域に適用したのち、Ｓ１７０１を実行する。 In S1708, the image processing unit 24 executes S1701 after applying processing to the background area according to the content determined to be inappropriate.

なお、ユーザの視覚特性に対して何が適切でな以下に応じて、主被写体領域と背景領域のそれぞれについてどのような加工処理を適用するのかは、予め定めておくことができる。したがって、Ｓ１７０５における判定結果に応じて、主被写体領域だけに加工処理を適用するのか、背景領域だけに加工処理を適用するのか、主被写体領域と背景領域の両方に加工処理を適用するのかと、適用する処理の内容が特定される。 Depending on what is appropriate for the user's visual characteristics, it is possible to determine in advance what kind of processing is to be applied to each of the main subject area and the background area. Therefore, depending on the determination result in S1705, whether to apply the processing to only the main subject area, only to the background area, or to apply the processing to both the main subject area and the background area. The content of the processing to be applied is specified.

以上説明したように、本実施形態によれば、視線入力機能が有効であるときは、ユーザの視覚特性を考慮した加工処理を適用して表示用画像データを生成するようにした。そのため、個々のユーザの視覚特性に対して適切な表示用画像データを生成することができ、よりユーザにとって使いやすい視線入力機能を提供することができる。 As described above, according to the present embodiment, when the line-of-sight input function is effective, the display image data is generated by applying the processing that takes into consideration the user's visual characteristics. Therefore, display image data suitable for the visual characteristics of individual users can be generated, and a user-friendly line-of-sight input function can be provided.

なお、第１実施形態で説明した、視線による主被写体の選択をし易くするための加工処理と、本実施形態で説明した、ユーザの視覚特性に適した画像にするための加工処理とは、組み合わせて適用することもできる。 Note that the processing for making it easier to select the main subject based on the line of sight described in the first embodiment and the processing for making an image suitable for the user's visual characteristics described in the present embodiment are: They can also be applied in combination.

●（第４実施形態）
次に、第４実施形態について説明する。本実施形態は、撮像装置１の構成要素を内蔵したＸＲゴーグル（頭部装着型の表示装置もしくはＨＭＤ）を用いて体験する仮想空間の視認性向上に関する。ＸＲゴーグルを通じて視認する仮想空間の画像は、仮想空間ごとに予め用意された表示用画像データをＸＲゴーグルの向きや姿勢に応じて描画することにより生成される。表示用画像データは記録媒体２００に予め記憶されていてもよいし、外部装置から取得してもよい。 ● (Fourth Embodiment)
Next, a fourth embodiment will be described. The present embodiment relates to improving the visibility of a virtual space experienced using XR goggles (head-mounted display device or HMD) incorporating the components of the imaging device 1 . An image of the virtual space viewed through the XR goggles is generated by drawing display image data prepared in advance for each virtual space according to the orientation and posture of the XR goggles. The display image data may be pre-stored in the recording medium 200, or may be obtained from an external device.

ここでは例として体験モード「ダイビング」および「美術館」を仮想空間で提供するための表示データが記録媒体２００に記憶されているものとする。しかし、提供する仮想空間の種類および数に特段の制限はない。 Here, as an example, it is assumed that the display data for providing the experience modes “diving” and “art museum” in the virtual space is stored in the recording medium 200 . However, there are no particular restrictions on the types and number of virtual spaces to be provided.

体験モード「ダイビング」および「美術館」を提供するための仮想空間画像の例を図１８（ａ）および（ｂ）に模式的に示す。ここでは、説明および理解を容易にするため、仮想空間の全体がＣＧ画像で表現されるものとする。したがって、強調表示する主被写体は、ＣＧ画像の一部である。仮想空間画像に含まれる主被写体を強調表示することにより、主被写体の視認性を向上させることができる。主被写体は少なくとも初期状態において撮像装置１（システム制御回路５０）が設定する。撮像装置１が設定した主被写体はユーザが変更してもよい。 Examples of virtual space images for providing the experience modes "diving" and "museum" are schematically shown in FIGS. 18(a) and (b). Here, in order to facilitate explanation and understanding, it is assumed that the entire virtual space is represented by a CG image. Therefore, the main subject to be highlighted is part of the CG image. By highlighting the main subject included in the virtual space image, the visibility of the main subject can be improved. The main subject is set by the imaging apparatus 1 (system control circuit 50) at least in the initial state. The main subject set by the imaging device 1 may be changed by the user.

なお、例えばビデオシースルー型のＨＭＤのように、現実空間を撮影した画像にＣＧを仮想空間画像として重畳した合成画像を表示する場合、強調表示する主被写体領域（特徴領域）は実写画像部分に含まれる場合もあれば、ＣＧ部分に含まれる場合もある。 For example, when displaying a composite image in which CG is superimposed as a virtual space image on an image of the real space, such as a video see-through HMD, the main subject region (feature region) to be highlighted is included in the actual image portion. In some cases, it is included in the CG part.

図１９（ａ）および（ｂ）はそれぞれ、図１８（ａ）および（ｂ）に示すシーンに対して主被写体を強調する加工処理を適用した例を模式的に示した図である。ここでは、主被写体以外の彩度を低減することによって主被写体を強調し、主被写体の視認性を向上させている。なお、加工処理は他の方法で主被写体を強調してもよい。 FIGS. 19(a) and 19(b) are diagrams schematically showing an example of applying processing for emphasizing the main subject to the scenes shown in FIGS. 18(a) and 18(b), respectively. Here, the main subject is emphasized by reducing the chroma saturation of the subjects other than the main subject, and the visibility of the main subject is improved. It should be noted that the main subject may be emphasized by another method for processing.

図１９に示す例は、主被写体の領域は加工せず、他の領域を目立たなくなるようにする加工処理である。このほか、主被写体を強調し、他の領域は加工しない加工処理や、主被写体を強調し、他の領域を目立たなくする加工処理であってもよい。あるいは、画像全体を加工して対象物を強調する加工処理であってもよいし、他の方法で主被写体の領域を強調する加工処理であってもよい。 The example shown in FIG. 19 is processing for not processing the area of the main subject and making the other areas inconspicuous. In addition, processing may be processing that emphasizes the main subject and does not process other areas, or processing that emphasizes the main subject and makes other areas inconspicuous. Alternatively, the processing may be processing that emphasizes the object by processing the entire image, or processing that emphasizes the area of the main subject by another method.

ダイビングを仮想空間で体験する場合、主被写体は「生き物」であると考えらえる。美術館を仮想空間で体験する場合、主被写体は「展示物」（絵画や彫刻など）や特徴的な色（ここでは極彩色としている）を有する物体であると考えられる。つまり、提示する仮想空間や体験の種類によって、強調表示すべき主被写体が異なりうる。 When experiencing diving in a virtual space, the main subject is considered to be a "creature." When experiencing an art museum in a virtual space, the main subjects are considered to be "exhibits" (paintings, sculptures, etc.) and objects with characteristic colors (here, rich colors). In other words, the main subject to be highlighted may differ depending on the type of virtual space and experience to be presented.

図２０は、提供する仮想空間（または体験）の種類と、強調表示することが可能な被写体の種類（特徴領域の種類）との関係を示した図である。ここでは、強調表示することが可能な被写体の種類を仮想空間の種類にメタデータとして関連付けている。また、デフォルトで強調表示する主被写体の種類も仮想空間の種類に関連付けている。ここで、メタデータとして一覧表示されている被写体の種類は、画像処理部２４が検出可能な被写体の種類に対応している。また、仮想空間の種類ごとに、主被写体として設定可能な被写体の種類を○で示し、デフォルトで主被写体として選択される被写体の種類を◎で示している。したがってユーザは、○で示された被写体から新たな主被写体を選択することができる。 FIG. 20 is a diagram showing the relationship between the type of virtual space (or experience) to be provided and the type of subject (type of characteristic region) that can be highlighted. Here, the type of subject that can be highlighted is associated with the type of virtual space as metadata. In addition, the type of main subject highlighted by default is also associated with the type of virtual space. Here, the types of subjects listed as metadata correspond to the types of subjects detectable by the image processing unit 24 . For each type of virtual space, the type of subject that can be set as the main subject is indicated by ◯, and the type of subject that is selected as the main subject by default is indicated by ⊚. Therefore, the user can select a new main subject from the subjects indicated by ◯.

ユーザが主被写体の種類を変更する方法に特に制限はない。例えば、システム制御回路５０は、操作部７０を通じたメニュー画面の操作などに応答して、撮像装置１が有する表示部２８やＸＲゴーグルの表示部に主被写体を変更するためのＧＵＩを表示する。そして、システム制御回路５０は、このＧＵＩに対する操作部７０を通じた操作に応じて、現在提供している仮想空間の種類に対する主被写体の設定を変更することができる。 There are no particular restrictions on how the user can change the type of main subject. For example, the system control circuit 50 displays a GUI for changing the main subject on the display section 28 of the imaging device 1 or the display section of the XR goggles in response to the operation of the menu screen through the operation section 70 . The system control circuit 50 can change the setting of the main subject for the type of virtual space currently provided according to the operation of the GUI through the operation unit 70 .

図２１は主被写体を変更するために表示するＧＵＩの例を示す図である。図２１（ａ）はモードダイヤルを模したＧＵＩであり、操作部７０に含まれるダイヤルの操作により、選択肢の１つを主被写体に設定することができる。図２１（ａ）では、風景が主被写体に設定された状態を示している。なお、主被写体を変更するためのＧＵＩに表示される選択肢は、図２０において○が付されたメタデータの種類と対応している。なお、図２１（ａ）の例では、メタデータの種類とは別に、強調表示を行わないことを設定するための「ＯＦＦ」を選択肢に含めている。図２１（ｂ）は主被写体の種類を変更するためのＧＵＩの別の例を示している。ダイヤルを模した形態の代わりに一覧表示形態としたことを除き、図２１（ａ）に示したＧＵＩと同じである。ユーザは操作部７０を用いて所望の選択肢を選択することにより、強調表示する主被写体を変更すること（および強調表示をＯＦＦすること）ができる。なお、視線を用いた選択肢の選択を可能としてもよい。 FIG. 21 is a diagram showing an example of a GUI displayed for changing the main subject. FIG. 21A shows a GUI that imitates a mode dial. By operating the dial included in the operation unit 70, one of the options can be set as the main subject. FIG. 21A shows a state in which a landscape is set as the main subject. The options displayed on the GUI for changing the main subject correspond to the types of metadata circled in FIG. In the example of FIG. 21(a), options include "OFF" for setting not to perform highlighting, in addition to the types of metadata. FIG. 21(b) shows another example of the GUI for changing the type of main subject. The GUI is the same as the GUI shown in FIG. By selecting a desired option using the operation unit 70, the user can change the main subject to be highlighted (and turn off the highlighting). It should be noted that it may be possible to select an option using the line of sight.

図２２は、仮想空間の種類「ダイビング」「美術館」「サファリ」について、メタデータの例を画像で示した図である。例えば、提示する仮想空間画像について画像処理部２４が検出した被写体領域を被写体の種類ごとにメタデータとして抽出し、メモリ３２に格納することができる。これにより、強調表示する主被写体の変更に対して容易に対応することができる。なお、ＸＲゴーグルに表示する仮想空間の画像を予め生成することが可能な場合には、メタデータについても予め記録しておくことができる。また、メタデータは被写体領域を表す数値情報（例えば、中心位置と大きさ、外縁の座標データなど）であってもよい。 FIG. 22 is an image showing an example of metadata for the virtual space types "diving", "museum", and "safari". For example, the subject area detected by the image processing unit 24 in the virtual space image to be presented can be extracted as metadata for each type of subject and stored in the memory 32 . This makes it possible to easily deal with a change in the main subject to be highlighted. Note that if the image of the virtual space to be displayed on the XR goggles can be generated in advance, metadata can also be recorded in advance. Also, the metadata may be numerical information representing the subject area (for example, center position and size, outer edge coordinate data, etc.).

また、第２実施形態で説明した注視位置情報を用いてユーザが関心を示している被写体の種類を特定し、特定した種類の被写体領域を強調表示してもよい。この場合、注視位置に応じて強調表示する主被写体の種類が変化するため、ユーザは明示的に設定を変更することなく主被写体を変更することができる。 Alternatively, the type of subject that the user is interested in may be specified using the gaze position information described in the second embodiment, and the specified type of subject area may be highlighted. In this case, since the type of the main subject to be highlighted changes according to the gaze position, the user can change the main subject without explicitly changing the settings.

また、現在の視野に主被写体が存在しない場合や、主被写体領域の数や大きさが閾値以下の場合に、より多くの主被写体が視野に入る方向を示す指標を仮想空間画像に重畳してもよい。 In addition, when the main subject does not exist in the current field of view, or when the number or size of the main subject area is less than a threshold value, an index indicating the direction in which more of the main subject enters the field of view is superimposed on the virtual space image. good too.

図２３（ａ）は、体験モード「ダイビング」において現在ＸＲゴーグルに提示中の仮想空間画像の例を示している。提示中の仮想空間画像には主被写体である魚の領域が存在しない。この場合、システム制御回路５０は、仮想空間画像に主被写体が存在する方向の指標Ｐ１を重畳することができる。システム制御回路５０は、例えば表示用画像データを生成するための仮想空間データにおける魚オブジェクトの位置情報に基づいて、魚がＸＲゴーグル視野に入る方向を特定することができる。 FIG. 23(a) shows an example of a virtual space image currently being presented to the XR goggles in the experience mode "diving". The virtual space image being presented does not include a fish area, which is the main subject. In this case, the system control circuit 50 can superimpose the index P1 in the direction in which the main subject exists on the virtual space image. The system control circuit 50 can specify the direction in which the fish enters the field of view of the XR goggles, for example, based on the position information of the fish object in the virtual space data for generating display image data.

ユーザは指標Ｐ１で示される方向を見るように首を振るなどすることにより、図２３（ｂ）に示すように魚を視認することが可能になる。なお、主被写体の存在する方向を示す指標は複数重畳させてもよい。この場合、システム制御回路５０は、主被写体を視野に含めるために必要な視線の移動距離が最も短い方向を示す指標、あるいは最も多くの主被写体を視野に含めることができる方向を示す指標を最も目立つ（例えば大きく）ように表示することができる。 The user can visually recognize the fish as shown in FIG. 23(b) by shaking the head so as to look in the direction indicated by the index P1. Note that a plurality of indices indicating the direction in which the main subject exists may be superimposed. In this case, the system control circuit 50 selects the index indicating the direction in which the line of sight movement required to include the main subject in the field of view is the shortest, or the index indicating the direction in which the maximum number of main subjects can be included in the field of view. It can be displayed prominently (eg, large).

本実施形態によれば、提供する仮想空間に応じた種類の被写体領域を強調表示するようにした。そのため、仮想空間画像について、ユーザが主被写体として意図している可能性の高い領域が視認しやすくなり、主被写体を注視するまでの時間が短縮される効果が期待できる。 According to this embodiment, the type of subject area corresponding to the provided virtual space is highlighted. As a result, it becomes easier for the user to visually recognize a region of the virtual space image that is likely to be the main subject, and the effect of shortening the time required to gaze at the main subject can be expected.

●（第５実施形態）
次に、第５実施形態について説明する。本実施形態は、第４実施形態でＸＲゴーグルに表示する仮想空間画像を、サーバなどＸＲゴーグルの外部装置から取得する表示システムに関する。 ● (Fifth embodiment)
Next, a fifth embodiment will be described. This embodiment relates to a display system that acquires a virtual space image to be displayed on the XR goggles in the fourth embodiment from an external device of the XR goggles, such as a server.

図２４（ａ）は、ＸＲゴーグルＤＰ１とサーバＳＶ１とが通信可能に接続された表示システムの模式図である。ＸＲゴーグルＤＰ１とサーバＳＶ１との間にＬＡＮやインターネットなどのネットワークが存在してもよい。 FIG. 24(a) is a schematic diagram of a display system in which the XR goggles DP1 and the server SV1 are communicably connected. A network such as a LAN or the Internet may exist between the XR goggles DP1 and the server SV1.

一般的に、仮想空間画像の生成には、大容量となる仮想空間データと、仮想空間データから仮想空間画像を生成（描画）する演算能力とが必要である。そのため、ＸＲゴーグルからは姿勢検出部５５で検出した姿勢情報のような、仮想空間画像を生成するために必要な情報をサーバに出力する。そして、サーバでＸＲゴーグルに表示する仮想空間画像を生成し、ＸＲゴーグルに送信する。 In general, generation of a virtual space image requires large-capacity virtual space data and computing power to generate (render) a virtual space image from the virtual space data. Therefore, the XR goggles output information necessary for generating a virtual space image, such as posture information detected by the posture detection unit 55, to the server. Then, the server generates a virtual space image to be displayed on the XR goggles and transmits it to the XR goggles.

仮想空間データ（３次元データ）をサーバＳＶ１で持つことにより、サーバに接続された複数のＸＲゴーグルで同一の仮想空間を共有することが可能になる。 By having the virtual space data (three-dimensional data) in the server SV1, it becomes possible for a plurality of XR goggles connected to the server to share the same virtual space.

図２５はサーバＳＶ１として利用可能なコンピュータ装置の構成例を示すブロック図である。図において、ディスプレイ２５０１はアプリケーションプログラムによって処理中のデータの情報、各種メッセージメニューなどを表示し、ＬＣＤ(Liquid Crystal Display)等から構成される。ビデオＲＡＭ（ＶＲＡＭ）ディスプレイコントローラとしてのＣＲＴＣ２５０２は、ディスプレイ２５０１への画面表示制御を行う。キーボード２５０３及びポインティングデバイス２５０４は、文字などを入力したり、ＧＵＩ（Graphical User Interface）におけるアイコンやボタンなどを操作するためなどに用いられる。ＣＰＵ２５０５はコンピュータ装置全体の制御を司る。 FIG. 25 is a block diagram showing a configuration example of a computer device that can be used as the server SV1. In the figure, a display 2501 displays information on data being processed by an application program, various message menus, etc., and is composed of an LCD (Liquid Crystal Display) or the like. A CRTC 2502 as a video RAM (VRAM) display controller controls screen display on the display 2501 . A keyboard 2503 and a pointing device 2504 are used for inputting characters and operating icons and buttons in a GUI (Graphical User Interface). A CPU 2505 controls the entire computer apparatus.

ＲＯＭ（Read Only Memory）２５０６はＣＰＵ２５０５が実行するプログラムやパラメータ等を記憶している。ＲＡＭ（Random Access Memory）２５０７は各種プログラムをＣＰＵ２５０５が実行する時のワークエリア、各種データのバッファ等として用いられる。 A ROM (Read Only Memory) 2506 stores programs executed by the CPU 2505, parameters, and the like. A RAM (Random Access Memory) 2507 is used as a work area when the CPU 2505 executes various programs, a buffer for various data, and the like.

ハードディスクドライプ（ＨＤＤ）２５０８、リムーバブルメディアドライプ（ＲＭＤ）２５０９は、外部記憶装置として機能する。リムーバブルメディアドライブは、着脱可能な記録媒体の読み書き又は読み出しを行う装置であり、光ディスクドライブ、光磁気ディスクドライブ、メモリカードリーダなどであってもよい。 A hard disk drive (HDD) 2508 and removable media drive (RMD) 2509 function as external storage devices. A removable media drive is a device that reads, writes, or reads a removable recording medium, and may be an optical disk drive, a magneto-optical disk drive, a memory card reader, or the like.

なお、サーバＳＶ１の各種機能を実現するプログラムを始め、ＯＳや、ブラウザ等のアプリケーションプログラム、データ、ライプラリなどは、その用途に応じてＲＯＭ２５０６、ＨＤＤ２５０８、ＲＭＤ２５０９（の記録媒体）の１つ以上に記憶されている。 Note that the programs that implement various functions of the server SV1, the OS, application programs such as browsers, data, libraries, etc., are stored in one or more of the ROM 2506, HDD 2508, and RMD 2509 (recording media thereof) according to their use. It is

拡張スロット２５１０は、例えばＰＣＩ(Periferal Component Interconnect)バス規格に準拠した拡張カード装着用スロットである。拡張スロット２５１０には、ビデオキャプチャボードや、サウンドボードなど、様々な拡張ボードを装着することが可能である。 The expansion slot 2510 is, for example, a slot for mounting an expansion card conforming to the PCI (Peripheral Component Interconnect) bus standard. Various expansion boards such as a video capture board and a sound board can be installed in the expansion slot 2510 .

ネットワークインタフェース２５１１はサーバＳＶ１をローカルネットワークや外部ネットワークと接続するためのインタフェースである。また、サーバ装置ＳＶ１はネットワークインタフェース２５１１の他に、規格に準拠した外部機器との通信インタフェースを１つ以上有している。規格の例にはＵＳＢ(Universal Serial Bus)、ＨＤＭＩ(High-Definition Multimedia Interface)（登録商標）、無線ＬＡＮ、Bluetooth（登録商標）などが含まれる。 A network interface 2511 is an interface for connecting the server SV1 to a local network or an external network. In addition to the network interface 2511, the server device SV1 has one or more communication interfaces with external devices conforming to the standard. Examples of standards include USB (Universal Serial Bus), HDMI (High-Definition Multimedia Interface) (registered trademark), wireless LAN, Bluetooth (registered trademark), and the like.

バス２５１２はアドレスバス、データバスおよび制御バスからなり、上述した各ブロック間を接続する。 A bus 2512 consists of an address bus, a data bus and a control bus, and connects the blocks described above.

次に、図２４（ｂ）に示すフローチャートを用いて、サーバＳＶ１とＸＲゴーグルＤＰ１の動作に関して説明する。サーバＳＶ１の動作は、ＣＰＵ２５０１が所定のアプリケーションを実行することによって実現される。 Next, the operation of the server SV1 and the XR goggles DP1 will be described using the flowchart shown in FIG. 24(b). The operation of the server SV1 is implemented by the CPU 2501 executing a predetermined application.

Ｓ２４０２でＸＲゴーグルＤＰ１からサーバＳＶ１に対し、仮想空間の種類（図２０）を指定する。システム制御回路５０は例えばＸＲゴーグルＤＰ１の表示部２８に仮想空間の種類を指定するＧＵＩを表示する。システム制御回路５０は、操作部７０を通じた選択操作を検出すると、選択された種類を示すデータを通信部５４を通じてサーバＳＶ１に送信する。 In S2402, the type of virtual space (FIG. 20) is specified from the XR goggles DP1 to the server SV1. The system control circuit 50 displays, for example, a GUI for designating the type of virtual space on the display section 28 of the XR goggles DP1. When the system control circuit 50 detects a selection operation through the operation unit 70, the system control circuit 50 transmits data indicating the selected type to the server SV1 through the communication unit 54. FIG.

ここではＸＲゴーグルＤＰ１に表示する仮想空間の範囲が固定されているものとする。したがって、サーバＳＶ１は指定された種類の仮想空間の特定のシーンの画像データ（仮想空間画像データ）を、付随するメタデータとともにＸＲゴーグルＤＰ１に送信する。 Here, it is assumed that the range of the virtual space displayed on the XR goggles DP1 is fixed. Therefore, the server SV1 transmits image data (virtual space image data) of a specified type of specific scene in the virtual space to the XR goggles DP1 together with accompanying metadata.

Ｓ２４０３でシステム制御回路５０は、サーバＳＶ１から仮想空間画像データと付随するメタデータとをサーバＳＶ１から受信する。 In S2403, the system control circuit 50 receives the virtual space image data and accompanying metadata from the server SV1.

Ｓ２４０４でシステム制御回路５０は、サーバＳＶ１から受信した仮想空間画像データとメタデータとをメモリ３２に保存する。 In S2404, the system control circuit 50 stores the virtual space image data and metadata received from the server SV1 in the memory 32. FIG.

Ｓ２４０５でシステム制御回路５０は、画像処理部２４を用い、仮想空間画像データに対し、図１９を用いて説明したような主被写体領域の強調処理を適用する。そして、強調処理を行った仮想空間画像データを表示部２８に表示させる。なお、仮想空間画像が右眼用の画像および左目用の画像から構成される場合、個々の画像について強調処理を適用する。 In S2405, the system control circuit 50 uses the image processing unit 24 to apply the main subject region enhancement processing described with reference to FIG. 19 to the virtual space image data. Then, the virtual space image data subjected to the enhancement processing is displayed on the display unit 28 . Note that when the virtual space image is composed of an image for the right eye and an image for the left eye, enhancement processing is applied to each image.

図２４（ｃ）は、サーバＳＶ１でＸＲゴーグルＤＰ１の姿勢（視線方向）に応じた仮想空間画像データの生成と、仮想空間データに対する強調処理とを適用する場合のサーバＳＶ１の動作に関するフローチャートである。サーバＳＶ１の動作は、ＣＰＵ２５０１が所定のアプリケーションを実行することによって実現される。 FIG. 24(c) is a flowchart regarding the operation of the server SV1 when the server SV1 generates virtual space image data according to the posture (line-of-sight direction) of the XR goggles DP1 and applies enhancement processing to the virtual space data. . The operation of the server SV1 is implemented by the CPU 2501 executing a predetermined application.

Ｓ２４１１でサーバＳＶ１は、ＸＲゴーグルＤＰ１から仮想空間の種類を指定するデータを受信する。 In S2411, the server SV1 receives data designating the type of virtual space from the XR goggles DP1.

Ｓ２４１２以降の動作は、ＸＲゴーグルＤＰ１に表示する動画の１フレームごとに実行される。
Ｓ２４１２でサーバＳＶ１は、ＸＲゴーグルＤＰ１から姿勢情報を受信する。
Ｓ２４１３でサーバＳＶ１は、ＸＲゴーグルＤＰ１の姿勢に応じた仮想空間画像データを生成する。仮想空間データは、３次元データのレンダリング、全周画像からの切りだしなど、公知の任意の方法で生成することができる。例えばサーバＳＶ１は図２６に示すように、ＸＲゴーグルＤＰ１の姿勢情報に基づいて、仮想空間画像からＸＲゴーグルＤＰ１の表示領域を決定し、表示領域に対応する範囲を切り出すことができる。なお、ＸＲゴーグルＤＰ１は、姿勢情報の代わりに表示領域を特定する情報（例えば中心座標）を送信してもよい。 The operations after S2412 are executed for each frame of the moving image displayed on the XR goggles DP1.
In S2412, the server SV1 receives posture information from the XR goggles DP1.
In S2413, the server SV1 generates virtual space image data corresponding to the posture of the XR goggles DP1. The virtual space data can be generated by any known method such as rendering of three-dimensional data, clipping from an omnidirectional image, or the like. For example, as shown in FIG. 26, the server SV1 can determine the display area of the XR goggles DP1 from the virtual space image based on the orientation information of the XR goggles DP1, and cut out the range corresponding to the display area. Note that the XR goggles DP1 may transmit information specifying the display area (for example, center coordinates) instead of the posture information.

Ｓ２４１５でサーバＳＶ１は、ＸＲゴーグルＤＰ１から主被写体の種類を受信する。なお、Ｓ２４１５での主被写体の種類の受信は、ＸＲゴーグルＤＰ１において主被写体の種類が変更された場合に実行され、変更のない場合にスキップされる。 In S2415, the server SV1 receives the type of main subject from the XR goggles DP1. Note that reception of the main subject type in S2415 is executed when the main subject type is changed in the XR goggles DP1, and is skipped when there is no change.

Ｓ２４１６でサーバＳＶ１は、Ｓ２４１３で生成した仮想空間画像データに対し、主被写体領域の強調処理を適用する。主被写体の種類に変更がない場合、サーバＳＶ１は仮想空間の種類に応じたデフォルトの主被写体領域に対して強調処理を適用する。 In S2416, the server SV1 applies main subject area enhancement processing to the virtual space image data generated in S2413. If there is no change in the type of main subject, the server SV1 applies enhancement processing to the default main subject area corresponding to the type of virtual space.

Ｓ２４１７でサーバＳＶ１は、強調処理を適用した仮想空間画像データをＸＲゴーグルＤＰ１に送信する。ＸＲゴーグルＤＰ１では、受信した仮想空間画像データを表示部２８に表示させる。 In S2417, the server SV1 transmits the virtual space image data to which the enhancement processing is applied to the XR goggles DP1. The XR goggles DP1 causes the display unit 28 to display the received virtual space image data.

図２７（ａ）は、図２６（ａ）の構成に対し、ＶＲ画像を生成可能なカメラＣＡを追加した表示システムの模式図である。ここでは、仮想空間の種類として、図２０の例で挙げた体験シェアの場合を想定している。カメラＣＡでＸＲ情報を付加して記録した画像をＸＲゴーグルに表示することにより、カメラＣＡのユーザの体験をＸＲゴーグルＤＰ１の装着者も疑似体験することができる。 FIG. 27(a) is a schematic diagram of a display system in which a camera CA capable of generating a VR image is added to the configuration of FIG. 26(a). Here, as a type of virtual space, it is assumed that the case of experience sharing mentioned in the example of FIG. 20 is used. By displaying an image recorded with XR information added by the camera CA on the XR goggles, the wearer of the XR goggles DP1 can also simulate the experience of the user of the camera CA.

図２８は、カメラＣＡの構成例を示すブロック図である。カメラＣＡは本体１００’と、本体１００’に装着されたレンズユニット３００を有する。レンズユニット３００と本体１００’はレンズマウント３０４、３０５によって着脱可能である。また、レンズユニット３００が有するレンズシステム制御回路３０３と、本体１００’のシステム制御回路５０（不図示）とは、レンズマウント３０４、３０５に設けられた通信端子６、１０を通じて相互に通信することができる。 FIG. 28 is a block diagram showing a configuration example of the camera CA. The camera CA has a main body 100' and a lens unit 300 attached to the main body 100'. The lens unit 300 and the main body 100 ′ can be attached and detached by lens mounts 304 and 305 . Also, the lens system control circuit 303 of the lens unit 300 and the system control circuit 50 (not shown) of the main body 100' can communicate with each other through the communication terminals 6 and 10 provided on the lens mounts 304 and 305. can.

レンズユニット３００は、ステレオ魚眼レンズであり、カメラＣＡは視野角が１８０°のステレオ円周魚眼画像を撮影することができる。具体的には、レンズユニット３００の２つの光学系３０１Ｌ、３０１Ｒのそれぞれは、左右方向（水平角度、方位角、ヨー角）１８０度、上下方向（垂直角度、仰俯角、ピッチ角）１８０度の視野を円形の２次元平面に投影した円周魚眼像を生成する。 The lens unit 300 is a stereo fisheye lens, and the camera CA can capture a stereo circular fisheye image with a viewing angle of 180°. Specifically, each of the two optical systems 301L and 301R of the lens unit 300 has a horizontal angle (horizontal angle, azimuth angle, yaw angle) of 180 degrees and a vertical direction (vertical angle, elevation/depression angle, pitch angle) of 180 degrees. A circular fisheye image is generated by projecting the field of view onto a circular two-dimensional plane.

本体１００’は、一部の構成しか示していないが、図１に示した撮像装置１の本体１００と同様の構成を有するものとする。このような構成のカメラＣＡで撮影した画像（例えばＶＲ１８０規格に準拠した動画像）をＸＲ画像として記録媒体２００に記録しておく。 The main body 100' has a configuration similar to that of the main body 100 of the imaging apparatus 1 shown in FIG. 1, although only a part of the configuration is shown. An image (for example, a moving image conforming to the VR180 standard) captured by the camera CA having such a configuration is recorded in the recording medium 200 as an XR image.

図２７（ｂ）に示すフローチャートを用いて、図２７（ａ）に示した表示システムの動作について説明する。なお、サーバＳＶ１は、ＸＲゴーグルＤＰ１およびカメラＣＡと通信可能な状態にあるものとする。 The operation of the display system shown in FIG. 27(a) will be described using the flowchart shown in FIG. 27(b). It is assumed that the server SV1 is in a state of being able to communicate with the XR goggles DP1 and the camera CA.

Ｓ２６０２でカメラＣＡからサーバＳＶ１へ画像データを送信する。画像データには撮影日、撮影条件などのＥｘｉｆ情報、撮影時に記録された撮影者の視線情報、撮影時に検出された主被写体情報などを含む付加情報が付随している。なお、カメラＣＡとサーバＳＶ１とで通信する代わりに、カメラＣＡの記録媒体２００をサーバＳＶ１に装着して画像データを読み出してもよい。 In S2602, image data is transmitted from the camera CA to the server SV1. Image data is accompanied by additional information including Exif information such as shooting date and shooting conditions, photographer's line-of-sight information recorded at the time of shooting, main subject information detected at the time of shooting, and the like. Note that instead of communicating between the camera CA and the server SV1, the recording medium 200 of the camera CA may be attached to the server SV1 to read image data.

Ｓ２６０３でサーバＳＶ１は、カメラＣＡから受信した画像データから、ＸＲゴーグルＤＰ１に表示する画像データおよびメタデータを生成する。本実施形態ではカメラＣＡがステレオ円周魚眼画像を記録するため、公知の方法で表示範囲を切りだし、矩形状の画像に変換することにより、表示用の画像データを生成する。また、サーバＳＶ１は、表示用画像データから予め定められた種類の被写体領域を検出し、検出された被写体領域の情報をメタデータとして生成する。サーバＳＶ１は生成した表示用画像データとメタデータとをＸＲゴーグルＤＰ１に送信する。また、サーバＳＶ１はカメラＣＡから取得した主被写体情報、視線情報など、撮影時の付加情報もＸＲゴーグルＤＰ１に送信する。 In S2603, the server SV1 generates image data and metadata to be displayed on the XR goggles DP1 from the image data received from the camera CA. In this embodiment, since the camera CA records a stereo circular fisheye image, image data for display is generated by cutting out the display range by a known method and converting it into a rectangular image. The server SV1 also detects a predetermined type of subject area from the display image data, and generates information about the detected subject area as metadata. The server SV1 transmits the generated display image data and metadata to the XR goggles DP1. The server SV1 also transmits additional information at the time of photographing, such as main subject information and line-of-sight information acquired from the camera CA, to the XR goggles DP1.

Ｓ２６０４およびＳ２６０５でＸＲゴーグルＤＰ１のシステム制御回路５０が行う動作はＳ２４０４およびＳ２４０５の動作と同様であるため、説明を省略する。システム制御回路５０は、Ｓ２６０５で強調処理を適用する主被写体の種類を、サーバＳＶ１から受信した主被写体情報に基づいて決定することができる。また、システム制御回路５０は、撮影者の視線情報に基づいて特定した主被写体領域に強調処理を適用してもよい。この場合、撮影者が撮影時に注視していた被写体が強調表示されるため、撮影者の体験をより一層共有することができる。 The operations performed by the system control circuit 50 of the XR goggles DP1 in S2604 and S2605 are the same as the operations in S2404 and S2405, so description thereof will be omitted. The system control circuit 50 can determine the type of main subject to which enhancement processing is applied in S2605 based on the main subject information received from the server SV1. Further, the system control circuit 50 may apply enhancement processing to the main subject area specified based on the line-of-sight information of the photographer. In this case, the subject that the photographer was gazing at at the time of shooting is highlighted, so that the experience of the photographer can be shared even more.

図２７（ｃ）は、図２７（ａ）に示した表示システムにおいて、図２４（ｃ）と同様に強調処理をサーバＳＶ１で実行する場合のサーバＳＶ１の動作に関するフローチャートである。 FIG. 27(c) is a flowchart relating to the operation of the server SV1 in the display system shown in FIG. 27(a) when the server SV1 executes the highlighting process in the same manner as in FIG. 24(c).

Ｓ２６１２はＳ２６０２と同様であるため説明を省略する。
また、Ｓ２６１３～Ｓ２６１７はＳ２４１２、Ｓ２４１３、Ｓ２４１５～Ｓ２４１７とそれぞれ同様であるため、説明を省略する。なお、強調表示を適用する主被写体の種類は、ＸＲゴーグルＤＰ１から指定があれば指定された種類とし、指定がなければ撮影時の主被写体情報に基づいて決定する。 Since S2612 is the same as S2602, description thereof is omitted.
Also, S2613 to S2617 are the same as S2412, S2413, and S2415 to S2417, respectively, so description thereof will be omitted. Note that the type of the main subject to which the highlight display is applied is the designated type if specified by the XR goggles DP1, and is determined based on the main subject information at the time of shooting if not specified.

本実施形態によれば、仮想空間画像やＶＲ画像に対しても適切な強調処理を適用することが可能になる。また、不可の大きな処理をサーバなどの外部装置で実行することにより、ＸＲゴーグルに必要なリソースが軽減できるほか、複数のユーザが同一の仮想空間を共有することが容易である。 According to this embodiment, appropriate enhancement processing can be applied to virtual space images and VR images as well. In addition, by executing unmanageably large processing on an external device such as a server, the resources required for the XR goggles can be reduced, and multiple users can easily share the same virtual space.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

本実施形態の開示は、以下の撮像装置、方法、画像処理装置、画像処理方法、およびプログラムを含む。
（項目１）
撮像装置であって、
前記撮像装置が表示している画像におけるユーザの注視位置を検出可能な検出手段と、
前記表示のための画像データを生成する生成手段と、を有し、
前記生成手段は、前記検出手段が有効な際に生成する前記画像データについては、特徴領域を他の領域より視覚的に強調する加工処理を適用し、
前記特徴領域が、前記撮像装置の設定に基づいて判定される種類の被写体の領域である、
ことを特徴とする撮像装置。
（項目２）
前記設定が、特定のシーンもしくは特定の被写体を撮影するための設定であり、
前記特徴領域が、前記特定のシーンに応じた種類の被写体の領域、あるいは前記特定の被写体の領域であることを特徴とする項目１に記載の撮像装置。
（項目３）
前記加工処理が、
前記特徴領域については加工せず、他の領域を目立たなく加工する処理、
前記特徴領域を強調し、他の領域は加工しない処理、
前記特徴領域を強調するとともに、他の領域を目立たなく加工する処理、
前記特徴領域を含む画像全体を加工して前記特徴領域を強調する処理、
のいずれか１つであることを特徴とする項目１または２に記載の撮像装置。
（項目４）
前記生成手段が、ライブビュー表示のための画像データとして前記画像データを生成することを特徴とする項目１から３のいずれか１項に記載の撮像装置。
（項目５）
さらに、前記検出手段が検出した注視位置に基づいて焦点検出領域を設定する設定手段を有することを特徴とする項目１から４のいずれか１項に記載の撮像装置。
（項目６）
表示している画像におけるユーザの注視位置を検出可能な検出手段を有する撮像装置が実行する方法であって、
前記表示のための画像データを生成する生成工程と、を有し、
前記生成工程では、
前記検出手段が有効な際に生成する前記画像データについては、特徴領域を他の領域より視覚的に強調する加工処理を適用し、
前記検出手段が有効でない際に生成する前記画像データについては、特徴領域を他の領域より視覚的に強調する加工処理を適用せず、
前記特徴領域が、前記撮像装置の設定に基づいて判定される種類の被写体の領域である、
ことを特徴とする方法。
（項目７）
撮像装置が有するコンピュータを、項目１から５のいずれか１項に記載の撮像装置が有する各手段として機能させるためのプログラム。
（項目８）
頭部装着型の表示装置に表示するための画像データを生成する生成手段を有し、
前記生成手段は、前記表示装置を通じてユーザに提供する仮想環境の種類に応じた特徴領域を他の領域より視覚的に強調する加工処理を適用することにより、前記画像データを生成する、
ことを特徴とする画像処理装置。
（項目９）
前記加工処理が、
前記特徴領域については加工せず、他の領域を目立たなく加工する処理、
前記特徴領域を強調し、他の領域は加工しない処理、
前記特徴領域を強調するとともに、他の領域を目立たなく加工する処理、
前記特徴領域を含む画像全体を加工して前記特徴領域を強調する処理、
のいずれか１つであることを特徴とする項目８に記載の画像処理装置。
（項目１０）
さらに、前記表示装置が表示している画像におけるユーザの注視位置を検出可能な検出手段を有し、
前記生成手段は、前記加工処理を適用したのち、前記検出手段が検出した前記注視位置に基づくさらなる加工処理を適用することにより、前記画像データを生成することを特徴とする項目８または９に記載の画像処理装置。
（項目１１）
前記さらなる加工処理が、前記特徴領域のうち、前記注視位置を含む特徴領域を、他の特徴領域よりも視覚的に強調する加工処理であることを特徴とする項目１０に記載の画像処理装置。
（項目１２）
前記さらなる加工処理が、前記特徴領域のうち、前記注視位置を含む特徴領域に関する付随情報を重畳表示する加工処理であることを特徴とする項目１０に記載の画像処理装置。
（項目１３）
前記さらなる加工処理が、前記注視位置の移動方向に存在する特徴領域を視覚的に強調する加工処理であることを特徴とする項目１０に記載の画像処理装置。
（項目１４）
前記仮想環境の種類ごとに、前記加工処理を適用可能な特徴領域の種類と、デフォルトで前記加工処理を適用する特徴領域の種類とが対応づけられていることを特徴とする項目８に記載の画像処理装置。
（項目１５）
前記生成手段は、ユーザに提供中の仮想環境に対応づけられた特徴領域の種類から前記ユーザが指定した種類に基づいて前記加工処理を適用することを特徴とする項目１４に記載の画像処理装置。
（項目１６）
前記生成手段は、ユーザの指定がない場合、前記ユーザに提供中の仮想環境に対応づけられた、デフォルトで前記加工処理を適用する特徴領域の種類に基づいて前記加工処理を適用することを特徴とする項目１４または１５に記載の画像処理装置。
（項目１７）
さらに、前記表示装置が表示している画像におけるユーザの注視位置を検出可能な検出手段を有し、
前記生成手段は、前記検出手段が検出した前記注視位置に基づく特徴領域に前記加工処理を適用することを特徴とする項目１４から１６のいずれか１項に記載の画像処理装置。
（項目１８）
前記生成手段は、生成した前記画像データに前記特徴領域が含まれていない場合、前記画像データに、特徴領域が存在する方向を示す指標を含めることを特徴とする項目１４から１７のいずれか１項に記載の画像処理装置。
（項目１９）
前記頭部装着型の表示装置が、前記画像処理装置と通信可能な外部装置であることを特徴とする項目１４から１８のいずれか１項に記載の画像処理装置。
（項目２０）
前記画像処理装置が、前記頭部装着型の表示装置の一部であることを特徴とする項目１４から１８のいずれか１項に記載の画像処理装置。
（項目２１）
前記仮想環境を表すＶＲ画像のデータを取得する取得手段をさらに有し、
前記生成手段は、前記ＶＲ画像から前記画像データを生成する、
ことを特徴とする項目１４から２０のいずれか１項に記載の画像処理装置。
（項目２２）
前記取得手段は前記ＶＲ画像の撮影時に得られた主被写体情報および／または視線情報をさらに取得し、
前記生成手段は、前記主被写体情報または前記視線情報に基づいて、前記加工処理を適用する前記特徴領域を決定することを特徴とする項目２１に記載の画像処理装置。
（項目２３）
画像処理装置が実行する画像処理方法であって、
頭部装着型の表示装置に表示するための画像データを生成する生成工程を有し、
前記生成工程では、前記表示装置を通じてユーザに提供する仮想環境の種類に応じた特徴領域を他の領域より視覚的に強調する加工処理を適用することにより、前記画像データを生成する、
ことを特徴とする画像処理方法。
（項目２４）
コンピュータを、項目８から２２のいずれか１項に記載の画像処理装置が有する各手段として機能させるためのプログラム。 Disclosure of this embodiment includes the following imaging device, method, image processing device, image processing method, and program.
(Item 1)
An imaging device,
a detection means capable of detecting a gaze position of a user in an image displayed by the imaging device;
and generating means for generating image data for the display,
The generation means applies processing to the image data generated when the detection means is active to visually emphasize a characteristic region over other regions,
wherein the characteristic region is a region of a subject of a type determined based on settings of the imaging device;
An imaging device characterized by:
(Item 2)
the settings are settings for shooting a specific scene or a specific subject;
The imaging apparatus according to item 1, wherein the characteristic area is an area of a subject of a type corresponding to the specific scene or an area of the specific subject.
(Item 3)
The processing treatment is
A process of processing other areas inconspicuously without processing the characteristic area,
processing that emphasizes the characteristic region and does not process other regions;
A process of emphasizing the characteristic region and processing other regions inconspicuously,
A process of processing the entire image including the characteristic region to emphasize the characteristic region;
3. The imaging device according to item 1 or 2, characterized by being any one of
(Item 4)
4. The imaging apparatus according to any one of items 1 to 3, wherein the generating means generates the image data as image data for live view display.
(Item 5)
5. The imaging apparatus according to any one of items 1 to 4, further comprising setting means for setting a focus detection area based on the gaze position detected by the detection means.
(Item 6)
A method executed by an imaging device having detection means capable of detecting a gaze position of a user in a displayed image,
a generating step of generating image data for the display;
In the generating step,
Applying processing to the image data generated when the detection means is active to visually emphasize the characteristic region from other regions,
The image data generated when the detection means is not effective is not processed to visually emphasize the characteristic region from other regions,
wherein the characteristic region is a region of a subject of a type determined based on settings of the imaging device;
A method characterized by:
(Item 7)
A program for causing a computer of an imaging device to function as each unit of the imaging device according to any one of items 1 to 5.
(Item 8)
having generating means for generating image data to be displayed on a head-mounted display device;
The generation means generates the image data by applying a processing process that visually emphasizes a characteristic region according to the type of virtual environment provided to the user through the display device from other regions.
An image processing apparatus characterized by:
(Item 9)
The processing treatment is
A process of processing other areas inconspicuously without processing the characteristic area,
processing that emphasizes the characteristic region and does not process other regions;
A process of emphasizing the characteristic region and processing other regions inconspicuously,
A process of processing the entire image including the characteristic region to emphasize the characteristic region;
9. The image processing apparatus according to item 8, characterized by being any one of
(Item 10)
Furthermore, it has a detection means capable of detecting a user's gaze position in the image displayed by the display device,
10. According to item 8 or 9, wherein the generating means generates the image data by applying further processing based on the gaze position detected by the detecting means after applying the processing. image processing device.
(Item 11)
11. The image processing apparatus according to item 10, wherein the further processing is processing for visually emphasizing a characteristic region including the gaze position among the characteristic regions more than other characteristic regions.
(Item 12)
11. The image processing apparatus according to item 10, wherein the further processing is a processing of superimposing accompanying information related to a characteristic region including the gaze position among the characteristic regions.
(Item 13)
11. The image processing apparatus according to item 10, wherein the further processing is processing for visually emphasizing a characteristic region existing in the moving direction of the gaze position.
(Item 14)
9. The method according to item 8, wherein for each type of the virtual environment, a type of characteristic region to which the processing is applicable and a type of the characteristic region to which the processing is applied by default are associated with each other. Image processing device.
(Item 15)
15. The image processing apparatus according to item 14, wherein the generating means applies the processing based on the type specified by the user from among the types of characteristic regions associated with the virtual environment being provided to the user. .
(Item 16)
The generating means, if not specified by the user, applies the processing based on the type of characteristic region to which the processing is applied by default, which is associated with the virtual environment being provided to the user. 16. The image processing device according to item 14 or 15.
(Item 17)
Furthermore, it has a detection means capable of detecting a user's gaze position in the image displayed by the display device,
17. The image processing apparatus according to any one of items 14 to 16, wherein the generation means applies the processing to the feature area based on the gaze position detected by the detection means.
(Item 18)
18. Any one of items 14 to 17, wherein when the generated image data does not include the characteristic region, the generating means includes an index indicating a direction in which the characteristic region exists in the image data. 10. The image processing device according to claim 1.
(Item 19)
19. The image processing device according to any one of items 14 to 18, wherein the head-mounted display device is an external device capable of communicating with the image processing device.
(Item 20)
19. The image processing device according to any one of items 14 to 18, wherein the image processing device is part of the head-mounted display device.
(Item 21)
further comprising acquisition means for acquiring VR image data representing the virtual environment;
the generating means generates the image data from the VR image;
21. The image processing apparatus according to any one of items 14 to 20, characterized by:
(Item 22)
the acquisition means further acquires main subject information and/or line-of-sight information obtained when the VR image is captured;
22. An image processing apparatus according to item 21, wherein the generation means determines the feature area to which the processing is applied based on the main subject information or the line-of-sight information.
(Item 23)
An image processing method executed by an image processing device,
a generation step of generating image data for display on a head-mounted display device;
In the generating step, the image data is generated by applying a processing process that visually emphasizes a characteristic region according to the type of virtual environment provided to the user through the display device from other regions.
An image processing method characterized by:
(Item 24)
A program for causing a computer to function as each unit included in the image processing apparatus according to any one of items 8 to 22.

本発明は上述した実施形態の内容に制限されず、発明の精神および範囲から離脱することなく様々な変更及び変形が可能である。したがって、発明の範囲を公にするために請求項を添付する。 The present invention is not limited to the content of the above-described embodiments, and various modifications and variations are possible without departing from the spirit and scope of the invention. Accordingly, the claims are appended to make public the scope of the invention.

１…撮像装置、２２…撮像部、２４…画像処理部、２８…表示部、５０…システム制御回路、７０…操作部、１００…本体、１５０…レンズユニット DESCRIPTION OF SYMBOLS 1... Imaging device 22... Imaging part 24... Image processing part 28... Display part 50... System control circuit 70... Operation part 100... Main body 150... Lens unit

Claims

An imaging device,
a detection means capable of detecting a gaze position of a user in an image displayed by the imaging device;
and generating means for generating image data for the display,
The generation means applies processing to the image data generated when the detection means is active to visually emphasize a characteristic region over other regions,
wherein the characteristic region is a region of a subject of a type determined based on settings of the imaging device;
An imaging device characterized by:

the settings are settings for shooting a specific scene or a specific subject;
2. The imaging apparatus according to claim 1, wherein the characteristic area is an area of a subject of a type corresponding to the specific scene or an area of the specific subject.

The processing treatment is
A process of processing other areas inconspicuously without processing the characteristic area,
processing that emphasizes the characteristic region and does not process other regions;
A process of emphasizing the characteristic region and processing other regions inconspicuously,
A process of processing the entire image including the characteristic region to emphasize the characteristic region;
2. The image pickup apparatus according to claim 1, wherein the image pickup apparatus is any one of:

2. The imaging apparatus according to claim 1, wherein said generating means generates said image data as image data for live view display.

2. The imaging apparatus according to claim 1, further comprising setting means for setting a focus detection area based on the gaze position detected by said detection means.

A method executed by an imaging device having detection means capable of detecting a gaze position of a user in a displayed image,
a generating step of generating image data for the display;
In the generating step,
Applying processing to the image data generated when the detection means is active to visually emphasize the characteristic region from other regions,
The image data generated when the detection means is not effective is not processed to visually emphasize the characteristic region from other regions,
wherein the characteristic region is a region of a subject of a type determined based on settings of the imaging device;
A method characterized by:

A program for causing a computer of an imaging device to function as each means of the imaging device according to claim 1 .

having generating means for generating image data to be displayed on a head-mounted display device;
The generation means generates the image data by applying a processing process that visually emphasizes a characteristic region according to the type of virtual environment provided to the user through the display device from other regions.
An image processing apparatus characterized by:

The processing treatment is
A process of processing other areas inconspicuously without processing the characteristic area,
processing that emphasizes the characteristic region and does not process other regions;
A process of emphasizing the characteristic region and processing other regions inconspicuously,
A process of processing the entire image including the characteristic region to emphasize the characteristic region;
9. The image processing apparatus according to claim 8, wherein the image processing apparatus is any one of:

Furthermore, it has a detection means capable of detecting a user's gaze position in the image displayed by the display device,
9. The image data according to claim 8, wherein said generating means generates said image data by applying further processing based on said gaze position detected by said detecting means after applying said processing. Image processing device.

11. The image processing apparatus according to claim 10, wherein the further processing is processing for visually emphasizing a characteristic region including the gaze position among the characteristic regions more than other characteristic regions. .

11. The image processing apparatus according to claim 10, wherein said further processing is a processing of superimposing accompanying information relating to a feature area including said gaze position among said feature areas.

11. The image processing apparatus according to claim 10, wherein the further processing is processing for visually emphasizing a characteristic region existing in the moving direction of the gaze position.

9. The method according to claim 8, wherein for each type of said virtual environment, a type of characteristic region to which said processing is applicable and a type of characteristic region to which said processing is applied by default are associated with each other. image processing device.

15. The image processing according to claim 14, wherein said generating means applies said processing based on a type specified by said user from among types of feature regions associated with the virtual environment being provided to the user. Device.

The generating means, if not specified by the user, applies the processing based on the type of characteristic region to which the processing is applied by default, which is associated with the virtual environment being provided to the user. 15. The image processing apparatus according to claim 14.

Furthermore, it has a detection means capable of detecting a user's gaze position in the image displayed by the display device,
15. The image processing apparatus according to claim 14, wherein said generation means applies said processing to the characteristic region based on said gaze position detected by said detection means.

15. The image processing according to claim 14, wherein, when the generated image data does not include the characteristic region, the generation means includes an index indicating a direction in which the characteristic region exists in the image data. Device.

15. The image processing apparatus according to claim 14, wherein the head-mounted display device is an external device capable of communicating with the image processing apparatus.

15. The image processing device according to claim 14, wherein said image processing device is part of said head-mounted display device.

further comprising acquisition means for acquiring VR image data representing the virtual environment;
the generating means generates the image data from the VR image;
15. The image processing apparatus according to claim 14, characterized by:

the acquisition means further acquires main subject information and/or line-of-sight information obtained when the VR image is captured;
22. The image processing apparatus according to claim 21, wherein said generating means determines said feature area to which said processing is applied, based on said main subject information or said line-of-sight information.

An image processing method executed by an image processing device,
a generation step of generating image data for display on a head-mounted display device;
In the generating step, the image data is generated by applying a processing process that visually emphasizes a characteristic region according to the type of virtual environment provided to the user through the display device from other regions.
An image processing method characterized by:

A program for causing a computer to function as each means of the image processing apparatus according to claim 8 .