JP6294054B2

JP6294054B2 - Video display device, video presentation method, and program

Info

Publication number: JP6294054B2
Application number: JP2013239137A
Authority: JP
Inventors: 真治木村; 美木子中西; 堀越　力; 力堀越
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2013-11-19
Filing date: 2013-11-19
Publication date: 2018-03-14
Anticipated expiration: 2033-11-19
Also published as: JP2015100032A

Description

本発明は、実環境に映像コンテンツを重畳してユーザに提示する技術に関する。 The present invention relates to a technique for superimposing video content on a real environment and presenting it to a user.

拡張現実（ＡＲ;Augmented Reality）を利用して画像を提示する技術が知られている。特許文献１及び非特許文献１には、実環境中にマーカを設置する等の事前処理を行うことなく、広告等の画像を提示する技術が開示されている。特許文献１は、撮影画像から背景画像内の面を有する領域を推定し、推定した領域内の各画素の画素値が単色又は単色のグラデーションを示し、且つ、所定サイズ以上の領域を、提示画像を重畳する領域として選択することを開示している。非特許文献１は、市街地を撮影した画像である市街地画像から広告提示の対象である平面群を検出し、壁面の高さを計算した結果に基づいて提示画像の表示に適する領域を選択して、市街地画像にＣＧ画像を重畳することを開示している。 A technique for presenting an image using augmented reality (AR) is known. Patent Document 1 and Non-Patent Document 1 disclose a technique for presenting an image such as an advertisement without performing pre-processing such as placing a marker in a real environment. Patent Document 1 estimates a region having a surface in a background image from a photographed image, and a pixel value of each pixel in the estimated region indicates a single color or a monochrome gradation, and a region having a predetermined size or more is represented as a presentation image. Is selected as a region to be superimposed. Non-Patent Document 1 detects a plane group that is a target of advertisement presentation from an urban area image that is an image of an urban area, and selects a region suitable for display of a presentation image based on the result of calculating the height of a wall surface. In addition, superimposing a CG image on an urban area image is disclosed.

特開２０１３−１０９４６９号公報JP 2013-109469 A

内山寛之，出口大輔，井手一郎，村瀬洋，川西隆仁，柏野邦夫、“市街地構造物への拡張現実型広告提示”、［online］、2011年、ＶｉＥＷビジョン技術の実利用ワークショップ講演論文集、［平成２５年１０月２５日検索］、インターネット〈URL：http://www.murase.nuie.nagoya-u.ac.jp/publications/887-pdf.pdf〉Hiroyuki Uchiyama, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase, Takahito Kawanishi, Kunio Kanno, “Augmented Reality Advertising on Urban Structures”, [online], 2011, Proceedings of Workshop on Real Use of ViEW Vision Technology, [October 25, 2013 search] Internet <URL: http://www.murase.nuie.nagoya-u.ac.jp/publications/887-pdf.pdf>

特許文献１及び非特許文献１に記載された技術では、実環境を撮影した画像の特徴に応じて当該実環境に重畳される画像の表示領域が定まる。
これに対し、本発明の目的は、ユーザが観察する実環境に当該ユーザが映像コンテンツを配置できるようにすることである。 In the techniques described in Patent Document 1 and Non-Patent Document 1, a display area of an image to be superimposed on the actual environment is determined according to the characteristics of the image obtained by capturing the actual environment.
In contrast, an object of the present invention is to enable a user to place video content in an actual environment observed by the user.

上述した課題を解決するため、本発明の映像表示装置は、ユーザにより観察される像を撮影して撮影データを生成する撮影部と、前記観察される像に重ねて映像コンテンツを表示する表示部と、前記撮影部が生成した撮影データに基づいて、前記ユーザにより空間領域で行われたジェスチャを認識するジェスチャ認識部と、前記ジェスチャ認識部が認識したジェスチャに基づいて、前記ユーザの位置から見て前記空間領域の延長線上に映像提示領域を設定する設定部と、前記設定部が設定した映像提示領域において前記ユーザにより映像コンテンツが観察されるように、当該映像コンテンツを前記表示部に表示させる表示制御部とを備える。 In order to solve the above-described problems, a video display device according to the present invention includes a shooting unit that captures an image observed by a user and generates shooting data, and a display unit that displays video content superimposed on the observed image. And a gesture recognition unit for recognizing a gesture performed in a spatial region by the user based on shooting data generated by the shooting unit, and a gesture from the position of the user based on the gesture recognized by the gesture recognition unit. A setting unit for setting a video presentation area on an extension line of the spatial area, and displaying the video content on the display unit so that the user can observe the video content in the video presentation area set by the setting unit. A display control unit.

本発明の映像表示装置において、前記ジェスチャ認識部は、前記ユーザにより行われた前記空間領域を平面的に囲むジェスチャを認識してもよい。
本発明の映像表示装置において、前記延長線上に存在する平面領域を検出する平面領域検出部を備え、前記設定部は、前記平面領域検出部が検出した前記平面領域上に、前記映像提示領域を設定してもよい。 In the video display device according to the aspect of the invention, the gesture recognition unit may recognize a gesture that surrounds the space area planarly performed by the user.
The video display device of the present invention further includes a plane area detection unit that detects a plane area existing on the extension line, and the setting unit displays the video presentation area on the plane area detected by the plane area detection unit. It may be set.

本発明の映像表示装置において、前記表示制御部は、前記ユーザにより前記映像提示領域が観察されなくなった場合には、当該映像提示領域における映像コンテンツの再生を中断し、当該映像提示領域が再び観察されたときに前記再生を再開してもよい。 In the video display device of the present invention, when the video presentation area is no longer observed by the user, the display control unit interrupts reproduction of video content in the video presentation area, and the video presentation area is observed again. The playback may be resumed when done.

本発明の映像表示装置において、前記表示制御部は、前記ユーザによって複数の前記映像提示領域が観察される場合に、一の前記映像提示領域の映像コンテンツを、前記ユーザにより観察されないようにするか又は他の前記映像提示領域の映像コンテンツよりも観察されにくくする制御を行ってもよい。 In the video display device according to the aspect of the invention, the display control unit may prevent the user from observing video content in one video presentation area when the user observes the plurality of video presentation areas. Or you may perform control which makes it harder to observe than the video content of the said other video presentation area | region .

本発明の映像表示装置において、自装置でない他の映像表示装置から前記映像提示領域の設定を示す設定情報を取得する設定情報取得部を備え、前記設定部は、前記設定情報取得部が取得した設定情報に基づいて前記映像提示領域を設定し、前記表示制御部は、前記他の映像表示装置と同じ前記映像コンテンツを表示させてもよい。 The video display device of the present invention includes a setting information acquisition unit that acquires setting information indicating the setting of the video presentation area from another video display device that is not its own device, and the setting unit is acquired by the setting information acquisition unit The video presentation area may be set based on setting information, and the display control unit may display the same video content as that of the other video display device.

本発明の映像表示装置において、前記設定部が設定した映像提示領域を、前記ジェスチャ認識部が認識したジェスチャに基づいて補正する補正部を備え、前記ジェスチャ認識部は、前記設定した映像提示領域が前記表示部に表示されているときに、当該映像提示領域を補正するためのジェスチャを認識してもよい。 In the video display device of the present invention, the video display device includes a correction unit that corrects the video presentation area set by the setting unit based on the gesture recognized by the gesture recognition unit, and the gesture recognition unit includes the set video presentation region. When displayed on the display unit, a gesture for correcting the video presentation area may be recognized.

本発明の映像提示方法は、ユーザにより観察される像を撮影して撮影データを生成する撮影部と、前記観察される像に重ねて映像コンテンツを表示する表示部とを備える映像表示装置の映像提示方法であって、前記撮影部が生成した撮影データに基づいて、前記ユーザにより空間領域で行われたジェスチャを認識するステップと、認識した前記ジェスチャに基づいて、前記ユーザの位置から見て前記空間領域の延長線上に映像提示領域を設定するステップと、設定した前記映像提示領域において前記ユーザにより映像コンテンツが観察されるように、当該映像コンテンツを前記表示部に表示させるステップとを有する。 The video presentation method of the present invention is a video of a video display device comprising: a photographing unit that captures an image observed by a user to generate captured data; and a display unit that displays video content superimposed on the observed image. A method of presentation, comprising: recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit; and viewing from the user's position based on the recognized gesture A step of setting a video presentation area on an extension line of the space area, and a step of displaying the video content on the display unit so that the user can observe the video content in the set video presentation area.

本発明のプログラムは、ユーザにより観察される像を撮影して撮影データを生成する撮影部と、前記観察される像に重ねて映像コンテンツを表示する表示部とを備える映像表示装置のコンピュータに、前記撮影部が生成した撮影データに基づいて、前記ユーザにより空間領域で行われたジェスチャを認識するステップと、認識した前記ジェスチャに基づいて、前記ユーザの位置から見て前記空間領域の延長線上に映像提示領域を設定するステップと、設定した前記映像提示領域において前記ユーザにより映像コンテンツが観察されるように、当該映像コンテンツを前記表示部に表示させるステップとを実行させるためのプログラムである。 The program of the present invention is provided in a computer of a video display device including a photographing unit that shoots an image observed by a user and generates photographing data, and a display unit that displays a video content on the observed image. The step of recognizing a gesture made in the spatial region by the user based on the photographing data generated by the photographing unit, and on the extension line of the spatial region as seen from the position of the user based on the recognized gesture A program for executing a step of setting a video presentation area and a step of displaying the video content on the display unit so that the user can observe the video content in the set video presentation area.

本発明によれば、ユーザが観察する実環境に当該ユーザが映像コンテンツを配置できる。 According to the present invention, the user can place video content in an actual environment that the user observes.

本発明の一実施形態に係る眼鏡型端末の外観構成を示す図。The figure which shows the external appearance structure of the spectacles type terminal which concerns on one Embodiment of this invention. 同眼鏡型端末が有する映像提示機能の概要の説明図。Explanatory drawing of the outline | summary of the video presentation function which the spectacles type terminal has. 同眼鏡型端末のハードウェア構成を示すブロック図。The block diagram which shows the hardware constitutions of the spectacles type terminal. 同眼鏡型端末の設定テーブルの構成例を示す図。The figure which shows the structural example of the setting table of the spectacles type terminal. 同眼鏡型端末の制御部の機能構成を示す機能ブロック図。The functional block diagram which shows the function structure of the control part of the spectacles type terminal. 同眼鏡型端末のユーザが行うジェスチャの説明図。Explanatory drawing of the gesture which the user of the spectacles type terminal performs. 同眼鏡型端末が行う処理の流れを示すフローチャート。The flowchart which shows the flow of the process which the spectacles type terminal performs. 同眼鏡型端末が認識するユーザのジェスチャの説明図。Explanatory drawing of the user's gesture which the spectacles type terminal recognizes. 同眼鏡型端末における映像提示領域の設定方法の説明図。Explanatory drawing of the setting method of the image | video presentation area | region in the spectacles type terminal. 同眼鏡型端末における映像提示領域の他の設定方法の説明図。Explanatory drawing of the other setting method of the image | video presentation area | region in the spectacles type terminal. 同眼鏡型端末が行う処理の流れを示すフローチャート。The flowchart which shows the flow of the process which the spectacles type terminal performs. 同眼鏡型端末における映像コンテンツの表示例を示す図。The figure which shows the example of a display of the video content in the spectacles type terminal. 変形例３の眼鏡型端末が行う処理の流れを示すフローチャート。10 is a flowchart showing a flow of processing performed by a glasses-type terminal according to Modification 3; 変形例４の眼鏡型端末が行う処理の流れを示すフローチャート。10 is a flowchart showing a flow of processing performed by a glasses-type terminal according to Modification 4;

［実施形態］
以下、図面を参照して本発明の実施形態を説明する。
図１は、本発明の一実施形態の眼鏡型端末１の外観構成を示す図である。図１には、ユーザ２に装着されているときの眼鏡型端末１が示されている。眼鏡型端末１は、ユーザ２の頭部に装着された状態で使用される映像表示装置である。眼鏡型端末１は、ユーザ２の頭部を挟むようにして装着される。このとき、眼鏡型端末１は、アーム部と呼ばれる部品がユーザの両耳２ａに接する。眼鏡型端末１を装着したユーザ２は、眼鏡部１０のレンズを通して像を観察する。以下に説明するユーザ２は、特に断りのない限り、眼鏡型端末１を装着した状態であるものとする。 [Embodiment]
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram showing an external configuration of a glasses-type terminal 1 according to an embodiment of the present invention. FIG. 1 shows the eyeglass-type terminal 1 when worn by the user 2. The glasses-type terminal 1 is a video display device that is used while being worn on the head of the user 2. The glasses-type terminal 1 is worn so as to sandwich the head of the user 2. At this time, in the glasses-type terminal 1, a part called an arm part is in contact with the user's both ears 2 a. A user 2 wearing the glasses-type terminal 1 observes an image through a lens of the glasses unit 10. The user 2 described below is assumed to be in a state of wearing the glasses-type terminal 1 unless otherwise specified.

眼鏡型端末１の眼鏡部１０は、電気的な構成として、撮影部１１と、表示部１２と、距離センサ１３と、位置方向センサ１４とを備える。撮影部１１は、ＣＣＤ（Charge Coupled Device）等の撮像素子を含み、動画像及び静止画像の撮影が可能である。撮影部１１は、眼鏡部１０を通してユーザに観察される像を撮影する。撮影部１１は、撮影する像の位置（範囲）が、眼鏡部１０を通してユーザ２により観察される像の位置（範囲）と一致するとみなせる位置に配置される。撮影部１１は、動画像又は静止画像を撮影すると、撮影した画像（すなわち撮影画像）を表す撮影データを生成する。 The glasses unit 10 of the glasses-type terminal 1 includes an imaging unit 11, a display unit 12, a distance sensor 13, and a position / direction sensor 14 as an electrical configuration. The imaging unit 11 includes an imaging element such as a CCD (Charge Coupled Device) and can capture moving images and still images. The imaging unit 11 captures an image observed by the user through the glasses unit 10. The photographing unit 11 is disposed at a position where the position (range) of the image to be photographed can be regarded as coincident with the position (range) of the image observed by the user 2 through the glasses unit 10. When shooting a moving image or a still image, the shooting unit 11 generates shooting data representing the shot image (that is, the shot image).

表示部１２は、光学シースルー型のヘッドマウントディスプレイ（ＨＭＤ；Head Mounted Display）を含み、眼鏡部１０を通してユーザに観察される像に重ねて映像コンテンツを表示する。ここでは、表示部１２は、ユーザの左目で映像が視認されるように、眼鏡部１０の左眼側のレンズに重ねて映像コンテンツを表示する。 The display unit 12 includes an optical see-through type head mounted display (HMD), and displays video content superimposed on an image observed by the user through the glasses unit 10. Here, the display unit 12 displays the video content so as to overlap the lens on the left eye side of the glasses unit 10 so that the video is visually recognized by the user's left eye.

距離センサ１３は、例えば光学式又は超音波式の距離センサであり、ユーザ２から見て前方（例えばユーザ２の視線方向）にある目標物までの距離を検出する距離検出手段である。
位置方向センサ１４は、例えば３軸加速度、３軸角速度及び３軸地磁気を含む９軸センサであり、眼鏡型端末１の位置及び眼鏡型端末１が向く方向を検出する位置方向検出手段である。 The distance sensor 13 is, for example, an optical or ultrasonic distance sensor, and is a distance detection unit that detects a distance to a target in front of the user 2 (for example, the direction of the line of sight of the user 2).
The position / direction sensor 14 is a nine-axis sensor including, for example, three-axis acceleration, three-axis angular velocity, and three-axis geomagnetism, and is a position / direction detection unit that detects the position of the glasses-type terminal 1 and the direction in which the glasses-type terminal 1 faces.

図２は、眼鏡型端末１が有する映像提示機能の概要を説明する図である。
眼鏡型端末１は、ユーザ２が観察する実環境中に、映像コンテンツを提示するための仮想的な領域（以下「映像提示領域」という。）を設定（定義）する。そして、眼鏡型端末１は、設定した映像提示領域においてユーザ２により映像コンテンツが観察されるように、映像コンテンツを表示部１２に表示する。映像提示領域に提示される映像コンテンツは、眼鏡型端末１を装着したユーザ２だけが観察することができる。図２には、ユーザ２が居る室空間１００に映像提示領域ＳＣ１〜ＳＣ４が設定された例が示されている。室空間１００は、天井Ｃ、側壁部Ｗ及び床Ｆに囲まれて形成されている。例えば、ユーザ２は、側壁部Ｗの方向に視線を向けると、映像提示領域ＳＣ１、ＳＣ２又はＳＣ３の映像コンテンツを観察することができ、天井Ｃの方向に視線を向けると、映像提示領域ＳＣ４の映像コンテンツを観察することができる。これ以外にも、床Ｆや室空間１００に設置された机やその他の家具に映像提示領域が設定された場合に、ユーザ２は、映像提示領域に視線を向けることで映像コンテンツを観察することができる。図２に示す映像提示領域ＳＣ１〜ＳＣ４から分かるように、本実施形態の映像提示領域は、矩形（正方形を含む。）の領域である。 FIG. 2 is a diagram for explaining an overview of the video presentation function of the glasses-type terminal 1.
The glasses-type terminal 1 sets (defines) a virtual area (hereinafter referred to as “video presentation area”) for presenting video content in the actual environment observed by the user 2. Then, the glasses-type terminal 1 displays the video content on the display unit 12 so that the video content is observed by the user 2 in the set video presentation area. Only the user 2 wearing the glasses-type terminal 1 can observe the video content presented in the video presentation area. FIG. 2 shows an example in which the video presentation areas SC1 to SC4 are set in the room space 100 where the user 2 is present. The room space 100 is formed by being surrounded by the ceiling C, the side wall W, and the floor F. For example, the user 2 can observe the video content in the video presentation area SC1, SC2, or SC3 when the user looks at the side wall W, and the user 2 looks at the ceiling C toward the video presentation area SC4. Video content can be observed. In addition to this, when a video presentation area is set on a desk or other furniture installed on the floor F or the room space 100, the user 2 observes video content by directing his / her line of sight to the video presentation area. Can do. As can be seen from the video presentation areas SC1 to SC4 shown in FIG. 2, the video presentation area of the present embodiment is a rectangular area (including a square).

図３は、眼鏡型端末１のハードウェア構成を示すブロック図である。図３に示すようい、眼鏡型端末１は、眼鏡部１０と、制御部２０と、記憶部３０と、通信部４０とを備える。
眼鏡部１０の撮影部１１は、撮影して撮影データを生成すると、生成した撮影データを制御部２０へ供給する。表示部１２は、制御部２０が行った表示制御に従って、映像コンテンツを表示する。距離センサ１３は、例えば、制御部２０が映像提示領域を設定する際に目標物までの距離を検出し、検出結果を制御部２０へ供給する。位置方向センサ１４は、例えば眼鏡型端末１の動作中において、眼鏡型端末１の位置及び眼鏡型端末１が向く方向を繰り返し検出して、検出結果を制御部２０へ供給する。 FIG. 3 is a block diagram illustrating a hardware configuration of the glasses-type terminal 1. As shown in FIG. 3, the glasses-type terminal 1 includes a glasses unit 10, a control unit 20, a storage unit 30, and a communication unit 40.
When the photographing unit 11 of the eyeglass unit 10 shoots and generates shooting data, the generated shooting data is supplied to the control unit 20. The display unit 12 displays video content according to display control performed by the control unit 20. For example, the distance sensor 13 detects the distance to the target when the control unit 20 sets the video presentation area, and supplies the detection result to the control unit 20. The position / direction sensor 14 repeatedly detects the position of the eyeglass-type terminal 1 and the direction in which the eyeglass-type terminal 1 faces during operation of the eyeglass-type terminal 1, for example, and supplies the detection result to the control unit 20.

制御部２０は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、及びＲＡＭ（Random Access Memory）を有するマイクロコンピュータを備える。ＣＰＵは、ＲＯＭ又は記憶部３０に記憶されたプログラムを、ワークエリアとしてのＲＡＭに読み出して実行することにより、眼鏡型端末１の各部を制御する。記憶部３０は、例えばＥＥＰＲＯＭ（Electrically Erasable Programmable ROM）を有する記憶手段であり、制御部２０により実行されるアプリケーションプログラム及び設定テーブル３１を記憶する。記憶部３０が記憶するアプリケーションプログラムは、例えば、動画コンテンツ又は静止画コンテンツを再生するためのアプリケーションプログラムやブラウザ、メーラである。これらのアプリケーションプログラムは、映像を表示するためのアプリケーションプログラムである。通信部４０は、外部装置と通信（典型的には無線通信）するためのインタフェースである。 The control unit 20 includes a microcomputer having a central processing unit (CPU), a read only memory (ROM), and a random access memory (RAM). The CPU controls each unit of the glasses-type terminal 1 by reading a program stored in the ROM or the storage unit 30 into a RAM as a work area and executing the program. The storage unit 30 is a storage unit having, for example, an EEPROM (Electrically Erasable Programmable ROM), and stores an application program executed by the control unit 20 and a setting table 31. The application program stored in the storage unit 30 is, for example, an application program, a browser, or a mailer for reproducing moving image content or still image content. These application programs are application programs for displaying video. The communication unit 40 is an interface for communicating with an external device (typically wireless communication).

図４は、設定テーブル３１の構成例を示す図である。設定テーブル３１は、設定された映像提示領域に関する情報を格納したデータテーブルである。
図４に示すように、設定テーブル３１では、「レコード番号」フィールドと、「機能」フィールドと、「設定情報」フィールドとが関連付けられている。「レコード番号」フィールドには、設定テーブル３１におけるレコードを識別する番号が格納される。図４に示すレコード番号「１」〜「４」の各レコードには、図２で説明した映像提示領域ＳＣ１〜ＳＣ４のうち、符号の末尾の値が同じ映像提示領域に関する情報が格納される。「機能」フィールドには、各映像提示領域に割り当てられた機能を識別する情報が格納される。本実施形態では、「機能」フィールドには、機能を利用するために実行すべきアプリケーションプログラムを識別する識別子（以下「アプリケーション識別子」という。）が格納される。各映像提示領域に割り当てられた機能を分かりやすくするために、図４の「機能」フィールドには、機能を説明する文字列が示されている。 FIG. 4 is a diagram illustrating a configuration example of the setting table 31. The setting table 31 is a data table that stores information related to the set video presentation area.
As shown in FIG. 4, in the setting table 31, a “record number” field, a “function” field, and a “setting information” field are associated with each other. In the “record number” field, a number for identifying a record in the setting table 31 is stored. Each record of record numbers “1” to “4” shown in FIG. 4 stores information related to the video presentation area having the same code end value among the video presentation areas SC1 to SC4 described in FIG. The “function” field stores information for identifying a function assigned to each video presentation area. In the present embodiment, the “function” field stores an identifier (hereinafter referred to as “application identifier”) that identifies an application program to be executed in order to use the function. In order to make it easy to understand the functions assigned to each video presentation area, the “function” field in FIG. 4 shows a character string describing the function.

「設定情報」フィールドには、映像提示領域の設定を示す設定情報が格納される。「設定情報」フィールドは、詳細には、「方向情報」フィールドと、「サイズ情報」フィールドと、「タグ情報」フィールドとを含む。「方向情報」フィールドには、映像提示領域が設定されたときに眼鏡型端末１が向いていた方向を示す方向情報が格納される。「サイズ情報」フィールドには、映像提示領域の縦横のサイズを示すサイズ情報が格納される。「タグ情報」フィールドには、映像提示領域にタグとして関連付けられたタグ情報が格納される。タグ情報は、具体的には、映像提示領域の撮影画像（ここでは静止画像）を表す撮影データである。 In the “setting information” field, setting information indicating the setting of the video presentation area is stored. Specifically, the “setting information” field includes a “direction information” field, a “size information” field, and a “tag information” field. The “direction information” field stores direction information indicating the direction in which the glasses-type terminal 1 is facing when the video presentation area is set. In the “size information” field, size information indicating the vertical and horizontal sizes of the video presentation area is stored. In the “tag information” field, tag information associated with the video presentation area as a tag is stored. Specifically, the tag information is photographing data representing a photographed image (here, a still image) in the video presentation area.

図５は、眼鏡型端末１の制御部２０の機能構成を示す機能ブロック図である。図５に示すように、制御部２０は、撮影制御部２１と、ジェスチャ認識部２２と、設定部２３と、表示制御部２４とに相当する機能を実現する。図５に示す設定情報取得部２５及び補正部２６は、後述する変形例に関わる機能であり、本実施形態には関係ないものとする。
撮影制御部２１は、撮影部１１の撮影に関する制御を司る。撮影制御部２１は、例えば、映像提示領域を設定する際に、撮影部１１に動画像を撮影させる。また、撮影制御部２１は、設定部２３が設定した映像提示領域の静止画像を撮影部１１に撮影させて、タグ情報を生成する。 FIG. 5 is a functional block diagram showing a functional configuration of the control unit 20 of the glasses-type terminal 1. As illustrated in FIG. 5, the control unit 20 realizes functions corresponding to the imaging control unit 21, the gesture recognition unit 22, the setting unit 23, and the display control unit 24. The setting information acquisition unit 25 and the correction unit 26 illustrated in FIG. 5 are functions related to a modified example described later, and are not related to the present embodiment.
The shooting control unit 21 controls the shooting of the shooting unit 11. For example, the shooting control unit 21 causes the shooting unit 11 to shot a moving image when setting a video presentation area. In addition, the shooting control unit 21 causes the shooting unit 11 to take a still image of the video presentation area set by the setting unit 23 and generates tag information.

ジェスチャ認識部２２は、撮影部１１が生成した撮影データに基づいて、ユーザ２により行われた空間領域を平面的に囲むジェスチャを認識する。図６は、ユーザ２が行うジェスチャを説明する図である。図６に示すように、ユーザ２は、右手の親指ＦＲ１と人差し指ＦＲ２とで「Ｌ」字を作り、左手の親指ＦＬ１と人差し指ＦＬ２とで「Ｌ」字を作り、それらを組み合わせて、平面状の略矩形の空間領域を作る。ジェスチャ認識部２２は、図６で説明したユーザ２のジェスチャを認識して、このジェスチャで指定された空間領域を検出する。ジェスチャ認識部２２が検出する空間領域のことを、以下では「ジェスチャ領域」と称する。 The gesture recognition unit 22 recognizes a gesture that surrounds the spatial area made by the user 2 in a plane based on the shooting data generated by the shooting unit 11. FIG. 6 is a diagram illustrating a gesture performed by the user 2. As shown in FIG. 6, the user 2 creates an “L” shape with the thumb FR1 and the index finger FR2 of the right hand, creates an “L” shape with the thumb FL1 and the index finger FL2 of the left hand, and combines them into a planar shape. Create a space area that is approximately rectangular. The gesture recognizing unit 22 recognizes the gesture of the user 2 described with reference to FIG. 6 and detects a spatial region specified by the gesture. The spatial area detected by the gesture recognition unit 22 is hereinafter referred to as “gesture area”.

設定部２３は、ジェスチャ認識部２２が認識したジェスチャに基づいて、ユーザ２の位置から見てジェスチャ領域の延長線上に映像提示領域を設定する。ユーザ２の位置は、例えばユーザ２の視点の位置であるが、特に断りのない限り、撮影部１１の撮影レンズの位置とみなす。設定部２３は、映像提示領域を設定したとき、映像提示領域の設定を示す設定情報を、当該映像提示領域に割り当てた機能と関連付けて設定テーブル３１に格納する。更に、設定部２３は、位置方向センサ１４の検出結果に基づいて特定した方向情報を、設定テーブル３１に格納する。 Based on the gesture recognized by the gesture recognition unit 22, the setting unit 23 sets the video presentation area on the extension line of the gesture area as viewed from the position of the user 2. The position of the user 2 is, for example, the position of the viewpoint of the user 2, but is regarded as the position of the photographing lens of the photographing unit 11 unless otherwise specified. When setting the video presentation area, the setting unit 23 stores setting information indicating the setting of the video presentation area in the setting table 31 in association with the function assigned to the video presentation area. Further, the setting unit 23 stores the direction information specified based on the detection result of the position / direction sensor 14 in the setting table 31.

また、設定部２３は、平面領域検出部２３ａを含む。平面領域検出部２３ａは、距離センサ１３の検出結果又は撮影部１１が生成した撮影データに基づいて、ユーザ２の前方に存在する（現存する）平面領域を検出する。この平面領域は、例えば、室空間１００を構成する天井Ｃ、側壁部Ｗ又は床Ｆである。設定部２３は、平面領域検出部２３ａで検出された平面領域上に、映像提示領域を設定する。 The setting unit 23 includes a planar area detection unit 23a. The plane area detection unit 23a detects a plane area existing (existing) in front of the user 2 based on the detection result of the distance sensor 13 or the shooting data generated by the shooting unit 11. This planar region is, for example, the ceiling C, the side wall portion W, or the floor F that constitutes the room space 100. The setting unit 23 sets a video presentation area on the plane area detected by the plane area detection unit 23a.

表示制御部２４は、設定テーブル３１に基づいて、設定部２３が設定した映像提示領域においてユーザ２により映像コンテンツが観察されるように、映像コンテンツを表示部１２に表示させる。表示制御部２４は、映像提示領域に映像コンテンツが映し出されている（投影されている）ような感覚（例えば遠近感）をユーザ２に与えるように、映像コンテンツを表示部１２に表示させる。 Based on the setting table 31, the display control unit 24 causes the display unit 12 to display the video content so that the video content can be observed by the user 2 in the video presentation area set by the setting unit 23. The display control unit 24 displays the video content on the display unit 12 so as to give the user 2 a feeling (for example, perspective) that the video content is projected (projected) in the video presentation area.

図７は、眼鏡型端末１が行う映像提示領域の設定に関する処理の流れを示すフローチャートある。
まず、眼鏡型端末１の制御部２０は、アプリケーションプログラムを実行する（ステップＳ１）。ここでは、制御部２０は、表示部１２に映像コンテンツを表示するためのアプリケーションプログラムを、ユーザ２の指示に従って実行する。 FIG. 7 is a flowchart showing a flow of processing relating to the setting of the video presentation area performed by the glasses-type terminal 1.
First, the control unit 20 of the glasses-type terminal 1 executes an application program (step S1). Here, the control unit 20 executes an application program for displaying video content on the display unit 12 in accordance with an instruction from the user 2.

次に、制御部２０は、眼鏡型端末１の周囲の三次元空間を認識する（ステップＳ２）。ユーザ２の前方における三次元空間を認識するために、制御部２０は、例えば撮影部１１が生成した動画像を示す撮影データに基づいてＳｆＭ（Structure from Motion）による三次元点群を取得する。具体的には、制御部２０は、撮影データが表す撮影画像の中の特徴点を、例えば公知のＨａｒｒｉｓオペレータに従って検出する。次に、制御部２０は、時間軸上で連続する複数フレームの撮影データに基づいて、例えばＫＬＴ(Kanade Lucas Tomasi)トラッカに従って特徴点を追跡する。そして、制御部２０は、追跡した特徴点から、ＳｆＭにより特徴点の三次元位置を推定する。
別の方法として、制御部２０は、例えば、距離センサ１３により検出された距離の検出結果に基づいて三次元空間を認識してもよいし、ステレオカメラを用いた方法によって三次元空間を認識してもよい。 Next, the control unit 20 recognizes the three-dimensional space around the glasses-type terminal 1 (step S2). In order to recognize the three-dimensional space in front of the user 2, the control unit 20 acquires a three-dimensional point group by SfM (Structure from Motion) based on, for example, shooting data indicating a moving image generated by the shooting unit 11. Specifically, the control unit 20 detects a feature point in the captured image represented by the captured data according to, for example, a known Harris operator. Next, the control unit 20 tracks the feature points according to, for example, a KLT (Kanade Lucas Tomasi) tracker based on the imaging data of a plurality of frames continuous on the time axis. And the control part 20 estimates the three-dimensional position of a feature point by SfM from the tracked feature point.
As another method, for example, the control unit 20 may recognize the three-dimensional space based on the detection result of the distance detected by the distance sensor 13, or recognize the three-dimensional space by a method using a stereo camera. May be.

次に、制御部２０は、ユーザ２により行われたジェスチャを認識する（ステップＳ３）。制御部２０は、例えば一のフレームの撮影データに基づいて、或る時点（瞬間）にユーザ２により行われた、図６で説明したジェスチャを認識する。ユーザ２は、図８（ａ）に示すように、自身の眼前であって眼鏡部１０のレンズの向こう側で、このジェスチャを行う。ユーザ２は、図８（ａ）の斜線部で示した領域を作るジェスチャを行うことで、映像提示領域の位置及びサイズを指定する。 Next, the control part 20 recognizes the gesture performed by the user 2 (step S3). The control unit 20 recognizes the gesture described with reference to FIG. 6 performed by the user 2 at a certain time (instant) based on, for example, shooting data of one frame. As shown in FIG. 8A, the user 2 performs this gesture in front of his / her eyes and beyond the lens of the spectacle unit 10. The user 2 designates the position and size of the video presentation area by performing a gesture for creating the area indicated by the hatched portion in FIG.

ステップＳ３の処理では、制御部２０は、公知のジェスチャの認識技術を用いて、ユーザ２が行ったジェスチャを認識してよい。一例を挙げると、ＳｉｘＳｅｎｓｅと呼ばれるジェスチャの認識技術がある（ＳｉｘＳｅｎｓｅについては、http://www.pranavmistry.com/projects/sixthsense/又はhttp://kissaten-no-heya.blogspot.jp/2012/08/tedmitpattie-maessixth-sense.htmlを参照。）。このジェスチャの認識技術では、ユーザが、両手の親指及び人差し指の各々にマーカを取り付けて、図６で説明したジェスチャを行う。ユーザが持っているカメラは、撮影画像から各マーカを検出することによって、ユーザが行ったジェスチャを認識する。これとは別に、Ｕｂｉ−Ｃａｍｅｒａを用いたジェスチャの認識技術がある（Ｕｂｉ−Ｃａｍｅｒａについては、http://japanese.engadget.com/2012/03/30/ubi-camera/を参照。）。このジェスチャの認識技術では、ユーザはいずれか１本の指（典型的には右手の人差し指）に小型のカメラを装着して、図６で説明したジェスチャを行う。ユーザの指に装着されたカメラは、別の指（典型的には左手の親指）によって自機の感圧センサが押されたことを検出すると、ユーザが行ったジェスチャを認識する。この技術を眼鏡型端末１に適用した場合、制御部２０は、例えば、カメラから撮影データ又は感圧センサが押されたことを示す情報を取得することによって、ジェスチャを認識する。
なお、制御部２０は、更に別の方法でジェスチャを認識してもよい。制御部２０は、例えば、撮影データが表す撮影画像の中から肌色又は肌色に近い色の画像領域を検出し、画像解析を行うことによってジェスチャを認識してもよい。 In the process of step S <b> 3, the control unit 20 may recognize a gesture made by the user 2 using a known gesture recognition technique. For example, there is a gesture recognition technology called SixSense (For SixSense, see http://www.pranavmistry.com/projects/sixthsense/ (See 08 / tedmitpattie-maessixth-sense.html.) In this gesture recognition technology, the user attaches a marker to each of the thumb and index finger of both hands and performs the gesture described with reference to FIG. The camera held by the user recognizes a gesture made by the user by detecting each marker from the captured image. Apart from this, there is a gesture recognition technology using Ubi-Camera (refer to http://japanese.engadget.com/2012/03/30/ubi-camera/ for Ubi-Camera). In this gesture recognition technology, the user attaches a small camera to any one finger (typically the index finger of the right hand) and performs the gesture described with reference to FIG. When the camera attached to the user's finger detects that the pressure sensor of its own device has been pressed by another finger (typically, the thumb of the left hand), the camera recognizes the gesture made by the user. When this technique is applied to the eyeglass-type terminal 1, the control unit 20 recognizes the gesture by acquiring information indicating that the photographing data or the pressure sensor is pressed from the camera, for example.
Note that the control unit 20 may recognize the gesture by another method. For example, the control unit 20 may recognize a gesture by detecting an image region of a skin color or a color close to the skin color from a captured image represented by the captured data and performing image analysis.

図７の説明に戻る。
制御部２０は、眼鏡型端末１が向く方向、及び、ジェスチャ領域のサイズを認識する（ステップＳ４）。まず、制御部２０は、ジェスチャを認識したときに位置方向センサ１４が検出した方向に基づいて、眼鏡型端末１が向く方向を認識する。また、制御部２０は、図８（ｂ）に示すように、右手の親指ＦＲ１と左手の人差し指ＦＬ２とが交差する位置にある点Ｐ１と、右手の人差し指ＦＲ２と左手の親指ＦＬ１とが交差する位置にある点Ｐ２とを認識する。そして、制御部２０は、点Ｐ１と点Ｐ２とを結ぶ線分を対角線とする矩形（図８（ｂ）の斜線部）を、ジェスチャ領域Ｔとして検出する。ここでは、制御部２０は、点Ｐ１と点Ｐ２とを結ぶ線分の長さ「Ａ」を、ジェスチャ領域Ｔのサイズとして認識する。
別の方法として、制御部２０は、例えば、図８（ａ）に示した斜線部の領域（図形）の外接矩形又は内接矩形を特定して、ジェスチャ領域Ｔを検出してもよい。 Returning to the description of FIG.
The control unit 20 recognizes the direction in which the glasses-type terminal 1 faces and the size of the gesture area (step S4). First, the control unit 20 recognizes the direction in which the glasses-type terminal 1 is directed based on the direction detected by the position / direction sensor 14 when the gesture is recognized. Further, as shown in FIG. 8B, the control unit 20 crosses the point P1 at the position where the right thumb FR1 and the left index finger FL2 intersect, and the right index finger FR2 and the left thumb FL1. The point P2 at the position is recognized. Then, the control unit 20 detects a rectangle (a hatched portion in FIG. 8B) having a line segment connecting the points P1 and P2 as a diagonal line as the gesture region T. Here, the control unit 20 recognizes the length “A” of the line segment connecting the points P1 and P2 as the size of the gesture region T.
As another method, for example, the control unit 20 may detect the gesture region T by specifying a circumscribed rectangle or an inscribed rectangle in the shaded region (graphic) shown in FIG.

次に、制御部２０は、ユーザ２の前方にある平面領域を検出する（ステップＳ５）。制御部２０は、例えば、ステップＳ２の処理で認識したユーザ２の前方における三次元空間の認識結果（特徴点群）に基づいて平面領域を検出する（非特許文献１の記載参照）。別の方法として、制御部２０は、距離センサ１３による距離の検出結果に基づいて、平面領域を検出してもよい。例えば、制御部２０は、ユーザ２の前方にある目標物までの距離が閾値以下である場合に、その距離だけ前方に離れた位置に平面領域を検出する。 Next, the control unit 20 detects a plane area in front of the user 2 (step S5). For example, the control unit 20 detects a planar area based on the recognition result (feature point group) of the three-dimensional space in front of the user 2 recognized in the process of step S2 (see the description of Non-Patent Document 1). As another method, the control unit 20 may detect the planar area based on the distance detection result by the distance sensor 13. For example, when the distance to the target in front of the user 2 is equal to or less than the threshold value, the control unit 20 detects the planar area at a position that is further forward by that distance.

次に、制御部２０は、平面領域を検出したかどうかを判断する（ステップＳ６）。制御部２０は、ステップＳ５の処理で平面領域を検出しなかったと判断した場合には（ステップＳ６；ＮＯ）、ステップＳ２の処理に戻る。他方、制御部２０は、ステップＳ５の処理で平面領域を検出したと判断した場合には（ステップＳ６；ＹＥＳ）、検出した平面領域上に映像提示領域を設定する（ステップＳ７）。 Next, the control unit 20 determines whether or not a planar area has been detected (step S6). When the control unit 20 determines that the plane area is not detected in the process of step S5 (step S6; NO), the control unit 20 returns to the process of step S2. On the other hand, if the control unit 20 determines that a plane area has been detected in the process of step S5 (step S6; YES), it sets a video presentation area on the detected plane area (step S7).

図９は、映像提示領域の設定方法を説明する図である。図９（ａ）には、側壁部Ｗに映像提示領域ＳＣが設定される場合の撮影部１１、ジェスチャ領域Ｔ及び側壁部Ｗの位置関係を説明する図が示されている。図９（ｂ）には、図９（ａ）の矢印Ｉ方向、すなわち、水平方向に平面視したときの撮影部１１、ジェスチャ領域Ｔ及び側壁部Ｗの位置関係が示されている。
以下に説明する映像提示領域の設定においては、以下の３つの事項を仮定する。
（仮定１）ユーザ２の位置（つまり視点の位置）と、撮影部１１の位置（レンズの位置）とが同一。
（仮定２）ユーザ２の位置（つまり視点の位置）と、ジェスチャ領域の重心とが同一の高さ。
（仮定３）側壁部Ｗとジェスチャ領域の平面方向とが平行。 FIG. 9 is a diagram illustrating a method for setting a video presentation area. FIG. 9A shows a diagram for explaining the positional relationship among the imaging unit 11, the gesture region T, and the side wall part W when the video presentation area SC is set on the side wall part W. FIG. 9B shows the positional relationship between the imaging unit 11, the gesture region T, and the side wall W when viewed in a plan view in the direction of arrow I in FIG. 9A, that is, in the horizontal direction.
In setting the video presentation area described below, the following three items are assumed.
(Assumption 1) The position of the user 2 (that is, the position of the viewpoint) is the same as the position of the photographing unit 11 (the position of the lens).
(Assumption 2) The position of the user 2 (that is, the position of the viewpoint) and the center of gravity of the gesture area are the same height.
(Assumption 3) The side wall W and the plane direction of the gesture region are parallel.

図９（ａ）に示すように、ジェスチャ領域Ｔの横方向（ここでは水平方向）の長さを「Ａｘ」とし、縦方向（ここでは鉛直方向）の長さを「Ａｙ」とする。また、図９（ｂ）に示すように、撮影部１１の位置とジェスチャ領域Ｔの重心との間の長さを「Ｂ」とする。長さＢは、図示せぬ距離センサを用いて実測されてもよいし、見込みに基づいて予め決められていてもよい。ジェスチャ領域Ｔの重心と側壁部Ｗとの間の長さを「Ｃ」とする。長さＣは、距離センサ１３の検出結果が使用される。 As shown in FIG. 9A, the length of the gesture region T in the horizontal direction (here, the horizontal direction) is “Ax” and the length of the vertical direction (here, the vertical direction) is “Ay”. Further, as shown in FIG. 9B, the length between the position of the photographing unit 11 and the center of gravity of the gesture region T is “B”. The length B may be measured using a distance sensor (not shown), or may be determined in advance based on the likelihood. The length between the center of gravity of the gesture region T and the side wall W is “C”. For the length C, the detection result of the distance sensor 13 is used.

制御部２０は、ユーザ２の位置から見てジェスチャ領域Ｔの延長線上に、映像提示領域ＳＣを設定する。具体的には、制御部２０は、撮影部１１の位置と点Ｐ１とを結ぶ線分の延長線上で、側壁部Ｗと交差する点に点Ｐｅ１を設定し、撮影部１１の位置と点Ｐ２とを結ぶ線分の延長線上で、側壁部Ｗと交差する点に点Ｐｅ２を設定する。そして、制御部２０は、点Ｐｅ１と点Ｐｅ２とを結ぶ線分を対角線とする矩形の領域を、映像提示領域ＳＣとして設定する。ここで、映像提示領域ＳＣの横方向の長さを「Ｄｘ」とし、縦方向の長さを「Ｄｙ」とし、点Ｐｅ１と点Ｐｅ２とを結ぶ線分の長さを「Ｄ」とした場合、下記式（１）の関係を満たすように、制御部２０は映像提示領域ＳＣを設定する。
Ｄｘ＝Ａｘ＊Ｃ／Ｂ、Ｄｙ＝Ａｙ＊Ｃ／Ｂ、Ｄ＝Ａ＊Ｃ／Ｂ・・・（１） The control unit 20 sets the video presentation area SC on the extension line of the gesture area T when viewed from the position of the user 2. Specifically, the control unit 20 sets a point Pe1 at a point that intersects the side wall portion W on the extension line of the line connecting the position of the imaging unit 11 and the point P1, and sets the position of the imaging unit 11 and the point P2. A point Pe2 is set at a point intersecting the side wall W on the extended line connecting the two. And the control part 20 sets the rectangular area | region which makes the line segment which connects the point Pe1 and the point Pe2 a diagonal line as video presentation area | region SC. Here, when the horizontal length of the video presentation area SC is “Dx”, the vertical length is “Dy”, and the length of the line segment connecting the points Pe1 and Pe2 is “D” The control unit 20 sets the video presentation area SC so as to satisfy the relationship of the following formula (1).
Dx = Ax * C / B, Dy = Ay * C / B, D = A * C / B (1)

図９で説明した映像提示領域ＳＣの設定方法によると、ユーザ２の位置とジェスチャ領域Ｔとで構成される四角錘と、ユーザ２の位置と映像提示領域ＳＣとで構成される四角錘とが相似となる。また、式（１）から分かるように、制御部２０は、長さＢと長さＣとの比に応じた倍率で、ジェスチャ領域Ｔを縦横に等倍することによって、映像提示領域ＳＣのサイズを決定していることになる。 According to the method for setting the video presentation area SC described with reference to FIG. 9, a square weight composed of the position of the user 2 and the gesture area T and a square weight composed of the position of the user 2 and the video presentation area SC are obtained. It will be similar. Further, as can be seen from the equation (1), the control unit 20 enlarges the size of the video presentation area SC by multiplying the gesture area T vertically and horizontally by a magnification according to the ratio of the length B to the length C. It will be decided.

ところで、図９で説明した映像提示領域の設定方法では、（仮定１）を設けることにより、撮影部１１の位置とユーザ２の視点の位置とのずれを無視していた。しかし、撮影部１１の位置とユーザ２の視点の位置とは厳密には一致しない。そこで、制御部２０は、ユーザ２が眼鏡型端末１を装着した際の撮影部１１の位置と、ユーザ２の視点の位置との相対的な位置関係から得られるパラメータ（例えば、撮影部１１の外部パラメータ）に基づいて、撮影部１１の撮影画像を変換する処理を、撮影データに施すことが望ましい。この場合、制御部２０は、撮影部１１の撮影画像が、ユーザ２の視点で観察される像と同じとなるように変換処理を施すとよい。 By the way, in the video presentation area setting method described with reference to FIG. 9, the difference between the position of the photographing unit 11 and the position of the viewpoint of the user 2 is ignored by providing (Assumption 1). However, the position of the photographing unit 11 and the position of the viewpoint of the user 2 do not exactly match. Therefore, the control unit 20 sets parameters obtained from the relative positional relationship between the position of the photographing unit 11 when the user 2 wears the eyeglass-type terminal 1 and the position of the viewpoint of the user 2 (for example, the It is desirable to perform processing for converting the captured image of the imaging unit 11 on the captured data based on the external parameter. In this case, the control unit 20 may perform the conversion process so that the captured image of the imaging unit 11 is the same as the image observed from the viewpoint of the user 2.

また、（仮定２）及び（仮定３）に関し、撮影部１１の位置とジェスチャ領域Ｔの重心とが同一の高さになく、撮影部１１が上下に傾く場合がある。この場合、映像提示領域が設定される平面領域（側壁部Ｗ）と、ジェスチャ領域Ｔの平面方向とが非平行となる。この場合、（仮定２）及び（仮定３）が成り立たない。そこで、図１０に示すように、撮影部１１が水平方向から角度θで下向きに傾いた場合、制御部２０は、下記式（２）の関係を満たすように映像提示領域ＳＣ'を設定する。
Ｄｘ'＝Ｄｘ、Ｄｙ'＝Ｄｙ＊ｃｏｓθ ・・・（２） Regarding (Assumption 2) and (Assumption 3), the position of the photographing unit 11 and the center of gravity of the gesture region T are not at the same height, and the photographing unit 11 may be tilted up and down. In this case, the plane area (side wall W) where the video presentation area is set and the plane direction of the gesture area T are not parallel. In this case, (Assumption 2) and (Assumption 3) do not hold. Therefore, as shown in FIG. 10, when the photographing unit 11 is tilted downward at an angle θ from the horizontal direction, the control unit 20 sets the video presentation area SC ′ so as to satisfy the relationship of the following formula (2).
Dx ′ = Dx, Dy ′ = Dy * cos θ (2)

式（２）において、映像提示領域ＳＣ'の横方向の長さを「Ｄｘ'」とし、縦方向の長さを「Ｄｙ'」とする。θの値は、例えば、眼鏡型端末１に設けられ、撮影部１１又は眼鏡型端末１の姿勢を検出するセンサを用いて算出される。撮影部１１又は眼鏡型端末１の姿勢を検出するセンサは、例えば、加速度センサやジャイロセンサ等である。
なお、撮影部１１が上向きに傾いた場合も、制御部２０は、図１０で説明した方法で映像提示領域ＳＣ'を設定してよい。 In Expression (2), the horizontal length of the video presentation area SC ′ is “Dx ′”, and the vertical length is “Dy ′”. The value of θ is calculated using, for example, a sensor that is provided in the glasses-type terminal 1 and detects the posture of the photographing unit 11 or the glasses-type terminal 1. The sensor that detects the posture of the photographing unit 11 or the glasses-type terminal 1 is, for example, an acceleration sensor or a gyro sensor.
Even when the photographing unit 11 is tilted upward, the control unit 20 may set the video presentation area SC ′ by the method described in FIG.

ところで、表示部１２の視野角は、一般に人間の視界や撮影部１１の画角に比べて小さい。このため、図８（ｂ）で説明したジェスチャ領域Ｔと、ユーザ２の視線とを結ぶ四角錐で要求される視野角（この場合、水平視野角＝２ａｒｃｔａｎ（Ａｘ／２Ｃ）、垂直視野角=２ａｒｃｔａｎ（Ａｙ／２Ｃ））が、表示部１２の視野角を超える場合がある。この場合、制御部２０は、表示部１２の視野角を上限として映像提示領域を設定してもよい。この設定方法とした場合、ユーザ２からは、要求した大きさよりも小さい画面で映像コンテンツが観察されるが、画面全体の映像コンテンツを一度に観察することができる。又は、制御部２０は、映像提示領域をユーザの要求どおりに表示部１２の視野角以上に設定し、表示部１２に表示する映像コンテンツを、画面全体ではなく、画面の一部としてもよい。この設定方法とした場合、ユーザ２からは、要求した大きさとおりで映像コンテンツが観察され、固定した場所からでは画面の一部の映像コンテンツしか観察されない。ユーザ２は、画面全体の映像コンテンツを観察するためには、自身が移動して視界に入る映像提示領域の範囲を移動させることとなる。 By the way, the viewing angle of the display unit 12 is generally smaller than the human field of view and the angle of view of the photographing unit 11. For this reason, the viewing angle required in the quadrangular pyramid connecting the gesture region T described in FIG. 8B and the line of sight of the user 2 (in this case, horizontal viewing angle = 2 arctan (Ax / 2C), vertical viewing angle = 2 arctan (Ay / 2C)) may exceed the viewing angle of the display unit 12 in some cases. In this case, the control unit 20 may set the video presentation area with the viewing angle of the display unit 12 as an upper limit. In this setting method, the video content is observed from the user 2 on a screen smaller than the requested size, but the video content on the entire screen can be observed at a time. Alternatively, the control unit 20 may set the video presentation area to be larger than the viewing angle of the display unit 12 as requested by the user, and the video content displayed on the display unit 12 may be a part of the screen instead of the entire screen. In the case of this setting method, the video content is observed from the user 2 in the requested size, and only a part of the video content on the screen is observed from a fixed location. In order to observe the video content on the entire screen, the user 2 moves the range of the video presentation area in which the user 2 moves and enters the field of view.

図７に戻り、次に、制御部２０は、設定した映像提示領域を撮影部１１に撮影させる。ここでは、制御部２０は、ユーザ２の手等が映り込まないように、側壁部Ｗにおける映像提示領域ＳＣに相当する部分を、撮影部１１に撮影させる。そして、制御部２０は、撮影部１１により生成された静止画像を表す撮影データをタグ情報として、設定テーブル３１のステップＳ７の処理と同じレコードに格納する（ステップＳ８）。
そして、制御部２０は、ステップＳ４の処理で位置方向センサ１４が検出した方向を示す方向情報を、設定テーブル３１のステップＳ７，Ｓ８の処理と同じレコードに格納する（ステップＳ９）。
以上が、映像提示領域を設定するときの眼鏡型端末１の動作の説明である。 Returning to FIG. 7, next, the control unit 20 causes the photographing unit 11 to photograph the set video presentation area. Here, the control unit 20 causes the photographing unit 11 to photograph a portion corresponding to the video presentation area SC in the side wall portion W so that the user's 2 hand and the like are not reflected. And the control part 20 stores the imaging | photography data showing the still image produced | generated by the imaging | photography part 11 as tag information in the same record as the process of step S7 of the setting table 31 (step S8).
And the control part 20 stores the direction information which shows the direction which the position direction sensor 14 detected by the process of step S4 in the same record as the process of step S7, S8 of the setting table 31 (step S9).
The above is the description of the operation of the glasses-type terminal 1 when setting the video presentation area.

図１１は、眼鏡型端末１が映像コンテンツをユーザ２に提示するときに行う処理の流れを示すフローチャートある。以下、図２で説明した映像提示領域ＳＣ１の映像コンテンツを提示するときの眼鏡型端末１の動作を説明する。以下の動作中において、制御部２０は、撮影部１１に動画像を撮影させている。
制御部２０は、ユーザ２が映像提示領域を観察中か否かを判断する（ステップＳ１１）。制御部２０は、撮影部１１が生成した撮影データが表す１フレームの撮影画像と、設定テーブル３１に格納された各タグ情報が表す撮影画像とを、例えば公知のパターンマッチングの技術を用いて照合する。そして、制御部２０は、両者の同一性に基づいて（例えば一致度）、どのタグ情報に対応する映像提示領域が、ユーザ２により観察されているかを判断する。
ステップＳ１１の処理で、制御部２０は、設定テーブル３１の全てのレコードのタグ情報を検索対象としてもよいが、位置方向センサ１４の検出結果に基づいて、検索対象とするレコード（タグ情報）を絞り込んでもよい。例えば、制御部２０は、眼鏡型端末１が現在向いている方向を基準として所定の範囲内を示す方向情報を含むレコードを、検索対象とする。この検索対象の絞込みにより、ステップＳ１の処理に関する制御部２０の処理量が減る。 FIG. 11 is a flowchart showing a flow of processing performed when the glasses-type terminal 1 presents video content to the user 2. Hereinafter, the operation of the glasses-type terminal 1 when presenting video content in the video presentation area SC1 described in FIG. 2 will be described. During the following operations, the control unit 20 causes the imaging unit 11 to capture a moving image.
The control unit 20 determines whether the user 2 is observing the video presentation area (step S11). The control unit 20 collates the captured image of one frame represented by the photographing data generated by the photographing unit 11 with the photographed image represented by each tag information stored in the setting table 31 using, for example, a known pattern matching technique. To do. And the control part 20 judges whether the video presentation area | region corresponding to which tag information is observed by the user 2 based on both identity (for example, coincidence degree).
In the process of step S <b> 11, the control unit 20 may set the tag information of all records in the setting table 31 as a search target. You may narrow down. For example, the control unit 20 searches for a record including direction information indicating a predetermined range with reference to the direction in which the glasses-type terminal 1 is currently facing. By narrowing down the search target, the processing amount of the control unit 20 relating to the processing of step S1 is reduced.

制御部２０は、ユーザ２が映像提示領域を観察中と判断すると（ステップＳ１１；ＹＥＳ）、観察中の映像提示領域における映像コンテンツの表示位置を特定する（ステップＳ１２）。この表示位置は、表示部１２における位置である。制御部２０は、設定テーブル３１に格納されたタグ情報が示す画像とこのタグ情報に対応する映像提示領域との位置関係と、撮影部１１の現在の撮影画像と当該映像提示領域との位置関係とが同じとなる（維持する）ように、映像コンテンツの表示位置を特定する。すなわち、制御部２０は、映像提示領域を観察するユーザ２に対し、この映像提示領域が側壁部Ｗ等の平面領域に固定されているような感覚を与えるような位置関係に維持する。 When the user 2 determines that the video presentation area is being observed (step S11; YES), the control unit 20 specifies the display position of the video content in the video presentation area being observed (step S12). This display position is a position on the display unit 12. The control unit 20 determines the positional relationship between the image indicated by the tag information stored in the setting table 31 and the video presentation area corresponding to the tag information, and the positional relationship between the current captured image of the imaging unit 11 and the video presentation area. The display position of the video content is specified so that is the same (maintained). That is, the control unit 20 maintains a positional relationship that gives the user 2 who observes the video presentation area a feeling that the video presentation area is fixed to a planar area such as the side wall W.

そして、制御部２０は、映像提示領域においてユーザ２により映像コンテンツが観察されるように、ステップＳ１２の処理で特定した表示部１２の表示位置に、映像コンテンツを表示させる（ステップＳ１３）。制御部２０は、設定テーブル３１に基づいて観察中の映像提示領域に割り当てられた機能を特定し、特定した機能を利用するためのアプリケーションプログラムに基づいて、映像コンテンツを表示させる。制御部２０は、映像コンテンツの表示に使用するアプリケーションプログラムを予め実行していてもよいし、ステップＳ１３の処理で実行してもよい。図１２（ａ）に示すように、ユーザ２が映像提示領域ＳＣ１の全体を観察している場合、ユーザ２は、映像提示領域ＳＣ１に割り当てられた機能の映像コンテンツの全体を、映像提示領域ＳＣ１（側壁部Ｗ）に表示されているような感覚で観察することができる。 And the control part 20 displays a video content in the display position of the display part 12 specified by the process of step S12 so that a video content can be observed by the user 2 in a video presentation area (step S13). The control unit 20 specifies the function assigned to the video presentation area being observed based on the setting table 31, and displays the video content based on the application program for using the specified function. The control unit 20 may execute an application program used for displaying video content in advance or may be executed in the process of step S13. As shown in FIG. 12A, when the user 2 is observing the entire video presentation area SC1, the user 2 displays the entire video content of the function assigned to the video presentation area SC1 in the video presentation area SC1. It can be observed as if it is displayed on the (side wall W).

次に、制御部２０は、ユーザ２が映像提示領域を観察中か否かを判断する（ステップＳ１４）。制御部２０は、ユーザ２が映像提示領域を観察中と判断すると（ステップＳ１４；ＹＥＳ）、ステップＳ１２の処理に戻る。映像提示領域の観察中においては、制御部２０は、ステップＳ１２〜Ｓ１４の処理ステップを繰り返し実行する。 Next, the control unit 20 determines whether or not the user 2 is observing the video presentation area (step S14). When the control unit 20 determines that the user 2 is observing the video presentation area (step S14; YES), the control unit 20 returns to the process of step S12. During observation of the video presentation area, the control unit 20 repeatedly executes the processing steps of steps S12 to S14.

ここで、ユーザ２が視線の方向を変えた場合の眼鏡型端末１の動作を説明する。
図１２（ａ）で説明した方向を見ていたユーザ２が、例えば右上方向に視線を変更したとする。この場合、制御部２０は、映像提示領域ＳＣ１の映像コンテンツの表示位置を変更する。図１２（ｂ）に示すように、映像提示領域ＳＣ１の右上部分のみをユーザ２が観察している場合には、制御部２０は、その観察部分に対応する部分の映像コンテンツだけが観察されるように、映像コンテンツの表示位置を変更する。図１２（ｂ）に示す方向を見ていたユーザ２が、例えば更に視線を変更して、映像提示領域ＳＣ１を全く観察しなかったとする。この場合、制御部２０は、図１２（ｃ）に示すように映像提示領域ＳＣ１の映像コンテンツを表示させない。このとき、ユーザ２は、実空間のみを観察している状態にある。
その後、ステップＳ１４の処理で、制御部２０は、映像提示領域の観察中でないと判断すると（ステップＳ１４；ＮＯ）、映像コンテンツの表示を終了する（ステップＳ１５）。 Here, the operation of the glasses-type terminal 1 when the user 2 changes the direction of the line of sight will be described.
It is assumed that the user 2 who has been viewing the direction described in FIG. 12A changes his / her line of sight in the upper right direction, for example. In this case, the control unit 20 changes the display position of the video content in the video presentation area SC1. As shown in FIG. 12B, when the user 2 observes only the upper right part of the video presentation area SC1, the control unit 20 observes only the video content of the part corresponding to the observed part. As described above, the display position of the video content is changed. It is assumed that the user 2 who was looking at the direction shown in FIG. 12B changed the line of sight for example and did not observe the video presentation area SC1 at all. In this case, the control unit 20 does not display the video content in the video presentation area SC1 as shown in FIG. At this time, the user 2 is in a state of observing only the real space.
Thereafter, when the control unit 20 determines in the process of step S14 that the video presentation area is not being observed (step S14; NO), the display of the video content is terminated (step S15).

以上説明した実施形態の眼鏡型端末１を装着したユーザ２は、眼前で行ったジェスチャにより、ユーザ２が観察する実環境に映像を配置させることができる。このため、ユーザ２は、直感的で、且つ簡単なジェスチャによって仮想的な映像提示領域を配置し、更に、その映像提示領域にあるディスプレイで映像コンテンツが表示されているような感覚で、この映像コンテンツを観察することができる。
また、眼鏡型端末１は、図２で説明したように複数の映像提示領域を同時に設定して、その各々で異なる映像コンテンツを提示しうる。このため、ユーザ２は、マルチタスクで実行される複数の機能の映像コンテンツを、視線の方向や自身の位置を変えながら観察することができる。
また、汎用のＨＭＤでは、ユーザが装着した状態では眼前に常に映像が表示されるため、例えば、ユーザ２にとっては眼精疲労の原因となることがある。また、ユーザ２が一時的に映像を視聴しないようにする場合には、ユーザは映像の再生を中断させるか、又は、装着していたＨＭＤを一旦取り外す必要がある。これに対し、眼鏡型端末１によれば、ユーザ２は映像提示領域を設定した場所を観察しなければ、映像コンテンツが観察されない（図１２（ｃ）参照）。このため、ユーザ２にとっての使用負担の軽減の効果も期待できる。 The user 2 wearing the glasses-type terminal 1 of the embodiment described above can place an image in the real environment observed by the user 2 by a gesture performed in front of the eyes. For this reason, the user 2 arranges the virtual video presentation area with an intuitive and simple gesture, and further, this video is displayed as if the video content is displayed on the display in the video presentation area. Content can be observed.
Further, as described with reference to FIG. 2, the glasses-type terminal 1 can simultaneously set a plurality of video presentation areas and present different video contents. For this reason, the user 2 can observe the video content of a plurality of functions executed by multitasking while changing the direction of the line of sight and the position of the user.
Moreover, in a general-purpose HMD, an image is always displayed in front of the eyes when worn by the user, and for example, it may cause eyestrain for the user 2. Further, when the user 2 is temporarily prevented from viewing the video, the user needs to interrupt the playback of the video or to remove the attached HMD once. On the other hand, according to the glasses-type terminal 1, the video content is not observed unless the user 2 observes the place where the video presentation area is set (see FIG. 12C). For this reason, the effect of reducing the use burden for the user 2 can also be expected.

［変形例］
本発明は、上述した実施形態と異なる形態で実施することが可能である。本発明は、例えば、以下のような形態で実施することも可能である。また、以下に示す変形例は、各々を適宜に組み合わせてもよい。
（変形例１）
制御部２０は、動画コンテンツを表示する場合には、ユーザ２が観察中の期間にのみ、動画コンテンツを再生してもよい。すなわち、眼鏡型端末１の制御部２０は、動画コンテンツの再生中に、ユーザ２により映像提示領域の全体が観察されなくなった場合には、その映像提示領域において観察される動画コンテンツの再生を中断（一時停止）する。その後、制御部２０は、ユーザ２により映像提示領域の少なくとも一部でも再び観察されたときに、中断していた動画コンテンツの再生を再開する。別の方法として、制御部２０は、ユーザ２により映像提示領域の少なくとも一部が観察されなくなった場合には、動画コンテンツの再生を中断し、映像提示領域の全体が再び観察されたときに、動画コンテンツの再生を再開してもよい。この変形例の眼鏡型端末１によれば、ユーザ２が動画コンテンツを部分的に見逃す可能性を低くすることができる。
この変形例において、制御部２０は、動画コンテンツ以外の映像コンテンツについて、ユーザ２が観察中の期間にのみ再生してもよい。例えば、制御部２０は、指定されたアプリケーションプログラムに関連付けられた映像提示領域を、ユーザ２が観察していない期間においては、そのアプリケーションプログラムを利用した処理（作業）を中断させ、また、映像コンテンツの表示を中断させる。ユーザ２が映像提示領域を観察中でないときに処理を中断するか否かについては、アプリケーションプログラム毎に予め設定されていてもよいし、ユーザ２が逐次明示的に指定してもよい。ユーザ２が指定する場合の指定方法は、例えば、手や指を用いたジェスチャを用いる方法や音声入力を用いる方法等がある。 [Modification]
The present invention can be implemented in a form different from the above-described embodiment. The present invention can also be implemented in the following forms, for example. Further, the following modifications may be combined as appropriate.
(Modification 1)
When displaying the moving image content, the control unit 20 may reproduce the moving image content only during the period during which the user 2 is observing. That is, if the entire video presentation area is no longer observed by the user 2 during playback of the video content, the control unit 20 of the glasses-type terminal 1 interrupts playback of the video content observed in the video presentation area. (Pause). Thereafter, when the user 2 observes at least a part of the video presentation area again by the user 2, the control unit 20 resumes the reproduction of the video content that has been interrupted. As another method, when at least a part of the video presentation area is no longer observed by the user 2, the control unit 20 interrupts the reproduction of the moving image content, and when the entire video presentation area is observed again, The playback of the video content may be resumed. According to the glasses-type terminal 1 of this modification, it is possible to reduce the possibility that the user 2 partially misses the moving image content.
In this modification, the control unit 20 may play back video content other than video content only during a period during which the user 2 is observing. For example, the control unit 20 interrupts the processing (work) using the application program during the period when the user 2 does not observe the video presentation area associated with the designated application program, and the video content The display of is interrupted. Whether the process is interrupted when the user 2 is not observing the video presentation area may be set in advance for each application program, or may be explicitly specified by the user 2 sequentially. Examples of the designation method when the user 2 designates include a method using a gesture using a hand or a finger and a method using voice input.

（変形例２）
制御部２０は、ユーザ２によって複数の映像提示領域が同時に観察される場合には、いずれか１つの映像提示領域で映像コンテンツが観察されるように表示させ、他の映像提示領域では映像コンテンツを表示させないようにしてもよい。制御部２０は、どのような条件に基づいて映像コンテンツを表示させる映像提示領域を決定してもよいが、例えば、ユーザ２によって観察される複数の映像提示領域のうち、最も面積が大きい映像提示領域、撮影部１１の撮影画像において最も中心に近い位置に配置されている映像提示領域、又は、優先度が最も高い機能の映像提示領域で映像コンテンツが観察されるようにする。 (Modification 2)
When a plurality of video presentation areas are observed simultaneously by the user 2, the control unit 20 displays the video content so that the video content is observed in any one of the video presentation areas, and displays the video content in the other video presentation areas. You may make it not display. The control unit 20 may determine the video presentation area in which the video content is displayed based on any condition. For example, the video presentation with the largest area among the plurality of video presentation areas observed by the user 2 is provided. The video content is observed in the area, the video presentation area arranged at the position closest to the center in the captured image of the imaging unit 11, or the video presentation area having the highest priority function.

また、制御部２０は、ユーザ２によって複数の映像提示領域が同時に観察される場合に、いずれか１つの映像提示領域で映像コンテンツを表示させ、他の映像提示領域の映像コンテンツをユーザ２により観察されにくくする制御を行ってもよい。例えば、制御部２０は、ユーザ２により観察されにくくする映像コンテンツの透過度を高くする映像処理を施す。これ以外にも、制御部２０は、ユーザ２により観察されにくくする映像コンテンツについては、解像度を低くしたり、映像提示領域のサイズを一時的に小さくしたり、色を変化させたりする映像処理を施してもよい。
この変形例の眼鏡型端末１によれば、ユーザ２が複数の映像コンテンツが同時に観察したことを原因して、提示中の映像コンテンツへの注意力が低下することを抑制することができる。 In addition, when a plurality of video presentation areas are observed simultaneously by the user 2, the control unit 20 displays video content in any one video presentation area, and the video contents in the other video presentation areas are observed by the user 2. You may perform control which makes it difficult to do. For example, the control unit 20 performs video processing for increasing the transparency of video content that is difficult to be observed by the user 2. In addition to this, the control unit 20 performs video processing that lowers the resolution, temporarily reduces the size of the video presentation area, or changes the color for video content that is difficult to be observed by the user 2. You may give it.
According to the glasses-type terminal 1 of this modification, it is possible to suppress a reduction in attention to the video content being presented due to the user 2 observing a plurality of video content at the same time.

（変形例３）
眼鏡型端末１は、自装置でない他の映像表示装置（ここでは他の眼鏡型端末。以下「他装置」という。）と映像提示領域、及び、映像コンテンツを共有する機能を有していてもよい。他の眼鏡型端末は、眼鏡型端末１と同じ構成を有する。
この変形例の眼鏡型端末１の制御部２０は、図５で説明した機能構成に加え、更に、設定情報取得部２５に相当する機能を実現する。設定情報取得部２５は、他装置で設定された映像提示領域の設定を示す設定情報を、通信部４０を介して取得する。設定情報取得部２５が取得する設定情報は、図４で説明した「設定情報」フィールドに格納される情報と同じでよい。設定情報取得部２５は、取得した設定情報とアプリケーション識別子とを対応付けて、設定テーブル３１の同じレコードに格納する。
表示制御部２４は、設定情報取得部２５が取得した設定情報に基づいて映像提示領域を設定し、この映像提示領域に他装置と同じ映像コンテンツを表示させる。 (Modification 3)
The glasses-type terminal 1 may have a function of sharing a video presentation area and video content with another video display device (here, another glasses-type terminal; hereinafter referred to as “other device”) that is not its own device. Good. Other glasses-type terminals have the same configuration as the glasses-type terminal 1.
In addition to the functional configuration described with reference to FIG. 5, the control unit 20 of the glasses-type terminal 1 according to this modification further realizes a function corresponding to the setting information acquisition unit 25. The setting information acquisition unit 25 acquires setting information indicating the setting of the video presentation area set by another device via the communication unit 40. The setting information acquired by the setting information acquisition unit 25 may be the same as the information stored in the “setting information” field described with reference to FIG. The setting information acquisition unit 25 stores the acquired setting information in association with the application identifier in the same record of the setting table 31.
The display control unit 24 sets a video presentation area based on the setting information acquired by the setting information acquisition unit 25, and displays the same video content as that of other devices in the video presentation area.

図１３は、眼鏡型端末１が他装置から取得した設定情報に基づいて映像提示領域を設定するときに行う処理の流れを示すフローチャートある。
制御部２０は、他装置から設定情報とアプリケーション識別子とを取得して、それらを対応付けて設定テーブル３１に格納する（ステップＳ２１）。次に、制御部２０は、ステップＳ２１の処理で取得した設定情報に基づいて映像提示領域を設定し、この映像提示領域に他装置と同じ映像コンテンツを表示させる（ステップＳ２２）。制御部２０は、他装置と同じ映像コンテンツが記憶部３０に記憶されている場合には、記憶部３０から読み出した映像コンテンツを表示部１２に表示させる。また、制御部２０は、他装置等から通信部４０を介して取得した映像コンテンツを表示部１２に表示させてもよい。以降、制御部２０は、上述した実施形態と同じ手順で、ステップＳ１１〜Ｓ１５の処理ステップを実行する。 FIG. 13 is a flowchart showing a flow of processing performed when the glasses-type terminal 1 sets a video presentation area based on setting information acquired from another device.
The control unit 20 acquires setting information and an application identifier from another device, stores them in the setting table 31 in association with each other (step S21). Next, the control unit 20 sets a video presentation area based on the setting information acquired in step S21, and displays the same video content as that of the other device in the video presentation area (step S22). When the same video content as that of the other device is stored in the storage unit 30, the control unit 20 causes the display unit 12 to display the video content read from the storage unit 30. Further, the control unit 20 may cause the display unit 12 to display video content acquired from another device or the like via the communication unit 40. Thereafter, the control unit 20 executes the processing steps of Steps S11 to S15 in the same procedure as the above-described embodiment.

この変形例において、眼鏡型端末１と他装置とで、ユーザにより観察される映像コンテンツが同期していることが望ましい。そこで、動画コンテンツを再生する場合には、制御部２０は、他装置と同期するように動画コンテンツを再生してもよい。同期再生を実現するために、表示制御部２４は、例えば、他装置との間で共有する同期信号を通信部４０を介して取得し、取得した同期信号に基づいて映像コンテンツを再生する。この眼鏡型端末１によれば、複数のユーザで同時に同じ映像を観察しているような使用感を与えることができる。 In this modification, it is desirable that the video content observed by the user is synchronized between the glasses-type terminal 1 and the other device. Therefore, when reproducing moving image content, the control unit 20 may reproduce the moving image content so as to synchronize with other devices. In order to realize synchronized playback, for example, the display control unit 24 acquires a synchronization signal shared with another device via the communication unit 40, and plays back video content based on the acquired synchronization signal. According to the glasses-type terminal 1, it is possible to give a feeling of use as if a plurality of users are observing the same image at the same time.

（変形例４）
制御部２０は、設定した映像提示領域を、ユーザ２が行ったジェスチャに基づいて補正（例えば微調整）する機能を有していてもよい。
この変形例の眼鏡型端末１の制御部２０は、図５で説明した機能構成に加え、更に、補正部２６に相当する機能を実現する。補正部２６は、設定部２３が設定した映像提示領域を、ジェスチャ認識部２２が認識したジェスチャに基づいて補正する。映像提示領域を補正する場合には、補正部２６は、設定テーブル３１に格納された設定情報を書き替える。ジェスチャ認識部２２は、設定部２３が設定した映像提示領域が表示部１２に表示されているときに、映像提示領域を補正するためのジェスチャを認識する。 (Modification 4)
The control unit 20 may have a function of correcting (for example, fine adjustment) the set video presentation area based on a gesture performed by the user 2.
In addition to the functional configuration described with reference to FIG. 5, the control unit 20 of the glasses-type terminal 1 according to this modification further implements a function corresponding to the correction unit 26. The correction unit 26 corrects the video presentation area set by the setting unit 23 based on the gesture recognized by the gesture recognition unit 22. When correcting the video presentation area, the correction unit 26 rewrites the setting information stored in the setting table 31. The gesture recognition unit 22 recognizes a gesture for correcting the video presentation area when the video presentation area set by the setting unit 23 is displayed on the display unit 12.

図１４は、眼鏡型端末１が映像提示領域を補正するときに行う処理の流れを示すフローチャートある。
制御部２０は、ステップＳ１〜Ｓ７の処理ステップを実行して映像提示領域を設定すると、設定した映像提示領域を表示部１２に表示させる（ステップＳ１０１）。制御部２０は、映像提示領域に映像コンテンツを表示させてもよいし、映像提示領域の範囲を示す画像を表示させてもよい。そして、制御部２０は、映像提示領域の表示中に認識したユーザ２のジェスチャに基づいて、この映像提示領域を補正する（ステップＳ１０２）。例えば、制御部２０は、ユーザ２が設定済みの映像提示領域に重ねて図６や図８で説明したジェスチャを行った場合、このジェスチャを認識して映像提示領域を補正する。この際、制御部２０は、上述した実施形態と同じ方法で映像提示領域を再設定すればよい。そして、制御部２０は、映像提示領域の補正結果に応じて設定テーブル３１の情報を書き替える。
制御部２０は、ユーザ２による別のジェスチャに基づいて映像提示領域を補正してもよい。例えば、制御部２０は、補正対象の映像提示領域に指を重ねた状態で、所定の方向に指が移動させられると、指の移動方向に映像提示領域を移動させる補正を行う。 FIG. 14 is a flowchart showing a flow of processing performed when the glasses-type terminal 1 corrects the video presentation area.
When the control unit 20 executes the processing steps of Steps S1 to S7 and sets the video presentation area, the control unit 20 displays the set video presentation area on the display unit 12 (Step S101). The control unit 20 may display video content in the video presentation area or display an image indicating the range of the video presentation area. And the control part 20 correct | amends this video presentation area based on the gesture of the user 2 recognized during the display of a video presentation area (step S102). For example, when the user 2 performs the gesture described with reference to FIGS. 6 and 8 on the video presentation area that has been set, the control unit 20 recognizes this gesture and corrects the video presentation area. At this time, the control unit 20 may reset the video presentation area by the same method as in the above-described embodiment. And the control part 20 rewrites the information of the setting table 31 according to the correction result of an image | video presentation area | region.
The control unit 20 may correct the video presentation area based on another gesture by the user 2. For example, when the finger is moved in a predetermined direction with the finger placed on the correction target video presentation area, the control unit 20 performs correction to move the video presentation area in the movement direction of the finger.

（変形例５）
上述した実施形態では、制御部２０は、ユーザ２の前方にある平面領域を検出して、検出した平面領域上に映像提示領域を設定していた。しかし、ユーザ２が広い室空間に居る場合等、近傍に平面領域が存在しない可能性もある。この場合、制御部２０は、ユーザ２の前方の予め決められた距離の位置に、映像提示領域を設定してもよい。この場合の眼鏡型端末１の映像提示領域の設定方法は、上述した実施形態と同じでよく、実空間中の決められた距離だけ離れた位置に平面領域が存在すると仮定して、映像提示領域を設定すればよい。制御部２０は、例えば、眼鏡部１０のレンズの無限縁の位置や予め決められた固定距離の位置に映像提示領域を設定する。
この変形例において、眼鏡型端末１における平面領域の検出に係る構成（例えば平面領域検出部２３ａの機能）が省略されてもよい。 (Modification 5)
In the embodiment described above, the control unit 20 detects a plane area in front of the user 2 and sets a video presentation area on the detected plane area. However, when the user 2 is in a large room space, there may be no plane area in the vicinity. In this case, the control unit 20 may set the video presentation area at a predetermined distance in front of the user 2. The method for setting the video presentation area of the glasses-type terminal 1 in this case may be the same as that of the above-described embodiment, and the video presentation area is assumed on the assumption that the planar area exists at a position separated by a predetermined distance in the real space. Should be set. For example, the control unit 20 sets the video presentation area at the position of the infinite edge of the lens of the spectacles unit 10 or a predetermined fixed distance.
In this modification, the configuration related to the detection of the planar area in the glasses-type terminal 1 (for example, the function of the planar area detection unit 23a) may be omitted.

（変形例６）
眼鏡型端末１は、距離センサ１３を備えないようにしてもよい。この場合、制御部２０は、撮影部１１により生成された撮影データを解析して、三次元空間を認識したり、平面領域を検出したり、平面領域までの距離を検出したりするとよい。
また、眼鏡型端末１は、タグ情報を記憶しない構成であってもよい。この場合、眼鏡型端末１は、位置方向センサ１４の検出結果に基づいて、映像提示領域の設定時の位置及び方向を特定して、位置情報及び方向情報を設定テーブル３１に格納する。そして、眼鏡型端末１は、位置方向センサ１４の現在の検出結果と、設定テーブル３１に格納された位置情報及び方向情報とを照合して、両者の同一性に基づいて映像提示領域の位置を特定する。
また、眼鏡型端末１は、位置方向センサ１４に代えて、眼鏡型端末１が向く方向を検出する方向センサを備えてもよい。この方向センサは、例えば、加速度センサ、地磁気センサ又はジャイロセンサである。
また、眼鏡型端末１は、位置方向センサ１４を備えないようにしてもよい。この場合、眼鏡型端末１は、方向情報を使用せず、タグ情報を用いて映像提示領域の位置を特定する。 (Modification 6)
The glasses-type terminal 1 may not include the distance sensor 13. In this case, the control unit 20 may analyze the shooting data generated by the shooting unit 11 to recognize a three-dimensional space, detect a plane area, or detect a distance to the plane area.
The glasses-type terminal 1 may be configured not to store tag information. In this case, the glasses-type terminal 1 specifies the position and direction when the video presentation area is set based on the detection result of the position / direction sensor 14 and stores the position information and the direction information in the setting table 31. Then, the glasses-type terminal 1 collates the current detection result of the position / direction sensor 14 with the position information and the direction information stored in the setting table 31, and determines the position of the video presentation area based on the identity of both. Identify.
The glasses-type terminal 1 may include a direction sensor that detects the direction in which the glasses-type terminal 1 faces instead of the position / direction sensor 14. This direction sensor is, for example, an acceleration sensor, a geomagnetic sensor, or a gyro sensor.
The glasses-type terminal 1 may not include the position / direction sensor 14. In this case, the glasses-type terminal 1 specifies the position of the video presentation area using the tag information without using the direction information.

上述した実施形態で説明した眼鏡型端末１の制御部２０が行う映像提示領域に対する機能（アプリケーションプログラム）の割り当て方法は、あくまで一例である。例えば、制御部２０は、映像提示領域を設定した後、ユーザ２が行った所定の操作に従って機能を割り当ててもよい。また、制御部２０は、映像提示領域を設定して設定テーブル３１に情報を格納した後に、割り当てる機能を変更してもよい。 The function (application program) assignment method for the video presentation area performed by the control unit 20 of the glasses-type terminal 1 described in the above-described embodiment is merely an example. For example, the control unit 20 may assign a function according to a predetermined operation performed by the user 2 after setting the video presentation area. The control unit 20 may change the function to be assigned after setting the video presentation area and storing the information in the setting table 31.

また、制御部２０は、式（１）及び（２）で説明した関係以外の関係で、映像提示領域を決定してもよい。例えば、制御部２０は、ユーザ２の両手の指が交差する２点（Ｐ１，Ｐ２）を認識して矩形領域であるジェスチャ領域を検出していたが、矩形領域の頂点となる４点を認識してジェスチャ領域を検出してもよい。また、制御部２０は、ジェスチャ領域と同じサイズの映像提示領域を設定してもよい。この場合、ユーザ２が右手と左手とを使ったジェスチャで所望する映像提示領域のサイズを指定し、制御部２０はこのジェスチャを認識してジェスチャ領域を検出する。 Moreover, the control part 20 may determine a video presentation area | region by relationships other than the relationship demonstrated by Formula (1) and (2). For example, the control unit 20 recognizes two points (P1, P2) where fingers of both hands of the user 2 intersect to detect a gesture region that is a rectangular region, but recognizes four points that are vertices of the rectangular region. Then, the gesture area may be detected. The control unit 20 may set a video presentation area having the same size as the gesture area. In this case, the user 2 designates the desired size of the video presentation area with the gesture using the right hand and the left hand, and the control unit 20 recognizes this gesture and detects the gesture area.

（変形例７）
上述した実施形態の眼鏡型端末１の制御部２０は、ユーザ２により行われた空間領域を平面的に囲むジェスチャを認識して、映像提示領域を設定していた。制御部２０は、別のジェスチャを認識して、映像提示領域を設定してもよい。例えば、ユーザ２が、空間領域において点によって映像コンテンツの提示位置を指定するジェスチャをすると、制御部２０は、ユーザ２が視認可能な映像提示領域のサイズを計算する。そして、制御部２０は、ユーザ２の位置から見てユーザ２が指定した提示位置の延長線上に、計算したサイズの映像提示領域を設定する。この際、制御部２０は、側面部Ｗ等の平面領域までの距離を用いて映像提示領域のサイズを計算してもよい。
すなわち、制御部２０は、ユーザ２の位置から見て、ユーザ２のジェスチャで指定された空間領域の延長線上に、ユーザ２が映像コンテンツを観察可能なサイズで映像提示領域を設定すればよい。具体的なジェスチャや、ジェスチャと映像提示領域との関係については、種々の変形が可能である。 (Modification 7)
The control unit 20 of the eyeglass-type terminal 1 according to the above-described embodiment recognizes a gesture surrounding the space area performed by the user 2 and sets the video presentation area. The control unit 20 may recognize another gesture and set the video presentation area. For example, when the user 2 performs a gesture for designating the presentation position of the video content by a point in the space area, the control unit 20 calculates the size of the video presentation area that the user 2 can visually recognize. And the control part 20 sets the image | video presentation area | region of the calculated size on the extension line of the presentation position which the user 2 designated seeing from the user 2 position. At this time, the control unit 20 may calculate the size of the video presentation area using the distance to the planar area such as the side face W.
That is, the control unit 20 may set the video presentation area in a size that allows the user 2 to observe the video content on the extension line of the spatial area specified by the user 2 gesture as viewed from the position of the user 2. Various modifications can be made to the specific gesture and the relationship between the gesture and the video presentation area.

（変形例８）
眼鏡型端末１（眼鏡部１０）の具体的な形状は、図１に示した形状に限定されない。
本発明の映像表示装置は、眼鏡型の映像表示装置に限らず、ユーザによって観察される像に重ねて映像を表示する機能を有する映像表示装置であればよい。例えば、本発明の映像表示装置は、ユーザが把持して使用する形態の映像表示装置であってもよいし、ユーザが装着するヘルメット等の装着物に取り付けて使用される形態の映像表示装置であってもよい。 (Modification 8)
The specific shape of the glasses-type terminal 1 (glasses unit 10) is not limited to the shape shown in FIG.
The video display device of the present invention is not limited to a glasses-type video display device, and may be any video display device having a function of displaying a video image superimposed on an image observed by a user. For example, the video display device of the present invention may be a video display device that is held and used by a user, or a video display device that is used by being attached to an attachment such as a helmet worn by the user. There may be.

（変形例９）
上述した眼鏡型端末１の制御部２０の機能は、ハードウェア資源、ソフトウェア資源又はこれらの組み合わせのいずれによって実現されてもよい。制御部２０の機能がプログラムを用いて実現される場合、このプログラムは、磁気記録媒体（磁気テープ、磁気ディスク（ＨＤＤ（Hard Disk Drive）、ＦＤ（Flexible Disk））等）、光記録媒体（光ディスク等）、光磁気記録媒体、半導体メモリ等のコンピュータ読取可能な記録媒体に記憶した状態で提供されてもよいし、ネットワークを介して配信されてもよい。また、本発明は、コンピュータが行う映像提示方法として実施することも可能である。 (Modification 9)
The function of the control unit 20 of the glasses-type terminal 1 described above may be realized by any of hardware resources, software resources, or a combination thereof. When the function of the control unit 20 is realized using a program, the program includes a magnetic recording medium (magnetic tape, magnetic disk (HDD (Hard Disk Drive), FD (Flexible Disk)), etc.), optical recording medium (optical disk). Etc.), may be provided in a state of being stored in a computer-readable recording medium such as a magneto-optical recording medium or a semiconductor memory, or may be distributed via a network. The present invention can also be implemented as a video presentation method performed by a computer.

１…眼鏡型端末、２…ユーザ、１０…眼鏡部、１１…撮影部、１２…表示部、１３…距離センサ、１４…位置方向センサ、２０…制御部、２１…撮影制御部、２２…ジェスチャ認識部、２３…設定部、２３ａ…平面領域検出部、２４…表示制御部、２５…設定情報取得部、２６…補正部、３０…記憶部、３１…設定テーブル、４０…通信部。 DESCRIPTION OF SYMBOLS 1 ... Eyeglass-type terminal, 2 ... User, 10 ... Glasses part, 11 ... Imaging | photography part, 12 ... Display part, 13 ... Distance sensor, 14 ... Position direction sensor, 20 ... Control part, 21 ... Shooting control part, 22 ... Gesture Recognizing unit, 23 ... setting unit, 23a ... plane area detecting unit, 24 ... display control unit, 25 ... setting information acquisition unit, 26 ... correction unit, 30 ... storage unit, 31 ... setting table, 40 ... communication unit.

Claims

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit that displays video content superimposed on the observed image;
A gesture recognition unit for recognizing a gesture made in a spatial region by the user based on shooting data generated by the shooting unit;
Based on the gesture recognized by the gesture recognition unit, a setting unit that sets a video presentation region on an extension line of the spatial region specified by the gesture as viewed from the user position;
A display control unit for displaying the video content on the display unit so that the video content is observed by the user in the video presentation area set by the setting unit;
With
The display control unit
When a plurality of the video presentation areas are observed by the user, the video content of one video presentation area is not observed by the user or is observed more than the video contents of the other video presentation areas. Film image display apparatus you and performs control to Nikuku.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit that displays video content superimposed on the observed image;
A gesture recognition unit for recognizing a gesture made in a spatial region by the user based on shooting data generated by the shooting unit;
Based on the gesture recognized by the gesture recognition unit, a setting unit that sets a video presentation region on an extension line of the spatial region specified by the gesture as viewed from the user position;
A display control unit for displaying the video content on the display unit so that the video content is observed by the user in the video presentation area set by the setting unit;
And a setting information acquisition unit that acquires setting information indicating settings of the video presentation area from another image display device is not a self apparatus,
The setting unit
Set the video presentation area based on the setting information acquired by the setting information acquisition unit,
The display control unit
The other movies image display device you characterized in that display the same said video contents and the video display device.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit that displays video content superimposed on the observed image;
A gesture recognition unit for recognizing a gesture made in a spatial region by the user based on shooting data generated by the shooting unit;
Based on the gesture recognized by the gesture recognition unit, a setting unit that sets a video presentation region on an extension line of the spatial region specified by the gesture as viewed from the user position;
A display control unit for displaying the video content on the display unit so that the video content is observed by the user in the video presentation area set by the setting unit;
The image presentation area where the setting unit has set, and a correcting section that corrects, based on the gesture the gesture recognition unit recognizes,
The gesture recognition unit
When the video presenting regions the setting is displayed on the display unit, film image display device you characterized by recognizing a gesture for correcting the video presentation area.

The gesture recognition unit
Video display according to any one of claims 1 to 3, characterized in that recognize gestures that surrounds the space region made by the user in a plane.

A plane area detecting unit for detecting a plane area existing on the extension line;
The setting unit
Wherein the plane area detection unit the flat region in which the detected image display device according to any one of claims 1 to 4 and sets the video presentation area.

The display control unit
When the video presentation area is no longer observed by the user, the reproduction of the video content in the video presentation area is interrupted, and the reproduction is resumed when the video presentation area is observed again. video display according to any one of claims 1 to 5.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
A video presentation method for a video display device comprising:
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
Have
In the third step, when a plurality of the video presentation areas are observed by the user, the video content of one video presentation area is not observed by the user, or other video presentation areas A video presentation method, characterized by being a step of performing control to make observation more difficult than video content.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
A video presentation method for a video display device comprising:
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
A fourth step of acquiring setting information indicating the setting of the video presentation area from another video display device that is not its own device;
Have
The second step is a step of setting the video presentation area based on the acquired setting information,
The third step is a step of displaying the same video content as that of the other video display device.
A video presentation method characterized by the above.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
A video presentation method for a video display device comprising:
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
A fourth step of correcting the set video presentation area based on the recognized gesture;
With
The first step is a step of recognizing a gesture for correcting the video presentation area when the set video presentation area is displayed on the display unit.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
In a computer of a video display device comprising
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
A program for executing
In the third step, when a plurality of the video presentation areas are observed by the user, the video content of one video presentation area is not observed by the user, or other video presentation areas A program characterized in that it is a step of performing control to make it less observable than video content.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
In a computer of a video display device comprising
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
A fourth step of acquiring setting information indicating the setting of the video presentation area from another video display device that is not its own device;
A program for executing
The second step is a step of setting the video presentation area based on the acquired setting information,
The third step is a step of displaying the same video content as that of the other video display device.
A program characterized by that.

An imaging unit that captures an image observed by the user and generates imaging data;
A display unit for displaying video content superimposed on the observed image;
In a computer of a video display device comprising
A first step of recognizing a gesture made in a spatial region by the user based on photographing data generated by the photographing unit;
A second step of setting a video presentation area on an extension line of a spatial area specified by the gesture as viewed from the user's position based on the recognized gesture;
A third step of displaying the video content on the display unit so that the video content is observed by the user in the set video presentation area;
A fourth step of correcting the set video presentation area based on the recognized gesture;
A program for executing
The first step is a step of recognizing a gesture for correcting the video presentation area when the set video presentation area is displayed on the display unit.