JP6168597B2

JP6168597B2 - Information terminal equipment

Info

Publication number: JP6168597B2
Application number: JP2013132732A
Authority: JP
Inventors: 加藤　晴久; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2013-06-25
Filing date: 2013-06-25
Publication date: 2017-07-26
Anticipated expiration: 2033-06-25
Also published as: JP2015008394A

Description

本発明は、情報を立体的に提示する情報端末装置に関し、特に、該装置と利用者との相対的位置関係の変化によって表示部での表示情報を制御できる情報端末装置に関する。 The present invention relates to an information terminal device that presents information three-dimensionally, and more particularly to an information terminal device that can control display information on a display unit by changing a relative positional relationship between the device and a user.

情報を立体的に提示する装置は、利用者に臨場感を提供できる。当該装置を実現する手法としては、以下のような手法が公開されている。 An apparatus that presents information three-dimensionally can provide a user with a sense of reality. The following methods are disclosed as a method for realizing the device.

特許文献１では、被写体までの距離に反比例したズレを設定した右目用の画像と左目用の画像とを用意し、視差バリア方式やレンチキュラーレンズ方式などにより、それぞれ対応する目に映像を提示することで立体的な表現を可能とする裸眼立体視ディスプレイを提案している。 In Patent Document 1, a right-eye image and a left-eye image in which a deviation that is inversely proportional to the distance to the subject is prepared, and images are presented to the corresponding eyes by a parallax barrier method, a lenticular lens method, or the like. We have proposed an autostereoscopic display that enables stereoscopic expression.

特許文献２では、視点の位置を推定することで視点毎の画像を生成し、立体視可能な視点数を増加させる裸眼立体視ディスプレイを提案している。特に、表示装置の表面に対して大きな奥行きや飛び出しを表現する場合でも、解像感の劣化を抑えることができ、解像感の高い立体画像を表示する効果を実現ができる。 Japanese Patent Application Laid-Open No. 2004-228688 proposes an autostereoscopic display that generates an image for each viewpoint by estimating the position of the viewpoint and increases the number of viewpoints that can be stereoscopically viewed. In particular, even when a large depth or protrusion is expressed with respect to the surface of the display device, it is possible to suppress degradation of the resolution and to realize an effect of displaying a stereoscopic image with a high resolution.

特開２０１２−２４９０６０号公報JP2012-249060A 特開２０１３−１３０５５号公報JP2013-13055A

特許文献１では、利用者がディスプレイを視聴する視点が離散的に設定されており、立体的に感じられる位置が限定されるという問題がある。 In patent document 1, the viewpoint from which a user views a display is set discretely, and there exists a problem that the position which can be felt in three dimensions is limited.

特許文献２では、特許文献１の問題が部分的に解決されているものの、複数の視点画像を撮影する際の撮影の仕方を工夫することによって隣接する視点画像間の視差を異ならせるため、専用の裸眼立体視用映像データを生成しなければならないという問題がある。 In Patent Document 2, although the problem of Patent Document 1 is partially solved, a special technique is used to make the parallax between adjacent viewpoint images different by devising a shooting method when shooting a plurality of viewpoint images. There is a problem that it is necessary to generate video data for autostereoscopic viewing.

また、特許文献１及び２は、いずれもレンチキュラーレンズ等の特殊な光学部材が必要であるため、ソフトウェアだけでは実現できないという問題がある。 Patent Documents 1 and 2 both require a special optical member such as a lenticular lens, and therefore cannot be realized by software alone.

本発明は、上記従来技術の課題に鑑みて、専用の裸眼立体視用映像データや特殊部材等を必要としない簡素な手法により、情報端末装置に対する利用者の相対的な位置関係に応じて表示される情報を立体的に提示することができる、情報端末装置を提供することを目的とする。 In view of the above-described problems of the prior art, the present invention displays the information according to the relative positional relationship of the user with respect to the information terminal device by a simple method that does not require dedicated autostereoscopic video data or special members. An object of the present invention is to provide an information terminal device capable of presenting information to be displayed three-dimensionally.

上記目的を達成するため、本発明は、撮像部と、表示部と、を有する情報端末装置であって、前記表示部の表示平面を基準とした所定の立体としての表示情報を記憶する記憶部と、前記撮像部で撮像した撮像画像よりユーザの視点座標を検出すると共に、当該視点座標と、前記表示情報における表示平面を基準とした所定の立体と、の位置関係を推定する推定部と、前記記憶部から読み出して前記表示部で表示する表示情報を、前記推定された位置関係に応じて加工した後、前記表示部に表示させるよう制御することで、当該加工された表示情報を前記検出された視点座標よりユーザが見た際に、前記表示平面を基準とした所定の立体として見えるようにする制御部と、を備えることを特徴とする。 In order to achieve the above object, the present invention provides an information terminal device having an imaging unit and a display unit, which stores display information as a predetermined three-dimensional object based on the display plane of the display unit. And an estimation unit that detects a user's viewpoint coordinates from a captured image captured by the imaging unit, and estimates a positional relationship between the viewpoint coordinates and a predetermined solid based on a display plane in the display information; The display information read out from the storage unit and displayed on the display unit is processed according to the estimated positional relationship and then displayed on the display unit, whereby the processed display information is detected. And a control unit that allows the user to view the image as a predetermined solid with the display plane as a reference when viewed from the viewpoint coordinates.

本発明によれば、表示部の表示平面を基準とした所定の立体と、撮像画像より検出したユーザの視点座標と、の位置関係を求め、実際に表示部に表示する際に当該位置関係に応じて加工を施して表示を行うので、専用の裸眼立体視用映像データや特殊部材等を必要としない簡素な手法により、立体的な表示が可能となる。 According to the present invention, a positional relationship between a predetermined three-dimensional object based on the display plane of the display unit and the user's viewpoint coordinates detected from the captured image is obtained, and the positional relationship is determined when actually displayed on the display unit. Since display is performed by processing accordingly, stereoscopic display is possible by a simple method that does not require dedicated autostereoscopic video data or special members.

一実施形態に係る情報端末装置の機能ブロック図である。It is a functional block diagram of the information terminal device concerning one embodiment. 本発明における表示部での見え方の、本来のユーザと非ユーザとの比較の例を示す図である。It is a figure which shows the example of a comparison with the original user and a non-user of the appearance on the display part in this invention. 推定部の機能ブロック図である。It is a functional block diagram of an estimation part. 視点座標検出部と視点座標追跡部とが時系列上で切り替えられて適用されることを示す図である。It is a figure which shows that a viewpoint coordinate detection part and a viewpoint coordinate tracking part are switched and applied on a time series. 変換係数算出部による変換係数の算出と、制御部の制御により当初意図された通りの立体が見えることを説明するための図である。It is a figure for demonstrating that the solid as originally intended is visible by calculation of the conversion coefficient by a conversion coefficient calculation part, and control of a control part. 制御部による影の設定処理を説明するための図である。It is a figure for demonstrating the shadow setting process by a control part. 太陽方位の算出を説明するための図である。It is a figure for demonstrating calculation of a sun azimuth | direction.

図１は、一実施形態に係る情報端末装置の機能ブロック図である。情報端末装置1は、撮像部2、推定部3、制御部4、記憶部5及び表示部6を備える。 FIG. 1 is a functional block diagram of an information terminal device according to an embodiment. The information terminal device 1 includes an imaging unit 2, an estimation unit 3, a control unit 4, a storage unit 5, and a display unit 6.

情報端末装置1は、一例では、携帯電話・スマートフォンなどの携帯端末として構成することができる。しかし、本発明の情報端末装置1は、携帯端末に限られるものではなく、撮像機能を有する撮像部2を備えたものであればどのような情報端末装置でもよく、例えば、コンピュータとして構成されていてもよい。 For example, the information terminal device 1 can be configured as a mobile terminal such as a mobile phone or a smartphone. However, the information terminal device 1 of the present invention is not limited to a mobile terminal, and may be any information terminal device provided with an imaging unit 2 having an imaging function, for example, configured as a computer. May be.

また、表示部6としてプロジェクターを想定しているが、表示部分が平面として構成されていればよく、プロジェクターに限らず、ディスプレイであってもよい。なお、プロジェクターを利用する場合、情報端末装置1の構成に応じて、適宜、外部出力等を利用してもよい。 In addition, although a projector is assumed as the display unit 6, it is sufficient that the display portion is configured as a flat surface, and the display unit 6 is not limited to the projector and may be a display. When a projector is used, an external output or the like may be used as appropriate according to the configuration of the information terminal device 1.

さらに、表示情報として複数のポリゴンで構成される3D-CG(３次元コンピュータグラフィックス)モデルを想定しているが、3D-CGモデルに限られるものではなく、写真やアイコン等の画像、Web ページ、文書等の、平面上に表現される情報でもよい。（ただし、表示部6に表示される際は、空間内に配置された形の平面上の情報として表示されるため、当該平面上に表現される情報は実質的には、3D-CGモデルと同様に立体とみなせる。） In addition, 3D-CG (3D computer graphics) model composed of multiple polygons is assumed as display information, but it is not limited to 3D-CG model. Information expressed on a plane such as a document may be used. (However, when displayed on the display unit 6, since it is displayed as information on a plane arranged in space, the information expressed on the plane is substantially a 3D-CG model. Similarly, it can be regarded as a solid.)

図１の各部の概要は以下の通りである。 The outline of each part in FIG. 1 is as follows.

撮像部2は、ユーザをリアルタイムで撮像して、時系列上の一連の撮像画像を得る。推定部3は、第一機能として、当該撮像画像よりリアルタイムでユーザの視点座標を検出すると共に、第二機能として、記憶部5に記憶されている所定の立体と当該視点座標との位置関係を求める。記憶部5は、表示部6にて表示する情報を、表示部6の表示平面を基準とした所定の立体として記憶している。制御部4は、記憶部5から読み出した情報を、推定部3にて推定された視点座標及び位置関係に応じてリアルタイムで制御して、表示部6に表示させる。 The imaging unit 2 captures a user in real time and obtains a series of captured images in time series. The estimation unit 3 detects the user's viewpoint coordinates in real time from the captured image as a first function, and the positional relationship between the predetermined solid stored in the storage unit 5 and the viewpoint coordinates as a second function. Ask. The storage unit 5 stores information to be displayed on the display unit 6 as a predetermined solid with the display plane of the display unit 6 as a reference. The control unit 4 controls the information read from the storage unit 5 in real time according to the viewpoint coordinates and the positional relationship estimated by the estimation unit 3, and displays the information on the display unit 6.

当該制御部4による制御により、情報端末装置1のユーザは、自身の視点から見た際に、平面上に表示を行う表示部6において、あたかも、立体が存在しているように見えることとなる。ここで、ユーザが見る位置を変えて表示部6を見た際は、撮像部2、推定部3及び制御部4によるリアルタイム処理によって、当該立体は３次元空間内に実在している場合と同様に見え方が変わって見えることとなる。 The control of the control unit 4 allows the user of the information terminal device 1 to appear as if a solid exists on the display unit 6 that displays on a plane when viewed from its own viewpoint. . Here, when the user changes the viewing position and looks at the display unit 6, the real-time processing by the imaging unit 2, the estimation unit 3, and the control unit 4 is the same as when the solid actually exists in the three-dimensional space. The way it looks will change.

例えば、表示部6が壁掛けディスプレイとして構成されている場合に、ユーザから見たい際に、実在の壁掛け時計が存在しているかのように表示することが可能となる。ユーザは、当該壁を正面から見た場合は、時計が正面で見え、位置を変えて壁に対して角度を付けて見た場合は、時計が当該角度から眺める形で傾いて見えることとなり、立体としての時計が実在しているように見えることとなる。また、前述の写真その他の平面上に表現される情報を見せる場合も同様に、当該平面が（例えば看板のような形で）3次元空間内に存在しているように見えることとなる。 For example, when the display unit 6 is configured as a wall-mounted display, it is possible to display as if an actual wall clock is present when the user wants to see it. When the user sees the wall from the front, the watch can be seen from the front, and when the user changes the position and looks at an angle with respect to the wall, the watch appears to be tilted so that the watch can be seen from the angle. The clock as a solid will appear to exist. Similarly, when the information represented on the above-mentioned photograph or other plane is shown, the plane will also appear to exist in the three-dimensional space (for example, in the form of a signboard).

なお、以上のようにユーザは立体を知覚するが、どのような立体をどのような位置、大きさ、向き、色その他によって知覚させるかに関する情報が、記憶部5に記憶された表示情報である。すなわち、当該表示情報はいわば立体モデルであり、管理者等により予め設定されて記憶部5に格納される。 Although the user perceives a solid as described above, information regarding what solid is perceived by what position, size, orientation, color, and the like is display information stored in the storage unit 5. . That is, the display information is a three-dimensional model, and is preset by an administrator or the like and stored in the storage unit 5.

なおまた、本発明による制御のもとでは、表示部6を、視点座標の推定がなされていない「非ユーザ」から見た場合、一般的には、本来のユーザに対して表示部6で表示させることを想定した立体モデルの本来の見え方とは異なった、歪んだ「模様」として認識されて見えることとなる。図２に当該例を示す。 In addition, under the control according to the present invention, when the display unit 6 is viewed from a “non-user” whose viewpoint coordinates are not estimated, generally, the display unit 6 displays the image to the original user. It is recognized as a distorted “pattern” that is different from the original appearance of the three-dimensional model assumed to be generated. The example is shown in FIG.

図２では、表示部6に対して、(1)に示す自身の視点推定がなされている本来のユーザの視点P1と、(2)に示す視点推定がなされていない「非ユーザ」の視点P2と、が示されている。本来のユーザの視点P1からは、表示部6はR1のように見え、机の上に紙が配置されると共に、所定の模様が付されたドミノ状の複数の立体が並んでいるように見える。しかしながら、「非ユーザ」の視点P2からは、同時刻で同一表示をしている表示部6が、R2のように歪んで見えることとなる。 In FIG. 2, the original user's viewpoint P1 that has been estimated for its own viewpoint shown in (1) on the display unit 6 and the “non-user” viewpoint P2 that has not been estimated for its viewpoint shown in (2). And are shown. From the viewpoint P1 of the original user, the display unit 6 looks like R1, paper is arranged on the desk, and a plurality of domino-shaped solids with a predetermined pattern are arranged side by side. . However, from the “non-user” viewpoint P2, the display unit 6 displaying the same at the same time appears to be distorted as R2.

なお、次の追加処理がなされてもよい。すなわち、推定部3にて撮像画像からユーザの視点座標の推定に加えてユーザのジェスチャーを推定して、制御部4では当該ジェスチャーに応じた所定の運動を表示部6での表示立体に対して行わせように制御してもよい。例えば、前述の壁掛け時計の例であれば、ユーザの手の動きに応じて、当該時計が揺れるなどするように制御してもよい。 The following additional processing may be performed. That is, the estimation unit 3 estimates the user's gesture from the captured image in addition to the estimation of the user's viewpoint coordinates, and the control unit 4 performs a predetermined motion corresponding to the gesture on the display solid on the display unit 6. You may control so that it may be performed. For example, in the case of the above-described wall clock, the clock may be controlled to swing according to the movement of the user's hand.

以下、図１の各部の詳細を説明する。 Hereinafter, details of each part of FIG. 1 will be described.

撮像部2は、所定のサンプリング周期で利用者の顔を連続的に撮影して、その撮影画像を推定部3へ出力する。撮像部2としては、携帯端末に標準装備されるデジタルカメラを用いることができる。あるいは、ステレオカメラや深度センサ、赤外線カメラ等を利用してもよい。当該各種のカメラ・センサ等を組み合わせて利用してもよい。 The imaging unit 2 continuously captures the user's face at a predetermined sampling period, and outputs the captured image to the estimation unit 3. As the imaging unit 2, a digital camera provided as a standard in a mobile terminal can be used. Alternatively, a stereo camera, a depth sensor, an infrared camera, or the like may be used. The various cameras and sensors may be used in combination.

推定部3は、第一機能として、撮像部2から入力された画像からユーザの目の座標を視点座標として検出する。推定部3はまた、第二機能として、視点座標と、記憶部5に記憶された、表示部6の表示座標を基準とした所定の立体（立体モデル）としての表示情報と、の位置関係を求める。 The estimation unit 3 detects, as a first function, the coordinates of the user's eyes from the image input from the imaging unit 2 as viewpoint coordinates. The estimation unit 3 also has a positional relationship between the viewpoint coordinates and display information as a predetermined solid (stereoscopic model) stored in the storage unit 5 and based on the display coordinates of the display unit 6 as a second function. Ask.

当該位置関係は、ユーザから表示部6を見たときに立体的に表示情報が提示されるように、すなわち、立体モデルが当該モデルを設定した管理者等の意図した通りの立体としてユーザに知覚されるようにに、予め設定された変換式（射影変換）を適用することで、表示部6の表示平面に対応する座標へと表示情報を投影する際の、変換係数の形で求められる。 The positional relationship is perceived by the user as a three-dimensional model as intended by the administrator who has set the model so that the display information is presented three-dimensionally when the display unit 6 is viewed by the user. As described above, by applying a conversion formula (projective conversion) set in advance, it is obtained in the form of a conversion coefficient when the display information is projected onto the coordinates corresponding to the display plane of the display unit 6.

推定部3において変換係数の推定に用いられた変換式（射影変換）および推定された変換係数は、制御部4へ出力される。推定部3の詳細は後述する。 The conversion formula (projective conversion) used for estimating the conversion coefficient in the estimation unit 3 and the estimated conversion coefficient are output to the control unit 4. Details of the estimation unit 3 will be described later.

なお、本発明では追加処理として指や指示棒等によるジェスチャー入力を実現することができる。この場合、撮像部2では顔に加えて当該指や指示棒等をも撮影して、撮像画像を推定部3へ出力する。推定部3は視点座標から求める変換係数等に加えてさらに、当該指などを認識することにより、ジェスチャーにおける動作を認識することで操作情報を求め、制御部4へ出力する。 In the present invention, gesture input by a finger, an indicator stick or the like can be realized as an additional process. In this case, the imaging unit 2 captures not only the face but also the finger, the pointing stick, and the like, and outputs the captured image to the estimation unit 3. The estimation unit 3 obtains operation information by recognizing the motion in the gesture by recognizing the finger in addition to the conversion coefficient obtained from the viewpoint coordinates, and outputs the operation information to the control unit 4.

制御部4は、推定部3から入力された変換係数による射影変換を記憶部5から読み出した表示情報に適用して、表示情報を加工し、表示部6で表示させるための表示情報となす。こうして、表示部6で表示する際には、見る角度・位置に応じて見え方が現実の立体と同一の態様で変化し、あたかも現実の立体が存在しているかのような知覚をユーザに与えることができる。 The control unit 4 applies the projective transformation based on the conversion coefficient input from the estimation unit 3 to the display information read out from the storage unit 5, processes the display information, and sets the display information to be displayed on the display unit 6. Thus, when displaying on the display unit 6, the appearance changes in the same manner as the real solid according to the viewing angle and position, giving the user a perception as if the real solid exists. be able to.

制御部4ではさらに、当該知覚される立体の現実感を高めるべく、表示情報の加工の際、さらに、以下の第一処理及び第二処理のいずれか又は両方を行うようにしてもよい。なお、第一処理は射影変換の適用後、第二処理は射影変換の適用前に、施すことができる。 The control unit 4 may further perform one or both of the following first process and second process when processing the display information in order to enhance the perceived stereoscopic reality. The first process can be performed after the projective transformation is applied, and the second process can be performed before the projective transformation is applied.

第一処理として、射影変換後の表示情報に奥行き情報の距離に比例して強くぼかすような画像効果を付与し焦点ぼけを模擬することで、より立体感を高めてもよい。すなわち、立体として構成された表示情報に対して、ユーザの視点座標から遠くに存在している（ものとして予めモデル化された）部分ほど強くぼかすようにしてもよい。 As a first process, a stereoscopic effect may be further enhanced by applying an image effect that strongly blurs the display information after projective transformation in proportion to the distance of the depth information to simulate defocusing. That is, the display information configured as a three-dimensional object may be more blurred as a part that exists far away from the user's viewpoint coordinates (modeled in advance as a thing).

なお、表示情報は立体として構成されているので、推定部3にて推定された視点座標との空間的な位置関係により、表示情報における相対的な前後関係や立体物等の奥行き情報が得られ、こうした情報を用いてぼかし処理を行うことができる。 Since the display information is configured as a solid, the relative positional relationship in the display information and depth information such as a three-dimensional object can be obtained based on the spatial positional relationship with the viewpoint coordinates estimated by the estimation unit 3. Such information can be used to perform the blurring process.

第二処理として、制御部4は、表示情報の影を追加し、ユーザが表示情報の3次元的な表示位置を容易に知覚できるようにしてもよい。この場合、新たに生成された影も含めて表示情報とし、当該影を含めて射影変換を適用することで、制御部4による制御対象とする。なお、前述の図２のR2の例においては、ドミノ状の立体に対する影も、表示情報として構成されている。 As the second process, the control unit 4 may add a shadow of the display information so that the user can easily perceive the three-dimensional display position of the display information. In this case, display information including newly generated shadows is used as display information, and projection control is applied including the shadows to be controlled by the control unit 4. In the example of R2 in FIG. 2 described above, a shadow on a domino solid is also configured as display information.

なお、上記第一処理及び第二処理については、推定部3及び制御部4の詳細説明の際に、再度説明する。 The first process and the second process will be described again when the estimation unit 3 and the control unit 4 are described in detail.

記憶部5は、表示部6に表示する表示情報を、表示部6の表示平面を基準とした所定の立体の形式で、予め複数蓄積している。利用者は、制御部4に対する入力操作(制御部4の一部として構成された、不図示のタッチパネルなどの入力インターフェースに対する入力操作)で、記憶部5に蓄積されている表示情報の中から所望の表示情報を選択して表示部6に表示させることができる。こうしてユーザは例えば、複数の3D-CGモデルの中からいずれを表示させるかを選択可能となる。 The storage unit 5 stores a plurality of pieces of display information to be displayed on the display unit 6 in a predetermined three-dimensional format with reference to the display plane of the display unit 6 in advance. The user can select from the display information stored in the storage unit 5 by an input operation on the control unit 4 (an input operation on an input interface such as a touch panel (not shown) configured as a part of the control unit 4). Display information can be selected and displayed on the display unit 6. Thus, for example, the user can select which of a plurality of 3D-CG models to display.

表示部6での情報表示の際、前述のように制御部4は、推定部3から入力された変換係数による射影変換を表示情報に適用して、あるいはまた、当該射影変換の適用の際さらに現実感を高める処理を施して、表示情報を加工する。これにより、推定部3での推定結果に従って表示部6での表示情報が制御され、現実の立体が存在しているような知覚をユーザに与える。 When displaying information on the display unit 6, as described above, the control unit 4 applies the projective transformation based on the conversion coefficient input from the estimating unit 3 to the display information, or, further, when applying the projective transformation. The display information is processed by performing processing that enhances the sense of reality. Thereby, the display information on the display unit 6 is controlled in accordance with the estimation result of the estimation unit 3, and the user is given a perception that an actual solid exists.

なお、記憶部5に記憶される立体としての表示情報は、実際に表示部6に表示される際の時間に応じて変化するように、設定しておいてもよい。前述の壁掛け時計の例であれば、所定の周期で振り子が揺れると共に、針が時間を示すように構成されていてもよい。 Note that the display information as a three-dimensional object stored in the storage unit 5 may be set so as to change according to the time when it is actually displayed on the display unit 6. In the case of the above-described wall clock, the pendulum may swing at a predetermined period and the hands may indicate time.

ここで、推定部3の詳細を説明する。図３は、推定部3の機能ブロック図である。推定部3は、撮像画像を用いて視点座標を検出し変換係数を求める第一構成としての視点座標検出部31、視点座標追跡部32、スイッチ33及び変換係数算出部34と、追加処理として、撮像画像を用いて指等によるジェスチャーの操作情報を求める第二構成としての操作情報算出部35と、を含む。以下、第一構成から説明する。 Here, details of the estimation unit 3 will be described. FIG. 3 is a functional block diagram of the estimation unit 3. The estimation unit 3 detects a viewpoint coordinate using a captured image and obtains a conversion coefficient, a viewpoint coordinate detection unit 31, a viewpoint coordinate tracking unit 32, a switch 33, and a conversion coefficient calculation unit 34 as a first configuration, and additional processing, An operation information calculation unit 35 as a second configuration for obtaining operation information of a gesture with a finger or the like using a captured image. Hereinafter, the first configuration will be described.

視点座標検出部31は、撮像画像よりユーザの目の座標を視点座標（空間座標）として検出する。ここでまず、画像座標内の目を検出する必要がある。目検出については、例えば以下の非特許文献2に開示の技術等、周知の各種の特徴量に基づく検出技術を利用できる。 The viewpoint coordinate detection unit 31 detects the coordinates of the user's eyes as viewpoint coordinates (spatial coordinates) from the captured image. Here, first, it is necessary to detect an eye in the image coordinates. For the eye detection, for example, detection techniques based on various known feature quantities such as the technique disclosed in Non-Patent Document 2 below can be used.

[非特許文献2]「P. Viola and M. Jones, "Robust real time object detection," In IEEE ICCV Workshop on Statistical and Computational Theories of Vision, July 2001.」 [Non-Patent Document 2] “P. Viola and M. Jones,“ Robust real time object detection, ”In IEEE ICCV Workshop on Statistical and Computational Theories of Vision, July 2001.”

当該特徴量による目検出により、目の撮像画像内における座標が求まる。実際の目は、当該画像内の座標(x, y)に対して、撮像部2を構成している所定の光学系における関係を適用して定まる、撮像部2から見たある方向d(x, y)に延びる直線上のどこかに存在している。従って、当該画像内の座標より定まる方向に対してさらに、視点座標検出部31では撮像部2からの奥行き方向の距離を求めることで、撮像部2を基準とした実空間内の座標として、視点座標を求める。 The coordinates in the captured image of the eye are obtained by eye detection based on the feature amount. The actual eye is determined by applying a relationship in a predetermined optical system constituting the imaging unit 2 to the coordinates (x, y) in the image, and a certain direction d (x seen from the imaging unit 2 , y) somewhere on the straight line. Therefore, in addition to the direction determined from the coordinates in the image, the viewpoint coordinate detection unit 31 obtains the distance in the depth direction from the imaging unit 2 to obtain the viewpoint as the coordinates in the real space with the imaging unit 2 as a reference. Find the coordinates.

当該奥行き方向の距離は、目検出によりそれぞれ求まった両目の間の長さに反比例した距離として推定してもよい。当該反比例の定数は、撮像部2を構成する上記所定の光学系における構成情報や、両目の間の長さの実測値などを用いて予め設定しておくことができる。 The distance in the depth direction may be estimated as a distance inversely proportional to the length between both eyes obtained by eye detection. The inversely proportional constant can be set in advance using configuration information in the predetermined optical system constituting the imaging unit 2, an actually measured value of the length between both eyes, or the like.

なお、奥行き方向の距離に関しては、撮像部2の各画素位置に対応する深度センサが利用できる場合、当該深度センサの値を奥行き方向の距離として採用して、視点座標を計測してもよい。なおまた、撮像部2としてステレオカメラが利用できる場合は、三角測量によって視点座標を算出してもよい。 As for the distance in the depth direction, when a depth sensor corresponding to each pixel position of the imaging unit 2 can be used, the value of the depth sensor may be adopted as the distance in the depth direction to measure the viewpoint coordinates. If a stereo camera can be used as the imaging unit 2, the viewpoint coordinates may be calculated by triangulation.

なお、視点座標は、両目のうち所定の一方の位置として算出してもよいし、両目の中点などの、両目から定義される所定位置として算出してもよい。 Note that the viewpoint coordinates may be calculated as a predetermined position of both eyes, or may be calculated as a predetermined position defined from both eyes, such as the midpoint of both eyes.

別の一実施形態において、視点座標検出部31は、顔の所定位置を視点座標として検出してもよい。この場合、特徴量の抽出などに基づく周知の顔検出技術を適用して顔領域を検出する。また、当該顔領域の中心などの所定点として、前記目検出における撮像画像内における座標に対応するものを求める。さらに、当該顔領域のサイズ（所定箇所の長さ）に反比例した距離を、前記目検出における奥行き方向の距離に対応するものとして、視点座標を算出することができる。 In another embodiment, the viewpoint coordinate detection unit 31 may detect a predetermined position of the face as viewpoint coordinates. In this case, a face area is detected by applying a known face detection technique based on feature amount extraction or the like. Further, as a predetermined point such as the center of the face area, a point corresponding to the coordinates in the captured image in the eye detection is obtained. Furthermore, the viewpoint coordinates can be calculated assuming that the distance inversely proportional to the size of the face area (the length of the predetermined portion) corresponds to the distance in the depth direction in the eye detection.

以上のように視点座標検出部31では、撮像画像に対して、特徴量を抽出することによって、視点座標を高精度に検出する。次に、時間軸上の観点からの説明を行う。 As described above, the viewpoint coordinate detection unit 31 detects the viewpoint coordinates with high accuracy by extracting the feature amount from the captured image. Next, description will be given from the viewpoint on the time axis.

一実施形態では、図３の視点座標追跡部32及びスイッチ33は省略され、時系列上で入力される一連の全ての撮像画像に対して、視点座標検出部31が視点座標を検出し、当該検出結果を変換係数算出部34へと出力する。 In one embodiment, the viewpoint coordinate tracking unit 32 and the switch 33 in FIG. 3 are omitted, and the viewpoint coordinate detection unit 31 detects viewpoint coordinates for a series of all captured images input in time series, and The detection result is output to the conversion coefficient calculation unit 34.

別の一実施形態では、視点座標検出部31を適用し続けることによる負荷を低減すべく、視点座標検出部31は時系列上における一部分を対象として間欠的に適用され、当該適用がなされていない間においては、代わりに視点座標追跡部32が適用され、過去に推定された視点座標の結果を利用した追跡を行うことにより、負荷を低減する。 In another embodiment, the viewpoint coordinate detection unit 31 is intermittently applied to a part of the time series in order to reduce a load caused by continuing application of the viewpoint coordinate detection unit 31, and the application is not performed. In the meantime, the viewpoint coordinate tracking unit 32 is applied instead, and the load is reduced by performing tracking using the result of viewpoint coordinates estimated in the past.

なお、スイッチ33は、変換係数算出部34へと出力する視点座標を求める際に、上記別の一実施形態においては、視点座標検出部31及び視点座標追跡部32のいずれを適用するかに関して切り替え処理がなされることを明示すべく、図示したものである。 Note that the switch 33 performs switching as to which of the viewpoint coordinate detection unit 31 and the viewpoint coordinate tracking unit 32 is applied in the other embodiment when obtaining the viewpoint coordinates to be output to the conversion coefficient calculation unit 34. It is shown in order to clarify that processing is performed.

図４は、当該切り替えを模式的に示すための図である。ここでは、時系列上の撮像画像の各々を「フレーム」とし、フレーム番号tのフレーム（撮像画像）をフレームF(t)と表記している。(1)に示すように、ある時刻t=nにおいては、フレームF(n)に視点座標検出部31が適用され、視点座標がD(n)のように検出される。また同じく、(3)に示すように、その後のある時刻t=n+Nにおいても、フレームF(n+N)に視点座標検出部31が適用され、視点座標がD(n+N)のように検出される。 FIG. 4 is a diagram for schematically showing the switching. Here, each of the captured images on the time series is referred to as a “frame”, and a frame (captured image) with a frame number t is expressed as a frame F (t). As shown in (1), at a certain time t = n, the viewpoint coordinate detection unit 31 is applied to the frame F (n), and the viewpoint coordinates are detected as D (n). Similarly, as shown in (3), at a certain time t = n + N thereafter, the viewpoint coordinate detection unit 31 is applied to the frame F (n + N), and the viewpoint coordinates are D (n + N). Is detected.

なお、図４では、視点座標D(n)等は当該点線で囲まれる矩形の内部の所定位置に存在する。当該矩形は、後述する追跡処理の際に利用されるテンプレートとしての小領域の一例である。図４では図示の簡略化のため、小領域の内部の所定位置に視点座標が存在するものとして、小領域と視点座標とを一体で扱い、点線で示された矩形の小領域に対して「視点座標D(n)」等の符号を付している。 In FIG. 4, the viewpoint coordinates D (n) and the like are present at predetermined positions inside the rectangle surrounded by the dotted line. The rectangle is an example of a small area as a template used in the tracking process described later. In FIG. 4, for simplification of illustration, it is assumed that the viewpoint coordinates exist at a predetermined position inside the small area, and the small area and the viewpoint coordinates are treated as a unit, and the rectangular small area indicated by the dotted line is “ Reference numerals such as “viewpoint coordinates D (n)” are attached.

一方、(1)と(3)の間の各時刻t=n+k (k=1, 2, ,..., N-1)においては、(2)に示すように、フレームF(n+k)に対して視点座標追跡部32が適用され、視点座標D(n+k)が（前述の図示簡略化の方式に従い、当該点線で図示された矩形内に）検出される。この際、過去の検出箇所D_nの近傍に限定して追跡処理を行ってもよい、当該過去の検出箇所D_nには、(1)に示す直近の過去における視点座標検出部31の検出箇所D(n)を利用してもよいし、当該時刻t=n+kの直前の時刻t=n+k-1（不図示）における視点座標追跡部32の検出結果D(n+k-1)を利用してもよい。 On the other hand, at each time t = n + k (k = 1, 2,..., N-1) between (1) and (3), as shown in (2), the frame F (n The viewpoint coordinate tracking unit 32 is applied to (+ k), and the viewpoint coordinates D (n + k) are detected (within the rectangle illustrated by the dotted line in accordance with the above-described simplified scheme). At this time, tracking processing may be performed only in the vicinity of the past detection location D_n, the detection location D (n) of the viewpoint coordinate detection unit 31 in the latest past shown in (1) is included in the past detection location D_n. n) may be used, and the detection result D (n + k-1) of the viewpoint coordinate tracking unit 32 at the time t = n + k-1 (not shown) immediately before the time t = n + k May be used.

視点座標追跡部32による追跡処理の詳細は次の通りである。すなわち、過去の撮像画像において検出された視点座標を含む局所領域のテンプレートマッチング等を用いた追跡により、現在の撮像画像における視点座標を検出する。具体的には、過去の撮像画像で検出された視点座標を含む小領域をテンプレートとして、現在の撮像画像で差分二乗和が最小となる領域を探索する。最小値を取る領域の対応点（例えば、過去検出のテンプレート内における重心等の所定点に視点座標がある場合、探索された小領域における同じく重心等の所定点）が、現在の撮像画像における視点座標とされ、変換係数算出部34へと出力される。 Details of the tracking processing by the viewpoint coordinate tracking unit 32 are as follows. That is, the viewpoint coordinates in the current captured image are detected by tracking using template matching or the like of the local region including the viewpoint coordinates detected in the past captured image. Specifically, using a small region including viewpoint coordinates detected in a past captured image as a template, a region where the sum of squared differences is minimized in the current captured image is searched. Corresponding point of the area that takes the minimum value (for example, if there is a viewpoint coordinate at a predetermined point such as the center of gravity in the template of the past detection, the predetermined point such as the center of gravity in the searched small area) is the viewpoint in the current captured image The coordinates are output to the conversion coefficient calculation unit 34.

図４の例であれば、(2)の各時刻t=n+kでは、テンプレートとしては、(1)に示す視線推定部31が適用された直近の過去t=nにおいて検出された視点座標D(n)を含む小領域(矩形等の所定形状及び所定サイズを有する小領域)を利用することができる。探索範囲も前述のように、過去の検出箇所D_n付近の所定範囲に限定してよい。 In the example of FIG. 4, at each time t = n + k in (2), as a template, viewpoint coordinates detected in the latest past t = n to which the line-of-sight estimation unit 31 shown in (1) is applied. A small area including D (n) (a small area having a predetermined shape such as a rectangle and a predetermined size) can be used. As described above, the search range may also be limited to a predetermined range near the past detection location D_n.

また、時系列上の各撮像画像に対して、視点座標追跡部32と視点座標検出部31とのいずれを適用するかの判断については、次のように下すことができる。なお、時系列の最初に読み込まれる撮像画像については、過去の検出結果が存在しないので、視点座標検出部31を適用する。 In addition, the determination as to which of the viewpoint coordinate tracking unit 32 and the viewpoint coordinate detection unit 31 is applied to each captured image in time series can be made as follows. Since there is no past detection result for the first captured image read in time series, the viewpoint coordinate detection unit 31 is applied.

一実施形態では、所定の周期Nを決めておき、当該N回に1回のみ視点座標検出部31を適用するようにしてよい。一実施形態では、視点座標追跡部32の結果が不適切であると判定された時刻tの結果は放棄し、当該時刻tについては視点座標検出部31を適用し、次の時刻t+1以降の撮像画像には、同様の不適切か否かの判定(及び不適切な場合の放棄)を行いながら、視点座標追跡部32の適用を試みるようにしてもよい。この際、視点座標追跡部32が適用される連続回数に上限を設け、上限に達した場合は強制的に視点座標検出部31の適用を行うようにしてもよい。 In one embodiment, the predetermined period N may be determined, and the viewpoint coordinate detection unit 31 may be applied only once every N times. In one embodiment, the result of the time t at which the result of the viewpoint coordinate tracking unit 32 is determined to be inappropriate is abandoned, the viewpoint coordinate detection unit 31 is applied for the time t, and the subsequent time t + 1 or later The viewpoint coordinate tracking unit 32 may be tried to be applied to the captured image while determining whether or not it is inappropriate (and abandoning when inappropriate). At this time, an upper limit may be set for the number of consecutive times the viewpoint coordinate tracking unit 32 is applied, and the viewpoint coordinate detection unit 31 may be forcibly applied when the upper limit is reached.

ここで、視点座標追跡部32の結果が不適切である判定は、次のようにして下すことができる。すなわち、テンプレートマッチングにおいて求まった最小の差分二乗和が所定の第一閾値を超えること、及び／又は、当該時刻tにつき追跡にて求まった視点座標と、直前の時刻t-1において求まった視点座標と、の距離が所定の第二閾値を超えること、が満たされた場合に、不適切であると判定することができる。 Here, the determination that the result of the viewpoint coordinate tracking unit 32 is inappropriate can be made as follows. That is, the minimum sum of squares of differences obtained in template matching exceeds a predetermined first threshold, and / or the viewpoint coordinates obtained by tracking at the time t and the viewpoint coordinates obtained at the immediately preceding time t-1. If the distance between the two exceeds a predetermined second threshold, it can be determined that the distance is inappropriate.

以下、視点座標追跡部32の追跡処理における補足事項を述べる。 Hereinafter, supplementary matters in the tracking processing of the viewpoint coordinate tracking unit 32 will be described.

視点座標追跡部32によりテンプレートマッチングで追跡する際の「視点座標」は、「奥行き方向の距離」に関する情報を含まず、視点座標検出部31の説明における、特徴量抽出による撮像画像上での目検出（又は顔検出）の際の座標を意味する。テンプレートにおける小領域は両目を含む1つの小領域として定義してもよいし、目ごとにそれぞれ1つの小領域を定義してもよい。 “Viewpoint coordinates” when tracking by template matching by the viewpoint coordinate tracking unit 32 does not include information on “distance in the depth direction”, and in the description of the viewpoint coordinate detection unit 31, an eye on a captured image by feature amount extraction is used. It means the coordinates for detection (or face detection). The small area in the template may be defined as one small area including both eyes, or one small area may be defined for each eye.

一方、視点座標追跡部32を適用する際の「奥行き方向の距離」については、当該小領域を用いた目検出（又は顔検出）にて検出された座標を用いて、視点座標検出部31の場合と同様の手法で算出すればよい。視点座標追跡部32は、当該算出した奥行き方向の距離の情報を加えて得た実空間の「視点座標」を、最終的な出力として変換係数算出部34へと出力する。 On the other hand, regarding the “distance in the depth direction” when the viewpoint coordinate tracking unit 32 is applied, the coordinates detected by the eye detection (or face detection) using the small region are used. What is necessary is just to calculate by the method similar to the case. The viewpoint coordinate tracking unit 32 outputs the “viewpoint coordinates” in the real space obtained by adding the calculated distance information in the depth direction to the conversion coefficient calculating unit 34 as a final output.

なおまた、前述の視点座標追跡部32の結果が不適切である判定における、第二閾値の判定については、当該奥行き方向の距離を含めた実空間内の「視点座標」に対して下すようにしてもよい。 In addition, regarding the determination of the second threshold value in the determination that the result of the viewpoint coordinate tracking unit 32 is inappropriate, the determination is made with respect to the “viewpoint coordinates” in the real space including the distance in the depth direction. May be.

変換係数算出部34は、以上のように検出された視点座標と、（ユーザ視点において表示部6の表示平面を基準とした見かけ上）立体表示させることを意図して、管理者により予め設定され記憶部5に格納された表示情報と、の位置関係を、射影変換における変換係数の形で算出する。当該算出された変換係数は、推定部3の第一構成（第二構成をなす操作情報算出部35以外の構成）における出力として、制御部4へ出力される。 The conversion coefficient calculation unit 34 is set in advance by the administrator with the intention of stereoscopically displaying the viewpoint coordinates detected as described above (appearing with reference to the display plane of the display unit 6 at the user viewpoint). The positional relationship with the display information stored in the storage unit 5 is calculated in the form of a conversion coefficient in projective conversion. The calculated conversion coefficient is output to the control unit 4 as an output in the first configuration of the estimation unit 3 (configuration other than the operation information calculation unit 35 forming the second configuration).

制御部4は、当該出力された変換係数によって予め設定された変換式（射影変換の変換式）を適用する。当該適用により、見かけ上において立体表示すべく予め設定して記憶部5に格納されていた表示情報が、表示部6で実際に表示する座標へと変換される。この結果、予め記憶部5に格納する際に管理者が意図した通りに、表示部6を見るユーザには立体が見えることとなる。 The control unit 4 applies a conversion expression (projection conversion conversion expression) preset by the output conversion coefficient. With this application, the display information that has been preset and stored in the storage unit 5 for apparent stereoscopic display is converted into coordinates that are actually displayed on the display unit 6. As a result, the user who sees the display unit 6 can see a solid as intended by the administrator when storing in the storage unit 5 in advance.

図５は、変換係数算出部34における当該変換係数の算出と、制御部4の制御によって当初意図された通りの立体が見えることを説明するための図である。 FIG. 5 is a diagram for explaining that a solid as originally intended can be seen by the calculation of the conversion coefficient in the conversion coefficient calculation unit 34 and the control of the control unit 4.

図５では、視点座標Eと、ユーザが立体であると知覚する仮想的な3次元座標Xと、表示部6の表示平面Pと、当該仮想的な3次元座標Xの表示平面Pにおける実際の表示座標Yと、が示されている。ここで、ユーザに知覚させることを意図した仮想的な3次元座標Xにおける立体は、表示平面Pを基準として予め設定されており、表示情報として記憶部5に記憶されている。なお、図５の(1)及び(2)については補足事項として後述する。 In FIG. 5, the viewpoint coordinate E, the virtual three-dimensional coordinate X that the user perceives as a solid, the display plane P of the display unit 6, and the actual display plane P of the virtual three-dimensional coordinate X on the display plane P. Display coordinates Y are shown. Here, the solid in the virtual three-dimensional coordinate X intended to be perceived by the user is set in advance with reference to the display plane P, and is stored in the storage unit 5 as display information. Note that (1) and (2) in FIG. 5 will be described later as supplementary matters.

実線矢印にて図示するように、変換係数算出部34は、視点座標Eから見て表示情報の仮想的な3 次元座標Xを表示平面P上の座標Yに射影（投影）する際の射影変換行列Mを、以下に述べる式(1)〜式(9)にて算出することができる。当該射影により例えば実線矢印上に示すように、仮想3次元座標Xにおける点q1〜q4がそれぞれ、表示平面P上の座標Yにおける点p1〜p4へと投影される。 As illustrated by the solid line arrow, the conversion coefficient calculation unit 34 performs projection conversion when projecting (projecting) the virtual three-dimensional coordinate X of the display information as viewed from the viewpoint coordinate E to the coordinate Y on the display plane P. The matrix M can be calculated by the following equations (1) to (9). For example, the points q1 to q4 in the virtual three-dimensional coordinate X are projected onto the points p1 to p4 in the coordinate Y on the display plane P, as shown on the solid line arrow by the projection.

一方、当該射影の向きとは逆向きに光線がユーザの目に届くことで、ユーザは仮想3次元座標Xにおける点q1〜q4の箇所に、表示平面P上に垂直して浮かんだ長方形の看板等があたかも実際に存在しているかのように知覚（錯覚）するが、その実体は、制御部4によって制御された表示部6において、表示平面P自身に表示される歪んだ形状p1〜p4となっている。なお、図５では射影の関係を簡素に例示すべく長方形q1〜q4を示しているが、3D-CGモデル等の所望の立体を表示可能である。 On the other hand, when the light beam reaches the user's eye in the direction opposite to the projection direction, the user can see a rectangular signboard that floats vertically on the display plane P at the points q1 to q4 in the virtual three-dimensional coordinate X. Are actually perceived (illusion) as if they existed, but the substance is the distorted shapes p1 to p4 displayed on the display plane P itself in the display unit 6 controlled by the control unit 4. It has become. In FIG. 5, rectangles q1 to q4 are shown to simply illustrate the relationship of projection, but a desired solid such as a 3D-CG model can be displayed.

射影変換行列Mの算出を説明する。まず、座標Yは平面P 上にあるので、次式(1)が成り立つ。 The calculation of the projective transformation matrix M will be described. First, since the coordinate Y is on the plane P, the following equation (1) is established.

なお、式(1)にて座標X 及びY はそれぞれの3 次元座標(x , y , z) の斉次座標を表し、平面P は平面方程式ax + by + cz + d = 0 の係数ベクトルを表す。 In Equation (1), the coordinates X and Y represent the homogeneous coordinates of the respective three-dimensional coordinates (x, y, z), and the plane P represents the coefficient vector of the plane equation ax + by + cz + d = 0. Represent.

平面座標Y は3 次元座標X と視点座標E を結ぶ直線上にあるので、次式(2),(3)が成り立つ。 Since the plane coordinate Y is on a straight line connecting the three-dimensional coordinate X and the viewpoint coordinate E, the following equations (2) and (3) hold.

なお、視点座標E も3 次元座標(x , y , z) の斉次座標を表す。上式(2),(3)での0 はスカラーなので解は次式(4),(5)が得られる。 Note that the viewpoint coordinates E also represent the homogeneous coordinates of the three-dimensional coordinates (x, y, z). Since 0 in the above equations (2) and (3) is a scalar, the following equations (4) and (5) can be obtained.

後者を代入すると、次式(6),(7),(8)が得られる。 Substituting the latter gives the following equations (6), (7), (8).

よって、射影変換行列M は次式(9)で得られる。ただし、I は単位行列である。 Therefore, the projective transformation matrix M is obtained by the following equation (9). Where I is the identity matrix.

なお、次式(10)に示すように、制御部4では、表示情報の仮想的な3 次元座標X（すなわち、記憶部5に格納された立体モデル)に射影変換行列M を適用することで、平面P 上の座標Y を算出する。 As shown in the following equation (10), the control unit 4 applies the projection transformation matrix M to the virtual three-dimensional coordinates X of the display information (that is, the three-dimensional model stored in the storage unit 5). Then, the coordinate Y on the plane P is calculated.

なお、立体モデルが複数の立体で構成され、図２のR2で説明した一連のドミノのような、オクルージョンの関係が存在する場合は、視点座標Eから見える最も手前の立体モデルの要素についてのみ、射影変換を適用すればよい。 If the stereo model is composed of multiple solids and there is an occlusion relationship such as the series of dominoes described in R2 in FIG. Projective transformation may be applied.

図５の(1),(2)に関連する補足事項は以下の通りである。 Supplementary matters related to (1) and (2) in FIG. 5 are as follows.

本発明においては図５の(1)に示すように、視点座標Eはまず、撮像画像を解析することにより、撮像部2を基準とした空間座標として推定部3（視点座標検出部31及び視点座標追跡部32）により求められる。 In the present invention, as shown in (1) of FIG. 5, the viewpoint coordinate E is first analyzed as a spatial coordinate based on the imaging unit 2 by analyzing the captured image, and the estimation unit 3 (the viewpoint coordinate detection unit 31 and the viewpoint coordinate). It is determined by the coordinate tracking unit 32).

一方、(2)に示すように、撮像部2と表示部6(の表示平面P)との空間座標上における位置関係も、本発明においては予め既知のものとして与えておく。例えば、情報端末装置1の筐体において固定して撮像部2及び表示部6を設けておき、それらの位置関係の情報を取得しておく。 On the other hand, as shown in (2), the positional relationship on the spatial coordinates between the imaging unit 2 and the display unit 6 (the display plane P thereof) is also given in advance as a known one in the present invention. For example, the imaging unit 2 and the display unit 6 are provided fixed in the casing of the information terminal device 1, and information on their positional relationship is acquired.

以上、図５の(1),(2)の前提により、情報端末装置1においては視点座標Eと、仮想的な3次元座標Xと、表示平面P及びその表示座標Yと、を共通の3次元座標系によって扱うことが可能となり、上記式(1)〜(10)の算出が可能となる。 As described above, on the premise of (1) and (2) in FIG. 5, in the information terminal device 1, the viewpoint coordinate E, the virtual three-dimensional coordinate X, the display plane P, and the display coordinate Y are shared 3. It can be handled by a dimensional coordinate system, and the above equations (1) to (10) can be calculated.

すなわち、前述の際には説明を省略したが、推定部3（視点座標検出部31及び視点座標追跡部32）では、撮像画像より撮像部2を基準とした視点座標を前述の各処理によって求めた後、さらに、当該図５の(2)の関係を適用して共通座標系に変換することで、視点座標Eを求めている。 That is, although the description is omitted in the above-described case, the estimation unit 3 (the viewpoint coordinate detection unit 31 and the viewpoint coordinate tracking unit 32) obtains the viewpoint coordinates based on the imaging unit 2 from the captured image by the above-described processes. Thereafter, the viewpoint coordinate E is obtained by applying the relationship (2) in FIG. 5 and converting it into the common coordinate system.

なお、当該共通座標系は任意に定めうるが、表示平面Pを基準として設定しておけばよい。この場合、ユーザに立体として提示する仮想3次元座標X上での各種の3D-CGモデル等を利用した表示情報を、当該表示平面Pを基準としてどのような位置にどのような大きさでどのような向きに見せるかといったことを考慮して、予めマニュアル設定しておくことができる。当該マニュアル設定の際には、周知の3D-CAD（3次元コンピュータ支援設計）を利用して表示情報を用意することができる。当該マニュアル設定された立体情報は前述のように記憶部5に格納され、利用される。 Although the common coordinate system can be arbitrarily determined, the display plane P may be set as a reference. In this case, the display information using various 3D-CG models etc. on the virtual three-dimensional coordinate X presented as a solid to the user is displayed at what position and in what size with respect to the display plane P. The manual setting can be made in advance in consideration of whether or not it looks like. At the time of the manual setting, display information can be prepared using well-known 3D-CAD (3D computer-aided design). The manually set stereoscopic information is stored and used in the storage unit 5 as described above.

なおまた、図５の(2)に示した撮像部2と表示部6との位置関係は、固定値を予め与えておく代わりに、変化することを許容して次のようにして求めてもよい。すなわち、表示平面P上の所定位置に周知の正方マーカ等のAR(拡張現実マーカ)を設けておき、当該マーカを撮像部2で撮像し、推定部3では当該マーカの位置及び姿勢を推定することによって、図５の(2)に示した撮像部2と表示部6との位置関係をリアルタイムで取得するようにしてもよい。なお、当該マーカは、表示平面P上ではなく、表示平面Pに対する既知の位置関係において設けられていても、当該既知の位置関係を追加で利用することにより、同様の処理が可能となる。 In addition, the positional relationship between the imaging unit 2 and the display unit 6 shown in (2) of FIG. 5 may be obtained in the following manner while allowing a change, instead of giving a fixed value in advance. Good. That is, an AR (augmented reality marker) such as a known square marker is provided at a predetermined position on the display plane P, the marker is imaged by the imaging unit 2, and the estimation unit 3 estimates the position and orientation of the marker. Accordingly, the positional relationship between the imaging unit 2 and the display unit 6 shown in (2) of FIG. 5 may be acquired in real time. Even if the marker is provided not in the display plane P but in a known positional relationship with respect to the display plane P, the same processing can be performed by additionally using the known positional relationship.

ここで、図５を参照して、前述の制御部4による現実感を高めるための第一処理（ぼかし処理）及び第二処理（影の生成処理）を説明する。 Here, with reference to FIG. 5, the first process (blurring process) and the second process (shadow generation process) for enhancing the reality by the control unit 4 will be described.

第一処理は、以上説明した図５より明らかなように、視点座標Eと、3次元座標Xにおける立体の各点qと、の間の距離d(E, q)を、周知のぼかし処理におけるぼかし度合いのパラメータとして利用して、当該仮想立体の点qに対応する表示平面P上の実際の点p（qをEから見て平面P上に投影した位置がp）にぼかし処理を施すことにより可能となる。 As is clear from FIG. 5 described above, the first processing is performed by calculating the distance d (E, q) between the viewpoint coordinate E and each point q of the solid in the three-dimensional coordinate X in the known blur processing. Use as a blurring degree parameter to blur the actual point p on the display plane P corresponding to the point q of the virtual solid (the position projected on the plane P when q is viewed from E) Is possible.

第二処理も、3D-CAD（3次元コンピュータ支援設計）分野等において周知の光源設定処理や、その他の処理により、3次元座標Xにおいて「影」を新たに生成することによって可能となる。なお、一例では、当該生成される影は、表示平面P上に投影されるように構成してよい。 The second processing can also be performed by newly generating a “shadow” in the three-dimensional coordinate X by a well-known light source setting processing in the 3D-CAD (three-dimensional computer-aided design) field and other processing. In one example, the generated shadow may be configured to be projected on the display plane P.

図６は、第二処理の例を説明するための図であり、(1)は影を設定しない状態を、(2)及び(3)は、(1)に対して影を設定した例を示している。(1)では、表示平面P上に、制御部4による加工によって、当該ユーザの視点座標から見た際に立体として見える表示情報D1が示されている。当該立体D1は、表示平面P上に配置された直方体である。図６では、表示平面P上に影を生成する例を説明するが、その他の平面上に生成してもよい。 FIG. 6 is a diagram for explaining an example of the second processing. (1) shows a state in which no shadow is set, and (2) and (3) show examples in which a shadow is set for (1). Show. In (1), display information D1 that appears as a solid when viewed from the viewpoint coordinates of the user is shown on the display plane P by processing by the control unit 4. The solid D1 is a rectangular parallelepiped arranged on the display plane P. Although FIG. 6 illustrates an example in which a shadow is generated on the display plane P, it may be generated on another plane.

一例では、(2)に示すように、表示情報D1が表示平面P1に接する領域を当該平面P1において所定割合だけ拡大したものを、影S1としてもよい。一例では、(3)に示すように、表示平面Pの座標系における所定位置L2に点光源を配置して、影S2を生成してもよい。あるいは、同じく(3)に示すように、表示平面Pの座標系における所定方向D2から、平行光線が射し込むようにして、影S2を生成してもよい。 In one example, as shown in (2), a region where the display information D1 is in contact with the display plane P1 is enlarged by a predetermined ratio on the plane P1 may be used as the shadow S1. In one example, as shown in (3), a point light source may be arranged at a predetermined position L2 in the coordinate system of the display plane P to generate the shadow S2. Alternatively, as shown in (3), the shadow S2 may be generated by allowing parallel rays to enter from a predetermined direction D2 in the coordinate system of the display plane P.

ここで、所定位置L2には、ユーザの視点座標Eを利用してもよいし、当該視点座標Eに対して所定の位置関係にある位置、例えば視点座標Eから表示平面Pの垂直方向に所定高さだけ上昇した位置など、を利用してもよい。所定方向D2には、図７を参照して以下説明する手法で算出される太陽方位（太陽高度h と太陽方位角a）を利用してもよい。 Here, the user's viewpoint coordinate E may be used for the predetermined position L2, or a position having a predetermined positional relationship with respect to the viewpoint coordinate E, for example, a predetermined position in the vertical direction of the display plane P from the viewpoint coordinate E. You may use the position raised only by height. For the predetermined direction D2, a solar azimuth (solar height h and solar azimuth angle a) calculated by a method described below with reference to FIG. 7 may be used.

緯度L，経度φである地点での太陽高度h と太陽方位角a は球面三角法を適用すると次式(11), (12), (13)の関係がある。 The solar altitude h and the solar azimuth angle a at the point of latitude L and longitude φ have the following equations (11), (12), and (13) when the spherical trigonometry is applied.

ここで、H，d はそれぞれ時角、太陽赤経である。なお、日本の標準時間JST と標準子午線(135 度) を使うと、時角H は以下の式(14)で与えられる。 Here, H and d are hour angle and solar red longitude, respectively. If Japan standard time JST and standard meridian (135 degrees) are used, hour angle H is given by the following equation (14).

ここで、Eq は平均太陽時による時刻と真太陽時による時刻との差分(均時差) である。均時差Eq は次式(15)で得られる。 Here, Eq is the difference (equal time difference) between the time of average solar time and the time of true solar time. The time difference Eq is obtained by the following equation (15).

ここで、w = 2 π/ 365(閏年はw = 2 π/ 366)、J は元旦からの通算日数-1 である。よって、太陽高度h 及び太陽方位a は以下の式(16),(17)で与えられる。 Here, w = 2π / 365 (the leap year is w = 2π / 366), and J is the total number of days -1 from the first day of the year. Therefore, the solar height h and the solar direction a are given by the following equations (16) and (17).

ここで、 α は太陽赤緯であり、次式(18)で求められる。 Where α is the solar declination and is obtained by the following equation (18).

なお、太陽方位a は真南を0 度とし、南西方向を正、南東方向を負の角度で表している。 Note that the solar direction a is represented by 0 degree in the south, positive in the southwest direction, and negative in the southeast direction.

以上のようにして、太陽方位（太陽高度h と太陽方位角a）が算出される。なお、入力として必要な、緯度L、経度φ、日本の標準時間JST及び元旦からの通算日数-1=Jは、ネットワークから情報を収集する等により、また、水平面と表示平面Pとの角度は、傾きセンサ等により、別途取得すればよい。あるいは、表示部6の表示平面の設置箇所が予め固定され既知である場合は、時間及び通算日数以外はマニュアルで与えてもよい。 As described above, the solar azimuth (solar altitude h and solar azimuth angle a) is calculated. Note that latitude L, longitude φ, Japanese standard time JST, and the total number of days from New Year's Day -1 = J, which are necessary as inputs, are obtained by collecting information from the network, and the angle between the horizontal plane and the display plane P is What is necessary is just to acquire separately by an inclination sensor etc. Or when the installation location of the display plane of the display unit 6 is fixed and known in advance, the time and the total number of days may be given manually.

ここで、図３に戻り、第二構成としての操作情報算出部35を説明する。操作情報算出部35は、周知の各種の手法により、撮像画像より手、指、指示棒その他の所定の対象によるユーザのジェスチャーを検出し、当該ジェスチャーに基づく操作情報を算出する。例えば、手の動きに関する速度ベクトルとして、操作情報を算出する。当該算出の際は、第一構成にて検出された視点座標を基準としてもよく、例えば視点座標の周りで回転された手を検出することで、角速度としての操作情報を算出してもよい。 Here, returning to FIG. 3, the operation information calculation unit 35 as the second configuration will be described. The operation information calculation unit 35 detects a user's gesture by a predetermined target such as a hand, a finger, an indicator stick, or the like from the captured image by various known methods, and calculates operation information based on the gesture. For example, the operation information is calculated as a velocity vector related to hand movement. In the calculation, the viewpoint coordinates detected in the first configuration may be used as a reference. For example, the operation information as the angular velocity may be calculated by detecting a hand rotated around the viewpoint coordinates.

制御部4は、上記算出された操作情報を受け取り、射影変換を適用する前の立体モデルとしての表示情報を、当該操作情報に応じて変化させる。例えば、検出された速度ベクトルに基づいて、表示平面P上での位置を移動してもよいし、検出された角速度に基づいて、表示平面P上での配置を回転してもよい。3D-CGで構成されていれば、そのパラメータ(大きさなど)を当該操作情報に応じて変化させてもよい。 The control unit 4 receives the calculated operation information and changes display information as a three-dimensional model before applying projective transformation according to the operation information. For example, the position on the display plane P may be moved based on the detected velocity vector, and the arrangement on the display plane P may be rotated based on the detected angular velocity. If configured with 3D-CG, its parameters (size, etc.) may be changed according to the operation information.

当該変化された表示情報に、第一構成における変換係数が適用されることにより、ユーザの立場では、表示部6にて表示されている立体が、自身のジェスチャーに応じて移動したり、回転されたりすることにより、直感的な操作が可能となる。 By applying the conversion coefficient in the first configuration to the changed display information, the solid displayed on the display unit 6 is moved or rotated according to its own gesture from the user's standpoint. Intuitive operation is possible.

1…情報端末装置、2…撮像部、3…推定部、4…制御部、5…記憶部、6…表示部 DESCRIPTION OF SYMBOLS 1 ... Information terminal device, 2 ... Imaging part, 3 ... Estimation part, 4 ... Control part, 5 ... Memory | storage part, 6 ... Display part

Claims

An information terminal device having an imaging unit and a display unit,
A storage unit for storing display information as a predetermined solid with reference to a display plane of the display unit;
Detecting the user's viewpoint coordinates from the captured image captured by the imaging section, and estimating the positional relationship between the viewpoint coordinates and a predetermined solid with reference to the display plane in the display information;
The display information read out from the storage unit and displayed on the display unit is processed according to the estimated positional relationship and then displayed on the display unit, whereby the processed display information is detected. A control unit that allows a user to view the image as a predetermined solid with the display plane as a reference when viewed from the viewpoint coordinates ,
The estimation unit further detects a user's gesture from the captured image captured by the imaging unit, calculates operation information based on the gesture,
Wherein the control unit further wherein when controlling so that the display appears plane as predetermined solid relative to the information terminal device characterized that you control varied according to the operation information.

The estimation unit detects the user's eye area from the captured image as viewpoint coordinates, and projects a predetermined solid with the display plane as a reference from the viewpoint coordinates toward the display plane. The information terminal device according to claim 1, wherein the positional relationship is estimated.

The information terminal apparatus according to claim 2, wherein the estimation unit obtains the projection relationship as a projective transformation matrix.

The information terminal apparatus according to claim 1, wherein the estimation unit detects viewpoint coordinates based on a feature amount extracted from the captured image.

The imaging unit continuously performs imaging on a time series,
The estimation unit detects viewpoint coordinates by extracting the feature amount for a part of the captured images on the time series, and determines the feature amount in the latest past for the other captured images on the time series. 5. The information terminal device according to claim 4, wherein the viewpoint coordinates are detected by template matching using a small area including the viewpoint coordinates extracted and detected.

The control unit, when performing the processing, displays corresponding to each position based on the distance from the detected viewpoint coordinates to each position in a predetermined solid with reference to the display plane in the display information. 6. The information terminal device according to claim 1, wherein blurring processing is performed on the information.

The information terminal device according to claim 1, wherein the control unit adds a shadow to a predetermined three-dimensional object based on a display plane in the display information in advance when the processing is performed.

The control unit calculates a sun azimuth based on the inclination of the display plane of the imaging unit with respect to a horizontal plane, longitude / latitude and date / time information, and generates the added shadow based on the sun azimuth. The information terminal device according to claim 7.