JP6341540B2

JP6341540B2 - Information terminal device, method and program

Info

Publication number: JP6341540B2
Application number: JP2014198392A
Authority: JP
Inventors: 加藤　晴久; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2014-09-29
Filing date: 2014-09-29
Publication date: 2018-06-13
Anticipated expiration: 2034-09-29
Also published as: JP2016071496A

Description

本発明は、撮像対象との相対的な位置関係に応じて情報を提示する情報端末装置、方法及びプログラムに関する。 The present invention relates to an information terminal device, a method, and a program for presenting information according to a relative positional relationship with an imaging target.

撮像対象との相対的な位置関係に応じて情報を提示する装置は、提示する情報を直感的に変化させることが可能であり、利用者の利便性を向上させることができる。このような装置を実現する技術として、以下のようなものが公開されている。 An apparatus that presents information according to a relative positional relationship with an imaging target can intuitively change the information to be presented, and can improve user convenience. As techniques for realizing such an apparatus, the following are disclosed.

非特許文献１では、白黒マーカの形状を既知とし、画像から抽出した白黒マーカの形状変形から位置、姿勢を計算するとともに、マーカ内部の模様から複数の種類を区別する手法が開示されている。 Non-Patent Document 1 discloses a method in which the shape of a black and white marker is known, the position and orientation are calculated from the deformation of the black and white marker extracted from the image, and a plurality of types are distinguished from the pattern inside the marker.

特許文献１では、処理負荷を軽減し、マーカをより正確に検出するため、色情報と円形状との組み合わせにより、マーカの検出負荷を軽減する手法が開示されている。 Patent Document 1 discloses a technique for reducing the detection load of a marker by combining color information and a circular shape in order to reduce the processing load and more accurately detect the marker.

特許文献２では、仮想的な物体像の描画にかかる負荷量が大きな場合でも、拡張現実の合成画像を実用的な速度で表示するため、処理の負荷量を算出し、複数の端末で負荷量を分散する手法が開示されている。 In Patent Document 2, even when the load amount required to draw a virtual object image is large, the processing load amount is calculated to display the augmented reality composite image at a practical speed. A technique for distributing the above is disclosed.

特開２０１２−２１６０２９号公報JP 2012-216029 A 特開２０１２−２２０９８６号公報JP 2012-220986 A

加藤, M. Billinghurst, 浅野, 橘, "マーカー追跡に基づく拡張現実感システムとそのキャリブレーション", 日本バーチャルリアリティ学会論文誌, vol.4, 20 no.4, pp.607-617, 1999.Kato, M. Billinghurst, Asano, Tachibana, "Augmented Reality System Based on Marker Tracking and Its Calibration", Transactions of the Virtual Reality Society of Japan, vol.4, 20 no.4, pp.607-617, 1999.

非特許文献１及び特許文献１では、マーカが必要であるため、利用できる環境に制約を受けるという問題がある。 In Non-Patent Document 1 and Patent Document 1, since a marker is required, there is a problem that the environment that can be used is restricted.

また、高精細な情報や数多くの情報を提示する場合は、その生成処理、描画処理がマーカの検出処理より負荷が大きいことから、マーカ検出の負荷軽減は全体の処理負荷軽減に大きく寄与しないという問題がある。 Furthermore, that when presenting a high-definition information and numerous information, the generating process, since the drawing process larger load than the detection processing of the marker, the load relief of marker detection does not contribute significantly to the overall processing load reduction There's a problem.

特許文献２では、複数の端末で処理を分散することから上記の問題は一部解決される。しかし、分散に伴う通信等による遅延が端末を動かした時にズレとして感じられることから、リアルタイム性が求められる用途には使えないという問題がある。 In Patent Document 2, the above problem is partially solved because processing is distributed among a plurality of terminals. However, there is a problem that it cannot be used for applications that require real-time performance because a delay due to communication or the like accompanying dispersion is felt as a shift when the terminal is moved.

本発明の目的は、上記従来技術の課題を解決し、撮像対象と撮像部との相対的な位置関係をコンテンツに反映するとともに、高品位なコンテンツをリアルタイム表示することも可能な、処理負荷の低い情報装置端末、方法及びプログラムを提供することにある。 The object of the present invention is to solve the above-mentioned problems of the prior art, reflect the relative positional relationship between the imaging target and the imaging unit in the content, and display a high-quality content in real time. To provide a low information device terminal, method and program.

上記目的を達成するため、本発明は、情報端末装置であって、撮像を行って撮像画像を取得する撮像部と、前記撮像画像より所定の撮像対象を特定すると共に、当該撮像対象と前記撮像部との相対的な位置姿勢を推定する推定部と、撮像対象ごとにその位置姿勢の各々に応じて予め描画された関連情報を記憶する記憶部と、前記特定された撮像対象及び前記推定された位置姿勢に応じた関連情報を前記記憶部から読み出して提示情報を生成する制御部と、前記生成された提示情報を提示する提示部と、を備えることを特徴とする。 In order to achieve the above object, the present invention is an information terminal device, wherein an imaging unit that performs imaging to acquire a captured image, specifies a predetermined imaging target from the captured image, and the imaging target and the imaging An estimation unit that estimates a relative position and orientation with respect to a unit, a storage unit that stores related information drawn in advance according to each position and orientation for each imaging target, the specified imaging target and the estimated And a control unit that reads out the related information according to the position and orientation from the storage unit and generates presentation information, and a presentation unit that presents the generated presentation information.

また、本発明は、方法であって、撮像を行って撮像画像を取得する撮像段階と、前記撮像画像より所定の撮像対象を特定すると共に、当該撮像対象と前記撮像部との相対的な位置姿勢を推定する推定段階と、撮像対象ごとにその位置姿勢の各々に応じて予め描画された関連情報を記憶する記憶段階と、前記特定された撮像対象及び前記推定された位置姿勢に応じた関連情報を前記記憶段階で記憶された関連情報から読み出して提示情報を生成する制御部と、前記生成された提示情報を提示する提示段階と、を備えることを特徴とする。 In addition, the present invention is a method, in which an imaging stage in which imaging is performed to acquire a captured image, a predetermined imaging target is specified from the captured image, and a relative position between the imaging target and the imaging unit An estimation stage for estimating the orientation, a storage stage for storing related information drawn in advance according to each of the position and orientation for each imaging target, and a relation according to the identified imaging target and the estimated position and orientation A control unit that reads information from the related information stored in the storage step and generates presentation information, and a presentation step that presents the generated presentation information.

さらに、本発明は、プログラムであって、コンピュータを前記情報端末装置として機能させることを特徴とする。 Furthermore, the present invention is a program that causes a computer to function as the information terminal device.

本発明によれば、記憶部において事前に撮像対象ごとにその位置姿勢の各々に応じて予め描画された関連情報を記憶しておき、撮像画像より特定された撮像対象とその位置姿勢に応じた関連情報を読み出して提示情報を生成して提示するので、計算負荷の高い位置姿勢に応じた描画処理を省略することができ、低い計算負荷で撮像対象と撮像部との相対的な位置関係をコンテンツに反映することができ、高品位なコンテンツを表示することも可能となる。 According to the present invention, related information drawn in advance according to each position and orientation of each imaging target is stored in advance in the storage unit, and the imaging target specified from the captured image and the position and orientation are determined. Since related information is read and presentation information is generated and presented, drawing processing according to the position and orientation with a high calculation load can be omitted, and the relative positional relationship between the imaging target and the imaging unit can be reduced with a low calculation load. It can be reflected in the content, and high-quality content can be displayed.

一実施形態に係る情報端末装置の機能ブロック図である。It is a functional block diagram of the information terminal device concerning one embodiment. 制御部における処理を模式的に説明するための例を示す図である。It is a figure which shows the example for demonstrating the process in a control part typically. 図２の例を用いて提示情報を生成する各実施形態を模式的に説明するための図である。It is a figure for demonstrating typically each embodiment which produces | generates presentation information using the example of FIG. 関連情報に実在の物体を用いる場合の撮像画像と関連情報との例を示す図である。It is a figure which shows the example of the captured image and related information in the case of using a real object for related information. 図４の例に対応する、外部対象が混在する場合における提示情報の例を示す図である。It is a figure which shows the example of the presentation information in the case where the external object corresponding to the example of FIG. 4 is mixed.

図１は、一実施形態に係る情報端末装置の機能ブロック図である。情報端末装置10は、撮像部1、推定部2、記憶部3、制御部4及び提示部5を備える。情報端末装置10としてはスマートフォン等の携帯端末を利用することができるが、撮像部1を備えたものであればどのような装置や機器等を利用してもよい。例えば、デスクトップ型、ラップトップ型又はその他のコンピュータなどでもよい。 FIG. 1 is a functional block diagram of an information terminal device according to an embodiment. The information terminal device 10 includes an imaging unit 1, an estimation unit 2, a storage unit 3, a control unit 4, and a presentation unit 5. A portable terminal such as a smartphone can be used as the information terminal device 10, but any device or device may be used as long as it includes the imaging unit 1. For example, a desktop type, a laptop type, or another computer may be used.

なお、情報端末装置10のうち、記憶部3、推定部2及び制御部4の全てまたはその任意の一部分を、情報端末装置10には備わらない外部構成とする、例えば、１台以上の外部サーバーにおいてその機能を実現させる構成とすることもできる。この場合、当該外部構成で実現された各部2,3,4と情報端末装置10との間で、以下説明する処理において必要となる情報の授受をネットワーク等経由によって行うようにすればよい。 Of the information terminal device 10, all or any part of the storage unit 3, the estimation unit 2, and the control unit 4 have an external configuration that is not provided in the information terminal device 10, for example, one or more external devices A configuration in which the function is realized in the server can also be adopted. In this case, information necessary for processing described below may be exchanged between the units 2, 3, and 4 realized by the external configuration and the information terminal device 10 via a network or the like.

図１の各部の処理内容は以下の通りである。 The processing content of each part in FIG. 1 is as follows.

撮像部1は、所定の撮像対象が配置されている現場（屋内、屋外その他の景色）を撮像して、その撮像画像を推定部2および制御部4へと出力する。撮像部1としては例えば、昨今の携帯端末に標準装備されることの多いデジタルカメラを用いることができる。 The imaging unit 1 captures a scene (indoor, outdoor, or other scenery) where a predetermined imaging target is arranged, and outputs the captured image to the estimation unit 2 and the control unit 4. As the imaging unit 1, for example, a digital camera often provided as a standard equipment in recent portable terminals can be used.

推定部2は、撮像部1で撮像された撮像画像から撮像対象を特定するとともに、当該特定結果に基づいて撮像対象と撮像部1との相対的な位置、姿勢を推定する。推定部2において当該推定された撮像対象、位置、姿勢は対象情報として制御部4へ出力する。 The estimation unit 2 identifies the imaging target from the captured image captured by the imaging unit 1, and estimates the relative position and orientation of the imaging target and the imaging unit 1 based on the identification result. The estimation unit 2 outputs the estimated imaging target, position, and orientation to the control unit 4 as target information.

なお、撮像対象が特定できない場合（撮像画像にいずれの撮像対象も撮影されていない場合、あるいは、撮影されているがノイズ等の影響により以下説明する処理によっても未検出と判定される場合）は、その旨が対象情報として制御部4へと出力される。 In addition, when the imaging target cannot be specified (when no imaging target is captured in the captured image, or when it is determined that the captured image is not detected by the process described below due to the influence of noise or the like). This is output to the control unit 4 as target information.

ここで、特定対象となる撮像対象については、１つ以上の所定対象（例えば前掲の非特許文献１におけるマーカ等）を予め設定しておき、その画像上における特徴情報を予め抽出したうえで推定部2に登録しておく。推定部2では撮像画像より特徴情報を抽出し、当該予め登録されている特徴情報との間でマッチングを行うことにより、撮像画像に撮像されている撮像対象を特定すると共に、相対的な位置及び姿勢を推定する。 Here, with respect to the imaging target to be specified, one or more predetermined targets (for example, the marker in Non-Patent Document 1 described above) are set in advance, and the feature information on the image is extracted in advance and estimated. Register with Part 2. The estimation unit 2 extracts feature information from the captured image and performs matching with the pre-registered feature information, thereby specifying the imaging target captured in the captured image, and the relative position and Estimate posture.

画像に対する特徴情報としては、周知のSIFT特徴量又はSURF特徴量等のような、回転及び拡大縮小あるいは射影変化（射影変換による歪み）のいずれか又はその任意の組み合わせに対して不変な性質を有し、画像の局所領域における相対的な輝度勾配に基づいて算出される局所特徴量を用いることができる。あるいは、同性質を有する周知の ORB特徴量等を用いてもよい。なお、SIFT特徴量は以下の非特許文献２に、SURF 特徴量は以下の非特許文献３に、ORB特徴量は以下の非特許文献４に、それぞれ開示されている。 The feature information for an image has a property that is invariant to any one of rotation and enlargement / reduction, projective change (distortion due to projective transformation), or any combination thereof, such as a well-known SIFT feature amount or SURF feature amount. Then, a local feature amount calculated based on a relative luminance gradient in a local region of the image can be used. Alternatively, a well-known ORB feature amount having the same property may be used. The SIFT feature value is disclosed in Non-Patent Document 2 below, the SURF feature value is disclosed in Non-Patent Document 3 below, and the ORB feature value is disclosed in Non-Patent Document 4 below.

非特許文献２「D.G.Lowe, Distinctive image features from scale-invariant key points, Proc. of Int. Journal of Computer Vision (IJCV), 60(2) pp.91-110 (2004)」
非特許文献３「H.Bay, T.Tuytelaars, and L.V.Gool, SURF: Speed Up Robust Features, Proc. of Int. Conf. of ECCV, (2006)」
非特許文献４「Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski: ORB: An efficient alternative to SIFT or SURF. ICCV 2011: 2564-2571.」 Non-Patent Document 2 “DGLowe, Distinctive image features from scale-invariant key points, Proc. Of Int. Journal of Computer Vision (IJCV), 60 (2) pp.91-110 (2004)”
Non-Patent Document 3 “H. Bay, T. Tuytelaars, and LVGool, SURF: Speed Up Robust Features, Proc. Of Int. Conf. Of ECCV, (2006)”
Non-Patent Document 4 “Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski: ORB: An efficient alternative to SIFT or SURF. ICCV 2011: 2564-2571.”

また、推定部2において撮像対象を特定する際の当該特徴量（特徴情報）同士のマッチングについても、画像認識の分野における周知の各手法を利用することができる。さらに、予め登録しておく撮像対象の特徴情報における画像上の座標情報と、撮像画像より抽出されマッチングされた特徴情報の撮像画像上における座標情報と、を互いに変換する平面射影変換の関係を求めることで、推定部2は撮像対象・撮像部1間の相対的な位置及び姿勢を算出することができる。当該位置及び姿勢の算出についても、前掲の非特許文献１その他に開示された拡張現実表示分野における周知手法を利用することができる。 In addition, each known technique in the field of image recognition can be used for matching between the feature amounts (feature information) when the estimation unit 2 specifies an imaging target. Further, a relation of plane projective transformation for mutually converting the coordinate information on the image in the feature information of the imaging target registered in advance and the coordinate information on the captured image of the feature information extracted and matched from the captured image is obtained. Thus, the estimation unit 2 can calculate the relative position and orientation between the imaging target and the imaging unit 1. For the calculation of the position and orientation, a well-known method in the field of augmented reality display disclosed in Non-Patent Document 1 and others described above can be used.

制御部4は、撮像部1から撮像画像を入力し、また、推定部2から対象情報を入力することで、当該入力された対象情報に応じた関連情報を記憶部3から読み出して提示情報を生成し、当該提示情報を提示部5へ出力する。制御部4の詳細は後述する。 The control unit 4 inputs the captured image from the imaging unit 1 and also inputs the target information from the estimation unit 2, thereby reading the related information corresponding to the input target information from the storage unit 3 and displaying the presentation information. It generates and outputs the presentation information to the presentation unit 5. Details of the control unit 4 will be described later.

記憶部3は、推定部2で推定された撮像対象、位置、姿勢に応じた関連情報を事前に保持し、制御部4での処理に応じて適宜関連情報を出力する。なお、当該事前に保持される関連情報は、管理者等がマニュアル作業等で用意しておく。 The storage unit 3 holds in advance related information according to the imaging target, position, and orientation estimated by the estimation unit 2, and appropriately outputs related information according to the processing in the control unit 4. Note that the relevant information held in advance is prepared by a manager or the like manually.

提示部5は、画像を表示するディスプレイで構成することができ、制御部4より入力された提示情報を提示する。一実施形態では当該関連情報の提示は、撮像部1で撮像された撮像対象に応じた拡張現実表示とすることができる。その詳細は後述する制御部4の詳細説明の通りである。 The presentation unit 5 can be configured with a display that displays an image, and presents the presentation information input from the control unit 4. In one embodiment, the presentation of the related information can be an augmented reality display according to the imaging target imaged by the imaging unit 1. The details are as detailed description of the control unit 4 to be described later.

なお、情報端末装置10の全体的な動作として、撮像部1で撮像画像が得られる都度、各部2,3,4の処理を経て提示部5に提示情報が提示されることとなる。撮像部1では所定のフレームレートで各時刻tの撮像画像P(t)を得て、提示部5では当該時刻tでの提示情報D(t)をリアルタイムで提示することができる。同様に、ユーザの指示によってある１時刻t1の撮像画像P(t1)（静止画）を得て、対応する1つの提示情報D(t1)を提示することもできる。 Note that, as an overall operation of the information terminal device 10, the presentation information is presented to the presentation unit 5 through the processing of the units 2, 3, and 4 each time a captured image is obtained by the imaging unit 1. The imaging unit 1 can obtain a captured image P (t) at each time t at a predetermined frame rate, and the presentation unit 5 can present the presentation information D (t) at the time t in real time. Similarly, a captured image P (t1) (still image) at a certain time t1 can be obtained by a user instruction, and one corresponding presentation information D (t1) can be presented.

すなわち、本発明によれば提示情報を少ない計算負荷で得ることができるので、所定レートでリアルタイムに提示部5における当該提示情報の提示が可能であるが、ユーザ指示によってある1時刻の撮像画像に対応する提示情報のみを提示するようにしてもよい。後者の場合であっても、ユーザ指示を受けた後に（処理完了まで長時間待たされることなく）速やかに提示情報の提示が可能であるため、ユーザに快適な操作環境を提供することができる。 That is, according to the present invention, since the presentation information can be obtained with a small calculation load, it is possible to present the presentation information in the presentation unit 5 in real time at a predetermined rate. Only the corresponding presentation information may be presented. Even in the latter case, the presentation information can be presented promptly after receiving a user instruction (without waiting for a long time until the process is completed), so that a comfortable operating environment can be provided to the user.

以下、制御部4（及び制御部4の出力に従って提示情報を提示する提示部5）の詳細を説明する。 Hereinafter, details of the control unit 4 (and the presentation unit 5 that presents presentation information according to the output of the control unit 4) will be described.

制御部4ではまず、撮像部1で撮像された画像に撮像対象が含まれているか否かの場合分けによって、処理を切り替える。ここで前述の通り、撮像画像に撮像対象が含まれているか否かの判定は、推定部2において撮像対象が特定されたか否かという形により実施されており、その判定結果は対象情報として制御部4に出力されている。従って、制御部4では当該判定結果に従って処理を切り替える。 First, the control unit 4 switches processing depending on whether or not an image to be captured is included in the image captured by the image capturing unit 1. Here, as described above, whether or not an imaging target is included in the captured image is determined based on whether or not the imaging target is specified by the estimation unit 2, and the determination result is controlled as target information. Output to part 4. Therefore, the control unit 4 switches processing according to the determination result.

当該判定結果による処理の切り替えは具体的には次の通りである。まず、推定部2の対象情報で撮像対象が特定されていない場合、撮像部1で撮像された画像をそのまま、提示情報として出力する。逆に、推定部2の対象情報で撮像対象が特定されている場合、当該対象情報における撮像対象、位置及び姿勢に対応する関連情報を記憶部3から検索して読み出し、撮像部1で撮像された撮像画像に関連情報を重ねあわせ、提示情報として出力する。 Specifically, the process switching based on the determination result is as follows. First, when the imaging target is not specified by the target information of the estimation unit 2, the image captured by the imaging unit 1 is output as it is as presentation information. Conversely, when the imaging target is specified by the target information of the estimation unit 2, the relevant information corresponding to the imaging target, position, and orientation in the target information is retrieved from the storage unit 3 and read out, and captured by the imaging unit 1. The related information is superimposed on the captured image and output as presentation information.

ここで、関連情報については撮像画像に対して重ね合わせる（重畳する）ことで、拡張現実表示等を実現するような所定の内容として予め用意しておき、記憶部3に記憶させておけばよい。この際、撮像画像上の座標(u, v)のどのような範囲においてどのような画像を重畳させるかという情報まで含めて関連情報を用意しておくことで、本発明においては高速に関連情報の重畳処理を実施し、提示情報を生成することができる。 Here, the related information may be prepared in advance as predetermined contents for realizing augmented reality display or the like by superimposing (superimposing) on the captured image, and stored in the storage unit 3. . At this time, by preparing related information including information on what kind of image is to be superimposed in what range of coordinates (u, v) on the captured image, in the present invention, the related information can be made at high speed. It is possible to generate the presentation information by performing the superimposing process.

例えば、前掲の非特許文献１等における従来技術に係るマーカベースの重畳処理では、カメラ位置姿勢パラメータを求める際に特定される、マーカの三次元座標(X, Y, Z)と画像座標(u, v)との対応関係を用いることで、重畳させる画像を画像座標(u, v)において算出する必要がその都度発生してしまう。しかし、本発明においては当該算出を予め実施しておき算出結果を関連情報（位置及び姿勢の関数としての関連情報）として記憶部3に保持しておくので、処理を高速化することが可能となる。 For example, in the marker-based superimposition processing according to the prior art in Non-Patent Document 1 and the like described above, the three-dimensional coordinates (X, Y, Z) and image coordinates (u , v), it is necessary to calculate the image to be superimposed at the image coordinates (u, v) each time. However, in the present invention, since the calculation is performed in advance and the calculation result is stored in the storage unit 3 as related information (related information as a function of position and orientation), the processing speed can be increased. Become.

また、従来技術ではリアルタイム表示させる際の計算リソースの制約から、重畳させる画像は所定内容の2次元画像を3次元空間において（傾き等を伴って）表示させることを前提としており、3次元立体物を3次元空間において重畳させて表示させる処理は考慮されていない。これに対して本発明においては、上記と同様に、当該位置及び姿勢に応じた3次元立体物の見え方を表現した画像として関連情報を予め算出し記憶部3に記憶させておくことで、3次元立体物の重畳表示も可能となる。 In addition, the conventional technology assumes that a 2D image with a predetermined content is displayed in a 3D space (with an inclination, etc.) due to restrictions on computational resources when displaying in real time. The processing for displaying the images in a three-dimensional space is not considered. On the other hand, in the present invention, similarly to the above, by calculating in advance the related information as an image representing the appearance of the three-dimensional solid object according to the position and orientation, and storing it in the storage unit 3, A superimposed display of a three-dimensional solid object is also possible.

当該重畳させる関連情報は、撮像画像の座標(u, v)の全範囲に対する一部分として用意されていてもよいし、撮像画像の全体として用意されていてもよい。関連情報が撮像画像の全体として用意されている場合は、重畳させることなく関連情報をそのまま提示情報として提示部5に出力すればよい。（なお、当該出力は、撮像画像の全体が関連情報として「上書き」されて提示部5に出力されるものとみなすこともできる。すなわち、撮像画像に対する関連情報の重畳の特別な場合とみなすこともできる。） The related information to be superimposed may be prepared as a part of the entire range of the coordinates (u, v) of the captured image, or may be prepared as the entire captured image. When the related information is prepared as the entire captured image, the related information may be output to the presentation unit 5 as it is as presentation information without being superimposed. (The output can also be regarded as the entire captured image being “overwritten” as related information and output to the presentation unit 5. In other words, the output is regarded as a special case of superimposing related information on the captured image. Can also.)

図２は、以上の説明における処理を含む制御部4の処理を模式的に説明するための例を[1]〜[4]と分けて示す図である。[1]には撮像部1が撮像する現場Fが示され、現場Fには地面G上に直方体形状の展示台Sが存在し、展示台Sの上面にその特徴情報が予め推定部2に登録された撮像対象M（例えば正方マーカ）が配置されている。 FIG. 2 is a diagram illustrating an example for schematically explaining the processing of the control unit 4 including the processing in the above description separately from [1] to [4]. [1] shows the site F taken by the imaging unit 1, and the site F has a rectangular parallelepiped display stand S on the ground G, and its feature information is preliminarily stored in the estimation unit 2 on the upper surface of the display stand S. A registered imaging target M (for example, a square marker) is arranged.

[1]における当該現場Fにおいて撮像部1が位置姿勢C1,C2,C3,C4で撮像した撮像画像が[2]にそれぞれ撮像画像P1,P2,P3,P4として示され、生成される提示情報が[3]にそれぞれ提示情報D1,D2,D3,D4として示されている。 The captured images captured by the image capturing unit 1 at the position and orientation C1, C2, C3, and C4 at the site F in [1] are shown as captured images P1, P2, P3, and P4 in [2], respectively, and generated presentation information Are shown in [3] as presentation information D1, D2, D3, and D4, respectively.

ここで、位置姿勢C1,C2,C3では撮像部1は撮像対象Mをそれぞれ概ね正面左側、正面、正面右側から撮像しており、撮像画像P1,P2,P3に撮像対象Mが含まれているので、関連情報が記憶部3から読み込まれ、撮像画像P1,P2,P3にそれぞれ重畳等されることによって提示情報D1,D2,D3が生成されている。当該重畳等して生成される提示情報は、[4]に示すように、現場Fにおいて撮像対象M上に所定物体O（図２の例では直方体形状の物体）があたかも実在しているように撮像画像を加工するものであり、拡張現実表示を提供するものである。 Here, in the positions and orientations C1, C2, and C3, the imaging unit 1 captures the imaging target M from the front left side, the front side, and the front right side, respectively, and the captured images P1, P2, and P3 include the imaging target M. Therefore, the related information is read from the storage unit 3, and the presentation information D1, D2, D3 is generated by being superimposed on the captured images P1, P2, P3, respectively. As shown in [4], the presentation information generated by the superimposition or the like is as if the predetermined object O (a rectangular parallelepiped object in the example of FIG. 2) is actually present on the imaging target M at the site F. The captured image is processed, and an augmented reality display is provided.

一方、位置姿勢C4では撮像部1は撮像対象M（及び不図示のその他の撮像対象）を撮像できない状態であるので、得られた撮像画像P4をそのままの形で提示情報D4が生成されることとなる。 On the other hand, in the position and orientation C4, the imaging unit 1 is in a state where it cannot capture the imaging target M (and other imaging targets not shown), so that the presentation information D4 is generated with the obtained captured image P4 as it is. It becomes.

図３は、図２の例を用いて提示情報を生成する各実施形態を模式的に説明するための図であり、各実施形態において図２における撮像画像P1から提示情報D1を生成する際の例が[3-1]及び[3-2]としてそれぞれ示されている。 FIG. 3 is a diagram for schematically explaining each embodiment for generating the presentation information by using the example of FIG. 2. When the presentation information D1 is generated from the captured image P1 in FIG. 2 in each embodiment, FIG. Examples are shown as [3-1] and [3-2], respectively.

図３の[3-1]は、撮像画像P1に対して、その画像の一部分の領域を占めるものとして予め記憶部3に保持されている関連情報R1を重畳することにより、提示情報D1が得られる（D1=P1+R1）実施形態を示している。ここで、関連情報R1は前述のように、当該撮像画像P1における位置姿勢C1（図２）において撮像対象Mとの関係で所定の物体Oが実在しているように見えるような画像として予め用意しておき、記憶部3に保持しておく。従って[3-1]に示すように、重畳される物体Oに対応する画像上の領域において関連情報R1は予め用意されている。 [3-1] in FIG. 3 obtains the presentation information D1 by superimposing the related information R1 held in advance in the storage unit 3 as occupying a partial area of the captured image P1. (D1 = P1 + R1) embodiment is shown. Here, as described above, the related information R1 is prepared in advance as an image that makes it appear that the predetermined object O exists in relation to the imaging target M in the position and orientation C1 (FIG. 2) in the captured image P1. In addition, it is stored in the storage unit 3. Therefore, as shown in [3-1], the related information R1 is prepared in advance in the area on the image corresponding to the object O to be superimposed.

一方、図３の[3-2]は、撮像画像P1に対して、その画像の全体を置き換えるものとして予め記憶部3に保持されている関連情報R10よってそのまま、提示情報D1を得る（D1=R10）実施形態を示している。ここで、関連情報R10は前述のように、当該撮像画像P1における位置姿勢C1（図２）において撮像対象Mとの関係で所定の物体Oが実在しているように見えるような画像として、当該物体Oに加えてその背景を含めた画像全体として予め用意しておき、記憶部3に保持しておく。 On the other hand, [3-2] in FIG. 3 obtains the presentation information D1 as it is based on the related information R10 held in advance in the storage unit 3 as a replacement for the entire captured image P1 (D1 = R10) shows an embodiment. Here, as described above, the related information R10 is an image that looks like the predetermined object O actually exists in relation to the imaging target M in the position and orientation C1 (FIG. 2) in the captured image P1. An entire image including the background in addition to the object O is prepared in advance and stored in the storage unit 3.

ここで、図３の[3-1]のように画像の一部を重畳する実施形態は、重畳させようとする関連情報R1が例えばコンピュータグラフィックで描画される対象である場合に適用することができる。この場合、撮像対象が取りうる位置、姿勢に応じて事前に描画しておき記憶部3に記憶させておくことで高品位な関連情報を生成しておく。 Here, the embodiment in which a part of the image is superimposed as in [3-1] in FIG. 3 may be applied when the related information R1 to be superimposed is an object to be drawn by, for example, computer graphics. it can. In this case, high-quality related information is generated by drawing in advance according to the position and orientation that can be taken by the imaging target and storing them in the storage unit 3.

より高い臨場感を実現するには、記憶部3に記憶させるための事前描画時と撮像部1で実際に撮像する時とで光源分布は変化させないことを前提として、光源分布を事前に推定しコンピュータグラフィックスの描画に反映させることが望ましい。この場合、周知のレイトレーシング技術を用いることにより、拡張現実表示で重畳させようとする物体等の表面の光沢や、光源分布によって生ずる当該物体等の陰及び影を含めて関連情報を描画することで、臨場感のある高品位な関連情報を事前に用意しておくことができる。 In order to achieve a higher sense of reality, the light source distribution is estimated in advance on the assumption that the light source distribution does not change between pre-drawing for storage in the storage unit 3 and actual imaging with the imaging unit 1. It is desirable to reflect it in the drawing of computer graphics. In this case, by using a well-known ray tracing technique, the related information including the gloss of the surface of the object etc. to be superimposed in the augmented reality display and the shadow and shadow of the object etc. caused by the light source distribution is drawn. Therefore, high-quality related information with a sense of reality can be prepared in advance.

一方、図３[3-2]のように画像全体を置き換える実施形態は、関連情報に例えば商品等の実在の物体を用いる場合（実在の物体として表示させたい場合）に適用することができる。 On the other hand, the embodiment in which the entire image is replaced as shown in FIG. 3 [3-2] can be applied to the case where an actual object such as a product is used as the related information (when it is desired to be displayed as an actual object).

ただし、上記のように提示情報において撮像画像の全体がいわば「差し替えられている」場合、高い臨場感を実現するには、次のようにすることが好ましい。まず、関連情報は、実際に撮像が行われる現場（あるいは当該現場を模した現場）において、重畳させたい実在物体を撮像対象に対する所定位置姿勢で配置したうえで、撮像部1と同様のカメラパラメータを有するカメラを用いた事前撮影によって用意する。そして、実際に撮像部1が撮像を行う環境においても、撮像対象および周辺物体、光源は（上述のコンピュータグラフィックスの場合と同様に）事前撮影時と同等の環境が維持されるようにしておくことが望ましい。 However, in the case where the entire captured image is “replaced” in the presentation information as described above, in order to achieve a high sense of realism, it is preferable to do the following. First, the related information is obtained by arranging the actual object to be superimposed at a predetermined position and orientation with respect to the imaging target at the site where the imaging is actually performed (or the site imitating the site), and the same camera parameters as those of the imaging unit 1 Prepare by pre-shooting using a camera with Even in the environment where the image capturing unit 1 actually captures an image, the image capturing target, the surrounding objects, and the light source are maintained in an environment equivalent to that during pre-shooting (similar to the case of computer graphics described above). It is desirable.

図４は、関連情報に実在の物体を用いる場合の撮像画像と関連情報との例を示す図である。撮像画像P100はパーソナルコンピュータ等が配置された机上の風景を撮像したものであり、撮像対象としての所定模様が付されたマーカがパーソナルコンピュータ手前に配置されている。当該撮像画像P100に対して生成される関連情報R100では、同風景のマーカ上に商品の瓶が配置されている。当該関連情報R100は、同風景に実際に商品の瓶を配置して事前撮影することによって用意することができる。 FIG. 4 is a diagram illustrating an example of a captured image and related information when a real object is used as the related information. The captured image P100 is an image of a landscape on a desk on which a personal computer or the like is arranged, and a marker with a predetermined pattern as an imaging target is arranged in front of the personal computer. In the related information R100 generated for the captured image P100, a product bottle is arranged on the marker of the same landscape. The related information R100 can be prepared by actually placing a product bottle in the same scene and taking a picture in advance.

以上の図３の[3-1]又は[3-2]で例示したいずれの実施形態（関連情報が撮像画像の一部又は全部として用意される実施形態）においても、撮像対象毎に用意される関連情報は、当該撮像対象が実際に撮像部1で撮像される際に取りうる位置及び姿勢のそれぞれにつき、網羅的に用意して記憶部3に保存しておくことが望ましい。 In any of the embodiments illustrated in [3-1] or [3-2] of FIG. 3 above (embodiments in which related information is prepared as part or all of a captured image), the image is prepared for each imaging target. The related information is preferably prepared in a comprehensive manner and stored in the storage unit 3 for each position and orientation that can be taken when the imaging target is actually imaged by the imaging unit 1.

すなわち、ある所定の撮像対象Mに対する関連情報R_iは、一連の網羅的な位置姿勢C_i=(X_i, Y_i, Z_i, r_1i, r_2i, r_3i)(i=1, 2, …, n; nは当該網羅する個数)のそれぞれの関数R_i(C_i)として用意しておく。ここで、X_i, Y_i, Z_iは当該位置姿勢C_iにおける位置（平行移動の３自由度のパラメータ）であり、r_1i, r_2i, r_3iは当該位置姿勢C_iにおける姿勢（回転の３自由度のパラメータ）である。なお、図２の[1]では、当該網羅的に用意しておく対象となる一連の位置姿勢C_iのうちの一部分の例として、位置姿勢C1,C2,C3が描かれている。 That is, the related information R _i for a given imaging target M is a series of comprehensive positions and orientations C _i = (X _i , Y _i , Z _i , r _1i , r _2i , r _3i ) (i = 1, 2 ,..., N; n is the number of functions R _i (C _i ). _{_{Here, X i, Y i, Z}} i is the position at the position and orientation C _i (parameter having three degrees of freedom of parallel _{_{movement), r 1i, r 2i,}} r 3i attitude at the position and orientation C _i (rotation Of 3 degrees of freedom). In [1] in FIG. 2, as an example of a portion of the series of the position and orientation C _i of interest to keep the comprehensively prepared, it is drawn position and orientation C1, C2, C3.

以下、（１）〜（８）としてそれぞれ、本発明における種々の実施形態や補足的事項を説明する。 Hereinafter, various embodiments and supplementary items in the present invention will be described as (1) to (8), respectively.

（１）各撮像対象Mに対する関連情報R_iは上記のように、予め網羅的に設定された一連の位置姿勢C_iの関数R_i(C_i)として、離散的に用意されている。従って、各時刻tの撮像画像P(t)に対する関連情報R(t)を制御部4が記憶部3から検索して求める際には、当該時刻tにおける撮像部1の撮像対象Mに対する位置姿勢C(t)に最も近いような位置姿勢C_{i_min}に対応する関連情報R_{i_min}(C_{i_min})を求めればよい。 (1) The related information R _i for each imaging target M is discretely prepared as a function R _i (C _i ) of a series of positions and orientations C _i set in advance as described above. Therefore, when the control unit 4 searches the storage unit 3 for related information R (t) for the captured image P (t) at each time t, the position and orientation of the imaging unit 1 with respect to the imaging target M at the time t The related information R _{i_min} (C _{i_min} ) corresponding to the position and orientation C _{i_min} closest to C (t) may be obtained.

ここで、前述のように一連の位置姿勢C_iの各々をC_i=(X_i, Y_i, Z_i, r_1i, r_2i, r_3i)と要素表示し、当該時刻tにおける撮像部1の撮像対象Mに対する位置姿勢C(t)を同じくC(t) =(X(t), Y(t), Z(t), r₁ (t), r₂ (t), r₃(t))と要素表示すると、最も近い位置姿勢C_{i_min}はそのインデクスi_minを以下の式[1],[2]によって求めることができる。 Here, as described above, each of the series of positions and orientations C _i is displayed as an element C _i = (X _i , Y _i , Z _i , r _1i , r _2i , r _3i ), and the imaging unit 1 at the time t The position and orientation C (t) with respect to the imaging target M of C (t) = (X (t), Y (t), Z (t), r ₁ (t), r ₂ (t), r ₃ (t )), The closest position and orientation C _{i_min} can be _obtained by the index i_min by the following equations [1] and [2].

すなわち、式[1]のように、位置姿勢C(t)と位置姿勢C_iとの距離d_{[位置姿勢距離]}を最小にするようなiをi_minとして、最も近い位置姿勢C_{i_min}を定めればよい。位置姿勢間の距離については、式[2]で示すような6次元空間におけるユークリッド距離として求めてもよいし、当該距離の各要素に所定の重み（重みをゼロとして当該要素を考慮しない場合を含む）を付与して算出される距離によって求めてもよいし、その他の定義の距離によって求めてもよい。 That is, as shown in Equation [1], the closest position and orientation C _{i_min} can be determined with i being i_min that minimizes the distance d _{[position and orientation distance]} between the position and orientation C (t) and the position and orientation C _i. That's fine. The distance between the positions and orientations may be obtained as a Euclidean distance in a 6-dimensional space as shown in Equation [2], or each element of the distance may have a predetermined weight (weight is set to zero and the element is not considered) It may be obtained from a distance calculated by adding (including), or may be obtained from other defined distances.

（２）以上の（１）の実施形態では、位置姿勢同士で定義される距離d_{[位置姿勢距離]}を用いた。当該距離の定義に関する別の実施形態として、当該位置姿勢における撮像対象Mの所定の特徴点（推定部2で抽出されるSIFT等の所定の画像特徴量における特徴点）の画像上の座標位置(u, v)同士で定義される距離d_{[画像距離]}を用いることで、最も近い位置姿勢C_{i_min}を定めるようにしてもよい。 (2) In the above embodiment (1), the distance d _{[position / orientation distance]} defined between _{positions and orientations} is used. As another embodiment regarding the definition of the distance, the coordinate position on the image of a predetermined feature point of the imaging target M at the position and orientation (a feature point in a predetermined image feature amount such as SIFT extracted by the estimation unit 2) ( The closest position and orientation C _{i_min} may be determined by using a distance d _{[image distance]} defined between u, v).

すなわち、一連の位置姿勢C_iにおいて予め抽出され推定部2（及び記憶部3）に登録しておく画像上の特徴点の座標を(u_i[j], v_i[j])(j=1, 2, …)とし、当該時刻tの撮像画像P(t)より抽出された当該姿勢C(t)に対応する特徴点の座標を(u_[j](t), v_[j](t))(j=1, 2, …)とすると、以下の式[3],[4]によって画像上での距離d_{[画像距離]}を最小にするようなiをi_minとして、最も近い位置姿勢C_{i_min}を定めてもよい。なお、共通の添え字jを付した座標(u_i[j], v_i[j])と座標(u_[j](t), v_[j](t))とは、対応する特徴量がマッチングによって互いに対応付けられた座標同士であることを表している。 That is, the coordinates of the feature points on the image extracted in advance in a series of positions and orientations C _i and registered in the estimation unit 2 (and storage unit 3) are expressed as (u _{i [j]} , v _{i [j]} ) (j = 1, 2, ...), and the coordinates of the feature points corresponding to the posture C (t) extracted from the captured image P (t) at the time t are (u _[j] (t), v _[j] ( t)) (j = 1, 2,…), the closest position is i_min that minimizes the distance d _{[image distance]} on the image by the following equations [3] and [4] The attitude C _{i_min} may be determined. Note that the coordinates (u _{i [j]} , v _{i [j]} ) and the coordinates (u _[j] (t), v _[j] (t)) with the common subscript j correspond to the corresponding feature quantities. Indicates that the coordinates are associated with each other by matching.

ここで、式[4]において添え字jについての和は、特徴量同士がマッチングされた個数の分だけ取るようにすればよい。当該マッチングされた個数に応じた規格化項を式[4]に追加してもよい。式[4]では距離を2乗距離として定義しているが、その他の定義による距離を用いてもよい。添え字jによって和を取る各要素には、特徴量同士のマッチング度合い（類似度）に応じた重みや、その他の所定の重みを付与してもよい。 Here, the sum of the subscript j in the equation [4] may be taken as many as the number of matched feature quantities. A normalization term corresponding to the number of matches may be added to Equation [4]. In Equation [4], the distance is defined as a square distance, but a distance based on other definitions may be used. A weight corresponding to the matching degree (similarity) between feature quantities or other predetermined weights may be given to each element that is summed by the subscript j.

なお、一連の位置姿勢C_iにおける画像上の特徴点の座標(u_i[j], v_i[j])(j=1, 2, …)を記憶部3に事前登録するので、記憶部3では当該位置姿勢C_i及びその他の情報に紐付く形で当該座標を記憶することとなる。 Note that the coordinates (u _{i [j]} , v _{i [j]} ) (j = 1, 2,...) Of the feature points on the image at a series of positions and orientations C _i are pre-registered in the storage unit 3, so that the storage unit In 3, the coordinates are stored in association with the position and orientation C _i and other information.

（３）上記（１）、（２）の実施形態では次のような問題が生ずることがありうる。すなわち（１）及び（２）の実施形態では共に、それぞれ定義した距離d_{[位置姿勢距離]}又はd_{[画像距離]}を用いることにより、推定部2に予め登録された一連の離散的な位置姿勢C_iの中から、各時刻tの撮像画像P(t)より求まる位置姿勢C(t)に最も近いと判定される位置姿勢C_{i_min}を算出した。 (3) In the above embodiments (1) and (2), the following problems may occur. That is, in the embodiments of (1) and (2), a series of discrete position and orientation registered in advance in the estimation unit 2 by using the defined distance d _{[position and orientation distance]} or d _{[image distance]} , respectively. from among C _i, and calculates the position and orientation C _{i_min} that is determined to be closest to from the determined position and orientation C (t) captured image P (t) at each time t.

従って、ある時刻t=t1の画像P(t1)において突発的に、ノイズが発生していたり撮像対象Mの一部にオクルージョンが生じている等の外乱がある場合には、算出される位置姿勢C_{i_min}も当該突発的な外乱を受けてしまうこととなる。当該突発的な変動は、当該時刻t1において実際の位置姿勢C(t1)から突発的に乖離した形で所定物体等を重畳させた提示情報を与える。例えば、時刻t1において突発的に提示情報が大きなズレを伴って表示されることとなる。当該表示は、時系列上で連続してリアルタイムに提示部5の提示情報を見るユーザに対して不自然な印象を与えるため、好ましくない。 Therefore, if there is a disturbance such as noise occurring suddenly in the image P (t1) at a certain time t = t1 or occlusion occurring in a part of the imaging target M, the calculated position and orientation C _{i_min} also receives the sudden disturbance. The sudden change gives presentation information in which a predetermined object or the like is superimposed in a form that suddenly deviates from the actual position and orientation C (t1) at the time t1. For example, the presentation information is suddenly displayed with a large shift at time t1. This display is not preferable because it gives an unnatural impression to a user who views the presentation information of the presentation unit 5 continuously in real time in time series.

従って、当該各時刻tにつき判定される位置姿勢C_{i_min}=C_{i_min(t)}（時刻t依存を明記してインデクスをi_min=i_min(t)と書いた）が時間軸上で滑らかに変動するように、直近の過去時刻t-1において既に判定された位置姿勢C_{i_min(t-1)}から当該時刻tにおける位置姿勢C_{i_min(t)}が極端に変動することを抑制するようなコスト項を前述の式[1]又は式[3]に追加したうえで、当該時刻tの位置姿勢C_{i_min(t)}を判定するようにしてもよい。具体的には、式[1]又は式[3]に代えて以下の式[5]で当該時刻tの位置姿勢C_{i_min(t)}を判定すればよい。 Therefore, the position / orientation C _{i_min} = C _{i_min (t)} determined at each time t (the index is written as i_min = i_min (t) with the time t dependency clearly specified) will change smoothly on the time axis. In addition, the cost term that suppresses the extreme fluctuation of the position and orientation C _{i_min (t) at the} time t from the position and orientation C _{i_min (t-1)} that has already been determined at the latest past time t-1 is described above. In addition to Equation [1] or Equation [3], the position and orientation C _{i_min (t)} at the time t may be determined. Specifically, the position and orientation C _{i_min (t)} at the time t may be determined by the following expression [5] instead of the expression [1] or [3].

ここで、αは0<α<1の定数であり、追加したコスト項d_[距離](C_i-C_{i_min(t-1)})をどれだけ考慮するかの重みを定めるものである。αが小さいほどコスト項が大きく考慮される。また、d_[距離]については、式[1]に対応する式[2]又は式[3]に対応する式[4]のいずれかの定義による距離d_{[位置姿勢距離]}又はd_{[画像距離]}を用いればよい。 Here, α is a constant of 0 <α <1, and determines the weight of how much the added cost term d _[distance] (C _i −C _i _{— min (t−1)} ) is considered. The smaller the α, the greater the cost term is considered. For d _[distance] , the distance d _{[position / posture distance]} or d _{[image distance} _] according to the definition of either formula [2] corresponding to formula [1] or formula [4] corresponding to formula [3] _] Can be used.

なお、d_[距離](C_i-C_{i_min(t-1)})の部分について式[2]又は式[4]を適用するに際しては、それぞれの式におけるC_i-C(t)の部分のうちC(t)の項をC_{i_min(t-1)}で置き換えればよい。すなわち、当該直近の時刻t-1において最も近いと判定された位置姿勢C_{i_min(t-1)}に該当するものとして推定部2に予め登録されている位置姿勢の６パラメータ又は画像上の座標位置により、それぞれ式[2]又は式[4]のC(t)の部分に該当する各要素を置き換えるようにすればよい。 When applying the formula [2] or the formula [4] to the part of d _[distance] (C _i -C _{i_min (t-1)} ), the part of C _i -C (t) in each formula Of these, the C (t) term may be replaced with _{Ci_min (t-1)} . That is, six parameters of the position and orientation registered in advance in the estimation unit 2 as corresponding to the position and orientation C _{i_min (t-1)} determined to be the closest at the latest time t-1 or the coordinate position on the image Thus, each element corresponding to the C (t) portion of the formula [2] or the formula [4] may be replaced.

上記の式[5]では直近の過去時刻t-1において判定された位置姿勢C_{i_min(t-1)}からの変動のみを加味しているが、さらに前の過去時刻t-2, t-3等における位置姿勢C_{i_min(t-2)}, C_{i_min(t-3)}等からの変動も加味したうえで同様に、当該時刻tの位置姿勢C_{i_min(t)}を判定するようにしてもよい。この場合も上記式(5)におけるαのように所定の重みをそれぞれの過去時点において付与するようにすればよい。 In the above equation [5], only the change from the position and orientation C _{i_min (t−1)} determined at the latest past time t−1 is taken into account, but further previous past times t−2 and t−3 position and orientation C _{i_min (t-2)} in such, as upon adding fluctuation from C _{i_min (t-3),} etc., may be determined the position and orientation C _{i_min} of the time t _(t) . In this case as well, a predetermined weight may be given at each past time point as indicated by α in the above equation (5).

（４）前述のように、各撮像対象Mに定義される関連情報R_iは離散的な位置姿勢C_iの関数として、いわば「量子化」されて用意されている。従って、提示情報として各時刻tにおいてリアルタイムに提示する際に、撮像部1・撮像対象M間の位置姿勢C(t)に応じて滑らかに変動するように、当該時刻tの位置姿勢C(t)に最も近いと判定された位置姿勢に対応する関連情報R_iを直接利用するのではなく、画像処理分野で周知のモーフィング技術を適用した関連情報R_mを利用するようにしてもよい。具体的には例えば、以下の（ステップ１）〜（ステップ５）のようにしてモーフィングによる関連情報R_mを制御部4において算出することができる。 (4) As described above, the related information R _i defined for each imaging target M is prepared by being “quantized” as a function of the discrete position and orientation C _i . Therefore, when the presentation information is presented in real time at each time t, the position and orientation C (t of the time t so as to change smoothly according to the position and orientation C (t) between the imaging unit 1 and the imaging target M. related information R _i corresponding to the closest and the determined position and orientation to) not directly available, may be utilized relevant information R_m applying a known morphing technology field of image processing. Specifically, for example, the related information R_m by morphing can be calculated in the control unit 4 as in the following (Step 1) to (Step 5).

（ステップ１）まず、当該時刻tにおける位置姿勢C(t)に近い位置姿勢C_iのうち上位所定数のものを、推定部2で登録している一連の離散的な位置姿勢C_iより検索する。当該検索には前述の式[2]又は[4]で定義される距離を利用すればよい。ここで説明例として、上位2個を検索して位置姿勢C_i及びC_jが得られたとし、対応する関連情報がR_i及びR_jであるものとする。 (Step 1) First, the upper predetermined number of positions and orientations C _i close to the position and orientation C (t) at the time t is searched from a series of discrete positions and orientations C _i registered in the estimation unit 2. To do. For this search, the distance defined by the above-mentioned formula [2] or [4] may be used. Here, as an illustrative example, it is assumed that the position and orientation C _i and C _j are obtained by searching the top two, and the corresponding related information is R _i and R _j .

（ステップ２）次に、推定部2の推定に用いた特徴点（推定部2において事前登録されている）に該当する点を当該上位所定数の関連情報R_i及びR_j（上位2個の場合）から求め、それぞれ、ドロネー三角分割により特徴点から構成される三角パッチを生成する。 (Step 2) Next, the points corresponding to the feature points (preliminarily registered in the estimation unit 2) used in the estimation by the estimation unit 2 are represented by the upper predetermined number of related information R _i and R _j (the upper two And a triangular patch composed of feature points is generated by Delaunay triangulation.

（ステップ３）次に、推定部2の推定に用いる特徴量（推定部2において事前登録されている）に対応する特徴量を当該上位所定数の関連情報R_i及びR_j（上位2個の場合）の特徴点周りから求め、当該特徴量同士が一致すると判定されるもの同士を対応付けることで、複数の関連情報（上位2個の場合、R_i及びR_j）間の三角パッチ同士の対応関係を算出する。 (Step 3) Next, the feature quantity corresponding to the feature quantity (pre-registered in the estimator 2) used for the estimation of the estimator 2 is set to the upper predetermined number of related information R _i and R _j (the top 2 ), The correspondence between triangle patches between multiple pieces of related information (R _i and R _{j in} the case of the top two) Calculate the relationship.

（ステップ４）続いて、モーフィングによって補間する位置、姿勢に対応する特徴点を複数の関連情報（上位2個の場合、R_i及びR_j）のそれぞれの三角パッチ毎に内分によって求める。なお、内分のための比は（ステップ１）で計算した距離の逆関数で定めてもよいし、等比など所定比としてもよい。ここで、三角パッチ毎の内分はアフィン変換によってなされ、アフィン変換係数を算出する。 (Step 4) Subsequently, feature points corresponding to the position and orientation to be interpolated by morphing are obtained by internal division for each triangular patch of a plurality of related information (R _i and R _{j in} the case of the top two). The ratio for the internal division may be determined by an inverse function of the distance calculated in (Step 1), or may be a predetermined ratio such as an equal ratio. Here, the internal division for each triangular patch is performed by affine transformation, and an affine transformation coefficient is calculated.

（ステップ５）最後に、ポリゴン毎にアフィン変換でテクスチャを変形し、全ポリゴンの画像を１つに集めることで、モーフィングによる関連情報R_mを算出することができる。 (Step 5) Finally, by deforming the texture by affine transformation for each polygon and collecting all the polygon images into one, the related information R_m by morphing can be calculated.

なお、上記の（ステップ２）及び（ステップ３）に関して、各関連情報R_iにおける特徴点及び特徴量を推定部2に事前登録されている特徴点及び特徴量と対応付けるという形により事前に実施しておき、その結果を制御部4が保持しておくようにしてもよい。この場合、特徴点及び特徴量には全ての関連情報R_i間で共通であり、且つ、推定部2の登録情報とも対応したID（識別子）が事前付与されることとなるので、各時刻tの撮像画像P(t)につきその都度（ステップ２）及び（ステップ３）を実行する必要はなくなる。 Note that the above (Step 2) and (Step 3) are performed in advance by associating the feature points and feature amounts in each related information _{Ri with} the feature points and feature amounts pre-registered in the estimation unit 2. The result may be held by the control unit 4. In this case, the feature points and the feature is common among all relevant information R _i, and, since both the registration information of the estimation unit 2 the corresponding ID (identifier) is to be pre-applied, each time t There is no need to execute (Step 2) and (Step 3) for each captured image P (t).

（５）図３の[3-2]の例で説明したように、関連情報R_iを撮像画像P(t)の全領域を占めるものとして構成し、撮像画像を置き換えるあるいは上書きする形で提示情報D(t)を生成する実施形態においては、制御部4は次のような追加処理を行ってもよい。 (5) as described in the example of [3-2] in FIG. 3, presented in the related information R _i constituted as occupying the entire area of the captured image P (t), or overwrite form replaces the captured image In the embodiment for generating the information D (t), the control unit 4 may perform the following additional processing.

まず、当該追加処理の意義を述べる。すなわち、当該実施形態においては、図４等で説明したように関連情報R_iを、画面全体（撮像画像P(t)の全領域）に対する事前撮影等によって用意しておくこととなる。従って、撮像部1で実際に撮像された現場に、当該事前撮影時には存在しなかった何らかの外部対象O_ext（例えば、人物や、撮像部1で撮像したユーザが誤って写した自分自身の指など）が存在して、撮像画像P(t)に撮像されていた場合、次のような不都合が生じる。すなわち、当該外部対象O_extが事前撮影等で得られる関連情報R_iには含まれないため、生成される提示情報D(t)も外部対象O_extが存在しないような画像となってしまい、ユーザに不自然な印象を与えることとなってしまうという不都合が生じる。（なお、用途によっては外部対象O_extが存在しない提示情報D(t)であっても不都合ではない場合もありうる。） First, the significance of the additional processing will be described. That is, in the embodiment, and thus to be prepared by pre-shooting or the like for additional information R _i as described with reference to FIG. 4 or the like, the entire screen (whole area of the captured image P (t)). Therefore, some external target O _ext (for example, a person or a user's own finger photographed by the image capturing unit 1 by mistake) that did not exist at the time of the previous image capturing at the site actually captured by the image capturing unit 1 ) And the captured image P (t) is captured, the following inconvenience occurs. That is, since the external object O _ext is not included in the related information R _i obtained by preparatory photographing and the like, presenting information D generated (t) also becomes a picture which does not exist outside the object O _ext, There is a disadvantage that an unnatural impression is given to the user. (Note that there may be cases where the presentation information D (t) for which the external target O _ext does not exist is not inconvenient depending on the application.)

従って、当該追加処理は、上記不都合となりうる問題を解決して、関連情報R_iを事前に用意しておくための事前撮影等の際には存在しなかった外部対象O_extを、提示情報D(t)の画像内にも存在させるようにするという処理である。具体的には、次の（ステップ１０）及び（ステップ１１）のようにすればよい。 Therefore, the additional processing solves the above-mentioned problem that may be inconvenient, and the external target O _ext that did not exist at the time of pre-shooting for preparing the related information _Ri in advance is displayed as the presentation information D. This is a process of making it exist in the image of (t). Specifically, the following (Step 10) and (Step 11) may be performed.

（ステップ１０）まず、外部対象O_extの領域を、撮像画像P(t)と、対応する関連情報R_iにおいて所定の対象Oを画像として重畳しない場合の情報（背景情報R_i'とする）と、に差分が生じている領域として検出する。 (Step 10) First, the information of the case where the predetermined target O is not superimposed on the captured image P (t) and the corresponding related information R _i as an image in the region of the external target O _ext (referred to as background information R _i ′) And it detects as the area | region where the difference has arisen.

ここで、背景情報R_i'は関連情報R_iに紐付けて記憶部3に事前に記憶させておき制御部4が読み出せばよい。背景情報R_i'はすなわち、外部対象O_extが全く存在しない場合に得られることとなる撮像画像P(t)を事前に用意しておくものである。従って、前述のように高品位化のため光源分布を考慮する等して実写によって関連情報R_iを用意する際に、併せて所定対象Oを除外した同環境において実写を行うことにより、背景情報R_i'を用意すればよい。 Here, the background information R _i ′ may be associated with the related information R _i and stored in advance in the storage unit 3 and read out by the control unit 4. That is, the background information R _i ′ is prepared in advance for a captured image P (t) that will be obtained when there is no external target O _ext at all. Therefore, as described above, when the related information _Ri is prepared by taking a live image taking into account the light source distribution for high quality as described above, the background information is obtained by performing the real photo in the same environment excluding the predetermined object O. R _i 'should be prepared.

また、差分が生じている領域の検出については、各種の周知手法により検出すればよい。例えば、撮像画像P(t)及び対応する関連情報R_iの両画像において各画素位置(u, v)の画素値の差を求め、当該差が所定の閾値を超える一連の画素に対して周知のモルフォジー演算等により外部対象O_extの領域を抽出すればよい。 Moreover, what is necessary is just to detect the area | region where the difference has arisen with various well-known methods. For example, determines the difference between the pixel value of each pixel position (u, v) in both image-related information R _i to captured image P (t) and the corresponding, known for a series of pixels to which the difference exceeds a predetermined threshold The region of the external target O _ext may be extracted by the morphological operation of.

（ステップ１１）次に、提示情報D(t)を、関連情報R_iに対して上記（ステップ１０）で検出された撮影画像P(t)における外部対象O_extの領域の部分を重畳した情報として生成する。 (Step 11) Next, the information presentation information D a (t), obtained by superimposing a portion of the area of the external object O _ext in the relative related information R _i detected in (step 10) the captured image P (t) Generate as

なお、（ステップ１０）において外部対象O_extの領域が未検出と判定される場合もあるが、この場合には通常通りに画面全体として構成されている関連情報R_iをそのまま提示情報D(t)とすればよい。 In some cases, the area of the external object O _ext (Step 10) is determined to be undetected, but related information R _i as it presents information D (t configured as the entire screen as usual in this case )And it is sufficient.

また、（ステップ１０）において検出された外部対象O_extの領域が、関連情報R_iに置き換えることによって撮影画像P(t)に重畳させようとしている所定対象Oの領域と重なっている場合には、提示部5において所定対象Oの全体を提示情報として表示させるべく、当該重なり部分に関しては重畳させる外部対象O_extの領域から除外するようにしてもよい。 Further, if overlapping with the detected area of the external object O _ext were (Step 10), the related information R _i predetermined target O regions are trying superimposed on the captured image P (t) by replacing the In order to display the entire predetermined target O as the presentation information in the presentation unit 5, the overlapping portion may be excluded from the region of the external target _Oext to be superimposed.

図５は、図４の撮像画像P100及び関連情報R100の例に対応する例で、外部対象が混在する場合の例を示す図である。ここでは、図４の撮像画像P10において右上側に外部対象としての指の一部が映り込んでいる場合に、当該外部対象を重畳させることで生成された提示情報D100の例が示されている。 FIG. 5 is an example corresponding to the example of the captured image P100 and the related information R100 in FIG. 4 and is a diagram illustrating an example when external targets are mixed. Here, an example of the presentation information D100 generated by superimposing the external target when a part of the finger as the external target is reflected on the upper right side in the captured image P10 of FIG. 4 is shown. .

（６）式[1]〜[5]等で説明した各実施形態においては、制御部4は記憶部3に保存された網羅的な位置姿勢C_iを検索する必要がある。当該検索の負荷を下げるべく、位置姿勢C_iの各々には自身との距離が所定の閾値THによる判定で小さいと判定される一連の別の位置姿勢{C_j|d_[距離](C_i-C_j)<TH}の情報を予め紐付けたうえで、記憶部3に事前記憶しておいてもよい。 (6) In each embodiment described in the equations [1] to [5] and the like, the control unit 4 needs to search for an exhaustive position and orientation C _i stored in the storage unit 3. In order to reduce the load of the search, each position and orientation C _i has a series of other positions and orientations {C _j | d _[distance] (C _i -C _j ) <TH} information may be linked in advance and stored in the storage unit 3 in advance.

この場合、各時刻tにおいて最も近い位置姿勢C_{i_min(t)}を検索するに際して、記憶部3に保存された全ての位置姿勢C_iを検索する代わりに、その一部分である位置姿勢{C_j|d_[距離]( C_{i_min(t-1)}-C_j)<TH}のみを検索することで、負荷を低減することができる。ただしここで、当該閾値THで制限された範囲を超えるような位置姿勢の変動が隣接する時刻t-1,t間で発生しないことを前提とする。 In this case, when searching for the closest position and orientation C _{i_min (t)} at each time t, instead of searching all the positions and orientations C _i stored in the storage unit 3, the position and orientation {C _j | By searching only for d _[distance] (C _{i — min (t−1)} −C _j ) <TH}, the load can be reduced. However, here, it is assumed that no change in position and orientation that exceeds the range limited by the threshold TH occurs between adjacent times t-1 and t.

（７）関連情報を事前に用意するに際しては、以上説明した手法に限らず、用途に応じた手法で用意することができる。例えば以上の説明では、関連情報が撮像画像の一部分の領域を占める場合、コンピュータグラフィックス（CG）による事前描画を行うことができ、全体を占める場合、同環境による事前撮影（実写）を行うことができるとして説明したが、この逆の組み合わせで関連情報を用意してもよいし、CG又は実写のいずれか片方のみを用いてもよい。（撮像画像の一部分を実写で置き換える場合は、実写した画像から周知の領域抽出法等にて関連情報を抽出すればよい。）また、実写とCGとを組み合わせた手法で関連情報を用意してもよい。また、関連情報として重畳させる対象は、撮像対象毎に複数存在してもよい。画面全体を対象として事前描画あるいは事前撮影を行う場合も、実際に撮像部1が撮像する環境から大きく変わる環境において実施し、提示部5においていわゆる「パラレルワールド」を表示させるようにしてもよい。 (7) When the related information is prepared in advance, it is not limited to the method described above, and can be prepared by a method according to the application. For example, in the above description, if the related information occupies a partial area of the captured image, pre-drawing by computer graphics (CG) can be performed, and if it occupies the whole, pre-shooting (actual shooting) in the same environment is performed. However, related information may be prepared by a reverse combination, or only one of CG and live action may be used. (If you want to replace a part of the captured image with a live-action image, you can extract the relevant information from the captured image by a well-known region extraction method, etc.) Also good. In addition, a plurality of targets to be superimposed as related information may exist for each imaging target. Even when pre-drawing or pre-photographing is performed on the entire screen, the pre-drawing or pre-photographing may be performed in an environment that greatly changes from the environment in which the image capturing unit 1 actually captures, and a so-called “parallel world” may be displayed on the presentation unit 5.

（８）本発明は、コンピュータを情報端末装置10として機能させるプログラムとしても提供可能である。当該コンピュータには、CPU(中央演算装置)、メモリ及び各種I/Fといった周知のハードウェア構成のものを採用することができ、CPUが情報端末装置10の各部の機能に対応する命令を実行することとなる。 (8) The present invention can also be provided as a program that causes a computer to function as the information terminal device 10. The computer can adopt a known hardware configuration such as a CPU (Central Processing Unit), a memory, and various I / Fs, and the CPU executes instructions corresponding to the functions of each unit of the information terminal device 10. It will be.

10…情報端末装置、1…撮像部、2…推定部、3…記憶部、4…制御部、5…提示部 DESCRIPTION OF SYMBOLS 10 ... Information terminal device, 1 ... Imaging part, 2 ... Estimation part, 3 ... Memory | storage part, 4 ... Control part, 5 ... Presentation part

Claims

An imaging unit that performs imaging to acquire a captured image;
An estimation unit that identifies a predetermined imaging target from the captured image and estimates a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage unit configured to be able to retrieve and read related information ,
By specifying the specified imaging target and the estimated position and orientation with respect to the storage unit, related information according to the specified imaging target and the estimated position and orientation is stored from the storage unit. A controller that retrieves and reads out and generates presentation information by superimposing the related information on the captured image ;
A presentation unit for presenting the generated presentation information ,
The storage unit includes an image configured as a live-action image over the entire range of the captured image, and includes the related information as information for configuring a predetermined augmented reality display corresponding to the position and orientation for each imaging target. Remember,
The storage unit stores the captured image as an image obtained by adding a predetermined superimposed display to the captured image, and associates the captured image with the captured image and omits the predetermined superimposed display. Remember the image,
When the control unit reads related information as an image configured as a live-action image over the entire range of the captured image from the storage unit, the control unit includes the acquired captured image and the read related information. determined the straps attached background image, the region forming the difference as the external region, by superimposing the outer region in the related information read out the information terminal device characterized that you generate the presentation information.

An imaging unit that performs imaging to acquire a captured image;
An estimation unit that identifies a predetermined imaging target from the captured image and estimates a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage unit configured to be able to retrieve and read related information ,
By specifying the specified imaging target and the estimated position and orientation with respect to the storage unit, related information according to the specified imaging target and the estimated position and orientation is stored from the storage unit. A controller that retrieves and reads out and generates presentation information by superimposing the related information on the captured image ;
A presentation unit for presenting the generated presentation information ,
The imaging unit continuously captures images at a predetermined sampling rate on the time axis to obtain captured images,
At each time, the control unit designates the specified imaging object and the estimated position and orientation at the current time in the storage unit to search and read the corresponding related information. The search range is limited to related information corresponding to the position and orientation in the vicinity of the position and orientation with respect to the read related information at the time, and the presentation information is generated by searching and reading from the storage unit. Information terminal device.

An imaging unit that performs imaging to acquire a captured image;
An estimation unit that identifies a predetermined imaging target from the captured image and estimates a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage unit configured to be able to retrieve and read related information ,
By specifying the specified imaging target and the estimated position and orientation with respect to the storage unit, related information according to the specified imaging target and the estimated position and orientation is stored from the storage unit. A controller that retrieves and reads out and generates presentation information by superimposing the related information on the captured image ;
A presentation unit for presenting the generated presentation information ,
The estimation unit specifies a predetermined imaging target from the captured image by matching an image feature amount extracted in advance for the predetermined imaging target with an image feature amount extracted from the captured image, and Estimating the relative position and orientation of the imaging target and the imaging unit;
The storage unit stores the related information drawn in advance according to each position and orientation for each imaging target in association with an image feature amount extracted in advance from the image of the imaging target in the position and orientation,
When the control unit reads the related information according to the estimated position and orientation from the storage unit, the control unit uses the estimated position and orientation among the image coordinates of the image feature amount stored in association with the storage unit. An information terminal device characterized by reading out related information corresponding to what is determined to be coincident with the image coordinates of the extracted image feature amount, and searching and reading with the image coordinates instead of the position and orientation .

The control unit generates the presentation information by superimposing the read related information on the acquired captured image when the estimation unit can identify a predetermined imaging target from the captured image, and the estimation If the part can not be specified a predetermined imaging target from the captured image, the information terminal device according to any one of 3 claims 1, characterized in that said acquired the presentation information captured image.

Wherein the storage unit, information terminal according to any one of claims 1 to 4, characterized in that storing related information as information that constitutes the predetermined augmented reality display in accordance with the position and orientation for each imaging target apparatus.

The storage unit includes an image configured as a live-action image over the entire range of the captured image, and includes the related information as information for configuring a predetermined augmented reality display corresponding to the position and orientation for each imaging target. the information terminal device according to any one of claims 1 and to store 5.

The storage unit includes the related information as information that includes an image generated by computer graphics in a partial region of the captured image and configures a predetermined augmented reality display according to the position and orientation of each imaging target. The information terminal device according to any one of claims 1 to 6 , wherein the information terminal device is stored.

The storage unit stores related information drawn in advance according to each of the discrete positions and orientations for each imaging target,
The control unit reads related information according to the identified imaging target from the storage unit for a predetermined upper number determined to be close to the estimated position and orientation, and converts the related information to the predetermined upper number of related information. the related information generated by applying morphing for the information terminal device according to any one of claims 1 to 7, characterized in that generating the presentation information.

An imaging stage in which the imaging unit captures an image and obtains a captured image;
An estimation stage for specifying a predetermined imaging target from the captured image and estimating a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage stage configured to retrieve and retrieve related information, and
By specifying the specified imaging target and the estimated position and orientation , related information according to the specified imaging target and the estimated position and orientation is obtained from the related information stored in the storage step. A control stage for searching and reading and generating the presentation information by superimposing the relevant information on the captured image ;
A presentation step of presenting the generated presentation information ,
In the storing step, the related information is included as information that constitutes a predetermined augmented reality display corresponding to the position and orientation for each imaging target, including an image configured as a live-action image over the entire range of the captured image. Remember,
In the storage step, the photographed image is stored as an image obtained by adding a predetermined superimposed display to the captured image, and a background corresponding to the captured image when the predetermined superimposed display is omitted by linking to the photographed image. Remember the image,
In the control step, when the related information as an image configured as a real image over the entire range of the captured image is read from the related information stored in the storage step, the acquired captured image and the read was determined with the string attached background image related information, an area forming the difference as the external region, by superimposing the outer region in the related information read out said, characterized that you generate the presentation information, A method performed by a computer .

An imaging stage in which the imaging unit captures an image and obtains a captured image;
An estimation stage for specifying a predetermined imaging target from the captured image and estimating a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage stage configured to retrieve and retrieve related information, and
By specifying the specified imaging target and the estimated position and orientation , related information according to the specified imaging target and the estimated position and orientation is obtained from the related information stored in the storage step. A control stage for searching and reading and generating the presentation information by superimposing the relevant information on the captured image ;
A presentation step of presenting the generated presentation information ,
In the imaging stage, the imaging unit continuously captures images at a predetermined sampling rate on the time axis to obtain captured images,
In the control stage, at each time, the specified imaging object and the estimated position and orientation at the current time are designated as the related information stored in the storage stage, and corresponding related information is retrieved and read out. In this case, the search range is limited to the related information corresponding to the position and orientation in the vicinity of the position and orientation with respect to the read related information at the latest past time, and the search is performed from the related information stored in the storage step. The computer-implemented method is characterized in that presentation information is generated by reading out the information .

An imaging stage in which the imaging unit captures an image and obtains a captured image;
An estimation stage for specifying a predetermined imaging target from the captured image and estimating a relative position and orientation between the imaging target and the imaging unit;
By storing related information drawn in advance for each imaging target in accordance with each position and orientation, by specifying the imaging target and its position and orientation, depending on the designated imaging target and its position and orientation A storage stage configured to retrieve and retrieve related information, and
By specifying the specified imaging target and the estimated position and orientation , related information according to the specified imaging target and the estimated position and orientation is obtained from the related information stored in the storage step. A control stage for searching and reading and generating the presentation information by superimposing the relevant information on the captured image ;
A presentation step of presenting the generated presentation information ,
In the estimation step, by matching an image feature amount extracted in advance for a predetermined imaging target with an image feature amount extracted from the captured image, the predetermined imaging target is specified from the captured image, and Estimating the relative position and orientation of the imaging target and the imaging unit;
In the storing step, related information drawn in advance according to each position and orientation of each imaging target is stored in association with an image feature amount extracted in advance from the image of the imaging target in the position and orientation.
In the control step, when reading the related information according to the estimated position and orientation from the related information stored in the storage step, among the image coordinates of the image feature amount stored in association with the storage step, By reading related information corresponding to what is determined to be coincident with the image coordinates of the extracted image feature amount in the estimated position and orientation, it is retrieved and read based on the image coordinates instead of the position and orientation. A computer-implemented method characterized by :

A program for causing a computer to function as the information terminal device according to any one of claims 1 to 8 .