JP5843288B2

JP5843288B2 - Information presentation system

Info

Publication number: JP5843288B2
Application number: JP2012095868A
Authority: JP
Inventors: 加藤　晴久; 晴久加藤; 米山　暁夫; 暁夫米山
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2012-04-19
Filing date: 2012-04-19
Publication date: 2016-01-13
Anticipated expiration: 2032-04-19
Also published as: JP2013222447A

Description

本発明は、情報を提示する情報提示システムに関し、特に、撮像部と撮像対象の相対的位置関係の変化によって表示部での表示情報を制御できる情報提示システムに関する。 The present invention relates to an information presentation system that presents information, and more particularly, to an information presentation system that can control display information on a display unit by changing a relative positional relationship between an imaging unit and an imaging target.

撮像対象との相対的な位置関係に応じて情報を提示する装置は、提示する情報を直感的に変化させることが可能であり、利用者の利便性を向上させることができる。 An apparatus that presents information according to a relative positional relationship with an imaging target can intuitively change the information to be presented, and can improve user convenience.

上記を実現する従来技術の例として特許文献1及び2に開示のものがあり、ここでは以下のような手法が公開されている。 Examples of conventional techniques for realizing the above are disclosed in Patent Documents 1 and 2, and the following methods are disclosed here.

特許文献１では、ジャイロセンサと、加速度センサと、撮像手段とを備えた入力装置から操作データを取得し、ジャイロセンサが検出する角速度に基づいて入力装置の姿勢を算出した後に、加速度センサが検出する加速度データ及び撮像手段が撮像する所定の撮像対象の画像に基づいて入力装置の姿勢を補正する手法が提案されている。 In Patent Document 1, operation data is acquired from an input device including a gyro sensor, an acceleration sensor, and an imaging unit, and after the attitude of the input device is calculated based on an angular velocity detected by the gyro sensor, the acceleration sensor detects There has been proposed a method for correcting the attitude of an input device based on acceleration data to be captured and an image of a predetermined imaging target captured by an imaging unit.

特許文献２では、撮像手段を備えた入力装置から画像を取得し、直線の交点等のコーナー周辺から拡大縮小や回転等の変換に不変な特徴情報を算出した後に、特徴情報の組み合わせに基づいて入力装置の姿勢を推定する手法が提案されている。 In Patent Document 2, an image is acquired from an input device including an imaging unit, and feature information that is invariant to conversion such as enlargement / reduction or rotation is calculated from the periphery of a corner such as an intersection of straight lines, and then based on a combination of feature information. A method for estimating the attitude of the input device has been proposed.

特開2010-5332号公報JP 2010-5332 A 特開2010-517129号公報JP 2010-517129 A

特許文献１の姿勢算出装置では、まずジャイロセンサから逐次出力される角速度を積分し、初期状態からの姿勢の変化量を積分結果から算出することによって、現在の姿勢を算出する。このとき、ジャイロセンサの誤差が蓄積する問題を解決するため、加速度センサが示す向きが重力加速度方向であることを仮定して補正する。 In the posture calculation device of Patent Document 1, first, the current posture is calculated by integrating the angular velocities sequentially output from the gyro sensor and calculating the amount of change in posture from the initial state from the integration result. At this time, in order to solve the problem that the error of the gyro sensor is accumulated, the correction is performed on the assumption that the direction indicated by the acceleration sensor is the gravitational acceleration direction.

しかし、動きによっては装置が移動している最中に重力加速度方向以外に加速度が発生すること、及び加速度センサから算出される姿勢の誤差を想定していないために、特許文献１では姿勢を正確に算出することができないおそれがある。また、加速度センサのなかには重力加速度を計測できないものもあるため、利用できる加速度センサは限定される。一方、撮像手段を補正に利用する実施例では、赤外線を照射する装置を別途設置しておく必要があるため、特許文献１の装置を利用できる場所が限定されるという課題がある。 However, since it is not assumed that acceleration occurs in a direction other than the gravitational acceleration direction while the apparatus is moving depending on the movement, and an error in the posture calculated from the acceleration sensor is not assumed, the posture is not accurate in Patent Document 1. May not be able to be calculated. Moreover, since some acceleration sensors cannot measure gravitational acceleration, the available acceleration sensors are limited. On the other hand, in the embodiment in which the imaging means is used for correction, it is necessary to separately install a device that irradiates infrared rays. Therefore, there is a problem that places where the device of Patent Document 1 can be used are limited.

特許文献２の装置では、平面状の画像から抽出した特徴量の組み合わせを評価することにより、該平面画像と撮像部との相対的な姿勢を算出する。取得する画像としてありふれた平面画像を利用すれば、予め設置する手間を省略できるため、特許文献１の問題を一部解決できる。 In the apparatus of Patent Document 2, the relative posture between the planar image and the imaging unit is calculated by evaluating a combination of feature amounts extracted from the planar image. If a common plane image is used as an image to be acquired, the trouble of installing in advance can be omitted, so that the problem of Patent Document 1 can be partially solved.

しかし、複数の異なる平面画像に対応させたい場合、認識対象の数に応じて特徴情報の対応関係に関する評価回数が増加するため、処理時間がかかるという問題がある。特に、非力な端末で処理する場合は、認識対象数が少ない場合でも、実用的な速度を維持することは難しい。 However, when it is desired to correspond to a plurality of different planar images, the number of evaluations related to the correspondence relationship of the feature information increases according to the number of recognition objects, which causes a problem that processing time is required. In particular, when processing with a weak terminal, it is difficult to maintain a practical speed even when the number of recognition targets is small.

本発明の目的は、上記課題を解決し、認識対象数が多い場合であっても撮像対象がいずれの認識対象であるかを高速で認識するとともに、撮像部を用いて端末の位置及び姿勢を正確に算出し、表示部で表示する情報を制御できる情報提示システムを提供することにある。 An object of the present invention is to solve the above-described problems, recognize even which recognition target is the recognition target at a high speed even when the number of recognition targets is large, and determine the position and orientation of the terminal using the imaging unit. An object of the present invention is to provide an information presentation system capable of accurately calculating and controlling information displayed on a display unit.

上記目的を達成するため、本発明は、撮像対象を撮像する撮像部を含む端末と、該端末と通信するサーバーとを備え、該端末において撮像対象に関連した付加情報を提示する情報提示システムであって、前記端末は、当該端末の位置を含む測位情報を取得する測位部と、前記撮像された画像より回転及び拡大縮小に対して不変な所定の特徴量及びその特徴点座標を算出する算出部と、前記取得した測位情報並びに前記算出された特徴量及び特徴点座標を前記サーバーに送信する端末側送信部と、前記サーバーが認識する前記端末の撮像対象に対する位置及び姿勢と、付加情報と、を前記サーバーより受信する端末側受信部と、前記認識された位置及び姿勢を更新する更新部と、前記サーバーより受信した付加情報を前記更新された位置及び姿勢に従って制御する制御部と、前記制御された付加情報を前記撮像された画像において撮像対象に対して所定配置をなして重畳させて表示する出力部とを含み、前記サーバーは、前記端末側送信部より送信された測位情報並びに特徴量及び特徴点座標を受信するサーバー側受信部と、所定の複数の撮像対象の各々につき、所定配置で撮像した際の特徴量及びその特徴点座標と、測位情報と、付加情報と、を対応付けて保持する記憶部と、前記記憶部を検索することにより、前記取得された測位情報との距離が所定範囲に収まる測位情報を有する撮像対象を、前記撮像部が撮像している撮像対象の候補として、その特徴量及び特徴点座標と共に抽出する抽出部と、前記抽出された各候補における特徴量及び特徴点座標を前記受信した特徴量及び特徴点座標と比較することで、いずれの候補が前記撮像部が撮像している撮像対象に該当するかと、当該撮像対象の前記端末に対する位置及び姿勢と、を認識する認識部と、前記認識された位置及び姿勢と、前記該当すると認識された撮像対象に対して前記記憶部で保持されている付加情報と、を前記端末に送信するサーバー側送信部とを含むことを特徴とする。 In order to achieve the above object, the present invention is an information presentation system that includes a terminal including an imaging unit that images an imaging target and a server that communicates with the terminal, and presents additional information related to the imaging target at the terminal. The terminal includes a positioning unit that acquires positioning information including the position of the terminal, and a calculation that calculates a predetermined feature amount that is invariant to rotation and enlargement / reduction and its feature point coordinates from the captured image. A terminal-side transmitting unit that transmits the acquired positioning information and the calculated feature amount and feature point coordinates to the server, a position and orientation of the terminal recognized by the server, and additional information , A terminal-side receiving unit that receives from the server, an updating unit that updates the recognized position and orientation, and the updated position and appearance of additional information received from the server. And a control unit that controls the additional information that is controlled and an output unit that superimposes and displays the controlled additional information on the imaging target in a predetermined arrangement. The server side receiving unit that receives the positioning information and the feature amount and the feature point coordinates transmitted from the device, the feature amount and the feature point coordinates when the image is captured in a predetermined arrangement for each of a plurality of predetermined imaging targets, and the positioning information And a storage unit that holds the additional information in association with each other, and by searching the storage unit, an imaging target having positioning information within which a distance from the acquired positioning information is within a predetermined range As an imaging target candidate being imaged, an extraction unit that extracts the feature amount and feature point coordinates together with the feature amount and feature point coordinates, and the received feature amount and feature point coordinates in each of the extracted candidates and The recognition unit recognizes which candidate corresponds to the imaging target imaged by the imaging unit and the position and orientation of the imaging target with respect to the terminal by comparing with the point coordinates. And a server-side transmission unit that transmits the additional information held in the storage unit with respect to the imaging target recognized to be applicable to the terminal.

本発明によれば、端末を持つユーザが撮像部と撮像対象との相対的位置及び姿勢を変化させるだけで、当該位置及び姿勢が認識及び更新されて、撮像画像における撮像対象に当該位置及び姿勢に基づいて制御された付加情報を付随表示させることができる。従って、ユーザは直感的な操作によって出力部における表示情報を制御することができる。 According to the present invention, the position and orientation are recognized and updated only by a user having a terminal changing the relative position and orientation of the imaging unit and the imaging target, and the position and orientation are added to the imaging target in the captured image. It is possible to display additional information controlled based on the above. Therefore, the user can control the display information in the output unit by an intuitive operation.

また、当該表示情報の制御に際して、撮像部に入力される画像を端末で解析して得られた特徴量及び特徴点座標をもとに、端末ではなくサーバーが撮像対象を認識するので、端末の性能によって制約を受けることなく撮像対象数を増加させることができるとともに、認識精度を高く維持することができる。 Further, when controlling the display information, the server recognizes the imaging target, not the terminal, based on the feature amount and the feature point coordinates obtained by analyzing the image input to the imaging unit with the terminal. The number of imaging objects can be increased without being restricted by performance, and the recognition accuracy can be maintained high.

さらに、あらかじめ抽出部にて測位情報を利用して検索を行うため、候補となる撮像対象を限定することができ、認識部における認識速度及び認識精度を同時に向上できる。また、端末において姿勢情報を更新することで、遅延無く処理速度を高めることができ、したがって、表示情報を確実に制御できる。 Furthermore, since the extraction unit performs the search using the positioning information in advance, the candidate imaging targets can be limited, and the recognition speed and recognition accuracy in the recognition unit can be improved at the same time. In addition, by updating the posture information in the terminal, the processing speed can be increased without delay, and thus the display information can be reliably controlled.

情報提示システムの構成概要を説明するための図である。It is a figure for demonstrating the structure outline | summary of an information presentation system. 情報提示システムが備える端末及びサーバーの機能ブロック図である。It is a functional block diagram of the terminal and server with which an information presentation system is provided. 表示情報を撮像対象の推定姿勢で制御する例を示す図である。It is a figure which shows the example which controls display information with the presumed attitude | position of an imaging target. 表示情報を撮像対象の推定位置で制御する例を示す図である。It is a figure which shows the example which controls display information by the estimated position of an imaging target. 記憶部の保持している情報の例を表形式で示すものである。An example of information held in the storage unit is shown in a table format. 特徴情報を概念的に説明するための図である。It is a figure for demonstrating the feature information notionally. 撮像対象の候補を検索する際に測位情報を利用することを概念的に説明する図である。It is a figure which illustrates notionally using positioning information when searching the candidate of an imaging target. 出力部における表示情報を説明するための図である。It is a figure for demonstrating the display information in an output part. 更新処理に関連する端末とサーバーとの情報のやりとりのタイミングの各種の態様を説明する図である。It is a figure explaining the various aspects of the timing of the exchange of information with the terminal relevant to an update process, and a server. 更新部の機能ブロック図である。It is a functional block diagram of an update part.

図1は、本発明の一実施形態に係る情報提示システムの構成概要を説明するための図である。情報提示システム3は、複数の端末1-1, ...,1-Nと、サーバー2とを備える。以下、これら複数の端末1-1, ...,1-Nを代表する任意の１つを端末1とする。端末1とサーバー2とは、インターネット等のネットワークNを介して互いに通信する。 FIG. 1 is a diagram for explaining a configuration outline of an information presentation system according to an embodiment of the present invention. The information presentation system 3 includes a plurality of terminals 1-1, ..., 1-N and a server 2. Hereinafter, an arbitrary one representing the plurality of terminals 1-1,..., 1-N is referred to as a terminal 1. The terminal 1 and the server 2 communicate with each other via a network N such as the Internet.

図2は情報提示システム3が備える端末1及びサーバー2の一実施形態に係る機能ブロック図である。端末1は、撮像部11、算出部12、測位部13、制御部14、更新部15、出力部16、(端末側)送信部17及び(端末側)受信部18を含む。サーバー2は、認識部21、記憶部22、抽出部23、(サーバー側)送信部24及び(サーバー側)受信部25を含む。 FIG. 2 is a functional block diagram according to an embodiment of the terminal 1 and the server 2 provided in the information presentation system 3. The terminal 1 includes an imaging unit 11, a calculation unit 12, a positioning unit 13, a control unit 14, an update unit 15, an output unit 16, a (terminal side) transmitting unit 17, and a (terminal side) receiving unit 18. The server 2 includes a recognition unit 21, a storage unit 22, an extraction unit 23, a (server side) transmission unit 24, and a (server side) reception unit 25.

例えば、携帯端末によってこのような構成の端末1を実現することができるが、その他のコンピュータなどによってもよい。また、サーバー2の構成についても、このような構成によって物理的に単一のサーバーに限られず、認識部21ないし受信部25の各部(複数の各部でもよい)の機能別に複数のサーバーに分離してもよい。 For example, the terminal 1 having such a configuration can be realized by a portable terminal, but may be another computer. In addition, the configuration of the server 2 is not limited to a physically single server by such a configuration, and is separated into a plurality of servers according to the functions of the respective units of the recognition unit 21 to the reception unit 25 (may be a plurality of units). May be.

なお、図2における各部を結ぶ線は、各部の間にやりとりが存在することを示しているが、当該線は主要なやりとりのみを示しており、詳細を適宜説明するように、線が描かれてなくともやりとりが存在する場合もある。 Note that the lines connecting the parts in FIG. 2 indicate that there is an exchange between the parts, but the lines show only the main exchanges, and lines are drawn to explain the details as appropriate. There may be interactions even if not.

図3及び図4は本発明が実現する出力部16における制御された表示情報の例を示す図であり、図3が姿勢による制御の一例を、図4が位置による制御の一例を示している。なお、制御は姿勢及び位置の両者によって同時に行われるが、ここでは説明のために両者を分けて説明する。なおまた、正確には制御部14が付加情報を位置及び姿勢によって撮像部11からの入力画像に連動させて制御したうえで出力部16において当該入力画像に重畳することで表示情報が得られるが、これについては図8等を用いて後述することとし、ここでは表示情報の制御として説明する。 3 and 4 are diagrams illustrating examples of display information controlled in the output unit 16 realized by the present invention. FIG. 3 illustrates an example of control based on posture, and FIG. 4 illustrates an example of control based on position. . Note that the control is performed simultaneously by both the posture and the position, but for the sake of explanation, both will be described separately. In addition, precisely, the control unit 14 controls the additional information according to the position and orientation in conjunction with the input image from the imaging unit 11, and then superimposes the display information on the input image in the output unit 16. This will be described later with reference to FIG. 8 and the like, and will be described here as control of display information.

本発明では、例えば看板といったような概ね平面状で且つ静止している撮像対象を端末1が撮像し、当該撮像した画像をサーバー2と連携して解析することで撮像対象が何であるかを認識すると共に、ユーザが端末1を手で動かす等することによって変化している端末1に対する撮像対象の姿勢及び位置を推定し、当該推定された姿勢及び位置によって端末1に表示される表示情報を制御する。 In the present invention, for example, the terminal 1 captures a substantially flat and stationary imaging target such as a signboard, and the captured image is analyzed in cooperation with the server 2 to recognize what the imaging target is. In addition, the posture and position of the imaging target with respect to the terminal 1 that is changing, for example, when the user moves the terminal 1 by hand, and the display information displayed on the terminal 1 is controlled based on the estimated posture and position. To do.

位置及び姿勢の推定は具体的には、認識された撮像対象が端末1に対して所定配置にある際に撮像される画像と、実際に撮像された画像と、を平面射影変換の関係で結びつけて、当該平面射影変換における並進成分によって位置を、所定配置にあった際の位置との関係で推定し、当該平面射影変換における回転成分によって姿勢を、所定配置にあった際の姿勢との関係で推定する。表示情報の制御は、当該平面射影変換の関係を直接用いて、あるいは並進成分及び回転成分毎に定数倍などの所定の関係を施したうえで用いて、制御することが可能である。 Specifically, the estimation of the position and orientation links the image captured when the recognized imaging target is in a predetermined arrangement with respect to the terminal 1 and the actually captured image in a plane projective transformation relationship. Thus, the position is estimated from the translation component in the plane projective transformation in relation to the position when the plane is in the predetermined arrangement, and the posture is determined from the rotation component in the plane projective transformation and the relation to the attitude in the predetermined arrangement Estimated by The display information can be controlled by using the planar projective transformation relationship directly or by applying a predetermined relationship such as a constant multiple for each translational component and rotational component.

例えば図3では、(A)に示すように撮像対象の例として看板が、端末1によって撮像されている。ここで、端末1がA10に示す所定配置において看板を正面から撮像している際に、表示情報の例としての平面状の模様が(B)のP10に示すような状態にあるものとする。 For example, in FIG. 3, as shown in FIG. 3A, a signboard is imaged by the terminal 1 as an example of an imaging target. Here, it is assumed that a flat pattern as an example of display information is in a state as indicated by P10 in (B) when the terminal 1 images a signboard from the front in a predetermined arrangement indicated by A10.

(B)に示すP11、P12、P13及びP14はそれぞれ、配置A10より各軸に示す方向A11、A12、A13及びA14に端末1を回転させた場合の、表示情報が制御される例である。ここでは、撮像される看板の画像上での見え方に連動させて、表示情報が制御されている。 P11, P12, P13, and P14 shown in (B) are examples in which display information is controlled when the terminal 1 is rotated in the directions A11, A12, A13, and A14 shown in the respective axes from the arrangement A10. Here, the display information is controlled in conjunction with how the signboard to be imaged looks on the image.

すなわち、回転A11によると看板の右側が近づいて見えるようになるので、表示情報も同様にP10の状態から回転変化してP11のような方向に回転して制御される。回転A12は当該回転A11の逆であり、表示情報はP10の状態から回転変化してP12に示すように回転される。 That is, according to the rotation A11, the right side of the sign can be seen approaching, so that the display information is similarly controlled by rotating from the state of P10 and rotating in the direction of P11. The rotation A12 is the reverse of the rotation A11, and the display information is rotated from the state of P10 and rotated as indicated by P12.

回転A13では看板が上側からのぞき込んで見えるようになるので、表示情報も同様にP10の状態から変化してP13のような方向に回転して制御される。回転A14は当該回転A13の逆であり、表示情報はP10の状態から変化してP14に示すように回転される。 In the rotation A13, since the signboard can be seen from the upper side, the display information similarly changes from the state of P10 and is rotated and controlled in the direction of P13. The rotation A14 is the reverse of the rotation A13, and the display information changes from the state of P10 and rotates as indicated by P14.

また例えば図4では、図3と同様に(A)に示すように撮像対象の例としての看板が、所定配置A20として正面から端末1に撮像されている際の表示情報の例が、(B)のP20に示す平面状の模様である。また、当該所定配置A20において看板と平行な平面で端末1上にある仮想的な平面がそのxy軸によって示されている。 Also, for example, in FIG. 4, as in FIG. 3, an example of display information when a signboard as an example of an imaging target is imaged on the terminal 1 from the front as the predetermined arrangement A20 as shown in (A) is (B ) Of the flat pattern shown on P20. Further, a virtual plane on the terminal 1 in a plane parallel to the signboard in the predetermined arrangement A20 is indicated by the xy axis.

当該xy平面上でA21に示す方向へと端末1が移動すると、撮像される看板は逆に左下側へと移動して見えるようになるので、表示情報は同様にP20の状態から移動してP21に示すように左下へ移動するよう制御される。当該xy平面に垂直な方向で且つ看板から遠ざかる奥行き方向A22へと端末1が移動すると、撮像される看板は縮小されて見えるようになるので、表示情報は同様にP20の状態から変化してP22に示すように縮小するように制御される。 When the terminal 1 moves in the direction indicated by A21 on the xy plane, the imaged signboard appears to move to the lower left side, so that the display information similarly moves from the state of P20 to P21. As shown in the figure, the movement is controlled to move to the lower left. When the terminal 1 moves in the direction perpendicular to the xy plane and in the depth direction A22 away from the signboard, the picked-up signboard appears to be reduced, so the display information similarly changes from the state of P20 to P22 As shown in FIG.

図2に戻り、このような制御を実現する端末1及びサーバー2の各部について順次説明する。まず、端末1の各部は次の通りである。 Returning to FIG. 2, each part of the terminal 1 and the server 2 that realizes such control will be described sequentially. First, each part of the terminal 1 is as follows.

撮像部11は所定のサンプリング周期で撮像対象を連続的に撮像して、その撮影画像を算出部12及び制御部14へ出力する。撮像部11としては携帯端末に標準装備されるデジタルカメラを用いることができる。 The imaging unit 11 continuously captures an imaging target at a predetermined sampling period, and outputs the captured image to the calculation unit 12 and the control unit 14. As the imaging unit 11, a digital camera provided as a standard in a portable terminal can be used.

算出部12は、撮像部11から入力される画像から特徴情報を算出し、特徴情報を送信部17及び更新部15へ出力する。特徴情報としては周知のSIFT特徴量又はSURF特徴量等のような、回転及び拡大縮小に対して不変な性質を有し、画像の局所領域における相対的な輝度勾配に基づいて算出される局所特徴量を用いることができる。あるいは同性質を有する周知のFERN特徴量を用いてもよい。 The calculation unit 12 calculates feature information from the image input from the imaging unit 11, and outputs the feature information to the transmission unit 17 and the update unit 15. The feature information is a local feature that is invariant to rotation and scaling, such as the well-known SIFT feature or SURF feature, and is calculated based on the relative brightness gradient in the local region of the image. An amount can be used. Alternatively, a known FERN feature having the same property may be used.

こうして、特徴情報は例えば次のような形式として得られる。
(d_i, x_i) [i=1,2, ..., n]
ここで、d_iは特徴量であってベクトル等の形式からなる量であり、x_iは当該特徴量d_iが算出された画像上の位置(特徴点座標)であり、nは当該画像から算出された特徴点座標x_i及び特徴量d_iのペアの個数である。 Thus, the feature information is obtained in the following format, for example.
(d _i , x _i ) [i = 1,2, ..., n]
Here, d _i is a feature quantity and is a quantity such as a vector, x _i is a position (feature point coordinates) on the image where the feature quantity d _i is calculated, and n is a value from the image. This is the number of pairs of calculated feature point coordinates x _i and feature amounts d _i .

なお、算出部12による特徴情報の算出は、端末1からサーバー2へ送信するデータ量の削減及び削減に伴うレスポンス短縮を図る目的で実施される。よって算出部12は、特徴情報のデータ量が入力画像のデータ量を上回る場合に、特徴情報の代わりに入力画像を送信部17よりサーバー2へと送信し、サーバー2が算出部12と同様にして特徴情報の算出を行うようにしてもよい。 The calculation of the feature information by the calculation unit 12 is performed for the purpose of reducing the amount of data transmitted from the terminal 1 to the server 2 and shortening the response accompanying the reduction. Therefore, when the data amount of the feature information exceeds the data amount of the input image, the calculation unit 12 transmits the input image instead of the feature information from the transmission unit 17 to the server 2, and the server 2 performs the same process as the calculation unit 12. Thus, the feature information may be calculated.

この場合、算出部12と同様の機能ブロック(不図示)をサーバー2は具備するものとする。なおまたこの場合、端末1側の算出部12は、後述の端末1単独での位置及び姿勢の推定を行う場合には特徴情報の算出を行うこととなる。 In this case, it is assumed that the server 2 includes functional blocks (not shown) similar to those of the calculation unit 12. Furthermore, in this case, the calculation unit 12 on the terminal 1 side calculates feature information when estimating the position and orientation of the terminal 1 alone described later.

なお、SIFT特徴量は「D.G.Lowe, ―Distinctive image features from scale-invariant key points, Proc. of Int. Journal of Computer Vision (IJCV), 60(2) pp.91-110 (2004)」に、SURF特徴量は「H.Bay,T.Tuytelaars, and L.V.Gool,SURF:Speed Up Robust Features, Proc. of Int. Conf. of ECCV, (2006)」に、FERN特徴量は「M.Ozuysal, M.Calonder, V.Lepetit, P.Fua, ―Fast Keypoint Recognition using Random Ferns, IEEE PAMI, 2009」に、開示されている。 SIFT features are listed in `` DGLowe, ―Distinctive image features from scale-invariant key points, Proc. Of Int.Journal of Computer Vision (IJCV), 60 (2) pp.91-110 (2004) ''. The feature is `` H. Bay, T. Tuytelaars, and LVGool, SURF: Speed Up Robust Features, Proc. Of Int. Conf. Of ECCV, (2006) '', and the FERN feature is `` M. Ozuysal, M. Calonder, V. Lepetit, P. Fua, “Fast Keypoint Recognition using Random Ferns, IEEE PAMI, 2009”.

測位部13は、端末1の座標、方位若しくは仰角又はこれらの一部若しくは全てからなる組み合わせを取得して測位情報とし、該測位情報を送信部17へ出力する。座標は緯度、経度及び高度並びに建築物の階数等を含んで構成することができる。なお、方位及び仰角については、端末1における撮像部11が撮像する際の正面の方向を特定するための方位及び仰角として取得する。 The positioning unit 13 acquires the coordinates, the azimuth or the elevation angle of the terminal 1, or a combination of some or all of them to obtain positioning information, and outputs the positioning information to the transmission unit 17. The coordinates can be configured to include latitude, longitude, altitude, floor number of the building, and the like. Note that the azimuth and the elevation angle are acquired as an azimuth and an elevation angle for specifying the front direction when the imaging unit 11 in the terminal 1 captures an image.

測位部13には携帯端末に標準装備され、座標を取得するGPS(Global Positioning System；全地球測位システム)や屋内測位方式のIMES(Indoor Messaging System)及びNFC(Near Field Communication；近距離通信)並びに方位や仰角を取得する電子コンパス等を用いることができる。建築物及びその階数についてはIMESやNFC等、特定用途AP(アクセスポイント)からの情報にて取得すればよい。 Positioning unit 13 is standard equipment on mobile terminals, and GPS (Global Positioning System) to acquire coordinates, IMES (Indoor Messaging System) and NFC (Near Field Communication) for indoor positioning, An electronic compass or the like that acquires the azimuth and elevation angle can be used. What is necessary is just to acquire about the building and its floor number from information from specific use AP (access point), such as IMES and NFC.

送信部17は、算出部12で得られた特徴情報及び測位部13で得られた測位情報をサーバー2へ送信する。また、必要に応じて端末1を特定する端末情報を送信することもできる。端末情報にはユーザによる端末1の利用履歴等が含まれていてもよい。 The transmission unit 17 transmits the feature information obtained by the calculation unit 12 and the positioning information obtained by the positioning unit 13 to the server 2. Further, terminal information for specifying the terminal 1 can be transmitted as necessary. The terminal information may include a usage history of the terminal 1 by the user.

受信部18は、送信部17で送信した特徴情報及び測位情報並びに端末情報に対する応答としての撮像対象の認識情報(撮像対象が何であるかを認識した結果の情報)、認識された撮像対象の位置及び姿勢の情報並びに付加情報をサーバー2から受信する。 The receiving unit 18 is the recognition information of the imaging target (information of the result of recognizing what the imaging target is) as a response to the feature information and the positioning information and the terminal information transmitted by the transmission unit 17, and the position of the recognized imaging target And posture information and additional information are received from the server 2.

更新部15は、撮像対象に対する端末1の位置及び姿勢の情報を現在時点での情報へと更新し、当該更新した位置及び姿勢の情報によって図3や図4の例で説明したように表示情報を制御させるべく、当該更新した情報を制御部14へと出力する。 The update unit 15 updates the information on the position and orientation of the terminal 1 with respect to the imaging target to the information at the current time point, and displays the display information as described in the examples of FIGS. 3 and 4 based on the updated position and orientation information. The updated information is output to the control unit 14.

なお、サーバー2との間で新たにデータの送受信があった場合さらに、更新部15は認識情報より撮像対象が何であるかの認識結果を得て、その時点までの撮像対象とは別の撮像対象であるとの認識結果であった場合には、その時点以降の位置及び姿勢の更新を、当該新たに認識された撮像対象に即した方式に変更して行うようにする。当該更新の詳細は後述する。 In addition, when new data is transmitted / received to / from the server 2, the update unit 15 obtains a recognition result indicating what the imaging target is based on the recognition information, and captures an image different from the imaging target up to that point. If the recognition result indicates that the object is a target, the position and orientation are updated after that time by changing the method according to the newly recognized imaging target. Details of the update will be described later.

制御部14は、サーバー2から受信した付加情報を更新部15で得られた位置及び姿勢の情報に応じて制御し、制御された付加情報を出力部16へ出力する。制御処理としては平面射影変換を用いることができ、付加情報の出力部16における表示の加工制御が行われる。撮像対象の位置又は当該位置から所定量シフトした位置に、撮像対象の姿勢と同じ姿勢又は所定の変化が加わった姿勢を取った付加情報として撮像部11で得られた画像に出力部16において重畳させるよう、当該制御を行う。すなわち、撮像部11の取得した画像において撮像対象に対して所定配置をなして付加情報を重畳させた表示が行われるように、当該制御が行われる。当該重畳処理によって、出力部16における表示情報は、付加情報があたかも平面状の撮像対象に最初から貼り付けられているかのように制御されることとなる。 The control unit 14 controls the additional information received from the server 2 in accordance with the position and orientation information obtained by the updating unit 15, and outputs the controlled additional information to the output unit 16. As the control process, planar projective transformation can be used, and display processing control in the output unit 16 of additional information is performed. The output unit 16 superimposes the image obtained by the imaging unit 11 on the image obtained by the imaging unit 11 as additional information that takes the same posture as the imaging target or a predetermined change at the position of the imaging target or a position shifted by a predetermined amount from the position. This control is performed so that That is, the control is performed so that the display acquired by superimposing the additional information in a predetermined arrangement on the imaging target in the image acquired by the imaging unit 11 is performed. By the superimposition process, the display information in the output unit 16 is controlled as if the additional information is pasted on the planar imaging target from the beginning.

出力部16は、制御された付加情報を撮像画像において撮像対象に対して所定配置をなして重畳させた表示情報を端末1から出力する。出力部16には携帯端末に標準装備されるディスプレイを用いることができる。 The output unit 16 outputs, from the terminal 1, display information in which the controlled additional information is superimposed on the imaging target in a predetermined arrangement in the captured image. As the output unit 16, a display provided as a standard in a portable terminal can be used.

一方、サーバー2の各部は次の通りである。 On the other hand, each part of the server 2 is as follows.

受信部25は、送信部17の送信した情報すなわち、端末1の算出部12で得られた特徴情報と、端末1の測位部13で得られた測位情報と、必要に応じて送信された利用履歴等を含む端末情報と、を受信する。これら受信した情報は抽出部23へと出力される。 The receiving unit 25 transmits the information transmitted by the transmitting unit 17, that is, the feature information obtained by the calculating unit 12 of the terminal 1, the positioning information obtained by the positioning unit 13 of the terminal 1, and the usage transmitted as necessary. Terminal information including history and the like. The received information is output to the extraction unit 23.

記憶部22は、端末1での位置及び姿勢に基づく表示情報の制御のために供する情報として、各種の所定の撮像対象を認識するための特徴情報と、撮像対象が存在する座標等を表す測位情報と、を保持する。なお、記憶部22の保持する測位情報は、測位部13の取得する測位情報と同形式にて構成されるものとする。 The storage unit 22 is information provided for controlling display information based on the position and orientation on the terminal 1, and includes feature information for recognizing various predetermined imaging targets, and coordinates indicating the coordinates where the imaging targets exist And information. Note that the positioning information held by the storage unit 22 is configured in the same format as the positioning information acquired by the positioning unit 13.

記憶部22はまた、端末1での表示情報において利用ユーザに特化した制御を施すために供する情報として、受信部18にて受信するのと同種の利用履歴等を含む端末情報と、撮像対象及び端末情報に関連した付加情報と、を保持する。 The storage unit 22 also includes terminal information including the same type of usage history received by the receiving unit 18 as information provided for performing control specialized for the user in the display information on the terminal 1, and the imaging target And additional information related to the terminal information.

図5に、記憶部22の保持している情報の例を表形式にて示す。複数(M個)の所定の撮像対象である撮像対象_1〜撮像対象_Mのそれぞれにつき、特徴情報、測位情報、付加情報及び端末情報を記憶部22は保持しておく。撮像対象_1〜撮像対象_Mは例えば各種の店舗の看板であり、概ね平面状である。 FIG. 5 shows an example of information held in the storage unit 22 in a table format. The storage unit 22 holds characteristic information, positioning information, additional information, and terminal information for each of a plurality (M) of imaging targets_1 to imaging target_M that are predetermined imaging targets. Imaging object_1 to imaging object_M are, for example, billboards of various stores, and are generally flat.

例えば撮像対象_1については、その特徴情報として算出部12にて算出するのと同種の特徴量及び特徴点座標のペア(特徴量_1_i, 特徴点座標_1_i)がインデクスi=1,2, ..., n1によりn1個、あらかじめ記憶されている。また、当該記憶されている撮像対象_1の特徴量及び特徴点座標の複数ペアが、撮像対象_1をどのような配置で撮像した際の画像から得られたものであるかを特定する情報として、基準配置_1も特徴情報に含まれ、記憶されている。 For example, for imaging object_1, a feature quantity and feature point coordinate pair (feature quantity_1_i, feature point coordinates_1_i) of the same type as the feature information calculated by the calculation unit 12 is index i = 1,2 , ..., n1, n1 are stored in advance. Further, information for specifying the arrangement of the plurality of pairs of the feature amount and the feature point coordinates of the image pickup target_1 obtained from the image when the image pickup target_1 is picked up. As described above, the reference arrangement_1 is also included in the feature information and stored.

図6は特徴情報を概念的に示す図である。B1が図5の撮像対象_1の例としての看板であり、局所領域R1, R2, R3, ..., Rn1は特徴量及び特徴点座標のペア(特徴量_1_i, 特徴点座標_1_i)[i＝1, 2, 3, ..., n1]がそれぞれ算出された領域の例である。看板B1は所定配置_1として、正面から所定距離離して撮像された状態であり、特に、特徴点座標は当該撮像画像上の座標として与えられる。 FIG. 6 is a diagram conceptually showing the feature information. B1 is a signboard as an example of the imaging object _1 in FIG. 5, and the local regions R1, R2, R3, ..., Rn1 are feature quantity and feature point coordinate pairs (feature quantity_1_i, feature point coordinates_1_i ) [i = 1, 2, 3,..., n1] are examples of calculated areas. The sign B1 is in a state of being imaged at a predetermined distance from the front as the predetermined arrangement_1. In particular, the feature point coordinates are given as coordinates on the captured image.

なお、算出部12にて算出されると共に当該記憶部22にも保持されている特徴量は回転や拡大縮小に対して理論上不変な性質があるので、撮像対象_1が見える配置である限り、基準配置_1等がどのようなものであってもほぼ一定である。基準配置_1等の情報は、特徴点座標の組から位置及び姿勢を求める際に認識部21に利用されることとなる。 Note that the feature amount calculated by the calculation unit 12 and also held in the storage unit 22 has a property that is theoretically invariant with respect to rotation and enlargement / reduction, so that the imaging target_1 can be seen. Whatever the reference arrangement_1 is, it is almost constant. Information such as the reference arrangement_1 is used by the recognition unit 21 when obtaining the position and orientation from the set of feature point coordinates.

図5に戻り、撮像対象_1についてはまた、その測位情報すなわち撮像対象_1のフィールド上の存在箇所の情報として測位情報_1が記憶され、付加情報として付加情報_1が記憶され、端末情報として端末情報_1が記憶されている。付加情報_1は、その詳細は図5には示していないが、利用履歴等を含む端末情報に応じたそれぞれの内容を有する情報として記憶されている。 Returning to FIG. 5, for imaging object_1, positioning information_1 is stored as positioning information thereof, that is, information on the location of the imaging object_1 on the field, additional information_1 is stored as additional information, and the terminal Terminal information_1 is stored as information. Although the details are not shown in FIG. 5, the additional information_1 is stored as information having contents according to the terminal information including the usage history and the like.

例えば撮像対象_1がある喫茶店の看板である場合で、端末情報としての端末1の利用履歴などより当該ユーザの当該喫茶店での選択メニュー履歴が得られる場合、特定種類のメニューが履歴内の所定期間に存在することに応じた当該喫茶店の特定種類のクーポン券情報を含んで付加情報_1が構成されていてもよい。結果としてユーザは、各自の履歴に応じた内容の付加情報を出力部16にて提示されることとなる。 For example, in the case where the imaging target_1 is a signboard of a coffee shop, and when the selection menu history of the user at the coffee shop is obtained from the usage history of the terminal 1 as terminal information, a specific type of menu is determined in the history The additional information_1 may be configured to include coupon information of a specific type of the coffee shop corresponding to being present in the period. As a result, the user is presented at the output unit 16 with additional information whose content corresponds to his / her history.

また図5において、端末情報_1のような端末情報は、端末を特定する情報の他に利用者の属性、嗜好、操作履歴、利用回数等が含まれ、当該端末情報に該当する場合のみに当該撮像対象を抽出部23での検索対象とするために利用される。具体例については「端末情報における嗜好(趣味)にダンスがある場合」として後述する通りである。 Also, in FIG. 5, terminal information such as terminal information_1 includes user attributes, preferences, operation history, number of uses, etc. in addition to information for identifying the terminal, and only when the terminal information corresponds to the terminal information. The imaging target is used to be a search target in the extraction unit 23. A specific example is as described later as “when there is a dance in the preference (hobby) in the terminal information”.

抽出部23は、受信部25から得られた端末1の測位情報に所定範囲で近い測位情報を有する撮像対象を記憶部22から検索し、端末1で実際に撮像している撮像対象に対する候補として抽出する。例えば、測位情報に含まれる位置から得られる、端末1と撮像対象との距離が所定範囲内に収まる撮像対象を候補として抽出する。 The extraction unit 23 searches the storage unit 22 for an imaging target having positioning information close to the positioning information of the terminal 1 obtained from the receiving unit 25 in a predetermined range, and as a candidate for the imaging target actually captured by the terminal 1 Extract. For example, an imaging target that is obtained from a position included in the positioning information and within which a distance between the terminal 1 and the imaging target is within a predetermined range is extracted as a candidate.

あるいは、当該得られた測位情報に方位に関する情報が存在する場合は、測位情報が所定範囲で近いという条件に加えてさらに、該方位を中心に端末1の撮像部11の画角に収まる範囲内に測位情報を有する撮像対象を、候補として記憶部22から抽出する。 Alternatively, when the obtained positioning information includes information related to a direction, in addition to the condition that the positioning information is close within a predetermined range, it is within a range that fits within the angle of view of the imaging unit 11 of the terminal 1 around the direction. The imaging target having the positioning information is extracted from the storage unit 22 as a candidate.

なおまた、測位情報に所定建築物の階数が含まれる場合、当該所定建築物において階数が一致するものに候補を予め絞り込んだのち、緯度及び経度などから当該建築物の当該階のフロア内にて平面的な距離を求めて所定範囲内に収まる候補を抽出してもよい。 In addition, if the number of floors of a given building is included in the positioning information, after narrowing down candidates to those with the same number of floors in the given building in advance, within the floor of the floor of the building from the latitude and longitude Candidates that fall within a predetermined range may be extracted by obtaining a planar distance.

図7は当該測位情報による検索を概念的に示す図である。(A)では端末1の測位情報で指定される位置Pを中心とする所定半径rの円C内が検索対象であり、当該円C内に測位情報を有する撮像対象すなわち当該円C内に存在する撮像対象が候補として抽出される。(B)では(A)の条件にさらに、端末1の方位を中心として画角θ内に収まるという条件を課すことで、扇形F内が検索対象となる。 FIG. 7 is a diagram conceptually showing a search based on the positioning information. In (A), the search target is a circle C having a predetermined radius r centered on the position P specified by the positioning information of the terminal 1, and the imaging target having the positioning information in the circle C, that is, the circle C is present. The imaging target to be extracted is extracted as a candidate. In (B), in addition to the condition of (A), a condition that the orientation of the terminal 1 is within the angle of view θ is imposed, so that the sector F is searched.

なお、画角θについては端末情報に含めてサーバー２に通知することで、各種の端末1における各種異なる構成の撮像部11についても当該条件検索を可能とすることができる。 The angle of view θ can be included in the terminal information and notified to the server 2 so that the condition search can be performed for the imaging units 11 having various configurations in the various terminals 1.

抽出部23はまた、このような記憶部22から候補として抽出する際の測位情報による絞り込みに加えてさらに、受信部25から得られた端末情報によって絞り込みを行ってもよい。 The extraction unit 23 may further perform the narrowing down using the terminal information obtained from the receiving unit 25 in addition to the narrowing down based on the positioning information when extracting the candidate from the storage unit 22 as described above.

例えば、撮像対象の中にダンススクールないしダンス用品店の看板がある場合には、端末情報における嗜好(趣味)にダンスがある場合のみ候補として抽出させるようにすることで、当該嗜好を有するユーザのみを対象として表示情報の制御によるサービス提供などを可能とすることができる。こうして、撮像対象が大量にある場合でも、各ユーザに有益な撮像対象のみを表示情報の制御の利用に供するようにすることができる。 For example, if there is a sign of a dance school or a dance equipment store in the imaging target, only users who have the preference can be extracted as candidates only when there is a dance in the preference (hobby) in the terminal information. For example, it is possible to provide services by controlling display information. In this way, even when there are a large number of imaging targets, only the imaging targets useful for each user can be used for the control of display information.

また、上記のように嗜好などの特定項目として端末情報を利用する代わりに、所定のサービスの契約者リストを端末IDのリストに変換したうえで、当該サービスに関連する撮像対象における端末情報として記憶部22で保持しておき、関連ユーザのみに表示情報の提示を可能とするようにしてもよい。 Further, instead of using terminal information as a specific item such as preference as described above, a contractor list of a predetermined service is converted into a list of terminal IDs, and stored as terminal information in an imaging target related to the service The information may be stored in the unit 22 so that display information can be presented only to related users.

抽出部23はこうして抽出した撮像対象の各候補を、その特徴情報と共に認識部21へと出力する。さらに、認識部21が後述のようにして撮像対象の候補の中から実際に撮像部11が画像上で捉えているのがどれであるかを特定した後、抽出部23は受信部25から得られた端末情報に応じて記憶部22から当該特定された撮像対象における付加情報を選択し、該付加情報を送信部24へ出力する。 The extraction unit 23 outputs each candidate of the imaging target thus extracted to the recognition unit 21 together with the feature information. Furthermore, after the recognizing unit 21 specifies which of the imaging target candidates the imaging unit 11 actually captures on the image as described later, the extracting unit 23 obtains from the receiving unit 25. The additional information on the specified imaging target is selected from the storage unit 22 according to the specified terminal information, and the additional information is output to the transmission unit 24.

なお、当該特定された撮像対象における付加情報の記憶部22からの選択は抽出部23ではなく送信部24が行って、該付加情報を送信するようにしてもよい。 Note that the additional information on the specified imaging target may be selected from the storage unit 22 by the transmission unit 24 instead of the extraction unit 23, and the additional information may be transmitted.

端末情報には、前述のように端末を特定する情報の他に利用者の属性、嗜好、操作履歴、利用回数等が含まれ、付加情報は前述のように当該端末情報などに応じた各種の内容を含むように構成されているので、端末情報等に応じて付加情報の内容を変えつつ、出力部16において表示情報と重畳させるようにすることができる。 The terminal information includes the user's attributes, preferences, operation history, number of times of use, etc. in addition to the information for specifying the terminal as described above, and the additional information includes various information according to the terminal information as described above. Since the content is configured to be included, it is possible to superimpose display information on the output unit 16 while changing the content of the additional information according to the terminal information or the like.

例えば、前述の撮像対象_1の例では、同じ喫茶店の看板であっても異なる端末1を利用しているユーザAとユーザBとでは、当該ユーザの利用履歴や嗜好などに応じて異なるクーポン情報を重畳させるようにすることができる。 For example, in the above-described example of the imaging target_1, coupon information that is different depending on the usage history or preference of the user between the user A and the user B who are using different terminals 1 even if they are signs of the same coffee shop Can be superimposed.

認識部21は、抽出部23から得られた撮像対象の各候補における特徴情報と、受信部25から得られ端末1において実際の撮像対象より算出部12より算出された特徴情報と、を比較し、実際の撮像対象が候補のうちのいずれであるかを認識し、さらに、当該認識された撮像対象の端末1に対する相対的な位置関係(位置及び姿勢)を認識する。 The recognizing unit 21 compares the feature information in each candidate of the imaging target obtained from the extracting unit 23 with the feature information obtained from the actual imaging target in the terminal 1 and calculated by the calculating unit 12 in the terminal 1. Then, it recognizes which of the candidates is the actual imaging target, and further recognizes the relative positional relationship (position and orientation) of the recognized imaging target with respect to the terminal 1.

当該認識処理については後述するが、端末1から送信された測位情報及び端末情報によって抽出部23が予め撮像対象の候補を絞り込んでいるので、特徴情報の探索空間が限定され、認識処理の高速化を図ることができる。撮像対象の認識結果としての認識情報並びに当該認識された撮像対象の位置及び姿勢の情報は、抽出部23及び送信部24へ出力される。なお、付加情報の選択を送信部24が行う場合は、当該出力は送信部24のみへ向けてなされる。 Although the recognition process will be described later, since the extraction unit 23 narrows down the candidates for imaging in advance based on the positioning information and terminal information transmitted from the terminal 1, the search space for the feature information is limited, and the recognition process is accelerated. Can be achieved. Recognition information as a recognition result of the imaging target and information on the position and orientation of the recognized imaging target are output to the extraction unit 23 and the transmission unit 24. Note that when the transmission unit 24 selects additional information, the output is directed to the transmission unit 24 only.

送信部24は、認識部21で得られた認識情報並びに位置及び姿勢の情報と、抽出部23又は送信部24が記憶部22より選択した当該認識情報及び端末情報等に対応する付加情報と、を端末へ送信する。 The transmission unit 24, the recognition information obtained by the recognition unit 21, the information of the position and orientation, the additional information corresponding to the recognition information and the terminal information etc. selected by the extraction unit 23 or the transmission unit 24 from the storage unit 22, To the terminal.

図8は、以上のような各部の処理によって得られる、制御部14によって加工された付加情報が重畳された出力部16における表示情報を説明するための図である。ここで(1)に示すように、撮像対象の例B10は看板であって、正面から撮像したものを示している。当該(1)の状態において付加情報を重畳した表示情報の例が(2)であり、当該看板B10と同一平面上の所定領域B11に看板を拡張している。 FIG. 8 is a diagram for explaining display information in the output unit 16 on which the additional information processed by the control unit 14 is obtained, which is obtained by the processing of each unit as described above. Here, as shown in (1), the imaging target example B10 is a signboard, and shows an image taken from the front. An example of display information in which the additional information is superimposed in the state (1) is (2), and the signboard is extended to a predetermined area B11 on the same plane as the signboard B10.

当該所定領域B11上には、前述のクーポン券情報など、端末情報に応じた各内容の付加情報が記載されていてもよい。当該所定領域B11は端末情報毎の内容を有してよいので、端末情報に応じて当該所定領域の配置及び／又は形状などが変更されてもよい。また当該所定領域の例B11は、看板B10の外部に拡張して存在しているが、看板B10の内部にB11が存在してもよいし、内部と外部とにB11が存在してもよい。 On the predetermined area B11, additional information of each content according to the terminal information such as the above coupon information may be described. Since the predetermined area B11 may have contents for each terminal information, the arrangement and / or shape of the predetermined area may be changed according to the terminal information. Further, the example B11 of the predetermined area exists outside the signboard B10, but B11 may exist inside the signboard B10, or B11 may exist inside and outside.

なお、(2)に示すように、撮像対象の領域B10に対して所定配置の領域B11が定められるように、付加情報には撮像対象に対する相対的な位置及び姿勢並びに占有領域の情報が含まれているものとする。ただし、撮像対象と付加情報とは画像上にて基本的には同一平面上に構成させるので、この場合、相対的な姿勢については同姿勢とする。あるいは、表示情報を眺めるユーザの注意を喚起するために、撮像対象に対する付加情報の相対的な姿勢を平面同士が直角をなすなどの、同姿勢以外の所定の姿勢としてもよい。 As shown in (2), the additional information includes information on the relative position and orientation with respect to the imaging target and information on the occupied area so that a predetermined area B11 is defined with respect to the imaging target area B10. It shall be. However, since the imaging target and the additional information are basically configured on the same plane on the image, in this case, the relative posture is the same. Alternatively, in order to call the attention of the user viewing the display information, the relative posture of the additional information with respect to the imaging target may be a predetermined posture other than the same posture, such as planes being perpendicular to each other.

図8にて、撮像対象である看板の端末1に対する位置及び姿勢が変動して、画像における見え方が変わることで、(1)の状態から(3)の状態になった際の、付加情報を重畳した表示情報が(4)である。すなわち、看板が正面から見たB10の状態から斜めから見たB20の状態になったのに応じて、付加情報も正面から見たB11の状態から斜めから見たB21の状態へとなって、表示情報が得られる。 In FIG. 8, additional information when the state of (1) is changed to (3) due to changes in the position and orientation of the signboard that is the imaging target with respect to the terminal 1 and the appearance in the image changes. The display information superimposed with (4) is (4). In other words, according to the signboard from the state of B10 seen from the front to the state of B20 seen from the front, the additional information also changed from the state of B11 seen from the front to the state of B21 seen from the front, Display information is obtained.

この際、B10に対するB20の位置及び姿勢の関係すなわち平面射影変換の関係を、B11に対して適用することで、すなわち、領域B11に対して当該平面射影変換を施すことで、B21が得られる。表示情報を眺めるユーザにとっては、看板B10を閲覧可能な任意の位置及び姿勢において、看板B10がB11の領域まで拡張されて最初から存在しているように見えることとなる。こうして、図3や図4に示すような姿勢及び位置による制御の際も、図3のP10の模様及び図4のP20の模様をB10及びB11の領域で置き換えたようにして表示情報が制御されることとなる。 At this time, B21 is obtained by applying the relationship of the position and orientation of B20 to B10, that is, the relationship of plane projective transformation, to B11, that is, by performing the plane projective transformation on region B11. For the user who views the display information, the signboard B10 is expanded to the area of B11 and appears to exist from the beginning at an arbitrary position and posture where the signboard B10 can be viewed. Thus, even in the control based on the posture and position as shown in FIGS. 3 and 4, the display information is controlled by replacing the pattern P10 in FIG. 3 and the pattern P20 in FIG. 4 with the areas B10 and B11. The Rukoto.

認識部21の処理の詳細は次の通りである。特徴情報の比較処理には、周知の手法であるRANSAC やPROSAC 等を利用することができ、例えばRANSACの場合であれば認識部21は以下の(1)〜(3)のようにして比較及び認識を行うことができる。 Details of the processing of the recognition unit 21 are as follows. For comparison processing of feature information, RANSAC or PROSAC, which are well-known methods, can be used.For example, in the case of RANSAC, the recognition unit 21 performs comparison and comparison as follows (1) to (3). Recognition can be performed.

(1)特徴量の対応づけ
実際の撮像対象から算出された特徴量と、候補cの撮像対象における特徴量と、の間にて特徴量の対応付けを行う。実際の撮像対象から算出された特徴量をd_i(i=1,2, ..., n)とし、撮像対象の候補cに対して記憶部22で記憶されている特徴量をD_cj(j=1, 2, ..., m)とすると、各特徴量D_cjに対応する特徴量d_i[i]を、特徴量間の距離が最小となるものとして定める。さらに、特徴量d_iのうち対応する特徴量D_cjが定まっていないものが残っている場合、当該特徴量d_iとの距離が最小となる特徴量D_cj[i]に対応付ける。 (1) Correspondence of feature amount The feature amount is associated between the feature amount calculated from the actual imaging target and the feature amount of the candidate c in the imaging target. The feature quantity calculated from the actual imaging target is d _i (i = 1, 2, ..., n), and the feature quantity stored in the storage unit 22 for the imaging target candidate c is D _cj ( If j = 1, 2,..., m), the feature quantity d _{i [i]} corresponding to each feature quantity D _cj is determined as the distance between the feature quantities being minimized. Further, when there remains a feature quantity d _i for which the corresponding feature quantity D _cj is not determined, the feature quantity D _i is associated with the feature quantity D _{cj [i]} having a minimum distance from the feature quantity d _i .

なお当該対応づけにおいては、特徴量間の距離が最小の条件にさらに、特徴量間の距離が所定の閾値以内である条件を課してもよい。また別実施例として、特徴量間の距離が所定の閾値以内であるもの同士を全て対応づけるようにしてもよい。 In the association, a condition that the distance between the feature amounts is within a predetermined threshold may be imposed on the condition that the distance between the feature amounts is minimum. As another embodiment, all the features whose distances between the feature amounts are within a predetermined threshold may be associated with each other.

(2)誤対応の部分の除外と正対応の部分による平面射影変換関係の算出
実際の撮像対象から算出された特徴量のうち、撮像対象における特徴点とは違う部分において誤算出された等の理由から、(1)の対応づけのうち誤対応であったと判定されるものを除外し、正対応の特徴量に対する特徴点によって平面射影変換の関係を求める。 (2) Calculation of planar projective transformation relationship by exclusion of miscorresponding part and correct correspondence part Of the feature amount calculated from the actual imaging target, it was miscalculated in the part different from the feature point in the imaging target etc. For the reason, the correspondence determined in (1) that is determined to be an incorrect correspondence is excluded, and the relationship of the planar projective transformation is obtained from the feature points with respect to the correct correspondence feature amount.

具体的には、(1)にて対応付けられた特徴量d_iと特徴量D_cjとを用いて、特徴量D_cjから所定数(ただし、平面射影変換を求めるために少なくとも4つ必要)を選び、これに対応する特徴量diの全組み合わせにつき、所定数の特徴量D_cjのそれぞれの特徴点座標からなる特徴点座標の組Xから、対応する特徴量diにおける特徴点座標の組xへの平面射影変換Hを、次式に示す当該変換の誤差eを最小にするように求める。
e＝|HX − x| Specifically, using the feature quantity d _i and the feature quantity D _cj associated in (1), a predetermined number from the feature quantity D _cj (however, at least four are required to obtain planar projective transformation) For all combinations of feature quantities di corresponding thereto, a set x of feature point coordinates in the corresponding feature quantity di from a set X of feature point coordinates consisting of the feature point coordinates of a predetermined number of feature quantities D _cj Is calculated so as to minimize the error e of the conversion expressed by the following equation.
e = | HX − x |

当該変換関係を特徴量D_cjから所定数を選ぶ全ての組み合わせにつき求める。変換関係Hを求めたこれら全ての組み合わせの中で、変換関係を求める際に利用しなかった特徴量D_cj'及び対応する特徴量d_i'における特徴点座標X'及びx'を、当該求められた変換関係Hで結びつけた際の次式に示す誤差Eが、所定の閾値に収まるような特徴点座標X'及びx'のペア数が最大のものを求める。
E＝|HX' − x'| The conversion relationship is obtained for all combinations in which a predetermined number is selected from the feature value D _cj . Among all the combinations for which the conversion relationship H is obtained, the feature point coordinates X ′ and x ′ in the feature amount D _cj ′ and the corresponding feature amount d _i ′ not used when obtaining the conversion relationship are obtained. The maximum number of pairs of feature point coordinates X ′ and x ′ is obtained so that the error E shown in the following equation when combined with the conversion relation H is within a predetermined threshold.
E ＝ | HX '− x' |

当該ペア数最大の際のHを求めた特徴点座標の組X及びxと、当該X及びxに対して誤差Eが上記の所定の閾値内に収まった特徴点座標X'及びx'のペアと、を当該候補cに対して(1)における特徴量の対応が正しく行われたものに対応する特徴点座標(正対応の特徴点座標)であると判定すると共に、当該正しく対応付けられた特徴量における特徴点座標を用いて再度平面射影変換Hを求め、当該候補cにおける位置及び姿勢の推定結果となす。 A pair of feature point coordinates X and x for which H is obtained when the number of pairs is maximum, and a pair of feature point coordinates X ′ and x ′ in which the error E is within the predetermined threshold with respect to X and x And the feature point coordinates corresponding to those for which the feature amount correspondence in (1) was correctly performed with respect to the candidate c (correct feature point coordinates) and The plane projective transformation H is obtained again using the feature point coordinates in the feature amount, and the position and orientation estimation results for the candidate c are obtained.

なお、当該平面射影変換Hは記憶部22で保持されている特徴点座標から入力画像における特徴点座標への変換であるので、記憶部22で保持されている基準配置の情報をさらに加味することによって、端末1に対する撮像対象の位置及び姿勢の推定結果を求めることができる。 Note that the plane projective transformation H is a conversion from the feature point coordinates held in the storage unit 22 to the feature point coordinates in the input image, and therefore further considers the reference arrangement information held in the storage unit 22. Thus, the estimation result of the position and orientation of the imaging target with respect to the terminal 1 can be obtained.

(3)実際の撮像対象が候補のうちのいずれであるかの決定
各候補cにつき上記(1)及び(2)を実行し、(2)にて正しく対応づけられた特徴量における特徴点座標で平面射影変換Hを求めた際に、当該Hにて正対応の特徴点座標を変換した誤差が最小となるような候補cを、実際の撮像対象に対応するものであると決定する。また当該候補cに対する(2)における位置及び姿勢の推定結果(すなわち当該再度求められた平面射影変換Hに対応する推定結果)を、実際の撮像対象の位置及び姿勢の推定結果となす。 (3) Determining which of the candidates is the actual imaging target Perform (1) and (2) above for each candidate c, and feature point coordinates in the feature values correctly associated in (2) When the plane projective transformation H is obtained in step (2), the candidate c that minimizes the error resulting from the transformation of the feature point coordinates corresponding to the positive in H is determined to correspond to the actual imaging target. In addition, the estimation result of the position and orientation in (2) for the candidate c (that is, the estimation result corresponding to the plane projection transformation H obtained again) becomes the estimation result of the actual position and orientation of the imaging target.

更新部15による更新処理の詳細を関連する処理と共に説明する。更新部15は、端末1とサーバー2との間での情報のやりとりのタイミングの各種の態様に応じた更新処理を行う。図9は当該タイミングの各種の態様を<A>及びとして説明するための図である。 Details of the update process by the update unit 15 will be described together with related processes. The update unit 15 performs an update process according to various aspects of the timing of information exchange between the terminal 1 and the server 2. FIG. 9 is a diagram for explaining various aspects of the timing as <A> and .

<A>及びにおいて(1)〜(6)等はサーバー2側での処理を、[1]〜[6]等は端末1側での処理を順次表している。特に、[1]〜[6]はそれぞれ、端末1側にて所定のタイミング間隔で順次繰り返される位置及び姿勢の各回の更新処理及び当該更新に基づく表示情報の制御を含んでいる。なお、図5においては(1)や[1]等の各処理は複数のステップを含み、ある程度の時間幅をもって行われることを想定している。 In <A> and , (1) to (6) and the like sequentially indicate processing on the server 2 side, and [1] to [6] and the like sequentially indicate processing on the terminal 1 side. In particular, each of [1] to [6] includes a position and orientation update process that is sequentially repeated at a predetermined timing interval on the terminal 1 side, and display information control based on the update. In FIG. 5, it is assumed that each processing such as (1) and [1] includes a plurality of steps and is performed with a certain time width.

<A>の場合、端末1が位置及び姿勢を更新する各回[i](i＝1,2, ...)につき、サーバー2とやりとりを行い、サーバー2が各回の処理(i)によって得た最新の認識情報その他を常に端末1で利用する。 In the case of <A>, each time [i] (i = 1, 2, ...) when the terminal 1 updates the position and orientation, the server 2 communicates with the server 2, and the server 2 obtains by each processing (i). The latest recognition information and other information are always used by the terminal 1.

すなわち、例えば1回目であれば端末1は処理[1]においてa1で示すように当該時点で得た特徴情報及び測位情報を送信部17及び受信部25を介してサーバー2に渡す。なお、端末情報については端末1とサーバー2とがやりとりを開始した時点でサーバー2に渡せばよい。 That is, for example, if it is the first time, the terminal 1 passes the feature information and the positioning information obtained at the time point to the server 2 via the transmission unit 17 and the reception unit 25 as indicated by a1 in the process [1]. The terminal information may be passed to the server 2 when the terminal 1 and the server 2 start exchanges.

サーバー2は対応する処理(1)において当該特徴情報及び測位情報を利用して、前述の一連の処理を行う。すなわち、抽出部23にて記憶部22より撮像対象の候補を抽出し、認識部21にて候補の中から実際の撮像対象がいずれであるかを認識情報として認識すると共にその位置及び姿勢を認識する。 The server 2 performs the series of processes described above using the feature information and the positioning information in the corresponding process (1). That is, the extraction unit 23 extracts imaging target candidates from the storage unit 22, and the recognition unit 21 recognizes the actual imaging target from the candidates as recognition information and recognizes the position and orientation thereof. To do.

そして、当該認識情報と、位置及び姿勢の情報と、当該認識された撮像対象に対して記憶部22に保持されている付加情報のうち当該端末情報に対応する内容と、をb1に示すように送信部24及び受信部18を介して端末1に返信する。当該返信された情報によって、端末1は当該[1]の時点での表示情報の制御を行う。2回目以降についても全く同様である。 Then, as shown in b1, the recognition information, the position and orientation information, and the content corresponding to the terminal information among the additional information held in the storage unit 22 for the recognized imaging target A reply is made to the terminal 1 via the transmitter 24 and the receiver 18. Based on the returned information, the terminal 1 controls the display information at the time [1]. The same applies to the second and subsequent times.

このように、<A>の場合は端末1における表示情報の制御を、常にサーバー2からの当該時点に対応する最新の情報を利用して行う。互いにやりとりするデータの送受信を含めた端末1及びサーバー2の各処理が十分に高速に行える場合、<A>によって所定精度を確保した表示情報の制御が可能となる。 As described above, in the case of <A>, the display information on the terminal 1 is always controlled using the latest information corresponding to the time point from the server 2. When each process of the terminal 1 and the server 2 including transmission / reception of data exchanged with each other can be performed at a sufficiently high speed, display information with a predetermined accuracy can be controlled by <A>.

一方、一部分の処理の速度が確保できず、撮像部11で撮像される及び／又は出力部16で出力する所定のサンプリングレートに追従して表示情報の制御を行うに際してボトルネックが存在する場合、例えばのようにして所定精度を確保しつつサンプリングレートに追従した速度で表示情報の制御を可能とする。 On the other hand, when a part of the processing speed cannot be ensured, and there is a bottleneck in controlling the display information following the predetermined sampling rate imaged by the imaging unit 11 and / or output by the output unit 16, For example, as shown in , display information can be controlled at a speed following the sampling rate while ensuring a predetermined accuracy.

すなわちの場合において、1回目における端末1の処理[1]及びサーバー2の処理(1)並びにデータの送受信a1及びb1については、<A>の場合で説明したのと同様である。一方、ボトルネックに対応すべく2回目における端末1の表示情報の制御処理[2]は、既にb1によって得られた情報と、処理[1]から[2]までの間にサーバー2を利用せず端末1自身が単独で取得した情報c2と、によって行うようにする。情報c2には後述するように、所定のサンプリングレートに追従して取得できるような各種のものを利用する。これによって、処理[2]は場合<A>におけるようなデータ送信a2の後のサーバー2からのデータ受信b2を待たずして可能となる。 That is, in the case of , the process [1] of the terminal 1 and the process (1) of the server 2 and the data transmission / reception a1 and b1 in the first time are the same as described in the case of <A>. On the other hand, in order to cope with the bottleneck, the display information control process [2] of the terminal 1 for the second time uses the server 2 between the information already obtained by b1 and the processes [1] to [2]. First, it is performed by the information c2 acquired independently by the terminal 1 itself. As will be described later, various kinds of information c2 that can be acquired following a predetermined sampling rate are used. As a result, the process [2] becomes possible without waiting for the data reception b2 from the server 2 after the data transmission a2 as in the case <A>.

同様に、3回目の表示情報の制御処理[3]は処理[1]から[3]までの間に取得した情報c3(c2と同種の情報)とb1とによって行い、4回目の表示情報の制御処理[4]は処理[1]から[4]までの間に取得した情報c4(c2と同種)とb1とによって行う。 Similarly, the display information control process [3] for the third time is performed using the information c3 (same type information as c2) and b1 acquired during the processes [1] to [3], and the display information for the fourth time is displayed. The control process [4] is performed using information c4 (same type as c2) and b1 acquired during the processes [1] to [4].

一方、情報c2〜c4は高速に取得可能な代わりに長期間に渡って精度を確保するのが困難である(又は困難となる可能性がある)ため、所定タイミング毎にサーバー2側の認識処理を利用することで精度を維持する。当該所定タイミングの例として、例えば5回目の表示情報の制御処理は当該の1回目又は<A>の場合と同様に、端末1の処理[5]、サーバー2の処理(5)並びにデータの送受信a5及びb5によって行われる。 On the other hand, the information c2 to c4 can be acquired at high speed, but it is difficult (or may be difficult) to ensure accuracy over a long period of time. Maintain accuracy by using. As an example of the predetermined timing, for example, the fifth display information control process is the same as that for the first or <A> in the process [5] of the terminal 1, the process (5) of the server 2, and This is performed by data transmission / reception a5 and b5.

こうして、6回目の表示情報の制御処理[6]はb5によって得られた情報と、処理[5]から[6]までの間に端末1が単独で取得した情報c6(c2と同種)と、によって行われる。以降は図示していないが、同様である。 Thus, the display information control process [6] for the sixth time is the information obtained by b5, and the information c6 (same type as c2) obtained by the terminal 1 alone between the processes [5] to [6], Is done by. Although not shown, the same applies.

情報c2〜c6等は具体的には、サーバー2が位置および姿勢の認識を行い認識結果を端末1に通知した直近の過去時点から現在時点までの間の、端末1の位置及び姿勢の変化を推定した情報である。当該直近の過去にサーバー2が認識して端末1に通知した端末1に対する撮像対象の位置及び姿勢に対して当該推定された変化を加味することで、各時点での撮像対象の位置及び姿勢が得られる。平面射影変化の関係を行列の形で与えると、直近の過去にサーバー2が認識した位置及び姿勢の平面射影変換の行列に、当該推定された変化の分の平面射影変換の行列を掛けることで、各時点の位置及び姿勢に対応する平面射影変換の行列が得られる。 Specifically, the information c2 to c6 and the like indicate the change in the position and posture of the terminal 1 from the most recent past time point to the current time point when the server 2 recognizes the position and posture and notifies the terminal 1 of the recognition result. This is estimated information. By adding the estimated change to the position and orientation of the imaging target with respect to the terminal 1 that the server 2 has recognized and notified to the terminal 1 in the most recent past, the position and orientation of the imaging target at each time point can be determined. can get. Given the relationship of the plane projection change in the form of a matrix, the plane projection transformation matrix of the position and orientation recognized by the server 2 in the past past is multiplied by the matrix of the plane projection transformation for the estimated change. Then, a plane projection transformation matrix corresponding to the position and orientation at each time point is obtained.

あるいは情報c2〜c6等は、現在時点での撮像対象の位置及び姿勢を直接的に推定した情報であってもよいが、この場合も直近の過去にサーバー2が認識した位置及び姿勢を基準の位置及び姿勢として利用することによって、現在時点での位置及び姿勢の推定が可能となる。 Alternatively, the information c2 to c6 and the like may be information obtained by directly estimating the position and orientation of the imaging target at the current time point. In this case, too, the position and orientation recognized by the server 2 in the latest past are used as the reference. By using the position and orientation, the position and orientation at the current time can be estimated.

なお、の場合であってもサーバー2側で所定頻度にて認識処理のみ並行して実行するようにしてもよい。すなわち例えばの[2]において<A>のようなa2の情報を送信し、(2)において認識処理を行うが、b2の情報は送信しないようにしてもよい。この場合、サーバー2側の認識処理において撮像対象が切り替ったことが判明した場合、ただちに端末1に通知して撮像対象を切り替えての制御に移行させるようにしてもよい。 Even in the case of , only the recognition process may be executed in parallel at a predetermined frequency on the server 2 side. That is, for example, the information a2 such as <A> is transmitted in [2] of and the recognition process is performed in (2), but the information of b2 may not be transmitted. In this case, when it is determined in the recognition process on the server 2 side that the imaging target has been switched, the terminal 1 may be immediately notified to shift to control for switching the imaging target.

図10は更新部15の機能ブロック図である。更新部15は、算出更新部51、センサ更新部52、別算出更新部53、外挿更新部54及びカメラワーク更新部55を含む。当該各部は情報c2〜c6等を推定する各実施形態を担う。なお、センサ更新部52及びカメラワーク更新部55が位置及び姿勢の変化を推定し、その他は全て位置及び姿勢そのものを推定する。 FIG. 10 is a functional block diagram of the updating unit 15. The update unit 15 includes a calculation update unit 51, a sensor update unit 52, another calculation update unit 53, an extrapolation update unit 54, and a camera work update unit 55. Each of the units is responsible for each embodiment for estimating information c2 to c6 and the like. The sensor update unit 52 and the camera work update unit 55 estimate changes in position and orientation, and all others estimate the position and orientation itself.

更新部15は当該各部のうちのいずれかを利用することで位置及び姿勢を推定することができる。更新部15はユーザからの指示等に従って当該各部のうちいずれを利用するかを変更可能なように構成されていてもよい。 The update unit 15 can estimate the position and orientation by using any of the units. The update unit 15 may be configured to be able to change which one of the units is used in accordance with an instruction from the user.

算出更新部51は、算出部12で算出しているのと同種の特徴情報より撮像対象の位置及び姿勢を推定する。すなわち、サーバー2とのやりとりにて特徴情報のうち撮像対象の部分を捉えた正対応の部分を把握することができるので、算出更新部51は以降、特徴情報のうち当該正対応の部分のみを画像上において追従することによって、画像全体から特徴情報を算出する場合と比べて計算量を削減しつつ、撮像対象の端末1に対する位置及び姿勢を推定することができる。 The calculation update unit 51 estimates the position and orientation of the imaging target from the same type of feature information that is calculated by the calculation unit 12. That is, since it is possible to grasp the correct correspondence portion that captures the portion to be imaged in the feature information by exchanging with the server 2, the calculation update unit 51 thereafter only extracts the correct correspondence portion of the feature information. By following the image, it is possible to estimate the position and orientation of the imaging target with respect to the terminal 1 while reducing the amount of calculation compared to the case of calculating feature information from the entire image.

例えば、ある正対応の特徴量がある時点にてある特徴点座標において算出されている場合、次の時点では当該特徴量を当該特徴点座標の近傍の所定領域を探索することによって算出するようにすればよい。当該近傍はオプティカルフロー等に基づいて定めてもよい。 For example, if a certain feature value corresponding to a certain positive point is calculated at a certain feature point coordinate, the feature amount is calculated by searching a predetermined area near the feature point coordinate at the next time point. do it. The vicinity may be determined based on an optical flow or the like.

センサ更新部52は、端末1に加わる加速度及び端末1の傾きをそれぞれ取得する加速度センサ及び傾きセンサを含み、加速度センサの出力の累計(積分)を端末1の並進移動量すなわち位置の変化として推定し、傾きセンサの出力の変化から端末1の回転量すなわち姿勢の変化を推定する。当該各センサ出力は十分なサンプリングレートを確保できる代わりに、時間と共に誤差が累積する性質を有する。なお、当該各センサ出力は平面射影変換の形式に変換されたうえで、制御部14へと出力される。 The sensor update unit 52 includes an acceleration sensor and an inclination sensor that respectively acquire the acceleration applied to the terminal 1 and the inclination of the terminal 1, and estimates the cumulative (integral) output of the acceleration sensor as a translational movement amount of the terminal 1, that is, a change in position. Then, the rotation amount of the terminal 1, that is, the change in posture is estimated from the change in the output of the tilt sensor. Each sensor output has the property that errors accumulate with time instead of ensuring a sufficient sampling rate. Each sensor output is converted to a planar projective conversion format and then output to the control unit 14.

別算出更新部53は、算出部12で算出している特徴情報よりも計算負荷の少ない別種類の特徴量を画像から算出することで、位置及び姿勢を推定する。例えば撮像対象には予め所定のマーカーを設けておき、当該マーカーを検出することで位置及び姿勢を推定する。 The separate calculation update unit 53 estimates the position and orientation by calculating from the image another type of feature amount having a calculation load smaller than that of the feature information calculated by the calculation unit 12. For example, a predetermined marker is provided in advance on the imaging target, and the position and orientation are estimated by detecting the marker.

当該マーカーは例えば、所定の色特徴でマーカーを囲む領域を画像上で絞り込み可能とした上で、当該領域の内部に対して一般に計算負荷の少ないエッジ検出を行うことによって互いに識別可能な4つ以上の特徴線分及び／又は特徴点(互いに識別可能な4つ以上の特徴箇所)を検出できるように構成することができる。特徴箇所を4つ以上とすることで、位置及び姿勢に対応する平面射影変換が算出可能となる。撮像対象が看板であれば外枠部分又は内部にこのようなマーカーを設けておくことができる。例えば看板が長方形であれば４本の特徴線分からなる外枠部分としてマーカーを構成することができる。 For example, four or more markers that can be distinguished from each other by performing edge detection that generally has a low calculation load on the inside of the region after making it possible to narrow down the region surrounding the marker with a predetermined color feature on the image The feature line segment and / or the feature point (four or more feature locations distinguishable from each other) can be detected. By setting the number of feature locations to four or more, plane projective transformation corresponding to the position and orientation can be calculated. If the imaging target is a signboard, such a marker can be provided in the outer frame portion or inside. For example, if the signboard is rectangular, the marker can be configured as an outer frame portion made up of four feature lines.

なお、マーカーの情報は図5で説明したような記憶部22の保持データに追加データとして含ませておくものとし、端末1はサーバー2とのやりとりがあった際に当該マーカーの情報を入手することで、認識された撮像対象におけるマーカーの検出が可能になるものとする。また、マーカーが所定の位置及び姿勢にある際の特徴点の座標も基準配置の情報として記憶部22で保持しておき、端末1にて検出された特徴点との間で平面射影変換を求められるようにしておくものとする。 The marker information is included as additional data in the data held in the storage unit 22 as described in FIG. 5, and the terminal 1 obtains the marker information when the server 1 interacts with the server 2. This makes it possible to detect the marker in the recognized imaging target. Further, the coordinates of the feature point when the marker is in a predetermined position and orientation are also stored in the storage unit 22 as reference arrangement information, and a planar projective transformation is obtained with the feature point detected by the terminal 1. It shall be made available.

外挿更新部54は、端末1とサーバー2とのやりとりによってサーバー2が認識した位置及び姿勢の所定の過去に渡る履歴に対して所定のフィッティング関数を適用することによって、当該時点での位置及び姿勢をその外挿値によって推定する。 The extrapolation update unit 54 applies a predetermined fitting function to a predetermined history of the position and posture recognized by the server 2 by the exchange between the terminal 1 and the server 2, so that the position and the current point The posture is estimated by the extrapolated value.

例えば、直近の過去に認識した位置及び姿勢と、その１つ前の過去に認識した位置及び姿勢と、を履歴として、時間軸上の線形関数でフィッティングすることで当該時点での位置及び姿勢を推定する。フィッティング関数は位置及び姿勢の各々に設けることができる。姿勢の線形フィッティング関数によれば、固定された回転軸の回りに等しい角速度で動くように姿勢が推定されることとなる。 For example, the position and orientation recognized in the past in the past and the position and orientation recognized in the past in the past are used as a history, and the position and orientation at that time are obtained by fitting with a linear function on the time axis. presume. A fitting function can be provided for each position and orientation. According to the linear fitting function of the posture, the posture is estimated so as to move at an equal angular velocity around the fixed rotation axis.

カメラワーク更新部55は、画像に対して特開2007-087049号公報などに開示されたカメラワーク推定処理を行い、所定のカメラワークパラメータを抽出することによって位置及び姿勢を推定する。すなわち、カメラワークは、パラメータとして抽出可能なパン、チルト、ズーム、ロールなどに分類でき、こうした所定のカメラワークパラメータより構成されるカメラワークは平面射影変換の特別な場合であるアフィン変換でモデル化できるので、当該パラメータより平面射影変換すなわち位置及び姿勢が推定できる。 The camera work update unit 55 performs a camera work estimation process disclosed in Japanese Patent Laid-Open No. 2007-087049 on the image, and estimates a position and orientation by extracting predetermined camera work parameters. In other words, camera work can be classified into pan, tilt, zoom, roll, etc. that can be extracted as parameters, and camera work composed of these predetermined camera work parameters is modeled by affine transformation, which is a special case of planar projective transformation. Therefore, plane projection transformation, that is, position and orientation can be estimated from the parameters.

また、サーバー2の認識処理を利用する所定タイミングの各実施例は次のとおりである。図9でも説明したように、以上のような更新部15による更新は当該所定タイミングで区切られる区間の内部毎に継続して実施されることとなる。 In addition, each example of the predetermined timing using the recognition process of the server 2 is as follows. As described with reference to FIG. 9, the updating by the updating unit 15 as described above is continuously performed for each section divided at the predetermined timing.

一実施例では、更新部15による位置及び姿勢の更新の回数が所定回数に達する毎に、サーバー2の認識処理の結果を利用する。更新部15がセンサ更新部52によって位置及び姿勢の変化を推定している場合、当該一実施例によってセンサ誤差の累積を所定タイミング毎に解消することができるため好ましい。また更新部15が外挿更新部54によって位置及び姿勢を推定している場合、当該一実施例によって外挿の精度を保つことができるため好ましい。 In one embodiment, the result of recognition processing of the server 2 is used every time the number of updates of the position and orientation by the update unit 15 reaches a predetermined number. When the update unit 15 estimates the change in position and orientation by the sensor update unit 52, it is preferable that the accumulation of sensor errors can be eliminated at every predetermined timing according to the embodiment. Further, when the updating unit 15 estimates the position and orientation by the extrapolation updating unit 54, it is preferable because the extrapolation accuracy can be maintained by the one embodiment.

一実施例では、更新部15では算出更新部51又は別算出更新部53によって位置及び姿勢を推定しておき、当該推定に用いる平面射影変換を求める際の誤差が一定値を超えた場合に、サーバー2の認識結果を利用する。なお、当該一定値を超えた次の回の処理をサーバー2の認識結果利用としてもよい。 In one embodiment, the update unit 15 estimates the position and orientation by the calculation update unit 51 or the separate calculation update unit 53, and when the error in obtaining the planar projective transformation used for the estimation exceeds a certain value, Use the recognition result of server 2. Note that the next processing exceeding the certain value may be used as the recognition result of the server 2.

一実施例では、更新部15では算出更新部51(又は別算出更新部53)によって位置及び姿勢を推定しておき、当該推定に用いる特徴点(又はマーカーの特徴箇所)が所定個数以上検出不能になった場合に、サーバー2の認識結果を利用する。なお、当該検出不能となった次の回の処理をサーバー2の認識結果利用としてもよい。なお、算出更新部51及び別算出更新部53は、特徴点(又はマーカーの特徴箇所)が所定個数未満検出不能である間は、検出可能な特徴点(ただし、4点以上)によって平面射影変換を求め、位置及び姿勢を推定するものとする。 In one embodiment, the update unit 15 estimates the position and orientation by the calculation update unit 51 (or another calculation update unit 53), and it is impossible to detect a predetermined number or more of feature points (or feature points of markers) used for the estimation. When it becomes, the recognition result of server 2 is used. Note that the next processing that becomes undetectable may be used as the recognition result of the server 2. Note that the calculation update unit 51 and the separate calculation update unit 53 perform planar projective conversion using detectable feature points (however, four or more points) while fewer than a predetermined number of feature points (or marker feature points) cannot be detected. And the position and orientation are estimated.

上記2つの実施例では特に、それまで画像上に捉えていた撮像対象が、撮像部11の向きの変化又は移動などで画像内から消えてしまった場合に、再度サーバー2にアクセスして、現在の入力画像において別の撮像対象を認識させるようにすることができる。 Especially in the above two embodiments, when the imaging target that has been captured on the image has disappeared from the image due to the change or movement of the orientation of the imaging unit 11, the server 2 is accessed again and the current In this input image, another imaging target can be recognized.

3…情報提示システム、1…端末、2…サーバー、11…撮像部、12…算出部、13…測位部、14…制御部、15…更新部、16…出力部、17…(端末側)送信部、18…(端末側)受信部、21…認識部、22…記憶部、23…抽出部、24…(サーバー側)送信部、25…(サーバー側)受信部、51…算出更新部、52…センサ更新部、53…別算出更新部、54…外挿更新部、55…カメラワーク更新部 3 ... Information presentation system, 1 ... Terminal, 2 ... Server, 11 ... Imaging unit, 12 ... Calculation unit, 13 ... Positioning unit, 14 ... Control unit, 15 ... Update unit, 16 ... Output unit, 17 ... (Terminal side) Transmission unit, 18 ... (terminal side) reception unit, 21 ... recognition unit, 22 ... storage unit, 23 ... extraction unit, 24 ... (server side) transmission unit, 25 ... (server side) reception unit, 51 ... calculation update unit 52 ... Sensor update unit, 53 ... Separate calculation update unit, 54 ... Extrapolation update unit, 55 ... Camera work update unit

Claims

An information presentation system that includes a terminal including an imaging unit that images an imaging target and a server that communicates with the terminal, and presents additional information related to the imaging target at the terminal,
The terminal
A positioning unit that acquires positioning information including the position of the terminal;
A calculation unit that calculates a predetermined feature amount that is invariant to rotation and enlargement / reduction from the captured image and its feature point coordinates;
A terminal-side transmitter that transmits the obtained positioning information and the calculated feature amount and feature point coordinates to the server;
A terminal-side receiving unit that receives from the server the position and orientation of the terminal recognized by the server with respect to the imaging target, and additional information;
An update unit for updating the recognized position and orientation;
A control unit for controlling the additional information received from the server according to the updated position and orientation;
An output unit that superimposes and displays the controlled additional information in a predetermined arrangement with respect to the imaging target in the captured image;
The server
A server-side receiving unit that receives positioning information and feature quantities and feature point coordinates transmitted from the terminal-side transmitting unit;
For each of a plurality of predetermined imaging targets, a storage unit that holds the feature amount and its feature point coordinates, positioning information, and additional information when imaged in a predetermined arrangement in association with each other;
By searching the storage unit, an imaging target having positioning information whose distance from the acquired positioning information is within a predetermined range is selected as a candidate for the imaging target captured by the imaging unit, and its feature amount and feature An extractor for extracting together with point coordinates;
By comparing the feature quantity and feature point coordinates in each extracted candidate with the received feature quantity and feature point coordinates, which candidate corresponds to the imaging target being imaged by the imaging unit, the imaging A recognition unit for recognizing the position and orientation of the target with respect to the terminal;
A server-side transmission unit configured to transmit the recognized position and orientation and the additional information held in the storage unit to the imaging object recognized as corresponding to the terminal; Information presentation system.

The said positioning part acquires the said positioning information including the combination which consists of longitude, the latitude, the altitude, the floor number of a building, an azimuth | direction, an elevation angle, or a part or all of these. Information presentation system.

The terminal-side transmitting unit transmits terminal information including information for identifying the terminal, the server-side receiving unit receives the terminal information, and the storage unit has the content corresponding to the terminal information. Hold and
The storage unit holds additional information by a configuration having each content according to terminal information,
The server-side transmission unit transmits, to the terminal, content corresponding to the received terminal information among the contents of the additional information held in the storage unit with respect to the imaging target recognized to be applicable. The information presentation system according to claim 1, wherein the system is an information presentation system.

The storage unit further associates and holds terminal information for each of the predetermined plurality of imaging targets,
The information extraction system according to claim 3 , wherein the extraction unit performs the extraction from an imaging target in which the received terminal information exists in the terminal information held in the storage unit. .

The positioning unit includes the orientation imaged by the imaging unit and acquires the positioning information,
The extraction unit searches the storage unit so that the distance from the acquired positioning information is within a predetermined range, and the imaging unit captures an image from a position in the positioning information toward an orientation in the positioning information. An imaging target having positioning information that falls within a predetermined angle of view at the time is extracted as a candidate for the imaging target being imaged by the imaging unit together with the feature amount and the feature point coordinates. 5. The information presentation system according to any one of 4.

The recognizing unit associates similar features between the extracted feature quantities and the received feature quantities, and temporarily uses RANSAC or PROSAC between the feature point coordinates for the associated feature quantities. The relationship between the plane projection changes of the two is calculated and the erroneous correspondence is excluded. Then, the relationship between the plane projection changes between the corresponding feature point coordinates is obtained again and the conversion error is obtained, and the conversion error is minimized among the candidates. The candidate is recognized as corresponding to the imaging target being imaged by the imaging unit, and the position and orientation of the imaging target with respect to the terminal are recognized based on the relationship of the plane projection change obtained again with respect to the candidate. The information presentation system according to claim 1, wherein the information presentation system is an information presentation system.

The update unit
The recognized position and orientation are updated by following the correct corresponding feature amount and feature point coordinates obtained by the recognition unit, and updating the relation of the planar projective transformation at the time of recognition. A calculation update unit;
Including an acceleration sensor for acquiring an acceleration applied to the terminal and an inclination sensor for acquiring an inclination of the terminal, estimating a change in the position of the terminal based on a cumulative acceleration acquired by the acceleration sensor, and acquiring the inclination sensor A sensor update unit that updates the recognized position and orientation by estimating a change in the posture of the terminal from the change in the tilt,
Another calculation update unit for updating the recognized position and orientation by detecting a predetermined marker provided in advance in the imaging target by edge detection;
An extrapolation update unit that updates the recognized position and posture by applying a predetermined fitting function to the predetermined history of the recognized position and posture and obtaining the position and posture at the current time point by extrapolation;
The information presentation according to claim 6, further comprising: a camera work update unit that updates the recognized position and orientation by extracting predetermined camera work parameters from the captured image. system.

The recognition unit recognizes the imaging target and its position and orientation at every predetermined timing,
The predetermined timing is
The updating unit updating the predetermined number of times;
An error in obtaining the position and orientation in order to update the position and orientation in the calculation update unit or the separate calculation update unit exceeds a certain value;
The feature point coordinates corresponding to the positive corresponding feature point calculated by the calculation updating unit or the predetermined marker detected by the separate calculation updating unit are determined depending on whether a predetermined number or more of feature points cannot be calculated or detected. The information presentation system according to claim 7.

9. The information presentation system according to claim 1, wherein the calculation unit calculates the feature amount and the feature point coordinate based on a relative luminance gradient in a local region of the image.

When the feature amount calculated by the calculation unit and the data amount of the feature point coordinates are larger than the data amount of the captured image, the terminal-side transmission unit replaces the positioning information, the feature amount, and the feature point coordinates. 10. The information presentation system according to claim 1, wherein the positioning information and the captured image are transmitted to the server, and the server functions as the calculation unit.