JP2018106611A

JP2018106611A - AR information display device and program

Info

Publication number: JP2018106611A
Application number: JP2016255475A
Authority: JP
Inventors: 令子瀧塚; Reiko Takizuka; 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2016-12-28
Filing date: 2016-12-28
Publication date: 2018-07-05
Anticipated expiration: 2036-12-28
Also published as: JP6687510B2

Abstract

PROBLEM TO BE SOLVED: To provide an AR information display device and program for intuitively achieving guidance to a predetermined target object from among a plurality of objects.SOLUTION: An AR information display device 20 for presenting a display of guidance from a plurality of objects to a predetermined target object is provided as including: an imaging unit 1 for obtaining a captured image by performing photographing; a recognition unit 2 for recognizing each of the plurality of objects from the captured image; a generation unit 3 for generating guidance information to the target object on the basis of the object recognized by the recognition unit 2 when the target object is not recognized by the recognition unit 2; and a display unit 4 for displaying the guidance information.SELECTED DRAWING: Figure 1

Description

本発明は、複数の対象の中から所定の目標対象への誘導を直感的に実現するAR情報表示装置及びプログラムに関する。 The present invention relates to an AR information display device and a program for intuitively realizing guidance from a plurality of objects to a predetermined target object.

AR(拡張現実)技術によって案内や誘導などによるユーザ補助を実現する従来技術としては、例えば次のようなものがある。 As a conventional technique for realizing user assistance by guidance or guidance using AR (augmented reality) technology, for example, the following is available.

特許文献１（発明の名称：商品情報提供端末装置および商品情報提供システム）に示されるこの種の装置は、撮影画像からAR技術を利用して店舗に陳列されている商品に関する最適商品情報をリアルタイムに提供するための商品情報提供端末装置であり、電力消費量を効率的に抑制することに配慮して、来店者が移動状態でないときや来店者の頭部が停止している状態であるときを判別して、商品情報を提供していた。 This type of device disclosed in Patent Document 1 (Title of Invention: Product Information Providing Terminal Device and Product Information Providing System) provides real-time optimal product information on products displayed in stores using AR technology from captured images. The product information providing terminal device for providing to the customer, considering that the power consumption is efficiently suppressed, when the customer is not in a moving state or when the customer's head is stopped Product information was provided.

特許文献２（発明の名称：ヘッドマウントディスプレイ用ユーザインターフェース）に示されるこの種の装置は、HMD（ヘッドマウントディスプレイ）を装着したユーザの頭の動きをセンサで取得し、ユーザが前を向いているときは視野から成る第一画面を、ユーザが下を向いたときには、視野とユーザの位置を含む地図（俯瞰図）と、視野内の物理的物体と独立した持続的データから成る第二画面を表示するHMD用ユーザインターフェース技術である。 This type of device disclosed in Patent Document 2 (Title: User Interface for Head Mount Display) acquires the movement of the head of a user wearing an HMD (Head Mount Display) with a sensor, and the user faces forward. The first screen consisting of the field of view when the user is facing down, the second screen consisting of a map containing the field of view and the user's position (overhead view) when the user looks down, and persistent data independent of the physical objects in the field of view Is a user interface technology for HMD.

特許第5656457号Patent No. 5656457 特許第5909034号Patent No.5909034

しかしながら、以上のような従来技術においては、必ずしもユーザに対して目標対象への到達を効果的に補助することができなかった。特許文献１では、端末装置が取得した指示情報に直接該当する商品に関する情報だけを表示するため、たくさんの商品が並んでいるときに、ユーザが個々の指示情報を調べて、目標の商品を探し出すのが難しかった。さらに、特許文献２において、HMD の視野内に表示される地図は、センサ情報に基づきユーザが位置する物理的な表面を描いたものであり、ナビゲーション情報はユーザの現在位置に基づいて提供され、必ずしも直感的とはいえなかった。 However, in the conventional techniques as described above, it has not always been possible to effectively assist the user in reaching the target object. In Patent Document 1, only information related to products directly corresponding to the instruction information acquired by the terminal device is displayed. Therefore, when a large number of products are lined up, the user searches individual instruction information to find a target product. It was difficult. Further, in Patent Document 2, the map displayed in the field of view of the HMD depicts a physical surface on which the user is located based on sensor information, and navigation information is provided based on the current location of the user. It was not always intuitive.

上記課題に鑑み、本発明は、複数の対象の中から目標対象を探す際に、起点の位置や終点の位置を含む誘導情報の態様を変えることにより、所定の目標対象への誘導を直感的に実現するAR情報表示装置及びプログラムを提供することを目的とする。 In view of the above problems, the present invention intuitively guides to a predetermined target object by changing the mode of guidance information including the position of the starting point and the position of the end point when searching for the target object from a plurality of objects. It is an object of the present invention to provide an AR information display device and a program that can be realized.

上記目的を達成するため、本発明は、複数の対象の中から所定の目標対象へと誘導する表示を行うAR情報表示装置であって、撮影を行って撮影画像を得る撮影部と、前記撮影画像より前記複数の対象の各々を認識する認識部と、前記認識部によって前記目標対象が認識されていない場合に、前記認識部によって認識されている対象に基づいて、前記目標対象への誘導情報を生成する生成部と、前記誘導情報を表示する表示部と、を備えることを特徴とする。また、コンピュータを当該AR情報表示装置として機能させるプログラムであることを特徴とする。 In order to achieve the above object, the present invention provides an AR information display device that performs display that guides a plurality of objects to a predetermined target object, the photographing unit that performs photographing and obtains a photographed image, and the photographing A recognition unit that recognizes each of the plurality of objects from an image, and guidance information to the target object based on the object that is recognized by the recognition unit when the target object is not recognized by the recognition unit And a display unit for displaying the guidance information. Further, the present invention is a program that causes a computer to function as the AR information display device.

本発明によれば、前記認識部によって認識されている対象および認識時の状況および操作目的に基づいて、前記目標対象への誘導情報の態様を変えることにより、誘導情報をより直感的なものとして生成して表示することができる。 According to the present invention, the guidance information is made more intuitive by changing the mode of the guidance information to the target object based on the target recognized by the recognition unit, the situation at the time of recognition, and the operation purpose. Can be generated and displayed.

一実施形態に係るAR情報表示装置の機能ブロック図である。It is a functional block diagram of the AR information display device according to an embodiment. 一実施形態に係る生成部の機能ブロック図である。It is a functional block diagram of the production | generation part which concerns on one Embodiment. 判定部により判定されるケース２とケース３との区別を説明するための模式図である。It is a schematic diagram for demonstrating the distinction between Case 2 and Case 3 determined by the determination unit. 誘導情報の模式的な例を示す図である。It is a figure which shows the typical example of guidance information. 一実施形態に係るAR情報表示装置の動作のフローチャートである。It is a flowchart of operation | movement of the AR information display apparatus which concerns on one Embodiment. 誘導情報としての矢印の第二実施形態を説明するための図である。It is a figure for demonstrating 2nd embodiment of the arrow as guidance information. 誘導情報としての矢印の第二実施形態を説明するための図である。It is a figure for demonstrating 2nd embodiment of the arrow as guidance information. 誘導情報としての矢印の第三実施形態を説明するための図である。It is a figure for demonstrating 3rd embodiment of the arrow as guidance information. ケース３及びケース２に該当する場合の矢印（及び設定される終点）の区別を示す図である。It is a figure which shows the distinction of the arrow (and the set end point) in case 3 and case 2 are applicable. 誘導情報の生成を説明するため配置などの例を示す図である。It is a figure which shows the example of arrangement | positioning etc. in order to demonstrate the production | generation of guidance information. 俯瞰図によって誘導情報を構成する実施形態の模式例を示す図である。It is a figure which shows the schematic example of embodiment which comprises guidance information with an overhead view. 図１１の例に対応する例として俯瞰図上において認識されていない対象の領域をメッシュで覆って表示する例を示す図である。It is a figure which shows the example which covers and displays the area | region of the object which is not recognized on an overhead view as an example corresponding to the example of FIG. 俯瞰図によって誘導情報を構成する場合において、終点を常に目標対象の中点として決定する実施形態の模式例を示す図である。FIG. 10 is a diagram illustrating a schematic example of an embodiment in which the end point is always determined as the midpoint of the target object when the guidance information is configured by an overhead view.

図１は、一実施形態に係るAR情報表示装置の機能ブロック図である。AR情報表示装置20は、撮影部1、認識部2、生成部3、表示部4及び記憶部5を備える。各部1〜5の概略的な機能は以下の通りである。 FIG. 1 is a functional block diagram of an AR information display device according to an embodiment. The AR information display device 20 includes an imaging unit 1, a recognition unit 2, a generation unit 3, a display unit 4, and a storage unit 5. The general functions of the respective units 1 to 5 are as follows.

撮影部1は、ユーザの撮影操作によってAR表示が行われる対象となる現実世界の撮影を行い、得られた撮影画像を認識部2へと出力する。枝分れしている点線L1で示すように、いくつかの実施形態においては撮影部1の得た撮影画像は認識部2のみならず、生成部3へも出力される。なお、撮影部1では映像（動画像）として時間軸上において所定レートで連続的に撮影を行い、各時刻のフレーム画像としての撮影画像が認識部2（及び実施形態によっては生成部3）へと出力される。撮影部1を実現するハードウェアとしては、通常のデジタルカメラを用いることができる。 The photographing unit 1 shoots the real world that is a target for which AR display is performed by a user's photographing operation, and outputs the obtained photographed image to the recognition unit 2. As indicated by the branched dotted line L1, in some embodiments, the captured image obtained by the imaging unit 1 is output not only to the recognition unit 2 but also to the generation unit 3. The shooting unit 1 continuously shoots images (moving images) at a predetermined rate on the time axis, and the shot images as frame images at each time are sent to the recognition unit 2 (and the generation unit 3 in some embodiments). Is output. As hardware for realizing the photographing unit 1, a normal digital camera can be used.

認識部2では撮影部1で得られた撮影画像を解析して、予め記憶部5に登録されている複数の対象の各々を当該撮影画像内から認識すると共にその現実世界における位置及び姿勢を推定し、認識結果（推定された位置及び姿勢も含む）を生成部3へと出力する。認識部2による当該処理には、AR技術分野の既存技術を用いることができる。 The recognition unit 2 analyzes the photographed image obtained by the photographing unit 1, recognizes each of a plurality of objects registered in advance in the storage unit 5 from the photographed image, and estimates the position and posture in the real world. Then, the recognition result (including the estimated position and orientation) is output to the generation unit 3. For the processing by the recognition unit 2, existing technology in the AR technical field can be used.

すなわち、認識処理に関しては、例えばSIFT特徴量のような特徴点周辺で定義される局所特徴量などを利用することで、撮影画像のどの箇所にどの対象が存在しているかを、すなわち、各対象のスクリーン座標系での座標配置を特定することができる。ここで、各対象の局所特徴量は予め記憶部5に登録しておき、認識部2では撮影画像から局所特徴量を抽出し、記憶部5に記憶されている各対象の局所特徴量と照合することで、いずれの対象が撮影画像内に存在するかを認識することができる。さらに、認識できた対象の所定の点の座標位置をワールド座標系からスクリーン座標系に変換するためのホモグラフィ行列を求めることができる。当該ホモグラフィ行列を求めることを可能にするため、記憶部5では各対象につき、所定のワールド座標系における4点以上の点の位置を予め記憶しておく。なお、AR分野において周知のように、ホモグラフィ行列を求めることは、撮像部1を構成するカメラに対する認識できた対象の位置姿勢を推定することに相当する。 In other words, with regard to recognition processing, for example, by using local feature amounts defined around feature points such as SIFT feature amounts, it is possible to determine which target exists in which part of the captured image, that is, each target. The coordinate arrangement in the screen coordinate system can be specified. Here, the local feature amount of each target is registered in the storage unit 5 in advance, the recognition unit 2 extracts the local feature amount from the captured image, and collates with the local feature amount of each target stored in the storage unit 5 This makes it possible to recognize which target is present in the captured image. Furthermore, it is possible to obtain a homography matrix for converting the coordinate position of a predetermined target point that has been recognized from the world coordinate system to the screen coordinate system. In order to obtain the homography matrix, the storage unit 5 stores in advance the positions of four or more points in a predetermined world coordinate system for each target. As is well known in the AR field, obtaining a homography matrix is equivalent to estimating the position and orientation of an object that can be recognized with respect to the camera constituting the imaging unit 1.

生成部3は、認識部2で得られた認識結果に基づき、目標対象の所定の点のスクリーン座標系の位置を取得もしくは推定して誘導情報を生成する。その際、目標対象を認識できない場合は、目標対象の近傍に存在する、認識できた対象の所定の点のホモグラフィ行列を使用する。記憶部５に記憶した各対象の所定の点のワールド座標系の位置情報を前記ホモグラフィ行列で変換することにより、同一状況下における目標対象の所定の点のスクリーン座標系の座標位置を推定することができる。生成部３は、前記取得もしくは推定した目標対象の位置に基づき誘導情報を生成して、表示部4へと出力する。表示部4は、認識部2により当該得られた誘導情報をユーザに対してAR表示として表示する。 Based on the recognition result obtained by the recognition unit 2, the generation unit 3 acquires or estimates the position of a predetermined point of the target target in the screen coordinate system and generates guidance information. At that time, when the target object cannot be recognized, a homography matrix of a predetermined point of the recognized object existing in the vicinity of the target object is used. By converting the position information of the predetermined point of each target stored in the storage unit 5 in the world coordinate system using the homography matrix, the coordinate position of the predetermined point of the target target in the same situation is estimated. be able to. The generation unit 3 generates guidance information based on the acquired or estimated target target position and outputs the guidance information to the display unit 4. The display unit 4 displays the guidance information obtained by the recognition unit 2 as an AR display for the user.

本発明においては特に、認識部2では目標対象が認識されていない場合、すなわち、撮影部1では目標対象が撮影されていない場合（なお、目標対象が撮影されているがノイズ等の影響による認識部2の認識処理エラーで認識できない場合もありうるが、ここでは考慮外とする）において、認識部2で認識されている目標対象以外の対象に基づいて、生成部3が誘導情報を生成し、当該誘導情報を表示部4がAR表示することで、ユーザに対する直感的な誘導を実現することができる。なお、誘導情報の例としては矢印や目標対象を囲む太枠などのアイコンを用いたAR表示情報がある。その詳細は後述する。 Particularly in the present invention, when the target unit is not recognized by the recognition unit 2, that is, when the target unit is not photographed by the photographing unit 1 (note that the target target is photographed but is recognized due to the influence of noise or the like). In some cases, it may not be recognized due to a recognition processing error in part 2, but this is not considered here). Based on an object other than the target object recognized in recognition part 2, generation part 3 generates guidance information. As the guidance information is displayed by the display unit 4 as AR, intuitive guidance for the user can be realized. An example of the guidance information is AR display information using an icon such as an arrow or a thick frame surrounding the target object. Details thereof will be described later.

ここで、表示部4は、AR技術において用いられる既存の各種の表示態様によって、撮影部1で得られた撮影画像に（位置関係などを）関連付けられたものとして誘導情報を表示することができる。 Here, the display unit 4 can display the guidance information as being associated with the captured image obtained by the imaging unit 1 (positional relationship or the like) by various existing display modes used in the AR technology. .

一実施形態では、図１（及び後述の図２）に点線L1で示すように、撮影部1で得られた撮影画像を生成部3において受け取り、生成部3では当該撮影画像に対してAR表示による誘導情報を重畳したうえで、表示部4へと出力し、表示部4では撮像画像に当該誘導情報が重畳されたものを表示するようにしてよい。この場合、表示部4を実現するハードウェアとしては、液晶ディスプレイその他の通常のディスプレイを用いることができる。この場合また、AR情報表示装置20は例えば、タブレットその他のカメラ及びディスプレイが付属する情報端末装置で実現することができる。 In one embodiment, as indicated by a dotted line L1 in FIG. 1 (and FIG. 2 to be described later), the photographed image obtained by the photographing unit 1 is received by the generating unit 3, and the generating unit 3 displays an AR for the photographed image. After superimposing the guidance information, the information may be output to the display unit 4, and the display unit 4 may display the superposed guidance information on the captured image. In this case, a liquid crystal display or other normal display can be used as hardware for realizing the display unit 4. Also in this case, the AR information display device 20 can be realized by, for example, an information terminal device to which a tablet or other camera and display are attached.

また、別の一実施形態では、AR情報表示装置20をヘッドマウントディスプレイ等として実現し、表示部4を実現するハードウェアとして、当該ヘッドマウントディスプレイ等で用いられるシースルー型ディスプレイを用いてもよい。この場合、既存技術に従って撮影部1で得られる撮影画像の視界とシースルー型ディスプレイを介したユーザの視界とを対応付けておき、ユーザがシースルー型ディスプレイを介して眺める現実世界に、シースルー型ディスプレイによってAR情報としての誘導情報を重畳し、ユーザに提示することができる。またこの場合、図１（及び後述の図２）に点線L1で示される、撮映部1で得られた撮像画像を生成部3へと出力する流れは省略される。 In another embodiment, the AR information display device 20 may be realized as a head mounted display or the like, and a see-through display used in the head mounted display or the like may be used as hardware for realizing the display unit 4. In this case, according to the existing technology, the field of view of the captured image obtained by the photographing unit 1 is associated with the field of view of the user through the see-through display, and the user sees through the see-through display with the see-through display. Guide information as AR information can be superimposed and presented to the user. In this case, the flow of outputting the captured image obtained by the imaging unit 1 to the generation unit 3 indicated by the dotted line L1 in FIG. 1 (and FIG. 2 described later) is omitted.

なお、図１（及び後述の図２）に点線L3として示すように、いくつかの実施形態においては、生成部3は当該誘導情報を生成した際の状況判断に応じた指示情報を認識部2へと出力し、認識部2において当該指示情報に従った認識処理を行わせるようにしてもよい。例えば、ある時点において既に認識部2によって認識された特定の対象に関して、以降の時点では例えばテンプレートマッチングを用いたトラッキングによる追跡を行わせるようにし、局所特徴量を用いた認識を省略させるようにする旨の指示情報を、生成部3から認識部2へと通知するようにしてもよい。 Note that, in some embodiments, as indicated by a dotted line L3 in FIG. 1 (and FIG. 2 described later), the generation unit 3 recognizes instruction information according to the situation determination when the guidance information is generated. The recognition unit 2 may perform recognition processing according to the instruction information. For example, for a specific target that has already been recognized by the recognition unit 2 at a certain point in time, tracking is performed by tracking using, for example, template matching at a later point in time, and recognition using a local feature amount is omitted. The instruction information to that effect may be notified from the generation unit 3 to the recognition unit 2.

記憶部5は、認識部2における各対象の認識処理及び位置姿勢の推定処理と、生成部3における誘導情報の生成処理と、に必要な情報を予め管理者等によって登録されたものとして記憶しておき、認識部2及び生成部3が処理を行う際に必要となる情報を記憶情報として提供する。 The storage unit 5 stores information necessary for recognition processing and position / orientation estimation processing of each target in the recognition unit 2 and guidance information generation processing in the generation unit 3 as registered in advance by an administrator or the like. In addition, information necessary when the recognition unit 2 and the generation unit 3 perform processing is provided as stored information.

例えば、前述の通り認識部2では撮影画像から複数の対象（そのうちの１つが目標対象である）の各々を認識し位置姿勢を推定するが、この際に必要となる各対象の認識用の特徴量（局所特徴量など）と、所定のワールド座標における座標位置情報とを記憶部5が記憶している。 For example, as described above, the recognition unit 2 recognizes each of a plurality of objects (one of which is a target object) from the captured image and estimates the position and orientation. The storage unit 5 stores a quantity (such as a local feature quantity) and coordinate position information in predetermined world coordinates.

一実施形態では、記憶部5に登録して記憶させておく各対象i(i=0,1,2,...とし、ここでは対象の識別子としてiを用いる)の位置及び姿勢は、AR技術分野において周知の正方マーカ等の登録におけるのと同様に、ワールド座標系における対象iの四方の点の位置((x0i, y0i,z0i), (x1i,y1i,z1i),(x2i,y2i,z2i),(x3i,y3i,z3i))として登録しておくことができる。（なお、周知のように、四方の点の位置は必ずしも正方形を形成している必要はなく、画像上で検出される4点と当該登録しておく4点との座標を互いに変換するホモグロフィ変換を算出可能な任意の４点を登録しておくことができる。また、4点より多くの点を登録してもよい。） In one embodiment, the position and orientation of each object i (i = 0, 1, 2,..., Where i is used as the object identifier) to be registered and stored in the storage unit 5 is AR. Similar to the registration of square markers and the like well known in the technical field, the positions of the four points of the object i in the world coordinate system ((x0i, y0i, z0i), (x1i, y1i, z1i), (x2i, y2i, z2i), (x3i, y3i, z3i)). (Note that, as is well known, the positions of the four points do not necessarily need to form a square, and the homo-growth transform converts the coordinates of the four points detected on the image and the four points to be registered with each other. You can register any 4 points that can be calculated, and you can register more than 4 points.)

また、個々の対象iについての、始点候補（当該始点などに関しては後述する）になるための優先順位度（ε0,ε1,ε2,ε3, ..., εi, ...)や、その他始点候補を選択するための種々の閾値も予め記憶部5に登録しておくことができる。その他、記憶部5に登録して記憶させておく個別の記憶情報の詳細については、認識部2及び生成部3の処理内容の詳細説明の際に適宜説明する。 In addition, for each object i, the priority (ε0, ε1, ε2, ε3, ..., εi, ...) to become a starting point candidate (the starting point will be described later), and other starting points Various threshold values for selecting candidates can also be registered in the storage unit 5 in advance. In addition, the details of the individual storage information registered and stored in the storage unit 5 will be described as appropriate in the detailed description of the processing contents of the recognition unit 2 and the generation unit 3.

図２は、一実施形態に係る生成部3の個別機能を示した機能ブロック図である。生成部3は、判定部31、目標推定部32、終点決定部33、始点決定部34、現状保持部35及び情報生成部36を備える。各部31〜36の概略的な機能は以下の通りである。 FIG. 2 is a functional block diagram showing individual functions of the generation unit 3 according to an embodiment. The generation unit 3 includes a determination unit 31, a target estimation unit 32, an end point determination unit 33, a start point determination unit 34, a current state holding unit 35, and an information generation unit 36. The general functions of the units 31 to 36 are as follows.

まず、判定部31は、認識部2から得られた認識結果が次のいずれのケースに該当するかを判定する。
（ケース１）…撮影画像において少なくとも１つの対象が認識されており、且つ、当該認識された対象の中には目標対象は存在しない。
（ケース２）…撮影画像において少なくとも１つの対象が認識されており、且つ、当該認識された対象の中に目標対象が存在し、且つ、当該目標対象は、表示部4による表示範囲内に存在している。
（ケース３）…撮影画像において少なくとも１つの対象が認識されており、且つ、当該認識された対象の中に目標対象が存在し、且つ、当該目標対象は、表示部4による表示範囲内には存在しない。
（ケース４）…撮影画像において対象が全く認識されていない。 First, the determination unit 31 determines which of the following cases the recognition result obtained from the recognition unit 2 corresponds to.
(Case 1) ... At least one target is recognized in the photographed image, and no target target exists among the recognized targets.
(Case 2) ... At least one target is recognized in the photographed image, the target target exists in the recognized target, and the target target exists within the display range of the display unit 4. doing.
(Case 3) ... At least one target is recognized in the photographed image, and the target target is present in the recognized target, and the target target is within the display range of the display unit 4. not exist.
(Case 4) ... No object is recognized in the photographed image.

図３はケース２とケース３の区別を説明するための例を示す模式図である。図３では、撮影部1による撮影範囲R1が撮影画像の矩形範囲として示され、さらに、表示部4による表示範囲R4（すなわち、生成部3が生成する誘導情報を表示し得る範囲）が当該矩形範囲の内部に含まれることで当該矩形範囲より狭い範囲として（スクリーン座標系において）示されている。このように、撮影範囲R1の方が表示範囲R4よりも広くなるという構成は、例えばAR情報表示装置20をヘッドマウントディスプレイによって実装し、表示部4をそのシースルー型ディスプレイとして実装する場合に、実装態様によっては該当しうる構成である。 FIG. 3 is a schematic diagram illustrating an example for explaining the distinction between the case 2 and the case 3. In FIG. 3, the shooting range R1 by the shooting unit 1 is shown as a rectangular range of the shot image, and the display range R4 by the display unit 4 (that is, the range in which the guide information generated by the generation unit 3 can be displayed) is the rectangle. By being included within the range, it is shown as a range narrower than the rectangular range (in the screen coordinate system). In this way, the configuration in which the shooting range R1 is wider than the display range R4 is implemented when, for example, the AR information display device 20 is mounted by a head mounted display, and the display unit 4 is mounted as its see-through display. This is a configuration that can be applied depending on the mode.

そして、図３の例では、表示範囲R4内（同時に撮影範囲R1内である）に、１つの認識された対象O2が存在しており、撮影範囲R1内且つ表示範囲R4外に１つの認識された対象O3が存在している例が示されている。もし対象O2が目標対象であるならば、ケース２に該当する。一方、もし対象O3が目標対象であるならば、ケース３に該当する。 In the example of FIG. 3, there is one recognized object O2 within the display range R4 (at the same time within the shooting range R1), and one is recognized outside the shooting range R1 and outside the display range R4. An example in which the target O3 exists is shown. If target O2 is the target target, it falls under Case 2. On the other hand, if the target O3 is the target target, it corresponds to Case 3.

なお、以上の図３の例からも明らかなように、図３の例とは異なり仮に撮影範囲R1と表示範囲R4とが一致するという関係にあれば、ケース３に該当する状況が発生することはない。従って、例えばAR情報表示装置20をカメラ及び当該カメラ画像の全域を表示するディスプレイを有したタブレット等によって実装し、撮影範囲R1と表示範囲R4とが一致するように実装している場合には、判定部31ではケース３に該当するか否かの判定を省略してもよい。 As is clear from the example of FIG. 3 described above, unlike the example of FIG. 3, if the shooting range R1 and the display range R4 coincide, the situation corresponding to case 3 occurs. There is no. Therefore, for example, when the AR information display device 20 is mounted with a tablet or the like having a camera and a display that displays the entire area of the camera image, and mounted so that the shooting range R1 and the display range R4 match, The determination unit 31 may omit determination as to whether or not the case 3 is applicable.

情報生成部36は、判定部31が判定したケース１〜ケース４及び認識部2の認識結果に応じた誘導情報を生成し、（生成部3における最終的な出力として）表示部4へと出力する。目標推定部32、終点決定部33、始点決定部34、現状保持部35の各部32〜35（の任意の一部又は全部）は、情報生成部36が誘導情報を生成する際に必要となる各種の要素的な処理を、判定部31が判定したケース１〜ケース４及び認識部2の認識結果に応じて実施し、処理結果を情報生成部36に提供する。情報生成部36では当該提供された処理結果に基づいて状況に応じた誘導情報を生成することができる。 The information generation unit 36 generates guidance information according to the recognition results of the cases 1 to 4 and the recognition unit 2 determined by the determination unit 31 and outputs the guidance information to the display unit 4 (as a final output in the generation unit 3). To do. The target estimation unit 32, the end point determination unit 33, the start point determination unit 34, and each of the units 32 to 35 (any or all of them) of the current state holding unit 35 are required when the information generation unit 36 generates the guidance information. Various elemental processes are performed according to the recognition results of cases 1 to 4 and the recognition unit 2 determined by the determination unit 31, and the processing results are provided to the information generation unit 36. The information generation unit 36 can generate guidance information corresponding to the situation based on the provided processing result.

各部32〜35の処理内容の概要は以下の通りである。なお、以降において説明するように、各部32〜35の処理内容に基づいて情報生成部36が誘導情報を生成する実施形態に関しては種々のものが可能である。 The outline of the processing contents of the units 32 to 35 is as follows. As will be described later, various embodiments are possible for the embodiment in which the information generation unit 36 generates the guidance information based on the processing contents of the units 32 to 35.

目標推定部32では、ケース１に該当する場合に、撮影画像内に存在しない目標対象の位置（撮影画像の範囲外のスクリーン座標系の位置）を、撮影画像内に存在する目標対象以外の、認識に成功した対象に基づいて推定する。 In the target estimation unit 32, in the case of Case 1, the position of the target object that does not exist in the captured image (the position of the screen coordinate system outside the range of the captured image) other than the target object that exists in the captured image, Estimate based on objects that have been successfully recognized.

終点決定部33は、誘導情報を構成する矢印の終点位置（矢印における矢先の位置）を、目標推定部32による推定結果などに基づいて決定する。始点決定部34は、誘導情報を構成する矢印の始点位置（矢印における根本（矢先の逆の側）の位置）を決定する。 The end point determination unit 33 determines the end point position of the arrow constituting the guidance information (the position of the arrow tip in the arrow) based on the estimation result by the target estimation unit 32 and the like. The start point determination unit 34 determines the start point position (the position of the root (the opposite side of the arrow tip) of the arrow) that constitutes the guidance information.

終点決定部33が終点を決定し、始点決定部34が始点を決定する実施形態においては、情報生成部36は、当該決定された終点及び始点をそれぞれ矢先及び根本とした矢印（表示部4で各態様によって表示される「矢印の画像」であるが、以下では単に「矢印」と称する。）を生成して、目標対象へと誘導するための誘導情報とすることができる。 In the embodiment in which the end point determination unit 33 determines the end point, and the start point determination unit 34 determines the start point, the information generation unit 36 uses an arrow (with the display unit 4) with the determined end point and start point as an arrow tip and a root, respectively. “An image of an arrow” displayed according to each aspect, but hereinafter simply referred to as “arrow”) can be generated and used as guidance information for guiding to a target object.

現状保持部35は、情報生成部36が俯瞰図によって誘導情報を構成する際の実施形態において利用されるものであり、当該俯瞰図において現時点のユーザが見ている対象（すなわち、現時点において撮影部1により撮影され認識部2により認識されている対象）がいずれであるかを、現状情報として保持し、リアルタイムで更新する。情報生成部36では当該実施形態において、現状情報を参照することで、現時点でユーザが認識している対象を現時点ではユーザが認識していない対象と区別して表示するものとして、現時点のユーザの視界範囲が反映された俯瞰図の情報を生成し、これに基づく誘導情報を生成することができる。 The current status holding unit 35 is used in the embodiment when the information generating unit 36 configures the guidance information with the overhead view, and the current user in the overhead view (that is, the imaging unit at the current time) 1), which is captured by 1 and recognized by the recognition unit 2, is stored as current information and updated in real time. In the embodiment, the information generation unit 36 refers to the current state information, and displays the target currently recognized by the user separately from the target not currently recognized by the user. Information of the overhead view reflecting the range can be generated, and guidance information based on the information can be generated.

図４は、以上のような図１及び図２のように構成されるAR情報表示装置20によって、撮影画像における対象の認識状況に応じて生成され表示される誘導情報の模式的な例を示すための図である。 FIG. 4 shows a schematic example of guidance information generated and displayed by the AR information display device 20 configured as shown in FIGS. 1 and 2 according to the recognition status of the target in the captured image. FIG.

図４では上段側に、AR表示による誘導が行われる対象となる現実世界Wの模式的な例として、棚Rの前面に複数の認識されるべき対象O11〜O63が配置されていることが描かれている。そして、当該現実世界WにおいてAR情報表示装置20を用いて撮影を行うユーザUが位置P1で撮影している場合と位置P2で撮影している場合とにおける誘導情報G1,G2の模式例が、上段側の現実世界Wとは区別して、下段側に示されている。すなわち、下段側は現実世界Wそのものの模式例ではなく、当該現実世界Wに関して生成される誘導情報の模式例を示すものである。図４にて上段側に示す現実世界WにいるユーザUは棚Rの側を向いてAR情報表示装置20を用いた撮影を行っており、当該撮影によってAR情報表示装置20上に得られる誘導情報G1,G2が、上段側の現実世界Wとは区別して、図４の下段側に示されている。（図４において、下段側は現実世界Wを示すためのものではなく、ユーザUが下段側に向けて撮影を行っているのではないことに注意されたい。） In FIG. 4, as a schematic example of the real world W to be guided by AR display, a plurality of objects to be recognized O11 to O63 are arranged in front of the shelf R on the upper side. It is. And, in the real world W, a schematic example of the guidance information G1, G2 in the case where the user U who is shooting using the AR information display device 20 is shooting at the position P1 and the case where the user U is shooting at the position P2, It is shown on the lower side in distinction from the real world W on the upper side. That is, the lower side shows not a schematic example of the real world W itself but a schematic example of guidance information generated with respect to the real world W. The user U who is in the real world W shown in the upper side in FIG. 4 faces the shelf R and is shooting using the AR information display device 20, and the guidance obtained on the AR information display device 20 by the shooting. Information G1 and G2 are shown on the lower side of FIG. 4 in distinction from the real world W on the upper side. (Note that in FIG. 4, the lower side is not for showing the real world W, and the user U is not shooting toward the lower side.)

図４では、上段側の現実世界Wに存在する合計18個の対象O11〜O63は、概ね平面状となっている棚Rの前面に概ね碁盤の目状に配置されている例が示されており、これらの位置関係を説明するための便宜上、横方向（左から右へ向かう横方向）でi番目（1≦i≦6）、縦方向（上から下へ向かう縦方向）でj番目（1≦j≦3）の位置にある対象に「Oij」の符号を付与している。 FIG. 4 shows an example in which a total of 18 objects O11 to O63 existing in the real world W on the upper side are arranged in a substantially grid pattern on the front surface of a substantially flat shelf R. For convenience in explaining these positional relationships, the i-th (1 ≦ i ≦ 6) in the horizontal direction (horizontal direction from left to right) and the j-th in the vertical direction (vertical direction from top to bottom) The symbol “Oij” is given to the object at the position of 1 ≦ j ≦ 3).

当該対象O11〜O63のそれぞれは、記憶部5にその特徴情報及び互いに共通な所定のワールド座標系における位置姿勢が登録されることで、認識部2によりそれぞれ区別して認識可能となっている。また、当該対象O11〜O63のうちの所定の一つがユーザを誘導すべき目標対象であり、目標対象である旨の識別情報も、その他の目標対象ではない対象と区別して記憶部5に登録されていてもよい。図４の例では最も右端にありかつ中段にある対象O62が目標対象として登録されているものとする。なお、目標対象は記憶部5に事前登録しておくほかにも、ユーザや当該ユーザの指導者などがその場で設定するようにしてもよい。 Each of the objects O11 to O63 can be distinguished and recognized by the recognition unit 2 by registering the feature information and the position and orientation in a predetermined world coordinate system common to each other in the storage unit 5. In addition, a predetermined one of the targets O11 to O63 is a target target to which the user is to be guided, and identification information indicating that the target target is a target target is also registered in the storage unit 5 separately from other targets that are not target targets. It may be. In the example of FIG. 4, it is assumed that the target O62 located at the rightmost end and in the middle is registered as the target target. In addition to pre-registering the target object in the storage unit 5, the user, the user's instructor, or the like may be set on the spot.

本発明のAR情報表示装置20によれば、図４の模式例のように多数の対象O11〜O63が存在する中から目標対象O62へとユーザを直感的に効率よく誘導することが可能となる。 According to the AR information display device 20 of the present invention, it is possible to guide the user intuitively and efficiently from the presence of a large number of objects O11 to O63 to the target object O62 as in the schematic example of FIG. .

図４にてまず、最初の時点で位置P1から撮影しているユーザUは、その下段側に示すような撮影画像PC1を撮影する。撮影画像PC1には対象O11〜O63の少なくとも１つは含まれるが、目標対象O62は含まれていないので、ケース１に該当する。この場合、矢印A1が生成され、且つ、まだ目標対象には到達していない旨を表現することで、誘導情報G1が生成される。なお、目標対象には到達していない旨の表現の具体例は図４では特に示されていないが、テキストその他のARメッセージや矢印A1の生成の態様（色や形状など）として当該表現が可能である。矢印A1は、対象O22の辺りを始点とし、対象O42の辺りを終点として生成されたものである。従って誘導情報G1により、位置P1にいるユーザUに対して、現在は目標対象が見えていないが、より右側に行けば目標対象がある旨を効果的に伝えることが可能となる。 In FIG. 4, first, the user U taking a picture from the position P1 at the first time takes a photographed image PC1 as shown on the lower side. The captured image PC1 includes at least one of the objects O11 to O63, but does not include the target object O62, and therefore corresponds to case 1. In this case, the guidance information G1 is generated by expressing that the arrow A1 is generated and the target object has not yet been reached. Note that a specific example of the expression that the target object has not been reached is not particularly shown in FIG. 4, but the expression can be used as a text or other AR message or arrow A1 generation mode (color, shape, etc.) It is. The arrow A1 is generated with the vicinity of the target O22 as the start point and the vicinity of the target O42 as the end point. Therefore, the guidance information G1 can effectively inform the user U at the position P1 that the target object is not currently visible, but if the user goes to the right side, the target object is present.

次に、誘導情報G1に誘導されて右側の位置P2に移ったユーザUは、その下段側に示すような撮影画像PC2を撮影する。撮影画像PC1には対象O11〜O63の少なくとも１つが含まれ、且つ、目標対象O62も含まれているので、ケース２に該当する。（ここでは説明簡略化の便宜上、AR情報表示装置20が前述のタブレット等で実装されケース３はないものとする。）この場合、矢印A2が生成され、且つ、既に目標対象O62に到達している旨を図示するような太枠で囲うなどの強調表示B2として生成することで、誘導情報G2が生成される。矢印A2は、対象O42の辺りを始点とし、目標対象O62の辺りを終点として生成されたものである。従って誘導情報G2により、位置P2にいるユーザUに対して、右側に移動したことで既に目標対象O62に到達したことと、画像PC2にあるいずれの対象が目標対象O62であるかを、効果的に伝えることが可能となる。なお、前述の誘導情報G1においては、誘導情報G2におけるような強調表示B2が生成されていないという表示態様の区別によって、目標対象O62には到達していない旨を表現することも可能である。 Next, the user U guided to the guidance information G1 and moved to the right position P2 captures the captured image PC2 as shown on the lower side. Since the captured image PC1 includes at least one of the targets O11 to O63 and also includes the target target O62, this corresponds to case 2. (Here, for convenience of explanation, it is assumed that the AR information display device 20 is mounted on the above-mentioned tablet or the like and there is no case 3.) In this case, the arrow A2 is generated and the target object O62 has already been reached. The guidance information G2 is generated by generating a highlight display B2 such as enclosing that a thick frame as shown in the figure. The arrow A2 is generated with the vicinity of the target O42 as the start point and the vicinity of the target object O62 as the end point. Therefore, the guidance information G2 effectively indicates that the user U who has been at the position P2 has already reached the target object O62 by moving to the right side, and which target in the image PC2 is the target object O62. Can be communicated to. In the above-described guidance information G1, it is also possible to express that the target object O62 has not been reached by distinguishing the display mode that the highlighted display B2 is not generated as in the guidance information G2.

以上のように、本発明のAR情報表示装置20は、次のような効果を奏することができる。すなわち、目標対象がまだ認識されていない場合であっても、目標対象ではないその他の対象で認識されているものを活用することで、目標対象へとユーザを直感的に誘導することが可能である。 As described above, the AR information display device 20 of the present invention can achieve the following effects. In other words, even when the target target is not yet recognized, it is possible to intuitively guide the user to the target target by using what is recognized by other targets that are not target targets. is there.

なお、図４の例では、撮影画像PC1,PC2内の全ての対象が認識されていることを理想的な場合の例として想定しているが、実際にはその少なくとも一部が認識されてさえいれば、本発明は図４の例と同様にユーザを目標対象へと直感的に誘導するという効果を奏することができる。撮影画像内の一部の対象のみが認識されている状況の例は後述する図１１〜図１３において紹介する。 In the example of FIG. 4, it is assumed as an example of an ideal case that all the objects in the captured images PC1 and PC2 are recognized, but actually at least a part of them is recognized. If this is the case, the present invention can provide an effect of intuitively guiding the user to the target object, as in the example of FIG. Examples of situations where only some of the objects in the captured image are recognized will be introduced in FIGS.

図４のような状況が発生し、本発明のAR情報表示装置20の利用に好適な現実の例として、例えば次を挙げることができる。一例として、商品の陳列やピッキングなどでは、一定期間、商品を固定して配置するため、予め個々の対象物の相対的な位置を示す情報を測定して記憶部5に登録することができる。 As a practical example suitable for use of the AR information display device 20 of the present invention in which the situation shown in FIG. As an example, when displaying or picking up merchandise, the merchandise is fixedly arranged for a certain period of time, so that information indicating the relative positions of individual objects can be measured and registered in the storage unit 5 in advance.

具体的に例えば、対象は商品等の陳列における複数の引き出しが並んでいるそれぞれであり、ユーザが、目標の引き出しに対して、品物を投入したり、抽出したりする場合を考える。図４の例の対象O11〜O63が引き出しであるものとして、図４の例のように引き出しが広範囲に並ぶ場合、最初（位置P1にユーザがいる場合）、ユーザはどの方向や位置から引き出しを探したらいいのか見当がつかない。従って、直接目標とする引き出しを認識できない場合であっても、目標の対象物に到達するための何らかの情報が提供されると、ユーザにとっては都合がよい。そこで、本発明では、予め全対象物の相対位置（これらは共通座標系で登録された各対象の位置姿勢から求まる対象同士の相対的な位置姿勢で表現可能である。）を記憶部5に登録しておくことで、目標外の対象物を認識できた場合に、認識できた対象物から目標の対象物の位置を推測してユーザの視線を誘導することができる。すなわち、図４の例のように、位置P1の視線から位置P2の視線へとユーザUを誘導することができる。 Specifically, for example, a case is considered in which each of a plurality of drawers in a display of goods or the like is arranged, and a user inputs or extracts an item from a target drawer. Assuming that the objects O11 to O63 in the example of FIG. 4 are drawers, when the drawers are arranged in a wide range as in the example of FIG. 4, first (when there is a user at the position P1), the user draws the drawer from which direction and position I have no idea what to look for. Therefore, even if the target drawer cannot be directly recognized, it is convenient for the user if some information for reaching the target object is provided. Therefore, in the present invention, the relative position of all the objects (these can be expressed by the relative position and orientation of the objects obtained from the position and orientation of each object registered in the common coordinate system) in the storage unit 5. By registering, when an object outside the target can be recognized, the position of the target object can be estimated from the recognized object and the user's line of sight can be guided. That is, as in the example of FIG. 4, the user U can be guided from the line of sight at the position P1 to the line of sight at the position P2.

同様の一例として、多数のサーバ等がラックに配置して並べられているサーバ室のメンテナンス作業を挙げることができる。その他、任意の類似する状況に対して本発明は好適である。 As a similar example, a maintenance operation of a server room in which a large number of servers are arranged in a rack can be mentioned. The present invention is suitable for any other similar situation.

なお、図４の例では画像PC1,PC2として説明したが、対応する誘導情報G1,G2を提供されるユーザの立場においては、シースルー型ディスプレイ等で構成される表示部4による提供の場合は、景色PC1,PC2として知覚されることとなる。 In the example of FIG. 4, the description has been made with the images PC1 and PC2. However, in the case of the user who is provided with the corresponding guidance information G1 and G2, in the case of provision by the display unit 4 including a see-through display, It will be perceived as scenery PC1 and PC2.

また、以下の説明では、特段の断りがない限り、生成部3（図２では情報生成部36）により生成される誘導情報とは、表示部4が通常の液晶ディスプレイ等である場合は撮影画像に対して重畳する内容を、表示部4がシースルー型ディスプレイ等で構成される場合は景色に対して重畳される内容を、それぞれ意味するものとし、両方の場合で共通して重畳可能な内容である前提で、これらを特に区別せずに説明するものとする。ただし、前述のケース３の場合は、シースルー型ディスプレイに特化した誘導情報となる。 In the following description, unless otherwise specified, the guidance information generated by the generation unit 3 (the information generation unit 36 in FIG. 2) refers to a captured image when the display unit 4 is a normal liquid crystal display or the like. Means that if the display unit 4 is configured with a see-through display, etc., it means that the content is superimposed on the scenery. These shall be explained without particular distinction under certain assumptions. However, in case 3 described above, the guidance information is specialized for the see-through display.

図５は、一実施形態に係るAR情報表示装置20の動作のフローチャートである。 FIG. 5 is a flowchart of the operation of the AR information display device 20 according to an embodiment.

まず、図５の全体的な構造を説明しておくと、ステップS12,S13,S21が判定部31によって前述のケース１〜ケース４を区別するステップに相当する。そして、ケース１に該当する場合、一連のステップS14〜S18が実行され、ケース２に該当する場合、一連のステップS26〜S28が実行され、ケース３に該当する場合、一連のステップS36〜S38が実行され、ケース４に該当する場合、一連のステップS47〜S48が実行される。 First, the overall structure of FIG. 5 will be described. Steps S12, S13, and S21 correspond to steps for distinguishing the above-described case 1 to case 4 by the determination unit 31. And when it corresponds to case 1, a series of steps S14-S18 are performed, when it corresponds to case 2, a series of steps S26-S28 are performed, and when it corresponds to case 3, a series of steps S36-S38 are performed. If it is executed and falls under Case 4, a series of steps S47 to S48 are executed.

また、撮影部1で撮影した各時刻の撮影画像について図５のフローが適用されることで、各時刻の撮影画像がケース１〜ケース４のいずれに該当するかの判定が行われると共に、該当するケースに応じた適切な誘導情報が生成及び表示される。こうして、AR情報表示装置20を利用するユーザは、リアルタイムの撮影を行うことによりリアルタイムで生成（及び更新）される誘導情報が提供されることとなる。ここで、撮影部1による撮影のフレームレートや、これに応じて誘導情報を生成及び表示するレートに関しては、所望の設定を利用してよい。撮影部1が撮影する映像の全ての撮影画像につき逐次、誘導情報を生成及び表示してもよいし、撮影部１が撮影する映像からレートを所定割合で間引いたものを誘導情報の生成対象としたうえで、誘導情報の表示を行うようにしてもよい。 In addition, by applying the flow of FIG. 5 to the captured image at each time captured by the capturing unit 1, it is determined whether the captured image at each time corresponds to Case 1 to Case 4. Appropriate guidance information corresponding to the case to be generated is generated and displayed. In this way, the user who uses the AR information display device 20 is provided with guidance information generated (and updated) in real time by performing real-time imaging. Here, as for the frame rate of shooting by the shooting unit 1 and the rate at which the guidance information is generated and displayed in accordance therewith, a desired setting may be used. Guidance information may be generated and displayed sequentially for all the captured images of the video imaged by the imaging unit 1, or the information obtained by thinning out the rate at a predetermined rate from the video imaged by the imaging unit 1 as the generation target of the guidance information. In addition, guidance information may be displayed.

以下、図５の各ステップを説明しながら、AR情報表示装置20の各部の詳細（特に、誘導情報の生成及び表示に関する詳細）を説明する。なお、上記の全体構造の説明の通り図５のフローはケース１〜ケース４に分岐しているが、各ステップの説明はこの順番（ケース１→ケース２→ケース３→ケース４の順番）で行うこととする。 In the following, details of each part of the AR information display device 20 (particularly, details regarding generation and display of guidance information) will be described while explaining the steps of FIG. Note that the flow of FIG. 5 is branched from case 1 to case 4 as described in the overall structure above, but the description of each step is in this order (case 1 → case 2 → case 3 → case 4). I will do it.

ステップS11では、撮影部1が現時刻における撮影を行って撮影画像を得ると共に、認識部2が当該現時刻の撮影画像に対して前述の認識処理を実施し、記憶部5に予め記憶されている複数の所定対象のうちいずれの対象が撮影されているかの特定と、撮影されている対象の位置姿勢の推定とを行うことで認識結果を得たうえで、ステップS12へと進む。 In step S11, the photographing unit 1 performs photographing at the current time to obtain a photographed image, and the recognition unit 2 performs the above-described recognition processing on the photographed image at the current time and is stored in the storage unit 5 in advance. The process proceeds to step S12 after obtaining a recognition result by identifying which of the plurality of predetermined objects is being photographed and estimating the position and orientation of the object being photographed.

ステップS12では、判定部31が、上記のステップS11における認識結果において、少なくとも１つの対象（記憶部5に予め記憶されている対象）が認識されているか否かを判定し、肯定判定であれば、すなわち、少なくとも１つの対象が認識されていたのであればステップS13へと進み、否定判定であれば、すなわち、認識された対象が存在しなかったのであれば、ステップS47へと進む。 In step S12, the determination unit 31 determines whether or not at least one target (a target stored in advance in the storage unit 5) is recognized in the recognition result in step S11. That is, if at least one target has been recognized, the process proceeds to step S13. If the determination is negative, that is, if no recognized target exists, the process proceeds to step S47.

ステップS13では、判定部31が、上記のステップS11における認識結果において、認識された対象の中に目標対象があるか否かを判定し、肯定判定であれば、すなわち、目標対象が存在すればステップS21へと進み、否定判定であれば、すなわち、目標対象が存在しなければステップS14へと進む。 In step S13, the determination unit 31 determines whether there is a target object among the recognized objects in the recognition result in step S11. If the determination is affirmative, that is, if the target object exists. The process proceeds to step S21. If the determination is negative, that is, if the target target does not exist, the process proceeds to step S14.

なお、前述の通り、記憶部5に記憶されている所定の複数の対象のうちの１つが予め目標対象として設定されて登録されているので、当該登録情報を参照することで判定部31はステップS13の判定を行うことができる。 As described above, one of a plurality of predetermined objects stored in the storage unit 5 is set and registered as a target object in advance, so that the determination unit 31 performs step by referring to the registration information. S13 can be determined.

以下、ケース１に該当する場合であるステップS14〜S18の説明を行う。 Hereinafter, Steps S14 to S18 that correspond to Case 1 will be described.

ステップS14では、目標推定部32が、撮影画像の範囲外にある目的対象の位置を推定するための、いわば足がかりとしての推定元対象を、ステップS11で認識に成功した対象の中から選択して、ステップS15へと進む。 In step S14, the target estimation unit 32 selects an estimation source target as a foothold for estimating the position of the target target outside the range of the captured image from the targets that have been successfully recognized in step S11. The process proceeds to step S15.

ステップS14における目標推定部32による推定元対象の選択処理は、以下のように各種の実施形態が可能である。 Various types of embodiments of the estimation source target selection process by the target estimation unit 32 in step S14 are possible as follows.

第一実施形態では、記憶部5に記憶されている所定のワールド座標系における各対象の位置座標を参照することで算出される、目標対象からの距離が当該ワールド座標系において最小となるような認識された対象を推定元対象として選択することができる。なお、例えば前述のように各対象の四方（四隅）の空間座標を位置及び姿勢として登録している場合であれば、当該四方の点から計算される所定点（例えば重心）を各対象の位置として参照することができる。 In the first embodiment, the distance from the target object calculated by referring to the position coordinates of each target in the predetermined world coordinate system stored in the storage unit 5 is minimized in the world coordinate system. The recognized object can be selected as the estimation source object. For example, if the spatial coordinates of the four directions (four corners) of each object are registered as the position and orientation as described above, a predetermined point (for example, the center of gravity) calculated from the four points is used as the position of each object. Can be referred to as

第二実施形態では、ステップS11で対象を認識した際に併せて推定した位置姿勢における誤差（例えば平面射影変換行列を計算する際の数値計算上の所定の種類の誤差など）が最小となるような認識された対象を推定元対象として選択することができる。 In the second embodiment, the error in the position and orientation estimated when the object is recognized in step S11 (for example, a predetermined type of error in numerical calculation when calculating the planar projection transformation matrix) is minimized. The recognized object can be selected as the estimation source object.

第三実施形態では、第一実施形態で計算するワールド座標系での距離と、第二実施形態で計算する誤差と、に基づく総合スコアが最大となるような認識された対象を推定元対象として選択することができる。当該総合スコアの算出は、距離が小さいほど大きく、誤差が小さいほど大きく算出される所定の評価式を利用すればよい。 In the third embodiment, the recognized target that maximizes the total score based on the distance in the world coordinate system calculated in the first embodiment and the error calculated in the second embodiment is set as the estimation source target. You can choose. The total score may be calculated using a predetermined evaluation formula that is larger as the distance is smaller and larger as the error is smaller.

ステップS15では、目標推定部32がさらに、上記ステップS14で選択した推定元対象に基づいて、目標対象がスクリーン座標において占める位置を推定してから、ステップS16へと進む。なお、撮影画像には目標対象は存在しないため、当該推定される位置は、（ノイズ等に起因する位置姿勢計算上の大きな誤差などが発生していない限りは、すなわち、通常であれば）撮影画像のスクリーン座標における撮影画像の占める範囲から外れた位置となる。 In step S15, the target estimation unit 32 further estimates the position occupied by the target object in the screen coordinates based on the estimation source object selected in step S14, and then proceeds to step S16. Since there is no target object in the photographed image, the estimated position is photographed (unless a large error in position and orientation calculation due to noise or the like has occurred, that is, in a normal case) The position is out of the range occupied by the captured image in the screen coordinates of the image.

ステップS15における目標対象のスクリーン座標の推定は次のようにすればよい。ここで、数式のための表現を次のように定める。推定元対象をi[推定元]というインデクスで、また目標対象をi[目標]というインデクスで、それぞれ識別するものとする。また、上記ステップS11の認識処理で推定元対象i[推定元]の撮影画像上の位置姿勢を表す平面射影変換行列が実測値としてH_{(実測)i[推定元]}として求まっているものとする。また、対象i（任意の対象のインデクスをiとする）の撮影画像におけるスクリーン座標をq_i、対象iの記憶部5に予め登録されているワールド座標をQ_iとする。 The estimation of the screen coordinates of the target object in step S15 may be performed as follows. Here, the expression for the mathematical formula is defined as follows. Assume that the estimation source object is identified by an index i [estimation source], and the target object is identified by an index i [target]. In addition, it is assumed that the planar projection transformation matrix representing the position and orientation of the estimation source object i [estimation source] on the captured image is obtained as an _actual measurement value as H _{(measurement) i [estimation source] in} the recognition processing in step S11. . In addition, the screen coordinates in the captured image of the object i (an index of an arbitrary object is i) are q _i , and the world coordinates registered in advance in the storage unit 5 of the object i are Q _i .

第一実施形態では、上記実測値として求まっているH_{(実測)i[推定元]}がそのまま目標対象i[目標]の位置姿勢を近似的に表しているとの仮定により、目標対象i[目標]のスクリーン座標q_i[目標]を以下のように求めることができる。
q_i[目標]=H_{(実測)i[推定元]}Q_i[目標] In the first embodiment, by assuming that H _{(actual measurement) i [estimator]} obtained as the actual measurement value directly represents the position and orientation of the target object i [target] as it is, the target object i [target ] Screen coordinates q _{i [target]} can be obtained as follows.
q _{i [target]} = H _{(actual measurement) i [estimator]} Q _{i [target]}

第二実施形態では、上記近似適用は行わず、以下のように目標対象i[目標]のスクリーン座標q_i[目標]を以下のように求めることができる。
q_i[目標]=T(i[推定元]→i[目標])H_{(実測)i[推定元]}Q_i[目標]
ここで、上記のT(i[推定元]→i[目標])は以下のように、記憶部5に予め登録されているワールド座標において推定元対象i[推定元]の座標Q_i[推定元]を目標対象i[目標]の座標Q_i[目標]に変換する行列として求めることができる。
Q_i[目標]= T(i[推定元]→i[目標]) Q_i[推定元] In the second embodiment, the above approximate application is not performed, and the screen coordinates q _{i [target]} of the target object i [target] can be obtained as follows.
q _{i [target]} = T (i [estimator] → i [target]) H _{(actually measured) i [estimator]} Q _{i [target]}
Here, T (i [estimation source] → i [target]) is the coordinate Q _{i [estimation} ] of the estimation source target i [estimation source] in the world coordinates registered in advance in the storage unit 5 as follows. _{The original]} can be obtained as a matrix for converting the coordinates Q _{i [target} _] of the target object i [target].
Q _{i [target]} = T (i [estimator] → i [target]) Q _{i [estimator]}

ステップS16では、その次のステップS17で情報生成部36が誘導情報を生成するために必要になる各種の情報として、始点決定部34が始点を決定し、終点決定部33が終点を決定してから、ステップS17へと進む。 In step S16, as various information necessary for the information generation unit 36 to generate guidance information in the next step S17, the start point determination unit 34 determines the start point, and the end point determination unit 33 determines the end point. To step S17.

ステップS16において始点決定部34は、以下のような各実施形態のいずれかで、表示部4による2次元表示領域上にその位置が定義される始点を決定することができる。 In step S16, the start point determination unit 34 can determine the start point whose position is defined on the two-dimensional display area by the display unit 4 in any of the following embodiments.

第一実施形態では、上記ステップS14で選択された推定元対象のカメラ（撮影部1を構成するハードウェアとしてのカメラ）からの距離をγ、ユーザの視線距離をγ0、判定用の所定閾値をα0として、以下（１）、（２）の場合分けで始点を決定することができる。なお、距離γは推定元対象に関して認識部2で求めた位置姿勢から求めることができる。また、視線距離γ0の値は固定値を記憶部5に記憶しておいてもよいし、認識部2で認識され位置姿勢が求められた対象においてそれぞれ求まる距離の平均値や、認識された対象のうち撮影画像上の最も中心に近い位置にある対象において求まる距離を用いてもよい。
（１）|γ-γ0|≦α0である場合は、視点（スクリーン座標系で求まる視点）を始点とする。ここで、視点は表示部4の表示領域内における所定点（例えば中心）として予め設定しておいてもよいし、HMD等を利用する場合で視線センサが利用できる場合は当該視線センサにより取得された位置としてもよい。
（２）|γ-γ0|>α0である場合は、ワールド座標系において視点から最短距離にある認識された対象の中点（スクリーン座標系での中点）を始点とする。ここで、対象の中点は、対象の占める領域内の所定点（例えば重心）として予め設定しておけばよい。また、ワールド座標系における視点の位置は、前述の視線距離γ0の場合と同様に、固定位置を記憶部5に記憶しておいてもよいし、認識部2で認識され位置姿勢が求められた対象においてそれぞれ求まる位置の平均値や、認識された対象のうち撮影画像上の最も中心に近いスクリーン座標位置にある対象において求まるワールド座標位置を用いてもよい。 In the first embodiment, the distance from the estimation source target camera (camera as hardware constituting the photographing unit 1) selected in step S14 is γ, the user's line-of-sight distance is γ0, and a predetermined threshold for determination is set. As α0, the starting point can be determined according to the following cases (1) and (2). The distance γ can be obtained from the position and orientation obtained by the recognition unit 2 with respect to the estimation source object. Further, the value of the line-of-sight distance γ0 may be stored as a fixed value in the storage unit 5, or the average value of the distances obtained from the objects recognized by the recognition unit 2 and the position and orientation are obtained, or the recognized objects Of these, the distance obtained for the object closest to the center on the captured image may be used.
(1) When | γ−γ0 | ≦ α0, the viewpoint (the viewpoint obtained in the screen coordinate system) is set as the starting point. Here, the viewpoint may be set in advance as a predetermined point (for example, the center) in the display area of the display unit 4, or is acquired by the line-of-sight sensor when the line-of-sight sensor can be used when using an HMD or the like. It may be a different position.
(2) If | γ−γ0 |> α0, the starting point is the midpoint of the recognized object (midpoint in the screen coordinate system) at the shortest distance from the viewpoint in the world coordinate system. Here, the midpoint of the object may be set in advance as a predetermined point (for example, the center of gravity) in the area occupied by the object. As for the viewpoint position in the world coordinate system, the fixed position may be stored in the storage unit 5 as in the case of the line-of-sight distance γ0 described above, or the recognition unit 2 recognizes the position and orientation. You may use the average value of the position obtained in each object, or the world coordinate position obtained in the object at the screen coordinate position closest to the center on the captured image among the recognized objects.

第一実施形態ではすなわち、目標対象に近い対象である推定元対象のワールド座標系での位置が視線のワールド座標系での位置に近い場合と遠い場合とで（１）、（２）の場合分けを行っている。そして、近い場合は視線位置がそのままユーザを誘導するガイドとなる位置（矢印の始点）として役立つものと判断し、遠い場合は視線位置に近い具体的な対象をガイドとなる位置（矢印の始点）に設定することで誘導をより確実にしている。 That is, in the first embodiment, in the case of (1) and (2), the position in the world coordinate system of the estimation source object that is an object close to the target object is near and far from the position in the world coordinate system of the line of sight Dividing. If the distance is close, it is determined that the line-of-sight position is useful as a guide for guiding the user as it is (start point of the arrow). By setting to, guidance is made more reliable.

第二実施形態では、上記のような場合分けは行わず、以下（３）〜（７）のいずれかの設定に従って始点を決定することができる。
（３）上記第一実施形態の（１）の場合の設定を常に採用する。すなわち、視点を始点とする。視点に関しては上記第一実施形態で説明したのと同様の所定点又は視線センサにより求まる点とすればよい。
（４）上記第一実施形態の（２）の場合の設定を常に採用する。すなわち、ワールド座標系において視点から最短距離にある認識された対象の中点を始点とする。中点に関しては上記第一実施形態で説明したのと同様の所定点であり、以下の説明でも同様とする。
（５）上記ステップS14で選択された推定元対象の中点を始点とする。 In the second embodiment, the above case classification is not performed, and the start point can be determined according to any of the following settings (3) to (7).
(3) The setting in the case of (1) of the first embodiment is always adopted. That is, the viewpoint is the starting point. With respect to the viewpoint, the same predetermined point as described in the first embodiment or a point obtained by a line-of-sight sensor may be used.
(4) The setting in the case of (2) of the first embodiment is always adopted. That is, the midpoint of the recognized object at the shortest distance from the viewpoint in the world coordinate system is set as the starting point. The midpoint is a predetermined point similar to that described in the first embodiment, and the same applies to the following description.
(5) The midpoint of the estimation source target selected in step S14 is set as the start point.

（６）上記ステップS12で認識された対象のうち、周辺と特徴が異なることによる顕著度（saliency）が最も高い対象の中点を始点とする。ここで、顕著度の計算に関しては、平均色や色ヒストグラム、形状や向きの違いを比較する等の画像処理分野における既存手法を利用すればよい。なお、当該実施形態においては上記ステップS12において認識部2が顕著度の計算も追加処理として、認識された対象に関して実施するものとする。 (6) Among the objects recognized in step S12, the middle point of the object having the highest saliency due to the difference in features from the surrounding area is set as the starting point. Here, regarding the calculation of the saliency, an existing method in the image processing field such as comparison of average colors, color histograms, shapes and orientations may be used. In this embodiment, it is assumed that the recognition unit 2 performs the calculation of the saliency in the above step S12 as an additional process for the recognized object.

（７）各対象iについて前述の通り記憶部5に優先順位度εiを登録しておき、上記ステップS12で認識された対象のうち、優先度εiが最大となる対象の中点を始点とする。なお、優先順位度は、ユーザ属性を反映したものとして（例えばユーザが熟知した対象ほど高い優先度順位を与えるなどして）予めテーブルとして記憶部5に登録しておいてもよいし、過去の誘導効果を示す操作履歴を用いて自動生成したり修正を行ったりしたものを記憶部5に登録しておいてもよい。 (7) For each target i, the priority degree εi is registered in the storage unit 5 as described above, and among the targets recognized in step S12, the middle point of the target with the highest priority εi is set as the starting point. . The priority level may be registered in the storage unit 5 as a table in advance as a table reflecting the user attributes (for example, by giving a higher priority level to an object familiar to the user). What is automatically generated or corrected using the operation history indicating the guidance effect may be registered in the storage unit 5.

なお、上記（６）の顕著度の計算において、（７）で用いられるのと同様の優先順位度εiを参照することで、当該顕著度の計算を行うようにしてもよい。すなわち、（６）に従って、周辺と特徴が異なることによる対象の顕著度（saliency）を計算する際に、（７）において記録部5に各対象iについて登録されているのと同様の優先順位度εiを参照することで、優先順位度εiを個々の対象の顕著度を計算するときの重みづけとして使用してもよい。 In the calculation of the saliency in the above (6), the saliency may be calculated by referring to the same priority level εi as used in (7). That is, according to (6), when calculating the saliency of the object due to the difference in features from the surroundings, the same priority degree as that registered for each object i in the recording unit 5 in (7) By referring to ε i, the priority degree ε i may be used as a weight when calculating the saliency of each object.

上記のユーザ属性を反映した優先度順位の利用は、図４の模式例のように誘導対象のユーザが撮影部1による撮影がなされている現場に存在する状況のみならず、当該現場から遠隔に存在して現場の撮影ユーザに対して指示を与える役割を担う遠隔ユーザに対して誘導情報を提供するような状況においても、好適である。すなわち、遠隔ユーザの立場では撮影ユーザの視点は必ずしも重要であるとは限らず、当該遠隔ユーザの属性を反映した始点の設定が好ましいこともある。 The use of the priority order reflecting the above-described user attributes is not limited to the situation where the user to be guided exists at the site where the photographing unit 1 is photographing as shown in the schematic example of FIG. It is also suitable in a situation where guidance information is provided to a remote user who exists and plays a role of giving an instruction to a shooting user on site. That is, from the standpoint of a remote user, the viewpoint of the shooting user is not always important, and it may be preferable to set a starting point that reflects the attributes of the remote user.

また、ステップS16において終点決定部33は、以下のような各実施形態のいずれかで、表示部4による2次元表示領域上にその位置が定義される終点を決定することができる。 In step S16, the end point determination unit 33 can determine the end point whose position is defined on the two-dimensional display area by the display unit 4 in any of the following embodiments.

第一実施形態では、始点決定部34が決定した始点から、ステップS15にて目標推定部32が推定した目標対象のスクリーン座標へと至る直線（方向付の半直線）を引き、当該直線が表示部4による表示領域の境界（外枠部分）と交差する位置を、終点として決定することができる。 In the first embodiment, a straight line (directional half line) extending from the start point determined by the start point determination unit 34 to the target target screen coordinates estimated by the target estimation unit 32 in step S15 is displayed. The position that intersects the boundary (outer frame portion) of the display area by the section 4 can be determined as the end point.

第二実施形態では、上記第一実施形態で求めた交差する位置よりも、所定量だけ当該直線上において表示領域の内部に戻った位置を、終点として決定することができる。 In the second embodiment, it is possible to determine, as an end point, a position that returns to the inside of the display area on the straight line by a predetermined amount from the intersecting position obtained in the first embodiment.

第三実施形態では、上記ステップS14で選択された推定元対象の中点を終点とすることができる。なおこの場合、終点と始点とが一致しないように、始点決定部34は上記（５）の実施形態以外を適用するものとする。また、第三実施形態は対象が多数存在することによって推定元対象の位置が表示領域外にある目標対象の位置へ向かうことを概ね示すものとして役立つ前提の実施形態であるため、始点位置を決定する実施形態は表示領域の概ね中心に始点が決定されるものを採用することが好ましい。 In the third embodiment, the midpoint of the estimation source target selected in step S14 can be the end point. In this case, it is assumed that the start point determination unit 34 applies other than the embodiment (5) so that the end point does not coincide with the start point. In addition, the third embodiment is a premise embodiment that serves as a general indication that the position of the estimation target object is directed to the position of the target object outside the display area due to the presence of a large number of objects. In the embodiment, it is preferable to adopt an embodiment in which the start point is determined approximately at the center of the display area.

第四実施形態では、上記の第一又は第二実施形態によって決定される終点にスクリーン座標上で最も近い、認識部2で認識された対象の中点を終点として決定することができる。ここで、第一実施形態又は第二実施形態によって決定される終点と、認識部2で認識された対象のうち当該終点に最も近いものと、の距離（スクリーン座標上での距離）が所定閾値以下である場合に第四実施形態を適用し、当該閾値よりも大きい場合には第一実施形態又は第二実施形態で決定される終点を代わりに採用するようにしてもよい。 In the fourth embodiment, the midpoint of the object recognized by the recognition unit 2 that is closest to the end point determined by the first or second embodiment on the screen coordinates can be determined as the end point. Here, the distance (the distance on the screen coordinates) between the end point determined by the first embodiment or the second embodiment and the object recognized by the recognition unit 2 that is closest to the end point is a predetermined threshold value. The fourth embodiment may be applied to the following cases, and if it is larger than the threshold, the end point determined in the first embodiment or the second embodiment may be adopted instead.

ステップS17では、情報生成部36が、撮影画像又は風景に重畳される矢印として誘導情報を生成してから、ステップS18へと進む。ここで、上記ステップS16で決定された始点及び終点をそれぞれ矢印の始点及び終点とすることで、誘導情報を構成する矢印を生成することができる。当該生成される矢印の情報は、始点及び終点を与えると矢印の形状が自動で決定されるような形状モデル情報等として、記憶部5に予め記憶しておけばよい。ステップS18では、表示部4が、当該生成された矢印としての誘導情報を撮影画像又は風景に重畳して表示する。 In step S17, the information generation unit 36 generates guidance information as an arrow superimposed on the captured image or landscape, and then proceeds to step S18. Here, the arrows constituting the guidance information can be generated by setting the start point and the end point determined in step S16 as the start point and end point of the arrow, respectively. The generated arrow information may be stored in advance in the storage unit 5 as shape model information or the like that automatically determines the shape of the arrow when a start point and an end point are given. In step S18, the display unit 4 displays the generated guidance information as an arrow superimposed on the captured image or landscape.

ステップS17,S18における誘導情報の生成及び表示に関して、以下のような各実施形態も可能である。 With respect to generation and display of guidance information in steps S17 and S18, the following embodiments are possible.

第一実施形態では、矢印の他にも、現状がケース１に該当する旨をユーザに伝達する情報を含めて誘導情報を生成して表示するようにしてもよい。例えば、目標対象にはまだ到達していないが、矢印の方に向かえば目標対象により近づく旨の情報をテキスト情報として与えてもよいし、あるいは表示する矢印の態様を（ケース２の場合とは異なる）所定態様とするようにしてもよい。 In 1st embodiment, you may make it produce | generate and display guidance information including the information which tells a user that the present condition corresponds to case 1 besides the arrow. For example, although the target object has not yet been reached, information indicating that the target object is closer to the direction of the arrow may be given as text information, or the direction of the arrow to be displayed (case 2) (Different) may be a predetermined mode.

図６及び図７は、第二実施形態における矢印の表示態様を説明するための図である。図６に示すように、現実世界Wには図４の例と同様の複数の対象O11〜O63が、図４の例とは異なる配置、すなわち、その一部は棚R1の前面PR1に配置され、残りの一部は棚R1よりも位置P3で撮影しているユーザUから見て奥行き方向D3に離れた棚R2の前面PR2に配置されている。 6 and 7 are diagrams for explaining a display mode of an arrow in the second embodiment. As shown in FIG. 6, in the real world W, a plurality of objects O11 to O63 similar to the example of FIG. 4 are arranged differently from the example of FIG. 4, that is, a part thereof is arranged on the front surface PR1 of the shelf R1. The remaining part is disposed on the front surface PR2 of the shelf R2 that is further away from the shelf R1 in the depth direction D3 when viewed from the user U taking the image at the position P3.

図６の位置P3で撮影される画像PC3が図７に[1]〜[4]としてそれぞれ示され、[2]〜[4]に第二実施形態による矢印の表示態様の例が矢印A32,A33,A34として示されている。なお、画像PC3においては棚R1,R2の境界などの対象O11〜O63以外のものは描くのを省略している。 Images PC3 taken at position P3 in FIG. 6 are shown in FIG. 7 as [1] to [4], respectively. In [2] to [4], examples of arrow display modes according to the second embodiment are arrows A32, Shown as A33, A34. Note that in the image PC3, drawing of objects other than the objects O11 to O63 such as the boundaries of the shelves R1 and R2 is omitted.

第二実施形態はすなわち、始点及び終点の決定をいずれかの対象の中点として決定する実施形態を採用したもとで可能な実施形態であり、始点として決定された対象のカメラ位置からの距離と、終点として決定された対象のカメラ位置からの距離との差（絶対値）が所定閾値以内にあるか否かによって、矢印の表示態様を変えるものである。なお周知のように、対象のカメラ位置からの距離の情報は、認識部2においてホモグラフィ行列として対象の位置姿勢を推定した際に既知となるため、当該既知となった距離の情報を参照することで、当該所定閾値以内にあるか否かの判定が可能となる。 That is, the second embodiment is an embodiment that is possible based on the embodiment in which the determination of the start point and the end point is determined as the midpoint of any target, and the distance from the camera position of the target determined as the start point Depending on whether or not the difference (absolute value) from the target camera position determined as the end point is within a predetermined threshold, the display mode of the arrow is changed. As is well known, since the information on the distance from the target camera position is known when the position and orientation of the target is estimated as a homography matrix in the recognition unit 2, the information on the known distance is referred to. Thus, it is possible to determine whether or not it is within the predetermined threshold.

図７の[2]に示す矢印A32と、[3]に示す矢印A33とは、当該距離差が閾値以内にあるため、第一態様として細い線の矢印を表示している。すなわち、矢印A32は始点が対象O32の中点、終点が対象O22の中点であり、共に図６の前面PR1にあるため距離差は小さい。同様に、矢印A33は始点が対象O43の中点、終点が対象O52の中点であり、共に図６の前面PR2にあるため距離差は小さい。一方、図７の[4]に示す矢印A34は、当該距離差が閾値を超えるため、第二態様として太い線の矢印を表示している。すなわち、矢印A34は始点が対象O32の中点であって図６の前面PR1上にあり、終点が対象O42の中点であって図６の前面PR2上であり、一方が前面PR1にありもう一方が奥行き方向D3で離れた前面PR2上にあるため距離差は大きい。 An arrow A32 shown in [2] in FIG. 7 and an arrow A33 shown in [3] display a thin line arrow as the first mode because the distance difference is within the threshold value. That is, since the arrow A32 has a start point at the midpoint of the object O32 and an end point at the midpoint of the object O22, both are located on the front surface PR1 in FIG. Similarly, the arrow A33 has a start point at the midpoint of the object O43 and an end point at the midpoint of the object O52, both of which are on the front surface PR2 in FIG. On the other hand, the arrow A34 shown in [4] of FIG. 7 displays a thick line arrow as the second aspect because the distance difference exceeds the threshold value. That is, the arrow A34 has the start point at the midpoint of the object O32 and is on the front surface PR1 in FIG. 6, the end point is at the midpoint of the target O42 and is on the front surface PR2 in FIG. Since one side is on the front surface PR2 that is separated in the depth direction D3, the distance difference is large.

第二実施形態ではすなわち、矢印の始点と終点とでカメラ位置からの距離が大きく変化するような場合には、当該変化が少ない場合と区別して矢印表示を行うことで、ユーザに奥行き等の方向の距離変化の存在の有無を意識させた効果的な誘導が可能となる。また、始点又は終点の少なくとも一方が対象の中点として決定されていない実施形態においても、当該始点又は終点の少なくとも一方にスクリーン座標上で最も近い認識された対象のワールド座標によって当該始点又は終点の少なくとも一方のカメラからの位置を近似値として算出し、以上の第二実施形態を適用するようにしてもよい。 In the second embodiment, that is, when the distance from the camera position changes greatly between the start point and the end point of the arrow, the direction of depth or the like is displayed to the user by displaying the arrow in distinction from the case where the change is small. It is possible to effectively guide the user to be aware of the presence or absence of a distance change. In the embodiment in which at least one of the start point and the end point is not determined as the midpoint of the target, the start point or the end point is determined by the recognized world coordinate of the target closest to the at least one of the start point or the end point on the screen coordinates. The position from at least one camera may be calculated as an approximate value, and the second embodiment described above may be applied.

図８は、第三実施形態を説明するための図である。図８では図４と同様の対象O11〜O63が図４と同様の現実世界Wで棚Rに配置されており、棚の前面に位置するユーザが棚の前面の方を向いて撮影を行うことで、図８の[1]のような撮影画像PC4（又は景色PC4）が得られるものとする。（ただし、実際の撮影画像は図４と同様に、PC4に示す領域の一部のみしか捉えられないものとする。すなわち、PC4は仮想的なパノラマ画像であるものとする。）また、図８において目標対象は右上端の対象O61であるものとする。 FIG. 8 is a diagram for explaining the third embodiment. In FIG. 8, the same objects O11 to O63 as in FIG. 4 are arranged on the shelf R in the real world W as in FIG. 4, and the user located in front of the shelf takes a picture facing the front of the shelf. Thus, it is assumed that a captured image PC4 (or scenery PC4) as shown in [1] in FIG. 8 is obtained. (However, it is assumed that the actual captured image can be captured only in a part of the area shown in the PC 4 as in FIG. 4. That is, the PC 4 is assumed to be a virtual panoramic image.) FIG. The target object is assumed to be the object O61 at the upper right end.

この場合、仮に第三実施形態を適用せずに以上と同様に矢印を表示すると、ユーザの撮影の仕方に応じて[2]に示すような矢印が順次、ユーザに誘導情報として提供される。すなわち、最初は、棚の左下辺りを撮影しているユーザに対して対象O13から対象O32に至る矢印A41が提供され、矢印A41によって誘導されたユーザは撮影箇所を右上側へと移し、次には対象O32から目標対象O61へと至る矢印A42が提供される。こうしてユーザは目標対象O61へと到達できるようになる。 In this case, if an arrow is displayed in the same manner as described above without applying the third embodiment, the arrow as shown in [2] is sequentially provided as guidance information to the user according to the manner of shooting by the user. That is, first, an arrow A41 from the target O13 to the target O32 is provided to the user who is shooting the lower left area of the shelf, and the user guided by the arrow A41 moves the shooting position to the upper right side, and then Is provided with an arrow A42 from the target O32 to the target target O61. In this way, the user can reach the target object O61.

しかしながら、対象O11〜O63がこのように概ね碁盤の目状に並んでいる場合、すなわち各対象が概ね格子点状に並んでいる場合において、[2]のように斜めに横切る矢印は、場合によってはユーザに対する目標対象への直感的な誘導を幾分か混乱させてしまうことが考えられる。特に、対象同士の間隔が広かったり一定数以上の対象が存在したりする場合、当該混乱はさらに顕著になるものと考えられる。 However, when the objects O11 to O63 are arranged in a grid pattern, that is, in a case where the objects are arranged in a grid pattern, an arrow that crosses diagonally as in [2] Can somewhat confuse the user's intuitive guidance to the target. In particular, when the distance between the objects is wide or there are a certain number of objects or more, the confusion is considered to become more prominent.

そこで、第三実施形態においては、ある時点で認識されその中点が始点に設定された対象O13から目標対象O61へ至るための矢印が[2]のように斜めになってしまう場合には、これに代えて[3]のように格子点上を直線的に進む矢印A43,A44,A45を順次、与えるようにすることができる。すなわち、矢印A43は対象O13からほぼ水平横方向の対象O43へ至るものであり、矢印A44は対象O43からほぼ水平横方向の対象O63へ至るものであり、矢印A45は対象O63からほぼ垂直縦方向の対象O61へ至るものであり、[2]と比べてユーザが直感的に把握しやすい誘導情報が実現されている。なお、第三実施形態を適用する場合は、始点及び終点を決定する実施形態はいずれかの対象の中点として決定する実施形態を採用することが好ましい。 Therefore, in the third embodiment, when the arrow to reach the target object O61 from the target O13 that is recognized at a certain time and whose midpoint is set as the start point is inclined as shown in [2], Instead of this, arrows A43, A44, and A45 that linearly advance on the lattice points can be sequentially given as in [3]. That is, the arrow A43 extends from the target O13 to the target O43 in the substantially horizontal and horizontal direction, the arrow A44 extends from the target O43 to the target in the horizontal and horizontal direction O63, and the arrow A45 corresponds to the vertical and vertical direction from the target O63. In comparison with [2], guidance information that is easier for the user to grasp intuitively is realized. In addition, when applying 3rd embodiment, it is preferable to employ | adopt embodiment which determines the embodiment which determines a start point and an end point as a midpoint of either object.

第三実施形態においては、記憶部5に予め、各対象i同士の位置関係情報を与えておき、ある時点でその中点が始点として設定された対象から目標対象へ直線的に至ると[2]のように「斜め」の矢印が発生すると判定された場合、「斜め」とならないような水平方向と垂直方向の２本の矢印の組合せに修正し、そのいずれかを誘導情報として表示することを繰り返して、ユーザを最終的な目標対象に到達させるようにすればよい。この際、繰り返しのn回目に表示する矢印の終点がその次のn+1回目に表示する矢印の始点となるようにすればよい。 In the third embodiment, information on the positional relationship between the objects i is given in advance to the storage unit 5, and at a certain point, when the center point reaches the target object linearly from the target set as the starting point, [2 ], If it is determined that a “diagonal” arrow is generated, correct it to a combination of two horizontal and vertical arrows that do not become “diagonal” and display either of them as guidance information Is repeated so that the user reaches the final target. At this time, the end point of the arrow displayed at the nth repetition may be the start point of the arrow displayed at the next n + 1th time.

第三実施形態では、始点決定部34が前述のいずれかの手法(第一実施形態の手法又は第二実施形態の（２）〜（７）のいずれかの手法)によって始点を決めた後に、水平方向の矢印と垂直方向の矢印を用いて誘導情報を生成することができる。 In the third embodiment, after the start point determination unit 34 determines the start point by any one of the methods described above (the method of the first embodiment or the method of any one of (2) to (7) of the second embodiment), Guidance information can be generated using horizontal and vertical arrows.

図１０は、当該誘導情報の生成を説明するため配置などの例を示す図である。図示するように、図４等と同様の対象O11〜O63が配置され対象O23の中点が始点として決定され、目標対象はO61であるものとする。ここでは、図１０内に示すように、+x方向が右向き、+y方向が上向きであるような座標（x, y）で位置を説明するものとする。 FIG. 10 is a diagram illustrating an example of arrangement and the like for explaining generation of the guidance information. As illustrated, it is assumed that the same objects O11 to O63 as those in FIG. 4 and the like are arranged, the midpoint of the object O23 is determined as the start point, and the target object is O61. Here, as shown in FIG. 10, the position is described with coordinates (x, y) such that the + x direction is rightward and the + y direction is upward.

第三実施形態では具体的には、例えば次のようにして水平方向矢印及び垂直方向矢印による誘導情報を生成することができる。始点の位置にある対象（図１０の例では対象O23）の中点の位置が(i0, j0)、目標対象(図１０の例では対象O61)の中点の位置が(i1, j1)であり、i1≧i0 かつ j1≧j0という関係があるものとすると、最初に水平方向の矢印を表示する場合は、水平方向の矢印の始点の位置は（i0, j0）であり、表示領域の横幅をＳwとすると、i0+Sw/2>i1の場合は終点の位置は(i1, j0)であり、i0+Sw/2<=i1の場合は、終点の位置は((i0+Sw/2), j0)である。これを繰り返す（繰り返し回数が1回の場合も含む）ことによって、すなわち、x軸方向にSw/2だけ視線を移動させることを繰り返すことによって、ユーザの視界内に(i1, j0)が出現した際に、表示する矢印を水平方向から垂直方向へ変えるよう制御する。変わった瞬間の垂直方向の矢印の始点は(i1, j0)であり、表示領域の縦幅をShとすると、j0+Sh/2>j1の場合は終点は(i1, (j0+Sh/2))であり, j0+Sh/2≦j1の場合は終点は(i1, j1)であるものとして、水平方向の際と同様に繰り返せばよい。なお、上述の説明では始点（i0, j0）は表示領域中心にあることを前提としている。（したがって、当該始点からx軸方向にSw/2よりも多く移動したり、y軸方向にSh/2よりも多く移動したりするような矢印の終点は、表示領域の外部となるため、そのような矢印は表示できない。）表示領域の中心からずれて(i0,j0)が存在する場合、初回の矢印表示の際に当該中心からのずれを修正する分の移動を加味して矢印を表示し、その後は上記と同様にすればよい。 Specifically, in the third embodiment, for example, guidance information by a horizontal arrow and a vertical arrow can be generated as follows. The midpoint position of the target at the start point (target O23 in the example of FIG. 10) is (i0, j0), and the midpoint position of the target target (target O61 in the example of FIG. 10) is (i1, j1). If there is a relationship of i1 ≥ i0 and j1 ≥ j0, when the horizontal arrow is displayed first, the position of the horizontal arrow start point is (i0, j0), and the horizontal width of the display area Is Sw, if i0 + Sw / 2> i1, the end point position is (i1, j0). If i0 + Sw / 2 <= i1, the end point position is ((i0 + Sw / 2 ), j0). By repeating this (including the case where the number of repetitions is 1), that is, by repeating the movement of the line of sight by Sw / 2 in the x-axis direction, (i1, j0) appeared in the user's field of view. At this time, control is performed to change the arrow to be displayed from the horizontal direction to the vertical direction. The starting point of the vertical arrow at the moment of change is (i1, j0), and if the vertical width of the display area is Sh, if j0 + Sh / 2> j1, the end point is (i1, (j0 + Sh / 2 )), And if j0 + Sh / 2 ≦ j1, the end point is assumed to be (i1, j1) and may be repeated in the same manner as in the horizontal direction. In the above description, it is assumed that the starting point (i0, j0) is at the center of the display area. (Therefore, the end point of an arrow that moves more than Sw / 2 in the x-axis direction or more than Sh / 2 in the y-axis direction from the start point is outside the display area. Such arrows cannot be displayed.) When (i0, j0) exists out of the center of the display area, the arrow is displayed taking into account the movement for correcting the deviation from the center when the arrow is displayed for the first time. After that, it may be the same as described above.

なお明らかなように、当該説明において用いた始点及び終点の位置(x,y)とは、現実世界における複数の対象の並びを表現するための固定された仮想的なスクリーン座標（撮影部1で取得される撮影画像とは別途に、複数の対象の全てが概ね正面から撮影されるような十分に大きな撮影画像を考えた場合のスクリーン座標）における位置を意味するものである。当該位置(x,y)は、撮影部1で取得される撮影画像におけるスクリーン座標（ユーザ視点の移動に伴ってその原点が現実世界内で移動するもの）とは別概念のものである。すなわち、撮像画像上のスクリーン座標では例えばある時点で始点が位置(i0, j0)であってもカメラが(+Δx, +Δy)だけ動いた時点ではその位置は(i0-Δx, j0-Δy)と逆向きに移動するが、上記の説明においてはこのようにカメラと共に移動するスクリーン座標ではなく、カメラ位置によらず始点が(i0,j0)となるような固定された仮想的なスクリーン座標を説明に用いている。このような仮想敵なスクリーン座標の情報は、記憶部5に登録しておく各対象のワールド座標系における位置座標から算出することができる。 As is clear, the start point and end point positions (x, y) used in the description are fixed virtual screen coordinates (in the photographing unit 1) for expressing the arrangement of a plurality of objects in the real world. In addition to the acquired captured image, it means a position in a screen coordinate) when considering a sufficiently large captured image in which all of a plurality of objects are generally captured from the front. The position (x, y) is a different concept from the screen coordinates (where the origin moves in the real world as the user viewpoint moves) in the captured image acquired by the imaging unit 1. That is, in the screen coordinates on the captured image, for example, even if the starting point is a position (i0, j0) at a certain time, the position is (i0-Δx, j0-Δy) when the camera is moved by (+ Δx, + Δy). ), But in the above description, not the screen coordinates that move with the camera in this way, but the fixed virtual screen coordinates that the starting point is (i0, j0) regardless of the camera position Is used for explanation. Such virtual enemy screen coordinate information can be calculated from position coordinates of each target in the world coordinate system registered in the storage unit 5.

上記の例では最初に水平方向の矢印を繰り返し表示して視界内に(i1, j0)が出現するようにし、その後に垂直方向の矢印を繰り返し表示して目標対象の(i1, j1)へと到達するものとしたが、同様に、最初に垂直方向の矢印を繰り返し表示して視界内に(i0,j1)が出現するようにし、その後に水平方向の矢印を繰り返し表示して目標対象の(i1, j1)へと到達させるようにしてもよい。水平方向又は垂直方向の矢印のいずれを先に表示するかについては固定的な設定を予め与えておいてもよいし、始点(i0, j0)と終点(i1, j1)との位置関係等に応じて所定ルールで決定するようにしてもよい。例えば、|i0-i1|≧|j0-j1|である場合、すなわち始点(i0, j0)と終点(i1, j1)とを１つの対角線とする矩形が横長の形状である場合は水平方向の矢印を先に表示するようにし、そうでない場合（当該矩形が縦長である場合）は逆に垂直方向の矢印を先に表示するといった所定ルール（あるいはこの逆の所定ルール）を用いればよい。当該実施形態においては、記憶部5に登録しておく複数の対象は水平方向及び垂直方向の移動が定義できるように、互いの位置関係が概ね格子点状に定義されていることが好ましい。 In the above example, the horizontal arrow is first displayed repeatedly so that (i1, j0) appears in the field of view, and then the vertical arrow is displayed repeatedly to the target target (i1, j1). Similarly, the vertical arrow is first displayed repeatedly so that (i0, j1) appears in the field of view, and then the horizontal arrow is repeatedly displayed to display the target object ( i1, j1) may be reached. Whether to display the horizontal or vertical arrow first may be given a fixed setting in advance, or the positional relationship between the start point (i0, j0) and the end point (i1, j1), etc. Accordingly, it may be determined according to a predetermined rule. For example, if | i0-i1 | ≧ | j0-j1 |, that is, if the rectangle with the start point (i0, j0) and the end point (i1, j1) as one diagonal is a horizontally long shape, The predetermined rule (or the reverse predetermined rule) may be used such that the arrow is displayed first, and if not (when the rectangle is vertically long), the vertical arrow is displayed first. In the present embodiment, it is preferable that the positional relationship between the plurality of objects registered in the storage unit 5 is generally defined in a lattice point so that the movement in the horizontal direction and the vertical direction can be defined.

あるいは当該判定は以下のようにして実施してもよい。
（１）認識できた対象と目標対象と間の距離が所定値α1以上離れている場合、又は、（２）認識できた対象と目標対象との間にα2個以上の対象が存在する場合、上記「水平」＋「垂直」の組合せの場合に該当するものと判定する。ここで、（２）における認識できた対象と目標対象との間の対象物の個数は、後述する実施形態における、記憶部5に予め記憶しておく俯瞰図上において認識できた対象と目標対象との間に直線を引き、当該直線が通過した対象の個数として求めるようにしてもよいし、予め記憶部5に任意の２つの対象の間の対象の個数を登録しておいてもよい。（１）における距離も記憶部5に予め登録されている各対象のワールド座標での位置の情報を用いて計算することができる。なお、上記（１）、（２）における「認識できた対象」は、上記の図１０で説明した際の始点が設定される対象とすればよい。 Alternatively, the determination may be performed as follows.
(1) When the distance between the recognized object and the target object is more than the predetermined value α1, or (2) When there are more than α2 objects between the recognized object and the target object, It is determined that the combination is the case of “horizontal” + “vertical”. Here, the number of objects between the recognized object and the target object in (2) is the number of objects and target objects that can be recognized on the overhead view stored in advance in the storage unit 5 in the embodiment described later. A straight line may be drawn between the two and the number of objects through which the straight line has passed may be obtained, or the number of objects between any two objects may be registered in the storage unit 5 in advance. The distance in (1) can also be calculated using information on the position of each target in world coordinates registered in the storage unit 5 in advance. The “recognized target” in the above (1) and (2) may be a target for which the start point at the time described with reference to FIG. 10 is set.

以上、ケース１に該当する場合であるステップS14〜S18の説明を行ったので、ケース２以降の説明へ移る。 As described above, steps S14 to S18, which are cases corresponding to case 1, have been described.

ステップS21では、判定部31が、上記のステップS13で存在すると判定された目標対象が、上記のステップS11における認識結果において表示部4による表示領域の内部にあるか否かを判定し、肯定判定であれば、すなわち、目標対象が表示領域の内部に存在すればステップS26へと進み、否定判定であれば、すなわち、目標対象が表示領域の内部に存在しなければステップS36へと進む。 In step S21, the determination unit 31 determines whether or not the target object determined to be present in step S13 is within the display area of the display unit 4 in the recognition result in step S11. If so, that is, if the target object exists inside the display area, the process proceeds to step S26, and if the determination is negative, that is, if the target object does not exist inside the display area, the process proceeds to step S36.

以下、ケース２に該当する場合であるステップS26〜S28の説明を行う。 Hereinafter, Steps S26 to S28 that correspond to Case 2 will be described.

ケース２のステップS26,S27,S28はそれぞれケース１のステップS16,S17,S18と共通であるため、重複する説明は省略する。ただし、ケース２では目標対象が認識され且つ表示領域内に存在しているので、ケース１の場合とは異なる次のような処理を行うことができる。 Since Steps S26, S27, and S28 of Case 2 are the same as Steps S16, S17, and S18 of Case 1, overlapping descriptions are omitted. However, since the target object is recognized in the case 2 and exists in the display area, the following processing different from the case 1 can be performed.

まず、ステップS26において終点決定部33は、既に認識されている目標対象の中点をそのまま、終点として決定すればよい。また、始点決定部34による始点の決定は省略されてもよいし、決定された始点が目標対象の中点となる場合に当該始点を省略するようにしてもよい。 First, in step S26, the end point determination unit 33 may determine the midpoint of the target object that has already been recognized as the end point. In addition, the determination of the start point by the start point determination unit 34 may be omitted, or the start point may be omitted when the determined start point is the midpoint of the target object.

また、ステップS26及びS27における情報生成部36による誘導情報の生成及び表示部4によるその表示においては、ケース１の場合に対する追加処理として、上記終点に設定された目標対象が目標のものであるものを示すような表示を行うようにしてもよい。例えば、目標対象である旨を示すテキスト情報を、上記終点として決定された目標対象の中点に重畳するようにしてもよい。その他、前述の図４の例で目標対象O62に対して重畳された太枠B2のようなアイコンを用いて強調表示などを与えるようにしてもよい。 In addition, in the generation of guidance information by the information generation unit 36 and the display by the display unit 4 in steps S26 and S27, the target target set as the end point is the target as an additional process for the case 1 You may make it perform the display which shows. For example, text information indicating that it is the target target may be superimposed on the midpoint of the target target determined as the end point. In addition, an emphasis display or the like may be given using an icon such as a thick frame B2 superimposed on the target object O62 in the example of FIG.

また、始点決定部34による始点の決定を省略した場合は、矢印表示を省略して、上記のように目標対象である旨を示す情報のみで誘導情報が構成されるようにしてもよい。 In addition, when the determination of the start point by the start point determination unit 34 is omitted, the arrow display may be omitted, and the guidance information may be configured only by information indicating that the target is the target as described above.

以下、ケース３に該当する場合であるステップS36〜S38の説明を行う。 Hereinafter, Steps S36 to S38, which are cases corresponding to Case 3, will be described.

ケース３のステップS36,S37,S38はそれぞれケース１のステップS16,S17,S18と共通であるため、重複する説明は省略する。ただし、ケース３では目標対象が認識され且つ表示領域外に存在しているので、ケース１の場合とは異なる次のような処理を行うことができる。 Since Steps S36, S37, and S38 of Case 3 are the same as Steps S16, S17, and S18 of Case 1, overlapping descriptions are omitted. However, since the target object is recognized and exists outside the display area in case 3, the following processing different from the case 1 can be performed.

まず、ステップS36において終点決定部33は、前述の第一実施形態を適用する場合、（ケース１におけるステップS15にて目標推定部32が推定した目標対象のスクリーン座標ではなく、）既に認識されている目標対象の中点へと、始点決定部34で決定された始点から至る線分を引き、当該線分が表示部4による表示領域の境界（外枠）と交わる点を終点として決定すればよい。同様に前述の第二実施形態を適用する場合、当該線分上で当該交点から表示領域の内部に所定量だけ戻った位置を終点とすればよい。また、ケース３の場合はケース１におけるステップS14で求まる推定元対象が存在しないため、ステップS36における終点決定部33の実施形態として前述の第三実施形態は除外してよい。 First, in step S36, the end point determination unit 33 is already recognized (not the target target screen coordinates estimated by the target estimation unit 32 in step S15 in case 1) when applying the first embodiment described above. If a line segment from the start point determined by the start point determination unit 34 is drawn to the midpoint of the target target, and the point where the line segment intersects the boundary (outer frame) of the display area by the display unit 4 is determined as the end point Good. Similarly, when the second embodiment described above is applied, a position where a predetermined amount is returned from the intersection to the inside of the display area on the line segment may be set as the end point. Further, in case 3, since the estimation source target obtained in step S14 in case 1 does not exist, the third embodiment described above may be excluded as an embodiment of the end point determination unit 33 in step S36.

図９は、[1]にケース３における矢印の例を、[2]にケース２における矢印の例を、区別して示す図である。[1]では図３と同様の構成において表示領域R4内にある対象O2が始点に設定されるが、目標対象O3が表示領域R4の外部で且つ撮影領域R1内に位置しているために、矢印A5は表示領域R4の境界までしか表示できない形で、誘導情報G5が与えられる。一方、[2]では[1]の場合と異なり撮影領域R1と表示領域R4とが一致する関係にあるため、対象O2を始点として目標対象O3をそのまま終点とした矢印を表示することができる形で、誘導情報G6が与えられる。 FIG. 9 is a diagram showing an example of an arrow in case 3 in [1] and an example of an arrow in case 2 in [2]. In [1], the target O2 in the display area R4 is set as the start point in the same configuration as in FIG. 3, but the target object O3 is located outside the display area R4 and in the imaging area R1, The arrow A5 can be displayed only up to the boundary of the display area R4, and the guidance information G5 is given. On the other hand, unlike [1], the shooting area R1 and the display area R4 are in the same relationship in [2], so that an arrow with the target O2 as the start point and the target object O3 as the end point can be displayed. Thus, guidance information G6 is given.

また、ステップS36及びS37における情報生成部36による誘導情報の生成及び表示部4によるその表示においては、ケース１の場合に対する追加処理として、目標対象が既に視界内（撮影画像内）には存在している旨の表示を行うようにしてもよい。例えば、目標対象が視界内には存在している旨を示すテキスト情報を、上記終点又はその近辺に重畳するようにしてもよい。 In addition, in the generation of the guidance information by the information generation unit 36 and the display by the display unit 4 in steps S36 and S37, the target object already exists in the field of view (in the captured image) as an additional process for the case 1. You may make it display that it is. For example, text information indicating that the target object exists in the field of view may be superimposed on the end point or the vicinity thereof.

以下、ケース４に該当する場合であるステップS47,S48の説明を行う。 Hereinafter, Steps S47 and S48 that correspond to Case 4 will be described.

ステップS47では、情報生成部36が誘導情報として、対象が全く認識されてない旨を表すテキスト情報などを生成し、ステップS48では当該情報を表示部が表示する。こうして、対象が全く認識されていない旨を誘導情報として生成し、ユーザに撮影箇所の再検討などを促すようにすることができる。あるいは、ステップS47,S48をスキップして、誘導情報を生成及び表示が行われないようにしてもよい。この場合も、誘導情報が全く表示されないことから、ユーザに撮影箇所の再検討などを促すようにすることができる。 In step S47, the information generation unit 36 generates text information indicating that the object is not recognized at all as the guidance information, and in step S48, the display unit displays the information. In this way, it can be generated as guidance information that the object is not recognized at all, and the user can be urged to review the shooting location. Alternatively, steps S47 and S48 may be skipped so that the guidance information is not generated and displayed. Also in this case, since the guidance information is not displayed at all, it is possible to prompt the user to review the shooting location.

以上、本発明によれば、複数の商品等の対象の中から目標対象を認識する際の難しさおよびセンサ使用時の高コスト問題を除去し、目標対象を認識できない場合であっても、他の認識できた対象の位置や姿勢情報から、表示画面上の目標対象の位置を推定し、目標の商品等に誘導する誘導情報を提供することが可能となる。 As described above, according to the present invention, it is possible to eliminate the difficulty in recognizing a target object from a plurality of objects such as products and the high cost problem when using the sensor, and even if the target object cannot be recognized, It is possible to estimate the position of the target object on the display screen from the position and orientation information of the object that can be recognized, and provide guidance information for guiding to the target product or the like.

ここで、GPSその他の特別な位置センサではなく、カメラと画像認識AR技術を利用して対象物を認識することにより、設置コストを抑えるだけでなく、正確で詳細な状況変化に対応した誘導情報を提供することができる。特別な位置センサを使用したナビゲーションでは、ユーザの位置を基準とするが、本発明では、ユーザの状況や作業目的に応じて、ユーザの視線の位置および、ユーザが視界上で認識できた対象物の位置に基づいて誘導情報を提供することにより、よりユーザの直感に訴える誘導情報を提供できる。 Here, not only GPS and other special position sensors, but also using cameras and image recognition AR technology to recognize objects, not only reducing installation costs, but also accurate and detailed guidance information corresponding to changes in the situation Can be provided. In navigation using a special position sensor, the position of the user is used as a reference. However, according to the present invention, the position of the user's line of sight and the object that the user can recognize in the field of view according to the user's situation and work purpose. By providing the guidance information based on the position, guidance information appealing to the user's intuition can be provided.

以下、本発明の追加実施形態その他の補足説明を行う。 Hereinafter, additional embodiments of the present invention and other supplementary explanations will be given.

（１）図５のケース１におけるステップS16〜S18（及び対応するケース２，３の各ステップ）において、以上の実施形態では始点及び終点（実施形態及び場合によっては終点のみ、以下同様とする。）を決定したうえで撮影画像または風景に直接矢印などを重畳することで誘導情報を実現した。これに代えて、複数の対象の配置を2次元マップとしてモデル表示した俯瞰図上において始点及び終点を決定して、俯瞰図上に全く同様の矢印や各種メッセージ（目標対象に到達している旨のメッセージ等）を表示するようにしてもよい。すなわち、以上の実施形態において撮影画像または風景を用いていたのに代えて俯瞰図を用いて、その上での矢印などの重畳表示によって、誘導情報を構成するようにしてもよい。俯瞰図を使用すると、ユーザは視界領域だけではなく、対象全体の状態を把握することができるため、ユーザは目標の対象の位置を把握しやすくなる場合がある。 (1) In steps S16 to S18 (and corresponding steps of cases 2 and 3) in case 1 of FIG. 5, in the above embodiment, the same applies to the start point and end point (only the end point in the embodiment and in some cases). ) And the guidance information was realized by directly superimposing an arrow on the photographed image or landscape. Instead, the start point and the end point are determined on the overhead view in which the arrangement of a plurality of targets is displayed as a model as a two-dimensional map, and exactly the same arrows and various messages (that the target object has been reached) May be displayed. In other words, instead of using a captured image or landscape in the above embodiment, an overhead view may be used and the guidance information may be configured by superimposing an arrow or the like thereon. When the overhead view is used, the user can grasp not only the field of view area but also the state of the entire object, so that the user can easily grasp the position of the target object.

すなわち、俯瞰図を用いる実施形態は、重畳表示を行うための矢印等の決定は以上の実施形態と全く同様に可能であり、重畳表示が俯瞰図上においてなされる形で誘導情報が生成及び表示される点のみが異なる。ただし、俯瞰図を利用することを考慮した実施形態として、現状保持部35を適用することで、俯瞰図上においてユーザの認識領域が識別可能となるようにしてもよい。また、始点及び終点を決定する実施形態はいずれかの対象の中点として決定する実施形態を適用することで、俯瞰図上に予め定義されている対象の位置に矢印の始点及び終点を設定できるようにすることが好ましい。 That is, in the embodiment using the overhead view, determination of an arrow or the like for performing the superimposed display can be performed in the same manner as the above embodiment, and the guidance information is generated and displayed in a form in which the superimposed display is performed on the overhead view. The only difference is that However, as an embodiment that considers using an overhead view, the current state holding unit 35 may be applied so that the user's recognition area can be identified on the overhead view. In addition, the embodiment for determining the start point and the end point can apply the embodiment for determining as the midpoint of any target, so that the start point and end point of the arrow can be set at the position of the target that is predefined on the overhead view. It is preferable to do so.

図１１は、俯瞰図によって誘導情報を構成する実施形態の模式例を、図４の例に対応するものとして示す図である。上段側に示すように、対象O11〜O63の配置を2次元配置でモデル化した俯瞰図情報OV0を予め作成して記憶部5に登録しておく。当該モデルは、図４の例であれば対象O11〜O63の配置されている棚Rを正面で見た際の平面配置モデルとして与えておき、記憶部5に登録しておくことができる。そして、下段側に示すように、当該俯瞰図上において矢印等を表示した情報OV1,OV2によって、下段側に示すような誘導情報G10,G20（図４の例に対応する別の実施形態としての誘導情報）を実現することができる。すなわち、図１１の俯瞰図OV1において構成される誘導情報G10は図４の誘導情報G1と同様に、対象O22の中点を始点として対象O42の中点を終点とする矢印を含んでいる。また、図１１の俯瞰図OV2において構成される誘導情報G20は図４の誘導情報G2と同様に、対象O42の中点を始点として対象O62の中点を終点とする矢印を含み、対象O62が目標対象である旨を示すアイコンによる強調表示がなされたものとなっている。 FIG. 11 is a diagram illustrating a schematic example of an embodiment in which guidance information is configured by a bird's-eye view corresponding to the example of FIG. As shown on the upper side, overhead view information OV0 obtained by modeling the arrangement of the objects O11 to O63 in a two-dimensional arrangement is created in advance and registered in the storage unit 5. In the example of FIG. 4, the model can be given as a planar arrangement model when the shelf R on which the objects O11 to O63 are arranged is viewed from the front, and can be registered in the storage unit 5. Then, as shown on the lower side, the guidance information G10, G20 (as another embodiment corresponding to the example of FIG. 4) as shown on the lower side by the information OV1, OV2 displaying an arrow etc. on the overhead view. Guidance information) can be realized. That is, the guidance information G10 configured in the overhead view OV1 of FIG. 11 includes an arrow having the middle point of the target O22 as the start point and the middle point of the target O42 as the end point, like the guidance information G1 in FIG. Further, the guidance information G20 configured in the overhead view OV2 of FIG. 11 includes an arrow having a middle point of the target O42 as a start point and a middle point of the target O62 as the end point, similar to the guidance information G2 of FIG. It is highlighted with an icon indicating that it is a target object.

重畳される俯瞰図においては、現時点で視界に入って認識できている対象（撮影部1で撮影されており、認識部2で認識に成功している対象）をその他の対象として区別して表示することで、現時点の視界が俯瞰図上のどの辺りにあるのかをユーザに対して把握可能とさせるようにしてもよい。当該把握可能とさせる表示を可能とすべく、現状保持部35において現時点の撮影画像に関して認識部2で認識に成功している対象を保持し、リアルタイムで更新する。現状保持部35による当該更新結果を受けて情報生成部36が生成する俯瞰図としての誘導情報の例として、図１１の情報OV1,OV2の例では、各位置P1,P2の時点における視界内の認識できている対象が白色表示として、認識できていない対象のグレー表示と区別して表示されている。 In the superimposed bird's-eye view, the target that has entered the field of view and can be recognized at the present time (the target that has been shot by the shooting unit 1 and has been successfully recognized by the recognition unit 2) is displayed separately from other targets. In this way, the user may be able to grasp where the current field of view is on the overhead view. In order to enable the display to be grasped, the current status holding unit 35 holds the target that has been successfully recognized by the recognition unit 2 with respect to the current captured image, and updates it in real time. As an example of guidance information as an overhead view generated by the information generation unit 36 in response to the update result by the current state holding unit 35, in the example of the information OV1 and OV2 in FIG. The recognized object is displayed as a white display in distinction from the gray display of the unrecognized object.

図１１では特に、撮影画像内の全ての対象が必ずしも認識されていない状況の例が示されている。すなわち、撮影画像PC1内にはO11〜O43の12個の対象が存在するが、情報OV1においてはこのうちの半分のO21,O31,O22,O32,O23,O33のみが認識に成功したものとして白色表示されている。また、撮影画像PC2内にはO31〜O63の12個の対象が存在するが、情報OV2においてはこのうちの半分のO41,O51,O42,O52,O43,O53のみが認識に成功したものとして白色表示されている。 In particular, FIG. 11 shows an example of a situation in which not all objects in the captured image are necessarily recognized. That is, there are 12 objects O11 to O43 in the photographed image PC1, but in the information OV1, only half of them O21, O31, O22, O32, O23, O33 are recognized as having been successfully recognized. It is displayed. In addition, there are 12 objects from O31 to O63 in the captured image PC2, but in the information OV2, only half of them, O41, O51, O42, O52, O43, and O53, are recognized as being white. It is displayed.

本発明においてはこのように認識部2において一部の対象が認識できていない場合であっても誘導情報の表示が可能である。図１１の例は俯瞰図の例であったが、撮影画像又は視界に矢印を重畳する場合であっても同様に、認識部2が一部の対象しか認識できていなくとも誘導情報の表示が可能である。 In the present invention, guidance information can be displayed even when the recognition unit 2 does not recognize some of the objects. Although the example of FIG. 11 is an example of an overhead view, even when the arrow is superimposed on the captured image or the field of view, the guidance information is displayed even if the recognition unit 2 can recognize only a part of the objects. Is possible.

また、図１１の例のように認識できていない対象のそれぞれをグレー表示するという態様の他にも、俯瞰図上において認識できていない対象がある領域全体をグレー表示する、あるいは同様に、当該領域全体をメッシュ等で覆われているような形で表示するようにして、情報生成部36が誘導情報を生成してもよい。図１２に、メッシュで覆われている形で表示する模式的な例を示す。図１２では、[1],[2]にそれぞれ、図１１の俯瞰図OV1,OV2において認識できていない対象をグレー表示していたのに代えて、同様のことを表現するものとして認識されていない対象をメッシュで覆って表示する模式例が示されている。その他にも、認識されている対象と認識されていない対象（現状保持部35により保持されている対象とそれ以外の対象）を俯瞰図上において区別するための任意の表示手法を用いてよい。 In addition to the mode of displaying each of the unrecognized objects in gray as in the example of FIG. 11, the entire area where the target is not recognized on the overhead view is displayed in gray, or similarly, The information generation unit 36 may generate the guidance information so that the entire region is displayed in a form covered with a mesh or the like. FIG. 12 shows a schematic example displayed in a form covered with a mesh. In FIG. 12, in [1] and [2], the objects that cannot be recognized in the overhead view OV1 and OV2 in FIG. 11 are displayed in gray instead of being displayed in gray. A schematic example is shown in which a target that is not covered is displayed with a mesh. In addition, any display method for distinguishing the recognized object and the unrecognized object (the object held by the current state holding unit 35 and the other object) on the overhead view may be used.

俯瞰図によって誘導情報を構成する実施形態においては、俯瞰図上には撮影画像や実際の景色とは異なり全ての対象の配置を表示することが可能であることから、撮影画像や実際の景色に矢印などを重畳する実施形態からの変更として、終点決定部33において終点を常に目標対象の中点として設定するようにしてもよい。こうして、情報生成部36で俯瞰図上に矢印を生成する場合、ユーザの視界範囲（撮影画像の範囲）の内部に終点が存在するか否かによらず、常に目標対象を矢印の終点とすることができる。この場合、矢印の終点が目標対象である旨を表現して、誘導情報を生成すればよい。 In the embodiment in which the guidance information is configured by the overhead view, it is possible to display the arrangement of all objects on the overhead view, unlike the captured image and the actual landscape. As a change from the embodiment in which an arrow or the like is superimposed, the end point determination unit 33 may always set the end point as the midpoint of the target object. Thus, when the information generation unit 36 generates an arrow on the overhead view, the target object is always set as the end point of the arrow regardless of whether the end point exists within the user's view range (the range of the captured image). be able to. In this case, guidance information may be generated by expressing that the end point of the arrow is the target object.

図１３は、終点を常に目標対象の中点として決定する実施形態の模式例を示す図である。図１１の実施形態の場合における誘導情報G10,G20に代えて、当該実施形態では図１２の[1],[2]にそれぞれ示すような誘導情報G15,G25が生成される。すなわち、図１３の[1]の誘導情報G15においては、ユーザの視界範囲の外部にある対象O62が太枠表示によって目標対象である旨が示されると共に、矢印の終点として設定されている。 FIG. 13 is a diagram illustrating a schematic example of an embodiment in which the end point is always determined as the midpoint of the target object. In place of the guide information G10 and G20 in the embodiment of FIG. 11, guide information G15 and G25 as shown in [1] and [2] of FIG. 12 are generated in this embodiment. That is, in the guidance information G15 in [1] of FIG. 13, the target O62 outside the user's field of view is indicated by the thick frame display and is set as the end point of the arrow.

（２）上記の俯瞰図上での矢印等の重畳と、撮影画像又は景色上への矢印等の重畳と、を組み合わせる、あるいは状況に応じて切り替える実施形態も可能である。例えば、認識された対象のうちカメラとの距離が最小のものの距離が閾値を超える場合（対象が全般的に遠いと判定される場合）、全体的な配置関係の把握をユーザに促すべく俯瞰図上での重畳を行うようにしてもよい。当該距離が閾値以下となった場合、ユーザ選択で俯瞰図又は直接重畳の両方又はいずれかを適用できるようにしてもよい。例えば、事前設定の一例として、当該距離が閾値以下となった場合、ユーザが見ている局所的な配置関係の把握をユーザに促すべく、直接重畳（又はこれと俯瞰図との組み合わせ）を用いるようにしてもよい。 (2) An embodiment in which the superimposition of an arrow or the like on the above-described overhead view and the superimposition of an arrow or the like on a captured image or a landscape is combined or switched according to the situation is possible. For example, when the distance of the recognized target that has the smallest distance from the camera exceeds the threshold (when the target is determined to be generally far), an overhead view to prompt the user to grasp the overall arrangement relationship You may make it superimpose on. When the distance is less than or equal to the threshold value, either a bird's eye view or direct superimposition or either of them may be applied by user selection. For example, as an example of presetting, direct superimposition (or a combination of this and a bird's-eye view) is used to prompt the user to understand the local positional relationship that the user is viewing when the distance falls below a threshold value. You may do it.

また、上記切り替える判断を行うための距離は、上記の通り認識された対象のうちカメラとの距離が最小となるものの距離としてもよいし、その他の距離を採用してもよい。例えば、認識された全ての対象とカメラとのそれぞれの距離の平均としてもよいし、ケース１において（図５のステップS14,S15により）推定元対象に基づいて推定される目標対象とカメラとの距離としてもよいし、ケース２，３において位置が直接求まっている目標対象とカメラとの距離としてもよい。さらに、以上の距離はワールド座標における距離であったが、これに代えてスクリーン座標の距離を用いるようにしてもよい。 The distance for performing the switching determination may be the distance of the object recognized as described above that has the minimum distance to the camera, or another distance may be adopted. For example, it may be the average of the distances between all recognized objects and the camera, or in case 1 (by steps S14 and S15 in FIG. 5) between the target object and the camera estimated based on the estimation source object. It may be a distance, or may be a distance between the target object whose position is directly obtained in cases 2 and 3 and the camera. Further, although the above distance is the distance in the world coordinates, the distance in the screen coordinates may be used instead.

さらに、俯瞰図及び撮影画像を組み合わせて誘導情報を生成する実施形態においては、俯瞰図上の対象と撮影画像（HMDの場合の視界を含む。以下同様とする。）上の対象を対応づけるために、それぞれ対応する位置にアイコンを重畳してもよい。アイコンを重畳する位置は、一実施形態では撮影画像上では認識できた対象のうち最もスクリーン座標における視線の位置又はワールド座標におけるカメラ位置に近いものとして決定することができ、俯瞰図上でも該当する位置にアイコンを重畳すれば、ユーザは視界内の対象が俯瞰図内のどの対象に該当するのかの判断を迷わなくてすむ。すなわち、ある対象に関して、俯瞰図及び撮影画像両方において当該同一対象であるものとしてアイコン表示されるので、ユーザは直感的な把握が可能となる。また、アイコンの重畳に限らず、当該対象を俯瞰図上及び撮影画像上において識別可能とするその他の任意の表示態様を用いてもよい。 Furthermore, in the embodiment in which the guidance information is generated by combining the overhead view and the captured image, the target on the overhead view and the target on the captured image (including the field of view in the case of HMD; the same applies hereinafter) are associated with each other. In addition, icons may be superimposed at corresponding positions. In one embodiment, the position where the icon is superimposed can be determined as the closest to the position of the line of sight in the screen coordinates or the camera position in the world coordinates among the objects recognized on the captured image, and also corresponds to the overhead view. If the icon is superimposed on the position, the user does not have to hesitate to determine which object in the overhead view corresponds to the object in the field of view. That is, an icon is displayed as being the same target in both the overhead view and the captured image for a certain target, so that the user can intuitively grasp it. Moreover, you may use not only superimposition of an icon but the other arbitrary display modes which make the said object identifiable on a bird's-eye view and a picked-up image.

また、当該アイコンその他を重畳する所定対象の決定は、上記のような視線等に最も近い対象として決定する以外の実施形態でもよく、始点決定部34で始点を決定する際に、対象の中点として決定する実施形態における当該対象（例えば顕著性が最大の対象など）として決定してもよい。また、認識部2で決定された対象のうち任意の１つとしてもよい。さらに、２つ以上の対象に対して当該アイコンその他を重畳するようにしてもよいが、この場合は俯瞰図上におけるアイコンと撮像画像上におけるアイコンとがいずれの対象であるかを区別可能なように表示することが好ましい。すなわち、当該アイコンその他は対象のIDを識別可能なように、表示態様を変えるなどして与えることが好ましい。当該用いるアイコンその他の情報は記憶部5に予め登録しておけばよい。 Further, the determination of the predetermined target on which the icon or the like is superimposed may be an embodiment other than the determination as the target closest to the line of sight as described above, and the target midpoint is determined when the start point is determined by the start point determination unit 34. May be determined as the target in the embodiment determined as (for example, the target having the greatest saliency). Further, any one of the objects determined by the recognition unit 2 may be used. Furthermore, the icon or the like may be superimposed on two or more objects, but in this case, it is possible to distinguish between the icon on the overhead view and the icon on the captured image. Is preferably displayed. That is, it is preferable to give the icon and the like by changing the display mode so that the target ID can be identified. The icon and other information to be used may be registered in the storage unit 5 in advance.

（３）以上では誘導情報を始点から終点へと至る矢印で構成されているものとしたが、矢印に限らず、始点（又はその近傍）及び終点（又はその近傍）をユーザが認識可能な任意の表示態様によって誘導情報を構成するようにしてもよい。例えば始点には始点である旨のアイコンその他の重畳を行い、終点には終点である旨のアイコンその他の重畳を行うようにしてもよい。始点と終点を結ぶ表示を行う場合も、矢印以外の任意の態様を利用してよい。例えば始点から終点へ向けて直線（又は細長い矩形）を引き、当該直線上において始点から終点へと模様が移動しているアニメーション表示を与えるようにしてもよい。また、始点及び／又は終点を対象の中点として設定する実施形態においては、当該設定された対象の領域を例えば枠で囲うなどする形で視認可能とすることで、始点及び／又は終点であることを表示するようにしてもよい。当該始点及び／又は終点に対応する領域を視認可能とすることと、上記の矢印等によって始点及び終点の関係を認識可能とすることと、を組み合わせて誘導情報を構成するようにしてもよい。 (3) In the above description, the guidance information is composed of arrows from the start point to the end point. However, the present invention is not limited to the arrow, and the start point (or the vicinity thereof) and the end point (or the vicinity thereof) can be arbitrarily recognized by the user. The guidance information may be configured according to the display mode. For example, an icon or other superimposition may be performed at the start point, and an icon or other superimposition may be performed at the end point. When performing display connecting the start point and the end point, any mode other than the arrow may be used. For example, a straight line (or elongated rectangle) may be drawn from the start point to the end point, and an animation display in which the pattern moves from the start point to the end point on the straight line may be given. In the embodiment in which the start point and / or the end point are set as the midpoint of the target, the start point and / or the end point can be obtained by making the set target region visible, for example, by surrounding it with a frame. May be displayed. The guidance information may be configured by combining making the region corresponding to the start point and / or the end point visible and making the relationship between the start point and the end point recognizable by the arrow or the like.

（４）本発明は、コンピュータをAR情報表示装置20として機能させるプログラムとしても提供可能である。この場合本発明のAR情報表示装置20を、CPU、当該CPUにワークエリアを提供する一次メモリ、所定データやプログラム等を格納する二次記憶装置などを備えた一般的な構成のコンピュータによって構成すると共に、図1及び図２の各部の機能を、各機能に対応する所定プログラムを読み込んで実行するCPUによって実現することができる。また、図1及び図２の各部のうちの任意の一部分又は全部を、汎用的なCPUがプログラムを実行することによって実現するのに代えて、専用ハードウェア（専用LSIなど）によって実現するようにしてもよい。 (4) The present invention can also be provided as a program that causes a computer to function as the AR information display device 20. In this case, the AR information display device 20 of the present invention is configured by a computer having a general configuration including a CPU, a primary memory that provides a work area for the CPU, a secondary storage device that stores predetermined data, programs, and the like. 1 and 2 can be realized by a CPU that reads and executes a predetermined program corresponding to each function. In addition, any part or all of the units shown in FIGS. 1 and 2 may be implemented by dedicated hardware (such as a dedicated LSI) instead of being implemented by a general-purpose CPU executing a program. May be.

（５）本発明は、図４の例のように記憶部5に予め登録されている対象の全てが１つの平面上に概ね配置されている場合や、図６の例のように対象をグループ分け（棚R1上にある対象のグループと棚R2上にある対象のグループに分ける）して、各グループに属する対象が１つの平面状に概ね配置されている場合に特に好適な誘導表示を実現するが、このような制約がなく各対象の配置が3次元空間内で任意のものである場合であっても本発明は適用可能である。 (5) In the present invention, when all of the objects registered in advance in the storage unit 5 are generally arranged on one plane as in the example of FIG. 4, the objects are grouped as in the example of FIG. Divided (divided into target groups on shelf R1 and target groups on shelf R2), realizing a particularly suitable guidance display when objects belonging to each group are generally arranged in one plane However, the present invention can be applied even when there is no such restriction and the arrangement of each object is arbitrary in the three-dimensional space.

（６）本発明においては、誘導情報を構成する矢印を生成するための始点及び終点に関して、スクリーン座標上で位置を求めるものとし、矢印はスクリーン座標上での2次元的な方向を表現するものとして説明したが、これに代えて既存のAR技術やCG(コンピュータグラフィック)技術において用いられている3次元的な表示の矢印を用いるようにしてもよい。この場合、始点及び終点に関しては認識された対象の中点によって決定する実施形態を採用し、当該対象の中点のワールド座標での位置を用いることで、奥行き方向の表現をも含んだ3次元表示の矢印を生成すればよい。 (6) In the present invention, the position on the screen coordinates is obtained with respect to the start point and the end point for generating the arrows constituting the guidance information, and the arrows represent the two-dimensional directions on the screen coordinates. However, instead of this, a three-dimensional display arrow used in the existing AR technology or CG (computer graphic) technology may be used. In this case, the embodiment in which the start point and the end point are determined by the midpoint of the recognized object is adopted, and the position in the world coordinates of the midpoint of the object is used, so that the 3D including the representation in the depth direction A display arrow may be generated.

20…AR情報表示装置、1…撮影部、2…認識部、3…生成部、4…表示部、5…記憶部 20 ... AR information display device, 1 ... shooting unit, 2 ... recognition unit, 3 ... generation unit, 4 ... display unit, 5 ... storage unit

Claims

An AR information display device that performs display that leads to a predetermined target object from a plurality of objects,
A shooting unit for shooting and obtaining a shot image;
A recognition unit for recognizing each of the plurality of objects from the captured image;
A generating unit that generates guidance information to the target target based on the target recognized by the recognition unit when the target target is not recognized by the recognition unit;
An AR information display device comprising: a display unit configured to display the guidance information.

Each of the plurality of objects has a registered position and orientation in a common world coordinate system,
In the recognition unit, when recognizing each of the plurality of objects, estimation of the position and orientation captured in the captured image is also performed with respect to at least one of the recognized objects,
In the generation unit, one of the objects recognized by the recognition unit and the registered position and posture in the target object, and a position and posture estimated by the recognition unit with respect to the one object, The AR information display device according to claim 1, wherein the guidance information is generated based on the information.

3. Each of the plurality of objects includes registering the position and orientation in a common world coordinate system by registering a spatial coordinate of a predetermined point in each object. AR information display device.

The generation unit generates the guidance information to the target target based on the target recognized by the recognition unit as information for displaying that the guidance information is guided to the recognition position or the estimated position of the target target. The AR information display device according to any one of claims 1 to 3,

The generation unit determines a start point and an end point based on an object recognized by the recognition unit, and generates guidance information to the target object as information indicating that the start point is reached to the end point. An AR information display device according to any one of claims 1 to 4.

In the generation unit,
One of the objects recognized by the recognition unit is determined as an estimation source object,
The starting point is
When the difference between the estimation source target and the camera constituting the photographing unit (γ) and the user's line-of-sight distance (γ0) is within a threshold, the predetermined user viewpoint on the photographed image is Decide
6. The AR information display device according to claim 5, wherein if the difference is not within a threshold value, the AR information display device is determined as a midpoint of the object recognized by the recognition unit at the shortest distance from the viewpoint in the world coordinate system. .

In the generation unit, the start point is determined as a predetermined user viewpoint on the photographed image, or as a midpoint of an object recognized by the recognition unit at a shortest distance from the viewpoint in the world coordinate system. The AR information display device according to claim 5.

The AR information display apparatus according to claim 5, wherein the generation unit determines the start point as a midpoint of a target having the highest saliency among the targets recognized by the recognition unit.

6. The AR according to claim 5, wherein the generation unit determines the start point as a midpoint in the highest priority degree registered in advance among the objects recognized by the recognition unit. Information display device.

In the generation unit,
Of the objects recognized by the recognition unit, the one closest to the target object in screen coordinates is determined as an estimation source object,
10. The AR according to claim 5, wherein the screen coordinates of the target object are estimated based on the estimation source object, and the end point is determined based on the estimated screen coordinates. Information display device.

The AR information display apparatus according to claim 10, wherein the generation unit determines the end point on a line segment from the start point to the screen coordinate based on the estimated screen coordinate.

In the generation unit,
If it is determined that a straight line from the starting point to the estimated screen coordinates crosses a plurality of objects diagonally,
The information indicating that the start point is reached and the end point is generated as being displayed divided into a plurality of times, and the information indicating that the horizontal direction and the vertical direction are sequentially displayed is displayed. 11. The method according to claim 10, wherein the information is generated by determining whether to display in the horizontal direction or the vertical direction first, and a start point and an end point for the display. The AR information display device described.

In the generation unit,
Depending on whether or not the difference between the distance between the start point and the camera constituting the photographing unit and the distance between the end point and the camera constituting the photographing unit is within a predetermined threshold, the start point is reached to the end point. The AR information display device according to claim 5, wherein the information is generated by changing a display mode of information for displaying the effect.

In the generation unit,
Based on the distance between the start point and the target object, or based on the number of objects existing between the start point and the target object, a display mode of information indicating that the start point is reached to the end point 12. The AR information display device according to claim 5, wherein the AR information display device is generated by changing.

The AR information display device according to claim 1, wherein the generation unit generates the guidance information as information associated with the captured image.

In the generation unit, the guidance information is used as information to be superimposed on the photographed image or a user's field of view corresponding to the photographed image and / or on a bird's-eye view expressing a predetermined arrangement relationship of the plurality of objects. The AR information display device according to claim 15, wherein the AR information display device generates the information to be superimposed.

When the target unit is recognized by the recognition unit, the generation unit generates the guidance information as information expressing that the target target already exists in the captured image. The AR information display device according to claim 1.

The generating unit superimposes the guidance information on the captured image or first information to be superimposed on a user's field of view corresponding to the captured image, and an overhead view expressing a predetermined arrangement relationship of the plurality of objects. And generating the second information, and
The first information is generated so that the predetermined object recognized by the recognition unit in the field of view of the user corresponding to the captured image or the captured image can be identified, and the second information is displayed on the overhead view. The AR information display device according to claim 1, wherein the predetermined object is generated as being identifiable.

In the generation unit, a position is estimated based on a distance between a camera constituting the imaging unit and a target recognized by the recognition unit or a camera constituting the imaging unit and a target recognized by the recognition unit. Depending on the distance to the target object, the guidance information is superimposed on the captured image or information that is superimposed on the field of view of the user corresponding to the captured image or an overhead view that represents a predetermined arrangement relationship of the plurality of objects. 18. The AR information display device according to claim 1, wherein the information is generated by switching to one or both of the information to be processed.

A program causing a computer to function as the AR information display device according to any one of claims 1 to 19.