JP2017182628A

JP2017182628A - Augmented reality user interface application device and control method

Info

Publication number: JP2017182628A
Application number: JP2016071843A
Authority: JP
Inventors: 智紀盛合; Tomoki Moriai; 紘史大塚; Hiroshi Otsuka
Original assignee: NTT Data Corp
Current assignee: NTT Data Group Corp
Priority date: 2016-03-31
Filing date: 2016-03-31
Publication date: 2017-10-05
Anticipated expiration: 2036-03-31
Also published as: JP6649833B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem that a video photographed by a high-performance camera often contains more objects than otherwise that are objects of AR display and providing AR display for all of these would cause the visibility of essential information to be known to decrease significantly resulting in reduction of usability, for which reason the need arises for an RA user interface application device and control method which selectively provide AR display of information in which a user is interested and for which the user has greater need.SOLUTION: The present invention detects from images photographed by a user terminal an object which is to be presented by AR display, acquires static metadata and dynamic metadata of the detected object, calculates the interest value of the detected object on the basis of the dynamic metadata, determines a target to be displayed at the user terminal on the basis of the calculated interest value, and creates data for AR display for each object determined as display target, on the basis of the static metadata.SELECTED DRAWING: Figure 4

Description

本発明は、拡張現実ユーザインタフェース適用装置および制御方法に関する。 The present invention relates to an augmented reality user interface application apparatus and a control method.

コンピュータの小型化および高性能化に伴い、拡張現実（ＡＲ：Augmented Reality）ユーザインタフェース適用装置として利用されるスマートフォンやウェアラブルデバイスの性能も飛躍的に向上してきている。ＡＲユーザインタフェースを制御するソフトウェア技術においても、二次元バーコードなどのＡＲマーカーや位置情報からＡＲの表示対象を決定する従来技術のみならず、物体認識技術を利用して、画像情報の中から表示対象を検出し（オブジェクト検出）、ＡＲ表示を行なうマーカーレスＡＲ技術の精度も向上してきている。また、カメラの性能向上により、より遠くの物体を撮影した場合も、当該物体をＡＲの表示対象（オブジェクト）として検出することが可能となってきている。なお、ここでいうオブジェクトとはＡＲ表示を行なう対象となる店舗の看板や広告などであり、物体が有する特定の形から、ＡＲマーカー不要で、それらを認識し検出することができる。ただし、当然ながら、ＡＲマーカーによりオブジェクトを検出することも含む。 With the downsizing and high performance of computers, the performance of smartphones and wearable devices used as augmented reality (AR) user interface application devices has also improved dramatically. In software technology for controlling the AR user interface, not only the conventional technology for determining the AR display target from AR markers such as two-dimensional barcodes and position information, but also display from image information using object recognition technology. The accuracy of the markerless AR technique for detecting an object (object detection) and performing AR display has also been improved. Further, due to the improvement in camera performance, even when a farther object is photographed, it is possible to detect the object as an AR display target (object). Note that the object here is a signboard or an advertisement of a store to be subjected to AR display, and can recognize and detect the object without using an AR marker from a specific shape of the object. However, as a matter of course, detection of an object by an AR marker is also included.

しかしながら、高性能カメラで撮影した映像にはＡＲの表示対象となる物体がより多く存在する場合があり、これら全てに対してＡＲ表示を行なうことは、本来知りたい情報の視認性を著しく低下させ、ユーザビリティの低下に繋がる可能性がある。そのため、マーカーレスＡＲ技術において、撮影した映像中に複数の物体が存在する場合であっても、全ての表示対象に対してＡＲ表示を行なうわけではなく、ユーザが興味を持ち、より必要とする情報を選択してＡＲ表示を行なうＡＲユーザインタフェース適用装置および制御方法が求められている。 However, there are cases where there are more objects that are targets for AR display in video shot with a high-performance camera, and performing AR display on all of these significantly reduces the visibility of information that is originally intended to be known. There is a possibility that it may lead to a decrease in usability. Therefore, in the markerless AR technology, even when there are a plurality of objects in the captured video, the AR display is not performed for all display targets, and the user is interested and needs more. There is a need for an AR user interface application apparatus and a control method for selecting information and displaying an AR.

本発明は、このような目的を達成するために、ユーザの興味に基づいて、拡張現実（ＡＲ）表示の対象であるオブジェクトに対するＡＲ表示をユーザ端末に行なわせるコンピュータ装置において、前記オブジェクト固有の情報である静的メタデータを記憶し、
前記ユーザ端末から、前記ユーザ端末において撮影された画像または映像、および前記画像または前記映像に関する付加情報を受信し、
受信した前記画像または前記映像に含まれる前記オブジェクトに対応する前記静的メタデータを取得し、
前記静的メタデータを取得した場合、前記オブジェクトに対する、前記ユーザの興味に関連する動的な情報である動的メタデータを、受信した前記画像または前記映像、および前記付加情報から取得し、
前記動的メタデータに基づいて、前記オブジェクトの興味値を算出し、
算出した前記興味値に基づいて、前記オブジェクトのうちから前記ユーザ端末での表示対象を決定し、
前記静的メタデータに基づいて、決定した前記表示対象であるオブジェクトごとのＡＲ表示用のデータである表示データを作成し、
作成した前記表示データを、前記ユーザ端末に送信することを特徴とする。 In order to achieve such an object, the present invention provides a computer device that causes a user terminal to perform AR display for an object that is an object of augmented reality (AR) display based on the user's interest. Remember the static metadata that is
From the user terminal, an image or video taken at the user terminal, and additional information related to the image or the video are received,
Obtaining the static metadata corresponding to the object contained in the received image or video;
When the static metadata is acquired, dynamic metadata that is dynamic information related to the user's interest with respect to the object is acquired from the received image or video and the additional information,
Calculating an interest value of the object based on the dynamic metadata;
Based on the calculated interest value, a display target on the user terminal is determined from the objects,
Based on the static metadata, create display data that is AR display data for each object that is the display target that has been determined,
The created display data is transmitted to the user terminal.

また、前段落に記載のコンピュータにおいて、前記付加情報は、前記画像または前記映像の撮影に関するデータ、前記ユーザ端末の端末に関するデータ、および前記ユーザの身体的特徴データのうちの少なくとも１つを含み、
前記動的メタデータは、前記オブジェクトの前記画像または前記映像における位置座標、サイズ、および前記オブジェクトの継続表示時間のうちの少なくとも１つを含み、前記動的メタデータは、前記ユーザによって過去に撮影された画像または前記映像に対する過去データを含み、
前記興味値を算出するにあたって、
前記ユーザに係る前記動的メタデータに含まれる前記位置座標が、前記オブジェクトの前記過去データにおける前記位置座標よりも前記画像の中央に寄ったかの判定、
前記ユーザに係る前記動的メタデータに含まれる前記サイズが、前記オブジェクトの前記過去データにおける前記サイズよりも大きくなったかの判定、および
前記ユーザに係る前記動的メタデータに含まれる前記継続表示時間が、予め定められた閾値以上であるかの判定
のうちの少なくとも１つを実行し、
実行された前記位置座標、前記サイズ、および前記継続表示時間のうちの少なくとも１つに関する判定に基づいて、前記興味値を算出することを特徴とする。 Further, in the computer according to the preceding paragraph, the additional information includes at least one of data relating to photographing of the image or the video, data relating to a terminal of the user terminal, and physical characteristic data of the user,
The dynamic metadata includes at least one of a position coordinate, a size, and a continuous display time of the object in the image or the video of the object, and the dynamic metadata is captured in the past by the user. Including historical data for the recorded image or the video,
In calculating the interest value,
Determining whether the position coordinates included in the dynamic metadata relating to the user are closer to the center of the image than the position coordinates in the past data of the object;
Determining whether the size included in the dynamic metadata relating to the user is larger than the size of the object in the past data, and the continuous display time included in the dynamic metadata relating to the user , Executing at least one of determining whether or not a predetermined threshold value is exceeded,
The interest value is calculated based on a determination regarding at least one of the executed position coordinates, the size, and the continuous display time.

さらに、前々段落に記載の発明において、前記興味値を算出するにあたって、前記ユーザの目線および前記目線の履歴データを観測事象とし、前記観測事象を元に、前記オブジェクトに対する前記ユーザの興味段階をノードとする隠れマルコフモデルに基づいて、前記興味値を推定することを特徴とする。 Further, in the invention described in the preceding paragraph, in calculating the interest value, the user's eye line and the history data of the eye line are set as an observation event, and the user's interest stage with respect to the object is determined based on the observation event. The interest value is estimated based on a hidden Markov model as a node.

また、前３段落に記載の発明において、前記ユーザ端末での表示対象を決定するにあたって、
算出した前記興味値に基づいて、前記興味値が高い上位所定数の前記オブジェクトを表示対象として決定すること、
算出した前記興味値が予め定められた閾値以上である前記オブジェクトを表示対象として決定すること、
算出した前記興味値、および前記オブジェクトに対するＡＲ表示の表示量に基づいて、前記オブジェクトのうちから前記ユーザ端末での表示対象を決定すること
のうちの少なくとも１つを実行することを特徴とする。 In the invention described in the previous three paragraphs, in determining the display target on the user terminal,
Determining, based on the calculated interest value, the upper predetermined number of objects having a high interest value as a display target;
Determining the object for which the calculated interest value is equal to or greater than a predetermined threshold as a display target;
Based on the calculated interest value and the display amount of the AR display for the object, at least one of determining a display target on the user terminal from among the objects is executed.

さらに、前４段落に記載の発明において、前記静的メタデータは、前記オブジェクトの名称、ジャンル、および詳細情報を含み、前記表示データを作成するにあたって、
算出した前記興味値に基づいて、前記表示データの詳細レベルを決定し、
前記詳細レベルに基づいて、前記名称、前記ジャンル、および前記詳細情報のうちの少なくとも１つを表示対象として決定し、
前記名称、前記ジャンル、および前記詳細情報のうち、前記決定した少なくとも１つを含む前記表示データを作成することを含むことを特徴とする。 Furthermore, in the invention described in the preceding four paragraphs, the static metadata includes a name, a genre, and detailed information of the object, and in creating the display data,
Based on the calculated interest value, a detailed level of the display data is determined,
Based on the detail level, determine at least one of the name, the genre, and the detailed information as a display target,
Generating the display data including at least one of the name, the genre, and the detailed information.

また、本発明は、ユーザの興味に基づいて、拡張現実（ＡＲ）表示の対象であるオブジェクトに対するＡＲ表示をユーザ端末に行なわせるコンピュータ装置によって実行される方法であって、
前記オブジェクト固有の情報である静的メタデータを記憶するステップと、
前記ユーザ端末から、前記ユーザ端末において撮影された画像または映像、および前記画像または前記映像に関する付加情報を受信するステップと、
受信した前記画像または前記映像に含まれる前記オブジェクトに対応する前記静的メタデータを取得するステップと、
前記静的メタデータを取得した場合、前記オブジェクトに対する、前記ユーザの興味に関連する動的な情報である動的メタデータを、受信した前記画像または前記映像、および前記付加情報から取得するステップと、
前記動的メタデータに基づいて、前記オブジェクトの興味値を算出するステップと、
算出した前記興味値に基づいて、前記オブジェクトのうちから前記ユーザ端末での表示対象を決定するステップと、
前記静的メタデータに基づいて、決定した前記表示対象であるオブジェクトごとのＡＲ表示用のデータである表示データを作成するステップと、
作成した前記表示データを、前記ユーザ端末に送信するステップと
を備えたことを特徴とする。 According to another aspect of the present invention, there is provided a method executed by a computer device that causes a user terminal to perform AR display on an object that is an object of augmented reality (AR) display, based on user interest.
Storing static metadata which is information specific to the object;
Receiving from the user terminal an image or video taken at the user terminal and additional information relating to the image or video;
Obtaining the static metadata corresponding to the object included in the received image or video;
Acquiring the static metadata, which is dynamic information related to the user's interest in the object, from the received image or video and the additional information when acquiring the static metadata; ,
Calculating an interest value of the object based on the dynamic metadata;
Determining a display target on the user terminal from among the objects based on the calculated interest value;
Creating display data that is data for AR display for each object that is the display target determined based on the static metadata;
Transmitting the created display data to the user terminal.

さらに、本発明は、ユーザの興味に基づいて、拡張現実（ＡＲ）表示の対象であるオブジェクトに対するＡＲ表示をユーザ端末に行なわせるコンピュータ装置に実行させるコンピュータプログラムであって、前記コンピュータ装置によって実行されると、前記コンピュータ装置に、
前記オブジェクト固有の情報である静的メタデータを記憶させ、
前記ユーザ端末から、前記ユーザ端末において撮影された画像または映像、および前記画像または前記映像に関する付加情報を受信させ、
受信した前記画像または前記映像に含まれる前記オブジェクトに対応する前記静的メタデータを取得させ、
前記静的メタデータを取得した場合、前記オブジェクトに対する、前記ユーザの興味に関連する動的な情報である動的メタデータを、受信した前記画像または前記映像、および前記付加情報から取得させ、
前記動的メタデータに基づいて、前記オブジェクトの興味値を算出させ、
算出した前記興味値に基づいて、前記オブジェクトのうちから前記ユーザ端末での表示対象を決定させ、
前記静的メタデータに基づいて、決定した前記表示対象であるオブジェクトごとのＡＲ表示用のデータである表示データを作成させ、
作成した前記表示データを、前記ユーザ端末に送信させることを特徴とする。 Furthermore, the present invention is a computer program for causing a computer device to cause a user terminal to perform AR display for an object that is an object of augmented reality (AR) display based on the user's interest, and is executed by the computer device. Then, in the computer device,
Storing static metadata which is information specific to the object;
From the user terminal, an image or video taken at the user terminal, and additional information related to the image or the video are received,
Obtaining the static metadata corresponding to the object included in the received image or video;
When the static metadata is acquired, dynamic metadata that is dynamic information related to the user's interest in the object is acquired from the received image or video and the additional information,
Based on the dynamic metadata, the interest value of the object is calculated,
Based on the calculated interest value, the display target on the user terminal is determined from the objects,
Based on the static metadata, display data that is data for AR display for each object that is the display target determined is created,
The created display data is transmitted to the user terminal.

以上説明したように、本発明により、ＡＲ技術（ＡＲマーカーを利用した技術、およびマーカーレスＡＲによる技術）において、撮影画像から物体および／またはＡＲマーカーを認識し、それにより検出されたオブジェクトの中から、ユーザが興味を持ち、より必要とするオブジェクトを選択してＡＲ表示を行なうＡＲユーザインタフェース適用装置および制御方法を提供することができる。 As described above, according to the present invention, in the AR technique (a technique using an AR marker and a technique using a markerless AR), an object and / or an AR marker is recognized from a captured image, and the object detected thereby is detected. Therefore, it is possible to provide an AR user interface application apparatus and a control method for performing AR display by selecting an object that the user is interested in and needs more.

従来のマーカーレスＡＲ表示イメージを示す図である。It is a figure which shows the conventional markerless AR display image. 本発明を用いた場合のマーカーレスＡＲ表示イメージを示す図である。It is a figure which shows the markerless AR display image at the time of using this invention. 本発明の一実施形態に係るシステム構成を示す図である。It is a figure which shows the system configuration | structure which concerns on one Embodiment of this invention. 本発明の一実施形態に係るＡＲ表示処理を示すフローチャートである。It is a flowchart which shows AR display processing which concerns on one Embodiment of this invention. 本発明の一実施形態に係る興味値算出処理を示すフローチャートである。It is a flowchart which shows the interest value calculation process which concerns on one Embodiment of this invention. 本発明の一実施形態に係る静的メタデータ記憶部に格納されたデータを示す図である。It is a figure which shows the data stored in the static metadata memory | storage part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る動的メタデータ記憶部に格納されたデータを示す図である。It is a figure which shows the data stored in the dynamic metadata memory | storage part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る表示データ記憶部に格納されたデータを示す図である。It is a figure which shows the data stored in the display data storage part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る興味推定モデルを示す図である。It is a figure which shows the interest estimation model which concerns on one Embodiment of this invention.

まず、従来のＡＲ表示イメージと、本発明を用いた場合のＡＲ表示イメージを対比して説明する。図１は、従来のマーカーレスＡＲ表示イメージを示す図である。このように、検出されたオブジェクトの全てに対して制限なく情報が表示されるため、互いの情報が重なりあったり、ユーザが本来知りたい情報を探すのに非常に時間がかかったりと、ユーザビリティの低下に繋がってしまう。 First, the conventional AR display image and the AR display image when the present invention is used will be described in comparison. FIG. 1 is a diagram showing a conventional markerless AR display image. In this way, since information is displayed without limitation for all detected objects, the mutual information overlaps, and it takes a very long time for users to search for information that they originally want to know. It will lead to decline.

一方、図２は、本発明を用いた場合のマーカーレスＡＲ表示イメージを示す図である。左上の画像から、時間経過と共に右の画像、左下の画像と遷移し、ユーザが興味を示したものをより詳細に表示し、逆にユーザが興味を示さないものをより簡潔に表示する（または非表示にする）。より具体的には、左上の画像を初期表示とし、ユーザが「飲食」を注視しており、「飲食」に興味を持っていることをシステムが判断する。すると、右の画像に遷移し、「飲食」に関連する詳細情報が表示される。この際、「飲食」に関係のない（すなわち、ユーザが興味を示していない）「銀行」などに関連する情報は非表示にされている。次に、右の画像に対し、ユーザは「中華料理」に非常に興味を示し、そして「和食」に対しても若干興味を示したとシステムが判断する。すると、左下の画像に遷移し、「中華料理」に関連する非常に詳細な情報が表示され、かつ「和食」に関連する情報もより詳細な表示に切り替わる（しかしながら、「中華料理」に関連する情報ほど詳細な表示ではない）。 On the other hand, FIG. 2 is a diagram showing a markerless AR display image when the present invention is used. Transition from the upper left image to the right image and the lower left image as time elapses, more detailed display of what the user is interested in, and more concisely display what the user is not interested in (or You want to hide). More specifically, the upper left image is used as an initial display, and the system is determined that the user is watching “food” and interested in “food”. Then, the screen changes to the right image, and detailed information related to “food” is displayed. At this time, information related to “bank” which is not related to “food and drink” (that is, the user is not interested) is hidden. Next, with respect to the right image, the system determines that the user is very interested in “Chinese cuisine” and slightly interested in “Japanese cuisine”. Then, a transition is made to the image on the lower left, very detailed information related to “Chinese food” is displayed, and information related to “Japanese food” is also switched to a more detailed display (however, related to “Chinese food”) Not as detailed as information).

次に、本発明の実施形態に係るシステムの概要を説明する。図１は、本発明の一実施形態に係るシステム構成を示す図である。図１において、データセンタなどに設置されたＡＲ表示制御サーバ３００は、ネットワーク３０１（例えば、インターネット）を介して、１つまたは複数のスマートフォン３０２ａ〜ｎ（以下、まとめて「スマートフォン３０２」という）、および１つまたは複数のウェアラブルデバイス３０３ａ〜ｎ（以下、まとめて「ウェアラブルデバイス３０３」という）と通信を行なうように構成されている。なお、スマートフォン３０２およびウェアラブルデバイス３０３をさらにまとめて「ユーザ端末」という場合もある。また、図１におけるＡＲ表示制御サーバ３００を便宜上、単一のサーバコンピュータとして記載しているが、複数台のサーバコンピュータによる分散システムとして構成することも可能である。さらに、ＡＲ表示制御サーバ３００の機能をユーザ端末に搭載し、ユーザ端末上で本発明を実施するように構成することも可能である。 Next, an overview of a system according to an embodiment of the present invention will be described. FIG. 1 is a diagram showing a system configuration according to an embodiment of the present invention. In FIG. 1, an AR display control server 300 installed in a data center or the like includes one or a plurality of smartphones 302a to 302n (hereinafter collectively referred to as “smartphones 302”) via a network 301 (for example, the Internet), And one or a plurality of wearable devices 303a to 303n (hereinafter collectively referred to as “wearable devices 303”). Smartphone 302 and wearable device 303 may be further collectively referred to as “user terminals”. Further, although the AR display control server 300 in FIG. 1 is described as a single server computer for the sake of convenience, it can also be configured as a distributed system by a plurality of server computers. Furthermore, the function of the AR display control server 300 can be mounted on the user terminal, and the present invention can be implemented on the user terminal.

ＡＲ表示制御サーバ３００は、ＡＲユーザインタフェースを提供する企業が管理および運営するサーバである。ＡＲ表示制御サーバ３００は、ＡＲの表示対象となるオブジェクトに関する情報を保持し、ユーザ端末からの要求に対して、各オブジェクトに対するユーザの興味値（例えば、０〜１．０の範囲内の０．１刻みの数値。数値が高いほどユーザが興味を示しているオブジェクトであることを示す指標。以下、「興味値」という）を算出する。また、ＡＲ表示制御サーバ３００は、算出した興味値に基づいて、ユーザ端末で表示すべき情報を決定および作成し、ユーザ端末に提供する。 The AR display control server 300 is a server managed and operated by a company that provides an AR user interface. The AR display control server 300 holds information about an object to be displayed as an AR, and in response to a request from the user terminal, a user's interest value for each object (for example, 0.1.0 within a range of 0 to 1.0). A numerical value in increments of 1. An index indicating an object that the user is interested in as the numerical value is higher (hereinafter referred to as “interest value”). Further, the AR display control server 300 determines and creates information to be displayed on the user terminal based on the calculated interest value, and provides it to the user terminal.

スマートフォン３０２およびウェアラブルデバイス３０３（ユーザ端末）は、いずれも一般消費者であるユーザが利用する端末である。ユーザは、ユーザ端末のカメラ機能を利用して、例えば、詳細情報を得たい（ユーザが興味を持った）オブジェクト（店舗の看板など）を含む街中の風景を撮影する。撮影された画像や映像は、ユーザ端末によってＡＲ表示制御サーバ３００に送信される。また、画像や映像に加えて、付加情報として、ユーザ端末からＡＲ表示制御サーバ３００に、当該画像や映像に関連する撮影位置や撮影時間等の撮影データ、ユーザ端末の端末データ、およびユーザ端末を使用しているユーザの測定した身体的特徴データ等を送信してもよい。これにより、ユーザ端末は、ＡＲ表示制御サーバ３００からＡＲ表示制御サーバ３００で編集された各オブジェクトの詳細情報を受信し、ＡＲ表示する。また、各オブジェクトの詳細情報は、ユーザ興味により動的に変化させる必要があるため、ユーザ端末は定期的に（例えば、１０秒間に１回）、撮影された画像や映像、および付加情報をＡＲ表示制御サーバ３００に送信し、都度、オブジェクトの詳細情報を受信することにより、ＡＲ表示を更新する。 The smartphone 302 and the wearable device 303 (user terminal) are both terminals used by users who are general consumers. The user uses a camera function of the user terminal to photograph a landscape in the city including, for example, an object (a store signboard or the like) for which detailed information is desired (the user is interested). The captured image or video is transmitted to the AR display control server 300 by the user terminal. In addition to the image and video, as additional information, the user terminal sends the AR display control server 300 with shooting data such as the shooting position and shooting time related to the image and video, the terminal data of the user terminal, and the user terminal. You may transmit the physical characteristic data etc. which the user who is using measured. Thereby, the user terminal receives the detailed information of each object edited by the AR display control server 300 from the AR display control server 300, and displays the AR. Further, since the detailed information of each object needs to be dynamically changed according to the user's interest, the user terminal periodically (for example, once every 10 seconds) takes captured images and videos, and additional information as AR. The AR display is updated by transmitting to the display control server 300 and receiving the detailed information of the object each time.

次に、ＡＲ表示制御サーバ３００の構成を詳細に説明する。なお、図１では、単一のコンピュータシステムを想定し、必要な機能構成だけを示している。ＡＲ表示制御サーバ３００は、ＣＰＵ３１０にシステムバス３１５を介して、ＲＡＭ３１１、入力装置３１２、出力装置３１３、通信制御装置３１４、および不揮発性記憶媒体（ＲＯＭやＨＤＤなど）である記憶装置３１６が接続された構成を有する。記憶装置３１６は、本システムの各機能を奏するためのソフトウェアプログラムを格納したプログラム格納領域と、当該ソフトウェアプログラムで取り扱うデータを格納したデータ格納領域とを備えている。以下に説明するプログラム格納領域の各手段は、実際は独立したソフトウェアプログラム、そのルーチンやコンポーネントなどであり、ＣＰＵ３１０によって記憶装置３１６から呼び出されＲＡＭ３１１のワークエリアに展開されて、データベースなどに適宜アクセスしながら順次実行されることで、各機能を奏するものである。 Next, the configuration of the AR display control server 300 will be described in detail. In FIG. 1, only a necessary functional configuration is shown assuming a single computer system. In the AR display control server 300, a RAM 311, an input device 312, an output device 313, a communication control device 314, and a storage device 316 that is a non-volatile storage medium (ROM, HDD, etc.) are connected to a CPU 310 via a system bus 315. Have a configuration. The storage device 316 includes a program storage area that stores a software program for performing each function of the present system, and a data storage area that stores data handled by the software program. Each means of the program storage area described below is actually an independent software program, its routine, component, etc., and is called from the storage device 316 by the CPU 310 and expanded in the work area of the RAM 311 while appropriately accessing a database or the like. Each function is performed by being executed sequentially.

次に、記憶装置３１６におけるプログラム格納領域に格納されているソフトウェアプログラムは、本発明に関連するものだけを列挙すると、データ送受信手段３２０、オブジェクト検出手段３２１、メタデータ管理手段３２２、興味値算出手段３２３、および表示データ管理手段３２４を備えている。これらの手段は、ＣＰＵ３１０によって実行される。 Next, only the software programs stored in the program storage area in the storage device 316 that are related to the present invention are listed. Data transmission / reception means 320, object detection means 321, metadata management means 322, interest value calculation means. 323 and display data management means 324. These means are executed by the CPU 310.

データ送受信手段１２０は、スマートフォン３０２およびウェアラブルデバイス３０３などの他のコンピュータとのデータ送受信を行う。 The data transmission / reception means 120 performs data transmission / reception with other computers such as the smartphone 302 and the wearable device 303.

オブジェクト検出手段３２１は、撮影された映像（画像）情報の中からＡＲの表示対象となるオブジェクトを検出する。 The object detection unit 321 detects an object that is an AR display target from the captured video (image) information.

メタデータ管理手段３２２は、検出したオブジェクトに対する静的メタデータを静的メタデータ記憶部３３０から取得する。ここで、静的メタデータとは、ユーザの動作や性質、周囲の環境、撮影のタイミングなどによらず、オブジェクトに対して一意に定まるメタデータをいう（例えば、店舗の名称、店舗に対する説明文）。また、メタデータ管理手段３２２は、動的メタデータを取得および算出し、動的メタデータ記憶部３３１に格納する。ここで、動的メタデータとは、オブジェクトに関連するが、ユーザの目線や心拍数などユーザの身体的特徴を含むユーザの動作や性質、位置情報などを含む周囲の環境、撮影のタイミングなどによって変化するメタデータをいう（例えば、画像内のオブジェクトの位置、オブジェクトが画像内に表示され続けている時間、オブジェクトに対するユーザの興味値）。本発明において、オブジェクトに対するメタデータは、静的メタデータおよび動的メタデータの２種類に分類することができる。 The metadata management unit 322 acquires static metadata for the detected object from the static metadata storage unit 330. Here, static metadata refers to metadata that is uniquely determined for an object regardless of the user's behavior and properties, surrounding environment, shooting timing, etc. (for example, store name, store description) ). Further, the metadata management unit 322 acquires and calculates dynamic metadata and stores it in the dynamic metadata storage unit 331. Here, the dynamic metadata is related to the object, but depends on the user's actions and properties including the user's physical characteristics such as the user's eyes and heart rate, the surrounding environment including the position information, the timing of shooting, etc. Metadata that changes (eg, the position of the object in the image, the time that the object has been displayed in the image, the user's interest in the object). In the present invention, metadata for an object can be classified into two types: static metadata and dynamic metadata.

興味値算出手段３２３は、動的メタデータに基づいて、対象オブジェクトの興味値を算出する。また、興味値算出手段３２３は、各オブジェクトの過去データに基づいて、対象オブジェクトに対する興味値を推定する。 The interest value calculation means 323 calculates the interest value of the target object based on the dynamic metadata. Moreover, the interest value calculation means 323 estimates the interest value for the target object based on the past data of each object.

表示データ管理手段３２４は、算出した興味値から、ユーザ端末での表示対象となるオブジェクトを決定し、さらに表示対象として決定したオブジェクトごとに表示データを作成し、表示データ記憶部３３２に格納する。 The display data management unit 324 determines an object to be displayed on the user terminal from the calculated interest value, further creates display data for each object determined as the display target, and stores the display data in the display data storage unit 332.

次に、記憶装置３１６におけるデータ格納領域は、本発明に関連するものだけを列挙すると、静的メタデータ記憶部３３０、動的メタデータ記憶部３３１、および表示データ記憶部３３２を備える。いずれも、記憶装置３１６内に確保された一定の記憶領域である。 Next, the data storage area in the storage device 316 includes a static metadata storage unit 330, a dynamic metadata storage unit 331, and a display data storage unit 332 when only those related to the present invention are listed. Both are fixed storage areas secured in the storage device 316.

静的メタデータ記憶部３３０は、ユーザの動作や性質、周囲の環境、撮影のタイミングなどによらず、各オブジェクト固有のデータを格納する。図６は、本発明の一実施形態に係る静的メタデータ記憶部３３０に格納されたデータを示す図である。本データは、オブジェクトごとに、ＡＲ表示制御サーバ３００が予め保持していることを想定している。図６における静的メタデータは、オブジェクトを一意に識別させる「オブジェクトＩＤ」、オブジェクトのジャンルを示す「ジャンル１」および「ジャンル２」、オブジェクトの名称を示す「名称」、ならびにオブジェクトの詳細情報を示す「詳細情報」を格納する。「オブジェクトＩＤ」は、例えば、シーケンシャル番号である。「ジャンル１」および「ジャンル２」は、「ジャンル２」の方が「ジャンル１」よりも詳細なジャンルである。なお、図６では、「ジャンル１」および「ジャンル２」と、２つの詳細レベルが異なるジャンルを示しているが、別の実施形態では、「ジャンル３」、「ジャンル４」・・・などと、より細分化したジャンルを設定することもできる。いずれであっても、ユーザがより興味を示した場合に、「ジャンル１」、「ジャンル２」、「ジャンル３」・・・「詳細情報」と、より詳細な情報を表示することを想定している。 The static metadata storage unit 330 stores data unique to each object regardless of the user's operation and properties, surrounding environment, shooting timing, and the like. FIG. 6 is a diagram illustrating data stored in the static metadata storage unit 330 according to an embodiment of the present invention. It is assumed that this data is held in advance by the AR display control server 300 for each object. The static metadata in FIG. 6 includes an “object ID” for uniquely identifying the object, “genre 1” and “genre 2” indicating the genre of the object, “name” indicating the name of the object, and detailed object information. “Detailed information” is stored. “Object ID” is, for example, a sequential number. For “genre 1” and “genre 2”, “genre 2” is a more detailed genre than “genre 1”. 6 shows “genre 1” and “genre 2” and two different genres, but in another embodiment, “genre 3”, “genre 4”, etc. A more detailed genre can also be set. In any case, it is assumed that more detailed information such as “genre 1”, “genre 2”, “genre 3”... “Detailed information” is displayed when the user shows more interest. ing.

動的メタデータ記憶部３３１は、ユーザの動作や性質、周囲の環境、撮影のタイミングなどによって変化する、各オブジェクトに対する動的なデータを格納する。図７は、本発明の一実施形態に係る動的メタデータ記憶部３３１に格納されたデータを示す図である。本データは、オブジェクトが検出されるたびに、ＡＲ表示制御サーバ３００によって作成されることを想定している。図７における動的メタデータは、ユーザを一意に識別させる「ユーザＩＤ」、オブジェクトを一意に識別させる「オブジェクトＩＤ」、オブジェクトを撮影した日時を示す「撮影日時」、ユーザ端末におけるオブジェクトの表示位置を示す「視界内座標」、ユーザ端末にオブジェクトが表示され続けている経過時間を示す「継続表示時間」、ユーザ端末におけるオブジェクトの大きさを示す「サイズ」、およびオブジェクトに対するユーザ興味の度合いを示す「興味値」を格納する。「ユーザＩＤ」は、ユーザ端末単位の識別子（例えば、ＭＡＣアドレス）であってもよいし、ＡＲサービスを利用するユーザ単位の識別子（ＡＲサービスのログインＩＤ）であってもよい。「オブジェクトＩＤ」は、静的メタデータ（図６）における「オブジェクトＩＤ」と紐付けられるが、本データの場合は動的データであるため、「ユーザＩＤ」、「オブジェクトＩＤ」、および「撮影日時」の３つのデータ項目により一意のデータとなる。そのため、本データは、あるユーザがあるオブジェクトに対し過去に興味を持っていたか、などといった過去データとして利用することもできる。「視界内座標」は、例えば、画像の左上を（０．０）とした場合のオブジェクトの位置を示すｘｙ座標であるが、画像の中心から所定範囲内をユーザの視界内と定義し（以下、「視界内」という）、所定範囲内における位置座標であってもよい（画像の端の方は、写り込んでいる部分ではあるがユーザが見ていないとする考え）。「継続表示時間」は、オブジェクトが画像内または視界内に表示され続けている経過時間である。「サイズ」は、例えば、画像内においてオブジェクトが占める画素数の合計である。「興味値」は、例えば、０〜１．０の範囲内の０．１刻みの数値。数値が高いほど対象オブジェクトに対し、ユーザが興味を示していることを示す。 The dynamic metadata storage unit 331 stores dynamic data for each object, which varies depending on the user's operation and properties, surrounding environment, shooting timing, and the like. FIG. 7 is a diagram showing data stored in the dynamic metadata storage unit 331 according to an embodiment of the present invention. It is assumed that this data is created by the AR display control server 300 every time an object is detected. The dynamic metadata in FIG. 7 includes a “user ID” that uniquely identifies the user, an “object ID” that uniquely identifies the object, a “shooting date and time” that indicates the date and time when the object was shot, and the display position of the object on the user terminal "Coordinates in sight" indicating, "continuous display time" indicating the elapsed time that the object has been displayed on the user terminal, "size" indicating the size of the object on the user terminal, and the degree of user interest in the object Stores “interest value”. The “user ID” may be an identifier in units of user terminals (for example, a MAC address), or an identifier in units of users using the AR service (AR service login ID). The “object ID” is linked to the “object ID” in the static metadata (FIG. 6). However, since this data is dynamic data, “user ID”, “object ID”, and “shooting” It becomes unique data by three data items of “date and time”. Therefore, this data can also be used as past data such as whether a user has been interested in a certain object in the past. “In-view coordinates” are, for example, xy coordinates indicating the position of an object when the upper left of the image is (0.0), and a predetermined range from the center of the image is defined as within the user's view (hereinafter referred to as “in-view coordinates”). , “In the field of view”) may be position coordinates within a predetermined range (considering that the end of the image is a reflected part but is not viewed by the user). “Continuous display time” is an elapsed time during which an object continues to be displayed in an image or field of view. “Size” is, for example, the total number of pixels occupied by an object in an image. The “interest value” is, for example, a numerical value in increments of 0.1 within a range of 0 to 1.0. A higher numerical value indicates that the user is interested in the target object.

表示データ記憶部３３２は、ユーザ端末におけるＡＲ表示を制御するためのデータを格納する。図８は、本発明の一実施形態に係る表示データ記憶部３３２に格納されたデータを示す図である。図８における表示データは、オブジェクトを一意に識別させる「オブジェクトＩＤ」、ユーザ端末に表示するオブジェクトの名称を示す「表示名称」、およびユーザ端末に表示するオブジェクトの詳細情報を示す「表示情報」を格納する。図８における表示データは、図２における表示イメージの左下の画像におけるＡＲ表示用のデータを示している。表示データには、上記データ項目の他、表示文字やバルーン表示のスタイルを指定する項目を含めることもできる。 The display data storage unit 332 stores data for controlling AR display on the user terminal. FIG. 8 is a diagram illustrating data stored in the display data storage unit 332 according to an embodiment of the present invention. The display data in FIG. 8 includes an “object ID” for uniquely identifying the object, a “display name” indicating the name of the object displayed on the user terminal, and “display information” indicating the detailed information of the object displayed on the user terminal. Store. The display data in FIG. 8 indicates data for AR display in the lower left image of the display image in FIG. In addition to the above data items, the display data can also include items for specifying display characters and balloon display styles.

次に、図４のフローチャート、および図６−８のデータを参照して、本発明の一実施形態に係るＡＲ表示処理を流れに沿って説明する。図４は、本発明の一実施形態に係るＡＲ表示処理を示すフローチャートである。本処理は、ユーザがユーザ端末に搭載されたカメラを用いて、例えば、図１および２に示されるような繁華街を撮影し、撮影された映像（画像）がＡＲ表示制御サーバ３００に送信された後を想定している。なお、本発明によるＡＲ表示はリアルタイム処理を想定しているため、ユーザ端末からＡＲ表示制御サーバ３００への撮影映像、付加情報の送信は定期的に（例えば１０秒に１回）行なわれる。 Next, the AR display processing according to the embodiment of the present invention will be described along the flow with reference to the flowchart of FIG. 4 and the data of FIGS. 6-8. FIG. 4 is a flowchart showing an AR display process according to an embodiment of the present invention. In this process, for example, a user captures a downtown area as shown in FIGS. 1 and 2 using a camera mounted on the user terminal, and the captured video (image) is transmitted to the AR display control server 300. Assumes after. Since the AR display according to the present invention assumes real-time processing, transmission of captured video and additional information from the user terminal to the AR display control server 300 is performed periodically (for example, once every 10 seconds).

ユーザ端末から撮影された映像が送信されると、データ送受信手段３２０は当該映像を受信する（ステップ４０１）。次に、オブジェクト検出手段３２１は、受信した映像（画像）の中から、オブジェクトを検出する（ステップ４０２）。オブジェクトの検出は、画像内における物体を一般的な手法で認識し、オブジェクトとして検出する。当然ながら、１つの画像から複数のオブジェクトが検出される場合がほとんどであるが、ステップ４０２では、オブジェクトを１つ検出すると次ステップに進むように記載してある。 When the captured video is transmitted from the user terminal, the data transmitting / receiving unit 320 receives the video (step 401). Next, the object detection means 321 detects an object from the received video (image) (step 402). The object is detected by recognizing an object in the image by a general method and detecting it as an object. Of course, in most cases, a plurality of objects are detected from one image. However, in step 402, when one object is detected, the process proceeds to the next step.

オブジェクトを検出すると、メタデータ管理手段３２２は、静的メタデータ記憶部３３０を検索し、検出したオブジェクトに対する静的メタデータ（図６）を取得する（ステップ４０３）。静的メタデータが取得できない場合（対象のオブジェクトがＡＲの表示対象でない場合）は、ステップ４０４のＮｏルートに進み、次のオブジェクトに対する処理に遷移する。次のオブジェクトが存在する場合は、ステップ４０５のＹｅｓルートに進み、オブジェクトがなくなるまで静的メタデータおよび動的メタデータの取得が繰り返される（ステップ４０３〜４０８）。なお、検出したオブジェクトに対する静的メタデータが１つも取得できない場合は、ステップ４０４のＮｏルート、ステップ４０５のＮｏルート、ステップ４０６のＮｏルートと進み、本処理は終了する。この場合は、すなわち、ＡＲ表示対象のオブジェクトが１つも存在しなかったことを意味する。 When the object is detected, the metadata management unit 322 searches the static metadata storage unit 330 and acquires the static metadata (FIG. 6) for the detected object (step 403). When static metadata cannot be acquired (when the target object is not an AR display target), the process proceeds to the No route of step 404 and transitions to processing for the next object. If the next object exists, the process proceeds to the Yes route in step 405, and the acquisition of static metadata and dynamic metadata is repeated until there are no more objects (steps 403 to 408). If no static metadata for the detected object can be acquired, the process proceeds with the No route in Step 404, the No route in Step 405, and the No route in Step 406, and the process ends. In this case, that is, it means that no AR display target object exists.

一方、ステップ４０３において対象の静的メタデータが取得できた場合は、ステップ４０４のＹｅｓルートに進み、メタデータ管理手段３２２は、検出したオブジェクトに対する動的メタデータ（図７）を取得し、動的メタデータ記憶部３３１に格納する（ステップ４０７）。 On the other hand, if the target static metadata can be acquired in step 403, the process proceeds to a Yes route in step 404, and the metadata management unit 322 acquires dynamic metadata (FIG. 7) for the detected object. This is stored in the target metadata storage unit 331 (step 407).

ここで、動的メタデータの取得方法について図７を例として説明する。「ユーザＩＤ」は前述したように、ユーザ端末単位の識別子（例えば、ＭＡＣアドレス）や、ＡＲサービスを利用するユーザ単位の識別子（ＡＲサービスのログインＩＤ）であり、例えば、撮影された映像と共に、ユーザ端末からＡＲ表示制御サーバ３００に送信される。「オブジェクトＩＤ」は、ステップ４０３で取得した静的メタデータの「オブジェクトＩＤ」と同一のものである。「撮影日時」は、例えば、撮影された映像（画像）の作成日時である。すなわち、同一の画像から取得された複数のオブジェクトの撮影日時は同一になる。「視界内座標」は、前述した通り、例えば、撮影画像の左上を（０．０）とした場合のオブジェクトの位置座標であるが、撮影画像の中心座標からの所定範囲内（視界内）における位置座標であってもよい。オブジェクトの位置座標は、オブジェクトの中点であってもよいし、オブジェクト検出の際に検出される特徴点であってもよい。「継続表示時間」は、オブジェクトが画像内または視界内に表示され続けている経過時間であるが、ユーザ端末における撮影周期に依存する。例えば、撮影周期が１０秒間に１回である場合、前回と今回の撮影画像に写り込んでいるオブジェクトの「継続表示時間」は、１０秒となる（さらに前々回以前から写り込んでいる場合は、＋１０秒ずつ加算されていく）。例え、その１０秒間の間に一度、画像内または視界内から外れても「継続表示時間」は１０秒である。そのため、「継続表示時間」は、対象のオブジェクトの「オブジェクトＩＤ」（と「ユーザＩＤ」）を検索キーとして、本データを検索し、前回の「撮影日時」のデータが存在する場合は、その際の「継続表示時間」に撮影周期を加算する（例えば、＋１０秒）ことにより算出することができる。「サイズ」は画像内におけるオブジェクトの画素数により算出することができる。「興味値」の算出については、ステップ４０９および図５の興味値算出処理の中で説明される。さらに、別の実施形態では、ウェアラブルデバイス３０３や、測定用のアプリケーションがインストールされたスマートフォン３０２などを利用して、ユーザの目線や心拍数などユーザの身体的特徴データを動的データとして取得することもできる。 Here, a method for acquiring dynamic metadata will be described with reference to FIG. As described above, the “user ID” is an identifier in units of user terminals (for example, a MAC address) or an identifier in units of users using the AR service (AR service login ID). It is transmitted from the user terminal to the AR display control server 300. The “object ID” is the same as the “object ID” of the static metadata acquired in step 403. “Shooting date / time” is, for example, the creation date / time of a shot video (image). That is, the shooting dates and times of a plurality of objects acquired from the same image are the same. As described above, the “in-view coordinates” are, for example, the position coordinates of the object when the upper left of the captured image is (0.0), but within a predetermined range (within the view) from the center coordinates of the captured image. It may be a position coordinate. The position coordinates of the object may be the midpoint of the object, or may be a feature point detected when the object is detected. “Continuous display time” is an elapsed time during which an object continues to be displayed in an image or field of view, but depends on a shooting period in the user terminal. For example, when the shooting cycle is once every 10 seconds, the “continuous display time” of the object reflected in the previous and current captured images is 10 seconds (in addition, if the image is captured before the previous time, +10 seconds will be added). For example, even if it is out of the image or view once in the 10 seconds, the “continuous display time” is 10 seconds. For this reason, the “continuous display time” is obtained by searching for this data using the “object ID” (and “user ID”) of the target object as a search key. It can be calculated by adding the imaging period to the “continuation display time” (for example, +10 seconds). The “size” can be calculated from the number of pixels of the object in the image. The calculation of the “interest value” will be described in step 409 and the interest value calculation process in FIG. Furthermore, in another embodiment, using the wearable device 303 or the smartphone 302 installed with the measurement application, the user's physical characteristic data such as the user's line of sight and heart rate is acquired as dynamic data. You can also.

ステップ４０７のおいて動的メタデータを取得した後、次のオブジェクトが存在する場合は、ステップ４０８のＹｅｓルートに進み、オブジェクトがなくなるまで静的メタデータおよび動的メタデータの取得が繰り返される（ステップ４０３〜４０８）。 After the dynamic metadata is acquired in step 407, if there is a next object, the process proceeds to the Yes route in step 408, and acquisition of static metadata and dynamic metadata is repeated until there are no more objects ( Steps 403-408).

次のオブジェクトが存在しない場合、興味値算出手段３２３は、各オブジェクトの興味値を算出する（ステップ４０９）。なお、興味値算出の処理対象となるオブジェクトは、静的メタデータが存在するオブジェクト（ＡＲ表示対象のオブジェクト）に限られる。ステップ４０９の興味値算出処理については、図５を用いて後述する。 If the next object does not exist, the interest value calculation means 323 calculates the interest value of each object (step 409). Note that the object that is the target of interest value calculation is limited to the object (object for AR display) in which static metadata exists. The interest value calculation process in step 409 will be described later with reference to FIG.

各オブジェクトの興味値を算出した後、表示データ管理手段３２４は、算出した各興味値に基づいて、ユーザ端末での表示対象となるオブジェクトを決定する（ステップ４１０）。これは、例えば、算出した興味値が高い上位所定数（例えば、４つ）のオブジェクトを表示対象とすることができる。または、興味値が閾値以上（例えば、０．５以上）のオブジェクトを表示対象とすることができる。また、表示対象となるオブジェクトの決定は、次ステップ４１１の表示データの作成と併せて行なうこともできる。例えば、興味値が高いオブジェクトから順番に、ＡＲ表示の表示量（例えば、表示文字数やバルーンの表示面積（画素数））を合計していき、表示量の合計値が所定量内に収まるまでのオブジェクトを表示対象のオブジェクトとして決定することができる。 After calculating the interest value of each object, the display data management unit 324 determines an object to be displayed on the user terminal based on the calculated interest value (step 410). For example, the upper predetermined number (for example, four) objects having a high calculated interest value can be displayed. Alternatively, an object having an interest value equal to or higher than a threshold value (for example, 0.5 or higher) can be displayed. Further, the determination of the object to be displayed can be performed in conjunction with the generation of display data in the next step 411. For example, the display amount of AR display (for example, the number of display characters and the display area of the balloon (number of pixels)) is summed in order from the object with the highest interest value until the total value of the display amount falls within a predetermined amount. An object can be determined as an object to be displayed.

表示対象のオブジェクトを決定すると、表示データ管理手段３２４は、決定したオブジェクト用の表示データ（図８）を作成し、表示データ記憶部３３２に格納する（ステップ４１１）。表示データは、ステップ４０３で取得した静的メタデータ（図６）、およびステップ４０９で算出した興味値に基づいて作成される。これは、興味値により、表示データの詳細レベルを変更することができる（ユーザがより興味を持ったオブジェクトほど表示データをより詳細に表示する）。例えば、表示データの詳細レベルを幾つかの段階に分け、それぞれに対応した閾値を興味値が超えたかどうかにより、表示データの詳細レベルを決定する。 When the display target object is determined, the display data management unit 324 creates display data for the determined object (FIG. 8) and stores it in the display data storage unit 332 (step 411). The display data is created based on the static metadata acquired in step 403 (FIG. 6) and the interest value calculated in step 409. In this case, the detail level of the display data can be changed by the interest value (an object that the user is more interested in displays the display data in more detail). For example, the detail level of the display data is divided into several stages, and the detail level of the display data is determined depending on whether the interest value exceeds the threshold corresponding to each of the stages.

より具体的には、興味値が０．９以上の場合、詳細レベルを“最高”に決定し、静的メタデータ（図６）における「ジャンル２」、「名称」、「詳細情報」を対象オブジェクトの表示データに決定する（例、四川料理××：中華料理、麻婆豆腐が名物／ランチタイム１１：３０−１３：３０）。 More specifically, when the interest value is 0.9 or more, the detail level is determined as “highest”, and “genre 2”, “name”, and “detailed information” in the static metadata (FIG. 6) are targeted. The display data of the object is determined (for example, Sichuan cuisine XX: Chinese cuisine, mapo tofu is a specialty / lunch time 11: 30-13: 30).

興味値が０．６以上および０．９未満の場合、詳細レベルを“高”に決定し、静的メタデータ（図６）における「ジャンル２」、「名称」を対象オブジェクトの表示データに決定する（例、四川料理××：中華料理）。 When the interest value is 0.6 or more and less than 0.9, the detail level is determined as “high”, and “genre 2” and “name” in the static metadata (FIG. 6) are determined as the display data of the target object. (Eg, Sichuan cuisine XX: Chinese cuisine).

興味値が０．３以上および０．６未満の場合、詳細レベルを“中”に決定し、静的メタデータ（図６）における「ジャンル２」を対象オブジェクトの表示データと決定する（例、中華料理）。なお、詳細レベルが“中”以下の場合、他のオブジェクトと表示データが重複する場合があるため、内容が全く同一の表示データは１つだけ表示するように制御することもできる。 When the interest value is 0.3 or more and less than 0.6, the detail level is determined as “medium”, and “genre 2” in the static metadata (FIG. 6) is determined as the display data of the target object (for example, Chinese cuisine). Note that when the detail level is “medium” or lower, display data may overlap with other objects, so that only one display data with exactly the same content can be displayed.

興味値が０．３未満の場合、詳細レベルを“低”に決定し、静的メタデータ（図６）における「ジャンル１」を対象オブジェクトの表示データと決定する（例、飲食）。これも、他のオブジェクトと表示データが重複する場合は、１つだけ表示するように制御することもできる。以上より、ユーザが興味を示したものをより詳細に、逆にユーザが興味を示さないものをより簡潔に表示することができる。 When the interest value is less than 0.3, the detail level is determined to be “low”, and “genre 1” in the static metadata (FIG. 6) is determined as the display data of the target object (eg, food and drink). This can also be controlled so that only one object is displayed when the display data overlaps with another object. From the above, it is possible to display in more detail what the user is interested in, and more concisely indicate what the user is not interested in.

表示データを作成すると、データ送受信手段３２０は、作成した表示データをユーザ端末に送信する（ステップ４１２）。その後、ユーザ端末は、当該表示データに基づいてＡＲ表示を行なう。そして、次の撮影周期になると、ユーザ端末は新たに撮影した映像（画像）をＡＲ表示制御サーバ３００に送信し、ステップ４０１から処理が繰り返される。ステップ４１２の後、本処理は終了する。 When the display data is created, the data transmission / reception means 320 transmits the created display data to the user terminal (step 412). Thereafter, the user terminal performs AR display based on the display data. Then, at the next shooting cycle, the user terminal transmits a newly shot video (image) to the AR display control server 300, and the processing is repeated from step 401. After step 412, the process ends.

次に、図４における興味値算出処理（ステップ４０９）を、図５のフローチャートを参照して、流れに沿って説明する。図５は、本発明の一実施形態に係る興味値算出処理を示すフローチャートである。図５の処理では、過去データに基づいて各種判定を行ない、各種判定に該当した場合にユーザが興味を持っているオブジェクトであると判断し、当該オブジェクトの興味値（例えば、初期値を０．１とする）に所定値を加算するように示してある。しかしながら、加算方式ではなく、例えば、各種判定を優先付けして処理していき、いずれかの判定に該当した場合は所定値を興味値として設定する単一設定方式や、判定ごとに所定値に重み付けして興味値を加算または設定する重み付け方式とすることもできる。また、図５の処理は、あくまでも一実施形態であり、各判定の実行順や、実行する判定の種類を変更することもできる。 Next, the interest value calculation process (step 409) in FIG. 4 will be described along the flow with reference to the flowchart of FIG. FIG. 5 is a flowchart showing an interest value calculation process according to an embodiment of the present invention. In the process of FIG. 5, various determinations are made based on past data, and it is determined that the user is interested in the various determinations, and an interest value (for example, an initial value of 0. 1), a predetermined value is added. However, instead of the addition method, for example, various determinations are prioritized and processed, and if any of the determinations is satisfied, a predetermined value is set as an interest value, or a predetermined value is set for each determination. It is also possible to adopt a weighting method in which the interest value is added or set by weighting. Moreover, the process of FIG. 5 is only one embodiment, and the execution order of each determination and the type of determination to be executed can be changed.

図５の処理を説明する。まず、興味値算出手段３２３は、検出したオブジェクトは前回も検出されたかどうかを判定する（ステップ５０１）。具体的には、動的メタデータ記憶部３３１に対象のユーザ端末の「ユーザＩＤ」、対象オブジェクトの「オブジェクトＩＤ」、および前回の「撮影日時」を検索キーとして、動的メタデータ（図７）が存在するかどうかを判定する。該当の動的メタデータが存在しない場合は、今回初めてユーザが興味を持ったオブジェクトであり、その他に検出されたオブジェクトよりも特別に興味を持っているわけではないと判断することができる。この場合、ステップ５０１のＮｏルートに進み、本処理は終了する（対象オブジェクトの興味値は初期値０．１のままである）。なお、動的メタデータを検索する際の検索キーを、前回の「撮影日時」のみならず、前々回の「撮影日時」や、その前の「撮影日時」を含めることもできる（複数の「撮影日時」でＯＲ検索する）。前回の撮影では対象オブジェクトは視界内から外れたが、前々回まではずっと表示されていた場合など、対象オブジェクトの興味値を初期値とするのは適切ではない場合があるためである。 The process of FIG. 5 will be described. First, the interest value calculation means 323 determines whether or not the detected object has been detected last time (step 501). Specifically, the dynamic metadata (FIG. 7) is stored in the dynamic metadata storage unit 331 using the “user ID” of the target user terminal, the “object ID” of the target object, and the previous “shooting date and time” as search keys. ) Exists. When the corresponding dynamic metadata does not exist, it can be determined that the object is the object that the user is interested in for the first time and is not particularly interested in other detected objects. In this case, the process proceeds to the No route in step 501 and the process ends (the interest value of the target object remains the initial value 0.1). Note that the search key when searching for dynamic metadata can include not only the previous “shooting date / time”, but also the previous “shooting date / time” and the previous “shooting date / time” (multiple “shooting date / time”). OR search by “date and time”). This is because it may not be appropriate to set the interest value of the target object as the initial value, such as when the target object is out of the field of view in the previous shooting, but has been displayed all the time until the previous time.

一方、該当の動的メタデータが存在した場合、ステップ５０１のＹｅｓルートに進み、メタデータ管理手段３２２は、検出したオブジェクトの動的メタデータ（図７）における「視界内座標」の今回値と前回値を取得する（ステップ５０２）。 On the other hand, if the corresponding dynamic metadata exists, the process proceeds to the Yes route in step 501, and the metadata management unit 322 determines the current value of “in-view coordinates” in the dynamic metadata (FIG. 7) of the detected object. The previous value is acquired (step 502).

次に、興味値算出手段３２３は、取得した「視界内座標」の今回値と前回値とを比較し、検出したオブジェクトが前回よりも中央に寄ったかどうかを判定する（ステップ５０３）。これにより、検出したオブジェクトが前回よりも中央に寄ったと判定された場合、ユーザが対象オブジェクトに対し、より興味を持ったと判断することができる。この場合、ステップ５０３のＹｅｓルートに進み、興味値算出手段３２３は、対象オブジェクトの興味値に対し、所定値（例えば、０．１）を加算する（ステップ５０４）。 Next, the interest value calculation unit 323 compares the acquired current value of the “in-view coordinates” with the previous value, and determines whether or not the detected object is closer to the center than the previous value (step 503). Thereby, when it is determined that the detected object is closer to the center than the previous time, it can be determined that the user is more interested in the target object. In this case, the process proceeds to the Yes route in Step 503, and the interest value calculation unit 323 adds a predetermined value (for example, 0.1) to the interest value of the target object (Step 504).

一方、検出したオブジェクトが前回よりも中央に寄ったと判定されなかった場合は、ステップ５０３のＮｏルートに進み、興味値を変更することなく、次の処理に遷移する。ただし、別の実施形態では、この際、検出したオブジェクトが前回よりも端に寄ったと判定された場合、興味値から所定値を減算することもできる。 On the other hand, if it is not determined that the detected object is closer to the center than the previous time, the process proceeds to the No route in step 503, and the process proceeds to the next process without changing the interest value. However, in another embodiment, when it is determined that the detected object is closer to the end than the previous time, a predetermined value can be subtracted from the interest value.

さらに別の実施形態では、検出したオブジェクトの中央または端への寄り方の度合いによって、加算または減算する所定値を変更することもできる。例えば、「視界内座標」の今回値と前回値との差異が同一であっても、端から少し寄った場合と、中央付近から中央に寄った場合とでは、ユーザ興味は異なるものと判断することができるためである（中央付近から中央に寄った場合の方が、端から少し寄った場合よりも、対象オブジェクトに対して、より興味があると判断することができる）。 In still another embodiment, the predetermined value to be added or subtracted can be changed according to the degree of approach to the center or edge of the detected object. For example, even if the difference between the current value and the previous value of “in-view coordinates” is the same, it is determined that the user's interest is different when approaching the center slightly and when approaching the center from near the center. This is because it can be determined that the case of approaching the center from near the center is more interested in the target object than the case of approaching the edge a little.

ステップ５０３またはステップ５０４の後、メタデータ管理手段３２２は、検出したオブジェクトの継続表示時間を取得する（ステップ５０５）。これは、動的メタデータ（図７）における前回の「撮影日時」に、撮影周期（例えば、１０秒間に１回）を加算する（＋１０秒）ことにより取得することができる。 After step 503 or step 504, the metadata management unit 322 acquires the continuous display time of the detected object (step 505). This can be acquired by adding (+10 seconds) a shooting cycle (for example, once every 10 seconds) to the previous “shooting date and time” in the dynamic metadata (FIG. 7).

次に、興味値算出手段３２３は、取得した継続表示時間が所定の閾値以上であり、検出したオブジェクトを長時間表示しているかどうかを判定する（ステップ５０６）。これにより、検出したオブジェクトを長時間表示していると判断された場合、ユーザが対象オブジェクトに対し興味を持ち続けていると判断することができる。この場合、ステップ５０６のＹｅｓルートに進み、興味値算出手段３２３は、対象オブジェクトの興味値に対し、所定値を加算する（ステップ５０７）。 Next, the interest value calculation means 323 determines whether or not the acquired continuous display time is equal to or longer than a predetermined threshold and the detected object is displayed for a long time (step 506). Thereby, when it is determined that the detected object is displayed for a long time, it can be determined that the user continues to be interested in the target object. In this case, the process proceeds to the Yes route in step 506, and the interest value calculation unit 323 adds a predetermined value to the interest value of the target object (step 507).

一方、検出したオブジェクトを長時間表示していると判断されなかった場合は、ステップ５０６のＮｏルートに進み、興味値を変更することなく、次の処理に遷移する。なお、別の実施形態では、検出したオブジェクトを長時間表示していると判断されなかった場合、興味値から所定値を減算することもできる。さらに別の実施形態では、取得した継続表示時間と比較する所定の閾値をより細分化し（例えば、長時間表示用の閾値、中時間表示用の閾値・・・）、いずれの閾値を超えたかによって、ステップ５０７で興味値に加算または減算する所定値を変更することもできる。 On the other hand, if it is not determined that the detected object is displayed for a long time, the process proceeds to the No route in step 506, and the process proceeds to the next process without changing the interest value. In another embodiment, when it is not determined that the detected object is displayed for a long time, a predetermined value can be subtracted from the interest value. In yet another embodiment, the predetermined threshold to be compared with the acquired continuous display time is further subdivided (for example, a threshold for long-time display, a threshold for medium-time display, ...), and depending on which threshold is exceeded In step 507, the predetermined value to be added to or subtracted from the interest value can be changed.

ステップ５０６またはステップ５０７の後、メタデータ管理手段３２２は、動的メタデータ（図７）における「サイズ」の今回値と前回値を取得する（ステップ５０８）。 After step 506 or step 507, the metadata management unit 322 acquires the current value and the previous value of “size” in the dynamic metadata (FIG. 7) (step 508).

次に、興味値算出手段３２３は、取得した「サイズ」の今回値と前回値とを比較し、検出したオブジェクトが前回よりも大きくなったかどうかを判定する（ステップ５０９）。これにより、検出したオブジェクトが前回よりも大きくなったと判定された場合、ユーザが対象オブジェクトに対し、より興味を持ったと判断することができる。この場合、ステップ５０９のＹｅｓルートに進み、興味値算出手段３２３は、対象オブジェクトの興味値に対し、所定値を加算する（ステップ５１０）。ステップ５１０の後、本処理は終了する。 Next, the interest value calculation unit 323 compares the acquired current value of “size” with the previous value, and determines whether or not the detected object is larger than the previous value (step 509). Thereby, when it is determined that the detected object is larger than the previous time, it can be determined that the user is more interested in the target object. In this case, the process proceeds to the Yes route in step 509, and the interest value calculation unit 323 adds a predetermined value to the interest value of the target object (step 510). After step 510, the process ends.

一方、検出したオブジェクトが前回よりも大きくなったと判定されなかった場合は、ステップ５０９のＮｏルートに進み、興味値を変更することなく、本処理を終了する。ただし、別の実施形態では、この際、検出したオブジェクトが前回よりも小さくなったと判定された場合、興味値から所定値を減算することもできる。 On the other hand, if it is not determined that the detected object has become larger than the previous time, the process proceeds to No route in step 509, and the present process ends without changing the interest value. However, in another embodiment, if it is determined that the detected object has become smaller than the previous time, a predetermined value can be subtracted from the interest value.

さらに別の実施形態では、検出したオブジェクトの大きくなった（小さくなった）度合いによって、加算（減算）する所定値を変更することもできる。例えば、一気に大きくなった場合と、少し大きくなった場合とでは、ユーザ興味は異なるものと判断することができるためである（一気に大きくなった場合の方が、少し大きくなった場合よりも、対象オブジェクトに対して、より興味があると判断することができる）。 In yet another embodiment, the predetermined value to be added (subtracted) can be changed depending on the degree of increase (decrease) in the detected object. For example, it is possible to judge that the user's interest is different between when it grows at a stretch and when it grows a little (when it grows all at once, it becomes more interesting than when it grows a little You can determine that you are more interested in the object).

また、別の実施形態では、図４における興味値算出処理（ステップ４０９）において、推定モデルを用いて、興味値を推定（算出）することもできる。図９は、本発明の一実施形態に係る興味推定モデルを示す図である。当該推定モデルは、ステップ４０２で検出した全てのオブジェクトに対するユーザ興味の心理段階をＡＩＤＡの４段階に分け（最も興味が低い段階から、「Ａｔｔｅｎｔｉｏｎ」（注意段階）、「Ｉｎｔｅｒｅｓｔ」（訴求段階）、「Ｄｅｓｉｒｅ」（欲求段階）、「Ａｃｔｉｏｎ」（行動段階）である）、各段階における各オブジェクトをノードとする隠れマルコフモデルである。各ノード間の遷移は、観測事象を利用する。観測事象とは、例えば、ユーザの目線や心拍数などユーザの身体的特徴である。撮影画像中のどのオブジェクトにユーザ目線が向けられているか、どのオブジェクトに対して心拍数に変化があったか、などにより、ユーザがどのオブジェクトに対して興味を示しているかを判断することができる。図９の推定モデルでは、便宜上、検出したオブジェクトを２つ（“中華料理”に関するオブジェクトおよび“和食”に関するオブジェクト）と仮定して示しているが、実際は、より多くのオブジェクトを含む推定モデルにより、各オブジェクトの興味値を推定することが想定される。図９の推定モデルでは、撮影された画像に写り込んだ（すなわち、ユーザが興味を持っていると考えられる）“中華料理”に関するオブジェクトおよび“和食”に関するオブジェクトに対する興味が、ユーザ目線の履歴データから、どのように遷移する可能性があるかを示した推定モデルである。 In another embodiment, in the interest value calculation process (step 409) in FIG. 4, the interest value can be estimated (calculated) using the estimation model. FIG. 9 is a diagram showing an interest estimation model according to an embodiment of the present invention. The estimation model divides the psychological stages of user interest for all objects detected in step 402 into four stages of AIDA (from the least interesting stage to “Attention” (attention stage), “Interest” (appeal stage), "Desire" (desired stage), "Action" (behavior stage)), a hidden Markov model with each object at each stage as a node. The transition between each node uses an observation event. An observation event is a user's physical characteristics, such as a user's eyes and a heart rate, for example. It is possible to determine which object the user is interested in based on which object in the captured image the user's eye is directed to, and which object has a change in heart rate. In the estimation model of FIG. 9, for the sake of convenience, two detected objects (an object related to “Chinese cuisine” and an object related to “Japanese cuisine”) are shown. However, in reality, the estimation model including more objects It is assumed that the interest value of each object is estimated. In the estimation model of FIG. 9, the interest in the object related to “Chinese cuisine” and the object related to “Japanese cuisine” reflected in the captured image (that is, considered to be interested by the user) is the history data of the user's eyes. From the above, it is an estimation model showing how the transition may occur.

図９では、例えば、Ａｔｔｅｎｔｉｏｎ段階（注意段階。例えば、興味値０．１〜０．２の興味段階）の“中華料理”に関するオブジェクトは、ユーザ目線の履歴データから、確率０．４（＝４０％）で、より興味が深まり、Ｉｎｔｅｒｅｓｔ段階（訴求段階。例えば、興味値０．３〜０．５の興味段階）に遷移すると推定されたことを意味している（この場合、例えば、“中華料理”に関するオブジェクトの興味値を０．３と推定する）。また、図９におけるＡｔｔｅｎｔｉｏｎ段階の“中華料理”に関するオブジェクトは、確率０．３で“和食”に関するオブジェクトに興味が移り、確率０．２で興味に変化がなく、確率０．１で“その他”のオブジェクトに興味が移ると推定されたことを意味している。ここで、“その他”のオブジェクトとは、検出した“中華料理”に関するオブジェクトおよび“和食”に関するオブジェクト以外のオブジェクトである（すなわち、画像内または視界内に写り込んでいないオブジェクト）。 In FIG. 9, for example, an object related to “Chinese cuisine” in the Attention stage (attention stage, for example, an interest stage with an interest value of 0.1 to 0.2) has a probability of 0.4 (= 40 from the history data of the user's eyes. %), It is estimated that the interest is deepened and the transition to the Interest stage (the appeal stage, for example, the interest stage having an interest value of 0.3 to 0.5) is assumed (in this case, for example, “Chinese Chinese”). The object's interest value for “cooking” is estimated to be 0.3). In addition, the object related to “Chinese cuisine” in the Attention stage in FIG. 9 is transferred to the object related to “Japanese food” with a probability of 0.3, the interest remains unchanged with a probability of 0.2, and “Other” with a probability of 0.1. This means that it has been estimated that interest has shifted to the object. Here, the “other” object is an object other than the detected object related to “Chinese food” and “Japanese food” (that is, an object not reflected in the image or view).

図９の推定モデルにおける各遷移確率は、観測事象（ユーザ目線または心拍数などユーザの身体的特徴）の履歴データおよびＢａｕｍ−Ｗｅｌｃｈアルゴリズムなどの一般的な推定アルゴリズムを用いて学習される。この際、学習に用いられるユーザ目線の履歴データに制限を設けることもできる。例えば、検出したオブジェクトと同一のオブジェクトに関する履歴データであっても全く異なる場所で撮影されたデータは学習に用いない、または、所定の期間内の履歴データ（「撮影日時」が所定の期間内の履歴データ。同一ユーザであっても、あまり古いデータは、ユーザ興味に影響しないという考えに基づく）のみを用いることもできる。また、図９の推定モデルでは、興味段階が１段階しか遷移しないように示されているが、２段階以上遷移するように推定することもできる。 Each transition probability in the estimation model of FIG. 9 is learned by using historical data of observation events (user's gaze or user's physical characteristics such as heart rate) and a general estimation algorithm such as the Baum-Welch algorithm. At this time, it is possible to limit the history data of the user's eyes used for learning. For example, even if it is history data relating to the same object as the detected object, data taken at a completely different place is not used for learning, or history data within a predetermined period ("shooting date and time" is within a predetermined period) Historical data (even if it is the same user, based on the idea that very old data does not affect user interest) can also be used. Further, in the estimation model of FIG. 9, it is shown that the interest stage changes only in one stage, but it can be estimated that the stage changes in two stages or more.

そして、学習した遷移確率、現在の観測事象の履歴データ、およびＶｉｔｅｒｂｉアルゴリズムなどの一般的な推定アルゴリズムを用いて、現在のユーザ興味がどこにあるか、最も尤もらしいノードを一意に確定することができる。これにより、当該確定したノード（オブジェクト）の興味値を、例えば、興味段階ごとに定義した興味値（例えば、Ａｔｔｅｎｔｉｏｎ段階は０．１、Ｉｎｔｅｒｅｓｔ段階は０．３、Ｄｅｓｉｒｅ段階は０．６、Ａｃｔｉｏｎ段階は０．９）と決定することができる。 Then, using the learned transition probability, the history data of the current observation event, and a general estimation algorithm such as the Viterbi algorithm, it is possible to uniquely determine the most likely node where the current user interest is. . As a result, the interest value of the determined node (object) is defined by, for example, the interest value defined for each interest stage (for example, the Attention stage is 0.1, the Interest stage is 0.3, the Desire stage is 0.6, Action) The stage can be determined as 0.9).

以上より、本発明により、ＡＲ技術（ＡＲマーカーを利用した技術、およびマーカーレスＡＲによる技術）において、撮影画像から物体および／またはＡＲマーカーを認識し、それにより検出されたオブジェクトの中から、ユーザが興味を持ち、より必要とするオブジェクトを選択してＡＲ表示を行なうＡＲユーザインタフェース適用装置および制御方法を提供することができる。 As described above, according to the present invention, in the AR technique (a technique using an AR marker and a technique using a markerless AR), an object and / or an AR marker is recognized from a captured image, and a user is detected from the detected objects. It is possible to provide an AR user interface application apparatus and a control method for performing an AR display by selecting an object that is more interested and more necessary.

３００ＡＲ表示制御サーバ
３０１ネットワーク
３０２スマートフォン
３０３ウェアラブルデバイス
３１０ＣＰＵ
３１１ＲＡＭ
３１２入力装置
３１３出力装置
３１４通信制御装置
３１５システムバス
３１６記憶装置
３２０データ送受信手段
３２１オブジェクト検出手段
３２２メタデータ管理手段
３２３興味値算出手段
３２４表示データ管理手段
３３０静的メタデータ記憶部
３３１動的メタデータ記憶部
３３２表示データ記憶部３３２ 300 AR display control server 301 Network 302 Smartphone 303 Wearable device 310 CPU
311 RAM
312 Input device 313 Output device 314 Communication control device 315 System bus 316 Storage device 320 Data transmission / reception means 321 Object detection means 322 Metadata management means 323 Interest value calculation means 324 Display data management means 330 Static metadata storage unit 331 Dynamic meta Data storage unit 332 Display data storage unit 332

Claims

Based on the user's interest, in a computer device that causes a user terminal to perform AR display for an object that is an object of augmented reality (AR) display, static metadata that is information unique to the object is stored;
From the user terminal, an image or video taken at the user terminal, and additional information related to the image or the video are received,
Obtaining the static metadata corresponding to the object contained in the received image or video;
When the static metadata is acquired, dynamic metadata that is dynamic information related to the user's interest with respect to the object is acquired from the received image or video and the additional information,
Calculating an interest value of the object based on the dynamic metadata;
Based on the calculated interest value, a display target on the user terminal is determined from the objects,
Based on the static metadata, create display data that is AR display data for each object that is the display target that has been determined,
A computer apparatus, wherein the created display data is transmitted to the user terminal.

The additional information includes at least one of data related to shooting of the image or the video, data related to a terminal of the user terminal, and physical characteristic data of the user,
The dynamic metadata includes at least one of a position coordinate, a size, and a continuous display time of the object in the image or the video of the object, and the dynamic metadata is captured in the past by the user. Including historical data for the recorded image or the video,
In calculating the interest value,
Determining whether the position coordinates included in the dynamic metadata relating to the user are closer to the center of the image than the position coordinates in the past data of the object;
Determining whether the size included in the dynamic metadata relating to the user is larger than the size of the object in the past data, and the continuous display time included in the dynamic metadata relating to the user , Executing at least one of determining whether or not a predetermined threshold value is exceeded,
The computer apparatus according to claim 1, wherein the interest value is calculated based on a determination regarding at least one of the executed position coordinates, the size, and the continuous display time.

In calculating the interest value, the user's eyes and the history data of the eyes are observation events, and based on the observation events, based on a hidden Markov model having the user's interest stage for the object as a node, The computer apparatus according to claim 1, wherein an interest value is estimated.

In determining the display target on the user terminal,
Determining, based on the calculated interest value, the upper predetermined number of objects having a high interest value as a display target;
Determining the object for which the calculated interest value is equal to or greater than a predetermined threshold as a display target;
At least one of determining the display target on the user terminal from among the objects is executed based on the calculated interest value and a display amount of AR display for the object. Item 4. The computer apparatus according to any one of Items 1 to 3.

The static metadata includes the name, genre, and detailed information of the object, and in creating the display data,
Based on the calculated interest value, a detailed level of the display data is determined,
Based on the detail level, determine at least one of the name, the genre, and the detailed information as a display target,
The computer apparatus according to any one of claims 1 to 4, wherein the display data including the determined at least one of the name, the genre, and the detailed information is created.

A method executed by a computer device that causes a user terminal to perform AR display for an object that is an object of augmented reality (AR) display based on user interests, comprising:
Storing static metadata which is information specific to the object;
Receiving from the user terminal an image or video taken at the user terminal and additional information relating to the image or video;
Obtaining the static metadata corresponding to the object included in the received image or video;
Acquiring the static metadata, which is dynamic information related to the user's interest in the object, from the received image or video and the additional information when acquiring the static metadata; ,
Calculating an interest value of the object based on the dynamic metadata;
Determining a display target on the user terminal from among the objects based on the calculated interest value;
Creating display data that is data for AR display for each object that is the display target determined based on the static metadata;
Transmitting the created display data to the user terminal.

A computer program that causes a computer device to cause a user terminal to perform AR display on an object that is an object of augmented reality (AR) display based on the user's interest, and when executed by the computer device, the computer device In addition,
Storing static metadata which is information specific to the object;
From the user terminal, an image or video taken at the user terminal, and additional information related to the image or the video are received,
Obtaining the static metadata corresponding to the object included in the received image or video;
When the static metadata is acquired, dynamic metadata that is dynamic information related to the user's interest in the object is acquired from the received image or video and the additional information,
Based on the dynamic metadata, the interest value of the object is calculated,
Based on the calculated interest value, the display target on the user terminal is determined from the objects,
Based on the static metadata, display data that is data for AR display for each object that is the display target determined is created,
A computer program for transmitting the created display data to the user terminal.