JP2023085961A

JP2023085961A - Information processing device, information processing method, and information processing program

Info

Publication number: JP2023085961A
Application number: JP2021200296A
Authority: JP
Inventors: 紗記子西; Sakiko Nishi
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2023-06-21
Anticipated expiration: 2041-12-09
Also published as: JP7459038B2

Abstract

To further improve quality of a service provision using a multi-viewpoint image.SOLUTION: An information processing device includes: a specification part configured to specify a target of the annotation tagging from among photographing targets included in a multi-view image; an estimation part configured to estimate a three-dimensional position in the multi-view image to be tagged; and a tagging part which tags the target to be tagged in accordance with the three-dimensional position of the target to be tagged. Further, the specification part specifies and classifies a shooting target by the image recognition for each viewpoint image of the multi-viewpoint image.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

従来、被写体（撮影対象）を異なる視点で撮像した複数の画像である多視点画像（自由視点画像）を生成する技術が提供されている（例えば特許文献１参照）。 2. Description of the Related Art Conventionally, there has been provided a technique for generating a multi-viewpoint image (free-viewpoint image), which is a plurality of images of a subject (capturing target) captured from different viewpoints (see, for example, Patent Document 1).

特開２０１６－１１９５１３号公報JP 2016-119513 A

しかしながら、上述した従来技術では、多視点画像を用いたサービスの提供が十分であるとは言えない場合がある。例えば、上述した従来技術では、ユーザに多視点画像を活用させているものの、多視点画像を用いたサービス提供の質については改善の余地がある。 However, the conventional technology described above may not be sufficient to provide services using multi-viewpoint images. For example, in the conventional technology described above, users are allowed to utilize multi-viewpoint images, but there is room for improvement in terms of the quality of service provision using multi-viewpoint images.

本願は、上記に鑑みてなされたものであって、多視点画像を用いたサービス提供の質をより向上させることを目的とする。 The present application has been made in view of the above, and an object of the present application is to further improve the quality of service provision using multi-view images.

本願に係る情報処理装置は、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する特定部と、前記タグ付けの対象の多視点画像内の３次元的な位置を推定する推定部と、前記タグ付けの対象の３次元的な位置に合わせて前記タグ付けの対象にタグを付与するタグ付与部と、を備えることを特徴とする。 An information processing apparatus according to the present application includes a specifying unit that specifies a target of annotation tagging from shooting targets included in a multi-view image, and a three-dimensional position of the target of tagging within the multi-view image. It is characterized by comprising an estimating unit that estimates, and a tagging unit that adds a tag to the tagging target in accordance with the three-dimensional position of the tagging target.

実施形態の一態様によれば、多視点画像を用いたサービス提供の質をより向上させることができる。 According to one aspect of the embodiment, it is possible to further improve the quality of service provision using multi-view images.

図１は、実施形態に係る情報処理方法の概要を示す説明図である。FIG. 1 is an explanatory diagram showing an outline of an information processing method according to an embodiment. 図２は、撮影ガイドの表示の概要を示す説明図である。FIG. 2 is an explanatory diagram showing an overview of the display of the shooting guide. 図３は、撮影モデルの顔画像の置換の概要を示す説明図である。FIG. 3 is an explanatory diagram showing an overview of replacement of a face image of a photographed model. 図４は、実施形態に係る情報処理システムの構成例を示す図である。FIG. 4 is a diagram illustrating a configuration example of an information processing system according to the embodiment; 図５は、実施形態に係る端末装置の構成例を示す図である。FIG. 5 is a diagram illustrating a configuration example of a terminal device according to the embodiment; 図６は、実施形態に係る情報提供装置の構成例を示す図である。FIG. 6 is a diagram illustrating a configuration example of an information providing apparatus according to the embodiment; 図７は、利用者情報データベースの一例を示す図である。FIG. 7 is a diagram showing an example of a user information database. 図８は、履歴情報データベースの一例を示す図である。FIG. 8 is a diagram showing an example of the history information database. 図９は、画像情報データベースの一例を示す図である。FIG. 9 is a diagram showing an example of an image information database. 図１０は、実施形態に係る処理手順を示すフローチャートである。FIG. 10 is a flowchart illustrating a processing procedure according to the embodiment; 図１１は、ハードウェア構成の一例を示す図である。FIG. 11 is a diagram illustrating an example of a hardware configuration;

以下に、本願に係る情報処理装置、情報処理方法及び情報処理プログラムを実施するための形態（以下、「実施形態」と記載する）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る情報処理装置、情報処理方法及び情報処理プログラムが限定されるものではない。また、以下の実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。 Embodiments for implementing an information processing apparatus, an information processing method, and an information processing program according to the present application (hereinafter referred to as "embodiments") will be described in detail below with reference to the drawings. The information processing apparatus, information processing method, and information processing program according to the present application are not limited to this embodiment. Also, in the following embodiments, the same parts are denoted by the same reference numerals, and overlapping descriptions are omitted.

〔１．情報処理方法の概要〕
まず、図１を参照し、実施形態に係る情報処理装置が行う情報処理方法の概要について説明する。図１は、実施形態に係る情報処理方法の概要を示す説明図である。なお、図１では、多視点画像を用いたサービス提供を行う場合を例に挙げて説明する。 [1. Outline of information processing method]
First, an outline of an information processing method performed by an information processing apparatus according to an embodiment will be described with reference to FIG. FIG. 1 is an explanatory diagram showing an outline of an information processing method according to an embodiment. Note that in FIG. 1, a case of providing a service using multi-view images will be described as an example.

図１に示すように、情報処理システム１は、端末装置１０と情報提供装置１００とを含む。端末装置１０と情報提供装置１００とは、ネットワークＮ（図４参照）を介して有線又は無線で互いに通信可能に接続される。本実施形態では、端末装置１０は、情報提供装置１００と連携する。 As shown in FIG. 1 , the information processing system 1 includes a terminal device 10 and an information providing device 100 . The terminal device 10 and the information providing device 100 are communicably connected to each other by wire or wirelessly via a network N (see FIG. 4). In this embodiment, the terminal device 10 cooperates with the information providing device 100 .

端末装置１０は、利用者Ｕ（ユーザ）により使用されるスマートフォンやタブレット等のスマートデバイスであり、４Ｇ（Generation）やＬＴＥ（Long Term Evolution）等の無線通信網を介して任意のサーバ装置と通信を行うことができる携帯端末装置である。また、端末装置１０は、液晶ディスプレイ等の画面であって、タッチパネルの機能を有する画面を有し、利用者Ｕから指やスタイラス等によりタップ操作、スライド操作、スクロール操作等、コンテンツ等の表示データに対する各種の操作を受付ける。なお、画面のうち、コンテンツが表示されている領域上で行われた操作を、コンテンツに対する操作としてもよい。また、端末装置１０は、スマートデバイスのみならず、デスクトップＰＣ（Personal Computer）やノートＰＣ等の情報処理装置であってもよい。 The terminal device 10 is a smart device such as a smartphone or tablet used by a user U (user), and communicates with an arbitrary server device via a wireless communication network such as 4G (Generation) or LTE (Long Term Evolution). It is a portable terminal device capable of performing The terminal device 10 has a screen such as a liquid crystal display and has a touch panel function, and the user U can perform a tap operation, a slide operation, a scroll operation, or the like with a finger, a stylus, or the like, and display data such as contents. Accepts various operations for . An operation performed on an area where content is displayed on the screen may be an operation on the content. In addition, the terminal device 10 may be an information processing device such as a desktop PC (Personal Computer) or a notebook PC as well as a smart device.

情報提供装置１００は、各利用者Ｕの端末装置１０と連携し、各利用者Ｕの端末装置１０に対して、各種アプリケーション（以下、アプリ）等に対するＡＰＩ（Application Programming Interface）サービス等と、各種データを提供する情報処理装置であり、サーバ装置やクラウドシステム等により実現される。 The information providing device 100 cooperates with the terminal device 10 of each user U, and provides the terminal device 10 of each user U with API (Application Programming Interface) services for various applications (hereinafter referred to as apps) and various It is an information processing device that provides data, and is realized by a server device, a cloud system, or the like.

また、情報提供装置１００は、各利用者Ｕの端末装置１０に対して、オンラインで何らかのＷｅｂサービスを提供する情報処理装置であってもよい。例えば、情報提供装置１００は、Ｗｅｂサービスとして、インターネット接続、検索サービス、ＳＮＳ（Social Networking Service）、電子商取引（ＥＣ：Electronic Commerce）、ファッションコーディネート（ファッションアイテムを着用した写真（静止画）や動画）が投稿される投稿サイト、電子決済、オンラインゲーム、オンラインバンキング、オンライントレーディング、宿泊・チケット予約、動画・音楽配信、ニュース、地図、ルート検索、経路案内、路線情報、運行情報、天気予報等のサービスを提供してもよい。実際には、情報提供装置１００は、上記のようなＷｅｂサービスを提供する各種サーバと連携し、Ｗｅｂサービスを仲介してもよいし、Ｗｅｂサービスの処理を担当してもよい。 Also, the information providing device 100 may be an information processing device that provides some web service online to the terminal device 10 of each user U. FIG. For example, the information providing apparatus 100 provides, as Web services, Internet connection, search service, SNS (Social Networking Service), electronic commerce (EC), fashion coordination (pictures (still images) and videos of wearing fashion items). Posting sites, electronic payments, online games, online banking, online trading, accommodation/ticket reservations, video/music distribution, news, maps, route searches, route guidance, route information, operation information, weather forecasts, etc. may be provided. In practice, the information providing apparatus 100 may cooperate with various servers that provide web services as described above, mediate web services, or take charge of web service processing.

なお、情報提供装置１００は、利用者Ｕに関する利用者情報を取得可能である。例えば、情報提供装置１００は、利用者Ｕの性別、年代、居住地域といった利用者Ｕの属性に関する情報を取得する。そして、情報提供装置１００は、利用者Ｕを示す識別情報（利用者ＩＤ等）とともに利用者Ｕの属性に関する情報を記憶して管理する。 Note that the information providing device 100 can acquire user information about the user U. FIG. For example, the information providing device 100 acquires information about attributes of the user U, such as the user's U gender, age, and area of residence. The information providing apparatus 100 stores and manages identification information indicating the user U (user ID, etc.) and information related to the attributes of the user U. FIG.

また、情報提供装置１００は、利用者Ｕの端末装置１０から、あるいは利用者ＩＤ等に基づいて各種サーバ等から、利用者Ｕの行動を示す各種の履歴情報（ログデータ）を取得する。例えば、情報提供装置１００は、利用者Ｕの位置や日時の履歴である位置履歴を端末装置１０から取得する。また、情報提供装置１００は、利用者Ｕが入力した検索クエリの履歴である検索履歴を検索サーバ（検索エンジン）や電子商取引サーバや投稿サーバから取得する。また、情報提供装置１００は、利用者Ｕが閲覧したコンテンツや商品（ファッションアイテム）の履歴である閲覧履歴をコンテンツサーバや電子商取引サーバや投稿サーバから取得する。また、情報提供装置１００は、利用者Ｕが購入や決済したコンテンツや商品（ファッションアイテム）の履歴である購入履歴（決済履歴）を電子商取引サーバや決済処理サーバから取得する。また、情報提供装置１００は、利用者Ｕのマーケットプレイスへの出品の履歴である出品履歴や販売履歴を電子商取引サーバや決済処理サーバから取得してもよい。また、情報提供装置１００は、利用者Ｕが投稿したファッションコーディネート（ファッションアイテム）の履歴である投稿履歴や閲覧者が支持（いいね）したファッションコーディネート（ファッションアイテム）の履歴である支持履歴を投稿サーバやＳＮＳサーバから取得する。 Further, the information providing apparatus 100 acquires various types of history information (log data) indicating actions of the user U from the terminal apparatus 10 of the user U or from various servers based on the user ID or the like. For example, the information providing device 100 acquires a location history, which is a history of the user U's location and date and time, from the terminal device 10 . The information providing apparatus 100 also acquires a search history, which is a history of search queries input by the user U, from a search server (search engine), an electronic commerce server, or a posting server. Further, the information providing apparatus 100 acquires a browsing history, which is a history of contents and products (fashion items) browsed by the user U, from a content server, an electronic commerce server, or a posting server. Further, the information providing apparatus 100 acquires a purchase history (payment history), which is a history of contents and products (fashion items) purchased or paid for by the user U, from an electronic commerce server or a payment processing server. Further, the information providing apparatus 100 may acquire the exhibition history and sales history, which are the history of the user U's exhibition in the marketplace, from the electronic commerce server or the payment processing server. The information providing apparatus 100 also posts a posting history, which is a history of fashion coordination (fashion items) posted by the user U, and a support history, which is a history of fashion coordination (fashion items) supported (liked) by the viewer. Acquire from a server or SNS server.

本実施形態では、情報提供装置１００は、ユーザのファッションの嗜好に関するファッション情報に基づき、ファッションアイテム（服、装飾品、鞄、靴、帽子等）とその付加情報との組合せを画像解析とＡＩ（Artificial Intelligence：人工知能）によって決定し、適切な位置に表示する。ここでは、情報提供装置１００は、画像解析により画像に含まれるファッションアイテムを特定し、そのファッションアイテムに対応するアノテーションタグ（以下、タグ）を付与する。なお、タグを付与するとは、タグ付けの対象の近傍の適当な位置にタグを表示することを示す。例えば、情報提供装置１００は、投稿サイトやＳＮＳに投稿するための写真や動画としてユーザがファッションアイテムを着用して自身を撮影した画像、あるいはそのように撮影・投稿された他のユーザの画像において、ファッションアイテムに対応するタグを付与する。 In this embodiment, the information providing apparatus 100 uses image analysis and AI ( Determined by Artificial Intelligence (AI) and displayed in the appropriate position. Here, the information providing apparatus 100 identifies fashion items included in the image by image analysis, and attaches annotation tags (hereinafter referred to as tags) corresponding to the fashion items. Note that adding a tag means displaying a tag at an appropriate position in the vicinity of the object to be tagged. For example, the information providing apparatus 100 may be used in an image of the user wearing a fashion item as a photograph or video for posting to a posting site or SNS, or in an image of another user that has been photographed and posted. , to assign tags corresponding to fashion items.

例えば、情報提供装置１００は、そのファッションアイテムの詳細情報を表示したタグを付与する。また、情報提供装置１００は、そのファッションアイテムの商品ページ（販売ページ／購入ページ／広告ページ等）、又はそのファッションアイテムに関連する他のファッションアイテム（一緒に購入されることが多い商品、同一ブランドの商品等）の商品ページへ案内するタグを付与してもよい。 For example, the information providing device 100 attaches a tag displaying detailed information of the fashion item. In addition, the information providing apparatus 100 may display the product page of the fashion item (sales page/purchase page/advertisement page, etc.), or other fashion items related to the fashion item (products that are often purchased together, products of the same brand, etc.). , etc.) may be added.

また、情報提供装置１００は、ファッションアイテムに合わせて、特徴的なタグを付与してもよい。例えば、情報提供装置１００は、ファッションアイテムとの色相環的な相性度が高いタグを付与する。あるいは、投稿サイトやＳＮＳでの閲覧者による支持率（いいね率）が高くなると推定されるタグを付与する。あるいは、閲覧者によるファッションアイテムの購入率が高くなると推定されるタグを付与する。これにより、閲覧者によるファッションアイテムの購入を促進することができる。 Also, the information providing apparatus 100 may attach a characteristic tag to the fashion item. For example, the information providing apparatus 100 attaches tags that are highly compatible with fashion items in terms of the color wheel. Alternatively, a tag that is estimated to increase the support rate (like rate) of viewers on the posting site or SNS is added. Alternatively, a tag that is presumed to increase the purchase rate of fashion items by viewers is added. As a result, it is possible to encourage the viewer to purchase the fashion item.

ユーザのファッションの嗜好に関するファッション情報は、各利用者Ｕの属性情報や履歴情報等から取得可能である。また、ファッション情報は、利用者Ｕが着用するファッションアイテムや利用者Ｕが閲覧したファッションアイテムに関する情報（ファッションアイテム情報）を含む。本実施形態では、ファッション情報は、ファッション通販サイト（例えば「ZOZOTOWN」（登録商標））等の電子商取引サイトやファッションコーディネート投稿サイト（例えば「WEAR」（登録商標））での行動情報（検索、選択、閲覧、購入、投稿）を含む。また、ファッション情報は、利用者Ｕが検索、閲覧、購入、所有、投稿等をしたファッションアイテムに関する情報（ファッションアイテム情報）を含む。 Fashion information about user's fashion preferences can be obtained from each user U's attribute information, history information, and the like. The fashion information includes fashion items worn by the user U and information on fashion items viewed by the user U (fashion item information). In this embodiment, fashion information is behavior information (search, selection , viewed, purchased, posted). The fashion information includes information (fashion item information) related to fashion items searched, browsed, purchased, owned, posted, etc. by the user U.

ユーザのファッションの嗜好は、本システムとアカウントを連結している（又は本システムの一部である）ファッション通販サイト等の電子商取引サイトやファッションコーディネート投稿サイト等の様々な履歴情報（ログデータ）を基に算出／推定するロジックにより決定される。例えば、情報提供装置１００は、ユーザがファッション通販サイトで購入した服（ユーザ情報とマッチする服を着ている服と定義）や閲覧した服から、「服とタグとの組合せ」を算出／推定し、服との組合せに最適なタグを付与する。また、情報提供装置１００は、ユーザがファッションコーディネート投稿サイトで閲覧した服から、「服とタグとの組合せ」を算出／推定し、服との組合せに最適なタグを付与する。 Users' fashion preferences are collected from various history information (log data) such as e-commerce sites such as fashion mail-order sites and fashion coordination posting sites that connect accounts with this system (or are part of this system). determined by logic that calculates/estimates based on For example, the information providing apparatus 100 calculates/estimates the "combination of clothes and tags" from clothes purchased by the user at a fashion mail-order site (defined as clothes wearing clothes that match the user information) or browsed clothes. and assign the most appropriate tag to match with clothes. In addition, the information providing apparatus 100 calculates/estimates the “combination of clothes and tags” from the clothes browsed by the user on the fashion coordination posting site, and assigns the optimum tag to the combination with the clothes.

また、情報提供装置１００は、ファッション情報に加えて、ユーザの生活行動・習慣や気分等の情報に基づき、タグを付与してもよい。すなわち、情報提供装置１００は、同じファッションアイテムであっても、閲覧するユーザごとにタグの内容や形態を変更してもよい。 Further, the information providing apparatus 100 may add tags based on information such as lifestyle behavior/habits and mood of the user in addition to fashion information. That is, the information providing apparatus 100 may change the content and form of the tag for each user who views the same fashion item.

本実施形態では、情報提供装置１００は、ファッションコーディネート投稿サイト（例えば「WEAR」（登録商標））のようなファッション写真（画像）を撮影して投稿するサービスにおいて、ファッションアイテム（服、装飾品、鞄、靴、帽子等）にタグ付けする。例えば、タグ付けとして、ファッション通販サイト（例えば「ZOZOTOWN」（登録商標））等の電子商取引サイトの商品ページとの関係性を設定する。タグは写真に重畳して表示される。タグをクリックすると、ファッション通販サイト（例えば「ZOZOTOWN」（登録商標））等の電子商取引サイトでその商品（ファッションアイテム）を販売する商品ページに遷移する。 In the present embodiment, the information providing apparatus 100 is used in a service for taking and posting fashion photographs (images) such as a fashion coordination posting site (for example, "WEAR" (registered trademark)). bags, shoes, hats, etc.). For example, as tagging, a relationship with a product page of an electronic commerce site such as a fashion mail-order site (for example, "ZOZOTOWN" (registered trademark)) is set. The tag is superimposed on the photo. When a tag is clicked, a transition is made to a product page where the product (fashion item) is sold on an e-commerce site such as a fashion mail-order site (for example, "ZOZOTOWN" (registered trademark)).

また、情報提供装置１００は、ファッションコーディネート投稿サイト（例えば「WEAR」（登録商標））において、多視点画像で表示する。例えば、情報提供装置１００は、ファッションコーディネート投稿サイト（例えば「WEAR」（登録商標））において、クリック又はスクロールして写真（画像）が表示されると、自動的に／経時的に／ユーザの操作に応じて、視点が異なる画像に切り替わるようにする。 In addition, the information providing apparatus 100 displays multi-viewpoint images on a fashion coordination posting site (for example, “WEAR” (registered trademark)). For example, when a photo (image) is displayed by clicking or scrolling on a fashion coordination posting site (for example, “WEAR” (registered trademark)), the information providing device 100 automatically/over time/user's operation switch to a different image depending on the

〔１－１．アノテーションタグ〕
本実施形態では、情報提供装置１００は、多視点画像内の撮影対象にアノテーション（注釈）のタグを付与するときに、多視点画像の各視点の画像から同一注視点（タグ付けの対象）を検出して、その同一注視点の近傍にタグを設定して表示する。 [1-1. Annotation tag]
In this embodiment, the information providing apparatus 100 assigns the same gazing point (tagging target) from each viewpoint image of the multi-view image when an annotation tag is attached to the shooting target in the multi-view image. It is detected, and a tag is set and displayed in the vicinity of the same fixation point.

図１に示すように、情報提供装置１００は、ネットワークＮ（図４参照）を介して、投稿者である利用者Ｕの端末装置１０から、多視点画像を取得する（ステップＳ１）。例えば、情報提供装置１００は、ネットワークＮ（図４参照）を介して、投稿者である利用者Ｕの端末装置１０から、多視点画像の作成に用いられる様々な視点から撮影された各画像を取得する。図１では、画像内において、被撮影者（撮影モデル）をＭ、撮影対象のバッグをＢ、タグをＴとして示す。 As shown in FIG. 1, the information providing apparatus 100 acquires multi-viewpoint images from the terminal device 10 of the user U who is the poster via the network N (see FIG. 4) (step S1). For example, the information providing apparatus 100 receives images taken from various viewpoints used for creating multi-viewpoint images from the terminal device 10 of the user U who is the contributor via the network N (see FIG. 4). get. In FIG. 1, in the image, M denotes a person to be photographed (photographed model), B denotes a bag to be photographed, and T denotes a tag.

次に、情報提供装置１００は、取得された全ての画像について画像認識（Image Recognition）又は機械学習を行い、撮影対象を認識して特定・分類する（ステップＳ２）。例えば、情報提供装置１００は、各画像に含まれる撮影対象を特定して、カテゴリ別に分類する。撮影対象は、複数であってもよい。 Next, the information providing apparatus 100 performs image recognition (Image Recognition) or machine learning on all of the acquired images to recognize, specify, and classify the shooting target (step S2). For example, the information providing apparatus 100 identifies imaging targets included in each image and classifies them by category. A plurality of subjects may be photographed.

次に、情報提供装置１００は、多視点画像内の撮影対象の位置（画像内の位置）を推定する（ステップＳ３）。本実施形態では、情報提供装置１００は、多視点画像を構成する画像ごとに（各視点の画像ごとに）、各画像に含まれる撮影対象の３次元的な位置を推定（又は特定）する。撮影対象の３次元的な位置は、画像内の座標等の絶対位置であってもよいし、基準点や他の撮像対象からの相対位置であってもよい。また、３次元的な位置は一例に過ぎない。 Next, the information providing apparatus 100 estimates the position of the imaging target within the multi-viewpoint image (position within the image) (step S3). In the present embodiment, the information providing apparatus 100 estimates (or specifies) the three-dimensional position of the shooting target included in each image constituting the multi-viewpoint image (for each image of each viewpoint). The three-dimensional position of the object to be imaged may be an absolute position such as coordinates in an image, or may be a relative position from a reference point or another object to be imaged. Also, the three-dimensional position is merely an example.

次に、情報提供装置１００は、投稿者又は閲覧者である利用者Ｕから、多視点画像内の３次元的な位置が推定された撮影対象（アノテーション対象の候補）のうち、タグ付けの対象（アノテーション対象）の選択を受け付ける（ステップＳ４）。例えば、情報提供装置１００は、投稿者又は閲覧者である利用者Ｕの端末装置１０から、ネットワークＮ（図４参照）を介して、タグ付けの対象（アノテーション対象）となる商品（ファッションアイテム）と、対応付ける当該商品の商品ページの指定を受け付ける。このとき、情報提供装置１００は、ファッション通販サイト（例えば「ZOZOTOWN」（登録商標））等の電子商取引サイトの各商品ページから画像認識又は機械学習で当該商品の類似画像を検索し、検索結果に基づいて当該商品の商品ページを自動で特定してもよい。 Next, the information providing apparatus 100 selects, from the user U, who is the poster or the viewer, the object to be tagged among the shooting objects (candidates for annotation objects) whose three-dimensional position in the multi-view image is estimated. The selection of (annotation target) is accepted (step S4). For example, the information providing apparatus 100 receives a product (fashion item) to be tagged (annotated) from the terminal device 10 of the user U, who is a poster or a viewer, via the network N (see FIG. 4). , the specification of the product page of the corresponding product to be associated is accepted. At this time, the information providing device 100 searches each product page of an e-commerce site such as a fashion mail-order site (for example, “ZOZOTOWN” (registered trademark)) for similar images of the product by image recognition or machine learning, Based on this, the product page of the product may be automatically specified.

なお、情報提供装置１００は、投稿者又は閲覧者である利用者Ｕからタグ付けの対象（アノテーション対象）の選択を受け付けていない場合、多視点画像内の全ての撮影対象をタグ付けの対象（アノテーション対象）としてもよい。また、情報提供装置１００は、投稿者又は閲覧者である利用者Ｕからのタグ付けの対象（アノテーション対象）の選択に関係なく、無条件で多視点画像内の全ての撮影対象をタグ付けの対象（アノテーション対象）としてもよい。また、情報提供装置１００は、事前設定に従って、撮影対象（アノテーション対象の候補）のうち、タグ付けの対象（アノテーション対象）を決定してもよい。 Note that if the information providing apparatus 100 does not receive a selection of a tagging target (annotation target) from the user U who is a poster or a viewer, the information providing apparatus 100 selects all shooting targets in the multi-view image as tagging targets ( Annotation target). In addition, the information providing apparatus 100 unconditionally tags all shooting targets in the multi-view image regardless of the selection of the tagging target (annotation target) by the user U who is the poster or the viewer. It may be a target (annotation target). In addition, the information providing apparatus 100 may determine a tagging target (annotation target) among shooting targets (annotation target candidates) according to presetting.

また、情報提供装置１００は、学習モデルを用いて、多視点画像内の３次元的な位置が推定された撮影対象のうち、タグ付けの対象（アノテーション対象）を推定してもよい。例えば、情報提供装置１００は、過去にタグ付けの対象として選択された撮影対象と、そのタグ付けの対象に付与されたタグとの組合せについて学習することで学習モデルを構築してもよい。そして、情報提供装置１００は、学習モデルに撮影対象を入力すると、その撮影対象がタグ付けの対象である場合に適当なタグを推論して出力してもよい。 Further, the information providing apparatus 100 may use a learning model to estimate a tagging target (annotation target) among shooting targets whose three-dimensional positions in the multi-view image are estimated. For example, the information providing apparatus 100 may build a learning model by learning about combinations of shooting targets that have been selected as targets for tagging in the past and tags that have been assigned to the targets for tagging. Then, when the shooting target is input to the learning model, the information providing apparatus 100 may infer and output an appropriate tag if the shooting target is to be tagged.

次に、情報提供装置１００は、ネットワークＮ（図４参照）を介して、閲覧者である利用者Ｕの端末装置１０に、多視点画像内の撮影対象のうち、選択されたタグ付けの対象（アノテーション対象）にアノテーションのタグを付けて表示する（ステップＳ５）。 Next, the information providing apparatus 100 sends, via the network N (see FIG. 4), to the terminal device 10 of the user U, who is a viewer, the tagging target selected from among the shooting targets in the multi-view image. (Annotation target) is displayed with an annotation tag (step S5).

次に、情報提供装置１００は、ネットワークＮ（図４参照）を介して、多視点画像の視点の変更に合わせて、閲覧者である利用者Ｕの端末装置１０に表示されたアノテーションのタグの位置を変更する（ステップＳ６）。例えば、情報提供装置１００は、多視点画像の視点が変更された際に、多視点画像内のタグ付けの対象（アノテーション対象）を自動で追従してアノテーションのタグを付与して適当な位置に表示する。このとき、情報提供装置１００は、視点の異なる画像ごとに（その都度）、タグ付けの対象（アノテーション対象）にアノテーションのタグを付与して適当な位置に表示してもよい。 Next, the information providing apparatus 100 changes the tag of the annotation displayed on the terminal device 10 of the user U, who is the viewer, in accordance with the change of the viewpoint of the multi-view image via the network N (see FIG. 4). The position is changed (step S6). For example, when the viewpoint of the multi-view image is changed, the information providing apparatus 100 automatically follows the tagging target (annotation target) in the multi-view image, attaches the annotation tag, and attaches the tag to the appropriate position. indicate. At this time, the information providing apparatus 100 may attach an annotation tag to a tagging target (annotation target) for each image from a different viewpoint (each time) and display the tag at an appropriate position.

このとき、情報提供装置１００は、アノテーションのタグが、他の対象や他のタグと重複しないように配置（表示）する。また、情報提供装置１００は、アノテーションのタグと、タグ付けの対象（アノテーション対象）との位置関係が保持されるような位置にタグを配置（表示）し続ける。 At this time, the information providing apparatus 100 arranges (displays) the tags of the annotations so as not to overlap with other objects or other tags. In addition, the information providing apparatus 100 continues to place (display) the tag at a position such that the positional relationship between the tag of the annotation and the target of tagging (annotation target) is maintained.

このように、本実施形態では、情報提供装置１００は、投稿者又は閲覧者であるユーザから多視点画像（投稿者により投稿された多視点画像、閲覧者により指定された多視点画像等）を取得し、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定し、タグ付けの対象の多視点画像内の３次元的な位置を推定する。そして、情報提供装置１００は、タグ付けの対象の３次元的な位置に合わせてタグ付けの対象にタグを付与する。すなわち、情報提供装置１００は、タグ付けの対象とともにタグを端末装置１０の画面に表示してユーザに通知（提示）する。 As described above, in the present embodiment, the information providing apparatus 100 receives a multi-view image (a multi-view image posted by a poster, a multi-view image specified by a viewer, etc.) from a user who is a poster or a viewer. Annotation tagging targets are identified from among shooting targets included in the acquired multi-viewpoint images, and the three-dimensional positions of the tagging targets within the multi-viewpoint images are estimated. Then, the information providing apparatus 100 attaches a tag to the tagging target according to the three-dimensional position of the tagging target. That is, the information providing apparatus 100 notifies (presents) the tag to the user by displaying the tag on the screen of the terminal device 10 together with the object to be tagged.

また、情報提供装置１００は、多視点画像の各視点の画像ごとに画像認識又は機械学習で撮影対象を特定して分類する。また、情報提供装置１００は、投稿者又は閲覧者であるユーザからタグ付けの対象の選択を受け付ける。そして、情報提供装置１００は、タグ付けの対象の選択に応じて、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。 In addition, the information providing apparatus 100 identifies and classifies a shooting target for each viewpoint image of the multi-viewpoint image by image recognition or machine learning. The information providing apparatus 100 also receives a selection of a tagging target from a user who is a poster or a viewer. Then, the information providing apparatus 100 identifies the target of annotation tagging from among the shooting targets included in the multi-view image according to the selection of the target of tagging.

例えば、情報提供装置１００は、投稿者又は閲覧者であるユーザから、タグ付けの対象の選択と、タグ付けの対象に対応付けるウェブページの指定とを受け付ける。あるいは、情報提供装置１００は、ネットワーク上の複数のウェブページから画像認識又は機械学習でタグ付けの対象の画像の類似画像を検索し、類似画像を含むウェブページをタグ付けの対象に対応付けるウェブページとして自動で特定する。 For example, the information providing apparatus 100 receives a selection of a tagging target and a specification of a web page associated with the tagging target from a user who is a poster or a viewer. Alternatively, the information providing apparatus 100 searches a plurality of web pages on the network for images similar to the image to be tagged by image recognition or machine learning, and associates the web page containing the similar image with the tag target. automatically identified as

情報提供装置１００は、タグ付けの対象にタグを付与する際、タグが他の対象及び他のタグと重複しないように付与する。また、情報提供装置１００は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されていない状態であれば、タグ付けの対象にタグを付与する。なお、情報提供装置１００は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されている状態であっても、タグ付けの対象のタグが他の対象のタグよりも表示の優先度が高い場合には、他の対象にはタグを付与せず、タグ付けの対象にタグを付与する。 When the information providing apparatus 100 attaches a tag to an object to be tagged, the tag is attached so as not to overlap with other objects and other tags. Further, when the information providing apparatus 100 attaches a tag to an object to be tagged, the information providing apparatus 100 attaches the tag to the object to be tagged if the object to be tagged is not hidden by other objects. Note that when the information providing apparatus 100 attaches a tag to a target to be tagged, even if the target to be tagged is hidden by another target, the tag to be tagged is the tag of the other target. If the display priority is higher than the target, the tag is assigned to the tagging target without assigning the tag to the other target.

なお、上記の各処理は、情報提供装置１００ではなく、端末装置１０がアプリ等の機能により実施してもよい。すなわち、端末装置１０上で処理が完結してもよい。 Note that each of the above processes may be performed by the terminal device 10, not by the information providing device 100, by a function such as an application. That is, the processing may be completed on the terminal device 10 .

〔１－２．タグの表示位置の変更〕
本実施形態では、情報提供装置１００は、多視点画像に撮影された撮影対象の位置関係に応じて、撮影対象を示すコンテンツ（アノテーションのタグ）の表示位置（表示態様）を変更する。なお、撮影対象を示すコンテンツ（アノテーションのタグ）は、複数でもよい。 [1-2. Change display position of tags]
In this embodiment, the information providing apparatus 100 changes the display position (display mode) of the content (annotation tag) indicating the shooting target in accordance with the positional relationship of the shooting target shot in the multi-view image. Note that there may be a plurality of contents (annotation tags) indicating the shooting target.

情報提供装置１００は、アノテーションのタグと、タグ付けの対象（アノテーション対象）との位置関係に応じて、タグの表示位置を変更する。 The information providing apparatus 100 changes the display position of the tag according to the positional relationship between the tag of the annotation and the target of tagging (annotation target).

情報提供装置１００は、アノテーション対象が撮影モデル／ユーザや他の撮影対象の陰に隠れている場合は、当該アノテーション対象に付与されるタグを隠す。 The information providing apparatus 100 hides the tag attached to the annotation target when the annotation target is hidden behind the photographed model/user or another photographed target.

例えば、情報提供装置１００は、アノテーション対象の撮像範囲／表示範囲が所定値以下である場合や、アノテーション対象がユーザや他の対象の陰になって隠れている場合には、アノテーションのタグを表示しない。 For example, when the imaging range/display range of the annotation target is equal to or less than a predetermined value, or when the annotation target is hidden behind the user or another target, the information providing apparatus 100 displays the tag of the annotation. do not.

あるいは、情報提供装置１００は、アノテーション対象と他の対象とのタグの前後関係を変更してもよい。情報提供装置１００は、複数のタグが存在する場合に、特に指定がない場合（何も設定されていない場合）には、通常は最も画面の手前に配置された対象のタグを表示する。本実施形態では、情報提供装置１００は、最も画面の手前に配置された対象よりも優先度が高い対象が陰に隠れている場合、最も画面の手前に配置された対象のタグを表示せず、その優先度が高い対象のタグを表示する。 Alternatively, the information providing apparatus 100 may change the anteroposterior relationship of tags between an annotation target and another target. When there are a plurality of tags, the information providing apparatus 100 normally displays the target tag arranged closest to the front of the screen unless otherwise specified (when nothing is set). In this embodiment, the information providing apparatus 100 does not display the tag of the target placed closest to the screen when an object having a higher priority than the target placed closest to the screen is hidden behind the target. , to display the target tag with its higher priority.

このとき、情報提供装置１００は、各対象に優先度を設定し、設定された優先度の大小関係で表示されるタグを決定してもよい。また、情報提供装置１００は、最も画面の手前に配置された対象よりもユーザの興味や関心が高いと推測される対象を、最も画面の手前に配置された対象よりも優先度が高い対象として決定してもよい。 At this time, the information providing apparatus 100 may set a priority for each target, and determine the tags to be displayed according to the set priority. In addition, the information providing apparatus 100 assigns an object that is presumed to be of higher interest to the user than the object placed closest to the screen as an object with a higher priority than the object placed closest to the screen. may decide.

このように、本実施形態では、情報提供装置１００は、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定し、タグ付けの対象の位置に合わせてタグを付与する。その後、情報提供装置１００は、多視点画像の視点の変更に伴い画面内のタグ付けの対象の位置が変更した場合に、タグ付けの対象の位置の変更に合わせてタグの表示位置を変更する。 As described above, in the present embodiment, the information providing apparatus 100 identifies an object to be tagged with an annotation from among shooting objects included in a multi-view image, and attaches a tag according to the position of the object to be tagged. . After that, when the position of the target of tagging in the screen changes due to the change of the viewpoint of the multi-view image, the information providing apparatus 100 changes the display position of the tag according to the change of the position of the target of tagging. .

また、情報提供装置１００は、多視点画像の視点が変更されても、タグ付けの対象とタグとの位置関係が保持されるような位置にタグを配置する。 Further, the information providing apparatus 100 arranges the tags at positions such that the positional relationship between the tagging target and the tags is maintained even if the viewpoint of the multi-viewpoint image is changed.

〔１－３．撮影ガイドの表示〕
本実施形態では、情報提供装置１００は、ユーザが多視点画像を投稿するために、カメラ（インカメラ等）を有する端末装置１０（撮像装置）を用いて自身の画像を撮影（インカメラ撮影）する際に、撮影に用いられるユーザの端末装置１０（又はそれに搭載／接続された表示装置）の画面（写真撮影画面）に所定の撮影ガイドを表示し、画像を撮影する度に（視点を移動する度に）、撮影ガイドを変更する。図２は、撮影ガイドの表示の概要を示す説明図である。図２では、画像内において、被撮影者（撮影モデル）をＭ、撮影対象のバッグをＢとして示す。 [1-3. Shooting guide display]
In the present embodiment, the information providing apparatus 100 captures an image of itself using a terminal device 10 (imaging device) having a camera (in-camera, etc.) in order for the user to post a multi-view image (in-camera shooting). When shooting, a predetermined shooting guide is displayed on the screen (photo shooting screen) of the user's terminal device 10 (or the display device mounted/connected to it) used for shooting, and each time an image is shot (the viewpoint is moved each time) to change the shooting guide. FIG. 2 is an explanatory diagram showing an overview of the display of the shooting guide. In FIG. 2, in the image, M denotes a person to be photographed (photographed model), and B denotes a bag to be photographed.

図２に示すように、情報提供装置１００は、ユーザが多視点画像を投稿するために、カメラを有する端末装置１０が多視点画像の撮影を開始したことを確認する（ステップＳ１１）。例えば、情報提供装置１００は、ネットワークＮ（図４参照）を介して、ユーザの端末装置１０から、多視点画像の撮影開始を示す信号やデータを受信する。 As shown in FIG. 2, the information providing apparatus 100 confirms that the terminal device 10 having a camera has started shooting multi-view images in order for the user to post the multi-view images (step S11). For example, the information providing device 100 receives a signal or data indicating the start of multi-view image shooting from the terminal device 10 of the user via the network N (see FIG. 4).

次に、情報提供装置１００は、ネットワークＮ（図４参照）を介して、ユーザの端末装置１０と連携する（ステップＳ１２）。例えば、情報提供装置１００は、ＡＰＩを介して、端末装置１０の撮影アプリを制御してもよい。すなわち、以降の処理は、情報提供装置１００が端末装置１０と連携して実施してもよい。 Next, the information providing device 100 cooperates with the user's terminal device 10 via the network N (see FIG. 4) (step S12). For example, the information providing device 100 may control the photography application of the terminal device 10 via an API. That is, the subsequent processing may be performed by the information providing device 100 in cooperation with the terminal device 10 .

次に、端末装置１０は、ユーザが自身の画像を撮影（インカメラ撮影）する際に、端末装置１０の画面に撮影ガイドを表示する（ステップＳ１３）。 Next, the terminal device 10 displays a shooting guide on the screen of the terminal device 10 when the user takes an image of himself/herself (in-camera shooting) (step S13).

次に、端末装置１０は、ユーザのポーズや所持しているバッグ等が撮影ガイドからずれている場合、撮影ガイドからずれている箇所を通知する（ステップＳ１４）。なお、通知方法は、画面表示でも音声案内でもよい。 Next, when the user's pose, bag, etc. carried by the user deviate from the photographing guide, the terminal device 10 notifies the location of the deviation from the photographing guide (step S14). Note that the notification method may be screen display or voice guidance.

次に、端末装置１０は、ユーザのポーズや所持しているバッグ等が撮影ガイドと一致している場合、自動的に撮影する（ステップＳ１５）。なお、端末装置１０は、ユーザのポーズや所持しているバッグ等が撮影ガイドと完全に一致していなくても、所定の割合以上一致していれば、一致していると判定して自動的に撮影してもよい。また、端末装置１０は、ユーザのポーズや所持しているバッグ等が撮影ガイドと一致している場合、自動的に撮影してもよい。このとき、情報提供装置１００は、端末装置１０に対して、ユーザのポーズや所持しているバッグ等が撮影ガイドと一致している場合には自動的に撮影するようにあらかじめ指示しておいてもよい。 Next, the terminal device 10 automatically takes a picture when the pose of the user, the bag the user carries, etc. match the picture-taking guide (step S15). Even if the pose of the user, the bag the user is carrying, etc. do not completely match the shooting guide, the terminal device 10 automatically determines that they match if they match at least a predetermined ratio. You can shoot at In addition, the terminal device 10 may automatically shoot when the pose of the user, the bag the user carries, etc. match the shooting guide. At this time, the information providing device 100 instructs the terminal device 10 in advance to automatically take a picture when the pose of the user, the bag carried by the user, etc. match the shooting guide. good too.

次に、端末装置１０は、撮影された画像に含まれる撮影対象（アノテーション対象の候補）を特定して通知する（ステップＳ１６）。詳細については後述する。 Next, the terminal device 10 identifies and notifies a photographing target (annotation target candidate) included in the photographed image (step S16). Details will be described later.

次に、端末装置１０は、多視点画像の生成のための視点変更に応じて、撮影ガイドを変更する（ステップＳ１７）。すなわち、端末装置１０は、変更後の視点に応じた撮影ガイドを表示する。 Next, the terminal device 10 changes the shooting guide according to the viewpoint change for generating the multi-viewpoint image (step S17). That is, the terminal device 10 displays the shooting guide corresponding to the changed viewpoint.

次に、端末装置１０は、多視点画像の生成のために必要な全視点の画像の撮影が完了した後、多視点画像を投稿する（ステップＳ１８）。例えば、端末装置１０は、撮影された画像から多視点画像を生成し、ネットワークＮ（図４参照）を介して、情報提供装置１００に多視点画像を自動的に投稿する。あるいは、端末装置１０は、ネットワークＮ（図４参照）を介して、情報提供装置１００に、撮影された画像をそのまま投稿してもよい。この場合、情報提供装置１００は、投稿された画像を取得した後に、投稿された画像から多視点画像を生成してもよい。 Next, the terminal device 10 posts the multi-viewpoint image after completing the shooting of all the viewpoint images necessary for generating the multi-viewpoint image (step S18). For example, the terminal device 10 generates a multi-view image from captured images and automatically posts the multi-view image to the information providing device 100 via the network N (see FIG. 4). Alternatively, the terminal device 10 may post the captured image as it is to the information providing device 100 via the network N (see FIG. 4). In this case, the information providing apparatus 100 may generate a multi-view image from the posted image after acquiring the posted image.

本実施形態では、多視点画像の撮影に用いられるユーザの端末装置１０は、端末装置１０の画面に、撮影ガイドとして、表情のガイド、姿勢（ポーズ）のガイド、商品の持ち方のガイド、又は着用の仕方のガイド等を表示し、位置（ポジション）や角度（アングル）を変えて画像を撮影する度に、撮影ガイドを変更する。 In the present embodiment, the terminal device 10 of the user used to capture the multi-view image displays, on the screen of the terminal device 10, a facial expression guide, a posture (pose) guide, a product holding guide, or a shooting guide. A guide for how to wear the device is displayed, and the photographing guide is changed each time an image is photographed at a different position or angle.

例えば、端末装置１０は、撮影に用いられるユーザの端末装置１０の画面に、撮影ガイドとして、ユーザが取るべき表情、姿勢（ポーズ）、商品の持ち方、又は着用の仕方のシルエットや輪郭を表示し、位置（ポジション）や角度（アングル）を変えて画像を撮影する度に、シルエットや輪郭を変更してもよい。この場合、ユーザは、画面に表示された表情、姿勢（ポーズ）、商品の持ち方、又は着用の仕方のシルエットや輪郭に合わせるように、自分の表情、姿勢（ポーズ）、商品の持ち方、又は着用の仕方を変えていく。 For example, the terminal device 10 displays, on the screen of the user's terminal device 10 used for photographing, as a photographing guide, facial expressions, postures (poses) that the user should take, and silhouettes and outlines of how to hold or wear the product. However, the silhouette or outline may be changed each time an image is taken at a different position or angle. In this case, the user adjusts the facial expression, posture (pose), how to hold the product, or how to wear the product displayed on the screen to match the silhouette or outline of the user. Or change the way you wear them.

また、端末装置１０は、撮影ガイドの表示・変更に合わせて、音声での案内（ガイダンス）を行ってもよい。また、端末装置１０は、撮影画面上で、表情、姿勢（ポーズ）、商品の持ち方、又は着用の仕方等について、ユーザが変更する必要がある箇所・部位に、変更内容に関するコメントを表示したタグを付与してもよい。 In addition, the terminal device 10 may provide voice guidance (guidance) in accordance with the display/change of the shooting guide. In addition, the terminal device 10 displays a comment regarding the content of the change on the photographing screen at a place/part where the user needs to change the facial expression, posture (pose), how to hold the product, how to wear the product, or the like. You can add tags.

（ポーズガイドの場合）
ここで、撮影ガイドの一例として、ポーズガイドの場合について説明する。カメラを有するユーザの端末装置１０は、多視点画像の撮影時に、端末装置１０の画面にポーズガイドを表示する。端末装置１０は、画像を撮影する度に（視点を移動する度に）、撮影ガイドを変更する。すなわち、ポーズガイドは、撮影する度に次々に変わっていく。 (For pose guide)
Here, a case of a pose guide will be described as an example of the shooting guide. A terminal device 10 of a user having a camera displays a pose guide on the screen of the terminal device 10 when capturing multi-view images. The terminal device 10 changes the shooting guide each time an image is shot (every time the viewpoint is moved). That is, the pose guide changes one after another each time the photograph is taken.

なお、端末装置１０のカメラは固定であるため、ユーザが端末装置１０を移動／回転させることで、カメラの位置（ポジション）や角度（アングル）を変えることになる。撮影ガイドは、ポーズ、表情、持ち方等に関する内容である。ポーズは、顔の向き、体の向き等を含む。カメラの位置（ポジション）や角度（アングル）に合わせて、ポーズガイドも段々（徐々に）変化していく。すなわち、端末装置１０は、カメラの位置（ポジション）や角度（アングル）に合わせて、ポーズガイドを段階的に変更していく。 Since the camera of the terminal device 10 is fixed, the user changes the position and angle of the camera by moving/rotating the terminal device 10 . The shooting guide has contents related to poses, facial expressions, how to hold the camera, and the like. The pose includes face orientation, body orientation, and the like. The pose guide gradually (gradually) changes according to the position and angle of the camera. That is, the terminal device 10 changes the pose guide step by step according to the position and angle of the camera.

端末装置１０は、画面内においてポーズガイドとユーザのポーズとが一致したと判断すると、自動的に撮影する。このとき、端末装置１０は、ユーザのポーズの適切性を判定する。端末装置１０は、ユーザのポーズが適切ではないと判定した場合、ユーザのポーズが適切になるように具体的なアドバイスをして、適切なポーズとなるように誘導する。 When the terminal device 10 determines that the pose guide matches the user's pose on the screen, the terminal device 10 automatically takes a picture. At this time, the terminal device 10 determines the suitability of the user's pose. When the terminal device 10 determines that the user's pose is not appropriate, the terminal device 10 gives specific advice to make the user's pose appropriate, and guides the user to an appropriate pose.

例えば、端末装置１０は、アゴ（顎）を引く、脚を開く／閉じる、体を傾ける、背を反らす、特定方向（前後左右）に○○歩／××ｃｍずれる、バッグやポーチ等をどういう風にもつか等のポーズガイドを表示したり、音声での案内（ガイダンス）を行ったりする。 For example, the terminal device 10 can pull back the chin, open/close the legs, tilt the body, arch the back, shift XX steps/XX cm in a specific direction (back and forth, left and right), what kind of bags and pouches, etc. A pose guide such as Kaze no Tsuka is displayed, or voice guidance is provided.

端末装置１０は、インカメラで撮影する際に、ユーザとポーズガイドとを画面に重畳表示する。ポーズガイドは、例えばシルエットや輪郭でもよいし、半透明の表示でもよい。また、音声で「こうしてください」でもよい。端末装置１０は、ユーザの現在のポーズとポーズガイドとの差分を特定して「もっと右手を挙げてください」、「もう少し、足を挙げてください」等のように、ポーズと違う部分を特定して表示／案内してもよい。そして、端末装置１０は、ユーザのポーズがポーズガイドに一致したと判定した場合に、自動で撮影してもよい。 The terminal device 10 superimposes the user and the pose guide on the screen when shooting with the in-camera. A pose guide may be, for example, a silhouette, an outline, or a translucent display. Also, it is possible to say "Please do this" by voice. The terminal device 10 identifies the difference between the current pose of the user and the pose guide, and identifies the part that is different from the pose, such as "Please raise your right hand more" or "Please raise your leg a little more". may be displayed/guided. Then, when the terminal device 10 determines that the pose of the user matches the pose guide, the terminal device 10 may automatically shoot the image.

また、ポーズガイドは、表情のガイドであってもよい。例えば、「一回転したら、笑顔になる」等でもよい。すなわち、撮影の度に、同一の表情でなくてもよい。また、端末装置１０は、最初の画像でポーズを特定して、特定したポーズガイドを表示してもよい。 Also, the pose guide may be a facial expression guide. For example, it may be ``If you make one turn, you will smile''. That is, the facial expression does not have to be the same each time the photograph is taken. Also, the terminal device 10 may specify a pose in the first image and display the specified pose guide.

このように、本実施形態では、端末装置１０は、多視点画像を撮影する際に、所定の撮影ガイドを画面に表示する。例えば、端末装置１０は、撮影ガイドとして被写体のうち被撮影者（撮影モデル）のポーズ、表情、持ち方のうち少なくとも１つに関するガイドを画面に表示する。また、端末装置１０は、撮影ガイドとして被写体のシルエット、輪郭、又は半透明の表示を画面に表示する。 As described above, in the present embodiment, the terminal device 10 displays a predetermined shooting guide on the screen when shooting a multi-view image. For example, the terminal device 10 displays on the screen a guide regarding at least one of the pose, facial expression, and manner of holding of the subject (photographed model) as the photographing guide. In addition, the terminal device 10 displays the silhouette, outline, or semi-transparent display of the subject on the screen as a shooting guide.

また、端末装置１０は、被写体を撮影する際に、被写体に撮影ガイドを重畳表示する。また、端末装置１０は、最初に撮影された画像で被写体のうち被撮影者のポーズを特定し、特定されたポーズに応じた撮影ガイドを選択して表示する。 In addition, the terminal device 10 superimposes a shooting guide on the subject when shooting the subject. In addition, the terminal device 10 identifies the pose of the person to be photographed among the subjects in the first photographed image, and selects and displays a photographing guide corresponding to the identified pose.

また、端末装置１０は、撮影時の視点を移動する度に、画面に表示された撮影ガイドを視点に応じて変更する。また、端末装置１０は、撮影時の視点を移動するにつれて、撮影ガイドを段階的に変更してもよい。 In addition, the terminal device 10 changes the shooting guide displayed on the screen according to the viewpoint every time the viewpoint during shooting is changed. In addition, the terminal device 10 may change the shooting guide step by step as the viewpoint during shooting is moved.

また、端末装置１０は、端末装置１０に対して、被写体が撮影ガイドに一致した場合には自動的に撮影するように指示する。また、端末装置１０は、被写体が撮影ガイドとずれている場合、被写体が撮影ガイドとずれている箇所を撮影者（利用者Ｕ）に通知（提示）する。なお、被写体は、撮影者自身であってもよい。すなわち、撮影者と被撮影者は同一人物であってもよい。 In addition, the terminal device 10 instructs the terminal device 10 to automatically shoot when the subject matches the shooting guide. In addition, when the subject is out of alignment with the shooting guide, the terminal device 10 notifies (presents) the location where the subject is out of alignment with the shooting guide to the photographer (user U). Note that the subject may be the photographer himself/herself. That is, the photographer and the person to be photographed may be the same person.

なお、上記の各処理は、端末装置１０ではなく、ＡＰＩを介して情報提供装置１００が実施してもよい。 Note that each of the above processes may be performed by the information providing apparatus 100 via an API instead of by the terminal apparatus 10 .

〔１－４．アノテーション対象の候補の認識〕
本実施形態では、端末装置１０は、多視点画像を構成する画像の撮影時に、撮影された画像に含まれる撮影対象（アノテーション対象の候補）を特定し、特定された撮影対象を利用者に通知する。 [1-4. Recognition of Candidates for Annotation]
In the present embodiment, the terminal device 10 specifies a shooting target (annotation target candidate) included in the shot image when shooting an image constituting a multi-view image, and notifies the user of the specified shooting target. do.

端末装置１０は、多視点画像の登録時に、撮影の度に画像認識又は機械学習で画像に含まれる撮影対象を認識し、撮影対象に関する情報を利用者に画面表示や音声で通知する。撮影対象は複数であってもよい。撮影対象を特定して通知することで、後でタグ付けが楽になる。また、先にタグ付けをする手間がなくなる。なお、端末装置１０は、事前にタグ付けの対象（アノテーション対象）をユーザに通知し、撮影の度にそのタグ付けの対象（アノテーション対象）が撮影されているか否かをユーザに通知してもよい。 When registering a multi-view image, the terminal device 10 recognizes a shooting target included in the image by image recognition or machine learning each time shooting is performed, and notifies the user of information about the shooting target by screen display or voice. A plurality of subjects may be photographed. Identifying and notifying what to shoot makes tagging easier later. In addition, the trouble of tagging in advance is eliminated. Note that the terminal device 10 may notify the user of the tagging target (annotation target) in advance, and notify the user whether or not the tagging target (annotation target) has been captured each time an image is captured. good.

また、端末装置１０は、「ワンピースが撮れました」、「バックは何カット撮れました」等を利用者に画面表示や音声で通知してもよい。また、撮る度に、「残り○○枚です」のように、あと何枚撮影するかを利用者に画面表示や音声で通知してもよい。すなわち、端末装置１０は、撮影の度に、撮影対象の撮影枚数（ショット数）や、多視点画像を構成する画像の必要数までの残り枚数を利用者に通知してもよい。また、端末装置１０は、撮影が完了した（又は完了していない）カメラの位置（ポジション）や角度（アングル）を利用者に画面表示や音声で通知してもよい。また、端末装置１０は、多視点画像の生成に必要な各視点の画像について、撮影漏れの視点（画像）があれば、利用者に画面表示や音声で通知してもよい。このとき、端末装置１０は、通知内容を示したタグを付与して表示することで通知してもよい。 In addition, the terminal device 10 may notify the user of "a dress was taken", "how many cuts were taken in the background", etc., by screen display or voice. Also, each time the user takes a picture, the user may be notified of how many more pictures are to be taken by means of screen display or voice, such as "There are XX pictures left." That is, the terminal device 10 may notify the user of the number of shots (the number of shots) of the shooting target and the remaining number of images constituting the multi-view image before the required number every time shooting is performed. In addition, the terminal device 10 may notify the user of the position (position) and angle (angle) of the camera that has completed (or has not completed) shooting by screen display or voice. In addition, the terminal device 10 may notify the user by screen display or voice if there are any viewpoints (images) that have not been captured for the images of each viewpoint necessary for generating a multi-view image. At this time, the terminal device 10 may notify by attaching and displaying a tag indicating the content of notification.

また、端末装置１０は、カメラの位置（ポジション）や角度（アングル）が変わったことにより、撮影の途中で見えなくなった（ユーザや他の対象の陰に隠れた）撮影対象を通知してもよい。このとき、端末装置１０は、撮影対象が他の対象の陰に隠れていることを示すタグを付与して表示することにより、撮影の途中で見えなくなった（ユーザや他の対象の陰に隠れた）撮影対象を通知してもよい。 In addition, the terminal device 10 may notify a shooting target that has become invisible (hidden behind the user or another target) during shooting due to a change in the position or angle of the camera. good. At this time, the terminal device 10 attaches and displays a tag indicating that the object to be photographed is hidden behind another object, so that the object becomes invisible during the photographing (hidden behind the user or another object). d) You may notify the shooting target.

〔１－５．撮影モデルの顔画像の置換〕
本実施形態では、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の顔を、閲覧者である利用者Ｕの顔に置き換える。すなわち、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の顔を別人の顔に置き換える。図３は、被撮影者（撮影モデル）の顔画像の置換の概要を示す説明図である。図３では、画像内において、顔画像の置換前の被撮影者（撮影モデル）をＭ１、被撮影者とともに撮影されたバッグをＢ、顔画像を利用者Ｕの顔画像に置換した被撮影者をＭ＋Ｕとして示す。 [1-5. Replacing face image of shooting model]
In this embodiment, the information providing apparatus 100 replaces the face of the subject (photographed model) of the multi-view image with the face of the user U who is the viewer. That is, the information providing apparatus 100 replaces the face of the subject (photographed model) of the multi-view image with the face of another person. FIG. 3 is an explanatory diagram showing an overview of replacement of a face image of a person (photographed model). In FIG. 3, in the image, M1 is the person to be photographed (photographing model) before replacement of the face image, B is the bag photographed with the person to be photographed, and B is the person to be photographed after replacing the face image with the face image of user U. is denoted as M+U.

図３に示すように、情報提供装置１００は、ネットワークＮ（図４参照）を介して、利用者Ｕ（ユーザ）の端末装置１０から、利用者Ｕの顔の多視点顔画像を取得する（ステップＳ２１）。例えば、情報提供装置１００は、利用者Ｕの端末装置１０から、投稿された多視点画像を閲覧する利用者Ｕの顔を、複数の視点から撮影した多視点顔画像を取得する。 As shown in FIG. 3, the information providing apparatus 100 acquires a multi-viewpoint face image of the face of the user U from the terminal device 10 of the user U (user) via the network N (see FIG. 4) (see FIG. 4). step S21). For example, the information providing apparatus 100 acquires, from the terminal device 10 of the user U, multi-viewpoint face images of the face of the user U browsing the posted multi-viewpoint images taken from a plurality of viewpoints.

次に、情報提供装置１００は、利用者Ｕの閲覧対象となる多視点画像から、被撮影者（撮影モデル）の顔を特定する（ステップＳ２２）。例えば、情報提供装置１００は、多視点画像の視点ごとの被撮影者（撮影モデル）の顔を特定する。本実施形態では、閲覧対象となる多視点画像は、利用者Ｕ（ユーザ）とは異なる被撮影者（撮影モデル）の顔が含まれる多視点画像である。なお、実際には、利用者Ｕ（ユーザ）自身の顔が含まれる多視点画像であってもよい。 Next, the information providing apparatus 100 identifies the face of the person to be photographed (photographed model) from the multi-viewpoint image to be browsed by the user U (step S22). For example, the information providing apparatus 100 identifies the face of the person (photographed model) for each viewpoint of the multi-view image. In the present embodiment, the multi-view image to be browsed is a multi-view image including the face of a person to be photographed (photographed model) different from the user U (user). In fact, it may be a multi-viewpoint image including the face of the user U (user) himself.

次に、情報提供装置１００は、閲覧対象となる多視点画像から、撮影時の視点を特定する（ステップＳ２３）。例えば、情報提供装置１００は、多視点画像の視点ごとの被撮影者（撮影モデル）の顔の位置（ポジション）や角度（アングル）を特定する。 Next, the information providing apparatus 100 identifies the viewpoint at the time of shooting from the multi-viewpoint image to be browsed (step S23). For example, the information providing apparatus 100 identifies the position and angle of the face of the person (photographed model) for each viewpoint of the multi-view image.

次に、情報提供装置１００は、撮影時の視点に応じて、閲覧対象となる多視点画像の被撮影者の顔を、閲覧者である利用者Ｕ（ユーザ）の顔に変更する（ステップＳ２４）。例えば、情報提供装置１００は、多視点画像の視点ごとの被撮影者（撮影モデル）の顔の位置（ポジション）や角度（アングル）に応じて、被撮影者（撮影モデル）の顔をユーザの顔に置き換えた画像を生成する。このとき、情報提供装置１００は、閲覧対象となる多視点画像の被撮影者の顔を、閲覧者である利用者Ｕ（ユーザ）の顔に、可能な限り自然な形で（できるだけ違和感が無いように）置き換える。また、情報提供装置１００は、同時に閲覧される複数の多視点画像のそれぞれの撮影時の視点に合わせて、それぞれの被撮影者（撮影モデル）の顔を一括して利用者Ｕ（ユーザ）の顔に変換する。 Next, the information providing apparatus 100 changes the face of the person being photographed in the multi-viewpoint image to be browsed to the face of the user U (user) who is the browsing person, according to the viewpoint at the time of photographing (step S24). ). For example, the information providing apparatus 100 displays the face of the person (photographed model) according to the position and angle of the face of the person (photographed model) for each viewpoint of the multi-view image. Generate a face-replaced image. At this time, the information providing apparatus 100 makes the face of the person to be photographed in the multi-view image to be browsed look like the face of the user U (user) who is the browsing person, in a form that is as natural as possible (no sense of incongruity as much as possible). ) replace. In addition, the information providing apparatus 100 collectively displays the face of each person to be photographed (photographed model) in accordance with the viewpoint at the time of photographing each of a plurality of multi-viewpoint images that are viewed at the same time. Convert to face.

次に、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の顔を、閲覧者である利用者Ｕ（ユーザ）の顔に変更する際、必要に応じて、多視点画像の被撮影者（撮影モデル）の身長調整を行う（ステップＳ２５）。すなわち、情報提供装置１００は、被撮影者（撮影モデル）の顔に限らず身長もユーザに合わせて変更してもよい。例えば、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の身長をユーザの身長に変更してもよい。また、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の顔とユーザの顔とに基づいて被撮影者の身長を調整してもよい。また、情報提供装置１００は、背景や被撮影者と一緒に撮影された撮影対象のサイズに合わせて被撮影者の身長を調整してもよい。 Next, when changing the face of the subject (photographed model) of the multi-view image to the face of the viewer U (user), the information providing apparatus 100 changes the face of the multi-view image as needed. The height of the person to be photographed (photographed model) is adjusted (step S25). That is, the information providing apparatus 100 may change not only the face of the person to be photographed (photographed model) but also the height thereof according to the user. For example, the information providing apparatus 100 may change the height of the subject (photographed model) of the multi-view image to the height of the user. Further, the information providing apparatus 100 may adjust the height of the person to be photographed based on the face of the person (photographed model) in the multi-view image and the face of the user. In addition, the information providing apparatus 100 may adjust the height of the person to be photographed according to the size of the object photographed together with the background and the person to be photographed.

また、情報提供装置１００は、多視点画像の被撮影者（撮影モデル）の顔をユーザの顔に変更した際、あるいは多視点画像の被撮影者（撮影モデル）がユーザ本人である場合に、画像加工編集等により、画像内のユーザの髪型や髪の色（濃淡を含む）、表情等を変更してもよい。例えば、ロングヘアーをショートヘアーに変更したり、黒髪を茶髪にしたり、目尻や口角を上げ下げしたりしてもよい。あるいは、情報提供装置１００は、ユーザの指示等に応じて、多視点画像に表示されている現在の顔画像（変換後のユーザの顔画像等）を、髪型や髪の色（濃淡を含む）、表情等が異なる顔画像に変換してもよい。 Further, when the face of the person (photographed model) of the multi-view image is changed to the face of the user, or when the person (photographed model) of the multi-view image is the user himself/herself, the information providing apparatus 100 The user's hairstyle, hair color (including shading), facial expression, etc. in the image may be changed by image processing and editing. For example, long hair may be changed to short hair, black hair may be changed to brown, and the corners of the eyes and the corners of the mouth may be raised and lowered. Alternatively, the information providing apparatus 100 changes the current face image (such as the user's face image after conversion) displayed in the multi-view image to a hairstyle or hair color (including shading) according to a user's instruction or the like. , may be converted into a face image with a different facial expression or the like.

次に、情報提供装置１００は、ネットワークＮ（図４参照）を介して、利用者Ｕ（ユーザ）の端末装置１０に、変換後の多視点画像を表示する（ステップＳ２６）。 Next, the information providing device 100 displays the converted multi-view image on the terminal device 10 of the user U (user) via the network N (see FIG. 4) (step S26).

なお、上記の説明では、情報提供装置１００は、多視点画像の被撮影者の顔を、閲覧者である利用者Ｕの顔に変換したが、実際には、閲覧者である利用者Ｕの顔に限定されない。情報提供装置１００は、閲覧者である利用者Ｕの顔以外にも、任意の人物の顔に変換してもよい。例えば、情報提供装置１００は、多視点画像に含まれる撮影対象のうちタグ付けの対象（アノテーション対象）に該当する商品を購入した際のプレゼント先となる利用者Ｕの友人の顔にしてもよい。 In the above description, the information providing apparatus 100 converts the face of the person being photographed in the multi-view image into the face of the user U who is the viewer. Not limited to faces. The information providing apparatus 100 may convert the face of an arbitrary person other than the face of the user U who is the viewer. For example, the information providing apparatus 100 may set the face of a friend of the user U as a gift recipient when purchasing a product corresponding to a tagging target (annotation target) among shooting targets included in a multi-view image. .

また、上記の説明では、情報提供装置１００は、多視点画像の被撮影者の顔を、別人の顔に変換したが、実際には、別人の顔に限定されない。情報提供装置１００は、別人の顔に限らず、同一人物の別の顔に変換してもよい。例えば、情報提供装置１００は、ある多視点画像の被撮影者が閲覧者である利用者Ｕ本人である場合（多視点画像の被撮影者と閲覧者が同一人である場合等）、その多視点画像の被撮影者である利用者Ｕの顔を、利用者Ｕの別の顔に変換してもよい。 Also, in the above description, the information providing apparatus 100 converts the face of the subject of the multi-view image into the face of another person, but in reality, the face is not limited to the face of another person. The information providing apparatus 100 may convert to another face of the same person instead of the face of another person. For example, when the subject of a certain multi-view image is the user U who is the viewer (when the subject of the multi-view image and the viewer are the same person, etc.), the information providing apparatus 100 The face of the user U who is the subject of the viewpoint image may be converted into another face of the user U.

また、情報提供装置１００は、ディープフェイク等の技術を用いて、閲覧対象となる多視点画像の被撮影者の顔を変換してもよい。また、情報提供装置１００は、単一視点の顔画像から複数視点の画像を生成する公知技術を用いて、差し替え先となる顔の多視点画像を生成し、これを用いて画像の顔を変換してもよい。 In addition, the information providing apparatus 100 may use techniques such as deepfake to transform the face of the person being photographed in the multi-viewpoint image to be browsed. In addition, the information providing apparatus 100 generates a multi-viewpoint image of the face to be replaced by using a known technique for generating multiple-viewpoint images from a single-viewpoint face image, and uses this to convert the face of the image. You may

〔２．情報処理システムの構成例〕
次に、図４を用いて、実施形態に係る情報提供装置１００が含まれる情報処理システム１の構成について説明する。図４は、実施形態に係る情報処理システム１の構成例を示す図である。図４に示すように、実施形態に係る情報処理システム１は、端末装置１０と情報提供装置１００とを含む。これらの各種装置は、ネットワークＮを介して、有線又は無線により通信可能に接続される。ネットワークＮは、例えば、ＬＡＮ（Local Area Network）や、インターネット等のＷＡＮ（Wide Area Network）である。 [2. Configuration example of information processing system]
Next, the configuration of the information processing system 1 including the information providing device 100 according to the embodiment will be described with reference to FIG. FIG. 4 is a diagram showing a configuration example of the information processing system 1 according to the embodiment. As shown in FIG. 4, the information processing system 1 according to the embodiment includes a terminal device 10 and an information providing device 100. As shown in FIG. These various devices are communicatively connected via a network N by wire or wirelessly. The network N is, for example, a LAN (Local Area Network) or a WAN (Wide Area Network) such as the Internet.

また、図４に示す情報処理システム１に含まれる各装置の数は図示したものに限られない。例えば、図４では、図示の簡略化のため、端末装置１０を１台のみ示したが、これはあくまでも例示であって限定されるものではなく、２台以上であってもよい。 Also, the number of devices included in the information processing system 1 shown in FIG. 4 is not limited to the illustrated one. For example, in FIG. 4, only one terminal device 10 is shown for simplification of illustration, but this is only an example and is not limited, and two or more devices may be provided.

端末装置１０は、利用者Ｕによって使用される情報処理装置である。例えば、端末装置１０は、スマートフォンやタブレット端末等のスマートデバイス、フィーチャーフォン、ＰＣ（Personal Computer）、ＰＤＡ（Personal Digital Assistant）、通信機能を備えたゲーム機やＡＶ機器、カーナビゲーションシステム、スマートウォッチやヘッドマウントディスプレイ等のウェアラブルデバイス（Wearable Device）、スマートグラス等である。 The terminal device 10 is an information processing device used by the user U. FIG. For example, the terminal device 10 includes smart devices such as smartphones and tablet terminals, feature phones, PCs (Personal Computers), PDAs (Personal Digital Assistants), game machines and AV equipment with communication functions, car navigation systems, smart watches, Wearable devices such as head-mounted displays, smart glasses, and the like.

また、かかる端末装置１０は、ＬＴＥ（Long Term Evolution）、４Ｇ（4th Generation）、５Ｇ（5th Generation：第５世代移動通信システム）等の無線通信網や、Ｂｌｕｅｔｏｏｔｈ（登録商標）、無線ＬＡＮ（Local Area Network）等の近距離無線通信を介してネットワークＮに接続し、情報提供装置１００と通信することができる。 In addition, the terminal device 10 is compatible with wireless communication networks such as LTE (Long Term Evolution), 4G (4th Generation), 5G (5th Generation: fifth generation mobile communication system), Bluetooth (registered trademark), wireless LAN (Local It is possible to communicate with the information providing apparatus 100 by connecting to the network N via short-range wireless communication such as Area Network).

情報提供装置１００は、例えばＰＣやサーバ装置、あるいはメインフレーム又はワークステーション等である。なお、情報提供装置１００は、クラウドコンピューティングにより実現されてもよい。 The information providing device 100 is, for example, a PC, a server device, a mainframe, a workstation, or the like. Note that the information providing apparatus 100 may be realized by cloud computing.

〔３．端末装置の構成例〕
次に、図５を用いて、端末装置１０の構成について説明する。図５は、端末装置１０の構成例を示す図である。図５に示すように、端末装置１０は、通信部１１と、表示部１２と、入力部１３と、測位部１４と、撮像部１５と、センサ部２０と、制御部３０（コントローラ）と、記憶部４０とを備える。 [3. Configuration example of terminal device]
Next, the configuration of the terminal device 10 will be described using FIG. FIG. 5 is a diagram showing a configuration example of the terminal device 10. As shown in FIG. As shown in FIG. 5, the terminal device 10 includes a communication unit 11, a display unit 12, an input unit 13, a positioning unit 14, an imaging unit 15, a sensor unit 20, a control unit 30 (controller), and a storage unit 40 .

（通信部１１）
通信部１１は、ネットワークＮ（図４参照）と有線又は無線で接続され、ネットワークＮを介して、情報提供装置１００との間で情報の送受信を行う。例えば、通信部１１は、ＮＩＣ（Network Interface Card）やアンテナ等によって実現される。 (Communication unit 11)
The communication unit 11 is connected to the network N (see FIG. 4) by wire or wirelessly, and transmits and receives information to and from the information providing apparatus 100 via the network N. FIG. For example, the communication unit 11 is implemented by a NIC (Network Interface Card), an antenna, or the like.

（表示部１２）
表示部１２は、位置情報等の各種情報を表示する表示デバイスである。例えば、表示部１２は、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）や有機ＥＬディスプレイ（Organic Electro-Luminescent Display）である。また、表示部１２は、タッチパネル式のディスプレイであるが、これに限定されるものではない。 (Display unit 12)
The display unit 12 is a display device that displays various information such as position information. For example, the display unit 12 is a liquid crystal display (LCD) or an organic EL display (Organic Electro-Luminescent Display). Also, the display unit 12 is a touch panel display, but is not limited to this.

（入力部１３）
入力部１３は、利用者Ｕから各種操作を受け付ける入力デバイスである。例えば、入力部１３は、文字や数字等を入力するためのボタン等を有する。なお、入力部１３は、入出力ポート（I/O port）やＵＳＢ（Universal Serial Bus）ポート等であってもよい。また、表示部１２がタッチパネル式のディスプレイである場合、表示部１２の一部が入力部１３として機能する。また、入力部１３は、利用者Ｕから音声入力を受け付けるマイク等であってもよい。マイクはワイヤレスであってもよい。 (Input unit 13)
The input unit 13 is an input device that receives various operations from the user U. For example, the input unit 13 has buttons and the like for inputting characters, numbers, and the like. The input unit 13 may be an input/output port (I/O port), a USB (Universal Serial Bus) port, or the like. Moreover, when the display unit 12 is a touch panel display, a part of the display unit 12 functions as the input unit 13 . Also, the input unit 13 may be a microphone or the like that receives voice input from the user U. FIG. The microphone may be wireless.

（測位部１４）
測位部１４は、ＧＰＳ（Global Positioning System）の衛星から送出される信号（電波）を受信し、受信した信号に基づいて、自装置である端末装置１０の現在位置を示す位置情報（例えば、緯度及び経度）を取得する。すなわち、測位部１４は、端末装置１０の位置を測位する。なお、ＧＰＳは、ＧＮＳＳ（Global Navigation Satellite System）の一例に過ぎない。 (Positioning unit 14)
The positioning unit 14 receives signals (radio waves) transmitted from GPS (Global Positioning System) satellites, and based on the received signals, position information (for example, latitude and longitude). That is, the positioning unit 14 positions the position of the terminal device 10 . GPS is merely an example of GNSS (Global Navigation Satellite System).

また、測位部１４は、ＧＰＳ以外にも、種々の手法により位置を測位することができる。例えば、測位部１４は、位置補正等のための補助的な測位手段として、下記のように、端末装置１０の様々な通信機能を利用して位置を測位してもよい。 Also, the positioning unit 14 can measure the position by various methods other than GPS. For example, the positioning unit 14 may measure the position using various communication functions of the terminal device 10 as described below as auxiliary positioning means for position correction and the like.

（撮像部１５）
撮像部１５は、被写体を撮影する画像センサ（カメラ）である。例えば、撮像部１５は、ＣＭＯＳイメージセンサやＣＣＤイメージセンサ等である。なお、撮像部１５は、内蔵カメラに限らず、端末装置１０と通信可能なワイヤレスカメラや、Ｗｅｂカメラ等の外付けカメラであってもよい。 (Imaging unit 15)
The imaging unit 15 is an image sensor (camera) that photographs a subject. For example, the imaging unit 15 is a CMOS image sensor, a CCD image sensor, or the like. Note that the imaging unit 15 is not limited to the built-in camera, and may be a wireless camera that can communicate with the terminal device 10 or an external camera such as a web camera.

（Ｗｉ－Ｆｉ測位）
例えば、測位部１４は、端末装置１０のＷｉ－Ｆｉ（登録商標）通信機能や、各通信会社が備える通信網を利用して、端末装置１０の位置を測位する。具体的には、測位部１４は、Ｗｉ－Ｆｉ通信等を行い、付近の基地局やアクセスポイントとの距離を測位することにより、端末装置１０の位置を測位する。 (Wi-Fi positioning)
For example, the positioning unit 14 measures the position of the terminal device 10 using the Wi-Fi (registered trademark) communication function of the terminal device 10 or the communication network provided by each communication company. Specifically, the positioning unit 14 performs Wi-Fi communication or the like and measures the position of the terminal device 10 by measuring the distance to a nearby base station or access point.

（ビーコン測位）
また、測位部１４は、端末装置１０のＢｌｕｅｔｏｏｔｈ（登録商標）機能を利用して位置を測位してもよい。例えば、測位部１４は、Ｂｌｕｅｔｏｏｔｈ（登録商標）機能によって接続されるビーコン（beacon）発信機と接続することにより、端末装置１０の位置を測位する。 (beacon positioning)
The positioning unit 14 may also use the Bluetooth (registered trademark) function of the terminal device 10 to measure the position. For example, the positioning unit 14 positions the position of the terminal device 10 by connecting with a beacon transmitter connected by the Bluetooth (registered trademark) function.

（地磁気測位）
また、測位部１４は、予め測定された構造物の地磁気のパターンと、端末装置１０が備える地磁気センサとに基づいて、端末装置１０の位置を測位する。 (geomagnetic positioning)
Further, the positioning unit 14 positions the position of the terminal device 10 based on the geomagnetism pattern of the structure measured in advance and the geomagnetic sensor provided in the terminal device 10 .

（ＲＦＩＤ測位）
また、例えば、端末装置１０が駅改札や店舗等で使用される非接触型ＩＣカードと同等のＲＦＩＤ（Radio Frequency Identification）タグの機能を備えている場合、もしくはＲＦＩＤタグを読み取る機能を備えている場合、端末装置１０によって決済等が行われた情報とともに、使用された位置が記録される。測位部１４は、かかる情報を取得することで、端末装置１０の位置を測位してもよい。また、位置は、端末装置１０が備える光学式センサや、赤外線センサ等によって測位されてもよい。 (RFID positioning)
Further, for example, if the terminal device 10 has an RFID (Radio Frequency Identification) tag function equivalent to a contactless IC card used at station ticket gates, stores, etc., or has a function of reading an RFID tag In this case, the location used is recorded together with the information that the payment was made by the terminal device 10 . The positioning unit 14 may measure the position of the terminal device 10 by acquiring such information. Also, the position may be measured by an optical sensor provided in the terminal device 10, an infrared sensor, or the like.

測位部１４は、必要に応じて、上述した測位手段の一つ又は組合せを用いて、端末装置１０の位置を測位してもよい。 The positioning unit 14 may measure the position of the terminal device 10 using one or a combination of the positioning means described above, if necessary.

（センサ部２０）
センサ部２０は、端末装置１０に搭載又は接続される各種のセンサを含む。なお、接続は、有線接続、無線接続を問わない。例えば、センサ類は、ウェアラブルデバイスやワイヤレスデバイス等、端末装置１０以外の検知装置であってもよい。図５に示す例では、センサ部２０は、加速度センサ２１と、ジャイロセンサ２２と、気圧センサ２３と、気温センサ２４と、音センサ２５と、光センサ２６と、磁気センサ２７とを備える。 (Sensor unit 20)
The sensor unit 20 includes various sensors mounted on or connected to the terminal device 10 . The connection may be wired connection or wireless connection. For example, the sensors may be detection devices other than the terminal device 10, such as wearable devices and wireless devices. In the example shown in FIG. 5, the sensor unit 20 includes an acceleration sensor 21, a gyro sensor 22, an atmospheric pressure sensor 23, an air temperature sensor 24, a sound sensor 25, an optical sensor 26, and a magnetic sensor 27.

なお、上記した各センサ２１～２７は、あくまでも例示であって限定されるものではない。すなわち、センサ部２０は、各センサ２１～２７のうちの一部を備える構成であってもよいし、各センサ２１～２７に加えてあるいは代えて、湿度センサ等その他のセンサを備えてもよい。また、撮像部１５も、画像センサの一種である。 The sensors 21 to 27 described above are only examples and are not limited. That is, the sensor unit 20 may be configured to include a part of the sensors 21 to 27, or may include other sensors such as a humidity sensor in addition to or instead of the sensors 21 to 27. . The imaging unit 15 is also a kind of image sensor.

加速度センサ２１は、例えば、３軸加速度センサであり、端末装置１０の移動方向、速度、及び、加速度等の端末装置１０の物理的な動きを検知する。ジャイロセンサ２２は、端末装置１０の角速度等に基づいて３軸方向の傾き等の端末装置１０の物理的な動きを検知する。気圧センサ２３は、例えば端末装置１０の周囲の気圧を検知する。 The acceleration sensor 21 is, for example, a three-axis acceleration sensor, and detects physical movements of the terminal device 10 such as movement direction, speed, and acceleration of the terminal device 10 . The gyro sensor 22 detects physical movements of the terminal device 10 such as inclination in three axial directions based on the angular velocity of the terminal device 10 and the like. The atmospheric pressure sensor 23 detects the atmospheric pressure around the terminal device 10, for example.

端末装置１０は、上記した加速度センサ２１やジャイロセンサ２２、気圧センサ２３等を備えることから、これらの各センサ２１～２３等を利用した歩行者自律航法（ＰＤＲ：Pedestrian Dead-Reckoning）等の技術を用いて端末装置１０の位置を測位することが可能になる。これにより、ＧＰＳ等の測位システムでは取得することが困難な屋内での位置情報を取得することが可能になる。 Since the terminal device 10 includes the above-described acceleration sensor 21, gyro sensor 22, barometric pressure sensor 23, etc., techniques such as pedestrian dead-reckoning (PDR: Pedestrian Dead-Reckoning) using these sensors 21 to 23, etc. , the position of the terminal device 10 can be determined. This makes it possible to acquire indoor position information that is difficult to acquire with a positioning system such as GPS.

例えば、加速度センサ２１を利用した歩数計により、歩数や歩くスピード、歩いた距離を算出することができる。また、ジャイロセンサ２２を利用して、利用者Ｕの進行方向や視線の方向、体の傾きを知ることができる。また、気圧センサ２３で検知した気圧から、利用者Ｕの端末装置１０が存在する高度やフロアの階数を知ることもできる。 For example, a pedometer using the acceleration sensor 21 can calculate the number of steps, walking speed, and distance walked. Further, by using the gyro sensor 22, it is possible to know the traveling direction, the direction of the line of sight, and the inclination of the body of the user U. Also, from the atmospheric pressure detected by the atmospheric pressure sensor 23, the altitude at which the terminal device 10 of the user U is present and the number of floors can be known.

気温センサ２４は、例えば端末装置１０の周囲の気温を検知する。音センサ２５は、例えば端末装置１０の周囲の音を検知する。光センサ２６は、端末装置１０の周囲の照度を検知する。磁気センサ２７は、例えば端末装置１０の周囲の地磁気を検知する。撮像部１５は、端末装置１０の周囲の画像を撮像する。 The temperature sensor 24 detects the temperature around the terminal device 10, for example. The sound sensor 25 detects sounds around the terminal device 10, for example. The optical sensor 26 detects the illuminance around the terminal device 10 . The magnetic sensor 27 detects, for example, geomagnetism around the terminal device 10 . The imaging unit 15 captures an image around the terminal device 10 .

上記した気圧センサ２３、気温センサ２４、音センサ２５、光センサ２６及び撮像部１５は、それぞれ気圧、気温、音、照度を検知したり、周囲の画像を撮像したりすることで、端末装置１０の周囲の環境や状況等を検知することができる。また、端末装置１０の周囲の環境や状況等から、端末装置１０の位置情報の精度を向上させることが可能になる。 The atmospheric pressure sensor 23, the temperature sensor 24, the sound sensor 25, the optical sensor 26, and the imaging unit 15 described above detect the atmospheric pressure, the temperature, the sound, and the illuminance, respectively, or capture an image of the surroundings, so that the terminal device 10 It is possible to detect the surrounding environment and situations. In addition, it is possible to improve the accuracy of the location information of the terminal device 10 based on the surrounding environment and situation of the terminal device 10 .

（制御部３０）
制御部３０は、例えば、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ、入出力ポート等を有するマイクロコンピュータや各種の回路を含む。また、制御部３０は、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路等のハードウェアで構成されてもよい。制御部３０は、送信部３１と、受信部３２と、処理部３３と、ガイド表示部３４と、ガイド変更部３５と、撮影判定部３６と、認識部３７と、通知部３８とを備える。なお、実際には、処理部３３が、ガイド表示部３４と、ガイド変更部３５と、撮影判定部３６と、認識部３７と、通知部３８とを備えていてもよい。 (control unit 30)
The control unit 30 includes, for example, a microcomputer having a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM, an input/output port, and various circuits. Also, the control unit 30 may be configured by hardware such as an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The control unit 30 includes a transmission unit 31 , a reception unit 32 , a processing unit 33 , a guide display unit 34 , a guide change unit 35 , a shooting determination unit 36 , a recognition unit 37 and a notification unit 38 . Note that the processing unit 33 may actually include a guide display unit 34 , a guide change unit 35 , a shooting determination unit 36 , a recognition unit 37 and a notification unit 38 .

（送信部３１）
送信部３１は、例えば入力部１３を用いて利用者Ｕにより入力された各種情報や、端末装置１０に搭載又は接続された各センサ２１～２７によって検知された各種情報、測位部１４によって測位された端末装置１０の位置情報等を、通信部１１を介して情報提供装置１００へ送信することができる。 (Sending unit 31)
The transmission unit 31 receives, for example, various information input by the user U using the input unit 13, various information detected by the sensors 21 to 27 mounted on or connected to the terminal device 10, and information measured by the positioning unit 14. The position information of the terminal device 10 and the like can be transmitted to the information providing device 100 via the communication unit 11 .

（受信部３２）
受信部３２は、通信部１１を介して、情報提供装置１００から提供される各種情報や、情報提供装置１００からの各種情報の要求を受信することができる。 (Receiver 32)
The receiving unit 32 can receive various information provided by the information providing apparatus 100 and requests for various information from the information providing apparatus 100 via the communication unit 11 .

（処理部３３）
処理部３３は、表示部１２等を含め、端末装置１０全体を制御する。例えば、処理部３３は、送信部３１によって送信される各種情報や、受信部３２によって受信された情報提供装置１００からの各種情報を表示部１２へ出力して表示させることができる。 (Processing unit 33)
The processing unit 33 controls the entire terminal device 10 including the display unit 12 and the like. For example, the processing unit 33 can output various types of information transmitted by the transmitting unit 31 and various types of information received by the receiving unit 32 from the information providing apparatus 100 to the display unit 12 for display.

（ガイド表示部３４）
ガイド表示部３４は、多視点画像を撮影する際に、所定の撮影ガイドを画面に表示する。例えば、ガイド表示部３４は、撮影ガイドとして被写体のうち被撮影者（撮影モデル）のポーズ、表情、持ち方のうち少なくとも１つに関するガイドを画面に表示する。また、ガイド表示部３４は、撮影ガイドとして被写体のシルエット、輪郭、又は半透明の表示を画面に表示する。 (Guide display section 34)
The guide display unit 34 displays a predetermined shooting guide on the screen when shooting a multi-viewpoint image. For example, the guide display unit 34 displays on the screen a guide regarding at least one of the pose, facial expression, and manner of holding of the person to be photographed (photographing model) as a photographing guide. Further, the guide display unit 34 displays the silhouette, outline, or semi-transparent display of the subject on the screen as a shooting guide.

また、ガイド表示部３４は、被写体を撮影する際に、被写体に撮影ガイドを重畳表示する。また、ガイド表示部３４は、最初に撮影された画像で被写体のうち被撮影者のポーズを特定し、特定されたポーズに応じた撮影ガイドを選択して表示する。 Further, the guide display unit 34 superimposes and displays a shooting guide on the subject when shooting the subject. In addition, the guide display unit 34 identifies the pose of the subject among the subjects in the first captured image, and selects and displays a photographing guide corresponding to the identified pose.

（ガイド変更部３５）
ガイド変更部３５は、撮影時の視点を移動する度に、画面に表示された撮影ガイドを視点に応じて変更する。また、ガイド変更部３５は、撮影時の視点を移動するにつれて、撮影ガイドを段階的に変更してもよい。 (Guide changing unit 35)
A guide changing unit 35 changes the shooting guide displayed on the screen according to the viewpoint every time the viewpoint during shooting is moved. Further, the guide changing unit 35 may change the shooting guide step by step as the viewpoint during shooting is moved.

（撮影判定部３６）
撮影判定部３６は、被写体が前記撮影ガイドに一致した場合、撮像部１５を用いて、自動的に撮影する。また、撮影判定部３６は、多視点画像の撮影が完了した場合、送信部３１を用いて、多視点画像を情報提供装置１００に投稿する。このとき、撮影判定部３６は、撮影された画像から多視点画像を生成してもよい。 (Photographing determination unit 36)
The shooting determination unit 36 automatically shoots the subject using the imaging unit 15 when the subject matches the shooting guide. Further, when the shooting of the multi-viewpoint image is completed, the shooting determination unit 36 uses the transmission unit 31 to post the multi-viewpoint image to the information providing apparatus 100 . At this time, the photographing determination unit 36 may generate a multi-viewpoint image from the photographed images.

（認識部３７）
認識部３７は、多視点画像を構成する画像の撮影時に、画像に含まれる撮影対象を認識する。例えば、認識部３７は、画像認識又は機械学習で、画像に含まれる撮影対象を認識する。また、認識部３７は、撮影の度に、画像に含まれる撮影対象を認識する。また、認識部３７は、撮影の度に、画像に含まれる複数の撮影対象の各々を認識する。また、認識部３７は、撮影の度に、他の撮影対象に隠れて見えなくなった撮影対象を認識する。 (Recognition unit 37)
The recognition unit 37 recognizes a shooting target included in an image when shooting an image forming a multi-view image. For example, the recognition unit 37 recognizes the shooting target included in the image by image recognition or machine learning. Further, the recognition unit 37 recognizes the photographing target included in the image each time photographing is performed. Further, the recognition unit 37 recognizes each of the plurality of shooting targets included in the image each time shooting is performed. In addition, the recognition unit 37 recognizes an object to be photographed that is hidden behind other objects to be photographed and is no longer visible each time photographing is performed.

（通知部３８）
通知部３８は、被写体が撮影ガイドとずれている場合、被写体が撮影ガイドとずれている箇所を撮影者に通知する。なお、被写体は、撮影者自身であってもよい。すなわち、撮影者と被撮影者は同一人物であってもよい。 (Notification unit 38)
When the subject is out of alignment with the shooting guide, the notification unit 38 notifies the photographer of the location where the subject is out of alignment with the shooting guide. Note that the subject may be the photographer himself/herself. That is, the photographer and the person to be photographed may be the same person.

また、通知部３８は、認識部３７により認識された撮影対象を利用者に通知する。例えば、通知部３８は、撮影の度に、認識された撮影対象を利用者に通知する。また、通知部３８は、撮影の度に、認識された複数の撮影対象の各々を利用者に通知する。また、通知部３８は、撮影の度に、認識された撮影対象の画像の撮影枚数を利用者に通知する。また、通知部３８は、撮影の度に、多視点画像を構成する画像の必要数までの残り枚数を利用者に通知する。また、通知部３８は、撮影の度に、他の撮影対象に隠れて見えなくなった撮影対象を利用者に通知する。 Also, the notification unit 38 notifies the user of the imaging target recognized by the recognition unit 37 . For example, the notification unit 38 notifies the user of the recognized shooting target each time shooting is performed. In addition, the notification unit 38 notifies the user of each of the plurality of recognized shooting targets each time shooting is performed. In addition, the notification unit 38 notifies the user of the number of images of the recognized object to be photographed each time photographing is performed. In addition, the notification unit 38 notifies the user of the remaining number of images constituting the multi-view image before the required number each time the image is captured. In addition, the notification unit 38 notifies the user of an object to be photographed that is hidden behind another object to be photographed and is no longer visible each time photographing is performed.

このとき、通知部３８は、特定された撮影対象を利用者に音声で通知してもよい。また、通知部３８は、特定された撮影対象を利用者に画面表示で通知してもよい。 At this time, the notification unit 38 may notify the user of the specified imaging target by voice. In addition, the notification unit 38 may notify the user of the specified imaging target by displaying it on the screen.

（記憶部４０）
記憶部４０は、例えば、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、又は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク等の記憶装置によって実現される。かかる記憶部４０には、各種プログラムや各種データ等が記憶される。 (storage unit 40)
The storage unit 40 is realized by, for example, a semiconductor memory device such as RAM (Random Access Memory) or flash memory, or a storage device such as HDD (Hard Disk Drive), SSD (Solid State Drive), or optical disk. be. Various programs, various data, and the like are stored in the storage unit 40 .

〔４．情報提供装置の構成例〕
次に、図６を用いて、実施形態に係る情報提供装置１００の構成について説明する。図６は、実施形態に係る情報提供装置１００の構成例を示す図である。図６に示すように、情報提供装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。 [4. Configuration example of information providing device]
Next, the configuration of the information providing device 100 according to the embodiment will be described using FIG. FIG. 6 is a diagram showing a configuration example of the information providing device 100 according to the embodiment. As shown in FIG. 6, the information providing device 100 has a communication section 110, a storage section 120, and a control section .

（通信部１１０）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。また、通信部１１０は、ネットワークＮ（図４参照）と有線又は無線で接続される。 (Communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card) or the like. Also, the communication unit 110 is connected to the network N (see FIG. 4) by wire or wirelessly.

（記憶部１２０）
記憶部１２０は、例えば、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、又は、ＨＤＤ、ＳＳＤ、光ディスク等の記憶装置によって実現される。図６に示すように、記憶部１２０は、利用者情報データベース１２１と、履歴情報データベース１２２と、画像情報データベース１２３とを有する。 (storage unit 120)
The storage unit 120 is implemented by, for example, a semiconductor memory device such as a RAM (Random Access Memory) or flash memory, or a storage device such as an HDD, SSD, or optical disk. As shown in FIG. 6, storage unit 120 has user information database 121 , history information database 122 , and image information database 123 .

（利用者情報データベース１２１）
利用者情報データベース１２１は、利用者Ｕに関する利用者情報を記憶する。例えば、利用者情報データベース１２１は、利用者Ｕの属性等の種々の情報を記憶する。図７は、利用者情報データベース１２１の一例を示す図である。図７に示した例では、利用者情報データベース１２１は、「利用者ＩＤ（Identifier）」、「年齢」、「性別」、「自宅」、「勤務地」、「興味」といった項目を有する。 (User information database 121)
The user information database 121 stores user information about the user U. FIG. For example, the user information database 121 stores various information such as user U attributes. FIG. 7 is a diagram showing an example of the user information database 121. As shown in FIG. In the example shown in FIG. 7, the user information database 121 has items such as "user ID (Identifier)", "age", "sex", "home", "place of work", and "interest".

「利用者ＩＤ」は、利用者Ｕを識別するための識別情報を示す。なお、「利用者ＩＤ」は、利用者Ｕの連絡先（電話番号、メールアドレス等）であってもよいし、利用者Ｕの端末装置１０を識別するための識別情報であってもよい。 “User ID” indicates identification information for identifying the user U. The “user ID” may be the user U's contact information (telephone number, e-mail address, etc.), or may be identification information for identifying the user U's terminal device 10 .

また、「年齢」は、利用者ＩＤにより識別される利用者Ｕの年齢を示す。なお、「年齢」は、利用者Ｕの具体的な年齢（例えば３５歳など）を示す情報であってもよいし、利用者Ｕの年代（例えば３０代など）を示す情報であってもよい。あるいは、「年齢」は、利用者Ｕの生年月日を示す情報であってもよいし、利用者Ｕの世代（例えば８０年代生まれなど）を示す情報であってもよい。また、「性別」は、利用者ＩＤにより識別される利用者Ｕの性別を示す。 "Age" indicates the age of the user U identified by the user ID. Note that the "age" may be information indicating a specific age of the user U (for example, 35 years old) or information indicating the age of the user U (for example, 30's). . Alternatively, the "age" may be information indicating the date of birth of the user U, or information indicating the generation of the user U (for example, born in the 80's). "Gender" indicates the gender of the user U identified by the user ID.

また、「自宅」は、利用者ＩＤにより識別される利用者Ｕの自宅の位置情報を示す。なお、図７に示す例では、「自宅」は、「ＬＣ１１」といった抽象的な符号を図示するが、緯度経度情報等であってもよい。また、例えば、「自宅」は、地域名や住所であってもよい。 "Home" indicates location information of the home of the user U identified by the user ID. In the example shown in FIG. 7, "home" is represented by an abstract code such as "LC11", but may be latitude/longitude information or the like. Also, for example, "home" may be an area name or an address.

また、「勤務地」は、利用者ＩＤにより識別される利用者Ｕの勤務地（学生の場合は学校）の位置情報を示す。なお、図７に示す例では、「勤務地」は、「ＬＣ１２」といった抽象的な符号を図示するが、緯度経度情報等であってもよい。また、例えば、「勤務地」は、地域名や住所であってもよい。 "Place of work" indicates location information of the place of work (school in the case of a student) of the user U identified by the user ID. In the example shown in FIG. 7, the "place of work" is illustrated as an abstract code such as "LC12", but may be latitude/longitude information or the like. Also, for example, the "place of work" may be an area name or an address.

また、「興味」は、利用者ＩＤにより識別される利用者Ｕの興味を示す。すなわち、「興味」は、利用者ＩＤにより識別される利用者Ｕが関心の高い対象を示す。例えば、「興味」は、利用者Ｕが検索エンジンに入力して検索した検索クエリ（キーワード）等であってもよい。なお、図７に示す例では、「興味」は、各利用者Ｕに１つずつ図示するが、複数であってもよい。 "Interest" indicates the interest of the user U identified by the user ID. That is, "interest" indicates an object in which the user U identified by the user ID is highly interested. For example, the "interest" may be a search query (keyword) that the user U has entered into a search engine and searched for. In the example shown in FIG. 7, one "interest" is shown for each user U, but there may be more than one.

例えば、図７に示す例において、利用者ＩＤ「Ｕ１」により識別される利用者Ｕの年齢は、「２０代」であり、性別は、「男性」であることを示す。また、例えば、利用者ＩＤ「Ｕ１」により識別される利用者Ｕは、自宅が「ＬＣ１１」であることを示す。また、例えば、利用者ＩＤ「Ｕ１」により識別される利用者Ｕは、勤務地が「ＬＣ１２」であることを示す。また、例えば、利用者ＩＤ「Ｕ１」により識別される利用者Ｕは、「スポーツ」に興味があることを示す。 For example, in the example shown in FIG. 7, the age of the user U identified by the user ID "U1" is "twenties" and the gender is "male". Also, for example, the user U identified by the user ID "U1" indicates that the home is "LC11". Also, for example, the user U identified by the user ID "U1" indicates that the place of work is "LC12". Also, for example, the user U identified by the user ID "U1" indicates that he is interested in "sports".

ここで、図７に示す例では、「Ｕ１」、「ＬＣ１１」及び「ＬＣ１２」といった抽象的な値を用いて図示するが、「Ｕ１」、「ＬＣ１１」及び「ＬＣ１２」には、具体的な文字列や数値等の情報が記憶されるものとする。以下、他の情報に関する図においても、抽象的な値を図示する場合がある。 Here, in the example shown in FIG. 7, abstract values such as “U1”, “LC11” and “LC12” are used, but “U1”, “LC11” and “LC12” are concrete values. It is assumed that information such as character strings and numerical values is stored. Hereinafter, abstract values may also be illustrated in diagrams relating to other information.

なお、利用者情報データベース１２１は、上記に限らず、目的に応じて種々の情報を記憶してもよい。例えば、利用者情報データベース１２１は、利用者Ｕの端末装置１０に関する各種情報を記憶してもよい。また、利用者情報データベース１２１は、利用者Ｕのデモグラフィック（人口統計学的属性）、サイコグラフィック（心理学的属性）、ジオグラフィック（地理学的属性）、ベヘイビオラル（行動学的属性）等の属性に関する情報を記憶してもよい。例えば、利用者情報データベース１２１は、氏名、家族構成、出身地（地元）、職業、職位、収入、資格、居住形態（戸建、マンション等）、車の有無、通学・通勤時間、通学・通勤経路、定期券区間（駅、路線等）、利用頻度の高い駅（自宅・勤務地の最寄駅以外）、習い事（場所、時間帯等）、趣味、興味、ライフスタイル等の情報を記憶してもよい。 The user information database 121 is not limited to the above, and may store various types of information depending on the purpose. For example, the user information database 121 may store various information about the terminal device 10 of the user U. FIG. In addition, the user information database 121 stores user U's demographics (demographic attributes), psychographics (psychological attributes), geographics (geographical attributes), behavioral attributes (behavioral attributes), etc. Information about attributes may be stored. For example, the user information database 121 includes name, family structure, hometown (local), occupation, position, income, qualification, residence type (detached house, condominium, etc.), presence or absence of car, commuting time, commuting time, commuting time. Information such as routes, commuter pass sections (stations, lines, etc.), frequently used stations (other than the nearest station to your home or place of work), lessons (places, time zones, etc.), hobbies, interests, lifestyle, etc. may

（履歴情報データベース１２２）
履歴情報データベース１２２は、利用者Ｕの行動を示す履歴情報（ログデータ）に関する各種情報を記憶する。図８は、履歴情報データベース１２２の一例を示す図である。図８に示した例では、履歴情報データベース１２２は、「利用者ＩＤ」、「位置履歴」、「検索履歴」、「閲覧履歴」、「購入履歴」、「投稿履歴」といった項目を有する。 (History information database 122)
The history information database 122 stores various types of information related to history information (log data) indicating user U's actions. FIG. 8 is a diagram showing an example of the history information database 122. As shown in FIG. In the example shown in FIG. 8, the history information database 122 has items such as "user ID", "location history", "search history", "browsing history", "purchase history", and "posting history".

「利用者ＩＤ」は、利用者Ｕを識別するための識別情報を示す。また、「位置履歴」は、利用者Ｕの位置や移動の履歴である位置履歴を示す。また、「検索履歴」は、利用者Ｕが入力した検索クエリの履歴である検索履歴を示す。また、「閲覧履歴」は、利用者Ｕが閲覧したコンテンツの履歴である閲覧履歴を示す。また、「購入履歴」は、利用者Ｕによる購入の履歴である購入履歴を示す。また、「投稿履歴」は、利用者Ｕによる投稿の履歴である投稿履歴を示す。なお、「投稿履歴」は、利用者Ｕの所有物に関する質問を含んでいてもよい。 “User ID” indicates identification information for identifying the user U. "Position history" indicates a position history that is a history of the user's U position and movement. Also, "search history" indicates a search history that is a history of search queries input by the user U. FIG. "Browsing history" indicates a browsing history that is a history of contents browsed by the user U. FIG. "Purchase history" indicates the purchase history of the user U's purchases. In addition, “posting history” indicates a posting history that is a history of posts by the user U. FIG. In addition, the “posting history” may include questions about user U's property.

例えば、図８に示す例において、利用者ＩＤ「Ｕ１」により識別される利用者Ｕは、「位置履歴＃１」の通りに移動し、「検索履歴＃１」の通りに検索し、「閲覧履歴＃１」の通りにコンテンツを閲覧し、「購入履歴＃１」の通りに所定の店舗等で所定の商品等を購入し、「投稿履歴」の通りに投稿したことを示す。 For example, in the example shown in FIG. 8, the user U identified by the user ID “U1” moves along the “location history #1”, searches along the “search history #1”, It indicates that the content was browsed according to the "history #1", a predetermined product or the like was purchased at a predetermined store or the like according to the "purchase history #1", and the content was posted according to the "posting history".

ここで、図８に示す例では、「Ｕ１」、「位置履歴＃１」、「検索履歴＃１」、「閲覧履歴＃１」、「購入履歴＃１」及び「投稿履歴＃１」といった抽象的な値を用いて図示するが、「Ｕ１」、「位置履歴＃１」、「検索履歴＃１」、「閲覧履歴＃１」、「購入履歴＃１」及び「投稿履歴＃１」には、具体的な文字列や数値等の情報が記憶されるものとする。 Here, in the example shown in FIG. 8, abstract history such as "U1", "location history #1", "search history #1", "browsing history #1", "purchase history #1" and "posting history #1" "U1", "location history #1", "search history #1", "browsing history #1", "purchase history #1" and "posting history #1" , information such as specific character strings and numerical values are stored.

なお、履歴情報データベース１２２は、上記に限らず、目的に応じて種々の情報を記憶してもよい。例えば、履歴情報データベース１２２は、利用者Ｕの所定のサービスの利用履歴等を記憶してもよい。また、履歴情報データベース１２２は、利用者Ｕの実店舗の来店履歴又は施設の訪問履歴等を記憶してもよい。また、履歴情報データベース１２２は、利用者Ｕの端末装置１０を用いた決済（電子決済）での決済履歴等を記憶してもよい。 Note that the history information database 122 may store various types of information, not limited to the above, depending on the purpose. For example, the history information database 122 may store the user U's usage history of a predetermined service. In addition, the history information database 122 may store the user U's store visit history, facility visit history, and the like. In addition, the history information database 122 may store a history of payment (electronic payment) using the terminal device 10 of the user U, and the like.

（画像情報データベース１２３）
画像情報データベース１２３は、多視点画像に関する各種情報を記憶する。図９は、画像情報データベース１２３の一例を示す図である。図９に示した例では、画像情報データベース１２３は、「多視点画像」、「画像」、「視点」、「撮影対象」、「位置」、「アノテーション対象」、「タグ」、「顔の位置」といった項目を有する。 (Image information database 123)
The image information database 123 stores various information regarding multi-view images. FIG. 9 is a diagram showing an example of the image information database 123. As shown in FIG. In the example shown in FIG. 9, the image information database 123 includes "multi-viewpoint image", "image", "viewpoint", "shooting target", "position", "annotation target", "tag", and "face position". ” has items such as

「多視点画像」は、多視点画像を識別するための識別情報を示す。なお、実際には、多視点画像のデータの格納場所や所在位置等であってもよい。また、「画像」は、多視点画像を構成する画像を識別するための識別情報を示す。なお、実際には、多視点画像を構成する画像のデータの格納場所や所在位置等であってもよい。 “Multi-view image” indicates identification information for identifying a multi-view image. It should be noted that, in practice, it may be the storage location or the location of the data of the multi-view image. Also, "image" indicates identification information for identifying an image forming a multi-view image. It should be noted that, in practice, it may be the storage location or location of the data of the images forming the multi-viewpoint image.

また、「視点」は、多視点画像を構成する画像を撮影した時の視点を示す。すなわち、視点は、多視点画像を構成する画像に組まれる撮影対象の位置（ポジション）や角度（アングル）を示す。 Also, "viewpoint" indicates a viewpoint when an image forming a multi-viewpoint image is captured. In other words, the viewpoint indicates the position and angle of the object to be photographed that is included in the images forming the multi-viewpoint image.

また、「撮影対象」は、多視点画像を構成する画像に含まれる撮影対象を示す。すなわち、被写体として撮影された撮影対象を示す。例えば、撮影対象の分類（カテゴリ）や具体的な商品名、商品コード等を示す。また、撮影対象は、被撮影者（人物）であってもよい。また、撮影対象は、複数であってもよい。すなわち、１つの画像に複数の撮影対象が含まれていてもよい。例えば、被撮影者と、その被撮影者が身につけている２つのファッションアイテムを、それぞれ撮影対象としてもよい。 In addition, "shooting target" indicates a shooting target included in the images forming the multi-viewpoint image. That is, it indicates an object to be photographed as a subject. For example, it indicates the classification (category) of the object to be photographed, the specific product name, the product code, and the like. Also, the object to be photographed may be a person (person) to be photographed. Also, the number of subjects to be photographed may be plural. That is, one image may include a plurality of shooting targets. For example, a person to be photographed and two fashion items worn by the person to be photographed may be photographed.

また、「位置」は、多視点画像内の撮影対象の位置（画像内の位置）を示す。本実施形態では、多視点画像を構成する個々の画像内の撮影対象の３次元的な位置を示す。撮影対象の３次元的な位置は、画像内の座標等の絶対位置であってもよいし、基準点や他の撮像対象からの相対位置であってもよい。また、３次元的な位置は一例に過ぎない。 "Position" indicates the position of the imaging target within the multi-viewpoint image (position within the image). In this embodiment, the three-dimensional position of the object to be photographed in each image that constitutes the multi-viewpoint image is indicated. The three-dimensional position of the object to be imaged may be an absolute position such as coordinates in an image, or may be a relative position from a reference point or another object to be imaged. Also, the three-dimensional position is merely an example.

また、「アノテーション対象」は、撮影対象（アノテーション対象の候補）のうち、タグ付けの対象（アノテーション対象）を示す。タグ付けの対象は、ユーザにより選択されたものであってもよいし、事前設定や機械学習等により自動的に決定されたものであってもよい。 "Annotation target" indicates a tagging target (annotation target) among shooting targets (annotation target candidates). The target of tagging may be selected by the user, or may be automatically determined by presetting, machine learning, or the like.

また、「タグ」は、タグ付けの対象（アノテーション対象）に付与されるタグを示す。例えば、事前に登録されたタグを識別するための識別情報であってもよいし、タグの内容であってもよい。例えば、タグ付けの対象（アノテーション対象）となる商品（ファッションアイテム）の商品ページに関する情報であってもよい。このとき、ファッション通販サイト（例えば「ZOZOTOWN」（登録商標））等の電子商取引サイトの各商品ページから画像認識又は機械学習で当該商品の類似画像を検索し、検索結果に基づいて当該商品の商品ページを自動で特定してもよい。 "Tag" indicates a tag given to a tagging target (annotation target). For example, it may be identification information for identifying a tag registered in advance, or the content of the tag. For example, it may be information about a product page of a product (fashion item) to be tagged (annotated). At this time, similar images of the product are searched for by image recognition or machine learning from each product page of e-commerce sites such as fashion mail order sites (for example, "ZOZOTOWN" (registered trademark)), and based on the search results, the product of the product Pages may be automatically identified.

また、「顔の位置」は多視点画像内の被撮影者の顔の位置（画像内の顔の位置）を示す。本実施形態では、多視点画像を構成する個々の画像内の被撮影者の顔の３次元的な位置を示す。顔の位置は、画像内の座標等の絶対位置であってもよいし、基準点や他の撮像対象からの相対位置であってもよい。また、顔の輪郭や顔の各部（眉、目、耳、鼻、口、顎等）の位置等であってもよい。 "Position of face" indicates the position of the face of the photographed person in the multi-view image (the position of the face in the image). In this embodiment, the three-dimensional position of the face of the person to be photographed in each image that constitutes the multi-viewpoint image is indicated. The position of the face may be an absolute position such as coordinates in the image, or may be a relative position from a reference point or another object to be imaged. Also, the outline of the face and the position of each part of the face (eyebrows, eyes, ears, nose, mouth, chin, etc.) may be used.

例えば、図９に示す例において、多視点画像「Ａ」を構成する画像「Ａ１」は、「視点＃Ａ１」で撮影され、撮影対象である「バッグ」が画像内の「位置＃Ａ１」にあり、「アノテーション対象」（タグ付けの対象）として選定されており、対象のバッグに関するウェブサイト「サイト＃Ｗ１」へのリンクがタグとして付与され、画像内の被撮影者の顔の位置は「顔位置＃Ａ１」であることを示す。 For example, in the example shown in FIG. 9, an image "A1" that forms a multi-viewpoint image "A" is shot at "viewpoint #A1", and the subject "bag" is at "position #A1" in the image. Yes, it is selected as an "annotation target" (target for tagging), a link to the website "Site #W1" related to the target bag is given as a tag, and the position of the photographed person's face in the image is " face position #A1”.

ここで、図９に示す例では、「Ａ」、「Ａ１」、「視点＃Ａ１」、「位置＃Ａ１」、「サイト＃Ｗ１」及び「顔位置＃Ａ１」といった抽象的な値を用いて図示するが、「Ａ」、「Ａ１」、「視点＃Ａ１」、「位置＃Ａ１」、「サイト＃Ｗ１」及び「顔位置＃Ａ１」には、具体的な文字列や数値等の情報が記憶されるものとする。 Here, in the example shown in FIG. 9, abstract values such as "A", "A1", "viewpoint #A1", "position #A1", "site #W1", and "face position #A1" are used. As shown, "A", "A1", "viewpoint #A1", "position #A1", "site #W1" and "face position #A1" contain information such as specific character strings and numerical values. shall be memorized.

なお、画像情報データベース１２３は、上記に限らず、目的に応じて種々の情報を記憶してもよい。例えば、画像情報データベース１２３は、多視点画像の投稿者又は閲覧者を識別するための識別情報を記憶してもよい。また、画像情報データベース１２３は、被写体（撮影対象、被撮影者等）に関する詳細情報を記憶してもよい。また、画像情報データベース１２３は、候補となるタグのリストを記憶してもよい。また、画像情報データベース１２３は、撮影場所や撮影日時に関する情報を記憶してもよい。また、画像情報データベース１２３は、撮影に用いた撮影装置（ユーザの端末装置等）や撮影環境に関する情報を記憶してもよい。 The image information database 123 is not limited to the above, and may store various kinds of information depending on the purpose. For example, the image information database 123 may store identification information for identifying contributors or viewers of multi-view images. In addition, the image information database 123 may store detailed information about a subject (photographed object, photographed person, etc.). The image information database 123 may also store a list of candidate tags. In addition, the image information database 123 may store information about the shooting location and shooting date and time. The image information database 123 may also store information about the imaging device (user's terminal device, etc.) used for imaging and the imaging environment.

（制御部１３０）
図６に戻り、説明を続ける。制御部１３０は、コントローラ（Controller）であり、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等によって、情報提供装置１００の内部の記憶装置に記憶されている各種プログラム（情報処理プログラムの一例に相当）がＲＡＭ等の記憶領域を作業領域として実行されることにより実現される。図６に示す例では、制御部１３０は、取得部１３１と、特定部１３２と、推定部１３３と、タグ付与部１３４と、タグ変更部１３５と、画像変換部１３６と、提供部１３７とを有する。 (control unit 130)
Returning to FIG. 6, the description is continued. The control unit 130 is a controller, and for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or the like controls the information providing apparatus 100. Various programs (corresponding to an example of an information processing program) stored in the internal storage device are executed by using a storage area such as a RAM as a work area. In the example shown in FIG. 6, the control unit 130 includes an acquisition unit 131, an identification unit 132, an estimation unit 133, a tagging unit 134, a tag changing unit 135, an image conversion unit 136, and a providing unit 137. have.

（取得部１３１）
取得部１３１は、利用者Ｕにより入力された検索クエリを取得する。例えば、取得部１３１は、利用者Ｕが検索エンジン等に検索クエリを入力してキーワード検索を行った際に、通信部１１０を介して、当該検索クエリを取得する。すなわち、取得部１３１は、通信部１１０を介して、利用者Ｕにより検索エンジンやサイト又はアプリの検索窓に入力されたキーワードを取得する。 (Acquisition unit 131)
The acquisition unit 131 acquires a search query input by the user U. For example, the acquisition unit 131 acquires the search query via the communication unit 110 when the user U inputs a search query into a search engine or the like and performs a keyword search. That is, the acquisition unit 131 acquires, via the communication unit 110, a keyword input by the user U to a search window of a search engine, site, or application.

また、取得部１３１は、通信部１１０を介して、利用者Ｕに関する利用者情報を取得する。例えば、取得部１３１は、利用者Ｕの端末装置１０から、利用者Ｕを示す識別情報（利用者ＩＤ等）や、利用者Ｕの位置情報、利用者Ｕの属性情報等を取得する。また、取得部１３１は、利用者Ｕのユーザ登録時に、利用者Ｕを示す識別情報や、利用者Ｕの属性情報等を取得してもよい。そして、取得部１３１は、利用者情報を、記憶部１２０の利用者情報データベース１２１に登録する。 The acquisition unit 131 also acquires user information about the user U via the communication unit 110 . For example, the acquisition unit 131 acquires identification information (user ID, etc.) indicating the user U, location information of the user U, attribute information of the user U, and the like, from the terminal device 10 of the user U. FIG. Further, the acquisition unit 131 may acquire identification information indicating the user U, attribute information of the user U, and the like when the user U is registered as a user. Acquisition unit 131 then registers the user information in user information database 121 of storage unit 120 .

また、取得部１３１は、通信部１１０を介して、利用者Ｕの行動を示す各種の履歴情報（ログデータ）を取得する。例えば、取得部１３１は、利用者Ｕの端末装置１０から、あるいは利用者ＩＤ等に基づいて各種サーバ等から、利用者Ｕの行動を示す各種の履歴情報を取得する。そして、取得部１３１は、各種の履歴情報を、記憶部１２０の履歴情報データベース１２２に登録する。 In addition, the acquisition unit 131 acquires various types of history information (log data) indicating actions of the user U via the communication unit 110 . For example, the acquisition unit 131 acquires various types of history information indicating actions of the user U from the terminal device 10 of the user U or from various servers based on the user ID or the like. The acquisition unit 131 then registers various types of history information in the history information database 122 of the storage unit 120 .

また、取得部１３１は、通信部１１０を介して、投稿者又は閲覧者である利用者Ｕから多視点画像を取得する。例えば、取得部１３１は、投稿者である利用者Ｕの端末装置１０から、投稿者が撮影した多視点画像を取得する。また、取得部１３１は、閲覧者である利用者Ｕが指定した他の投稿者が撮影した多視点画像を取得する。 Also, the acquisition unit 131 acquires a multi-view image from the user U, who is a poster or a viewer, via the communication unit 110 . For example, the acquiring unit 131 acquires a multi-view image captured by the poster from the terminal device 10 of the user U who is the poster. In addition, the acquisition unit 131 acquires multi-viewpoint images shot by other contributors specified by the user U who is the viewer.

また、取得部１３１は、通信部１１０を介して、被撮影者とは異なる別人の顔の多視点画像を取得する。例えば、取得部１３１は、別人の顔を複数の視点から撮影した多視点顔画像を取得する。別人の顔は、閲覧者である利用者Ｕの顔であってもよい。本実施形態では、取得部１３１は、通信部１１０を介して、閲覧者の顔の多視点画像を取得する。なお、取得部１３１は、閲覧者の顔の多視点画像を事前に取得してもよいし、閲覧時に取得してもよい。また、閲覧者の顔の多視点画像は、少なくとも閲覧者の顔を含む多視点画像であってもよい。例えば、取得部１３１は、閲覧者である利用者Ｕの顔を複数の視点から撮影した多視点顔画像を取得する。 Also, the acquiring unit 131 acquires a multi-viewpoint image of the face of a person other than the person being photographed via the communication unit 110 . For example, the acquiring unit 131 acquires a multi-viewpoint face image obtained by photographing another person's face from a plurality of viewpoints. The face of another person may be the face of the user U who is the viewer. In this embodiment, the acquisition unit 131 acquires a multi-viewpoint image of the viewer's face via the communication unit 110 . Note that the acquisition unit 131 may acquire the multi-viewpoint image of the viewer's face in advance, or may acquire it at the time of viewing. Also, the multi-viewpoint image of the viewer's face may be a multi-viewpoint image including at least the viewer's face. For example, the acquiring unit 131 acquires multi-viewpoint face images obtained by photographing the face of the user U, who is a viewer, from a plurality of viewpoints.

また、取得部１３１は、通信部１１０を介して、投稿者からタグ付けの対象の選択を受け付ける受付部としても機能する。例えば、取得部１３１（受付部）は、投稿者から、タグ付けの対象の選択と、タグ付けの対象に対応付けるウェブページの指定とを受け付ける。 The acquisition unit 131 also functions as a reception unit that receives selection of a tagging target from a poster via the communication unit 110 . For example, the acquiring unit 131 (receiving unit) receives, from the contributor, selection of a tagging target and designation of a web page to be associated with the tagging target.

（特定部１３２）
特定部１３２は、多視点画像に含まれる撮影対象を特定する。そして、特定部１３２は、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。このとき、特定部１３２は、多視点画像の各視点の画像ごとに画像認識又は機械学習で撮影対象を特定して分類する。 (Specifying unit 132)
The identifying unit 132 identifies a shooting target included in the multi-viewpoint image. Then, the specifying unit 132 specifies targets to be tagged with annotations from shooting targets included in the multi-viewpoint images. At this time, the identifying unit 132 identifies and classifies the shooting target for each viewpoint image of the multi-viewpoint image by image recognition or machine learning.

例えば、特定部１３２は、投稿者からのタグ付けの対象の選択に応じて、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。あるいは、特定部１３２は、画像認識又は機械学習で、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。 For example, the specifying unit 132 specifies the target of annotation tagging from among the shooting targets included in the multi-view image according to the selection of the target of tagging by the contributor. Alternatively, the identifying unit 132 identifies, by image recognition or machine learning, an annotation tagging target from shooting targets included in the multi-viewpoint image.

また、特定部１３２は、ネットワーク上の複数のウェブページから画像認識又は機械学習でタグ付けの対象の画像の類似画像を検索し、類似画像を含むウェブページをタグ付けの対象に対応付けるウェブページとして自動で特定する。 In addition, the identifying unit 132 searches a plurality of web pages on the network for images similar to the image to be tagged by image recognition or machine learning, and uses the web page containing the similar image as a web page to associate with the tagging target. Identify automatically.

また、特定部１３２は、利用者Ｕの閲覧対象となる多視点画像の被撮影者の顔を特定する。例えば、特定部１３２は、利用者Ｕの閲覧対象となる多視点画像の被撮影者の顔と、多視点画像の撮影時の視点とを特定する。このとき、特定部１３２は、同時に閲覧される複数の多視点画像のそれぞれの被撮影者の顔と、複数の多視点画像のそれぞれの撮影時の視点とを特定してもよい。 Further, the specifying unit 132 specifies the face of the person to be photographed in the multi-viewpoint image to be browsed by the user U. For example, the specifying unit 132 specifies the face of the person to be photographed in the multi-viewpoint image to be browsed by the user U and the viewpoint at the time of shooting the multi-viewpoint image. At this time, the specifying unit 132 may specify the face of the person to be photographed in each of the multiple multi-view images viewed at the same time, and the viewpoint at the time of shooting each of the multiple multiple-view images.

（推定部１３３）
推定部１３３は、多視点画像内の撮影対象の位置を推定する。すなわち、推定部１３３は、多視点画像内の撮影対象から選択されたタグ付けの対象の位置を推定する。本実施形態では、推定部１３３は、タグ付けの対象の多視点画像内の３次元的な位置を推定する。多視点画像内の３次元的な位置は、画像内の座標等の絶対位置であってもよいし、基準点や他の撮像対象からの相対位置であってもよい。なお、実際には、特定部１３２が推定部１３３として機能してもよい。このとき、特定部１３２は、タグ付けの対象の多視点画像内の３次元的な位置を特定する。 (Estimation unit 133)
The estimation unit 133 estimates the position of the shooting target in the multi-view image. That is, the estimation unit 133 estimates the position of the tagging target selected from the shooting targets in the multi-view image. In this embodiment, the estimation unit 133 estimates a three-dimensional position in the multi-view image to be tagged. The three-dimensional position in the multi-viewpoint image may be an absolute position such as coordinates in the image, or may be a relative position from a reference point or another imaging target. Note that the identifying unit 132 may actually function as the estimating unit 133 . At this time, the specifying unit 132 specifies a three-dimensional position in the multi-view image to be tagged.

（タグ付与部１３４）
タグ付与部１３４は、タグ付けの対象の位置に合わせてタグを付与する。例えば、タグ付与部１３４は、タグ付けの対象の３次元的な位置に合わせてタグ付けの対象にタグを付与する。これにより、付与されたタグが画面内に表示される。また、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグが他の対象及び他のタグと重複しないように付与する。 (Tagging unit 134)
The tag assigning unit 134 assigns a tag according to the position of the object to be tagged. For example, the tagging unit 134 tags the object to be tagged according to the three-dimensional position of the object to be tagged. As a result, the attached tag is displayed on the screen. In addition, when the tagging unit 134 gives a tag to a target to be tagged, the tag is given so as not to overlap with other targets and other tags.

また、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されていない状態であれば、タグ付けの対象にタグを付与する。なお、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されている状態であっても、タグ付けの対象のタグが他の対象のタグよりも表示の優先度が高い場合には、他の対象にはタグを付与せず、タグ付けの対象にタグを付与してもよい。 Further, when tagging a target to be tagged, the tagging unit 134 tags the target to be tagged if the target to be tagged is not hidden by other targets. Note that when the tagging unit 134 gives a tag to a target to be tagged, even if the target to be tagged is hidden by another target, the tag to be tagged is the tag of the other target. If the priority of display is higher than the target, the target to be tagged may be tagged without assigning the tag to the other target.

（タグ変更部１３５）
タグ変更部１３５は、多視点画像の視点の変更に伴い画面内のタグ付けの対象の位置が変更した場合に、タグ付けの対象の位置の変更に合わせてタグの表示位置を変更する。また、タグ変更部１３５は、多視点画像の視点が変更されても、タグ付けの対象とタグとの位置関係が保持されるような位置にタグを配置する。なお、実際には、タグ付与部１３４がタグ変更部１３５として機能してもよい。この場合、タグ付与部１３４は、多視点画像の視点が変更される度に、タグ付けの対象の位置の変更に合わせて、タグ付けの対象にタグを付与する。 (Tag changing unit 135)
The tag changing unit 135 changes the tag display position in accordance with the change of the tagging target position when the tagging target position in the screen is changed due to the change of the viewpoint of the multi-viewpoint image. Also, the tag changing unit 135 arranges the tag at a position such that the positional relationship between the tagging target and the tag is maintained even if the viewpoint of the multi-viewpoint image is changed. Note that the tagging unit 134 may actually function as the tag changing unit 135 . In this case, the tagging unit 134 tags the target to be tagged in accordance with the change in the position of the target to be tagged each time the viewpoint of the multi-viewpoint image is changed.

（画像変換部１３６）
画像変換部１３６は、多視点画像を構成する画像ごとに、多視点画像の被撮影者の顔を別人の顔に変換する。例えば、画像変換部１３６は、多視点画像を構成する画像ごとに、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に変換する。このとき、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に置き換えた新たな多視点画像を生成してもよい。 (Image converter 136)
The image conversion unit 136 converts the face of the photographed person in the multi-view image into the face of another person for each image forming the multi-view image. For example, the image conversion unit 136 converts the face of the subject in the multi-viewpoint image into the face of another person in accordance with the viewpoint at the time of shooting the multi-viewpoint image, for each image forming the multi-viewpoint image. At this time, the image conversion unit 136 may generate a new multi-view image by replacing the face of the subject in the multi-view image with the face of another person in accordance with the viewpoint at the time of shooting the multi-view image.

本実施形態では、画像変換部１３６は、多視点画像を構成する画像ごとに、多視点画像の被撮影者の顔を、閲覧者である利用者Ｕの顔に変換する。例えば、画像変換部１３６は、多視点画像を構成する画像ごとに、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、利用者Ｕの顔に変換する。このとき、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、利用者Ｕの顔に置き換えた新たな多視点画像を生成してもよい。 In this embodiment, the image conversion unit 136 converts the face of the subject of the multi-view image into the face of the user U who is the viewer, for each image forming the multi-view image. For example, the image conversion unit 136 converts the face of the person to be photographed in the multi-viewpoint image into the face of the user U for each image forming the multi-viewpoint image in accordance with the viewpoint at the time of shooting the multi-viewpoint image. At this time, the image conversion unit 136 may generate a new multi-view image by replacing the face of the person being photographed in the multi-view image with the face of the user U according to the viewpoint at the time of shooting the multi-view image. good.

また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に変換する際に、多視点画像の被撮影者の身長を別人の身長に合わせて調整する。このとき、画像変換部１３６は、被撮影者の顔と別人の顔とに基づいて被撮影者の身長を調整してもよい。 In addition, the image conversion unit 136 adjusts the height of the person to be photographed in the multi-view image when converting the face of the person to be photographed in the multi-view image to the face of another person according to the viewpoint at the time of photographing the multi-view image. Adjust for another person's height. At this time, the image conversion unit 136 may adjust the height of the person to be photographed based on the face of the person being photographed and the face of another person.

本実施形態では、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、閲覧者である利用者Ｕの顔に変換する際に、多視点画像の被撮影者の身長を利用者Ｕの身長に合わせて調整する。このとき、画像変換部１３６は、被撮影者の顔と利用者Ｕの顔とに基づいて被撮影者の身長を調整してもよい。 In the present embodiment, the image conversion unit 136 converts the face of the person being photographed in the multi-viewpoint image into the face of the user U who is the viewer, in accordance with the viewpoint at the time of shooting the multi-viewpoint image. The height of the person to be photographed in the viewpoint image is adjusted according to the height of the user U. At this time, the image conversion unit 136 may adjust the height of the person to be photographed based on the face of the person to be photographed and the face of the user U.

また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に変換するとともに、変換後の画像の別人の顔の表情を変更する。また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に変換するとともに、変換後の画像の別人の髪型を変更する。また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、別人の顔に変換するとともに、変換後の画像の別人の髪の色（濃淡を含む）を変更する。 In addition, the image conversion unit 136 converts the face of the subject in the multi-viewpoint image into the face of another person according to the viewpoint at the time of shooting the multi-viewpoint image, and also converts the expression of the face of the other person in the image after conversion. change. In addition, the image conversion unit 136 converts the face of the photographed person in the multi-viewpoint image into another person's face and changes the hairstyle of the other person in the image after conversion in accordance with the viewpoint at the time of photographing the multi-viewpoint image. . In addition, the image conversion unit 136 converts the face of the person to be photographed in the multi-viewpoint image into the face of another person in accordance with the viewpoint at the time of shooting the multi-viewpoint image, and the hair color of the other person in the image after conversion ( (including shading).

本実施形態では、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、閲覧者である利用者Ｕの顔に変換する際に、変換後の利用者Ｕの顔の表情を変更する。また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、利用者Ｕの顔に変換する際に、変換後の利用者Ｕの髪型を変更する。また、画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、利用者Ｕの顔に変換する際に、変換後の利用者Ｕの髪の色を変更する。 In this embodiment, the image conversion unit 136 converts the face of the person being photographed in the multi-view image into the face of the user U who is the viewer, according to the viewpoint at the time of shooting the multi-view image. The facial expression of the subsequent user U is changed. In addition, the image conversion unit 136 converts the face of the person to be photographed in the multi-viewpoint image into the face of the user U in accordance with the viewpoint at the time of shooting the multi-viewpoint image. to change In addition, the image conversion unit 136 converts the face of the person to be photographed in the multi-viewpoint image into the face of the user U in accordance with the viewpoint at the time of shooting the multi-viewpoint image. change the color of

例えば、画像変換部１３６は、画像加工編集等により、画像内の利用者Ｕの髪型や髪の色（濃淡を含む）、表情等を変更してもよい。あるいは、画像変換部１３６は、利用者Ｕの指示等に応じて、多視点画像に表示されている現在の顔画像（変換後の利用者Ｕの顔画像等）を、髪型や髪の色（濃淡を含む）、表情等が異なる顔画像に変換してもよい。 For example, the image conversion unit 136 may change the hairstyle, hair color (including shading), facial expression, etc. of the user U in the image by processing and editing the image. Alternatively, the image conversion unit 136 converts the current face image displayed in the multi-viewpoint image (face image of the user U after conversion, etc.) into a hairstyle or hair color ( (including shading), facial expressions, etc. may be converted into face images.

また、画像変換部１３６は、同時に閲覧される複数の多視点画像のそれぞれの撮影時の視点に合わせて、それぞれの被撮影者の顔を一括して別人の顔に変換する。本実施形態では、画像変換部１３６は、同時に閲覧される複数の多視点画像のそれぞれの撮影時の視点に合わせて、それぞれの被撮影者の顔を一括して閲覧者である利用者Ｕの顔に変換する。 In addition, the image conversion unit 136 collectively converts the faces of the respective persons to be photographed into the faces of different persons according to the respective viewpoints of the plurality of multi-viewpoint images viewed at the same time. In this embodiment, the image conversion unit 136 collectively converts the faces of the persons to be photographed according to the viewpoints at the time of photographing of the plurality of multi-viewpoint images that are viewed at the same time. Convert to face.

（提供部１３７）
提供部１３７は、別人の顔に変換後の多視点画像を利用者Ｕに提供する。例えば、提供部１３７は、生成された新たな多視点画像を利用者Ｕに提供する。また、提供部１３７は、別人の顔に変換後の複数の多視点画像のそれぞれを利用者Ｕに提供する。 (Providing unit 137)
The providing unit 137 provides the user U with the multi-viewpoint image converted to the face of another person. For example, the providing unit 137 provides the user U with the generated new multi-view image. In addition, the providing unit 137 provides the user U with each of the plurality of multi-viewpoint images after conversion into the face of another person.

本実施形態では、提供部１３７は、閲覧者である利用者Ｕの顔に変換後の多視点画像を利用者Ｕに提供する。例えば、提供部１３７は、生成された新たな多視点画像を利用者Ｕに提供する。また、提供部１３７は、利用者Ｕの顔に変換後の複数の多視点画像のそれぞれを利用者Ｕに提供する。 In the present embodiment, the providing unit 137 provides the user U, who is a viewer, with the multi-viewpoint image converted into the user U's face. For example, the providing unit 137 provides the user U with the generated new multi-view image. Further, the providing unit 137 provides the user U with each of the plurality of multi-viewpoint images converted into the user U's face.

〔５．処理手順〕
次に、図１０を用いて実施形態に係る端末装置１０及び情報提供装置１００による処理手順について説明する。図１０は、実施形態に係る処理手順を示すフローチャートである。なお、以下に示す処理手順は、端末装置１０の制御部３０及び情報提供装置１００の制御部１３０によって繰り返し実行される。また、端末装置１０と情報提供装置１００とは連携する。 [5. Processing procedure]
Next, processing procedures by the terminal device 10 and the information providing device 100 according to the embodiment will be described with reference to FIG. FIG. 10 is a flowchart illustrating a processing procedure according to the embodiment; The processing procedure described below is repeatedly executed by the control unit 30 of the terminal device 10 and the control unit 130 of the information providing device 100 . In addition, the terminal device 10 and the information providing device 100 cooperate with each other.

図１０に示すように、端末装置１０のガイド表示部３４は、多視点画像を撮影する際に、所定の撮影ガイドを画面に表示する（ステップＳ１０１）。 As shown in FIG. 10, the guide display unit 34 of the terminal device 10 displays a predetermined shooting guide on the screen when shooting a multi-viewpoint image (step S101).

続いて、端末装置１０の撮影判定部３６は、被写体が前記撮影ガイドに一致した場合、撮像部１５により、自動的に撮影する（ステップＳ１０２）。このとき、端末装置１０の通知部３８は、被写体が撮影ガイドとずれている場合、被写体が撮影ガイドとずれている箇所を撮影者に通知する。このとき、通知部３８は、特定された撮影対象を利用者に音声で通知してもよいし、特定された撮影対象を利用者に画面表示で通知してもよい。 Subsequently, when the subject matches the shooting guide, the shooting determination unit 36 of the terminal device 10 automatically shoots the subject using the imaging unit 15 (step S102). At this time, if the subject is out of alignment with the shooting guide, the notification unit 38 of the terminal device 10 notifies the photographer of the location where the subject is out of alignment with the shooting guide. At this time, the notification unit 38 may notify the user of the specified imaging target by voice, or may notify the user of the specified imaging target by screen display.

続いて、端末装置１０の認識部３７は、撮影された画像に含まれる撮影対象を認識する。端末装置１０の通知部３８は、認識部３７により認識された撮影対象を利用者に通知する（ステップＳ１０３）。例えば、通知部３８は、撮影の度に、認識された撮影対象の画像の撮影枚数や、多視点画像を構成する画像の必要数までの残り枚数、他の撮影対象に隠れて見えなくなった撮影対象等を利用者に通知する。 Subsequently, the recognition unit 37 of the terminal device 10 recognizes the photographing target included in the photographed image. The notification unit 38 of the terminal device 10 notifies the user of the imaging target recognized by the recognition unit 37 (step S103). For example, each time the notification unit 38 takes a photograph, the number of images of the recognized object to be photographed, the remaining number of images constituting the multi-view image to the required number, and the number of images to be photographed that are hidden behind other objects to be photographed. Notify the user of the target etc.

続いて、端末装置１０のガイド変更部３５は、撮影時の視点を移動する度に、画面に表示された撮影ガイドを視点に応じて変更する（ステップＳ１０４）。このとき、ガイド変更部３５は、撮影時の視点を移動するにつれて、撮影ガイドを段階的に変更してもよい。 Subsequently, the guide changing unit 35 of the terminal device 10 changes the shooting guide displayed on the screen according to the viewpoint every time the viewpoint during shooting is changed (step S104). At this time, the guide changing unit 35 may change the shooting guide step by step as the viewpoint during shooting is moved.

続いて、端末装置１０の撮影判定部３６は、多視点画像の撮影が完了した場合、送信部３１を用いて、多視点画像を情報提供装置１００に投稿する（ステップＳ１０５）。このとき、撮影判定部３６は、撮影された画像から多視点画像を生成してもよい。また、撮影判定部３６は、投稿前に撮影者に投稿してもよいか確認するようにしてもよい。 Subsequently, when the shooting of the multi-view image is completed, the shooting determination unit 36 of the terminal device 10 uses the transmission unit 31 to post the multi-view image to the information providing device 100 (step S105). At this time, the photographing determination unit 36 may generate a multi-viewpoint image from the photographed images. Further, the photographing determination unit 36 may confirm with the photographer whether or not it is permissible to post before posting.

続いて、情報提供装置１００の推定部１３３は、多視点画像内の撮影対象の位置を推定する（ステップＳ１０６）。例えば、情報提供装置１００の取得部１３１は、通信部１１０を介して、端末装置１０から多視点画像を取得する。情報提供装置１００の特定部１３２は、多視点画像内の撮影対象を特定する。そして、情報提供装置１００の推定部１３３は、多視点画像内の撮影対象の３次元的な位置を推定する。 Subsequently, the estimation unit 133 of the information providing device 100 estimates the position of the imaging target in the multi-view image (step S106). For example, the acquisition unit 131 of the information providing device 100 acquires multi-view images from the terminal device 10 via the communication unit 110 . The specifying unit 132 of the information providing device 100 specifies a shooting target in the multi-view image. Then, the estimation unit 133 of the information providing apparatus 100 estimates the three-dimensional position of the imaging target in the multi-view image.

続いて、情報提供装置１００の特定部１３２は、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する（ステップＳ１０７）。例えば、特定部１３２は、投稿者からのタグ付けの対象の選択に応じて、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。あるいは、特定部１３２は、画像認識又は機械学習で、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。 Subsequently, the specifying unit 132 of the information providing apparatus 100 specifies targets to be tagged with annotations from among shooting targets included in the multi-view images (step S107). For example, the specifying unit 132 specifies the target of annotation tagging from among the shooting targets included in the multi-view image according to the selection of the target of tagging by the contributor. Alternatively, the identifying unit 132 identifies, by image recognition or machine learning, an annotation tagging target from shooting targets included in the multi-viewpoint image.

続いて、情報提供装置１００のタグ付与部１３４は、タグ付けの対象にタグを付与する（ステップＳ１０８）。このとき、情報提供装置１００の推定部１３３は、多視点画像内のタグ付けの対象の位置を推定する。タグ付与部１３４は、タグ付けの対象の位置に合わせてタグを付与する。なお、タグを付与する位置については、多視点画像を投稿した投稿者があらかじめ指定していてもよい。 Subsequently, the tagging unit 134 of the information providing device 100 tags the object to be tagged (step S108). At this time, the estimation unit 133 of the information providing apparatus 100 estimates the position of the tagging target within the multi-view image. The tag assigning unit 134 assigns a tag according to the position of the object to be tagged. Note that the position to which the tag is attached may be specified in advance by the contributor who posted the multi-view image.

このとき、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグが他の対象及び他のタグと重複しないように付与する。また、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されていない状態であれば、タグ付けの対象にタグを付与する。なお、タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されている状態であっても、タグ付けの対象のタグが他の対象のタグよりも表示の優先度が高い場合には、他の対象にはタグを付与せず、タグ付けの対象にタグを付与してもよい。 At this time, when the tagging unit 134 gives the tag to the object to be tagged, the tag is given so as not to overlap with other objects and other tags. Further, when tagging a target to be tagged, the tagging unit 134 tags the target to be tagged if the target to be tagged is not hidden by other targets. Note that when the tagging unit 134 gives a tag to a target to be tagged, even if the target to be tagged is hidden by another target, the tag to be tagged is the tag of the other target. If the priority of display is higher than the target, the target to be tagged may be tagged without assigning the tag to the other target.

続いて、情報提供装置１００のタグ変更部１３５は、多視点画像の各視点の画像に合わせて、タグ付けの対象に付与されたタグの表示位置を変更する（ステップＳ１０９）。
例えば、タグ変更部１３５は、多視点画像の視点の変更に伴い画面内のタグ付けの対象の位置が変更した場合に、タグ付けの対象の位置の変更に合わせてタグの表示位置を変更する。 Subsequently, the tag changing unit 135 of the information providing apparatus 100 changes the display position of the tag given to the target of tagging according to the image of each viewpoint of the multi-view image (step S109).
For example, when the position of the target of tagging in the screen changes due to the change of the viewpoint of the multi-view image, the tag changing unit 135 changes the display position of the tag according to the change of the position of the target of tagging. .

続いて、情報提供装置１００の画像変換部１３６は、多視点画像の撮影時の視点に合わせて、多視点画像の被撮影者の顔を、閲覧者の顔に変換する（ステップＳ１１０）。このとき、情報提供装置１００の取得部１３１は、通信部１１０を介して、閲覧者の顔の多視点画像を取得する。なお、取得部１３１は、閲覧者の顔の多視点画像を事前に取得してもよいし、閲覧時に取得してもよい。また、情報提供装置１００の提供部１３７は、通信部１１０を介して、閲覧者の顔に変換後の多視点画像を閲覧者に提供する。 Subsequently, the image conversion unit 136 of the information providing apparatus 100 converts the face of the subject of the multi-viewpoint image into the face of the viewer according to the viewpoint at the time of shooting the multi-viewpoint image (step S110). At this time, the acquiring unit 131 of the information providing apparatus 100 acquires a multi-viewpoint image of the viewer's face via the communication unit 110 . Note that the acquisition unit 131 may acquire the multi-viewpoint image of the viewer's face in advance, or may acquire it at the time of viewing. In addition, the providing unit 137 of the information providing apparatus 100 provides the viewer with the multi-viewpoint image converted to the viewer's face via the communication unit 110 .

〔６．変形例〕
上述した端末装置１０及び情報提供装置１００は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、実施形態の変形例について説明する。 [6. Modification]
The terminal device 10 and the information providing device 100 described above may be implemented in various different forms other than the above embodiments. So, below, the modification of embodiment is demonstrated.

上記の実施形態において、情報提供装置１００が実行している処理の一部又は全部は、実際には、端末装置１０が実行してもよい。例えば、スタンドアローン（Stand-alone）で（端末装置１０単体で）処理が完結してもよい。この場合、端末装置１０に、上記の実施形態における情報提供装置１００の機能が備わっているものとする。また、上記の実施形態では、端末装置１０は情報提供装置１００と連携しているため、利用者Ｕから見れば、情報提供装置１００の処理も端末装置１０が実行しているように見える。すなわち、他の観点では、端末装置１０は、情報提供装置１００を備えているともいえる。 In the above embodiment, part or all of the processing executed by the information providing device 100 may actually be executed by the terminal device 10 . For example, the processing may be completed stand-alone (by the terminal device 10 alone). In this case, it is assumed that the terminal device 10 has the functions of the information providing device 100 in the above embodiment. In addition, in the above-described embodiment, the terminal device 10 cooperates with the information providing device 100, so from the user U's point of view, it appears that the terminal device 10 is executing the processing of the information providing device 100 as well. That is, from another point of view, it can be said that the terminal device 10 includes the information providing device 100 .

また、上記の実施形態において、撮影対象（アノテーション対象の候補）及びタグ付けの対象（アノテーション対象）としてファッションアイテムを例に説明しているが、実際にはファッションアイテムに限定されない。撮影対象及びタグ付けの対象（アノテーション対象）は、多視点画像の撮影時に、ユーザとともに撮影される物品であってもよい。例えば、撮影時にユーザが着用しているウェアラブルデバイスや手に持っている端末装置等であってもよいし、撮影時にユーザの周囲に配置されている家電製品（家電機器・電化製品）、背景として一緒に撮影された室内のインテリア、本棚の書籍、キッチンやテーブルの料理や食器、アート作品等であってもよい。 Also, in the above-described embodiment, fashion items are used as an example of shooting targets (annotation target candidates) and tagging targets (annotation targets), but in reality they are not limited to fashion items. The object to be photographed and the object to be tagged (annotation object) may be an article photographed together with the user when the multi-view image is photographed. For example, it may be a wearable device worn by the user at the time of shooting, a terminal device held in the hand, or the like, home appliances (household appliances/electrical appliances) placed around the user at the time of shooting, or a background. It may be the interior of the room photographed together, the books on the bookshelf, the dishes and tableware on the kitchen or table, the art work, or the like.

また、上記の実施形態において、多視点画像の撮影時に、ユーザではなく、特定の物品のみを撮影してもよい。例えば、多視点画像内のタグ付けの対象（アノテーション対象）にアノテーション（注釈）のタグを付与する際、多視点画像にユーザが含まれていなくてもよい。 Further, in the above embodiment, only a specific article may be photographed instead of the user when photographing multi-viewpoint images. For example, when an annotation tag is attached to a tagging target (annotation target) in a multi-view image, the user does not have to be included in the multi-view image.

〔７．効果〕
上述してきたように、本願に係る情報処理装置（端末装置１０及び情報提供装置１００）は、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する特定部１３２と、タグ付けの対象の多視点画像内の３次元的な位置を推定する推定部と、タグ付けの対象の３次元的な位置に合わせてタグ付けの対象にタグを付与するタグ付与部１３４と、を備える。 [7. effect〕
As described above, the information processing device (the terminal device 10 and the information providing device 100) according to the present application includes the specifying unit 132 that specifies a target of annotation tagging from among shooting targets included in a multi-view image, an estimating unit for estimating a three-dimensional position of a tagging target in a multi-view image; a tagging unit 134 for tagging a tagging target according to the three-dimensional position of the tagging target; Prepare.

特定部１３２は、多視点画像の各視点の画像ごとに画像認識で撮影対象を特定して分類する。 The identifying unit 132 identifies and classifies a shooting target by image recognition for each viewpoint image of the multi-viewpoint image.

本願に係る情報処理装置は、利用者Ｕから多視点画像を取得する取得部１３１と、利用者Ｕからタグ付けの対象の選択を受け付ける受付部（取得部１３１）と、をさらに備える。特定部１３２は、利用者Ｕからのタグ付けの対象の選択に応じて、多視点画像に含まれる撮影対象の中から、アノテーションのタグ付けの対象を特定する。 The information processing apparatus according to the present application further includes an acquisition unit 131 that acquires a multi-view image from a user U, and a reception unit (acquisition unit 131) that receives a selection of a tagging target from the user U. The specifying unit 132 specifies an annotation tagging target from shooting targets included in the multi-view image according to the user U's selection of the tagging target.

受付部（取得部１３１）は、利用者Ｕから、タグ付けの対象の選択と、タグ付けの対象に対応付けるウェブページの指定とを受け付ける。 The reception unit (acquisition unit 131) receives, from the user U, selection of a tagging target and designation of a web page to be associated with the tagging target.

特定部１３２は、ネットワーク上の複数のウェブページから画像認識でタグ付けの対象の画像の類似画像を検索し、類似画像を含むウェブページをタグ付けの対象に対応付けるウェブページとして自動で特定する。 The identifying unit 132 searches multiple web pages on the network for images similar to the image to be tagged by image recognition, and automatically identifies the web page containing the similar image as a web page associated with the tagging target.

タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグが他の対象及び他のタグと重複しないように付与する。 The tagging unit 134, when giving a tag to an object to be tagged, gives the tag so as not to overlap with other objects and other tags.

タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されていない状態であれば、タグ付けの対象にタグを付与する。 When tagging the target to be tagged, the tagging unit 134 tags the target to be tagged if the target is not hidden by other targets.

タグ付与部１３４は、タグ付けの対象にタグを付与する際、タグ付けの対象が他の対象により隠されている状態であっても、タグ付けの対象のタグが他の対象のタグよりも表示の優先度が高い場合には、他の対象にはタグを付与せず、タグ付けの対象にタグを付与する。 When the tagging unit 134 gives a tag to a target to be tagged, even if the target to be tagged is hidden by another target, the tag to be tagged is higher than the other target tag. When the priority of display is high, tags are not given to other objects, and tags are given to the objects to be tagged.

上述した各処理のいずれかもしくは組合せにより、本願に係る情報処理装置は、多視点画像を用いたサービス提供の質をより向上させることができる。 The information processing apparatus according to the present application can further improve the quality of service provision using multi-view images by one or a combination of the processes described above.

〔８．ハードウェア構成〕
また、上述した実施形態に係る端末装置１０や情報提供装置１００は、例えば図１１に示すような構成のコンピュータ１０００によって実現される。以下、情報提供装置１００を例に挙げて説明する。図１１は、ハードウェア構成の一例を示す図である。コンピュータ１０００は、出力装置１０１０、入力装置１０２０と接続され、演算装置１０３０、一次記憶装置１０４０、二次記憶装置１０５０、出力Ｉ／Ｆ（Interface）１０６０、入力Ｉ／Ｆ１０７０、ネットワークＩ／Ｆ１０８０がバス１０９０により接続された形態を有する。 [8. Hardware configuration]
Also, the terminal device 10 and the information providing device 100 according to the above-described embodiments are implemented by a computer 1000 configured as shown in FIG. 11, for example. The information providing apparatus 100 will be described below as an example. FIG. 11 is a diagram illustrating an example of a hardware configuration; The computer 1000 is connected to an output device 1010 and an input device 1020, and an arithmetic device 1030, a primary storage device 1040, a secondary storage device 1050, an output I/F (Interface) 1060, an input I/F 1070, and a network I/F 1080 are buses. It has a form connected by 1090.

演算装置１０３０は、一次記憶装置１０４０や二次記憶装置１０５０に格納されたプログラムや入力装置１０２０から読み出したプログラム等に基づいて動作し、各種の処理を実行する。演算装置１０３０は、例えばＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等により実現される。 Arithmetic device 1030 operates based on programs stored in primary storage device 1040 and secondary storage device 1050, programs read from input device 1020, and the like, and executes various types of processing. The arithmetic unit 1030 is implemented by, for example, a CPU (Central Processing Unit), MPU (Micro Processing Unit), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array), or the like.

一次記憶装置１０４０は、ＲＡＭ（Random Access Memory）等、演算装置１０３０が各種の演算に用いるデータを一次的に記憶するメモリ装置である。また、二次記憶装置１０５０は、演算装置１０３０が各種の演算に用いるデータや、各種のデータベースが登録される記憶装置であり、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、フラッシュメモリ等により実現される。二次記憶装置１０５０は、内蔵ストレージであってもよいし、外付けストレージであってもよい。また、二次記憶装置１０５０は、ＵＳＢ（Universal Serial Bus）メモリやＳＤ（Secure Digital）メモリカード等の取り外し可能な記憶媒体であってもよい。また、二次記憶装置１０５０は、クラウドストレージ（オンラインストレージ）やＮＡＳ（Network Attached Storage）、ファイルサーバ等であってもよい。 The primary storage device 1040 is a memory device such as a RAM (Random Access Memory) that temporarily stores data used for various calculations by the arithmetic device 1030 . The secondary storage device 1050 is a storage device in which data used for various calculations by the arithmetic device 1030 and various databases are registered. State Drive), flash memory, or the like. The secondary storage device 1050 may be an internal storage or an external storage. Also, the secondary storage device 1050 may be a removable storage medium such as a USB (Universal Serial Bus) memory or an SD (Secure Digital) memory card. Also, the secondary storage device 1050 may be a cloud storage (online storage), a NAS (Network Attached Storage), a file server, or the like.

出力Ｉ／Ｆ１０６０は、ディスプレイ、プロジェクタ、及びプリンタ等といった各種の情報を出力する出力装置１０１０に対し、出力対象となる情報を送信するためのインターフェースであり、例えば、ＵＳＢ（Universal Serial Bus）やＤＶＩ（Digital Visual Interface）、ＨＤＭＩ（登録商標）（High Definition Multimedia Interface）といった規格のコネクタにより実現される。また、入力Ｉ／Ｆ１０７０は、マウス、キーボード、キーパッド、ボタン、及びスキャナ等といった各種の入力装置１０２０から情報を受信するためのインターフェースであり、例えば、ＵＳＢ等により実現される。 The output I/F 1060 is an interface for transmitting information to be output to the output device 1010 that outputs various information such as a display, a projector, and a printer. (Digital Visual Interface), HDMI (registered trademark) (High Definition Multimedia Interface), and other standardized connectors. Also, the input I/F 1070 is an interface for receiving information from various input devices 1020 such as a mouse, keyboard, keypad, buttons, scanner, etc., and is realized by, for example, USB.

また、出力Ｉ／Ｆ１０６０及び入力Ｉ／Ｆ１０７０はそれぞれ出力装置１０１０及び入力装置１０２０と無線で接続してもよい。すなわち、出力装置１０１０及び入力装置１０２０は、ワイヤレス機器であってもよい。 Also, the output I/F 1060 and the input I/F 1070 may be wirelessly connected to the output device 1010 and the input device 1020, respectively. That is, the output device 1010 and the input device 1020 may be wireless devices.

また、出力装置１０１０及び入力装置１０２０は、タッチパネルのように一体化していてもよい。この場合、出力Ｉ／Ｆ１０６０及び入力Ｉ／Ｆ１０７０も、入出力Ｉ／Ｆとして一体化していてもよい。 Also, the output device 1010 and the input device 1020 may be integrated like a touch panel. In this case, the output I/F 1060 and the input I/F 1070 may also be integrated as an input/output I/F.

なお、入力装置１０２０は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、又は半導体メモリ等から情報を読み出す装置であってもよい。 Note that the input device 1020 includes, for example, optical recording media such as CDs (Compact Discs), DVDs (Digital Versatile Discs), PDs (Phase change rewritable discs), magneto-optical recording media such as MOs (Magneto-Optical discs), and tapes. It may be a device that reads information from a medium, a magnetic recording medium, a semiconductor memory, or the like.

ネットワークＩ／Ｆ１０８０は、ネットワークＮを介して他の機器からデータを受信して演算装置１０３０へ送り、また、ネットワークＮを介して演算装置１０３０が生成したデータを他の機器へ送信する。 Network I/F 1080 receives data from other devices via network N and sends the data to arithmetic device 1030, and also transmits data generated by arithmetic device 1030 via network N to other devices.

演算装置１０３０は、出力Ｉ／Ｆ１０６０や入力Ｉ／Ｆ１０７０を介して、出力装置１０１０や入力装置１０２０の制御を行う。例えば、演算装置１０３０は、入力装置１０２０や二次記憶装置１０５０からプログラムを一次記憶装置１０４０上にロードし、ロードしたプログラムを実行する。 Arithmetic device 1030 controls output device 1010 and input device 1020 via output I/F 1060 and input I/F 1070 . For example, arithmetic device 1030 loads a program from input device 1020 or secondary storage device 1050 onto primary storage device 1040 and executes the loaded program.

例えば、コンピュータ１０００が情報提供装置１００として機能する場合、コンピュータ１０００の演算装置１０３０は、一次記憶装置１０４０上にロードされたプログラムを実行することにより、制御部１３０の機能を実現する。また、コンピュータ１０００の演算装置１０３０は、ネットワークＩ／Ｆ１０８０を介して他の機器から取得したプログラムを一次記憶装置１０４０上にロードし、ロードしたプログラムを実行してもよい。また、コンピュータ１０００の演算装置１０３０は、ネットワークＩ／Ｆ１０８０を介して他の機器と連携し、プログラムの機能やデータ等を他の機器の他のプログラムから呼び出して利用してもよい。 For example, when the computer 1000 functions as the information providing device 100 , the arithmetic device 1030 of the computer 1000 implements the functions of the control unit 130 by executing a program loaded on the primary storage device 1040 . Further, arithmetic device 1030 of computer 1000 may load a program acquired from another device via network I/F 1080 onto primary storage device 1040 and execute the loaded program. Further, the arithmetic unit 1030 of the computer 1000 may cooperate with another device via the network I/F 1080, and call functions, data, etc. of the program from another program of the other device for use.

〔９．その他〕
以上、本願の実施形態を説明したが、これら実施形態の内容により本発明が限定されるものではない。また、前述した構成要素には、当業者が容易に想定できるもの、実質的に同一のもの、いわゆる均等の範囲のものが含まれる。さらに、前述した構成要素は適宜組み合わせることが可能である。さらに、前述した実施形態の要旨を逸脱しない範囲で構成要素の種々の省略、置換又は変更を行うことができる。 [9. others〕
Although the embodiments of the present application have been described above, the present invention is not limited by the contents of these embodiments. In addition, the components described above include those that can be easily assumed by those skilled in the art, those that are substantially the same, and those within the so-called equivalent range. Furthermore, the components described above can be combined as appropriate. Furthermore, various omissions, replacements, or modifications of components can be made without departing from the gist of the above-described embodiments.

また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 Further, among the processes described in the above embodiments, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed manually. All or part of this can also be done automatically by known methods. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。 Also, each component of each device illustrated is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

例えば、上述した情報提供装置１００は、複数のサーバコンピュータで実現してもよく、また、機能によっては外部のプラットフォーム等をＡＰＩ（Application Programming Interface）やネットワークコンピューティング等で呼び出して実現するなど、構成は柔軟に変更できる。 For example, the information providing apparatus 100 described above may be implemented by a plurality of server computers, and depending on the function, may be implemented by calling an external platform or the like using an API (Application Programming Interface), network computing, or the like. can be changed flexibly.

また、上述してきた実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Also, the above-described embodiments and modifications can be appropriately combined within a range that does not contradict the processing content.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、取得部は、取得手段や取得回路に読み替えることができる。 Also, the above-mentioned "section, module, unit" can be read as "means" or "circuit". For example, the acquisition unit can be read as acquisition means or an acquisition circuit.

１情報処理システム
１０端末装置
３４ガイド表示部
３５ガイド変更部
３６撮影判定部
３７認識部
３８通知部
１００情報提供装置
１１０通信部
１２０記憶部
１２１利用者情報データベース
１２２履歴情報データベース
１２３画像情報データベース
１３０制御部
１３１取得部
１３２特定部
１３３推定部
１３４タグ付与部
１３５タグ変更部
１３６画像変換部
１３７提供部 1 information processing system 10 terminal device 34 guide display unit 35 guide change unit 36 photographing determination unit 37 recognition unit 38 notification unit 100 information providing device 110 communication unit 120 storage unit 121 user information database 122 history information database 123 image information database 130 control Section 131 Acquisition Section 132 Identification Section 133 Estimation Section 134 Tag Assignment Section 135 Tag Change Section 136 Image Conversion Section 137 Provision Section

Claims

a specifying unit that specifies a target of annotation tagging from among shooting targets included in the multi-view image;
an estimating unit that estimates a three-dimensional position in the multi-view image to be tagged;
a tagging unit that tags the target to be tagged according to the three-dimensional position of the target to be tagged;
An information processing device comprising:

The information processing apparatus according to claim 1, wherein the identifying unit identifies and classifies the imaging target by image recognition for each viewpoint image of the multi-viewpoint image.

an acquisition unit that acquires the multi-viewpoint image from a user;
a reception unit that receives selection of the tagging target from the user;
further comprising
3. The identifying unit identifies the target of annotation tagging from among the shooting targets included in the multi-view image in accordance with the user's selection of the tagging target. 3. The information processing device according to 1 or 2.

4. The information processing apparatus according to claim 3, wherein the reception unit receives selection of the tagging target and specification of a web page to be associated with the tagging target from the user.

The identification unit searches a plurality of web pages on a network for images similar to the image to be tagged by image recognition, and automatically associates the web page containing the similar image with the tag target. 5. The information processing apparatus according to any one of claims 1 to 4, characterized in that:

6. Any one of claims 1 to 5, wherein the tagging unit, when giving a tag to the object to be tagged, gives the tag so that the tag does not overlap with other objects and other tags. The information processing device according to 1.

The tagging unit, when giving a tag to the target to be tagged, gives the tag to the target to be tagged if the target to be tagged is not hidden by another target. The information processing apparatus according to any one of claims 1 to 6.

When the tagging unit gives the tag to the tagging target, even if the tagging target is hidden by another target, the tag to be tagged is the other target. any one of claims 1 to 7, characterized in that when the display priority is higher than the tag of the tag, the tag is not assigned to the other target and the tag is assigned to the target to be tagged. 1. The information processing device according to 1.

An information processing method executed by an information processing device,
an identification step of identifying a target of annotation tagging from among shooting targets included in the multi-view image;
an estimation step of estimating a three-dimensional position in the multi-view image to be tagged;
a tagging step of tagging the tagging target according to the three-dimensional position of the tagging target;
An information processing method comprising:

an identification procedure for identifying a target of annotation tagging from among shooting targets included in the multi-view image;
an estimation procedure for estimating a three-dimensional position in the multi-view image to be tagged;
a tagging procedure for tagging the tagging target according to the three-dimensional position of the tagging target;
An information processing program for executing a computer.