JP5946315B2

JP5946315B2 - Image search system

Info

Publication number: JP5946315B2
Application number: JP2012095036A
Authority: JP
Inventors: 説男木村
Original assignee: 説男木村
Priority date: 2012-04-18
Filing date: 2012-04-18
Publication date: 2016-07-06
Anticipated expiration: 2032-04-18
Also published as: JP2013222406A

Description

携帯端末のカメラ機能により映し出された画像に対応する原画像をデータベースから取り出し、この原画像と関連する情報を提供する画像検索システム、画像検索装置およびコンピュータプログラムに関する。 The present invention relates to an image search system, an image search apparatus, and a computer program that extract an original image corresponding to an image displayed by a camera function of a portable terminal from a database and provide information related to the original image.

携帯端末にはカメラ機能が付与されることが普通になり、撮影対象物に向けて携帯端末をかざすように移動させるとカメラレンズの映し出す範囲が携帯端末の所定のエリアに表示される。この映し出された画像に関連する情報に対してリアルタイムにアクセスできるならば、各種プレゼンテーションや教育など多方面に応用できる。
このような観点から、携帯電話で撮影した画像に類似する画像をデータベースから抽出するシステムとして例えば、特許文献１に開示された「情報処理装置および携帯端末」がある。 The mobile terminal is usually provided with a camera function, and when the mobile terminal is moved over the object to be photographed, the range where the camera lens is displayed is displayed in a predetermined area of the mobile terminal. If information related to the projected image can be accessed in real time, it can be applied to various presentations and education.
From this point of view, there is an “information processing apparatus and portable terminal” disclosed in Patent Document 1, for example, as a system for extracting an image similar to an image taken with a mobile phone from a database.

特開２０１０−２０５１２１号公報JP 2010-205121 A

上記の特許文献１では、ユーザは画像内の注目する領域を指定し、その領域を重視した特徴点抽出・照合によって検索の精度を高めようとしている。しかしこれには以下の欠点がある。１枚の画像の中から領域を指定するためには人間の目によって区分けが必要となり、撮影画像から瞬時に検索結果を得るという用途には不向きである。じっくり腰をすえて検索しようという場合なら領域指定も良いが、例えば営業先での商品検索などあわただしい状況下での利用には使いづらい。また領域指定が難しい場合もある。例えば用紙一面に文字が印刷されているパンフレットなどを撮影する場合である。
また、特許文献１では、「画像」に動画像が含まれるか否かが明示されていない。仮に動画像が含まれるとしてもどのような処理がなされるのか言及がない。もし動画像も検索対象に含めるように機能を拡張しようとすると、人間が領域を指定する点がネックとなる。
このような点に鑑み、本発明は撮影された画像全体を検索キーとしても検索の精度を落とさず、静止画像だけでなく動画像も対象とする動画検索サービスを提供することを目的とする。 In the above-mentioned Patent Document 1, the user designates a region of interest in an image, and tries to improve the accuracy of search by extracting and collating feature points with emphasis on that region. However, this has the following disadvantages. In order to designate a region from one image, it is necessary to classify the region by human eyes, which is not suitable for an application in which a retrieval result is instantaneously obtained from a captured image. If you want to search slowly, you can specify the area, but it is difficult to use it in a busy situation such as searching for products at a business location. It may be difficult to specify an area. For example, it is a case where a pamphlet or the like in which characters are printed on one side of a sheet is taken.
Further, Patent Document 1 does not clearly indicate whether or not “image” includes a moving image. There is no mention of what kind of processing is performed even if a moving image is included. If the function is to be expanded so that moving images are also included in the search target, the point that a person designates an area becomes a bottleneck.
In view of the above, an object of the present invention is to provide a moving image search service that targets not only a still image but also a moving image without reducing the accuracy of a search even if the entire captured image is used as a search key.

上記の目的を達成するために、本発明は、画像検索システムであって、
ユーザの入力を受け付ける入力手段と、
撮像手段により得られた静止画像又は動画像（以下、「撮影画像」）を表示する出力手段と、
表示された撮影画像の特徴点を抽出する特徴点抽出手段と、
予め収集されている画像群の特徴点が格納されている原画像特徴点データベースと、
前記抽出された撮影画像の特徴点と前記原画像特徴点データベースから取り出した特徴点と照合し、条件に合う画像（以下、「当たり画像」）を特定する情報を取り出す特徴点照合手段と、
当たり画像を特定する情報に基づいて、その当たり画像に関連する情報を取得する画像関連情報取得手段と、
を備えを備え、前記特徴点抽出手段は、動画像を特徴点抽出対象とする場合、複数枚の静止画像をとり、各静止画像の特徴点を抽出し、一連の静止画像の特徴点を集めて動画像の特徴点とすることを特徴とする。
ここで、「撮像手段」とは、携帯端末に内蔵されているカメラ機能を想定している。このような構成による発明の目的は、携帯端末に映し出された画像と類似する画像を登録されている原画像群から探しだすことであり、さまざまな応用が考えられる。第２の実施形態で記載した営業担当者が客先で本システムを利用してプレゼンテーションをするのは応用例の一つである。
「当たり画像を特定する情報」とは、画像ＩＤあるいはｈａｓｈ文字列などをいう。
本発明では、検索対象が動画像である場合、複数枚の静止画像をとり、各静止画像の特徴点を集めて動画像の特徴点とする。これにより、本発明は静止画像だけでなく動画像にも対応できることになり、例えばテレビ画面を撮影し、この特徴点からどのテレビ局の何の番組の画面であるかを判定することも可能である。
To achieve the above object, the present invention provides an image search system comprising:
Input means for accepting user input;
Output means for displaying a still image or a moving image (hereinafter referred to as “captured image”) obtained by the imaging means;
Feature point extraction means for extracting feature points of the displayed captured image;
An original image feature point database in which feature points of pre-collected image groups are stored;
A feature point collating unit that collates the feature points of the extracted captured image with the feature points extracted from the original image feature point database, and extracts information that specifies an image that satisfies a condition (hereinafter referred to as “winning image”);
Image-related information acquisition means for acquiring information related to the winning image based on the information specifying the winning image;
The feature point extraction means takes a plurality of still images, extracts feature points of each still image, and collects a series of feature points of a still image when a moving image is a feature point extraction target. And feature points of moving images .
Here, the “imaging means” is assumed to be a camera function built in the portable terminal. An object of the invention having such a configuration is to search for an image similar to the image displayed on the mobile terminal from the registered original image group, and various applications are conceivable. The sales representative described in the second embodiment makes a presentation using this system at the customer's site as one of application examples.
“Information specifying a hit image” refers to an image ID or a hash character string.
In the present invention, when the search target is a moving image, a plurality of still images are taken and the feature points of each still image are collected and used as the feature points of the moving image. As a result, the present invention can cope with not only still images but also moving images. For example, it is possible to shoot a TV screen and determine what program screen of which TV station is based on this feature point. .

上記の目的を達成するために、本発明の画像検索システムは次の態様をとることもできる。すなわち、サーバと携帯端末が通信ネットワークを介して接続され、
前記携帯端末は、
ユーザの入力を受け付ける入力手段と、
撮像手段により得られた撮影画像を表示する出力手段と、
表示された撮影画像の特徴点を抽出する特徴点抽出手段と、
抽出された撮影画像の特徴点を前記サーバに送信して当たり画像の特定を要求する画像検索要求送信手段と、
検索結果を受信する画像検索結果受信手段と
当たり画像に関連付けられて蓄積されている情報を取得する画像関連情報取得手段と、
を備え、前記特徴点抽出手段は、動画像を特徴点抽出対象とする場合、複数枚の静止画像をとり、各静止画像の特徴点を抽出し、一連の静止画像の特徴点を集めて動画像の特徴点とするとともに、
前記サーバは、
原画像の特徴点が格納されている原画像特徴点データベースと、
前記画像検索装置から受信した特徴点と前記原画像特徴点データベースから取り出した特徴点とを照合し、当たり画像を取り出す特徴点照合手段と、
を備えることを特徴とするものである。
このように、処理負荷のかかる特徴点の照合処理はサーバが担うので、大量の画像データも処理可能となり検索精度が高まる。

In order to achieve the above object, the image search system of the present invention can take the following modes. In other words, the server and the mobile terminal are connected via a communication network,
The portable terminal is
Input means for accepting user input;
Output means for displaying a photographed image obtained by the imaging means;
Feature point extraction means for extracting feature points of the displayed captured image;
Image search request transmission means for transmitting the feature points of the extracted captured image to the server and requesting the identification of the hit image;
An image search result receiving means for receiving a search result, an image related information acquiring means for acquiring information stored in association with the winning image, and
The feature point extraction means takes a plurality of still images, extracts the feature points of each still image, collects the feature points of a series of still images, As feature points of the image,
The server
An original image feature point database storing feature points of the original image;
A feature point collating unit that collates a feature point received from the image search device with a feature point extracted from the original image feature point database, and extracts a hit image;
It is characterized by providing.
As described above, since the server performs the matching processing of the feature point which requires a processing load, a large amount of image data can be processed, and the search accuracy is improved.

本発明は上記の目的を達するために、原画像の特徴点を抽出し、前記原画像特徴点データベースに登録する特徴点作成手段を備えることが好ましい。
つまり、予め特徴点集合を自動作成する機能を備えているのであるから、携帯端末の操作時にリアルタイムの特徴点照合が行えることになる。 In order to achieve the above object, the present invention preferably includes a feature point creating means for extracting feature points of the original image and registering them in the original image feature point database.
That is, since a function for automatically creating a feature point set is provided in advance, real-time feature point matching can be performed when operating the mobile terminal.

本発明は上記の目的を達するために、前記特徴点照合手段は、撮影画像の特徴点とユークリッド距離が近い特徴点を多数もつ原画像を候補画像として抽出し、抽出した各候補画像について位置関係保存の判定処理を行い、位置関係が保存されていると判定された候補画像を当たり画像とみなすことが好ましい。
このように位置関係保存の判定処理を併用すれば、抽出結果の精度を高めることができる。撮影画像の特徴点と原画像の特徴点との距離の比較だけでは、抽出結果にノイズを含むおそれがあるからである。
「位置関係保存の判定」とは、距離的に近いと判定された撮影画像の特徴点の集合と候補画像側の特徴点の集合とのそれぞれから対応する１個の特徴点を取り出し、各集合の重心と取り出した特徴点とのベクトルを基準として他の特徴点が左右いずれの側にあるかを調べ、各集合から取り出した対応する特徴点が同じ側にあれば点数化し、点数が所定の値以上であれば位置関係が保存されていると判定することをいう。 In order to achieve the above object, the feature point matching unit extracts, as candidate images, original images having many feature points that are close to Euclidean distances from the feature points of the captured image, and the positional relationship between the extracted candidate images. It is preferable to perform a storage determination process and regard a candidate image determined to have a stored positional relationship as a winning image.
If the determination processing for storing the positional relationship is used in combination, the accuracy of the extraction result can be improved. This is because the extraction result may include noise only by comparing the distance between the feature point of the captured image and the feature point of the original image.
“Determining whether to save the positional relationship” means taking out one corresponding feature point from each of the set of feature points of the photographed image determined to be close in distance and the set of feature points on the candidate image side, Based on the vector of the center of gravity and the extracted feature points, the other feature points are examined on the left or right side. If the corresponding feature points extracted from each set are on the same side, the points are scored. If the value is greater than or equal to the value, it is determined that the positional relationship is stored.

画像の撮影から当たり画像の検索までが短時間で処理されるので、携帯端末でなんらかの画像を撮影するとただちに関連ある情報へのアクセスが可能となる。
そのため、本発明は営業支援ツールとして営業担当者が客先での商品検索及び関連情報入手に利用すると効果的である。例えば営業担当者が客先で本システムを利用してプレゼンテーションをするのも応用例の一つである。従来の画像とその関連情報とが混然一体となった営業用資料では分厚いカタログになってしまって手軽には持ち運べないし、また混乱・手落ちも発生しやすい。そこで情報を自分の携帯端末と簡単なパンフレットとに分割するわけである。この両者を相互補完的に使用することによって営業効率を上げることができる。
また本発明では、データベースには画像そのものではなくその特徴点が格納されている。その結果、著作権の問題が回避でき、またプライバシーに係わるような営業にも対応できる。 Since the process from image capture to hit image search is processed in a short time, it is possible to access relevant information as soon as any image is captured by the mobile terminal.
Therefore, the present invention is effective when a salesperson uses it as a sales support tool to search for products at a customer and obtain related information. For example, a sales representative makes a presentation using this system at a customer site. Business materials that are a mixture of conventional images and related information become thick catalogs that cannot be easily carried around, and are apt to cause confusion and omissions. Therefore, the information is divided into one's own mobile terminal and a simple pamphlet. By using both of these in a mutually complementary manner, business efficiency can be improved.
In the present invention, the database stores not the image itself but its feature points. As a result, the copyright problem can be avoided and the business related to privacy can be handled.

第１の実施形態のシステム構成および機能ブロックを示す図である。It is a figure which shows the system configuration | structure and functional block of 1st Embodiment. 第１の実施形態の処理概要を示すフロー図である。It is a flowchart which shows the process outline | summary of 1st Embodiment. 第１の実施形態の原画像特徴点データベースに格納されるデータ構造を示す図である。It is a figure which shows the data structure stored in the original image feature point database of 1st Embodiment. 第１の実施形態の特徴点照合処理を説明するフロー図である。It is a flowchart explaining the feature point collation process of 1st Embodiment. 第２の実施形態のシステム構成および機能ブロックを示す図である。It is a figure which shows the system configuration | structure and functional block of 2nd Embodiment. 第２の実施形態の処理概要を示すフロー図である。It is a flowchart which shows the process outline | summary of 2nd Embodiment. 第２の実施形態の応用例である動画像検索処理を説明するための図である。It is a figure for demonstrating the moving image search process which is an application example of 2nd Embodiment. 第２の実施形態の応用例である動画像検索処理で参照される原画像関連情報記憶手段の格納データ例を示す図である。It is a figure which shows the example of storage data of the original image relevant-information storage means referred by the moving image search process which is an application example of 2nd Embodiment.

《第１の実施形態》
以下、図面を参照しながら本発明の一実施の形態のシステム（以下、「本システム」）について説明する。 << First Embodiment >>
Hereinafter, a system according to an embodiment of the present invention (hereinafter, “the present system”) will be described with reference to the drawings.

本システムは、図１に示すように、ユーザが使用する携帯端末１と画像検索サービスを提供するサーバ２と、適宜外部のＷｅｂサーバ３とから構成され、それぞれがインターネットＮなどの通信ネットワークを介して接続している。 As shown in FIG. 1, this system includes a mobile terminal 1 used by a user, a server 2 that provides an image search service, and an appropriate external Web server 3, each of which is connected via a communication network such as the Internet N. Connected.

携帯端末１は、スマートフォンのような可搬型の情報処理装置である。
携帯端末１は、入力手段４と、出力手段５と、撮像手段６と、記憶手段７と、処理手段８と、図示しない通信インターフェース手段を有する。 The portable terminal 1 is a portable information processing device such as a smartphone.
The portable terminal 1 includes an input unit 4, an output unit 5, an imaging unit 6, a storage unit 7, a processing unit 8, and a communication interface unit (not shown).

入力手段４には、出力手段５の画面に重ねて配置されるタッチパネルが含まれる。特徴点抽出プログラムの起動・終了の指示や、Ｗｅｂサーバ３へのアクセスなどはこの入力手段４を介して行われる。
出力手段５はディスプレイ画面が必須であり、適宜スピーカも含まれる。
撮像手段６は、カメラレンズ及び撮像素子であり、本システムで用いられる携帯端末はこのような画像撮影機能が必須である。 The input unit 4 includes a touch panel arranged on the screen of the output unit 5. Instructions for starting and ending the feature point extraction program, access to the Web server 3 and the like are performed through the input unit 4.
A display screen is essential for the output means 5, and a speaker is also included as appropriate.
The image pickup means 6 is a camera lens and an image pickup device , and such an image shooting function is indispensable for a portable terminal used in this system.

記憶手段７には、処理手段８による各種処理を実現するコンピュータプログラム、これらのプログラムの実行の際に必要となるパラメータ類や処理の中間結果などが格納される。本システムで使用される携帯端末は、撮影画像の特徴点を抽出するプログラムを実行するために必要なメモリを備えていることが必須である。 The storage means 7 stores a computer program that implements various processes by the processing means 8, parameters necessary for executing these programs, intermediate results of the processes, and the like. The portable terminal used in this system must have a memory necessary for executing a program for extracting feature points of a captured image.

処理手段８は、特徴点抽出手段９、画像検索要求送信手段１０、画像検索結果受信手段１１、画像関連情報取得手段１２を有する。
特徴点抽出手段９は、撮像手段６によって画面５上に映し出された撮影画像から、特徴点を抽出する。
画像検索要求送信手段１０は、抽出された撮影画像の特徴点をサーバ２に送信する。この特徴点は画像検索のための検索キーとなるものである。検索キーとするのは特徴点であって撮影画像自体ではない。直接画像を送信すれば、著作権上の問題が発生したり、人物や人家が写ったりしているとプライバシー侵害のおそれも生ずるからである。
画像検索結果受信手段１１は、サーバ２において検索キーに相当すると判定された特徴点が抽出されたならば、その特徴点に対応する画像に関連する情報、つまりユーザが撮影した画像と関連あると考えられる情報が送信されてくるので、これを受信する。例えば、ある店の入り口を撮影した場合、その店のＷｅｂサイトのＵＲＬを受信する、といったことが考えられる。
画像関連情報取得手段１２は、サーバ２から送信された情報に基づいて、自分が撮影した画像に関連する情報にアクセスする手段である。例えば、サーバ２からＵＲＬが送信されたならば、そのＵＲＬに基づいて該当するＷｅｂサイトにアクセスする。 The processing unit 8 includes a feature point extraction unit 9, an image search request transmission unit 10, an image search result reception unit 11, and an image related information acquisition unit 12.
The feature point extraction unit 9 extracts feature points from the captured image displayed on the screen 5 by the imaging unit 6.
The image search request transmission unit 10 transmits the feature points of the extracted captured image to the server 2. This feature point serves as a search key for image search. The search key is a feature point, not the captured image itself. This is because if an image is transmitted directly, there is a risk of infringement of privacy if a copyright problem occurs or if a person or person's house is shown.
If the feature point determined to correspond to the search key in the server 2 is extracted, the image search result receiving unit 11 is related to information related to the image corresponding to the feature point, that is, an image taken by the user. As possible information is transmitted, it is received. For example, if an entrance of a certain store is photographed, the URL of the store's website may be received.
The image related information acquisition unit 12 is a unit for accessing information related to an image taken by the user based on information transmitted from the server 2. For example, if a URL is transmitted from the server 2, the corresponding Web site is accessed based on the URL.

処理手段８に含まれる各手段９〜１２の分類は、説明の便宜のためであり、各手段が截然と分かれているわけではない。これらの手段は所定のプログラムを携帯端末１が実装することにより実現される。つまり、このシステムは携帯端末向けの応用ソフト（アプリ）として、例えばＡＰＫファイルなどの形式でユーザに提供されることを想定している。 The classification of the means 9 to 12 included in the processing means 8 is for convenience of explanation, and the means are not clearly separated. These means are realized by the portable terminal 1 mounting a predetermined program. That is, it is assumed that this system is provided to the user as application software (application) for portable terminals, for example, in the form of an APK file.

サーバ２は、記憶手段１３と、処理手段１４と、図示しない入出力手段や通信インターフェース手段を有する情報処理装置である。 The server 2 is an information processing apparatus having a storage unit 13, a processing unit 14, and an input / output unit and a communication interface unit (not shown).

記憶手段１３は、検索対象となる情報を格納する原画像情報記憶手段１５と、各種処理の中間結果などを格納するメモリ（図示せず）やコンピュータプログラムの格納手段などから構成される。
原画像情報記憶手段１５には、原画像特徴点データベース（以下、「特徴点ＤＢ」）１６と特徴点インデックスデータベース（以下、「インデックスＤＢ」）１７と原画像関連情報データベース（以下、「関連情報ＤＢ」）１８が含まれる。これらのデータベースについては後で説明する。 The storage unit 13 includes an original image information storage unit 15 that stores information to be searched, a memory (not shown) that stores intermediate results of various processes, a computer program storage unit, and the like.
The original image information storage means 15 includes an original image feature point database (hereinafter referred to as “feature point DB”) 16, a feature point index database (hereinafter referred to as “index DB”) 17, and an original image related information database (hereinafter referred to as “related information”). DB ") 18. These databases will be described later.

サーバ２の処理手段１４は、特徴点作成手段１９と、検索キー受信手段２０と、特徴点照合手段２１と、当たり画像情報送信手段２２とを有する。
特徴点作成手段１９は、原画像の特徴点と、照合時探索用のインデックスを作成して特徴点ＤＢ１６とインデックスＤＢ１７に登録しておく。
検索キー受信手段２０は、携帯端末１から画像検索キーである特徴点情報を受信する。
特徴点照合手段２１は、受信した検索キーを、予め特徴点ＤＢ１６に登録された原画像の特徴点と照合し、最も近い特徴点を最も多く持つ原画像を、当たり画像とする。
当たり画像情報送信手段２２は、当たり画像に関する情報、例えばＵＲＬを関連情報ＤＢ１８から取り出して携帯端末１に送信する。 The processing unit 14 of the server 2 includes a feature point creating unit 19, a search key receiving unit 20, a feature point collating unit 21, and a hit image information transmitting unit 22.
The feature point creation means 19 creates the feature points of the original image and the index for search during matching, and registers them in the feature point DB 16 and the index DB 17.
The search key receiving unit 20 receives feature point information that is an image search key from the mobile terminal 1.
The feature point collating means 21 collates the received search key with the feature points of the original image registered in the feature point DB 16 in advance, and sets the original image having the most closest feature points as the hit image.
The winning image information transmitting unit 22 extracts information related to the winning image, for example, a URL from the related information DB 18 and transmits the information to the mobile terminal 1.

次に、本システムの動作について、図２に従い説明する。
サーバ２側で、原画像の特徴点と、照合時探索用のインデックスを作成しておく（ステップＳ１）。この処理は、ステップＳ２以降の処理とは独立に行われ、システム運用までに行われるとともに、システム運用開始後も適宜更新処理がなされる。
特徴点抽出には、たとえば公知のＯＲＢ（Oriented FAST and Rotated BRIEF）アルゴリズムを使用する。
（詳細は、http://www.willowgarage.com/papers/orb-efficient-alternative-sift-or-surfなどを参照）
数千〜数万の原画像の特徴点を取って特徴点ＤＢ１６及びインデックスＤＢ１７に格納する。このように特徴点の抽出をあらかじめ行っておくので特徴点照合処理が高速化できる。 Next, the operation of this system will be described with reference to FIG.
On the server 2 side, a feature point of the original image and an index for searching during collation are created (step S1). This process is performed independently of the processes in and after step S2, is performed before the system operation, and is appropriately updated after the system operation is started.
For the feature point extraction, for example, a known ORB ( Oriented FAST and Rotated BRIEF ) algorithm is used.
(See http://www.willowgarage.com/papers/orb-efficient-alternative-sift-or-surf for details)
The feature points of thousands to tens of thousands of original images are taken and stored in the feature point DB 16 and the index DB 17. Since feature points are extracted in this way, feature point matching processing can be speeded up.

特徴点ＤＢ１６には、原画像の個数分だけ特徴点情報が格納されている。データ構造は図３に示すように、原画像１個につき、特徴点が属する画像ID(intまたは原画像のhash文字列)、原画像から取得した特徴点の数（画像により巾がある）、特徴点ベクトルの値(３２個のint値)を有する。
さらに、縮小画像から取得した特徴点の数と特徴点ベクトルの値、および拡大画像から取得した特徴点の数と特徴点ベクトルの値も有する。
ここで注意することは原画像そのもののデータは持っていないことである。特徴点ベクトルは１６方向あり、それぞれ１組のＸ座標とＹ座標の値を持つので、特徴点ベクトルの値は各特徴点につき合計３２個ある。ベクトルを１６方向と細かくとったことと、特徴点ベクトルの値の型をintとしたことによって、検索のスピードと精度を高めることができる。
この実施形態ではORBアルゴリズムを使用するが、このアルゴリズムはサイズの変化に弱いという欠点がある。この欠点を補い精度を維持するために、原画像だけでなく縮小画像および拡大画像の特徴点も同時にデータベースに格納しておく。縮小画像および拡大画像の特徴点も必要なのは、被写体が同じであっても、解像度によって特徴点の個数や抽出される特徴点が異なってくるからである。 Feature point information is stored in the feature point DB 16 by the number of original images. As shown in FIG. 3, the data structure includes, for each original image, an image ID to which the feature point belongs (int or hash character string of the original image), the number of feature points acquired from the original image (there is a width depending on the image), It has the value of a feature point vector (32 int values).
Furthermore, it also has the number of feature points and feature point vector values acquired from the reduced image, and the number of feature points and feature point vector values acquired from the enlarged image.
It should be noted here that the original image itself has no data. There are 16 feature point vectors, each having a set of X and Y coordinate values, so there are a total of 32 feature point vector values for each feature point. The speed and accuracy of the search can be improved by taking the vectors as fine as 16 directions and by setting the value type of the feature point vector to int.
This embodiment uses an ORB algorithm, but this algorithm has the disadvantage of being vulnerable to size changes. In order to compensate for this defect and maintain accuracy, not only the original image but also the feature points of the reduced image and the enlarged image are stored in the database at the same time. The feature points of the reduced image and the enlarged image are also necessary because the number of feature points and the extracted feature points differ depending on the resolution even if the subject is the same.

インデックスＤＢ１７の作成には、ｆｌａｎｎ（ＦａｓｔＬｉｂｒａｒｙｆｏｒＡｐｐｒｏｘｉｍａｔｅＮｅａｒｅｓｔＮｅｉｇｈｂｏｒｓ）アルゴリズムが用いられる。ｆｌａｎｎアルゴリズムは、高次元特徴量に関するK-近傍探索の高速な近似計算法であり、これに基づいてインデックスのツリーが作成され、このツリーに沿って照合が実行されていく。具体的にはＯｐｅｎＣＶ（ＯｐｅｎＳｏｕｒｃｅＣｏｍｐｕｔｅｒＶｉｓｉｏｎＬｉｂｒａｒｙ）を利用するわけであるが、公知であり且つ関数の利用レベルになるのでその詳細は省略する。 For the creation of the index DB 17, a flann (Fast Library for Proximity Nearest Neighbors) algorithm is used. The flann algorithm is a high-speed approximate calculation method for K-neighbor search related to high-dimensional features, and based on this, an index tree is created, and matching is executed along this tree. Specifically, OpenCV (Open Source Computer Vision Library) is used, but since it is a publicly known function usage level, its details are omitted.

関連情報ＤＢ１８には、特徴点が属する画像ID（あるいはhash文字列）と対応づけて、その画像と関連のある情報が格納されている。 The related information DB 18 stores information related to the image in association with the image ID (or hash character string) to which the feature point belongs.

次に、携帯端末１が、カメラレンズをかざして画面上に映し出した画像から、特徴点を抽出する（ステップＳ２）。１画像につき特徴点の個数は数百になることもある。抽出のアルゴリズムはステップＳ１とまったく同じである。
抽出された特徴点は画像検索キーとしてサーバ２に送信される（ステップＳ３）。 Next, the portable terminal 1 extracts feature points from the image displayed on the screen with the camera lens held up (step S2). The number of feature points per image may be several hundred. The extraction algorithm is exactly the same as in step S1.
The extracted feature points are transmitted to the server 2 as image search keys (step S3).

サーバ２は受信した特徴点を、登録された原画像の特徴点と照合する（ステップＳ４）。
図４にしたがい、特徴点の照合処理を説明する。
原画像の個数をＪ個とし、ステップＳ１０１でループ変数ｊを１に初期化する。次に特徴点ＤＢ１６から原画像の特徴点データを取り出して（ステップＳ１０２）、検索キーの各特徴点にユークリッド距離が最も近い特徴点の個数を点数化し、記憶手段１３に中間処理結果として格納する（ステップＳ１０３）。格納する情報は、検索キーのどの特徴点がどの原画像のどの特徴点と対応するかであり、位置関係保存の判定処理で参照される。ステップＳ１０４では、ｊ＝Ｊであるかを判定し、ｊ＝Ｊでなければ、ループ変数ｊに１を加算し（ステップＳ１０５）、ステップＳ１０２へ戻って同様に処理を繰り返す。
ｊ＝Ｊであれば、中間処理結果から最も点数の高い原画像を候補画像としてＫ個抽出する（ステップＳ１０６）。ここまでは、ノイズ除去は考慮しない。つまり全データをシグナル（＝本来のデータ）とみなして処理するので候補画像は多めに採用されることになる。 The server 2 collates the received feature point with the registered feature point of the original image (step S4).
The feature point matching process will be described with reference to FIG.
The number of original images is J, and a loop variable j is initialized to 1 in step S101. Next, the feature point data of the original image is extracted from the feature point DB 16 (step S102), the number of feature points having the Euclidean distance closest to each feature point of the search key is scored, and stored in the storage means 13 as an intermediate processing result. (Step S103). The information to be stored is which feature point of the search key corresponds to which feature point of which original image, and is referred to in the determination process of the positional relationship storage. In step S104, it is determined whether j = J. If j = J is not satisfied, 1 is added to the loop variable j (step S105), and the process returns to step S102 and is repeated in the same manner.
If j = J, K original images having the highest score are extracted as candidate images from the intermediate processing result (step S106). So far, noise removal is not considered. That is, since all data is processed as signals (= original data), more candidate images are employed.

以下の処理はノイズを除去することを主眼とする。ノイズ除去にも色々な手法があるが、本発明では位置関係が保存されているか否かを判定することにより行う。
この処理は、対応関係をよりグローバルにチェックする処理といえる。ノイズにはグローバルな法則性はないと考えられるから、ノイズがあるために採用されることになった画像はこの段階で排除されることになる。
先ずステップＳ１０７で、ループ変数ｋを１に初期化する。このｋは、前段で設定された採用候補画像数＝Ｋまで、後段でカウントアップされていくことになる。あわせてステップＳ１０７では、カウンタ変数ｉを０に初期化する。
次にステップＳ１０８で、抽出した原画像について、最も点数の高い順に位置関係の保存を調べる。即ち、携帯端末１で撮影された撮影画像の特徴点と特定の候補画像内の特徴点を1対1で対応付ける。ここで対応づけられた特徴点集合に於いて、撮影画像側の特徴点集合をサブセットＡとし、候補画像側の特徴点集合をサブセットＢとする。サブセットＡはステップＳ１０３において中間処理結果として格納された検索キーの特徴点の集合であり、サブセットＢはある原画像の特徴点の集合である。
サブセットＡおよびサブセットＢのそれぞれについて、特徴点の位置から重心Ｇａ，Ｇｂを計算する。
次に、サブセットＡ側から２点を取り、点1A、点2Aとする。点1A、点2Aに対応するサブセットＢ内の点を点1B、点2Bとする。重心Ｇａ―＞点1Aに対する点2Aの左右の位置と、重心Ｇｂ―＞点1Bに対する点2Bの左右の位置が同じかを調べる。これを全ての2点の組み合わせで調べ、一致している個数の全体に対する割合を計算する。この割合が予め設定してあった閾値以上であれば、位置関係が保存されていると判定する。つまりその候補画像は、ノイズチェックもパスしたことになる。 The following processing focuses on removing noise. There are various methods for removing noise. In the present invention, it is performed by determining whether or not the positional relationship is stored.
This process can be said to be a process for checking the correspondence relationship more globally. Since it is thought that there is no global law for noise, images that are adopted because of noise will be eliminated at this stage.
First, in step S107, the loop variable k is initialized to 1. This k is counted up in the subsequent stage until the number of adoption candidate images set in the previous stage = K. At the same time, in step S107, the counter variable i is initialized to zero.
In step S108, the extracted original images are checked for the storage of positional relationships in order of highest score. That is, the feature point of the photographed image photographed by the mobile terminal 1 is associated with the feature point in the specific candidate image on a one-to-one basis. In the feature point set associated here, the feature point set on the photographed image side is set as subset A, and the feature point set on the candidate image side is set as subset B. The subset A is a set of feature points of the search key stored as an intermediate processing result in step S103, and the subset B is a set of feature points of a certain original image.
For each of subset A and subset B, centroids Ga and Gb are calculated from the positions of the feature points.
Next, two points are taken from the subset A side to be point 1A and point 2A. Points in subset B corresponding to point 1A and point 2A are point 1B and point 2B. It is checked whether the left and right positions of the point 2A with respect to the center of gravity Ga-> point 1A and the left and right positions of the point 2B with respect to the center of gravity Gb-> point 1B are the same. This is examined for all combinations of two points, and the ratio of the number of matching numbers to the whole is calculated. If this ratio is equal to or greater than a preset threshold value, it is determined that the positional relationship is stored. That is, the candidate image has also passed the noise check.

位置関係が保存されていれば（ステップＳ１０８でＹｅｓ）、カウンタ変数ｉをインクリメントし、この原画像を中間処理結果として保存しておく（ステップＳ１０９）。続いて、ｋ＝Ｋであるかを判定し（ステップＳ１１０）、ｋ＝Ｋでなければ、ループ変数ｋに１を加算し（ステップＳ１１１）、ステップＳ１０８へ戻って同様に処理を繰り返す。
ｋ＝Ｋであれば（ステップＳ１１０でＹｅｓ），位置関係保存のチェックにパスした原画像があるか否かを判定する（ステップＳ１１２）。パスした原画像があれば（ステップＳ１１２でＹｅｓ），ｉ個の原画像の中でステップＳ１０８における一致率が最も高いものを当たり画像とする（ステップＳ１１３）。
パスした原画像が一つもなければ（ステップＳ１１２でＮｏ）、当たり画像はないとみなしてもよいが、次のようにユーザに当たり画像を提示してもよい（ステップＳ１１４）。すなわち、Ｋ個の中から予め定めてある上限値Ｋ２までを当たり画像と見なす（Ｋ＞Ｋ２＞＝１）。Ｋ２個選ぶ基準については予めルールを決めておく。ステップＳ１０３で得られた点数を重視するか、あるいはステップＳ１０８で得られた一致率を重視するかなどである。 If the positional relationship is stored (Yes in step S108), the counter variable i is incremented, and this original image is stored as an intermediate processing result (step S109). Subsequently, it is determined whether k = K (step S110). If k = K is not satisfied, 1 is added to the loop variable k (step S111), and the process returns to step S108 and the same process is repeated.
If k = K (Yes in step S110), it is determined whether or not there is an original image that has passed the positional relationship storage check (step S112). If there is a passed original image (Yes in step S112), the i original images having the highest matching rate in step S108 are determined as hit images (step S113).
If there is no passed original image (No in step S112), it may be considered that there is no winning image, but the winning image may be presented to the user as follows (step S114). That is, from K pieces up to a predetermined upper limit value K2 is regarded as a hit image (K>K2> = 1). Rules are determined in advance for the criteria for selecting K2. Whether importance is attached to the score obtained in step S103, or whether the matching rate obtained in step S108 is important.

サーバ２は、当該当たり画像に関連する情報を関連情報ＤＢ１８から取り出して携帯端末１に送信する（ステップＳ５）。
携帯端末１は、サーバ２から受信した情報にもとづいて、撮影画像に関連する情報の提供を受ける（ステップＳ６）。例えば、提供を受けた情報がＵＲＬであれば、このＵＲＬに基づいてＷｅｂサーバ３にアクセスしてＷｅｂページを取得して画面に表示させる。
このように本システムは、携帯端末のカメラで撮影した画像について、関連する情報をその場で取得し表示させることができるので、ビジネス、教育、娯楽などさまざまな場面で活用することが期待される。 The server 2 extracts information related to the hit image from the related information DB 18 and transmits the information to the portable terminal 1 (step S5).
The portable terminal 1 receives provision of information related to the photographed image based on the information received from the server 2 (step S6). For example, if the provided information is a URL, the Web server 3 is accessed based on the URL, a Web page is acquired and displayed on the screen.
In this way, this system can acquire and display related information on the spot for images taken with the camera of a mobile terminal, so it is expected to be used in various situations such as business, education, and entertainment. .

《第２の実施形態》
以下、本発明の第２の実施の形態のシステムについて説明する。
このシステムは第１の実施の形態と比べ、携帯端末において画像の撮影、特徴点の抽出、登録済原画像の特徴点との照合、当たり画像のＩＤ取り出しまでを行い、サーバには当たり画像に関する情報の問い合わせのみを行う点で第１の実施形態と相違する。
以下、図面を参照しながら主に第１の実施形態との相違点を説明する。図中、第１の実施形態と機能が同じものには同一の符号を付する。 << Second Embodiment >>
The system according to the second embodiment of the present invention will be described below.
Compared with the first embodiment, this system performs shooting of an image, extraction of feature points, collation with feature points of a registered original image, and extraction of a hit image ID in a portable terminal. It differs from the first embodiment in that only information inquiry is performed.
The differences from the first embodiment will be mainly described below with reference to the drawings. In the figure, components having the same functions as those of the first embodiment are denoted by the same reference numerals.

《１．本システムの構成》
本システムは、図５に示すように、ユーザが使用する携帯端末１０１と画像検索を利用した情報提供サービスを行うサーバ１０２と、適宜外部のＷｅｂサーバ３とから構成され、それぞれがインターネットＮなどの通信ネットワークを介して接続している。 << 1. Configuration of this system >>
As shown in FIG. 5, the system includes a mobile terminal 101 used by a user, a server 102 that provides an information providing service using image search, and an external Web server 3 as appropriate. Connected via a communication network.

携帯端末１０１は、スマートフォンのような可搬型の情報処理装置である。
携帯端末１０１は、入力手段４と、出力手段５と、撮像手段６と、記憶手段１０３と、処理手段１０４と、図示しない通信インターフェース手段を有する。 The portable terminal 101 is a portable information processing device such as a smartphone.
The portable terminal 101 includes an input unit 4, an output unit 5, an imaging unit 6, a storage unit 103, a processing unit 104, and a communication interface unit (not shown).

記憶手段１０３は、検索対象となる情報を格納する原画像情報記憶手段１０５と、各種処理の中間結果などを格納するメモリ（図示せず）やコンピュータプログラムの格納手段などから構成される。本システムで使用される携帯端末は、撮影画像の特徴点を抽出するプログラムや特徴点同士を照合するプログラムを実行するために必要なメモリを備えていることが必須である。
原画像情報記憶手段１０５には、原画像特徴点データベース（以下、「特徴点ＤＢ」）１０６が含まれる。特徴点ＤＢ１０６については後で説明する。 The storage unit 103 includes an original image information storage unit 105 that stores information to be searched, a memory (not shown) that stores intermediate results of various processes, a storage unit for computer programs, and the like. The portable terminal used in this system must have a memory necessary for executing a program for extracting feature points of a captured image and a program for matching feature points.
The original image information storage unit 105 includes an original image feature point database (hereinafter, “feature point DB”) 106. The feature point DB 106 will be described later.

処理手段１０４は、特徴点抽出手段１０７、特徴点照合手段１０８、当たり画像情報取得手段１０９、画像関連情報取得手段１１０を有する。
特徴点抽出手段１０７は、撮像手段６によって画面５上に映し出された撮影画像から、特徴点を抽出する。
特徴点照合手段１０８は、撮影画像から抽出した特徴点を、予め特徴点ＤＢ１０６に登録された原画像の特徴点と照合し、最も近い特徴点を最も多く持つ原画像を当たり画像とする。
当たり画像情報取得手段１０９は、当たり画像に関する情報をサーバ１０２に要求し、受信する。
画像関連情報取得手段１１０は、サーバ１０２から送信された情報に基づいて、自分が撮影した画像に関連する情報にアクセスする手段である。例えば、サーバ１０２からＵＲＬが送信されたならば、そのＵＲＬに基づいて該当するＷｅｂサーバ３にアクセスし、取得したＷｅｂページを画面５に表示させる。 The processing unit 104 includes a feature point extracting unit 107, a feature point collating unit 108, a hit image information acquiring unit 109, and an image related information acquiring unit 110.
The feature point extraction unit 107 extracts feature points from the photographed image displayed on the screen 5 by the imaging unit 6.
The feature point collating means 108 collates the feature points extracted from the photographed image with the feature points of the original image registered in advance in the feature point DB 106, and sets the original image having the most closest feature points as the hit image.
The winning image information acquisition unit 109 requests and receives information related to the winning image from the server 102.
The image related information acquisition unit 110 is a unit that accesses information related to an image captured by the user based on information transmitted from the server 102. For example, if a URL is transmitted from the server 102, the corresponding Web server 3 is accessed based on the URL, and the acquired Web page is displayed on the screen 5.

処理手段１０４に含まれる各手段１０７〜１１０の分類は、説明の便宜のためであり、各手段が截然と分かれているわけではない。これらの手段はＡＰＫファイルの形式で提供されるプログラムを携帯端末１０１が実装することにより実現される。つまり、このシステムは携帯端末向けの応用ソフト（アプリ）としてユーザに提供されることを想定している。ただし、第１の実施形態と異なり、原画像の特徴点データもアプリとともに提供される。 The classification of the respective means 107 to 110 included in the processing means 104 is for convenience of explanation, and the respective means are not clearly separated. These means are realized by the portable terminal 101 mounting a program provided in the form of an APK file. That is, it is assumed that this system is provided to the user as application software (application) for portable terminals. However, unlike the first embodiment, the feature point data of the original image is also provided with the application.

サーバ１０２は、記憶手段１１１と、処理手段１１２と、図示しない入出力手段や通信インターフェース手段を有する情報処理装置である。 The server 102 is an information processing apparatus having a storage unit 111, a processing unit 112, and an input / output unit and a communication interface unit (not shown).

記憶手段１１１には、原画像に関連する情報を格納する原画像関連情報記憶手段１１３を有する。
処理手段１１２には、携帯端末１０１から要求された当たり画像に関する情報を送信する当たり画像情報送信手段１１４を有する。 The storage unit 111 includes an original image related information storage unit 113 that stores information related to the original image.
The processing unit 112 includes a hit image information transmitting unit 114 that transmits information related to the hit image requested from the mobile terminal 101.

次に、本システムの動作について、図６に従い説明する。
アプリとともに、原画像の特徴点を格納した特徴点ＤＢ１０６を記憶手段１０３に格納しておく（ステップＳ２０１）。特徴点ＤＢ１０６の格納は、ステップＳ２０２以降の処理とは独立に行われ、アプリの実装時に行われるとともに、適宜更新も可能である。このように、予め特徴点集合を準備しているので、リアルタイムに特徴点の照合が行える。 Next, the operation of this system will be described with reference to FIG.
The feature point DB 106 storing the feature points of the original image is stored in the storage unit 103 together with the application (step S201). The feature point DB 106 is stored independently of the processes in and after step S202, and is performed when the application is installed, and can be updated as appropriate. Thus, since the feature point set is prepared in advance, the feature points can be collated in real time.

特徴点抽出のアルゴリズムは、第１の実施形態と同様である。
ただし原画像の個数は、携帯端末１０１のメモリ容量およびＣＰＵの能力を考慮すると、50〜60個くらいが適当である。
この実施形態では、第１の実施形態と異なりインデックスＤＢは予め作成しない。原画像の個数も５０個程度と少ないことから、撮影画像の特徴点と照合する際、原画像の特徴点データを読み込んだ時にインデックスを作成してメモリ上に持つようにする。 The algorithm for feature point extraction is the same as in the first embodiment.
However, considering the memory capacity of the mobile terminal 101 and the CPU capacity, the number of original images is appropriately about 50-60.
In this embodiment, unlike the first embodiment, the index DB is not created in advance. Since the number of original images is as small as about 50, when collating with feature points of a captured image, an index is created and stored in the memory when feature point data of the original image is read.

特徴点ＤＢ１０６には、原画像の個数分だけ特徴点情報が格納されている。データ構造は第１の実施形態と同様であって図３に示すように、原画像１個につき、特徴点が属する画像ID(intまたは原画像のhash文字列)、原画像から取得した特徴点の数、特徴点ベクトルの値(32個の整数値)を有する。さらに、縮小画像から取得した特徴点の数と特徴点ベクトルの値、および拡大画像から取得した特徴点の数と特徴点ベクトルの値も有する。 The feature point DB 106 stores feature point information for the number of original images. The data structure is the same as in the first embodiment, and as shown in FIG. 3, for each original image, the image ID to which the feature point belongs (int or the hash character string of the original image), and the feature points acquired from the original image And feature point vector values (32 integer values). Furthermore, it also has the number of feature points and feature point vector values acquired from the reduced image, and the number of feature points and feature point vector values acquired from the enlarged image.

カメラレンズをかざして画面上に映し出した撮影画像から、特徴点を抽出する（ステップＳ２０２）。抽出のアルゴリズムはステップＳ２０１とまったく同じであり、特徴点の個数は数百個くらいである。
撮影画像の特徴点と登録された原画像の特徴点と照合する（ステップＳ２０３）。
各特徴点に対して、最も近い特徴点を多数持つ原画像データを候補画像として抽出した後、各候補画像について位置関係保存の判定を行い、当たり画像を決定する（ステップＳ２０４）のは、第１の実施形態と同様である。 Feature points are extracted from the captured image projected on the screen with the camera lens held up (step S202). The extraction algorithm is exactly the same as in step S201, and the number of feature points is about several hundred.
The feature point of the captured image is collated with the registered feature point of the original image (step S203).
After extracting original image data having a number of closest feature points with respect to each feature point as a candidate image, determination of saving of positional relationship is performed for each candidate image, and a winning image is determined (step S204). This is the same as the first embodiment.

サーバ１０２側に当たり画像のIDを送信する（ステップＳ２０５）。
サーバ１０２は受信したIDに応じた当たり画像の情報を原画像関連情報記憶手段１１３から取り出して送信する（ステップＳ２０６）。
サーバ１０２から受信した当たり画像の情報に応じて、コンテンツを表示する（ステップＳ２０７）。 The ID of the hit image is transmitted to the server 102 side (step S205).
The server 102 retrieves the information of the hit image corresponding to the received ID from the original image related information storage unit 113 and transmits it (step S206).
The content is displayed according to the information of the hit image received from the server 102 (step S207).

この第２の実施形態の活用例として、営業担当の社員に本実施形態の携帯端末を持たせる場合について説明する。
普通、営業担当者は商品情報を載せたパンフレットを客先に持参するが、このパンフレットは客によって変わる。そうでなければ分厚いカタログになってしまう。また時間と共に変化する（値段、在庫の有無、情報用サイトのＵＲＬなど）。その度に新しいパンフレットを作成するわけにはいかない。そこで本発明を活用することになる。
先ずその客に関連のありそうな商品（５０個程度）を撮影して特徴点ＤＢ１０６を作成し、自分の携帯端末１０１に格納する。もちろん作成済みのものをそのまま利用してもよいし、
作成済みのものをコンパクト化してもよい。あるいは臨機応変に客先を訪問する直前に作成してもよい。このように本願発明のシステムを携帯端末に実装すると、極めてハンディーな営業支援ツールとなるのである。なお原画像を撮影する際は後の画像照合を考慮して、通常の画像・縮小画像・拡大画像の３種類を撮る。更に印刷してその客向けの当座のパンフレットを作成してもよい。商品のパンフレットを原画像にするのは、それが最も変化しにくいからである（パンフレットの更新は、新モデルが追加された時ぐらいで頻繁にはないと考えられる）。関連情報（例：ＷｅｂサイトのＵＲＬ）の変化などはパンフレットと無関係に自分の携帯端末だけで吸収しておけばよい。
営業担当者はそのパンフレットと携帯端末を客先に持参する。客先でパンフレットの説明をする際、そのパンフレットの上で携帯端末１０１のカメラレンズをかざすようにして撮影する。携帯端末１０１の処理手段１０４は特徴点の抽出および格納済の特徴点との照合を行い、撮影されたパンフレットはどれかを認識する。認識されたパンフレットに関連するさらに詳しい情報を画面に表示する。
このようにすれば、営業担当者は携帯端末１０１とせいぜい５０商品程度のパンフレットしか持参しなくても、客先で適切かつ詳細な説明をしたり、客の質問に答えたりすることが可能となる。 As an application example of the second embodiment, a case where an employee in charge of sales has the portable terminal of this embodiment will be described.
Normally, sales representatives bring pamphlets with product information to customers, but these pamphlets vary from customer to customer. Otherwise it will be a thick catalog. It also changes over time (price, availability, URL of information site, etc.). You can't create a new brochure each time. Therefore, the present invention is utilized.
First, a feature point DB 106 is created by photographing products (about 50) that are likely to be relevant to the customer and stored in the portable terminal 101 of the customer. Of course, you can use what you have already created,
You may make it compact. Alternatively, it may be created immediately before visiting the customer in a flexible manner. Thus, when the system of the present invention is mounted on a portable terminal, it becomes an extremely handy sales support tool. Note that when capturing an original image, three types of images, a normal image, a reduced image, and an enlarged image are taken in consideration of later image matching. You can also print it to create a temporary brochure for the customer. The product pamphlet is made the original image because it is the least likely to change (the pamphlet will not be updated as often as new models are added). Changes in related information (e.g., website URL) may be absorbed only by one's own portable terminal regardless of the pamphlet.
The sales representative brings the pamphlet and mobile terminal to the customer. When the customer explains the brochure, the user takes a picture while holding the camera lens of the portable terminal 101 over the brochure. The processing means 104 of the portable terminal 101 extracts feature points and collates with stored feature points, and recognizes which pamphlet has been photographed. Display more detailed information related to the recognized brochure on the screen.
In this way, the sales representative can give an appropriate and detailed explanation or answer the customer's questions even if he / she only brings the portable terminal 101 and a pamphlet of about 50 products at most. Become.

この第２の実施形態の他の活用例として、動画像の検索がある。
ここでは、図７の左側に示すような順で出現する静止画像Ｆ１、Ｆ２、Ｆ３及びＦ４を含む動画像を例に説明する。動画像の場合は、特徴点ＤＢ１０６には各静止画像の特徴点データを格納する。特徴点ＤＢ１０６にはこれら４画像の特徴点データが格納されるとともに、各画像には一意の画像ＩＤ（あるいはhash文字列）が対応づけられている。なお、特徴点ＤＢ１０６には静止画像か動画像かを区別する情報は設定されていない。携帯端末１０１の撮像手段６をこの動画像にかざすと静止画像Ｆ１〜Ｆ４のそれぞれと同一でなくとも類似した静止画像Ｕ１〜Ｕ４を取得でき、これら静止画像から抽出した特徴点を特徴点ＤＢ１０６中の特徴点データと照合する。その結果、当たり画像としてＦ１→Ｆ２→Ｆ３→Ｆ４の順で抽出される。携帯端末１０１は、サーバ１０２にこれらの当たり画像の画像ＩＤを送信する。
サーバ１０２は、図８にデータ格納例を示すような原画像関連情報記憶手段１１３を参照してＦ１〜Ｆ４に対応する動画像の関連情報を抽出し、携帯端末１０１に送信する。
図８に示すように携帯端末１０１の画面には動画像（Ｄ１）が表示されていたが、サーバ１０２から関連情報を受信すると、画面表示を関連情報（Ｄ２）に変更する。携帯端末１０１は画面表示されたＵＲＬをもとにＷｅｂサーバ３にアクセスなどして当該動画像と関連ある情報を収集することができる。
なお、図８で画像ＩＤ欄に１個のＩＤが登録されているのは原画像が静止画像の場合である。携帯端末１０１における特徴点抽出と特徴点照合の各処理では撮影画像が静止画像か動画像かは区別せず、サーバ１０２側で受信した画像ＩＤによって静止画像か動画像かが区別できる。 Another utilization example of the second embodiment is a search for moving images.
Here, a moving image including still images F1, F2, F3, and F4 that appear in the order shown on the left side of FIG. 7 will be described as an example. In the case of a moving image, the feature point data of each still image is stored in the feature point DB 106. The feature point DB 106 stores feature point data of these four images, and each image is associated with a unique image ID (or hash character string). Note that the feature point DB 106 is not set with information for distinguishing between still images and moving images. If the imaging means 6 of the portable terminal 101 is held over this moving image, similar still images U1 to U4 that are not identical to the still images F1 to F4 can be obtained, and feature points extracted from these still images are stored in the feature point DB 106. Match with the feature point data of. As a result, the winning images are extracted in the order of F1, F2, F3, and F4. The portable terminal 101 transmits the image IDs of these hit images to the server 102.
The server 102 refers to the original image related information storage unit 113 as shown in the data storage example in FIG. 8, extracts the related information of the moving images corresponding to F1 to F4, and transmits it to the mobile terminal 101.
As shown in FIG. 8, the moving image (D1) is displayed on the screen of the mobile terminal 101. When the related information is received from the server 102, the screen display is changed to the related information (D2). The mobile terminal 101 can collect information related to the moving image by accessing the Web server 3 based on the URL displayed on the screen.
In FIG. 8, one ID is registered in the image ID column when the original image is a still image. In each process of feature point extraction and feature point matching in the portable terminal 101, it is not possible to distinguish whether the captured image is a still image or a moving image, and whether the still image is a moving image or not can be distinguished based on the image ID received on the server 102 side.

上記の第１〜第２の実施形態の処理フローやデータベースの構造は例示にすぎず、これらに限るものではない。たとえば、特徴点抽出のアルゴリズムとしてＯＲＢを用いていたが、これに限るものではない。
また、第２の実施形態では、当たり画像に関連する情報をサーバに問い合わせているが、携帯端末に原画像に対応付けて関連情報も格納しておき、当たり画像が見つかれば、この関連情報を参照して画面表示してもよい。
さらに、本願発明の動画検索機能を、例えば現在放送中のテレビ番組を特定するために用いることも可能である。サーバ側で各テレビ局の番組の画像から逐次特徴点を抽出しキューに格納する。一方、あるテレビ番組を見ているユーザは携帯端末のカメラ機能で撮影して特徴点を抽出し、サーバに送信する。サーバは受信した特徴点をキューに格納済の特徴点データと照合し、どのテレビ局が現在放送している番組であるかを特定する。このテレビ番組を特定する機能は、視聴率の推定をはじめ種々の分野での利用が考えうる。 The processing flows and database structures of the first and second embodiments described above are merely examples, and are not limited thereto. For example, the ORB is used as the feature point extraction algorithm, but the present invention is not limited to this.
In the second embodiment, the server is inquired about information related to the hit image. However, the related information is also stored in the portable terminal in association with the original image. It may be displayed on the screen with reference.
Furthermore, the moving image search function of the present invention can be used, for example, to identify a television program currently being broadcast. On the server side, feature points are sequentially extracted from the images of the programs of each television station and stored in a queue. On the other hand, a user watching a certain TV program takes a picture with the camera function of the mobile terminal, extracts a feature point, and transmits it to the server. The server collates the received feature point with the feature point data stored in the queue, and identifies which TV station is the program currently being broadcast. This function of specifying a television program can be considered to be used in various fields including audience rating estimation.

携帯端末のカメラ機能で撮影した画像に関連する情報を、リアルタイムで画面表示などができるので、特に客先での営業支援ツールとして有効である。また、携帯端末に情報誌のような役割も持たせることにより携帯端末の可能性をさらに拡げることができる。
Since information related to images taken by the camera function of the mobile terminal can be displayed on the screen in real time, it is particularly effective as a sales support tool at customers. Moreover, the possibility of a portable terminal can be further expanded by giving the portable terminal a role like an information magazine.

１：携帯端末、２：サーバ２、３：Ｗｅｂサーバ、
４：入力手段、５：出力手段６：撮像手段、７：記憶手段、８：処理手段、
９：特徴点抽出手段、１０：画像検索要求送信手段、１１：画像検索結果受信手段、
１２：画像関連情報取得手段、１３：記憶手段、１４：処理手段、
１５：原画像情報記憶手段、
１６：原画像特徴点データベース（特徴点ＤＢ）、
１７：特徴点インデックスデータベース（インデックスＤＢ）、
１８：原画像関連情報データベース（関連情報ＤＢ）、
１９：特徴点作成手段、２０：検索キー受信手段、２１：特徴点照合手段、
２２：当たり画像情報送信手段
１０１：携帯端末、１０２：サーバ、１０３：記憶手段、１０４：処理手段、
１０５：原画像情報記憶手段、１０６：原画像特徴点データベース（特徴点ＤＢ）、
１０７：特徴点抽出手段、１０８：特徴点照合手段、１０９：当たり画像情報取得手段、１１０：画像関連情報取得手段、１１１：記憶手段、１１２：処理手段、
１１３：原画像関連情報記憶手段、１１４：当たり画像情報送信手段、
Ｎ：インターネット 1: mobile terminal, 2: server 2, 3: Web server,
4: input means, 5: output means 6: imaging means, 7: storage means, 8: processing means,
9: Feature point extracting means, 10: Image search request transmitting means, 11: Image search result receiving means,
12: Image related information acquisition means, 13: Storage means, 14: Processing means,
15: Original image information storage means,
16: Original image feature point database (feature point DB),
17: Feature point index database (index DB),
18: Original image related information database (related information DB),
19: feature point creating means, 20: search key receiving means, 21: feature point collating means,
22: Hit image information transmission means 101: Mobile terminal, 102: Server, 103: Storage means, 104: Processing means,
105: Original image information storage means, 106: Original image feature point database (feature point DB),
107: feature point extraction means, 108: feature point collation means, 109: hit image information acquisition means, 110: image related information acquisition means, 111: storage means, 112: processing means,
113: Original image related information storage means, 114: Hit image information transmission means,
N: Internet

Claims

Input means for accepting user input;
Output means for displaying a still image or a moving image (hereinafter referred to as “captured image”) obtained by the imaging means;
Feature point extraction means for extracting feature points of the displayed captured image;
An original image feature point database in which feature points of pre-collected image groups are stored;
A feature point collating unit that collates the feature points of the extracted captured image with the feature points extracted from the original image feature point database, and extracts information that specifies an image that satisfies a condition (hereinafter referred to as “winning image”);
Image-related information acquisition means for acquiring information related to the winning image based on the information specifying the winning image ,
When the moving image is a feature point extraction target, the feature point extracting unit takes a plurality of still images, extracts the feature points of each still image, collects the feature points of a series of still images, and collects the feature points of the moving image. An image search system characterized by a point .

The server and mobile device are connected via a communication network,
The portable terminal is
Input means for accepting user input;
Output means for displaying a photographed image obtained by the imaging means;
Feature point extraction means for extracting feature points of the displayed captured image;
Image search request transmission means for transmitting the feature points of the extracted captured image to the server and requesting the identification of the hit image;
An image search result receiving unit for receiving a search result, an image related information acquiring unit for acquiring related information based on information for identifying a hit image,
The feature point extraction means takes a plurality of still images, extracts the feature points of each still image, collects the feature points of a series of still images, As feature points of the image,
The server
An original image feature point database storing feature points of the original image;
A feature point matching unit that compares feature points received from the portable terminal with feature points extracted from the original image feature point database, and extracts information for identifying a hit image;
An image search system comprising: