JP2012160047A

JP2012160047A - Corresponding reference image retrieval device and method thereof, content superimposing apparatus, system, method, and computer program

Info

Publication number: JP2012160047A
Application number: JP2011019575A
Authority: JP
Inventors: Yuichi Yoshida; 悠一吉田; Mitsuru Abe; 満安倍
Original assignee: Denso IT Laboratory Inc
Current assignee: Denso IT Laboratory Inc
Priority date: 2011-02-01
Filing date: 2011-02-01
Publication date: 2012-08-23
Anticipated expiration: 2031-02-01
Also published as: JP5563494B2

Abstract

PROBLEM TO BE SOLVED: To provide a corresponding reference image retrieval device capable of effectively retrieving a reference image corresponding to an input image in an apparatus having limited resources.SOLUTION: A corresponding reference image retrieval device 10 for retrieving a reference image corresponding to an input image includes: a feature amount detection unit 12 which extracts a feature point from the input image and detects a feature amount of the feature point; a binary conversion unit 13 which converts the feature amount detected in the feature amount detection unit 12 to a binary code; a feature point database 14 which stores the feature amounts for each of the feature points of a plurality of reference images in a format of binary code; a matching unit 15 which detects the reference image corresponding to the input image from the plurality of reference images by comparing the feature amount of the binary code for the input image converted in the binary conversion unit 13 with the feature amount of the binary code for the plurality of reference images stored in the feature point database 14.

Description

本発明は、入力画像に対応する参照画像を検索する対応参照画像検索装置及び方法、並びにそれらを用いて入力画像に対して対応するコンテンツを重畳するコンテンツ重畳装置、システム、及び方法に関し、特に、画像の特徴点を用いて入力画像に対応する参照画像を検索する対応参照画像検索装置及び方法、並びにそれらを用いて入力画像に対して対応するコンテンツを重畳するコンテンツ重畳装置、システム、及び方法に関するものである。 The present invention relates to a corresponding reference image search apparatus and method for searching for a reference image corresponding to an input image, and a content superimposing apparatus, system, and method for superimposing corresponding content on an input image using them, and in particular, BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a corresponding reference image search apparatus and method for searching a reference image corresponding to an input image using feature points of the image, and a content superimposing apparatus, system, and method for superimposing corresponding content on the input image using them. Is.

近年、カメラ付き携帯端末の普及により、カメラで対象物体を撮影して入力画像とし、その対象物体に対応するコンテンツ（例えば、対象物体の解説）を入力画像に重畳表示する、ＡＲ（Augmented Reality）技術が提案されている。 In recent years, with the widespread use of mobile terminals with cameras, AR (Augmented Reality) that captures a target object with a camera as an input image and displays content corresponding to the target object (for example, explanation of the target object) superimposed on the input image. Technology has been proposed.

ＡＲを実現する従来技術として、対象物体に対応するコンテンツを特定するために画像処理を行わないものと画像処理を行うものとがある。画像処理を行わない従来技術としては、ＧＰＳレシーバと電子コンパスを併用するものが挙げられる。この従来技術は、あらかじめデータベース上に対象物体の位置を記憶しておき、ＧＰＳレシーバによりカメラ付き携帯端末の位置を検出し、電子コンパスによりカメラ付き携帯端末の姿勢を検出し、その位置と姿勢に基づいて、カメラの画角内にある対象物体及びその位置を推定することで、対象物体に対応するコンテンツ及びその重畳箇所を特定する。 As conventional techniques for realizing the AR, there are one that does not perform image processing and one that performs image processing in order to specify content corresponding to a target object. As a prior art that does not perform image processing, there is one that uses a GPS receiver and an electronic compass in combination. In this prior art, the position of the target object is stored in advance in a database, the position of the mobile terminal with camera is detected by a GPS receiver, the orientation of the mobile terminal with camera is detected by an electronic compass, and the position and orientation are determined. Based on this, by estimating the target object within the angle of view of the camera and its position, the content corresponding to the target object and its overlapping location are specified.

対象物体に対応するコンテンツを特定するために画像処理を行わない従来技術は、カメラ付き携帯端末がＧＰＳレシーバ及び電子コンパスを搭載していれば比較的容易に実現できるが、位置と姿勢の検出精度はＧＰＳレシーバ及び電子コンパスの精度に依存するので、対象物体に対して正確にコンテンツを重畳することが困難である。 The conventional technology that does not perform image processing to identify the content corresponding to the target object can be realized relatively easily if the camera-equipped mobile terminal is equipped with a GPS receiver and an electronic compass. Since it depends on the accuracy of the GPS receiver and the electronic compass, it is difficult to accurately superimpose the content on the target object.

一方、入力画像に映された対象物体に対応するコンテンツを特定するために画像処理を行う従来技術は、入力画像に映っている対象物体を認識することで、対象物体に対応するコンテンツを特定する。画像処理を行う従来技術として、指標を用いるものと対応点を用いるものが提案されている。 On the other hand, the conventional technology that performs image processing to identify the content corresponding to the target object shown in the input image identifies the content corresponding to the target object by recognizing the target object shown in the input image. . As a conventional technique for performing image processing, one using an index and one using a corresponding point have been proposed.

指標を用いる技術は、例えば指標として２次元コードを用い、あらかじめデータベース上にコンテンツとそれに対応する２次元コードとを記憶しておき、また、対象物体又はその付近には２次元コードを付与しておき、カメラ付き携帯端末で２次元コードを含む対象物体を撮影することで、データベース上から撮影された２次元コードに対応するコンテンツを検索する。 The technique using an index uses, for example, a two-dimensional code as an index, stores content and a corresponding two-dimensional code in advance in a database, and assigns a two-dimensional code to the target object or its vicinity. The content corresponding to the two-dimensional code photographed from the database is searched by photographing the target object including the two-dimensional code with the camera-equipped mobile terminal.

この従来技術によれば、２次元コードが付与されている平面に対するカメラ付き携帯端末の姿勢を高速かつ高精細に推定することが可能である。また、２次元コードを採用することで、大量のコードパターンを容易に作成できるため、多様な対象を認識できる。しかしながら、対象物体又はその付近に２次元コードを付与する必要があるため、例えば、屋外のランドマークや大きな看板等にコンテンツを重畳することは非現実的である。また、対象物体又はその付近に２次元コードを付与することで、対象物体の意匠に対する影響が大きくなる。 According to this prior art, it is possible to estimate the attitude of the camera-equipped mobile terminal with respect to the plane to which the two-dimensional code is assigned at high speed and with high definition. In addition, by adopting a two-dimensional code, a large number of code patterns can be easily created, so that various objects can be recognized. However, since it is necessary to add a two-dimensional code to the target object or its vicinity, it is impractical to superimpose content on, for example, an outdoor landmark or a large signboard. Moreover, the influence with respect to the design of a target object becomes large by giving a two-dimensional code to a target object or its vicinity.

対応点を用いる技術は、入力画像とデータベース中の参照画像との対応点を求めることで、入力画像に対応する参照画像を検索して、その参照画像に対応付けられたコンテンツを入力画像に重畳する。対応点を用いることで、２次元コードのような指標が不要なマーカレスＡＲを実現できるので、応用範囲が広くなり、対象物体の意匠に影響を与えることもない。 The technique using the corresponding points searches for the reference image corresponding to the input image by obtaining the corresponding point between the input image and the reference image in the database, and superimposes the content corresponding to the reference image on the input image. To do. By using corresponding points, a markerless AR that does not require an index such as a two-dimensional code can be realized, so that the application range is widened and the design of the target object is not affected.

対応点を用いて対応する参照画像を検索する技術では、入力画像複数の特徴点を抽出して、各特徴点の特徴量（局所特徴量）を参照画像の特徴点の特徴量と比較し、対応する特徴点（対応点）を探索し、対応する特徴点を多く含む参照画像を、入力画像に対応する参照画像とする。このような技術として、ＳＩＦＴ（Scale-invariant Feature Transform）や、ＳＩＦＴを高速化したＳＵＲＦ（Speeded Up Robust Features）といった画像特徴点表現手法が知られている（非特許文献１参照）。これらの画像特徴点表現手法は、認識能力が高く、多様な対象を認識できるという利点がある。また、入力画像と参照画像との間の対応点のリストが得られるので、入力画像へのコンテンツの重畳に必要な計算を容易に実行できる。 In the technique of searching for a corresponding reference image using corresponding points, a plurality of feature points of the input image are extracted, the feature amount of each feature point (local feature amount) is compared with the feature amount of the feature point of the reference image, Corresponding feature points (corresponding points) are searched, and a reference image including many corresponding feature points is set as a reference image corresponding to the input image. As such a technique, there are known image feature point expression methods such as SIFT (Scale-invariant Feature Transform) and SURF (Speeded Up Robust Features) obtained by increasing the speed of SIFT (see Non-Patent Document 1). These image feature point expression methods have an advantage that recognition ability is high and various objects can be recognized. In addition, since a list of corresponding points between the input image and the reference image is obtained, it is possible to easily execute calculations necessary for superimposing content on the input image.

David G. Lowe, "Object recognition from local scale-invariant features," International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157David G. Lowe, "Object recognition from local scale-invariant features," International Conference on Computer Vision, Corfu, Greece (September 1999), pp. 1150-1157

しかしながら、特徴点の特徴量を比較することで入力画像に対応する参照画像を検索する上記の従来技術では、以下の問題点がある。まず、上記の従来技術では、特徴量のサイズが大きい。特徴量は、単精度実数で表現された数百次元のベクトルで表現されるので、１つの画像から数百ないし数千の特徴点が抽出されると、特徴量のデータ量は数十キロバイトないし数メガバイトにもなる。よって、携帯端末のようなハードウェア資源が限られた装置では、主記憶領域にデータベースを保持することは困難である。 However, the above-described conventional technique for searching for a reference image corresponding to an input image by comparing feature amounts of feature points has the following problems. First, in the above-described conventional technology, the size of the feature amount is large. Since feature quantities are expressed as vectors of hundreds of dimensions expressed as single-precision real numbers, if hundreds to thousands of feature points are extracted from one image, the data amount of the feature quantity is from tens of kilobytes to It can be several megabytes. Therefore, it is difficult to maintain a database in the main storage area in an apparatus with limited hardware resources such as a portable terminal.

また、上記の従来技術では、ベクトルで表現された特徴量同士のＬ２ノルムを計算する必要がある。この高次元のベクトル同士の距離計算は、計算負荷が極めて高い。よって、携帯端末のような計算資源が限られた装置では、実現は困難である。 Further, in the above-described conventional technology, it is necessary to calculate the L2 norm between the feature amounts expressed by vectors. The calculation of the distance between these high-dimensional vectors is extremely expensive. Therefore, it is difficult to realize with an apparatus with limited computing resources such as a portable terminal.

本発明は、上記の問題点に鑑みてなされたものであり、資源の限られた装置においても有効に、入力画像に対応する参照画像の検索を行うことができる対応参照画像検索装置及び方法、並びにそれらを用いて入力画像に対して対応するコンテンツを重畳するコンテンツ重畳装置、システム、及び方法を提供することを目的とする。 The present invention has been made in view of the above problems, and a corresponding reference image retrieval apparatus and method capable of retrieving a reference image corresponding to an input image effectively even in an apparatus with limited resources, It is another object of the present invention to provide a content superimposing apparatus, system, and method for superimposing corresponding content on an input image using them.

上記従来の課題を解決するために、本発明の対応参照画像検索装置は、入力画像に対応する参照画像を検索する対応参照画像検索装置であって、入力画像から特徴点を抽出して前記特徴点の特徴量を検出する特徴量検出部と、前記特徴量検出部にて検出された特徴量をバイナリコードに変換するバイナリ変換部と、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶した特徴点データベースと、前記バイナリ変換部にて変換された前記入力画像のバイナリコードの特徴量と、前記特徴点データベースに記憶された前記複数の参照画像のバイナリコードの特徴量とを比較することで、前記複数の参照画像の中から前記入力画像に対応する参照画像を検出するマッチング部とを備えた構成を有している。 In order to solve the above-described conventional problems, a corresponding reference image search device according to the present invention is a corresponding reference image search device that searches for a reference image corresponding to an input image, and extracts feature points from the input image to extract the feature points. A feature amount detection unit that detects a feature amount of a point; a binary conversion unit that converts the feature amount detected by the feature amount detection unit into a binary code; and a feature amount of each feature point of a plurality of reference images Feature point database stored in code format, binary code feature amount of the input image converted by the binary conversion unit, and binary code feature amount of the plurality of reference images stored in the feature point database And a matching unit that detects a reference image corresponding to the input image from the plurality of reference images.

この構成により、複数の参照画像の特徴量はバイナリデータで記憶されており、対応参照画像の検索においても、バイナリコードの特徴量を比較するので、資源の限られた装置においても有効に、入力画像に対応する参照画像の検索を行うことができる。 With this configuration, the feature values of a plurality of reference images are stored as binary data, and the feature values of binary codes are compared even when searching for corresponding reference images. A reference image corresponding to the image can be searched.

また、上記の対応参照画像検索装置において、前記バイナリ変換部は、変換行列を用いて、前記特徴量検出部にて検出された特徴量をバイナリコードに変換してよい。 In the corresponding reference image search device, the binary conversion unit may convert the feature quantity detected by the feature quantity detection unit into a binary code using a transformation matrix.

この構成により、バイナリ変換部における変換の計算コストを軽減できる。 With this configuration, the calculation cost of conversion in the binary conversion unit can be reduced.

また、上記の対応参照画像検索装置において、前記変換行列は、疎行列であってよい。 In the corresponding reference image search device, the conversion matrix may be a sparse matrix.

この構成により、バイナリ変換部における変換の計算コストをさらに軽減できる。 With this configuration, the calculation cost of conversion in the binary conversion unit can be further reduced.

また、上記の対応参照画像検索装置において、前記バイナリ変換部は、前記変換行列のサイズを変更することで、前記バイナリコードのサイズを変更可能であってよい。 In the corresponding reference image search device, the binary conversion unit may be able to change the size of the binary code by changing the size of the conversion matrix.

この構成により、バイナリコードのサイズを小さくすることでマッチング部における計算コストを軽減したり、バイナリコードのサイズを大きくすることでマッチング部における検索精度を向上させたりといった調整が可能になる。 With this configuration, it is possible to make adjustments such as reducing the calculation cost in the matching unit by reducing the size of the binary code and improving the search accuracy in the matching unit by increasing the size of the binary code.

また、上記の対応参照画像検索装置において、前記マッチング部は、前記入力画像に対応する複数の参照画像がある場合には、複数の参照画像を検出してよい。 In the corresponding reference image search device, the matching unit may detect a plurality of reference images when there are a plurality of reference images corresponding to the input image.

この構成により、入力画像に複数の対象物体が映っている場合には、複数の参照画像が検出される。 With this configuration, when a plurality of target objects are reflected in the input image, a plurality of reference images are detected.

また、上記の対応参照画像検索装置は、前記対応参照画像検索装置の実行環境を測定する環境測定部をさらに含んでよく、前記バイナリ変換部は、前記環境測定部による測定結果に応じて前記変換行列のサイズを変更することで、前記バイナリコードのサイズを変更してよい。 In addition, the corresponding reference image search device may further include an environment measurement unit that measures an execution environment of the corresponding reference image search device, and the binary conversion unit converts the conversion according to a measurement result by the environment measurement unit. The size of the binary code may be changed by changing the size of the matrix.

この構成により、バイナリ変換部は、特徴量検出部が検出した特徴量を、実行環境（例えば、記憶手段の容量、計算手段の容量、及び計算処理能力など）に応じたサイズのバイナリコードに変更できる。 With this configuration, the binary conversion unit changes the feature amount detected by the feature amount detection unit into a binary code having a size according to the execution environment (for example, the capacity of the storage unit, the capacity of the calculation unit, and the calculation processing capacity). it can.

また、本発明のコンテンツ重畳装置は、上記の対応参照画像検索装置を備え、前記入力画像に対して対応するコンテンツを重畳するコンテンツ重畳装置であって、コンテンツ及び前記参照画像と前記コンテンツとの対応関係を記憶したコンテンツデータベースと、前記マッチング部で検出された参照画像に対応するコンテンツを前記コンテンツデータベースから抽出するコンテンツ抽出部と、前記コンテンツ抽出部にて抽出されたコンテンツを前記入力画像に重畳する重畳部とを備えた構成を有している。 A content superimposing apparatus of the present invention is a content superimposing apparatus that includes the corresponding reference image search device described above and superimposes the corresponding content on the input image, and the correspondence between the content and the reference image and the content A content database that stores the relationship, a content extraction unit that extracts content corresponding to the reference image detected by the matching unit from the content database, and a content that is extracted by the content extraction unit is superimposed on the input image And a superimposing unit.

この構成により、資源の限られた装置においても有効に、入力画像に対して対応するコンテンツを重畳させることができる。 With this configuration, it is possible to effectively superimpose the corresponding content on the input image even in an apparatus with limited resources.

また、上記のコンテンツ重畳装置において、前記特徴量検出部は、前記入力画像中の位置の情報を含む特徴点を抽出してよく、前記特徴点データベースは、複数の参照画像の各々の特徴点の特徴量とともに、前記各特徴点の位置の情報を記憶していてよく、前記コンテンツデータベースは、さらに、前記コンテンツの重畳位置を記憶していてよく、前記コンテンツ重畳装置は、さらに、前記特徴量検出部にて抽出された特徴点の位置と前記特徴点データベースに記憶された特徴点の位置との関係に基づいて、前記コンテンツ抽出部にて抽出されたコンテンツの、前記コンテンツデータベースに記憶された前記重畳位置を変換するコンテンツ変換部を備えていてよく、前記重畳部は、前記入力画像中の前記コンテンツ変換部にて変換された重畳位置に前記コンテンツ抽出部にて抽出されたコンテンツを重畳してよい。 In the content superimposing apparatus, the feature amount detection unit may extract feature points including position information in the input image, and the feature point database may store feature points of a plurality of reference images. Information on the position of each feature point may be stored together with the feature amount, the content database may further store a superimposed position of the content, and the content superimposing device may further detect the feature amount. The content extracted by the content extraction unit based on the relationship between the position of the feature point extracted by the unit and the position of the feature point stored in the feature point database is stored in the content database. The content conversion part which converts a superimposition position may be provided, and the said superimposition part is the superimposition converted by the said content conversion part in the said input image. It may be superimposed the contents extracted by the contents extraction unit to location.

この構成により、入力画像と参照画像とでコンテンツを重畳させる対象の位置が異なっていたとしても、入力画像において適切な位置にコンテンツを重畳させることができる。 With this configuration, even if the target position on which the content is superimposed is different between the input image and the reference image, the content can be superimposed at an appropriate position in the input image.

また、本発明のコンテンツ重畳システムは、コンテンツ重畳装置と、前記コンテンツ重畳装置と通信可能な外部検索サーバとからなるコンテンツ重畳システムである。前記コンテンツ重畳装置は、入力画像から特徴点を抽出して前記特徴点の特徴量を検出する特徴量検出部と、前記特徴量検出部にて検出された特徴量をバイナリコードに変換するバイナリ変換部と、前記バイナリ変換部にて変換された前記入力画像のバイナリコードの特徴量を前記外部検索サーバに送信するコンテンツ重畳装置側通信部とを備え、前記外部検索サーバは、前記コンテンツ重畳装置側通信部より送信された前記入力画像のバイナリコードの特徴量を受信する外部検索サーバ側通信部と、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶した外部検索サーバ側特徴点データベースと、前記外部検索サーバ側通信部にて受信した前記入力画像のバイナリコードの特徴量と、前記外部検索サーバ側特徴点データベースに記憶された前記複数の参照画像のバイナリコードの特徴量とを比較することで、前記複数の参照画像の中から前記入力画像に対応する参照画像を検出する外部検索サーバ側マッチング部とを備えた構成を有している。 The content superimposing system of the present invention is a content superimposing system comprising a content superimposing device and an external search server that can communicate with the content superimposing device. The content superimposing apparatus extracts a feature point from an input image and detects a feature amount of the feature point, and binary conversion that converts the feature amount detected by the feature amount detection unit into a binary code And a content superimposing device side communication unit that transmits the feature value of the binary code of the input image converted by the binary converting unit to the external search server, and the external search server is the content superimposing device side An external search server side communication unit that receives a binary code feature amount of the input image transmitted from the communication unit, and an external search server side that stores the feature amount of each feature point of a plurality of reference images in the form of a binary code Feature point database, feature quantity of binary code of the input image received by the external search server side communication unit, and feature point data of the external search server An external search server-side matching unit that detects a reference image corresponding to the input image from the plurality of reference images by comparing the feature values of binary codes of the plurality of reference images stored in a database; It has the composition provided with.

この構成により、コンテンツ重畳装置からは、入力画像の特徴量としてバイナリコードが外部検索サーバに送信されるので、単精度実数等のデータ量の多い特徴量をそのまま送信する場合と比較して、送信データ量を軽減できる。また、外部検索サーバにおいても、計算コストや必要なデータベース容量を軽減できる。 With this configuration, the content superimposing device transmits a binary code as a feature quantity of the input image to the external search server. Therefore, compared with a case where a feature quantity having a large amount of data such as a single precision real number is transmitted as it is, transmission is performed. Data volume can be reduced. Also, the external search server can reduce the calculation cost and the required database capacity.

また、上記のコンテンツ重畳システムにおいて、前記特徴量検出部は、前記入力画像中の位置の情報を含む特徴点を抽出してよく、前記特徴点データベースは、複数の参照画像の各々の特徴点の特徴量とともに、前記各特徴点の位置の情報を記憶していてよい。そして、前記コンテンツ重畳システムは、前記特徴点データベースに記憶された参照画像と前記コンテンツとの対応関係、及び前記コンテンツの重畳位置を記憶したコンテンツデータベースと、前記外部検索サーバ側マッチング部で検出された参照画像に対応するコンテンツを前記コンテンツデータベースから抽出して、前記特徴量検出部にて抽出された特徴点の位置と前記特徴点データベースに記憶された特徴点の位置との関係に基づいて、前記コンテンツデータベースから抽出されたコンテンツの、前記コンテンツデータベースに記憶された前記重畳位置を変換するコンテンツ変換部と、前記入力画像中の前記コンテンツ変換部にて変換された重畳位置に前記コンテンツ変換部にて抽出されたコンテンツを重畳する重畳部とをさらに備えていてよい。 In the content superimposing system, the feature amount detection unit may extract feature points including position information in the input image, and the feature point database may store feature points of a plurality of reference images. Information on the position of each feature point may be stored together with the feature amount. The content superimposition system detects the correspondence between the reference image stored in the feature point database and the content, the content database storing the content superimposition position, and the external search server side matching unit. Extracting content corresponding to a reference image from the content database, based on the relationship between the position of the feature point extracted by the feature amount detection unit and the position of the feature point stored in the feature point database, A content conversion unit that converts the superposition position stored in the content database of the content extracted from the content database, and the superposition position converted by the content conversion unit in the input image by the content conversion unit And a superimposing unit that superimposes the extracted content. There may be.

この構成により、入力画像に対して対応するコンテンツを重畳させることができるとともに、入力画像と参照画像とでコンテンツを重畳させる対象の位置が異なっていたとしても、入力画像において適切な位置にコンテンツを重畳させることができる。 With this configuration, it is possible to superimpose corresponding content on the input image, and even if the target position on which the content is superimposed differs between the input image and the reference image, the content is placed at an appropriate position in the input image. Can be superimposed.

また、上記のコンテンツ重畳システムにおいて、前記コンテンツ重畳装置は、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶したコンテンツ重畳装置側特徴点データベースと、前記バイナリ変換部にて変換された前記入力画像のバイナリコードの特徴量と、前記コンテンツ重畳装置側特徴点データベースに記憶された前記バイナリコードの特徴量とを比較することで、前記複数の参照画像の中から前記入力画像に対応する参照画像を検出するコンテンツ重畳装置側マッチング部とを備えていてよく、前記外部検索サーバ側通信部は、外部検索サーバ側特徴点データベースに記憶された特徴量のうち、前記外部検索サーバ側マッチング部にて検出された参照画像及びそれに関連する参照画像の特徴量を前記コンテンツ重畳装置に送信してよく、前記コンテンツ重畳装置側通信部は、前記外部検索サーバ側通信部より送信された前記バイナリコードの特徴量を受信してよく、前記コンテンツ重畳装置側特徴点データベースは、前記コンテンツ重畳装置側通信部にて受信した前記バイナリコードの特徴量を、前記複数の参照画像の各々の特徴点の特徴量としてよい。 Further, in the content superimposing system, the content superimposing device includes a feature superimposing device-side feature point database in which feature amounts of feature points of a plurality of reference images are stored in a binary code format, and the binary converting unit. By comparing the converted binary image feature quantity of the input image with the binary code feature quantity stored in the content superimposing apparatus side feature point database, the input image is selected from the plurality of reference images. A content superimposing device side matching unit that detects a reference image corresponding to the external search server side communication unit, wherein the external search server side communication unit includes the external search server among the feature quantities stored in the external search server side feature point database. The reference image detected by the side matching unit and the feature amount of the reference image related thereto are used as the content weight. The content superimposing device side communication unit may receive the feature amount of the binary code transmitted from the external search server side communication unit, and the content superimposing device side feature point database The feature amount of the binary code received by the content superimposing apparatus side communication unit may be the feature amount of each feature point of the plurality of reference images.

この構成により、コンテンツ重畳装置は、大量の参照画像の特徴量のすべてを記憶していなくても、必要なデータのみを外部検索サーバからダウンロードすることができるので、コンテンツ重畳装置に必要とされるデータベースの容量を軽減できる。 With this configuration, the content superimposing apparatus can download only necessary data from the external search server even if it does not store all of the feature quantities of a large amount of reference images. Database capacity can be reduced.

また、本発明の対応参照画像検索方法は、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶した特徴点データベースを備えた対応参照画像検索装置における、入力画像に対応する参照画像を検索する対応参照画像検索方法であって、入力画像から特徴点を抽出する特徴点抽出ステップと、前記特徴点抽出ステップにて抽出された前記特徴点の特徴量を検出する特徴量検出ステップと、前記特徴量検出ステップにて検出された特徴量をバイナリコードに変換するバイナリコード変換ステップと、前記バイナリコード変換ステップにて変換された前記入力画像のバイナリコードの特徴量と、前記特徴点データベースに記憶された前記複数の参照画像の各々のバイナリコードの特徴量とを比較することで、前記複数の参照画像の中から前記入力画像に対応する参照画像を検出するマッチングステップとを含んでいる。 Further, the corresponding reference image search method of the present invention corresponds to an input image in a corresponding reference image search device including a feature point database in which feature quantities of feature points of a plurality of reference images are stored in a binary code format. A corresponding reference image retrieval method for retrieving a reference image, the feature point extracting step for extracting a feature point from an input image, and the feature amount detection for detecting the feature amount of the feature point extracted in the feature point extracting step A binary code conversion step for converting the feature amount detected in the feature amount detection step into a binary code, a feature amount of the binary code of the input image converted in the binary code conversion step, and the feature The plurality of reference images are compared by comparing the binary code feature amount of each of the plurality of reference images stored in the point database. And a matching step of detecting a reference image corresponding to the input image from within.

この構成によっても、複数の参照画像の特徴量はバイナリデータで記憶されており、対応参照画像の検索においても、バイナリコードの特徴量を比較するので、資源の限られた装置においても有効に、入力画像に対応する参照画像の検索を行うことができる。 Also with this configuration, the feature amounts of a plurality of reference images are stored as binary data, and the feature amounts of the binary codes are compared even in the search for the corresponding reference images. Therefore, even in an apparatus with limited resources, A reference image corresponding to the input image can be searched.

また、本発明のコンテンツ重畳方法は、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶した特徴点データベースを備えた対応参照画像検索装置と、前記特徴点データベースに記憶された参照画像とコンテンツとの対応関係を記憶したコンテンツデータベースとを備えたコンテンツ重畳装置における、入力画像に対して対応するコンテンツを重畳するコンテンツ重畳方法であって、上記の対応参照画像検索方法にて、前記入力画像に対応する参照画像を検出する対応参照画像検索ステップと、前記対応参照画像検索ステップにて検出された参照画像に対応するコンテンツを前記コンテンツデータベースから抽出するコンテンツ抽出ステップと、前記コンテンツ抽出部にて抽出されたコンテンツを前記入力画像に重畳する重畳ステップとを含んでいる。 Also, the content superimposing method of the present invention includes a corresponding reference image search device provided with a feature point database in which feature amounts of feature points of a plurality of reference images are stored in a binary code format, and stored in the feature point database. A content superimposing method for superimposing corresponding content on an input image in a content superimposing apparatus provided with a content database storing a correspondence relationship between the reference image and the content. A corresponding reference image search step for detecting a reference image corresponding to the input image, a content extraction step for extracting content corresponding to the reference image detected in the corresponding reference image search step from the content database, and the content The content extracted by the extraction unit is superimposed on the input image. And a tatami step.

この構成によっても、資源の限られた装置においても有効に、入力画像に対して対応するコンテンツを重畳させることができる。 With this configuration, it is possible to effectively superimpose the corresponding content on the input image even in an apparatus with limited resources.

本発明のコンピュータプログラムは、上記の対応参照画像検索方法をコンピュータに実行させるためのコンピュータプログラムである。 The computer program of the present invention is a computer program for causing a computer to execute the above-described corresponding reference image search method.

本発明の別の態様のコンピュータプログラムは、上記のコンテンツ重畳方法をコンピュータに実行させるためのコンピュータプログラムである。 A computer program according to another aspect of the present invention is a computer program for causing a computer to execute the above-described content superimposing method.

本発明によれば、複数の参照画像の特徴量はバイナリデータで記憶されており、対応参照画像の検索においても、バイナリコードの特徴量を比較するので、資源の限られた装置においても有効に、入力画像に対応する参照画像の検索を行うことができる。 According to the present invention, the feature values of a plurality of reference images are stored as binary data, and the feature values of the binary codes are compared even when searching for the corresponding reference images. Therefore, it is effective even in an apparatus with limited resources. The reference image corresponding to the input image can be searched.

本発明の実施の形態におけるコンテンツ重畳装置の構成を示すブロック図The block diagram which shows the structure of the content superimposition apparatus in embodiment of this invention. 本発明の実施の形態における入力画像の例を示す図The figure which shows the example of the input image in embodiment of this invention 本発明の実施の形態における入力画像から抽出された特徴点を示す図The figure which shows the feature point extracted from the input image in embodiment of this invention 本発明の実施の形態における入力画像から検出された特徴量を示す図The figure which shows the feature-value detected from the input image in embodiment of this invention 本発明の実施の形態におけるバイナリコードに変換された特徴量を示す図The figure which shows the feature-value converted into the binary code in embodiment of this invention 本発明の実施の形態における特徴点データベースに保存されたデータを示す図The figure which shows the data preserve | saved in the feature point database in embodiment of this invention 本発明の実施の形態における対応点対を示す図The figure which shows the corresponding point pair in embodiment of this invention 本発明の実施の形態におけるコンテンツデータベースに保存されたデータを示す図The figure which shows the data preserve | saved in the content database in embodiment of this invention 本発明の実施の形態の変形例２におけるコンテンツ重畳装置の構成を示すブロック図The block diagram which shows the structure of the content superimposition apparatus in the modification 2 of embodiment of this invention. 本発明の実施の形態の変形例３におけるコンテンツ重畳装置の構成を示すブロック図The block diagram which shows the structure of the content superimposition apparatus in the modification 3 of embodiment of this invention. 本発明の実施の形態の変形例４におけるコンテンツ重畳装置の構成を示すブロック図The block diagram which shows the structure of the content superimposition apparatus in the modification 4 of embodiment of this invention. 本発明の実施の形態の変形例４における外部検索サーバの特徴点データベースに保存されたデータの構成を示す図The figure which shows the structure of the data preserve | saved in the feature point database of the external search server in the modification 4 of embodiment of this invention.

以下、本発明を実施するための形態について、図面を参照しながら説明する。図１は、本実施の形態のコンテンツ重畳装置の構成を示すブロック図である。コンテンツ重畳装置１１０は、入力画像に対応する参照画像を検索するための対応参照画像検索装置１０を備えており、対応参照画像検索装置１０で検索された参照画像を用いて、入力画像に対して関連するコンテンツを重畳するための構成として、対応関係算出部２１、コンテンツ変換部２２、コンテンツデータベース２３、及び重畳部２４を備えている。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. FIG. 1 is a block diagram illustrating a configuration of the content superimposing apparatus according to the present embodiment. The content superimposing device 110 includes a corresponding reference image search device 10 for searching for a reference image corresponding to an input image. The content superimposing device 110 uses the reference image searched by the corresponding reference image search device 10 to As a configuration for superimposing related content, a correspondence calculation unit 21, a content conversion unit 22, a content database 23, and a superposition unit 24 are provided.

対応参照画像検索装置１０は、画像取得部１１、特徴量検出部１２、バイナリ変換部１３、特徴点データベース１４、及びマッチング部１５を備えている。画像取得部１１は、撮像装置としてのカメラで撮影をすることにより画像を生成し、これを入力画像として取得する。画像取得部１１は、外部で生成された画像を通信又は記録媒体を介して画像を入力してもよい。図２は、入力画像の例を示す図である。以下の説明では、この図２の入力画像を用いて各部における処理を説明する。画像取得部１１にて取得された入力画像は、特徴量検出部１２に出力される。 The corresponding reference image search apparatus 10 includes an image acquisition unit 11, a feature amount detection unit 12, a binary conversion unit 13, a feature point database 14, and a matching unit 15. The image acquisition part 11 produces | generates an image by image | photographing with the camera as an imaging device, and acquires this as an input image. The image acquisition unit 11 may input an image generated outside through communication or a recording medium. FIG. 2 is a diagram illustrating an example of an input image. In the following description, processing in each unit will be described using the input image of FIG. The input image acquired by the image acquisition unit 11 is output to the feature amount detection unit 12.

特徴量検出部１２は、入力画像から特徴点を抽出して、抽出した特徴点の特徴量を検出する。図３は、入力画像から抽出された特徴点を示す図である。図３に示すように、一般的には、入力画像から複数の特徴点が検出される。 The feature amount detection unit 12 extracts feature points from the input image and detects feature amounts of the extracted feature points. FIG. 3 is a diagram illustrating feature points extracted from the input image. As shown in FIG. 3, generally, a plurality of feature points are detected from the input image.

図４は、特徴量検出部１２により検出された特徴量を示す図である。本実施の形態では、特徴量として、局所特徴量を用いる。具体的には、特徴量として、ＳＩＦＴ特徴量が用いられる。実際には、ＳＵＲＦ特徴量などの他の局所特徴量が用いられてもよい。図４に示すように、特徴量検出部１２において、局所特徴量は、単精度実数のベクトルとして求められる。特徴量検出部１２は、各特徴点の位置の情報と、各特徴点について検出された局所特徴量を、バイナリ変換部１３に出力する。 FIG. 4 is a diagram illustrating the feature amounts detected by the feature amount detection unit 12. In the present embodiment, a local feature amount is used as the feature amount. Specifically, a SIFT feature value is used as the feature value. In practice, other local feature quantities such as SURF feature quantities may be used. As shown in FIG. 4, in the feature quantity detection unit 12, the local feature quantity is obtained as a single-precision real vector. The feature quantity detection unit 12 outputs the position information of each feature point and the local feature quantity detected for each feature point to the binary conversion unit 13.

バイナリ変換部１３は、入力画像から抽出されたすべての特徴点について、それらの特徴量をバイナリコードに変換する。図５は、バイナリコードに変換された特徴量を示す図である。特徴量検出部１２にて検出された特徴量を１２８次元のベクトルｖ∈Ｒ¹²⁸であるとすると、バイナリ変換部１３は、この特徴量を下式（１）でバイナリコードに変換する。

The binary conversion unit 13 converts the feature amounts of all feature points extracted from the input image into binary codes. FIG. 5 is a diagram illustrating the feature amount converted into the binary code. Assuming that the feature quantity detected by the feature quantity detection unit 12 is a 128-dimensional vector vεR ¹²⁸ , the binary conversion unit 13 converts this feature quantity into a binary code by the following equation (1).

但し、式（１）において、ｄは、変換後のバイナリコードのサイズ（即ちビット数）であり、ｓｇｎ関数は、下式（２）で与えられる。

また、ベクトルｗ_kは、１２８次元における半径１の超球上の点から、一様分布に従ってランダムサンプリングをして得られるベクトルである。ｗ_k（ｋ＝１，……ｄ）は、１２８行ｄ列の行列として表現できる。このベクトルｗ_kからなる行列を「ｗ」と表記し、変換行列という。 In equation (1), d is the size (that is, the number of bits) of the binary code after conversion, and the sgn function is given by the following equation (2).

The vector w _k is a vector obtained by random sampling according to a uniform distribution from a point on a hypersphere having a radius of 1 in 128 dimensions. w _k (k = 1,... d) can be expressed as a matrix of 128 rows and d columns. A matrix composed of the vector w _k is expressed as “w” and is referred to as a transformation matrix.

なお、上記の変換を実行する前に、バイナリ変換部１３は、あらかじめ大量の画像から特徴量をサンプリングしておき、そこから求めた平均又は中央値であるｍをｖ∈Ｒ¹²⁸から引き、さらにＬ２ノルムが１となるようにｖ∈Ｒ¹²⁸を正規化しておく。本実施の形態では、特徴点データベース１４にデータを保存する際に生成される大量の特徴量を用いてｍを生成する。また、本実施の形態では、バイナリコードのビット長を１２８ビットとし、即ちｄ＝１２８とする。 Before executing the above conversion, the binary conversion unit 13 samples feature amounts from a large amount of images in advance, subtracts the average or median m obtained therefrom from vεR ¹²⁸ , and VεR ¹²⁸ is normalized so that the L2 norm is 1. In the present embodiment, m is generated using a large amount of feature amounts generated when data is stored in the feature point database 14. In this embodiment, the bit length of the binary code is 128 bits, that is, d = 128.

特徴点データベース１４は、複数の参照画像の各々の特徴点の特徴量をバイナリコードの形式で記憶している。この参照画像は、ＡＲにおいて認識対象の画像となる。図６は、特徴点データベース１４に保存されたデータを示す図である。図６に示すように、特徴点データベース１４には、特徴点ごとに、その特徴点が所属する参照画像の画像識別番号、バイナリコードで表現されたその特徴点の特徴量、及びその特徴点の画像内での位置からなるレコードが記憶されている。 The feature point database 14 stores feature amounts of feature points of a plurality of reference images in a binary code format. This reference image is a recognition target image in the AR. FIG. 6 is a diagram showing data stored in the feature point database 14. As shown in FIG. 6, in the feature point database 14, for each feature point, the image identification number of the reference image to which the feature point belongs, the feature amount of the feature point expressed in binary code, and the feature point A record consisting of a position in the image is stored.

特徴点データベース１４に保存されるこれらのレコードは、上記で説明した画像取得部１１、特徴量検出部１２、及びバイナリ変換部１３を用いて用意される。即ち、画像取得部１１は、撮影を行なうか、又はネットワーク若しくは記録媒体からデータを読み出すことで、参照画像を取得し、特徴量検出部１２は、この参照画像から特徴点を抽出して、その特徴量を検出する。特徴量検出部１２は、参照画像に画像識別番号を付与し、その画像識別番号とともに、各特徴点の参照画像内での位置の情報、及び検出した特徴量をバイナリ変換部１３に出力する。バイナリ変換部１３は、特徴量をバイナリ変換してバイナリコードを生成する。 These records stored in the feature point database 14 are prepared using the image acquisition unit 11, the feature amount detection unit 12, and the binary conversion unit 13 described above. That is, the image acquisition unit 11 acquires a reference image by photographing or reading data from a network or a recording medium, and the feature amount detection unit 12 extracts a feature point from the reference image, Detect feature values. The feature amount detection unit 12 assigns an image identification number to the reference image, and outputs the position information of each feature point in the reference image and the detected feature amount to the binary conversion unit 13 together with the image identification number. The binary conversion unit 13 performs binary conversion on the feature value to generate a binary code.

図６に示すように、１つの参照画像からは複数の特徴点が抽出される。図６の例では、１つの参照画像について、数個の特徴点のレコードしか示されていないが、実際には１つの参照画像につき数百ないし数千の特徴点のレコードが保存されてよい。また、図６の例では、２つの参照画像しか示されていないが、特徴点データベース１４には、数千又はそれ以上の参照画像について、特徴点のレコードが保存されてよい。さらに、図６の例では、紙面の都合上、バイナリコードは最初の１１桁のみを示しているが、上述のように、本実施の形態では、バイナリ変換部１３によって１２８ビットのバイナリコードが生成され、特徴点データベース１４にも１２８ビットのバイナリコードが保存されている。 As shown in FIG. 6, a plurality of feature points are extracted from one reference image. In the example of FIG. 6, only a few feature point records are shown for one reference image, but in practice hundreds to thousands of feature point records may be stored for one reference image. In the example of FIG. 6, only two reference images are shown, but the feature point database 14 may store feature point records for thousands or more reference images. Furthermore, in the example of FIG. 6, for convenience of space, the binary code shows only the first 11 digits. However, as described above, in this embodiment, the binary conversion unit 13 generates a 128-bit binary code. The feature point database 14 also stores a 128-bit binary code.

マッチング部１５は、入力画像から抽出された特徴点の特徴量を示すバイナリコードと、特徴点データベース１４に保存されている参照画像の特徴点の特徴量を示すバイナリコードの各々とを比較して、最も近いバイナリコードを探索する。本実施の形態では、マッチング部１５は、バイナリコードの遠近の評価にはハミング距離を採用する。 The matching unit 15 compares the binary code indicating the feature amount of the feature point extracted from the input image with each of the binary code indicating the feature amount of the feature point of the reference image stored in the feature point database 14. Search for the nearest binary code. In the present embodiment, the matching unit 15 employs a Hamming distance for evaluation of the perspective of the binary code.

なお、バイナリコードのハミング距離は、その極限において、元の入力ベクトル空間におけるコサイン距離と一致する。即ち、任意の二つのベクトルｖ₁及びｖ₂をｈ_k（ｖ）によって変換したとき、それぞれのビットが異なる値になる確率は、ベクトルｖ₁とベクトルｖ₂とがなす角度に比例し、下式（３）が成り立つ。

Note that the Hamming distance of the binary code coincides with the cosine distance in the original input vector space in the limit. That is, when any two vectors v ₁ and v ₂ are converted by h _k (v), the probability that each bit becomes a different value is proportional to the angle formed by the vectors v ₁ and v _2, and Equation (3) holds.

式（３）の左辺の確率の値は、バイナリコードのハミング距離と見なすことができる。このため、十分に長いビット列を求めれば、元の空間におけるベクトルのコサイン距離と、バイナリコードのハミング距離とは一致する。よって、マッチング部１５は、式（３）を用いてバイナリコード同士の遠近を評価してもよい。なお、マッチング部１５は、高速化等の目的に応じて、ＬＳＨ（Locality Sensitive Hashing）などの既存技術を用いて、最も近いバイナリコードの探索を行なってもよい。 The value of the probability on the left side of Equation (3) can be regarded as the Hamming distance of the binary code. Therefore, if a sufficiently long bit string is obtained, the cosine distance of the vector in the original space matches the Hamming distance of the binary code. Therefore, the matching unit 15 may evaluate the distance between the binary codes using Expression (3). The matching unit 15 may search for the nearest binary code using an existing technique such as LSH (Locality Sensitive Hashing) according to the purpose of speeding up.

マッチング部１５は、特徴点データベース１４に保存されたすべてのレコードのバイナリコードについて、入力画像から抽出された特徴点の特徴量のバイナリコードとの比較を行い、最も近いバイナリコードを有する特徴点に対して、投票を行う。マッチング部１５は、入力画像から抽出されたすべての特徴点について投票を行った結果、最も多くの票を獲得した参照画像を、この入力画像に対応する参照画像（以下、「対応参照画像」という。）であると決定する。 The matching unit 15 compares the binary code of all the records stored in the feature point database 14 with the binary code of the feature amount of the feature point extracted from the input image, and determines the feature point having the closest binary code. Vote against it. As a result of voting on all the feature points extracted from the input image, the matching unit 15 obtains the reference image that has acquired the most votes as a reference image corresponding to the input image (hereinafter referred to as “corresponding reference image”). .)

マッチング部１５は、対応参照画像であると決定するための獲得票数の下限を設定してもよい。この場合は、最も多く票を獲得した参照画像の獲得票数がこの下限に満たない場合には、特徴点データベース１４に対応参照画像は存在しないと判断される。また、マッチング部１５は、対応参照画像であると決定するための獲得票数の閾値を設定して、この閾値以上の票を獲得した参照画像をすべて対応参照画像としてもよい。 The matching unit 15 may set a lower limit of the number of acquired votes for determining that it is a corresponding reference image. In this case, if the number of acquired votes of the reference image that has acquired the most votes is less than this lower limit, it is determined that no corresponding reference image exists in the feature point database 14. Further, the matching unit 15 may set a threshold value for the number of acquired votes for determining that it is a corresponding reference image, and may set all reference images that have acquired votes equal to or greater than the threshold value as corresponding reference images.

マッチング部１５は、対応参照画像の画像識別番号とともに、対応参照画像において投票を受けた特徴点（この特徴点を「対応参照画像の対応点」という。）の位置の情報、入力画像の特徴点のうち対応参照画像に投票された特徴点（この特徴点を「入力画像の対応点」という。）の位置の情報を対応関係算出部２１に出力する。このとき、マッチング部１５は、対応参照画像の対応点とそれに対して投票を行なった入力画像の対応点とを対にして出力する。図７は、対応点対を示す図である。 The matching unit 15, together with the image identification number of the corresponding reference image, information on the position of the feature point voted in the corresponding reference image (this feature point is referred to as “corresponding point of the corresponding reference image”), the feature point of the input image Information of the position of the feature point voted for the corresponding reference image (this feature point is referred to as “corresponding point of the input image”) is output to the correspondence calculating unit 21. At this time, the matching unit 15 outputs the corresponding points of the corresponding reference image and the corresponding points of the input image for which voting has been performed as a pair. FIG. 7 is a diagram showing a pair of corresponding points.

対応関係算出部２１は、マッチング部１５から入力した複数の対応点対に基づいて、対応参照画像上の任意の点（座標）を入力画像上の点（座標）に写像するホモグラフィ行列を算出する。具体的には、対応関係算出部２１は、マッチング部１５から入力した対応点対を用いて、以下の式（４）を満たすホモグラフィ行列ＡをＲＡＮＳＡＣ（Random Sample Consensus）法で推定する。

このホモグラフィ行列Ａを用いると、対応参照画像上の任意の点を入力画像上の点に写像させることができる。対応関係算出部２１は、対応参照画像の画像識別番号とホモグラフィ行列Ａとをコンテンツ変換部２２に出力する。 The correspondence calculation unit 21 calculates a homography matrix that maps an arbitrary point (coordinate) on the corresponding reference image to a point (coordinate) on the input image based on the plurality of corresponding point pairs input from the matching unit 15. To do. Specifically, the correspondence calculation unit 21 estimates the homography matrix A satisfying the following equation (4) by the RANSAC (Random Sample Consensus) method using the corresponding point pair input from the matching unit 15.

If this homography matrix A is used, an arbitrary point on the corresponding reference image can be mapped to a point on the input image. The correspondence calculation unit 21 outputs the image identification number of the corresponding reference image and the homography matrix A to the content conversion unit 22.

コンテンツデータベース２３は、入力画像上に重畳するコンテンツを記憶している。図８は、コンテンツデータベース２３に保存されたデータを示す図である。図８に示すように、コンテンツデータベース２３には、コンテンツごとに、コンテンツが対応する参照画像の画像識別番号、コンテンツデータ、コンテンツの形状、サイズ、及び重畳場所からなるレコードが記憶されている。 The content database 23 stores content to be superimposed on the input image. FIG. 8 is a diagram showing data stored in the content database 23. As shown in FIG. 8, the content database 23 stores, for each content, a record including an image identification number of a reference image corresponding to the content, content data, a shape, a size, and a superimposition location of the content.

コンテンツデータは、テキストデータ、画像データ、動画データを含む、入力画像に重畳される各種のデータであってよい。図８の例では、画像識別番号１の参照画像には、コンテンツデータとして、「この写真は・・・」という参照画像に映っている対象物体の説明文（テキストデータ）、「ｈｔｔｐ：／／ｗｗｗ．ａｂｃｄｅｆｇ．ｃｏｍ」という参考ＵＲＬ（テキストデータ）、及び参考画像の画像データが用意されている。 The content data may be various data superimposed on the input image, including text data, image data, and moving image data. In the example of FIG. 8, the reference image with the image identification number 1 includes, as content data, a description (text data) of the target object shown in the reference image “This photo is ...”, “http: /// A reference URL (text data) “www.abcdefg.com” and image data of a reference image are prepared.

コンテンツ変換部２２は、コンテンツデータベース２３から、対応関係算出部２１から入力した対応参照画像の画像識別番号に対応するコンテンツを抽出する。このとき、コンテンツ変換部２２は、本発明のコンテンツ抽出部として機能する。コンテンツ変換部２２は、対応関係算出部２２から入力したホモグラフィ行列Ａを用いて、抽出したコンテンツの重畳位置を変換して、コンテンツデータとともに重畳部２４に出力する。 The content conversion unit 22 extracts content corresponding to the image identification number of the corresponding reference image input from the correspondence calculation unit 21 from the content database 23. At this time, the content conversion unit 22 functions as a content extraction unit of the present invention. The content conversion unit 22 converts the superimposed position of the extracted content using the homography matrix A input from the correspondence calculation unit 22 and outputs the converted content to the superposition unit 24 together with the content data.

重畳部２４は、画像取得部１１から入力画像を取得し、コンテンツ変換部２２から得たコンテンツデータを当該入力画像に重畳させて出力する。このとき、重畳部２４は、入力画像中の、コンテンツ変換部２２から出力された変換後の重畳位置に、コンテンツデータを重畳する。 The superimposing unit 24 acquires the input image from the image acquiring unit 11, superimposes the content data obtained from the content converting unit 22 on the input image, and outputs it. At this time, the superimposing unit 24 superimposes the content data on the superposed position after conversion output from the content converting unit 22 in the input image.

以上のように、本実施の形態の対応参照画像検索装置１０によれば、バイナリコードの特徴量を用いて入力画像と参照画像とのマッチング（対応参照画像の検索）を行なうので、マッチングの計算処理の負担を軽減できる。また、参照画像の特徴点の特徴量を保存したデータベースも、特徴量をバイナリコードの形式で記憶しているので、データベースに必要とされる容量が小さくて済む。従って、対応参照画像検索装置１０は、限られた資源の装置において実現でき、また、この対応参照画像検索装置１０を含むコンテンツ重畳装置１１０も、限られた資源の装置において実現できる。 As described above, according to the corresponding reference image retrieval apparatus 10 of the present embodiment, matching between the input image and the reference image (retrieval of the corresponding reference image) is performed using the feature amount of the binary code. The burden of processing can be reduced. In addition, since the database storing the feature quantities of the feature points of the reference image also stores the feature quantities in the form of binary code, the capacity required for the database can be small. Therefore, the corresponding reference image search apparatus 10 can be realized in a limited resource apparatus, and the content superimposing apparatus 110 including the corresponding reference image search apparatus 10 can also be realized in a limited resource apparatus.

本発明は、上記の実施の形態に限られず、種々の変形が可能である。以下、変形例を説明する。 The present invention is not limited to the above-described embodiment, and various modifications can be made. Hereinafter, modified examples will be described.

（変形例１）
上記の実施の形態のバイナリ変換部１３は、上式（１）を用いて特徴量をバイナリコードに変換するが、このとき、ベクトルｗ_kに疎性をもたせることができる。また、ベクトルｗ_kを一様分布からサンプリングするのではなく、次式（５）のようにサンプリングしてｗを疎行列にしても、式（１）が近似的に成立する。

(Modification 1)
The binary conversion unit 13 of the above embodiment converts the feature quantity into a binary code using the above equation (1). At this time, the vector w _k can be made sparse. Further, if the vector w _k is not sampled from a uniform distribution but is sampled as in the following equation (5) to make w a sparse matrix, equation (1) is approximately established.

なお、このような変換は、超疎ランダム写像（Very Sparse Random Projection）と呼ばれる。このときのｗは疎行列であり、かつ非ゼロの要素が−１又は＋１のみで構成されているので、行列ｗの計算において乗算が不要であり、また、加減算の回数が非常に少なくて済む。よって、バイナリ変換部１３における計算コストを大幅に削減できる。 Such conversion is called a very sparse random mapping (Very Sparse Random Projection). Since w at this time is a sparse matrix and non-zero elements are composed of only -1 or +1, multiplication is not necessary in the calculation of the matrix w, and the number of additions / subtractions can be very small. . Therefore, the calculation cost in the binary conversion unit 13 can be greatly reduced.

（変形例２）
図９は、変形例２のコンテンツ重畳装置の構成を示すブロック図である。変形例２のコンテンツ重畳装置１２０は、上記の実施の形態のコンテンツ重畳装置１１０と比較して、対応参照画像検索装置２０に環境測定部１６が追加されている。また、コンテンツ重畳装置１２０の対応参照画像検索装置２０のバイナリ変換部１３は、変換によって生成するバイナリコードのサイズを変更可能である。 (Modification 2)
FIG. 9 is a block diagram illustrating a configuration of a content superimposing apparatus according to the second modification. In the content superimposing device 120 of Modification 2, the environment measuring unit 16 is added to the corresponding reference image search device 20 as compared to the content superimposing device 110 of the above-described embodiment. Further, the binary conversion unit 13 of the corresponding reference image search device 20 of the content superimposing device 120 can change the size of the binary code generated by the conversion.

本発明の対応参照画像検索装置ないしはコンテンツ重畳装置は、携帯電話端末や、ノートパソコン等の様々なデバイスに実装される。よって、デバイスの資源であるＣＰＵの処理速度やデータベース（主記憶装置）の容量は、実行環境ごとに異なる。一方、バイナリ変換部１３によって生成されるバイナリコードのサイズ（ビット数）は、それが小さいほど計算コストを軽減でき、必要なデータベースの容量も小さく抑えられるが、その反面、特徴量を表すバイナリコードのサイズが小さいとマッチング（対応参照画像の検索）の精度が低くなる。 The corresponding reference image search device or content superimposing device of the present invention is mounted on various devices such as a mobile phone terminal and a notebook computer. Therefore, the processing speed of the CPU, which is a resource of the device, and the capacity of the database (main storage device) are different for each execution environment. On the other hand, as the size (number of bits) of the binary code generated by the binary conversion unit 13 is smaller, the calculation cost can be reduced and the necessary database capacity can be reduced. However, on the other hand, the binary code representing the feature amount is used. If the size of is small, the accuracy of matching (search for corresponding reference image) becomes low.

そこで、環境測定部１６は、対応参照画像検索装置２０ないしはコンテンツ重畳装置１２０が実装されるデバイスのＣＰＵの処理速度やデータベースの容量を測定し、その測定結果に応じてバイナリ変換部１３にて生成するバイナリコードのサイズを決定する。そして、バイナリ変換部１３は、環境測定部１６にて決定されたサイズに基づいて、単精度実数の特徴量をバイナリコードに変換する。 Therefore, the environment measurement unit 16 measures the processing speed of the CPU and the capacity of the database of the device on which the corresponding reference image search device 20 or the content superimposing device 120 is mounted, and generates the binary conversion unit 13 according to the measurement result. Determine the size of the binary code to be executed. Then, based on the size determined by the environment measurement unit 16, the binary conversion unit 13 converts the single-precision real number feature quantity into a binary code.

さらに、バイナリ変換部１３は、特徴点データベース１４に保存するための参照画像の特徴量としてのバイナリコードを生成するときに利用した行列ｗのサイズを調整することで、生成するバイナリコードのサイズを調整する。 Further, the binary conversion unit 13 adjusts the size of the matrix w used when generating the binary code as the feature amount of the reference image to be stored in the feature point database 14, thereby reducing the size of the generated binary code. adjust.

例えば、特徴量検出部１２で検出される単精度実数の特徴量のベクトルがＤ次元であり、特徴点データベース１４に保存されるバイナリコードが１２８ビットである場合は、バイナリ変換部１３は、特徴点データベースに保存するための参照画像の特徴点の特徴量を求めるために、１２８行Ｄ列の行列ｗを用いて単精度実数の特徴量をバイナリコードに変換する。 For example, if the single-precision real feature vector detected by the feature detector 12 is D-dimensional and the binary code stored in the feature point database 14 is 128 bits, the binary converter 13 In order to obtain the feature amount of the feature point of the reference image to be stored in the point database, the single-precision real number feature amount is converted into a binary code using the matrix w of 128 rows and D columns.

この場合において、環境測定部１６が決定したバイナリコードのサイズが６４ビットであるときは、バイナリ変換部１３は、１２８行Ｄ列の行列ｗから、１行目から６４行目の部分を切出して、６４行Ｄ列のサイズの行列ｗ’を生成し、この行列Ｗ’を用いて入力画像の特徴点の特徴量をバイナリコードに変換する。この場合には、マッチング部１５は、入力画像の特徴点の特徴量であるバイナリコード（６４ビット）と、特徴点データベース１４に記憶されたバイナリコード（１２８ビット）の上位６４ビットとを比較して、投票をする特徴点を決定することで、マッチングを行なう。 In this case, when the size of the binary code determined by the environment measurement unit 16 is 64 bits, the binary conversion unit 13 cuts out the first row to the 64th row from the matrix w of 128 rows and D columns. , A matrix w ′ having a size of 64 rows and D columns is generated, and the feature amount of the feature point of the input image is converted into a binary code using the matrix W ′. In this case, the matching unit 15 compares the binary code (64 bits) that is the feature amount of the feature point of the input image with the upper 64 bits of the binary code (128 bits) stored in the feature point database 14. Then, matching is performed by determining the feature points to vote.

この変形例２の対応参照画像検索装置２０及びコンテンツ重畳装置１２０によれば、環境測定部１６が、対応参照画像検索装置２０ないしはコンテンツ重畳装置１２０が実装されるデバイスのＣＰＵの処理速度やデータベースの容量といった実行環境に応じてバイナリ変換部１３にて生成するバイナリコードのサイズを決定するので、実行環境に適したバイナリコードを生成できる。 According to the corresponding reference image search device 20 and the content superimposing device 120 of the second modification, the environment measurement unit 16 causes the processing speed of the CPU of the device on which the corresponding reference image search device 20 or the content superimposing device 120 is mounted, and the database Since the size of the binary code generated by the binary conversion unit 13 is determined according to the execution environment such as the capacity, a binary code suitable for the execution environment can be generated.

（変形例３）
対応参照画像検索装置ないしはコンテンツ重畳装置は、携帯電話端末のように常に外部ネットワークと通信可能なデバイスに実装されてよい。対応参照画像検索装置ないしはコンテンツ重畳装置が常に外部ネットワークと通信可能である場合には、対応参照画像の検索を外部の装置で行うことも可能である。 (Modification 3)
The corresponding reference image search device or the content superimposing device may be mounted on a device that can always communicate with an external network, such as a mobile phone terminal. When the corresponding reference image search device or the content superimposing device can always communicate with an external network, the corresponding reference image can be searched by an external device.

図１０は、変形例３のコンテンツ重畳システムの構成を示すブロック図である。コンテンツ重畳システム１０１は、コンテンツ重畳装置１３０と外部検索サーバ２３０とからなる。コンテンツ重畳装置１３０及び外部検索サーバ２３０にはそれぞれ通信部３１、４１が設けられており、互いに通信を行なう。 FIG. 10 is a block diagram illustrating a configuration of a content superimposing system according to the third modification. The content superimposing system 101 includes a content superimposing device 130 and an external search server 230. The content superimposing apparatus 130 and the external search server 230 are provided with communication units 31 and 41, respectively, and communicate with each other.

コンテンツ重畳装置１３０の構成は、通信部３１を有しており、かつバイナリ変換部１３において生成するバイナリコードのサイズが変更可能である点を除き、上記の実施の形態のコンテンツ重畳装置１１０と同様である。外部検索サーバ２３０は、通信部４１のほか、マッチング部４２、特徴点データベース４３、及びコンテンツデータベース４４を備えている。 The configuration of the content superimposing apparatus 130 is the same as that of the content superimposing apparatus 110 of the above embodiment, except that it includes the communication unit 31 and the size of the binary code generated by the binary conversion unit 13 can be changed. It is. In addition to the communication unit 41, the external search server 230 includes a matching unit 42, a feature point database 43, and a content database 44.

コンテンツ重畳装置１３０は、上記の実施の形態と同様にして対応参照画像を検索してそのコンテンツデータベース２３に記憶されたコンテンツを入力画像に重畳させることができる。コンテンツ重畳装置１３０は、さらに、入力画像から抽出された特徴点の特徴量を表すバイナリコードを、通信部３１を介して外部検索サーバ２３０に送信することもできる。 The content superimposing apparatus 130 can retrieve the corresponding reference image and superimpose the content stored in the content database 23 on the input image in the same manner as in the above embodiment. The content superimposing apparatus 130 can further transmit a binary code representing the feature amount of the feature point extracted from the input image to the external search server 230 via the communication unit 31.

外部検索サーバ２３０は通信部４１でバイナリコードの特徴量を受信する。マッチング部４２は、このバイナリコードに基づいて、特徴点データベース４３に保存された特徴点に投票をすることで、対応参照画像を検索する。通信部４１は、対応参照画像の画像識別番号及び対応点対の情報をコンテンツ重畳装置１３０に送信する。通信部４１はまた、コンテンツデータベース４４から対応参照画像に対応するコンテンツのレコード（コンテンツデータ及びコンテンツの重畳位置の情報を含む）を抽出してコンテンツ重畳装置１３０に送信する。 The external search server 230 receives the feature value of the binary code by the communication unit 41. The matching unit 42 searches for the corresponding reference image by voting on the feature points stored in the feature point database 43 based on the binary code. The communication unit 41 transmits the image identification number of the corresponding reference image and the information on the corresponding point pair to the content superimposing apparatus 130. The communication unit 41 also extracts a content record (including content data and content superposition position information) corresponding to the corresponding reference image from the content database 44 and transmits the content record to the content superposition device 130.

コンテンツ重畳装置１３０の通信部３１は、対応点対の情報、及び対応参照画像に対応するコンテンツのレコードを受信する。対応関係算出部２１は、通信部３１にて受信した対応点対に基づいて、ホモグラフィ行列を算出する。コンテンツ変更部２２は、対応関係算出部２１にて算出されたホモグラフィ行列を用いて、通信部３１にて受信したコンテンツの重畳位置を変換する。重畳部２４は、上記の実施の形態と同様に、コンテンツ変換部２２から出力された、重畳位置の変換されたコンテンツを、画像取得部１１から得た入力画像に重畳させる。 The communication unit 31 of the content superimposing apparatus 130 receives the information on the corresponding point pair and the content record corresponding to the corresponding reference image. The correspondence calculation unit 21 calculates a homography matrix based on the corresponding point pair received by the communication unit 31. The content changing unit 22 converts the superimposed position of the content received by the communication unit 31 using the homography matrix calculated by the correspondence calculating unit 21. Similar to the above embodiment, the superimposing unit 24 superimposes the content converted from the superimposition position output from the content converting unit 22 on the input image obtained from the image acquiring unit 11.

変形例３では、上記の説明のように対応参照画像の検索を外部検索サーバ２３０にて行う場合には、バイナリ変換部１３は、外部検索サーバ２３０の計算能力に応じてバイナリコードのサイズを変更する。外部検索サーバ２３０がコンテンツ重畳装置１３０よりも高機能であるときは、バイナリ変換部１３は、バイナリコードのサイズを大きくする（ビット数を長くする）。 In Modification 3, when the corresponding reference image is searched by the external search server 230 as described above, the binary conversion unit 13 changes the size of the binary code according to the calculation capability of the external search server 230. To do. When the external search server 230 has a higher function than the content superimposing apparatus 130, the binary conversion unit 13 increases the size of the binary code (increases the number of bits).

変形例３によれば、携帯電話端末のような計算資源の限られた環境のみで対応参照画像の検索を行うのではなく、より計算の環境が整った外部検索サーバで検索を行うことができる。しかも、バイナリ変換部１３は、バイナリコードのサイズを適切な大きさに調節できるので、小規模なリアルタイムマッチング処理には、携帯電話端末であるコンテンツ重畳装置１３０内で上記の実施の形態のようにして対応参照画像を検索し、大規模なマッチング処理をする場合には、携帯電話端末より処理能力の高い外部検索サーバに、よりサイズの大きいバイナリコードを用いて対応参照画像の検索を行わせることができる。 According to the third modification example, the corresponding reference image is not searched only in an environment where the calculation resources are limited, such as a mobile phone terminal, but the search can be performed by an external search server with a better calculation environment. . Moreover, since the binary conversion unit 13 can adjust the size of the binary code to an appropriate size, the small-scale real-time matching processing is performed in the content superimposing apparatus 130 that is a mobile phone terminal as in the above embodiment. When searching for a corresponding reference image and performing a large-scale matching process, an external search server having a higher processing capacity than a mobile phone terminal should search for the corresponding reference image using a binary code having a larger size. Can do.

なお、コンテンツ重畳装置１３０が単体でコンテンツの重畳を行わない場合には、コンテンツ重畳装置１３０は、特徴点データベース１４、マッチング部１５、又はコンテンツデータベース２３を備えていなくてもよく、また、対応関係算出部２１、又は、対応関係算出部２１及びコンテンツ変換部２２、又は、対応関係算出部２１、コンテンツ変換部２２、及び重畳部２４が、外部検索サーバ２３０に備えられていてもよい。逆に、外部検索サーバ２３０がコンテンツデータベース４４を備えず、マッチング部４２によるマッチングの結果のみをコンテンツ重畳装置１３０に送信するようにしてもよい。 When the content superimposing device 130 does not superimpose content alone, the content superimposing device 130 may not include the feature point database 14, the matching unit 15, or the content database 23, and the correspondence relationship. The external search server 230 may include the calculation unit 21, the correspondence relationship calculation unit 21 and the content conversion unit 22, or the correspondence relationship calculation unit 21, the content conversion unit 22, and the superposition unit 24. Conversely, the external search server 230 may not include the content database 44 and only the result of matching by the matching unit 42 may be transmitted to the content superimposing device 130.

（変形例４）
コンテンツ重畳装置が通信部を備えて外部検索サーバと通信をすることにより、以下のようなコンテンツ重畳システムも実現できる。図１１は、変形例４のコンテンツ重畳システムの構成を示すブロック図である。コンテンツ重畳システム１０２は、コンテンツ重畳装置１４０と外部検索サーバ２４０とからなる。このコンテンツ重畳システム１０２は、コンテンツを重畳する対象が書籍の頁の画像である場合に好適に用いられる。以下では、コンテンツを重畳する対象が書籍の頁の画像である場合を例にコンテンツ重畳システム１０２を説明する。 (Modification 4)
When the content superimposing apparatus includes a communication unit and communicates with the external search server, the following content superimposing system can also be realized. FIG. 11 is a block diagram illustrating a configuration of a content superimposition system according to the fourth modification. The content superimposing system 102 includes a content superimposing device 140 and an external search server 240. This content superimposing system 102 is preferably used when the target of superimposing content is an image of a book page. Below, the content superimposition system 102 is demonstrated to an example when the object on which a content is superimposed is the image of the page of a book.

例えば、参照画像を保存すべき書籍が５０００冊存在し、各書籍の頁数が平均３００頁であるとすると、特徴点データベースには、１５０万頁分の特徴点（各頁につき、複数の特徴点がある）のレコードを保存しておく必要がある。しかし、携帯電話端末のような小型ないしは携帯型のデバイスにこのような大量のデータをすべて保存することは、ストレージデバイスの容量の制限により不可能である。また、仮にそのような大量のデータを携帯電話端末に保存できたとしても、検索対象が多すぎて、マッチングの計算コストが膨大になる。さらに、検索対象が多いので、バイナリコードを長くしないと、検索精度が悪化するという問題もある。 For example, if there are 5000 books to store reference images, and the average number of pages of each book is 300 pages, the feature point database includes 1.5 million page feature points (a plurality of features for each page). It is necessary to save the record. However, it is impossible to store all of such a large amount of data in a small or portable device such as a mobile phone terminal due to the capacity limitation of the storage device. Even if such a large amount of data can be stored in the mobile phone terminal, there are too many search targets, and the calculation cost of matching becomes enormous. Furthermore, since there are many search objects, there is a problem that the search accuracy deteriorates unless the binary code is lengthened.

そこで、コンテンツ重畳システム１０２では、すべての書籍のすべての頁についてのすべての特徴点のレコードは、大容量のデータベースを比較的容易に実現でき、物理的な制約も比較的少ない外部検索サーバ２４０の特徴点データベース４３に保存しておく。 Therefore, in the content superimposing system 102, the records of all the feature points for all the pages of all the books can realize a large-capacity database relatively easily, and the external search server 240 has relatively few physical restrictions. It is stored in the feature point database 43.

図１２は、特徴点データベース４３に保存されたデータの構成を示す図である。図１２に示すように、特徴点データベース４３には、特徴点ごとに、書籍番号、頁番号、バイナリコード（１２８ビットと６４ビット）、及び特徴点の位置（座標）からなるレコードが記憶されている。上記の実施の形態と同様に、１つの参照画像（１頁の画像）について、複数の特徴点のレコードが記憶されている。 FIG. 12 is a diagram showing a configuration of data stored in the feature point database 43. As shown in FIG. 12, the feature point database 43 stores a record including a book number, a page number, a binary code (128 bits and 64 bits), and a feature point position (coordinates) for each feature point. Yes. Similar to the above embodiment, records of a plurality of feature points are stored for one reference image (one-page image).

コンテンツ重畳装置１４０は、画像取得部１１にて入力画像を取得すると、特徴量検出部１２にて特徴点の特徴量を検出して、バイナリ変換部１３がその特徴量をバイナリコードに変換する。そして、通信部３１がそのバイナリコードを外部検索サーバ２４０に送信する。 In the content superimposing apparatus 140, when the image acquisition unit 11 acquires an input image, the feature amount detection unit 12 detects the feature amount of the feature point, and the binary conversion unit 13 converts the feature amount into a binary code. Then, the communication unit 31 transmits the binary code to the external search server 240.

外部検索サーバ２４０は、通信部４１にてコンテンツ重畳装置１４０からバイナリコードを受信する。マッチング部４２は、通信部４１にて受信したバイナリコード（入力画像から抽出された特徴点の数だけある）を用いて、投票を行い、投票を最も多く獲得した参照画像を対応参照画像として検出する。対応参照画像は、５０００冊の書籍の中のある１冊の書籍のある１頁の画像である。 The external search server 240 receives the binary code from the content superimposing device 140 at the communication unit 41. The matching unit 42 performs voting using the binary code received by the communication unit 41 (the number of feature points extracted from the input image), and detects the reference image that has acquired the most votes as a corresponding reference image. To do. The corresponding reference image is a one-page image of one book out of 5000 books.

通信部４１は、特徴点データベース４３に記憶されたレコードのうち、マッチング部４２にて検出された対応参照画像が所属する書籍番号のレコード（即ち対応参照画像及び対応参照画像に関連する他の参照画像のレコード）をコンテンツ重畳装置１４０に送信する。このとき、通信部４１は、当該書籍番号のレコードのうち、少なくとも頁番号、バイナリコード（１２８ビット及び６４ビットのいずれか一方）、及び特徴点の位置の情報を送信する。このようにして外部検索サーバ２４０からコンテンツ重畳装置１４０にダウンロードされるデータを特徴点データベース更新データという。 Among the records stored in the feature point database 43, the communication unit 41 records the book number to which the corresponding reference image detected by the matching unit 42 belongs (that is, other reference related to the corresponding reference image and the corresponding reference image). Image record) is transmitted to the content superimposing apparatus 140. At this time, the communication unit 41 transmits at least a page number, a binary code (either one of 128 bits and 64 bits), and feature point position information in the book number record. Data downloaded to the content superimposing device 140 from the external search server 240 in this way is referred to as feature point database update data.

１２８ビット及び６４ビットのいずれのバイナリコードを送信するかは、送信先、即ちコンテンツ重畳装置１４０の計算能力やデータベースの容量による。１２８ビット及び６４ビットのいずれのバイナリコードを送信するかは、外部検索サーバ２４０が送信先であるコンテンツ重畳装置１４０の計算能力やデータベースの容量を認識して決定してもよいし、コンテンツ重畳装置１４０側からバイナリコードのサイズを指定してもよい。 Whether the 128-bit or 64-bit binary code is transmitted depends on the transmission destination, that is, the calculation capability of the content superimposing apparatus 140 and the capacity of the database. Whether the 128-bit or 64-bit binary code is to be transmitted may be determined by the external search server 240 recognizing the calculation capability of the content superimposing device 140 that is the transmission destination or the capacity of the database. The size of the binary code may be specified from the 140 side.

なお、上述のように、外部検索サーバ２４０にて書籍を特定するために、コンテンツ重畳装置１４０がまず入力画像の特徴点の特徴量を外部検索サーバ２４０に送信するが、この特徴量（バイナリデータ）のサイズと、外部検索サーバ２４０がコンテンツ重畳装置１４０に送信するバイナリコードのサイズとは必ずしも一致しなくてもよく、特に、前者が大きく、後者が小さくてよい。 As described above, in order to specify a book in the external search server 240, the content superimposing apparatus 140 first transmits the feature amount of the feature point of the input image to the external search server 240. This feature amount (binary data) ) And the size of the binary code transmitted from the external search server 240 to the content superimposing device 140 are not necessarily the same. In particular, the former may be large and the latter may be small.

外部検索サーバ２４０の特徴点データベース４３中の一部のレコードである特徴点データベース更新データを受信したコンテンツ重畳装置１４０の通信部３１は、これを特徴点データベース１４に保存し、あるいは既に特徴点データベース１４に何らかのデータが保存されている場合には、特徴点データベース更新データで特徴点データベース１４を更新する。そして、それ以降の入力画像については、この特徴点データベース１４に保存された特徴点データベース更新データを用いてマッチング（対応参照画像の検索）を行う。このとき、バイナリ変換部１３は、特徴点データベース更新データにおけるバイナリコードのサイズと同じサイズになるように、特徴量検出部１２にて検出された単精度実数の特徴量をバイナリコードに変換する。 The communication unit 31 of the content superimposing apparatus 140 that has received the feature point database update data, which is a part of the records in the feature point database 43 of the external search server 240, stores this in the feature point database 14 or already has the feature point database. If any data is stored in the feature point database 14, the feature point database 14 is updated with the feature point database update data. For subsequent input images, matching (search for corresponding reference images) is performed using the feature point database update data stored in the feature point database 14. At this time, the binary conversion unit 13 converts the single-precision real number feature quantity detected by the feature quantity detection unit 12 into a binary code so as to have the same size as the binary code size in the feature point database update data.

変形例４のコンテンツ重畳システム１０２によれば、コンテンツ重畳装置１４０が、例えば５０００冊×３００頁といった大量の参照画像の特徴点についてデータベースにその特徴量を保存していなくても、必要なデータのみを外部検索サーバ２４０からダウンロードすることができる。 According to the content superimposing system 102 of the fourth modification, even if the content superimposing apparatus 140 does not store the feature amount of the reference points of a large amount of reference images such as 5000 books × 300 pages in the database, only necessary data is stored. Can be downloaded from the external search server 240.

さらに、必要な特徴点データベース更新データを特定するために（即ち、書籍を特定するために）外部検索サーバ２４０に入力画像の特徴点のバイナリコードの特徴量を送信する場合には、外部検索サーバ２４０のマッチング部４２における検索の精度を確保するために、バイナリ変換部１３がサイズの大きいバイナリコードを生成して、これを外部検索サーバ２４０に送信するとともに、外部検索サーバ２４０から特徴点データベース更新データをダウンロードした後には、バイナリ変換部１３は、入力画像から検出された特徴量を、その特徴点データベース更新データにおけるバイナリコードのサイズに応じたサイズのバイナリコードに変換することができる。 Further, in the case of transmitting the binary code feature quantity of the feature point of the input image to the external search server 240 in order to specify the necessary feature point database update data (that is, in order to specify the book), the external search server In order to ensure the accuracy of the search in the matching unit 42 of 240, the binary conversion unit 13 generates a large binary code, transmits this to the external search server 240, and updates the feature point database from the external search server 240. After the data is downloaded, the binary conversion unit 13 can convert the feature amount detected from the input image into a binary code having a size corresponding to the size of the binary code in the feature point database update data.

なお、変形例４において、コンテンツ重畳装置１４０が外部検索サーバ２４０からダウンロードする特徴点データベース更新データにおけるバイナリコードは、該当書籍を識別する能力だけを向上させるように構成することもできる。すなわち、変形例４の外部検索サーバ２４０のように、あらゆる任意の画像から対象を識別する場合と異なり、書籍ごとの頁の画像だけを識別の対象としている場合には、識別対象の数が少なくなるため、バイナリ変換をする際に用いる変換行列ｗを書籍ごとに機械学習によって生成して、効率的にマッチングを行なえるようにバイナリコードを生成することができる。 In Modification 4, the binary code in the feature point database update data downloaded from the external search server 240 by the content superimposing device 140 can be configured to improve only the ability to identify the book. That is, unlike the case of identifying the target from any arbitrary image as in the external search server 240 of the modification 4, the number of identification targets is small when only the image of the page for each book is the target of identification. Therefore, a conversion matrix w used for binary conversion can be generated for each book by machine learning, and a binary code can be generated so that matching can be performed efficiently.

このとき、変換行列ｗは、外部検索サーバ２４０から、該当書籍の特徴点データベース更新データともにコンテンツ重畳装置１４０にダウンロードすることができる。そして、コンテンツ重畳装置１４０のバイナリ変換部１３は、該当書籍の特徴点データベース更新データをダウンロードした後は、それとともにダウンロードした変換行列ｗを用いて入力画像の特徴点の特徴量をバイナリ変換する。この構成により、あらゆる任意の書籍の頁の画像をマッチングの対象とする場合と比較して、特徴点データベース１４の容量や対応参照画像検索装置１０ないしはコンテンツ重畳装置１４０の計算コストの削減を期待できる。 At this time, the transformation matrix w can be downloaded from the external search server 240 to the content superimposing device 140 together with the feature point database update data of the book. Then, after downloading the feature point database update data of the book, the binary conversion unit 13 of the content superimposing apparatus 140 performs binary conversion on the feature amount of the feature point of the input image using the conversion matrix w downloaded therewith. With this configuration, the capacity of the feature point database 14 and the calculation cost of the corresponding reference image search device 10 or the content superimposing device 140 can be expected to be reduced as compared with the case where images of any arbitrary book pages are targeted for matching. .

なお、ＡＲの対象とする書籍が変更されると、外部検索サーバ２４０からコンテンツ重畳装置１４０の特徴点データベース１４にダウンロードした特徴点データベース更新データは使用できなくなる。この場合には、マッチング部１５で対応参照画像を検出できなくなる。よって、マッチング部１５で対応参照画像を検出できなくなった場合には、再度、通信部３１を介して外部検索サーバ２４０に入力画像の特徴点の特徴量のバイナリコードを送信し、外部検索サーバ２４０でマッチングを行なって新たな書籍を特定して、コンテンツ重畳装置１４０にて新たな特徴点データベース更新データをダウンロードして、特徴点データベース１４を更新すればよい。 Note that when the book targeted for AR is changed, the feature point database update data downloaded from the external search server 240 to the feature point database 14 of the content superimposing device 140 cannot be used. In this case, the matching reference image cannot be detected by the matching unit 15. Therefore, when the matching unit 15 can no longer detect the corresponding reference image, the binary code of the feature amount of the feature point of the input image is transmitted again to the external search server 240 via the communication unit 31, and the external search server 240. Then, matching is performed to identify a new book, the content superimposing device 140 downloads new feature point database update data, and the feature point database 14 is updated.

なお、上記の実施の形態及び各変形例では、特徴点データベースには、画像識別番号が１つの画像ごとに付与されており、マッチング部は、対応参照画像として検出された１つの画像の中の対応点を対応関係算出部に出力した。しかし、本発明は、これに限られない。本発明は、コンテンツを付与する対象を参照画像として、特徴点データベースにおいて、コンテンツを付与する対象ごとに画像識別番号を付与してもよい。即ち、参照画像は１つの画像でなくてもよく、１つの画像に含まれる特徴点の集合を参照画像としてもよい。 In the above-described embodiment and each modification, the feature point database is assigned an image identification number for each image, and the matching unit is included in one image detected as a corresponding reference image. The corresponding points were output to the correspondence calculation unit. However, the present invention is not limited to this. In the present invention, an image identification number may be assigned to each target to which content is added in the feature point database, with the target to which content is assigned as a reference image. That is, the reference image may not be one image, and a set of feature points included in one image may be used as the reference image.

例えば、図２に示す画像があった場合に、この画像中の山形の対象に関連する特徴点の集合（図３参照）を１つの参照画像とし、雲形の対象に関連する特徴点の集合（図３参照）を他の参照画像としてもよい。この場合は、特徴点データベースには、山形の対象に関連する特徴点と雲形の対象に関連する特徴点とで異なる画像識別番号が付与され、マッチング部は、投票を多く受けて対応参照画像として検出された対象に関連する特徴点（同一の画像識別番号が付与された特徴点）のうち、入力画像の特徴点と対応する点を対応参照画像の対応点として対応関係算出部に出力する。この場合には、コンテンツデータベースも対象ごとにコンテンツを記憶している。 For example, when there is an image shown in FIG. 2, a set of feature points related to a mountain-shaped object (see FIG. 3) in this image is taken as one reference image, and a set of feature points related to a cloud-shaped object ( 3) may be used as another reference image. In this case, the feature point database is assigned different image identification numbers for the feature points related to the Yamagata object and the feature points related to the cloud object, and the matching unit receives a lot of votes as a corresponding reference image. Of the feature points related to the detected object (feature points assigned the same image identification number), the points corresponding to the feature points of the input image are output to the correspondence calculation unit as the corresponding points of the corresponding reference image. In this case, the content database also stores content for each target.

以上のように、本発明は、資源の限られた装置においても有効に、入力画像に対応する参照画像の検索を行うことができるという効果を有し、画像の特徴点を用いて入力画像に対応する参照画像を検索する対応参照画像検索装置等として有用である。 As described above, the present invention has an effect that a reference image corresponding to an input image can be searched effectively even in an apparatus with limited resources, and an input image is obtained using feature points of the image. This is useful as a corresponding reference image search device that searches for a corresponding reference image.

１０、２０対応参照画像検索装置
１１画像取得部
１２特徴量検出部
１３バイナリ変換部
１４特徴点データベース
１５マッチング部
１６環境測定部
２１対応関係算出部
２２コンテンツ変換部
２３コンテンツデータベース
２４重畳部
３１通信部
４１通信部
４２マッチング部
４３特徴点データベース
４４コンテンツデータベース
１０１、１０２コンテンツ重畳システム
１１０、１２０、１３０、１４０コンテンツ重畳装置
２３０、２４０外部検索サーバ DESCRIPTION OF SYMBOLS 10, 20 Corresponding reference image search apparatus 11 Image acquisition part 12 Feature quantity detection part 13 Binary conversion part 14 Feature point database 15 Matching part 16 Environment measurement part 21 Correspondence relation calculation part 22 Content conversion part 23 Content database 24 Superimposition part 31 Communication part 41 Communication Unit 42 Matching Unit 43 Feature Point Database 44 Content Database 101, 102 Content Superimposition System 110, 120, 130, 140 Content Superimposition Device 230, 240 External Search Server

Claims

A corresponding reference image retrieval device for retrieving a reference image corresponding to an input image,
A feature quantity detection unit that extracts feature points from the input image and detects the feature quantities of the feature points;
A binary conversion unit that converts the feature amount detected by the feature amount detection unit into a binary code;
A feature point database storing feature amounts of feature points of a plurality of reference images in a binary code format;
By comparing the feature amount of the binary code of the input image converted by the binary conversion unit with the feature amount of the binary code of the plurality of reference images stored in the feature point database, the plurality of references A matching unit for detecting a reference image corresponding to the input image from the images;
A corresponding reference image retrieval apparatus comprising:

The corresponding reference image search device according to claim 1, wherein the binary conversion unit converts the feature quantity detected by the feature quantity detection unit into a binary code using a transformation matrix.

The corresponding reference image retrieval apparatus according to claim 2, wherein the transformation matrix is a sparse matrix.

The corresponding reference image search device according to claim 2, wherein the binary conversion unit can change the size of the binary code by changing the size of the conversion matrix.

5. The correspondence reference according to claim 1, wherein when there are a plurality of reference images corresponding to the input image, the matching unit detects a plurality of reference images. 6. Image search device.

An environment measurement unit that measures an execution environment of the corresponding reference image search device;
The corresponding reference image search according to claim 2, wherein the binary conversion unit changes the size of the binary code by changing the size of the conversion matrix according to a measurement result by the environment measurement unit. apparatus.

A content superimposing device comprising the corresponding reference image search device according to any one of claims 1 to 6, and superimposing a corresponding content on the input image,
A content database storing correspondence between the content and the reference image and the content;
A content extraction unit for extracting content corresponding to the reference image detected by the matching unit from the content database;
A superimposing unit that superimposes the content extracted by the content extracting unit on the input image;
A content superimposing apparatus comprising:

The feature amount detection unit extracts a feature point including information on a position in the input image,
The feature point database stores information on the position of each feature point together with the feature amount of each feature point of a plurality of reference images.
The content database further stores the superimposed position of the content,
The content superimposing device is further extracted by the content extraction unit based on the relationship between the position of the feature point extracted by the feature amount detection unit and the position of the feature point stored in the feature point database. A content conversion unit that converts the superimposed position stored in the content database of the content,
The content superimposing apparatus according to claim 7, wherein the superimposing unit superimposes the content extracted by the content extracting unit on a superimposition position converted by the content converting unit in the input image.

A content superposition system comprising a content superposition device and an external search server capable of communicating with the content superposition device,
The content superimposing device includes:
A feature quantity detection unit that extracts feature points from the input image and detects the feature quantities of the feature points;
A binary conversion unit that converts the feature amount detected by the feature amount detection unit into a binary code;
A content superimposing apparatus side communication unit that transmits the feature amount of the binary code of the input image converted by the binary conversion unit to the external search server;
With
The external search server
An external search server-side communication unit that receives the binary code feature quantity of the input image transmitted from the content superimposing device-side communication unit;
An external search server side feature point database storing feature amounts of feature points of a plurality of reference images in a binary code format;
Comparing the binary code feature quantity of the input image received by the external search server side communication unit with the binary code feature quantity of the plurality of reference images stored in the external search server side feature point database. An external search server side matching unit for detecting a reference image corresponding to the input image from the plurality of reference images,
A content superposition system characterized by comprising:

The feature amount detection unit extracts a feature point including information on a position in the input image,
The feature point database stores information on the position of each feature point together with the feature amount of each feature point of a plurality of reference images.
The content superimposition system includes:
A content database storing a correspondence between the reference image stored in the feature point database and the content, and a superimposed position of the content;
The content corresponding to the reference image detected by the external search server side matching unit is extracted from the content database, and the feature point position extracted by the feature amount detection unit and the feature stored in the feature point database A content conversion unit that converts the superimposed position stored in the content database of the content extracted from the content database based on the relationship with the position of a point;
A superposition unit that superimposes the content extracted by the content conversion unit on the superposition position converted by the content conversion unit in the input image;
The content superposition system according to claim 9, further comprising:

The content superimposing device includes:
A content superimposing apparatus-side feature point database storing feature quantities of feature points of a plurality of reference images in a binary code format;
By comparing the feature amount of the binary code of the input image converted by the binary conversion unit with the feature amount of the binary code stored in the content superimposing apparatus side feature point database, the plurality of reference images A content superimposing apparatus side matching unit for detecting a reference image corresponding to the input image from
With
The external search server side communication unit includes the feature amounts of the reference image detected by the external search server side matching unit among the feature amounts stored in the external search server side feature point database and the feature amount of the reference image related thereto. To the content superimposing device,
The content superimposing apparatus side communication unit receives the feature amount of the binary code transmitted from the external search server side communication unit,
The content superimposing apparatus side feature point database uses the characteristic amount of the binary code received by the content superimposing apparatus side communication unit as a feature amount of each of the feature points of the plurality of reference images. Item 14. The content superimposition system according to Item 9.

A corresponding reference image retrieval method for retrieving a reference image corresponding to an input image in a corresponding reference image retrieval apparatus having a feature point database storing feature quantities of feature points of a plurality of reference images in a binary code format. And
A feature point extraction step for extracting feature points from the input image;
A feature amount detection step of detecting a feature amount of the feature point extracted in the feature point extraction step;
A binary code conversion step of converting the feature quantity detected in the feature quantity detection step into a binary code;
By comparing the feature amount of the binary code of the input image converted in the binary code conversion step with the feature amount of the binary code of each of the plurality of reference images stored in the feature point database, A matching step of detecting a reference image corresponding to the input image from a plurality of reference images;
And a corresponding reference image search method.

Corresponding reference image retrieval apparatus having a feature point database storing feature quantities of feature points of a plurality of reference images in the form of binary code, and correspondence between reference images stored in the feature point database and contents A content superimposing method for superimposing corresponding content on an input image in a content superimposing apparatus comprising a stored content database,
A corresponding reference image search step for detecting a reference image corresponding to the input image in the corresponding reference image search method according to claim 12,
A content extraction step of extracting content corresponding to the reference image detected in the corresponding reference image search step from the content database;
A superimposing step of superimposing the content extracted by the content extraction unit on the input image;
The content superimposing method characterized by including.

A computer program for causing a computer to execute the corresponding reference image search method according to claim 12.

A computer program for causing a computer to execute the content superimposing method according to claim 13.