JP5500404B1

JP5500404B1 - Image processing apparatus and program thereof

Info

Publication number: JP5500404B1
Application number: JP2013112323A
Authority: JP
Inventors: 建一林
Original assignee: Qoncept Inc
Current assignee: Qoncept Inc
Priority date: 2013-05-28
Filing date: 2013-05-28
Publication date: 2014-05-21
Anticipated expiration: 2033-05-28
Also published as: JP2015026093A

Abstract

【課題】スケール不変性及び回転不変性を持つ局所特徴ベクトルを、より低い計算コストで求められるようにする。
【解決手段】第１特徴点から距離が小さい順の４個の特徴点を第２特徴点とし、第１特徴点と、第２特徴点のそれぞれとのペア特徴点を選択し（Ｓ１）、特徴点間距離Ｌを算出し、第１、第２特徴点を中心とする距離Ｌに比例した半径の第１、第２サンプリング円を決定し（Ｓ３）、各円周上の１６個の画素のそれぞれを中心とする矩形画素領域のそれぞれの平均輝度Ｉ（Ｐｉ）、Ｉ（Ｑｉ）、ｉ＝０〜ｆを、直線Ｌの方向を基準として順にサンプリングし、これらのそれぞれと、第１、第２特徴点を中心とする矩形画素領域の平均輝度との差を、局所特徴ベクトルの成分としてこの順に並べる（Ｓ４）。
【選択図】図２A local feature vector having scale invariance and rotation invariance is obtained at a lower calculation cost.
Four feature points in order of decreasing distance from a first feature point are used as second feature points, and pair feature points of the first feature point and each of the second feature points are selected (S1), A distance L between the feature points is calculated, and first and second sampling circles having a radius proportional to the distance L around the first and second feature points are determined (S3), and 16 pixels on each circumference are determined. The average luminances I (Pi), I (Qi), i = 0 to f of the rectangular pixel regions centered on each of the first and second pixels are sequentially sampled with respect to the direction of the straight line L. The difference from the average luminance of the rectangular pixel area centered on the second feature point is arranged in this order as a component of the local feature vector (S4).
[Selection] Figure 2

Description

本発明は、輝度画像上の自然特徴点を中心とする局所領域のスケール不変性かつ回転不変性特徴ベクトルを求める機能を備えた画像処理装置及びそのプログラムに関する。 The present invention relates to an image processing apparatus having a function for obtaining a scale-invariant and rotation-invariant feature vector of a local region centered on a natural feature point on a luminance image, and a program thereof.

カメラを備えたスマートフォンの性能向上とＦＡＳＴ（ＦｅａｔｕｒｅｓｆｒｏｍＡｃｃｅｌｅｒａｔｅｄＳｅｇｍｅｎｔＴｅｓｔ）コーナ検出法などの画像処理技術の向上に伴い、マーカレスＡＲ（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ：拡張現実）をスマートフォンで実現可能になった。 Markerless AR (Augmented Reality) has become feasible with smartphones as the performance of smartphones equipped with cameras improves and image processing technologies such as FAST (Features From Accelerated Segment) corner detection methods improve.

ＦＡＳＴコーナ検出法によれば、１画像内の自然特徴点を多数、高速に検出することができる。これら特徴点を、予め求めた参照データ内の特徴点とマッチングすることにより、３次元座標をカメラ画像の２次元座標に投影するカメラパラメータを推定することができ、このパラメータに基づき、３Ｄモデルをカメラ画像に投影した（カメラ画像にＣＧ画像を重畳した）ＡＲ画像を生成することができる。このマッチングを行うために、各特徴点を中心とする局所特徴ベクトルを記述する必要がある。 According to the FAST corner detection method, a large number of natural feature points in one image can be detected at high speed. By matching these feature points with the feature points in the reference data obtained in advance, the camera parameters for projecting the three-dimensional coordinates onto the two-dimensional coordinates of the camera image can be estimated, and based on these parameters, the 3D model can be estimated. An AR image projected onto the camera image (a CG image superimposed on the camera image) can be generated. In order to perform this matching, it is necessary to describe a local feature vector centered on each feature point.

下記特許文献１には、スケール不変性及び回転不変性を有する局所特徴ベクトルを、コントラストによらずに算出することができる局所特徴ベクトル算出方法が開示されている。 Patent Document 1 below discloses a local feature vector calculation method capable of calculating a local feature vector having scale invariance and rotation invariance without depending on contrast.

また、下記非特許文献１には、そこでのテストの結果、下記非特許文献２のＢＲＩＥＦ（ＢｉｎａｒｙＲｏｂｕｓｔＩｎｄｅｐｅｎｄｅｎｔＥｌｅｍｅｎｔａｒｙＦｅａｔｕｒｅｓ）という手法が最速であると記載されている。 Further, Non-Patent Document 1 described below describes that the method called BRIEF (Binary Robust Independent Elementary Features) of Non-Patent Document 2 described below is the fastest as a result of the test.

特開２０１２−３８２９０号公報JP 2012-38290 A

特徴記述子比較レポート：http://computer-vision-talks.com/2011/08/feature-descriptor-comparison-report/Feature descriptor comparison report: http://computer-vision-talks.com/2011/08/feature-descriptor-comparison-report/ BRIEF:http://cvlab.epfl.ch/~lepetit/papers/calonder_pami11.pdfBRIEF: http: //cvlab.epfl.ch/~lepetit/papers/calonder_pami11.pdf ランダムフォレスト:http://link.springer.com/article/10.1023%2FA%3A1010933404324?LI=trueRandom Forest: http://link.springer.com/article/10.1023%2FA%3A1010933404324?LI=true

しかしながら、スケール不変性及び回転不変性を有する局所特徴ベクトルを求める処理は、計算コストが比較的大きく、特に、スケール不変性を持たせるためには複数の画像スケールのそれぞれで画像処理を行う必要があるので、計算コストが増大する原因となる。 However, the processing for obtaining local feature vectors having scale invariance and rotation invariance has a relatively high calculation cost. In particular, in order to have scale invariance, it is necessary to perform image processing on each of a plurality of image scales. As a result, the calculation cost increases.

より具体的には、特許文献１では、特徴点を中心とするｐ個の円の円周上の画素データを検出する同心円検出部と、各画素データにおける画素値の勾配の角度と支配的勾配との差分値に円の半径の平方根を乗じた重み付き差分値をそれぞれ算出する重み付き差分値算出部と、重み付き差分値についてのｑ個の階級を有する度数分布を作成する度数分布作成
部と、各円についての各度数を成分とするｑ次元のベクトルから、ｐ×ｑ次元の記述子ベクトルを算出する記述子ベクトル算出部とを設ける必要があるので、ＳＩＦＴ（ＳｃａｌｅＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）や、これよりも高速なＳＵＲＦよりも計算コストを低減できるものの、計算コストが比較的大きくなる。 More specifically, in Patent Document 1, a concentric circle detection unit that detects pixel data on the circumference of p circles centering on a feature point, and the gradient angle and dominant gradient of the pixel value in each pixel data A weighted difference value calculation unit that calculates a weighted difference value obtained by multiplying the difference value by the square root of the radius of the circle, and a frequency distribution generation unit that generates a frequency distribution having q classes of the weighted difference values And a descriptor vector calculation unit for calculating a p × q-dimensional descriptor vector from a q-dimensional vector having each frequency for each circle as a component, it is necessary to provide a SIFT (Scale Invariant Feature Transform), Although the calculation cost can be reduced as compared with the SURF faster than this, the calculation cost becomes relatively large.

また、非特許文献２のＢＲＩＥＦでは、特徴点を中心とする円内の２画素の輝度差を２値化したものを局所特徴ベクトルの成分とするので、局所特徴ベクトルの次元数とビット長とが等しくなって省メモリにはなるものの、高速化の点では、次の理由により、充分とは言えない。すなわち、例えば１２８次元の局所特徴ベクトルを生成する場合、ＢＲＩＥＦでは局所特徴ベクトルを１２８ビットで表現できるものの、例えば６４０ｘ４８０画素の画像中の該円内のランダムな１２８回の画素サンプリングが必要になり、キャッシング回数が増え、重い処理となるからである。また、ＢＲＩＥＦは回転不変性を有しない。さらに、上記２値化により、接近した局所特徴ベクトル間の区別が困難になるので、カメラ画像から取得した局所特徴ベクトルと参照局所特徴ベクトルとのマッチングの精度及び安定性、すなわち特徴点識別性が低下する。 Further, in BRIEF of Non-Patent Document 2, since the binarized luminance difference of two pixels in a circle centered on a feature point is used as a component of the local feature vector, the number of dimensions of the local feature vector, the bit length, However, in terms of speeding up, it is not sufficient for the following reasons. That is, for example, when generating a 128-dimensional local feature vector, BRIEF can represent a local feature vector with 128 bits, but for example, random sampling of 128 pixels within the circle in an image of 640 × 480 pixels is required. This is because the number of times of caching increases and the processing becomes heavy. Also, BRIEF has no rotation invariance. Furthermore, since the above binarization makes it difficult to distinguish between local feature vectors that are close to each other, the accuracy and stability of matching between the local feature vector acquired from the camera image and the reference local feature vector, that is, the feature point distinguishability is improved. descend.

さらに、例えば文字や記号などの画像に関する情報をデータベースで検索する場合、従来法では特徴点間の局所特徴ベクトルによる識別性が比較的低いので、画像認識率が低くなる。 Furthermore, when searching for information about images such as characters and symbols in a database, for example, the conventional method has a relatively low discriminability by local feature vectors between feature points, so the image recognition rate is low.

本発明の目的は、このような問題点に鑑み、スケール不変性及び回転不変性を持つ局所特徴ベクトルを、より低い計算コストで求める構成を備えた画像処理装置及びそのプログラムを提供することにある。 In view of such problems, an object of the present invention is to provide an image processing apparatus having a configuration for obtaining a local feature vector having scale invariance and rotation invariance at a lower calculation cost, and a program thereof. .

本発明の他の目的は、画像から生成した局所特徴ベクトルの識別性をより向上させる構成を備えた画像処理装置及びそのプログラムを提供することにある。 Another object of the present invention is to provide an image processing apparatus having a configuration for further improving the discriminability of a local feature vector generated from an image and a program thereof.

本発明の第１態様では、プロセッサと、データ及びプログラムが格納される記憶装置とを備え、該データは濃淡画像を含み、該プログラムは該プロセッサに対し該データに含まれる複数の局所特徴量を生成させる特徴ベクトル生成プログラムを含む画像処理装置において、
該特徴ベクトル生成プログラムは該プロセッサに対し、
（ａ）該濃淡画像に対しコーナポイントである特徴点の座標を検出させ、
（ｂ）検出された各特徴点である第１特徴点から近い順の所定数の第２特徴点のそれぞれと該第１特徴点とのペア特徴点を選択させ、
（ｃ）各ペア特徴点について、該第１特徴点と該第２特徴点との間の距離Ｌを求めさせ、
（ｄ）該第１特徴点を中心とし該距離Ｌに比例した第１半径の円周上の画素のうち等画素間隔のｎ個（ｎ≧４）の画素のそれぞれを含む画素領域Ｐｉ、ｉ＝０〜ｎ−１の平均第１輝度Ｉ（Ｐｉ）を、該距離Ｌの線方向を基準として所定順にサンプリングさせ、それぞれと該第１特徴点を含む画素領域の輝度との差と、
該第２特徴点を中心とし該距離Ｌに比例した第２半径の円周上の画素のうち等画素間隔のｍ個（ｍ≧４）の画素のそれぞれを含む画素領域Ｑｉ、ｉ＝０〜ｍ−１の平均第２輝度Ｉ（Ｑｊ）を、該距離Ｌの線方向を基準として所定順にサンプリングさせ、それぞれと該第２特徴点を含む画素領域の輝度との差と、
を成分とする、正規化された局所特徴ベクトルを求めさせ、
該画素領域の画素数の平方根は該距離Ｌに略比例している。 In a first aspect of the present invention, a processor and a storage device in which data and a program are stored are provided, the data includes a grayscale image, and the program stores a plurality of local feature amounts included in the data with respect to the processor. In an image processing apparatus including a feature vector generation program to be generated,
The feature vector generation program provides the processor with
(A) detecting the coordinates of feature points that are corner points in the grayscale image;
(B) selecting a pair feature point of each of a predetermined number of second feature points in order closer to the first feature point which is each detected feature point and the first feature point;
(C) For each pair feature point, the distance L between the first feature point and the second feature point is obtained,
(D) Pixel regions Pi, i each including n pixels (n ≧ 4) at equal pixel intervals among the pixels on the circumference of the first radius proportional to the distance L with the first feature point as the center. The average first luminance I (Pi) of = 0 to n−1 is sampled in a predetermined order with respect to the line direction of the distance L, and the difference between each and the luminance of the pixel region including the first feature point;
A pixel region Qi, i = 0 to 0 including each of m pixels (m ≧ 4) at equal pixel intervals among the pixels on the circumference of the second radius that is centered on the second feature point and proportional to the distance L The average second luminance I (Qj) of m−1 is sampled in a predetermined order with respect to the line direction of the distance L, and the difference between each and the luminance of the pixel region including the second feature point;
To obtain a normalized local feature vector with
The square root of the number of pixels in the pixel region is substantially proportional to the distance L.

ここに、濃淡画像は、例えばグレースケール画像又はカラー画像の単色成分画像であり
、ＲＧＢ画像のＲ，Ｇ，Ｂチャネルの１つ以上のチャンネルのそれぞれの濃淡画像であってもよい。コーナポイントは、例えばＦＡＳＴコーナ検出法又はＨａｒｒｉｓオペレータを用いたコーナ検出法により検出される。平均輝度は累積加算輝度を含む概念である。また、局所特徴ベクトルの表現は、上記距離Ｌの線方向に関し成分が所定順に配置されておればよく、前者の輝度差成分と後者の輝度差成分を交互に配置したものなどであってもよい。 Here, the grayscale image is, for example, a single-color component image of a grayscale image or a color image, and may be a grayscale image of each of one or more channels of R, G, and B channels of an RGB image. The corner point is detected by, for example, a FAST corner detection method or a corner detection method using a Harris operator. The average luminance is a concept including cumulative added luminance. In addition, the local feature vector may be expressed in a predetermined order with respect to the line direction of the distance L, and may be one in which the former luminance difference component and the latter luminance difference component are alternately arranged. .

本発明による画像処理装置の第２態様では、第１態様において、ｍ及びｎはいずれも８、１６又は３２である。 In the second aspect of the image processing apparatus according to the present invention, in the first aspect, m and n are each 8, 16, or 32.

上記第１態様の構成によれば、上記のようにペア特徴点を選択し局所特徴ベクトルを求めるので、スケール不変性及び回転不変性を持つ局所特徴ベクトルを、従来より低い計算コストで求めることができるという効果を奏する。 According to the configuration of the first aspect, since the pair feature points are selected and the local feature vector is obtained as described above, the local feature vector having scale invariance and rotation invariance can be obtained at a lower calculation cost than in the past. There is an effect that can be done.

また、画像の特徴点間に基づく局所特徴ベクトルを生成するので、文字や記号などのフレーム画像であっても、局所特徴ベクトルの識別性が向上し、結果としてフレーム画像の識別性等を向上させることが可能となるという効果を奏する。 In addition, since local feature vectors are generated based on the feature points of the image, the discriminability of the local feature vectors is improved even in the case of frame images such as characters and symbols, and as a result, the discriminability of the frame images is improved. There is an effect that it becomes possible.

上記第２態様の構成によれば、ｍ及びｎはいずれも２の冪乗であるので、実施例２で述べるように、局所特徴ベクトルをより高速に求めることができるという効果を奏する。 According to the configuration of the second aspect, since m and n are both powers of 2, as described in the second embodiment, there is an effect that a local feature vector can be obtained at higher speed.

本発明の他の目的、特徴的な構成及び効果は、以下の説明を特許請求の範囲及び図面の記載と関係づけて読むことにより明らかになる。 Other objects, characteristic configurations and effects of the present invention will become apparent from the following description read in connection with the appended claims and the drawings.

本発明の実施例１に係る画像処理装置１０のハードウェア構成を示す概略ブロック図である。It is a schematic block diagram which shows the hardware constitutions of the image processing apparatus 10 which concerns on Example 1 of this invention. １フレーム画像内の局所特徴ベクトルを生成する処理手順を示すフローチャートである。It is a flowchart which shows the process sequence which produces | generates the local feature vector in 1 frame image. （Ａ）は、文字画像と、その上に各特徴点に関するペア特徴点の特徴点間を直線で連結したものとを示す図であり、（Ｂ）は（Ａ）の一部を拡大したペア特徴点説明図である。(A) is a figure which shows what connected the character image and the feature point of the pair feature point regarding each feature point on it with the straight line on it, (B) is the pair which expanded a part of (A) It is feature point explanatory drawing. 図２のステップＳ３の処理説明図であって、図３（Ａ）中の一部を拡大した局所特徴ベクトル生成処理説明図である。3 is a process explanatory diagram of step S3 of FIG. 2, and is a local feature vector generation process explanatory diagram in which a part of FIG. 3A is enlarged. FIG. 図２のステップＳ４の処理説明図である。It is processing explanatory drawing of step S4 of FIG. （Ａ）〜（Ｄ）は何れも、第１特徴点を共通にする特徴点ペアを点で示す局所領域画像であり、（Ｅ）〜（Ｈ）はそれぞれ（Ａ）〜（Ｄ）の特徴点ペアに関する局所特徴ベクトルの棒グラフによる成分表示図である。(A) to (D) are local area images each showing a feature point pair sharing the first feature point, and (E) to (H) are the features of (A) to (D), respectively. It is a component display figure by the bar graph of the local feature vector regarding a point pair. 実施例２に係る画像処理装置の概略機能ブロック図である。FIG. 6 is a schematic functional block diagram of an image processing apparatus according to a second embodiment. 図７中の主処理部４０で実行されるメインルーチンの概略フローチャートである。8 is a schematic flowchart of a main routine executed by a main processing unit 40 in FIG. 図７中のマッチング処理部４６での、局所特徴ベクトル記憶部Ｍ３内の１つの局所特徴ベクトルＶに対するクラスＩＤ推定処理を示す概略フローチャートである。It is a schematic flowchart which shows the class ID estimation process with respect to one local feature vector V in the local feature vector memory | storage part M3 in the matching process part 46 in FIG. （Ａ）は、クラスＩＤ及びフレーム画像ＩＤでラベルされた同一ペア特徴点に関する局所特徴ベクトルの説明図であり、（Ｂ）は、参照データ内の局所特徴ベクトルの全集合からランダムに抽出した部分集合毎のツリーからなるランダムフォレストの識別器をその入出力と対応付けて示す説明図である。(A) is explanatory drawing of the local feature vector regarding the same pair feature point labeled with class ID and frame image ID, (B) is the part extracted at random from the whole set of the local feature vector in reference data It is explanatory drawing which shows the discriminator of the random forest which consists of a tree for every set | correspondence with the input / output. 白鳥の写真と「Ｓｗａｎ」の文字列とを含む印刷物の撮影画像に対し図７の処理を行いその途中結果を可視化したものを含む図であって、該画像と該画像から抽出された特徴点ペアと、各ペアの特徴点間を接続した直線とを示す図である。FIG. 8 is a diagram including a photographed image of a printed matter including a swan photograph and a character string “Swan” that includes a visualization of the intermediate result obtained by performing the processing of FIG. 7, and the feature points extracted from the image It is a figure which shows a pair and the straight line which connected between the feature points of each pair. 白鳥の写真と「Ｓｗａｎ」の文字列とを含む印刷物の撮影画像に対し図７の処理を行いその途中結果を可視化したものを含む図であって、該画像を縮小し回転した参照画上の特徴点と認識対象である図１１中の画像上の特徴点とをマッチング部でマッチングさせ、マッチングした特徴点間を直線で接続した図である。FIG. 8 is a diagram including the image obtained by performing the process of FIG. 7 on the printed image including the swan photograph and the character string “Swan”, and visualizing the result of the process, on the reference image obtained by reducing and rotating the image. FIG. 12 is a diagram in which feature points and feature points on the image in FIG. 11 that are recognition targets are matched by a matching unit, and the matched feature points are connected by a straight line. 白鳥の写真と「Ｓｗａｎ」の文字列とを含む印刷物の撮影画像に対し図７の処理を行いその途中結果を可視化したものを含む図であって、該画像を縮小し回転し射影変換した参照画上の特徴点と認識対象である図１１中の画像上の特徴点とをマッチング部でマッチングさせ、マッチングした特徴点間を直線で接続した図である。FIG. 8 is a diagram including a photographed image of a printed matter including a photograph of a swan and a character string “Swan”, including a visualization of the result of the processing shown in FIG. 7, the image being reduced, rotated, and projectively converted. FIG. 12 is a diagram in which feature points on the image and feature points on the image in FIG. 11 to be recognized are matched by a matching unit, and the matched feature points are connected by a straight line. 白鳥の写真と「Ｓｗａｎ」の文字列とを含む印刷物の撮影画像に対し図７の処理を行いその途中結果を可視化したものを含む図であって、該画像を回転し図１２の場合よりも縮小した参照画上の特徴点と認識対象である図１１中の画像上の特徴点とをマッチング部でマッチングさせ、マッチングした特徴点間を直線で接続した図である。FIG. 9 is a diagram including a photographed image of a printed matter including a photograph of a swan and a character string “Swan” that includes a visualization of the result of the process shown in FIG. 7, and the image is rotated and compared with the case of FIG. 12. FIG. 12 is a diagram in which feature points on the reduced reference image and feature points on the image in FIG. 11 to be recognized are matched by a matching unit, and the matched feature points are connected by a straight line.

図１は、本発明の実施例１に係る画像処理装置１０のハードウェア構成を示す概略ブロック図であって、この実施例１で必要な構成要素のみを示す。画像処理装置１０は例えば、カメラを備えたスマートフォンやＰＤＡ等の携帯端末装置、ノートパソコン又はデスクトップパソコンなどである。 FIG. 1 is a schematic block diagram showing a hardware configuration of an image processing apparatus 10 according to the first embodiment of the present invention, and shows only components necessary for the first embodiment. The image processing apparatus 10 is, for example, a mobile terminal device such as a smartphone or PDA provided with a camera, a notebook personal computer, or a desktop personal computer.

画像処理装置１０は、その本体部２０において、プロセッサ２１がバス２２を介して記憶装置２３、入力インターフェイス２４、カメラインターフェイス２５及びディスプレインターフェイス２６に結合されている。プロセッサ２１は、内部キャッシュメモリを備えている。入力インターフェイス２４には入力装置３０が結合され、カメラインターフェイス２５にはカメラ３１が結合され、ディスプレインターフェイス２６には出力装置としての表示装置３２が結合され、他の出力装置としての通信部２７にはアンテナ３３が結合されている。 In the main body 20 of the image processing apparatus 10, a processor 21 is coupled to a storage device 23, an input interface 24, a camera interface 25, and a display interface 26 via a bus 22. The processor 21 includes an internal cache memory. An input device 30 is coupled to the input interface 24, a camera 31 is coupled to the camera interface 25, a display device 32 as an output device is coupled to the display interface 26, and a communication unit 27 as another output device is coupled to the display unit 26. An antenna 33 is coupled.

入力装置３０は、対話型入力装置であって、タッチパネル、ポインティングデバイス若しくはキーボード又はこれらの組み合わせで構成されている。通信部２７は、電波を介して外部モニタ又はインターネットと結合するためのインターフェイスを備えている。 The input device 30 is an interactive input device and includes a touch panel, a pointing device, a keyboard, or a combination thereof. The communication unit 27 includes an interface for coupling to an external monitor or the Internet via radio waves.

記憶装置２３にはプログラム及びデータが格納され、このプログラムは、プロセッサ２１に対し、入力装置３０から入力インターフェイス２４を介したユーザの指示又は設定値の選択若しくは入力を受け付け、この入力に応じて、アプリケーションを起動させ、カメラ３１で被写体、例えば、図書館の本の表紙又は看板を撮像させてそのフレーム画像（静止画）を記憶装置２３内に格納させ、このフレーム画像から複数の局所特徴ベクトルを生成し、これらと記憶装置２３内の参照データとに基づいて、このフレーム画像を識別し、このフレーム画像に関する情報、例えば、該図書館に蔵書されている関連する本の情報又は看板に関する詳細情報を記憶装置２３から読み出してディスプレインターフェイス２６を介し表示装置３２に表示させる。或いは、カメラ３１で店内又は通販カタログの商品を撮像させ、同様にして、この商品に関する情報を表示装置３２に表示させる。 The storage device 23 stores a program and data. The program accepts a user instruction or setting value selection or input from the input device 30 via the input interface 24 to the processor 21, and according to this input, The application is activated, and the camera 31 captures a subject, for example, a cover or signboard of a library book, and stores the frame image (still image) in the storage device 23, and generates a plurality of local feature vectors from the frame image. Then, based on these and the reference data in the storage device 23, the frame image is identified, and information on the frame image, for example, related book information stored in the library or detailed information on the signboard is stored. Read from device 23 and display on display device 32 via display interface 26 . Alternatively, the product in the store or the mail order catalog is imaged by the camera 31, and information on the product is displayed on the display device 32 in the same manner.

本実施例１の特徴は、図２に示す、１フレーム画像内の局所特徴ベクトルを生成する処理である。以下、括弧内は図中のステップ識別符号を示す。 The feature of the first embodiment is processing for generating a local feature vector in one frame image shown in FIG. In the following, the step identification codes in the figure are shown in parentheses.

（Ｓ０）１フレーム画像内で注目画素をラスタースキャンしながら特徴点をＦＡＳＴコーナ検出法により検出する。 (S0) A feature point is detected by a FAST corner detection method while raster scanning the target pixel in one frame image.

このＦＡＳＴコーナ検出法は、注目画素を中心とし、正の閾値をｔｈとし、例えば半径
３画素の円周上における１６個の画素の輝度値を、（注目画素の輝度値）−ｔｈより小さければ暗い、（注目画素の輝度値）＋ｔｈより大きければ明るい、これらの間の値であれば類似と３値化し、連続して例えば９画素以上が明るい又は暗いと判定された場合に、注目画素がコーナの特徴点であると判定するものである。 In this FAST corner detection method, if the pixel of interest is at the center and the positive threshold is th, for example, the luminance value of 16 pixels on the circumference of a radius of 3 pixels is smaller than (the luminance value of the pixel of interest) −th. If it is dark, it is bright if it is larger than (the luminance value of the pixel of interest) + th, and if it is between these values, it is ternarized as similar, and when it is determined that, for example, 9 pixels or more are bright or dark continuously, It is determined that the corner is a feature point.

（Ｓ１）以下、ステップＳ０で検出した各特徴点（注目特徴点）について、Ｓ５迄の処理を行う。 (S1) Thereafter, the process up to S5 is performed for each feature point (target feature point) detected in step S0.

（Ｓ２）注目特徴点（第１特徴点）について、この特徴点から距離が小さい順の所定数ｎの特徴点を第２特徴点とし、第１特徴点と、第２特徴点のそれぞれとのｎ組のペア特徴点を選択する。ｎは、ｎ≧１であって、各第１特徴点について共通の値である。 (S2) With respect to the feature point of interest (first feature point), a predetermined number n of feature points in order of decreasing distance from the feature point are defined as second feature points, and the first feature point and each of the second feature points n pairs of feature points are selected. n is n ≧ 1, and is a common value for each first feature point.

図３（Ａ）は、各特徴点について、ｎ＝４とし、第１特徴点とそれぞれの第２特徴点との間を直線で連結（ペアを連結）したものである。図４は、図３中の一部拡大説明図である。 In FIG. 3A, n = 4 is set for each feature point, and the first feature point and each second feature point are connected by a straight line (a pair is connected). FIG. 4 is a partially enlarged explanatory view of FIG.

このステップＳ２で求めた各ペア特徴点についてステップＳ３及びステップＳ４の処理を行う。 The processing of step S3 and step S4 is performed for each pair feature point obtained in step S2.

（Ｓ３）ペア特徴点の特徴点間距離Ｌを算出し、例えば図４に示すように第１特徴点３５０と第２特徴点３５１との間の距離Ｌを算出し、第１特徴点３５０を中心とする距離Ｌに比例した半径の第１サンプリング円３５２、及び、第２特徴点３５１を中心とする距離Ｌに比例した半径の第２サンプリング円３５３を決定する。 (S3) The distance L between the feature points of the pair feature points is calculated. For example, the distance L between the first feature point 350 and the second feature point 351 is calculated as shown in FIG. A first sampling circle 352 having a radius proportional to the center distance L and a second sampling circle 353 having a radius proportional to the distance L centering on the second feature point 351 are determined.

図４での比例定数は、１であり、各特徴点について共通である。第１サンプリング円３５２と第２サンプリング円３５３の半径に関する該比例定数は、互いに異なるものを用いてもよい。 The proportionality constant in FIG. 4 is 1, which is common for each feature point. Different proportional constants for the radii of the first sampling circle 352 and the second sampling circle 353 may be used.

図５は、図４と異なる第１特徴点３６、第２特徴点３７及び該比例定数が上記と異なる第１サンプリング円Ｃ１及び第２サンプリング円Ｃ２を示している。 FIG. 5 shows a first sampling point C1 and a second sampling point C2, which are different from those in FIG.

（Ｓ４）第１サンプリング円Ｃ１上の画素のうち、等画素間隔のＮ個（Ｎ≧４）、例えば１６個の画素のそれぞれを中心とする矩形画素領域Ｐ０〜Ｐ９、Ｐａ〜Ｐｆのそれぞれの平均輝度Ｉ（Ｐｉ）、ｉ＝０〜ｆを、第１特徴点３６から第２特徴点３７へ向かう方向ベクトル（又は直線Ｌの方向）を基準として所定順、例えばこの方向ベクトルを基準として反時計回りの方向にサンプリング、図５では平均輝度Ｉ（Ｐ１），Ｉ（Ｐ２）、・・・、Ｉ（Ｐｆ）、Ｉ（Ｐ０）をこの順にサンプリングし、これらのそれぞれと、第１特徴点３６を中心とする矩形画素領域（ハッチングで示す領域）の平均輝度Ｉ１との差を、この順に並べ、同様に、第２特徴点３７から第１特徴点３６へ向かう方向ベクトル（又は直線Ｌの方向）を基準として所定順、例えばこの方向ベクトルを基準として反時計回りの方向にサンプリング、図５では平均輝度Ｉ（Ｑ９），Ｉ（Ｑａ）、・・・、Ｉ（Ｑｆ）、Ｉ（Ｑ０）、・・・、Ｉ（Ｑ８）をこの順にサンプリングし、これらのそれぞれと、第２特徴点３７を中心とする矩形画素領域（ハッチングで示す領域）の平均輝度Ｉ２との差を、この順に並べ、かつ、正規化したものを、ペア特徴点の３６に関する局所特徴ベクトルとして求める。すなわち、この局所特徴ベクトルＶを、
Ｖ＝α（Ｉ（Ｐ１）−Ｉ１，Ｉ（Ｐ２）−Ｉ１，・・・，Ｉ（Ｐｆ）−Ｉ１、Ｉ（Ｐ０）−Ｉ１，Ｉ（Ｑ９），Ｉ（Ｑａ）−Ｉ２，・・・，Ｉ（Ｑｆ），Ｉ（Ｑ０），・・・，Ｉ（Ｑ８））
として求める。αは、特徴ベクトルＶのノルムの値を、例えば符号付８ビット整数の最大値である１２７（ノルムの平方が１６１２９）に正規化するための係数である。また、各
成分の符号は、上記と逆であっても、また、第２サンプリング円Ｃ２に関してのみ上記と逆であってもよい。 (S4) Among the pixels on the first sampling circle C1, each of the rectangular pixel regions P0 to P9 and Pa to Pf centered at N pixels with equal pixel intervals (N ≧ 4), for example, 16 pixels, respectively. The average luminance I (Pi), i = 0 to f is deviated from the first feature point 36 toward the second feature point 37 in a predetermined order with reference to the direction vector (or the direction of the straight line L), for example, the direction vector as a reference. Sampling is performed in the clockwise direction. In FIG. 5, the average luminances I (P1), I (P2),..., I (Pf), I (P0) are sampled in this order. Differences from the average luminance I1 of rectangular pixel areas (areas indicated by hatching) centered at 36 are arranged in this order. Similarly, a direction vector (or a straight line L of the straight line L) from the second feature point 37 to the first feature point 36 is arranged. Direction) as a standard For example, sampling is performed in the counterclockwise direction with reference to this direction vector. In FIG. 5, average luminances I (Q9), I (Qa),..., I (Qf), I (Q0),. Q8) is sampled in this order, and the difference between each of these and the average luminance I2 of the rectangular pixel area (area indicated by hatching) centered on the second feature point 37 is arranged in this order and normalized. As a local feature vector for 36 of the pair feature points. That is, this local feature vector V is
V = α (I (P1) -I1, I (P2) -I1,..., I (Pf) -I1, I (P0) -I1, I (Q9), I (Qa) -I2,. ., I (Qf), I (Q0), ..., I (Q8))
Asking. α is a coefficient for normalizing the norm value of the feature vector V to 127 (norm square is 16129) which is the maximum value of a signed 8-bit integer, for example. Also, the sign of each component may be reversed from the above, or may be reversed only with respect to the second sampling circle C2.

上記各矩形画素領域は、正方形領域であって、その一辺の長さが距離Ｌに略比例している。ここに略比例とは、量子化誤差が含まれることを意味している。 Each of the rectangular pixel areas is a square area, and the length of one side thereof is approximately proportional to the distance L. Here, “approximately proportional” means that a quantization error is included.

１フレーム内の全特徴点のそれぞれについて、ステップＳ１〜Ｓ５の処理を行うので、上記第２特徴点３７と第１特徴点とを互いに入れ替えたものの局所特徴ベクトルも算出することになる。 Since all the feature points in one frame are processed in steps S1 to S5, the local feature vector of the above-described second feature point 37 and the first feature point is also calculated.

このようにして求めた局所特徴ベクトルは、カメラ３１の光軸及び位置を一定にし、光軸の回りにカメラ３１を回転させても不変であり、かつ、この光軸方向へカメラ３１をスライドさせても不変である。すなわち、この局所特徴ベクトルは、スケール不変性及び回転不変性を持っている。 The local feature vector thus obtained is constant even if the optical axis and position of the camera 31 are fixed and the camera 31 is rotated around the optical axis, and the camera 31 is slid in this optical axis direction. But it is unchanged. That is, this local feature vector has scale invariance and rotation invariance.

図６（Ａ）〜（Ｄ）は何れも、第１特徴点を共通にする互いに異なる特徴点ペアを点で示す局所領域画像であり、（Ｅ）〜（Ｈ）はそれぞれ（Ａ）〜（Ｄ）の特徴点ペアに関する局所特徴ベクトルの棒グラフによる成分表示図である。 6 (A) to 6 (D) are local region images that show different feature point pairs sharing the first feature point with dots, and (E) to (H) are (A) to (H), respectively. It is a component display figure by the bar graph of the local feature vector regarding the feature point pair of D).

上記の正規化前の局所特徴ベクトルＶの第ｋ成分をＶ［ｋ］、輝度Ｉ（Ｐｉ）をＲ［ｉ］、第１特徴点３６の輝度をＩ１、計算開始位置の輝度配列要素をＲ［ｏ］（図５の場合、ｏ＝１）とし、１６進数の前に０ｘを付加して表記すると、Ｃ言語の場合、ベクトルＶの第１サンプリング円Ｃ１に関する成分を次のような簡単なループ処理で計算することができる。 The k-th component of the local feature vector V before normalization is V [k], the luminance I (Pi) is R [i], the luminance of the first feature point 36 is I1, and the luminance array element at the calculation start position is R [O] (in the case of FIG. 5, o = 1), and 0x is added before the hexadecimal number, in the case of C language, the component related to the first sampling circle C1 of the vector V is as follows. It can be calculated by loop processing.

ｆｏｒ（ｉ＝０；ｉ＜１６；ｉ＋＋）｛Ｖ［ｉ］＝Ｓ［（ｉ＋ｏ）＆０ｘ１ｆ］ − Ｉ１｝；
ここに、＆は論理積演算子である。一般に、剰余（ｍｏｄｅ）演算子を％、とすると、ｎが２の冪乗である場合、ｉ＝（ｊ＋ｏ）％ｎはｉ＝（ｊ＋ｏ）＆（ｎ−１）で計算できるので、上記ループ処理のように、ｉの値を決定するためｉ＝ｎであるか否かでジャンプする余分な条件ジャンプ命令を用いずに論理積演算子＆を用いてインデックスｉを高速計算できる。 for (i = 0; i <16; i ++) {V [i] = S [(i + o) & 0x1f] −I1};
Here, & is a logical product operator. In general, assuming that the mode operator is%, when n is a power of 2, i = (j + o)% n can be calculated by i = (j + o) & (n−1). As in the processing, the index i can be calculated at high speed using the AND operator & without using an extra conditional jump instruction that jumps depending on whether i = n or not in order to determine the value of i.

ベクトルＶの第２サンプリング円Ｃ２に関する成分についても上記と同様である。
局所特徴ベクトルＶは、各成分が輝度値の差であるので、照明の変化に影響されにくい。また、局所特徴ベクトルＶのノルムが正規化されているので、さらに照明の変化に影響されにくい。さらに、局所特徴ベクトルＶの各成分が画素領域の平均輝度値（後でベクトルＶが正規化されるので、これは累積加算値でよい）を用いたものであるので、局所特徴ベクトルＶのＳＮ比を比較的大きくすることができる。 The components related to the second sampling circle C2 of the vector V are the same as described above.
The local feature vector V is not easily affected by changes in illumination because each component is a difference in luminance value. In addition, since the norm of the local feature vector V is normalized, it is less susceptible to changes in illumination. Further, since each component of the local feature vector V uses an average luminance value of the pixel area (the vector V is normalized later, this may be a cumulative addition value), so the SN of the local feature vector V The ratio can be made relatively large.

上記のような正規化された局所特徴ベクトルＶは、以下のような実施例２で用いられる。 The normalized local feature vector V as described above is used in the following second embodiment.

図７は、実施例１の方法を用いた実施例２に係る画像処理装置の概略機能ブロック図である。この画像処理装置のハードウェア構成は、実施例１の図１に示すものと同一である。 FIG. 7 is a schematic functional block diagram of an image processing apparatus according to the second embodiment using the method according to the first embodiment. The hardware configuration of this image processing apparatus is the same as that shown in FIG.

図７中、角丸矩形のブロックＭｉ及びバッファ領域Ｍ０〜Ｍ５は、図１の記憶装置２３内のデータ領域の一部である。 In FIG. 7, rounded rectangular blocks Mi and buffer areas M0 to M5 are part of the data area in the storage device 23 of FIG.

主処理部４０は、フレーム画像及びその輝度画像を画像処理するメインルーチンである。図８は、この主処理部４０による処理を示す概略フローチャートであり、ステップＳ４ｉ、Ｓ４１、Ｓ４３、Ｓ４５〜Ｓ４８でそれぞれ図７中のブロック４ｉ、４１、４３、４５〜４８をサブルーチンとして呼び出して処理する。 The main processing unit 40 is a main routine that performs image processing on the frame image and its luminance image. FIG. 8 is a schematic flowchart showing processing by the main processing unit 40. In steps S4i, S41, S43, and S45 to S48, blocks 4i, 41, 43, and 45 to 48 in FIG. To do.

図７において、画像入力部４ｉ、バッファ領域Ｍｉ、グレースケール化部４１、バッファ領域Ｍ０、特徴点検出部４３、２次元座標記憶部Ｍ１及び局所特徴ベクトル生成部４５は、実施例１においても用いられる。すなわち、画像入力部４ｉは、オペレーティングシステムを介しカメラ３１から、シャッターオン時のカラーフレーム画像Ｇ０（例えば６４０ｘ４８０画素）を取得してバッファ領域Ｍｉに格納する。また、グレースケール化部４１は、バッファ領域Ｍｉ内のフレーム画像Ｇ０を、グレースケール化して８ビット１チャンネルの輝度画像（フレーム画像）Ｇ１に変換しながら、これをバッファ領域Ｍ０に格納する。特徴点検出部４３は、図２のステップＳ０と同じ処理を行って、各特徴点の２次元座標を取得し、２次元座標記憶部Ｍ１に格納する。局所特徴ベクトル生成部４５は、２次元座標記憶部Ｍ１内の各特徴点に対し図２のステップＳ１〜Ｓ５の処理を行って局所特徴ベクトルを生成し、局所特徴ベクトル記憶部Ｍ３に追加する。 In FIG. 7, the image input unit 4i, the buffer region Mi, the gray scale unit 41, the buffer region M0, the feature point detection unit 43, the two-dimensional coordinate storage unit M1, and the local feature vector generation unit 45 are also used in the first embodiment. It is done. That is, the image input unit 4i acquires a color frame image G0 (for example, 640 × 480 pixels) when the shutter is turned on from the camera 31 via the operating system, and stores it in the buffer area Mi. Further, the gray scale conversion unit 41 converts the frame image G0 in the buffer area Mi into a gray scale and converts it into an 8-bit 1-channel luminance image (frame image) G1, and stores this in the buffer area M0. The feature point detection unit 43 performs the same process as step S0 in FIG. 2, acquires the two-dimensional coordinates of each feature point, and stores them in the two-dimensional coordinate storage unit M1. The local feature vector generation unit 45 performs the processing of steps S1 to S5 in FIG. 2 for each feature point in the two-dimensional coordinate storage unit M1, generates a local feature vector, and adds it to the local feature vector storage unit M3.

次に、参照データ記憶部Ｍ４には、予め、検索で用いられる参照データが格納されている。この参照データ記憶部Ｍ４は、参照データ作成部４２、アフィン変換部４４、局所領域画像記憶部Ｍ２及び局所特徴ベクトルを生成する上記の構成を用いて、以下のようにして生成される。 Next, reference data used in the search is stored in the reference data storage unit M4 in advance. The reference data storage unit M4 is generated as follows using the reference data generation unit 42, the affine transformation unit 44, the local region image storage unit M2, and the above-described configuration for generating local feature vectors.

すなわち、参照データ作成部４２は、フレーム画像Ｇ１から、各ペア特徴点の図５に示すような第１サンプリング円Ｃ１及び第２サンプリング円Ｃ２を含む局所領域画像を切り出し、局所領域画像群Ｇ２として局所領域画像記憶部Ｍ２に追加し、局所特徴ベクトル生成部４５での図２のステップＳ１〜Ｓ５の処理により、局所領域画像群Ｇ２の各ペア特徴点の局所特徴ベクトル（参照局所特徴ベクトル）を求め、参照データ記憶部Ｍ４に追加する。 That is, the reference data creation unit 42 cuts out a local region image including the first sampling circle C1 and the second sampling circle C2 as shown in FIG. 5 of each pair feature point from the frame image G1, and sets it as a local region image group G2. The local feature vector (reference local feature vector) of each pair feature point of the local region image group G2 is added to the local region image storage unit M2 and processed by the local feature vector generation unit 45 in steps S1 to S5 in FIG. Obtained and added to the reference data storage unit M4.

参照データ作成部４２はまた、局所領域画像群Ｇ２のそれぞれの画像から、カメラの奥行き及び姿勢を変えたものに相当する複数の局所領域画像をアフィン変換で自動生成し、局所領域画像群Ｇ２に加えるとともに各局所領域画像について上記同様にして局所特徴ベクトルを求め、参照データ記憶部Ｍ４に追加する。 The reference data creation unit 42 also automatically generates a plurality of local area images corresponding to those obtained by changing the depth and orientation of the camera from the respective images of the local area image group G2 by affine transformation, and generates the local area image group G2. In addition, a local feature vector is obtained for each local region image in the same manner as described above and added to the reference data storage unit M4.

すなわち、アフィン変換部４４を介し、局所領域画像群Ｇ２の各局所領域画像を、奥行きを変えずに光軸方向を変えることに相当する複数のマトリックスのそれぞれでアフィン変換して新たな局所領域画像群を生成し、これらを局所領域画像群Ｇ２に追加し、変換された各画像について局所特徴ベクトル生成部４５を介し同様にして局所特長ベクトルを求め、参照データ記憶部Ｍ４に追加する。参照データ作成部４２はさらに、局所領域画像群Ｇ２のそれぞれを、奥行きのみ長くすることに相当する複数のマトリックスのそれぞれでアフィン変換した画像群、すなわち、縮小した局所領域画像、例えば、幅及び高さをそれぞれ１／√２倍し、さらに１／√２倍し、さらに１／√２倍したそれぞれの局所領域画像群Ｇ３、Ｇ４及びＧ５を生成し、それぞれの局所領域画像について、局所特徴ベクトル生成部４５を介し同様にして局所特長ベクトルを求め、参照データ記憶部Ｍ４に追加する。 That is, a new local region image is obtained by affine transforming each local region image of the local region image group G2 through each of a plurality of matrices corresponding to changing the optical axis direction without changing the depth via the affine transformation unit 44. A group is generated, these are added to the local region image group G2, and local feature vectors are similarly obtained for each converted image via the local feature vector generation unit 45, and added to the reference data storage unit M4. Further, the reference data creation unit 42 further affine-transforms each local region image group G2 with a plurality of matrices corresponding to lengthening only the depth, that is, reduced local region images, for example, width and height. Each local region image group G3, G4, and G5 is generated by multiplying the length by 1 / √2, further by 1 / √2, and further by 1 / √2, and for each local region image, a local feature vector is generated. Similarly, a local feature vector is obtained through the generation unit 45 and added to the reference data storage unit M4.

参照データ作成部４２は、参照データ記憶部Ｍ４において、アフィン変換の有無に拘わらず同一のペア特徴点に関する局所特長ベクトルのそれぞれに同一のクラスＩＤ（ＣＩＤ）を対応付ける。すなわち、参照データ記憶部Ｍ４には、例えば図１０（Ａ）に示すように、１つのペア特徴点に対し異なる複数のカメラ視点のそれぞれの局所特徴ベクトル、例
えばＶ０１０１、Ｖ０１０２、Ｖ０１０３、・・・が同一クラスＩＤ、例えばＣＩＤ０１でクラス分けされている。 In the reference data storage unit M4, the reference data creation unit 42 associates the same class ID (CID) with each local feature vector related to the same pair feature point regardless of the presence or absence of affine transformation. That is, in the reference data storage unit M4, as shown in FIG. 10A, for example, local feature vectors of a plurality of different camera viewpoints for one pair feature point, for example, V0101, V0102, V0103,. Are classified by the same class ID, for example, CID01.

参照データ記憶部Ｍ４にはさらに、各ＣＩＤが属するフレーム画像ＩＤ（ＦＩＤ）が対応付けられている。例えば、ＣＩＤ０１はＦＩＤ０１、ＦＩＤ１２が対応付けられている。これは、ＦＩＤ０１、ＦＩＤ１２のそれぞれのフレーム画像にＣＩＤ０１が含まれていることを意味する。 The reference data storage unit M4 is further associated with a frame image ID (FID) to which each CID belongs. For example, FID01 and FID12 are associated with CID01. This means that CID01 is included in each frame image of FID01 and FID12.

参照データ記憶部Ｍ４にはまた、各ＦＩＤに対応付けられた上記情報、例えば関連する本の情報、看板の詳細情報又は商品情報などが含まれている。 The reference data storage unit M4 also includes the information associated with each FID, for example, related book information, detailed information on signboards, or product information.

マッチング部４６は、決定木を局所特徴ベクトルの識別器（分類器）として備えている。識別器は、局所特徴ベクトルＶを入力としクラスＩＤを出力とする。決定木として、本実施例では、複数のツリーを用いたランダムフォレスト（ＲａｎｄｏｍＦｏｒｅｓｔ）を用いる。その理由は、使用時に高速動作すること、マルチクラス識別器であること、識別精度が比較的高いこと、識別精度とメモリ使用量がトレードオフになるがそのパラメータがほぼツリーの数で調整できること（ツリーが少数（多数）だと識別精度が低い（高い）がメモリ使用量が小（大））である。 The matching unit 46 includes a decision tree as a local feature vector classifier (classifier). The classifier receives the local feature vector V as an input and outputs a class ID. In this embodiment, a random forest (Random Forest) using a plurality of trees is used as the decision tree. The reason for this is that it operates at high speed during use, is a multi-class classifier, has a relatively high classification accuracy, and there is a trade-off between identification accuracy and memory usage, but its parameters can be adjusted by the number of trees ( If the tree is small (many), the identification accuracy is low (high) but the memory usage is small (large).

局所特徴ベクトルの各成分を２値化しないのは、決定木を用いることにより、２値化とは無関係にマッチングを高速に行うことができることと、２値化による局所特徴ベクトルの識別力低下を避けるためである。 The reason why each component of the local feature vector is not binarized is that the decision tree is used so that matching can be performed at high speed irrespective of binarization and the discriminating power of the local feature vector is reduced by binarization. This is to avoid it.

マッチング部４６は、ランダムフォレスト識別器を学習しておく。すなわち、参照データ記憶部Ｍ４内の局所特徴ベクトルの全集合から、クラスＩＤが同一であるか否かを考慮せずにランダムに複数の部分集合（各部分集合の要素数は互いに同一）を決め、図１０（Ｂ）に示すように部分集合をツリーの分岐ノードで分割する分割関数ｆ（Ｖ）と分割の境界を定める閾値ｔとをランダムに決定し、情報利得が最大になるように学習して分割関数ｆ（Ｖ）のパラメータと閾値ｔとを更新し、また、各ツリーの各リーフノードに、クラスＩＤ毎の確率Ｐｒを対応付けておく（リーフノードに対応付けられていないＩＤはその確率が０）。 The matching unit 46 learns a random forest classifier. That is, a plurality of subsets (the number of elements of each subset is the same as each other) is randomly determined from the entire set of local feature vectors in the reference data storage unit M4 without considering whether or not the class ID is the same. As shown in FIG. 10B, a division function f (V) for dividing the subset at the branch nodes of the tree and a threshold value t for determining the division boundary are randomly determined, and learning is performed so that the information gain is maximized. Then, the parameter of the division function f (V) and the threshold value t are updated, and the probability Pr for each class ID is associated with each leaf node of each tree (IDs not associated with leaf nodes are The probability is 0).

マッチング部４６は、各局所特徴ベクトルＶについて、ランダムフォレストの各ツリーを辿って、リーフノードでクラスＩＤ毎の確率を取得し、全ツリーでのクラスＩＤ毎の確率の総和が最大になるＩＤを、ランダムフォレスト識別器の出力とする。 The matching unit 46 traces each tree of the random forest for each local feature vector V, obtains the probability for each class ID at the leaf nodes, and determines the ID that maximizes the sum of the probabilities for each class ID in all trees. The output of the random forest classifier.

すなわち、マッチング部４６は、局所特徴ベクトル記憶部Ｍ３内の各局所特徴ベクトルＶに対し、図９に示すようなステップＳ１０〜Ｓ１５の処理を行って局所特徴ベクトルＶのクラスＩＤを推定する。 That is, the matching unit 46 performs the processing of steps S10 to S15 as shown in FIG. 9 on each local feature vector V in the local feature vector storage unit M3 to estimate the class ID of the local feature vector V.

（Ｓ１０）横軸をクラスＩＤとし、縦軸を頻度（正確には確率値の累積加算値）とする空のヒストグラムをフレーム画像ＩＤヒストグラム記憶部Ｍ５に生成する。以下、ランダムフォレストの各ツリーについて、ステップＳ１１〜Ｓ１４の処理を行う。 (S10) An empty histogram having the horizontal axis as the class ID and the vertical axis as the frequency (more accurately, the cumulative addition value of the probability values) is generated in the frame image ID histogram storage unit M5. Thereafter, the processes in steps S11 to S14 are performed for each tree in the random forest.

（Ｓ１２）局所特徴ベクトルＶに対し、ツリーを上から下へ辿り、その際、ツリーの各ノードにおいて、局所特徴ベクトルＶの対応する成分と、その閾値ｔと、分割関数ｆ（Ｖ）とに基づいて、どちらの子ノードへ分岐するかを決定し、リーフノードで得られるクラスＩＤの確率分布から、確率値が大きい順の、例えば３つのクラスＩＤを決定する。 (S12) The tree is traced from top to bottom with respect to the local feature vector V. At this time, at each node of the tree, the corresponding component of the local feature vector V, its threshold value t, and the division function f (V) Based on this, it is determined to which child node to branch, and from the probability distribution of class IDs obtained at the leaf nodes, for example, three class IDs in descending order of probability values are determined.

（Ｓ１３）これらの３つのクラスＩＤの確率値を、ステップＳ１０のヒストグラムに追
加する。 (S13) The probability values of these three class IDs are added to the histogram of step S10.

（Ｓ１５）ヒストグラム上の最頻値を、この局所特徴ベクトルＶの特徴点のクラスＩＤと推定する（図１０（Ｂ）参照）。 (S15) The mode value on the histogram is estimated as the class ID of the feature point of the local feature vector V (see FIG. 10B).

（Ｓ１６）マッチング部４６は、推定したクラスＩＤ（ＣＩＤ）に対応するフレーム画像ＩＤ（ＦＩＤ）、例えば図１０（Ａ）の左側のクラスＩＤであるＩＤ０１に対応したＦＩＤ０１及びＦＩＤ１２を、参照データ記憶部Ｍ４から取得し、フレーム画像ＩＤヒストグラム記憶部Ｍ５内のフレーム画像ＩＤで識別されるカウンタを１だけインクリメントする。 (S16) The matching unit 46 stores the frame image ID (FID) corresponding to the estimated class ID (CID), for example, FID01 and FID12 corresponding to ID01 which is the class ID on the left side of FIG. The counter acquired from the part M4 and identified by the frame image ID in the frame image ID histogram storage part M5 is incremented by one.

マッチング部４６が局所特徴ベクトル記憶部Ｍ３内の各局所特徴ベクトルＶに対し図９の処理を行った後、フレーム画像ＩＤ推定部４７は、フレーム画像ＩＤヒストグラム記憶部Ｍ５内のカウンタ値が最大のフレーム画像ＩＤを、バッファ領域Ｍｉ内のフレーム画像のＩＤと推定する。 After the matching unit 46 performs the processing of FIG. 9 on each local feature vector V in the local feature vector storage unit M3, the frame image ID estimation unit 47 sets the counter value in the frame image ID histogram storage unit M5 to the maximum. The frame image ID is estimated as the ID of the frame image in the buffer area Mi.

フレーム画像ＩＤの情報出力部４８は、このフレーム画像ＩＤに対応した情報を参照データ記憶部Ｍ４から取り出して、表示装置３２に出力する。 The frame image ID information output unit 48 extracts information corresponding to the frame image ID from the reference data storage unit M4 and outputs the information to the display device 32.

次に、マッチング部４６による処理の試験結果を説明する。 Next, test results of processing by the matching unit 46 will be described.

図１１〜図１４はいずれも、白鳥の写真と「Ｓｗａｎ」の文字列とを含む印刷物の入力画像に対し図７の処理を行いその途中結果を可視化したものを含む図であって、図１１は、該入力画像と該画像から抽出された特徴点ペアと、各ペアの特徴点間を接続した直線とを示す可視化画像の図であり、図１２〜図１４はいずれも、参照画像上の特徴点と図１１中の入力画像上の特徴点とをマッチング部４６でマッチングさせ、マッチングした特徴点間を直線で接続した図であり、図１２は該参照画像が該入力画像を縮小し回転したものに相当し、図１３は該参照画像が該入力画像を縮小し回転し射影変換したものに相当し、図１４は該参照画像が該入力画像を回転し図１２の場合よりも縮小したものに相当する図である。ここに参照画像は、この画像から上記参照データが得られる画像である。 FIGS. 11 to 14 are diagrams including images obtained by performing the processing in FIG. 7 on the input image of the printed matter including the swan photograph and the character string “Swan”, and visualizing the result. FIG. 12 is a view of a visualized image showing the input image, feature point pairs extracted from the image, and straight lines connecting the feature points of each pair. FIGS. 12 to 14 are all on the reference image. The feature points and the feature points on the input image in FIG. 11 are matched by the matching unit 46, and the matched feature points are connected by a straight line. FIG. 12 is a diagram in which the reference image reduces and rotates the input image. FIG. 13 corresponds to the reference image obtained by reducing and rotating the input image and performing projection conversion, and FIG. 14 is obtained by rotating the input image and reducing the reference image as compared with the case of FIG. It is a figure equivalent to a thing. Here, the reference image is an image from which the reference data is obtained.

図１２及び図１３の参照画像上の特徴点ペアはいずれも１３７対であり、このうち、図１２では１１１対（８１％）がマッチングに成功し、図１３では９３対（６８％）がマッチングに成功した。図１４の参照画像上の特徴点ペアは３６対であり、このうち２９対（８０％）がマッチングに成功した。 There are 137 pairs of feature points on the reference images in FIGS. 12 and 13, among which 111 pairs (81%) succeeded in matching in FIG. 12, and 93 pairs (68%) matched in FIG. succeeded in. The number of feature point pairs on the reference image in FIG. 14 is 36, of which 29 pairs (80%) succeeded in matching.

以上において、本発明の好適な実施例を説明したが、本発明には他にも種々の変形例が含まれ、上記各構成要素の機能を実現する他の構成を用いたもの、当業者であればこれらの構成又は機能から想到するであろう他の構成も、本発明に含まれる。 In the above, preferred embodiments of the present invention have been described. However, the present invention includes various other modifications, and those using other configurations for realizing the functions of the above-described components can be used by those skilled in the art. Other configurations that would come from these configurations or functions, if any, are also included in the present invention.

例えば、マッチング部４６で用いる識別器は、高速かつある程度以上マッチング精度が得られるものであればよく、ランダムフォレスト識別器に限定されず、バギングやブースティングなどのアンサンブル学習アルゴリズムを用いたものや単一の決定木を用いた識別器であってもよい。 For example, the discriminator used in the matching unit 46 is not limited to a random forest discriminator as long as the matching accuracy can be obtained at a high speed to a certain degree, and is not limited to a random forest discriminator, or an ensemble learning algorithm such as bagging or boosting. It may be a discriminator using one decision tree.

また、参照データ記憶部Ｍ４は、アプリケーション起動後に上記のような参照データを参照データ作成部４２で自動生成する構成であってもよい。 Further, the reference data storage unit M4 may be configured to automatically generate the reference data as described above by the reference data creation unit 42 after the application is started.

さらに、本発明は、拡張現実（ＡＲ）表示装置などにも適用することができる。 Furthermore, the present invention can be applied to an augmented reality (AR) display device or the like.

１０画像処理装置
２０本体部
２１プロセッサ
２２バス
２３記憶装置
２４入力インターフェイス
２５カメラインターフェイス
２６ディスプレインターフェイス
２７通信部
３０入力装置
３１カメラ
３２表示装置
３３アンテナ
４ｉ画像入力部
４０主処理部
４１グレースケール化部
４２参照データ作成部
４３特徴点検出部
４４アフィン変換部
４５局所特徴ベクトル生成部
４６マッチング部
４７フレーム画像ＩＤ推定部
４８フレーム画像ＩＤの情報出力部
３４０、３５０，３６第１特徴点
３４１〜３４４、３５１、３７第２特徴点
３５２、Ｃ１第１サンプリング円
３５３、Ｃ２第２サンプリング円
Ｐ０〜Ｐｆ、Ｑ０〜Ｑｆ領域
Ｍｉ、Ｍ０バッファ領域
Ｍ１２次元座標記憶部
Ｍ２局所領域画像記憶部
Ｍ３局所特徴ベクトル記憶部
Ｍ４参照データ記憶部
Ｍ５フレーム画像ＩＤヒストグラム記憶部 DESCRIPTION OF SYMBOLS 10 Image processing apparatus 20 Main body part 21 Processor 22 Bus | bath 23 Memory | storage device 24 Input interface 25 Camera interface 26 Display interface 27 Communication part 30 Input device 31 Camera 32 Display apparatus 33 Antenna 4i Image input part 40 Main processing part 41 Gray scale part 42 Reference data creation unit 43 Feature point detection unit 44 Affine transformation unit 45 Local feature vector generation unit 46 Matching unit 47 Frame image ID estimation unit 48 Frame image ID information output unit 340, 350, 36 First feature points 341-344, 351 37 Second feature point 352, C1 First sampling circle 353, C2 Second sampling circle P0-Pf, Q0-Qf Region Mi, M0 Buffer region M1 Two-dimensional coordinate storage unit M2 Local region image storage unit M3 Local Symptoms vector storage unit M4 reference data storage section M5 frame image ID histogram storage unit

Claims

A processor and a storage device in which the data and the program are stored, wherein the data includes a grayscale image, and the program includes a feature vector generation program causing the processor to generate a plurality of local feature amounts included in the data In the image processing apparatus,
The feature vector generation program provides the processor with
(A) detecting the coordinates of feature points that are corner points in the grayscale image;
(B) selecting a pair feature point of each of a predetermined number of second feature points in order closer to the first feature point which is each detected feature point and the first feature point;
(C) For each pair feature point, the distance L between the first feature point and the second feature point is obtained,
(D) Pixel regions Pi, i each including n pixels (n ≧ 4) at equal pixel intervals among the pixels on the circumference of the first radius proportional to the distance L with the first feature point as the center. The average first luminance I (Pi) of = 0 to n−1 is sampled in a predetermined order with respect to the line direction of the distance L, and the difference between each and the luminance of the pixel region including the first feature point;
A pixel region Qi, i = 0 to 0 including each of m pixels (m ≧ 4) at equal pixel intervals among the pixels on the circumference of the second radius that is centered on the second feature point and proportional to the distance L The average second luminance I (Qj) of m−1 is sampled in a predetermined order with respect to the line direction of the distance L, and the difference between each and the luminance of the pixel region including the second feature point;
To obtain a normalized local feature vector with
An image processing apparatus, wherein the square root of the number of pixels in the pixel region is substantially proportional to the distance L.

The image processing apparatus according to claim 1, wherein each of m and n is 8, 16 or 32.

A camera,
The image processing apparatus according to claim 1, wherein the grayscale image is an image obtained by converting a frame image captured by the camera into a gray scale.

The data further includes, for each reference gray image, a local feature vector generated by the feature vector generation program associated with a class ID as a reference local feature vector and information on the reference gray image, Including an image search program,
The image search program provides the processor with
(E) For each local feature vector obtained in step (d) for the search grayscale image, the class ID in the reference data corresponding to the local feature vector is set as the local feature vector and the reference local in the reference data. A decision is made by matching with a feature vector, the counter of the reference gray image to which the class ID belongs is incremented,
(F) Information in the reference data regarding the reference grayscale image having the maximum counter value is output as information of the search grayscale image.
The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The image search program causes the processor to determine the class ID in a step (e) by a discriminator having a local feature vector as an input and a class ID as an output.
The image processing apparatus according to claim 4.

The image processing apparatus according to claim 1, wherein the image processing apparatus is an augmented reality display device.

The program which comprises the image processing apparatus as described in any one of Claims 1 thru | or 6.