JP2019207657A

JP2019207657A - Information processing device and program

Info

Publication number: JP2019207657A
Application number: JP2018103952A
Authority: JP
Inventors: 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-05-30
Filing date: 2018-05-30
Publication date: 2019-12-05
Anticipated expiration: 2038-05-30
Also published as: JP6904922B2

Abstract

To provide an information processing device which can obtain feature information on a target imaged in a captured image as information in which the effect of a light source as disturbance in the target is excluded or reduced.SOLUTION: The information processing device comprises: a first feature calculation unit 2 for calculating first feature information from an image; a first attitude estimation unit 3 for estimating the first position attitude of a target imaged in the image with respect to a camera which obtained the image, by collating the first feature information with feature information for reference; a processing unit 4 for estimating, using the first position attribute, a difference between an extracted image of the target in an area of the image occupied by the target and a reference image relating to the target, as being caused from a difference in light source environment, and obtaining a processed image of the image as the one in which the light source environment in the image is corrected to a light source environment in the reference image on the basis of the estimation result; and a second feature calculation unit 5 for calculating second feature information on the target from the processed image assuming that the image is under the light source environment of the reference image.SELECTED DRAWING: Figure 1

Description

本発明は、撮像画像に撮像されている対象の特徴情報を、当該対象における外乱としての光源の影響を除外ないし低減したものとして求めることが可能な情報処理装置及びプログラムに関する。 The present invention relates to an information processing apparatus and program capable of obtaining feature information of a target imaged in a captured image as a result of eliminating or reducing the influence of a light source as a disturbance in the target.

画像より撮像されている対象を認識ないし検出することで、当該対象の相対的な位置（当該画像を撮像しているカメラを基準とした位置）を得ることを可能とする技術として、非特許文献１や特許文献１の技術がある。 Non-patent literature as a technology that makes it possible to obtain a relative position of a target (position based on the camera that is capturing the image) by recognizing or detecting the target captured from the image 1 and Patent Document 1.

非特許文献１では、画像から特徴点を検出し、特徴点周辺から局所特徴量を算出した上で、事前に蓄積しておいた局所特徴量と照合することによって、特徴点同士の対応関係を結び、対象との相対的な位置関係を認識する。特許文献１では、カメラの周辺に複数個配置され各々が独立して選択的に明滅する光源で照射し、明滅の組み合わせに対応するテンプレートでマッチングし、当該マッチング結果として位置関係を取得する。 In Non-Patent Document 1, feature points are detected from an image, local feature amounts are calculated from around the feature points, and then compared with local feature amounts accumulated in advance, the correspondence between feature points is determined. Recognize the relative positional relationship with the target. In Patent Document 1, a plurality of light sources that are arranged around the camera and each selectively irradiate light selectively are matched, matched with a template corresponding to the combination of blinking, and the positional relationship is acquired as the matching result.

特開2015-173344号公報JP-A-2015-173344

D. G. Lowe, "Object recognition from local scale-invariant Features,'' Proc. of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.D. G. Lowe, "Object recognition from local scale-invariant Features, '' Proc. Of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.

しかしながら、以上のような従来技術においては、その強度、分光分布及び対象に対して照射される位置方向といったような情報が未知であり制御不能である外乱としての光が存在する場合においても頑強に、撮像された対象の相対的な位置情報を取得することができなかった。 However, the conventional technology as described above is robust even in the case where there is light as disturbance that cannot be controlled because information such as its intensity, spectral distribution, and position direction irradiated on the object is unknown. The relative position information of the imaged object could not be acquired.

すなわち、非特許文献１は、輝度差に基づく特徴量を用いることで全体的な光量増減には頑健であるが、非均一な光が照射される場合は輝度のオリエンテーションやヒストグラムが変化することがあるため、認識率が低下するという問題があった。例えば、撮像対象の一部に光が照射されグラデーションが発生する場合などにおいて、同一箇所にも関わらず異なる特徴量が発生し認識率が低下してしまう。 That is, Non-Patent Document 1 is robust in increasing and decreasing the overall light amount by using a feature amount based on a luminance difference, but the luminance orientation and histogram may change when non-uniform light is irradiated. Therefore, there is a problem that the recognition rate is lowered. For example, when a part of the imaging target is irradiated with light and a gradation is generated, different feature amounts are generated regardless of the same location and the recognition rate is lowered.

特許文献１では例えば、対象に対して正面にある平面上の円周上に囲んで各光源を配置し、一度に１つの光源のみを点灯させ他の光源は点灯させない選択的な明滅制御を行うなど、用いる複数の光源は外乱としての光源ではなく、制御下にあることが前提であった。従って、当該制御下にない外乱としての光源がさらに存在する場合には、上記非特許文献１におけるのと同様の問題が特許文献１においても発生する。 In Patent Document 1, for example, each light source is disposed so as to surround a circumference on a plane in front of the object, and selective blinking control is performed so that only one light source is turned on at a time and the other light sources are not turned on. The plurality of light sources used are based on the premise that they are under control, not as light sources as disturbances. Therefore, when there is a light source as a disturbance that is not under the control, the same problem as in Non-Patent Document 1 occurs in Patent Document 1.

特許文献１ではまた、制御下にある光源として複数の光源とその選択的な明滅が必要であるため、装置が大規模化するという問題もあった。さらに、制御下にある光源が生み出す立体物の陰影をテンプレートマッチングに利用しているため、陰影が生じない平面物体等には適用できないという問題や、複数光源による複数の明滅状態に応じた多数のテンプレートを用意しておく手間や当該明滅状態ごとの画像を撮像する手間が発生するという問題もあった。 In Patent Document 1, a plurality of light sources and their selective blinking are necessary as the light sources under control, which causes a problem that the apparatus becomes large-scale. Furthermore, since the shadow of the three-dimensional object generated by the light source under control is used for template matching, it cannot be applied to planar objects that do not generate shadows. There is also a problem that a trouble of preparing a template and a trouble of taking an image for each blinking state occur.

以上のような問題は、従来技術では撮像画像に撮像されている対象の特徴情報を、当該対象における外乱としての光源の影響を除外ないし低減したものとして求めることができないということに起因するものである。 The problems as described above are caused by the fact that the feature information of the target imaged in the captured image cannot be obtained by excluding or reducing the influence of the light source as a disturbance in the target object in the prior art. is there.

上記従来技術の課題に鑑み、本発明は、撮像画像に撮像されている対象の特徴情報を、当該対象における外乱としての光源の影響を除外ないし低減したものとして求めることができる情報処理装置及びプログラムを提供することを目的とする。 In view of the above-described problems of the related art, the present invention provides an information processing apparatus and program capable of obtaining feature information of a target imaged in a captured image as a result of eliminating or reducing the influence of a light source as a disturbance in the target object. The purpose is to provide.

上記目的を達成するため、本発明は、情報処理装置であって、撮像を行って得られる撮像画像より第一特徴情報を算出する第一特徴算出部と、前記第一特徴情報と所与の参照用特徴情報とを照合することにより、前記撮像画像に撮像されている対象の、前記撮像画像を得たカメラに対する第一位置姿勢を推定する第一姿勢推定部と、前記撮像画像において前記対象が占める領域における当該対象の抽出画像と、当該対象に関して所与の参照画像と、の相違を、光源環境の違いに起因するものとして、前記第一位置姿勢に対応する座標変換により共通座標系において推定し、当該推定した結果に基づいて前記撮像画像における第一光源環境を前記参照画像における第二光源環境へと当該共通座標系において修正する加工処理に基づいて、前記撮像画像の加工画像を得る加工部と、前記加工画像より、前記撮像画像が前記第二光源環境にある際のものとして、前記対象の第二特徴情報を算出する第二特徴算出部と、を備えることを特徴とする。また、コンピュータを前記情報処理装置として機能させるプログラムであることを特徴とする。 In order to achieve the above object, the present invention provides an information processing apparatus, a first feature calculation unit that calculates first feature information from a captured image obtained by imaging, the first feature information, and a given A first posture estimation unit that estimates a first position and posture of a target imaged in the captured image with respect to a camera that has obtained the captured image by collating with reference feature information; and the target in the captured image In the common coordinate system, the difference between the extracted image of the target in the area occupied by the image and the given reference image for the target is caused by the difference in the light source environment, by coordinate conversion corresponding to the first position and orientation. And based on the processing result of correcting the first light source environment in the captured image to the second light source environment in the reference image in the common coordinate system based on the estimated result. A processing unit that obtains a processed image of the image; and a second feature calculation unit that calculates second feature information of the target as the one when the captured image is in the second light source environment from the processed image. It is characterized by that. Further, the present invention is a program that causes a computer to function as the information processing apparatus.

本発明によれば、撮像画像に撮像されている対象の特徴情報を、当該対象における外乱としての光源の影響を除外ないし低減したものとして求めることができる。 According to the present invention, it is possible to obtain the feature information of a target imaged in a captured image as a result of excluding or reducing the influence of a light source as a disturbance in the target.

一実施形態に係る情報処理装置の機能ブロック図である。It is a functional block diagram of the information processor concerning one embodiment. 一実施形態に係る加工部及びその後段処理部による情報処理の流れの主要部を、各情報を表した模式的な例示イラストと共に示すものである。The principal part of the flow of the information processing by the process part which concerns on one Embodiment, and a subsequent stage process part is shown with the typical illustration illustration showing each information. 加工部による加工処理の模式例を、ある１つの特徴点の近傍に関して1次元で模式的に示す図である。It is a figure which shows typically the schematic example of the process by a process part in the one dimension regarding the vicinity of a certain one feature point. 図２とは別の一実施形態による情報処理の流れの主要部を、各情報を表した模式的な例示イラストと共に示すものである。The main part of the flow of the information processing by one Embodiment different from FIG. 2 is shown with the typical illustration illustration showing each information.

図１は、一実施形態に係る情報処理装置10の機能ブロック図である。図示するように、情報処理装置10は撮像部1、第一特徴算出部2、第一姿勢推定部3、外乱推定部41と外乱補正部42とを含む加工部4、第二特徴算出部5、第二姿勢推定部6及び記憶部7を備える。図示するように、情報処理装置10の各部の概略的な機能は次の通りである。 FIG. 1 is a functional block diagram of an information processing apparatus 10 according to an embodiment. As illustrated, the information processing apparatus 10 includes an imaging unit 1, a first feature calculation unit 2, a first posture estimation unit 3, a processing unit 4 including a disturbance estimation unit 41 and a disturbance correction unit 42, and a second feature calculation unit 5. The second posture estimation unit 6 and the storage unit 7 are provided. As shown in the figure, the schematic functions of each part of the information processing apparatus 10 are as follows.

撮像部1は、認識されるべき対象が存在するフィールド（屋内外空間など）の撮像を行うことで撮像画像を得て、当該撮像画像を第一特徴算出部2及び加工部4へと出力する。第一特徴算出部2は、撮像部1で得た撮像画像から第一特徴情報を算出して当該第一特徴情報を第一姿勢推定部3及び第二特徴算出部5へと出力する。第一姿勢推定部3は、第一特徴算出部2で得た第一特徴情報を記憶部7に記憶されている参照用特徴情報（リファレンス情報として認識対象候補ごとに記憶されている）に対して照合することで、撮像部1で得た撮像画像に撮像されている対象の認識結果を得ると共に、当該照合した特徴情報同士における座標の対応関係から認識された対象の第一位置姿勢情報を推定し、当該推定した第一位置姿勢情報を加工部4へと出力する。 The imaging unit 1 obtains a captured image by imaging a field (such as an indoor / outdoor space) where a target to be recognized exists, and outputs the captured image to the first feature calculation unit 2 and the processing unit 4 . The first feature calculation unit 2 calculates first feature information from the captured image obtained by the imaging unit 1, and outputs the first feature information to the first posture estimation unit 3 and the second feature calculation unit 5. The first posture estimation unit 3 applies the first feature information obtained by the first feature calculation unit 2 to the reference feature information stored in the storage unit 7 (stored as reference information for each recognition target candidate). To obtain the recognition result of the target imaged in the captured image obtained by the imaging unit 1, and the first position and orientation information of the target recognized from the coordinate correspondence between the matched feature information The estimated first position and orientation information is output to the processing unit 4.

加工部4では、第一姿勢推定部3で得た第一位置姿勢情報と、記憶部7を参照して得られる参照画像（第一特徴算出部2で撮像画像より認識された対象に対応する参照画像）と、撮像部1で得た撮像画像と、を用いることにより、光による外乱の影響を除外ないし低減するように当該撮像画像を加工したものとしての加工画像を得て、当該加工画像を第二特徴算出部5へと出力する。ここで、加工画像を得る詳細は後述するが、外乱推定部41において撮像画像における光による外乱を推定し、外乱補正部42において当該推定した外乱を除外ないし低減するような補正を撮像画像に対して行うことにより、加工画像を得ることができる。 In the processing unit 4, the first position and orientation information obtained by the first orientation estimation unit 3 and the reference image obtained by referring to the storage unit 7 (corresponding to the target recognized from the captured image by the first feature calculation unit 2) Reference image) and the captured image obtained by the imaging unit 1 are used to obtain a processed image obtained by processing the captured image so as to exclude or reduce the influence of disturbance caused by light. Is output to the second feature calculation unit 5. Here, details of obtaining the processed image will be described later, but the disturbance estimation unit 41 estimates a disturbance due to light in the captured image, and the disturbance correction unit 42 performs correction on the captured image so as to exclude or reduce the estimated disturbance. By doing this, a processed image can be obtained.

第二特徴算出部5は、加工部4で得た加工画像より第二特徴情報を算出して第二姿勢推定部6へと出力する。ここで、第二特徴算出部5では第一特徴算出部2から得られる第一特徴情報を参照することにより、既に第一特徴情報において光の外乱の影響を受けずに適切に算出されている特徴情報に関しては再度の算出を省略して、第二特徴情報を算出するようにしてもよい。第二姿勢推定部6は、第二特徴算出部5で得た第二特徴情報と、記憶部7に記憶されている参照用特徴情報のうち第一特徴算出部3で得た認識結果に対応するものと、の座標の対応関係から第二位置姿勢情報を推定して出力する。 The second feature calculation unit 5 calculates second feature information from the processed image obtained by the processing unit 4 and outputs the second feature information to the second posture estimation unit 6. Here, by referring to the first feature information obtained from the first feature calculation unit 2 in the second feature calculation unit 5, the first feature information has already been appropriately calculated without being affected by light disturbance. Regarding the feature information, the second feature information may be calculated by omitting the recalculation. The second posture estimation unit 6 corresponds to the second feature information obtained by the second feature calculation unit 5 and the recognition result obtained by the first feature calculation unit 3 among the reference feature information stored in the storage unit 7. The second position / orientation information is estimated and output from the corresponding relationship between the coordinates and the one to be performed.

ここで、第二特徴算出部5において得られる第二特徴情報は、第一特徴算出部2において得られた第一特徴情報における光の外乱の影響が除外ないし低減されたものとして得られることとなる。同様に、第二姿勢推定部6において得られる第二位置姿勢情報は、第一姿勢推定部3において得られた第一位置姿勢情報における光の外乱の影響が除外ないし低減されたものとして得られることとなる。 Here, the second feature information obtained in the second feature calculation unit 5 is obtained as a result of eliminating or reducing the influence of light disturbance in the first feature information obtained in the first feature calculation unit 2. Become. Similarly, the second position / posture information obtained by the second posture estimation unit 6 is obtained as a result of eliminating or reducing the influence of light disturbance in the first position / posture information obtained by the first posture estimation unit 3. It will be.

記憶部7では、複数種類の認識対象ごとにその特徴情報及び参照画像を予め記憶しておき、特徴情報を第一姿勢推定部3及び第二姿勢推定部6へと参照に供すると共に、参照画像を加工部4へと参照に供する。 In the storage unit 7, the feature information and the reference image are stored in advance for each of a plurality of types of recognition targets, and the feature information is used for reference to the first posture estimation unit 3 and the second posture estimation unit 6, and the reference image Is provided for reference to the processing section 4.

以下では、以上において概略説明した情報処理装置10の各部の処理の詳細を説明する。 Hereinafter, the details of the processing of each unit of the information processing apparatus 10 outlined above will be described.

＜撮像部1＞
撮像部1は、対象を撮像して撮像画像を得る。ここで、撮像画像には予め既知の対象、すなわち、後段側の第一姿勢推定部3において認識されその位置姿勢が推定されるべき対象（記憶部7にその情報を記憶しておくのと同様の対象）が含まれるように、ユーザ操作等によって撮像を行うようにすればよい。対象は具体的には例えば、特徴等が既知の模様を持つマーカーや印刷物、立体物等であってよい。撮像部1を実現するハードウェアとしては、携帯端末に標準装備されるデジタルカメラを用いることができる。 <Imaging part 1>
The imaging unit 1 captures a target and obtains a captured image. Here, in the captured image, a known target in advance, that is, a target that is recognized by the first posture estimation unit 3 on the rear stage side and whose position and posture are to be estimated (similar to storing the information in the storage unit 7) Imaging may be performed by a user operation or the like so that the target is included. Specifically, the target may be, for example, a marker having a pattern with known characteristics, a printed material, a three-dimensional object, or the like. As hardware for realizing the imaging unit 1, a digital camera provided as a standard in a portable terminal can be used.

なお、本発明によれば撮像画像内の対象における光の外乱の影響を除外ないし低減することが可能であるが、撮像部1による撮像においては撮像画像内の対象に白飛びや黒潰れが生じないように、あるいは白飛びや黒潰れが生じたとしても生じている領域が可能な限り小さくなるように、撮像することが望ましい。 According to the present invention, it is possible to exclude or reduce the influence of light disturbance on the target in the captured image. However, in the imaging by the imaging unit 1, whiteout or blackout occurs in the target in the captured image. It is desirable to take an image so that the area that is generated is as small as possible even if whiteout or blackout occurs.

＜第一特徴算出部2＞
第一特徴算出部2は、まず撮像部1で撮像された撮像画像から対象の特徴点を検出する。当該検出する特徴点には、対象におけるコーナーなどの特徴的な点を利用できる。検出手法としては、SIFT (Scale-Invariant Feature Transform)やSURF (Speeded Up Robust Features)などの特徴的な点を検出する既存手法が利用できる。第一特徴算出部2では次に、検出された特徴点座標を中心として、撮像部1で撮像された撮像画像から特徴量を算出する。特徴量の算出手法としては、SIFT(Scale-Invariant Feature Transform)やSURF(Speeded Up Robust Features)などの特徴的な量を算出する既存手法が利用できる。その他、特徴点検出及び特徴量算出には任意の既存手法を用いてよく、例えばFASTで特徴点検出し、さらにORB(Oriented Fast and Rotated BRIEF)によって所定の2点間の画素値の大小をコード化したものとして特徴量算出してもよい。 <First feature calculation unit 2>
The first feature calculation unit 2 first detects a target feature point from the captured image captured by the imaging unit 1. A characteristic point such as a corner in the object can be used as the characteristic point to be detected. As detection methods, existing methods for detecting characteristic points such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded Up Robust Features) can be used. Next, the first feature calculation unit 2 calculates a feature amount from the captured image captured by the imaging unit 1 with the detected feature point coordinates as the center. As a feature amount calculation method, an existing method for calculating a characteristic amount such as SIFT (Scale-Invariant Feature Transform) or SURF (Speeded Up Robust Features) can be used. In addition, any existing method may be used for feature point detection and feature amount calculation.For example, feature check is performed by FAST, and pixel value between two predetermined points is coded by ORB (Oriented Fast and Rotated BRIEF). As a result, the feature amount may be calculated.

第一特徴算出部2では、以上のようにして得た複数の特徴点の情報（撮像画像上の座標(u,v)として特定される情報）と各特徴点の特徴量の情報（ベクトル値等として特定される情報）とを第一特徴情報として出力する。ここで、以下の説明における変数表記のため、第一特徴情報における各特徴点をインデクスi(i=1,2,…,N)(Nは得られた特徴点の総数)で識別し、特徴点iの座標を(u,v)_[i]、その特徴情報をf_[i]とする。すなわち、撮像画像から得た第一特徴情報F1は以下のような、特徴点とその特徴量とのペアの集合として表すことができる。
F1= {((u,v)_[i],f_[i])|i=1,2,…,N} In the first feature calculation unit 2, information on a plurality of feature points obtained as described above (information specified as coordinates (u, v) on the captured image) and information on feature values of each feature point (vector values) And the like) are output as the first feature information. Here, for the variable notation in the following explanation, each feature point in the first feature information is identified by an index i (i = 1, 2,..., N) (N is the total number of obtained feature points), and the feature The coordinates of the point i are (u, v) _[i] , and the feature information is f _[i] . That is, the first feature information F1 obtained from the captured image can be expressed as a set of pairs of feature points and feature amounts as follows.
F1 = {((u, v) _[i] , f _[i] ) | i = 1,2,…, N}

＜第一姿勢推定部3＞
第一姿勢推定部3はまず、第一特徴算出部2で得た第一特徴情報F1に基づき、撮像画像に撮像されている対象が記憶部7において記憶されている複数M個の所定対象o(oはiと同様にインデクスであり、o=1,2,…,M)のうちのいずれに該当するものであるかを特定することで、撮像画像に撮像されている対象の認識結果を得る。当該特定においては、第一特徴情報F1と各々の所定対象oに関して記憶部7に記憶されている特徴情報F(o)とのマッチング度mat(F1,F[o])を求め、当該マッチング度を最大とするような所定対象oを認識結果o_[認識]とすればよい。なお、当該最大のマッチング度が所定閾値以下の場合は、撮像画像には所定対象が撮像されていないものと判断して以降の処理を省略するようにしてもよい。 <First posture estimation unit 3>
First posture estimation unit 3 first, based on the first feature information F1 obtained by the first feature calculation unit 2, a plurality of M predetermined targets o that are captured in the captured image is stored in the storage unit 7 (o is an index similar to i, and o = 1, 2, ..., M), and by identifying which one corresponds to the recognition result of the target imaged in the captured image obtain. In the identification, the matching degree mat (F1, F [o]) between the first feature information F1 and the feature information F (o) stored in the storage unit 7 for each predetermined object o is obtained, and the matching degree The predetermined object o that maximizes the value may be the recognition result o _{[recognition]} . When the maximum matching degree is equal to or less than a predetermined threshold, it may be determined that the predetermined target is not captured in the captured image and the subsequent processing may be omitted.

ここで、記憶部7においては各対象oの特徴情報F[o]を第一特徴情報F1と同様の形式のもの（特徴点座標(u,v)_[k,o]とその特徴量f_[k,o]のペアの集合）として以下のように記憶しておくことにより、第一姿勢推定部3では当該記憶されている特徴情報F[o]を参照して任意の既存手法を用いてマッチング度mat(F1,F[o])を計算することができる。以下にてkは特徴点のインデクスであり、N[o]は特徴情報F[o]を構成している特徴点の個数である。
F[o]= {((u,v)_[k,o],f_[k,o])|k=1,2,…,N[o]} Here, in the storage unit 7, the feature information F [o] of each object o is in the same format as the first feature information F1 (feature point coordinates (u, v) _{[k, o]} and its feature value f _{[ k, o]} as a set of pairs), the first posture estimation unit 3 refers to the stored feature information F [o] and uses any existing method. The matching degree mat (F1, F [o]) can be calculated. In the following, k is an index of feature points, and N [o] is the number of feature points constituting the feature information F [o].
F [o] = {((u, v) _{[k, o]} , f _{[k, o]} ) | k = 1,2,…, N [o]}

例えば、記憶されている特徴情報F[o]における各特徴点kの特徴量f_[k,o]が第一特徴情報F1のいずれの特徴点iの特徴量f_[i]にマッチするかを、当該特徴量同士の距離が最小となり且つ当該最小値が所定閾値以下であるものとして判定することで特徴点k,i同士のマッチング結果を得て、特徴情報F[o]において特徴点kがマッチングした個数をマッチング度mat(F1,F[o])として求めてもよいし、当該個数を特徴点の数N[o]で規格化したものをマッチング度mat(F1,F[o])として求めてもよい。さらに、このような特徴量同士の距離等の比較のみでなく、特徴点座標同士の幾何的整合性も考慮してマッチング度を求めてもよいし、RANSAC（ランダムサンプルコンセンサス）によるロバスト推定を用いてマッチング度を求めてもよい。 For example, the feature quantity f _[k, o] of each feature point k in the stored feature information F [o] matches the feature quantity f _{[i] of} which feature point i of the first feature information F1 , A matching result between the feature points k and i is obtained by determining that the distance between the feature amounts is minimum and the minimum value is equal to or less than a predetermined threshold, and the feature point k in the feature information F [o] The matching number may be obtained as the matching degree mat (F1, F [o]), or the number obtained by normalizing the number by the number of feature points N [o] is the matching degree mat (F1, F [o]) You may ask as. Furthermore, the degree of matching may be obtained not only by comparing the distance between feature quantities, but also by considering the geometric consistency between feature point coordinates, or using robust estimation by RANSAC (random sample consensus). The degree of matching may be obtained.

第一姿勢推定部3では次に、認識結果o_[認識]における各特徴点kと第一特徴情報F1の各特徴点iとのうち、上記マッチング度の計算の際にマッチングすると判定された特徴点同士の座標を変換する関係としての平面射影変換行列H'を計算し、当該行列H'を第一位置姿勢推定結果として出力する。すなわち、認識結果o_[認識]における特徴点kと第一特徴情報F1の特徴点iとがマッチングすると判定されたものとすると、対応する座標の斉次座標表現x[k]=(u,v,1)_{[k,o[認識]]} ^Tとx[i]=(u,v,1)_[i] ^Tと（ここで上付きTは転置演算であり、従ってx[k]及びx[i]は3次元の列ベクトルである）を以下のように変換するものとして行列H'(サイズ3×3の行列)を求める。当該行列H'の算出に関しては、互いに対応する一連の座標を変換した際の誤差の総和を最小化するものとして、任意の既存手法で算出するようにすればよい。
x[k]=H'x[i] Next, in the first posture estimation unit 3, the feature determined to be matched in the above-described matching degree calculation among the feature points k in the recognition result o _{[recognition]} and the feature points i of the first feature information F1. A plane projection transformation matrix H ′ as a relationship for transforming the coordinates of the points is calculated, and the matrix H ′ is output as a first position and orientation estimation result. That is, if it is determined that the feature point k in the recognition result o _{[recognition]} matches the feature point i of the first feature information F1, the homogeneous coordinate representation x [k] = (u, v , 1) _{[k, o [recognition]]} ^T and x [i] = (u, v, 1) _[i] ^T and (where the superscript T is a transpose operation, so x [k] and x [ i] is a three-dimensional column vector), and a matrix H ′ (a matrix of size 3 × 3) is obtained by converting as follows. Regarding the calculation of the matrix H ′, any existing method may be used as a method for minimizing the sum of errors when a series of corresponding coordinates are converted.
x [k] = H'x [i]

ここで、3次元コンピュータグラフィックス等の分野において用いられる数学として周知のように、平面射影変換行列は並進成分と回転成分との積に分解することが可能であり、行列H'において当該分解される並進成分が対象の位置情報を、当該分解される回転成分が対象の姿勢情報を表現するものとなっていることから、行列H'は撮像部1を構成するカメラの位置姿勢を基準とした撮像画像における対象の第一位置姿勢情報を表現するものとなっている。より正確には、所定の位置姿勢（例えばカメラに対して所定距離離れた正面での位置姿勢）を表す平面射影変換行列H[o_[認識]]に対して行列H'による第一位置姿勢を合成したものとしての積H'H[o_[認識]]が、撮像部1のカメラを基準とした撮像画像における対象o_[認識]の位置姿勢を表すものとなる。（すなわち、行列H'は所定の位置姿勢を表す行列H[o_[認識]]からの変化分を表すものとなる。）ここで、行列H[o_[認識]]で表現される当該所定の位置姿勢とは、認識結果o_[認識]に関して記憶部7に記憶しておく特徴情報F[o_[認識]]を第一特徴算出部2と同様の手法で予め算出しておくために用いる所定画像R[o_[認識]]において対象o_[認識]が撮像されている、カメラ（撮像部1と同様のカメラ）を基準とした所定の位置姿勢である。そして、当該特徴情報F[o_[認識]]を算出するための当該所定画像R[o_[認識]]が、後述する加工部4において参照する参照画像であり、認識結果o_[認識]をも含む全ての対象oに関して記憶部7に予め記憶しておくものとなる。 Here, as is well known as mathematics used in the field of 3D computer graphics and the like, the plane projective transformation matrix can be decomposed into a product of a translation component and a rotation component, and the decomposition is performed in the matrix H ′. The matrix H ′ is based on the position and orientation of the camera constituting the imaging unit 1 because the translation component that represents the target position information and the rotation component that is decomposed represents the target posture information. It represents the first position and orientation information of the target in the captured image. More precisely, the first position and orientation according to the matrix H ′ is expressed with respect to the planar projective transformation matrix H [o _{[recognition]} ] representing a predetermined position and orientation (for example, the position and orientation in front of the camera at a predetermined distance). The product H′H [o _{[recognition]} ] as a composite represents the position and orientation of the object o _{[recognition] in} the captured image with the camera of the imaging unit 1 as a reference. (I.e., the matrix H 'is intended to refer to a change from the predetermined matrix H representing the position and orientation [o _{[Recognition].),} Where matrix H [o _{[Recognition]} of the predetermined represented by The position / orientation is a predetermined used for pre-calculating the feature information F [o _{[recognition]} ] stored in the storage unit 7 regarding the recognition result o _{[recognition] in the same} manner as the first feature calculation unit 2. This is a predetermined position and orientation based on a camera (a camera similar to the imaging unit 1) in which the object o _{[recognition]} is captured in the image R [o _{[recognition]} ]. Then, the predetermined image R [o _{[recognition]} ] for calculating the feature information F [o _{[recognition]} ] is a reference image to be referred to in the processing unit 4 described later, and the recognition result o _{[recognition]} is also obtained. All the objects o to be included are stored in the storage unit 7 in advance.

＜加工部4＞
加工部4では、外乱推定部41及び外乱補正部42による各処理によって、撮像部1で得た撮像画像Q、第一姿勢推定部3で得た第一位置姿勢情報H'及び記憶部7に記憶されている認識結果o_[認識]に対応する参照画像R[o_[認識]]を用いて、撮像画像Qにおける光の外乱の影響を除外ないし低減した加工画像を得る。なお、参照画像R[o_[認識]]に関して、認識結果としての対象o_[認識]は前段側の第一姿勢推定部3において既に確定済みであることから、以下では適宜、表記簡略化の観点からこれを省略して単に参照画像Rと表記する。 <Processing part 4>
In the processing unit 4, the captured image Q obtained by the imaging unit 1, the first position and orientation information H ′ obtained by the first orientation estimation unit 3, and the storage unit 7 by each process by the disturbance estimation unit 41 and the disturbance correction unit 42 Using the reference image R [o _{[recognition]} ] corresponding to the stored recognition result o _{[recognition]} , a processed image in which the influence of the light disturbance in the captured image Q is excluded or reduced is obtained. Note that, regarding the reference image R [o _{[recognition]} ], the object o _{[recognition]} as the recognition result has already been determined in the first posture estimation unit 3 on the preceding stage, and therefore, from the viewpoint of simplified notation as appropriate below. Therefore, this is omitted and simply referred to as reference image R.

図２は、一実施形態に係る加工部4及びその後段処理部（第二特徴算出部5及び第二姿勢推定部6）による情報処理の流れの主要部を、各情報を表した模式的な例示イラストと共に示すものである。図２にて、各破線枠内のイラストは、当該破線枠内に表記される情報に対応したものである。以下では、一実施形態に係る外乱推定部41及び外乱補正部42の処理を、図２を適宜参照しながら説明する。 FIG. 2 is a schematic diagram showing the main parts of the information processing flow by the processing unit 4 and the subsequent processing unit (second feature calculation unit 5 and second posture estimation unit 6) according to an embodiment. It is shown with an illustration. In FIG. 2, the illustration in each broken line frame corresponds to the information written in the broken line frame. Hereinafter, processing of the disturbance estimation unit 41 and the disturbance correction unit 42 according to an embodiment will be described with reference to FIG. 2 as appropriate.

＜外乱推定部41＞
外乱推定部41では、図２にて(4),(5),(6)で示される情報処理を行うが、まず前提となる前段側の(1),(2),(3)を説明する。(1)は撮像部1で得た撮像画像Qを示し、イラストでは対象が斜めの位置姿勢で撮像されることで本来の長方形の形状（正面で撮像した場合の長方形の形状）が歪んだ四角形に変化していると共に、その一頂点の近辺が光の外乱で画素値（輝度値）が局所的に大きくなっていることが、当該四角形の一頂点の近傍領域が白みがかったものとして表現されている。なお、正面で撮像した場合の本来の長方形の形状及び光の外乱の影響を受けていない状態は(5)に参照画像Rとして示されている。(2)は第一特徴算出部2で得た第一特徴情報F1を示し、イラストでは抽出された特徴点が「×」印として模式的に描かれている。(3)は第一姿勢推定部3で得た第一位置姿勢H'を示し、イラストでは(5)に示す正面で撮像した長方形（参照画像R）に対して(3)では歪んだ四角形（撮像画像Q）のように見えるものとして撮像されるものとして、第一位置姿勢H'が示されている。以上、図２の(1),(2),(3)の模式例を前提として、外乱推定部41による(4),(5),(6)の処理を以下に説明する。 <Disturbance estimation unit 41>
The disturbance estimation unit 41 performs the information processing indicated by (4), (5), and (6) in FIG. 2, but first explains (1), (2), and (3) on the predecessor side. To do. (1) shows the captured image Q obtained by the imaging unit 1. In the illustration, the target is imaged at an oblique position and orientation, and the original rectangular shape (rectangular shape when imaged from the front) is distorted The pixel value (luminance value) is locally increased due to light disturbance near the one vertex, and the area near the one vertex of the rectangle is expressed as white. Has been. Note that the original rectangular shape when imaged in front and the state not affected by light disturbance are shown as a reference image R in (5). (2) shows the first feature information F1 obtained by the first feature calculation unit 2, and in the illustration, the extracted feature points are schematically drawn as “x” marks. (3) shows the first position and orientation H ′ obtained by the first orientation estimation unit 3, and in the illustration, the rectangle (reference image R) imaged in the front shown in (5) is a distorted rectangle (3) in (3). The first position / posture H ′ is shown as an image that is captured as it looks like the captured image Q). The processing of (4), (5), and (6) by the disturbance estimation unit 41 will be described below on the assumption of the schematic examples of (1), (2), and (3) in FIG.

まず、処理(4)では、撮像画像Qに対して第一位置姿勢の行列H'を乗じて座標変換することにより、抽出画像H'(Q)を得る。ここでは、撮像画像Qの全体を行列H'で画像H'(Q)_[全体]へと変換した後に、当該全体の画像H'(Q)_[全体]において参照画像Rと同一の領域部分を抽出したものとして、抽出画像H'(Q)を得る。すなわち、抽出画像H'(Q)と参照画像Rとは同一形状且つ同一サイズであり、さらに以下に説明するように同一の座標系で定義されているものであるため、抽出画像H'(Q)と参照画像Rとの間で同位置にある画素同士の対応付けが可能となっているものである。 First, in the process (4), the extracted image H ′ (Q) is obtained by multiplying the captured image Q by the first position / orientation matrix H ′ to perform coordinate transformation. Here, after the entire captured image Q is converted into an image H ′ (Q) _[overall] by a matrix H ′, the same region portion as the reference image R in the _entire image H ′ (Q) _[overall] is displayed. An extracted image H ′ (Q) is obtained as the extracted one. That is, since the extracted image H ′ (Q) and the reference image R have the same shape and the same size, and are defined in the same coordinate system as described below, the extracted image H ′ (Q ) And the reference image R, the pixels at the same position can be associated with each other.

処理(4)を可能とすべく、前述した記憶部7において対象oごとに記憶しておく特徴情報F[o]における各特徴点kの座標(u,v)_[k,o]が定義されている座標系(u,v)_[o]においてさらに、参照画像R（=R[o]）が占める領域A[o]の情報も当該特徴情報F[o]に紐づけて記憶しておくものとする。外乱推定部41では記憶部7から当該領域A[o]の情報を参照することにより、処理(4)において変換した全体の画像H'(Q)_[全体]から抽出画像H'(Q)を得ることができる。ここで、第一姿勢推定部3の説明において前述した通り、参照画像R（=R[o]）より特徴情報F[o]が抽出されるという関係があるため、特徴情報F[o]に含まれる各特徴点kの座標(u,v)_[k,o]が定義される座標系(u,v)_[o]は当該参照画像の座標系と同一であり、領域A[o]を当該同一の座標系における当該参照画像の所定範囲として、記憶部7に予め記憶しておくことができる。 In order to enable the processing (4), the coordinates (u, v) _{[k, o]} of each feature point k in the feature information F [o] stored for each object o in the storage unit 7 described above are defined. In the coordinate system (u, v) _[o] , the information of the area A [o] occupied by the reference image R (= R [o]) is also stored in association with the feature information F [o]. Shall. The disturbance estimation unit 41 refers to the information of the region A [o] from the storage unit 7 to extract the extracted image H ′ (Q) from the _entire image H ′ (Q) _[overall] converted in the process (4). Can be obtained. Here, as described above in the description of the first posture estimation unit 3, the feature information F [o] is extracted from the reference image R (= R [o]). The coordinate system (u, v) _[o] in which the coordinates (u, v) _{[k, o] of} each included feature point k are defined is the same as the coordinate system of the reference image, and the region A [o] The predetermined range of the reference image in the same coordinate system can be stored in advance in the storage unit 7.

なお、処理(4)の変形例として、参照画像Rに行列H'の逆行列H'^-1を乗じた領域H'^-1 (R)として撮像画像Qの部分領域を求めたうえで、当該部分領域に行列H'を乗じたものとして抽出画像H'(Q)を得るようにしてもよい。 As a modification of the process (4), after obtaining a partial region of the captured image Q as a region H ′ ⁻¹ (R) obtained by multiplying the reference image R by the inverse matrix H ′ ⁻¹ of the matrix H ′, The extracted image H ′ (Q) may be obtained by multiplying the partial region by the matrix H ′.

次に、処理(5)では、記憶部7を参照して認識結果o_[認識]に対応する参照画像R（=R[o_[認識]]）を取得する。 Next, in the process (5), the reference image R (= R [o _{[recognition]} ]) corresponding to the recognition result o _{[recognition]} is acquired with reference to the storage unit 7.

最後に、処理(6)では、処理(4)で抽出した抽出画像H'(Q)から処理(5)で取得した参照画像Rを「外乱推定減算」することにより、外乱推定画像L'=H'(Q)〜Rを求める。（本明細書では二項演算子としての外乱推定減算を波ダッシュ「〜」で表すものとする。）外乱推定減算の詳細は後述するが、外乱推定画像L'を外乱推定結果として得るための補正処理（後述する補正係数αを求めて乗ずる処理）及び補完処理（後述する内挿処理）を伴う減算である。ここで、処理(4)の説明において既に述べたように、抽出画像H'(Q)と参照画像Rとは同一座標系(u,v)において定義された同一形状、同一サイズのものとなっているので、同じ画素位置(u,v)にある抽出画像H'(Q)の画素値H'(Q)(u,v)から参照画像Rの画素値R(u,v)を外乱推定減算して当該位置(u,v)での外乱推定画像L'の画素値L'(u,v)を求めることにより、外乱推定画像L'を求めることができる。 Finally, in the process (6), the disturbance estimated image L ′ = is obtained by performing “disturbance estimation subtraction” on the reference image R acquired in the process (5) from the extracted image H ′ (Q) extracted in the process (4). Find H '(Q) ~ R. (In this specification, disturbance estimation subtraction as a binary operator is represented by a wave dash “˜”.) Although details of disturbance estimation subtraction will be described later, a disturbance estimation image L ′ is obtained as a disturbance estimation result. It is a subtraction accompanied by a correction process (a process for obtaining and multiplying a correction coefficient α described later) and a complement process (an interpolation process described later). Here, as already described in the description of the process (4), the extracted image H ′ (Q) and the reference image R have the same shape and the same size defined in the same coordinate system (u, v). Therefore, disturbance estimation of the pixel value R (u, v) of the reference image R from the pixel value H ′ (Q) (u, v) of the extracted image H ′ (Q) at the same pixel position (u, v) By obtaining the pixel value L ′ (u, v) of the disturbance estimated image L ′ at the position (u, v) by subtraction, the disturbance estimated image L ′ can be obtained.

なお、外乱推定減算では詳細を後述する補正処理及び補完処理によって、減算の際には注目している各位置(u,v)の周辺にある画素値等も結果的に参照されたうえで、減算結果が得られることとなる。従って、外乱推定画像L'も抽出画像H'(Q)及び参照画像Rと同一形状、同一サイズを占めるものとして、参照画像Rと同一座標系において定義されて構成されるものとなる。 In addition, in disturbance estimation subtraction, pixel values around each position (u, v) of interest at the time of subtraction are referred to as a result by correction processing and interpolation processing described in detail later. A subtraction result is obtained. Therefore, the disturbance estimated image L ′ is also defined and configured in the same coordinate system as the reference image R, assuming that it has the same shape and size as the extracted image H ′ (Q) and the reference image R.

また、上記のように部分的な領域に関して値が定義される外乱推定画像L'は、画素値同士を減算（外乱推定減算）して得られるものであるため、画像の通常の画素値範囲から逸脱する値を画素値として含んでいてもよい。例えば、抽出画像H'(Q)及び参照画像Rが共に8ビットで0以上255以下の画素値で与えられている場合に、外乱推定画像L'の画素値がマイナス値を含むものであってもよい。 In addition, the disturbance estimated image L ′ in which values are defined for partial areas as described above is obtained by subtracting pixel values from each other (disturbance estimation subtraction), and therefore, from the normal pixel value range of the image. Deviating values may be included as pixel values. For example, when the extracted image H ′ (Q) and the reference image R are both given by 8 bits and a pixel value of 0 to 255, the pixel value of the disturbance estimation image L ′ includes a negative value. Also good.

以上、(4),(5),(6)に関して図２に示すイラストでは模式例として、(1)に示す歪んだ四角形としての対象を、(5)に参照画像として示す正面から見て撮像した長方形へと変換することを意図した行列H'による変換結果としての抽出画像が(4)に示されている。(4)では、光による外乱の結果として特徴量がテカリやボケ等の影響を受けることで本来の値とは異なる値として算出される等により、第一特徴情報にある程度の誤差を含み、記憶されている参照用特徴情報との間でマッチングされる特徴点の数が減る及び／又は特徴点の分布が偏ることになることから、行列H'も誤差を含む近似値として算出され、(1)よりも歪みは解消されているものの、(5)のような完全な長方形の形状とはなっていない状態としての抽出画像が模式的に示されている。(6)では当該抽出画像より求まる外乱推定画像の模式例が示されている。 As described above, in the illustration shown in FIG. 2 with respect to (4), (5), and (6), as a schematic example, the object as a distorted rectangle shown in (1) is imaged as seen from the front shown as a reference image in (5). An extracted image as a conversion result by a matrix H ′ intended to be converted into a rectangular shape is shown in (4). In (4), the first feature information includes a certain amount of error, for example, by calculating the feature value as a value different from the original value due to the influence of shine or blur as a result of light disturbance. Since the number of feature points matched with the reference feature information that has been reduced and / or the distribution of feature points will be biased, the matrix H ′ is also calculated as an approximate value including an error, (1 Although the distortion is eliminated more than (), the extracted image as a state that is not a complete rectangular shape as in (5) is schematically shown. (6) shows a schematic example of a disturbance estimation image obtained from the extracted image.

以下、外乱推定減算の詳細を説明する。 Hereinafter, details of the disturbance estimation subtraction will be described.

外乱推定減算では、次のようなモデルを前提に計算を行う。すなわち、光の外乱の影響により誤差を伴う行列H'が仮に真値Hとして得られているものとし、従って真値Hによる抽出画像H(Q)は参照画像Rに対して位置ズレのない理想的な状態で得られている場合に、光の影響による画素値変化を、参照画像Rと抽出画像H(Q)との間の環境光源及び近接光源の相違に起因するものとして以下の式eq-0のようにモデル化する。
R(u,v)=αH(Q)(u,v)-L(u,v) …(eq-0)
ここで、係数αは位置(u,v)によらない環境光の相違を表現するものであり、L(u,v)は位置(u,v)に依存する近接光の相違（すなわち、抽出画像H(Q)=H(Q)(u,v)に照射されている近接光による輝度変化の分布）である。 In disturbance estimation subtraction, calculation is performed on the assumption of the following model. That is, it is assumed that a matrix H ′ with an error due to the influence of light disturbance is obtained as a true value H, and therefore the extracted image H (Q) with the true value H is an ideal with no positional deviation from the reference image R. If the pixel value change due to the influence of light is caused by the difference between the ambient light source and the proximity light source between the reference image R and the extracted image H (Q), the following equation eq Model as -0.
R (u, v) = αH (Q) (u, v) -L (u, v)… (eq-0)
Here, the coefficient α expresses the difference in ambient light that does not depend on the position (u, v), and L (u, v) indicates the difference in the proximity light that depends on the position (u, v) (ie, extraction). Image H (Q) = H (Q) (u, v) is a distribution of luminance change due to the proximity light irradiated on the image).

しかしながら、実際には真値Hは不明であり誤差を伴う行列H'が求まっており、従って抽出画像H'(Q)と参照画像Rとには位置ズレがあることが想定される。そこで、近接光の相違に関しても本来のL(u,v)は不明であるが、その行列H'による変換座標における近似値としてのL'(u,v)（外乱推定画像）を、上記と同じモデルによって求めるようにする。すなわち、
R(u,v)=αH'(Q)(u,v)-L'(u,v)
上記を移項して以下の式eq-1が得られる。
L'(u,v)=αH'(Q)(u,v)-R(u,v) …(eq-1) However, in reality, the true value H is unknown, and a matrix H ′ with an error is obtained. Therefore, it is assumed that the extracted image H ′ (Q) and the reference image R are misaligned. Therefore, the original L (u, v) is unknown even with respect to the difference in near light, but L ′ (u, v) (disturbance estimation image) as an approximate value in the transformed coordinates by the matrix H ′ is as above. Use the same model. That is,
R (u, v) = αH '(Q) (u, v) -L' (u, v)
The above is transferred to obtain the following equation eq-1.
L '(u, v) = αH' (Q) (u, v) -R (u, v)… (eq-1)

すなわち、外乱推定減算は概念的には上記の式eq-1で表されるものであり、環境光の相違を表す係数α（環境光の補正係数α）を予め求めておいたうえで、当該係数αを用いて近接光の影響を表現した外乱推定画像L'（前述した二項演算結果H'(Q)〜R）を得ることができる。係数αの求め方は後述する。 That is, the disturbance estimation subtraction is conceptually expressed by the above equation eq-1, and after obtaining a coefficient α (environment light correction coefficient α) representing the difference in ambient light in advance, It is possible to obtain a disturbance estimation image L ′ (binary calculation results H ′ (Q) to R described above) expressing the influence of near light using the coefficient α. A method for obtaining the coefficient α will be described later.

ここで、仮に式eq-1でそのまま計算したとする、すなわち、右辺を位置(u,v)（のみ）に関して計算したものをそのまま左辺の同位置(u,v)での外乱推定画像L'(u,v)の値として採用すると、行列H'による位置ズレの影響が直接に外乱推定画像L'(u,v)の値として現れ、理想的なモデル式eq-0で本来は想定していた近接光の影響をモデル表現したL(u,v)とは乖離の大きな、近似値として不適切なものとなってしまう。（すなわち、近接光による輝度変化分布としてではなく、H'(Q)(u,v)とR(u,v)との画像上の模様等の位置のずれの影響が顕著に表れた不適切な外乱推定画像L'(u,v)の値が得られてしまう。なお、当該不適切な外乱推定画像に関しては、後述する図３の[2]で「αH'(Q)-R'」として模式例を示す。） Here, it is assumed that the calculation is performed as is using the equation eq-1, that is, the disturbance estimated image L ′ at the same position (u, v) on the left side as it is is calculated with respect to the position (u, v) (only) on the right side When adopted as the value of (u, v), the effect of positional deviation due to the matrix H 'appears directly as the value of the disturbance estimation image L' (u, v), which is originally assumed by the ideal model equation eq-0. It is inappropriate as an approximate value with a large divergence from L (u, v) that expresses the influence of the near light as a model. (In other words, not as a luminance change distribution due to near light, but improperly affected by the positional shift of the pattern on the image between H '(Q) (u, v) and R (u, v). A value of the estimated disturbance estimated image L ′ (u, v) is obtained, and regarding the inappropriate estimated disturbance image, “αH ′ (Q) -R ′” in [2] of FIG. As an example.)

従って、実際の外乱推定減算の計算は、式eq-1の計算を直接行うのではなく、以下の2つの式eq-2で表されるように、各位置(u,v)に関しての左辺L'(u,v)を、当該位置(u,v)の周辺所定近傍領域NB(u,v)での右辺の分布値f(u,v)（画素差分値マップf(u,v)）からの内挿値として求めるようにすればよい。なお、近傍領域NB(u,v)での分布値f(u,v)からその内部側にある位置(u,v)の値を内挿で求める際は、所定のフィッティングモデル、例えば平面フィッティングによる値として求めるようにすればよい。
L'(u,v)=「分布f(u,v)の領域NB(u,v)からの内挿値」
f(u,v)=αH'(Q)(u,v)-R(u,v) …(eq-2) Therefore, the actual disturbance estimation subtraction does not directly calculate the expression eq-1, but instead expresses the left side L for each position (u, v) as represented by the following two expressions eq-2. '(u, v) is the distribution value f (u, v) of the right side in the predetermined neighborhood region NB (u, v) around the position (u, v) (pixel difference value map f (u, v)) What is necessary is just to obtain | require as an interpolation value from. When the value of the position (u, v) inside the distribution value f (u, v) in the neighborhood region NB (u, v) is obtained by interpolation, a predetermined fitting model, for example, plane fitting, is used. What is necessary is just to obtain | require as a value by.
L ′ (u, v) = “Interpolated value from region NB (u, v) of distribution f (u, v)”
f (u, v) = αH '(Q) (u, v) -R (u, v)… (eq-2)

なお、内挿に用いる領域NB(u,v)に関しては、位置(u,v)の所定近傍範囲であり、且つ、前述したH'(Q)(u,v)とR(u,v)との画像上の模様等の位置のずれの影響を受けることなく近接光の相違が対応位置において適切に反映される領域として、両画像（あるいは少なくとも抽出画像）において特徴点及びその近傍に該当せず、且つ／又は、両画像及び係数αから定まる分布f(u,v)が平坦であると判定される領域として定めればよい。なお、各位置(u,v)に応じた近傍領域NB(u,v)において内挿によるフィッティング値としてL'(u,v)を求めることから、外乱としての近接光によるグラデーション等が対象の範囲全体（参照画像Rと同じ範囲全体）内において不均一に発生している場合であっても、当該範囲全体で定義されたマップ値L'(u,v)において当該不均一な発生を表現することが可能となる。 The region NB (u, v) used for the interpolation is a predetermined neighborhood range of the position (u, v), and the above-described H ′ (Q) (u, v) and R (u, v) As a region in which the difference in near light is appropriately reflected at the corresponding position without being affected by the positional deviation of the pattern on the image, it corresponds to the feature point and its vicinity in both images (or at least the extracted image) And / or the distribution f (u, v) determined from both images and the coefficient α may be determined as a region determined to be flat. Since L ′ (u, v) is obtained as a fitting value by interpolation in the neighboring region NB (u, v) corresponding to each position (u, v), gradations due to near light as disturbances are targeted. Even in the case of non-uniform occurrence in the entire range (the same range as the reference image R), the non-uniform occurrence is expressed in the map value L ′ (u, v) defined in the entire range. It becomes possible to do.

ここで、特徴点近傍を除外する場合は、少なくとも抽出画像H'(Q)における特徴点近傍を除外すればよい。参照画像Rにおける特徴点近傍は、抽出画像H'(Q)における特徴点近傍との重複も想定されることから、除外してもしなくてもよい。特徴点近傍を除外することの効果として次もある。すなわち、特徴点近傍は焦点ボケ等の外乱が生じていることがあるので、光（近接光）による外乱と混同することなく適切に領域NB(u,v)を定めることが可能となるという効果もある。 Here, when the vicinity of the feature point is excluded, at least the vicinity of the feature point in the extracted image H ′ (Q) may be excluded. The vicinity of the feature point in the reference image R may not be excluded because it is assumed to overlap with the vicinity of the feature point in the extracted image H ′ (Q). The effect of excluding the vicinity of the feature point is as follows. That is, since disturbance such as out-of-focus may occur in the vicinity of the feature point, it is possible to appropriately determine the region NB (u, v) without being confused with disturbance due to light (near light). There is also.

ここで、参照画像R(u,v)においては記憶部7に予め記憶しておく際に、特徴点及びその近傍に該当しない領域の情報も紐づけたうえで記憶しておき、上記の領域NB(u,v)を決定する際に当該情報を利用してもよい。 Here, in the reference image R (u, v), when the information is stored in advance in the storage unit 7, the feature points and information on areas not corresponding to the vicinity thereof are also linked and stored, and the above-described areas are stored. The information may be used when determining NB (u, v).

また、分布f(u,v)が平坦であるか否かの判定は、内挿で用いるのと同じフィッティングモデルのあてはめ誤差（平面フィッティングであれば最小二乗誤差）等によって判定すればよい。 Whether or not the distribution f (u, v) is flat may be determined by fitting error of the same fitting model used in the interpolation (least square error in the case of plane fitting) or the like.

なお、当該着目している位置(u,v)自体が上記の近接光の相違が適切に反映されると判定される領域に属している場合、すなわち、特徴点及びその近傍に該当せず、且つ、分布f(u,v)が平坦であると判定される領域に属している場合は、式eq-2による内挿を用いることなく、式eq-1から直接にL'(u,v)の値を求めるようにしてもよい。 In addition, when the position of interest (u, v) itself belongs to a region where it is determined that the difference in the near light is appropriately reflected, that is, it does not correspond to the feature point and its vicinity, If the distribution f (u, v) belongs to a region determined to be flat, L ′ (u, v directly from the expression eq-1 without using the interpolation according to the expression eq-2. ) Value may be obtained.

さらに、外乱推定減算において用いる環境光の相違を表現する係数αは、抽出画像H'(Q)と参照画像R(Q)との両方を参照し、両画像（あるいは少なくとも抽出画像）において特徴点の近傍に該当せず、且つ、両画像において平坦領域と判定されるような領域FRにおける以下の係数βの関数としての差分絶対値和を最小化するような係数として求めればよい。なお、以下の式に限らず、差分二乗和やその他の任意のβH'(Q)(u,v)とR(u,v)との距離d(βH'(Q)(u,v), R(u,v))の和を最小化するような係数βの値として、係数αを求めてよい。 Furthermore, the coefficient α representing the difference in ambient light used in disturbance estimation subtraction refers to both the extracted image H ′ (Q) and the reference image R (Q), and is a feature point in both images (or at least the extracted image). And a coefficient that minimizes the sum of absolute differences as a function of the following coefficient β in a region FR that does not fall within the vicinity of the image and is determined to be a flat region in both images. It should be noted that the distance d (βH ′ (Q) (u, v), the sum of squared differences or any other βH ′ (Q) (u, v) and R (u, v) is not limited to the following equation. The coefficient α may be obtained as the value of the coefficient β that minimizes the sum of R (u, v)).

以上、外乱推定部41の処理の詳細を説明したので、その模式例を説明する。図３は、加工部4（外乱推定部41及び外乱補正部42）による加工処理の模式例を、ある１つの特徴点の近傍に関して示す図であり、以下では図３を参照して特に、外乱推定部41の処理(6)における外乱推定減算の模式例を説明する。図３では[1]〜[3]と分けて、加工部4で処理される情報としての2次元(u,v)分布の画像情報における画素値を、簡潔な説明のために一般性を失うことなくある1次元の線分（ここでは例としてu軸方向）上において切り出した模式例が示されている。当該模式例は画像上の局所的な領域として、エッジ交点としてのコーナー等のような１つの特徴点とその近傍とを1次元上において示し、当該局所的な領域における加工部4による各画像の加工処理の例を示すものとなっている。 Since the details of the processing of the disturbance estimation unit 41 have been described above, a schematic example thereof will be described. FIG. 3 is a diagram illustrating a schematic example of the processing performed by the processing unit 4 (disturbance estimation unit 41 and disturbance correction unit 42) with respect to the vicinity of a certain feature point. In the following, referring to FIG. A schematic example of disturbance estimation subtraction in the process (6) of the estimation unit 41 will be described. In FIG. 3, the pixel values in the image information of the two-dimensional (u, v) distribution as information processed by the processing unit 4 are separated from [1] to [3], and the generality is lost for a brief explanation. A schematic example cut out on a certain one-dimensional line segment (here, the u-axis direction as an example) is shown. The schematic example shows one feature point such as a corner as an edge intersection and its vicinity in a one-dimensional manner as a local region on the image, and each image by the processing unit 4 in the local region An example of processing is shown.

まず、図３の[1]では抽出画像H'(Q)と参照画像Rとが実線で示され、参照画像Rで位置u₀にコーナー等の特徴点が求まるのに対し、同じ特徴点が行列H'における光の外乱の影響での真値Hからの誤差により、抽出画像H'(Q)においては別の位置u₁に位置ズレを生じて求まることが示されている。また、参照画像Rでは特徴点位置u₀の前後で画素値がほぼ水平となり変化しない分布を示すのに対し、抽出画像H'(Q)では対応する特徴点位置u₁の前後で近接光の影響により直線的に変化するという異なる分布を示している。さらに、参照画像Rと抽出画像H'(Q)とは環境光の違いにより全体的な明るさの相違も存在するものとなっているが、[1]に点線で示すように、当該相違を補正する係数αを抽出画像H'(Q)に乗じた画像αH'(Q)では当該環境光の違いが補正されている。 First, in [1] in FIG. 3, the extracted image H ′ (Q) and the reference image R are indicated by solid lines, and a feature point such as a corner is obtained at the position u ₀ in the reference image R, whereas the same feature point is obtained. It has been shown that in the extracted image H ′ (Q), a position shift occurs at another position u ₁ due to an error from the true value H due to the influence of light disturbance in the matrix H ′. In addition, in the reference image R, the pixel values are substantially horizontal before and after the feature point position u ₀ and show a distribution that does not change, whereas in the extracted image H ′ (Q), the proximity light is emitted before and after the corresponding feature point position u ₁ . It shows a different distribution that changes linearly with the influence. Furthermore, although there is an overall brightness difference between the reference image R and the extracted image H ′ (Q) due to the difference in ambient light, as shown by the dotted line in [1], the difference is In the image αH ′ (Q) obtained by multiplying the extracted image H ′ (Q) by the coefficient α to be corrected, the difference in the ambient light is corrected.

さらに、図３の[2]は当該局所的な領域での外乱推定減算の適用結果の模式例を示している。すなわち、[1]に示す係数αによる抽出画像H'(Q)の補正画像αH'(Q)から参照画像Rをそのまま減算した結果（前述の式eq-1を直接適用した結果）が[2]に実線によって「αH'(Q)-R」として示されているが、これは位置ズレとしての特徴点u₀及びu₁のズレによる画像の模様などの違いの影響が当該特徴点u₀及びu₁の近傍にパルス状に発生したものとなっており、その他の部分では近接光の分布を直線的な変化として概ね正しく捉えているが、当該パルス発生部分は近接光の分布を捉えるのに失敗している。 Furthermore, [2] in FIG. 3 shows a schematic example of the application result of disturbance estimation subtraction in the local region. That is, the result of subtracting the reference image R as it is from the corrected image αH ′ (Q) of the extracted image H ′ (Q) with the coefficient α shown in [1] (the result of directly applying the above-described equation eq-1) is [2 ] to are shown as "αH '(Q) -R" by the solid line, which is characteristic point u ₀ and impact point the feature differences, such as the pattern of the image due to the deviation of u ₁ u ₀ as misalignment And in the vicinity of u ₁ , it is generated in the form of a pulse, and in other parts, the distribution of the near light is roughly correctly regarded as a linear change, but the pulse generation part captures the distribution of the near light. Has failed.

従って、外乱推定減算においては前述の式eq-1を直接適用するのではなく、内挿による式eq-2を適用した結果として、図３の[2]に破線として示すような、当該パルス発生部分が存在せず近接光の分布を直線的に変化するものとして正しく捉えた外乱推定画像L'を得ることができる。さらに図３の[3]に示されるのは外乱推定部41の後段処理を担う外乱補正部42における[1],[2]に対応する模式例であるが、これに関しては以下に外乱補正部42の具体的な処理を説明した後に説明する。 Therefore, in disturbance estimation subtraction, the above-mentioned equation eq-1 is not directly applied, but as a result of applying the equation eq-2 by interpolation, the pulse generation as shown by the broken line in [2] of FIG. It is possible to obtain a disturbance estimation image L ′ that correctly grasps that the distribution of the near light does not exist and is linearly changed. Further, [3] in FIG. 3 is a schematic example corresponding to [1] and [2] in the disturbance correction unit 42 that performs the subsequent processing of the disturbance estimation unit 41. This will be described below. The specific processing of 42 will be described later.

＜外乱補正部42＞
外乱補正部42は、図２にて(7),(8)で示される情報処理を行うことにより、加工画像を得る。外乱補正部42はまず、処理(7)として示されるように、外乱推定部41で得た外乱推定画像L'に対して第一姿勢推定部3で得た第一位置姿勢を表す行列H'の逆行列H'^-1を乗じて座標変換することにより、外乱補正用画像H'^-1(L')を得る。第一位置姿勢を表現する行列H'の乗算が座標変換を表すものとして既に説明した通り、当該逆変換で得られる外乱補正用画像H'^-1(L')は、参照画像Rの座標系で定義されていた外乱推定画像L'を、撮像画像Qの座標系に戻したものである。すなわち、外乱補正用画像H'^-1(L')は撮像画像Qの座標系において、近接光の存在による画素値増分を推定したものとなっている。また一般に（特殊な場合を除き）、外乱補正用画像H'^-1(L')は撮像画像Qの全体ではなく一部の領域で定義されたものとなる。 <Disturbance correction unit 42>
The disturbance correcting unit 42 obtains a processed image by performing information processing indicated by (7) and (8) in FIG. First, as shown in the process (7), the disturbance correction unit 42 is a matrix H ′ representing the first position and orientation obtained by the first posture estimation unit 3 with respect to the disturbance estimation image L ′ obtained by the disturbance estimation unit 41. ^Is multiplied by the inverse matrix H ′ ⁻¹ to obtain a disturbance correction image H ′ ⁻¹ (L ′). As already described as the multiplication of the matrix H ′ representing the first position and orientation represents the coordinate transformation, the disturbance correction image H ′ ⁻¹ (L ′) obtained by the inverse transformation is the coordinate system of the reference image R. The disturbance estimation image L ′ defined in (1) is returned to the coordinate system of the captured image Q. That is, the disturbance correction image H ′ ⁻¹ (L ′) is obtained by estimating the pixel value increment due to the presence of the near light in the coordinate system of the captured image Q. In general (except for special cases), the disturbance correction image H ′ ⁻¹ (L ′) is defined not in the entire captured image Q but in a part of the region.

外乱補正部42は次いで、処理(8)として示されるように、処理(7)で求めた近接光の影響を表現した外乱補正用画像H'^-1(L')と、外乱推定部41において求めた環境光の影響を表現した係数αと、を用いて撮像画像Qから当該近接光及び環境光の影響を除去ないし低減したものとして、加工画像を得る。具体的には、既にモデル式eq-0等で説明した通りのモデルにより以下の式で表現されるものとして加工画像を得ることができる。
αQ(u,v)-H'^-1(L')(u,v) Next, as shown in the process (8), the disturbance correction unit 42 is a disturbance correction image H ′ ⁻¹ (L ′) expressing the influence of the near light obtained in the process (7), and the disturbance estimation unit 41 A processed image is obtained assuming that the influence of the proximity light and the environmental light is removed or reduced from the captured image Q using the coefficient α expressing the obtained influence of the environmental light. Specifically, a processed image can be obtained as expressed by the following equation using a model as already described in the model equation eq-0 or the like.
αQ (u, v) -H ' ^-1 (L') (u, v)

すなわち、撮像画像Qに係数αを乗じて環境光を参照画像Rにおけるものと同様の状態へと補正したもの「αQ」からさらに外乱補正用画像H'^-1(L')を減算して近接光の影響も除去ないし低減したものとして、加工画像αQ-H'^-1(L')を得ることができる。 That is, the disturbance correction image H ′ ⁻¹ (L ′) is further subtracted from “αQ” obtained by multiplying the captured image Q by the coefficient α to correct the ambient light to the same state as that in the reference image R A processed image αQ-H ′ ⁻¹ (L ′) can be obtained by removing or reducing the influence of light.

なお、処理(8)で得られる加工画像αQ-H'^-1(L')は処理(7)で得られる外乱補正用画像H'^-1(L')と同様に、撮像画像Qの座標系において定義され、一般に（撮像画像Qの全体を対象が占めているという特殊な場合を除き）、撮像画像Qの全体ではなく一部の領域で定義されたものとなる。そして、当該一部の領域は、撮像画像Qにおいて前述の第一姿勢推定部3が対象（o_[認識]）を検出した領域となる。 Note that the processed image αQ-H ′ ⁻¹ (L ′) obtained in the process (8) is the same as the disturbance correction image H ′ ^-1 (L ′) obtained in the process (7). Generally, it is defined in a part of the captured image Q instead of the entire captured image Q (except in a special case where the entire captured image Q is occupied by the object). The partial area is an area where the first posture estimation unit 3 detects the target (o _{[recognition]} ) in the captured image Q.

図３の[3]の例では、[1],[2]の例に対応するものとして、加工画像αQ-H'^-1(L')に対応するものが示されている。すなわち、加工画像αQ-H'^-1(L')は撮像画像Qの座標系で定義されるものであるが、図３の[3]では当該加工画像に行列H'を乗じて参照画像Rの座標系に変換したものとしての画像αH'(Q)-L'が示されている。これを[1]に示す参照画像Rと対比することからその分布形状の類似として見て取れるように、画像αH'(Q)-L'は環境光と近接光との影響が除外ないし低減され、H'による位置ずれのみが残った状態となっている。従って、座標変換によって対応する加工画像αQ-H'^-1(L')も同様に、近接光と環境光との影響が除外ないし低減されたものとして得られることとなる。 In the example of [3] in FIG. 3, the one corresponding to the processed image αQ-H ′ ⁻¹ (L ′) is shown as corresponding to the example of [1], [2]. That is, the processed image αQ-H ′ ⁻¹ (L ′) is defined in the coordinate system of the captured image Q, but in [3] of FIG. 3, the processed image is multiplied by the matrix H ′ to obtain the reference image R. An image αH ′ (Q) -L ′ as converted into the coordinate system of is shown. Since this is contrasted with the reference image R shown in [1], the image αH ′ (Q) -L ′ can eliminate or reduce the influence of ambient light and proximity light, so that it can be seen as a similar distribution shape. Only the misalignment due to 'remains. Accordingly, the corresponding processed image αQ-H ′ ⁻¹ (L ′) is also obtained by eliminating or reducing the influence of the proximity light and the environmental light by the coordinate conversion.

なお、図４を参照して後述する図２とは別の一実施形態は、図３の[3]に示した画像αH'(Q)-L'の方を「加工画像」として利用する実施形態である。図４の実施形態の図２の実施形態に対する主要な相違は、既に図３の[3]に関して説明した通り、加工画像を撮像画像Qの座標系においてαQ-H'^-1(L') として得る（図２の場合）か、参照画像Rの座標系においてαH'(Q)-L'として得る（図４の場合）か、であり、近接光と環境光との影響のモデル化の手法や当該影響を除外する計算は実質的には同様である。 Note that an embodiment different from FIG. 2 described later with reference to FIG. 4 uses the image αH ′ (Q) -L ′ shown in [3] of FIG. 3 as a “processed image”. It is a form. The main difference between the embodiment of FIG. 4 and the embodiment of FIG. 2 is that the processed image is represented by αQ−H ′ ⁻¹ (L ′) in the coordinate system of the captured image Q, as already described with reference to [3] of FIG. (In the case of FIG. 2) or αH ′ (Q) −L ′ in the coordinate system of the reference image R (in the case of FIG. 4), and a method for modeling the influence of the proximity light and the ambient light And the calculation to exclude the effect is substantially the same.

＜第二特徴算出部5＞
第二特徴算出部5では、以上のように加工部4で得られた加工画像を対象として、特徴点検出及び特徴量算出を行うことで、当該求まった特徴点及び特徴量のペア集合を加工画像の第二特徴情報F2として得る。第二特徴算出部5での特徴点検出及び特徴量算出は第一特徴算出部2におけるのと同様の処理によればよい。 <Second feature calculation unit 5>
The second feature calculation unit 5 processes the feature point / feature amount pair set by performing feature point detection and feature amount calculation on the processed image obtained by the processing unit 4 as described above. Obtained as second feature information F2 of the image. The feature point detection and feature amount calculation in the second feature calculation unit 5 may be performed by the same processing as in the first feature calculation unit 2.

一実施形態では上記のように加工画像の全体を対象として第二特徴算出部5が第二特徴情報の算出を行うが、別の一実施形態では当該算出するための計算量を次のようにして削減するようにしてもよい。 In one embodiment, the second feature calculation unit 5 calculates the second feature information for the entire processed image as described above, but in another embodiment, the calculation amount for the calculation is as follows. May be reduced.

すなわち、第二特徴算出部5で得られる第二特徴情報F2は、第一特徴算出部2で得られる第一特徴情報F1のうち、第一姿勢推定部3において認識された対象（o_[認識]）に関して記憶部7に記憶されている特徴情報F[o_[認識]]とマッチしたものF1_[マッチ]における光の外乱の影響を除外ないし低減したものとなっている。一方、当該マッチした第一特徴情報F1_[マッチ]内の特徴点及び特徴量ペアには、光による外乱の影響を受けておらず、第二特徴情報F2内の特徴点及び特徴量ペアと（実質的に）同じものが存在する場合もある。この場合、第二特徴算出部5による第二特徴情報F2の算出の際に、当該同じ特徴点及び特徴量ペアを再度算出することなく、既に得られているマッチした第一特徴情報F1_[マッチ]から当該同じ特徴点及び特徴量ペアを得るようにしてよい。 That is, the second feature information F2 obtained by the second feature calculation unit 5 is the target (o _{[recognition] of} the first feature information F1 obtained by the first feature calculation unit 2 and recognized by the first posture estimation unit 3. _] ), The influence of the light disturbance on F1 _[match] matched with the feature information F [o _{[recognition]} ] stored in the storage unit 7 is excluded or reduced. On the other hand, the feature points and feature amount pairs in the _matched first feature information F1 _[match] are not affected by light disturbance, and the feature points and feature amount pairs in the second feature information F2 ( There may be (substantially) the same thing. In this case, when the second feature calculation unit 5 calculates the second feature information F2, the matched first feature information F1 _[match that has already been obtained without calculating the same feature point and feature amount pair again. _] May be used to obtain the same feature point and feature amount pair.

具体的には、マッチした第一特徴情報F1_[マッチ]における各特徴点の近傍領域（特徴量を算出するための局所領域）を撮像画像と加工画像とにおいて対比し、撮像画像における近傍領域と加工画像における近傍領域とに相違がないと判定される場合、対応する特徴点に関する特徴量を再度算出することなく、従って、当該近傍領域は第二特徴情報F2を計算するために参照する領域からは除外したうえで、マッチした第一特徴情報F1_[マッチ]において得られている特徴量を取得するようにすればよい。ここで、近傍領域同士の相違有無の判定は、SAD（差分絶対値和）やSSD（差分二乗和）等により閾値判定で評価すればよい。 Specifically, the neighborhood area (local area for calculating the feature amount ₎ of each feature point in the _matched first feature information F1 _[match] is compared between the captured image and the processed image, When it is determined that there is no difference in the neighborhood area in the processed image, the feature quantity relating to the corresponding feature point is not calculated again, and thus the neighborhood area is determined from the area that is referred to in order to calculate the second feature information F2. And the feature quantity obtained in the _matched first feature information F1 _[match] may be acquired. Here, the determination of whether or not there is a difference between neighboring regions may be evaluated by threshold determination using SAD (sum of absolute differences), SSD (sum of squared differences), or the like.

一方、上記において近傍領域同士に相違があると判定された場合においては、対応する特徴点はマッチした第一特徴情報F1_[マッチ]において既に得られていることから、第二特徴情報F2を求める際には当該近傍領域において特徴点を再度検出することは省略し、当該近傍領域を対象として特徴量を求める処理のみを追加で行うようにしてもよい。そして、第一特徴情報において検出済みの特徴点と、当該相違すると判定された近傍領域において再度の算出で得られる特徴量とのペアを、第二特徴情報F2を構成するものとして採用するようにすればよい。 On the other hand, if it is determined in the above that there is a difference between the neighboring areas, the corresponding feature points are already obtained in the _matched first feature information F1 _[match] , so the second feature information F2 is obtained. In this case, it may be omitted that the feature point is detected again in the neighboring region, and only the process for obtaining the feature amount for the neighboring region is additionally performed. Then, a pair of a feature point detected in the first feature information and a feature amount obtained by recalculation in the neighboring region determined to be different is adopted as a component of the second feature information F2. do it.

さらに、加工画像のうちマッチした第一特徴情報F1_[マッチ]における各特徴点の近傍領域に該当しない領域に関しては、新たな特徴点及び特徴量ペアが求まる可能性があるため、新たな特徴点検出及び特徴量算出を試みるようにすればよい。以上のようにして、マッチした第一特徴情報F1_[マッチ]から計算を省略して得られたものと、新たな特徴点検出及び特徴量算出により得られたものとを、第二特徴情報F2として出力すればよい。 In addition, new feature points and feature quantity pairs may be obtained for regions that do not correspond to the neighborhood of each feature point in the _matched first feature information F1 _[match] in the processed image. It is only necessary to try out the output and the feature amount calculation. As described above, the second feature information F2 is obtained by omitting the calculation from the _matched first feature information F1 _[match] and obtained by the new feature point detection and feature amount calculation. As output.

＜第二姿勢推定部6＞
第二姿勢推定部6では、第二特徴算出部5で得た第二特徴情報F2と、第一姿勢推定部3において既に認識済みである対象（o_[認識]）に関して記憶部7に記憶されている特徴情報F[o_[認識]]と、の間において特徴点同士を対応する特徴量に基づいてマッチングし、当該マッチングした特徴点同士を座標変換する関係としての平面射影変換行列Hを求め、行列Hを第二位置姿勢推定結果として出力する。平面射影変換行列Hの算出の仕方に関しては第一姿勢推定部3で説明したのと同様である。 <Second posture estimation unit 6>
In the second posture estimation unit 6, the second feature information F2 obtained by the second feature calculation unit 5 and the object (o _{[recognition]} ) that has already been recognized in the first posture estimation unit 3 are stored in the storage unit 7. The feature information F [o _{[recognition]} ] is matched with the feature information based on the corresponding feature quantity, and a planar projective transformation matrix H is obtained as a relationship for coordinate transformation of the matched feature points. The matrix H is output as the second position / orientation estimation result. The method of calculating the planar projective transformation matrix H is the same as that described in the first posture estimation unit 3.

行列Hを算出するための第二特徴情報F2が第一特徴情報F1における光による外乱の影響を除外ないし低減して高精度化されたものとなっているため、行列Hで表現される第二位置姿勢推定結果も同様に、行列H'で表現される第一位置姿勢推定結果が高精度化されたものとなっている。なお、図２では(9),(10)に第二特徴情報F2及び第二位置姿勢情報H（行列H）が示されている。 The second feature information F2 for calculating the matrix H has been improved by excluding or reducing the influence of light disturbance in the first feature information F1, so that the second feature information F2 is expressed by the matrix H. Similarly, the position / orientation estimation result is obtained by increasing the accuracy of the first position / orientation estimation result expressed by the matrix H ′. In FIG. 2, the second feature information F2 and the second position and orientation information H (matrix H) are shown in (9) and (10).

＜記憶部7＞
第一姿勢推定部3や加工部4等の説明において既に述べた通り、記憶部7では各対象oの特徴情報F[o]や対応する参照画像R[o]等とを予め記憶しておくことにより、当該情報を必要とする各部に当該情報を提供する。なお、参照画像R[o]に関しては、対象oを近接光の影響（テカリなど）がない光源環境において撮像したものとして用意しておくことが望ましい。 <Storage unit 7>
As already described in the description of the first posture estimation unit 3, the processing unit 4, and the like, the storage unit 7 stores in advance the feature information F [o] of each object o, the corresponding reference image R [o], and the like. Thus, the information is provided to each part that needs the information. Regarding the reference image R [o], it is desirable to prepare the object o as an image taken in a light source environment free from the influence of near light (such as shine).

以上、本発明によれば、対象を撮像部1で撮像することで対象の相対的な位置関係を認識することができる。特に、光の外乱を除去した上で特徴量を算出するため、高精度な認識が可能となる。以下、本発明の変形例などに関して補足説明を行う。 As described above, according to the present invention, the relative positional relationship between the objects can be recognized by imaging the object with the imaging unit 1. In particular, since the feature amount is calculated after removing the light disturbance, high-accuracy recognition is possible. In the following, supplementary explanation will be given regarding modifications of the present invention.

（１）図４は、主として加工部4における処理に関して、図２で説明した一実施形態とは別の一実施形態を説明するための図である。図４で(1)〜(6)で示される情報は図２における(1)〜(6)と同様であり、図４の別の一実施形態においても加工部4の前段側にある撮像部1、第一特徴算出部2及び第一姿勢推定部3の処理は図２で説明した一実施形態と同様である。図３の[3]において既に言及した通り、図４の実施形態は加工画像を得る座標系が異なるものである。なお、当該図４の別の一実施形態では図２の(7)に示した計算が不要となるという効果がある。 (1) FIG. 4 is a diagram for explaining an embodiment different from the embodiment described in FIG. 2 mainly regarding the processing in the processing unit 4. The information indicated by (1) to (6) in FIG. 4 is the same as (1) to (6) in FIG. 2, and in the other embodiment of FIG. 1. The processing of the first feature calculation unit 2 and the first posture estimation unit 3 is the same as that of the embodiment described with reference to FIG. As already mentioned in [3] of FIG. 3, the embodiment of FIG. 4 has a different coordinate system for obtaining a processed image. 4 has the effect that the calculation shown in (7) of FIG. 2 is not necessary.

図４の実施形態では、外乱補正部42は図２の(7)に示した外乱補正用画像H'^-1(L)を得ることなく、図４及び図２で共通の(6)に示される通りの外乱推定部41で得た外乱推定画像L'を、参照画像Rの座標系における外乱補正用画像として採用する。そして、外乱補正部42では図４の(8A)に示される通り、参照画像Rの座標系にある抽出画像H'(Q)において光による外乱の影響を除外ないし低減したものとして、加工画像αH'(Q)-L'を得る。 In the embodiment of FIG. 4, the disturbance correction unit 42 does not obtain the disturbance correction image H ′ ⁻¹ (L) shown in (7) of FIG. The disturbance estimation image L ′ obtained by the disturbance estimation unit 41 as described above is employed as a disturbance correction image in the coordinate system of the reference image R. Then, as shown in FIG. 4 (8A), the disturbance correction unit 42 excludes or reduces the influence of the disturbance due to light in the extracted image H ′ (Q) in the coordinate system of the reference image R. Get '(Q) -L'.

さらに、加工部4の後段側の第二特徴算出部5及び第二指定推定部6の処理に関しては、加工画像αH'(Q)-L'を座標変換して図２の実施形態と同じ画像αQ-H'^-1(L')を得たうえで、図２の実施形態と同様の処理としてもよいし、図４に示される処理を用いてもよい。図４に示される処理(9A),(10A)は次の通りである。 Further, regarding the processing of the second feature calculation unit 5 and the second designation estimation unit 6 on the rear stage side of the processing unit 4, the processed image αH ′ (Q) -L ′ is subjected to coordinate conversion and the same image as the embodiment of FIG. After obtaining αQ-H ′ ⁻¹ (L ′), the same processing as in the embodiment of FIG. 2 may be used, or the processing shown in FIG. 4 may be used. The processes (9A) and (10A) shown in FIG. 4 are as follows.

まず、第二特徴算出部5は、処理(9A)として、参照画像Rの座標系にある加工画像αH'(Q)-L'を対象として、第二特徴情報F2Aを算出する。この際、第一位置姿勢としての行列H'による座標変換で第一特徴情報F1を参照画像Rの座標系に変換したうえで第二特徴情報F2Aと対比し、図２の実施形態で説明したのと同様の計算削減（検出済みの特徴点算出や算出済みの特徴量算出の省略）を適用してもよい。次いで、第二姿勢推定部6は、第二特徴情報F2Aと第一姿勢推定部3において既に認識済みである対象（o_[認識]）に関して記憶部7に記憶されている特徴情報F[o_[認識]]との間の対応特徴点同士を座標変換する平面射影変換行列H_2Aを求める。ここで、行列H_2Aを、積H_2AH'が撮像画像Qの座標系から参照画像Rの座標系への変換（H'にあった誤差を低減した変換）を表すものとして求めるようにすることで、第二姿勢推定部6では最終的な第二位置姿勢推定結果としての行列Hを、「H=H_2AH'」のように積として求めることができる。 First, the second feature calculation unit 5 calculates the second feature information F2A for the processed image αH ′ (Q) -L ′ in the coordinate system of the reference image R as the process (9A). At this time, the first feature information F1 is converted into the coordinate system of the reference image R by the coordinate conversion by the matrix H ′ as the first position and orientation, and then compared with the second feature information F2A, as described in the embodiment of FIG. It is also possible to apply the same calculation reduction as in (no calculation of detected feature points or calculation of already calculated feature values). Next, the second posture estimation unit 6 uses the feature information F [o _[ stored in the storage unit 7 regarding the second feature information F2A and the object (o _{[recognition]} ) that has already been recognized in the first posture estimation unit 3 _{. recognition]} the homography matrix H _2A to coordinate transformation corresponding feature points with each other between the seek. Here, the matrix H _2A is determined so that the product H _2A H ′ represents the transformation from the coordinate system of the captured image Q to the coordinate system of the reference image R (transformation with reduced error in H ′). Thus, the second posture estimation unit 6 can obtain a matrix H as a final second position and posture estimation result as a product such as “H = H _2A H ′”.

（２）上記と同様の観点からの変形例として、図２及び図４では共通の(4),(5),(6)に示される抽出画像と参照画像とを用いた外乱推定減算による外乱推定画像の算出を、参照画像Rの座標系ではなく、撮像画像Qの座標系において行うようにしてもよい。すなわち、抽出画像Q_[抽出]は、参照画像Rの占める領域を座標変換H'^-1(R)によって撮像画像Qから切り出したものとして得ると共に、参照画像Rも同座標変換を施して座標変換された参照画像H'^-1(R)として得るようにし、抽出画像Q_[抽出]から座標変換された参照画像H'^-1(R)を外乱推定減算「Q_[抽出]〜H'^-1(R)」したものとして、外乱推定画像L'= Q_[抽出]〜H'^-1(R)を得てもよい。その他、全く同様に、撮像画像や参照画像の座標系に限らない任意の共通座標系において外乱推定画像L'を求め、さらに対応する加工画像を求めるようにしてもよい。 (2) As a modified example from the same viewpoint as described above, in FIG. 2 and FIG. 4, disturbance by disturbance estimation subtraction using the extracted image and the reference image shown in common (4), (5), and (6) The estimation image may be calculated not in the coordinate system of the reference image R but in the coordinate system of the captured image Q. That is, the extracted image Q _[extraction] is obtained as a region that the reference image R occupies as being cut out from the captured image Q by the coordinate transformation H ′ ⁻¹ (R), and the reference image R is also subjected to the coordinate transformation by the coordinate transformation. is the reference image H 'as obtained by a ^-1 (R), the extraction image Q _[extraction] reference image H is the coordinate transformation from' was ^-1 (R) estimated disturbance subtraction "Q _[extraction] to H ^'-1 As a result of (R) ”, a disturbance estimated image L ′ = Q _{[extraction] to} H ′ ⁻¹ (R) may be obtained. In addition, in the same manner, the disturbance estimated image L ′ may be obtained in an arbitrary common coordinate system that is not limited to the coordinate system of the captured image or the reference image, and a corresponding processed image may be obtained.

（３）加工部4において加工画像は、撮像画像や参照画像を構成している所定の色空間の各チャネルごとに独立に得るようにしてもよい。この場合、画像L'や係数αにおいて光源の構成色の情報も取得することが可能となる。例えば撮像画像がRGB空間で構成されていれば、Rチャネルの加工画像と、Gチャネルの加工画像と、Bチャネルの加工画像と、をそれぞれ独立に得たうえで、これらを合成して得られるRGB空間の加工画像を第二特徴算出部5以降の処理対象としてよい。 (3) In the processing unit 4, a processed image may be obtained independently for each channel of a predetermined color space constituting a captured image or a reference image. In this case, it is possible to acquire information on the constituent colors of the light source in the image L ′ and the coefficient α. For example, if the captured image is configured in RGB space, the R channel processed image, the G channel processed image, and the B channel processed image can be obtained independently and then combined. The processed image in the RGB space may be a processing target after the second feature calculation unit 5.

（４）情報処理装置10はさらに、推定した対象の第二位置姿勢に基づいて、撮像画像における対象の認識結果に応じた情報を提示する処理、例えば撮像画像内における対象に対する拡張現実表示を行うようにしてもよい。この場合、拡張現実表示等に用いる重畳表示コンテンツ等も、記憶部7に予め記憶しておけばよい。 (4) The information processing apparatus 10 further performs processing for presenting information according to the recognition result of the target in the captured image, for example, augmented reality display for the target in the captured image, based on the estimated second position and orientation of the target. You may do it. In this case, superimposed display content used for augmented reality display or the like may be stored in the storage unit 7 in advance.

（５）情報処理装置10では、所定のフレームレートで各時刻tに関してリアルタイムに撮像画像Q(t)（映像の各時刻tのフレームQ(t)）における対象の第二位置姿勢H(t)を求めるようにしてもよい。この場合、各時刻tにおいて情報処理装置10の各部が以上説明した処理を行うようにする実施形態の他にも、第一姿勢推定部3による処理を次のように簡略化する実施形態も可能である。すなわち、初期時刻t=1に関しては、情報処理装置10の各部が以上説明した処理を行うことで、第一姿勢推定部3でも時刻t=1における第一位置姿勢H'(1)を求めるようにする。一方、その後の時刻t≧2に関しては、第一姿勢推定部3では当該時刻tでの第一位置姿勢H'(t)の算出を行うことなく、直前の時刻t-1に関して第二姿勢推定部6で既に得られている第二位置姿勢H(t-1)の値を、当該時刻tでの第一位置姿勢H'(t)の値として採用するようにしてよい。既に説明した通り、第一位置姿勢は第二位置姿勢の近似値として求まればよいので、対象の位置姿勢が隣接時刻t-1,t間で極端に大きく変動しないことを前提とすると、直前時刻t-1の第二位置姿勢H(t-1)は現時刻tの第二位置姿勢H(t)の近似値であるため、現時刻tの第一位置姿勢H'(t)の値として採用することが可能である。同様に、直前時刻以外をも含む所定の過去時刻t-n（n≧1）の第二位置姿勢H(t-n)を現時刻tの第一位置姿勢H'(t)として採用してもよい。 (5) In the information processing apparatus 10, the second position and orientation H (t) of the target in the captured image Q (t) (the frame Q (t) at each time t of the video) in real time with respect to each time t at a predetermined frame rate. May be requested. In this case, in addition to the embodiment in which each unit of the information processing device 10 performs the processing described above at each time t, an embodiment in which the processing by the first posture estimation unit 3 is simplified as follows is also possible It is. That is, with respect to the initial time t = 1, each part of the information processing apparatus 10 performs the above-described processing so that the first position estimation unit 3 also obtains the first position / posture H ′ (1) at the time t = 1. To. On the other hand, for the subsequent time t ≧ 2, the first posture estimation unit 3 does not calculate the first position and posture H ′ (t) at the time t, and the second posture is estimated for the immediately preceding time t−1. The value of the second position / posture H (t−1) already obtained by the unit 6 may be adopted as the value of the first position / posture H ′ (t) at the time t. As already explained, the first position and orientation can be obtained as an approximate value of the second position and orientation, so assuming that the target position and orientation does not fluctuate significantly between adjacent times t-1 and t, Since the second position and orientation H (t-1) at time t-1 is an approximate value of the second position and orientation H (t) at current time t, the value of the first position and orientation H '(t) at current time t Can be adopted. Similarly, the second position / posture H (t−n) at a predetermined past time t−n (n ≧ 1) including other than the previous time may be adopted as the first position / posture H ′ (t) at the current time t.

（６）第一姿勢推定部3では、撮像画像に撮像されている対象を記憶部7において記憶されている複数M個の所定対象oの中から特定することで対象の認識処理を行ったうえで、撮像画像における対象の第一位置姿勢を推定するものとして説明した。予め対象が1種類に定まっており、記憶部7では当該１個の所定対象に関する特徴情報のみを記憶している場合、対象を複数候補の中から認識する処理は省略して、撮像画像内から対象が検出されることを前述のマッチング度に対する閾値判定によって確認のうえで第一位置姿勢の推定のみを行うようにしてもよい。 (6) The first posture estimation unit 3 performs target recognition processing by identifying the target imaged in the captured image from a plurality of M predetermined targets o stored in the storage unit 7. In the above description, it is assumed that the first position and orientation of the target in the captured image is estimated. In the case where one type of target is determined in advance and only the feature information related to the one predetermined target is stored in the storage unit 7, the process of recognizing the target from a plurality of candidates is omitted, and from within the captured image Only the first position and orientation may be estimated after confirming that the target is detected by the above-described threshold determination for the matching degree.

（７）情報処理装置10から第二姿勢推定部6を省略する構成を取るようにしてもよい。この場合、第二特徴算出部5から得られる第二特徴情報を情報処理装置10の出力とすることで、情報処理装置10は撮像画像における対象の第二特徴情報を、光による外乱の影響を除外ないし低減したものとして取得することが可能な装置として機能し、当該取得された第二特徴情報を位置姿勢推定以外の用途（例えば、対象の特徴情報のデータベースを構築する際の特徴情報の収集用途など）で利用することも可能である。 (7) The second posture estimation unit 6 may be omitted from the information processing apparatus 10. In this case, by using the second feature information obtained from the second feature calculation unit 5 as the output of the information processing device 10, the information processing device 10 can analyze the second feature information of the target in the captured image with the influence of light disturbance. Functions as a device that can be acquired as excluded or reduced, and uses the acquired second feature information for purposes other than position and orientation estimation (for example, collection of feature information when constructing a database of target feature information) It is also possible to use it for purposes).

（８）加工部4において環境光の相違を表現する係数αを求める別の一実施形態として、前述のように特徴点の近傍等を除外することなく、抽出画像H'(Q)のみから係数αを求めるようにしてもよい。例えば、参照画像Rは灰色仮説(Gray World Assumption)を満たし、その画素値の平均値が灰色となるものとして、（これを参照画像Rに関して既知の情報として与えておき、）抽出画像H'(Q)における画素値の平均値の当該灰色からの乖離を係数αとして求めてもよい。同様に例えば、max white仮説を用いて、すなわち、参照画像Rにおいて最大値は白色であるものとして、抽出画像H'(Q)における画素値の最大値の白色からの乖離を係数αとして求めてもよい。 (8) As another embodiment for obtaining the coefficient α expressing the difference in ambient light in the processing unit 4, the coefficient is obtained from only the extracted image H ′ (Q) without excluding the vicinity of the feature point as described above. α may be obtained. For example, assuming that the reference image R satisfies the gray hypothesis (Gray World Assumption) and the average value of the pixel values is gray (given this as known information regarding the reference image R), the extracted image H ′ ( The deviation from the gray of the average value of the pixel values in Q) may be obtained as the coefficient α. Similarly, for example, using the max white hypothesis, that is, assuming that the maximum value is white in the reference image R, the deviation from the white of the maximum pixel value in the extracted image H ′ (Q) is obtained as the coefficient α. Also good.

（９）情報処理装置10は、一般的な構成のコンピュータとして実現可能である。すなわち、CPU（中央演算装置）、当該CPUにワークエリアを提供する主記憶装置、ハードディスクやSSDその他で構成可能な補助記憶装置、キーボード、マウス、タッチパネルその他といったユーザからの入力を受け取る入力インタフェース、ネットワークに接続して通信を行うための通信インタフェース、表示を行うディスプレイ、カメラ及びこれらを接続するバスを備えるような、一般的なコンピュータによって情報処理装置10を構成することができる。さらに、情報処理装置10の各部の処理はそれぞれ、当該処理を実行させるプログラムを読み込んで実行するCPUによって実現することができるが、任意の一部の処理を別途の専用回路等において実現するようにしてもよい。 (9) The information processing apparatus 10 can be realized as a computer having a general configuration. That is, a CPU (Central Processing Unit), a main storage device that provides a work area for the CPU, an auxiliary storage device that can be configured with a hard disk, SSD, etc., an input interface that receives input from the user such as a keyboard, mouse, touch panel, etc., network The information processing apparatus 10 can be configured by a general computer including a communication interface for connecting to and communicating, a display for displaying, a camera, and a bus for connecting them. Further, the processing of each part of the information processing apparatus 10 can be realized by a CPU that reads and executes a program for executing the processing, but any part of the processing is realized in a separate dedicated circuit or the like. May be.

10…情報処理装置、1…撮像部、2…第一特徴算出部、3…第一姿勢推定部、4…加工部、5…第二特徴算出部、6…第二姿勢推定部、7…記憶部 DESCRIPTION OF SYMBOLS 10 ... Information processing apparatus, 1 ... Imaging part, 2 ... 1st characteristic calculation part, 3 ... 1st attitude | position estimation part, 4 ... Processing part, 5 ... 2nd characteristic calculation part, 6 ... 2nd attitude | position estimation part, 7 ... Memory

Claims

A first feature calculation unit that calculates first feature information from a captured image obtained by imaging;
First posture estimation for estimating a first position and posture of a target imaged in the captured image with respect to the camera that obtained the captured image by comparing the first feature information with given reference feature information And
The difference between the extracted image of the target in the area occupied by the target in the captured image and the given reference image for the target is caused by the difference in the light source environment, and the coordinates corresponding to the first position and orientation Based on the processing that is estimated in the common coordinate system by conversion and that corrects the first light source environment in the captured image to the second light source environment in the reference image based on the estimated result in the common coordinate system. A processing unit for obtaining a processed image of the image;
An information processing apparatus comprising: a second feature calculation unit configured to calculate second feature information of the target as the one when the captured image is in the second light source environment from the processed image.

The processing unit estimates the difference as a result of the difference in the light source environment as including a difference in the ambient light that is uniform over the entire area of the extracted image and the reference image, The information processing apparatus according to claim 1.

The information processing according to claim 2, wherein the processing unit estimates the difference in the ambient light in an area where pixel values are determined to be flat in both the extracted image and the reference image. apparatus.

The processing unit estimates a difference as a result of a difference in the light source environment as including an influence distribution of proximity light that exists in the extracted image but does not exist in the reference image. The information processing apparatus according to claim 1.

The processing unit estimates a difference as a result of the difference in the light source environment as including an influence distribution of near light that exists in the extracted image but does not exist in the reference image. After correcting the extracted image by estimating the difference and removing the difference of the ambient light, from the peripheral region of each position in the pixel difference value map obtained by subtracting the reference image from the corrected extracted image The information processing apparatus according to claim 2, wherein the influence distribution of the near light at each position of the corrected extracted image is estimated by interpolation.

The information processing apparatus according to claim 5, wherein the processing unit determines a peripheral region for the interpolation to be a flat region in the pixel difference value map.

The first feature information and the second feature information are calculated as a pair set of feature points and feature amounts,
The second feature calculation unit evaluates a difference between the processed image and the captured image in a region near the feature point in the first feature information, and determines that there is no difference when the difference is determined. While excluding a neighboring region from the reference region for calculating the second feature information, a pair of feature points and feature quantities in the first feature information corresponding to the determined neighboring region is converted into the second feature information. The information processing apparatus according to claim 1, wherein the information processing apparatus is used as included in the information processing apparatus.

The first feature information and the second feature information are calculated as a pair set of feature points and feature amounts,
The second feature calculation unit evaluates a difference between the processed image and the captured image in a region near the feature point in the first feature information, and determines that there is a difference when the difference is determined. A pair of a feature point in the first feature information corresponding to a neighboring region and feature information calculated from the determined neighboring region in the processed image is adopted as included in the second feature information. An information processing apparatus according to any one of claims 1 to 7.

Second posture estimation for estimating a second position and posture of the object captured in the captured image with respect to the camera that has obtained the captured image by comparing the second feature information with given reference feature information The information processing apparatus according to claim 1, further comprising a unit.

The information processing device performs processing in real time on a captured image as a frame at each time of video,
In the first posture estimation unit, the comparison between the first feature information and the given reference feature information is omitted, and the target captured in the captured image is obtained with respect to the camera that has obtained the captured image. The information processing apparatus according to claim 9, wherein the second position and orientation that is already estimated at the past time in the second estimation unit is adopted as the first position and orientation at the current time.

A program causing a computer to function as the information processing apparatus according to any one of claims 1 to 10.