JP2012164188A

JP2012164188A - Image processing apparatus, image processing method and program

Info

Publication number: JP2012164188A
Application number: JP2011024870A
Authority: JP
Inventors: Takehiro Hamada; 健宏濱田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2011-02-08
Filing date: 2011-02-08
Publication date: 2012-08-30

Abstract

PROBLEM TO BE SOLVED: To provide an image processing apparatus, an image processing method and a program which are capable of estimating a position/attitude of a specific area in an image with high accuracy.SOLUTION: The image processing apparatus includes: a feature point extraction part 11 for extracting a feature point of an input image I1; a matching part 12 for determining correspondence of the feature point with a reference image R; a homography calculation part 13 for calculating a relation of projection between the input image I1 and the reference image R based on the correspondence; and an image conversion part 14 for converting at least a part of the input image I1 based on the relation of projection. The homography calculation part 13 calculates a homography matrix H1 based on the correspondence of the feature point with the reference image R for the input image I1, and calculates a homography matrix H2 based on the correspondence of the feature point with the reference image R for the input image I1 converted by the image conversion part 14 on the basis of the homography matrix H1, and calculates again the relation of projection between the input image I1 and the reference image R based on the homography matrixes H1, H2.

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

近年、拡張現実感（ＡｕｇｕｍｅｎｔｅｄＲｅａｌｉｔｙ：ＡＲ）に対するニーズが高まりつつある。拡張現実感は、例えば、入力画像中の特定領域（または特定物体）を基準として、コンピュータ画像等の付加画像を入力画像に合成することで付与される。この場合、拡張現実感の品質は、入力画像中での特定領域の位置・姿勢の推定精度に大きく左右される。 In recent years, needs for augmented reality (AR) are increasing. Augmented reality is given by, for example, synthesizing an additional image such as a computer image with an input image based on a specific region (or a specific object) in the input image. In this case, the quality of augmented reality greatly depends on the estimation accuracy of the position / posture of the specific region in the input image.

このような画像処理では、入力画像中の特定領域（マーカー）を指定するために、マーカーに相当する参照画像が用いられる。そして、入力画像の特徴点を抽出し、参照画像との間で特徴点の対応関係を決定し、対応関係に基づき入力画像と参照画像の射影関係を算出することで、入力画像中でのマーカーの位置・姿勢が推定される。そして、射影関係に基づき付加画像が変換され、変換後の付加画像がマーカーを基準として入力画像に合成される。 In such image processing, a reference image corresponding to a marker is used to designate a specific area (marker) in the input image. Then, a feature point of the input image is extracted, a correspondence relationship between the feature point and the reference image is determined, and a projective relationship between the input image and the reference image is calculated based on the correspondence relationship, whereby a marker in the input image is calculated. Is estimated. Then, the additional image is converted based on the projective relationship, and the converted additional image is combined with the input image using the marker as a reference.

しかし、マーカーの位置・姿勢の推定結果には誤差が伴う。特に、入力画像が動画である場合、カメラおよびマーカーが静止していてもフレーム毎に推定結果に誤差が伴うことで、合成画像上で付加画像が微かに震える状態が生じ、拡張現実感の品質が低下してしまう。 However, there are errors in the marker position / posture estimation results. In particular, when the input image is a moving image, even if the camera and marker are stationary, there is an error in the estimation results for each frame, causing a state where the additional image slightly shakes on the composite image, and the quality of augmented reality Will fall.

推定結果の誤差としては、入力画像中のマーカーと参照画像の間で画像の姿勢が異なる点が挙げられる。画像の特徴点は、線形フィルタ、非線形フィルタ等、各種のフィルタを用いて抽出されるが、フィルタの出力は、画像の姿勢に応じて変化する。例えば、ある画像と、この画像を回転させた画像との間では、特徴点の座標値が異なったり、画像間で対応する同一の特徴点（対応点）が見出されなかったりする。 As an error of the estimation result, there is a point that the posture of the image is different between the marker in the input image and the reference image. The feature points of the image are extracted using various filters such as a linear filter and a non-linear filter, and the output of the filter changes according to the posture of the image. For example, a coordinate value of a feature point differs between an image and an image obtained by rotating this image, or the same feature point (corresponding point) corresponding to the image may not be found.

ここで、座標値の誤差は、多くの対応点が見いだされれば平均化することで、推定結果の誤差に及ぼす影響を抑えられる。しかし、マーカーと参照画像の間で画像の姿勢が異なれば多くの対応点が見いだされなくなり、推定結果の誤差に及ぼす影響を抑えられず、良好な推定結果を得られない。 Here, the error of the coordinate value can be suppressed by averaging if a large number of corresponding points are found, thereby suppressing the influence on the error of the estimation result. However, if the posture of the image differs between the marker and the reference image, many corresponding points cannot be found, the influence on the error of the estimation result cannot be suppressed, and a good estimation result cannot be obtained.

そこで、本発明は、画像中の特定領域の位置・姿勢を高精度に推定可能な、画像処理装置、画像処理方法およびプログラムを提供しようとするものである。 Accordingly, the present invention is intended to provide an image processing apparatus, an image processing method, and a program capable of accurately estimating the position / posture of a specific region in an image.

本発明のある観点によれば、入力画像の特徴点を抽出する特徴点抽出部と、参照画像との間で特徴点の対応関係を決定する対応関係決定部と、対応関係に基づき入力画像と参照画像の射影関係を算出する射影関係算出部と、射影関係に基づき入力画像の少なくとも一部を変換する画像変換部とを備え、射影関係算出部は、入力画像について、参照画像との間で特徴点の対応関係に基づき第１の射影関係を算出し、画像変換部により第１の射影関係に基づき変換された入力画像について、参照画像との間で特徴点の対応関係に基づき第２の射影関係を算出し、第１および第２の射影関係に基づき入力画像と参照画像の射影関係を再び算出する画像処理装置が提供される。 According to an aspect of the present invention, a feature point extraction unit that extracts a feature point of an input image, a correspondence relationship determination unit that determines a correspondence relationship of feature points with a reference image, an input image based on the correspondence relationship, A projection relationship calculation unit that calculates a projection relationship of the reference image; and an image conversion unit that converts at least a part of the input image based on the projection relationship, the projection relationship calculation unit between the input image and the reference image A first projective relationship is calculated based on the feature point correspondence, and an input image converted based on the first projective relationship by the image conversion unit is input to the second reference point based on the feature point correspondence with the reference image. An image processing apparatus is provided that calculates a projective relationship and recalculates a projective relationship between an input image and a reference image based on the first and second projective relationships.

画像変換部は、少なくとも第１の射影関係に基づき、入力画像中の特定領域の姿勢が参照画像の姿勢に近づくように、入力画像の少なくとも一部を変換してもよい。 The image conversion unit may convert at least a part of the input image so that the posture of the specific region in the input image approaches the posture of the reference image based on at least the first projective relationship.

射影関係算出部は、さらに、画像変換部により第１および第２の射影関係に基づき変換された入力画像について、参照画像との間で特徴点の対応関係に基づき第３の射影関係を算出し、第１から第３の射影関係に基づき入力画像と参照画像の射影関係を再び算出してもよい。 The projecting relationship calculation unit further calculates a third projecting relationship based on the correspondence relationship of the feature points with the reference image for the input image converted by the image conversion unit based on the first and second projecting relationships. The projection relationship between the input image and the reference image may be calculated again based on the first to third projection relationships.

特徴点抽出部は、第１の射影関係を算出する場合と第２の射影関係を算出する場合で互いに異なる手法を用いて特徴点を抽出してもよい。 The feature point extraction unit may extract feature points using different methods when calculating the first projective relationship and when calculating the second projective relationship.

対応関係決定部は、第１の射影関係を算出する場合と第２の射影関係を算出する場合で互いに異なる手法を用いて対応関係を決定してもよい。 The correspondence relationship determination unit may determine the correspondence relationship by using different methods for calculating the first projective relationship and calculating the second projective relationship.

第１の射影関係を算出する場合、第２の射影関係を算出する場合よりも、入力画像中の特定領域の姿勢変化に対してロバストな手法が用いられてもよい。 When the first projective relationship is calculated, a method that is more robust with respect to the posture change of the specific region in the input image may be used than when the second projective relationship is calculated.

画像変換部は、入力画像のうち参照画像に対応する特定領域の画像を変換してもよい。 The image conversion unit may convert an image of a specific area corresponding to the reference image in the input image.

画像処理装置は、入力画像に画像を合成する画像合成部をさらに備え、画像変換部は、さらに少なくとも第１および第２の射影関係に基づき付加画像を変換し、画像合成部は、変換した付加画像を、入力画像のうち参照画像に対応する特定領域に合成してもよい。 The image processing apparatus further includes an image composition unit that synthesizes an image with the input image, the image conversion unit further converts the additional image based on at least the first and second projection relationships, and the image composition unit converts the added image You may synthesize | combine an image to the specific area | region corresponding to a reference image among input images.

本発明の他の観点によれば、入力画像について、参照画像との間で特徴点の対応関係に基づき第１の射影関係を算出し、第１の射影関係に基づき入力画像の少なくとも一部を変換し、変換した入力画像について、参照画像との間で特徴点の対応関係に基づき第２の射影関係を算出し、第１および第２の射影関係に基づき入力画像と参照画像の射影関係を再び算出することを含む画像処理方法が提供される。 According to another aspect of the present invention, with respect to an input image, a first projection relationship is calculated based on a correspondence relationship between feature points with a reference image, and at least a part of the input image is calculated based on the first projection relationship. Converting the converted input image, calculating a second projective relationship based on the correspondence between the feature points with the reference image, and determining the projective relationship between the input image and the reference image based on the first and second projective relationships. An image processing method including calculating again is provided.

本発明の他の観点によれば、上記画像処理方法をコンピュータに実行させるためのプログラムが提供される。ここで、プログラムは、コンピュータ読取り可能な記録媒体を用いて提供されてもよく、通信手段等を介して提供されてもよい。 According to another aspect of the present invention, there is provided a program for causing a computer to execute the image processing method. Here, the program may be provided using a computer-readable recording medium, or may be provided via a communication unit or the like.

本発明によれば、画像中の特定領域の位置・姿勢を高精度に推定可能な、画像処理装置、画像処理方法およびプログラムを提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the image processing apparatus, the image processing method, and program which can estimate the position and attitude | position of the specific area | region in an image with high precision can be provided.

本発明の実施形態に係る画像処理方法の概念を示す図である。It is a figure which shows the concept of the image processing method which concerns on embodiment of this invention. 本発明の実施形態に係る画像処理装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an image processing apparatus according to an embodiment of the present invention. 画像処理装置の動作を示すフロー図である。It is a flowchart which shows operation | movement of an image processing apparatus. 画像処理方法の一例を示す図（１／２）である。It is a figure (1/2) which shows an example of an image processing method. 画像処理方法の一例を示す図（２／２）である。It is a figure (2/2) which shows an example of an image processing method. 画像中の特定領域に画像を合成する一般的な画像処理方法を示すフロー図である。It is a flowchart which shows the general image processing method which synthesize | combines an image to the specific area | region in an image.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Exemplary embodiments of the present invention will be described below in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.

［１．一般的な画像処理方法］
まず、図５を参照して、画像中の特定領域に画像を合成する一般的な画像処理方法について説明する。図５は、画像中の特定領域に画像を合成する一般的な画像処理方法を示すフロー図である。 [1. General image processing method]
First, a general image processing method for synthesizing an image with a specific area in the image will be described with reference to FIG. FIG. 5 is a flowchart showing a general image processing method for synthesizing an image with a specific area in the image.

画像処理方法は、パーソナルコンピュータ、ゲーム機、ＰＤＡ、携帯電話等の画像処理装置により行われる。画像処理方法では、参照画像Ｒの特徴点Ｐを示すデータがメモリ（不図示）等に格納されている。参照画像Ｒとは、入力画像Ｉ１中の特定領域（マーカーＭ）を指定するための平面画像であり、例えば、任意のテクスチャでもよく、二次元コード等でもよい。特徴点Ｐとは、輝度のエッジ成分等、画像中の特徴部分を表す画素点である。 The image processing method is performed by an image processing apparatus such as a personal computer, a game machine, a PDA, or a mobile phone. In the image processing method, data indicating the feature point P of the reference image R is stored in a memory (not shown) or the like. The reference image R is a planar image for designating a specific region (marker M) in the input image I1, and may be an arbitrary texture, a two-dimensional code, or the like, for example. The feature point P is a pixel point that represents a feature portion in the image, such as a luminance edge component.

図５に示すように、画像処理方法では、動画または静止画が入力画像Ｉ１としてカメラ（不図示）等から入力され（ステップＳ５１）、入力画像Ｉ１の特徴点Ｐが抽出される（ステップＳ５２）。入力画像Ｉ１は、参照画像Ｒとの間で特徴点Ｐの対応関係が決定される（マッチングされる）（ステップＳ５３）。特徴点Ｐの対応関係は、入力画像Ｉ１と参照画像Ｒの間で対応する同一の特徴点Ｐを見出すことで決定される。 As shown in FIG. 5, in the image processing method, a moving image or a still image is input as an input image I1 from a camera (not shown) or the like (step S51), and a feature point P of the input image I1 is extracted (step S52). . The input image I1 is matched (matched) with the feature point P with the reference image R (step S53). The correspondence relationship between the feature points P is determined by finding the same feature point P corresponding between the input image I1 and the reference image R.

入力画像Ｉ１は、特徴点Ｐのマッチングに基づき、参照画像Ｒとの間で射影関係（ホモグラフィ）が算出される（ステップＳ５４）。画像間の射影関係とは、特定平面を異なる地点から捉えた画像間に成立する関係であり、画像間のホモグラフィとも称される。画像間のホモグラフィは、３行３列のホモグラフィ行列を用いて表される。そして、入力画像Ｉ１と参照画像Ｒの射影関係を示すホモグラフィ行列Ｈ１は、マーカーＭを任意の姿勢で捉えた入力画像Ｉ１と、参照画像Ｒ（マーカーＭを一定の姿勢で捉えた画像）との間に成立する射影関係を示している。 A projection relationship (homography) is calculated between the input image I1 and the reference image R based on the matching of the feature points P (step S54). The projective relationship between images is a relationship established between images obtained by capturing a specific plane from different points, and is also referred to as homography between images. Homography between images is represented using a 3 × 3 homography matrix. The homography matrix H1 indicating the projection relationship between the input image I1 and the reference image R includes an input image I1 in which the marker M is captured in an arbitrary posture, and a reference image R (an image in which the marker M is captured in a certain posture). Projection relations established between the two are shown.

ホモグラフィ行列Ｈ１を適切に算出できると（ステップＳ５５で「Ｙｅｓ」）、入力画像Ｉ１中でのマーカーＭの位置・姿勢が推定される。そして、ホモグラフィ行列Ｈ１に基づき付加画像が変換され（ステップＳ５６）、変換された付加画像がマーカーＭを基準として入力画像Ｉ１に合成され（ステップＳ５７）、合成画像として出力される（ステップＳ５８）。なお、付加画像のデータは、コンピュータ画像等のデータとしてメモリ等に格納されている。一方、ホモグラフィ行列Ｈ１を適切に算出できなければ（ステップＳ５５で「Ｎｏ」）、マーカーＭの位置・姿勢が適切に推定されず、入力画像Ｉ１自体が出力される（ステップＳ５９）。 If the homography matrix H1 can be calculated appropriately (“Yes” in step S55), the position / posture of the marker M in the input image I1 is estimated. Then, the additional image is converted based on the homography matrix H1 (step S56), and the converted additional image is combined with the input image I1 using the marker M as a reference (step S57) and output as a combined image (step S58). . The additional image data is stored in a memory or the like as data such as a computer image. On the other hand, if the homography matrix H1 cannot be calculated properly (“No” in step S55), the position / posture of the marker M is not properly estimated, and the input image I1 itself is output (step S59).

推定結果の誤差としては、入力画像中のマーカーと参照画像の間で画像の姿勢が異なる点が挙げられる。画像の特徴点Ｐは、線形フィルタ、非線形フィルタ等、各種のフィルタを用いて抽出されるが、フィルタの出力は、画像の姿勢に応じて変化する。例えば、ある画像と、この画像を回転させた画像との間では、特徴点Ｐの座標値が異なったり、画像間で対応する同一の特徴点Ｐ（対応点）が見出されなかったりする。 As an error of the estimation result, there is a point that the posture of the image is different between the marker in the input image and the reference image. The feature point P of the image is extracted using various filters such as a linear filter and a nonlinear filter, and the output of the filter changes according to the posture of the image. For example, the coordinate value of the feature point P differs between an image and an image obtained by rotating this image, or the same feature point P (corresponding point) corresponding to the image may not be found.

画像間のホモグラフィは、理論上、画像間で対応する４組の特徴点Ｐ（対応点）の座標値を用いて算出できる。しかし、各座標値に誤差が生じている場合、ホモグラフィにも誤差が生じてしまうので、実際上、４つの対応点の座標値のみでは、ホモグラフィを精度良く算出できない。 The homography between images can theoretically be calculated using the coordinate values of four sets of feature points P (corresponding points) corresponding to each other between the images. However, if there is an error in each coordinate value, an error also occurs in the homography. Therefore, in practice, the homography cannot be calculated with high accuracy using only the coordinate values of the four corresponding points.

このため、座標値の誤差を正規分布と仮定し、対応点の数を多くして最小二乗法によりホモグラフィを算出することが行われる。この場合、対応点の数が多いほどホモグラフィの算出精度が向上し、さらにいえば、特徴点Ｐの抽出数が多いほど対応点の数が多くなる。よって、ホモグラフィの算出精度は、特徴点Ｐの抽出精度に依存することになる。 For this reason, assuming that the error of the coordinate value is a normal distribution, the number of corresponding points is increased and the homography is calculated by the least square method. In this case, the accuracy of homography calculation increases as the number of corresponding points increases, and more specifically, the number of corresponding points increases as the number of extracted feature points P increases. Therefore, the calculation accuracy of homography depends on the extraction accuracy of the feature point P.

極端な例として入力画像Ｉ１中のマーカーＭと参照画像Ｒが同一であれば、入力画像Ｉ１中のマーカーＭと参照画像Ｒとの間で画像の姿勢が一致するので、画像間で対応する全ての特徴点Ｐが抽出され、対応点の数が最多となる。換言すれば、入力画像Ｉ１中のマーカーＭと参照画像Ｒの間で画像の姿勢が異なるほど、特徴点Ｐが抽出されない可能性が増え、対応点の数が少なくなり、ホモグラフィの算出精度が低下する。 As an extreme example, if the marker M and the reference image R in the input image I1 are the same, the posture of the image matches between the marker M in the input image I1 and the reference image R. Feature points P are extracted, and the number of corresponding points is maximized. In other words, as the image posture differs between the marker M and the reference image R in the input image I1, the possibility that the feature point P is not extracted increases, the number of corresponding points decreases, and the homography calculation accuracy increases. descend.

［２．画像処理方法の概念］
つぎに、図１を参照して本発明の実施形態に係る画像処理方法の概念について説明する。図１は、本発明の実施形態に係る画像処理方法の概念を示す図である。 [2. Concept of image processing method]
Next, the concept of the image processing method according to the embodiment of the present invention will be described with reference to FIG. FIG. 1 is a diagram showing a concept of an image processing method according to an embodiment of the present invention.

図１に示すように、本発明の実施形態に係る画像処理方法では、まず、入力画像Ｉ１について、参照画像Ｒとの間で特徴点Ｐの対応関係に基づき第１の射影関係（ホモグラフィ行例Ｈ１）が算出される。よって、入力画像Ｉ１中での特定領域（マーカーＭ）の位置・姿勢がラフに推定される。そして、入力画像Ｉ１の少なくとも一部は、ホモグラフィ行例Ｈ１に基づき、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように変換される。図１に示す例では、マーカーＭは、入力画像Ｉ１では、時計回りに１３５°程度と大きく回転しているが、変換画像Ｉ２では、時計回りに１５°程度と僅かに回転している。 As shown in FIG. 1, in the image processing method according to the embodiment of the present invention, first, for the input image I1, a first projective relationship (homography line) based on the correspondence relationship of the feature points P with the reference image R. Example H1) is calculated. Therefore, the position / posture of the specific area (marker M) in the input image I1 is roughly estimated. Then, at least a part of the input image I1 is converted based on the homography example H1 so that the posture of the marker M approaches the posture of the reference image R. In the example shown in FIG. 1, the marker M is rotated by about 135 ° clockwise in the input image I1, but slightly rotated by about 15 ° clockwise in the converted image I2.

つぎに、変換後の入力画像Ｉ１（変換画像Ｉ２）について、参照画像Ｒとの間で特徴点Ｐの対応関係に基づき第２の射影関係（ホモグラフィ行例Ｈ２）が算出される。前述したように、変換画像Ｉ２では、入力画像Ｉ１に比べて、マーカーＭの姿勢が参照画像Ｒの姿勢に近づいている。よって、変換画像Ｉ２では、入力画像Ｉ１中でマーカーＭの位置・姿勢を推定する場合よりも、マーカーＭの位置・姿勢が緻密に推定される。また、ホモグラフィ行例Ｈ２は、ホモグラフィ行例Ｈ１の誤差を打ち消すように作用する。そして、ホモグラフィ行例Ｈ１、Ｈ２を合成したホモグラフィ行例Ｈに基づき、入力画像Ｉ１中でのマーカーＭの位置・姿勢を高い精度で推定することができる。 Next, for the input image I1 after conversion (converted image I2), a second projective relationship (homographic example H2) is calculated based on the correspondence relationship of the feature points P with the reference image R. As described above, in the converted image I2, the posture of the marker M is closer to the posture of the reference image R than the input image I1. Therefore, in the converted image I2, the position / posture of the marker M is estimated more precisely than when the position / posture of the marker M is estimated in the input image I1. In addition, the homography example H2 acts to cancel the error of the homography example H1. The position / posture of the marker M in the input image I1 can be estimated with high accuracy based on the homography example H obtained by combining the homography examples H1 and H2.

これにより、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように入力画像Ｉ１を変換しながら、入力画像Ｉ１および変換された入力画像Ｉ１（変換画像Ｉ２）と参照画像Ｒのホモグラフィを算出することで、入力画像Ｉ１中でのマーカーＭの位置・姿勢を高い精度で推定することができる。 Thus, the input image I1 is converted so that the posture of the marker M approaches the posture of the reference image R, and the homography between the input image I1 and the converted input image I1 (converted image I2) and the reference image R is calculated. Thus, the position / posture of the marker M in the input image I1 can be estimated with high accuracy.

［３．画像処理装置１０の構成］
つぎに、図２を参照して本発明の実施形態に係る画像処理装置１０の構成について説明する。図２は、本発明の実施形態に係る画像処理装置１０の構成を示すブロック図である。 [3. Configuration of Image Processing Device 10]
Next, the configuration of the image processing apparatus 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 2 is a block diagram showing a configuration of the image processing apparatus 10 according to the embodiment of the present invention.

本発明の実施形態に係る画像処理装置１０は、パーソナルコンピュータ、ゲーム機、ＰＤＡ、携帯電話等である。図２に示すように、画像処理装置１０は、特徴点抽出部１１、マッチング部１２（対応関係決定部）、ホモグラフィ算出部１３（射影関係算出部）、画像変換部１４、画像合成部１５および記憶部１６を含んでいる。なお、画像処理装置１０は、カメラ等の画像入力装置と一体に構成されてもよく、ディスプレイ等の画像表示装置と一体に構成されてもよい。 The image processing apparatus 10 according to the embodiment of the present invention is a personal computer, a game machine, a PDA, a mobile phone, or the like. As shown in FIG. 2, the image processing apparatus 10 includes a feature point extraction unit 11, a matching unit 12 (corresponding relationship determination unit), a homography calculation unit 13 (projection relationship calculation unit), an image conversion unit 14, and an image composition unit 15. And a storage unit 16. The image processing device 10 may be configured integrally with an image input device such as a camera, or may be configured integrally with an image display device such as a display.

記憶部１６は、メモリ等の記憶装置により構成され、参照画像Ｒ、付加画像、参照画像Ｒの特徴点Ｐ等のデータを格納しており、ホモグラフィの算出結果等、画像処理の中間結果も格納する。なお、記憶部１６は、例えば画像データを格納する記憶部、特徴点データを記憶する記憶部等のように、複数に区分して構成されてもよい。 The storage unit 16 is configured by a storage device such as a memory, and stores data such as a reference image R, an additional image, and a feature point P of the reference image R, and also intermediate results of image processing such as homography calculation results. Store. The storage unit 16 may be divided into a plurality of sections, such as a storage unit that stores image data and a storage unit that stores feature point data.

参照画像データとは、入力画像Ｉ１中のマーカーＭを指定する画像のデータであり、付加画像データとは、マーカーＭを基準として入力画像Ｉ１に合成される画像のデータである。特徴点データとは、輝度のエッジ成分等、画像中の特徴部分を表す画素点のデータである。特徴点データは、画像中での特徴点Ｐの座標値および特徴量（輝度値、輝度値勾配等）のデータを少なくとも含んでいる。 The reference image data is image data specifying the marker M in the input image I1, and the additional image data is image data combined with the input image I1 with the marker M as a reference. The feature point data is pixel point data representing a feature portion in an image, such as a luminance edge component. The feature point data includes at least data of the coordinate value and feature amount (luminance value, luminance value gradient, etc.) of the feature point P in the image.

特徴点抽出部１１には、動画または静止画が入力画像Ｉ１として入力される。特徴点抽出部１１は、入力画像Ｉ１に対応する輝度値データを生成し、入力画像Ｉ１の特徴点Ｐを抽出し、抽出結果を示す特徴点データをマッチング部１２に供給する。また、特徴点抽出部１１は、画像変換部１４により変換された変換画像Ｉ２を供給され、入力画像Ｉ１の場合と同様に、変換画像Ｉ２の特徴点Ｐを抽出する。 A moving image or a still image is input to the feature point extraction unit 11 as the input image I1. The feature point extraction unit 11 generates luminance value data corresponding to the input image I1, extracts the feature point P of the input image I1, and supplies the feature point data indicating the extraction result to the matching unit 12. The feature point extraction unit 11 is supplied with the converted image I2 converted by the image conversion unit 14, and extracts the feature point P of the converted image I2 as in the case of the input image I1.

特徴点Ｐは、例えばハリスコーナー（ＨａｒｒｉｓＣｏｒｎｅｒ）、ＳＩＦＴ（ＳｃａｌｅＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）、ＲａｎｄｏｍＦｅｒｎｓ等、任意の手法・パラメータを用いて抽出される。ＳＩＦＴは、特徴点Ｐの周辺の画素の勾配方向を用いて特徴点Ｐを記述する手法であり、マーカーＭの回転に対してロバストである。ＲａｎｄｏｍＦｅｒｎｓは、参照画像Ｒを画像変換して事前に学習する手法であり、マーカーＭの姿勢変化に対してロバストである。 The feature point P is extracted using an arbitrary method / parameter such as Harris Corner, SIFT (Scale Invariant Feature Transform), Random Ferns, or the like. SIFT is a technique for describing the feature point P using the gradient direction of pixels around the feature point P, and is robust to the rotation of the marker M. Random Ferns is a method of converting the reference image R into an image and learning in advance, and is robust against changes in the posture of the marker M.

マッチング部１２には、入力画像Ｉ１の特徴点データと、参照画像Ｒの特徴点データとが供給される。マッチング部１２は、入力画像Ｉ１と参照画像Ｒの間で特徴点Ｐをマッチングし、マッチング結果を示すマッチングデータをホモグラフィ算出部１３に供給する。また、マッチング部１２は、変換画像Ｉ２の特徴データを供給され、入力画像Ｉ１の場合と同様に、変換画像Ｉ２と参照画像Ｒの間で特徴点Ｐをマッチングする。 The matching unit 12 is supplied with the feature point data of the input image I1 and the feature point data of the reference image R. The matching unit 12 matches the feature point P between the input image I1 and the reference image R, and supplies matching data indicating the matching result to the homography calculation unit 13. Further, the matching unit 12 is supplied with the feature data of the converted image I2, and matches the feature point P between the converted image I2 and the reference image R as in the case of the input image I1.

特徴点Ｐは、特徴点Ｐ間の類似度を測る評価値として差分絶対値和（ＳＡＤ）や特徴点Ｐベクトルの差のノルム等、任意の手法・パラメータを用いてマッチングされる。マッチングデータは、入力画像Ｉ１と参照画像Ｒの間で対応関係が確認された特徴点Ｐについて、入力画像Ｉ１中の座標値と参照画像Ｒ中の座標値のデータを少なくとも含んでいる。入力画像Ｉ１と参照画像Ｒの間では、入力画像Ｉ１中でのマーカーの位置・姿勢に応じては、画像間で対応する同一の特徴点Ｐを見出すことができず、１対１の対応関係を確認できない場合もある。なお、変換画像Ｉ２と参照画像Ｒの間についても、入力画像Ｉ１と参照画像Ｒの間と同様に説明される。 The feature points P are matched using an arbitrary method / parameter such as a sum of absolute differences (SAD) or a norm of a difference between feature point P vectors as an evaluation value for measuring the similarity between the feature points P. The matching data includes at least coordinate value data in the input image I1 and coordinate value data in the reference image R for the feature point P for which the correspondence between the input image I1 and the reference image R is confirmed. Between the input image I1 and the reference image R, depending on the position / orientation of the marker in the input image I1, the same feature point P corresponding to the image cannot be found, and there is a one-to-one correspondence. May not be confirmed. Note that the description between the converted image I2 and the reference image R is the same as that between the input image I1 and the reference image R.

ホモグラフィ算出部１３には、マッチングデータが供給される。ホモグラフィ算出部１３は、マッチングデータに基づき、入力画像Ｉ１と参照画像Ｒのホモグラフィを算出し、画像変換部１４および記憶部１６に供給する。また、ホモグラフィ算出部１３は、入力画像Ｉ１の場合と同様に、変換画像Ｉ２と参照画像Ｒのホモグラフィ行列Ｈ２を算出し、ホモグラフィ行列Ｈ１、Ｈ２を合成したホモグラフィ行列Ｈ（＝Ｈ１・Ｈ２）を算出する。 The homography calculation unit 13 is supplied with matching data. The homography calculation unit 13 calculates the homography of the input image I1 and the reference image R based on the matching data, and supplies the homography to the image conversion unit 14 and the storage unit 16. Similarly to the case of the input image I1, the homography calculation unit 13 calculates a homography matrix H2 of the converted image I2 and the reference image R, and combines the homography matrices H1 and H2 to form a homography matrix H (= H1 Calculate H2).

入力画像Ｉ１と参照画像Ｒのホモグラフィとは、マーカーＭを任意の姿勢で捉えた入力画像Ｉ１と、参照画像Ｒ（一定の姿勢で捉えられたマーカーＭの画像）との間に成立する射影関係を意味し、３行３列のホモグラフィ行列を用いて表される。なお、変換画像Ｉ２の場合についても、入力画像Ｉ１の場合と同様に説明される。ホモグラフィ行列は、マッチングデータを構成する４組以上の特徴点Ｐ（対応点）の座標値を用いて算出される。 The homography of the input image I1 and the reference image R is a projection established between the input image I1 in which the marker M is captured in an arbitrary posture and the reference image R (an image of the marker M captured in a certain posture). This means a relationship and is expressed using a 3 × 3 homography matrix. The case of the converted image I2 will be described in the same manner as the case of the input image I1. The homography matrix is calculated using the coordinate values of four or more sets of feature points P (corresponding points) constituting the matching data.

画像変換部１４には、入力画像データおよびホモグラフィ行列が供給される。画像変換部１４は、ホモグラフィ行列Ｈ１に基づき、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように入力画像Ｉ１（特にマーカーＭの領域の画像）を線形変換し、変換後の入力画像Ｉ１を変換画像Ｉ２として記憶部１６に一時的に格納する。画像変換は、ホモグラフィ行列Ｈ、入力画像Ｉ１の各画素の３次元座標ベクトルＸｓｒｃ、変換結果Ｘｄｓｔとすると、Ｘｄｓｔ＝Ｈ・Ｘｓｒｃとして表される。 The image conversion unit 14 is supplied with input image data and a homography matrix. Based on the homography matrix H1, the image conversion unit 14 linearly converts the input image I1 (particularly the image of the area of the marker M) so that the posture of the marker M approaches the posture of the reference image R, and the input image I1 after conversion Is temporarily stored in the storage unit 16 as the converted image I2. The image conversion is expressed as Xdst = H · Xsrc, where the homography matrix H, the three-dimensional coordinate vector Xsrc of each pixel of the input image I1, and the conversion result Xdst are used.

ここで、画像変換の対象は、マーカーＭの領域の画像のみでもよく、マーカーＭの領域を含む入力画像Ｉ１自体でもよい。なお、マーカーＭの領域の画像を変換する場合、参照画像Ｒのコーナー部の座標とホモグラフィ行列ＨとからマーカーＭの領域の座標を算出し、この座標に基づきマーカーＭの領域の画像が入力画像Ｉ１から切出される。また、画像変換部１４は、ホモグラフィ行列に基づき、付加画像を線形変換し、変換後の付加画像を画像合成部１５に供給する。 Here, the image conversion target may be only the image of the marker M region or the input image I1 itself including the marker M region. When the image of the marker M area is converted, the coordinates of the marker M area are calculated from the corner coordinates of the reference image R and the homography matrix H, and the image of the marker M area is input based on the coordinates. It is cut out from the image I1. The image conversion unit 14 linearly converts the additional image based on the homography matrix and supplies the converted additional image to the image composition unit 15.

画像合成部１５には、入力画像データ、変換後の付加画像データおよびホモグラフィ行列Ｈが供給される。画像合成部１５は、ホモグラフィ行列Ｈに基づき、変換後の付加画像を入力画像Ｉ１中のマーカーＭを基準として入力画像Ｉ１に合成して合成画像を生成し、画像表示装置等に出力する。 The image composition unit 15 is supplied with input image data, converted additional image data, and a homography matrix H. Based on the homography matrix H, the image synthesizing unit 15 synthesizes the converted additional image with the input image I1 based on the marker M in the input image I1, generates a synthesized image, and outputs the synthesized image to an image display device or the like.

画像処理装置１０の各構成要素は、回路ロジック等のハードウェアおよび／またはプログラム等のソフトウェアとして構成される。ソフトウェアとして構成される構成要素は、例えば、不図示のＣＰＵ上でプログラムを実行することにより実現される。 Each component of the image processing apparatus 10 is configured as hardware such as circuit logic and / or software such as a program. The component configured as software is realized, for example, by executing a program on a CPU (not shown).

なお、画像変換部１４は、画像合成部１５と一体に構成されてもよい。また、入力画像Ｉ１と変換画像Ｉ２をそれぞれに処理するために、特徴量抽出部１１、マッチング部１２、ホモグラフィ算出部１３、画像変換部１４のうち少なくとも１以上が個別に設けられてもよい。また、入力画像Ｉ１は、バッファ（不図示）を通じて特徴点抽出部１１に入力されてもよい。 The image conversion unit 14 may be configured integrally with the image composition unit 15. Further, in order to process the input image I1 and the converted image I2, respectively, at least one or more of the feature amount extraction unit 11, the matching unit 12, the homography calculation unit 13, and the image conversion unit 14 may be provided individually. . Further, the input image I1 may be input to the feature point extraction unit 11 through a buffer (not shown).

［４．画像処理装置１０の動作］
つぎに、図３および図４Ａ、４Ｂを参照して本発明の実施形態に係る画像処理装置１０の動作について説明する。図３は、画像処理装置１０の動作を示すフロー図であり、図４Ａ、４Ｂは、画像処理方法の一例を示す図である。 [4. Operation of Image Processing Device 10]
Next, the operation of the image processing apparatus 10 according to the embodiment of the present invention will be described with reference to FIGS. 3 and 4A and 4B. FIG. 3 is a flowchart showing the operation of the image processing apparatus 10, and FIGS. 4A and 4B are diagrams showing an example of the image processing method.

図３に示すように、特徴点抽出部１１は、動画フレームのデータを入力画像Ｉ１のデータとして入力され（ステップＳ３１）、入力画像Ｉ１の特徴点Ｐを抽出する（ステップＳ３２）。なお、入力画像Ｉ１のデータは、記憶部１６に一時的に格納される。マッチング部１２は、入力画像Ｉ１の特徴点データと参照画像Ｒの特徴点データを供給され、入力画像Ｉ１と参照画像Ｒの間で特徴点Ｐをマッチングする（ステップＳ３３）。なお、参照画像Ｒの特徴点データは、記憶部１６から読み出される代わりに、特徴点抽出部１１により参照画像Ｒから抽出されてマッチング部１２に供給されてもよい。 As shown in FIG. 3, the feature point extraction unit 11 receives moving image frame data as data of the input image I1 (step S31), and extracts a feature point P of the input image I1 (step S32). The data of the input image I1 is temporarily stored in the storage unit 16. The matching unit 12 is supplied with the feature point data of the input image I1 and the feature point data of the reference image R, and matches the feature point P between the input image I1 and the reference image R (step S33). Note that the feature point data of the reference image R may be extracted from the reference image R by the feature point extraction unit 11 and supplied to the matching unit 12 instead of being read from the storage unit 16.

ホモグラフィ算出部１３は、マッチングデータを供給され、入力画像Ｉ１と参照画像Ｒの射影関係を示すホモグラフィ行列Ｈ１を算出する（ステップＳ３４）。ホモグラフィ行列Ｈ１は、記憶部１６に一時的に格納される。なお、例えば、４組以上の特徴点Ｐ（対応点）を見出せなかった場合（ステップＳ３５で「Ｎｏ」）、ホモグラフィ行列Ｈ１を算出することができず、画像合成を行わずに入力画像Ｉ１が出力される（ステップＳ４５）。 The homography calculation unit 13 is supplied with matching data, and calculates a homography matrix H1 indicating the projection relationship between the input image I1 and the reference image R (step S34). The homography matrix H1 is temporarily stored in the storage unit 16. For example, when four or more sets of feature points P (corresponding points) cannot be found (“No” in step S35), the homography matrix H1 cannot be calculated and the input image I1 is not synthesized. Is output (step S45).

図４Ａには、特徴点Ｐのマッチング例が示されている。この例では、入力画像Ｉ１中の特定領域（マーカーＭ）は、日本周辺の地図画像を示す参照画像Ｒにより指定されている。入力画像Ｉ１は、斜めに吊り下げられた地図画像（マーカーＭ）を斜め前方から捉えている。参照画像Ｒは、正立した状態のマーカーＭに対応する地図画像であり、輝度値のエッジ成分等、９つの特徴点Ｐ１〜Ｐ９が予め抽出されている。なお、図４Ａでは、便宜上、地図画像の輝度画像ではなく地図画像自体に特徴点Ｐが示されている。 FIG. 4A shows a matching example of feature points P. In this example, the specific area (marker M) in the input image I1 is designated by a reference image R indicating a map image around Japan. The input image I1 captures a map image (marker M) suspended diagonally from the diagonally front. The reference image R is a map image corresponding to the upright marker M, and nine feature points P1 to P9 such as edge components of luminance values are extracted in advance. In FIG. 4A, for convenience, the feature point P is shown in the map image itself, not the luminance image of the map image.

参照画像Ｒと入力画像Ｉ１の間では、対応関係にある同一の特徴点Ｐ（対応点）同士を結ぶ線分を用いて暗示するように、９つの特徴点Ｐ１〜Ｐ９のうち５つの特徴点Ｐ１〜Ｐ５がマッチングされている。これは、入力画像Ｉ１中のマーカーＭと参照画像Ｒの間で画像の姿勢が大きく異なるので、残り４つの特徴点Ｐ６〜Ｐ９を入力画像Ｉ１から適切に抽出できないためである。 Between the reference image R and the input image I1, five feature points out of the nine feature points P1 to P9 are implied by using a line segment that connects the same feature points P (corresponding points) in the correspondence relationship. P1 to P5 are matched. This is because the posture of the image is greatly different between the marker M and the reference image R in the input image I1, and the remaining four feature points P6 to P9 cannot be appropriately extracted from the input image I1.

画像変換部１４は、入力画像Ｉ１のデータおよびホモグラフィ行列Ｈ１を供給され、ホモグラフィ行列Ｈ１に基づき入力画像Ｉ１を変換する（ステップＳ３６）。変換後の入力画像Ｉ１（変換画像Ｉ２）は、入力画像Ｉ１に比べて、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように入力画像Ｉ１を変換することで得られる。変換画像Ｉ２のデータは、記憶部１６に一時的に格納される。 The image conversion unit 14 is supplied with the data of the input image I1 and the homography matrix H1, and converts the input image I1 based on the homography matrix H1 (step S36). The converted input image I1 (converted image I2) is obtained by converting the input image I1 so that the posture of the marker M approaches the posture of the reference image R as compared to the input image I1. The data of the converted image I2 is temporarily stored in the storage unit 16.

特徴点抽出部１１は、さらに、変換画像Ｉ２のデータを供給され、変換画像Ｉ２の特徴点Ｐを抽出する（ステップＳ３７）。マッチング部１２は、変換画像Ｉ２の特徴点データと参照画像Ｒの特徴点データを供給され、変換画像Ｉ２と参照画像Ｒの間で特徴点Ｐをマッチングする（ステップＳ３８）。 The feature point extraction unit 11 is further supplied with the data of the converted image I2, and extracts the feature point P of the converted image I2 (step S37). The matching unit 12 is supplied with the feature point data of the converted image I2 and the feature point data of the reference image R, and matches the feature point P between the converted image I2 and the reference image R (step S38).

ホモグラフィ算出部１３は、マッチングデータを供給され、変換画像Ｉ２と参照画像Ｒの射影関係を示すホモグラフィ行列Ｈ２を算出する（ステップＳ３９）。ホモグラフィ行列Ｈ２は、記憶部１６に一時的に格納されてもよい。なお、ホモグラフィ行列Ｈ２を算出することができない場合（ステップＳ４０で「Ｎｏ」）、画像合成を行わずに入力画像Ｉ１が出力される（ステップＳ４５）。ホモグラフィ算出部１３は、ホモグラフィＨ１、Ｈ２を合成してホモグラフィ行列Ｈ（＝Ｈ１・Ｈ２）を算出する（ステップＳ４１）。 The homography calculation unit 13 is supplied with matching data, and calculates a homography matrix H2 indicating a projection relationship between the converted image I2 and the reference image R (step S39). The homography matrix H2 may be temporarily stored in the storage unit 16. When the homography matrix H2 cannot be calculated (“No” in step S40), the input image I1 is output without performing image synthesis (step S45). The homography calculation unit 13 combines the homography H1 and H2 to calculate a homography matrix H (= H1 · H2) (step S41).

図４Ｂには、図４Ａに関連する特徴点Ｐのマッチング例が示されている。この例では、変換画像Ｉ２は、ホモグラフィ行列Ｈ１に基づきマーカーＭの領域を含む画像を入力画像Ｉ１から切出し、入力画像Ｉ１に比べて、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように切出し画像を変換することで得られる。入力画像Ｉ１に占めるマーカーＭの領域が小さい場合、画像変換に要する処理を低減することができる。なお、図4Ｂでも、便宜上、地図画像の輝度画像ではなく地図画像自体に特徴点Ｐが示されている。 FIG. 4B shows a matching example of the feature point P related to FIG. 4A. In this example, in the converted image I2, an image including the region of the marker M is cut out from the input image I1 based on the homography matrix H1, and the posture of the marker M is closer to the posture of the reference image R than the input image I1. It is obtained by converting the cutout image. When the area of the marker M in the input image I1 is small, processing required for image conversion can be reduced. In FIG. 4B, for convenience, the feature point P is shown in the map image itself, not the luminance image of the map image.

参照画像Ｒと変換画像Ｉ２の間では、対応関係にある同一の特徴点Ｐ（対応点）同士を結ぶ線分を用いて暗示するように、９つの特徴点Ｐ１〜Ｐ９の全てがマッチングされている。これは、変換画像Ｉ２中のマーカーＭと参照画像Ｒの間で画像の姿勢が近いので、特徴点Ｐ１〜Ｐ９を変換画像Ｉ２から適切に抽出できるためである。 Between the reference image R and the converted image I2, all nine feature points P1 to P9 are matched so as to be implied by using a line segment that connects the same feature points P (corresponding points) in a corresponding relationship. Yes. This is because the posture of the image is close between the marker M and the reference image R in the converted image I2, and the feature points P1 to P9 can be appropriately extracted from the converted image I2.

画像変換部１４は、付加画像のデータおよびホモグラフィ行列Ｈを供給され、ホモグラフィ行列Ｈに基づき付加画像を線形変換する（ステップＳ４２）。付加画像は、付加画像の姿勢がマーカーＭの姿勢に一致するように変換される。画像合成部１５は、変換後の付加画像のデータ、入力画像Ｉ１のデータおよびホモグラフィ行列Ｈを供給され、ホモグラフィ行列Ｈに基づき変換後の付加画像を入力画像Ｉ１に合成し（ステップＳ４３）、合成画像として出力する（ステップＳ４３）。 The image conversion unit 14 is supplied with the data of the additional image and the homography matrix H, and linearly converts the additional image based on the homography matrix H (step S42). The additional image is converted so that the posture of the additional image matches the posture of the marker M. The image synthesis unit 15 is supplied with the converted additional image data, the input image I1 data, and the homography matrix H, and synthesizes the converted additional image with the input image I1 based on the homography matrix H (step S43). Then, it is output as a composite image (step S43).

ホモグラフィ行列Ｈ１、Ｈ２は、同一の手法（およびパラメータ）を用いて算出されてもよく、互いに異なる手法（および／またはパラメータ）を用いて算出されてもよい。なお、異なる手法とは、特徴量の抽出（参照画像の特徴量データの種類も含む）および／または特徴点Ｐのマッチングの手法が異なることを意味している。 The homography matrices H1 and H2 may be calculated using the same method (and parameters), or may be calculated using different methods (and / or parameters). Note that the different methods mean that the feature amount extraction method (including the type of feature amount data of the reference image) and / or the matching method of the feature points P are different.

ホモグラフィ行列Ｈ１は、入力画像Ｉ１中のマーカーＭと参照画像Ｒの間で画像の位置・姿勢が大きく異なる可能性が高いので、マーカーＭの姿勢変化に対してロバストな手法を用いて算出されることが好ましい。一方、ホモグラフィ行列Ｈ２は、変換画像Ｉ２中のマーカーＭと参照画像Ｒの間で画像の位置・姿勢が大きく異なる可能性が低いので、大抵のケースでは、推定精度の高い手法を用いて算出されることが好ましい。一般に、マーカーＭの姿勢変化に対するロバスト性と推定精度は、トレードオフの関係にあるので、ホモグラフィ行列Ｈ１をロバストな手法を用いて算出し、ホモグラフィ行列Ｈ２を推定精度の高い手法を用いて算出することで、マーカーＭの位置・姿勢を高精度で推定できる可能性が高くなる。 The homography matrix H1 is calculated using a technique that is robust against the change in the posture of the marker M because the image position / posture is highly likely to differ greatly between the marker M and the reference image R in the input image I1. It is preferable. On the other hand, the homography matrix H2 is calculated by using a method with high estimation accuracy in most cases because it is unlikely that the position and orientation of the image greatly differ between the marker M and the reference image R in the converted image I2. It is preferred that In general, the robustness with respect to the posture change of the marker M and the estimation accuracy are in a trade-off relationship. Therefore, the homography matrix H1 is calculated using a robust method, and the homography matrix H2 is calculated using a method with high estimation accuracy. By calculating, there is a high possibility that the position / posture of the marker M can be estimated with high accuracy.

マーカーＭの位置・姿勢は、３段階以上のホモグラフィの算出結果に基づき推定されてもよい。例えば、ホモグラフィ行列Ｈ１、Ｈ２、Ｈ３の算出結果に基づく場合、画像変換部１４は、ホモグラフィ行列Ｈ２に基づき変換画像Ｉ２を２次変換画像にさらに変換する（もちろん、ホモグラフィ行列Ｈ１、Ｈ２基づき入力画像Ｉ１を２次変換画像に変換してもよい。）。 The position / posture of the marker M may be estimated based on the calculation result of three or more homography. For example, when based on the calculation results of the homography matrices H1, H2, and H3, the image conversion unit 14 further converts the converted image I2 into a secondary conversion image based on the homography matrix H2 (of course, the homography matrices H1, H2) Based on this, the input image I1 may be converted into a secondary conversion image).

画像処理装置１０では、２次変換画像について、参照画像Ｒとの間で特徴点Ｐの抽出、特徴点Ｐのマッチングおよびホモグラフィの算出が行われ、２次変換画像と参照画像Ｒの射影関係を示すホモグラフィ行列Ｈ３が算出される。そして、入力画像Ｉ１と参照画像Ｒの射影関係を示すホモグラフィ行列Ｈ´（＝Ｈ１・Ｈ２・Ｈ３）が算出され、ホモグラフィ行列Ｈ´に基づき入力画像Ｉ１に付加画像が合成される。 In the image processing apparatus 10, the feature point P is extracted with the reference image R, the feature point P is matched, and the homography is calculated for the secondary transformed image, and the projective relationship between the secondary transformed image and the reference image R is calculated. A homography matrix H3 is calculated. Then, a homography matrix H ′ (= H 1, H 2, H 3) indicating the projection relationship between the input image I 1 and the reference image R is calculated, and an additional image is synthesized with the input image I 1 based on the homography matrix H ′.

この場合も、各ホモグラフィ行列Ｈ１、Ｈ２、Ｈ３は、前述した場合と同様に、同一の手法を用いて算出されてもよく、互いに異なる手法を用いて算出されてもよい。ここで、ホモグラフィの算出段階が上がるほど、変換後の入力画像Ｉ１中のマーカーＭの姿勢は、参照画像Ｒの姿勢に近づくと期待される。よって、算出段階が上がるほど、マーカーＭの姿勢変化に対するロバスト性が低く精度水準が高い手法を用いることで、マーカーＭの位置・姿勢を高精度に推定できる可能性が高くなる。 Also in this case, the homography matrices H1, H2, and H3 may be calculated using the same method as described above, or may be calculated using different methods. Here, the posture of the marker M in the input image I1 after conversion is expected to approach the posture of the reference image R as the homography calculation stage increases. Therefore, the higher the calculation stage, the higher the possibility that the position / orientation of the marker M can be estimated with high accuracy by using a technique that has a low robustness to the attitude change of the marker M and a high accuracy level.

［５．まとめ］
以上説明したように、本発明によれば、マーカーＭの姿勢が参照画像Ｒの姿勢に近づくように入力画像Ｉ１を変換しながら、入力画像Ｉ１および変換された入力画像Ｉ１（変換画像Ｉ２等）と参照画像Ｒのホモグラフィを算出することで、入力画像Ｉ１中でのマーカーＭの位置・姿勢を高い精度で推定することができる。 [5. Summary]
As described above, according to the present invention, while converting the input image I1 so that the posture of the marker M approaches the posture of the reference image R, the input image I1 and the converted input image I1 (converted image I2, etc.) By calculating the homography of the reference image R, the position / posture of the marker M in the input image I1 can be estimated with high accuracy.

以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 The preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention pertains can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that these also belong to the technical scope of the present invention.

例えば、上記実施形態では、画像処理装置１０は、カメラ等の画像入力装置から入力画像Ｉ１を入力され、ディスプレイ等の画像表示装置に出力画像を出力する場合について説明した。しかし、入力画像Ｉ１は、レコーダ、プレイヤ等の画像再生装置から入力されてもよく、出力画像は、レコーダ、プレイヤ等の画像記録装置に出力されてもよい。 For example, in the above-described embodiment, the case where the image processing apparatus 10 receives the input image I1 from an image input apparatus such as a camera and outputs an output image to an image display apparatus such as a display has been described. However, the input image I1 may be input from an image reproducing device such as a recorder or a player, and the output image may be output to an image recording device such as a recorder or a player.

１０画像処理装置
１１特徴点抽出部
１２マッチング部（対応関係決定部）
１３ホモグラフィ算出部（射影関係算出部）
１４画像変換部
１５画像合成部
１６記憶部
Ｍマーカー（特定領域）
Ｒ参照画像
Ｉ１入力画像
Ｉ２変換画像
Ｈ、Ｈ１、Ｈ２ホモグラフィ行列（射影関係）
Ｐ特徴点 DESCRIPTION OF SYMBOLS 10 Image processing apparatus 11 Feature point extraction part 12 Matching part (correspondence determination part)
13 Homography calculator (projection relation calculator)
14 Image conversion unit 15 Image composition unit 16 Storage unit M Marker (specific area)
R reference image I1 input image I2 transformed image H, H1, H2 Homography matrix (projection relation)
P Features

Claims

A feature point extraction unit for extracting feature points of the input image;
A correspondence determining unit that determines the correspondence of the feature points with the reference image;
A projection relationship calculating unit that calculates a projection relationship between the input image and the reference image based on the correspondence relationship;
An image conversion unit that converts at least a part of the input image based on the projection relationship,
The projection relationship calculation unit calculates a first projection relationship for the input image based on the correspondence relationship of the feature points with the reference image, and converts the input image based on the first projection relationship. For the input image, a second projection relationship is calculated based on the correspondence relationship of the feature points with the reference image, and the input image and the reference image are calculated based on the first and second projection relationships. An image processing apparatus that calculates a projection relationship again.

The image conversion unit converts at least a part of the input image based on at least the first projective relationship so that the posture of the specific region in the input image approaches the posture of the reference image. An image processing apparatus according to 1.

The projection relationship calculation unit further includes a third based on a correspondence relationship of the feature points with the reference image with respect to the input image converted by the image conversion unit based on the first and second projection relationships. The image processing apparatus according to claim 1, wherein a projection relationship between the input image and the reference image is calculated again based on the first to third projection relationships.

The feature point extraction unit extracts the feature points by using different methods when calculating the first projective relationship and when calculating the second projective relationship. The image processing apparatus according to claim 1.

The correspondence relationship determination unit determines the correspondence relationship by using different methods for calculating the first projective relationship and calculating the second projective relationship. The image processing apparatus according to claim 1.

6. The method according to claim 4, wherein when calculating the first projective relationship, a method that is more robust with respect to a change in posture of a specific region in the input image is used than when calculating the second projective relationship. The image processing apparatus described.

The image processing apparatus according to claim 1, wherein the image conversion unit converts an image in a specific area corresponding to the reference image in the input image.

An image composition unit for compositing an image with the input image;
The image conversion unit further converts the additional image based on at least the first and second projective relationships,
The image processing apparatus according to claim 1, wherein the image synthesis unit synthesizes the converted additional image with a specific area corresponding to the reference image in the input image.

For the input image, a first projective relationship is calculated based on the feature point correspondence with the reference image,
Converting at least a portion of the input image based on the first projective relationship;
For the converted input image, a second projective relationship is calculated based on the correspondence of feature points with the reference image,
An image processing method comprising: calculating again the projection relationship between the input image and the reference image based on the first and second projection relationships.

For the input image, a first projective relationship is calculated based on the feature point correspondence with the reference image,
Converting at least a portion of the input image based on the first projective relationship;
For the converted input image, a second projective relationship is calculated based on the correspondence of feature points with the reference image,
A program for causing a computer to execute an image processing method including calculating again a projection relationship between the input image and the reference image based on the first and second projection relationships.