JP4287391B2

JP4287391B2 - Image matching device

Info

Publication number: JP4287391B2
Application number: JP2005039624A
Authority: JP
Inventors: 直三島; 伊藤　　剛; 雅裕馬場
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2005-02-16
Filing date: 2005-02-16
Publication date: 2009-07-01
Anticipated expiration: 2025-02-16
Also published as: JP2006227828A

Description

この発明は、２つの画像から対応点を検出し画像マッチングをおこなう画像マッチング方法及び装置に関する。 The present invention relates to an image matching method and apparatus for detecting corresponding points from two images and performing image matching.

動き検出、ステレオマッチング、画像モーフィング、画像認識、動画像符号化、など多くの技術分野において、一つの画像から他方の画像への対応関係を求める画像マッチングの技術は基本的な問題である。 In many technical fields such as motion detection, stereo matching, image morphing, image recognition, and moving image coding, image matching technology for obtaining a correspondence relationship from one image to another is a basic problem.

「非特許文献１」によれば、画像マッチングの技術は大きく分けて４つに分類できる。オプティカルフロー手法、ブロックベース手法、勾配法、ベイジアンメソッドである。オプティカルフロー手法は、「輝度の変化は一定である」というオプティカルフロー式を導出しそのオプティカルフロー式を拘束条件としてフローを求めるものである。ブロックベースの手法はブロック毎のテンプレートマッチングによって動きを求める手法である。勾配法は画像の輝度勾配が減少する方向にマッチングをおこなう手法である。ベイジアンメソッドは確率的にもっともらしいマッチングを求める手法である。 According to “Non-Patent Document 1”, image matching techniques can be broadly classified into four. These are the optical flow method, block-based method, gradient method, and Bayesian method. The optical flow method derives an optical flow equation that “a change in luminance is constant” and obtains a flow using the optical flow equation as a constraint. The block-based method is a method for obtaining motion by template matching for each block. The gradient method is a method of performing matching in the direction in which the luminance gradient of the image decreases. The Bayesian method is a technique for obtaining probabilistic plausible matching.

「特許文献１」には上記の分類には属さない技術として多重解像度フィルタを用いた画像マッチングの方法が開示されている。この手法は複数の多重解像度フィルタによって複数の多重解像度画像ピラミッドを生成し、画像ピラミッドを上から順にマッチング処理をおこなうことによって大きな動きから小さな動きまでマッチング可能なロバスト性の高いマッチング技術である。
特許第２９２７３５０号公報 A. Murat Tekalp, “Digital Video Processing”, Prentice Hall, 1995 “Patent Document 1” discloses an image matching method using a multi-resolution filter as a technique that does not belong to the above classification. This technique is a highly robust matching technique capable of matching a large motion to a small motion by generating a plurality of multi-resolution image pyramids using a plurality of multi-resolution filters and performing a matching process on the image pyramids in order from the top.
Japanese Patent No. 2927350 A. Murat Tekalp, “Digital Video Processing”, Prentice Hall, 1995

しかしながら、このような従来の画像マッチング技術には次のような問題がある。オプティカルフロー手法では、ノイズに敏感で速い動きに対応することが本質的に困難であるという問題があり、ブロックベースの手法では、画像の中のブロック毎に画像マッチングを行っているので、画像中のオブジェクトの平行移動などの動きには高い信頼性を有するが、画像中のオブジェクトが変形したり、回転したりする等の動きに対して本質的に対応することが難しいという問題がある。勾配法では、画像の輝度勾配が減少する方向にマッチングを行うため、安定してオブジェクトの動きを探索することが困難であるという問題がある。ベイジアンメソッドでは、大域的最適点の求め方が困難であるという問題がある。 However, such a conventional image matching technique has the following problems. The optical flow method has a problem that it is sensitive to noise and it is inherently difficult to cope with fast movement. In the block-based method, image matching is performed for each block in the image. However, there is a problem that it is difficult to essentially cope with the movement of the object in the image such as deformation or rotation. In the gradient method, since matching is performed in a direction in which the luminance gradient of the image decreases, there is a problem that it is difficult to stably search for the movement of the object. The Bayesian method has a problem that it is difficult to obtain a global optimum.

一方、特許文献１に開示された技術では、原理的に複数の多重解像度フィルタを必要とする。また静的な拘束条件を加えた静的な最適化を各画素毎に局所的におこなうため画面全体のエネルギーが必ずしも最適になるとは限らない。 On the other hand, the technique disclosed in Patent Document 1 requires a plurality of multi-resolution filters in principle. In addition, since static optimization including a static constraint condition is locally performed for each pixel, the energy of the entire screen is not always optimal.

一般的な従来手法ではフレーム間の相関が低くなるほどマッチングが困難になる傾向にある。よって画像が大きく変形したり、速い動きのオブジェクトが含まれる画像などではマッチングが困難である。特許文献１の技術はよりロバストであるが、本質的には静的な局所最適化であるので、生成される写像が画面全体として最適である保証はない。運動方程式を用いて対象画像と参照画像との画像マッチングを行う場合、全格子点について画像相関ポテンシャルエネルギーの勾配によって受ける力を求めるので計算が重くなる。 In a general conventional method, matching tends to become difficult as the correlation between frames decreases. Therefore, matching is difficult for an image that is greatly deformed or includes an object that moves quickly. Although the technique of Patent Document 1 is more robust, it is essentially a static local optimization, so there is no guarantee that the generated mapping is optimal for the entire screen. When image matching between the target image and the reference image is performed using the equation of motion, the calculation is heavy because the force received by the gradient of the image correlation potential energy is obtained for all grid points.

本発明は、対象画像と参照画像間の対応関係を求める画像マッチングにおいて運動方程式の計算量を軽減する画像マッチング方法及び装置を提供することを目的とする。 An object of the present invention is to provide an image matching method and apparatus that reduce the amount of calculation of a motion equation in image matching for obtaining a correspondence relationship between a target image and a reference image.

本発明の第１局面は、対象画像と参照画像間の対応関係を求める画像マッチングにおいて、前記対象画像に複数の対象格子点を設定するとともに、前記参照画像に前記対象画像上の対象格子点の各々と１対１に対応する参照格子点を設定するステップと、前記対象画像を少なくとも一つの対象格子点を含む複数のクラスタに分割し、各クラスタの代表対象格子点を決定するステップと、前記参照画像上で前記代表対象格子点に対応する参照格子点である代表参照格子点の画素情報、前記代表対象格子点の画素情報、および、前記代表参照格子点と前記代表対象格子点との位置関係によって求まる画像相関ポテンシャルエネルギーの勾配によって前記代表参照格子点が受ける画像相関力を計算するステップと、前記各クラスタに属する対象格子点の各々に対応する参照格子点に作用する前記画像相関力を、当該クラスタの代表対象格子点に対応する代表参照格子点について算出された前記画像相関力に設定するステップと、前記参照画像上の参照格子点の各々に関して、当該参照格子点に作用する前記画像相関力、当該参照格子点と当該参照格子点に隣接する他の参照格子点との間の弾性エネルギーによって当該参照格子点が受ける弾性力、および、当該参照格子点に作用する摩擦力を用いて、当該参照格子点に関する運動方程式を構築するモデル生成ステップと、前記運動方程式を数値解析することによって前記参照格子点の各々の平衡状態を求める数値解析ステップとを有することを特徴とする画像マッチング方法を提供する。 In the first aspect of the present invention, in image matching for obtaining a correspondence relationship between a target image and a reference image, a plurality of target grid points are set in the target image, and target grid points on the target image are set in the reference image. Setting reference grid points corresponding to each one-to-one, dividing the target image into a plurality of clusters including at least one target grid point, and determining representative target grid points of each cluster; Pixel information of a representative reference grid point that is a reference grid point corresponding to the representative target grid point on the reference image, pixel information of the representative target grid point, and positions of the representative reference grid point and the representative target grid point Calculating the image correlation force received by the representative reference grid point by the gradient of the image correlation potential energy determined by the relationship, and the target grid belonging to each cluster Setting the image correlation forces acting on the reference grid points corresponding to each of the image correlation forces calculated for the representative reference grid points corresponding to the representative target grid points of the cluster, on the reference image For each reference grid point, the image correlation force acting on the reference grid point, and the elasticity that the reference grid point receives by the elastic energy between the reference grid point and another reference grid point adjacent to the reference grid point A model generation step of constructing an equation of motion related to the reference grid point using a force and a frictional force acting on the reference grid point; and an equilibrium state of each of the reference grid points by numerically analyzing the motion equation And a numerical analysis step for obtaining an image matching method.

本発明の第２局面は、対象画像と参照画像間の対応関係を求める画像マッチング装置において、前記対象画像に複数の対象格子点を設定するとともに、前記参照画像に前記対象画像上の対象格子点の各々と１対１に対応する参照格子点を設定する設定手段と、前記対象画像を少なくとも一つの対象格子点を含む複数のクラスタに分割し、各クラスタの代表対象格子点を決定する決定手段と、前記参照画像上で前記代表対象格子点に対応する参照格子点(y)である代表参照格子点の画素情報、前記代表対象格子点の画素情報、および、前記代表参照格子点と前記代表対象格子点との位置関係によって求まる画像相関ポテンシャルエネルギーの勾配によって前記代表参照格子点が受ける画像相関力を計算する計算手段と、前記各クラスタに属する対象格子点の各々に対応する参照格子点に作用する前記画像相関力を、当該クラスタの代表対象格子点に対応する代表参照格子点について算出された前記画像相関力に設定する設定手段と、前記参照画像上の参照格子点の各々に関して、当該参照格子点に作用する前記画像相関力、当該参照格子点と当該参照格子点に隣接する他の参照格子点との間の弾性エネルギーによって当該参照格子点が受ける弾性力、および、当該参照格子点に作用する摩擦力を用いて、当該参照格子点に関する運動方程式を構築するモデル生成手段と、前記運動方程式を数値解析することによって前記参照格子点の各々の平衡状態を求める数値解析手段とを有することを特徴とする画像マッチング装置を提供する。 According to a second aspect of the present invention, in the image matching device for obtaining a correspondence relationship between a target image and a reference image, a plurality of target grid points are set in the target image, and target grid points on the target image are set in the reference image. Setting means for setting a reference grid point corresponding to each of the first and the second, and a determination means for dividing the target image into a plurality of clusters including at least one target grid point and determining a representative target grid point of each cluster And pixel information of a representative reference grid point that is a reference grid point (y) corresponding to the representative target grid point on the reference image, pixel information of the representative target grid point, and the representative reference grid point and the representative Calculating means for calculating the image correlation force received by the representative reference grid point by the gradient of the image correlation potential energy obtained by the positional relationship with the target grid point; and a pair belonging to each cluster Setting means for setting the image correlation force acting on the reference grid point corresponding to each of the grid points to the image correlation force calculated for the representative reference grid point corresponding to the representative target grid point of the cluster; and the reference For each reference grid point on the image, the reference grid point by the image correlation force acting on the reference grid point, and the elastic energy between the reference grid point and another reference grid point adjacent to the reference grid point Model generating means for constructing an equation of motion related to the reference lattice point using the elastic force applied to the reference lattice point and the frictional force acting on the reference lattice point, and each of the reference lattice points by numerically analyzing the equation of motion There is provided an image matching device characterized by having numerical analysis means for obtaining an equilibrium state of the image.

各クラスタ内で画像相関ポテンシャルエネルギーによる力F_uを計算する代表対象格子点を決定し、各代表対象格子点における力F_uをクラスタ内の他の点の力として割り当てるので、計算量が重くなる画像相関ポテンシャルエネルギーによる力Fuの計算が、全格子点に関して実行する必要がなく、計算時間が短縮される。 Representative target grid points to calculate the force F _u by the image correlating potential energy within each cluster to determine, is allocated a force F _u in each representative object grid points as a force other points in the cluster, the amount of calculation becomes heavy The calculation of the force Fu by the image correlation potential energy does not need to be performed for all the grid points, and the calculation time is shortened.

［第１の実施の形態］
第１の実施の形態を図１のブロック図に従って説明する。
図１によると、例えば、メモリなどを含むプロセッサにより構成される画像マッチング部１１がモデル生成モジュール１２と数値解析モジュール１３とを有する。画像マッチング部１１に対象画像と参照画像の２枚の画像が入力されたときに、フレームメモリの両画像の同じ部分、例えば３Ｄ画像の同じ部分をマッチングする。即ち、２つの画像が入ってきたときに力学的な概念を用いて動的なモデルを生成する。この動的モデルが常微分方程式の形でモデル生成モジュール１２から出力されるので、数値解析モジュール１３はその出力を一般の数値解法により反復的に解いていく。反復計算の最終回に得られる結果の画像がマッチングの最終状態となり、写像として出力される。本実施形態では、例えば、プロセッサにより構成される代表対象格子点決定モジュール１４及び画像相関ポテンシャルエネルギー計算モジュール１５が設けられており、後述するように画像を複数のクラスタに分割し、各クラスタの画素の少なくとも１つを代表対象格子点として画像相関ポテンシャルエネルギーによる力を算出し、この算出値を同クラスタ内の各画素についての運動方程式に適用する。 [First Embodiment]
The first embodiment will be described with reference to the block diagram of FIG.
According to FIG. 1, for example, the image matching unit 11 configured by a processor including a memory or the like includes a model generation module 12 and a numerical analysis module 13. When two images of the target image and the reference image are input to the image matching unit 11, the same part of both images in the frame memory, for example, the same part of the 3D image is matched. That is, when two images are received, a dynamic model is generated using a dynamic concept. Since this dynamic model is output from the model generation module 12 in the form of an ordinary differential equation, the numerical analysis module 13 repeatedly solves the output by a general numerical solution. The resulting image obtained at the final iteration is the final matching state and is output as a mapping. In the present embodiment, for example, a representative target lattice point determination module 14 and an image correlation potential energy calculation module 15 configured by a processor are provided, and an image is divided into a plurality of clusters as described later, and pixels of each cluster are provided. The force due to the image correlation potential energy is calculated using at least one of the representative target grid points, and this calculated value is applied to the equation of motion for each pixel in the cluster.

本実施の形態では、対象画像と参照画像が入力され、それらの画像間の写像関係を求める。本実施の形態では従来技術の困難を、画面全体を動的システムとして捉えることによって、画面全体が系の自然な最適状態に収束する特徴を用いて解決する。まずは、基本となる画像マッチングステップについて説明する。対象とする画像として以下のようなモデルを考える。 In the present embodiment, a target image and a reference image are input, and a mapping relationship between these images is obtained. In this embodiment, the difficulty of the prior art is solved by using the feature that the entire screen converges to the natural optimum state of the system by capturing the entire screen as a dynamic system. First, the basic image matching step will be described. Consider the following model as the target image.

連続画像モデル

Continuous image model

これは実数ベースの連続な画像モデルである。ここではデジタル画像を対象と考えているので、上記のモデルをサンプリングした以下のモデルを用いる。 This is a real-based continuous image model. Since the digital image is considered here, the following model obtained by sampling the above model is used.

サンプリング画像モデル

Sampling image model

（ただしこの定式化の場合、暗に同じ対象物の画像値は時間変化をしないと仮定している）。また、動きベクトルdが実数なので、右辺は連続画像モデル（数式（１））の表記を用いていることに注意が必要である。ここでは対象画像と参照画像、２枚の画像間の画像マッチングを考えているので、等価な以下の問題を考える。

(However, in this formulation, it is assumed that the image values of the same object do not change with time). In addition, since the motion vector d is a real number, it should be noted that the right side uses the notation of the continuous image model (Equation (1)). Here, since image matching between the target image, the reference image, and the two images is considered, the following equivalent problem is considered.

画像マッチング問題

Image matching problem

x=Vnであるからxは格子空間上の点nに一意に対応する。写像gは一意写像であるからyもxに一意に対応する。故に、yはnに一意に対応する。このことを図で表すと図２のようになる。つまりここで取り扱いたい空間は格子空間上の点nによって１対１に対応する変形格子空間である。 Since x = Vn, x uniquely corresponds to a point n on the lattice space. Since map g is a unique map, y also uniquely corresponds to x. Therefore, y uniquely corresponds to n. This is illustrated in FIG. That is, the space to be handled here is a deformed lattice space corresponding to one-to-one by the point n on the lattice space.

以上のようにy=g(x)=g(Vn)なので、nに１対１に対応することを分かりやすくするために、これを次式（８）と再定義する。

Since y = g (x) = g (Vn) as described above, this is redefined as the following equation (8) in order to make it easy to understand that one-to-one correspondence with n.

画像マッチング問題

Image matching problem

数式（９）を解くためにここでは点y_nに対してダイナミクスを導入する。つまり画像マッチング問題を点y_nに関する動的システムを解く問題に帰着させる。このイメージ図を図３に示した。点y_nは周りの点との関係も考慮しつつ数式（９）を満たす状態に移動していき平衡状態に収束する。その平衡状態をもって画像マッチング問題が完了したとする。 In order to solve Equation (9), dynamics is introduced for the point y _n here. In other words, the image matching problem is reduced to the problem of solving the dynamic system about the point y _n . This image is shown in FIG. The point y _n moves to a state satisfying Equation (9) while considering the relationship with surrounding points, and converges to an equilibrium state. Assume that the image matching problem is completed with the equilibrium state.

点yに対して新たな時間軸τ∈Rを導入し関数y_n(τ)を定義する。ここで初期値は次式（１０）のように正方格子xと同一であるとする。

A new time axis τ∈R is introduced for the point y to define the function y _n (τ). Here, the initial value is assumed to be the same as the square lattice x as shown in the following equation (10).

新たな時間軸を導入したので、時間に関する微分が次式（１２）のように定義できる。

Since a new time axis is introduced, the derivative with respect to time can be defined as in the following equation (12).

通常ダイナミクスは次式（１３）のような常微分方程式によって記述される。

Ordinary dynamics are described by an ordinary differential equation such as the following equation (13).

ここでF∈R²は力の総和である。これは運動方程式とも呼ばれる。 Where F∈R ² is the sum of forces. This is also called the equation of motion.

次に、y_n(τ)にかかる力について考える。まずは、ダイナミクスを駆動させる力となるポテンシャルエネルギーによる力を考える。これはy_n(τ)が数式（９）を満たす状態に移動するための力である。数式を変形すると数式（１４）となる。

Next, consider the force applied to y _n (τ). First, consider the potential energy that drives the dynamics. This is a force for moving y _n (τ) to a state satisfying Equation (9). When the mathematical formula is transformed, the mathematical formula (14) is obtained.

しかし、通常数式（１４）を厳密に満たす点を探すことは、画像に含まれるノイズ成分などにより困難である。そこで次式（１５）で示すようなエネルギー関数を考え、このエネルギー関数Euが最小となる点を探すようにする。

However, it is difficult to search for a point that normally satisfies Equation (14) because of noise components included in the image. Therefore, an energy function as shown in the following equation (15) is considered, and a point where the energy function Eu is minimized is searched.

最急降下法の原理を用いれば、y_n(τ)の周りでエネルギー関数Euの最急降下の方向に下っていくことによりローカルミニマムに行き着くことができる。よってこの最急降下の方向への勾配をy_n(τ)に対する力として定義する。エネルギー関数Euは画像の相関とも考えられるので、この力を画像相関ポテンシャルエネルギーによる力Fuとする。 If the principle of steepest descent is used, the local minimum can be reached by descending in the direction of steepest descent of the energy function Eu around y _n (τ). Therefore, the gradient in the direction of the steepest descent is defined as the force against y _n (τ). Since the energy function Eu can also be considered as image correlation, this force is defined as a force Fu due to image correlation potential energy.

最急降下の方向への勾配を計算する方法は色々考えられるが、ここでは次のような方法を採用する。図４（ｂ）に示すように、最急降下方向への勾配は局所最適化によって直接求める。画像モデルScは連続の画像モデルだが、実際にはサンプリングされた画像モデルSpしか利用できない。そこで局所最適化もサンプリングされた画像モデルをベースにおこなう。y_n(τ)にもっとも近いサンプリング点を局所空間中心y_cとしたいので、次式（１６）のように求める。

There are various methods for calculating the gradient in the direction of steepest descent. Here, the following method is adopted. As shown in FIG. 4B, the gradient in the steepest descent direction is obtained directly by local optimization. Although the image model Sc is a continuous image model, only the sampled image model Sp can actually be used. Therefore, local optimization is also performed based on the sampled image model. Since the sampling point closest to y _n (τ) is to be used as the local space center y _c , the following equation (16) is obtained.

局所最適化をしてその方向へのベクトルを求め、それを正規化し勾配の大きさをかけると次式が得られる。 When local optimization is performed to obtain a vector in that direction, and the vector is normalized and multiplied by the magnitude of the gradient, the following equation is obtained.

画像相関ポテンシャルエネルギーによる力（二乗誤差エネルギー）式（１９）

Force (square error energy) by image correlation potential energy (19)

実装上の扱いやすさ等からエネルギー関数を次式（２０）と定義した画像相関ポテンシャルエネルギーによる力（絶対値差分誤差エネルギー）（式（２１））を用いることもできる。

A force (absolute value difference error energy) (formula (21)) based on image correlation potential energy in which an energy function is defined as the following formula (20) can be used for ease of handling in mounting.

次に、周辺の点との関係を記述する力について考える。マッチング対象の画像は３次元空間を２次元に投影したものだとする。３次元空間上のオブジェクトが剛体とすると、２次元画像では剛体のサーフェスがオブジェクトとして観測されることになる。３次元空間上のオブジェクトが対象画像と参照画像で観測されるとすると、このときそれぞれの画像上で観測されるオブジェクトの位相は保たれる確率が高い。図５に示すように、対象画像オブジェクト上の点xの位置関係は参照画像オブジェクト上の点y_n(τ)でも保たれるはずである。この性質は点y_n(τ)の間をバネで接続することによってシミュレーションできる。周辺との関係はバネの力F_kによって記述する。次のように、まずは対象点周辺の格子点空間N_nを定義する。周囲４点であれば次式（２２）のようになる。

Next, consider the ability to describe the relationship with surrounding points. Assume that the matching target image is a two-dimensional projection of a three-dimensional space. If the object in the three-dimensional space is a rigid body, the surface of the rigid body is observed as an object in the two-dimensional image. If an object in the three-dimensional space is observed in the target image and the reference image, there is a high probability that the phase of the object observed in each image will be maintained. As shown in FIG. 5, the positional relationship of the point x on the target image object should be maintained at the point y _n (τ) on the reference image object. This property can be simulated by connecting the points y _n (τ) with a spring. The relationship with the periphery is described by the spring force F _k . First, a lattice point space N _n around the target point is defined as follows. If there are four surrounding points, the following equation (22) is obtained.

バネ定数（弾性定数）をkとすれば、点y_n(τ)にかかるバネの復元力は数式（２３）で表されるバネ力（弾性力）となる。なお、弾性定数は画像相関エネルギーと弾性エネルギーのバランサーであり、弾性定数が大きければ変形がしにくくなり結果が安定する。しかし画像への適合性が悪くなる。弾性定数が小さければ変形がしやすくなるので画像の適合性が良くなる。ただし結果が柔軟になりすぎる。そこで、現在のところ、このパラメータは経験的に与えられる。挙動はこのパラメータの値にそれほど敏感ではではないので、基本的にはある一定値を固定的に与えられる。 If the spring constant (elastic constant) is k, the restoring force of the spring applied to the point y _n (τ) is the spring force (elastic force) represented by Expression (23). The elastic constant is a balance between the image correlation energy and the elastic energy. If the elastic constant is large, the elastic constant is difficult to deform and the result is stable. However, the compatibility with the image is deteriorated. If the elastic constant is small, the image can be easily deformed, so that the adaptability of the image is improved. However, the results are too flexible. So, at present, this parameter is given empirically. Since the behavior is not so sensitive to the value of this parameter, it is basically given a fixed value.

バネの力

Spring force

周囲４点を接続すると具体的には次式（２４）のようになる。 Specifically, when the four surrounding points are connected, the following equation (24) is obtained.

４点接続バネモデル

4-point connection spring model

最後に保存されたエネルギーを散逸させる力について考える。y_n(τ)にかかる力がF_u, F_kのみではエネルギーが保存されてしまうために系が振動する定常状態となってしまう。そこで保存されているエネルギーを散逸させる力を導入する。これには摩擦力が利用できる。速度が一定と近似できる場合には摩擦力は次式（２５）で記述できる。 Finally, consider the power to dissipate the stored energy. If the force applied to y _n (τ) is only F _u and F _k , the energy is conserved and the system is in a steady state where it vibrates. Therefore, the power to dissipate the stored energy is introduced. Frictional force can be used for this. When the speed can be approximated to be constant, the frictional force can be described by the following equation (25).

摩擦力

Frictional force

以上の力をまとめると運動方程式は次式（２６）のようになる。 Summarizing the above forces, the equation of motion is as shown in the following equation (26).

運動方程式

Equation of motion

画像相関ポテンシャルエネルギーによる力F_uが解析的には解けないため常微分方程式（２６）は解析的には解けない。故に、システムのτ→∞における極限を取ることは困難である。そこでシステムが収束するのに十分大きな時間Tを考え、数値解析によってt=(0,T)区間を計算することによってシステムの収束状態を推定する。 ODE a force F _u by the image correlation potential energy is not solved analytically (26) can not be solved analytically. Therefore, it is difficult to take the limit at τ → ∞ of the system. Therefore, a sufficiently large time T for the system to converge is considered, and the convergence state of the system is estimated by calculating the t = (0, T) interval by numerical analysis.

常微分方程式は初期値が決まれば、数値解析によって一意に解が求まる。一般には常微分方程式の初期値問題といわれるものである。この問題の数値解法は数多く存在するが、有名なものではオイラー法、ルンゲクッタ法、ブリルシュ・ストア法、予測子・修正子法、隠的ルンゲクッタ法などがある。ルンゲクッタ法がもっとも有名かつ使用頻度が高い。しかし式（２６）は画像のサイズ分の次元を持つため複雑な数値解法は適合しにくい。そこでここでは実現が最も簡単なオイラー法を応用することを考える。 If the initial value of the ordinary differential equation is determined, a solution can be uniquely obtained by numerical analysis. In general, it is called the initial value problem of ordinary differential equations. There are many numerical solutions to this problem, but famous ones include the Euler method, the Runge-Kutta method, the Birsch store method, the predictor / corrector method, and the hidden Runge-Kutta method. The Runge-Kutta method is the most famous and frequently used. However, since Equation (26) has dimensions corresponding to the size of the image, a complicated numerical solution method is difficult to adapt. Therefore, here we consider applying the Euler method, which is the simplest implementation.

オイラー法は一階の常微分方程式に対する数値解法なので、まずは、式（２７）を式（２６）に施して式（２６）を一階の常微分方程式に変換する。これにより変換された運動方程式（２８）が得られる。 Since the Euler method is a numerical solution for a first-order ordinary differential equation, first, Equation (27) is applied to Equation (26) to convert Equation (26) to a first-order ordinary differential equation. Thereby, the converted equation of motion (28) is obtained.

変数変換

Variable conversion

変換された運動方程式

Transformed equations of motion

常微分方程式（２９）に対するオイラー法のスキームは式（３０）により表される。

The Euler scheme for the ordinary differential equation (29) is represented by equation (30).

これはt⁽ⁿ⁾からt⁽ⁿ⁺¹⁾≡t⁽ⁿ⁺¹⁾+hへと解を進展させるものである。ここでx⁽ⁿ⁾はnステップであることを示しており、hはステップ幅である。オイラー法のスキームを数式（２８）に適用すると次式（３１）の更新式が得られる。 This advances the solution from t ⁽ⁿ⁾ to t ^{(n + 1)} ≡t ^{(n + 1)} + h. Here, x ⁽ⁿ⁾ indicates n steps, and h is a step width. When the Euler scheme is applied to the equation (28), an update equation of the following equation (31) is obtained.

オイラー法による更新式

Renewal formula by Euler method

そこでこれらの計算の無駄を省くために、図６に示すように画面全体を任意の形状のクラスタ（分割片）に分割し、各クラスタ内で画像相関ポテンシャルエネルギーによる力F_uを計算する代表対象格子点●を決定し、各代表対象格子点における力F_uをクラスタ内の他の点の力として割り当てる。即ち、次式（３２）のように最初の画素y₅を代表対象格子点と決めた場合、この代表対象格子点(y₅)における力F_uを求め、他の画素y1-y4, y6-ynに対しては代表対象格子点(y₅)のF_uを代入する。

Therefore, in order to eliminate waste of these calculations, representative subjects is divided into clusters of arbitrary shape the entire screen as shown in FIG. 6 (divided pieces), to calculate the force F _u by the image correlating potential energy within each cluster The grid point ● is determined, and the force _Fu at each representative target grid point is assigned as the force of another point in the cluster. That is, when you decide the representative target grid point of the first pixel y ₅ as in the following equation (32) obtains the force F _u in the representative object grid points (y _5), the other pixels y1-y4, y6- for yn substitutes F _u of the representative target lattice point (y _5).

代表対象格子点をステップ毎にずらせばある周期ではすべての点に関して力F_uを計算したことに近似できる。任意形状のクラスタは正方ブロックを用いることができる。例えば、２×２正方ブロックまたは４×４正方ブロックなどを用いることができる。また、クラスタは正方ではなく長方形でも構わない。 If the representative target grid point is shifted for each step, it can be approximated that the force _Fu is calculated for all points in a certain period. A square block can be used for an arbitrarily shaped cluster. For example, a 2 × 2 square block or a 4 × 4 square block can be used. The cluster may be a rectangle instead of a square.

代表対象格子点の決定は代表対象格子点決定ステップでおこなう。クラスタを２×２正方ブロックとして各クラスタ内の格子点のインデックスを次式（３３）のようにつける。

The representative target lattice point is determined in the representative target lattice point determination step. The cluster is set to a 2 × 2 square block, and the index of the lattice point in each cluster is given as in the following equation (33).

代表対象格子点の決定法を例えば図７のように時間のステップに合わせて周期的に変化させる方法が考えられる。具体的には代表対象格子点を次式（３４）により決定する。 For example, a method of periodically changing the method of determining the representative target grid point according to the time step as shown in FIG. Specifically, the representative target grid point is determined by the following equation (34).

代表対象格子点決定方法１

Representative object grid point determination method 1

ここでmod(.)は余りを計算するものである。 Here, mod (.) Calculates the remainder.

また、一様乱数を使って次式（３５）により決定しても良い。 Moreover, you may determine by following Formula (35) using a uniform random number.

代表対象格子点決定方法２

Representative object grid point determination method 2

ここでuni_rand(.,.)は引数の範囲の一様整数乱数を生成するオペレータとする。 Here, uni_rand (.,.) Is an operator that generates a uniform integer random number in the argument range.

以上、画像マッチングをアルゴリズムとしてまとめると代表対象格子点決定方法２は図８のフローに示すようになる。 As described above, when image matching is summarized as an algorithm, the representative target lattice point determination method 2 is as shown in the flow of FIG.

画像マッチングアルゴリズム：

Image matching algorithm:

本実施の形態では従来技術の困難を、画面全体を動的システムとして捉えることによって、画面全体が系の自然な最適状態に収束する特徴を用いて解決した。 In this embodiment, the difficulty of the prior art is solved by using the feature that the entire screen converges to the natural optimum state of the system by capturing the entire screen as a dynamic system.

［第２の実施の形態］（領域分割利用）
本実施の形態を図９のブロック図に従って説明する。本実施の形態では、第１の実施の形態と同様、従来技術の困難を、画面全体を動的システムとして捉えることによって、画面全体が系の自然な最適状態に収束する特徴を用いて解決する。第１の実施の形態ではブロックを用いてクラスタを記述したのに対して、本実施の形態では領域分割の技術を用いて、より絵に適合したクラスタを利用する。 [Second Embodiment] (use of area division)
This embodiment will be described with reference to the block diagram of FIG. In the present embodiment, similar to the first embodiment, the difficulty of the prior art is solved by using the feature that the entire screen converges to the natural optimum state of the system by capturing the entire screen as a dynamic system. . In the first embodiment, clusters are described using blocks, but in this embodiment, a cluster that is more suitable for a picture is used by using a region division technique.

基本的な構成は第１の実施の形態と同様であるが、第１の実施の形態に領域分割モジュール１６が付加されている。即ち、画像マッチングステップ内に領域分割ステップが加わっている。第１の実施の形態では任意形状のクラスタとして２ｘ２の正方ブロックなどのブロックを用いたが、本実施の形態では領域分割の分割片を元にして絵に適合したクラスタを構築する。 The basic configuration is the same as that of the first embodiment, but an area dividing module 16 is added to the first embodiment. That is, a region dividing step is added in the image matching step. In the first embodiment, a block such as a 2 × 2 square block is used as a cluster having an arbitrary shape. However, in this embodiment, a cluster suitable for a picture is constructed on the basis of the segmentation pieces.

ブロック型クラスタの場合にはひとつ問題点がある。図１０にオブジェクトと背景の境界に存在するブロック型クラスタの例が示されている。これは、オブジェクトが左に移動し、背景が右に移動している画像だとする。そのときオブジェクト上の格子点は左方向にポテンシャルエネルギーが発生し、背景上の格子点は右方向にポテンシャルエネルギーが発生する。また、代表対象格子点の決定方法は数式（３４）の方法を用いている。このときクラスタに発生するポテンシャルエネルギーによる力は図１０のように周期的に右方向、左方向を繰り返すことになる。時間方向に積算して考えるとクラスタとしては左方向への少しの力がかかっていることと等価になる。これは境界の挙動としては相応しいとはいえない。 There is one problem with block clusters. FIG. 10 shows an example of a block type cluster existing at the boundary between the object and the background. This is an image with the object moving to the left and the background moving to the right. At that time, the lattice energy on the object generates potential energy in the left direction, and the lattice point on the background generates potential energy in the right direction. Further, the method of determining the representative target lattice point uses the method of Equation (34). At this time, the force due to the potential energy generated in the cluster periodically repeats the right direction and the left direction as shown in FIG. Considering the integration in the time direction, the cluster is equivalent to applying a slight force in the left direction. This is not a good boundary behavior.

この問題は動きの境界部分にまたがるようなクラスタが存在するから発生する。画面で一様なブロック型クラスタの場合にはこの境界にまたがるクラスタが発生する可能性が高い確率で存在する。 This problem arises because there are clusters that span the boundaries of motion. In the case of a block-type cluster that is uniform on the screen, there is a high probability that a cluster across this boundary will occur.

撮影された動画などのような通常の動画を考えると、動きの境界は多くの場合、画像のエッジと重なることが多くなる。画像のエッジ領域であることが動きの境界になることの必要条件となっていると考えることができる。そこで、図１１に示すように画像のエッジに沿った領域分割をすることが領域分割手段の必要条件となる。 When considering a normal moving image such as a captured moving image, the boundary of movement often overlaps the edge of the image. It can be considered that the edge region of the image is a necessary condition for becoming a boundary of motion. Therefore, as shown in FIG. 11, dividing the area along the edge of the image is a necessary condition for the area dividing means.

上記の条件を満たす領域分割手段としてFast Watersheds法［L. Vincent and P. Soille; “Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations”, IEEE Trans. on Pattern Analysis and Machine Intell., Vol.13, No.6, 1991］があげられる。これは、差分オペレータによって処理対象の輝度画像から輝度勾配画像（エッジ抽出画像）を求め、そこで求まった輝度勾配の尾根によって囲まれる閉領域を一つの領域として分割する手法である。Fast Watershed法は、輝度勾配画像、つまりエッジを基準として、そのエッジの尾根によって囲まれる閉領域を一つの領域として分割する手法であるため、クラスタとエッジが正確に一致しやすい手法である。Fast Watersheds法は計算速度も考慮されており、画像マッチングステップ全体に比べれば計算量のインパクトは小さい。 Fast watersheds method [L. Vincent and P. Soille; “Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations”, IEEE Trans. On Pattern Analysis and Machine Intell., Vol.13] , No. 6, 1991]. This is a technique in which a luminance gradient image (edge extracted image) is obtained from a luminance image to be processed by a difference operator, and a closed region surrounded by a ridge of the luminance gradient obtained there is divided as one region. The Fast Watershed method is a method of dividing a closed region surrounded by a ridge of a luminance gradient image, that is, an edge, as a single region, and thus a cluster and an edge are easily matched with each other. The Fast Watersheds method also considers the calculation speed, and the impact of the calculation amount is small compared to the entire image matching step.

次に、代表対象格子点決定ステップについて述べる。領域分割によって分割されたクラスタはサイズも形状もまちまちになるため、周期的にインデックスを変える第１の実施の形態（数式（３４））のような方法は取りにくい。そこで数式（３５）のようにランダムにインデックスを変更する決定方法を用いる。 Next, the representative target lattice point determination step will be described. Since the clusters divided by region division vary in size and shape, it is difficult to take the method as in the first embodiment (formula (34)) in which the index is periodically changed. Therefore, a determination method is used in which the index is randomly changed as shown in Equation (35).

クラスタ内の各格子点のインデックスを、図１２に示すように左上の点から右下方向に順番につけることとする。あるクラスタをc_iとすると代表対象格子点決定方法は次式（３６）のように記述できる。 As shown in FIG. 12, the index of each lattice point in the cluster is assigned in order from the upper left point to the lower right direction. Assuming that a certain cluster is c _i , the representative target lattice point determination method can be described as the following equation (36).

代表対象格子点決定方法３

Representative object grid point determination method 3

ここでN(.)はクラスタ内の格子点数を計算するオペレータとする。 Here, N (.) Is an operator that calculates the number of grid points in the cluster.

テストパターンにおいて実験をすると図１３のような成功例と失敗例が得られた。代表対象格子点を乱数によって生成しているが、この乱数の生成のされ方によって上手く行ったり行かなかったりする。これは代表対象格子点におけるポテンシャルがきちんと出る部分と出ない部分が存在するためと考えられる。よって上手くポテンシャルが計算できる部分が代表対象格子点に選ばれると結果がうまくいき、そうでないと失敗する。これはクラスタ内の代表対象格子点を増やすことによって改善できると考えられる。 Experiments with test patterns yielded successes and failures as shown in FIG. The representative target grid point is generated by a random number, but depending on how this random number is generated, it may or may not go well. This is considered to be because there are portions where the potential at the representative target lattice point appears properly and portions where it does not. Therefore, if the part that can calculate the potential well is selected as the representative target grid point, the result will be successful, otherwise it will fail. This can be improved by increasing the number of representative target grid points in the cluster.

クラスタには大きい物も小さい物も存在する。そこで代表対象格子点の数をクラスタ内の割合から求めるようにする。こうすればクラスタの数、サイズによらず画面全体の計算量を見積もることも可能である。クラスタ内で代代表対象表点とする格子点の割合をαとすると、代表対象格子点の数は次式（３７）と記述できる。 There are large and small clusters. Therefore, the number of representative target grid points is obtained from the ratio in the cluster. In this way, it is possible to estimate the calculation amount of the entire screen regardless of the number and size of clusters. Assuming that the ratio of lattice points to be representative representative object table points in the cluster is α, the number of representative object lattice points can be described by the following equation (37).

代表対象格子点の数

Number of representative target grid points

例えば、次式（３８）に従って各代表対象格子点毎に画像相関ポテンシャルエネルギーによる力F_uを計算し、それらを平均してクラスタc_iの代表力F_uとする。 For example, the force F _u by the image correlating potential energy calculated for each representative object grid points according to the following equation (38), and the average thereof as the representative force F _u cluster c _i with.

代表画像相関ポテンシャルエネルギーによる力F_u

Force F _u by representative image correlation potential energy

図１４にα＝０．２として４回の実験をおこなった結果を示した。安定して写像が計算できていることが分かる。 FIG. 14 shows the results of four experiments with α = 0.2. It can be seen that the map can be calculated stably.

以上、画像マッチングをアルゴリズムとしてまとめると代表対象格子点決定方法２は図１５のフローに示すようになる。 As described above, when image matching is summarized as an algorithm, the representative target lattice point determination method 2 is as shown in the flow of FIG.

画像マッチングアルゴリズム：

Image matching algorithm:

［第３の実施の形態］（距離が離れていたら再計算）
本実施の形態を図１６のブロック図に従って説明する。本実施の形態では、第１の実施の形態と同様、従来技術の困難を、画面全体を動的システムとして捉えることによって、画面全体が系の自然な最適状態に収束する特徴を用いて解決する。第１の実施の形態ではブロック内の代表対象格子点をひとつに固定した構成であったが、本実施の形態では格子点間の状態によって代表対象格子点を適応的に選択する。即ち、代表対象格子点決定モジュール１１４が追加代表対象格子点を決定する機能を有する。 [Third Embodiment] (Recalculation if distance is long)
This embodiment will be described with reference to the block diagram of FIG. In the present embodiment, similar to the first embodiment, the difficulty of the prior art is solved by using the feature that the entire screen converges to the natural optimum state of the system by capturing the entire screen as a dynamic system. . In the first embodiment, the representative target grid point in the block is fixed to one, but in the present embodiment, the representative target grid point is adaptively selected according to the state between the grid points. That is, the representative target grid point determination module 114 has a function of determining an additional representative target grid point.

図１７に示すように、右下のパターンなどではクラスタ内の格子点が離れた位置に存在することがある。このような状況にもかかわらず、クラスタ内にひとつの代表対象格子点だけで計算していると結果が歪んでしまう。そこで、本実施の形態ではこのパターンのようにクラスタ内の格子点で距離が離れているものに関しては、それも代表対象格子点とするという方法を用いる。 As shown in FIG. 17, in the lower right pattern or the like, the lattice points in the cluster may exist at positions separated from each other. Despite this situation, if the calculation is performed with only one representative target grid point in the cluster, the result is distorted. Therefore, in this embodiment, a method is used in which a lattice point in the cluster having a large distance is used as a representative target lattice point as in this pattern.

代表対象格子点決定ステップに関して述べる。まず、代表対象格子点決定モジュール１４によって初期代表対象格子点を数式（３４）や数式（３５）の方法に従って決定する。この代表対象格子点をx_d=Vn_dとする。このとき写像点はy_nd=g(x_d)である。このときポテンシャル力を割り当てられる対象点をx=Vnとすると、その写像点はy_n=g(x)である。ポテンシャルを割り当てずに追加で代表対象格子点にした方が良い不連続な点というのは図１７内の点線で接続された点である（割り当てはせずに再計算するということである）。このような点の定義を例えば次式（３９）のようにおこなう。 The representative target grid point determination step will be described. First, the representative representative lattice point determination module 14 determines the initial representative target lattice point according to the method of Equation (34) or Equation (35). The representative target grid point and x _{_d} = Vn _d. At this time, the mapping point is y _nd = g (x _d ). At this time, if the target point to which the potential force is assigned is x = Vn, the mapping point is y _n = g (x). Discontinuous points that should be added to representative target grid points without assigning potentials are points connected by dotted lines in FIG. 17 (recalculation without assigning). Such a point is defined, for example, by the following equation (39).

追加代表対象格子点決定

Determination of additional representative target grid points

これは格子点間の距離比が定数αを超えた点としたものである。 This is a point where the distance ratio between lattice points exceeds the constant α.

以上、画像マッチングをアルゴリズムとしてまとめると代表対象格子点決定方法２は図１８のフローに示すようになる。 As described above, when image matching is summarized as an algorithm, the representative target lattice point determination method 2 is as shown in the flow of FIG.

画像マッチングアルゴリズム：

Image matching algorithm:

上記実施形態において、代表対象格子点と対象格子点との間の距離と、代表対象格子点を写像した参照画像上の代表参照格子点と参照格子点の距離の比がある値より大きい場合には、代表対象格子点における画像相関ポテンシャルエネルギーの勾配によって受ける力ではなく、対象格子点と参照格子点の位置関係によって求まる画像相関ポテンシャルエネルギーの勾配によって受ける力を計算する。 In the above embodiment, when the ratio between the distance between the representative target grid point and the target grid point and the distance between the representative reference grid point on the reference image mapping the representative target grid point and the reference grid point is greater than a certain value. Calculates the force received by the gradient of the image correlation potential energy determined by the positional relationship between the target lattice point and the reference lattice point, not the force received by the gradient of the image correlation potential energy at the representative target lattice point.

本発明の第１の実施の形態に従った画像マッチング装置のブロック図1 is a block diagram of an image matching apparatus according to a first embodiment of the present invention. 画像における各空間の繋がりを示す図The figure which shows the connection of each space in the image 画像マッチング問題ｙｎに対するダイナミクスを示す図Diagram showing dynamics for image matching problem yn 画像相関ポテンシャルエネルギーによる力をイメージで示す図Image showing force by image correlation potential energy 対象画像上のオブジェクトと参照画像のオブジェクトの位相が保たれる状態を示す図The figure which shows the state by which the phase of the object on a target image and the object of a reference image is maintained クラスタに分割した画像を示す図Figure showing an image divided into clusters 代表対象格子点を選択する方法を説明する図The figure explaining the method of selecting a representation target lattice point 第１の実施の形態による画像マッチングアルゴリズムのフローチャート図The flowchart figure of the image matching algorithm by 1st Embodiment 本発明の第２の実施の形態に従った画像マッチング装置のブロック図Block diagram of an image matching device according to a second embodiment of the present invention ブロック型クラスタの問題点を説明するための図Diagram for explaining the problem of block type cluster クラスタ内ポテンシャル代表モデルを説明するための図Diagram for explaining representative model of potential in cluster クラスタ内のインデックスを示す図Diagram showing indexes in a cluster 画像マッチングの成功例と失敗例を示す図Diagram showing examples of successful and unsuccessful image matching αを０．２に設定したときの画像マッチング結果を示す図The figure which shows the image matching result when α is set to 0.2 第２の実施の形態による画像マッチングアルゴリズムのフローチャート図Flowchart diagram of an image matching algorithm according to the second embodiment 本発明の第３の実施の形態に従った画像マッチング装置のブロック図Block diagram of an image matching device according to a third embodiment of the present invention 第３の実施の形態を概念的に示す図The figure which shows 3rd Embodiment notionally 第３の実施の形態による画像マッチングアルゴリズムのフローチャート図Flowchart diagram of an image matching algorithm according to the third embodiment

Explanation of symbols

１１…画像マッチング部、１２…モデル生成モジュール、１３…数値解析モジュール、
１４…代表対象格子点決定モジュール、１５…画像相関ポテンシャルエネルギー計算モジュール、１６…領域分割モジュール、１１４…代表対象格子点決定モジュール 11 ... Image matching unit, 12 ... Model generation module, 13 ... Numerical analysis module,
DESCRIPTION OF SYMBOLS 14 ... Representative object grid point determination module, 15 ... Image correlation potential energy calculation module, 16 ... Area division module, 114 ... Representative object grid point determination module

Claims

In the image matching device for obtaining the correspondence between the target image and the reference image,
Setting means for setting a plurality of target grid points in the target image, and setting a reference grid point corresponding to each of the target grid points on the target image in a one-to-one manner in the reference image;
Determining means for dividing the target image into a plurality of clusters including at least one target grid point, and determining a representative target grid point of each cluster;
On the reference image, pixel information of a representative reference grid point that is a reference grid point corresponding to the representative target grid point, pixel information of the representative target grid point, and the representative reference grid point and the representative target grid point Calculation means for calculating the image correlation force received by the representative reference grid point by the gradient of the image correlation potential energy determined by the positional relationship;
The image correlation force acting on the reference lattice points corresponding to each of the target lattice points belonging to each cluster is set to the image correlation force calculated for the representative reference lattice points corresponding to the representative target lattice points of the cluster. Setting means;
For each reference grid point on the reference image, the reference by the image correlation force acting on the reference grid point, the elastic energy between the reference grid point and another reference grid point adjacent to the reference grid point Model generation means for constructing an equation of motion related to the reference grid point using the elastic force received by the grid point and the frictional force acting on the reference grid point;
Numerical analysis means for obtaining an equilibrium state of each of the reference lattice points by numerically analyzing the equation of motion;
An image matching apparatus comprising:

The image matching apparatus according to claim 1, wherein the determining unit sets the cluster as a uniform block over the entire screen.

The image matching apparatus according to claim 1, wherein the determination unit divides a region based on the image information of the target image, and sets a result of the region division as a cluster.

The image matching apparatus according to claim 1, wherein the determination unit performs region division based on luminance information of the target image.

The image matching apparatus according to claim 1, wherein the determination unit performs region division based on a color similarity of the target image.

The calculation means includes a distance between the representative target grid point and the target grid point, and a ratio of a distance between the representative reference grid point and the reference grid point on the reference image mapping the representative target grid point. When the value is larger than a certain value, it is received not by the force received by the gradient of the image correlation potential energy at the representative target grid point but by the gradient of the image correlation potential energy obtained by the positional relationship between the target grid point and the reference grid point. The image matching apparatus according to claim 1, wherein the force is calculated.

The determining means determines a plurality of representative target grid points according to the size of the cluster, and the calculating means calculates a force received by the gradient of the image correlation potential energy for each representative target grid point, and a plurality of calculation results The image matching apparatus according to claim 1, wherein an average of the values is a representative power of the cluster.