JP2008084076A

JP2008084076A - Image processor, method, and program

Info

Publication number: JP2008084076A
Application number: JP2006264359A
Authority: JP
Inventors: Sunao Mishima; 直三島; Takeshi Ito; 伊藤　　剛
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2006-09-28
Filing date: 2006-09-28
Publication date: 2008-04-10

Abstract

<P>PROBLEM TO BE SOLVED: To perform image processing highly precisely by performing assignment of optimal labels all over an image. <P>SOLUTION: An image processor is provided with: an area division part 112 for dividing a target image into a plurality of areas; and a label assignment part 111 which selects a parametric model where conformity with the parametric model and value of an evaluation function that indicates the degree of smoothness representing whether or not a model identification label changes between adjacent pixels changes become minimum from among a plurality of parametric models including a plurality of model parameters and the model identification labels for identifying models to be models for coordinating points on the target image and points on a reference image, and executes label assignment processing for assigning a model identification label of the selected parametric model to a pixel in the area in order of the number of pixels among the plurality of areas. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、入力された画像に関する複数のパラメータモデルを導入して、入力された画像の領域ごとにパラメータモデルを割り当てて画像処理を行う画像処理装置、方法およびプログラムに関する。 The present invention relates to an image processing apparatus, method, and program for introducing a plurality of parameter models related to an input image and performing image processing by assigning the parameter model to each region of the input image.

画像の領域ごとに予め定められたパラメトリックモデルを適用し、画像処理を行う技術としては、例えば、非特許文献１の技術が知られている。非特許文献１の技術では、画像の動きを推定する問題に対して、２つの画像間の局所的な動きをオプティカルフローなどの手法で推定し、推定された局所的な動きの集合から複数のモデルパラメータをアフィン変換と最小二乗法の手法により抽出し、k-means法、watershed法などの手法により画像を複数の領域に分割して、分割された領域ごとに最適なパラメータを選択するラベリングを行う技術が開示されている。このような技術によれば、画像の領域ごとに適切な動きモデルが選択される。 As a technique for performing image processing by applying a predetermined parametric model for each region of an image, for example, the technique of Non-Patent Document 1 is known. In the technique of Non-Patent Document 1, with respect to the problem of estimating the motion of an image, a local motion between two images is estimated by a technique such as an optical flow, and a plurality of local motions are estimated from a set of estimated local motions. The model parameters are extracted by the affine transformation and least square method, and the image is divided into multiple regions using the k-means method, watershed method, etc., and labeling is performed to select the optimal parameters for each divided region. Techniques to do are disclosed. According to such a technique, an appropriate motion model is selected for each region of the image.

P. E. Eren, Y. Altunbasak and M. Tekalp, "Region-based affine motion segmentation using color information", IEEE ICASSP1997, Vol. 4, pp. 3005 - 3008, 1997P. E. Eren, Y. Altunbasak and M. Tekalp, "Region-based affine motion segmentation using color information", IEEE ICASSP1997, Vol. 4, pp. 3005-3008, 1997

このような従来技術では、ラベリングは領域ごとにモデルパラメータの適合度のみを用いた最尤推定がおこなわれ、ラベリングが動き推定の結果の精度に大きな影響を与える。このため、サンプリング数が多いサイズの大きい領域では高精度となる一方、サイズの小さい領域に関してはサンプリング数が少なくなるので、最適なラベリングを行うことができず、動き推定の精度が低くなるという問題がある。 In such a conventional technique, labeling is performed by maximum likelihood estimation using only the fitness of model parameters for each region, and the labeling greatly affects the accuracy of the motion estimation result. For this reason, while high accuracy is obtained in a large area with a large number of samplings, the number of samplings is reduced in an area with a small size, so that optimal labeling cannot be performed and the accuracy of motion estimation is reduced. There is.

本発明は、上記に鑑みてなされたものであって、画像全体に亘って最適なラベル割り当てを行って、パラメータモデルを適用した画像処理を高精度に行うことができる画像処理装置、方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and is an image processing apparatus, method, and program capable of performing image processing using a parameter model with high accuracy by performing optimal label allocation over the entire image. The purpose is to provide.

上述した課題を解決し、目的を達成するために、本発明にかかる画像処理装置は、第１の画像を、それぞれが複数の画素を含む複数の領域に分割する領域分割部と、前記第１の画像上の点と第２の画像上の点とを対応付けるためのモデルであって、複数のモデルパラメータ及び前記モデルを識別するためのモデル識別ラベルを含む複数のパラメトリックモデルの中から、前記パラメトリックモデルとの適合度及び隣接する画素間で前記モデル識別ラベルが変化するか否かを表す円滑度を示す評価関数の値が最小になる前記パラメトリックモデルを選択し、選択された前記パラメトリックモデルの前記モデル識別ラベルを前記領域内の画素に割り当てるラベル割り当ての処理を、前記複数の領域のうち画素数が多いものから順に実行するラベル割当部と、各画素に割り当てられた前記モデル識別ラベルに対応する前記パラメトリックモデルに基づいて、前記第１の画像と前記第２の画像との対応関係を求める画像処理部と、を備えたことを特徴とする。 In order to solve the above-described problem and achieve the object, an image processing apparatus according to the present invention includes a region dividing unit that divides a first image into a plurality of regions each including a plurality of pixels, A model for associating a point on the second image with a point on the second image, the parametric model including a plurality of model parameters and a model identification label for identifying the model. Select the parametric model that minimizes the value of the evaluation function indicating the degree of fitness with the model and the smoothness indicating whether the model identification label changes between adjacent pixels, and the parametric model of the selected parametric model A label that executes a process of assigning a label for assigning a model identification label to a pixel in the region in order from the largest number of pixels in the plurality of regions. And an image processing unit for obtaining a correspondence relationship between the first image and the second image based on the parametric model corresponding to the model identification label assigned to each pixel. It is characterized by.

また、本発明は、動きのない画像である背景画像に対して動きのある領域である動領域を含む対象画像を、それぞれが複数の画素を含む複数の領域に分割する領域分割部と、背景領域か動領域かを識別する領域種別ラベルを前記領域内の画素に割り当てるラベル割り
当ての処理を、前記複数の領域のうち画素数が多いものから順に実行するラベル割当部と、各画素に割り当てられた前記領域種別ラベルに基づいて、前記対象画像から前記動領域を検出する動領域検出部と、を備えたことを特徴とする。
本発明は、上記画像処理装置に対応する画像処理方法およびプログラムである。 The present invention also provides an area dividing unit that divides a target image including a moving area that is a moving area with respect to a background image that is an image without movement into a plurality of areas each including a plurality of pixels, and a background A label allocating unit that executes a process of allocating a label for allocating an area type label for identifying an area or a moving area to a pixel in the area in order from the largest number of pixels in the plurality of areas; And a moving area detecting unit that detects the moving area from the target image based on the area type label.
The present invention is an image processing method and program corresponding to the image processing apparatus.

本発明によれば、画像全体に亘って最適なラベル割り当てを行って、画像処理を高精度に行うことができるという効果を奏する。 According to the present invention, there is an effect that image processing can be performed with high accuracy by performing optimal label allocation over the entire image.

以下に添付図面を参照して、この発明にかかる画像処理装置、画像処理方法およびプログラムの最良な実施の形態を詳細に説明する。 Exemplary embodiments of an image processing apparatus, an image processing method, and a program according to the present invention are explained in detail below with reference to the accompanying drawings.

（実施の形態１）
図１は、実施の形態１にかかる動き推定装置の機能的構成を示すブロック図である。実施の形態１にかかる動き推定装置は、対象画像と参照画像の各点の動きを推定するものであり、図１に示すように、局所動き推定部１０１と、モデル生成部１０７と、領域分割部１１２と、ラベル割当部１１１と、画像処理部１１３と、フレームメモリ１０６と、メモリ１０３とを主に備えている。 (Embodiment 1)
FIG. 1 is a block diagram of a functional configuration of the motion estimation apparatus according to the first embodiment. The motion estimation apparatus according to the first embodiment estimates the motion of each point of the target image and the reference image. As illustrated in FIG. 1, the local motion estimation unit 101, the model generation unit 107, and the region division Mainly includes a unit 112, a label allocation unit 111, an image processing unit 113, a frame memory 106, and a memory 103.

局所動き推定部１０１は、対象画像と参照画像から、ブロックマッチングの手法により２つ画像間の動き推定を行って、各点の動きベクトルを求める処理部である。 The local motion estimation unit 101 is a processing unit that performs motion estimation between two images from a target image and a reference image using a block matching technique, and obtains a motion vector at each point.

モデル生成部１０７は、局所動き推定の結果から、対象画像上の点と参照画像上の点とを対応付けるためパラメトリックモデルを生成するものである。パラメトリックモデルは、パラメトリックモデルを規定する複数のモデルパラメータ及びモデルを識別するためのモデル識別ラベルを含み、このモデル識別ラベルによりラベル付けされている。本実施の形態では、パラメトリックモデルは、モデルパラメータにより、モデル識別ラベルの領域における第１の画像における各点の第２の画像上との点の動きベクトルを求めるためのモデルである。モデル生成部１０７は、モデル定義部１０８と、クラスタリング処理部１０９と、パラメータ推定部１１０とを備えている。 The model generation unit 107 generates a parametric model for associating a point on the target image with a point on the reference image from the result of local motion estimation. The parametric model includes a plurality of model parameters defining the parametric model and a model identification label for identifying the model, and is labeled with the model identification label. In the present embodiment, the parametric model is a model for obtaining a motion vector of a point on the second image of each point in the first image in the region of the model identification label based on the model parameter. The model generation unit 107 includes a model definition unit 108, a clustering processing unit 109, and a parameter estimation unit 110.

モデル定義部１０８は、上記パラメトリックモデルを定義し生成してメモリ１０３に保存する処理部である。 The model definition unit 108 is a processing unit that defines and generates the parametric model and stores it in the memory 103.

クラスタリング処理部１０９は、局所動き推定部１０１で求めた対象画像上の各点の動きベクトルを分類して（クラスタリングして）、ラベル割り当ての初期値を付与する処理部である。 The clustering processing unit 109 is a processing unit that classifies (clusters) the motion vectors of each point on the target image obtained by the local motion estimation unit 101 and assigns an initial value for label allocation.

パラメータ推定部１１０は、モデル識別ラベルごとに、同一のモデル識別ラベル内において、パラメトリックモデルによって定められる動きベクトルと各点の局所動き推定部１０１によって求めた動きベクトルとの誤差から求められる誤差エネルギーを最小にするモデルパラメータを最小二乗法によって推定する処理部である。 For each model identification label, the parameter estimation unit 110 calculates error energy obtained from the error between the motion vector determined by the parametric model and the motion vector obtained by the local motion estimation unit 101 at each point within the same model identification label. It is a processing unit for estimating the model parameter to be minimized by the least square method.

領域分割部１１２は、対象画像のフレームを画素値に基づいた領域に分割する（クラスタリングする）処理部である。領域分割部１１２は、さらに分割された各領域に領域（クラスタ）を識別するための領域ラベルを割り当てる。 The region dividing unit 112 is a processing unit that divides (clusters) the frame of the target image into regions based on pixel values. The area dividing unit 112 assigns an area label for identifying an area (cluster) to each further divided area.

ラベル割当部１１１は、複数のパラメトリックモデルの中から、パラメトリックモデルとの適合度及び隣接する画素間でモデル識別ラベルが変化するか否かを表す円滑度を示す
評価関数の値が最小になるパラメトリックモデルを選択し、選択されたパラメトリックモデルのモデル識別ラベルを領域分割部１１２によって分割された領域（クラスタ）内の画素に割り当てるラベル割り当てを行う処理部である。
具体的には、ラベル割当部１１１は、分割された領域（クラスタ）内の画素に対して、対象画像の画素と参照画像の画素と各画素の動きベクトルに基づいた尤度エネルギーＵ₁と正則化エネルギーＵ₂とに基づいた上記評価関数が最小にするパラメトリックモデルを選択してラベル割り当ての処理を行っている。また、ラベル割当部１１１は、このようなラベル割り当ての処理を、複数の領域（クラスタ）のうち画素数が多い（クラスタサイズの大きい）ものから順に領域を選択して評価関数の値を計算して実行している。
画像処理部１１３は、各画素に割り当てられたモデル識別ラベルに対応するパラメトリックモデルに基づいて、対象画像と参照との対応関係を求めて動き推定を行い、その結果を動き推定結果として出力する処理部である。
なお、上記各部の詳細については後述する。 The label allocating unit 111 is a parametric that minimizes the value of an evaluation function indicating the degree of conformity between the parametric model and the smoothness indicating whether or not the model identification label changes between adjacent pixels from among a plurality of parametric models. This is a processing unit that selects a model and assigns a label to assign a model identification label of the selected parametric model to pixels in a region (cluster) divided by the region dividing unit 112.
Specifically, the label allocating unit 111 performs regularity on the likelihood energy U ₁ based on the pixel of the target image, the pixel of the reference image, and the motion vector of each pixel for the pixels in the divided region (cluster). A labeling process is performed by selecting a parametric model that the evaluation function based on the conversion energy U ₂ minimizes. Further, the label allocation unit 111 calculates the value of the evaluation function by selecting a region in order from the largest number of pixels (cluster size) from among a plurality of regions (clusters). Running.
The image processing unit 113 performs motion estimation by obtaining the correspondence between the target image and the reference based on the parametric model corresponding to the model identification label assigned to each pixel, and outputs the result as a motion estimation result Part.
The details of each of the above parts will be described later.

フレームメモリ１０６は、入力された対象画像と参照画像と保存する記憶媒体である。メモリは、各部で求めた計算結果やパラメータ、モデル定義部１０７で生成したパラメトリックモデル等を保存する記憶媒体である。 The frame memory 106 is a storage medium that stores the input target image and reference image. The memory is a storage medium that stores calculation results and parameters obtained by each unit, a parametric model generated by the model definition unit 107, and the like.

次に、各部の詳細について説明する。本実施の形態では、処理対象の２枚の画像を対象画像、参照画像とし、対象画像から参照画像への動きを求めている。 Next, the detail of each part is demonstrated. In this embodiment, two images to be processed are set as a target image and a reference image, and a motion from the target image to the reference image is obtained.

対象画像、参照画像の点ｎ∈Λ²の画素値をそれぞれｇ₁（ｎ），ｇ₂（ｎ）とする。 Let g ₁ (n) and g ₂ (n) be the pixel values of the point n∈Λ ² of the target image and the reference image, respectively.

局所動き推定部１０１では、対象画像と参照画像から、ブロックマッチングの手法を用いて以下のように２つの画像間の動きを推定している。ここで、本実施の形態では、ブロックマッチングの手法を採用しているが、これに限定されるものではなく、例えば、オプティカルフロー推定、勾配法、ベイジアンメソッドなどの各手法を用いて動き推定を行ってもよい。 The local motion estimation unit 101 estimates a motion between two images from the target image and the reference image using a block matching technique as follows. Here, in this embodiment, a block matching method is adopted, but the present invention is not limited to this, and for example, motion estimation is performed using each method such as an optical flow estimation method, a gradient method, and a Bayesian method. You may go.

ブロックマッチングでは、対象画像を（１）式であらかじめ設定された矩形領域のブロックに分割する。 In block matching, the target image is divided into blocks of a rectangular area set in advance by equation (1).

動き探索の領域をＷ∈Ｒ²とすると、差分二乗和基準（SSD：Sum of Squared Difference）に従ったブロックマッチングアルゴリズムは（２）式で示される。 Assuming that the motion search region is WεR ² , a block matching algorithm in accordance with a sum of squared difference (SSD) is expressed by equation (2).

ｄ（i）が動きベクトル、すなわちブロックｉの動きである。ブロックマッチングではブロック内は同一の動きベクトルを有すると仮定され、各画素の動きベクトルは、（３）式で表すことができる。 d (i) is the motion vector, that is, the motion of block i. In block matching, it is assumed that the blocks have the same motion vector, and the motion vector of each pixel can be expressed by equation (3).

局所動き推定部１０１は、（２）、（３）式により動きベクトルを求める処理を行っている。 The local motion estimation unit 101 performs a process for obtaining a motion vector according to equations (2) and (3).

次に、モデル定義部１０８の詳細について説明する。モデル定義部１０８は、パラメトリックモデルを規定してメモリ１０３に保存する。パラメトリックモデルは、推定したい動きの数だけ用意される。各パラメトリックモデルは、推定したい動きの数分の任意の数のモデル識別ラベルα∈Ｌ⊂Ｚによって、一意に識別される。点ｎのラベルをｚ（ｎ）∈Ｌとし、モデル識別ラベルαが割り当てられている領域の集合、すなわち各動き領域Ｎαを（４）式で示す。 Next, details of the model definition unit 108 will be described. The model definition unit 108 defines a parametric model and stores it in the memory 103. As many parametric models as the number of motions to be estimated are prepared. Each parametric model is uniquely identified by an arbitrary number of model identification labels α∈L⊂Z corresponding to the number of motions to be estimated. The label of the point n is set to z (n) εL, and a set of regions to which the model identification label α is assigned, that is, each motion region Nα is represented by equation (4).

そして、モデル識別ラベルαが割り当てられた動き領域ごとのパラメトリックモデルを（５）式で定義する。 Then, a parametric model for each motion region to which the model identification label α is assigned is defined by equation (5).

モデルパラメータａαは、対象画像から参照画像への平面から平面への幾何変換を定義したパラメータなど種々の形式を採用することができる。例えば、モデルパラメータａαとして、アファインモデルを用いた場合には（６）、（７）式、２次形式モデルを用いた場合には（８）、（９）式で示される。 As the model parameter aα, various forms such as a parameter defining a plane-to-plane geometric transformation from the target image to the reference image can be adopted. For example, when the affine model is used as the model parameter aα, the equations (6) and (7) are expressed by the equations (8) and (9) when the quadratic form model is used.

この他、モデルパラメータとして、回転変換モデル、ユークリッド変換モデル、相似変換モデル、射影変換モデル、Ｌｉｅ変換モデルなどを用いてもよい。 In addition, a rotation transformation model, Euclidean transformation model, similarity transformation model, projective transformation model, Lie transformation model, and the like may be used as model parameters.

モデル定義部１０８は、（５）、（６）、（７）式で定義されるパラメトリックモデル、若しくは、（５）、（８）、（９）式で定義されるパラメトリックモデルを生成してメモリ１０３に保存している。 The model definition unit 108 generates a parametric model defined by the equations (5), (6), and (7) or a parametric model defined by the equations (5), (8), and (9) and stores the memory. 103.

次に、パラメータ推定部１１０の詳細について説明する。パラメトリックモデルによって定まる動きベクトルは、（１０）式で示されるので、これと（５）式を用いると、パラメトリックモデルによって定まる動きベクトルは（１１）式で示される。 Next, details of the parameter estimation unit 110 will be described. Since the motion vector determined by the parametric model is expressed by equation (10), using this and equation (5), the motion vector determined by the parametric model is expressed by equation (11).

一方、パラメトリックモデルによって得られる各点の動きベクトルと局所動き推定部１０１で求めた各点の動きベクトルの誤差を（１２）式で定義すると、誤差エネルギーは（１３）式で表される。従って、各動きベクトルの定義式（１０）、（１１）式により、誤差エネルギーは（１４）式のようになり、結局（１５）式で示されることになる。 On the other hand, when the error between the motion vector of each point obtained by the parametric model and the motion vector of each point obtained by the local motion estimation unit 101 is defined by equation (12), the error energy is expressed by equation (13). Accordingly, the error energy becomes as shown in the equation (14) by the equations (10) and (11) for each motion vector, and is eventually expressed as the equation (15).

一方、モデル識別ラベルα領域内の誤差は、（１６）式で示され、このモデル識別ラベルα領域内の誤差が（１７）式に示すように最小になるようにモデルパラメータが推定される。 On the other hand, the error in the model identification label α region is expressed by equation (16), and the model parameter is estimated so that the error in the model identification label α region is minimized as shown in equation (17).

本実施の形態では、モデルパラメータと得られた動きベクトルの誤差がガウス分布に従っていると仮定して、最適なモデルパラメータは最小二乗法を使用した数値解析によって求めている。すなわち、最小二乗法における正規方程式は、（１８）、（１９）、（２０）式に示されるため、モデルパラメータは（２１）式で示される。 In the present embodiment, assuming that the error between the model parameter and the obtained motion vector follows a Gaussian distribution, the optimum model parameter is obtained by numerical analysis using the least square method. That is, since the normal equation in the least square method is represented by equations (18), (19), and (20), the model parameter is represented by equation (21).

かかる（２１）式は連立一次方程式であり、このため、パラメータ推定部１１０では、モデルパラメータを（２１）式により特異値分解法で求めている。なお、モデルパラメータを、ＬＵ分解などの他の手法で求めてもよい。 Since the equation (21) is a simultaneous linear equation, the parameter estimation unit 110 obtains the model parameter by the singular value decomposition method using the equation (21). Note that the model parameters may be obtained by other methods such as LU decomposition.

次に、クラスタリング処理部１０９の詳細について説明する。局所動き推定処理を終了した時点では、ラベル割り当て処理が行われていないため、モデルパラメータの推定を行うことができない。最初にモデルパラメータの推定処理を行う場合には、モデルパラメータの推定処理の実行前に、クラスタリング処理部１０９によって、局所動き推定部１０１によって求めた動きベクトルｄ（ｎ）をクラスタリングすることによりラベル割り当ての初期値として付与している。ここで、動きクラスタリング処理としては、輝度等の画素値に基づいて画像をｋ−Ｍｅａｎｓ法によって複数個にクラスタリングを行って画像を分割し、そのクラスタリング結果によって複数個のモデルパラメータを求め、求めたモデルパラメータを初期値としてｋ−Ｍｅａｎｓ法による数値解析処理によりモデルパラメータのラベルを求める。 Next, details of the clustering processing unit 109 will be described. When the local motion estimation process is completed, the model parameter cannot be estimated because the label allocation process is not performed. When the model parameter estimation process is performed first, before the model parameter estimation process is executed, the clustering processing unit 109 clusters the motion vectors d (n) obtained by the local motion estimation unit 101 to assign labels. Is given as the initial value. Here, as motion clustering processing, an image is clustered into a plurality of images by the k-Means method based on pixel values such as luminance, and the image is divided, and a plurality of model parameters are obtained based on the clustering result. The model parameter label is obtained by numerical analysis processing by the k-Means method using the model parameter as an initial value.

次に、領域分割部１１２の詳細について説明する。領域分割部１１２では、例えばｋ−ｍｅａｎｓ法やｗａｔｅｒｓｈｅｄ法等の手法を用いて、対象画像のフレームを画素値の情報に基づいた領域に分割する。 Next, details of the region dividing unit 112 will be described. The region dividing unit 112 divides the frame of the target image into regions based on pixel value information using a technique such as the k-means method or the watershed method.

すなわち、領域分割部１１２は、対象画像ｇ₁を画素値に応じてｋ個に分割（クラスタリング）する。そして、各領域（クラスタ）に各領域を一意に識別するための領域ラベルｗ（ｎ）∈Ｋ⊂Ｚを割り当てる。このとき、各領域（クラスタ）は、（２２）式で表される。 That is, the area dividing unit 112 divides (clusters) the target image g ₁ into k pieces according to the pixel values. Then, a region label w (n) εK⊂Z for uniquely identifying each region is assigned to each region (cluster). At this time, each region (cluster) is expressed by equation (22).

次に、ラベル割当部１１１の詳細について説明する。ラベル割当部１１１では、領域分割部１１２でクラスタリングされた各領域に対して割り当てるパラメトリックモデルを決定している。 Next, details of the label allocation unit 111 will be described. The label allocating unit 111 determines a parametric model to be allocated to each region clustered by the region dividing unit 112.

ここで、対象画像ｇ₁（ｎ）、参照画像ｇ₂（ｎ）、動きベクトルｄ（ｎ）が与えられた場合のモデル識別ラベルｚ（ｎ）の確率密度関数は（２３）式で与えられる。 Here, the probability density function of the model identification label z (n) when the target image g ₁ (n), the reference image g ₂ (n), and the motion vector d (n) are given is given by Equation (23). .

ベイズの定理を用いると、（２３）式は、（２４）式と等価となる。 Using Bayes' theorem, equation (23) is equivalent to equation (24).

ここで、ｚとｇ₁、ｄが独立であると仮定すると、（２４）式は、（２５）式と等価となる。 Here, assuming that z, g ₁ , and d are independent, Expression (24) is equivalent to Expression (25).

（２５）式において、ｐ（ｇ₂｜ｚ，ｇ₁，ｄ）は尤度関数といい、モデル識別ラベルｚの画像に対する適合度を表す。ｐ（ｚ）は事前密度といい、モデル識別ラベルｚの形状としてふさわしいかどうかの事前知識を表す。パラメトリックモデルに従った画素値の差分がガウス分布に従うと仮定すると尤度関数は（２６）、（２７）式のように定義することができる。 In the equation (25), p (g ₂ | z, g ₁ , d) is called a likelihood function, and represents the fitness of the model identification label z with respect to the image. p (z) is referred to as prior density, and represents prior knowledge as to whether or not the shape of the model identification label z is appropriate. Assuming that the difference in pixel values according to the parametric model follows a Gaussian distribution, the likelihood function can be defined as shown in equations (26) and (27).

また、ラベルの事前密度は（２８）、（２９）式のように定義することができる。 Further, the prior density of the label can be defined as in the equations (28) and (29).

Ｎ（ｎ）は点ｎの近傍を示し、モデル識別ラベルが同一のところでは低いエネルギーとなる密度関数である。この定式化の下で尤もらしいモデル識別ラベルｚは、（３０）式で示すＭＡＰ推定問題（Maximum a posteriori probability）の解である。 N (n) indicates the vicinity of the point n, and is a density function that is low energy when the model identification label is the same. The model identification label z that is plausible under this formulation is a solution of the MAP estimation problem (Maximum a posteriori probability) expressed by Equation (30).

（３０）式の両辺のｌｏｇをとり、マイナス（−）を乗算すると、（３１）式のエネルギー最小化問題に帰着する。 Taking the log of both sides of equation (30) and multiplying by minus (-) results in the energy minimization problem of equation (31).

（３１）式において、温度パラメータＴ、ノイズの標準偏差σをまとめてハイパーパラメータλとして記述している。Ｕ₁はパラメトリックモデルとの適合度を示す尤度エネルギー、Ｕ₂は隣接する画素間でモデル識別ラベルが変化するか否かを表す円滑度を示す正則化エネルギーであり、（３１）式は、尤度エネルギーＵ₁と正則化エネルギーＵ₂とからなる評価関数を示している。 In the equation (31), the temperature parameter T and the standard deviation σ of noise are collectively described as a hyper parameter λ. U ₁ is a likelihood energy indicating the degree of conformity with the parametric model, U ₂ is a regularization energy indicating a smoothness indicating whether or not the model identification label changes between adjacent pixels, and Equation (31) is: An evaluation function composed of likelihood energy U ₁ and regularization energy U ₂ is shown.

ここで、非特許文献１の技術では、（３１）式を尤度エネルギーＵ₁のみで解法しており、正則化エネルギーの項Ｕ₂がないため、ラベル割り当ての精度が安定しない。 Here, in the technique of Non-Patent Document 1, the equation (31) is solved only by the likelihood energy U ₁ , and since there is no regularization energy term U ₂ , the accuracy of label allocation is not stable.

本実施の形態では、ラベル割当部１１１によって正則化エネルギーの項Ｕ₂を考慮した（３１）式を解法することにより、安定した精度でラベル割り当てを行うことができる。 In the present embodiment, label allocation can be performed with stable accuracy by solving the equation (31) in consideration of the regularization energy term U ₂ by the label allocation unit 111.

次に、ラベル割当部１１１による（３１）式の解法について説明する。すなわち、ラベル割当部１１１は、（３２）式を解法して各領域（クラスタ）Ｋ_Kに対して、λＵ₁＋Ｕ₂が最小となるモデル識別ラベルαを求める。 Next, the solution of equation (31) by the label allocating unit 111 will be described. That is, the label assigning unit 111 solves the equation (32) to obtain a model identification label α that minimizes λU ₁ + U ₂ for each region (cluster) K _K.

ただし、ただし正則化エネルギーＵ₂には隣接するラベルのエネルギーが含まれるため、（３２）式のままでは計算することができない。ここで、領域（クラスタ）の境界のク
リーク（自画素と隣接画素の対）と内部のクリークに着目すると正則化エネルギーＵ₂は（３３）式のように変形することができる。 However, since the regularization energy U ₂ includes the energy of the adjacent label, it cannot be calculated using the equation (32). Here, the regularization energy U ₂ can be transformed as shown in the equation (33) when attention is focused on a clique at the boundary of the region (cluster) (a pair of the own pixel and the adjacent pixel) and an internal clique.

これにより、同一領域に含まれる点で構成されるクリークＣ₁と、異なる領域同士の点で構成されるクリークＣ₂に分けることができる。このとき各領域にどのモデル識別ラベルを選択したとしてもＣ₁のクリークポテンシャルは変化しない。すなわち、この場合Ｃ₂のクリークポテンシャルがエネルギーに影響を及ぼすことになる。 Thereby, it can be divided into clique C ₁ constituted by points included in the same region and clique C ₂ constituted by points of different regions. In this case Creek potential of C ₁ even on the choice of model identification label to each area is not changed. That is, in this case, the C ₂ clique potential affects the energy.

図２は、一辺ｒの正方形クラスタ、４周辺クリークを示す模式図である。図２のクラスタにおけるＣ₁の数を数えると以下のようになる。 FIG. 2 is a schematic diagram showing a square cluster of one side r, and four peripheral cliques. The number of C ₁ in the cluster of FIG. 2 is counted as follows.

頂点では２クリーク。頂点は４個。
辺上の点では３クリーク。辺上の点は４（ｒ−２）個。
内部の点では４クリーク。内部の点は（ｒ−２）²個。
それぞれ点の数を考慮すれば（３４）式を導くことができる。 2 creeks at the top. There are 4 vertices.
There are 3 creeks on the side. There are 4 (r-2) points on the side.
4 creeks inside. Inside the point ^(r-2) 2 pieces.
If the number of points is taken into consideration, the equation (34) can be derived.

また、図２の領域（クラスタ）におけるＣ₂の数を数えると以下のようになる。 Further, the number of C ₂ in the region (cluster) of FIG. 2 is counted as follows.

頂点では２クリーク。頂点は４個。
辺上の点では１クリーク。辺上の点は４（ｒ−２）個
内部の点では０クリーク。内部の点は（ｒ−２）²個
従って、（３５）式が導かれる。 2 creeks at the top. There are 4 vertices.
One creek at the point on the side. There are 4 (r-2) points on the side. The number of internal points is (r-2) ^2. Accordingly, the equation (35) is derived.

図３は、同一領域に含まれる点で構成されるクリークＣ₁、異なる点で構成されるクリークＣ₂の各個数を、領域内の全クリーク数に対する割合で示したグラフである。図３からわかるように、辺ｒの大きさが大きくなるに従い、Ｃ₂の影響が下がりＣ₁の占める割合が増加している。 FIG. 3 is a graph showing the number of cliques C ₁ composed of points included in the same region and the number of cliques C ₂ composed of different points as a percentage of the total number of cliques in the region. As can be seen from FIG. 3, as the size of the side r increases, the influence of C ₂ decreases and the proportion of C ₁ increases.

このことから、辺ｒが大きくなると正則化エネルギーＵ₂に対するＣ₂の影響が下がりＣ₁が支配的になることが分かり、つまり領域のラベルをどのように選択しても正則化エネルギーＵ₂は大きく変化しなくなることがわかる。従って、辺ｒが大きいときには、全体エネルギーに対して相対的に正則化エネルギーＵ₂の与える影響は小さくなり、正則化エネルギーＵ₂の影響が大きくなり、エネルギー最小化問題はＵ₁に対する最尤推定の結果に近くなる。 From this, it can be seen that as the side r increases, the influence of C _{2 on} the regularization energy U ₂ decreases and C ₁ becomes dominant, that is, the regularization energy U ₂ is no matter how the region label is selected. It turns out that it does not change greatly. Therefore, when the edge r is large, the influence of the regularization energy U _{2 is} relatively small relative to the total energy, the influence of the regularization energy U ₂ is large, and the energy minimization problem is the maximum likelihood estimation for U ₁ . Close to the result.

一方、辺ｒが大きいときには大数の法則から、最尤エネルギーＵ₁のラベル選択に対する精度は高くなる。よって辺ｒが大きいときには最尤エネルギーＵ₁に対する最尤推定で
十分であるということが言える。 On the other hand, when the side r is large, the accuracy for the label selection of the maximum likelihood energy U ₁ is high due to the law of large numbers. Therefore, it can be said that the maximum likelihood estimation for the maximum likelihood energy U ₁ is sufficient when the side r is large.

なお、図３では正方形の領域で例を示しているが、これに限定されるものではなく、正方形以外の領域に対しても同様である。これは、領域が大きくなるほど、外周が占める割合が同様に急激に低下するためである。 Although FIG. 3 shows an example with a square area, the present invention is not limited to this, and the same applies to areas other than the square. This is because the proportion of the outer circumference decreases rapidly as the area increases.

ただし、領域が例えばゴマ塩状で非常に数が多い場合には上述したとおりにならない可能性が高いが、これは領域の作り方で回避することができる。 However, if the region is, for example, sesame salt and has a very large number, there is a high possibility that it will not be as described above, but this can be avoided by making the region.

図４は、辺ｒが大きい領域では正則化エネルギーＵ₂を考慮せず、辺ｒが小さい領域では正則化エネルギーＵ₂を考慮することを示す模式図である。図４に示すように、辺ｒの小さい領域である境界の領域で正則化エネルギーＵ₂を考慮していることがわかる。 FIG. 4 is a schematic diagram showing that the regularization energy U ₂ is not considered in the region where the side r is large, and the regularization energy U ₂ is considered in the region where the side r is small. As shown in FIG. 4, it can be seen that the regularization energy U ₂ is considered in the boundary region which is a region having a small side r.

具体的には、ラベル割当部１１１は、（３６）式のようにモデル識別ラベルｚ（ｎ）を−１に初期化して、領域が大きい順、すなわち領域内の画素数が大きい順に、領域（クラスタ）Ｋを選択する。そして、（３８）式により尤度エネルギーを算出し、（３９）式により正則化エネルギーを算出し、（３７）式によりラベル値を算出して、算出したラベル値でモデル識別ラベルｚ（ｎ）を（４０）式のように更新する。かかる算出処理およびモデル識別ラベルの更新をすべての領域について繰り返すことにより、モデル識別ラベルの割り当てを行っている。 Specifically, the label allocating unit 111 initializes the model identification label z (n) to −1 as shown in Equation (36), and sets the region (in order from the largest region, that is, from the largest number of pixels in the region). Cluster) K is selected. Then, the likelihood energy is calculated by the equation (38), the regularization energy is calculated by the equation (39), the label value is calculated by the equation (37), and the model identification label z (n) is calculated by the calculated label value. Is updated as shown in equation (40). The model identification label is assigned by repeating the calculation process and the update of the model identification label for all regions.

ここで、（３８）式により、領域が大きいときには（３７）式は最尤推定そのものとして働き、領域が小さくなってくると徐々に事前密度が影響を及ぼしてくるようになっている。 Here, according to the equation (38), when the region is large, the equation (37) works as maximum likelihood estimation itself, and the prior density gradually affects as the region becomes small.

ここで、同一サイズの領域（すなわち、画素数が同一の領域）が複数存在する場合には、領域の境界における画素数が小さい順に領域を選択することとしている。 Here, when there are a plurality of regions of the same size (that is, regions having the same number of pixels), the regions are selected in ascending order of the number of pixels at the boundary of the region.

次に、このように構成された実施の形態１にかかる動き推定装置による動き推定処理について説明する。図５は、実施の形態１にかかる動き推定処理の手順を示すフローチャートである。 Next, a motion estimation process performed by the motion estimation apparatus according to the first embodiment configured as described above will be described. FIG. 5 is a flowchart of a motion estimation process procedure according to the first embodiment.

まず、カウント回数Ｌを１に初期化し（ステップＳ１１）、（２）、（３）式により局所動き推定処理を行う（ステップＳ１２）。 First, the count number L is initialized to 1 (step S11), and local motion estimation processing is performed using equations (2) and (3) (step S12).

局所動き推定処理が完了したら、モデル定義部１０８によりパラメトリックモデルを定義する（ステップＳ１３）。そして、クラスタリング処理部１０９によって動きクラスタリング処理を行い（ステップＳ１４）、対象画像上の各点の動きベクトルをクラスタリングしてラベル割り当ての初期値を求める。 When the local motion estimation process is completed, a parametric model is defined by the model definition unit 108 (step S13). Then, motion clustering processing is performed by the clustering processing unit 109 (step S14), and a motion vector of each point on the target image is clustered to obtain an initial value of label assignment.

次に、パラメータ推定部１１０によるモデルパラメータ推定処理を行い（ステップＳ１５）、各ラベルの領域に対するモデルパラメータを推定する。そして、領域分割部１１２によって、領域分割処理を行い（ステップＳ１６）、対象画像のフレームを画素値に基づいた領域にクラスタリングし、各領域に領域ラベルを割り当てる。 Next, model parameter estimation processing is performed by the parameter estimation unit 110 (step S15), and model parameters for each label area are estimated. Then, the region dividing unit 112 performs region dividing processing (step S16), clusters the frames of the target image into regions based on the pixel values, and assigns region labels to each region.

次いで、ラベル割当部１１１によって、ラベル割り当て処理を行い（ステップＳ１７）、領域分割部１１２によって分割された領域（クラスタ）に割り当てるパラメトリックモデルを選択し、選択されたパラメトリックモデルのモデル識別ラベルを領域内の画素に割り当てる。そして、カウント回数Ｌを１増加して（ステップＳ１８）、カウント回数Ｌが予め定められた所定回数を超えたか否かを調べる（ステップＳ１９）。 Next, label allocation processing is performed by the label allocation unit 111 (step S17), the parametric model to be allocated to the area (cluster) divided by the area division unit 112 is selected, and the model identification label of the selected parametric model is stored in the area. Assigned to the pixels. Then, the count number L is increased by 1 (step S18), and it is checked whether or not the count number L exceeds a predetermined number of times (step S19).

カウント回数Ｌが予め定められた所定回数を超えていない場合には（ステップＳ１９：Ｎｏ）、ステップＳ１５からＳ１８までの処理を繰り返す。 When the count number L does not exceed the predetermined number of times (step S19: No), the processing from step S15 to S18 is repeated.

一方、ステップＳ１９において、カウント回数Ｌが予め定められた所定回数を超えた場合には（ステップＳ１９：Ｙｅｓ）、領域ごとに最適なパラメトリックモデルが選択されたことになり、画像処理部１１３によって、最適なパラメトリックモデルにより動き推定を行って動き推定結果を出力する（ステップＳ２０）。 On the other hand, when the count number L exceeds a predetermined number in step S19 (step S19: Yes), the optimum parametric model is selected for each region, and the image processing unit 113 Motion estimation is performed using an optimal parametric model, and a motion estimation result is output (step S20).

次に、ステップＳ１４における動きクラスタリング処理について説明する。図６は、動きクラスタリング処理の手順を示すフローチャートである。 Next, the motion clustering process in step S14 will be described. FIG. 6 is a flowchart showing the procedure of the motion clustering process.

まず、クラスタリング処理部１０９は、輝度等の画素値に基づいて、対象画像をｋ−Ｍｅａｎｓ法によって複数個にクラスタリングを行って画像を分割しモデル識別ラベルを求める（ステップＳ２１）。 First, the clustering processing unit 109 performs clustering on the target image into a plurality of images by the k-Means method based on pixel values such as luminance and divides the image to obtain a model identification label (step S21).

そして、そのクラスタリング結果によるモデル識別ラベルに基づき、初期モデルパラメータを推定し、複数個の初期モデルパラメータを求める（ステップＳ２２）。次に、求めた初期モデルパラメータを初期値としてｋ−Ｍｅａｎｓ法による数値解析処理によりモデルパラメータのモデル識別ラベルαを求める（ステップＳ２３）。ここで、モデル識別ラベルαは、複数個の初期モデルパラメータに対応して複数個求められる。かかる動きクラスタリング処理は、図５に示すように、局所的な動的マッチング処理が完了した直後に１回だけ実行される。 And based on the model identification label by the clustering result, an initial model parameter is estimated and several initial model parameters are calculated | required (step S22). Next, the model identification label α of the model parameter is obtained by numerical analysis processing by the k-Means method using the obtained initial model parameter as an initial value (step S23). Here, a plurality of model identification labels α are obtained corresponding to a plurality of initial model parameters. Such motion clustering processing is executed only once immediately after the local dynamic matching processing is completed, as shown in FIG.

次に、ステップＳ１５におけるモデルパラメータ推定処理について説明する。図７は、モデルパラメータ推定処理の手順を示すフローチャートである。 Next, the model parameter estimation process in step S15 will be described. FIG. 7 is a flowchart showing a procedure of model parameter estimation processing.

まず、パラメータ推定部１１０は、クラスタリング処理部１０９によって求めたモデル識別ラベルαに対して、全ての格子点ｎに対して、（１９）式によりＡを、（２０）式によりｂを計算する（ステップＳ３１）。そして、（２１）式を特異値分解法（もしくはＬＵ法）による数値解析により解法処理を行う（ステップＳ３２）。そして、全てのモデル識別ラベルαに対して、上記ステップＳ３１およびＳ３２の処理を繰り返して実行する（
ステップＳ３３）。 First, the parameter estimation unit 110 calculates A by equation (19) and b by equation (20) for all lattice points n for the model identification label α obtained by the clustering processing unit 109 ( Step S31). Then, the equation (21) is solved by numerical analysis using the singular value decomposition method (or LU method) (step S32). And the process of said step S31 and S32 is repeatedly performed with respect to all the model identification labels (alpha) (
Step S33).

次に、ステップＳ１７におけるラベル割り当て処理について説明する。図８は、ラベル割り当て処理の手順を示すフローチャートである。 Next, the label allocation process in step S17 will be described. FIG. 8 is a flowchart showing the procedure of label allocation processing.

まず、ラベル割当部１１１は、（３６）式のようにモデル識別ラベルｚ（ｎ）を−１に初期化する（ステップＳ４１）。そして、領域が大きい順、すなわち領域内の画素数が大きい順に、領域Ｋを選択する（ステップＳ４２）。このとき、同一サイズの領域（画素数が同一の領域）が複数存在する場合には、領域の境界における画素数が小さい順に領域を選択する。 First, the label assigning unit 111 initializes the model identification label z (n) to −1 as shown in the equation (36) (step S41). Then, the region K is selected in the descending order of the region, that is, in the descending order of the number of pixels in the region (step S42). At this time, if there are a plurality of regions of the same size (regions having the same number of pixels), the regions are selected in ascending order of the number of pixels at the boundary of the region.

そして、（３８）式により尤度エネルギーを算出し（ステップＳ４３）、（３９）式により正則化エネルギーを算出する（ステップＳ４４）。そして、（３７）式により最尤推定を行ってラベル値を算出する（ステップＳ４５）。そして、算出したラベル値でモデル識別ラベルｚ（ｎ）を（４０）式により更新する（ステップＳ４６）。 Then, likelihood energy is calculated from the equation (38) (step S43), and regularization energy is calculated from the equation (39) (step S44). Then, the maximum likelihood estimation is performed by the equation (37) to calculate the label value (step S45). Then, the model identification label z (n) is updated with the calculated label value according to the equation (40) (step S46).

次に、全ての領域についてラベル割り当て処理が完了した否かを判断する（ステップＳ４７）。そして、まだ全ての領域についてラベル割り当て処理が完了していない場合には（ステップＳ４７：Ｎｏ）、次にサイズの大きい領域（すなわち、次に画素数の大きい領域）を選択して（ステップＳ４８）、ステップＳ４３からＳ４６までの処理を繰り返し行う。これにより、ラベルの割り当てが行われる。 Next, it is determined whether or not the label allocation process has been completed for all areas (step S47). If the label allocation process has not been completed for all the areas (step S47: No), the next largest area (that is, the next largest pixel number) is selected (step S48). The processes from step S43 to S46 are repeated. As a result, label assignment is performed.

図９は対象画像の例、図１０は参照画像の例、図１１は、領域分割の結果を示す。また、図１２は、図９の対象画像、図１０の参照画像を用いた場合の比較例としての従来の手法によるラベル割り当て処理結果を示す。図１３は、図９の対象画像、図１０の参照画像を用いた場合の実施の形態１のラベル割り当て処理結果を示す。 FIG. 9 shows an example of the target image, FIG. 10 shows an example of the reference image, and FIG. 11 shows the result of area division. FIG. 12 shows the result of label allocation processing by a conventional method as a comparative example when the target image of FIG. 9 and the reference image of FIG. 10 are used. FIG. 13 shows the result of label assignment processing in the first embodiment when the target image of FIG. 9 and the reference image of FIG. 10 are used.

図１２と図１３を比較すればわかるように、実施の形態１による動き推定結果の方が、比較例に比べて高精度にラベル割り当てが行われていることがわかる。 As can be seen from a comparison between FIG. 12 and FIG. 13, it can be seen that the motion estimation result according to the first embodiment performs label assignment with higher accuracy than the comparative example.

このように実施の形態１にかかる動き推定装置１００では、（３６）〜（３９）式により最尤推定を行うことにより、領域が大きいときには最尤推定そのものを求め、領域が小さくなってくると徐々に事前密度が影響を及ぼしてくるように構成しているので、画像全体に亘って最適なモデル識別ラベルの割り当てを行って、最適なパラメトリックモデルを適用した動き推定処理を高精度に行うことができる。 As described above, in the motion estimation apparatus 100 according to the first embodiment, by performing maximum likelihood estimation using the equations (36) to (39), when the area is large, the maximum likelihood estimation itself is obtained, and the area becomes small. Since the prior density is gradually influenced, the optimal model identification label is assigned to the entire image, and the motion estimation process using the optimal parametric model is performed with high accuracy. Can do.

（実施の形態２）
実施の形態１では、動き推定処理の過程で、パラメトリックモデルを定義し、モデルパラメータを推定してパラメトリックモデルを生成して動き推定処理を行っていたが、この実施の形態２では、本発明の画像処理装置を背景差分を行う動領域検出装置に適用している。 (Embodiment 2)
In the first embodiment, in the course of the motion estimation process, a parametric model is defined, a model parameter is estimated to generate a parametric model, and the motion estimation process is performed. The image processing apparatus is applied to a moving region detection apparatus that performs background difference.

背景差分法は、あらかじめ撮影された背景画像に対して、動領域が含まれる対象画像との間の差分を取ることにより、動領域を検出するものである。 In the background subtraction method, a moving area is detected by taking a difference between a background image captured in advance and a target image including the moving area.

図１４は、実施の形態２にかかる動領域検出装置１４００の機能的構成を示すブロック図である。本実施の形態にかかる動領域検出装置は、あらかじめ撮影された動きのない画像である背景画像を用いて入力された対象画像から動きのある領域である動領域を検出するものであり、図１４に示すように、領域分割部１４１２と、ラベル割当部１４１１と、動領域検出部１４０１と、フレームメモリ１０６と、メモリ１０３とを主に備えている。 FIG. 14 is a block diagram of a functional configuration of the moving region detection apparatus 1400 according to the second embodiment. The moving region detection apparatus according to the present embodiment detects a moving region that is a moving region from a target image that is input using a background image that is a pre-captured image. FIG. As shown in FIG. 3, the image processing apparatus mainly includes an area dividing unit 1412, a label assigning unit 1411, a moving area detecting unit 1401, a frame memory 106, and a memory 103.

本実施の形態の領域分割部１４１２は、実施の形態１と同様に、対象画像を複数の領域に分割する。 Similar to the first embodiment, the region dividing unit 1412 according to the present embodiment divides the target image into a plurality of regions.

ラベル割当部１４１１は、領域分割部１１２によって分割された領域（クラスタ）に領域種別ラベルを割り当てる処理部である。ここで、領域種別ラベルは、各領域が背景領域であるか動領域であるかを識別するためのラベルであり、本実施の形態では、背景領域の旨を示す領域種別ラベルαをα＝０、動領域の旨を示す領域種別ラベルαをα＝１としている。ただし、領域種別ラベルの値はこれに限定されるものはない。 The label assigning unit 1411 is a processing unit that assigns region type labels to the regions (clusters) divided by the region dividing unit 112. Here, the area type label is a label for identifying whether each area is a background area or a moving area. In this embodiment, an area type label α indicating a background area is set to α = 0. The area type label α indicating the moving area is set to α = 1. However, the value of the area type label is not limited to this.

より具体的には、ラベル割当部１４１１は、対象画像の点ｎの画素値と背景画像の点ｎの画素の差分値ｄ（ｎ）が予め定められた閾値Ｔ以下の場合に、分割された領域に背景領域の領域種別ラベルを割り当てており、この際に、領域の周囲の全ての領域が背景領域を示す領域種別ラベルが割り当てられている場合には閾値Ｔを増加し、領域の周囲の全ての領域が動領域を示す領域種別ラベルが割り当てられている場合には閾値Ｔを減少している。そして、ラベル割当部１４１１は、このような領域種別ラベルを領域内の画素に割り当てるラベル割り当ての処理を、複数の領域のうち画素数が多いものから順に実行している。 More specifically, the label assigning unit 1411 is divided when the difference value d (n) between the pixel value of the point n of the target image and the pixel of the point n of the background image is equal to or less than a predetermined threshold T. The area type label of the background area is assigned to the area. At this time, if the area type label indicating the background area is assigned to all the areas around the area, the threshold T is increased, When the area type label indicating the moving area is assigned to all the areas, the threshold value T is decreased. Then, the label allocation unit 1411 executes label allocation processing for allocating such area type labels to the pixels in the area in order from the largest number of pixels in the plurality of areas.

以下、ラベル割当部１４１１によるラベル割り当て処理の詳細について説明する。対象画像の点ｎの画素値をｇ₁（ｎ）、背景画像の点ｎの画素値をｇ₂（ｎ）とする。このとき、背景領域の差分値ｄ（ｎ）は、（４１）式で表される。そして、ノイズがガウス分布Ｎ（０，σ²）に従うとすると、背景領域の差分値ｄ（ｎ）もガウス分布に従う。 Hereinafter, details of the label allocation processing by the label allocation unit 1411 will be described. The pixel value of the point n of the target image is g ₁ (n), and the pixel value of the point n of the background image is g ₂ (n). At this time, the difference value d (n) of the background region is expressed by equation (41). If the noise follows a Gaussian distribution N (0, σ ² ), the background area difference value d (n) also follows the Gaussian distribution.

これに対し、動領域の差分値は上記ガウス分布には従わない。点ｎが（４２）式を満たせば、閾値Ｔ＝１．９６σ（σは定数）のときに９５％の確率で背景領域であると考えられる。一方、点ｎが（４２）式を満たさない場合には、動領域であると考えられる。 On the other hand, the difference value of the moving area does not follow the Gaussian distribution. If the point n satisfies the equation (42), it is considered to be a background region with a probability of 95% when the threshold value T = 1.96σ (σ is a constant). On the other hand, when the point n does not satisfy the equation (42), it is considered to be a moving region.

領域Ｋ_k内の平均差分値を（４３）式で定義する。 The average difference value in the region K _k is defined by equation (43).

ここで、Ｎ（ｋ）は、領域内の画素数である。領域Ｋ_kが背景領域に属する場合には、中心極限定理より平均差分値は（４４）式のガウス分布に従う。すなわち、精度が（４５）式で示す倍率だけ向上することになる。従って、面積の大きい領域はそれだけラベリング精度が高くなることがわかる。 Here, N (k) is the number of pixels in the region. When the region K _k belongs to the background region, the average difference value follows the Gaussian distribution of Equation (44) from the central limit theorem. That is, the accuracy is improved by the magnification indicated by the equation (45). Therefore, it can be seen that the region with a large area has higher labeling accuracy.

しかし、面積の小さい領域はそれほど精度が向上しない。そこで本実施の形態では、実施の形態１と同様に空間的な拘束条件を導入したラベル割り当て処理を行う。なお、本実施の形態では、実施の形態１と異なる手法のラベル割り当て処理を行う。 However, the accuracy of a region having a small area is not improved so much. Therefore, in the present embodiment, a label allocation process in which a spatial constraint condition is introduced is performed as in the first embodiment. In the present embodiment, label allocation processing is performed by a method different from that in the first embodiment.

任意の領域の周囲がすべて背景領域であり領域種別ラベルα＝０とラベル割り当てされている場合には、この領域も背景領域になると考えた方がラベル割り当てはより安定すると考えられる。一方、任意の領域の周囲がすべて動領域であり領域種別ラベルα＝１とラベル割り当てされている場合には、この領域も動領域になると考えた方がラベル割り当てはより安定すると考えられる。以下ではこの概念を定式化する。 If the surroundings of an arbitrary area are all the background area and the area type label α = 0 is assigned, the label assignment is considered to be more stable if it is considered that this area also becomes the background area. On the other hand, if the area around the arbitrary area is all a moving area and the area type label α = 1 is assigned, the label allocation is considered to be more stable if it is considered that this area also becomes a moving area. In the following, this concept is formulated.

判定対象とする領域の周囲の領域が背景領域と判定されている場合には、任意の定数η＞０に対して、（４６）式を満たすようにすることによって、対象とする領域も背景領域と判定しやすくなる。すなわち、（４６）式により、閾値Ｔの値が増加するので、（４２）式により、判定対象の領域の画素の差分値ｄ（ｎ）が背景領域と判定される可能性が高くなる。 When the area around the determination target area is determined as the background area, the target area is also determined as the background area by satisfying the expression (46) for an arbitrary constant η> 0. It becomes easy to judge. That is, since the value of the threshold value T is increased by the equation (46), there is a high possibility that the difference value d (n) of the pixels in the determination target region is determined as the background region by the equation (42).

同様に、対象とする領域の周囲の領域が動領域と判定されている場合には、任意の定数β＞０に対して、（４７）式を満たすようにすることによって、対象とする領域も動領域と判定しやすくなる。すなわち、（４６）式により、閾値Ｔの値が減少するので、（４２）式により、判定対象の領域の画素の差分値ｄ（ｎ）が動領域と判定される可能性が高くなる。 Similarly, when the region around the target region is determined to be a moving region, the target region can be determined by satisfying the expression (47) for an arbitrary constant β> 0. It becomes easy to determine the moving area. That is, since the value of the threshold value T is reduced according to the equation (46), there is a high possibility that the difference value d (n) of the pixel in the determination target region is determined as the moving region according to the equation (42).

ここで、η，βは、判定の対象とする領域の周囲の領域の状況によって決定されるべきである。従って、（４６）、（４７）式をまとめ、領域Ｋ_kに対する閾値Ｔを定義すると（４８）式のようになる。 Here, η and β should be determined according to the situation of the area around the area to be determined. Therefore, when the expressions (46) and (47) are combined and the threshold T for the region K _k is defined, the expression (48) is obtained.

ここで、γ（ｋ）は、領域Ｋ_kの周囲の領域に背景領域が多ければγ（ｋ）＞０、動領域が多ければγ（ｋ）＜０となるように定める。例えば、γ（ｋ）の値の範囲を−σ≦γ
（ｋ）≦σとすると、（４９）式のように設計することができる。 Here, γ (k) is determined so that γ (k)> 0 if there are many background regions around the region K _k , and γ (k) <0 if there are many moving regions. For example, the value range of γ (k) is −σ ≦ γ
If (k) ≦ σ, it can be designed as shown in equation (49).

判定対象の領域に隣接する領域の領域種別ラベルがすべて背景領域（α＝０）の場合には、Ｎ（α＝０）＝Ｂ（ｋ）、Ｎ（α＝１）＝０が成立するため、γ（ｋ）＝σとなる。 When all the region type labels of the regions adjacent to the determination target region are background regions (α = 0), N (α = 0) = B (k) and N (α = 1) = 0 are satisfied. , Γ (k) = σ.

一方、判定対象の領域に隣接する領域の領域種別ラベルがすべて動領域（α＝１）の場合には、Ｎ（α＝１）＝Ｂ（ｋ）、Ｎ（α＝０）＝０でが成立するため、γ（ｋ）＝−σとなる。さらに、領域の隣接する領域において背景領域と動領域とが同数の場合には、Ｎ（α＝０）＝Ｎ（α＝１）が成立するため、γ（ｋ）＝０となる。 On the other hand, when the region type labels of the regions adjacent to the determination target region are all moving regions (α = 1), N (α = 1) = B (k) and N (α = 0) = 0. Therefore, γ (k) = − σ. Further, when the number of background regions and moving regions is the same in the adjacent regions, N (α = 0) = N (α = 1) is established, and therefore γ (k) = 0.

従って、各領域に割り当てる領域種別ラベルをｚ（ｎ）で示すと、実施の形態２にかかるラベル割当部１４１１は、（５０）式により各領域に割り当てる領域種別ラベルｚ（ｎ）を初期化し、領域の画素数が大きい順に以下の処理を行う。境界領域の数Ｂ（ｋ）、判定対象の領域に隣接する領域の領域種別ラベルが背景領域である数Ｎ（α＝０）、判定対象の領域に隣接する領域の領域種別ラベルが動領域である数Ｎ（α＝１）を算出し、（５１）式によりγ（ｋ）を算出し、（５２）式の計算を行って、（５３）式によりラベル値を求める。そして、（５４）式により、求めたラベル値で領域に割り当てる領域種別ラベルｚ（ｎ）を更新する。 Therefore, when the area type label assigned to each area is indicated by z (n), the label assigning unit 1411 according to the second embodiment initializes the area type label z (n) assigned to each area according to the equation (50), The following processing is performed in descending order of the number of pixels in the area. The number of boundary areas B (k), the area type label of the area adjacent to the determination target area is the number N (α = 0), and the area type label of the area adjacent to the determination target area is the moving area. A certain number N (α = 1) is calculated, γ (k) is calculated from equation (51), equation (52) is calculated, and a label value is obtained from equation (53). Then, the region type label z (n) assigned to the region is updated with the obtained label value according to the equation (54).

動領域検出部１４０１は、ラベル割当部１４１１の結果、すなわち各領域に割り当てられた領域種別ラベルｚ（ｎ）によって、対象画像の各領域の中から領域種別ラベルｚ（ｎ）＝０である動領域を検出する処理部である。 Based on the result of the label assigning unit 1411, that is, the region type label z (n) assigned to each region, the moving region detection unit 1401 uses the region type label z (n) = 0 among the regions of the target image. A processing unit for detecting an area.

次に、以上のように構成された実施の形態２にかかる動領域検出処理について説明する。図１５は、実施の形態２の動領域検出処理の手順を示すフローチャートである。 Next, the moving area detection process according to the second embodiment configured as described above will be described. FIG. 15 is a flowchart illustrating the procedure of the moving region detection process according to the second embodiment.

まず、カウント回数Ｌを１に初期化する（ステップＳ５１）。そして、領域分割部１４１２によって、領域分割処理を行い（ステップＳ５２）、対象画像のフレームを画素値に基づいた領域に分割（クラスタリング）する。 First, the count number L is initialized to 1 (step S51). Then, the region dividing unit 1412 performs region dividing processing (step S52), and divides the frame of the target image into regions based on the pixel values (clustering).

次いで、ラベル割当部１４１１によって、分割された領域に領域種別ラベルを割り当てるラベル割り当て処理を行う（ステップＳ５３）。そして、カウント回数Ｌを１増加して（ステップＳ５４）、カウント回数Ｌが予め定められた所定回数を超えたか否かを調べる（ステップＳ５５）。 Next, the label allocation unit 1411 performs label allocation processing for allocating area type labels to the divided areas (step S53). Then, the count number L is increased by 1 (step S54), and it is checked whether or not the count number L exceeds a predetermined number of times (step S55).

カウント回数Ｌが予め定められた所定回数を超えていない場合には（ステップＳ５５：Ｎｏ）、ステップＳ５２からＳ５４までの処理を繰り返す。 When the count number L does not exceed the predetermined number of times (step S55: No), the processing from step S52 to S54 is repeated.

一方、ステップＳ５５において、カウント回数Ｌが予め定められた所定回数を超えた場合には（ステップＳ５５：Ｙｅｓ）、領域に領域種別ラベルが割り当てられたことになり、動領域検出部１４０１によって、領域種別ラベルが０である領域を動領域として検出して（ステップＳ５６）、検出結果を出力する。 On the other hand, if the count number L exceeds a predetermined number in step S55 (step S55: Yes), an area type label has been assigned to the area, and the moving area detecting unit 1401 An area where the type label is 0 is detected as a moving area (step S56), and the detection result is output.

次に、ステップＳ５３によるラベル割り当て処理について説明する。図１６は、実施の形態２にかかるラベル割り当て処理の手順を示すフローチャートである。 Next, the label allocation process in step S53 will be described. FIG. 16 is a flowchart of a label assignment process according to the second embodiment.

まず、ラベル割当部１４１１は、（５０）式のように、領域に割り当てる領域種別ラベルｚ（ｎ）を−１に初期化する（ステップＳ６１）。そして、領域が大きい順、すなわち領域の画素数が大きい順に、領域Ｋを選択する（ステップＳ６２）。このとき、同一画素数の領域が複数存在する場合には、領域の境界における画素数が小さい順にクラスタを選択する。 First, the label assigning unit 1411 initializes an area type label z (n) to be assigned to an area to −1 as shown in Expression (50) (step S61). Then, the region K is selected in the order from the largest region, that is, from the largest number of pixels in the region (step S62). At this time, when there are a plurality of regions having the same number of pixels, clusters are selected in ascending order of the number of pixels at the boundary of the region.

そして、境界領域の数Ｂ（ｋ）、クラスタの隣接ラベルが背景領域である数Ｎ（α＝０）、判定対象の領域の隣接する領域の領域種別ラベルが動領域である数Ｎ（α＝１）を算出する（ステップＳ６３）。次に、（５１）式によりγ（ｋ）を算出し（ステップＳ６４）、（５２）式により平均差分値を算出する（ステップＳ６５）。そして、（５３）式によりラベル値を求めて（ステップＳ６６）、（５４）式により、求めたラベル値で領域種別ラベルｚ（ｎ）を更新する（ステップＳ６７）。 Then, the number B (k) of boundary regions, the number N (α = 0) where the adjacent label of the cluster is the background region, and the number N (α = number of region type labels of the adjacent regions of the determination target region) 1) is calculated (step S63). Next, γ (k) is calculated from equation (51) (step S64), and an average difference value is calculated from equation (52) (step S65). Then, a label value is obtained from the equation (53) (step S66), and the region type label z (n) is updated with the obtained label value from the equation (54) (step S67).

次に、全ての領域についてラベル割り当て処理が完了した否かを判断する（ステップＳ６８）。そして、まだ全ての領域についてラベル割り当て処理が完了していない場合には（ステップＳ６８：Ｎｏ）、次に画素数が大きい領域を選択して（ステップＳ６９）、ステップＳ６３からＳ４７までの処理を繰り返し行う。これにより、ラベルの割り当てが行われる。 Next, it is determined whether or not the label allocation process has been completed for all areas (step S68). If the label allocation processing has not been completed for all the regions (step S68: No), the region with the next largest number of pixels is selected (step S69), and the processing from steps S63 to S47 is repeated. Do. As a result, label assignment is performed.

このように実施の形態２にかかる動領域検出装置では、（５１）〜（５３）式により背景領域か動領域かを識別する領域種別ラベルを領域内の画素に割り当てるラベル割り当ての処理を、複数の領域のうち画素数が多いものから順に実行しているので、ノイズなどによらず、画像の領域毎に安定して背景領域と動領域のラベル割り当てをおこなうことが可能となり、画像全体に亘って最適なラベル割り当てをして画像処理を高精度に行うことができる。 As described above, in the moving region detection apparatus according to the second embodiment, a plurality of label allocation processes for assigning region type labels that identify the background region or the moving region to the pixels in the region using the equations (51) to (53) are performed. Since the processing is performed in the order from the largest number of pixels, the background region and the moving region can be stably assigned to each region of the image regardless of noise and the like over the entire image. Thus, it is possible to perform image processing with high accuracy by assigning the optimum label.

（実施の形態３）
実施の形態３は、本発明の画像処理装置をステレオマッチング装置に適用したものである。 (Embodiment 3)
In the third embodiment, the image processing apparatus of the present invention is applied to a stereo matching apparatus.

図１７は、実施の形態３にかかるステレオマッチング装置１７００の機能的構成図である。本実施の形態にかかるステレオマッチング装置１７００は、左目用の第１視差画像と右目用の第２視差画像を入力し、２枚の視差画像を撮影し画像間の対応点から空間の奥行きを推定するものである。 FIG. 17 is a functional configuration diagram of the stereo matching device 1700 according to the third embodiment. The stereo matching device 1700 according to the present embodiment inputs the first parallax image for the left eye and the second parallax image for the right eye, captures the two parallax images, and estimates the depth of the space from the corresponding points between the images. To do.

本実施の形態にかかるステレオマッチング装置１７００は、図１７に示すように、領域分割部１１２と、ラベル割当部１１１と、対応点推定部１７０１と、フレームメモリ１０６と、メモリ１０３とを主に備えている。 As shown in FIG. 17, stereo matching apparatus 1700 according to the present embodiment mainly includes region dividing unit 112, label allocation unit 111, corresponding point estimation unit 1701, frame memory 106, and memory 103. ing.

本実施の形態では、メモリ１０３に予め複数のパラメトリックモデルが記憶されている。本実施の形態では、（５５）式に示すパラメトリックモデルを予め記憶し、ラベル割当部１１１では、（５５）式のパラメトリックモデルを用いてラベル割り当てを行っている点が実施の形態１と異なり、その他の処理については実施の形態と同様である。 In the present embodiment, a plurality of parametric models are stored in the memory 103 in advance. In the present embodiment, the parametric model shown in Equation (55) is stored in advance, and the label assignment unit 111 differs from Embodiment 1 in that label assignment is performed using the parametric model of Equation (55). Other processes are the same as those in the embodiment.

ここで、ｎ₁は第１視差画像上の点の座標、ｎ₂は第２視差画像上の点の座標である。また、αはモデルパラメータであるとともに、パラメトリックモデルを識別するためのモデル識別ラベルである。すなわち、本実施の形態では、モデルパラメータとモデル識別ラベルとを同一のαとしている。
ステレオマッチングにおける空間の奥行きを推定するために、視差画像間の対応点（動き）を推定する。この場合、対応点の関係は左右の平行移動のみで近似できる場合が多い。例えば、左右の平行移動が０〜１６の範囲で行われると予めわかっている場合には、平行移動量を｛０，・・・，１６｝の範囲で設定することができる。本実施の形態では、モデルパラメータおよびモデル識別ラベルであるαを平行移動量として｛０，・・・，１６｝の範囲で設定している。 Here, n ₁ is the coordinates of a point on the first parallax image, and n ₂ is the coordinates of a point on the second parallax image. Α is a model parameter and a model identification label for identifying a parametric model. That is, in the present embodiment, the model parameter and the model identification label are the same α.
In order to estimate the depth of the space in stereo matching, corresponding points (motion) between parallax images are estimated. In this case, the relationship between corresponding points can often be approximated only by left and right translation. For example, when it is known in advance that the left-right translation is performed in the range of 0 to 16, the translation amount can be set in the range of {0,..., 16}. In the present embodiment, α, which is a model parameter and a model identification label, is set as a translation amount in a range of {0,..., 16}.

対応点推定部１７０１は、ラベル割り当て処理の結果に基づいて視差画像間の対応点（動き）を推定する処理部である。 The corresponding point estimation unit 1701 is a processing unit that estimates a corresponding point (motion) between parallax images based on the result of the label assignment process.

次に、以上のように構成された本実施の形態にかかるステレオマッチング処理について説明する。図１８は、実施の形態３にかかるステレオマッチング処理の手順を示すフローチャートである。 Next, stereo matching processing according to the present embodiment configured as described above will be described. FIG. 18 is a flowchart of a stereo matching process according to the third embodiment.

まず、カウント回数Ｌを１に初期化する（ステップＳ７１）。そして、領域分割部１１
２によって、領域分割処理を行い（ステップＳ７２）、第１視差画像のフレームを画素値に基づいた領域に分割（クラスタリング）し、各領域に領域ラベルを割り当てる。 First, the count number L is initialized to 1 (step S71). Then, the area dividing unit 11
2 is performed (step S72), the frame of the first parallax image is divided into regions based on pixel values (clustering), and region labels are assigned to the regions.

次いで、ラベル割り当て部１１１によって、ラベル割り当て処理を行い（ステップＳ７３）、領域分割部１１２によって分割された領域（クラスタ）に割り当てるパラメトリックモデルを選択し、選択されたパラメトリックモデルのモデル識別ラベルを領域内の画素に割り当てる。そして、カウント回数Ｌを１増加して（ステップＳ７４）、カウント回数Ｌが予め定められた所定回数を超えたか否かを調べる（ステップＳ７５）。 Next, label allocation processing is performed by the label allocation unit 111 (step S73), a parametric model to be allocated to the area (cluster) divided by the area division unit 112 is selected, and the model identification label of the selected parametric model is stored in the area. Assigned to the pixels. Then, the count number L is increased by 1 (step S74), and it is checked whether or not the count number L exceeds a predetermined number of times (step S75).

カウント回数Ｌが予め定められた所定回数を超えていない場合には（ステップＳ７５：Ｎｏ）、ステップＳ７２からＳ７４までの処理を繰り返す。 When the count number L does not exceed the predetermined number of times (step S75: No), the processing from step S72 to S74 is repeated.

一方、ステップＳ７５において、カウント回数Ｌが予め定められた所定回数を超えた場合には（ステップＳ７５：Ｙｅｓ）、領域ごとに最適なパラメトリックモデルが選択されたことになり、最適なパラメトリックモデルにより対応点を検出して（ステップＳ７６）、対応点推定部１７０１によって空間の奥行きを推定する。 On the other hand, when the count number L exceeds a predetermined number in step S75 (step S75: Yes), the optimum parametric model is selected for each region, and is handled by the optimum parametric model. A point is detected (step S76), and the corresponding point estimation unit 1701 estimates the depth of the space.

ここで、ステップＳ７３におけるラベル割り当て処理については、（５５）式のパラメトリックモデルを使用して実施の形態１と同様に行われる。 Here, the label allocation processing in step S73 is performed in the same manner as in the first embodiment using the parametric model of equation (55).

このように実施の形態３にかかるステレオマッチング装置１７００では、（５５）式のパラメトリックモデルを用い、（３６）〜（３９）式により最尤推定を行うことにより、クラスタが大きいときには最尤推定そのものを求め、クラスタが小さくなってくると徐々に事前密度が影響を及ぼしてくるように構成しているので、画像全体に亘って最適なラベル割り当てを行って、高精度に対応点を推定することでき、高精度に奥行きを推定可能となり、パラメトリックモデルを適用したステレオマッチング処理を高精度に行うことができる。 As described above, in the stereo matching device 1700 according to the third embodiment, the maximum likelihood estimation itself is performed when the cluster is large by using the parametric model of the equation (55) and performing the maximum likelihood estimation by the equations (36) to (39). Since the prior density gradually affects as the cluster gets smaller, optimal label allocation is performed over the entire image, and corresponding points are estimated with high accuracy. Therefore, the depth can be estimated with high accuracy, and stereo matching processing using the parametric model can be performed with high accuracy.

実施の形態１〜３の各装置は、ＣＰＵなどの制御装置と、ＲＯＭ（Read Only Memory）やＲＡＭなどの記憶装置と、ＨＤＤ、ＣＤドライブ装置などの外部記憶装置と、ディスプレイ装置などの表示装置と、キーボードやマウスなどの入力装置を備えており、通常のコンピュータを利用したハードウェア構成となっている。 Each of the first to third embodiments includes a control device such as a CPU, a storage device such as a ROM (Read Only Memory) and a RAM, an external storage device such as an HDD and a CD drive device, and a display device such as a display device. And an input device such as a keyboard and a mouse, and has a hardware configuration using a normal computer.

本実施形態の各装置で実行される各プログラム（動き推定プログラム、動領域検出プログラム、ステレオマッチングプログラム）は、インストール可能な形式又は実行可能な形式のファイルでＣＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）等のコンピュータで読み取り可能な記録媒体に記録されて提供される。 Each program (motion estimation program, motion region detection program, stereo matching program) executed by each device of the present embodiment is an installable or executable file, a CD-ROM, a flexible disk (FD), The program is recorded on a computer-readable recording medium such as a CD-R or a DVD (Digital Versatile Disk).

また、本実施形態の各装置で実行される各プログラム（動き推定プログラム、動領域検出プログラム、ステレオマッチングプログラム）を、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成しても良い。また、本実施形態の各装置で実行される各プログラム（動き推定プログラム、動領域検出プログラム、ステレオマッチングプログラム）をインターネット等のネットワーク経由で提供または配布するように構成しても良い。 Also, each program (motion estimation program, motion region detection program, stereo matching program) executed by each device of the present embodiment is stored on a computer connected to a network such as the Internet and downloaded via the network. You may comprise so that it may provide. Further, each program (motion estimation program, motion region detection program, stereo matching program) executed by each device of the present embodiment may be provided or distributed via a network such as the Internet.

また、本実施形態の各装置で実行される各プログラム（動き推定プログラム、動領域検出プログラム、ステレオマッチングプログラム）を、ＲＯＭ等に予め組み込んで提供するように構成してもよい。 In addition, each program (motion estimation program, motion area detection program, stereo matching program) executed by each device of the present embodiment may be provided by being incorporated in advance in a ROM or the like.

本実施形態の各装置で実行される各プログラム（動き推定プログラム、動領域検出プログラム、ステレオマッチングプログラム）は、上述した各部（局所動き推定部と、モデル定義部、クラスタリング処理部、パラメータ推定部、領域分割部、ラベル割り当て部、動領域検出部、対応点推定部）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ（プロセッサ）が上記記憶媒体から〜プログラムを読み出して実行することにより上記各部が主記憶装置上にロードされ、局所動き推定部と、モデル定義部、クラスタリング処理部、パラメータ推定部、領域分割部、ラベル割り当て部、動領域検出部、対応点推定部が主記憶装置上に生成されるようになっている。 Each program (motion estimation program, motion region detection program, stereo matching program) executed by each device of the present embodiment includes the above-described units (local motion estimation unit, model definition unit, clustering processing unit, parameter estimation unit, The module configuration includes an area dividing unit, a label assigning unit, a moving region detecting unit, and a corresponding point estimating unit). As actual hardware, a CPU (processor) reads and executes a program from the storage medium. Are loaded onto the main memory, and the local motion estimation unit, model definition unit, clustering processing unit, parameter estimation unit, region division unit, label allocation unit, motion region detection unit, and corresponding point estimation unit are stored in the main memory. It is generated on the device.

なお、本発明は、上記実施の形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化することができる。また、上記実施の形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成することができる。例えば、実施の形態に示される全構成要素からいくつかの構成要素を削除してもよい。さらに、異なる実施の形態にわたる構成要素を適宜組み合わせても良い。 It should be noted that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above embodiments. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

実施の形態１にかかる動き推定装置の機能的構成を示すブロック図である。1 is a block diagram showing a functional configuration of a motion estimation apparatus according to a first embodiment. 一辺ｒの正方形クラスタ、４周辺クリークを示す模式図である。It is a schematic diagram which shows the square cluster of 1 side r, and 4 periphery cliques. クラスタの内部、境界のクリークの各個数とクラスタ内の全クリーク数に対する割合で示したグラフである。It is the graph shown by the ratio with respect to the number of each clique of the inside of a cluster, and a boundary, and the total number of cliques in a cluster. クラスタと正則化エネルギーの関係を示す模式図である。It is a schematic diagram which shows the relationship between a cluster and regularization energy. 実施の形態１にかかる動き推定処理の手順を示すフローチャートである。3 is a flowchart showing a procedure of motion estimation processing according to the first exemplary embodiment; 動きクラスタリング処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a motion clustering process. モデルパラメータ推定処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a model parameter estimation process. ラベル割り当て処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of a label allocation process. 対象画像の例を示す説明図である。It is explanatory drawing which shows the example of a target image. 参照画像の例を示す説明図である。It is explanatory drawing which shows the example of a reference image. 領域分割の結果を示す説明図である。It is explanatory drawing which shows the result of an area | region division. 比較例としての従来の手法によるラベル割り当て処理結果を示す説明図である。It is explanatory drawing which shows the label allocation process result by the conventional method as a comparative example. 実施の形態１のラベル割り当て処理結果を示す説明図である。6 is an explanatory diagram illustrating a label allocation process result according to Embodiment 1. FIG. 実施の形態２にかかる動領域検出装置１４００の機能的構成図である。It is a functional block diagram of the moving region detection apparatus 1400 concerning Embodiment 2. FIG. 実施の形態２の動領域検出処理の手順を示すフローチャートである。10 is a flowchart illustrating a procedure of moving region detection processing according to the second embodiment. 実施の形態２にかかるラベル割り当て処理の手順を示すフローチャートである。10 is a flowchart illustrating a procedure of label allocation processing according to the second exemplary embodiment. 実施の形態３にかかるステレオマッチング装置の機能的構成図である。FIG. 6 is a functional configuration diagram of a stereo matching device according to a third embodiment. 実施の形態３にかかるステレオマッチング処理の手順を示すフローチャートである。10 is a flowchart illustrating a procedure of stereo matching processing according to the third embodiment;

Explanation of symbols

１００動き推定装置
１０１局所動き推定部
１０３メモリ
１０６フレームメモリ
１０７モデル生成部
１０８モデル定義部
１０９クラスタリング処理部
１１０パラメータ推定部
１１１，１４１１ラベル割当部
１１２，１４１２領域分割部
１１３画像処理部
１４００動領域検出装置
１４０１動領域検出部
１７００ステレオマッチング装置
１７０１対応点推定部 DESCRIPTION OF SYMBOLS 100 Motion estimation apparatus 101 Local motion estimation part 103 Memory 106 Frame memory 107 Model generation part 108 Model definition part 109 Clustering process part 110 Parameter estimation part 111,1411 Label assignment part 112,1412 Area division part 113 Image processing part 1400 Motion area detection Device 1401 Moving Region Detection Unit 1700 Stereo Matching Device 1701 Corresponding Point Estimation Unit

Claims

A region dividing unit that divides the first image into a plurality of regions each including a plurality of pixels;
A model for associating a point on the first image with a point on the second image, and a plurality of parametric models including a plurality of model parameters and a model identification label for identifying the model Selecting the parametric model that minimizes the value of the evaluation function indicating the degree of fitness with the parametric model and the smoothness indicating whether the model identification label changes between adjacent pixels. A label allocating unit that executes a process of label allocation for allocating the model identification label of a model to a pixel in the region, in order from the largest number of pixels in the plurality of regions;
An image processing unit for obtaining a correspondence relationship between the first image and the second image based on the parametric model corresponding to the model identification label assigned to each pixel;
An image processing apparatus comprising:

The image processing apparatus according to claim 1, further comprising a model generation unit that generates the parametric model based on characteristics of the first image.

The model generation unit
A model definition section for defining the parametric model;
A clustering processing unit that classifies corresponding points on the second image of each point in the first image, and assigns an initial value of the label assignment to each of the classified corresponding points;
For each model identification label, the model parameter that can represent each point on the first image and the corresponding point on the second image corresponding to each point with a minimum error is estimated by the least square method. A parameter estimator;
The image processing apparatus according to claim 2, further comprising:

The image processing apparatus according to claim 3, wherein the model definition unit defines the parametric model by defining a plane-to-plane geometric transformation from the first image to the second image. .

Local estimation of motion between the first image and the second image to obtain a motion vector of the point on the second image of each point in the first image from the corresponding point of each point A motion estimation unit;
The parametric model is a model for obtaining a motion vector of a point on the second image of each point in the first image in the area of the model identification label,
The clustering processing unit classifies the motion vector obtained by the local motion estimation unit, and assigns an initial value of the label assignment for each classified motion vector,
The parameter estimation unit estimates the model parameter for each model identification label based on the motion vector determined by the parametric model and the motion vector obtained by the local motion estimation unit,
The said image processing part estimates the motion with respect to the said 2nd image of the said 1st image based on the said parametric model corresponding to the said model identification label allocated to each pixel. An image processing apparatus according to 1.

2. The label allocation unit according to claim 1, wherein when there are a plurality of the regions having the same number of pixels, the label allocation unit performs the label allocation processing in ascending order of the number of pixels at the boundary of the region. Image processing apparatus.

A storage unit for storing the parametric model;
The image processing apparatus according to claim 1, wherein the model parameter is set in advance according to a problem.

The parametric model is a model for obtaining a point on the second parallax image as the second image with respect to a point on the first parallax image as the first image. The image processing apparatus according to claim 1, comprising a point and a horizontal movement amount of the point of the second parallax image corresponding to the point as the model parameter and the model identification label.

An area dividing unit that divides a target image including a moving area that is a moving area with respect to a background image that is an image without movement into a plurality of areas each including a plurality of pixels;
A label allocating unit that executes a process of label allocation for allocating a region type label for identifying a background region or a moving region to pixels in the region, in order from the largest number of pixels in the plurality of regions;
Based on the region type label assigned to each pixel, a moving region detector that detects the moving region from the target image;
An image processing apparatus comprising:

The label allocation unit allocates the area type label of the background area to the area when a difference value between the pixel of the target image and the pixel of the background image is equal to or less than a predetermined threshold, When the area type label indicating the background area is assigned to all areas, the threshold value is increased, and the area type label indicating the moving area is assigned to all areas around the area. The image processing apparatus according to claim 9, wherein in the case, the threshold value is increased.

Dividing the first image into a plurality of regions each including a plurality of pixels;
A model for associating a point on the first image with a point on the second image, and a plurality of parametric models including a plurality of model parameters and a model identification label for identifying the model Selecting the parametric model that minimizes the value of the evaluation function indicating the degree of fitness with the parametric model and the smoothness indicating whether or not the model identification label changes between adjacent pixels. Performing a process of assigning a label for assigning the model identification label of a model to a pixel in the region in order from the largest number of pixels in the plurality of regions;
Obtaining a correspondence relationship between the first image and the second image based on the parametric model corresponding to the model identification label assigned to each pixel;
An image processing method comprising:

Dividing a target image including a moving area that is a moving area with respect to a background image that is an image without movement into a plurality of areas each including a plurality of pixels;
A step of performing label assignment processing for assigning a region type label for identifying a background region or a moving region to pixels in the region in order from the largest number of pixels in the plurality of regions;
Detecting the moving region from the target image based on the region type label assigned to each pixel;
An image processing method comprising:

Dividing the first image into a plurality of regions each including a plurality of pixels;
A model for associating a point on the first image with a point on the second image, and a plurality of parametric models including a plurality of model parameters and a model identification label for identifying the model Selecting the parametric model that minimizes the value of the evaluation function indicating the degree of fitness with the parametric model and the smoothness indicating whether the model identification label changes between adjacent pixels. A procedure for performing label assignment processing for assigning the model identification label of a model to pixels in the region in order from the largest number of pixels in the plurality of regions;
Obtaining a correspondence relationship between the first image and the second image based on the parametric model corresponding to the model identification label assigned to each pixel;
An image processing program for causing a computer to execute.

Dividing a target image including a moving area that is a moving area with respect to a background image that is an image without movement into a plurality of areas each including a plurality of pixels;
A procedure of executing label assignment processing for assigning an area type label for identifying a background area or a moving area to pixels in the area in order from the largest number of pixels in the plurality of areas;
A procedure for detecting the moving area from the target image based on the area type label assigned to each pixel;
An image processing program for causing a computer to execute.