JP2000172842A

JP2000172842A - Unknown target and method for estimating unknown target from observation record of training data

Info

Publication number: JP2000172842A
Application number: JP11337740A
Authority: JP
Inventors: William T Freeman; ウィリアム・ティー・フリーマン; Egon C Pasztor; エゴン・シー・パスツール
Original assignee: Mitsubishi Electric Information Technology Corp; Mitsubishi Electric Research Laboratories Inc
Current assignee: Mitsubishi Electric Information Technology Corp; Mitsubishi Electric Research Laboratories Inc
Priority date: 1998-11-30
Filing date: 1999-11-29
Publication date: 2000-06-23

Abstract

PROBLEM TO BE SOLVED: To solve a general kind of problem of low-level vision by sending a local evidence to an adjacent node in an inference stage and determining the maximum post-probability of scene estimation. SOLUTION: Training data 2 is an observation record of a known target. A display of discontinuity 11 or continuity 12 modeling the training data 2 is selected. The statistical relationship of the training data 2 is learnt by using the display of discontinuity 11 or continuity 12. Their relationship is represented as a mix of a vector, a matrix, or a Gaussian distribution. After the learning stage, inference is carried out as to the unknown target. A probability function Pd21 or Pc22 is used to infer one which is possibly a target 31 from the observation record 32 of the unknown target. This inference is carried out by locally transmitting the reliability through the Markov network.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、一般的にコンピ
ュータビジョンに関し、より詳細には、画像が表す情景
の特性を推定することに関するものである。すなわち、
既知のターゲットの観察記録の統計的特性を用いてター
ゲットを推定する方法に関するものである。The present invention relates generally to computer vision, and more particularly, to estimating the characteristics of a scene represented by an image. That is,
The present invention relates to a method for estimating a target by using statistical characteristics of observation records of a known target.

【０００２】[0002]

【従来の技術】コンピュータビジョンにおける一般的な
問題の１つは、その下にある情景を表す画像からどのよ
うにしてその情景の特性を判定するか、ということであ
る。いくつかの特定の問題点を以下に挙げる。動きの推
定については、入力は通常、一時的に順序づけられたの
一連の画像、例えば「ビデオ」、である。問題となるの
は、様々なもの−人間、車、ボール、そのビデオにおい
て動いている背景−の見積もり速度をどのように推定す
るか、ということである。他の問題は、２Ｄ画像から現
実世界の三次元（３Ｄ）構造を回復すること、例えば、
線描、写真、または１対の立体写真からどのようにオブ
ジェクトの形状を回復するか、を取り扱う。更に他の問
題は、低解像度の画像からどのようにして高解像度の情
景の詳細を回復するか、ということである。BACKGROUND OF THE INVENTION One of the common problems in computer vision is how to determine the characteristics of an underlying scene from an image representing the scene. Some specific issues are listed below. For motion estimation, the input is typically a temporally ordered sequence of images, eg, “video”. The question is how to estimate the estimated speed of various things-humans, cars, balls, moving background in the video. Other problems include recovering real-world three-dimensional (3D) structures from 2D images, eg,
It deals with how to recover the shape of an object from a line drawing, a photograph, or a pair of stereo photographs. Yet another problem is how to recover high-resolution scene details from low-resolution images.

【０００３】人間は、このようなタイプの推定を、しば
しば半ば無意識のうちに、いつも行っている。機械にこ
れができるようにするアプリケーションもまた多く存在
する。これらの問題は、何年もの間、多くの研究者によ
って異なるアプローチで研究されてきており、様々に成
功している。最も知られたアプローチに伴う問題は、一
般的な枠組み内で現在のプロセッサのパワーを利用する
ことができる機械学習法を欠いている、ということであ
る。[0003] Humans always make this type of estimation, often semi-involuntarily. There are also many applications that allow machines to do this. These problems have been studied by many researchers for many years with different approaches, with varying success. The problem with the best-known approach is that it lacks machine learning methods that can utilize the power of current processors within a general framework.

【０００４】[0004]

【発明が解決しようとする課題】従来技術において、ブ
ロックの世界の画像を解釈する各方法が開発されてい
る。手でラベル付けした情景を用いる他の従来技術の作
業は、ベクトルコードをベースにして空中の画像の局所
的な特徴を分析しており、情景解釈を伝える各規則を開
発している。しかし、これらの解決法は、ある特定の１
ステップの範疇用のものであり、従って、一般的な種類
の低レベルビジョンの問題を解決するのに用いることは
できない。確率を伝える各方法が用いられてきている
が、これらの方法は、ビジョンの各問題を解決する一般
的な枠組み内に入れられてはいない。In the prior art, methods have been developed for interpreting an image of the world of blocks. Other prior art work with hand-labeled scenes has analyzed local features of aerial images based on vector codes, and has developed rules that convey scene interpretation. However, these solutions are not
It is for the category of steps and therefore cannot be used to solve the general class of low-level vision problems. Although methods of communicating probabilities have been used, they are not within the general framework for solving vision problems.

【０００５】または、４つ１組のツリーを用いることに
よって画像から光流（optical flow）を推定して、色々
な割合で動き情報を伝えることができる。その場合に
は、明るさ一定の仮定を用い、光流の速度についての信
頼度がガウス確率分布として表される。[0005] Alternatively, by using a set of four trees, the optical flow can be estimated from the image to convey motion information at various rates. In that case, using the assumption of constant brightness, the reliability of the speed of the light flow is expressed as a Gaussian probability distribution.

【０００６】[0006]

【課題を解決するための手段】本発明は、対応する画像
データから視覚情景を推定するために、ラベル付けした
視覚世界の統計的特性を分析する。画像データは、フレ
ームが単一であっても多数であってもよい。推定する情
景特性は、投影オブジェクト速度、表面形状、反射度パ
ターン、またはカラーであってもよい。本発明は、ラベ
ル付けしたトレーニングデータから集めた統計的特性を
用いて、下にある情景の「最良推測」推定、すなわち最
適解釈を形成する。SUMMARY OF THE INVENTION The present invention analyzes the statistical properties of a labeled visual world to estimate a visual scene from corresponding image data. The image data may have a single frame or a large number of frames. The estimated scene characteristics may be a projected object velocity, surface shape, reflectance pattern, or color. The present invention uses statistical properties gathered from labeled training data to form a "best guess" estimate, or optimal interpretation, of the underlying scene.

【０００７】本発明は、学習段階および推論段階におい
て動作する。学習段階の間に、トレーニングデータにつ
いての統計的特性が、確率密度関数、例えば、ガウス分
布のミックスとしてモデル化される。マルコフネットワ
ークが確立される。推論段階の間に、ある特定の画像か
ら取り出した信頼度および密度関数がネットワークの回
りで伝えられ、その特定の画像に対応する特定の情景に
ついての推定を行う。The present invention operates in the learning and inference stages. During the learning phase, the statistical properties for the training data are modeled as a mix of probability density functions, eg, Gaussian distributions. A Markov network is established. During the inference phase, the confidence and density functions extracted from a particular image are passed around the network to make an estimate about the particular scene corresponding to that particular image.

【０００８】従って、学習段階の間に、通常の画像およ
び情景についてのトレーニングデータが合成して生成さ
れる。画像と情景の両方についてのパラメータ記号表が
生成される。隣接した情景パラメータを条件とする情景
パラメータの確率のように、情景パラメータ（尤度関
数）を条件とする画像パラメータの確率がモデル化され
る。これらの関係はマルコフネットワークでモデル化さ
れ、このマルコフネットワークにおいては、推論段階の
間に局所的な証拠が隣接したノードに伝えられて、情景
推定の最大事後確率を決定する。Thus, during the learning phase, training data for normal images and scenes is synthesized and generated. Parameter symbol tables are generated for both images and scenes. Like the probability of a scene parameter conditioned on adjacent scene parameters, the probability of an image parameter conditioned on a scene parameter (likelihood function) is modeled. These relationships are modeled in a Markov network, where local evidence is passed to neighboring nodes during the inference phase to determine the maximum posterior probability of the scene estimation.

【０００９】人間が情景解釈を行う方法は、大部分が未
知であるが、数学的にはっきりと言い表せるものでない
ことは確かである。我々は、すべての局所的画像につい
て可能性のある情景解釈それぞれの確率を決定し、互い
に隣接したいかなる２つの局所的情景の確率も決定する
ことによって、視覚情景を解釈する視覚システムを、説
明する。第１の確率によって、視覚システムが局所的画
像データから情景推定を行うことができ、第２の確率に
よって、これらの局所的推定を伝えることができる。１
つの実施の形態では、マルコフ仮定によって拘束される
ベイズ的方法を用いる。The way in which humans interpret scenes is largely unknown, but certainly not mathematically explicit. We describe a visual system that interprets visual scenes by determining the probabilities of each possible scene interpretation for all local images and determining the probabilities of any two local scenes adjacent to each other. . The first probability allows the visual system to make scene estimates from local image data, and the second probability conveys these local estimates. 1
One embodiment uses a Bayesian method constrained by the Markov assumption.

【００１０】本発明による本方法は、様々な低レベルビ
ジョンの問題、例えば、低解像度の画像バージョンから
高解像度の情景の詳細の推定、線描からのオブジェクト
の形状の推定、に適用することができる。これらのアプ
リケーションにおいては、ドメイン知識なしでも、空間
的に局所的な統計的情報であれば、合理的な全体的情景
解釈に達するのに十分である。The method according to the invention can be applied to various low-level vision problems, such as estimating high-resolution scene details from low-resolution image versions, estimating the shape of objects from line drawings. . In these applications, even without domain knowledge, spatially localized statistical information is sufficient to reach a reasonable overall scene interpretation.

【００１１】特に本発明は、画像から情景を推定する方
法を提供する。複数の情景が生成され、それぞれの情景
について画像がレンダリングされる。これらによって、
トレーニングデータが形成される。これらの情景および
対応する画像は、パッチに分割される。それぞれのパッ
チはベクトルとして定量化され、これらのベクトルが確
率密度、例えば、ガウス分布のミックスとしてモデル化
される。パッチ同士の間の統計的関係は、マルコフネッ
トワークとしてモデル化される。局所的確率情報は、ネ
ットワークの隣接したノードに繰り返して伝えられ、結
果として得られるそれぞれのノードにおける確率密度、
「信頼度」が読み出されて情景が推定される。In particular, the present invention provides a method for estimating a scene from an image. A plurality of scenes are generated, and an image is rendered for each scene. By these,
Training data is formed. These scenes and corresponding images are divided into patches. Each patch is quantified as vectors, and these vectors are modeled as a mix of probability densities, eg, Gaussian. Statistical relationships between patches are modeled as Markov networks. Local probability information is repeatedly transmitted to adjacent nodes of the network, and the resulting probability density at each node,
The “reliability” is read to estimate the scene.

【００１２】本発明の１つのアプリケーションにおい
て、ぼんやりとした、すなわち低解像度の画像から高解
像度の詳細を推定することが可能である。低解像度の画
像は、入力「画像」データであり、「情景」データは、
高解像度の詳細の画像強さである。本発明はまた、一連
の画像から情景の動きを推定するのに用いることもでき
る。このアプリケーションにおいては、画像データはそ
の一連のうちの２つの連続する画像からの画像強さであ
り、情景データは、それぞれの画素位置における可視オ
ブジェクトの投影速度を示す連続した速度マップであ
る。本発明の他のアプリケーションは、陰影付けおよび
反射度の統一である。In one application of the present invention, it is possible to estimate high-resolution details from blurry, ie, low-resolution, images. The low resolution image is the input "image" data, and the "scene" data is
High resolution detail image strength. The invention can also be used to estimate scene motion from a sequence of images. In this application, the image data is the image intensity from two consecutive images in the series, and the scene data is a continuous speed map showing the projection speed of the visible object at each pixel location. Another application of the invention is shading and reflectivity unification.

【００１３】本発明はまた、トレーニングデータおよび
ターゲットデータを確率密度関数でモデル化することが
できる他の推定の問題に、例えば、音声認識、地震学研
究、ＥＥＧやＥＩＫＧ等の医学診断信号において、適用
することもできる。更に、確率表示は、学習段階または
推論段階のどちらにおいても、不連続であっても連続で
あってもよい。The present invention also provides other estimation problems that allow training data and target data to be modeled with a probability density function, such as speech recognition, seismic research, and medical diagnostic signals such as EEG and EIKG. It can also be applied. Further, the probability indication may be discontinuous or continuous, at either the learning stage or the inference stage.

【００１４】[0014]

【発明の実施の形態】実施の形態１．（導入）単一の画像または多数の画像のどちらかを用い
て、情景の特性を推定するために、ラベル付けした視覚
世界の統計的特性を用いる方法を説明する。推定する情
景特性は、情景における投影オブジェクト速度、オブジ
ェクトの表面形状、反射度パターン、またはカラーを含
んでもよい。この一般的な方法は、多数の低レベルビジ
ョンの問題に適用することができる。本方法はまた、例
えば人間の音声、地震計等の他の複雑なデジタル信号の
統計的特性をモデル化するためにも用いることができ
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiment 1 (Introduction) Describes how to use the statistical properties of the labeled visual world to estimate the properties of a scene using either a single image or multiple images. The estimated scene characteristics may include a projected object velocity in the scene, a surface shape of the object, a reflectance pattern, or a color. This general approach can be applied to many low-level vision problems. The method can also be used to model the statistical properties of other complex digital signals, such as, for example, human speech, seismographs, and the like.

【００１５】図１に示すように、一般的方法１は、トレ
ーニングデータ２で始まる。トレーニングデータは、既
知のターゲットの観察記録である。トレーニングデータ
２は、ランダムに生成されてもよい。ステップ１０にお
いて、トレーニングデータをモデル化する不連続１１ま
たは連続１２の表示を選択する。ステップ２０におい
て、不連続または連続のどちらかの表示を用いて、トレ
ーニングデータについての統計的関係を学習する。これ
らの関係は、不連続または連続のどちらかの確率関数
（Ｐ_d２１、またはＰ_c２２）、例えば、ベクトルおよび
マトリクス、またはガウス分布のミックスとして表すこ
とができる。As shown in FIG. 1, the general method 1 starts with training data 2. The training data is an observation record of a known target. The training data 2 may be randomly generated. In step 10, a display of discontinuities 11 or series 12 for modeling the training data is selected. In step 20, a statistical relationship for the training data is learned using either a discontinuous or continuous representation. These relationships can be expressed as either discrete or continuous probability functions (P _d 21, or P _c 22), for example, vectors and matrices, or a mix of Gaussian distributions.

【００１６】学習段階の後、未知のターゲットについて
の推論を行うことができる。ステップ３０において、Ｐ
_d２１またはＰ_c２２のどちらかを用いて、未知のターゲ
ットの観察記録３２から、ターゲット３１でありそうな
ものを推論する。この推論は、マルコフネットワークに
おいて、信頼度を局所的に伝えることによって行われ
る。マルコフネットワークにおいては、ネットワークに
おけるノードが観察記録を表し、信頼度は、信頼度の不
連続または連続の統計的表示である。ステップ３０は、
そのトレーニングデータに似た他のターゲットについて
繰り返してもよい。After the learning phase, inferences about unknown targets can be made. In step 30, P
Either _d 21 or P _c 22 is used to infer what is likely to be the target 31 from the observation record 32 of the unknown target. This inference is performed in a Markov network by locally transmitting the reliability. In a Markov network, nodes in the network represent observational records, and the confidence is a statistical indication of the discontinuity or continuity of the confidence. Step 30 is
It may be repeated for other targets similar to the training data.

【００１７】（トレーニングデータについてのランダム
な情景および画像の生成）図２により詳細に示すよう
に、トレーニング段階の間に、一般的方法１００は、ス
テップ１１０において、ランダムな情景ｘ_i（既知のタ
ーゲット）および対応する画像ｙ_i（観察記録）をトレ
ーニングデータ１１１として生成する。ランダムな情景
およびレンダリングされた画像は、コンピュータグラフ
ィックスを用いて合成して生成することができる。合成
画像は、システムが処理する未知の画像の特色をいくら
か示している。Generating Random Scenes and Images for Training Data As shown in more detail in FIG. 2, during the training phase, the general method 100 includes, in step 110, a random scene x _i (known target) ) And the corresponding image y _i (observation record) are generated as training data 111. Random scenes and rendered images can be generated synthetically using computer graphics. The composite image shows some of the features of the unknown image that the system processes.

【００１８】（情景のパッチへの分割）ステップ１２０
において、情景および対応する画像が、局所的パッチ１
２１に分割される。分割は、情景および画像を覆う正方
形のパッチワークであってもよい。パッチの大きさは多
数であってもよく、パッチは冗長して載せてもよい。例
えば、パッチは多数のガウスピラミッドにおいて形成し
てもよい。ピラミッドは、例えば、５レベルの解像度−
密から粗まで−を有してもよい。更に、パッチは、異な
る向きをつけたフィルタを通して見る画像情報を表して
もよい。(Division of Scene into Patches) Step 120
, The scene and the corresponding image are in local patch 1
It is divided into 21. The segmentation may be a square patchwork covering the scene and the image. The size of the patch may be many, and the patch may be redundantly mounted. For example, patches may be formed in multiple Gaussian pyramids. The pyramid has, for example, five levels of resolution-
It may have-from dense to coarse. Further, the patches may represent image information viewed through differently oriented filters.

【００１９】解像度や向き等であるが、空間的に異な
る、与えられた１組の基準のすべてのパッチは、同じ区
分であると言われており、同じ統計的分布から引き出さ
れると仮定される。パッチの大きさは、モデル化ができ
るほど十分小さく、しかしながら、情景全体について意
味のある情報を伝えるほど十分大きい。All patches of a given set of references, such as resolution, orientation, etc., but spatially different, are said to be of the same partition and are assumed to be drawn from the same statistical distribution. . The size of the patch is small enough to allow modeling, but large enough to convey meaningful information about the entire scene.

【００２０】（パッチのベクトルとしての定量化）ステ
ップ１３０において、プリンシプル・コンポーネント・
アナリシス（ＰＣＡ）を用いて、それぞれのパッチにつ
いての表示を決定する。それぞれのパッチは、ベース関
数同士の線形の組み合わせとして表される。パッチ１２
１を、低次元ベクトル１３１として表す。例えば、それ
ぞれの情景パッチを五次元ベクトルとして表し、それぞ
れの画像パッチを七次元ベクトルとして表してもよい。
言い換えれば、ランダムなトレーニングデータ、情景、
および画像のそれぞれのパッチを、例えば、五次元およ
び七次元の空間における点として表す。(Quantification of Patch as Vector) In step 130, the principal component
The display for each patch is determined using analysis (PCA). Each patch is represented as a linear combination of the base functions. Patch 12
1 is represented as a low-dimensional vector 131. For example, each scene patch may be represented as a five-dimensional vector, and each image patch may be represented as a seven-dimensional vector.
In other words, random training data, scenes,
And each patch of the image is represented, for example, as a point in five-dimensional and seven-dimensional space.

【００２１】（トレーニングデータの確率密度のモデル
化）ステップ１４０において、これら低次元空間におけ
るすべてのトレーニングデータの確率密度を、ガウス分
布のミックスでモデル化する。トレーニングデータを用
いて、次式のような非常に一般的な形で局所的パッチの
確率を推定する。(Modeling of probability density of training data) In step 140, the probability densities of all training data in these low-dimensional spaces are modeled by a mix of Gaussian distribution. The training data is used to estimate the local patch probabilities in a very general form:

【００２２】Ｐ（ｓｃｅｎｅ），Ｐ（ｉｍａｇｅ｜ｓｃ
ｅｎｅ）ａｎｄＰ（ｎｅｉｇｈｂｏｒｉｎｇｓｃ
ｅｎｅ｜ｓｃｅｎｅ）P (scene), P (image | sc
ene) and P (neighboring sc
ene | scene)

【００２３】よりはっきりと言えば、以下の３つの確率
密度１４１をモデル化する。More specifically, the following three probability densities 141 are modeled.

【００２４】（１）それぞれの情景要素ｘの先験的確
率、情景要素のそれぞれの区分について異なる先験的確
率が存在する、(1) There is a priori probability of each scene element x, and a different a priori probability exists for each segment of the scene element.

【００２５】（２）関連する画像要素ｙが与えられたと
きの情景要素ｘの条件付き確率、すなわちＰ（ｙ｜
ｘ）、および(2) The conditional probability of the scene element x when the related image element y is given, that is, P (y |
x), and

【００２６】（３）情景要素ｘ₁および隣接した情景要
素ｘ₂の条件付き確率、すなわちＰ（ｘ₁｜ｘ₂）であ
る。(3) The conditional probability of the scene element x ₁ and the adjacent scene element x ₂ , that is, P (x ₁ | x ₂ ).

【００２７】隣接した要素は、空間的位置において近接
したものでもよいが、また、縮尺や向き等の区分属性の
うちの何らかの１つにおいて近いものであってもよい。The adjacent elements may be close in spatial position, or may be close in any one of the segmentation attributes such as scale and orientation.

【００２８】トレーニングデータを修正して、ガウス分
布のミックスに適合するのがより容易な確率分布を有す
るようにするのが有用かもしれない。現実の画像につい
ては、関係のある多くの分布は、原点において非常に急
峻なスパイクを有する。このピークは、ガウス分布のミ
ックスと適合し、ガウス分布のミックスを操作するのは
困難である。ラベル付けした視覚データの統計的分析か
ら、情景データの先験的確率を求めることができる。そ
うすれば、トレーニングデータを二度目に通って、情景
データの先験的確率に反比例する確率でそれぞれのトレ
ーニングサンプルをランダムに削除することができる。
これによって、モデル化がより容易な確率分布を有する
バイアスされた１組のデータが与えられる。It may be useful to modify the training data so that it has a probability distribution that is easier to fit into a mix of Gaussian distributions. For real images, many distributions of interest have very steep spikes at the origin. This peak matches the Gaussian distribution mix, and it is difficult to manipulate the Gaussian mix. A priori probabilities of the scene data can be determined from the statistical analysis of the labeled visual data. Then, it is possible to pass the training data a second time and randomly delete each training sample with a probability that is inversely proportional to the a priori probability of the scene data.
This gives a biased set of data with a probability distribution that is easier to model.

【００２９】（マルコフネットワークの確立）学習段階
の最後のステップ１５０において、パッチおよびそれら
の関連する確率密度が、情景と画像との統計的関係を表
すマルコフネットワーク２００に組織される。マルコフ
ネットワークにおいて、各ノードは低次元ベクトルを表
し、ノードｘ_iは情景を、ノードｙ_iは画像を表す。ノー
ド同士を接続する縁は、それらのノード同士の間の統計
的依存を表す。(Establishment of Markov Network) In the last step 150 of the learning phase, the patches and their associated probability densities are organized into a Markov network 200 representing the statistical relationship between the scene and the image. In a Markov network, each node represents a low-dimensional vector, node x _i represents a scene, and node y _i represents an image. The edges connecting nodes represent the statistical dependence between those nodes.

【００３０】また、ガウスピラミッドを用いる場合に
は、与えられた解像度レベルのノードを、同レベルの空
間的に隣接したノードおよび近接した解像度レベルの同
じ空間的位置におけるノードに接続することができる。
更に、向きをつけたフィルタの向き等の何か他の次元に
おいて異なる情景要素に接続することもできる。Also, when a Gaussian pyramid is used, nodes of a given resolution level can be connected to nodes of the same level spatially adjacent and nodes of the same resolution level and at the same spatial location.
In addition, different scene elements can be connected in some other dimension, such as the orientation of an oriented filter.

【００３１】これらの接続は、情景を推定しながら空間
的アーティファクトを除去するのを促進する。接続され
たマルコフネットワーク２００によって、それぞれの情
景ノードは、以下の推論段階の間に他のノードから集め
られた蓄積した局所的な証拠をベースにして、自らの信
頼度を更新することができる。信頼度は、最終最良推定
を形成する組み合わせ確率密度である。These connections facilitate removing spatial artifacts while estimating the scene. The connected Markov network 200 allows each scene node to update its confidence based on accumulated local evidence gathered from other nodes during the following inference phase. Confidence is the combined probability density that forms the final best estimate.

【００３２】（信頼度を繰り返して伝え最良推定を読み
出す）推論段階の間に、未知の観察記録つまり画像１７
２から、未知のターゲット情景１７１を推定する。後述
の規則をベースにして、ステップ１６０は、それぞれの
ノードにおけるベイズ的「信頼度」を、メッセージ１６
１によって隣接したノードに繰り返し伝える。ベイズ的
すなわち規則正しくするアプローチは、これまでにも低
レベルビジョンの問題において用いられてきた。しか
し、従来技術とは対照的に、ラベル付けしたトレーニン
グデータ（情景および対応する画像）を用い、強いマル
コフ仮定を用いる。During the inference phase (repeating confidence and reading out the best estimate), an unknown observation record or image 17
2, an unknown target scene 171 is estimated. Based on the rules described below, step 160 determines the Bayesian "reliability" at each node in message 16
1 is repeatedly transmitted to adjacent nodes. The Bayesian or regular approach has been used in low-level vision issues. However, in contrast to the prior art, it uses labeled training data (scenes and corresponding images) and uses strong Markov assumptions.

【００３３】ステップ１７０において、観察した画像情
報が与えられたときの、対応する情景についてのそれぞ
れのノードにおける最良推定１７１が読み出される。こ
れは、それぞれのノードにおける信頼度についての確率
分布を検討して、ガウス分布のそのミックスの平均値ま
たは最大値のどちらかを取ることによって行うことがで
きる。これによって、観察した画像データが与えられた
ときの、その位置における真の下にあるターゲット情景
についての最良推定が、どんな情景値であるかがわか
る。At step 170, the best estimate 171 at each node for the corresponding scene given the observed image information is read. This can be done by examining the probability distribution for reliability at each node and taking either the mean or the maximum of that mix of Gaussian distributions. This tells us what scene value is the best estimate for the target scene directly below at that location given the observed image data.

【００３４】（３×３のマルコフネットワークの例）図
３は、簡単な３×３のマルコフネットワークを示す。簡
略化のために、すべてのデータを一次元にして、データ
をプロットすることができるようにしている。推定する
「情景データ」は、それぞれのノードにおいて１Ｄのｘ
２０１である。それぞれのノードにくる１Ｄの画像デー
タｙ２０２を用いて、ｘが何であるかを推定する。(Example of 3 × 3 Markov Network) FIG. 3 shows a simple 3 × 3 Markov network. For simplicity, we have made all data one-dimensional so that we can plot the data. The “scene data” to be estimated is a 1D x at each node.
201. Using 1D image data y202 coming to each node, what x is is estimated.

【００３５】本発明の通常の使用においては、トレーニ
ングの１組の画像および情景を作り出すために、ランダ
ムに作ったコンピュータグラフィック情景（既知のター
ゲット）およびそれらの対応するレンダリングされた画
像（観察記録）を生成する。それらを用いて、そこから
所望の先験的および条件付き統計を集める、画像および
情景のトレーニングのパッチを表すベクトルを生成す
る。In a typical use of the present invention, randomly generated computer graphic scenes (known targets) and their corresponding rendered images (observation records) to produce a set of images and scenes for training. Generate They are used to generate vectors representing image and scene training patches from which to gather desired a priori and conditional statistics.

【００３６】しかし、この簡単な例については、画像お
よび情景のトレーニングのパッチを表すベクトルに対応
する合成データを形成する。画像および情景を支配す
る、下にある同時確率関係を形成する。However, for this simple example, composite data corresponding to vectors representing training patches for images and scenes is formed. Form the underlying joint probability relationship that governs the image and the scene.

【００３７】図４は、この簡単な例についての変数ｘお
よびｙの同時確率関係３００を示す。図４において、変
数ｘは水平軸３０１に沿っており、変数ｙは垂直軸３０
２に沿っている。ｙがゼロである場合には、変数ｘは、
図４の中央のぼやけた水平線３０３の幅広い分布によっ
て示されるように、多くの可能な値のうちの１つを有す
ることができる。観察記録ｙが２である場合には、ｘは
いくらか３に近い。FIG. 4 shows the joint probability relationship 300 of the variables x and y for this simple example. 4, the variable x is along the horizontal axis 301 and the variable y is the vertical axis 30.
Along 2. If y is zero, the variable x is
It can have one of many possible values, as shown by the wide distribution of the central blurred horizontal line 303 in FIG. If the observation record y is 2, x is somewhat closer to 3.

【００３８】更に、この簡単な例においては、隣接した
情景パッチの値ｘ同士の間の関係は以下のようになる。
ネットワーク２００の「行」２０３を下げるときには常
に情景データｘに２を掛け、右に１列２０４行くときに
は情景データｘに１．５を掛ける。Further, in this simple example, the relationship between the values x of adjacent scene patches is as follows.
When lowering the "row" 203 of the network 200, the scene data x is always multiplied by 2; when going to the right one column 204, the scene data x is multiplied by 1.5.

【００３９】この簡単な例について、ノードにくる画像
データｙを形成する。ここでもまた簡単のために、ノー
ド５を除くすべてのノードは、ｙ＝０にセットされてい
る。For this simple example, image data y coming to a node is formed. Again, for simplicity, all nodes except node 5 are set to y = 0.

【００４０】従って、すべてのノードは、自らの値に関
して不確定性の幅が広い。ノード５は、観察した値ｙ＝
２を有する。この場合には、中央のノード５の観察した
値は、ほとんど確かに３であるはずである。そうする
と、ベイズ的信頼度を伝えることは、その知識をネット
ワーク２００における他のすべてのノードに伝えること
を行う。最終推定は、ノード５においてｘ＝３であり、
他のノードのｘ値は、ノード５から遠ざかる方向に水平
に右へまたは下へ１つ行く毎にそれぞれ１．５または２
の係数だけ増える（そして反対方向に行く場合には１／
１．５および１／２の割合で）であろう。Therefore, all nodes have a wide range of uncertainties regarding their values. Node 5 determines the observed value y =
2 In this case, the observed value of the central node 5 should almost certainly be 3. Propagating the Bayesian trust then conveys that knowledge to all other nodes in the network 200. The final estimate is x = 3 at node 5;
The x value of the other node is 1.5 or 2 each time one goes horizontally to the right or down in the direction away from node 5.
(And going in the opposite direction, 1 /
1.5 and 1/2).

【００４１】例のネットワーク２００は、樹形図のルー
トにおける１から始まって、連続した番号が各ノードに
ついた、ノードが９つの樹形図である。ｉ番目のノード
の局所的な情景状態はｘ_iであり、ｉ番目のノードにお
ける画像証拠はｙ_iである。The example network 200 is a nine-node dendrogram, starting at 1 at the root of the dendrogram, with consecutive numbers assigned to each node. The local scene state of the i-th node is x _i , and the image evidence at the i-th node is y _i .

【００４２】上で概要を述べた一般的方法１００の各ス
テップの次は、以下のように進んでいく。問題のコンピ
ュータグラフィックのシミュレーションから、トレーニ
ングデータを集める。この例の問題について、ｙおよび
ｘの、およびｘ₁およびその隣接したノードのｘ₂の既知
の同時分布から引き出すことによって、シミュレーショ
ンしたデータを生成する。Following each step of the general method 100 outlined above, proceeds as follows. Gather training data from computer graphic simulations of the problem. For the problem in this example, simulated data is generated by drawing from a known joint distribution of y and x, and x ₁ and its neighboring nodes x ₂ .

【００４３】簡単な１Ｄの問題については、プリンシプ
ル・コンポーネント・アナリシス（ＰＣＡ）を行ってそ
れぞれのノードにおいて集められるデータの次元を低く
する必要はない。次に、ガウス確率モデルのミックスを
用いて、所望の同時確率を推定する。Bishop“Neural n
etworks for pattern recognition,”Oxford,1995を参
照されたい。For simple 1D problems, it is not necessary to perform Principal Component Analysis (PCA) to reduce the dimensions of the data collected at each node. Next, a desired joint probability is estimated using a mix of Gaussian probability models. Bishop “Neural n
See etworks for pattern recognition, "Oxford, 1995.

【００４４】図５は、ｘの観察した値のヒストグラムを
示し、図６は、先験的確率密度に適合するガウス分布の
ミックスを示し、図７は、そのガウス分布のミックスを
簡潔にしたものである。後述の理由のために、それぞれ
のかけ算や確率の適合の後は削除する。FIG. 5 shows a histogram of the observed values of x, FIG. 6 shows a mix of Gaussian distributions that fits a priori probability density, and FIG. 7 shows a simplified mix of the Gaussian distributions. It is. After each multiplication or probability match, it is deleted for the reasons described below.

【００４５】図８は、必要な条件付き確率１４１のうち
のいくつかに適合するガウス分布のミックスを示す。
（ａ）および（ｂ）が同時に起こることについての同時
データを用いて、Ｐ（ａ，ｂ）／Ｐ（ｂ）＝Ｐ（ａ｜
ｂ）が与えられたとき、１／Ｐ（ｂ）だけ各点に重みを
つけることによって、ガウス分布のミックスをモデルの
条件付き確率Ｐ（ａ｜ｂ）に適合させる。図８（ａ）
は、ｘが与えられたときの確率密度ｙへのガウス分布の
適合のミックスを示し、図８（ｂ）は、１／１．５の勾
配の直線の、ｘの値が与えられたときのｘの右に隣接し
たものの確率密度へのガウス分布の適合のミックスを示
す。図８（ｃ）は、１／２の勾配の直線の、ｘの値が与
えられたときのｘの下に隣接したものの確率密度へのガ
ウス分布の適合のミックスを示す。FIG. 8 shows a mix of Gaussian distributions that fits some of the required conditional probabilities 141.
Using simultaneous data about what happens (a) and (b) at the same time, P (a, b) / P (b) = P (a |
Given b), fit the Gaussian mix to the model's conditional probability P (a | b) by weighting each point by 1 / P (b). FIG. 8 (a)
Shows the mix of fits of the Gaussian distribution to the probability density y given x, and FIG. 8 (b) shows a straight line with a slope of 1 / 1.5 given the value of x. Shown is a mix of Gaussian fits to the probability density of the neighbors to the right of x. FIG. 8 (c) shows a mix of fits of the Gaussian distribution to the probability density of the neighbors below x for a straight line of 勾配 slope, given the value of x.

【００４６】後述する規則に従って、それぞれのノード
において信頼度を繰り返し計算する。第１のステップ
は、それぞれのノードからその隣接したもののそれぞれ
にどんなメッセージを伝えるかを決定する、ということ
である。According to the rules described later, the reliability is repeatedly calculated at each node. The first step is to determine what message to pass from each node to each of its neighbors.

【００４７】図９（ａ）〜（ｄ）は、一緒に掛け合わせ
て、ノード５が第１の繰り返しでその上のノードである
ノード４に伝えるメッセージを生成する確率のそれぞれ
をグラフで示す。図９（ａ）は画像からの確率であり、
図９（ｂ）はノード２からであり、図９（ｃ）はノード
６からであり、図９（ｄ）はノード８からである。FIGS. 9 (a)-(d) graphically illustrate each of the probabilities of multiplying together to generate a message that node 5 conveys to node 4 above it in the first iteration. FIG. 9A shows the probability from the image.
9 (b) is from node 2, FIG. 9 (c) is from node 6, and FIG. 9 (d) is from node 8.

【００４８】図９（ｅ）は、図９（ａ）〜（ｄ）に示す
確率の積である。次に、図９（ｅ）に示す分布の次元を
高くして、図９（ｆ）には含まれているが図９（ｅ）に
は含まれていない次元において分布を一定に保つことに
よって、図９（ｆ）に示す分布の次元を等しくする。次
に、この高くした分布に、図９（ｆ）示す条件付き密度
を掛けて、図９（ｅ）に含まれる分布の次元に沿って周
辺化する。その結果、図９（ｇ）に示すメッセージ１６
１がノード５からノード４に送られる。FIG. 9E is the product of the probabilities shown in FIGS. 9A to 9D. Next, by increasing the dimension of the distribution shown in FIG. 9 (e) and keeping the distribution constant in the dimensions included in FIG. 9 (f) but not included in FIG. 9 (e). , The dimensions of the distribution shown in FIG. Next, the raised distribution is multiplied by the conditional density shown in FIG. 9F to form a margin along the dimension of the distribution included in FIG. 9E. As a result, the message 16 shown in FIG.
1 is sent from node 5 to node 4.

【００４９】図１０は、一緒に掛け合わせて先験的確率
の順にノード５がノード４に送るメッセージ、局所的画
像データから、隣接したノード４、ノード２、ノード
６、およびノード８からのメッセージ、および、第１の
繰り返しの最後でノード５における画像からの最終信頼
度（推定）を計算する確率をグラフで示す。FIG. 10 shows a message that node 5 sends to node 4 in the order of a priori probabilities multiplied together, from local image data, a message from adjacent nodes 4, 2, 6, and 8. , And the probability of calculating the final confidence (estimate) from the image at node 5 at the end of the first iteration.

【００５０】図１１〜図１３は、本方法の第１の３つの
繰り返しの間のネットワークでのそれぞれのノードにお
ける「信頼度」を示す。図１１に示すように、ノード同
士の間にはまだ情報が伝わっておらず、それぞれのノー
ドは、自らの局所的画像情報であるｙのみに依存して、
自らのｘ値を推定する。ノード５を除くすべてのノード
においてｙ＝０であったので、これらは自らのｘ値につ
いてほとんど情報を受け取っておらず、自らのｘ値につ
いての自らの信頼度は非常に幅広く分布している。ノー
ド５には、自らのｘ値が３に近いということがわかって
いる。これはｙ＝２によって暗示されているからであ
る。それぞれのノードにおいて示す信頼度は、それぞれ
のノードにおけるｙの適当な値について、Ｐ（ｙ｜ｘ）
Ｐ（ｘ）である。FIGS. 11-13 show the "reliability" at each node in the network during the first three iterations of the method. As shown in FIG. 11, information has not yet been transmitted between the nodes, and each node depends only on its own local image information y,
Estimate its own x value. Since y = 0 at all nodes except node 5, they have received very little information about their x-values and their confidence in their x-values is very widely distributed. Node 5 knows that its x value is close to three. This is because it is implied by y = 2. The reliability shown at each node is P (y | x) for an appropriate value of y at each node.
P (x).

【００５１】第２の伝わりにおいて、図１２に示すよう
に、それぞれのノードはその隣接したノードと自らの情
報を共用している。ノード２、４、６、および８は、自
らがおそらく有しているｘがどんな値であるかを知って
いる唯一のノードであるノード５から、情報を提供する
メッセージを受け取っており、これらのノードは、それ
に応じて、自らのｘの値についての自らの信頼度を調整
する。それぞれのノードにおいて示される分布は、Ｐ
（ｙ｜ｘ）Ｐ（ｘ）とそのノードに隣接したもののそれ
ぞれからのメッセージとを掛け合わせたものである。In the second transmission, as shown in FIG. 12, each node shares its own information with its adjacent nodes. Nodes 2, 4, 6, and 8 have received informational messages from node 5, which is the only node that probably knows what value of x they have, The node adjusts its confidence in its value of x accordingly. The distribution shown at each node is P
(Y | x) P (x) multiplied by a message from each of the nodes adjacent to the node.

【００５２】第３の伝わりによって、それぞれのノード
には２つ向こうにあるすべてのノードから伝えられてお
り、従って、それぞれのノードがノード５からの知識を
受け取っている。第３の伝わりの後、それぞれのノード
の信頼度の平均値または最大値は、そうであるべきもの
と略同じである。つまりノード５のｘは略３の値を有
し、他のｘの値は、右に行くと１．５倍、下に行くと２
倍になる。According to the third transmission, each node is informed by all two nodes behind it, and thus each node has received knowledge from node 5. After the third propagation, the average or maximum of the confidence of each node is about the same as it should be. That is, x of the node 5 has a value of approximately 3, and the other x values are 1.5 times to the right and 2 to the bottom.
Double.

【００５３】（ミックスの簡潔化）Ｎ個のガウス分布の
確率ミックスに、Ｍ個のガウス分布の確率ミックスを掛
けると、ＮＭ個のガウス分布のミックスが生じる。従っ
て、ガウス分布のミックス同士を掛け合わせると、ガウ
ス分布の数は急速に増えるので、ガウス分布を簡潔にし
なければならない。ミックスからの非常に小さい重み
で、簡単にしきい値によってガウス分布をふるいにかけ
ることができるが、このようにすると、ミックス適合が
不正確になる可能性がある。(Simplification of Mix) When the probability mix of N Gaussian distributions is multiplied by the probability mix of M Gaussian distributions, a mix of NM Gaussian distributions is generated. Therefore, when the Gaussian distribution mix is multiplied, the number of Gaussian distributions increases rapidly, so that the Gaussian distribution must be simplified. The Gaussian distribution can be easily sieved by the threshold with very little weight from the mix, but this can result in inaccurate mix fitting.

【００５４】（同時確率の因数分解）局所的証拠を隣接
したノードに伝えるのに用いられる同時確率の因数分解
の詳細を、図１４を参照して説明する。図１４に示すネ
ットワークは、それぞれ以下の４つの情景ノードおよび
画像ノードを有する。(Factorization of Joint Probability) The factorization of joint probability used for transmitting local evidence to an adjacent node will be described in detail with reference to FIG. The network shown in FIG. 14 has the following four scene nodes and image nodes, respectively.

【００５５】ｘ₁，．．．ｘ₄，およびｙ₁，．．．ｙ₄ X ₁ ,. . . x _4, and y _1,. . . y ₄

【００５６】局所的証拠を伝えるルールを生じる同時確
率の因数分解を求める。この因数分解では、以下の３つ
の確率操作規則を繰り返して用いる。Find the factorization of joint probabilities that results in rules that convey local evidence. In this factorization, the following three probability operation rules are used repeatedly.

【００５７】規則［１］基本確率Ｐ（ａ，ｂ）＝Ｐ（ａ｜ｂ）Ｐ（ｂ）に従う。Rule [1] The basic probability P (a, b) = P (a | b) P (b).

【００５８】規則［２］ノードｂがノードａとノードｃとの間にある場合には、
Ｐ（ａ，ｃ｜ｂ）＝Ｐ（ａ｜ｂ）Ｐ（ｃ｜ｂ）である。
これは、ｂが与えられたときのａおよびｃの条件付き独
立のステートメントである。Rule [2] If node b is between node a and node c,
P (a, c | b) = P (a | b) P (c | b).
This is a conditionally independent statement of a and c when b is given.

【００５９】規則［３］ノードｂがノードａとノードｃとの間にある場合には、
Ｐ（ｃ｜ａ，ｂ）＝Ｐ（ｃ｜ｂ）である。これは、最も
近いノードについての知識によってチェーンの残りにつ
いての知識を要約できるようにするマルコフ特性であ
る。Rule [3] If node b is between node a and node c,
P (c | a, b) = P (c | b). This is a Markov property that allows knowledge of the rest of the chain to be summarized by knowledge of the closest node.

【００６０】これら３つの規則のいずれも、ノード同士
を接続している縁を送る必要はない、ということに注意
されたい。これによって、ネットワーク２００における
因果関係について恣意的な選択をする必要がなくなる。It should be noted that none of these three rules need send the edges connecting the nodes. This eliminates the need for arbitrarily selecting causal relationships in the network 200.

【００６１】パラメータｘ₁，ｘ₂，ｘ₃，ｘ₄の最大事後
（ＭＡＰ）確率を推定するためには、ａｒｇｍａｘ
_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄｜ｙ₁，ｙ₂，ｙ₃，
ｙ₄）を決定したい。この条件付き確率は、同時確率Ｐ
（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）とは、変
化する独立変数にわたって一定である係数だけ異なる。
従って、ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，
ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）を求めるように同等に選択
でき、こちらの方が簡単に決定される。To estimate the maximum posterior (MAP) probabilities of the parameters x ₁ , x ₂ , x ₃ , x ₄ , argmax
_{x1, x2, x3, x4 P} (x 1, x 2, x 3, x 4 | y 1, y 2, y 3,
y ₄₎ I want to determine. This conditional probability is the joint probability P
(X ₁ , x ₂ , x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) by a factor that is constant over the changing independent variables.
_{Therefore, argmax x1, x2, x3,} x4 P (x 1, x 2, x 3,
x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) can be equally selected to be determined, which is easier to determine.

【００６２】それぞれのパラメータｘ_iの他の有用な推
定は、周辺分布の平均値、Ｐ（ｘ_i｜ｙ₁，ｙ₂，ｙ₃，ｙ
₄）である。この平均値は、同時分布Ｐ（ｘ₁，ｘ₂，
ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）から、ｘ_i以外のすべ
てのｘパラメータを周辺化する（積分する）ことによっ
て、求めることができる。この周辺化によって、Ｐ（ｘ
_i，ｙ₁，ｙ₂，ｙ₃，ｙ₄）が生じる。これは、一定の目
盛係数によって、分布Ｐ（ｘ_i｜ｙ₁，ｙ₂，ｙ₃，ｙ₄）
に関係しており、従って、この２つの分布の平均値は同
じになる。ＭＡＰ推定についての次の因数分解ステップ
もまた、周辺分布の平均値に当てはまるが、以下の変更
がある。演算ａｒｇｍａｘ_xjの代わりに、変数ｘ_j（Ｉ
ｘ_j）の積分となる。ノードにおける信頼度に関する最
終ａｒｇｍａｘ演算の代わりに、その信頼度の平均を取
る。Another useful estimate of each parameter x _i is the mean of the marginal distribution, P (x _i | y ₁ , y ₂ , y ₃ , y
₄ ). This average value is represented by the joint distribution P (x ₁ , x ₂ ,
x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) by peripheralizing (integrating) all x parameters other than x _i . By this peripheralization, P (x
_{_{_{i, y 1, y 2,}}} y 3, y 4) is produced. This is due to the distribution P (x _i | y ₁ , y ₂ , y ₃ , y ₄ )
Therefore, the average values of the two distributions are the same. The next factorization step for MAP estimation also applies to the mean of the marginal distribution, with the following changes. Instead of the operation argmax _xj , a variable x _j (I
x _j ). Instead of the final argmax operation on the reliability at a node, take the average of that reliability.

【００６３】それぞれのノードにおける計算について
は、同時確率を異なった方法で因数分解する。それぞれ
のノードｊは、その計算のまさに最後においてはＰ（ｘ
_j）の原因となり、隣接したノードにその量を伝えるこ
とはない。これによって、不変の局所的証拠を伝えるア
ルゴリズムができ、報告されているノードの数が与えら
れたときに出力が常に最適となる。For the calculation at each node, the joint probabilities are factored in different ways. Each node j has P (x
_j ), and does not transmit the quantity to the adjacent node. This allows an algorithm to convey invariant local evidence, and the output is always optimal given the number of nodes being reported.

【００６４】例を挙げて続けると、ネットワーク２００
における４つのノードのそれぞれについて４つの異なる
場合を説明する。第１に、ノードｊにおけるａｒｇｍａ
ｘ_jが次式と同じ値になるように、それぞれのノードに
おいて行う因数分解を説明する。Continuing with an example, the network 200
, Four different cases will be described for each of the four nodes. First, argma at node j
Factorization performed at each node so that x _j has the same value as the following equation will be described.

【００６５】ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，
ｘ₃，ｘ₄｜ｙ₁，ｙ₂，ｙ₃，ｙ₄）Argmax _{x1, x2, x3, x4} P (x ₁ , x ₂ ,
_{_{_{x 3, x 4 | y 1}}} , y 2, y 3, y 4)

【００６６】この４つの場合の後に、一般的な局所的証
拠を伝える規則を提示する。これらは、それぞれの因数
分解の計算を行うものである。After these four cases, the rules that convey general local evidence are presented. These perform the calculation of each factorization.

【００６７】（ノード１における計算）規則１を適用
し、次に規則２を適用すると、次式が得られる。(Calculation at Node 1) By applying rule 1 and then applying rule 2, the following equation is obtained.

【００６８】Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）＝Ｐ（ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄｜ｘ₁）Ｐ（ｘ₁）＝Ｐ（ｙ₁，ｘ₁）Ｐ（ｘ₂，ｘ₃，ｘ₄，ｙ₂，ｙ₃，ｙ₄｜ｘ₁）Ｐ（ｘ₁）P (x ₁ , x ₂ , x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) = P (x ₂ , x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ | x ₁ ) P (x ₁ ) = P (y ₁ , x ₁ ) P (x ₂ , x ₃ , x ₄ , y ₂ , y ₃ , y ₄ | x ₁ ) P (x ₁ )

【００６９】規則１を適用し、次に規則３を適用する
と、因数分解が次式のように続く。Applying rule 1 and then applying rule 3, the factorization continues as follows:

【００７１】規則２を二度適用して、Applying rule 2 twice,

【００７２】Ｐ（ｘ₃，ｘ₄，ｙ₂，ｙ₃，ｙ₄｜ｘ₂）＝Ｐ（ｙ₂｜ｘ₂）Ｐ（ｘ₃，ｙ₃｜ｘ₂）Ｐ（ｘ₄，ｙ₄｜ｘ₂）P (x ₃ , x ₄ , y ₂ , y ₃ , y ₄ | x ₂ ) = P (y ₂ | x ₂ ) P (x ₃ , y ₃ | x ₂ ) P (x ₄ , y ₄ | X ₂ )

【００７３】規則１を適用し、次に規則３を適用して、Applying rule 1, then applying rule 3,

【００７４】Ｐ（ｘ₃，ｙ₃｜ｘ₂）＝Ｐ（ｙ₃｜ｘ₂，ｘ₃）Ｐ（ｘ₃｜ｘ₂）＝Ｐ（ｙ₃｜ｘ₃）Ｐ（ｘ₃｜ｘ₂）およびＰ（ｘ₄，ｙ₄｜ｘ₂）＝Ｐ（ｙ₄｜ｘ₂，ｘ₄）Ｐ（ｘ₄｜ｘ₂）＝Ｐ（ｙ₄｜ｘ₄）Ｐ（ｘ₄｜ｘ₂）P (x ₃ , y ₃ | x ₂ ) = P (y ₃ | x ₂ , x ₃ ) P (x ₃ | x ₂ ) = P (y ₃ | x ₃ ) P (x ₃ | x ₂ ) And P (x ₄ , y ₄ | x ₂ ) = P (y ₄ | x ₂ , x ₄ ) P (x ₄ | x ₂ ) = P (y ₄ | x ₄ ) P (x ₄ | x ₂ )

【００７５】これらすべての代入を適用することによっ
て、次式が得られる。By applying all these substitutions, the following equation is obtained.

【００７６】Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）＝Ｐ（ｘ₁）Ｐ（ｙ₁｜ｘ₁）Ｐ（ｘ₂｜ｘ₁）Ｐ（ｙ₂｜ｘ₂）Ｐ（ｘ₃｜ｘ₂）Ｐ（ｙ₃｜ｘ₃）Ｐ（ｘ₄｜ｘ₂）Ｐ（ｙ₄｜ｘ₄）P (x ₁ , x ₂ , x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) = P (x ₁ ) P (y ₁ | x ₁ ) P (x ₂ | x ₁ ) P (y ₂ | x ₂ ) P (x ₃ | x ₂ ) P (y ₃ | x ₃ ) P (x ₄ | x ₂ ) P (y ₄ | x ₄ )

【００７７】ａｒｇｍａｘの勾配を、代入が一定である
変数に通らせると、次式が得られる。Passing the argmax gradient through a variable with constant substitution yields:

【００７８】ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）＝ａｒｇｍａｘ_x1Ｐ（ｘ₁）Ｐ（ｙ₁｜ｘ₁）ａｒｇｍａｘ_x2Ｐ（ｘ₂｜ｘ₁）Ｐ（ｙ₂｜ｘ₂）ａｒｇｍａｘ_x3Ｐ（ｘ₃｜ｘ₂）Ｐ（ｙ₃｜ｘ₃）[0078] _{argmax x1, x2, x3, x4} P (x 1, x 2, x 3, x 4, y 1, y 2, y 3, y 4) = argmax x1 P (x 1) P (y 1 | _{_{x 1) argmax x2 P (x}} 2 | x 1) P (y 2 | x 2) argmax x3 P (x 3 | x 2) P (y 3 | x 3)

【００７９】上記結果は、同時事後確率のＭＡＰ推定を
求めるためのものである。上述のように、そうしない
で、周辺分布の平均値を求めるには、次式の分布のｘ₁
に関する平均を取る。The above result is for obtaining the MAP estimation of the joint posterior probability. As described above, instead of this, to obtain the average value of the marginal distribution, x _{1 of the} following expression
Take an average about.

【００８０】Ｐ（ｘ₁，ｙ₁，ｙ₂，ｙ₃，ｙ₄）＝Ｐ（ｘ₁）Ｐ（ｙ₁｜ｘ₁）Ｉ_x2Ｐ（ｘ₂｜ｘ₁）Ｐ（ｙ₂｜ｘ₂）Ｉ_x3 Ｐ（ｘ₃｜ｘ₂）Ｐ（ｙ₃｜ｘ₃）P (x ₁ , y ₁ , y ₂ , y ₃ , y ₄ ) = P (x ₁ ) P (y ₁ | x ₁ ) I _x2 P (x ₂ | x ₁ ) P (y ₂ | x ₂ ) I _x3 P (x ₃ | x ₂ ) P (y ₃ | x ₃ )

【００８１】（一般化）規則１を用いてＰ（ｘ_a）がノ
ードａに現れるようにした。規則２によって、ノードａ
を出るそれぞれの縁が、Ｐ（他の変数｜ｘ_a）の形の係
数を与える。これらの「他の変数」のストリングのそれ
ぞれが、規則１および２を用いて再び分解され、規則３
を用いることによっていかなる追加の条件付け変数も簡
単にする。(Generalization) Using rule 1, P (x _a ) is made to appear at node a. According to rule 2, node a
Each edge gives a coefficient in the form of P (the other variable | x _a ). Each of these "other variable" strings is decomposed again using rules 1 and 2 and
Simplifies any additional conditioning variables.

【００８２】これによって、同時確率が、ノードａの立
場からネットワークのトポロジーを反映するような方法
で因数分解される。ノードｂおよびｃがノードａから分
岐しているノードが３つのチェーンについては、次式の
ようになる。Thus, the joint probability is factorized in a manner that reflects the topology of the network from the viewpoint of node a. For a chain of three nodes where nodes b and c branch off from node a, the following equation is obtained.

【００８３】Ｐ（ｘ_a，ｘ_b，ｘ_c）＝Ｐ（ｘ_a）Ｐ（ｘ_b
｜ｘ_a）Ｐ（ｘ_c｜ｘ_a）P (x _a , x _b , x _c ) = P (x _a ) P (x _b
│x _a ) P (x _c │x _a )

【００８４】それぞれのノードから分岐している画像ｙ
を含めると、次式のようになる。Image y branching from each node
Is included, the following equation is obtained.

【００８５】Ｐ（ｘ_a，ｘ_b，ｘ_c，ｙ_a，ｙ_b，ｙ_c）＝Ｐ（ｘ_a）Ｐ（ｙ_a｜ｘ_a）Ｐ（ｘ_b｜ｘ_a）Ｐ（ｙ_b｜ｘ_b）Ｐ（ｘ_c｜ｘ_a）Ｐ（ｙ_c｜ｘ_c）[0085] _{_{P (x a, x b,}} x c, y a, y b, y c) = P (x a) P (y a | x a) P (x b | x a) P (y b | _{_{x b) P (x c |}} x a) P (y c | x c)

【００８６】（ノード２における計算）３つの操作規則
を用いて、ノード２において用いる異なる因数分解を書
き込む。今、単一の変数に関する唯一の先験的確率は、
Ｐ（ｘ₂）である。(Calculation at Node 2) The different factorizations used at Node 2 are written using the three operating rules. Now, the only a priori probability for a single variable is
P (x ₂ ).

【００８７】ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）＝ａｒｇｍａｘ_x2Ｐ（ｘ₂）Ｐ（ｙ₂｜ｘ₂）ａｒｇｍａｘ_x1Ｐ（ｘ₁｜ｘ₂）Ｐ（ｙ₁｜ｘ₁）ａｒｇｍａｘ_x3Ｐ（ｘ₃｜ｘ₂）Ｐ（ｙ₃｜ｘ₃）ａｒｇｍａｘ_x4Ｐ（ｘ₄｜ｘ₂）Ｐ（ｙ₄｜ｘ₄）[0087] _{argmax x1, x2, x3, x4} P (x 1, x 2, x 3, x 4, y 1, y 2, y 3, y 4) = argmax x2 P (x 2) P (y 2 | _{_{x 2) argmax x1 P (x}} 1 | x 2) P (y 1 | x 1) argmax x3 P (x 3 | x 2) P (y 3 | x 3) argmax x4 P (x 4 | x 2) P (Y ₄ | x ₄ )

【００８８】（ノード３における計算）Ｐ（ｘ₁，ｘ₂，
ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）を因数分解して、次式
の因数を外に出す。(Calculation at Node 3) P (x ₁ , x ₂ ,
x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) are factorized, and a factor of the following equation is taken out.

【００８９】Ｐ（ｘ₃），ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ ₃ ，ｙ₄）＝ａｒｇｍａｘ_x3Ｐ（ｘ₃）Ｐ（ｙ₃｜ｘ₃）ａｒｇｍａｘ_x2Ｐ（ｘ₂｜ｘ₃）Ｐ（ｙ₂｜ｘ₂）ａｒｇｍａｘ_x1Ｐ（ｘ₁｜ｘ₂）Ｐ（ｙ₁｜ｘ₁）ａｒｇｍａｘ_x4Ｐ（ｘ₄｜ｘ₂）Ｐ（ｙ₄｜ｘ₄）P (x_Three), Argmax_{x1, x2, x3, x4}P (x₁, X_Two, X_Three, X_Four, Y₁, Y_Two, Y _Three , Y_Four) = Argmax_x3P (x_Three) P (y_Three| X_Three) Argmax_x2P (x_Two| X_Three) P (y_Two| X_Two) Argmax_x1P (x₁| X_Two) P (y₁| X₁) Argmax_x4P (x_Four| X_Two) P (y_Four| X_Four)

【００９０】（ノード４における計算）Ｐ（ｘ₁，ｘ₂，
ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ₃，ｙ₄）を因数分解して、次式
の因数を外に出す。(Calculation at Node 4) P (x ₁ , x ₂ ,
x ₃ , x ₄ , y ₁ , y ₂ , y ₃ , y ₄ ) are factorized, and a factor of the following equation is taken out.

【００９１】Ｐ（ｘ₄），ａｒｇｍａｘ_x1,x2,x3,x4Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，ｙ₂，ｙ ₃ ，ｙ₄）＝ａｒｇｍａｘ_x4Ｐ（ｘ₄）Ｐ（ｙ₄｜ｘ₄）ａｒｇｍａｘ_x2Ｐ（ｘ₂｜ｘ₄）Ｐ（ｙ₂｜ｘ₂）ａｒｇｍａｘ_x1Ｐ（ｘ₁｜ｘ₂）Ｐ（ｙ₁｜ｘ₁）ａｒｇｍａｘ_x3Ｐ（ｘ₃｜ｘ₂）Ｐ（ｙ₃｜ｘ₃）P (x_Four), Argmax_{x1, x2, x3, x4}P (x₁, X_Two, X_Three, X_Four, Y₁, Y_Two, Y _Three , Y_Four) = Argmax_x4P (x_Four) P (y_Four| X_Four) Argmax_x2P (x_Two| X_Four) P (y_Two| X_Two) Argmax_x1P (x₁| X_Two) P (y₁| X₁) Argmax_x3P (x_Three| X_Two) P (y_Three| X_Three)

【００９２】（局所的に伝える規則）単一の組の伝える
規則で、上記４つの計算のそれぞれが４つの異なるノー
ドに到着する。(Locally Transmitted Rules) With a single set of transmitted rules, each of the above four calculations arrives at four different nodes.

【００９３】それぞれの繰り返しの間に、それぞれのノ
ードｘ_jは証拠を集め、次にそれぞれの接続ノードｘ_kに
適当なメッセージを伝える。ノードｋからの証拠は、そ
こから受け取る最も最近のメッセージである。画像ｙ_j
からの証拠は、Ｐ（ｙ_j｜ｘ _j）である。During each iteration, each node
Code x_jGathers evidence, then each connection node x_kTo
Give an appropriate message. The evidence from node k is
Here is the latest message you will receive. Image y_j
Evidence from P (y_j| X _j).

【００９４】（１）ノードｊからノードｋに送られるメ
ッセージは、ノードｋ以外のノードからのノードｊにお
ける証拠の積Ｑ（ｊ；ｋ）で始まる。ノードｋは、その
メッセージを受け取っているノードである。これには、
局所的ノードの証拠Ｐ（ｙ_j｜ｘ_j）が含まれる。(1) A message sent from the node j to the node k starts with a product Q (j; k) of evidence at the node j from a node other than the node k. Node k is the node receiving the message. This includes
Includes local node evidence P (y _j | x _j ).

【００９５】（２）そうすると、ノードｋに送られるメ
ッセージはａｒｇｍａｘ_xjＰ（ｘ_j｜ｘ_k）Ｑ（ｊ；ｋ）
である。異なる計算を用いて、ノードｊから最適のｘ_j
を読み出す。(2) Then, the message sent to node k is argmax _xj P (x _j | x _k ) Q (j; k)
It is. Using a different calculation, the optimal x _j from node _j
Is read.

【００９６】（３）Ｐ（ｘ₁，ｘ₂，ｘ₃，ｘ₄，ｙ₁，
ｙ₂）を最大にするｘ_jを求めるために、ノードｊにおけ
るすべての証拠とＰ（ｘ_j）との積に関するａｒｇｍａ
ｘ_xjを取る。(3) P (x ₁ , x ₂ , x ₃ , x ₄ , y ₁ ,
argma for the product of P (x _j ) with all the evidence at node _j to find x _j that maximizes y ₂ )
Take x _xj .

【００９７】（伝える規則、不連続の場合）この伝える
動作は、不連続の確率表示の場合について、より容易に
表すことができるかもしれない。本実施例においては、
学習段階と推論段階の両方の間に不連続の確率表示を用
いる。トレーニングの間に、ノードｋの隣にあるノード
ｊについて、同時に起こるヒストグラムＨ（ｙ_j，ｘ_j）
およびＨ（ｘ_j，ｘ_k）を測定する。これらのヒストグラ
ムから、Ｐ（ｙ_j｜ｘ_j）およびＰ（ｘ_j｜ｘ_k）を推定す
ることができる。同時に起こるヒストグラムＨ（ａ，
ｂ）を、ａで示す行およびｂで示す列のマトリクスとし
て記憶する場合には、ポアッソン到着統計についてそれ
ぞれのカウントに小さな定数を加えた後のＰ（ａ｜ｂ）
が、そのマトリクスの行を標準化したものである。それ
ぞれの行は、合計すると１になる。(Conveying Rule, In the Case of Discontinuity) This transmitting operation may be more easily expressed in the case of displaying the probability of discontinuity. In this embodiment,
A discontinuous probability representation is used during both the learning phase and the inference phase. During training, for node j next to node k, the concurrent histogram H (y _j , x _j )
And H (x _j , x _k ) are measured. From these histograms, P (y _j | x _j ) and P (x _j | x _k ) can be estimated. The histograms H (a,
If b) is stored as a matrix of rows denoted by a and columns denoted by b, P (a | b) after adding a small constant to each count for Poisson arrival statistics
Are standardized rows of the matrix. Each row sums to one.

【００９８】ノードｊは、それぞれのノードから列ベク
トルメッセージを受け取る。ノードｊからノードｋにメ
ッセージを送るためには、ノードｊは、Node j receives a column vector message from each node. To send a message from node j to node k, node j

【００９９】（１）それぞれの入メッセージ（ノードｋ
からのものを除く）を１項ずつ掛け合わせて、列ベクト
ルＰ（ｙ_j｜ｘ_j）において掛け、次に(1) Each incoming message (node k
) Are multiplied one by one and multiplied by a column vector P (y _j | x _j ), and then

【０１００】（２）結果として得られるベクトルとＰ
（ｘ_j｜ｘ_j）との「最大マトリクス乗算」を行う。(2) The resulting vector and P
(Maximum matrix multiplication) with (x _j | x _j ).

【０１０１】結果として得られる列ベクトルが、ノード
ｋへのメッセージである。The resulting column vector is the message to node k.

【０１０２】「最大マトリクス乗算」という用語は、列
ベクトルとマトリクスのそれぞれの行との１項ずつ掛け
合わせた積を意味し、出力列ベクトルのインデックスに
ついての出力を、掛け合わせた積の最大値と等しくなる
ようにセットする。最小平均平方誤差（ＭＭＳＥ）推定
については、最大マトリクス乗算のステップの代わり
に、従来技術のベクトルとマトリクスとの積を用いる。The term "maximum matrix multiplication" means the product of the column vector multiplied by one term with each row of the matrix, and the output for the index of the output column vector is multiplied by the maximum value of the multiplied product. Set to equal. For minimum mean squared error (MMSE) estimation, the prior art vector-matrix product is used instead of the maximum matrix multiplication step.

【０１０３】不連続の確率表示において、ノードｊにお
けるｘの最良推定を読み出すために、それぞれの接続ノ
ードからの最も最近のメッセージを１項ずつ掛け合わせ
て、列ベクトルＰ（ｙ_j｜ｘ_j）において掛け、列ベクト
ルＰ（ｘ_j）において掛ける。結果として得られる列ベ
クトルを最大にするインデックスが、ｘの最良推定であ
り、これはターゲット情景内にある。In the discrete probability representation, to read out the best estimate of x at node j, the most recent message from each connected node is multiplied by one term to form a column vector P (y _j | x _j ). And multiply by the column vector P (x _j ). The index that maximizes the resulting column vector is the best estimate of x, which is in the target scene.

【０１０４】（超解像度の問題）本発明の１つのアプリ
ケーションにおいて、ぼんやりとした、すなわち低解像
度の、画像から高解像度の詳細を推定する。このアプリ
ケーションにおいては、画像データは低解像度の画像の
画像強さであり、「情景」データは、高解像度の詳細の
画像強さである。(Super Resolution Problem) In one application of the present invention, high resolution details are estimated from a blurry, ie, low resolution, image. In this application, the image data is the image intensity of the low-resolution image, and the "scene" data is the image intensity of the high-resolution detail.

【０１０５】トレーニング画像は、コンピュータグラフ
ィックス技術によってレンダリングされたランダムな表
面マーキングで覆われたランダムな形状のブロブから始
まる。まず帯域通過画像を得るために、向きのついた帯
域フィルタを作用させる。この帯域通過画像に、空間的
に変化する局所的乗法利得制御係数を適用する。利得制
御係数は、帯域通過画像の２乗しぼんやりした値の平方
根として計算される。この一定の利得制御によって、画
像の縁の強さが標準化され、次のモデル化ステップにか
かる負担を軽くする。結果として得られる画像は、「観
察した」情報を表す。The training image begins with a randomly shaped blob covered with random surface markings rendered by computer graphics techniques. First, in order to obtain a bandpass image, an oriented bandpass filter is applied. A spatially varying local multiplicative gain control coefficient is applied to this bandpass image. The gain control factor is calculated as the square root of the squared blurred value of the bandpass image. This constant gain control standardizes the edge strength of the image and reduces the burden on the next modeling step. The resulting image represents "observed" information.

【０１０６】また、レンダリングした画像に向きのつい
た高域フィルタも作用させて、次に帯域通過画像から計
算された空間的に変化する局所的利得制御係数を適用す
る。この結果は、対応するターゲットすなわち「情景」
情報を表す。The oriented high-pass filter also acts on the rendered image, and then applies the spatially varying local gain control coefficients calculated from the bandpass image. The result is the corresponding target or "scene"
Represents information.

【０１０７】多くのこのような画像と情景との対を生成
してトレーニングデータを確立した。それぞれの画像と
情景との対を、単一の空間的割合で同じ格子構造内でパ
ッチに分割した。画像パッチと情景パッチに別個にＰＣ
Ａを適用して、それぞれのパッチについての低次元表示
を得た。Many such pairs of images and scenes were generated to establish training data. Each image-scene pair was divided into patches in the same grid structure at a single spatial rate. Separate PC for image patch and scene patch
A was applied to obtain a low dimensional representation for each patch.

【０１０８】トレーニングデータから必要な条件付き確
率および先験的確率を決定し、そのデータにガウス分布
のミックスを適合させた。局所的情報を伝えて、推定高
解像度画像を得た。本実施例においては、学習段階と推
論段階の両方の間に連続の確率表示を用いる。The required conditional and a priori probabilities were determined from the training data and a Gaussian mix was fitted to the data. The local information was conveyed to obtain an estimated high-resolution image. In this embodiment, a continuous probability expression is used during both the learning stage and the inference stage.

【０１０９】（ハイブリッド確率密度表示）上述の確率
密度を伝える方法は、処理速度の点から改善することが
できる。確率密度の連続表示によって、学習段階の間に
入力画像データが良好に適合することができる。不連続
の表示であれば、推論段階の間に速く伝わることができ
る。次に、良好な適合と速い伝わりの両方を可能にする
ハイブリッドな方法を説明する。(Hybrid Probability Density Display) The above method of transmitting the probability density can be improved in terms of processing speed. The continuous display of the probability densities allows the input image data to better match during the learning phase. Discontinuous indications can travel faster during the inference phase. Next, a hybrid method is described that allows both good fit and fast propagation.

【０１１０】このハイブリッドの場合においては、図２
の先験的分布および条件付き分布１４１を、図３のマル
コフネットワーク２００におけるそれぞれのノード２０
１について異なる不連続な１組の情景値においてのみ評
価する。情景値は、そのノードにおいてその画像にレン
ダリングする情景のサンプリングである。これによっ
て、計算が、局所的に実行可能な情景解釈に集中する。
条件付き確率Ｐ（ｘ_j｜ｘ_k）は、それぞれノードｊおよ
びノードｋにおける情景サンプルにおいて評価される、
ガウス分布のミックスＰ（ｘ_j，ｘ_k）とＰ（ｘ_k）との
比を表す。条件付き確率Ｐ（ｙ_k｜ｘ_k）は、ノードｋの
情景サンプルにおいて、そこでの観察した画像値ｙ_kに
ついて評価される、確率Ｐ（ｙ_k，ｘ_k）／Ｐ（ｘ_k）で
ある。In the case of this hybrid, FIG.
The a priori and conditional distributions 141 of each node 20 in the Markov network 200 of FIG.
1 is evaluated only in a set of different discontinuous scene values. The scene value is a sampling of the scene to render in the image at that node. This concentrates the computation on locally viable scene interpretation.
The conditional probabilities P (x _j | x _k ) are evaluated in the scene samples at nodes j and k, respectively.
Represents the ratio between the Gaussian distribution mix P ( _xj , _xk ) and P ( _xk ). The conditional probability P (y _k | x _k ) is the probability P (y _k , x _k ) / P (x _k ) evaluated in the scene sample at node k for the observed image value y _k there. .

【０１１１】従って、上述のようにノード同士の間で情
報を伝えるために、ガウス分布のミックスを互いに掛け
合わせる代わりに、情景ドメイン内での不連続な１組の
点において、確率サンプル同士を掛け合わせる。情景サ
ンプルの組は、ネットワーク２００のそれぞれのノード
において異なっており、そのノードにくる画像情報によ
って決まる。１つのノードにおける信頼度は、そのノー
ドにおける情景サンプルのそれぞれにかかる１組の確率
重みである。ノードｊからノードｋへの信頼度の伝わり
には、Ｑ（ｊ：ｋ）からの不連続なサンプルのベクトル
とリンクマトリクスのそれぞれの行との１点ずつの積で
ある、Ｐ（ｘ_j｜ｘ_k）を含む（伝わりの規則１から）。
伝わりの規則２に従って、結果として得られるベクトル
の値は、それぞれの行の積の最大値である。例えば、１
０から１５のサンプルを用いる場合には、処理時間を低
減しながら下にある情景を十分に記述することが可能で
ある。推論の間に不連続の表現を用いることによって、
処理速度が２４時間から約１０分へと、大きさで２桁以
上改善される。Thus, instead of multiplying the Gaussian mix by each other in order to convey information between nodes as described above, the probability samples are multiplied by a set of discrete points in the scene domain. Match. The set of scene samples is different at each node of the network 200 and depends on the image information coming to that node. The confidence at one node is a set of probability weights on each of the scene samples at that node. The transfer of reliability from node j to node k includes P (x _j |, which is the product of the vector of discontinuous samples from Q (j: k) and each row of the link matrix, one point at a time. x _k ) (from Rule 1 of the Tradition).
In accordance with Tradition Rule 2, the value of the resulting vector is the maximum of the product of each row. For example, 1
When using samples from 0 to 15, it is possible to fully describe the underlying scene while reducing processing time. By using discontinuous expressions during inference,
Processing speed is improved by more than two orders of magnitude from 24 hours to about 10 minutes.

【０１１２】情景サンプルを選択するのに、それぞれの
ノードにおいて観察される画像要素ｙをミックスＰ
（ｙ，ｘ）の条件とし、情景要素ｘを結果として得られ
るガウス分布のミックスからサンプリングすることがで
きる。これらの情景を、その画像がそのノードにおいて
観察される画像と最もよく適合するトレーニングの組か
ら用いるだけで、より良好な結果を得ることができる。
これによって、ガウス分布のミックスのモデル化ステッ
プが１つ回避される。To select a scene sample, the image element y observed at each node is mixed P
Given the condition of (y, x), the scene element x can be sampled from the resulting Gaussian mix. Better results can be obtained simply by using these scenes from the training set whose image best matches the image observed at the node.
This avoids one modeling step of a Gaussian mix.

【０１１３】超サンプリングのアプリケーションにおけ
るこの情景推定方法によって、低解像度の画像が高品質
でズームされる。With this scene estimation method in a supersampling application, low resolution images are zoomed with high quality.

【０１１４】（他のアプリケーション）本発明はまた、
一連の画像から情景の動きを推定するのに用いることも
できる。このアプリケーションにおいては、画像データ
はその一連のうちの２つの連続する画像からの画像強さ
であり、情景データは、それぞれの画素位置における可
視オブジェクトの投影速度を示す連続した速度マップで
ある。(Other Applications) The present invention also provides
It can also be used to estimate scene motion from a series of images. In this application, the image data is the image intensity from two consecutive images in the series, and the scene data is a continuous speed map showing the projection speed of the visible object at each pixel location.

【０１１５】本発明の他のアプリケーションは、陰影付
けおよび反射度の統一である。画像は、表面上の陰影効
果からも、表面自体の反射度の変化からも生じることが
できる。例えば、陰影付けした表面の画像は、陰影付け
した表面自体からも、陰影付けした表面のように見える
ように描いた平らな表面（例えば、その平らな絵）から
も生じることができる。そのアプリケーション用の画像
データは、画像自体であろう。下にある推定する情景デ
ータは、下にある表面の形状および反射度のパターンで
あろう。本方法は、画像によって表す３Ｄの情景および
描くパターンを最良に推定するのに用いることができ
る。Another application of the present invention is unifying shading and reflectivity. Images can result from shading effects on the surface as well as changes in the reflectivity of the surface itself. For example, an image of a shaded surface can originate from the shaded surface itself, as well as from a flat surface (eg, a flat picture thereof) painted to look like the shaded surface. The image data for the application would be the image itself. The underlying estimated scene data would be the underlying surface shape and reflectivity pattern. The method can be used to best estimate the 3D scene represented by the image and the pattern to be drawn.

【０１１６】本発明はまた、他の複雑なデジタル信号に
ついての推定を行うのにも用いることができる。例え
ば、本発明を用いて、音声、地震データ、医学診断デー
タ、等の、統計的にサンプリングして確率密度関数とし
て表すことができるいかなる信号にも用いることができ
る。The present invention can also be used to make estimates for other complex digital signals. For example, the present invention can be used with any signal that can be statistically sampled and represented as a probability density function, such as speech, seismic data, medical diagnostic data, and the like.

【０１１７】[0117]

【発明の効果】本発明による本方法は、様々な低レベル
ビジョンの問題、例えば、低解像度の画像バージョンか
ら高解像度の情景の詳細の推定、線描からのオブジェク
トの形状の推定、に適用することができる。これらのア
プリケーションにおいては、ドメイン知識なしでも、空
間的に局所的な統計的情報であれば、合理的な全体的情
景解釈に達するのに十分である。The method according to the invention applies to various low-level vision problems, for example estimating high-resolution scene details from low-resolution image versions, estimating the shape of objects from line drawings. Can be. In these applications, even without domain knowledge, spatially localized statistical information is sufficient to reach a reasonable overall scene interpretation.

【０１１８】本発明のこの説明においては、特定の用語
および例を用いた。本発明の精神および範囲内で、様々
な他の適合および変形を行ってもよい、ということが理
解されるべきである。従って、添付の特許請求の範囲の
目的は、本発明の真の精神および範囲内にあるすべての
このような変更および変形を包含することである。In this description of the present invention, certain terms and examples have been used. It is to be understood that various other adaptations and modifications may be made within the spirit and scope of the invention. It is therefore the object of the appended claims to cover all such changes and modifications that fall within the true spirit and scope of the invention.

[Brief description of the drawings]

【図１】ターゲットを推定する方法のフローチャート
である。FIG. 1 is a flowchart of a method for estimating a target.

【図２】本発明による画像から情景を推定する方法の
詳細のフローチャートである。FIG. 2 is a detailed flowchart of a method for estimating a scene from an image according to the present invention;

【図３】本方法の信頼度を伝えるネットワークのグラ
フである。FIG. 3 is a graph of a network conveying the reliability of the method.

【図４】情景変数ｘを画像変数ｙと関係づける真の下
にある同時確率のグラフである。FIG. 4 is a graph of the true underlying joint probability relating a scene variable x to an image variable y.

【図５】トレーニングデータにおいて観察される情景
値のヒストグラムである。FIG. 5 is a histogram of scene values observed in training data.

【図６】図５のヒストグラムに示す分布に適合するガ
ウス分布の初期のミックスである。6 is an initial mix of a Gaussian distribution that fits the distribution shown in the histogram of FIG.

【図７】図６の適合を簡潔にしたものである。FIG. 7 is a simplification of the adaptation of FIG. 6;

【図８】トレーニングデータにおいて観察される条件
付き確率へのガウス分布の適合のミックスを示す。FIG. 8 shows a mix of fitting a Gaussian distribution to the conditional probabilities observed in the training data.

【図９】ネットワークの様々なノードにおける確率の
グラフ（ａ〜ｄ）、ａ〜ｄに示す確率の積（ｅ）、条件
付き密度のグラフ（ｆ）、及びメッセージ内を伝わる確
率のグラフ（ｇ）である。FIG. 9 is a graph of probabilities at various nodes of the network (ad), a product of probabilities shown in ad (e), a graph of conditional density (f), and a graph of probabilities of propagating in a message (g) ).

【図１０】組み合わせてノードの信頼度を形成する確
率のグラフである。FIG. 10 is a graph of the probability of combining to form node reliability.

【図１１】初期確率のグラフである。FIG. 11 is a graph of initial probabilities.

【図１２】第１の繰り返し後の確率のグラフである。FIG. 12 is a graph of the probability after the first iteration.

【図１３】第２の繰り返し後の確率のグラフである。FIG. 13 is a graph of the probability after the second iteration.

【図１４】４つの情景ノードおよび画像ノードを有す
るマルコフネットワークのグラフである。FIG. 14 is a graph of a Markov network having four scene nodes and image nodes.

[Explanation of symbols]

１一般的方法、２トレーニングデータ、１１不連
続、１２連続、３１ターゲット、３２観察記録、１
００一般的方法、２００マルコフネットワーク。1 general method, 2 training data, 11 discontinuous, 12 continuous, 31 targets, 32 observation records, 1
00 General method, 200 Markov network.

───────────────────────────────────────────────────── フロントページの続き (71)出願人 597067574 201 ＢＲＯＡＤＷＡＹ，ＣＡＭＢＲＩＤＧＥ，ＭＡＳＳＡＣＨＵＳＥＴＴＳ 02139，Ｕ．Ｓ．Ａ. (72)発明者ウィリアム・ティー・フリーマンアメリカ合衆国、マサチューセッツ州、アクトン、ハーフ・ムーン・ヒル 16 (72)発明者エゴン・シー・パスツールアメリカ合衆国、マサチューセッツ州、ジャマイカ・プレイン、ウォレン・スクエア６ ──────────────────────────────────────────────────続き Continuation of the front page (71) Applicant 597067574 201 BROADWAY, CAMBRIDGE, MASSACHUSETS 02139, S. A. (72) Inventor William T. Freeman Half Moon Hill, Acton, Massachusetts, United States of America 16 (72) Inventor Egon Sea Pasteur Jamaica Plain, Warren Square, Massachusetts, United States of America 6

Claims

[Claims]

1. A method for estimating an unknown target from an unknown target and an observation record of training data, comprising: generating a plurality of known targets and an observation record of the known target to form training data; Dividing the training data into corresponding subsets; quantifying each subset as a vector and modeling the probabilities for each vector; observing an unknown target and the probabilities of the training data. Repeatedly transmitting local probability information to adjacent nodes of the network; reading the probabilities at each node to estimate the unknown target from the unknown target and the observation record of the training data. Method of estimating the unknown target from observations of an unknown target and the training data including and.

2. The method of claim 1, wherein said dividing and said quantifying are performed during a learning phase, and said transmitting and reading are performed during an inference phase. A method for estimating unknown targets from observation records of targets and training data.

3. Estimating an unknown target from an unknown target and an observation record of training data according to claim 2, wherein the probability is represented as a function of a discontinuity between the learning phase and the inference phase. Method.

4. The method of claim 2, wherein the probabilities are expressed as a continuous function during the learning and inference stages. .

5. The unknown target and training data of claim 2, wherein the probabilities are represented as a discontinuous function during the learning phase and as a discontinuous function during the inference phase. A method of estimating unknown targets from observation records.

6. The unknown target according to claim 5, wherein the discontinuous function comprises a vector and a matrix, and the continuous function is a mix of Gaussian distributions. how to.

7. The observation of unknown target and training data according to claim 1, wherein the unknown target is a scene to be estimated, and the training data includes a random target and a corresponding image of the random target. A method of estimating unknown targets from records.

8. The unknown target and the unknown target from the training data observation record of claim 1, wherein the network is a Markov network and the nodes in the network represent the observation record of the unknown target. How to estimate.

9. The method of claim 1, further comprising the step of repeating the inference step for different observation records of an unknown target.
The method for estimating an unknown target from the unknown target and the observation record of the training data described in 1.