JP4934810B2

JP4934810B2 - Motion capture method

Info

Publication number: JP4934810B2
Application number: JP2006355681A
Authority: JP
Inventors: 聖二石川; ジュークイタン
Original assignee: Kyushu Institute of Technology NUC
Current assignee: Kyushu Institute of Technology NUC
Priority date: 2006-12-28
Filing date: 2006-12-28
Publication date: 2012-05-23
Anticipated expiration: 2026-12-28
Also published as: JP2008165580A

Description

本発明は、例えば、ビデオカメラのような画像入力手段によって取得した対象動体の２次元的な動作画像、特に人の２次元的な動作画像から、アニメーション、三次元アバター、またはロボットの３次元的な動作を再現するためのモーションキャプチャ方法に関する。 The present invention can be applied to, for example, an animation, a three-dimensional avatar, or a three-dimensional image of a robot from a two-dimensional motion image of a target moving object acquired by an image input unit such as a video camera. The present invention relates to a motion capture method for reproducing various operations.

従来、人の動作を測定し、その立体モデルを造る技術として、モーションキャプチャ方法がある。
このモーションキャプチャ方法としては、例えば、機械式方法、磁気式方法、および光学式方法が主要な方法として挙げられる（例えば、特許文献１〜４参照）。
中でも、光学式方法は、例えば、対象動体である人をカメラにより撮影するだけであるため、動作の制限を最も受けにくい方法であり、広く利用されている。例えば、映画、ビデオゲーム、スポーツ、またはダンスのように、運動または動作を表現する３次元メディアに関連する分野で、それらの立体コンテンツの製作に利用されてきた。これらの分野では、たくさんのカメラがセットされたスタジオを所有しているため、このスタジオでシナリオ通りの動作を行うことにより、その動作データが得られ、立体モデル化ができるからである。 2. Description of the Related Art Conventionally, there is a motion capture method as a technique for measuring a human motion and creating a three-dimensional model.
As this motion capture method, for example, a mechanical method, a magnetic method, and an optical method are main methods (see, for example, Patent Documents 1 to 4).
Among them, the optical method is, for example, a method in which a person who is a target moving object is only photographed by a camera, and is therefore the method that is most difficult to be restricted in operation, and is widely used. For example, in the field related to three-dimensional media expressing movement or motion, such as a movie, a video game, sports, or dance, it has been used to produce such three-dimensional content. This is because, in these fields, since a studio with many cameras is set, operation data can be obtained and a three-dimensional model can be obtained by performing operations according to the scenario in this studio.

特開２００５−３４５１６１号公報JP 2005-345161 A 特開２００４−１０１２７３号公報JP 2004-101273 A 特開２０００−３２１０４４号公報JP 2000-321044 A 特開平１０−７４２４９号公報Japanese Patent Laid-Open No. 10-74249

しかしながら、前記従来のモーションキャプチャ方法には、未だ解決すべき以下のような問題があった。
機械式方法は、測定機械を身体に装着する必要があるため、表現可能な動作に制限がある。
また、磁気式方法は、磁場を発生させた環境でしか使用できないという問題があり、適用可能な場所が制約される。
そして、光学式方法は、広く利用されてはいるが、事前にカメラのキャリブレーション（配置および位置設定）が必要であり、また人の動きを測定するためのマーカも、身体に装着する必要があるため、作業性が悪い。なお、視体積交差法（バックプロジェクト法）を使用すれば、マーカの使用は不要となるが、カメラのキャリブレーションは必要である。一方、因子分解に基づく方法を使用すれば、カメラのキャリブレーションは不要となるが、マーカの使用が必要となる。 However, the conventional motion capture method still has the following problems to be solved.
In the mechanical method, since a measuring machine needs to be attached to the body, the motion that can be expressed is limited.
In addition, the magnetic method has a problem that it can be used only in an environment in which a magnetic field is generated, and the applicable place is limited.
Although the optical method is widely used, it is necessary to calibrate the camera (placement and position setting) in advance, and it is also necessary to attach a marker for measuring human movement to the body. Therefore, workability is poor. If the visual volume intersection method (back project method) is used, the use of a marker is not necessary, but the camera needs to be calibrated. On the other hand, if a method based on factorization is used, camera calibration is not required, but the use of markers is required.

本発明はかかる事情に鑑みてなされたもので、画像入力手段によって取得した対象動体の２次元的な動作画像から、観察方向によらずに、自動かつ高速に、３次元的な動作を再現可能なモーションキャプチャ方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and can reproduce a three-dimensional motion automatically and at high speed from a two-dimensional motion image of a target moving body acquired by an image input means, regardless of the observation direction. An object is to provide a simple motion capture method.

前記目的に沿う本発明に係るモーションキャプチャ方法は、画像入力手段によって取得した対象動体の２次元的な動作画像から、該対象動体の３次元的な動作を再現するためのモーションキャプチャ方法であって、
固有空間データ作成手段により、予め動体の基本動作ごとに、該動体の基本動作の各フレーム画像データＡが点で表示される固有空間データＡを作成し記憶手段に格納してデータベース化する固有空間データ作成工程と、
木構造作成手段により、前記固有空間データ作成手段でデータベース化された前記固有空間データＡを、前記動体の基本動作が保有している情報ごとに木構造群に分解（木構造内に分配）し前記記憶手段に格納して構造化する木構造作成工程と、
判別手段により、判別しようとする前記対象動体の動作のフレーム画像データＢが点で表示された固有空間データＢと、前記木構造作成手段で構造化された前記動体の基本動作ごとの固有空間データＡとを比較して、前記固有空間データＢからの距離が最も近い固有空間データＡを選び、前記対象動体の３次元的な動作を特定する判別工程とを有し、
前記木構造群では、大小比較が可能な境界を表す値であるキーによって根および節が構築され、しかも、前記固有空間データＡは葉のみに格納され、更に、前記判別工程での前記固有空間データＢと前記固有空間データＡとの比較は前記キーを用いて行う。 The motion capture method according to the present invention that meets the object is a motion capture method for reproducing a three-dimensional motion of a target moving object from a two-dimensional motion image of the target moving object acquired by an image input means. ,
An eigenspace that creates eigenspace data A in which each frame image data A of the basic motion of the moving object is displayed as a point for each basic motion of the moving object by the eigenspace data creating means and stores it in the storage means to create a database Data creation process,
The tree structure creation means decomposes the eigenspace data A databased by the eigenspace data creation means into a tree structure group for each piece of information held by the basic motion of the moving object (distributed in the tree structure). A tree structure creating step for storing and structuring in the storage means;
The eigenspace data B in which the frame image data B of the motion of the target moving object to be discriminated is displayed by dots by the discriminating unit, and the eigenspace data for each basic motion of the moving body structured by the tree structure creating unit by comparing the a, the distance from the eigenspace data B to select the closest eigenspace data a, it possesses a discrimination step for identifying the three-dimensional operation of the target body,
In the tree structure group, roots and nodes are constructed by keys that are values representing boundaries that can be compared in size, and the eigenspace data A is stored only in leaves, and the eigenspace in the determination step is further stored. The comparison between the data B and the eigenspace data A is performed using the key.

ここで、木構造とは、動体の基本動作が保有している情報、例えば、動体の基本動作をその特徴ごとに区分する方法であり、例えば、Ｂ−ｔｒｅｅ、Ｂ^＊−ｔｒｅｅ、またはＢ^＋−ｔｒｅｅが従来知られている。
また、動体として人を対象とした場合、その基本動作として、例えば、重たい荷物を持ち上げる動作、物を拾う動作、腹痛でおなかを抱えて座り込む動作、頭上から落ちてくる物を避けようと両手で頭を覆う動作、歩く動作、および転倒する動作等がある。また、動体としては、人の他に、動物、車等の乗り物、またはロボット等を適用することができる。 Here, the tree structure is a method of dividing information held by the basic motion of the moving object, for example, the basic motion of the moving object according to its features. For example, B-tree, B ^* -tree, or B ⁺ -Tree is conventionally known.
In addition, when moving a person as a moving object, the basic actions include, for example, lifting a heavy load, picking up an object, sitting with a stomachache due to abdominal pain, and avoiding an object that falls from the overhead. There are movements such as covering the head, walking, and falling. In addition to humans, animals, vehicles such as cars, or robots can be applied as moving objects.

本発明に係るモーションキャプチャ方法において、前記基本動作ごとの前記各フレーム画像データＡは、連続するまたは間隔を有する２つずつのフレーム画像を重ね合わせ、変化のない部分を削除して得られる複数の差分画像を、それぞれ重ね合わせることで得られることが好ましい。
ここで、差分画像は、例えば、動体の背景画像を削除することにより得られる画像であり、このような背景画像を削除することで、処理するデータ量を少なくできる。 In the motion capture method according to the present invention, each frame image data A for each of the basic operations is obtained by superimposing two consecutive frame images having intervals or intervals, and deleting a plurality of unchanged portions. It is preferable that the difference images are obtained by overlapping each other.
Here, the difference image is, for example, an image obtained by deleting a background image of a moving object, and the amount of data to be processed can be reduced by deleting such a background image.

本発明に係るモーションキャプチャ方法において、前記固有空間データ作成工程での前記動体は疑似人モデルまたは人であり、しかも前記画像入力手段は、前記動体が擬似人モデルの場合は仮想カメラ群、前記動体が人の場合はカメラ群であり、前記動体の基本動作を前記画像入力手段を用いて多方向から撮影し、前記基本動作ごとの複数のフレーム画像データＡを得ることが好ましい。
ここで、疑似人モデルとは、人の３次元モデルのことであり、一般にアバターといわれる。疑似人モデルを使用することにより、基本動作を行う動体を標準化することができる。 In the motion capture method according to the present invention, the moving object in the eigenspace data creation step is a pseudo-human model or a person, and the image input means includes a virtual camera group and the moving object when the moving object is a pseudo-human model. In the case of a person, it is a camera group, and it is preferable that the basic motion of the moving object is photographed from multiple directions using the image input means to obtain a plurality of frame image data A for each basic motion.
Here, the pseudo person model is a three-dimensional model of a person and is generally called an avatar. By using the pseudo person model, it is possible to standardize moving objects that perform basic actions.

本発明に係るモーションキャプチャ方法において、前記固有空間データＡは、前記フレーム画像データＡに微分処理を行って作成されることが好ましい。
ここで、微分処理とは、例えば、ログ（ＬｏＧ）フィルタ、またはソーベルフィルタ等によって行う方法である。なお、ログフィルタとは、画像データをぼかして微分する方法である。 In the motion capture method according to the present invention, it is preferable that the eigenspace data A is created by performing a differentiation process on the frame image data A.
Here, the differential processing is a method performed by, for example, a log (LoG) filter or a Sobel filter. The log filter is a method of blurring and differentiating image data.

本発明に係るモーションキャプチャ方法において、前記固有空間データＡは、前記フレーム画像データＡをカルーネン・レーベ変換して求められた固有値および固有ベクトルから作成される固有空間に投影して得られ、前記固有空間データＢは、前記フレーム画像データＢを前記固有空間に投影して得られることが好ましい。
ここで、カルーネン・レーベ変換とは、カルーネン・レーベ展開ともいわれ、高次元の各フレーム画像データを低次元に変換する方法である。 In the motion capture method according to the present invention, the eigenspace data A is obtained by projecting the frame image data A onto an eigenspace created from eigenvalues and eigenvectors obtained by Karoonen-Labe transform, and the eigenspace The data B is preferably obtained by projecting the frame image data B onto the eigenspace.
Here, the Karoonen-Labe conversion is also called Karunen-Label expansion, and is a method of converting each high-dimensional frame image data into a low dimension.

請求項１〜５記載のモーションキャプチャ方法は、フレーム画像データＡ、Ｂを点で表示した固有空間データＡ、Ｂを用いて固有空間を構築するので、処理するデータの量を少なくでき、対象動体の動作を高速に処理できる。また、固有空間データＡを木構造群に分解するので、固有空間データＡと固有空間データＢとの全てを比較することなく、固有空間データＡを選んで対象動体の３次元的な動作を特定でき、処理速度の更なる高速化を図ることができる。
従って、動体の多数の基本動作を記憶手段に格納することで、例えば、任意の場所における不特定の人物の動作を、観察方向によらず、ビデオカメラのような画像入力手段で撮影し、立体的に再現することができる。 In the motion capture method according to any one of claims 1 to 5, since the eigenspace is constructed using eigenspace data A and B in which the frame image data A and B are displayed as dots, the amount of data to be processed can be reduced, and the target moving object Can be processed at high speed. Since the eigenspace data A is decomposed into a tree structure group, the eigenspace data A is selected without specifying the eigenspace data A and the eigenspace data B, and the three-dimensional motion of the target moving object is specified. And the processing speed can be further increased.
Therefore, by storing a large number of basic motions of the moving object in the storage means, for example, the motion of an unspecified person in an arbitrary place is photographed by an image input means such as a video camera regardless of the observation direction, Can be reproduced.

特に、請求項２記載のモーションキャプチャ方法においては、基本動作ごとの各フレーム画像データＡが、複数の差分画像をそれぞれ重ね合わせることで得られるので、処理するデータ量を少なくでき、動作認識の処理時間を更に短くできる。
請求項３記載のモーションキャプチャ方法においては、基本動作を疑似人モデルに行わせ複数のフレーム画像データＡを得た場合、標準化した人のデータとすることができ、体型の違いを無くすことができる。更に、疑似人モデルまたは人を多方向から観察した動作画像を用いるので、対象物体をどの方向から観察した場合でも動作を判別することができる。 In particular, in the motion capture method according to claim 2, since each frame image data A for each basic motion is obtained by superimposing a plurality of difference images, the amount of data to be processed can be reduced, and motion recognition processing is performed. Time can be further shortened.
In the motion capture method according to claim 3, when a plurality of frame image data A is obtained by performing a basic operation on a pseudo human model, it can be standardized human data, and a difference in body shape can be eliminated. . Furthermore, since a pseudo-human model or a motion image obtained by observing a person from multiple directions is used, it is possible to determine the motion even when the target object is observed from any direction.

請求項４記載のモーションキャプチャ方法においては、フレーム画像データＡに微分処理を行って、固有空間データＡを作成するので、例えば、服装の違いによる誤差（ノイズ）を減少させることができ、例えば、疑似人モデルをより標準化することができる。
請求項５記載のモーションキャプチャ方法においては、固有空間データＡが、フレーム画像データＡをカルーネン・レーベ変換して求められた固有値および固有ベクトルから作成される固有空間に投影して得られ、固有空間データＢが、フレーム画像データＢを固有空間に投影して得られるので、次元を低くでき、動作認識の処理時間を短くできる。 In the motion capture method according to claim 4, since the eigenspace data A is generated by performing differential processing on the frame image data A, for example, an error (noise) due to a difference in clothes can be reduced. The pseudo-person model can be standardized more.
6. The motion capture method according to claim 5, wherein the eigenspace data A is obtained by projecting the frame image data A onto an eigenspace created from eigenvalues and eigenvectors obtained by Karoonen-Leve transform, and the eigenspace data Since B is obtained by projecting the frame image data B onto the eigenspace, the dimension can be lowered, and the processing time for motion recognition can be shortened.

続いて、添付した図面を参照しつつ、本発明を具体化した実施の形態につき説明し、本発明の理解に供する。
本発明の一実施の形態に係るモーションキャプチャ方法は、１台のビデオカメラ（画像入力手段の一例）によって取得した対象動体の一例である対象人の２次元的な動作画像を、予め登録された動体の一例である人が行った複数の基本動作と比較して、対象人の３次元的な動作を再現するための方法である。以下、詳しく説明する。 Next, embodiments of the present invention will be described with reference to the accompanying drawings for understanding of the present invention.
In the motion capture method according to the embodiment of the present invention, a two-dimensional motion image of a target person, which is an example of a target moving object, acquired by one video camera (an example of an image input unit) is registered in advance. This is a method for reproducing a three-dimensional motion of a target person as compared with a plurality of basic motions performed by a person as an example of a moving object. This will be described in detail below.

まず、人が行う基本動作の固有空間データＡを作成する固有空間データ作成工程について説明する。
人を中心にして等距離で、しかも等角度に、複数台（例えば、４台）のビデオカメラ（動画が撮影可能であればよい）を配置し、人が行う各基本動作（例えば、重たい荷物を持ち上げる動作、物を拾う動作、腹痛でおなかを抱えて座り込む動作、頭上から落ちてくる物を避けようと両手で頭を覆う動作、歩く動作、および転倒する動作等）を撮影する。なお、ビデオカメラとしては、例えば、ＣＣＤカメラ、高速度カメラ、ハンディータイプカメラ、デジタルＶＴＲ、またはデジタルビデオカメラを使用してもよい。
次に、それぞれの基本動作を撮影した映像をコンピュータに取り込む。なお、以下の作業は、コンピュータ内で計算して行われ、コンピュータ内のプログラムにより処理される。 First, the eigenspace data creation process for creating eigenspace data A for basic operations performed by a person will be described.
Place multiple video cameras (for example, 4 video cameras) (if video can be taken) at the same distance and at the same angle around the person, and perform each basic action (for example, heavy luggage) Shooting, picking up an object, sitting down with a stomachache due to abdominal pain, covering the head with both hands to avoid falling from the head, walking, and falling). As the video camera, for example, a CCD camera, a high-speed camera, a handy type camera, a digital VTR, or a digital video camera may be used.
Next, the video which image | photographed each basic operation | movement is taken in to a computer. The following operations are performed by calculation in a computer and processed by a program in the computer.

コンピュータ内に取り込まれた画像のうち、各ビデオカメラごとに、例えば、１秒間に１コマ以上５０コマ以下の間隔で得られる連続する複数のフレーム画像は、コンピュータ内の前処理手段により重ね合わせられる。このとき、変化のない部分、例えば、人の周囲に存在する背景画像（例えば、壁、床、および空）を削除するが、人の画像についても、動きが無い部分（僅かに動く部分を含んでもよく、また含まなくてもよい）を削除してもよい。なお、複数のフレーム画像は、例えば、２枚ごと、または３枚ごとのように、複数枚ごとに間隔を有するものでもよい。
これにより、基本動作の一連の動作が残像として示される１枚の圧縮された画像を、基本動作の複数のフレーム画像データＡとして、コンピュータ内の記憶手段に格納できる。 Among the images captured in the computer, for each video camera, for example, a plurality of consecutive frame images obtained at intervals of 1 frame or more and 50 frames or less per second are superimposed by preprocessing means in the computer. . At this time, a portion that does not change, for example, a background image (for example, a wall, a floor, and the sky) existing around a person is deleted, but a portion of a human image that does not move (including a slightly moving portion) is also included. May or may not be included). The plurality of frame images may have an interval for each of the plurality of images, for example, every two images or every three images.
As a result, a single compressed image in which a series of basic operations is shown as an afterimage can be stored as a plurality of frame image data A of the basic operations in a storage unit in the computer.

このとき、前記した複数のフレーム画像を、連続する２つのフレーム画像ごとに重ね合わせ、変化のない部分、例えば、人の周囲に存在する背景画像を引き算して削除した後、得られる複数の差分画像をそれぞれ重ね合わせることで、１枚の圧縮画像を得てもよい。
なお、上記した人の行う基本動作の３次元データは、人に実際に動作をしてもらうことで得たが、例えば、コンピュータグラフィクスを用いた疑似人モデルにより作成してもよく、また予め他のモーションキャプチャ法で獲得したデータを用いて疑似人モデルにより作成してもよい。
この場合、人の基本動作を疑似人モデルに行わせ、この疑似人モデルを中心として、水平方向、上方向、および下方向のいずれか１または２以上に、等間隔で等角度に配置される多数（例えば、６台以上）の仮想ビデオカメラからなる仮想カメラ群により、疑似人モデルを撮影して、複数のフレーム画像データＡを得る。 At this time, the plurality of frame images described above are overlapped for every two consecutive frame images, and a plurality of differences obtained after subtracting and deleting a portion that does not change, for example, a background image existing around a person. One compressed image may be obtained by superimposing the images.
In addition, the above-described three-dimensional data of the basic actions performed by a person was obtained by having a person actually perform the action. For example, the three-dimensional data may be created by a pseudo-human model using computer graphics. It may be created by a pseudo-human model using data acquired by the motion capture method.
In this case, the person's basic motion is performed by the pseudo-human model, and the pseudo-human model is centered on the pseudo-human model and is arranged at equal intervals at equal intervals in one or more of the horizontal direction, the upward direction, and the downward direction. A virtual person group consisting of a large number (for example, six or more) of virtual video cameras is used to photograph a pseudo human model to obtain a plurality of frame image data A.

ここで、各フレーム画像データＡは、基本動作を撮影した動画中の画像の集合であって、１枚の画像データが、例えば、縦が２５６ピクセル、横が２５６ピクセルで構成されている場合、総画素数が６５５３６画素、つまり６５５３６（Ｎ）次元のデータが得られる。また、例えば、１秒間に１５コマで２秒間撮影した場合、１方向から３０（Ｐ）枚のフレーム画像が得られるが、前記したように、画像の前処理を行うことで、その動作を表す複数のフレーム画像は、１画像で圧縮表現される。
次に、コンピュータ内の固有空間データ作成手段により、予め人の基本動作ごとに、人の基本動作の各フレーム画像データＡが点で表示される固有空間データＡを作成する。なお、この固有空間データＡの作成は、特願２００５−２３７７８５の方法と同様の手法で実施できる。 Here, each frame image data A is a set of images in a moving image obtained by capturing a basic operation, and when one piece of image data is composed of, for example, 256 pixels vertically and 256 pixels horizontally, The total number of pixels is 65536, that is, 65536 (N) -dimensional data is obtained. Also, for example, when shooting for 15 seconds at 15 frames per second, 30 (P) frame images can be obtained from one direction. As described above, by performing image preprocessing, the operation is expressed. A plurality of frame images are compressed and expressed as one image.
Next, eigenspace data A in which each frame image data A of the person's basic motion is displayed as a point is created in advance for each basic motion of the person by the eigenspace data creating means in the computer. The eigenspace data A can be created by a method similar to the method described in Japanese Patent Application No. 2005-237785.

得られた１つの基本動作のフレーム画像データＡ（以下、単に画像ともいう）に対し、正規化を行い、従来公知のＴＶラスタースキャンと同様の方法で走査して、（１）式に示すベクトルを得る。
ｘ_ｐ＝（ｘ₁，ｘ₂，・・・，ｘ_Ｎ）^Ｔ・・・（１）
ここで、ベクトルの各要素は、スキャンした順番で並んでいる画素数である。なお、Ｎはピクセル数を示し、Ｔは転置を示し、またｘ_ｐは‖ｘ_ｐ‖＝１となるように正規化されている。
次に、Ｎ行Ｐ列の行列Ｘを、（２）式のように定義する。
Ｘ≡（ｘ₁−ｃ，ｘ₂−ｃ，・・・，ｘ_P−ｃ）・・・（２）
ここで、ｃは画像の平均値であり、（３）式で計算される。 The obtained frame image data A (hereinafter also simply referred to as an image) of one basic operation is normalized, scanned in the same manner as a conventionally known TV raster scan, and the vector shown in equation (1) Get.
x _p = (x ₁ , x ₂ ,..., x _N ) ^T (1)
Here, each element of the vector is the number of pixels arranged in the scanned order. Note that N indicates the number of pixels, T indicates transposition, and x _p is normalized so that ‖x _p ‖ = 1.
Next, a matrix X with N rows and P columns is defined as shown in Equation (2).
X≡ (x ₁ −c, x ₂ −c,..., X _P −c) (2)
Here, c is an average value of the image and is calculated by the equation (3).

また、共分散行列Ｑは、行列Ｘより（４）式で定義される。
Ｑ＝ＸＸ^Ｔ・・・（４） The covariance matrix Q is defined by the equation (4) from the matrix X.
Q = XX ^T (4)

カルーネン・レーベ変換により、（５）式を用いて共分散行列Ｑの固有値λ_１，λ_２，・・・，λ_Ｎを求める。但し、λ_１＞λ_２＞・・・＞λ_Ｎである。
Ｑｕ＝λｕ・・・（５）
ここで、ｕはＮ個の成分を持つベクトルである。
得られた固有値λ_１，λ_２，・・・，λ_Ｎから、固有ベクトルｅ_１，ｅ_２，・・・，ｅ_Ｎが求められる。 The eigenvalues λ ₁ , λ ₂ ,..., Λ _N of the covariance matrix Q are obtained by the Karoonen-Loeve transform using the equation (5). However, λ ₁ > λ ₂ >...> Λ _N.
Qu = λu (5)
Here, u is a vector having N components.
The obtained eigenvalues λ _1, λ _2, ···, from lambda _N, eigenvectors _{_{e 1, e 2, ···,}} e N are obtained.

ここで、固有ベクトルのｋ個の最大固有値λ_１，λ_２，・・・，λ_ｋ、および、それに対応する固定ベクトルｅ_１，ｅ_２，・・・，ｅ_ｋを選択し、ｋ個の固有ベクトルの張る空間、即ち、（６）式に示すｋ次元の固有空間ＥＳを作成する。
ＥＳ（ｅ_１，ｅ_２，・・・，ｅ_ｋ）≡ＥＳ・・・（６）
なお、ｋ≪Ｎであり、固有空間ＥＳ上に画像データを写像する変換行列Ｅは、（７）式で示される。例えば、ｋを１００とした場合には、Ｎ次元からｋ次元、すなわち、６５５３６次元から１００次元に次元を下げることができる。
Ｅ＝（ｅ_１，ｅ_２，・・・，ｅ_ｋ）・・・（７） Here, k-number of largest eigenvalues lambda ₁ eigenvector, lambda _2, ..., lambda _k, and the fixed vector _e _1, e 2 corresponding thereto, ..., select _{e k,} k eigenvectors Is created, that is, a k-dimensional eigenspace ES shown in Equation (6) is created.
ES (e ₁ , e ₂ ,..., E _k ) ≡ES (6)
Note that k << N, and the transformation matrix E that maps the image data on the eigenspace ES is expressed by Equation (7). For example, when k is 100, the dimension can be lowered from the N dimension to the k dimension, that is, from 65536 dimension to 100 dimension.
E = (e ₁ , e ₂ ,..., E _k ) (7)

ここで、（８）式により、各フレーム画像データＡを固有空間ＥＳ上に投影して、固有空間データＡとして点の集合ｇ_ｐを得る。
ｇ_ｐ＝（ｅ_１，ｅ_２，・・・，ｅ_ｋ）^Ｔｘ_ｐ・・・（９）
このようにして、人の姿勢は、固有空間上で単なる点として登録される。
この得られた点の集合ｇ_ｐを、記憶手段に格納してデータベース化する。
なお、固有空間データＡの作成に際しては、事前に、コンピュータ内に取り込まれた画像の各フレーム画像データＡのそれぞれの画像データを、従来公知のログフィルタにかけ、各フレーム画像データＡをぼかして微分処理してもよい。 Here, according to the equation (8), each frame image data A is projected onto the eigenspace ES, and a set of points _gp is obtained as the eigenspace data A.
g _p = (e ₁ , e ₂ ,..., e _k ) ^T x _p (9)
In this way, the posture of a person is registered as a simple point on the eigenspace.
The set g _p of the obtained point, a database of stored in the storage means.
When creating the eigenspace data A, each image data of each frame image data A of the image captured in the computer is subjected to a conventionally known log filter, and each frame image data A is blurred and differentiated. It may be processed.

以下、同様に他の方向から撮影した基本動作の各フレーム画像データＡからそれぞれ固有空間データＡを作成し、得られた点の集合を、記憶手段に格納してデータベース化する。
また、複数の基本動作の全てのフレーム画像データＡから、同様にして固有空間データＡを作成し、記憶手段に格納してデータベース化する。
次に、以上に示した固有空間データ作成手段でデータベース化された固有空間データＡを、コンピュータ内の木構造作成手段により、人の基本動作が保有している情報ごとに木構造群に分解する木構造作成工程について説明する。なお、木構造としては、例えば、Ｂ−ｔｒｅｅ、Ｂ^＊−ｔｒｅｅ、またはＢ^＋−ｔｒｅｅが従来知られている。 Similarly, the eigenspace data A is created from the frame image data A of the basic motion photographed from other directions, and the obtained set of points is stored in the storage means and made into a database.
Further, the eigenspace data A is created in the same manner from all the frame image data A of the plurality of basic operations, stored in the storage means, and made into a database.
Next, the eigenspace data A created as a database by the eigenspace data creation means described above is decomposed into a tree structure group for each piece of information held by a person's basic motion by the tree structure creation means in the computer. The tree structure creation process will be described. As a tree structure, for example, B-tree, B ^* -tree, or B ⁺ -tree is conventionally known.

固有空間に対して、Ｂ−ｔｒｅｅを適用するという考えは、固有空間を複数に分割し、点として表現された姿勢をそれぞれ格納するビン（貯蔵箱：人の基本動作が所有している情報ごとに分解され構成される木構造群）を作り、入力された未知の姿勢と似た画像の格納されたビンを高速に探し出すことにある。
Ｂ−ｔｒｅｅ構造を固有空間に導入して、固有空間の構造化を行うことにより、圧縮画像が点として表現された固有空間は、複数のビンに分けられ、ビンはＢ−ｔｒｅｅ構造で表現される。
なお、人の動作を表すこの固有空間を、動作データベースと呼ぶ。 The idea of applying B-tree to the eigenspace is that the eigenspace is divided into a plurality of bins each storing the posture expressed as a point (storage box: for each piece of information owned by a person's basic motion) A tree structure group that is decomposed into two) and searches for bins storing images similar to the input unknown pose at high speed.
By introducing the B-tree structure into the eigenspace and structuring the eigenspace, the eigenspace in which the compressed image is represented as a point is divided into a plurality of bins, and the bin is represented by the B-tree structure. The
Note that this eigenspace representing a human motion is called a motion database.

ここで、Ｂ−ｔｒｅｅについて説明する。
以下の条件を満たすものを、τ（ｍ，Ｈ）に属するＢ−ｔｒｅｅ Τという。ここで、ｍは、根（ルート）または節（ノード）が持つことのできる子供の数である。また、Ｈは木の高さを表し、検索速度に関係する。
１．根は葉であるか、または２〜ｍ個の子を持つ。
２．根、葉以外の節は、［ｍ／２］〜ｍ個の子を持つ。ただし、［ｘ］はｘ以下の最大の整数を表す。
３．根からすべての葉までの経路の長さは等しい。
Ｂ−ｔｒｅｅでは、格納するデータから造られる「境界を表す値」、即ちキーが重要な意味を持ち、このキーによって根や節が構築される。このキーは、大小比較することが可能なスカラー値である。また、データは、葉のみに格納される。 Here, B-tree will be described.
Those satisfying the following conditions are called B-tree 属する belonging to τ (m, H). Here, m is the number of children a root (root) or node (node) can have. H represents the height of the tree and is related to the search speed.
1. The root is a leaf or has 2 to m children.
2. Nodes other than roots and leaves have [m / 2] to m children. However, [x] represents the maximum integer below x.
3. The length of the path from the root to all leaves is equal.
In B-tree, a “value representing a boundary” created from data to be stored, that is, a key has an important meaning, and a root and a clause are constructed by this key. This key is a scalar value that can be compared in magnitude. Data is stored only in leaves.

このＢ−ｔｒｅｅを固有空間に適用するときは、それぞれの固有空間上の座標値ｅ_ｋ（ｋ＝１，２，・・・，Ｋ）を、Ｒ個のある幅Ｌを持ったセクションに分割し、木構造を作成する。
ここで、画像Ｉ_Ｐが式（９）によって固有空間の点ｇ＝（ｇ_１，ｇ_２，・・・，ｇ_K）に投影されると、ｇ_ｋ（ｋ＝１，２，・・・，Ｋ）は、いずれかのセクションに含まれるから、そのセクションの固有の番号Ｓ_ｋ，ｒ（ｒ＝０，１，・・・，Ｒ−１）が与えられる。
この結果ｇは、式（１０）によって、Ｋ桁Ｒ進数であるＳ_ｐに変換される。
Ｓ_Ｐ＝Ｓ_１，ｒ１Ｓ_２，ｒ２Ｓ_３，ｒ３・・・Ｓ_Ｋ，ｒＫ・・・（１０）
これにより、画像は、Ｓ_Ｐをキーとして、木構造であるＢ−ｔｒｅｅ Τに分配されて格納されるので、これを、記憶手段に格納して、構造化する。
以上の方法により、人の各基本動作がデータベース化される。 When this B-tree is applied to the eigenspace, the coordinate value e _k (k = 1, 2,..., K) on each eigenspace is divided into R sections having a width L. And create a tree structure.
Here, when the image _IP is projected to the point g = (g ₁ , g ₂ ,..., G _K ) of the eigenspace by the equation (9), g _k (k = 1, 2,...). , K) is included in any section, and is given a unique number S _{k, r} (r = 0, 1,..., R−1) of that section.
The result g is converted into _Sp , which is a K-digit R-ary number, by Equation (10).
S _P = S _{1, r1} S _{2, r2} S _{3, r3} ... S _{K, rK} (10)
Thus, the image as a key S _P, since it is stored is distributed to a tree structure B-tree T, which, stored in a storage unit, structured.
By the above method, each basic motion of a person is made into a database.

次に、前記したコンピュータ内の固有空間データ作成手段により、判別しようとする対象人のフレーム画像データＢが点で表示された固有空間データＢを作成する。
まず、対象人の動作を１台のビデオカメラで撮影する。
動作画像をコンピュータに取り込み、（１１）式に示す各フレーム画像データＢの集合ｙを得る。
ｙ＝（ｙ_１，ｙ_２，・・・，ｙ_Ｐ）・・・（１１）
そして、前記した固有空間データＡと前処理と同様の方法により、その動作を表す連続フレームを圧縮表現して１画像とすることで、フレーム画像データＢが作成される。 Next, the eigenspace data B in which the frame image data B of the subject to be discriminated is displayed as dots is created by the eigenspace data creation means in the computer.
First, the motion of the subject is photographed with one video camera.
A motion image is taken into a computer, and a set y of each frame image data B shown in equation (11) is obtained.
y = (y ₁ , y ₂ ,..., y _P ) (11)
Then, the frame image data B is created by compressing and expressing a continuous frame representing the operation into one image by the same method as the eigenspace data A and the preprocessing.

更に、前記した固有値および固有ベクトルから作成される固有空間ＥＳに、（１２）式を用いて、フレーム画像データＢ（ｙ´と表記）を投影し、固有空間データＢである点ｈを得る。
ｈ＝Ｅ^Ｔｙ´＝（ｅ_１，ｅ_２，・・・，ｅ_ｋ）^Ｔｙ´・・・（１２）
そして、コンピュータ内の判別手段により、固有空間データＢと、木構造作成手段で構造化された人の基本動作ごとの固有空間データＡとを比較する判別工程について説明する。 Further, the frame image data B (denoted as y ′) is projected onto the eigenspace ES created from the eigenvalues and eigenvectors using the equation (12) to obtain a point h that is the eigenspace data B.
h = E ^T y ′ = (e ₁ , e ₂ ,..., e _k ) ^T y ′ (12)
Then, a discrimination process in which the eigenspace data B is compared with the eigenspace data A for each person's basic motion structured by the tree structure creation means by the discrimination means in the computer will be described.

人の姿勢認識では、未知の姿勢を持つ画像Ｉ_Ｐ′を固有空間に投影し、（１０）式によってセクション番号Ｓ_Ｐ′を得る。次に、Ｓ_Ｐ′を検索キーとしてΤを検索し、候補姿勢ｇ_ｐｒ（ｒ＝１，２，・・・，Ｒ）を得る。
最後に、（１３）式を適用すれば、固有空間データＢを示す点ｈからの距離が最も近い（距離が最小）固有空間データＡを示す点の集合ｇ_ｉが選ばれ、最も近い姿勢ｐ′＝ｐ^＊が得られる。
ｄ_ｐ ^＊＝ｍｉｎ‖ｇ_ｐｒ−ｇ_ｐ‖・・・（１３）
ここでは、Ｒ≪Ｐとなることが期待されるため、検索速度は大幅に改善される。
ただし、Ｒ≪動作データベースに登録されている全基本動作の数である。 In human posture recognition, an image I _{P ′} having an unknown posture is projected onto the eigenspace, and a section number S _{P ′} is obtained by equation (10). Next, Τ is searched using SP _′ as a search key to obtain a candidate posture g _pr (r = 1, 2,..., R).
Finally, if the equation (13) is applied, a set g _{i of} points indicating the eigenspace data A having the closest distance (minimum distance) from the point h indicating the eigenspace data B is selected, and the closest posture p '= P ^* is obtained.
d _p ^* = min‖g _pr −g _p ‖ (13)
Here, since R << P is expected, the search speed is greatly improved.
However, R << the number of all basic actions registered in the action database.

このように、人の動作を任意方向からビデオ撮影し、その動作に最も近い動作を動作データベースの検索により探し、見つかればそれを３次元的な動作と特定することで、例えば、アニメーション、アバター、またはロボットのような３次元媒体で再現できる。
これにより、固有空間データＢに最も近い圧縮画像が検索されるが、この圧縮画像は、もとの動作情報（即ち、動体の基本動作）を持っているため、これを参照することで、３次元動作を再現できる。
なお、未知の動作がカメラで撮影される場合、画像の前処理により、その動作を表す連続するフレーム画像は、１画像Ｉで圧縮表現されるため、画像Ｉに最も近い圧縮画像が、コンピュータ内の基本動作のデータベースから検索される。このデータベースは、前記したように、Ｂ−ｔｒｅｅ構造を持つため、検索は高速に行われる。従って、画像Ｉと最も距離の短い画像が検索され、この距離がある閾値より小さければ、未知動作は、その動作として判断される。
以上の方法により、動作データベース検索によるモーションキャプチャが実現する。 In this way, a person's motion is video-recorded from any direction, the motion closest to the motion is searched by searching the motion database, and if found, it is identified as a three-dimensional motion, for example, animation, avatar, Or it can be reproduced by a three-dimensional medium such as a robot.
As a result, the compressed image closest to the eigenspace data B is retrieved. Since this compressed image has the original motion information (that is, the basic motion of the moving object), 3 Dimensional motion can be reproduced.
When an unknown operation is shot by a camera, a continuous frame image representing the operation is compressed and represented by one image I by image preprocessing, so that the compressed image closest to the image I is stored in the computer. Retrieved from a database of basic operations. Since this database has a B-tree structure as described above, the search is performed at high speed. Therefore, an image having the shortest distance from the image I is searched, and if this distance is smaller than a certain threshold value, the unknown operation is determined as the operation.
By the above method, motion capture by motion database search is realized.

次に、本発明の作用効果を確認するために行った実施例について説明する。
ここでは、本願発明のモーションキャプチャ方法を適用し、ビデオカメラによって取得した対象人の２次元的な動作画像から、対象人の３次元的な動作を再現する方法について説明する。
まず、図１（Ａ）、（Ｂ）、図２（Ａ）、（Ｂ）、図３（Ａ）、（Ｂ）に示すように、人に、重たい荷物を持ち上げる動作、物を拾う動作、腹痛でおなかを抱えて座り込む動作、頭上から落ちてくる物を避けようと両手で頭を覆う動作、歩く動作、および転倒する動作の各基本動作を行ってもらい、この動作画像をビデオカメラで連続的に撮影して、コンピュータに入力する。なお、ここでは、説明の便宜上、一方向からのみ撮影した映像を示す。 Next, examples carried out for confirming the effects of the present invention will be described.
Here, a method for reproducing the three-dimensional motion of the target person from the two-dimensional motion image of the target person acquired by the video camera by applying the motion capture method of the present invention will be described.
First, as shown in FIG. 1 (A), (B), FIG. 2 (A), (B), FIG. 3 (A), (B), an operation to lift a heavy load on a person, an operation to pick up an object, Sit down with stomachache, sit down with your stomach, cover your head with both hands to avoid falling objects from your head, walk, and fall Shoot and input to computer. Here, for convenience of explanation, an image taken from only one direction is shown.

次に、前記した前処理について説明する。
ここでは、図１〜図３の各基本動作を、その動作ごとに重ね合わせ、図４（Ａ）の（ａ）〜（ｆ）に示す１枚の画像データを得る。なお、図１（Ａ）は図４（Ａ）の（ａ）、図１（Ｂ）は図４（Ａ）の（ｂ）、図２（Ａ）は図４（Ａ）の（ｃ）、図２（Ｂ）は図４（Ａ）の（ｄ）、図３（Ａ）は図４（Ａ）の（ｅ）、および図３（Ｂ）は図４（Ａ）の（ｆ）に、それぞれ対応する。
そして、図４（Ａ）の（ａ）〜（ｆ）の背景画像を削除することで、図４（Ｂ）に示す抽出画像を得た後、前記した正規化を行うことで、図５に示す画像が得られる。 Next, the preprocessing described above will be described.
Here, the basic operations shown in FIGS. 1 to 3 are overlapped for each operation, and one piece of image data shown in (a) to (f) of FIG. 4A is obtained. 1A is (a) in FIG. 4A, FIG. 1B is (b) in FIG. 4A, and FIG. 2A is (c) in FIG. 4A. 2 (B) is (d) of FIG. 4 (A), FIG. 3 (A) is (e) of FIG. 4 (A), and FIG. 3 (B) is (f) of FIG. 4 (A). Each corresponds.
Then, by deleting the background images of (a) to (f) of FIG. 4 (A) to obtain the extracted image shown in FIG. 4 (B), normalization is performed as shown in FIG. The image shown is obtained.

また、前処理は、以下の方法で行うこともできる。
ここでは、図１〜図３の各基本動作を、連続する２つのフレーム画像ごとに重ね合わせ、変化のない部分、即ち人の周囲に存在する背景画像（例えば、壁、床、および空）を引き算して削除する。そして、この差分画像をそれぞれ重ね合わせることで、図６（Ａ）に示すように、基本動作の一連の動作が残像として示される１枚の圧縮画像を作成できる。なお、図１（Ａ）は図６（Ａ）の（ａ）、図１（Ｂ）は図６（Ａ）の（ｂ）、図２（Ａ）は図６（Ａ）の（ｃ）、図２（Ｂ）は図６（Ａ）の（ｄ）、図３（Ａ）は図６（Ａ）の（ｅ）、および図３（Ｂ）は図６（Ａ）の（ｆ）に、それぞれ対応する。
そして、図６（Ａ）の（ａ）〜（ｆ）に残存する背景画像を削除することで、図６（Ｂ）に示す抽出画像を得た後、前記した正規化を行うことで、図７に示す画像が得られる。 The pretreatment can also be performed by the following method.
Here, the basic operations shown in FIGS. 1 to 3 are superimposed on every two consecutive frame images, and a background image (for example, a wall, a floor, and sky) that exists around a person, that is, a portion that does not change. Subtract and delete. Then, by superimposing the difference images, as shown in FIG. 6A, a single compressed image in which a series of basic operations are shown as afterimages can be created. 1A is (a) in FIG. 6A, FIG. 1B is (b) in FIG. 6A, FIG. 2A is (c) in FIG. 2 (B) is (d) of FIG. 6 (A), FIG. 3 (A) is (e) of FIG. 6 (A), and FIG. 3 (B) is (f) of FIG. 6 (A). Each corresponds.
Then, by deleting the background image remaining in (a) to (f) of FIG. 6 (A) to obtain the extracted image shown in FIG. 6 (B), the above normalization is performed. 7 is obtained.

これにより、各圧縮画像には、対応する３次元の動作を再現するのに必要な情報を持たせることができる。
この各フレーム画像データＡからカルーネン・レーベ変換により、固有値および固有ベクトルを計算し、固有空間データＡを作成し、複数の基本動作の固有空間データＡを、記憶手段に格納してデータベース化する。
次に、このデータベース化された固有空間データＡを、木構造に分解（動体の基本動作が保有している情報、例えば、その画像特徴ごとに区分）し、記憶手段に格納してデータベース化する。これにより、固有空間データＡは、例えば、類似画像（同じビンに異なる基本動作が入る場合もある）ごとに分類される。 Thereby, each compressed image can be provided with information necessary to reproduce the corresponding three-dimensional operation.
Eigenvalues and eigenvectors are calculated from each frame image data A by Karoonen-Labe transformation, eigenspace data A is created, and eigenspace data A of a plurality of basic motions are stored in a storage means to form a database.
Next, the eigenspace data A stored in the database is decomposed into a tree structure (information held by the basic motion of the moving object, for example, classified for each image feature), and stored in the storage means to be converted into a database. . Accordingly, the eigenspace data A is classified, for example, for each similar image (in some cases, different basic operations may be included in the same bin).

一方、３次元的な動作を得るための元データとなる人の動作については、１台のビデオカメラで対象人を撮影し、動作画像をコンピュータに取り込み、各フレーム画像データＢを得る。そして、前記した各フレーム画像データＡと同様の方法で、１画像に圧縮されたフレーム画像データＢを、前記した固有値および固有ベクトルから作成される固有空間に投影して固有空間データＢを得る。
そして、固有空間データＢと基本動作ごとの固有空間データＡとを比較して、固有空間データＢからの距離が最も近い固有空間データＡを選び、これを対象人の３次元的な動作と特定する。
これにより、例えば、アニメーション、アバター、またはロボットのような３次元媒体で再現できる。 On the other hand, as for the motion of a person serving as original data for obtaining a three-dimensional motion, the target person is photographed with one video camera, the motion image is taken into a computer, and each frame image data B is obtained. The eigenspace data B is obtained by projecting the frame image data B compressed into one image onto the eigenspace created from the eigenvalues and eigenvectors in the same manner as the frame image data A described above.
Then, the eigenspace data B is compared with the eigenspace data A for each basic motion, and the eigenspace data A having the closest distance from the eigenspace data B is selected, and this is identified as the three-dimensional motion of the target person. To do.
Thereby, it is reproducible with a three-dimensional medium like an animation, an avatar, or a robot, for example.

本発明のモーションキャプチャ方法によって、例えば、従来のモーションキャプチャ方法では不可能であった予期せぬ突発的な事象（広場で突然ストリートダンスが始まる場合のように、身体にマーカをつけず、自然な状態で動作してもらえる事象）などの、立体モデル化が可能となり、また容易となる。これは、画像入力手段により、現地で対象動体の動作を撮影するだけでよいからである。
また、従来のように、事前のカメラキャリブレーションおよびマーカも必要なく、対象となる動体の動作のみを撮影するだけでよいので、立体モデル化できる動作数が増え、また操作が簡単なモーションキャプチャ方法を提供できる。これにより、低価格で使い易いモーションキャプチャシステムを提供できるので、モーションキャプチャ技術の他分野への普及が期待できる。 With the motion capture method of the present invention, for example, an unexpected sudden event that is impossible with the conventional motion capture method (such as when a street dance suddenly starts in a plaza, the body is not marked, 3D modeling such as events that can be operated in a state) becomes possible. This is because it is only necessary to photograph the motion of the target moving object on site by the image input means.
Also, unlike the conventional method, there is no need for camera calibration and markers in advance, and it is only necessary to shoot the motion of the target moving object, so the number of motions that can be made into a three-dimensional model increases and the motion capture method is easy to operate. Can provide. This makes it possible to provide a low-priced and easy-to-use motion capture system, which can be expected to spread to other fields of motion capture technology.

以上、本発明を、実施の形態を参照して説明してきたが、本発明は何ら上記した実施の形態に記載の構成に限定されるものではなく、特許請求の範囲に記載されている事項の範囲内で考えられるその他の実施の形態や変形例も含むものである。例えば、前記したそれぞれの実施の形態や変形例の一部または全部を組合せて本発明のモーションキャプチャ方法を構成する場合も本発明の権利範囲に含まれる。
また、前記実施の形態においては、本願発明である動体として人を適用した場合について説明したが、これに限定されるものではなく、動体として、例えば、人以外の動物、車等の乗り物、またはロボットでもよい。 As described above, the present invention has been described with reference to the embodiment. However, the present invention is not limited to the configuration described in the above embodiment, and the matters described in the scope of claims. Other embodiments and modifications conceivable within the scope are also included. For example, the case where the motion capture method of the present invention is configured by combining some or all of the above-described embodiments and modifications is also included in the scope of the right of the present invention.
In the above embodiment, the case where a person is applied as the moving object according to the present invention has been described. However, the present invention is not limited to this, and as the moving object, for example, an animal other than a person, a vehicle such as a car, or the like A robot may be used.

（Ａ）は人が重たい荷物を持ち上げる基本動作の連続するフレーム画像の説明図、（Ｂ）は人が物を拾う基本動作の連続するフレーム画像の説明図である。(A) is an explanatory diagram of a continuous frame image of a basic operation for lifting a heavy load by a person, and (B) is an explanatory diagram of a frame image of a basic operation for a person to pick up an object. （Ａ）は人が腹痛でおなかを抱えて座り込む基本動作の連続するフレーム画像の説明図、（Ｂ）は人が頭上から落ちてくる物を避けようと両手で頭を覆う基本動作の連続するフレーム画像の説明図である。(A) is an explanatory diagram of a continuous frame image of a basic motion in which a person sits down with abdominal pain, and (B) is a continuous basic motion that covers the head with both hands so as to avoid an object falling from above the head. It is explanatory drawing of a frame image. （Ａ）は人が歩く基本動作の連続するフレーム画像の説明図、（Ｂ）は人が転倒する基本動作の連続するフレーム画像の説明図である。(A) is explanatory drawing of the frame image which the basic motion which a person walks continues, (B) is explanatory drawing of the continuous frame image of the basic motion that a person falls. （Ａ）は図１〜図３に示す各基本動作ごとの画像の説明図、（Ｂ）は（Ａ）の背景画像を削除した後の画像の説明図である。(A) is explanatory drawing of the image for every basic operation | movement shown in FIGS. 1-3, (B) is explanatory drawing of the image after deleting the background image of (A). 図４（Ｂ）の画像を正規化した画像の説明図である。It is explanatory drawing of the image which normalized the image of FIG. 4 (B). （Ａ）は図１〜図３に示す各基本動作ごとの連続する画像を重ね合わせ変化のない部分を削除した後に得られる圧縮画像の説明図、（Ｂ）は（Ａ）の背景画像を削除した後の圧縮画像の説明図である。(A) is an explanatory view of a compressed image obtained after deleting a portion where there is no change in overlapping the continuous images for each basic operation shown in FIGS. 1 to 3, and (B) is for deleting the background image of (A). It is explanatory drawing of the compressed image after having performed. 図６（Ｂ）の画像を正規化した画像の説明図である。It is explanatory drawing of the image which normalized the image of FIG. 6 (B).

Claims

A motion capture method for reproducing a three-dimensional motion of a target moving object from a two-dimensional motion image of the target moving object acquired by an image input means,
An eigenspace that creates eigenspace data A in which each frame image data A of the basic motion of the moving object is displayed as a point for each basic motion of the moving object by the eigenspace data creating means and stores it in the storage means to create a database Data creation process,
The tree structure creation means decomposes the eigenspace data A databased by the eigenspace data creation means into a tree structure group for each piece of information held by the basic motion of the moving object, and stores it in the storage means. A tree structure creation process to be structured;
The eigenspace data B in which the frame image data B of the motion of the target moving object to be discriminated is displayed by dots by the discriminating unit, and the eigenspace data for each basic motion of the moving body structured by the tree structure creating unit by comparing the a, the distance from the eigenspace data B to select the closest eigenspace data a, it possesses a discrimination step for identifying the three-dimensional operation of the target body,
In the tree structure group, roots and nodes are constructed by keys that are values representing boundaries that can be compared in size, and the eigenspace data A is stored only in leaves, and the eigenspace in the determination step is further stored. A motion capture method characterized in that the comparison between the data B and the eigenspace data A is performed using the key .

2. The motion capture method according to claim 1, wherein each of the frame image data A for each basic operation is obtained by superimposing two frame images that are continuous or having an interval, and deleting a portion having no change. A motion capture method characterized by being obtained by superimposing the difference images.

3. The motion capture method according to claim 1, wherein the moving object in the eigenspace data creation step is a pseudo-human model or a person, and the image input unit is configured such that the moving object is a pseudo-human model. Is a virtual camera group, and when the moving object is a human, it is a camera group. The basic operation of the moving object is photographed from multiple directions using the image input means, and a plurality of frame image data A for each basic operation is obtained. A motion capture method characterized by obtaining.

4. The motion capture method according to claim 1, wherein the eigenspace data A is created by performing a differentiation process on the frame image data A. 5.

5. The motion capture method according to claim 1, wherein the eigenspace data A is an eigenspace created from eigenvalues and eigenvectors obtained by performing Karoonen-Leve transform on the frame image data A. 6. A motion capture method obtained by projecting and obtaining the eigenspace data B by projecting the frame image data B onto the eigenspace.