JPH09198504A

JPH09198504A - Attitude detector

Info

Publication number: JPH09198504A
Application number: JP8005016A
Authority: JP
Inventors: Atsushi Otani; 淳大谷; Fumio Kishino; 文郎岸野
Original assignee: ATR TSUSHIN SYST KENKYUSHO KK
Current assignee: ATR TSUSHIN SYST KENKYUSHO KK
Priority date: 1996-01-16
Filing date: 1996-01-16
Publication date: 1997-07-31
Anticipated expiration: 2016-01-16
Also published as: JP2892610B2

Abstract

PROBLEM TO BE SOLVED: To improve the estimated precision of attitude parameters at an attitude detector using a genetic algorithm by generating the attitude of a three-dimensional model corresponding to the generic algorithm and deforming this attitude of the three-dimensional model. SOLUTION: This device is provided with a generic silhouette algorithm execution part (S-GA) 10 for estimating the attitude of a person 1 while using the silhouette of the person and a genetic position algorithm execution part (P-GA) 12 for estimating only the attitude of body of the person 1 again based on the estimated result of the S-GA 10. Further, a box generic algorithm execution part (B-GA) 14 is provided for estimating the attitude of each body part of the person 1 again based on the estimated result of the S-GA 10. Thus, based on the estimated result of attitude of a three-dimensional object using the genetic algorithm, only the attitude of main part of the three-dimensional object is estimated again while using the genetic algorithm. Therefore, the estimation precision of attitude is improved.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、姿勢検出装置に
関し、特に、３次元物体に対して所定の幾何学的位置関
係でそれぞれが設けられた複数の撮像手段で３次元物体
を撮像し、その複数の画像に基づいて３次元物体の姿勢
を検出する姿勢検出装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a posture detecting device, and more particularly to a three-dimensional object imaged by a plurality of image pickup means provided in a predetermined geometrical positional relationship with respect to the three-dimensional object. The present invention relates to a posture detection device that detects the posture of a three-dimensional object based on a plurality of images.

【０００２】[0002]

【従来の技術】人物像は、関節の動きにより３次元形状
が大きく変化する柔軟な動きの物体の典型である。その
ため、画像処理やコンピュータビジョンの分野において
は人物像は重要なターゲットとされている。近年、人物
の動きや姿勢を自動的かつ非接触な方式で検出する技術
の確立が、種々の画像通信システムや監視システムの実
現のために重要性を増してきている。すなわち、人体を
形成する主要な骨は関節によって接続されており、その
関節の回転角度が検出されれば、応用が可能であると考
えられる。2. Description of the Related Art A person image is typical of a flexible moving object whose three-dimensional shape is largely changed by the movement of joints. Therefore, in the field of image processing and computer vision, a human figure is an important target. In recent years, the establishment of a technique for automatically detecting the movement and posture of a person by a non-contact method has become more important for realizing various image communication systems and monitoring systems. That is, the main bones forming the human body are connected by joints, and it is considered that the application is possible if the rotation angle of the joints is detected.

【０００３】従来の人物の姿勢を検出する方式として
は、マーカーやセンサを人体に装着する接触型の方式が
あったが、実時間計測には適しているものの、適用領域
が限定されるという問題があった。一方、画像処理を用
いる研究としては、単眼の動画像系列を解析する手法が
あるが、各種のモデルや拘束条件が必要となり、任意の
関節角度の組合せの姿勢を検出するのは容易でない。ま
た、画像のエッジ等の低レベルの情報を利用して、パラ
メータ記述を得るものも研究されている。正確に当ては
めが行なわれれば、姿勢推定もロバストに行なえるが、
画像ノイズ等のため記述が正確でないときは致命的な誤
りを生じる危険性もある。As a conventional method for detecting the posture of a person, there is a contact type method in which a marker or a sensor is attached to the human body, but it is suitable for real-time measurement, but the application area is limited. was there. On the other hand, as a study using image processing, there is a method of analyzing a moving image sequence of a single eye, but various models and constraint conditions are required, and it is not easy to detect a posture of an arbitrary combination of joint angles. Also, research is being made on obtaining a parameter description using low-level information such as an edge of an image. If accurate fitting is performed, posture estimation can be robust, but
If the description is not correct due to image noise or the like, there is a risk of fatal error.

【０００４】上記のような問題を解決するためのものと
して、遺伝的アルゴリズムに基づいてマルチ画像から人
物の姿勢パラメータを推定する姿勢検出装置が特開平７
−３０２３４１号公報に開示されている。この姿勢検出
装置では、非接触な方式で、拘束条件も不要で、かつ画
像処理におけるノイズ要因に対して安定な遺伝的アルゴ
リズム（Genetic Algorithm ：ＧＡ）［D.E.Goldberg,
“Genetic Algorithmin search, optimization, and ma
chine learning ”, Addison-Wesley, 1989.］が用いら
れている。すなわち、集団における固体の遺伝子に姿勢
パラメータを対応させておき、遺伝子情報に従って、予
め作成されている３次元人物モデルを移動、変形し、こ
れを仮想カメラにより観察することにより生成される合
成人物像と、実カメラからの実人物像とのシルエットの
重なり度合（適応度）を評価し、ある世代遺伝的操作を
集団に対して繰返した時点での、最良適応度を持つ固体
の遺伝子情報を姿勢の推定結果としている。As a means for solving the above problem, there is a posture detecting device for estimating a posture parameter of a person from a multi-image based on a genetic algorithm.
It is disclosed in Japanese Patent Publication No. 302341. This posture detection device is a non-contact method, does not require a constraint condition, and is stable against a noise factor in image processing (Genetic Algorithm: GA) [DEGoldberg,
“Genetic Algorithmin search, optimization, and ma
chine learning ”, Addison-Wesley, 1989.] is used. That is, posture parameters are made to correspond to individual genes in a population, and a 3D human model created in advance is moved or transformed according to the genetic information. Then, the degree of overlap (fitness) of the silhouette of the synthetic human image generated by observing this with a virtual camera and the real human image from the real camera is evaluated, and a certain generation genetic operation is performed on the population. The genetic information of the individual with the best fitness at the time of repetition is used as the posture estimation result.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上述し
た従来の姿勢検出装置では、姿勢パラメータをある程度
真値に近づけることは可能であるが、ある一定の値に達
するとそれ以上近づかなくなるという飽和現象が見られ
る。すなわち、上記公報の図１１に示されるように、同
図１０（ａ）の最大値については、約２００世代までに
立上がり、５００世代までにはほぼ適応度が飽和してい
る。また、同図１２に示されるように、同図１０（ｂ）
の最大値についても、約２００世代までに立上がり、５
００世代までにはほぼ適応度が飽和している。その飽和
による適応度は、同図１１では８６％であり、同図１２
では８４％程度である。５００世代以降の世代をさらに
重ねれば、適応度が上がる可能性もあるが、２００世代
から５００世代の上昇率から推測すると、かなり先の世
代まで処理が繰返さなければならないと思われる。しか
しながら、適応度を上げるためにかなり先の世代まで処
理を繰返すことは効率的でない。However, in the above-described conventional attitude detection device, although it is possible to bring the attitude parameter closer to the true value to some extent, there is a saturation phenomenon in which the attitude parameter does not approach any more when it reaches a certain value. Can be seen. That is, as shown in FIG. 11 of the above publication, the maximum value in FIG. 10A rises up to about 200 generations, and the fitness is almost saturated by 500 generations. Further, as shown in FIG. 12, the same FIG.
Also about the maximum value of, rise up to about 200 generations, 5
The fitness is almost saturated by the 00th generation. The fitness due to the saturation is 86% in FIG. 11, and the fitness in FIG.
Is about 84%. If the generations after the 500th generation are further piled up, the fitness may increase, but if it is inferred from the increase rate of the 200th generation to the 500th generation, it seems that the process has to be repeated to a considerably older generation. However, it is not efficient to repeat the process to a generation far ahead in order to improve the fitness.

【０００６】この発明は上記のような問題を解決するた
めになされたもので、遺伝的アルゴリズムを用いた姿勢
検出装置における姿勢パラメータの推定精度を向上させ
ることを目的とする。The present invention has been made to solve the above problems, and an object of the present invention is to improve the estimation accuracy of a posture parameter in a posture detection device using a genetic algorithm.

【０００７】[0007]

【課題を解決するための手段】この発明に従うと、３次
元物体に対して所定の幾何学的位置関係でそれぞれが設
けられた複数の撮像手段で３次元物体を撮像し、その複
数の画像に基づいて３次元物体の姿勢を検出する姿勢検
出装置は、第１および第２の遺伝的アルゴリズム実行手
段を備える。第１の遺伝的アルゴリズム実行手段は、３
次元物体に対応して設けられる第１の仮想３次元モデル
と、第１の仮想３次元モデルに対して上記幾何学的位置
関係と同一の幾何学的位置関係でそれぞれが設けられた
複数の第１の仮想撮像手段と、複数の第１の仮想撮像手
段によって得られる複数の仮想画像と複数の画像とを比
較して第１の適応度を求める第１の比較手段と、予め定
められた初期遺伝子情報に基づいて、第１の適応度に従
う遺伝的アルゴリズムに応じて第１の仮想３次元モデル
の姿勢を特定可能な第１の遺伝子情報を生成する第１の
遺伝子情報生成手段と、第１の遺伝子情報に応じて第１
の仮想３次元モデルの姿勢を変形させる第１の変形手段
とを含む。第２の遺伝的アルゴリズム実行手段は、３次
元物体の主要部に対応して設けられる第２の仮想３次元
モデルと、第２の仮想３次元モデルに対して上記幾何学
的位置関係と同一の幾何学的位置関係でそれぞれが設け
られた複数の第２の仮想撮像手段と、複数の第２の仮想
撮像手段によって得られる複数の仮想画像と複数の画像
とを比較して第２の適応度を求める第２の比較手段と、
第１の遺伝的アルゴリズム実行手段からの第１の遺伝子
情報に基づいて、第２の適応度に従う遺伝的アルゴリズ
ムに応じて第２の仮想３次元モデルの姿勢を特定可能な
第２の遺伝子情報を生成する第２の遺伝子情報生成手段
と、第２の遺伝子情報に応じて第２の仮想３次元モデル
の姿勢を変形させる第２の変形手段を含む。According to the present invention, a three-dimensional object is imaged by a plurality of imaging means, each of which is provided in a predetermined geometrical positional relationship with respect to the three-dimensional object, and the plurality of images are obtained. A posture detection device that detects a posture of a three-dimensional object based on the first and second genetic algorithm execution means. The first genetic algorithm executing means is 3
A first virtual three-dimensional model provided corresponding to a three-dimensional object, and a plurality of first virtual three-dimensional models each provided with the same geometrical positional relationship as the above-mentioned geometrical positional relationship with respect to the first virtual three-dimensional model. One virtual imaging means, a first comparison means for comparing a plurality of virtual images and a plurality of images obtained by the plurality of first virtual imaging means to obtain a first fitness, and a predetermined initial value. First gene information generating means for generating first gene information capable of specifying the posture of the first virtual three-dimensional model according to the genetic algorithm according to the first fitness based on the gene information; First according to the genetic information of
And a first deforming unit that deforms the posture of the virtual three-dimensional model. The second genetic algorithm execution means has a second virtual three-dimensional model provided corresponding to the main part of the three-dimensional object, and has the same geometrical positional relationship with respect to the second virtual three-dimensional model. The second fitness by comparing a plurality of second virtual imaging means provided respectively in a geometrical positional relationship with a plurality of virtual images and a plurality of images obtained by the plurality of second virtual imaging means A second comparison means for obtaining
Based on the first genetic information from the first genetic algorithm executing means, second genetic information capable of specifying the posture of the second virtual three-dimensional model according to the genetic algorithm according to the second fitness is obtained. It includes a second gene information generating means for generating and a second deforming means for deforming the posture of the second virtual three-dimensional model according to the second gene information.

【０００８】好ましくは、上記姿勢検出装置はさらに、
第３の遺伝的アルゴリズム実行手段を備える。第３の遺
伝的アルゴリズム実行手段は、３次元物体の細部に対応
して設けられる第３の仮想３次元モデルと、第３の仮想
３次元モデルに対して上記幾何学的位置関係と同一の幾
何学的位置関係でそれぞれが設けられた複数の第３の仮
想撮像手段と、複数の第３の仮想撮像手段によって得ら
れる複数の仮想画像と複数の画像とを比較して第３の適
応度を求める第３の比較手段と、第２の遺伝的アルゴリ
ズム実行手段からの第２の遺伝子情報に基づいて、第３
の適応度に従う遺伝的アルゴリズムに応じて第３の仮想
３次元モデルの姿勢を特定可能な第３の遺伝子情報を生
成する第３の遺伝子情報生成手段と、第３の遺伝子情報
に応じて第３の仮想３次元モデルの姿勢を変形させる第
３の変形手段を含む。[0008] Preferably, the posture detection device further comprises
A third genetic algorithm executing means is provided. The third genetic algorithm execution means includes a third virtual three-dimensional model provided corresponding to the details of the three-dimensional object, and the same geometrical positional relationship with the third virtual three-dimensional model. A plurality of third virtual imaging means provided respectively in a geometrical positional relationship, and a plurality of virtual images and a plurality of images obtained by the plurality of third virtual imaging means are compared to determine a third fitness. Based on the second comparing means to be obtained and the second genetic information from the second genetic algorithm executing means,
Third gene information generating means for generating third gene information capable of specifying the posture of the third virtual three-dimensional model according to the genetic algorithm according to the fitness of, and the third gene information according to the third gene information. And a third deforming means for deforming the posture of the virtual three-dimensional model.

【０００９】さらに好ましくは、第３の仮想３次元モデ
ルは３次元物体と同じ色彩を有し、第３の比較手段はさ
らに複数の仮想画像の色彩と複数の仮想画像の色彩とを
も比較して第３の適応度を求める。More preferably, the third virtual three-dimensional model has the same color as the three-dimensional object, and the third comparing means further compares the colors of the plurality of virtual images with the colors of the plurality of virtual images. Then, the third fitness is obtained.

【００１０】[0010]

【発明の実施の形態】以下、この発明の実施の形態の一
例を図面を参照して詳しく説明する。（１）全体構成図１は、この発明の実施の形態による姿勢検出装置の全
体構成を示すブロック図である。図１を参照して、この
姿勢検出装置は、人物１に対して所定の幾何学的位置に
配置された実マルチカメラＲ₁〜Ｒ_nが人物１を撮像す
ることで得られる目標人物マルチ画像３に基づいて人物
１の姿勢を検出する。姿勢検出装置は、人物１のシルエ
ットを用いてその姿勢を推定するシルエット遺伝的アル
ゴリズム実行部（Ｓ−ＧＡ）１０と、Ｓ−ＧＡ１０の推
定結果に基づいて人物１の胴の姿勢だけを再推定するポ
ジション遺伝的アルゴリズム実行部（Ｐ−ＧＡ）１２
と、Ｓ−ＧＡ１０の推定結果に基づいて人物１の各人体
パーツの姿勢を再推定するボックス遺伝的アルゴリズム
実行部（Ｂ−ＧＡ）１４とを備える。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. (1) Overall Configuration FIG. 1 is a block diagram showing the overall configuration of a posture detection device according to an embodiment of the present invention. With reference to FIG. 1, this posture detection apparatus is a target person multi-image obtained by capturing images of person 1 by real multi-cameras R _{1 to} R _n arranged at predetermined geometrical positions with respect to person 1. The posture of the person 1 is detected based on 3. The posture detection apparatus re-estimates only the posture of the trunk of the person 1 based on the silhouette genetic algorithm execution unit (S-GA) 10 that estimates the posture using the silhouette of the person 1 and the estimation result of the S-GA 10. Position Genetic Algorithm Execution Unit (P-GA) 12
And a box genetic algorithm execution unit (B-GA) 14 for re-estimating the posture of each human body part of the person 1 based on the estimation result of the S-GA 10.

【００１１】Ｓ−ＧＡ１０は上述した従来の姿勢検出装
置に相当する。Ｓ−ＧＡ１０は、その推定結果のうち人
物１の胴に関するものをＰ−ＧＡ１２に与え、二の腕、
腕、手および頭に関するものをＢ−ＧＡ１４に与える。
Ｐ−ＧＡ１２は、Ｓ−ＧＡ１０から与えられた胴に関す
る推定結果を初期値として人物１の胴の姿勢をより詳細
に再推定する。Ｂ−ＧＡ１４は、Ｓ−ＧＡ１０から与え
られた二の腕、腕、手および頭に関する推定結果を初期
値として各人体パーツの姿勢をより詳細に再推定する。The S-GA 10 corresponds to the above-mentioned conventional attitude detecting device. The S-GA 10 gives to the P-GA 12 one of the estimation results relating to the torso of the person 1, and the upper arm,
Give to B-GA14 about arms, hands and head.
The P-GA 12 re-estimates the posture of the torso of the person 1 in more detail by using the estimation result regarding the torso given from the S-GA 10 as an initial value. The B-GA 14 re-estimates the posture of each human body part in more detail by using the estimation result regarding the two arms, the arm, the hand, and the head provided from the S-GA 10 as an initial value.

【００１２】（２）推定すべき姿勢パラメータ人物の姿勢は、人物を構成する主要な骨を接続する関節
が回転動作することにより発生する。また、人物は主に
足を使って、３次元空間中を移動可能である。この実施
の形態では、人物の上半身を扱う。上半身における姿勢
パラメータの定義を図２に示す。なおこの実施の形態で
は、手首の関節は考慮に入れるが、指の関節は取扱わな
いことにする。図２に示すように、人物の移動に関する
パラメータについては、人物の胸部の基準点Ｃ１の、３
次元空間中の基準座標系における３次元座標（Ｘ，Ｙ，
Ｚ）およびこの基準座標軸の周りの回転α、β、γの合
計６つのパラメータを推定する必要がある。また、人物
の上半身には図２に示すように、合計７つの主要な関節
Ｃ２〜Ｃ８がある。これら関節Ｃ２〜Ｃ８の各々は１ま
たは３の自由度を持つため、図２に示すように回転θ１
〜θ１７の関節角度パラメータがある。(2) Posture Parameter to be Estimated The posture of the person is generated by the rotational movement of the joints connecting the main bones of the person. A person can move in a three-dimensional space mainly by using his / her feet. In this embodiment, the upper body of a person is handled. The definition of posture parameters in the upper body is shown in FIG. In this embodiment, the wrist joint is taken into consideration, but the finger joint is not handled. As shown in FIG. 2, regarding the parameters relating to the movement of the person, 3 of the reference point C1 of the chest of the person is used.
3D coordinates (X, Y,
Z) and rotations α, β, γ about this reference coordinate axis, a total of six parameters need to be estimated. In addition, as shown in FIG. 2, there are a total of seven main joints C2 to C8 in the upper half of the person. Since each of these joints C2 to C8 has a degree of freedom of 1 or 3, rotation θ1 as shown in FIG.
There is a joint angle parameter of ~ θ17.

【００１３】頭部については頭の関節（中心点）Ｃ８を
原点とする３次元座標軸の周りの回転があり、それぞれ
の回転に対してθ１（上下を向く）、θ２（首を左右に
振る）、θ３（首を傾げる）の３つのパラメータがあ
る。肩の部分については、たとえば右腕に着目すると、
肩と肘を結ぶ骨の周りの回転θ４と２つの回転θ５，θ
６がある。さらに、右腕に関しては、肘について、肘の
１つの自由度に対応する回転θ７、手首部分は肘と手首
を結ぶ骨の周りの回転θ８、および２つの回転θ９，θ
１０がある。このように右腕に対しては、θ４〜θ１０
のパラメータがあり、これと同様に左腕に関してもθ１
１〜θ１７のパラメータがある。したがって、このよう
な２３個の姿勢パラメータＸ，Ｙ，Ｚ，α，β，γ，θ
１〜θ１７が図１に示した姿勢検出装置によって推定さ
れる。Regarding the head, there is rotation around a three-dimensional coordinate axis whose origin is the joint (center point) C8 of the head, and θ1 (up and down) and θ2 (swing the neck left and right) for each rotation. , Θ3 (tilting the neck). Regarding the shoulder, for example, focusing on the right arm,
Rotation around the bone connecting the shoulder and elbow θ4 and two rotations θ5, θ
There are six. Further, regarding the right arm, with respect to the elbow, a rotation θ7 corresponding to one degree of freedom of the elbow, a wrist portion rotates around a bone connecting the elbow and the wrist θ8, and two rotations θ9, θ.
There is 10. Thus, for the right arm, θ4 to θ10
There is also a parameter of
There are parameters 1 to θ17. Therefore, such 23 posture parameters X, Y, Z, α, β, γ, θ
1 to θ17 are estimated by the attitude detection device shown in FIG.

【００１４】（３）Ｓ−ＧＡ上述したように、この姿勢検出装置は人物の上半身の２
３個の姿勢パラメータを推定する必要がある。一般に
は、何も拘束条件を導入しなければ、組合せの爆発を招
き、正しいパラメータを推定することは極めて困難であ
る。このような組合せ最適化問題を解くための有力な手
法として、上述した遺伝的アルゴリズムがある。図１中
のＳ−ＧＡ１０はこのような遺伝的アルゴリズムに従っ
て人物１の上半身の姿勢パラメータを推定する。(3) S-GA As described above, this posture detection device is used for the upper half of the body of a person.
It is necessary to estimate three pose parameters. In general, if no constraint condition is introduced, the combination will explode and it will be extremely difficult to estimate the correct parameters. As a powerful method for solving such a combinatorial optimization problem, there is the above-mentioned genetic algorithm. The S-GA 10 in FIG. 1 estimates the posture parameters of the upper body of the person 1 according to such a genetic algorithm.

【００１５】図３は、図１中のＳ−ＧＡ１０の具体的な
構成を示すブロック図である。図３を参照して、このＳ
−ＧＡ１０は、人物１に対応する人物モデル（仮想３次
元モデル）７Ｓと、人物モデル７Ｓを撮像する仮想マル
チカメラＶ_1S〜Ｖ_NSと、仮想マルチカメラＶ_1S〜Ｖ_NSで
人物モデル７Ｓを撮像することにより得られる合成人物
マルチ画像９Ｓと目標人物マルチ画像３とを比較する比
較部１１Ｓと、比較部１１Ｓで得られる適応度に応じて
遺伝子情報を有した染色体１５Ｓを生成する遺伝子情報
生成部１３Ｓと、染色体１５Ｓに応じて人物モデル７Ｓ
の姿勢を変形する変形部１７Ｓとを含む。ここで、人物
がＮ（≧２）台の実マルチカメラＲ₁〜Ｒ_Nによって撮
像されるのは、人物が本来３次元構造を持つものであ
り、その動作も三次元的なものだからである。また、極
力オクルージョンの確率を低くするためでもある。さら
に後述するように、ステレオマッチング等を用いた３次
元構造の復元などのような処理は不要であるのが本発明
の特徴の１つである。FIG. 3 is a block diagram showing a specific structure of the S-GA 10 in FIG. Referring to FIG. 3, this S
-GA10 is captured and human model (virtual three-dimensional model) 7S corresponding to the person 1, and the virtual multi-camera V _1S ~V _NS imaging the human model 7S, the human model 7S the virtual multi-camera V _1S ~V _NS The comparison unit 11S that compares the synthesized person multi-image 9S and the target person multi-image 3 obtained by doing the above, and the gene information generation unit that generates the chromosome 15S having gene information according to the fitness obtained by the comparison unit 11S 13S and human model 7S according to chromosome 15S
And a deforming unit 17S that deforms the posture. Here, the person is imaged by N (≧ 2) stage of the real multi-camera R ₁ to R _N are those persons having an original three-dimensional structure, its operation is also because those three-dimensional . This is also to reduce the probability of occlusion as much as possible. Further, as will be described later, one of the features of the present invention is that processing such as restoration of a three-dimensional structure using stereo matching or the like is unnecessary.

【００１６】また、人物モデル７Ｓは、姿勢パラメータ
を求めようとしている人物１に対応して予め作成されて
いる。この人物モデル７Ｓの各関節は、実際の人物と同
様の回転動作が可能である。遺伝子情報生成部１３Ｓ
は、比較部１１Ｓによって得られる適応度に応じて自然
淘汰の遺伝子操作を行なう自然淘汰遺伝子操作部１９Ｓ
と、突然変異などの遺伝子操作が行なわれる遺伝子プー
ル２１Ｓとを含む。この遺伝子プール２１Ｓによって生
成される染色体１５Ｓは、遺伝子Ｘ₁〜Ｘ_nを有し、こ
れら遺伝子Ｘ₁〜Ｘ_nが姿勢パラメータに相当する。The person model 7S is prepared in advance for the person 1 whose posture parameter is to be obtained. Each joint of the human model 7S can rotate similarly to an actual human. Gene information generator 13S
Is a natural selection gene manipulation unit 19S that performs gene manipulation of natural selection according to the fitness obtained by the comparison unit 11S.
And a gene pool 21S in which genetic manipulation such as mutation is performed. Chromosomal 15S produced by this gene pool 21S has a gene X ₁ to X _n, these genes X ₁ to X _n corresponds to the pose parameters.

【００１７】次に、このＳ−ＧＡ１０の動作を説明す
る。実マルチカメラＲ₁〜Ｒ_nは人物１の３次元情報を
得るために、人物１を撮像する。そして、得られた目標
人物マルチ画像３が比較部１１Ｓに与えられる。一方、
関節角度等の姿勢パラメータが検出されるために、人物
１に対して仮想的に設けられる人物モデル７Ｓが予め作
成されている。染色体１５Ｓの遺伝子Ｘ ₁〜Ｘ_nは人物
１の各関節角度を表わしている。最初は、予め定められ
た姿勢パラメータが初期遺伝子情報として染色体１５Ｓ
に与えられる。変形部１７Ｓは、そのような染色体１５
Ｓの遺伝子Ｘ₁〜Ｘ_nに応じて人物モデル７Ｓの各関節
を回転させて、変形を行なう。そして、仮想マルチカメ
ラＶ_1S〜Ｖ_NSが変形された人物モデル７Ｓを撮像し、そ
の合成人物マルチ画像９Ｓが比較部１１Ｓに与えられ
る。ここで、仮想マルチカメラは、上述した実マルチカ
メラＲ₁〜Ｒ_Nと同様にＮ台のカメラＶ_1S〜Ｖ_NSから構
成されており、互いの位置関係、画角等のカメラパラメ
ータは一致させられている。すなわち、仮想マルチカメ
ラＶ_1S〜Ｖ_NSの人物モデル７Ｓに対する幾何学的配置関
係は、人物１に対して設けられる実マルチカメラＲ₁〜
Ｒ_Nの幾何学的配置位置と同じである。このような配置
により、人物の３次元構造および３次元的動作が抽出さ
れる。また、このような配置により、オクルージョンの
確率が極力低くなる。Next, the operation of the S-GA 10 will be described.
You. Real multi-camera R₁~ R_nIs the 3D information of person 1.
The person 1 is imaged in order to obtain it. And the goals obtained
The person multi-image 3 is given to the comparison unit 11S. on the other hand,
Since posture parameters such as joint angles are detected,
The person model 7S virtually provided for 1 is created in advance.
Has been established. Gene X on chromosome 15S ₁~ X_nIs a person
1 represents each joint angle. Initially predetermined
Posture parameter is chromosome 15S as initial gene information
Given to. The deforming portion 17S has such a chromosome 15
S gene X₁~ X_nEach joint of the human model 7S according to
Rotate to transform. And the virtual multi-turtle
La V_1S~ V_NSImage of the deformed human model 7S,
The composite person multi-image 9S of is given to the comparison unit 11S.
You. Here, the virtual multi-camera is the real multi-camera described above.
Mela R₁~ R_NSame as N cameras V_1S~ V_NSFrom
The camera parameters such as the positional relationship and angle of view of each other are
The data are matched. That is, virtual multi-camera
La V_1S~ V_NSGeometrical Arrangement Function for Human Model 7S
The person in charge is the real multi-camera R provided for the person 1.₁~
R_NIs the same as the geometrical arrangement position of. Such an arrangement
To extract the 3D structure and 3D motion of a person.
It is. Also, with this arrangement, occlusion
The probability is as low as possible.

【００１８】比較部１１Ｓは、実マルチカメラＲ₁〜Ｒ
_Nからの目標人物マルチ画像３と仮想マルチカメラＶ_1S
〜Ｖ_NSからの合成人物マルチ画像９Ｓとを比較し、目標
人物マルチ画像３を環境と考えて、その環境への適応度
を計算して求める。この適応度は、実人物像と仮想人物
像との間で生じる重なり部分の面積率が評価されるとい
うものである。図４は、適応度を説明するための図であ
る。目標人物マルチ画像３および合成人物マルチ画像９
Ｓのそれぞれは、予め取得しておいた人物１および人物
モデル７Ｓのいない背景画像との差分が計算されて、人
物１および人物モデル７Ｓに対応する可能性のある領域
と背景候補領域に２値化される。その後、互いに対応す
る実マルチカメラＲ_iと仮想マルチカメラＶ_iS（ｉ＝
１，…，Ｎ）ごとに、次の式（１）に従って適応度ＳＦ
が計算される。The comparison unit 11S includes real multi-cameras R ₁ -R.
Target person multi image 3 from _N and virtual multi camera V _1S
~ The synthesized person multi-image 9S from V _NS is compared, the target person multi-image 3 is considered as the environment, and the fitness to the environment is calculated and obtained. This fitness is that the area ratio of the overlapping portion generated between the real human image and the virtual human image is evaluated. FIG. 4 is a diagram for explaining the fitness. Target person multi-image 3 and composite person multi-image 9
For each of S, the difference between the person 1 and the background image without the person model 7S that have been acquired in advance is calculated, and the area that may correspond to the person 1 and the person model 7S and the background candidate area are binary. Be converted. After that, the real multi-camera R _i and the virtual multi-camera V _iS (i =
1, ..., N), the fitness SF according to the following equation (1):
Is calculated.

【００１９】ＳＦ_i＝Ａ／（Ａ＋Ｂ＋Ｃ） …（１）ここで、Ａは実マルチカメラＲ_iからの人物候補領域２
３と仮想マルチカメラＶ_iSからの合成人物領域２５とが
重なり合っている部分の面積を表わし、Ｂは仮想マルチ
カメラＶ_iSからの合成人物領域２５だけが存在する部分
の面積を表わし、Ｃは実マルチカメラＲ_iからの人物候
補領域２３だけが存在する部分の面積を表わす。したが
って、ＳＦ_iは０と１の間の値を取り、実人物像と仮想
人物像とが完全に重なれば１になる。マルチカメラ全体
の適応度ＳＦは式（１）で示された適応度ＳＦ_iの平
均、すなわち次の式（２）により求められる。SF _i = A / (A + B + C) (1) where A is the person candidate region 2 from the real multi-camera R _i
3 represents the area of a portion where the synthetic human area 25 from the virtual multi-camera V _iS overlaps, B represents the area of the portion in which only the synthetic human area 25 from the virtual multi-camera V _iS exists, and C represents the actual area. It represents the area of a portion where only the person candidate area 23 from the multi-camera R _i exists. Therefore, SF _i takes a value between 0 and 1, and becomes 1 when the real person image and the virtual person image completely overlap. The fitness SF of the entire multi-camera is obtained by the average of the fitness SF _i shown in the equation (1), that is, the following equation (2).

【００２０】[0020]

【数１】 [Equation 1]

【００２１】式（１）および（２）による適応度の計算
は、個体集団におけるＰ個の個体全てについて行なわれ
る。このＰ個の個体のそれぞれは、上述したような人物
の上半身の関節および人物の位置に関する２３個のパラ
メータを表わす遺伝子を有している。したがって、遺伝
子情報生成部１３Ｓは、一般には何も拘束条件が導入さ
れなければ組合せの爆発が生じるようなパラメータの検
出を遺伝的アルゴリズムに基づいて行なっている。この
遺伝的アルゴリズムは、組合せ最適化問題を解く有力な
手段である。まず、自然淘汰遺伝子創作１９Ｓは、比較
部１１Ｓによって求められた適応度の値に比例する形で
選択確率を決定し、ランダムな抽出により子孫を残すた
めの親を２個体ずつ選択して自然淘汰の遺伝子操作を行
なう。このときには、適応度の高い親が選ばれる確率は
高い。The calculation of the fitness by the equations (1) and (2) is performed for all P individuals in the population. Each of the P individuals has a gene representing 23 parameters relating to the joints of the upper body of the person and the position of the person as described above. Therefore, the gene information generation unit 13S generally detects a parameter based on a genetic algorithm such that a combination explosion occurs if no constraint condition is introduced. This genetic algorithm is a powerful means for solving combinatorial optimization problems. First, the natural selection gene creation 19S determines the selection probability in a form proportional to the value of the fitness calculated by the comparison unit 11S, selects two parents for leaving offspring by random extraction, and selects natural selection. Genetic manipulation of. At this time, there is a high probability that a parent with high fitness is selected.

【００２２】そして、遺伝子プール２１Ｓで親から２個
体の子供が生まれる。このように親が交配するとき、交
差と突然変異がそれぞれ確率ｐ_cとｐ_mで起こるような
遺伝子操作が行なわれる。なお、ここでは、各遺伝子Ｘ
₁〜Ｘ_nはビット列で表現される。このようにして、遺
伝子プール２１Ｓに次の世代のＰ個の個体（染色体）１
５が発生する。変形部１７Ｓによる人物モデル７Ｓの姿
勢変形と、比較部１１Ｓによる適応度計算と、遺伝子情
報生成部１３Ｓの個体生成というサイクルが繰返され
る。そして、ある世代を経た後得られる個体集団の中
で、最大の適応度ＳＦを与える個体の遺伝子情報が人物
１の姿勢パラメータの推定値とされる。Then, two children are born from the parents in the gene pool 21S. When such a parent is bred, cross and mutation genetic manipulation, such as occurs with probability p _c and p _m, respectively is performed. In addition, here, each gene X
_{1 to} _Xn are represented by bit strings. In this way, P individuals (chromosomes) 1 of the next generation are added to the gene pool 21S.
5 occurs. The cycle of posture transformation of the human model 7S by the transformation unit 17S, fitness calculation by the comparison unit 11S, and individual generation by the gene information generation unit 13S is repeated. Then, in the individual population obtained after passing a certain generation, the genetic information of the individual that gives the maximum fitness SF is used as the estimated value of the posture parameter of the person 1.

【００２３】このように、画像処理として、背景と人物
画像の差分が求められて２値化が行なわれ、適応度ＳＦ
の計算も面積情報に基づいて行なわれるので、画像ノイ
ズへの耐性が高い。（４）Ｐ−ＧＡＳ−ＧＡ１０の推定結果の精度を高めるためには、その
推定結果に基づいて各人体パーツごとに合成人物マルチ
画像を目標人物マルチ画像に合わせなおすことが有効で
ある。このＰ−ＧＡ１２では、人体パーツの中で最大の
ものである胴についての姿勢パラメータを再推定する。Thus, as the image processing, the difference between the background and the human image is obtained and binarized, and the fitness SF
Since the calculation of is also performed based on the area information, the resistance to image noise is high. (4) In order to improve the accuracy of the estimation result of the P-GA S-GA 10, it is effective to re-adjust the synthetic human multi-image to the target human multi-image for each human body part based on the estimation result. In this P-GA 12, the posture parameter for the torso, which is the largest of the human body parts, is re-estimated.

【００２４】図５は、図１中のＰ−ＧＡ１２の具体的な
構成を示すブロック図である。このＰ−ＧＡ１２は、Ｓ
−ＧＡ１０により得られた移動に関する６つの姿勢パラ
メータ（Ｘ，Ｙ，Ｚ，α，β，γ）の推定精度を高め
る。これらのパラメータは図２に示されるように、人物
１の胴についてのものである。胴は他の人体パーツに比
べて、大きなものであり、シルエットの重ね合わせが有
効と考えられる。そこで、このＰ−ＧＡ１２では、Ｐ−
ＧＡ１０の人物モデル７Ｃに代えて胴の人物モデル７Ｐ
が用いられる。このような胴の人物モデル７Ｐを除き、
このＰ−ＧＡ１２は上記Ｓ−ＧＡ１０と同様に、仮想マ
ルチカメラＶ_iP〜Ｖ_NPと、合成人物マルチ画像９Ｐと目
標人物マルチ画像３とを比較する比較部１１Ｐと、染色
体１５Ｐを生成する遺伝子情報生成部１３Ｐと、変形部
１７Ｐとを含む。遺伝子情報生成部１３Ｐは、自然淘汰
遺伝子操作部１９Ｐと、遺伝子プール２１Ｐとを含む。FIG. 5 is a block diagram showing a specific structure of the P-GA 12 in FIG. This P-GA12 is S
-Increase the estimation accuracy of the six posture parameters (X, Y, Z, α, β, γ) related to the movement obtained by the GA10. These parameters are for the torso of person 1, as shown in FIG. The torso is larger than other human body parts, and it is thought that overlapping silhouettes is effective. Therefore, in this P-GA12, P-
Body model 7P of the body instead of the person model 7C of GA10
Is used. Except for such a torso model 7P,
Similar to the P-GA12 is the S-GA10, the virtual multi-camera V _iP ~V _NP, a comparing unit 11P for comparing the synthesized human multi-image 9P and the target person multi-image 3, the genetic information to produce a chromosome 15P The generation unit 13P and the transformation unit 17P are included. The gene information generation unit 13P includes a natural selection gene manipulation unit 19P and a gene pool 21P.

【００２５】Ｓ−ＧＡ１０の最終世代の計算終了時にお
いて、Ｐ個体のうち、式（２）の適応度ＳＦの値の上位
Ｊ（＜Ｐ）個のものを抽出し、Ｐ−ＧＡ１２の初期値と
して与える（Ｐ／Ｊ個体ずつ、同じ初期値を持つ個体が
発生する）。Ｐ−ＧＡ１２における各個体の遺伝子は、
前述した６つの姿勢パラメータである。このＰ−ＧＡ１
２における遺伝的アルゴリズムは上記Ｓ−ＧＡ１０にお
ける遺伝的アルゴリズムとほぼ同じであるが、適応度Ｐ
Ｆは以下のものを用いる。At the end of the calculation of the final generation of S-GA10, among the P individuals, the upper J (<P) values of the fitness SF of equation (2) are extracted, and the initial value of P-GA12 is extracted. (Each P / J individual generates an individual having the same initial value). The gene of each individual in P-GA12 is
These are the six posture parameters described above. This P-GA1
The genetic algorithm in 2 is almost the same as the genetic algorithm in S-GA10, but the fitness P
The following F is used.

【００２６】[0026]

【数２】 [Equation 2]

【００２７】ここで、式（３）におけるＡおよびＢの定
義は式（１）のものと同じである。式（３）にはＣが含
まれていないが、これは、実人物像における胴の位置は
Ｐ−ＧＡ１２の段階では特定されていないのと、胴が最
大の人体パーツであるため、胴の人物モデル７Ｐが実人
物像に合わない場合はＢに反映されるためである。この
ようにＰ−ＧＡ１２では、人物１の主要部である胴の姿
勢パラメータが詳細に再推定されるため、人物像１の姿
勢推定精度は向上する。（５）Ｂ−ＧＡ図６は、図１中のＢ−ＧＡ１４の具体的な構成を示すブ
ロック図である。このＰ−ＧＡ１２により胴の６つの姿
勢パラメータを再推定した後、このＢ−ＧＡ１４により
人体パーツの姿勢パラメータを再推定し、姿勢の推定精
度を向上させる。このＢ−ＧＡ１４では、直方体のボッ
クス２９が設定された人物モデル７Ｂが用いられる。こ
のような人物モデル７Ｂを除き、このＢ−ＧＡ１４は上
記Ｓ−ＧＡ１０と同様に、仮想マルチカメラＶ_1B〜Ｖ_NB
と、合成人物マルチ画像９Ｂと目標人物マルチ画像３と
を比較する比較部１１Ｂと、染色体１５Ｂを生成する遺
伝子情報生成部１３Ｂと、変形部１７Ｂとを含む。ま
た、遺伝子情報生成部１３Ｂは、自然淘汰遺伝子操作部
１９Ｂと、遺伝子プール２１Ｂとを含む。Here, the definitions of A and B in the equation (3) are the same as those in the equation (1). Although C is not included in Expression (3), this is because the position of the torso in the actual human image is not specified at the stage of P-GA12 and the torso is the largest human body part. This is because when the person model 7P does not match the actual person image, it is reflected in B. As described above, in the P-GA 12, the posture parameter of the body, which is the main part of the person 1, is re-estimated in detail, so that the posture estimation accuracy of the person image 1 is improved. (5) B-GA FIG. 6 is a block diagram showing a specific configuration of the B-GA 14 in FIG. After the six posture parameters of the torso are re-estimated by the P-GA 12, the posture parameters of the human body parts are re-estimated by the B-GA 14 to improve the posture estimation accuracy. In this B-GA 14, a person model 7B in which a rectangular parallelepiped box 29 is set is used. Except for such a person model 7B, the B-GA 14 is similar to the S-GA 10 in the virtual multi-cameras V _{1B to} V _NB.
And a comparison unit 11B that compares the synthesized person multi-image 9B and the target person multi-image 3, a gene information generation unit 13B that generates the chromosome 15B, and a transformation unit 17B. Further, the gene information generation unit 13B includes a natural selection gene operation unit 19B and a gene pool 21B.

【００２８】このＢ−ＧＡ１４では、胴に近いものから
順番に、人体パーツの局所的情報を利用して、人物モデ
ル７Ｂの各人体パーツを人物１に合わせていく。ここ
で、局所的な情報として色彩情報が用いられる。これ
は、Ｓ−ＧＡ１０やＰ−ＧＡ１２に用いたシルエットの
重なり情報だけでは、画像中において人体パーツ同士の
オクルージョンが存在する（たとえば、胴の前に腕があ
る）場合に、対応が困難なためである。人体パーツの局
所的な情報を利用するため、図７（Ａ）に示すように、
人物モデル７Ｂにおける各人体パーツに直方体のボック
ス２９を設定する。このボックス２９は図７（Ｂ）に示
すように、目標人物マルチ画像３において多角形のウイ
ンドウ３１として観測される。このウインドウ３１の内
部を対象に遺伝的アルゴリズムに従って各人体パーツの
姿勢パラメータが再推定される。In the B-GA 14, each human body part of the human model 7B is matched with the human 1 by utilizing the local information of the human body parts in order from the one close to the torso. Here, color information is used as local information. This is because it is difficult to deal with the occlusion of human body parts in the image (for example, the arm in front of the torso) only with the overlapping information of the silhouettes used in the S-GA10 and the P-GA12. Is. In order to use the local information of human body parts, as shown in FIG.
A rectangular box 29 is set for each human body part in the human model 7B. This box 29 is observed as a polygonal window 31 in the target person multi-image 3 as shown in FIG. 7 (B). The posture parameter of each human body part is re-estimated according to a genetic algorithm for the inside of this window 31.

【００２９】上述したように、胴に近い人体パーツか
ら、Ｓ−ＧＡ１０によって推定された姿勢パラメータの
値がＢ−ＧＡ１４によって更新される。すなわち、二の
腕→腕→手→頭という順番でそれぞれの姿勢パラメータ
が更新される。ここで、頭以外の人体パーツは、それぞ
れ左右の人体パーツについて遺伝的アルゴリズムに従う
推定が行なわれる。Ｂ−ＧＡ１４における人体パーツｋ
の処理内容は以下のとおりである。各個体（染色体１５
Ｂ）は２３個のパラメータに対応する遺伝子を持つ。胴
に関する６つのパラメータは、Ｐ−ＧＡ１２により得ら
れたものを全個体に与え、Ｂ−ＧＡ１４では処理せず、
固定値とする。Ｂ−ＧＡ１４において処理の対象となる
遺伝子は人体パーツｋについてのもので、図２に示すよ
うに、頭はθ１〜θ３、二の腕はθ４〜θ６、腕はθ
７、手はθ８〜θ１０である（反対側も同様）。各個体
の初期値は、Ｓ−ＧＡの結果の上位Ｊ（＜Ｐ）個の個体
人体パーツがｋについての推定値を与える。また、人体
パーツｋより胴に近い人体パーツｋ′が存在するならば
（たとえば、腕ならば二の腕）、人体パーツｋ′につい
てのＢ−ＧＡ１４の結果の最良値を固定値として用い
る。As described above, the value of the posture parameter estimated by the S-GA 10 is updated by the B-GA 14 from the human body part close to the torso. That is, the respective posture parameters are updated in the order of second arm → arm → hand → head. Here, for the human body parts other than the head, the left and right human body parts are estimated according to a genetic algorithm. Human body parts k in B-GA14
The processing contents of are as follows. Each individual (chromosome 15
B) has genes corresponding to 23 parameters. The six parameters for the torso were given to all individuals as obtained by P-GA12 and not processed by B-GA14,
It is a fixed value. The gene to be processed in B-GA14 is for the human body part k. As shown in FIG. 2, the head is θ1 to θ3, the two arms are θ4 to θ6, and the arms are θ.
7, the hand is θ8 to θ10 (same on the opposite side). The initial value of each individual gives an estimate for k of the top J (<P) individual human body parts of the S-GA result. If a human body part k ′ closer to the torso than the human body part k exists (for example, an arm is a second arm), the best value of the result of the B-GA 14 for the human body part k ′ is used as a fixed value.

【００３０】人体パーツｋの合成人物マルチ画像ｉ（ｉ
＝１，…，Ｎ）におけるウインドウ３１についての適応
度ｂｆ_iとしては、次の式（５）のものを用いる。ｂｆ_i＝ｃｆ_i×ＳＦ_i …（５）ここで、ＳＦ_iは式（１）で与えられるシルエットの重
なり度合であり、ｃｆ _iは合成人物マルチ画像９Ｂの色
彩情報と目標人物マルチ画像３の色彩情報との重なり度
合で、次の式（６）のように表わされる。ｃｆ_i＝Ｄ／（Ｄ＋Ｅ） …（６）ここで、Ｄはウインドウ３１内で色が同じ部分の面積を
表わし、Ｅは色が異なる部分の面積を表わす。ウインド
ウ３１内の画素の色の同一性の判定は、次の式（７）を
満たすか否かで行なう。Composite human multi-image i of human body part k (i
= 1, ..., N) for window 31
Degree bf_iThe following equation (5) is used as bf_i= Cf_i× SF_i (5) Where SF_iIs the weight of the silhouette given by equation (1)
The degree is cf _iIs the color of the composite portrait image 9B
Degree of overlap between color information and color information of target person multi-image 3
In total, it is expressed as the following formula (6). cf_i= D / (D + E) (6) where D is the area of the same color portion in the window 31.
, E represents the area of the part having a different color. Wind
C. To determine the color identity of the pixels within 31, use the following equation (7).
It depends on whether or not it is satisfied.

【００３１】｜Ｒｍ−Ｒｒ｜＋｜Ｇｍ−Ｇｒ｜＋｜Ｂｍ−Ｂｒ｜＜Ｔｈ …（７）ここで、Ｒ，Ｇ，Ｂは画素の持つ三原色を示し、その添
字ｍは合成人物マルチ画像、ｒは目標人物マルチ画像を
示す。また、式（７）においてＴｈはしきい値を示す。
なお、式（５）において、ＳＦ_iの項をｃｆ_iに掛けて
いるのは、ある人体パーツが他の人体パーツに食い込む
という現実にはあり得ない場合を防ぐためで、食い込み
が発生するとＳＦ_iの値が小さくなり、結果としてｂｆ
_iの値を小さくすることを目的としている。| Rm-Rr | + | Gm-Gr | + | Bm-Br | <Th (7) Here, R, G, and B represent the three primary colors of the pixel, and the subscript m is the composite person multi-image. , R are target person multi-images. Further, in Expression (7), Th represents a threshold value.
In Expression (5), the term SF _i is multiplied by cf _i in order to prevent the case where one human body part bites into another human body part, which is impossible in reality. The value of _i becomes small, resulting in bf
_The purpose is to reduce the value of _i .

【００３２】このようにして得られる合成人物マルチ画
像ｉのｂｆ_iから、人体パーツｋの適応度ＢＦは、次の
式（８）により求められる。[0032] From bf _i of the thus obtained synthetic human multi-image i, fitness BF of the body part k is obtained by the following equation (8).

【００３３】[0033]

【数３】 (Equation 3)

【００３４】この式（８）において、ｂｆ_iの最小値を
取るという、いわば厳しい条件を課しているのは、人体
パーツｋがより厳密に目標人物マルチ画像３にマッチン
グするようにするためである。以上のようにこの発明の
実施の形態によれば、Ｓ−ＧＡ１０によって推定された
胴の姿勢パラメータを初期値としてＰ−ＧＡ１２がその
胴の姿勢パラメータを再推定するため、人物モデルの適
応度はさらに向上する。また、Ｓ−ＧＡ１０により得ら
れた胴以外の姿勢パラメータを初期値としてＢ−ＧＡ１
４がそれらの姿勢パラメータを各人体パーツごとに再推
定するため、人物モデルの適応度はさらに向上する。し
かも、このＢ−ＧＡ１４では色彩情報が利用されている
ため、「腕を胴の前で組んでいる」というような姿勢も
検出することができる。In the equation (8), the so-called strict condition of taking the minimum value of bf _i is imposed, so that the human body part k is more closely matched to the target person multi-image 3. is there. As described above, according to the embodiment of the present invention, since the P-GA 12 re-estimates the posture parameter of the torso using the torso posture parameter estimated by the S-GA 10 as an initial value, the fitness of the human model is Further improve. In addition, the posture parameters other than the torso obtained by S-GA10 are used as initial values for B-GA1.
4 re-estimates those posture parameters for each human body part, so that the fitness of the human model is further improved. Moreover, since the B-GA 14 uses color information, it is possible to detect a posture such as "arms are assembled in front of the torso".

【００３５】[0035]

【実施例】以下、上記実施の形態に従う実験結果につい
て説明する。（１）実験条件ここでは、人物モデルを用いて生成した合成人物マルチ
画像を目標人物マルチ画像として用いる。カメラを３台
用いる構成とし、各画像は平行投影とする。各カメラ
は、図２に示した人物の胸部の中心点Ｃ１を原点とする
３次元座標系の各座標軸上に設置する。ここで、各座標
軸は、人物の正面、側面、頭上からの各画像の中心を通
るようにする。各画像のサイズは、２５６×２５６画素
で、画像の長さは１７３ｃｍに対応する。EXAMPLES The results of the experiments according to the above embodiment will be described below. (1) Experimental condition Here, the synthetic person multi-image generated using the person model is used as the target person multi-image. Three cameras are used and each image is projected in parallel. Each camera is installed on each coordinate axis of the three-dimensional coordinate system whose origin is the center point C1 of the chest of the person shown in FIG. Here, each coordinate axis passes through the center of each image from the front, side, and overhead of the person. The size of each image is 256 × 256 pixels, and the length of the image corresponds to 173 cm.

【００３６】人物の３次元モデリングは、サイバーウェ
ア・カラー・３Ｄデジタイザにより獲得される３次元形
状データと色彩情報を用いて行なわれる。このモデリン
グは、各人体パーツごとに行なわれる。すなわち、上記
デジタイザによる各人体パーツの表面形状の計測結果に
基づき、三角パッチの集合体で近似することにより、ワ
イヤフレームモデルを作成する。また、上記デジタイザ
を用いればカラーテクスチャを同時に獲得することがで
きるので、これをワイヤフレームモデルにマッピングす
る。人体パーツは互いに接続され、三角パッチ頂点を適
宜移動させることにより、ワイヤフレームの変形や各関
節の回転動作が行なえる。カラーテクスチャ情報は、対
応する場所の三角パッチにマッピングされ、三角パッチ
の変形に応じて、色彩の補間や間引きが行なわれる。The three-dimensional modeling of a person is performed by using the three-dimensional shape data and color information acquired by the cyberware color 3D digitizer. This modeling is performed for each human body part. That is, a wire frame model is created by approximating with a set of triangular patches based on the measurement result of the surface shape of each human body part by the digitizer. Further, since the color texture can be acquired at the same time by using the digitizer, it is mapped to the wireframe model. The human body parts are connected to each other, and by appropriately moving the vertices of the triangular patch, the wire frame can be deformed and the joints can be rotated. The color texture information is mapped to a triangular patch at a corresponding location, and color interpolation or thinning is performed according to the deformation of the triangular patch.

【００３７】目標３方向画像を合成するために使用する
人物モデルとしては、簡易モデルと精密モデルの２種類
を使用する。精密モデルは６０７９個のポリゴンで構成
されているので、図８に示すように、極めて滑らかな形
状を保っている。また、カラーテクスチャも、実際の人
物のものをそのまま使用している。これに対して簡易モ
デルは、精密モデルの形状に基づき、ポリゴン数を３６
４まで減らし、テクスチャも精密モデルにおける対応す
る領域のカラー情報を平均し、各ポリゴンの色としてい
る（図７参照）。As the human model used for synthesizing the target three-direction image, two types, a simple model and a precise model, are used. Since the precise model is composed of 6079 polygons, it maintains an extremely smooth shape as shown in FIG. In addition, the color texture used is that of an actual person. On the other hand, in the simple model, the number of polygons is 36 based on the shape of the precise model.
The color information is reduced to 4 and the color information of the corresponding areas in the precision model is averaged to obtain the color of each polygon (see FIG. 7).

【００３８】この実験には、精密モデルと簡易モデルを
用いて、図８の（Ａ），（Ｂ），（Ｃ）に示される３種
類の姿勢を取る人物を３方向から観測することにより得
られるマルチ画像を目標画像として使用する。一般に、
合成人物マルチ画像は目標人物マルチ画像よりもノイズ
要因が小さいが、上述したＳ−ＧＡ１０、Ｐ−ＧＡ１２
およびＢ−ＧＡ１４における画像処理は、ノイズの影響
が小さい。また、モデルの姿勢パラメータの値を自由に
与えられるので、推定精度の評価が容易であるという利
点がある。図８に示される３種類の姿勢（Ａ），
（Ｂ），（Ｃ）のパラメータ値（真値）を表１に示す。
なお、この３種類の姿勢のパラメータ値は、精密、簡易
モデルの目標画像に共通である。精密モデルの姿勢
（Ａ），（Ｂ），（Ｃ）を前方から観測したのが図５で
ある。In this experiment, a precise model and a simple model were used to obtain by observing persons having three types of postures shown in FIGS. 8A, 8B, and 8C from three directions. The multi-image obtained is used as the target image. In general,
Although the composite person multi-image has a smaller noise factor than the target person multi-image, the above-described S-GA10 and P-GA12
Image processing in the B-GA 14 is less affected by noise. Further, since the value of the posture parameter of the model can be freely given, there is an advantage that the estimation accuracy can be easily evaluated. 3 types of postures (A) shown in FIG.
Table 1 shows the parameter values (true values) of (B) and (C).
The parameter values of these three types of postures are common to the target images of the precision and simple models. The postures (A), (B), and (C) of the precision model are observed from the front in FIG.

【００３９】[0039]

【表１】 [Table 1]

【００４０】各ＧＡ１０，１２，１４における個体は、
８ビット整数型データの遺伝子から構成されるビット列
である。交差は２個体のビット列の交換により行ない。
突然変異はビット反転により行なう。なお、目標画像に
は精密および簡易モデルの双方を用いているが、各ＧＡ
１０，１２，１４の処理に必要な人物モデルとしては、
精密モデルは処理の著しい増大を招くので、この実験で
は簡易モデルのみを用いる。（２）簡易モデルの目標画像を対象とした検討まず、上述した簡易モデルから生成された目標画像を対
象に検討を行なった。The individuals in each GA 10, 12, 14 are
It is a bit string composed of genes of 8-bit integer type data. Crossing is performed by exchanging bit strings of two individuals.
Mutation is performed by bit inversion. Note that both the precise and simple models are used for the target image, but each GA
As the human model necessary for the processing of 10, 12, and 14,
Only the simple model is used in this experiment, as the precise model causes a significant increase in processing. (2) Study on Target Image of Simple Model First, the target image generated from the above-described simple model was examined.

【００４１】各ＧＡ１０，１２，１４において用いた個
体数は１０００である。Ｓ−ＧＡ１０では、世代数は１
０００、交差確率ｐ_c＝０．１、突然変異確率ｐ_m＝１
０^-4である。Ｐ−ＧＡ１２では、世代数は１０００、Ｓ
−ＧＡ１０の上位１００個体の推定値を初期値とし、ｐ
_c＝０．１、ｐ_m＝０．０１を用いた。ここで、突然変
異をＳ−ＧＡ１０より高い確率で発生させているのは、
Ｓ−ＧＡ１０により、既に真値の近傍にかなりの個数の
個体が集められていると考えられるので、微調整の効果
を出すためである。Ｂ−ＧＡ１４では、７つの人体パー
ツそれぞれにつき１００世代の処理を行なった。各人体
パーツの初期値は、Ｓ−ＧＡ１０の結果の上記１００個
のものを用いた。また、Ｐ−ＧＡ１２と同様、微調整の
効果を持たせるため、ｐ_c＝ｐ_m＝０．１とした。The number of individuals used in each GA 10, 12, 14 is 1000. S-GA10 has 1 generation
000, crossing probability p _c = 0.1, mutation probability p _m = 1
It is ^0-4 . P-GA12 has 1000 generations and S
-The initial value is the estimated value of the top 100 individuals of GA10, and p
_c = 0.1 and p _m = 0.01 were used. Here, the mutation is generated with a higher probability than S-GA10,
This is because the S-GA 10 is considered to have already collected a considerable number of individuals in the vicinity of the true value, and thus exerts the effect of fine adjustment. The B-GA14 processed 100 generations for each of the seven human body parts. As the initial value of each human body part, the above 100 pieces as a result of S-GA10 were used. Further, similarly to the P-GA12, for giving the effect of fine-tuning, and the p _c = p _m = 0.1.

【００４２】次の表２に、各ＧＡ１０，１２，１４にお
ける処理により得られた適応度の値を示す。Table 2 below shows the values of the fitness values obtained by the processing in each GA 10, 12, 14.

【００４３】[0043]

【表２】 [Table 2]

【００４４】胴を対象にしたＰ−ＧＡ１２の結果、ＰＦ
の値は３つの目標画像がいずれも改善され、ほぼ１００
％となっている。Ｂ−ＧＡ１４に用いている適応度は、
人体パーツごとの局所的なものであるので、表２には、
最終的に推定結果として得られた姿勢をシルエットの重
なり度合ＳＦにより評価している。最初のＳ−ＧＡのＳ
Ｆ値と比較して、いずれも値が上昇している。ちなみ
に、Ｂ−ＧＡ１４では色彩情報を用いているが、ウイン
ドウ３１内においてＳ−ＧＡ１０やＰ−ＧＡ１２と同
様、シルエットの重なり度合を用いた場合、ＳＦの値
は、（Ａ）７８．２％、（Ｂ）７５．７％、（Ｃ）７
４．３％と、色彩情報を用いる場合に比較して２０％程
度低くなり、Ｂ−ＧＡ１４の有効性を示している。ま
た、Ｂ−ＧＡ１４による姿勢推定結果をパラメータごと
に、Ｓ−ＧＡ１０の結果と併わせて、表１に示す。同表
より、ほぼすべてのパラメータについてＢ−ＧＡ１４が
Ｓ−ＧＡ１０よりもよい結果を与えることが明らかとな
った。As a result of P-GA12 for the body, PF
The value of is improved to almost 100 for all three target images.
%. The fitness used for B-GA14 is
Since it is local for each human body part, Table 2 shows
The posture finally obtained as an estimation result is evaluated by the degree of overlap SF of silhouettes. First S-GA S
The values are higher than the F value. By the way, although the B-GA14 uses color information, when the degree of overlap of silhouettes is used in the window 31 as in the case of the S-GA10 and P-GA12, the SF value is (A) 78.2%, (B) 75.7%, (C) 7
This is 4.3%, which is about 20% lower than when color information is used, indicating the effectiveness of B-GA14. Further, Table 1 shows the results of posture estimation by the B-GA 14 together with the results of the S-GA 10 for each parameter. From the table, it is clear that B-GA14 gives better results than S-GA10 for almost all parameters.

【００４５】（３）精密モデルの目標画像を対象とし
た検討目標画像および人物モデルの双方が簡易モデルである場
合には、上記実施の形態におけるＰ−ＧＡ１２とＢ−Ｇ
Ａ１４の有効性が明らかとなった。これに対し、人物モ
デルとして簡易モデルを用い、精密モデルから生成した
目標画像に適応する場合を検討した。なお、精密モデル
と簡易モデルは、各画像において精密には形状が一致し
ないので、正確に真値が推定できた場合でも、シルエッ
トの重なり度合ＳＦは９０％程度である。(3) Examination for Target Image of Precision Model When both the target image and the human model are simple models, the P-GA 12 and the BG in the above embodiment are used.
The effectiveness of A14 was revealed. On the other hand, we examined the case of using a simple model as a human model and adapting to a target image generated from a precise model. Since the shapes of the precision model and the simple model do not precisely match in each image, the degree of overlap SF of silhouettes is about 90% even when the true value can be accurately estimated.

【００４６】図８における姿勢（Ｂ）について、上述し
た簡易モデルの目標画像の場合と同様に、Ｓ−ＧＡ１０
→Ｐ−ＧＡ１２→Ｂ−ＧＡ１４においてそれぞれ処理を
行なった。その結果、Ｓ−ＧＡ１０におけるＳＦ＝８
４．２％がＢ−ＧＡ１４におけるＳＦ＝８０．６％とな
り、却って推定精度はＳ−ＧＡ１０よりも悪くなった。
この原因としては、上述したように、厳密には簡易モデ
ルと精密モデルの形状が一致しないことが挙げられる。
さらに色彩情報についても、精密モデルでは微妙な色合
いが存在するのに対して、簡易モデルでは１つの三角パ
ッチは均一の色を持つだけである。このように、形状と
色彩の不一致要因により、シルエットでの重なり度合が
不十分な場合が発生していると考えられる。Regarding the posture (B) in FIG. 8, as in the case of the target image of the simple model described above, the S-GA10
->P-GA12-> B-GA14 was processed respectively. As a result, SF = 8 in S-GA10
4.2% was SF = 80.6% in B-GA14, on the contrary, the estimation accuracy was worse than S-GA10.
As a cause of this, as described above, strictly speaking, the shapes of the simple model and the precise model do not match.
Further, regarding the color information, there is a delicate tint in the precise model, whereas in the simple model, one triangular patch has only a uniform color. As described above, it is considered that there is a case where the degree of overlap in the silhouette is insufficient due to the cause of the mismatch between the shape and the color.

【００４７】そこで、シルエットの重なり度合をより反
映すべく、Ｓ−ＧＡ１０とＰ−ＧＡ１２の後に、再度二
度目のシルエットＧＡ（Ｓ−ＧＡ２）を行なう。さら
に、Ｂ−ＧＡ１４についても、以下のように、シルエッ
トの重なり度合を評価するＰ−ＧＡ２に置き換える。Ｓ
−ＧＡ２では、Ｐ−ＧＡ１２の結果得られる胴に関する
６つのパラメータの最良推定値を固定値として各個体に
持ち、残りの１７個のパラメータを推定する。なお、各
個体の１７個のパラメータの初期値は、最初のＳ−ＧＡ
１０の上位１００個の結果を用いる。Therefore, in order to further reflect the degree of silhouette overlap, the second silhouette GA (S-GA2) is performed again after S-GA10 and P-GA12. Furthermore, the B-GA 14 is also replaced with the P-GA 2 for evaluating the degree of silhouette overlap, as described below. S
In -GA2, each individual has the best estimated values of the six parameters related to the trunk obtained as a result of P-GA12 as fixed values, and the remaining 17 parameters are estimated. The initial values of the 17 parameters of each individual are the first S-GA.
The top 100 results of 10 are used.

【００４８】局所的色彩情報にシルエット情報を反映す
るため、Ｂ−ＧＡ２では、交配の結果発生した子供とそ
の親のＢＦ（式（８））を比較し、子供が親より悪けれ
ばその子供の個体を殺し、もし子供が親よりも高いＢＦ
の値を持っている場合には、その親子のシルエットの重
なり度合ＳＦを比較し、子供のほうがよい場合にのみ、
その子供を次世代の個体として残す。以上のＧＡの処理
による結果を表３に示す。目標画像（Ａ）では、Ｓ−Ｇ
Ａ１０よりもＰ−ＧＡ２のほうが結果が悪いが、他の２
つの画像では、改善が見られる。In order to reflect the silhouette information in the local color information, the B-GA2 compares the BF (formula (8)) of the child resulting from the mating with the parent, and if the child is worse than the parent, the child's BF is calculated. BF that kills the individual and if the child is higher than the parent
If you have a value of, compare the overlapping degree SF of the silhouette of the parent and child, and only when the child is better,
Leave the child as the next generation individual. Table 3 shows the results of the above GA processing. In the target image (A), SG
P-GA2 gives worse results than A10, but other 2
Two images show improvement.

【００４９】[0049]

【表３】 [Table 3]

【００５０】以上のように、遺伝的アルゴリズムに基づ
き、人物のマルチ画像から人物の姿勢パラメータの推定
を行なう姿勢検出装置の改良を提案した。この姿勢検出
装置では、従来の姿勢検出装置におけるシルエットの重
なり度合を利用した手法（Ｓ−ＧＡ）の推定結果に基づ
き、人物の局所的な情報を利用した遺伝的アルゴリズム
を実行する。まず、Ｓ−ＧＡの結果を胴の姿勢パラメー
タの初期値として与え、シルエットベースの遺伝的アル
ゴリズム（Ｐ−ＧＡ）を実行し、胴の６つのパラメータ
を再推定する。次に、Ｓ−ＧＡおよびＰ−ＧＡの結果に
基づき、二の腕、腕、手、頭の各人体パーツごとに色彩
情報を利用した遺伝的アルゴリズム（Ｂ−ＧＡ）を実行
する。As described above, the improvement of the posture detection device for estimating the posture parameter of the person from the multi-image of the person based on the genetic algorithm has been proposed. In this posture detecting device, a genetic algorithm using local information of a person is executed based on the estimation result of the method (S-GA) using the degree of silhouette overlap in the conventional posture detecting device. First, the result of S-GA is given as the initial value of the torso posture parameter, the silhouette-based genetic algorithm (P-GA) is executed, and the six parameters of the torso are re-estimated. Next, based on the results of S-GA and P-GA, a genetic algorithm (B-GA) using color information is executed for each human body part of the upper arm, arm, hand, and head.

【００５１】実際の人物から得られた２種類（精密、簡
易）の３次元人物上半身モデルを対象に本発明の有効性
の検討を行なった。目標人物マルチ画像としては、これ
らのモデルから生成された合成３方向画像を使用した。
ここでは、３種類の姿勢を検討した。各ＧＡ処理におけ
る３次元人物モデルとしては、簡易モデルを使用した。
簡易モデルから合成した目標３方向画像の場合には、Ｓ
−ＧＡの結果に比較して、Ｂ−ＧＡのほうが良好な推定
精度が得られることが確認できた。一方、精密モデルか
らの目標画像の場合には、Ｐ−ＧＡのほうが結果が悪い
という結果が得られた。この原因は、簡易、精密モデル
の形状と色彩の不一致が考えられる。このような不一致
要因に対応するため、Ｓ−ＧＡとＰ−ＧＡの後に、これ
らの結果を利用して再度シルエットベースのＧＡを上半
身に対して行なった（Ｓ−ＧＡ２）。さらに、Ｂ−ＧＡ
においても、各人体パーツだけでなく上半身全体のシル
エットの重なり度合も評価するＧＡ（Ｂ−ＧＡ２）を導
入し、最初のＳ−ＧＡからのある程度の改善を確認し
た。The effectiveness of the present invention was examined for two types of three-dimensional (upper and lower) human upper body models obtained from an actual person. As the target person multi-image, a synthetic three-direction image generated from these models was used.
Here, three types of postures were examined. A simple model was used as the three-dimensional human model in each GA process.
In the case of the target three-direction image synthesized from the simple model, S
It was confirmed that better estimation accuracy was obtained with B-GA as compared with the result of -GA. On the other hand, in the case of the target image from the precision model, the result that the P-GA was worse was obtained. The cause of this may be the inconsistency of the shape and color of the simple and precise model. In order to deal with such a mismatch factor, after performing S-GA and P-GA, silhouette-based GA was performed again on the upper body using these results (S-GA2). Furthermore, B-GA
In addition, GA (B-GA2), which evaluates not only the human body parts but also the degree of overlap of the silhouette of the entire upper body, was introduced, and some improvement from the initial S-GA was confirmed.

【００５２】[0052]

【発明の効果】以上のようにこの発明によれば、遺伝的
アルゴリズムを用いた３次元物体の姿勢の推定結果に基
づいて、再び遺伝的アルゴリズムを用いてその３次元物
体の主要部のみの姿勢を推定するため、姿勢の推定精度
が向上する。さらに、遺伝的アルゴリズムを用いた３次
元物体の姿勢推定結果に基づいて、再び遺伝的アルゴリ
ズムを用いてその３次元物体の細部の姿勢を推定するた
め、姿勢の推定精度はさらに向上する。しかも、３次元
物体の細部の姿勢を再推定するときに色彩情報を加味す
るため、３次元物体の姿勢を正確に検出することができ
る。As described above, according to the present invention, based on the estimation result of the posture of the three-dimensional object using the genetic algorithm, the posture of only the main part of the three-dimensional object is again used by using the genetic algorithm. Therefore, the posture estimation accuracy is improved. Further, based on the posture estimation result of the three-dimensional object using the genetic algorithm, the detail algorithm of the three-dimensional object is estimated again using the genetic algorithm, so that the posture estimation accuracy is further improved. Moreover, since the color information is taken into consideration when re-estimating the detailed posture of the three-dimensional object, the posture of the three-dimensional object can be accurately detected.

[Brief description of drawings]

【図１】この発明の実施の形態による姿勢検出装置の全
体構成を示すブロック図である。FIG. 1 is a block diagram showing an overall configuration of a posture detection device according to an embodiment of the present invention.

【図２】人体の上半身における関節角の回転パラメータ
の定義を説明するための図である。FIG. 2 is a diagram illustrating the definition of rotation parameters of joint angles in the upper body of a human body.

【図３】図１中のＳ−ＧＡの具体的な構成を示すブロッ
ク図である。FIG. 3 is a block diagram showing a specific configuration of the S-GA in FIG.

【図４】図３中の比較部が求める適応度を説明するため
の図である。FIG. 4 is a diagram for explaining the fitness calculated by a comparison unit in FIG.

【図５】図１中のＰ−ＧＡの具体的な構成を示すブロッ
ク図である。5 is a block diagram showing a specific configuration of a P-GA in FIG.

【図６】図１中のＢ−ＧＡの具体的な構成を示すブロッ
ク図である。FIG. 6 is a block diagram showing a specific configuration of the B-GA in FIG.

【図７】（Ａ）は人物モデルに設定された人体パーツの
ボックスを説明するための図であり、（Ｂ）は人物に設
定されたウインドウを説明するための図である。7A is a diagram for explaining a box of a human body part set in a human model, and FIG. 7B is a diagram for explaining a window set in a human figure.

【図８】（Ａ）〜（Ｃ）は３種類の姿勢を表わす前方か
らの目標画像を示す図である。8A to 8C are diagrams showing target images from the front, which represent three types of postures.

[Explanation of symbols]

１人物７Ｓ，７Ｐ，７Ｂ人物モデル１０シルエット遺伝的アルゴリズム実行部（Ｓ−Ｇ
Ａ）１２ポジション遺伝的アルゴリズム実行部（Ｐ−Ｇ
Ａ）１４ボックス遺伝的アルゴリズム実行部（Ｂ−ＧＡ）１１Ｓ，１１Ｐ，１１Ｂ比較部１３Ｓ，１３Ｐ，１３Ｂ遺伝子情報生成部１７Ｓ，１７Ｐ，１７Ｂ変形部Ｒ₁〜Ｒ_N 実マルチカメラＶ_1S〜Ｖ_NS，Ｖ_1P〜Ｖ_NP，Ｖ_1B〜Ｖ_NB 仮想マルチカメ
ラ1 person 7S, 7P, 7B person model 10 silhouette genetic algorithm execution unit (SG
A) 12-position genetic algorithm execution unit (PG
A) 14 box genetic algorithm execution unit (B-GA) 11S, 11P , 11B comparator unit 13S, 13P, 13B genetic information generating unit 17S, 17P, 17B deformable portion R ₁ to R _N real multi-camera V _1S ~V _NS , V _{1P to} V _NP , V _{1B to} V _NB virtual multi-camera

Claims

[Claims]

1. A three-dimensional object is imaged by a plurality of imaging means, each of which is provided in a predetermined geometrical positional relationship with respect to the three-dimensional object, and the posture of the three-dimensional object is based on the plurality of images. A first virtual three-dimensional model provided corresponding to the three-dimensional object, the same geometrical relationship as the geometrical positional relationship with respect to the first virtual three-dimensional model. A plurality of first virtual image pickup means respectively provided in a physical position relationship, a plurality of virtual images obtained by the plurality of first virtual image pickup means and the plurality of images are compared to determine a first fitness. First comparing means for obtaining, first gene information capable of specifying the posture of the first virtual three-dimensional model in accordance with a genetic algorithm according to the first fitness based on predetermined initial gene information First death to generate A first genetic algorithm executing means including child information generating means and first deforming means for deforming the posture of the first virtual three-dimensional model according to the first genetic information; A second virtual three-dimensional model provided corresponding to a main part, and a plurality of first virtual three-dimensional models each provided with the same geometrical positional relationship as the geometrical positional relationship with respect to the second virtual three-dimensional model. Two virtual image pickup means, a second comparing means for comparing the plurality of virtual images obtained by the plurality of second virtual image pickup means with the plurality of images to obtain a second fitness, the first inheritance The first from the dynamic algorithm execution means
Second gene information generating means for generating second gene information capable of specifying the posture of the second virtual three-dimensional model according to the genetic algorithm according to the second fitness based on the gene information of And a second deforming the posture of the second virtual three-dimensional model according to the second gene information.
Posture detecting apparatus, comprising: a second genetic algorithm executing means including the transforming means.

2. A third virtual three-dimensional model provided corresponding to the details of the three-dimensional object, and the same geometrical positional relationship as the geometrical positional relationship with respect to the third virtual three-dimensional model. A plurality of third virtual image pickup means provided respectively in the plurality of third virtual image pickup means, and a plurality of virtual images obtained by the plurality of third virtual image pickup means are compared with the plurality of images to obtain a third fitness degree. Comparing means, the second from the second genetic algorithm executing means
Third gene information generating means for generating third gene information capable of specifying the posture of the third virtual three-dimensional model according to the genetic algorithm according to the third fitness based on the gene information of And a third deforming the posture of the third virtual three-dimensional model according to the third gene information.
The posture detecting apparatus according to claim 1, further comprising third genetic algorithm executing means including the transforming means.

3. The third virtual three-dimensional model has the same color as the three-dimensional object, and the third comparing means further calculates the colors of the plurality of virtual images and the colors of the plurality of virtual images. The posture detecting apparatus according to claim 2, wherein the third fitness is also compared to obtain the third fitness.