JP3400961B2

JP3400961B2 - Apparatus for estimating posture of human image and recording medium storing program for estimating posture of human image

Info

Publication number: JP3400961B2
Application number: JP27161799A
Authority: JP
Inventors: 和彦高橋; 淳大谷
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 1999-09-27
Filing date: 1999-09-27
Publication date: 2003-04-28
Anticipated expiration: 2019-09-27
Also published as: JP2001092978A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、カメラによって撮影
された人物像の姿勢を推定する姿勢推定装置、およびカ
メラによって撮影された人物像の姿勢を推定する姿勢推
定プログラムを記録した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a posture estimation device for estimating the posture of a human image captured by a camera, and a recording medium recording a posture estimation program for estimating the posture of a human image captured by the camera.

【０００２】[0002]

【従来の技術】この種の従来技術の一例が、平成１０年
９月２９日付けで出願公開された特開平１０−２５８０
４４号公報［Ａ６１Ｂ５／１０］に開示されている。こ
の従来技術は、赤外線カメラによって撮影された熱画像
から人物のシルエットを抽出し、抽出されたシルエット
画像の輪郭を分析することによって、人物の姿勢を推定
していた。2. Description of the Related Art An example of this type of prior art is disclosed in Japanese Patent Application Laid-Open No. 10-2580 filed on September 29, 1998.
No. 44 publication [A61B5 / 10]. This prior art estimates the posture of a person by extracting the silhouette of the person from a thermal image taken by an infrared camera and analyzing the contour of the extracted silhouette image.

【０００３】[0003]

【発明が解決しようとする課題】しかし、この従来技術
では、複数の条件（たとえば、右足は重心の下でかつ身
体の右側にある。手は、重心よりも上でかつ頭よりも下
にあり、主軸から一定の距離を置いている。）を事前に
設定しておく必要があり、姿勢推定のための処理が複雑
になるという問題があった。また、この従来技術では、
四肢同士が交叉するような動きがあったときや、四肢と
胴体が完全に重なるような動きがあったときに、特徴点
の検出が不可能となっていた。However, in this prior art, there are a number of conditions (for example, the right foot is below the center of gravity and to the right of the body; the hand is above the center of gravity and below the head). , A certain distance from the main axis.) Has to be set in advance, which poses a problem that the process for posture estimation becomes complicated. In addition, in this conventional technology,
It was impossible to detect the feature points when the limbs crossed each other or when the limbs and the body completely overlapped each other.

【０００４】それゆえに、この発明の主たる目的は、簡
単な処理で人物の姿勢を推定することができる姿勢推定
装置を提供することである。Therefore, a main object of the present invention is to provide a posture estimating apparatus capable of estimating the posture of a person by a simple process.

【０００５】この発明の他の目的は、簡単な処理で人物
の姿勢を推定することができる姿勢推定プログラムを記
録した記録媒体を提供することである。Another object of the present invention is to provide a recording medium recording a posture estimation program capable of estimating the posture of a person by a simple process.

【０００６】この発明のその他の目的は、人物の姿勢に
動きがあったときでも特徴点を正確に検出できる姿勢推
定装置を提供することである。この発明のさらにその他
の目的は、人物の姿勢に動きがあったときでも特徴点を
正確に検出できる姿勢推定プログラムを記録した記録媒
体を提供することである。Another object of the present invention is to provide a posture estimation device capable of accurately detecting feature points even when the posture of a person moves. Still another object of the present invention is to provide a recording medium recording a posture estimation program capable of accurately detecting feature points even when the posture of a person moves.

【０００７】[0007]

【課題を解決するための手段】第１の発明は、カメラに
よって撮影された人物像の姿勢を推定する姿勢推定装置
において、人物像の輪郭を検出する輪郭検出手段、輪郭
上の各位置から人物像の上にある複数の基準位置のそれ
ぞれまでの距離を検出する距離検出手段、および距離に
基づいて人物像の特徴点を検出する特徴点検出手段を備
えることを特徴とする、姿勢推定装置である。According to a first aspect of the present invention, there is provided a posture estimating device for estimating a posture of a human image captured by a camera, a contour detecting means for detecting a contour of the human image, and a person from each position on the contour. A posture estimation apparatus comprising: a distance detection unit that detects a distance to each of a plurality of reference positions on an image, and a feature point detection unit that detects a feature point of a human image based on the distance. is there.

【０００８】[0008]

【０００９】第２の発明は、カメラによって撮影された
人物像の姿勢を推定する姿勢推定プログラムを記録した
記録媒体において、姿勢推定プログラムは、人物像の輪
郭を検出する輪郭検出ステップ、輪郭上の各位置から人
物像の上にある複数の基準位置のそれぞれまでの距離を
検出する距離検出ステップ、および距離に基づいて人物
像の特徴点を検出する特徴点検出ステップを含むことを
特徴とする、記録媒体である。According to a second aspect of the present invention, in a recording medium recording a posture estimating program for estimating the posture of a human image captured by a camera, the posture estimating program includes a contour detecting step for detecting the contour of the human image, and a contour detecting step. A distance detection step of detecting a distance from each position to each of the plurality of reference positions on the human image, and a feature point detection step of detecting a feature point of the human image based on the distance, It is a recording medium.

【００１０】[0010]

【００１１】[0011]

【作用】第１の発明によれば、人物像の輪郭が輪郭検出
手段によって検出され、輪郭上の各位置から人物像の上
にある複数の基準位置のそれぞれまでの距離が、距離検
出手段によって検出される。特徴点検出手段は、このよ
うにして検出された距離に基づいて人物像の特徴点を検
出する。According to the first invention, the contour of the human image is detected by the contour detecting means, and the distance from each position on the contour to each of the plurality of reference positions on the human image is detected by the distance detecting means. To be detected. The feature point detecting means detects the feature points of the person image based on the distance thus detected.

【００１２】複数の基準位置は、好ましくは人物像の頭
頂位置および重心位置を含む。The plurality of reference positions preferably include a top position and a barycentric position of the person image.

【００１３】この発明のある局面では、特徴点は次のよ
うにして検出される。まず、演算手段が、輪郭上のある
位置で検出された複数の距離を個別に二乗し、次に、加
算手段が、演算手段で求められた複数の二乗値を互いに
加算する。そして、加算手段による加算値が極大となる
輪郭上の位置が、極大位置検出手段によって検出され
る。検出された位置が特徴点である。In one aspect of the present invention, the characteristic points are detected as follows. First, the calculating means squares the plurality of distances detected at a certain position on the contour individually, and then the adding means adds the plurality of square values obtained by the calculating means to each other. Then, the position on the contour where the added value of the addition means is maximum is detected by the maximum position detection means. The detected position is a feature point.

【００１４】この発明の他の局面では、カメラは、姿勢
が経時的に変化する人物像を所定期間おきに撮影し、特
徴点検出手段は、撮影されたそれぞれの人物像の特徴点
を検出する。ここで、人物像の初期姿勢は、好ましく
は、カメラに対して四肢が胴体と重ならずかつ四肢同士
が交叉しない姿勢である。In another aspect of the present invention, the camera captures a human image whose posture changes with time at predetermined intervals, and the feature point detecting means detects a feature point of each captured human image. . Here, the initial posture of the person image is preferably a posture in which the limbs do not overlap the body and the limbs do not intersect with respect to the camera.

【００１５】この発明のある実施例では、特徴点検出手
段によって検出された現人物像の特徴点に、フィルタ手
段によってカルマンフィルタ処理が施される。特徴点確
定手段は、フィルタ手段の処理結果と第１の自己回帰モ
デル式とに基づいて、現人物像の特徴点を確定させる。In one embodiment of the present invention, the Kalman filter process is applied to the feature points of the current person image detected by the feature point detecting unit by the filter unit. The feature point determination means determines the feature point of the current human figure based on the processing result of the filter means and the first autoregressive model formula.

【００１６】また、特徴点予測手段が、フィルタ手段の
処理結果と第２の自己回帰モデル式とに基づいて次回に
撮影される次人物像の特徴点を予測すると、部分画像領
域設定手段が、予測された特徴点を含む部分画像領域を
次人物像の上に設定する。このとき、距離検出手段は、
部分画像領域に含まれる輪郭上の各位置から複数の基準
位置のそれぞれまでの距離を検出する。Further, when the feature point predicting means predicts the feature point of the next person image to be photographed next time based on the processing result of the filter means and the second autoregressive model formula, the partial image area setting means A partial image area including the predicted feature point is set on the next person image. At this time, the distance detecting means
The distance from each position on the contour included in the partial image area to each of the plurality of reference positions is detected.

【００１７】[0017]

【００１８】[0018]

【００１９】[0019]

【００２０】第２の発明によれば、カメラによって撮影
された人物像の姿勢を推定する姿勢推定プログラムが記
録媒体に記録される。この姿勢推定プログラムでは、輪
郭検出ステップで人物像の輪郭が検出され、距離検出ス
テップで、輪郭上の各位置から人物像の上にある複数の
基準位置のそれぞれまでの距離が検出される。特徴点検
出ステップでは、上記の距離に基づいて人物像の特徴点
が検出される。According to the second aspect of the invention, a posture estimation program for estimating the posture of the human image taken by the camera is recorded in the recording medium. In this posture estimation program, the contour of the human image is detected in the contour detecting step, and the distance from each position on the contour to each of the plurality of reference positions on the human image is detected in the distance detecting step. In the feature point detection step, the feature point of the human image is detected based on the distance.

【００２１】[0021]

【００２２】[0022]

【発明の効果】第１の発明および第２の発明によれば、
輪郭上の各位置から人物像の上にある複数の基準位置の
それぞれまでの距離を検出し、検出した距離に基づいて
人物像の特徴点を検出するようにしたため、簡単な処理
で人物の姿勢を推定することができる。According to the first and second inventions,
The distance from each position on the contour to each of the multiple reference positions on the human image is detected, and the feature points of the human image are detected based on the detected distances. Can be estimated.

【００２３】[0023]

【００２４】この発明の上述の目的，その他の目的，特
徴および利点は、図面を参照して行う以下の実施例の詳
細な説明から一層明らかとなろう。The above-mentioned objects, other objects, features and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

【００２５】[0025]

【実施例】図１を参照して、この実施例の姿勢推定装置
１０は、カラーカメラ（以下、単に「カメラ」とい
う。）１２を含む。カメラ１２は姿勢が経時的に変化す
る人物（被験者）の全体像をたとえば１／３０秒おきに
撮影し、撮影した全体像および背景の画像データを画像
処理装置１４に入力する。画像処理装置１４には、光デ
ィスク１６に記録された姿勢推定プログラムが事前にイ
ンストールされている。画像処理装置１４は、カメラ１
２から入力された画像データとインストールされた姿勢
推定プログラムとに基づいて被験者の姿勢を推定し、ア
バタの姿勢を制御する。つまり、アバタの姿勢を被験者
の姿勢の変化に追従させる。ここで、“アバタ”とは、
仮想環境内に再現される２次元ＣＧモデルである。所望
の姿勢をとるアバタの画像データは、モデル合成装置１
８に与えられ、モデル合成装置１８は、与えられたアバ
タの画像データを背景画像データと合成する。これによ
って生成された合成画像データは、画像表示装置２０に
よって画面に表示される。DESCRIPTION OF THE PREFERRED EMBODIMENTS Referring to FIG. 1, a posture estimation apparatus 10 of this embodiment includes a color camera (hereinafter, simply referred to as "camera") 12. The camera 12 captures an entire image of a person (subject) whose posture changes over time, for example, every 1/30 seconds, and inputs the captured entire image and background image data to the image processing device 14. The posture estimation program recorded on the optical disc 16 is pre-installed in the image processing device 14. The image processing device 14 includes the camera 1.
The posture of the subject is estimated based on the image data input from 2 and the installed posture estimation program, and the posture of the avatar is controlled. That is, the posture of the avatar is made to follow the change in the posture of the subject. Here, "avatar" means
It is a two-dimensional CG model reproduced in a virtual environment. The image data of the avatar taking a desired posture is the model synthesizing device 1
8, the model synthesizer 18 synthesizes the image data of the given avatar with the background image data. The composite image data generated in this way is displayed on the screen by the image display device 20.

【００２６】上述の姿勢推定プログラムは、図２および
図３に示すように構成される。画像処理装置１４に設け
られたＣＰＵ１４ａは、このようなフロー図に従って、
カメラ１２から入力された画像データを処理する。The above posture estimation program is constructed as shown in FIGS. 2 and 3. The CPU 14a provided in the image processing device 14 follows the flow chart as described above.
The image data input from the camera 12 is processed.

【００２７】ＣＰＵ１４ａは、まずステップＳ１でフレ
ームカウンタ１４ｂのカウント値Ｋを“０”にセット
し、ステップＳ３で探索窓（探索エリア）を初期化す
る。初期化された探索窓のサイズは画面のサイズと等し
く、撮影された画像（人物像および背景）は全て、探索
窓の中に含まれる。ＣＰＵ１４ａは続いて、ステップＳ
４でカメラ１２から画像を取り込み、ステップＳ５で背
景差分法によって入力画像から人物像を抽出する。背景
差分法とは、人物像を含む画像から背景画像を差し引く
ことによって、人物像のみを取り出す手法である。な
お、被験者は、少なくとも撮影開始時に、カメラ１２に
向かって四肢が胴体と重ならずかつ四肢同士が交叉しな
い姿勢をとる必要がある。The CPU 14a first sets the count value K of the frame counter 14b to "0" in step S1, and initializes the search window (search area) in step S3. The size of the initialized search window is equal to the size of the screen, and all the captured images (person image and background) are included in the search window. The CPU 14a then proceeds to step S
An image is captured from the camera 12 in 4 and a human image is extracted from the input image by the background subtraction method in step S5. The background subtraction method is a method of extracting only the human image by subtracting the background image from the image containing the human image. It should be noted that the subject needs to take a posture toward the camera 12 at least at the start of photographing so that the four limbs do not overlap with the body and the four limbs do not cross each other.

【００２８】ＣＰＵ１４ａはさらに、ステップＳ７で人
物像を“１”、背景画像を“０”として入力画像を２値
化する。これによって、人物のシルエット画像が得られ
る。ＣＰＵ１４ａは続いて、ステップＳ９でこのシルエ
ット画像の重心を検出するとともに、ステップＳ１１で
同じシルエット画像の上半身の主軸を検出する。つま
り、シルエット画像が図４に示す姿勢をとるときは、同
図に示す重心Ｇならびにこの重心Ｇから上方に延びる慣
性主軸が検出される。In step S7, the CPU 14a further binarizes the input image by setting the person image to "1" and the background image to "0". Thereby, the silhouette image of the person is obtained. Subsequently, the CPU 14a detects the center of gravity of this silhouette image in step S9, and detects the main axis of the upper half of the same silhouette image in step S11. That is, when the silhouette image has the posture shown in FIG. 4, the center of gravity G shown in FIG. 4 and the principal axis of inertia extending upward from the center of gravity G are detected.

【００２９】ステップＳ１３では、探索窓の内側に属す
るシルエット画像から輪郭線を検出する。つまり、重心
Ｇからラスタ走査して最初に発見された境界画素を開始
点とし、境界線を反時計回りにトレースする。境界線上
で“１”となり、それ以外で“０”となる２値画像を抽
出すれば、シルエット画像の輪郭線が得られる。続くス
テップＳ１５では、慣性主軸と輪郭線との交差点を検出
し、これを頭頂点とする。つまり、図４に示す交差点Ｈ
が頭頂点となる。In step S13, a contour line is detected from the silhouette image inside the search window. That is, the boundary line that is first discovered by raster scanning from the center of gravity G is used as a starting point, and the boundary line is traced counterclockwise. The contour line of the silhouette image can be obtained by extracting a binary image that has "1" on the boundary line and "0" at other positions. In a succeeding step S15, an intersection point between the principal axis of inertia and the contour line is detected, and this is set as a vertex. That is, the intersection H shown in FIG.
Is the top point.

【００３０】このようにして、人物像の輪郭線ならびに
人物像の上にある複数の基準位置（重心および頭頂点）
が検出される。In this way, the contour line of the human figure and a plurality of reference positions (the center of gravity and the apex of the head) on the human figure.
Is detected.

【００３１】ＣＰＵ１４ａはその後ステップＳ１６に進
み、輪郭線上を頭頂点から所定量だけ移動した位置を測
定位置と決定する。移動方向は、反時計回り方向であ
る。ステップＳ１７では、現測定位置から頭頂点および
重心のそれぞれに向かうベクトルを求め、この２つのベ
クトルのノルムを検出する。つまり、現測定位置から頭
頂点および重心のそれぞれまでの距離が求められる。続
くステップＳ１９では、検出された２つの距離の２乗和
を数１に従って算出する。After that, the CPU 14a proceeds to step S16, and determines the position on the contour line moved from the apex by a predetermined amount as the measurement position. The moving direction is the counterclockwise direction. In step S17, the vectors from the current measurement position toward the vertex and the center of gravity are obtained, and the norm of these two vectors is detected. That is, the distances from the current measurement position to each of the top vertex and the center of gravity are obtained. In the following step S19, the sum of squares of the two detected distances is calculated according to the equation 1.

【００３２】[0032]

【数１】ステップＳ２１では現測定位置が頭頂点であるかどうか
判断し、ＮＯであれば、ステップＳ２２で測定位置を更
新する。つまり、輪郭線上を現測定位置から所定量だけ
反時計回り方向に移動した位置を、次の測定位置とす
る。そして、ステップＳ１７に戻る。この結果、輪郭線
上の各測定位置から重心および頭頂点までの距離の２乗
和が求められる。現測定位置が図４に示す位置Ｐである
場合、ステップＳ１７では位置Ｐから重心Ｇまでの距離
ａ（ｓ）と位置Ｐから頭頂点Ｈまでの距離ｂ（ｓ）が求
められ、ステップＳ１９ではこのような距離ａ（ｓ）お
よびｂ（ｓ）の２乗和が求められる。[Equation 1] In step S21, it is determined whether or not the current measurement position is the apex, and if NO, the measurement position is updated in step S22. That is, the position that is moved in the counterclockwise direction by a predetermined amount from the current measurement position on the contour line is set as the next measurement position. Then, the process returns to step S17. As a result, the sum of squares of the distance from each measurement position on the contour line to the center of gravity and the apex of the head is obtained. If the current measurement position is the position P shown in FIG. 4, the distance a (s) from the position P to the center of gravity G and the distance b (s) from the position P to the apex H are obtained in step S17, and in step S19. The sum of squares of such distances a (s) and b (s) is obtained.

【００３３】輪郭線上の各位置における２乗和が求めら
れると、ＣＰＵ１４ａは、この２乗和が極大値をとる輪
郭上の位置をステップＳ２３で検出する。被験者が図４
に示す姿勢をとる場合、２乗和は図５に示すように変化
する。初期姿勢では、被験者はカメラ１２を向き、四肢
は、カメラ１２に対して胴体と重ならずかつ互いに交叉
しない。また、測定位置の移動方向は、反時計回り方向
と予め決まっている。このため、図５に示す極大位置Ａ
が右手先となり、極大位置Ｂが右足先となり、極大位置
Ｃが左足先となり、そして極大位置Ｄが左手先となる。
つまり、ステップＳ２３の処理によって、人物像の輪郭
線上にある４つの特徴点の座標が検出されるとともに、
検出された特徴点座標が四肢のいずれであるかが判別さ
れる。When the sum of squares at each position on the contour line is obtained, the CPU 14a detects the position on the contour where the sum of squares has a maximum value in step S23. Figure 4
When the posture shown in is taken, the sum of squares changes as shown in FIG. In the initial posture, the subject faces the camera 12, and the limbs do not overlap the body with respect to the camera 12 and do not cross each other. Further, the moving direction of the measurement position is predetermined to be the counterclockwise direction. Therefore, the maximum position A shown in FIG.
Is the right hand tip, the maximum position B is the right foot tip, the maximum position C is the left foot tip, and the maximum position D is the left hand tip.
That is, by the process of step S23, the coordinates of the four feature points on the outline of the human image are detected, and
It is determined which of the four limbs the detected feature point coordinates are.

【００３４】ＣＰＵ１４ａは、現測定位置から２つの基
準位置（重心および頭頂点）までの距離に基づいて、２
乗和を求めている。このため、四肢が胴体と重ならずか
つ互いに交差しない姿勢で人物像が撮影される限り、人
物像の特徴点は正確に検出される。なお、被験者が撮影
開始時に図６に示す姿勢をとる場合、２乗和は各特徴点
においてより大きな値を示す。このため、特徴点を最も
確実に検出することができる。The CPU 14a determines 2 based on the distance from the current measurement position to the two reference positions (center of gravity and top vertex).
I am looking for the sum of squares. Therefore, as long as the human image is captured in a posture in which the limbs do not overlap the body and do not intersect with each other, the characteristic points of the human image can be accurately detected. When the subject takes the posture shown in FIG. 6 at the start of photographing, the sum of squares shows a larger value at each feature point. Therefore, the feature point can be detected most reliably.

【００３５】ステップＳ１５で検出された頭頂点もまた
輪郭線上に位置する特徴点であり、合計５つの特徴点の
座標がこれまでの処理で検出されたことになる。検出さ
れた各特徴点座標は、メモリ１４ｃに形成された図７に
示されるテーブルに書き込まれる。第Ｋフレームにおい
て検出された特徴点のＸ座標およびＹ座標はＸ（Ｋ）お
よびＹ（Ｋ）と定義される。また、右手先の特徴点座標
がＭ＝１に対応付けられ、右足先の特徴点座標がＭ＝２
に対応付けられ、左足先の特徴点座標がＭ＝３に対応付
けられ、左手先の特徴点座標がＭ＝４に対応付けられ、
そして頭頂の特徴点座標がＭ＝５に対応付けられる。こ
のようなテーブルは、各フレーム毎に作成される。The head apex detected in step S15 is also a feature point located on the contour line, and the coordinates of a total of five feature points have been detected by the processing up to this point. Each detected feature point coordinate is written in the table shown in FIG. 7 formed in the memory 14c. The X and Y coordinates of the characteristic point detected in the Kth frame are defined as X (K) and Y (K). Further, the feature point coordinates of the right hand tip are associated with M = 1, and the feature point coordinates of the right foot tip are M = 2.
, The feature point coordinate of the left foot is associated with M = 3, the feature point coordinate of the left hand is associated with M = 4,
Then, the feature point coordinates of the crown are associated with M = 5. Such a table is created for each frame.

【００３６】なお、このテーブルに示されるＸ
_S（Ｋ），Ｙ_S（Ｋ），Ｘ_S（Ｋ＋１），Ｙ_S（Ｋ＋１），
ａ_１〜ａ_ｐおよびｂ_１〜ｂ_ｐについては、後述する。ま
た、この実施例では、合計５つの特徴点を検出している
が、この“５”という数は処理の簡便さを考慮して決め
られたものであり、姿勢推定の精度を向上させたい場合
は、検出する特徴点の数を増やせばよい。It should be noted that X shown in this table
_S (K), Y _S (K), X _S (K + 1), Y _S (K + 1),
The details of a _{1 to} a _p and b _{1 to} b _p will be described later. In addition, in this embodiment, a total of five feature points are detected, but the number "5" is determined in consideration of the simplicity of the processing, and when it is desired to improve the accuracy of posture estimation. May increase the number of feature points to be detected.

【００３７】ＣＰＵ１４ａは続いて、ステップＳ２４で
特徴点カウンタ１４ｄのカウント値Ｍを“１”にセット
する。ステップＳ２５では、現カウント値Ｍに対応する
特徴点座標Ｘ（Ｋ）およびＹ（Ｋ）を図７に示すテーブ
ルから読み出し、数２および数３に従って各座標にカル
マンフィルタ処理をかける。これによって求められた係
数ａ₁，ａ₂…ａ_pおよびｂ₁，ｂ₂…ｂ_pは、図７に示すテ
ーブルの現カウント値Ｍに対応する位置に書き込まれ
る。なお、Ｘ（Ｋ−１），Ｘ（Ｋ−２），…Ｘ（Ｋ−
ｐ）ならびにＹ（Ｋ−１），Ｙ（Ｋ−２），…Ｙ（Ｋ−
ｐ）は、前フレーム以前に検出された特徴点座標であ
り、これらのデータは図７と同じ要領で作成された別の
テーブルから読み出される。The CPU 14a subsequently sets the count value M of the feature point counter 14d to "1" in step S24. In step S25, the characteristic point coordinates X (K) and Y (K) corresponding to the current count value M are read from the table shown in FIG. 7, and each coordinate is subjected to the Kalman filter processing according to the equations 2 and 3. The coefficients a ₁ , a ₂ ... _Ap and b ₁ , b ₂ ... B _p obtained by this are written in the positions corresponding to the current count value M in the table shown in FIG. Note that X (K-1), X (K-2), ... X (K-
p) and Y (K-1), Y (K-2), ... Y (K-
p) is the feature point coordinates detected before the previous frame, and these data are read from another table created in the same manner as in FIG. 7.

【００３８】[0038]

【数２】 [Equation 2]

【００３９】[0039]

【数３】続くステップＳ２７では、数２および数３によって求め
られた係数ａ₁，ａ₂…ａ_pおよびｂ₁，ｂ₂…ｂ_pを数４に
よって表される自己回帰モデル式に適用し、特徴点の座
標Ｘ_S（Ｋ）およびＹ_S（Ｋ）を算出する。[Equation 3] In the following step S27, the coefficients a ₁ , a ₂ ... _Ap and b ₁ , b ₂ ... B _p obtained by the equations 2 and 3 are applied to the autoregressive model formula represented by the equation 4 to calculate the characteristic points. Coordinates X _S (K) and Y _S (K) are calculated.

【００４０】[0040]

【数４】ここで、Ｘ_S（Ｋ−１），Ｘ_S（Ｋ−２），…Ｘ_S（Ｋ−
ｐ）ならびにＹ_S（Ｋ−１），Ｙ_S（Ｋ−２），…Ｙ
_S（Ｋ−ｐ）もまた、前フレーム以前に算出された特徴
点座標であり、別のテーブルから読み出される。なお、
ｅ（Ｋ）は、誤差を示す。[Equation 4] _{Here, X S (K-1)} , X S (K-2), ... X S (K-
p) and Y _S (K-1), Y _S (K-2), ... Y
_S (K-p) is also a feature point coordinate calculated before the previous frame, and is read from another table. In addition,
e (K) indicates an error.

【００４１】テーブルから読み出した特徴点の座標Ｘ
（Ｋ）およびＹ（Ｋ）は、あくまで仮の座標である。こ
のため、この座標Ｘ（Ｋ）およびＹ（Ｋ）にカルマンフ
ィルタ処理を施し、これによって得られた係数ａ₁，ａ₂
…ａ_pおよびｂ₁，ｂ₂…ｂ_pと自己回帰モデル式とに基づ
いて、座標Ｘ_S（Ｋ）およびＹ_S（Ｋ）を算出する。これ
によって、特徴点の座標が確定される。確定した座標Ｘ
_S（Ｋ）およびＹ_S（Ｋ）は、図７に示すテーブルの現カ
ウント値Ｍに対応する位置に書き込まれる。Coordinates X of feature points read from the table
(K) and Y (K) are only temporary coordinates. For this reason, the Kalman filter processing is applied to the coordinates X (K) and Y (K), and the coefficients a ₁ and a ₂ obtained by this are applied.
The coordinates X _S (K) and Y _S (K) are calculated based on a _p and b ₁ , b _2, ... B _p and the autoregressive model formula. As a result, the coordinates of the feature points are fixed. Confirmed coordinate X
_S (K) and Y _S (K) are written in the position corresponding to the current count value M in the table shown in FIG.

【００４２】ステップＳ２９では、特徴点カウンタ１４
ｄのカウント値Ｋが“５”を示しているかどうか判断
し、ＮＯであれば、ステップＳ３１でカウント値Ｍをイ
ンクリメントしてからステップＳ２５に戻る。このた
め、メモリ１４ｃに格納された５つの特徴点の全てに対
してカルマンフィルタ処理ならびに自己回帰モデル式を
用いた演算が施され、全ての特徴点の座標が確定され
る。In step S29, the feature point counter 14
It is determined whether or not the count value K of d indicates "5", and if NO, the count value M is incremented in step S31 and the process returns to step S25. Therefore, the Kalman filtering process and the calculation using the autoregressive model formula are performed on all the five feature points stored in the memory 14c, and the coordinates of all the feature points are determined.

【００４３】カウント値Ｍが“５”に達すると、ＣＰＵ
１４ａはステップＳ２９からステップＳ３３に進み、確
定された５つの特徴点座標に基づいてアバタの姿勢を制
御する。姿勢が制御されたアバタの画像データは、画像
処理回路１４からモデル合成装置１８に与えられる。こ
の結果、被験者と同じ姿勢をとるアバタが、画像表示装
置２０に表示される。つまり、所望の姿勢をとるアバタ
が、仮想環境内に再現される。When the count value M reaches "5", the CPU
Step 14a advances from step S29 to step S33, and controls the posture of the avatar based on the determined five feature point coordinates. The image data of the avatar whose posture is controlled is given from the image processing circuit 14 to the model synthesizer 18. As a result, an avatar that takes the same posture as the subject is displayed on the image display device 20. That is, an avatar that takes a desired posture is reproduced in the virtual environment.

【００４４】ＣＰＵ１４ａは続いて、ステップＳ３５で
特徴点カウンタ１４ｄのカウント値Ｍを再度“１”にセ
ットする。そして、ステップＳ３７で数５によって表さ
れる自己回帰モデル式を演算する。Subsequently, the CPU 14a sets the count value M of the feature point counter 14d to "1" again in step S35. Then, in step S37, the autoregressive model formula represented by the equation 5 is calculated.

【００４５】[0045]

【数５】ここでは、直前のステップＳ２５において算出された係
数ａ₁，ａ₂…ａ_pおよびｂ₁，ｂ₂…ｂ_p、直前のステップ
Ｓ２７において算出された特徴点座標Ｘ_S（Ｋ）および
Ｙ_S（Ｋ）、ならびに前フレーム以前のステップＳ２７
で算出された特徴点座標Ｘ_S（Ｋ−１），…Ｘ_S（Ｋ＋１
−ｐ）およびＹ_S（Ｋ−１），…Ｙ_S（Ｋ＋１−ｐ）を演
算に用いる。これによって、次フレームにおける人物像
の特徴点座標Ｘ_S（Ｋ＋１）およびＹ_S（Ｋ＋１）が予測
される。つまり、次フレームにおいて被験者がどのよう
な姿勢をとるかが予測される。[Equation 5] Here, coefficients a ₁ calculated in step S25 immediately before, a ₂ ... a _p and _{_{_{b 1, b 2 ... b p}}} , feature point coordinates calculated at step S27 immediately before X _S (K) and Y _S ( K) and step S27 before the previous frame
Feature point coordinates X _S (K-1), ... X _S (K + 1)
-P) and Y _S (K-1), ... Y _S (K + 1-p) are used for the calculation. As a result, the characteristic point coordinates X _S (K + 1) and Y _S (K + 1) of the human image in the next frame are predicted. That is, the posture of the subject in the next frame is predicted.

【００４６】ステップＳ３９ではカウント値Ｍが“５”
に達したかどうか判断し、ＮＯであれば、ステップＳ４
１でカウント値ＭをインクリメントしてからステップＳ
３７に戻る。これによって、全ての特徴点の次フレーム
での座標が予測される。なお、予測された特徴点座標Ｘ
_S（Ｋ＋１）およびＹ_S（Ｋ＋１）もまた、図７に示すテ
ーブルに格納される。At step S39, the count value M is "5".
If NO, step S4
After incrementing the count value M by 1, step S
Return to 37. As a result, the coordinates of all feature points in the next frame are predicted. Note that the predicted feature point coordinates X
_S (K + 1) and Y _S (K + 1) are also stored in the table shown in FIG.

【００４７】Ｍ＝５となると、ＣＰＵ１４ａはステップ
Ｓ４３に進み、ステップＳ３７の処理によって予測され
た特徴点座標Ｘ_S（Ｋ＋１）およびＹ_S（Ｋ＋１）を中心
とする探索窓を画面上に設定する。次フレームにおいて
被験者が図６に示す姿勢をとると予測された場合、探索
窓は、同図に点線で示す位置に設定される。設定される
探索窓の数は、特徴点の数と同じ“５”であり、いずれ
の探索窓も矩形である。さらに、それぞれの探索窓のサ
イズは、画面のサイズよりも大幅に小さい。つまり、初
期の探索窓の数は１つであり、サイズは画面サイズと等
しいが、ステップＳ４３で探索窓が更新されることによ
って、数が５つに増えるとともに、サイズが画面サイズ
よりも大幅に小さくなる。このような探索窓が設定され
ると、ＣＰＵ１４ａは、ステップＳ４５でフレームカウ
ンタ１４ｂをインクリメントする。そして、ステップＳ
４以降で次フレームの入力画像について同様の処理を実
行する。When M = 5, the CPU 14a proceeds to step S43 and sets a search window centered on the characteristic point coordinates X _S (K + 1) and Y _S (K + 1) predicted by the processing of step S37 on the screen. . When the subject is predicted to assume the posture shown in FIG. 6 in the next frame, the search window is set to the position shown by the dotted line in the figure. The number of search windows set is “5”, which is the same as the number of feature points, and all the search windows are rectangular. Furthermore, the size of each search window is significantly smaller than the size of the screen. That is, although the number of initial search windows is one and the size is equal to the screen size, the number of search windows is increased to five by updating the search windows in step S43, and the size is significantly larger than the screen size. Get smaller. When such a search window is set, the CPU 14a increments the frame counter 14b in step S45. And step S
From 4 onward, similar processing is executed for the input image of the next frame.

【００４８】探索窓の数およびサイズが更新された結
果、次回のステップＳ１３では、特徴点の周辺の輪郭だ
けが抽出され、このような部分的な輪郭線に基づいて特
徴点（頭頂点および極大点）の座標が検出される。そし
て、検出された特徴点座標にカルマンフィルタ処理が施
され、これによって得られた係数に基づいて次フレーム
の特徴点座標が確定されるとともに、その次のフレーム
の特徴点座標が予測される。さらに、予測された特徴点
座標に基づいて探索窓が更新される。As a result of updating the number and size of the search windows, only the contours around the feature points are extracted in the next step S13, and the feature points (head vertex and maximum are extracted based on such partial contour lines. The coordinates of (point) are detected. Then, the detected feature point coordinates are subjected to Kalman filtering, the feature point coordinates of the next frame are determined based on the coefficient obtained thereby, and the feature point coordinates of the next frame are predicted. Further, the search window is updated based on the predicted feature point coordinates.

【００４９】この実施例によれば、人物像の特徴点（手
先点および足先点）を検出するにあたって、人物像の輪
郭線上の各位置から重心および頭頂点のそれぞれまでの
距離を検出し、さらに各位置で検出された２つの距離の
２乗和を求めている。そして、求められた２乗和が極大
となる位置を特徴点としている。したがって、人物像
が、カメラに対して四肢が胴体と重ならずかつ四肢同士
が交叉しない姿勢をとる限り、各特徴点を容易に検出で
き、かつ検出された特徴点が人物のどの部位であるかを
容易に判別することができる。つまり、人物像の姿勢を
容易に推定することができる。According to this embodiment, in detecting the characteristic points (hand point and toe point) of the human image, the distances from the respective positions on the contour line of the human image to the center of gravity and the apex of the head are detected, Furthermore, the sum of squares of the two distances detected at each position is calculated. The position where the calculated sum of squares is maximum is set as the characteristic point. Therefore, as long as the image of the person takes a posture in which the limbs do not overlap the body and the limbs do not intersect with each other, each feature point can be easily detected, and the detected feature point is which part of the person. It can be easily determined. That is, the posture of the person image can be easily estimated.

【００５０】また、この実施例では、検出された特徴点
（頭頂点も含む）にカルマンフィルタ処理を施し、これ
によって得られた係数と自己回帰モデル式とに基づいて
人物像の特徴点を確定させるようにしている。このた
め、特徴点を正確に特定できる。Further, in this embodiment, the detected characteristic points (including the apexes) are subjected to the Kalman filter processing, and the characteristic points of the human image are determined based on the coefficients and the autoregressive model formula obtained thereby. I am trying. Therefore, the feature points can be accurately specified.

【００５１】さらに、この実施例では、カルマンフィル
タ処理によって得られた係数と別の自己回帰モデル式に
基づいて次フレームで撮影される人物像の特徴点を予測
し、予測された特徴点を含む探索窓を次フレームの人物
像の上に設定する。そして、設定された探索窓の中に属
する輪郭上の各位置から重心および頭頂点までの距離を
求め、求められた距離に基づいて上述と同じ要領で特徴
点を検出するようにしている。このため、２フレーム目
以降は、特徴点の検出のために輪郭線の全範囲を走査す
る必要がない。Further, in this embodiment, the characteristic points of the human image captured in the next frame are predicted based on the coefficient obtained by the Kalman filter processing and another autoregressive model formula, and the search including the predicted characteristic points is performed. Set the window above the figure in the next frame. Then, the distances from the respective positions on the contour belonging to the set search window to the center of gravity and the head apex are obtained, and the feature points are detected in the same manner as described above based on the obtained distances. Therefore, in the second and subsequent frames, it is not necessary to scan the entire range of the contour line to detect the characteristic points.

[Brief description of drawings]

【図１】この発明の１実施例を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【図２】図１実施例の動作の一部を示すフロー図であ
る。FIG. 2 is a flowchart showing a part of the operation of the embodiment in FIG.

【図３】図１実施例の動作の他の一部を示すフロー図で
ある。FIG. 3 is a flowchart showing another portion of the operation of FIG. 1 embodiment.

【図４】被験者のシルエット画像を示す図解図である。FIG. 4 is an illustrative view showing a silhouette image of a subject.

【図５】輪郭線上の各測定位置と距離の２乗和との関係
を示すグラフである。FIG. 5 is a graph showing the relationship between each measurement position on the contour line and the sum of squared distances.

【図６】人物像およびその上に設定された探索窓を示す
図解図である。FIG. 6 is an illustrative view showing a person image and a search window set thereon.

【図７】第Ｋフレームに関連するテーブルを示す図解図
である。FIG. 7 is an illustrative view showing a table related to a Kth frame.

[Explanation of symbols]

１０…姿勢推定装置１２…カラーカメラ１４…画像処理装置１６…光ディスク１８…モデル合成装置２０…画像表示装置 10 ... Attitude estimation device 12 ... Color camera 14 ... Image processing device 16 ... Optical disc 18 ... Model synthesizer 20 ... Image display device

フロントページの続き (56)参考文献岩澤昭一郎外４名，”ＳｈａｌｌＷｅＤａｎｃｅ？”の構築，電子情報通信学会技術研究報告（ＰＲＭＵ98−112 〜122），1998年11月12日，Ｖｏｌ．98, Ｎｏ．394，ｐｐ．15−22 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 7/00 - 7/60 A61B 5/107 G06T 1/00 ＪＩＣＳＴファイル（ＪＯＩＳ)Continuation of the front page (56) References Shoichiro Iwasawa and 4 others, Construction of "Shall We Dance?", IEICE Technical Report (PRMU98-112-122), November 12, 1998, Vol. 98, No. 394, pp. 15-22 (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06T ⁷ /00-7/60 A61B 5/107 G06T 1/00 JISST file (JOIS)

Claims

(57) [Claims]

1. A posture estimating apparatus for estimating the posture of a human image captured by a camera, comprising: a contour detecting means for detecting a contour of the human image; Distance detecting means for detecting the distance to each of the reference positions,
And a posture estimation device comprising feature point detection means for detecting a feature point of the person image based on the distance.

2. The posture estimating apparatus according to claim 1, wherein the plurality of reference positions include a top position of the human figure and a barycentric position of the human figure.

3. The feature point detecting means is a computing means for individually squaring a plurality of distances detected at a certain position on the contour, and an adding means for adding together a plurality of squared values obtained by the computing means. And the maximum position detection means for detecting the position on the contour where the added value by the addition means is maximum, and the posture estimation apparatus according to claim 1 or 2.

4. The camera according to claim 1, wherein the camera photographs a human image whose posture changes with time at predetermined intervals, and the feature point detecting means detects a feature point of each photographed human image. The posture estimation device according to any one of 1.

5. The posture estimating apparatus according to claim 4, wherein the initial posture of the person image is a posture in which the limbs do not overlap the body and the limbs do not intersect with respect to the camera.

6. A filter means for performing a Kalman filter process on the feature points of the current person image detected by the feature point detecting means, and the current person based on the processing result of the filter means and a first autoregressive model formula. The posture estimation apparatus according to claim 4, further comprising a feature point determination unit that determines a feature point of the image.

7. A feature point prediction unit that predicts a feature point of a next person image to be captured next time based on the processing result of the filter unit and a second autoregressive model formula, and the feature point prediction unit. Further comprising a partial image area setting means for setting a partial image area including the feature points on the next person image, wherein the distance detecting means is configured to detect the plurality of positions from each position on the contour included in the partial image area. The posture estimation device according to claim 6, which detects a distance to each of the reference positions.

8. A recording medium in which a posture estimation program for estimating the posture of a human image captured by a camera is recorded, wherein the posture estimation program includes a contour detecting step of detecting a contour of the human image, A distance detection step of detecting a distance from a position to each of a plurality of reference positions on the human image, and a characteristic point detection step of detecting a characteristic point of the human image based on the distance. Recording medium.