JP2534617B2

JP2534617B2 - Real-time recognition and synthesis method of human image

Info

Publication number: JP2534617B2
Application number: JP5182744A
Authority: JP
Inventors: 淳大谷; 治雄竹村; 泰一北村; 文郎岸野
Original assignee: 株式会社エイ・ティ・アール通信システム研究所
Priority date: 1993-07-23
Filing date: 1993-07-23
Publication date: 1996-09-18
Anticipated expiration: 2011-09-18
Also published as: JPH0738873A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は人物像の実時間認識合
成方法に関し、特に、送信側の人物モデルを３次元ワイ
ヤーフレームモデルにカラーテクスチャマッピングした
ものにより構成し、この人物モデルをコンピュータグラ
フィックス技術を用いて生成された仮想空間に配置し、
受信側の立体ディスプレイに人物の動きを再現して立体
表示し、互いに異なる複数の場所にいる複数の人々が、
あたかも一堂に介している感覚で会議を行なうことがで
きるような通信に適用される人物像の実時間認識合成方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for real-time recognition and synthesis of a human image, and in particular, it comprises a human model on the transmitting side which is color texture mapped to a three-dimensional wire frame model, and the human model is constructed by computer graphics. Placed in a virtual space created using technology,
Multiple people in different places can reproduce the movement of a person on the receiving side's 3D display and display it in 3D.
The present invention relates to a real-time recognition and synthesis method of a human image, which is applied to communication such that a person can hold a conference as if he / she were in a hall.

【０００２】[0002]

【従来の技術】従来、人物の表情や動きの検出，認識方
法および人物モデルにおける動きの合成（再現）は別個
に研究される場合がほとんどであった。2. Description of the Related Art Conventionally, the detection and recognition methods of human facial expressions and movements, and the synthesis (reproduction) of movements in human models have mostly been studied separately.

【０００３】人物の表情や動きの検出は、人物を含むシ
ーンをテレビカメラにより画像として捉え、これを処理
することにより、表情認識や動きの認識を行なう試みが
行なわれてきたが、いずれの方法も処理精度に問題があ
り、さらに処理時間も実時間処理からかけ離れたもので
あった。For the detection of facial expressions and movements of people, attempts have been made to recognize facial expressions and movements by capturing a scene including a person as an image with a television camera and processing the image, but either method has been used. Also had a problem in processing accuracy, and the processing time was far from the real-time processing.

【０００４】[0004]

【発明が解決しようとする課題】一方、人物モデルにお
ける動きの合成に関しては、コンピュータグラフィック
スの分野で主に扱われていた。すなわち、通常、人物の
動きをビデオカメラなどで撮影し、その連続画像から何
フレームかのキーフレームを選択し、これに基づいて設
計者が人物モデルを適宜変形することにより動きを合成
し、キーフレームの間のフレームについては、適宜補間
を行なうことにより、滑らかな動きを合成していた。し
かし、このような手法を用いると、予め用意された限ら
れた動きしか合成することができない。さらに、人物の
動き検出部からの情報を用いて、人物モデルにおいて動
きを再現することは不可能であった。On the other hand, the synthesis of motion in a human model has been mainly dealt with in the field of computer graphics. That is, usually, a person's movement is photographed with a video camera, several key frames are selected from the continuous images, and based on this, the designer appropriately transforms the person model to synthesize the movement, For the frames between the frames, smooth motion was synthesized by performing interpolation as appropriate. However, when such a method is used, only limited motions prepared in advance can be combined. Furthermore, it is impossible to reproduce the motion in the human model by using the information from the human motion detection unit.

【０００５】これに対して、エクゾスケルトン型の変位
検出デバイスを顔面に貼り付け、コンピュータグラフィ
ックスで生成された人物像において表情を再現する試み
も見られるが、大きな機器を装着する必要があり、使い
勝手に大きな問題点があった。On the other hand, it has been attempted to attach an Exo-skeleton type displacement detection device to the face and reproduce the facial expression in the human figure generated by computer graphics, but it is necessary to wear a large device, There was a big problem in usability.

【０００６】それゆえに、この発明の主たる目的は、人
物の動き検出部と人物モデルにおける動き合成部を合体
し、さらに人物の動きを実時間で人物モデルにおいて再
現できるような人物像の実時間認識合成方法を提供する
ことである。Therefore, the main object of the present invention is to combine a human motion detection unit and a motion synthesis unit in a human model, and further to realize real-time recognition of a human image so that the human motion can be reproduced in real time in the human model. It is to provide a synthetic method.

【０００７】[0007]

【課題を解決するための手段】この発明は送信側の人物
のモデルを３次元ワイヤーフレームモデルにカラーテク
スチャマッピングしたものにより構成し、この人物モデ
ルをコンピュータグラフィックス技術を用いて生成され
た仮想空間に配置し、受信側の立体ディスプレイに人物
の動きを再現して立体表示する通信方法において、送信
側の人物の顔の表情筋の上の皮膚表面にマーカを貼り付
け、この人物の顔画像を撮像してその画像中におけるマ
ーカを実時間で追跡するとともに、その人物の角膜に光
を照射したときの角膜反射像を撮像して視線と瞬きを検
出し、その人物の体，頭，指などの動き情報を実時間で
検出する。According to the present invention, a model of a person on the transmitting side is constructed by color texture mapping to a three-dimensional wire frame model, and this person model is generated in a virtual space using computer graphics technology. In this communication method, in which a person's movement is reproduced and displayed stereoscopically on the receiving side stereoscopic display, a marker is pasted on the skin surface above the facial expression muscle of the transmitting side person, and the face image of this person is displayed. The image is captured and the markers in the image are tracked in real time, and the corneal reflection image when the cornea of the person is irradiated with light is captured to detect the line of sight and blink, and the body, head, finger, etc. of the person. Motion information is detected in real time.

【０００８】そして、３次元人物モデルにおいて、マー
カの追跡結果に基づいて顔に対応する部分のワイヤーフ
レームモデルを駆動し、視線および瞬きの検出結果に基
づいて、３次元人物モデルの眼球の黒目の位置および瞼
の開閉を行ない、検出した動き情報により３次元人物モ
デルの対応する各部分を駆動するように構成される。In the three-dimensional human model, the wire frame model corresponding to the face is driven based on the tracking result of the marker, and the black eye of the eyeball of the three-dimensional human model is detected based on the detection result of the line of sight and blink. It is configured to open and close the position and the eyelid and drive the corresponding parts of the three-dimensional human model according to the detected motion information.

【０００９】[0009]

【作用】この発明に係る人物像の実時間認識方法は、人
物の顔の表情筋にマーカを貼り付け、このマーカと角膜
反射像を実時間で追跡するとともに、体，頭，指などの
動きも実時間で検出し、３次元の人物モデルに実時間に
てテクスチャマッピングを行なうことにより、３次元の
人物モデルにおいて実時間で動きの再現を可能にする。According to the real-time method for recognizing a human image according to the present invention, a marker is attached to the facial expression muscle of a person, the marker and the corneal reflection image are tracked in real time, and movements of the body, head, fingers, etc. Is also detected in real time, and texture mapping is performed in real time in the three-dimensional human model, so that the motion can be reproduced in real time in the three-dimensional human model.

【００１０】[0010]

【実施例】図１はこの発明の一実施例の全体の構成を示
すブロック図である。この実施例では、実空間１中の人
物２の３次元モデルを生成するための人物像３次元モデ
ル生成部２０と、人物の動きを検出する実時間検出部３
０と、検出された動き情報で人物像を生成する実時間生
成部４０とを含む。人物像３次元モデル生成部２０は、
人体のパーツ２１ごとに、形状入力装置２２により３次
元ワイヤーフレームモデル２３を獲得し、映像入力装置
２４によってカラーテクスチャ２５を獲得する。ワイヤ
ーフレームモデル２３は人体の表面を三角形のパッチで
近似したものであり、カラーテクスチャは人体表面の色
彩情報である。1 is a block diagram showing the overall construction of an embodiment of the present invention. In this embodiment, a person image three-dimensional model generation unit 20 for generating a three-dimensional model of the person 2 in the real space 1 and a real-time detection unit 3 for detecting the movement of the person.
0, and a real-time generation unit 40 that generates a human image based on the detected motion information. The person image three-dimensional model generation unit 20
For each part 21 of the human body, the shape input device 22 acquires the three-dimensional wireframe model 23, and the image input device 24 acquires the color texture 25. The wire frame model 23 is obtained by approximating the surface of the human body with triangular patches, and the color texture is color information of the surface of the human body.

【００１１】通信会議中において、実時間検出部３０に
よって人物の動きが検出される。この人物の動き実時間
検出部３０は、顔の表情検出部３１と、頭部の回転移動
検出部３２と、指の動き検出部３３と、体の回転運動検
出部３４とを含み、それぞれ人体の部分の動き情報を検
出する。人物像生成部４０は、３次元ワイヤーフレーム
モデルの変形部４１によって、動き検出部３０で検出さ
れた人体の各部分の動き情報と人物像３次元モデル生成
部２０によって生成された３次元ワイヤーフレームモデ
ル２３とに基づいて、３次元ワイヤーフレームモデルを
変形し、人体の動きによる形状変化を再現する。そし
て、テクスチャマッピング部４２において、変形部４１
において変形されたワイヤーフレームモデルに含まれる
三角パッチにカラーテクスチャ２５を貼り付ける。この
ようにして、生成された３次元人物モデル３は仮想空間
３次元データ４３に基づいて、コンピュータグラフィッ
クス技術を用いて生成された仮想空間４４に配置され
る。受信側の立体ディスプレイ５に仮想空間４４が、人
物像３を含む仮想空間４として表示されることにより、
受信側の会議参加者６は送信側の人物２とあたかも一ヶ
所にいるような感覚で会議を行なうことができる。During the communication conference, the movement of the person is detected by the real-time detecting section 30. The person's movement real-time detection unit 30 includes a facial expression detection unit 31, a head rotational movement detection unit 32, a finger movement detection unit 33, and a body rotational movement detection unit 34, each of which is a human body. The motion information of the part is detected. The person image generation unit 40 includes the three-dimensional wire frame generated by the person image three-dimensional model generation unit 20 and the motion information of each part of the human body detected by the motion detection unit 30 by the three-dimensional wire frame model transformation unit 41. The three-dimensional wire frame model is deformed based on the model 23 and the shape change due to the movement of the human body is reproduced. Then, in the texture mapping unit 42, the transformation unit 41
The color texture 25 is attached to the triangular patch included in the wire frame model deformed in (3). In this way, the generated three-dimensional human model 3 is placed in the virtual space 44 generated by using the computer graphics technology based on the virtual space three-dimensional data 43. By displaying the virtual space 44 as the virtual space 4 including the person image 3 on the stereoscopic display 5 on the receiving side,
The conference participant 6 on the receiving side can hold the conference with the person 2 on the transmitting side as if he / she were in one place.

【００１２】以下に、図１の各部分の構成について詳細
に説明する。図２は図１の人物像３次元モデル生成部２
０の動作原理を示す図である。人物２の人体は、パーツ
２１ａ（頭），２１ｂ（上体），２１ｃ（上腕），２１
ｄ（下腕），２１ｅ（手）に分割され、形状データとカ
ラーデータとが獲得される。図２では上半身のみを示し
ているが、下半身も同様にして行なわれる。形状データ
を獲得するためには、たとえばレーザレンジスキャナ２
６が用いられる。このようなレーザレンジスキャナとし
ては、ＣｙｂｅｒｗａｒｅＣｏｌｏｒ３ＤＤｉｇ
ｉｔｉｚｅｒなどが用いられる。レーザレンジスキャナ
２６は対象物の周囲を回転しながらレーザストライプ光
を対象物に照射し、対象物の表面のレーザストライプの
変形を観測することにより、３次元点データ集合を獲得
できる。同時に、対象物の表面の色彩情報（カラーテク
スチャ）も獲得できる。たとえば、パーツ２１ｃについ
て考えると、レーザレンジスキャナ２６により得られた
３次元データ集合に基づいて、３次元ワイヤーフレーム
モデル２３ｃと、そのカラーテクスチャ２５ｃが得られ
る。The structure of each part shown in FIG. 1 will be described in detail below. FIG. 2 is a human image three-dimensional model generation unit 2 of FIG.
It is a figure which shows the operation principle of 0. The human body of the person 2 includes parts 21a (head), 21b (upper body), 21c (upper arm), 21
It is divided into d (lower arm) and 21e (hand), and shape data and color data are acquired. Although FIG. 2 shows only the upper half of the body, the lower half of the body is similarly processed. In order to obtain the shape data, for example, the laser range scanner 2
6 is used. As such a laser range scanner, a Cyberware Color 3D Dig is available.
Itizer or the like is used. The laser range scanner 26 irradiates the object with laser stripe light while rotating around the object, and observes the deformation of the laser stripe on the surface of the object to acquire a three-dimensional point data set. At the same time, the color information (color texture) of the surface of the object can be acquired. For example, considering the part 21c, the three-dimensional wire frame model 23c and its color texture 25c are obtained based on the three-dimensional data set obtained by the laser range scanner 26.

【００１３】ただし、レーザレンジスキャナ２６は、上
腕２１ｃのように回転体に近いパーツには有効である
が、指を含む手などのように回転体とはいえないパーツ
については、テレビカメラ２７を利用し、カラーテクス
チャ２５ｅを得るとともに、３次元点データ２３ｅはマ
ニアル操作により取得する。However, the laser range scanner 26 is effective for parts close to a rotating body such as the upper arm 21c, but the TV camera 27 is used for parts that cannot be called a rotating body such as a hand including fingers. The color texture 25e is obtained by using the three-dimensional point data 23e by manual operation.

【００１４】上述のようにして、各人体パーツの３次元
ワイヤーフレームモデル２３とカラーテクスチャ２５と
が得られる。これらのパーツは、それぞれの間の関節の
動きを再現可能な形で接続される。As described above, the three-dimensional wire frame model 23 and the color texture 25 of each human body part are obtained. These parts are connected in such a way that the movement of the joint between them can be reproduced.

【００１５】一方、動き検出部３０においては、顔の表
情検出部３１では、後述のような画像処理を用いる。頭
部および体の回転移動検出部３２と３４とでは、たとえ
ばＰｏｌｈｅｍｕｓの磁気センサが使用される。この磁
気センサは３次元空間における座標軸方向の移動および
座標軸周りの回転の計６個の姿勢パラメータの実時間検
出が可能である。指の動き検出部３３では、たとえばＶ
ＰＬ社のデータグローブが使用される。このグローブ
は、指の曲がり具合を光ファイバにより実時間で検出す
ることができる。On the other hand, in the motion detecting section 30, the facial expression detecting section 31 uses image processing as described later. For the head and body rotation movement detection units 32 and 34, for example, Polhemus magnetic sensors are used. This magnetic sensor is capable of real-time detection of a total of six posture parameters, namely movement in the coordinate axis direction in three-dimensional space and rotation around the coordinate axis. In the finger movement detection unit 33, for example, V
PL data gloves are used. This glove can detect the bending degree of a finger in real time by an optical fiber.

【００１６】図３は顔の表情検出部３１の原理を説明す
るための図である。図３に示すように、顔５０の表情筋
の上の皮膚表面にカラーマーカ５１を貼り付け、これを
テレビカメラ５２で画像５５として捉え、画像５５中で
カラーマーカ５１の追跡を行なう。画像５５は２値化さ
れ、カラーマーカ５１に対応する部分のみを検出できる
ようにされる。カラーマーカ５１の追跡を安定化するた
め、顔５０を持つ人物は図３に示すようなヘルメット５
３を被る。ヘルメット５３には枠５４が固定されてい
て、これにテレビカメラ５２が固定される。このように
構成することにより、顔５０が向きを変えても、常に顔
５０に対して一定の位置から顔画像５５が得られる。な
お、カラーマーカ５１の追跡の高速化のため、顔画像５
５において、各カラーマーカ５１に対してウィンドウ５
６を設定し、このウィンドウ５６において、カラーマー
カ５１に対応すると考えられる画素の重心を求めてカラ
ーマーカ５１の位置とする。FIG. 3 is a diagram for explaining the principle of the facial expression detecting section 31. As shown in FIG. 3, a color marker 51 is attached to the skin surface above the facial muscles of the face 50, the television camera 52 captures this as an image 55, and the color marker 51 is tracked in the image 55. The image 55 is binarized so that only the portion corresponding to the color marker 51 can be detected. In order to stabilize the tracking of the color marker 51, the person having the face 50 is the helmet 5 as shown in FIG.
Suffer 3. A frame 54 is fixed to the helmet 53, and the television camera 52 is fixed to the frame 54. With this configuration, even if the face 50 changes its direction, the face image 55 is always obtained from a fixed position with respect to the face 50. In order to speed up the tracking of the color marker 51, the face image 5
5, the window 5 for each color marker 51
6 is set, and the center of gravity of the pixel considered to correspond to the color marker 51 is obtained in this window 56 and set as the position of the color marker 51.

【００１７】表情検出部３１では、皮膚表面の変形だけ
ではなく、視線の検出も行なう。すなわち、図３に示す
枠５４に設けられたテレビカメラ５２の横に、光源とし
ての電球５７を配置し、これを点灯したときに生じる電
球の角膜反射像５８をテレビカメラ５２で撮像する。そ
して、得られた画像５５において、カラーマーカ５１と
同様に追跡する。ここで、テレビカメラ５２とは別のカ
メラ５９を角膜反射像５８を検出するための専用として
設けることにより、検出の高精度化を狙うことも可能で
ある。これにより、視線の方向が検出されるとともに、
角膜反射像５８が観測されるか否かにより、瞬きの検出
も行なう。The facial expression detection unit 31 not only deforms the surface of the skin but also detects the line of sight. That is, a light bulb 57 as a light source is arranged next to the television camera 52 provided in the frame 54 shown in FIG. 3, and the television camera 52 captures a corneal reflection image 58 of the light bulb produced when the light bulb 57 is turned on. Then, the obtained image 55 is tracked in the same manner as the color marker 51. Here, by providing a camera 59 different from the television camera 52 exclusively for detecting the corneal reflection image 58, it is possible to aim at higher detection accuracy. As a result, the direction of the line of sight is detected,
Blinking is also detected depending on whether or not the corneal reflection image 58 is observed.

【００１８】図１に示した人物像生成部４０では、変形
部４１において、検出部３０で検出された動き情報に基
づいて、人物像の３次元ワイヤーフレームモデル２３を
変形する。まず、顔の部分のワイヤーフレームモデルの
変形法について説明する。本来、顔５０は３次元構造を
持つのに対して、カラーマーカ５１の動きは、テレビカ
メラ５２によって獲得される画像５５中の２次元の動き
として検出されるため、３次元モデルを駆動するために
は、知識や拘束条件が必要である。ここでは、鼻のマー
カを不動の基準点とし、無表情時における各マーカとの
距離を求めておき、表情変化に伴い距離変化が生じたマ
ーカの動き情報に基づき、たとえば以下のようにして、
３次元モデル２３を駆動する。In the person image generating section 40 shown in FIG. 1, the deforming section 41 deforms the three-dimensional wire frame model 23 of the person image based on the motion information detected by the detecting section 30. First, a method of transforming the wire frame model of the face will be described. Originally, the face 50 has a three-dimensional structure, whereas the movement of the color marker 51 is detected as a two-dimensional movement in the image 55 acquired by the television camera 52, so that the three-dimensional model is driven. Requires knowledge and constraints. Here, using the marker of the nose as an immovable reference point, the distance to each marker when there is no expression is obtained in advance, and based on the movement information of the marker in which the distance change has occurred in accordance with the expression change, for example, as follows,
The three-dimensional model 23 is driven.

【００１９】図４は下唇の下のマーカの動きが小さい場
合のワイヤーフレーム駆動法を示す図であり、図５は下
唇の下のマーカの動きが大きい場合のワイヤーフレーム
駆動法を示す図でり、図６は唇の両端のマーカが外側に
動く場合のワイヤーフレーム駆動法を示す図であり、図
７は唇の両端のマーカが内側に動く場合のワイヤーフレ
ーム駆動法を示す図である。FIG. 4 is a diagram showing the wire frame driving method when the movement of the marker under the lower lip is small, and FIG. 5 is a diagram showing the wire frame driving method when the movement of the marker under the lower lip is large. FIG. 6 is a diagram showing a wire frame driving method when the markers at both ends of the lip move outward, and FIG. 7 is a diagram showing a wire frame driving method when the markers at both ends of the lip move inward. .

【００２０】まず、下唇のマーカの動きにより、下顎を
駆動する。なお、このマーカの動きが小さいときは、図
４に示すように、口を小さく開いたと判断し、この場合
顔の両側にある下顎と上顎の接続点を結ぶ直線を中心軸
とした円筒面上を下顎が動くという知見に従い、唇も含
めた下顎領域に対応する部分を駆動する。あるしきい値
を越えて動いたときは、図５に示すように、大きく口を
開いたと判断し、鉛直下向きに下顎領域を動かす。First, the lower jaw is driven by the movement of the marker on the lower lip. When the movement of the marker is small, it is determined that the mouth is opened small, as shown in FIG. 4, and in this case, on the cylindrical surface with the straight line connecting the lower jaw and the upper jaw on both sides of the face as the central axis. According to the knowledge that the lower jaw moves, the part corresponding to the lower jaw region including the lips is driven. When moving beyond a certain threshold, it is determined that the mouth is wide open, and the lower jaw region is moved vertically downward, as shown in FIG.

【００２１】唇の両端（口角点）のマーカの動きによ
り、唇の形状を決定する。口角点が唇に対して外側（内
側）に動くことにより、唇を引いた（口を突き出した）
と判断する。口角点が外へ動いたときは、図６に示すよ
うに、唇の周りに対応するモデル部分が、口角点と耳と
で決定される面上を動くと仮定し、各頂点の３次元座標
を決定する。口角点が内へ動いた場合は、図７に示すよ
うに、予め得られているデータに基づき、唇を突き出す
ように口周囲の頂点の座標を決定する。The shape of the lips is determined by the movement of the markers at both ends of the lips (mouth corner points). Lips pulled (protruding mouth) by moving the corner of mouth outward (inward) with respect to lips
Judge. When the mouth corner points move outward, as shown in FIG. 6, it is assumed that the corresponding model part around the lips moves on the plane determined by the mouth corner points and the ears, and the three-dimensional coordinates of each vertex are shown. To decide. When the corner point of the mouth moves inward, as shown in FIG. 7, the coordinates of the vertices around the mouth are determined based on the data obtained in advance so that the lip is projected.

【００２２】鼻側溝および上唇のマーカは、上記の動き
の調整に用いられる。すなわち、唇の形状を変化させる
ために、鼻の周りも含めた広い範囲の頂点を駆動してい
るが、口角点のマーカの動きだけに従うと、これらの頂
点の動きが不自然なものとなるので、これを防ぐのが狙
いである。上頬点のマーカの動きが検出される場合は、
マーカ周囲の頂点を、実際に計測したデータに基づいて
駆動する。The nasal groove and upper lip markers are used for adjusting the above movement. In other words, in order to change the shape of the lips, we drive a wide range of vertices, including around the nose, but if we follow only the movement of the markers at the corners of the mouth, the movement of these vertices becomes unnatural. So the aim is to prevent this. If movement of the marker on the upper cheek point is detected,
The vertices around the marker are driven based on the actually measured data.

【００２３】一方、表情検出部３１では、視線と瞬きの
検出を行なうが、変形部４１では人物像のワイヤーフレ
ームモデル２３における眼球の部分において、黒目の部
分の位置の決定および瞼の開閉を行なう。On the other hand, the facial expression detection unit 31 detects the line of sight and blinks, while the deformation unit 41 determines the position of the black eye portion and opens and closes the eyelid in the eyeball portion of the wire frame model 23 of the human figure. .

【００２４】頭部の回転移動検出部３２と体の回転運動
検出部３４とによって、各部分の３次元位置および回転
が判別できるので、関節の間の部分に対応するワイヤー
フレームモデルに含まれる三角パッチの頂点の３次元座
標を計算することにより、３次元ワイヤーフレームモデ
ル２３上で動きが再現される。同様にして、指の動き検
出部３３により得られる情報により、指のモデルが適宜
変形される。Since the three-dimensional position and the rotation of each part can be discriminated by the rotational movement detection part 32 of the head and the rotational movement detection part 34 of the body, the triangles included in the wire frame model corresponding to the part between the joints. The movement is reproduced on the three-dimensional wire frame model 23 by calculating the three-dimensional coordinates of the vertices of the patch. Similarly, the finger model is appropriately transformed by the information obtained by the finger movement detection unit 33.

【００２５】このようにして、ワイヤーフレームモデル
２３が変形されるが、この変形された三角パッチに対し
て、実時間でカラーテクスチャをテクスチャマッピング
部４２で貼り付ける処理を行なう。初期状態の三角パッ
チに対応する形でカラーテクスチャが得られているの
で、人物の動きに伴なう三角パッチの変形に対応して、
適宜カラーテクスチャの間引きや補間が行なわれる。In this way, the wire frame model 23 is deformed, and the texture mapping unit 42 pastes a color texture to the deformed triangular patch in real time. Since the color texture is obtained in the form corresponding to the triangular patch in the initial state, it corresponds to the deformation of the triangular patch accompanying the movement of the person,
Color texture thinning and interpolation are performed as appropriate.

【００２６】以上のようにして作成された３次元人物モ
デル２３が仮想空間データ４３を用いてコンピュータグ
ラフィックス技術により作成される仮想空間に配置され
る。The three-dimensional human model 23 created as described above is placed in the virtual space created by the computer graphics technique using the virtual space data 43.

【００２７】図８はこの発明の動作を実現する実施例を
示す図であり、２つの場所にいる２人の会議参加者のた
めの実施例である。一方側のワークステーション６１
は、たとえばＩＲＩＳＷＳＣｒｉｍｓｏｎ，Ｒｅａ
ｌｉｔｙＥｎｇｉｎｅＳｉｌｉｃｏｎＧｒａｐｈ
ｉｃｓ製であり、高速なテクスチャマッピングが可能で
あり、他方側の人物８１の実時間３次元表示のために用
いられる。人物８１の３次元人物像は、立体視用３次元
スクリーン６４に表示される。３次元スクリーンの例と
して、左右目用の画像が交互に６０Ｈｚ程度の頻度で表
示されるものがあり、一方側の人物７１は、立体視用眼
鏡７２（スクリーン６４に同期して左右目用のシャッタ
が開閉される）を着用し、磁気センサ７３が設けられた
データグローブ７４を着用して、他方側の人物８１と協
調作業を行なう。FIG. 8 is a diagram showing an embodiment for realizing the operation of the present invention, which is an embodiment for two conference participants at two places. Workstation 61 on one side
Is, for example, IRIS WS Crimson, Rea
light Engine Silicon Graph
It is made of ics, enables high-speed texture mapping, and is used for real-time three-dimensional display of the person 81 on the other side. The three-dimensional image of the person 81 is displayed on the three-dimensional screen 64 for stereoscopic viewing. An example of a three-dimensional screen is one in which images for the left and right eyes are alternately displayed at a frequency of about 60 Hz, and the person 71 on one side is wearing stereoscopic eyeglasses 72 (for the left and right eyes in synchronization with the screen 64). The shutter is opened / closed), and the data glove 74 provided with the magnetic sensor 73 is worn to perform a cooperative work with the person 81 on the other side.

【００２８】一方、人物８１の表情検出は、他方側の２
台のワークステーション６２，６３のうちの１台６３
（たとえばＩＲＩＳＷＳ４Ｄ３４０／ＶＧＸ）を用
いて行なわれる。前述の図３に示したヘルメット５３に
取付けられたテレビカメラ５２から得られる画像情報を
分岐し、一方を青色のマーカ追跡のためにクロマキー６
７に入力して、青色部分とその他の色の部分に分離する
２値化を行ない、他方を角膜反射像検出のためにレベル
キー６８に入力して２値化を行ない、ワークステーショ
ン６３において追跡処理を行なう。On the other hand, the facial expression of the person 81 is detected on the other side.
63 out of one workstation 62, 63
(Eg IRIS WS 4D340 / VGX). The image information obtained from the television camera 52 attached to the helmet 53 shown in FIG. 3 described above is branched, and one of them is a chroma key 6 for tracking a blue marker.
7 to perform binarization for separating the blue portion and other color portions, and the other for inputting to the level key 68 for detecting the corneal reflection image to perform binarization and tracking at the workstation 63. Perform processing.

【００２９】他方側のもう一方のワークステーション６
２（たとえば、ＩＲＩＳＷＳ４Ｄ２４０／ＶＧＳ）
は、人物８１の装着するデータグローブ８５と磁気セン
サ８４からの情報処理に使用される。また、他方側も一
方側と同様の仮想空間がこのワークステーション６２に
よって構成され、立体視用３次元スクリーン６５に表示
されるが、テクスチャマッピングの能力の限界から、人
物８１の人物像表示は簡易なコンピュータグラフィック
ス像により行なわれることもある。The other workstation 6 on the other side
2 (eg IRIS WS 4D240 / VGS)
Are used for information processing from the data glove 85 worn by the person 81 and the magnetic sensor 84. Also, on the other side, a virtual space similar to that on the one side is configured by this workstation 62 and is displayed on the stereoscopic three-dimensional screen 65. However, due to the limitation of the texture mapping capability, the human figure display of the person 81 is simple. Sometimes done by computer graphics.

【００３０】他方側の２台のワークステーション６２と
６３とによって得られた人物８１の動き情報は、Ｅｔｈ
ｅｒｎｅｔ６４を介して一方側のワークステーション６
１に転送され、人物８１の３次元モデルが駆動される。
また、人物７１と８１とが音声によりコミュニケーショ
ンできるように、一方側にマイクロフォン９４とスピー
カ９３が配置され、他方側にもマイクロフォン９１とス
ピーカ９５と遅延装置９２とが用意される。遅延装置９
２は表情の検出と生成のための処理時間を考慮して、音
声に遅延をかけるためのものである。The motion information of the person 81 obtained by the two workstations 62 and 63 on the other side is Eth.
workstation 6 on one side via ernet64
1 and the three-dimensional model of the person 81 is driven.
Further, a microphone 94 and a speaker 93 are arranged on one side and a microphone 91, a speaker 95, and a delay device 92 are prepared on the other side so that the persons 71 and 81 can communicate by voice. Delay device 9
2 is for delaying the voice in consideration of the processing time for detecting and generating the facial expression.

【００３１】この図８に示した実施例の性能は以下のと
おりである。種々の状況でマーカ追跡および角膜反射像
追跡の実験を行なった結果、１１フレーム／秒程度の速
度で行なえることがわかった。人物の上半身像の３次元
モデル２３を約６７００点の頂点数（頭部は１４００
点）からなるワイヤーフレームモデルで構成した場合、
一方、動き生成については、まず動き検出系とは切り離
して実験を行なった結果、１１フレーム／秒程度の速度
で表示を行なうことができた。表情検出系と連動して生
成を行なった場合、仮想空間における人物以外のものが
ある場合には５〜６フレーム／秒，人物のみの場合に
は、６〜７フレーム／秒となった。The performance of the embodiment shown in FIG. 8 is as follows. As a result of experiments of marker tracking and corneal reflection image tracking in various situations, it was found that the operation can be performed at a speed of about 11 frames / sec. A three-dimensional model 23 of a person's upper body image has about 6700 vertices (the head is 1400).
When configured with a wireframe model consisting of
On the other hand, as for the motion generation, as a result of conducting an experiment separately from the motion detection system, it was possible to display at a speed of about 11 frames / sec. When the generation is performed in conjunction with the facial expression detection system, it is 5 to 6 frames / sec when there is something other than a person in the virtual space, and 6 to 7 frames / sec when there is only a person.

【００３２】[0032]

【発明の効果】以上のように、この発明によれば、顔の
表情筋にマーカを貼り付け、その顔の画像を撮像して実
時間でマーカを追跡し、体，頭，指などの動きを実時間
で検出するようにし、３次元の人物モデルに実時間でテ
クスチャマッピングが可能な装置を用いるようにしてい
るため、３次元の人物モデルにおいて、実時間で動きの
再現が可能となる。As described above, according to the present invention, a marker is attached to a facial expression muscle, an image of the face is captured, the marker is tracked in real time, and movements of the body, head, fingers, etc. Is detected in real time and a device capable of texture mapping in real time is used for the three-dimensional human model, so that the motion can be reproduced in real time in the three-dimensional human model.

[Brief description of drawings]

【図１】この発明の一実施例の全体の構成を示すブロッ
ク図である。FIG. 1 is a block diagram showing the overall configuration of an embodiment of the present invention.

【図２】３次元人物像を生成する原理を説明するための
図である。FIG. 2 is a diagram for explaining the principle of generating a three-dimensional human figure.

【図３】表情検出部の原理を説明するための図である。FIG. 3 is a diagram for explaining the principle of a facial expression detection unit.

【図４】下唇の下のマーカの動きが小さい場合のワイヤ
ーフレーム駆動法を示す図である。FIG. 4 is a diagram showing a wire frame driving method when the movement of a marker under the lower lip is small.

【図５】下唇の下のマーカの動きが大きい場合のワイヤ
ーフレーム駆動法を示す図である。FIG. 5 is a diagram showing a wire frame driving method when the movement of a marker under the lower lip is large.

【図６】唇の両端のマーカが外側に動く場合のワイヤー
フレーム駆動法を示す図である。FIG. 6 is a diagram showing a wire frame driving method when the markers on both ends of the lip move outward.

【図７】唇の両端のマーカが内側に動く場合のワイヤー
フレーム駆動法を示す図である。FIG. 7 is a diagram showing a wire frame driving method in the case where the markers on both ends of the lips move inward.

【図８】この発明の動作原理を実現する実施例を示す図
である。FIG. 8 is a diagram showing an embodiment for realizing the operation principle of the present invention.

[Explanation of symbols]

１実空間２人物３３次元人物モデル４仮想空間５立体ディスプレイ６受信側の会議参加者２０人物像３次元モデル生成部２１人体パーツ２１ａ頭２１ｂ上体２１ｃ上腕２１ｄ下碗２１ｅ手２２形状入力装置２３３次元ワイヤーフレームモデル２４映像入力装置２５カラーテクスチャ２６レーザレンジスキャナ２７テレビカメラ３０人物の動き実時間検出部３１顔の表情検出部３２頭部の回転移動検出部３３指の動き検出部３４体の回転移動検出部４０人物像生成部４１３次元ワイヤーフレームモデル変形部４２カラーテクスチャマッピング部４３仮想空間３次元データ４４生成された仮想空間５０顔５１カラーマーカ５２テレビカメラ５３ヘルメット５４枠５５顔画像５６ウィンドウ５７電球５８角膜反射像５９テレビカメラ６１，６２，６３ワークステーション６４，６５立体視用３次元スクリーン６６Ｅｔｈｅｒｎｅｔ６７クロマキー６８レベルキー７４，８５データグローブ７３，８４磁気センサ９１，９４マイクロフォン９２遅延装置９３，９５スピーカ 1 Real Space 2 Person 3 3D Person Model 4 Virtual Space 5 3D Display 6 Conference Participant on Receiving Side 20 Human Image 3D Model Generation Unit 21 Human Body Parts 21a Head 21b Upper Body 21c Upper Arm 21d Lower Bowl 21e Hand 22 Shape Input Device 23 three-dimensional wire frame model 24 video input device 25 color texture 26 laser range scanner 27 TV camera 30 human movement real-time detection unit 31 facial expression detection unit 32 head rotation detection unit 33 finger movement detection unit 34 body Rotational movement detection unit 40 Human image generation unit 41 3D wireframe model deformation unit 42 Color texture mapping unit 43 Virtual space 3D data 44 Virtual space 50 Face 51 Color marker 52 TV camera 53 Helmet 54 Frame 55 Face image 56 window 57 Light bulb 58 Corneal reflection image 59 Television camera 61,62,63 Workstation 64,65 Stereoscopic three-dimensional screen 66 Ethernet 67 Chroma key 68 Level key 74,85 Data glove 73,84 Magnetic sensor 91,94 Microphone 92 Delay device 93, 95 speakers

───────────────────────────────────────────────────── フロントページの続き (72)発明者北村泰一京都府相楽郡精華町大字乾谷小字三平谷５番地株式会社エイ・ティ・アール通信システム研究所内 (72)発明者岸野文郎京都府相楽郡精華町大字乾谷小字三平谷５番地株式会社エイ・ティ・アール通信システム研究所内 (56)参考文献山梨大学工学部研究報告第42号（1991 年12月）Ｐ．16−20 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Taiichi Kitamura Taiichi Kitamura, Seika-cho, Kyoto Prefecture Prefectural Satoshi-cho, Osamu Osamu, Mihiratani No.5, ATR Communication Systems Laboratories, Inc. (72) Inventor, Fumio Kishino Soraku, Kyoto Prefecture Gunma Seika-cho, Osamu Osamu, Osamu Osamu, 5 Hiratani, AT Communication System Research Institute, Inc. (56) References Yamanashi University Faculty of Engineering Research Report No. 42 (December 1991) P. 16-20

Claims

(57) [Claims]

1. A human model on the transmitting side is configured by color texture mapping to a three-dimensional wire frame model, and this human model is arranged in a virtual space generated by using computer graphics technology, and a three-dimensional image on the receiving side is formed. In a communication method for reproducing a person's movement on a display in a stereoscopic manner, a marker is attached to the skin surface on the facial expression muscle of the person on the transmitting side, a face image of this person is captured, and the image in the image is displayed. While tracking the marker in real time, the cornea reflection image when the cornea of the person is irradiated with light is captured to detect the line of sight and blink, and the movement information of the person's body, head, fingers, etc. is obtained in real time. In the three-dimensional human model, the wire frame model of the portion corresponding to the face is driven based on the tracking result of the marker to detect the line of sight and blink. Based on the above, the position of the iris of the eyeball of the three-dimensional human model and the opening and closing of the eyelids are performed, and each corresponding portion of the three-dimensional human model is driven by the detected motion information. Real-time recognition synthesis method.