JP2005523488A

JP2005523488A - Automatic 3D modeling system and method

Info

Publication number: JP2005523488A
Application number: JP2003522039A
Authority: JP
Inventors: ハーヴィルヤング
Original assignee: パルスエンターテインメントインコーポレイテッド
Priority date: 2001-08-14
Filing date: 2002-08-14
Publication date: 2005-08-04
Also published as: MXPA04001429A; WO2003017206A1; CA2690826A1; CN1628327B; EP1425720A1; JP2008102972A; WO2003017206A9; CN1628327A; JP2011159329A; CA2457839C; CA2690826C; CA2457839A1

Abstract

【課題】３Ｄモデリングシステム及び方法、より具体的には、仮想３Ｄ人間の高速作成をもたらすために、画像ベースの自動モデル生成技術を対話式かつリアルタイムのキャラクター配向技術と結合させるシステム及び方法を提供する。
【解決手段】３Ｄモデルを写真又は他の画像から生成することができる自動３Ｄモデリングシステム及び方法。例えば、人の顔の３Ｄモデルを自動的に生成することができる。本システム及び方法はまた、ジェスチャー／挙動を任意の３Ｄモデルに適用することができるように、３Ｄモデルに付随するジェスチャー／挙動を自動的に生成することを可能にする。PROBLEM TO BE SOLVED: To provide a 3D modeling system and method, and more particularly, a system and method for combining image-based automatic model generation technology with interactive and real-time character orientation technology to provide rapid creation of virtual 3D humans. To do.
An automated 3D modeling system and method capable of generating a 3D model from a photograph or other image. For example, a 3D model of a human face can be automatically generated. The system and method also allows automatically generating gestures / behaviors associated with 3D models so that gestures / behaviors can be applied to any 3D model.

Description

関連出願
本出願は、本明細書において引用により組み込まれる、２００１年８月１４日出願の「自動３Ｄモデリングシステム及び方法」という名称の米国特許仮出願一連番号第６０／３１２，３８４号からの優先権を「３５ＵＳＣ§１１９」の下に主張する。
本発明は、３Ｄモデリングシステム及び方法に関し、より具体的には、仮想３Ｄ人間の高速作成をもたらすために、画像ベースの自動モデル生成技術を対話式かつリアルタイムのキャラクター配向技術と結合させるシステム及び方法に関する。 RELATED APPLICATIONS This application is a priority from US Provisional Application Serial No. 60 / 312,384 entitled "Automatic 3D Modeling System and Method" filed 14 August 2001, which is incorporated herein by reference. We claim rights under "35 USC § 119".
The present invention relates to 3D modeling systems and methods, and more particularly, systems and methods that combine image-based automatic model generation techniques with interactive and real-time character orientation techniques to provide rapid creation of virtual 3D humans. About.

コンピュータディスプレイ上に三次元対象物の動画を生成するための多くの異なる技術が存在する。最初は、動画は非常に良質というわけではなかったので、動画の画面（例えば、顔）は、まるで木で造られたキャラクターのように見えたものである。特に、ユーザは、一般に動画の顔を見るであろうが、その特徴や表情は静的であった。多分口は恐らく開閉し、目は瞬くかもしれないが、顔の表情やその動画一般は、木の人形に似ていた。問題は、これらの動画が一般的に最初から図面として作られ、より現実的な外観を捕えるために基礎的な３Ｄモデルを使用してレンダリングされなかったために、動画が現実的に見えず、それほど生きているようには見えなかったということである。より最近になって、一層現実的な動画画面をもたらすために、人物の骨格を皮膚で覆うことができるように動画が改良されてきた。 There are many different techniques for generating a moving image of a three-dimensional object on a computer display. At first, the video was not very good quality, so the video screen (eg, face) looked like a character made of wood. In particular, the user will generally see the face of a movie, but its features and expressions are static. Perhaps the mouth opens and closes and the eyes may blink, but the facial expressions and videos in general resemble a wooden doll. The problem is that these videos were generally created as drawings from scratch and were not rendered using basic 3D models to capture a more realistic look, so the videos were not realistic and not much It did not appear to be alive. More recently, animation has been improved so that the skin of a person can be covered with skin to provide a more realistic animation screen.

動画に対するより現実的な外観を捕えるために、そのような動画は、今では１つ又はそれ以上の変形グリッド上でレンダリングされるが、それでもなお動画は専門の会社でレンダリングされてユーザに再配布されることが多い。これは、高品質の動画をもたらすが、ユーザが、仮想人物として使用するために特定の動画、例えば、ユーザ自身の動画をカスタム化する能力を持たないという点で限界がある。「インターネット」又は「ワールドワイドウェブ」の機能が進歩すると、これらの仮想人物は、ユーザ間の能力及び対話を拡張することになる。従って、典型的なユーザが、写真のような仮想人物として有用な画像から迅速かつ容易に３Ｄモデルを生成することを可能にする３Ｄモデリングシステム及び方法を提供することが望ましいと考えられる。 To capture a more realistic look to the video, such video is now rendered on one or more deformation grids, but the video is still rendered by a professional company and redistributed to the user Often done. This results in a high quality video, but is limited in that the user does not have the ability to customize a specific video, for example the user's own video, for use as a virtual person. As the functionality of the “Internet” or “World Wide Web” advances, these virtual people will expand the capabilities and interaction between users. Accordingly, it would be desirable to provide a 3D modeling system and method that allows a typical user to quickly and easily generate a 3D model from an image useful as a virtual person such as a photograph.

米国特許仮出願一連番号第６０／３１２，３８４号US Patent Provisional Application Serial No. 60 / 312,384

熟練動画製作者がモデルを作成した状態で、典型的なシステムはまた、その同じ動画製作者が、一般にそのモデルに与えたいと思う様々なジェスチャーを動画化するように要求されることを必要とした。例えば、動画製作者は、微笑む、手を振る、又は話すという動画を作成し、次に、これらの動画をモデルに組み込んで目標とするジェスチャーをモデルに与えるであろう。挙動／ジェスチャーデータを生成する処理は、遅くて高価であり、熟練の動画製作者を必要とする。熟練動画製作者の助けなしにモデルに対するジェスチャー及び挙動を生成するための自動機構を提供することが望まれる。本発明が目指すものは、これらの最終目標である。 With a skilled video producer creating a model, a typical system also requires that same video producer be required to animate various gestures that they typically want to give to that model. did. For example, a video creator would create a video that smiles, shakes, or speaks, and then incorporates these videos into the model to give the model a target gesture. The process of generating behavior / gesture data is slow and expensive and requires a skilled video producer. It would be desirable to provide an automatic mechanism for generating gestures and behavior for a model without the assistance of a skilled animation producer. It is these end goals that the present invention aims for.

本発明は、概して、人の顔のような対象物の、写真のようにリアルな３Ｄモデルを１つの画像（又は、複数の画像）から自動的に生成することを可能にする画像処理技術、統計解析、及び３Ｄ幾何変形を利用する。例えば、人の顔に関しては、１つの写真（又は一連の写真）から得られる顔立ちや特徴の細部が識別され、適切な３Ｄモデルを生成するために使用される。画像処理技術及びテクスチャ・マッピング技術はまた、１つ又は複数の写真を３Ｄモデルのための詳細で写真のようにリアルなテクスチャとして使う方法を最適化する。 The present invention generally provides an image processing technique that allows a photo-like realistic 3D model of an object such as a human face to be automatically generated from a single image (or multiple images), Use statistical analysis and 3D geometric deformation. For example, for a human face, facial features and feature details from a single photo (or series of photos) are identified and used to generate an appropriate 3D model. Image processing techniques and texture mapping techniques also optimize the method of using one or more photographs as detailed and photo-realistic textures for 3D models.

本発明の別の態様によれば、人のジェスチャーは、これを他の任意のモデルに適用することができるように捕えて抽象化することができる。例えば、特定人物の生き生きとした微笑みを捕えることができる。次に、この微笑みは、ジェスチャーの抽象化をもたらす特徴空間に変換することができる。抽象化されたジェスチャー（例えば、モデルの異なる部分の動き）は、１つのジェスチャーとして捕えられる。このジェスチャーは、次に、他の任意のモデルのために使用することができる。すなわち、本発明によると、本システムは、他のモデルと共に使用することができるジェスチャーモデルの生成を可能にする。 According to another aspect of the invention, a human gesture can be captured and abstracted so that it can be applied to any other model. For example, a lively smile of a specific person can be captured. The smile can then be transformed into a feature space that provides gesture abstraction. An abstracted gesture (eg, movement of different parts of the model) is captured as one gesture. This gesture can then be used for any other model. That is, according to the present invention, the system enables the generation of gesture models that can be used with other models.

本発明によれば、画像から対象物の三次元モデルを生成する方法が提供される。本方法は、モデル化される対象物の境界を判断する段階と、モデル化される対象物上の１つ又はそれ以上の目印の位置を判断する段階とを含む。この方法は、更に、目印の位置に基づいて画像内の対象物の縮尺及び向きとを判断する段階と、目印を有する対象物の画像を変形グリッドと位置合せする段階と、変形グリッドへの対象物の画像のマッピングに基づいて対象物の３Ｄモデルを生成する段階とを含む。 According to the present invention, a method for generating a three-dimensional model of an object from an image is provided. The method includes determining the boundaries of the object to be modeled and determining the location of one or more landmarks on the object to be modeled. The method further includes determining a scale and orientation of the object in the image based on the position of the landmark, aligning the image of the object having the landmark with the deformation grid, and an object to the deformation grid. Generating a 3D model of the object based on the mapping of the image of the object.

本発明の別の態様によれば、画像の三次元モデルを生成するためのコンピュータ実装システムが提供される。本システムは、対象物の画像を受信する命令と対象物の三次元モデルを自動的に生成する命令とを更に有する三次元モデル発生モジュールを含む。本システムは、更に、特徴空間を生成するための命令と、ジェスチャー挙動を対象物の別のモデルに適用することができるように、対象物のジェスチャーに対応するジェスチャーオブジェクトを生成するための命令とを更に有するジェスチャー発生モジュールを含む。 According to another aspect of the invention, a computer-implemented system for generating a three-dimensional model of an image is provided. The system includes a 3D model generation module further comprising instructions for receiving an image of the object and instructions for automatically generating a 3D model of the object. The system further includes instructions for generating a feature space and instructions for generating a gesture object corresponding to the gesture of the object so that the gesture behavior can be applied to another model of the object. A gesture generating module.

本発明の更に別の態様によれば、自動ジェスチャーモデルを自動的に生成する方法が提供される。本方法は、特定のジェスチャーを行う対象物の画像を受信する段階と、対象物の動きからジェスチャーに付随する動きを判断してジェスチャーオブジェクトを生成する段階とを含み、ジェスチャーオブジェクトは、更に、ジェスチャーの間に起こる配色の変化を記憶する配色変化変数と、ジェスチャーの間に起こる表面の変化を記憶する二次元変化変数と、ジェスチャーの間の対象物に付随する頂点の変化を記憶する三次元変化変数とを含む。 According to yet another aspect of the invention, a method for automatically generating an automatic gesture model is provided. The method includes receiving an image of an object to perform a specific gesture, and determining a movement associated with the gesture from the movement of the object to generate a gesture object, the gesture object further comprising: A color change variable that memorizes the color change that occurs during the gesture, a two-dimensional change variable that memorizes the surface change that occurs during the gesture, and a three-dimensional change that memorizes the vertex change associated with the object during the gesture Including variables.

本発明の更に別の態様によれば、対象物に関するジェスチャーに付随するデータを記憶するジェスチャーオブジェクトデータ構造が提供される。ジェスチャーオブジェクトは、ジェスチャーの間のモデルの配色の変化を記憶するテクスチャ変化変数と、ジェスチャーの間のモデルの表面の変化を記憶するテクスチャマップ変化変数と、ジェスチャーの間のモデルの頂点の変化を記憶する頂点変化変数とを含み、テクスチャ変化変数、テクスチャマップ変化変数、及び頂点変化変数は、ジェスチャーをテクスチャ及び頂点を有する別のモデルに適用することを可能にする。ジェスチャーオブジェクトデータ構造は、モデルの多くの個々のインスタンスが配色、表面運動、及び３Ｄ運動を使用することができるベクトル空間にそのデータを記憶する。 According to yet another aspect of the present invention, a gesture object data structure is provided for storing data associated with a gesture related to an object. Gesture objects store texture change variables that store model color changes between gestures, texture map change variables that store model surface changes between gestures, and model vertex changes between gestures Vertex change variables, texture change variables, texture map change variables, and vertex change variables allow gestures to be applied to another model having textures and vertices. The gesture object data structure stores its data in a vector space where many individual instances of the model can use color schemes, surface motions, and 3D motions.

本発明は、広範な用途を有するが、これを人間の実態に付随する人の顔及びジェスチャーの３Ｄモデルを生成するという関連において以下に説明する。当業者は認識するであろうが、本明細書に説明する原理及び技術を使用して他の任意の３Ｄモデル及びジェスチャーを生成することが可能であり、従って、以下の説明は、単に本発明の特定用途を単に例示したものであり、本発明は、本明細書に説明する顔のモデルに限定されない。 The present invention has a wide range of uses, which will be described below in the context of generating a 3D model of a person's face and gestures that accompanies the human reality. Those skilled in the art will recognize that any other 3D models and gestures can be generated using the principles and techniques described herein, and thus the following description is simply the present invention. The specific application is merely illustrative, and the present invention is not limited to the facial model described herein.

人の顔の３Ｄモデルを生成するために、本発明は、好ましくは、３Ｄモデルを生成するための案内役を果たす一組の目印点１０を判断するために、一連の複雑な画像処理技術を実行する。図１は、人の顔の３Ｄモデルを生成するための好ましいアルゴリズムを説明した流れ図である。図１を参照すると、コンピュータメモリ内に人の顔（例えば、「顔写真」）の１つ又は複数の写真（又は、他の画像）を読み込むために画像取得処理（段階１）が使用される。画像は、好ましくは、「ＪＰＥＧ」画像として読み込むことができるが、本発明から逸脱することなく他の画像タイプのフォーマットも使用することができる。画像は、３Ｄモデルを生成するために画像に対して本発明の画像処理技術を実行することができるように、ディスケットから読み込むか、「インターネット」からダウンロードするか、又は、それ以外に公知の技術を用いてメモリ内に読み込むことができる。 In order to generate a 3D model of a human face, the present invention preferably employs a series of complex image processing techniques to determine a set of landmark points 10 that serve as a guide for generating a 3D model. Execute. FIG. 1 is a flow diagram illustrating a preferred algorithm for generating a 3D model of a human face. Referring to FIG. 1, an image acquisition process (stage 1) is used to read one or more photographs (or other images) of a person's face (eg, “face photograph”) into computer memory. . The image can preferably be read as a “JPEG” image, but other image type formats can also be used without departing from the invention. Images are read from diskettes, downloaded from the “Internet”, or otherwise known techniques so that the image processing techniques of the present invention can be performed on the images to generate a 3D model Can be read into memory.

異なる画像は異なる向きを有する場合があるので、適切な目印点１０を見つけて等級付けすることにより、画像の適正な向きを判断すべきである。画像の向きを判断することは、変形グリッド上への画像の一層現実的なレンダリングを可能にする。ここで、適切な目印点１０の位置決めについて以下に詳しく説明する。
図１を参照すると、画像上の目印点１０を見つけるには、頭部の境界（顔の場合）を画像上で分離することができるように、画像の可変的背景を削除するために、好ましくは、画像上で「シード・フィル」操作を実行することができる（段階２）。図４は、画像取得処理中にコンピュータメモリに読み込むことができる人の頭部の典型的な画像２０である（図１、段階１）。「シード・フィル」操作（図１、段階２）は、公知の反復ペイント・フィル操作であり、これは、例えば、１つ又は複数の点２２の色及び明度に基づいて、画像２０の背景２４内の１つ又はそれ以上の点２２を識別し、色及び明度が類似したペイント・フィル区域２６を点２２から外に向って拡張することにより達成される。好ましくは、「シード・フィル」操作は、画像の色及び明度の背景２４を不透明な背景とうまく置換するので、頭部の境界を一層容易に判断することができる。 Since different images may have different orientations, the proper orientation of the image should be determined by finding and grading the appropriate landmark point 10. Determining the orientation of the image allows for a more realistic rendering of the image on the deformation grid. Here, the positioning of the appropriate mark point 10 will be described in detail below.
Referring to FIG. 1, to find the landmark point 10 on the image, preferably to remove the variable background of the image so that the head boundary (in the case of a face) can be separated on the image. Can perform a “seed fill” operation on the image (step 2). FIG. 4 is a typical image 20 of a person's head that can be read into computer memory during the image acquisition process (FIG. 1, stage 1). The “seed fill” operation (FIG. 1, stage 2) is a known iterative paint fill operation, which is based on, for example, the background 24 of the image 20 based on the color and brightness of one or more points 22. This is accomplished by identifying one or more points 22 within and extending a paint fill area 26 of similar color and brightness outward from point 22. Preferably, the “seed-fill” operation better replaces the image color and lightness background 24 with an opaque background so that the head boundary can be more easily determined.

再び図１を参照すると、頭部３０の境界は、例えば画像の垂直中心（線３２）を見つけ、頭部３０の幅を判断するために中心線３２から水平区域３４に亘って積分する（ノンフィル操作を使用）ことにより、かつ、画像の水平中心（線３６）を見つけ、頭部３０の高さを判断するために中心線３６から垂直区域３８に亘って積分する（ノンフィル操作を使用）ことによって判断することができる（段階３）。換言すると、対象物の存在又は背景の存在に基づいて値が異なるピクセルの場の統計指向線形積分が実行される。これは、不透明な背景２４を有する図４の典型的な画像２０を示す図５に示されている。 Referring again to FIG. 1, the boundary of the head 30 finds, for example, the vertical center of the image (line 32) and integrates over the horizontal area 34 from the center line 32 to determine the width of the head 30 (nonfill). Find the horizontal center (line 36) of the image and integrate over the vertical area 38 from the center line 36 to determine the height of the head 30 (using a non-fill operation). (Step 3). In other words, a statistically directed linear integration of pixel fields with different values based on the presence of an object or the presence of a background is performed. This is illustrated in FIG. 5 which shows the exemplary image 20 of FIG. 4 with an opaque background 24.

再び図１を参照すると、頭部３０の幅及び高さを判断すれば、頭部３０の境界は、頭部３０の高さの統計的特質と、積分された水平区域３４と頭部３０の上部との既知の特性とを使用することによって判断することができる。典型的には、頭部の高さは、画像高さの約２／３であり、頭部の幅は、画像幅の約１／３になることになる。頭部の高さはまた、頭部の幅の１．５倍としてもよく、これは、第１近似として使用される。 Referring again to FIG. 1, once the width and height of the head 30 are determined, the boundary of the head 30 is determined by the statistical characteristics of the height of the head 30, the integrated horizontal area 34 and the head 30. This can be determined by using the known properties with the top. Typically, the height of the head is about 2/3 of the image height, and the width of the head will be about 1/3 of the image width. The height of the head may also be 1.5 times the width of the head, which is used as a first approximation.

頭部３０の境界が判断された状態で、両目４０の位置を判断することができる（段階４）。両目４０は、典型的に頭部３０の上半分に位置するから、統計的計算を使用することができ、両目の境界区域４６ａ及び４６ｂを分離するために、頭部の境界を上半分４２と下半分４４とに分割することができる。頭部境界の上半分４２は、左目４０ａと右目４０ｂとを分離するために、それぞれ右側部分４６ａと左側部分４６ｂとに更に分割することができる。これは、破線で特定の境界区域を表した図４の典型的な画像２０を示す図６に詳しく示されている。 With the boundary of the head 30 determined, the position of both eyes 40 can be determined (step 4). Since both eyes 40 are typically located in the upper half of the head 30, statistical calculations can be used, and to separate the boundary areas 46 a and 46 b of the eyes, the head boundary is separated from the upper half 42. It can be divided into a lower half 44. The upper half 42 of the head boundary can be further divided into a right part 46a and a left part 46b, respectively, to separate the left eye 40a and the right eye 40b. This is illustrated in detail in FIG. 6, which shows the exemplary image 20 of FIG.

更に再び図１を参照すると、各々の目４０ａ及び４０ｂの最中心領域は、それぞれの目の境界４６ａ及び４６ｂ内の高コントラスト輝度の円形領域４８を識別することにより見つけることができる（段階５）。この操作は、境界で限られた区域４６ａ及び４６ｂに亘って最中心点４８から外に向けて繰返し実行することができ、両目４０ａ及び４０ｂの適正な境界を判断するためにその結果を等級付けすることができる。図７は、破線で特定された両目の高コントラスト輝度部分を有する図６の典型的な画像を示す。
再び図１を参照すると、両目４０ａ及び４０ｂが特定された状態で、両方の目４０ａ及び４０ｂを結ぶ線５０を解析し、画面の水平軸線からの線５０のオフセット角度を判断することにより、頭部３０の縮尺及び向きを判断することができる（段階６）。頭部３０の縮尺は、境界の幅／モデルの幅という式に従って境界の幅から導くことができる。 Still referring to FIG. 1, the most central region of each eye 40a and 40b can be found by identifying the high contrast luminance circular region 48 within the respective eye boundary 46a and 46b (step 5). . This operation can be performed repeatedly from the central point 48 across the bounded areas 46a and 46b, and the results are graded to determine the proper boundaries of both eyes 40a and 40b. can do. FIG. 7 shows the exemplary image of FIG. 6 with the high-contrast luminance portion of both eyes identified by a dashed line.
Referring again to FIG. 1, with both eyes 40a and 40b identified, the line 50 connecting both eyes 40a and 40b is analyzed and the offset angle of the line 50 from the horizontal axis of the screen is determined to determine the head angle. The scale and orientation of the unit 30 can be determined (step 6). The scale of the head 30 can be derived from the boundary width according to the equation boundary width / model width.

上述の情報を判断し終えたら、頭部３０上の近似目印点１０を適切に識別することができる。好ましい目印点１０としては、ａ）外側頭部境界６０ａ、６０ｂ、及び６０ｃ、ｂ）内側頭部境界６２ａ、６２ｂ、６２ｃ、及び６２ｄ、ｃ）右目境界６４ａ〜６４ｄ及び左目境界６４ｗ〜６４ｚ、ｄ）鼻の隅部６６ａ及び６６ｂ、及びｅ）口の隅部６８ａ及び６８ｅ（口の線）が含まれるが、本発明から逸脱することなく他の目印点を使用してもよいことを当業者は認めるであろう。図８は、図４の画像に対して示された上述の目印点を例示的に表すものである。 When the above information has been determined, the approximate landmark point 10 on the head 30 can be appropriately identified. Preferred mark points 10 are: a) outer head boundaries 60a, 60b and 60c, b) inner head boundaries 62a, 62b, 62c and 62d, c) right eye boundaries 64a to 64d and left eye boundaries 64w to 64z, d. Those skilled in the art will appreciate that)) nose corners 66a and 66b, and e) mouth corners 68a and 68e (mouth lines), although other landmarks may be used without departing from the invention. Will admit. FIG. 8 exemplarily shows the above-described mark points shown for the image of FIG.

頭部３０上の適切な目印位置１０を判断し終えたら、画像は、頭部の３Ｄモデル７０を形成する１つ又はそれ以上の変形グリッド（後述する）と適切に位置合せすることができる（段階７）。以下においては、３Ｄモデル７０を形成するのに使用することができるいくつかの変形グリッドを説明するが、これらは、３Ｄモデルを形成するのに使用することができる変形グリッドのいくつかの例証に過ぎず、他の変形グリッドも本発明から逸脱することなく使用することができることを当業者は認めるであろう。図９は、本発明による３Ｄモデル生成方法を使用して生成された人の顔の３Ｄモデルの例を示すものである。ここで、３Ｄモデル生成システムを更に詳細に以下に説明する。 Once the appropriate landmark position 10 on the head 30 has been determined, the image can be properly aligned with one or more deformation grids (described below) that form the 3D model 70 of the head (see below). Step 7). In the following, some deformation grids that can be used to form the 3D model 70 are described, but these are just some examples of deformation grids that can be used to form the 3D model. However, those skilled in the art will recognize that other deformed grids may be used without departing from the present invention. FIG. 9 shows an example of a 3D model of a human face generated using the 3D model generation method according to the present invention. Here, the 3D model generation system will be described in more detail below.

図２は、３Ｄモデル生成方法及びジェスチャーモデル生成方法を実施することができるコンピュータシステム７０の例を示す。特に、３Ｄモデル生成方法及びジェスチャーモデル生成方法は、コンピュータシステムにより実行されるソフトウエアコード（又は、コンパイルされたソフトウエアコード）の１つ又はそれ以上の部分として実施することができる。本発明による方法はまた、本発明による方法がハードウエア装置内にプログラムされたハードウエア装置上で実施することができる。図２に戻ると、図に示すコンピュータシステム７０は、パーソナルコンピュータシステムである。しかし、本発明は、クライアント／サーバシステム、サーバシステム、ワークステーションなどのような様々な異なるコンピュータシステム上で実施することができ、いかなる特定のコンピュータシステム上においてもその実施を限定されるわけではない。図示のコンピュータシステムは、ＣＲＴ又はＬＣＤのようなディスプレイ装置７２、シャーシ７４、及び、ユーザがコンピュータシステムと対話することを可能にする図示のキーボード７６やマウス７８のような１つ又はそれ以上の入出力装置を含むことができる。例えば、ユーザは、キーボード又はマウスを使用してデータ又は指令をコンピュータシステムに入力することができ、ディスプレイ装置（可視データ）やプリンタ（図示せず）などを使用してコンピュータシステムからの出力データを受信することができる。シャーシ７４は、コンピュータシステムの計算リソースを収納することができ、公知のようにコンピュータシステムの作動を制御する１つ又はそれ以上の中央演算処理装置（ＣＰＵ）８０と、コンピュータシステムに給電されていない時でも「ＣＰＵ」によって実行されるデータ及び命令を記憶するハードディスクドライブ、光ディスクドライブ、及びテープドライブなどのような永続的記憶装置８２と、公知のように「ＣＰＵ」によって現在実行されているデータ又は命令を一時的に記憶し、コンピュータシステムが給電されていない時にそのデータを失う「ＤＲＡＭ」のようなメモリ８４とを含むことができる。本発明による３Ｄモデル生成及びジェスチャーモデル生成方法を実施するために、メモリは、上述の３Ｄモデル及びジェスチャー生成方法を実行するために「ＣＰＵ」８０によって実行される一連の命令及びデータである３Ｄモデラー８６を記憶することができる。ここで、３Ｄモデラーをより詳しく以下に説明する。 FIG. 2 shows an example of a computer system 70 that can implement the 3D model generation method and the gesture model generation method. In particular, the 3D model generation method and the gesture model generation method can be implemented as one or more portions of software code (or compiled software code) executed by a computer system. The method according to the invention can also be implemented on a hardware device in which the method according to the invention is programmed in a hardware device. Returning to FIG. 2, the computer system 70 shown in the figure is a personal computer system. However, the present invention can be implemented on a variety of different computer systems, such as client / server systems, server systems, workstations, etc., and is not limited to implementation on any particular computer system. . The illustrated computer system includes a display device 72, such as a CRT or LCD, a chassis 74, and one or more inputs such as the illustrated keyboard 76 and mouse 78 that allow the user to interact with the computer system. An output device can be included. For example, a user can input data or commands to a computer system using a keyboard or mouse, and output data from the computer system using a display device (visible data) or a printer (not shown). Can be received. The chassis 74 can contain the computing resources of the computer system and, as is known, one or more central processing units (CPUs) 80 that control the operation of the computer system and is not powered by the computer system. Persistent storage devices 82, such as hard disk drives, optical disk drives, tape drives, etc., that store data and instructions executed by the “CPU” at times, and data currently being executed by the “CPU”, as is known And a memory 84 such as a “DRAM” that temporarily stores instructions and loses the data when the computer system is not powered. In order to implement the 3D model generation and gesture model generation method according to the present invention, the memory is a 3D modeler that is a series of instructions and data executed by the “CPU” 80 to execute the 3D model and gesture generation method described above. 86 can be stored. Here, the 3D modeler will be described in more detail below.

図３は、図２に示す３Ｄモデラー８６のより詳細な図である。特に、３Ｄモデラーは、それぞれ１つ又はそれ以上のコンピュータプログラム命令を使用して実施される３Ｄモデル発生モジュール８８とジェスチャー発生モジュール９０とを含む。図１２Ａ〜１２Ｂ及び図１４Ａ〜１４Ｂには、これらのモジュールの各々を実施するために使用することができる擬似コードを示す。図３に示すように、人の顔のような対象物の画像は、図に示すようなシステムに入力される。画像は、図示のように３Ｄモデル発生モジュールと、同じくジェスチャー発生モジュールとに供給される。３Ｄモデル発生モジュールからの出力は、上述のように自動的に生成された画像の３Ｄモデルである。ジェスチャー発生モジュールからの出力は、１つ又はそれ以上のジェスチャーモデルであり、これは、次に、３Ｄモデル発生モジュールにより生成された任意のモデルを含む任意の３Ｄモデルに対して適用及び使用することができる。ジェスチャー発生器については、図１１を参照しながらより詳細に後述する。このようにして、本システムは、任意の対象物の３Ｄモデルを迅速に生成して実施することを可能にする。更に、ジェスチャー発生器は、微笑みのジェスチャーや手を振るジェスチャーなどのような１つ又はそれ以上のジェスチャーモデルを特定の画像から自動的に生成することを可能にする。ジェスチャー発生器の利点は、このジェスチャーモデルを次に任意の３Ｄモデルに適用することができるということである。ジェスチャー発生器はまた、ジェスチャーを実施するための熟練動画製作者の必要性を排除する。ここで、３Ｄモデル生成のための変形グリッドについて以下に説明する。 FIG. 3 is a more detailed view of the 3D modeler 86 shown in FIG. In particular, the 3D modeler includes a 3D model generation module 88 and a gesture generation module 90, each implemented using one or more computer program instructions. 12A-12B and FIGS. 14A-14B show pseudocode that can be used to implement each of these modules. As shown in FIG. 3, an image of an object such as a human face is input to a system as shown in the figure. The images are supplied to the 3D model generation module and the gesture generation module as shown. The output from the 3D model generation module is a 3D model of the image automatically generated as described above. The output from the gesture generation module is one or more gesture models that are then applied to and used on any 3D model, including any model generated by the 3D model generation module. Can do. The gesture generator will be described in detail later with reference to FIG. In this way, the system allows a 3D model of any object to be quickly generated and implemented. In addition, the gesture generator allows one or more gesture models, such as a smiling gesture or a waving gesture, to be automatically generated from a particular image. The advantage of a gesture generator is that this gesture model can then be applied to any 3D model. Gesture generators also eliminate the need for skilled video creators to perform gestures. Here, the deformation grid for generating the 3D model will be described below.

図１０Ａ〜図１０Ｄは、人の頭部の３Ｄモデル７０を形成するのに使用することができる典型的な変形グリッドを示す。図１０Ａは、好ましくは最も内側の変形グリッドである境界空間変形グリッド７２を示す。境界空間変形グリッド７２の上には、特徴空間変形グリッド７４（図１０Ｂに示す）が重ねられる。縁部空間変形グリッド７６（図１０Ｃに示す）は、好ましくは、特徴空間変形グリッド７４の上に重なる。図１０Ｄは、好ましくは最も外側の変形グリッドである詳細変形グリッド７８を示す。 10A-10D show an exemplary deformed grid that can be used to form a 3D model 70 of a human head. FIG. 10A shows a boundary space deformation grid 72, which is preferably the innermost deformation grid. A feature space deformation grid 74 (shown in FIG. 10B) is overlaid on the boundary space deformation grid 72. The edge space deformation grid 76 (shown in FIG. 10C) preferably overlies the feature space deformation grid 74. FIG. 10D shows a detailed deformation grid 78, which is preferably the outermost deformation grid.

グリッドは、好ましくは、頭部画像３０の目印位置１０が変形グリッドの目印位置１０と位置合せした時に頭部画像が変形グリッドとほぼ位置合せすることになるように、目印位置１０（図１０Ｅに示す）に従って位置合せされる。頭部画像３０を変形グリッドと適正に位置合せするために、ユーザは、例えばマウス又は他の入力装置を使用して特定の目印を画像３０上の異なる区域に「ドラッグする」ことにより、頭部画像３０上の目印位置精度を手動でより精緻にすることができる（段階８）。新しい目印位置情報を使用して、適切な場合には、頭部画像３０を変形グリッドと適正に位置合せするために画像３０を変形グリッドに対して修正することができる（段階９）。新しいモデルの状態を次に計算することができ、詳細グリッド７８をその後取り外すことができ（段階１０）、得られた３Ｄモデルに対して挙動をスケーリングすることができ（段階１１）、かつ、仮想人物として使用するためにそのモデルを記憶することができる（段階１２）。ここで、本発明による自動ジェスチャー生成についてより詳しく以下に説明する。 The grid preferably has a landmark position 10 (see FIG. 10E) so that the head image will be substantially aligned with the deformation grid when the landmark position 10 of the head image 30 is aligned with the landmark position 10 of the deformation grid. Aligned). In order to properly align the head image 30 with the deformation grid, the user can “drag” specific landmarks to different areas on the image 30 using, for example, a mouse or other input device. The mark position accuracy on the image 30 can be manually refined (step 8). Using the new landmark position information, if appropriate, the image 30 can be modified with respect to the deformation grid in order to properly align the head image 30 with the deformation grid (step 9). The state of the new model can then be calculated, the detail grid 78 can then be removed (stage 10), the behavior can be scaled against the resulting 3D model (stage 11), and the virtual The model can be stored for use as a person (step 12). Here, the automatic gesture generation according to the present invention will be described in more detail below.

図１１は、本発明による自動ジェスチャーモデル生成方法１００を示す流れ図である。一般に、自動的なジェスチャーの生成は、ジェスチャーオブジェクトをもたらし、これは、次に、ジェスチャー挙動を迅速に生成して他のモデルに再使用することができるように、任意の３Ｄモデルに適用することができる。通常は、異なるタイプの３Ｄモデルに対して別々のジェスチャーモデルが必要であろう。例えば、微笑みジェスチャーは、そのジェスチャーをより現実的にするために、男性、女性、男の子、及び女の子に対して自動的に生成する必要があるであろう。本方法は、共通の特徴空間が生成される段階１０２で開始される。特徴空間は、顔のような対象物の画像、ジェスチャーの間の対象物の動き、及び異なる対象物間の差異を捕えるオブジェクトスケーラを記憶して表すために使用される共通の空間である。本方法を使用して生成されるジェスチャーオブジェクトはまた、運動と幾何データとの変換を可能にするモデル空間と特徴空間との間のマッピングを記憶するスケーラ場変数を記憶する。自動ジェスチャー生成方法は、顔のような対象物の特定画像を使用して微笑みのような対象物のジェスチャーの抽象化を発生させ、それが次にジェスチャーオブジェクトとして記憶され、それによってその後ジェスチャーオブジェクトを任意の３Ｄモデルに適用することができるようにする段階を伴う。 FIG. 11 is a flowchart illustrating an automatic gesture model generation method 100 according to the present invention. In general, automatic gesture generation results in a gesture object that can then be applied to any 3D model so that gesture behavior can be quickly generated and reused in other models. Can do. Usually, separate gesture models will be required for different types of 3D models. For example, smile gestures may need to be automatically generated for men, women, boys, and girls in order to make the gestures more realistic. The method begins at step 102 where a common feature space is generated. The feature space is a common space used to store and represent an object scaler that captures images of objects such as faces, object movement during gestures, and differences between different objects. Gesture objects generated using the method also store a scaler field variable that stores a mapping between model space and feature space that allows transformation of motion and geometric data. The automatic gesture generation method uses a specific image of the object such as the face to generate an abstraction of the gesture of the object such as a smile, which is then stored as a gesture object, which then It involves steps that allow it to be applied to any 3D model.

図１１に戻ると、本方法は、段階１０４において、ジェスチャーの間の画像の表面の動きに対する変化を表すテクスチャマップ変化を判断するために、特徴空間と画像空間との間の相関関係を判断する。段階１０６において、本方法は、画像からテクスチャマップを更新し（相関関係を検査するため）、得られたテクスチャマップを特徴空間に適用し、テクスチャマップ変化を記憶する、図１４Ａ及び１４Ｂに示す典型的な擬似コードに示すような変数「ｓｔＤｅｌｔａＣｈａｎｇｅ」を発生する。段階１０８において、本方法は、ジェスチャーの間に発生する３Ｄの動きを捕える、ジェスチャーの間の画像モデルの３Ｄ頂点の変化を判断する。段階１１０において、頂点変化は、特徴空間に適用され、図１４Ａ及び１４Ｂに示すように、ジェスチャーオブジェクト内の変数「ＶｅｒｔＤｅｌｔａＣｈａｎｇｅ」に捕えられる。段階１１２において、本方法は、ジェスチャーの間に発生するテクスチャの配色を判断し、それを特徴空間に適用する。テクスチャ配色は、ジェスチャーオブジェクト内の「ＤｅｌｔａＭａｐ」変数に捕えられる。段階１１４において、ジェスチャーの間に発生する配色と２Ｄ及び３Ｄの動きとを包含する「ｓｔＤｅｌｔａＣｈａｎｇｅ」、「ＶｅｒｔＤｅｌｔａＣｈａｎｇｅ」、及び「ＤｅｌｔａＭａｐ」変数を含むジェスチャーオブジェクトが生成される。これらの変数は、ジェスチャーオブジェクトを次に任意の３Ｄモデルに適用することができるように、ジェスチャーの間に発生する動き及び色変化のみを表す。要するに、ジェスチャーオブジェクトは、特定の画像モデルに存在するジェスチャーを抜き出して、そのジェスチャーの本質的な要素を包含する抽象的オブジェクトに変え、それによって次にそのジェスチャーを任意の３Ｄモデルに適用することができるようにする。 Returning to FIG. 11, the method determines a correlation between the feature space and the image space at step 104 to determine a texture map change that represents a change to the surface motion of the image during the gesture. . In step 106, the method updates the texture map from the image (to check correlation), applies the resulting texture map to the feature space, and stores the texture map changes, as shown in FIGS. 14A and 14B. A variable “stDeltaChange” as shown in a typical pseudo code is generated. In step 108, the method determines a change in the 3D vertex of the image model during the gesture that captures the 3D movement that occurs during the gesture. In step 110, the vertex changes are applied to the feature space and captured in a variable “VertDeltaChange” in the gesture object, as shown in FIGS. 14A and 14B. In step 112, the method determines the texture color scheme that occurs during the gesture and applies it to the feature space. The texture color scheme is captured in a “DeltaMap” variable in the gesture object. In step 114, a gesture object is created that includes the “stDeltaChange”, “VertDeltaChange”, and “DeltaMap” variables that include the color scheme that occurs during the gesture and the 2D and 3D movements. These variables represent only the movements and color changes that occur during the gesture so that the gesture object can then be applied to any 3D model. In short, a gesture object can extract a gesture that exists in a particular image model and turn it into an abstract object that contains the essential elements of that gesture, so that it can then be applied to any 3D model. It can be so.

ジェスチャーオブジェクトはまた、幾何データ及び運動データの変換を可能にするために、ジェスチャーの特徴空間とモデルのモデル空間との間のマッピングを記憶するスケーラ場変数を含む。「ｓｃａｌｅｒＡｒｒａｙ」は、「ジェスチャー」オブジェクト内の各幾何頂点に対するエントリを有する。各エントリは、「特徴」レベルのその頂点に対するその変形されていない状態から変形された状態に至る縮尺の変化を保持する三次元ベクトルである。縮尺は、「特徴」空間内の頂点により、その頂点から接続された各頂点までの距離におけるスケーラの変化を評価することによって計算される。所定の「ジェスチャー」頂点に対するスケーラは、「特徴レベル」における多角形の「ＵＶ」空間にマップされた時のその「頂点」位置の重み付き補間によって計算される。特徴レベルにおける多角形の形状及び大きさは、同様にスケーリングされた動きの区域と適合するように選択される。これは、典型的な顔のジェスチャーの視覚的な流れを解析することにより判断される。上述の方法は、図１４Ａ及び１４Ｂに示す擬似コードに一層詳細に示されている。
図１２Ａ〜１２Ｂ及び図１３Ａ〜１３Ｂは、それぞれ、本発明に従って３Ｄモデルを自動的に生成するためのサンプル擬似コードアルゴリズム及び典型的な作業流れ処理を含む。 The gesture object also includes a scaler field variable that stores a mapping between the feature space of the gesture and the model space of the model to allow conversion of geometric and motion data. The “scalerArray” has an entry for each geometric vertex in the “gesture” object. Each entry is a three-dimensional vector that holds the change in scale from its undeformed state to its deformed state for that vertex at the “feature” level. The scale is calculated by evaluating the change in scaler by the vertices in the “feature” space at the distance from that vertex to each connected vertex. The scaler for a given “gesture” vertex is calculated by weighted interpolation of its “vertex” position when mapped to a polygonal “UV” space at “feature level”. The shape and size of the polygon at the feature level is selected to match the similarly scaled area of motion. This is determined by analyzing the visual flow of typical facial gestures. The above method is illustrated in more detail in the pseudo code shown in FIGS. 14A and 14B.
FIGS. 12A-12B and FIGS. 13A-13B each include a sample pseudocode algorithm and an exemplary workflow process for automatically generating a 3D model in accordance with the present invention.

自動的に生成されたモデルには、組込み挙動動画及び対話性を組み込むことができる。例えば、人の顔に関しては、そのような表現には、ジェスチャー、唇同期（ｖｉｓｅｍｅｓ）のための口の位置、及び頭の動きが含まれる。そのような挙動は、自動唇同期、音声合成、自然言語処理、及び音声認識のような技術を用いて統合することができ、ユーザ又はデータ駆動イベントを開始するか、又はそれによって開始されることが可能である。例えば、自動的に生成されたモデルのリアルタイムの唇同期は、オーディオトラックと関連付けることが可能である。更に、インテリジェント・エージェントによって話された音声のリアルタイム解析を提供することができ、同期した頭及び顔のジェスチャーが開始されて、発語の配信に伴って自動的な生き生きとした動きを提供する。 The automatically generated model can incorporate built-in behavior animation and interactivity. For example, for a human face, such expressions include gestures, mouth positions for lips synchronization, and head movements. Such behavior can be integrated using techniques such as automatic lip synchronization, speech synthesis, natural language processing, and speech recognition to initiate or be initiated by a user or data driven event. Is possible. For example, an automatically generated model real-time lip synchronization can be associated with an audio track. In addition, real-time analysis of speech spoken by intelligent agents can be provided, and synchronized head and face gestures are initiated to provide automatic and lively movements as speech is delivered.

すなわち、仮想人間は、知識ベースと、顧客リソース管理システムと、学習管理システムと、チャット、インスタントメッセージング、及び電子メールを通じた娯楽アプリケーション及び通信とに含まれた情報に対する対話式即応性フロントエンドとして使用することができるインテリジェント・エージェントとして働くように配置することができる。ここで、３Ｄモデルの画像から生成され、その後本発明に従って別のモデルに適用されるジェスチャーの例を以下に説明する。 That is, virtual humans are used as interactive responsive front ends for information contained in knowledge bases, customer resource management systems, learning management systems, and entertainment applications and communications through chat, instant messaging, and email Can be arranged to work as an intelligent agent that can. Here, an example of a gesture generated from an image of a 3D model and then applied to another model according to the present invention will be described below.

図１５は、第１のモデルであるクリステンに対する基本３Ｄモデルの例を示す。図１５に示す３Ｄモデルは、３Ｄモデル生成処理を使用して上述のように以前に生成されたものである。図１６は、上述のように生成された第２の３Ｄモデルを示す。これら２つのモデルは、ジェスチャーオブジェクトを生成するために既存のモデルから微笑みジェスチャーを自動的に生成すること、及び、次にこのように生成されたジェスチャーオブジェクトを別の３Ｄモデルに適用することを説明するために使用されることになる。図１７は、中立的ジェスチャーにおける第１のモデルの例を示し、図１８は、微笑みジェスチャーにおける第１のモデルの例を示す。次に、第１のモデルの微笑みジェスチャーが上述のように捕えられる。図１９は、中立的ジェスチャー及び微笑みジェスチャーに基づいて第１のモデルから生成された微笑みジェスチャーマップ（上述したジェスチャーオブジェクトのグラフィックバージョン）の一例を示す。上述のように、ジェスチャーマップは、第１のモデルのジェスチャー挙動を一連の配色変化、テクスチャマップ変化、及び３Ｄ頂点変化に抽象化し、次に、これをテクスチャマップ及び３Ｄ頂点を有する他の任意の３Ｄモデルに適用することができる。次に、このジェスチャーマップ（上述の変数を含む）を使用して、ジェスチャーオブジェクトを本発明に従って別のモデルに適用することができる。このようにして、自動ジェスチャー生成処理は、３Ｄモデルに対する様々なジェスチャーを抽象化し、次に他の３Ｄモデルに適用することを可能にする。 FIG. 15 shows an example of a basic 3D model for kristen, which is the first model. The 3D model shown in FIG. 15 was previously generated as described above using the 3D model generation process. FIG. 16 shows the second 3D model generated as described above. These two models describe automatically generating a smiling gesture from an existing model to generate a gesture object, and then applying the gesture object thus generated to another 3D model Will be used to do. FIG. 17 shows an example of the first model in the neutral gesture, and FIG. 18 shows an example of the first model in the smile gesture. Next, the smile gesture of the first model is captured as described above. FIG. 19 shows an example of a smile gesture map (graphic version of the gesture object described above) generated from the first model based on the neutral gesture and the smile gesture. As described above, the gesture map abstracts the gesture behavior of the first model into a series of color scheme changes, texture map changes, and 3D vertex changes, which are then any other having texture maps and 3D vertices. It can be applied to 3D models. This gesture map (including the variables described above) can then be used to apply the gesture object to another model according to the present invention. In this way, the automatic gesture generation process allows various gestures for a 3D model to be abstracted and then applied to other 3D models.

図２０は、第１及び第２のモデルの特徴空間が互いに整合していることを示すために、両方のモデルが互いに重ねられた特徴空間の例である。ここで、別のモデルに対するジェスチャーマップ（従って、ジェスチャーオブジェクト）の適用について一層詳しく以下に説明する。特に、図２１は、第２のモデルの中立的ジェスチャーを示す。図２２は、第２のモデルが実際には微笑みを見せていない時でも第２のモデルに微笑みジェスチャーを与えるために第２のモデルに適用された微笑みジェスチャー（第１のモデルによって生成されたジェスチャーマップからの）を示す。 FIG. 20 is an example of a feature space in which both models are superimposed on each other to show that the feature spaces of the first and second models are aligned with each other. Here, the application of the gesture map (and hence the gesture object) to another model will be described in more detail below. In particular, FIG. 21 shows a neutral gesture of the second model. FIG. 22 shows a smile gesture (a gesture generated by the first model) applied to the second model to give a smile gesture to the second model even when the second model is not actually showing a smile. From the map).

上述の通り、画像上の目印位置の点を見つける特定の方法、及びジェスチャーを生成する特定の方法に関連して説明したが、特許請求の範囲により規定された本発明から逸脱することなく他の技術を使用し得ることを当業者は認識するであろう。例えば、各レベルをダウンサンプリングして各レベルでの周波数の差を解析することによる画像の周波数解析を使用するピラミッド変換のような技術を使用することができる。更に、サイドサンプリング及び画像ピラミッド技術のような他の技術を使用して画像を処理することもできる。更に、直交（低域）フィルタリング技術を使用して顔の特徴の信号強度を増大させることができ、ファジー論理技術を使用して顔の全体的な位置を特定することができる。目印の位置は、次に、公知のコーナー発見アルゴリズムによって判断することができる。 As described above, a particular method for finding a point of a landmark location on an image and a particular method for generating a gesture have been described, but other methods may be used without departing from the invention as defined by the claims. Those skilled in the art will recognize that techniques may be used. For example, a technique such as pyramid transformation that uses frequency analysis of the image by down-sampling each level and analyzing the frequency difference at each level can be used. In addition, other techniques such as side sampling and image pyramid techniques can be used to process the image. Furthermore, orthogonal (low-pass) filtering techniques can be used to increase the signal strength of facial features, and fuzzy logic techniques can be used to determine the overall position of the face. The location of the landmark can then be determined by a known corner finding algorithm.

人の顔の３Ｄモデルを生成する方法を説明した流れ図である。3 is a flowchart illustrating a method for generating a 3D model of a human face. 本発明による３Ｄモデリング方法を実施するのに使用することができるコンピュータシステムの例を示す図である。FIG. 2 illustrates an example of a computer system that can be used to implement a 3D modeling method according to the present invention. 本発明による３Ｄモデル生成システムをより詳細に示すブロック図である。1 is a block diagram illustrating a 3D model generation system according to the present invention in more detail. FIG. 画像取得処理中にコンピュータのメモリに読み込むことができる人の頭部の典型的な画像を示す図である。FIG. 3 is a diagram illustrating a typical image of a person's head that can be loaded into a computer's memory during an image acquisition process. 「シード・フィル」操作で画像を処理した後の不透明な背景を有する図４の典型的な画像を示す図である。FIG. 5 illustrates the exemplary image of FIG. 4 with an opaque background after processing the image with a “seed fill” operation. 両目の位置の周りの特定の境界区域を示す破線を有する図５の典型的な画像を示す図である。FIG. 6 shows the exemplary image of FIG. 5 with a dashed line indicating a particular boundary area around the position of both eyes. 破線によって特定された両目の高コントラスト輝度部分を有する図６の典型的な画像を示す図である。FIG. 7 shows the exemplary image of FIG. 6 with high contrast brightness portions of both eyes identified by dashed lines. 人の頭部に対する様々な目印位置の点を示す例示的な図である。FIG. 6 is an exemplary diagram illustrating points at various landmark positions relative to a human head. 本発明による人の顔の３Ｄモデルの一例を示す図である。It is a figure which shows an example of the 3D model of the human face by this invention. 人の頭部の３Ｄモデルを生成するために使用することができるそれぞれの変形グリッドの１つを示す図である。FIG. 3 shows one of each deformation grid that can be used to generate a 3D model of a human head. 人の頭部の３Ｄモデルを生成するために使用することができるそれぞれの変形グリッドの１つを示す図である。FIG. 3 shows one of each deformation grid that can be used to generate a 3D model of a human head. 人の頭部の３Ｄモデルを生成するために使用することができるそれぞれの変形グリッドの１つを示す図である。FIG. 3 shows one of each deformation grid that can be used to generate a 3D model of a human head. 人の頭部の３Ｄモデルを生成するために使用することができるそれぞれの変形グリッドの１つを示す図である。FIG. 3 shows one of each deformation grid that can be used to generate a 3D model of a human head. 互いの上に重ねられた変形グリッドを示す図である。It is a figure which shows the deformation | transformation grid overlaid on each other. 本発明による自動ジェスチャー挙動生成方法を示す流れ図である。4 is a flowchart illustrating an automatic gesture behavior generation method according to the present invention. 本発明の画像処理技術を実行するための典型的な擬似コードを示す図である。FIG. 3 is a diagram showing typical pseudo code for executing the image processing technique of the present invention. 本発明の画像処理技術を実行するための典型的な擬似コードを示す図である。FIG. 3 is a diagram showing typical pseudo code for executing the image processing technique of the present invention. 本発明に従って３Ｄモデルを自動的に生成するための典型的な操作流れ処理を示す図である。FIG. 6 is a diagram illustrating an exemplary operation flow process for automatically generating a 3D model according to the present invention. 本発明に従って３Ｄモデルを自動的に生成するための典型的な操作流れ処理を示す図である。FIG. 6 is a diagram illustrating an exemplary operation flow process for automatically generating a 3D model according to the present invention. 本発明に従って自動ジェスチャー挙動モデルを実行するための典型的な擬似コードを示す図である。FIG. 3 illustrates exemplary pseudo code for executing an automatic gesture behavior model in accordance with the present invention. 本発明に従って自動ジェスチャー挙動モデルを実行するための典型的な擬似コードを示す図である。FIG. 3 illustrates exemplary pseudo code for executing an automatic gesture behavior model in accordance with the present invention. 第１のモデルであるクリステンに対する基本３Ｄモデルの例を示す図である。It is a figure which shows the example of the basic 3D model with respect to kristen which is a 1st model. 第２のモデル、エリーに対する基本３Ｄモデルの例を示す図である。It is a figure which shows the example of the basic 3D model with respect to a 2nd model and Ellie. 中立的なジェスチャーをした第１のモデルの例を示す図である。It is a figure which shows the example of the 1st model which made the neutral gesture. 微笑みのジェスチャーをした第１のモデルの例を示す図である。It is a figure which shows the example of the 1st model which made the gesture of smile. 第１のモデルの中立的なジェスチャー及び微笑みのジェスチャーから生成された微笑みジェスチャーマップの例を示す図である。It is a figure which shows the example of the smile gesture map produced | generated from the neutral gesture and smile gesture of the 1st model. 両方のモデルが互いの上に重ねられた特徴空間の例を示す図である。It is a figure which shows the example of the feature space with which both models were piled up on each other. 第２のモデルに対する中立的なジェスチャーの例を示す図である。It is a figure which shows the example of the neutral gesture with respect to a 2nd model. 第２のモデルに微笑みのジェスチャーを生成するために第２のモデルに適用される、第１のモデルから生成された微笑みのジェスチャーの例を示す図である。FIG. 6 is a diagram illustrating an example of a smile gesture generated from a first model applied to a second model to generate a smile gesture on the second model.

Explanation of symbols

８６３Ｄモデラー
８８３Ｄモデル発生モジュール
９０ジェスチャー発生モジュール 86 3D Modeler 88 3D Model Generation Module 90 Gesture Generation Module

Claims

A method for generating a three-dimensional model of an object from an image,
Determining the boundaries of the object being modeled;
Determining the position of one or more landmarks on the object to be modeled;
Determining the scale and orientation of the object in the image based on the position of the landmark;
Aligning an image of the object having the landmark with a deformation grid;
Generating a 3D model of the object based on a mapping of the image of the object to the deformation grid;
A method comprising the steps of:

The method of claim 1, wherein the step of determining the boundary further comprises a statistically directed linear integration of a field of pixels having different values depending on the presence of an object or the presence of a background.

The method of claim 1, wherein determining the boundary further comprises performing a statistically oriented seed fill operation to remove background around an image of the object.

Determining the landmark includes identifying features found by procedural correlation or bandpass filtering, and thresholding within a statistically characterized area as determined during the boundary determination. The method of claim 3, further comprising the step of:

The method of claim 4, wherein determining the landmark further comprises determining an additional landmark based on refinement of the boundary area.

The method of claim 5, wherein determining the landmark further comprises adjusting the landmark by a user.

A computer-implemented system for generating a three-dimensional model of an image,
A 3D model generation module further comprising: an instruction for receiving an image of the object; and an instruction for automatically generating a 3D model of the object;
Instructions for generating a feature space, and instructions for generating a gesture object corresponding to the gesture of the object so that the behavior of the gesture can be applied to a model of another object; A gesture generation module further comprising:
A system characterized by including.

A method for automatically generating an automatic gesture model,
Receiving an image of an object performing a specific gesture;
Determining a movement associated with the gesture from the movement of the object to generate a gesture object;
Including
The gesture object includes a color change variable that stores a color change that occurs during the gesture, a two-dimensional change variable that stores a surface change that occurs during the gesture, and the object between the gestures. Further including a three-dimensional change variable for storing vertex changes associated with
A method characterized by that.

The method of claim 8, further comprising generating a feature space to which the gesture is mapped during the automatic gesture generation process.

The method of claim 9, wherein determining the movement further comprises determining a correlation between the feature space and an image of the object.

The method of claim 9, further comprising transforming the geometric and motion vectors into and out of the feature space.

10. The method of claim 9, further comprising applying color scheme, texture motion, and geometric motion changes from one model to another using the feature space.

A gesture object data structure for storing data associated with a gesture of an object,
A scalar field variable that stores a mapping between the feature space of the gesture and the model space of the model to enable transformation of geometric and motion data;
Texture change variables that memorize changes in the model's color scheme between gestures;
A texture map change variable that stores changes in the surface of the model during the gesture;
A vertex change variable that stores a change in the model vertex during the gesture;
Including
The texture change variable, the texture map change variable, and the vertex change variable allow the gesture to be applied to another model having a texture and a vertex.
A structure characterized by that.