JP2002232782A

JP2002232782A - Image processor, method therefor and record medium for program

Info

Publication number: JP2002232782A
Application number: JP2001029149A
Authority: JP
Inventors: Takayuki Ashigahara; 隆之芦ヶ原
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2001-02-06
Filing date: 2001-02-06
Publication date: 2002-08-16

Abstract

PROBLEM TO BE SOLVED: To provide an apparatus for image processing capable of executing with ease an image compositing processing to change a specific target object to be replaced in a material video by a target object for pasting imaged by a camera. SOLUTION: In a method for image processing, the image processor acquires imaging direction data of the specific target object to be replaced (for example, the face of a specific person on the screen) in a material video of a movie, dram or the like, or image data generated based on the 3DCG and executes, the pasting procesing by compositing the target object (for example, the face of a user) to be pasted imaged by a camera whose position is controlled based on the imaging direction data. According to the constitution, image composition can be attained without forcing various motions to the target object to be pasted (for example, the face of the user).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像処理装置およ
び画像処理方法、並びにプログラム記憶媒体に関し、特
に、映画やドラマなどのシーン（素材映像）に同期して
決められた軌道を移動するカメラでユーザの顔を取り込
み、素材映像の登場人物の顔に合成するすることで、ユ
ーザが素材映像中の登場人物に成り代わってインタラク
ティブに入り込んだ映像を生成する画像処理装置および
画像処理方法、並びにプログラム記憶媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus, an image processing method, and a program storage medium. More particularly, the present invention relates to a camera that moves on a predetermined orbit in synchronization with a scene (material video) such as a movie or a drama. An image processing apparatus, an image processing method, and a program for capturing a user's face and synthesizing it with the face of a character in a material video to generate a video in which the user enters interactively on behalf of a character in the material video It relates to a storage medium.

【０００２】[0002]

【従来の技術】映画やドラマなどシーンにおいて、そこ
に登場する人物にユーザ自身の画像を貼り付けて登場人
物に成り代わってユーザが映画やドラマなどのシーン中
にインタラクティブに入り込んだ映像を生成するシステ
ムとして、現在提案されているものに例えば次のような
ものがある。2. Description of the Related Art In a scene such as a movie or a drama, a user's own image is pasted on a person appearing in the scene to generate a video in which the user interactively enters a scene such as a movie or a drama on behalf of the character. As the system, for example, the following are proposed.

【０００３】まず、映画の画像中の登場人物の顔部分に
貼り付けるためのユーザの顔の三次元形状と画像を取得
する。次に、映画の登場人物の顔部分に対する顔モデル
のフィッティング情報をもとに合成処理を実行する。こ
のフィッティング処理はマッチムーブと呼ばれ、映画の
特殊効果としては一般的である。しかしながら、このマ
ッチムーブ処理は、例えば映画の各シーンを構成する各
フレームにおける時間方向の不連続や振動が生じないよ
うに考慮した処理が必要となり、結果としてフレーム単
位での合成処理が必要になる。First, a three-dimensional shape of a user's face and an image to be attached to a character's face in a movie image are acquired. Next, synthesis processing is executed based on the fitting information of the face model for the face part of the character in the movie. This fitting process is called a match move, and is common as a special effect of a movie. However, this match move processing requires processing in consideration of, for example, the occurrence of discontinuity or vibration in the time direction in each frame constituting each scene of a movie, and as a result, the synthesis processing in frame units is required. .

【０００４】フィッティング処理に用いられる顔モデル
は三次元空間上で動かすことが可能であり、実際に映画
のシーンが進行し、登場人物と置き換わった自分の顔を
インタラクティブに操作できる。また、マイクに向かっ
て例えばセリフを話しかければ、その言葉を話すように
合成されたユーザの顔の口形状が変化する。また、キー
操作によって表情を任意に変えることがもきる。（詳し
くは例えば森島繁生,“顔の認識・合成と新メディアの
可能性”,第６回画像センシングシンポジウム講演論文
集，pp.415-424, June 2000 を参照のこと）また、顔の
表情などの動きに関しては、ユーザの顔の動きを計測し
て、それを顔モデルに反映させることも考えられる。The face model used in the fitting process can be moved in a three-dimensional space, and the scene of the movie actually progresses, and the user can interactively operate his or her own face, which has been replaced by a character. Also, if, for example, a user speaks words toward the microphone, the mouth shape of the user's face synthesized to speak the words changes. In addition, the expression can be arbitrarily changed by key operation. (For details, see Shigeo Morishima, "Face Recognition / Synthesis and Possibility of New Media," Proceedings of the 6th Image Sensing Symposium, pp.415-424, June 2000.) With regard to the movement of the face, it is conceivable to measure the movement of the user's face and reflect the result in the face model.

【０００５】上記の従来のフィッティング処理方法で
は、画像中に貼り付けるデータとしてユーザの三次元形
状を取り込む必要がある。また、このアプローチで技術
が進んだとしても、表現できる口形状や表情に限界があ
り、合成された画像には不自然な口の動きや表情が残っ
てしまう。このような不自然さを解消するために、顔の
向きや位置に関して、映画の中の登場人物と全く同じ動
作をユーザにしてもらい、その方向、位置部分の顔デー
タを抽出して合成する処理も考えられるが、映画などの
登場人物と全く同じ動作をすることは不可能であり、現
実的ではない。また、合成処理後に各シーンの合成画像
を取り出して微少な調整処理を実行するということも考
えられるが、このような後処理を実行するとリアルタイ
ムのインタラクティブなシステムとして成り立たなくな
ってしまう。In the above-mentioned conventional fitting processing method, it is necessary to capture a user's three-dimensional shape as data to be pasted into an image. Further, even if the technique is advanced by this approach, there are limitations on the mouth shapes and expressions that can be expressed, and unnatural mouth movements and expressions remain in the synthesized image. In order to eliminate such unnaturalness, the user is required to perform exactly the same operation as the characters in the movie with respect to the direction and position of the face, and the face data of the direction and position is extracted and synthesized. Although it is conceivable, it is impossible to perform exactly the same operation as a character such as a movie, which is not realistic. It is also conceivable to take out a synthesized image of each scene after the synthesizing process and execute a minute adjustment process. However, if such a post-process is executed, a real-time interactive system cannot be established.

【０００６】[0006]

【発明が解決しようとする課題】本発明は、上記のよう
な従来のフィッティング処理の問題点に鑑みてなされた
ものであり、映画やドラマ、あるいは３ＤＣＧなどに基
づく様々な画像データに対してユーザの顔、あるいは他
のオブジェクトの画像貼り付け処理をインタラクティブ
にかつ容易に実行し、自然な合成画像の生成を可能とし
た画像処理装置および画像処理方法を提供することを目
的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the problems of the conventional fitting processing as described above, and has been developed for a variety of image data based on a movie, a drama, or 3DCG. It is an object of the present invention to provide an image processing apparatus and an image processing method that can easily and interactively execute an image pasting process of a face or another object and generate a natural synthesized image.

【０００７】[0007]

【課題を解決するための手段】本発明の第１の側面は、
素材画像データ内の特定オブジェクトを置き換え対象オ
ブジェクトとし、該置き換え対象オブジェクトの画像
を、カメラで撮影した貼り付け対象オブジェクトの画像
データに置き換える処理を実行する画像処理装置であ
り、前記置き換え対象オブジェクトの撮影位置情報に基
づいて生成されるカメラ位置情報に従って撮影カメラ位
置を制御して貼り付け対象オブジェクトの画像撮り込み
を実行する画像撮り込み手段と、画像データ内の置き換
え対象オブジェクトの画像を前記画像撮り込み手段によ
り撮影された貼り付け対象オブジェクトの画像に変更す
る画像合成処理を実行する画像合成手段と、を有するこ
とを特徴とする画像処理装置にある。SUMMARY OF THE INVENTION A first aspect of the present invention is as follows.
An image processing apparatus for performing a process of replacing a specific object in material image data with a replacement target object and replacing an image of the replacement target object with image data of a paste target object captured by a camera. Image capturing means for controlling an image capturing camera position in accordance with camera position information generated based on the position information to execute image capturing of the object to be pasted, and capturing the image of the object to be replaced in the image data by the image capturing An image synthesizing unit that executes an image synthesizing process of changing the image of the paste target object photographed by the unit.

【０００８】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、前記置き換え対象オ
ブジェクトを含む素材画像データを格納した素材映像蓄
積手段と、前記置き換え対象オブジェクトの撮影方向情
報に基づいて生成されるカメラ位置情報を蓄積したカメ
ラ位置情報蓄積手段を有し、カメラ位置情報蓄積手段に
格納されるカメラ位置情報は、前記素材映像蓄積手段に
格納された画像データの各フレームに対応する時系列デ
ータとして構成され、前記カメラ位置情報蓄積手段に記
録されたカメラ位置情報の前記画像撮り込み手段に対す
る出力、および、前記素材映像蓄積手段に記録された素
材映像の前記画像合成手段に対する出力を同期させた処
理として実行し、前記画像合成手段は、前記素材映像蓄
積手段から入力する素材映像と、前記画像撮り込み手段
から入力される貼り付け対象オブジェクトの画像を並列
に入力し、該並列入力データに基づいて、前記素材画像
中の置き換え対象オブジェクトの画像を前記貼り付け対
象オブジェクトの画像に変更する処理を実行する構成で
あることを特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image processing apparatus includes: a material video storage unit storing material image data including the replacement target object; Camera position information storage means for storing camera position information generated based on the camera position information, wherein the camera position information stored in the camera position information storage means corresponds to each frame of the image data stored in the material video storage means. Output to the image capturing unit of the camera position information recorded in the camera position information storage unit, and output of the material video recorded in the material video storage unit to the image synthesizing unit. Are executed as synchronized processing, and the image synthesizing unit inputs the image data from the material image storage unit. Material video and the image of the paste target object input from the image capturing means are input in parallel, and based on the parallel input data, the image of the replacement target object in the material image is It is characterized in that it is configured to execute processing for changing to an image.

【０００９】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、さらに、前記画像撮
り込み手段の撮影した画像から貼り付け対象オブジェク
トの画像のみを抽出する撮り込み画像抽出手段を有し、
前記撮り込み画像抽出手段は、抽出した貼り付け対象オ
ブジェクトの画像を前記画像合成手段に出力する構成で
あることを特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image processing apparatus further includes a captured image extracting unit for extracting only an image of an object to be pasted from the image captured by the image capturing unit. Has,
The captured image extracting means outputs the extracted image of the paste target object to the image synthesizing means.

【００１０】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの遮蔽領域情報
としての隠れマスク情報を格納した隠れマスク情報蓄積
手段を有し、前記画像合成手段は、該隠れマスク情報蓄
積手段からの隠れマスク情報を入力し、マスク領域につ
いては、前記画像撮り込み手段からの貼り付け対象オブ
ジェクトの画像データではなく、前記素材画像データの
データを出力データとして選択する合成処理を実行する
ことを特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image processing apparatus further stores hidden mask information storing hidden mask information as occluded area information of the object to be replaced in the material image data. Means for inputting hidden mask information from the hidden mask information accumulating means, and regarding the mask area, not the image data of the object to be pasted from the image capturing means but the material A synthesizing process for selecting data of image data as output data is performed.

【００１１】さらに、本発明の画像処理装置の一実施態
様において、前記画像撮り込み手段は、貼り付け対象オ
ブジェクトの画像を撮影するカメラと、前記カメラを前
記カメラ位置情報に従って移動させるカメラ移動手段と
を有することを特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image capturing means includes a camera for capturing an image of an object to be pasted, and a camera moving means for moving the camera according to the camera position information. It is characterized by having.

【００１２】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの輝度情報に基
づいて生成される照明位置情報を格納した照明位置情報
蓄積手段を有し、前記画像撮り込み手段は、貼り付け対
象オブジェクトを照射する照明と、前記照明を前記照明
位置情報に従って移動させる照明移動手段とを有する構
成であることを特徴とする。Further, in one embodiment of the image processing apparatus according to the present invention, the image processing apparatus further includes a lighting unit storing lighting position information generated based on luminance information of the replacement target object in the material image data. The image capturing device includes a position information accumulating unit, and the image capturing unit includes a lighting unit that irradiates the paste target object and a lighting moving unit that moves the lighting unit according to the lighting position information.

【００１３】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの位置情報を格
納した置き換え対象オブジェクト位置情報蓄積手段を有
し、前記画像合成手段は、前記置き換え対象オブジェク
ト位置情報蓄積手段から入力される前記置き換え対象オ
ブジェクトの位置情報に基づいて、前記画像撮り込み手
段から入力する貼り付け対象オブジェクトの画像のサイ
ズ調整および位置調整処理を実行する構成であることを
特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image processing apparatus further includes a replacement object position information storage unit that stores position information of the replacement object in the material image data. The image synthesizing unit adjusts the size and position of the image of the paste target object input from the image capturing unit based on the position information of the replacement target object input from the replacement target object position information storage unit. It is characterized in that it is configured to execute processing.

【００１４】さらに、本発明の画像処理装置の一実施態
様において、前記画像処理装置は、さらに、３次元モデ
ルデータを格納した３次元モデル蓄積手段と、前記３次
元モデル蓄積手段に格納した３次元モデルデータに含ま
れる置き換え対象オブジェクトの仮想撮影視点情報を格
納した仮想カメラ情報蓄積手段と、前記３次元モデルデ
ータと、前記仮想撮影視点情報に基づいて素材映像デー
タを生成する素材映像生成手段とを有し、前記画像撮り
込み手段は、前記素材映像生成手段の生成する素材映像
に含まれる置き換え対象オブジェクトの仮想撮影視点情
報に基づいて生成されるカメラ位置情報に従って撮影カ
メラ位置を制御して貼り付け対象オブジェクトの画像撮
り込みを実行する構成であることを特徴とする。Further, in one embodiment of the image processing apparatus of the present invention, the image processing apparatus further includes a three-dimensional model storage unit storing three-dimensional model data, and a three-dimensional model storage unit stored in the three-dimensional model storage unit. Virtual camera information storage means for storing virtual shooting viewpoint information of the replacement target object included in the model data; material video generating means for generating material video data based on the three-dimensional model data and the virtual shooting viewpoint information; The image capturing means controls and pastes a photographing camera position according to camera position information generated based on virtual photographing viewpoint information of a replacement target object included in the material video generated by the material video generating means. It is characterized in that it is configured to capture an image of a target object.

【００１５】さらに、本発明の第２の側面は、素材画像
データ内の特定オブジェクトを置き換え対象オブジェク
トとし、該置き換え対象オブジェクトの画像を、カメラ
で撮影した貼り付け対象オブジェクトの画像データに置
き換える処理を実行する画像処理方法であり、前記置き
換え対象オブジェクトの撮影位置情報に基づいて生成さ
れるカメラ位置情報に従って撮影カメラ位置を制御して
貼り付け対象オブジェクトの画像撮り込みを実行する画
像撮り込みステップと、画像データ内の置き換え対象オ
ブジェクトの画像を前記画像撮り込みステップにおいて
撮影された貼り付け対象オブジェクトの画像に変更する
画像合成処理を実行する画像合成ステップと、を有する
ことを特徴とする画像処理方法にある。Further, according to a second aspect of the present invention, a process for replacing a specific object in material image data as an object to be replaced and replacing the image of the object to be replaced with image data of an object to be pasted taken by a camera. An image processing method to be executed, wherein an image capturing step of controlling an image capturing camera position according to camera position information generated based on the image capturing position information of the replacement target object and capturing an image of the paste target object, An image synthesizing step of executing an image synthesizing process of changing an image of the replacement target object in the image data into an image of the paste target object photographed in the image capturing step. is there.

【００１６】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、前記置き換
え対象オブジェクトの撮影方向情報に基づいてカメラ位
置情報を生成、蓄積するカメラ位置情報蓄積ステップを
有し、前記カメラ位置情報蓄積ステップにおいて生成し
たカメラ位置情報の画像撮り込み手段に対する出力、お
よび、素材映像蓄積手段に記録された素材映像の画像合
成手段に対する出力を同期させた処理として実行し、前
記画像合成ステップは、前記素材映像蓄積手段から入力
する素材映像と、前記画像撮り込み手段から入力される
貼り付け対象オブジェクトの画像を並列に入力し、該並
列入力データに基づいて、前記素材画像中の置き換え対
象オブジェクトの画像を前記貼り付け対象オブジェクト
の画像に変更する処理を実行することを特徴とする。Further, in one embodiment of the image processing method of the present invention, the image processing method further includes a camera position information storing step of generating and storing camera position information based on shooting direction information of the object to be replaced. The output of the camera position information generated in the camera position information storage step to the image capturing means, and the output of the material video recorded in the material video storage means to the image synthesizing means are executed as synchronized processing, In the image synthesizing step, the material image input from the material image storing means and the image of the paste target object input from the image capturing means are input in parallel, and the material image is input based on the parallel input data. Change the image of the object to be replaced in the image of the object to be pasted And executes the management.

【００１７】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、前記画像撮
り込みステップにおいて撮影した画像から貼り付け対象
オブジェクトの画像のみを抽出する撮り込み画像抽出ス
テップを有することを特徴とする。Further, in one embodiment of the image processing method of the present invention, the image processing method further includes a captured image extracting step of extracting only the image of the object to be pasted from the image captured in the image capturing step. It is characterized by having.

【００１８】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの遮蔽領域情報
としての隠れマスク情報を生成し格納する隠れマスク情
報蓄積ステップを有し、前記画像合成ステップは、前記
隠れマスク情報を入力し、マスク領域については、前記
画像撮り込みステップにおいて撮り込まれた貼り付け対
象オブジェクトの画像データではなく、前記素材画像デ
ータのデータを出力データとして選択する合成処理を実
行することを特徴とする。Further, in one embodiment of the image processing method of the present invention, the image processing method further comprises generating and storing hidden mask information as occluded area information of the object to be replaced in the material image data. The image synthesizing step includes inputting the hidden mask information, and regarding the mask area, not the image data of the object to be pasted captured in the image capturing step, but the material image data. A synthesis process of selecting the data as the output data.

【００１９】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの輝度情報に基
づいて生成される照明位置情報を生成し格納する照明位
置情報蓄積ステップを有し、前記画像撮り込みステップ
は、前記照明位置情報蓄積ステップにおいて生成した照
明位置情報に従って前記貼り付け対象オブジェクトを照
射する照明を移動させることを特徴とする。Further, in one embodiment of the image processing method of the present invention, the image processing method further generates and stores illumination position information generated based on luminance information of the object to be replaced in the material image data. The image capturing step moves the illumination for irradiating the paste target object according to the illumination position information generated in the illumination position information accumulation step.

【００２０】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、素材画像デ
ータ内の前記置き換え対象オブジェクトの位置情報を生
成格納する置き換え対象オブジェクト位置情報蓄積ステ
ップを有し、前記画像合成ステップは、前記置き換え対
象オブジェクト位置情報蓄積ステップにおいて生成した
前記置き換え対象オブジェクトの位置情報に基づいて、
前記画像撮り込みステップにおいて撮り込まれた貼り付
け対象オブジェクトの画像のサイズ調整および位置調整
処理を実行することを特徴とする。Further, in one embodiment of the image processing method according to the present invention, the image processing method further includes a step of accumulating position information of the object to be replaced, which generates and stores position information of the object to be replaced in the material image data. Then, the image synthesizing step, based on the position information of the replacement target object generated in the replacement target object position information accumulation step,
A size adjustment and a position adjustment process of the image of the object to be pasted captured in the image capturing step are performed.

【００２１】さらに、本発明の画像処理方法の一実施態
様において、前記画像処理方法は、さらに、３次元モデ
ル蓄積手段に格納した３次元モデルデータに含まれる置
き換え対象オブジェクトの仮想撮影視点情報を生成格納
する仮想カメラ情報蓄積ステップと、前記３次元モデル
データと、前記仮想撮影視点情報に基づいて素材映像デ
ータを生成する素材映像生成ステップとを有し、前記画
像撮り込みステップは、前記素材映像生成ステップの生
成した素材映像に含まれる置き換え対象オブジェクトの
仮想撮影視点情報に基づいて生成されるカメラ位置情報
に従って撮影カメラ位置を制御して貼り付け対象オブジ
ェクトの画像撮り込みを実行することを特徴とする。Further, in one embodiment of the image processing method of the present invention, the image processing method further includes generating virtual photographing viewpoint information of the replacement target object included in the three-dimensional model data stored in the three-dimensional model storage means. Storing a virtual camera information to be stored, a material video generating step of generating material video data based on the three-dimensional model data and the virtual shooting viewpoint information, and the image capturing step includes: Controlling the photographing camera position according to the camera position information generated based on the virtual photographing viewpoint information of the replacement target object included in the material video generated in the step, and executing the image capturing of the paste target object. .

【００２２】さらに、本発明の第３の側面は、素材画像
データ内の特定オブジェクトを置き換え対象オブジェク
トとし、該置き換え対象オブジェクトの画像を、カメラ
で撮影した貼り付け対象オブジェクトの画像データに置
き換える処理をコンピュータ・システム上で実行せしめ
るコンピュータ・プログラムを提供するプログラム記憶
媒体であって、前記コンピュータ・プログラムは、前記
置き換え対象オブジェクトの撮影位置情報に基づいて生
成されるカメラ位置情報に従って撮影カメラ位置を制御
して貼り付け対象オブジェクトの画像撮り込みを実行す
る画像撮り込みステップと、画像データ内の置き換え対
象オブジェクトの画像を前記画像撮り込みステップにお
いて撮影された貼り付け対象オブジェクトの画像に変更
する画像合成処理を実行する画像合成ステップと、を有
することを特徴とするプログラム記憶媒体にある。Further, a third aspect of the present invention is a process for replacing a specific object in material image data with a replacement target object and replacing the image of the replacement target object with image data of a paste target object photographed by a camera. A program storage medium for providing a computer program to be executed on a computer system, wherein the computer program controls a shooting camera position according to camera position information generated based on shooting position information of the replacement target object. Image capturing step of executing image capturing of the object to be pasted and image combining processing for changing the image of the object to be replaced in the image data to the image of the object to be pasted captured in the image capturing step In the program storage medium comprising: an image synthesizing step of executing, a.

【００２３】なお、本発明のプログラム記憶媒体は、例
えば、様々なプログラム・コードを実行可能な汎用コン
ピュータ・システムに対して、コンピュータ・プログラ
ムをコンピュータ可読な形式で提供する媒体である。媒
体は、ＣＤやＦＤ、ＭＯなどの記録媒体、あるいは、ネ
ットワークなどの伝送媒体など、その形態は特に限定さ
れない。The program storage medium of the present invention is a medium that provides a computer program in a computer-readable format to a general-purpose computer system that can execute various program codes. The form of the medium is not particularly limited, such as a recording medium such as a CD, an FD, and an MO, and a transmission medium such as a network.

【００２４】このようなプログラム記憶媒体は、コンピ
ュータ・システム上で所定のコンピュータ・プログラム
の機能を実現するための、コンピュータ・プログラムと
記憶媒体との構造上又は機能上の協働的関係を定義した
ものである。換言すれば、該記憶媒体を介してコンピュ
ータ・プログラムをコンピュータ・システムにインスト
ールすることによって、コンピュータ・システム上では
協働的作用が発揮され、本発明の他の側面と同様の作用
効果を得ることができるのである。Such a program storage medium defines a structural or functional cooperative relationship between the computer program and the storage medium for realizing the functions of a predetermined computer program on a computer system. Things. In other words, by installing the computer program into the computer system via the storage medium, a cooperative operation is exerted on the computer system, and the same operation and effect as the other aspects of the present invention can be obtained. You can do it.

【００２５】本発明のさらに他の目的、特徴や利点は、
後述する本発明の実施例や添付する図面に基づくより詳
細な説明によって明らかになるであろう。Still other objects, features and advantages of the present invention are:
It will become apparent from the following more detailed description based on the embodiments of the present invention and the accompanying drawings.

【００２６】[0026]

【発明の実施の形態】［実施例１］図１は、本発明の画
像処理装置の一実施形態に係るインタラクティブ型の画
像処理装置のブロック図である。図２は図１の画像処理
装置における画像撮り込み部１００の具体的構成例を示
す図である。図１および図２を用いて本発明の画像処理
装置の概要について説明する。ここで説明する実施例に
おいては、映画、ドラマなどの動画像データの登場人物
の顔を図２に示す画像撮り込み部１００を用いて撮り込
んだユーザの顔画像を貼り付ける処理を想定して説明す
る。なお、本実施例では撮り込み画像をユーザの顔とし
た例を説明するが、その他のオブジェクト、例えば人物
全体、車、建物、風景など様々な被写体を撮り込みオブ
ジェクトとする構成が可能である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiment 1 FIG. 1 is a block diagram of an interactive image processing apparatus according to an embodiment of the present invention. FIG. 2 is a diagram illustrating a specific configuration example of the image capturing unit 100 in the image processing apparatus of FIG. The outline of the image processing apparatus of the present invention will be described with reference to FIGS. In the embodiment described here, it is assumed that a process of pasting a face image of a user who captures the face of a character in moving image data such as a movie or a drama using the image capturing unit 100 illustrated in FIG. explain. In this embodiment, an example in which the captured image is a user's face will be described. However, other objects, for example, various subjects such as an entire person, a car, a building, and a scenery can be configured as the captured object.

【００２７】なお、以下の説明では、映画、ドラマなど
の元画像中の人物の顔を、カメラで撮り込んだユーザの
顔に置き換える処理例を説明する。以下の説明中におい
て、映画、ドラマなどの素材画像中の人物の顔など置き
換え処理の対象となるオブジェクトを置き換え対象オブ
ジェクトと呼び、カメラで撮り込んだユーザの顔など、
元画像に貼り付けるオブジェクトを貼り付け対象オブジ
ェクトと呼ぶ。In the following description, an example of processing for replacing the face of a person in an original image such as a movie or drama with the face of a user captured by a camera will be described. In the following description, an object to be replaced, such as a person's face in a material image such as a movie or a drama, is referred to as a replacement target object, such as a user's face captured by a camera.
An object to be pasted on the original image is called a pasting target object.

【００２８】カメラ位置情報蓄積部１は、予め素材蓄積
部８に格納された素材データ、例えば映画のある特定の
登場人物の顔の向き（方向）、位置情報に対応してユー
ザの顔５を撮り込むためのカメラの位置情報を時系列の
シーケンシャルデータとして格納している。すなわちカ
メラ位置情報蓄積部１は、素材映像蓄積部８の置き換え
対象となる画像オブジェクト（ここでは特定の登場人物
の顔）に一致した方向から貼り付け対象オブジェクトで
ある顔画像の撮り込みを行なうためのカメラの位置情報
を時系列シーケンシャルデータとして蓄積している。The camera position information storage unit 1 stores material data stored in the material storage unit 8 in advance, for example, the face 5 of the user in accordance with the direction (direction) of the face of a specific character in the movie and the position information. Camera position information for capturing images is stored as time-sequential sequential data. That is, the camera position information storage unit 1 captures a face image, which is an object to be pasted, from a direction coinciding with an image object (here, the face of a specific character) to be replaced by the material video storage unit 8. Is stored as time-sequential sequential data.

【００２９】カメラ位置情報蓄積部１に蓄積されたカメ
ラ位置情報は、画像撮り込み部１００（ｅｘ．図２参
照）を構成するカメラ移動制御部２に送られる。カメラ
移動制御部２ではカメラ位置情報蓄積部１から入力され
るカメラ位置情報に基づいてカメラ移動部３を制御す
る。カメラ移動部３は、図２に示すように、たとえばア
ームロボットのように、空間内の所望の位置・方向に移
動することが可能な装置である。カメラ移動部３の先端
に取り付けられたカメラ４は、ユーザの顔５を撮影す
る。カメラ4で撮像された画像は撮り込み画像抽出部６
に時系列データとして送られる。The camera position information stored in the camera position information storage unit 1 is sent to a camera movement control unit 2 constituting the image capturing unit 100 (ex. See FIG. 2). The camera movement control unit 2 controls the camera movement unit 3 based on the camera position information input from the camera position information storage unit 1. As shown in FIG. 2, the camera moving unit 3 is a device that can move to a desired position and direction in a space, for example, like an arm robot. The camera 4 attached to the tip of the camera moving unit 3 captures the face 5 of the user. The image captured by the camera 4 is a captured image extraction unit 6
Is sent as time-series data.

【００３０】撮り込み画像抽出部6では、画像撮り込み
部１００のカメラ４で撮像された画像から合成対象とな
る部分データ（この例では顔領域画像データ）のみを抽
出する。抽出された画像は画像合成部７に送られ、合成
処理が実行される。The captured image extracting unit 6 extracts only partial data (in this example, face area image data) to be synthesized from the image captured by the camera 4 of the image capturing unit 100. The extracted image is sent to the image synthesizing unit 7, where the synthesizing process is executed.

【００３１】素材映像蓄積部８には、映画やドラマのシ
ーンの映像（以下、素材映像と呼ぶ）が蓄積されてお
り、蓄積画像は時系列で画像合成部７に送られる。隠れ
マスク情報蓄積部9には、隠れマスク情報が記録されて
おり、素材映像蓄積部８の蓄積画像出力に同期して時系
列で画像合成部７に送られる。隠れマスク情報蓄積部9
の格納している隠れマスク情報は、素材画像中の置き換
え処理対象オブジェクトの一部が手前のオブジェクト、
例えば柱、建物などに隠れているような場合の隠れ領域
情報である。The material video storage unit 8 stores video of a movie or drama scene (hereinafter, referred to as material video), and the stored images are sent to the image synthesizing unit 7 in time series. Hidden mask information is stored in the hidden mask information storage unit 9 and is sent to the image synthesizing unit 7 in time series in synchronization with the output of the stored image from the material video storage unit 8. Hidden mask information storage unit 9
The hidden mask information stored in is that the part of the replacement target object in the material image
For example, it is hidden area information in the case of being hidden behind a pillar, a building, or the like.

【００３２】画像合成部７は、隠れマスク情報に応じ
て、その画素領域について撮り込み画像抽出部6の抽出
した画像ではなく、素材映像蓄積部８から入力する素材
映像データを出力することにより、素材映像に対する撮
り込み画像の貼り付けが忠実に実行可能となる。例えば
素材映像中の登場人物の顔に手や柱の画像が手前にかぶ
さって顔の一部または全部が隠れている場合に、隠れマ
スク情報蓄積部9の格納している隠れマスク情報に基づ
く合成処理を実行することにより、合成画像においても
その隠れ部分が忠実に再現される。The image synthesizing unit 7 outputs material video data input from the material video accumulating unit 8 instead of the image extracted by the captured image extracting unit 6 for the pixel area in accordance with the hidden mask information. Pasting of the captured image to the material video can be executed faithfully. For example, when the image of a hand or a pillar covers the face of a character in the material video and part or all of the face is hidden, synthesis based on the hidden mask information stored in the hidden mask information storage unit 9 By executing the processing, the hidden portion is faithfully reproduced even in the composite image.

【００３３】画像合成部７では、素材映像と、撮り込み
画像から抽出したユーザの顔の画像とを、隠れマスク情
報を用いて合成した画像を生成する。ただし、他の物体
で顔が隠れてしまうような状態がない場合には、隠れマ
スク情報はなくても合成可能である。生成された画像は
出力部１０に送られ、出力される。出力部は、例えばＣ
ＲＴ、ＬＣＤ等のモニタ装置や、ＣＤ，ＤＶＤ，ビデオ
などの記録装置、あるいはデータ通信のための通信手段
等である。The image synthesizing unit 7 generates an image obtained by synthesizing the material video and the image of the user's face extracted from the captured image using hidden mask information. However, when there is no state where the face is hidden by another object, the composition can be performed without the hidden mask information. The generated image is sent to the output unit 10 and output. The output unit is, for example, C
It is a monitor device such as an RT or LCD, a recording device such as a CD, DVD, or video, or a communication means for data communication.

【００３４】なお、カメラ位置情報蓄積部１に記録され
たカメラ位置情報、素材映像蓄積部８に記録された素材
映像、および隠れマスク情報蓄積部９に記録された隠れ
マスク情報の読み出しは、時系列で同期して行われる。The camera position information recorded in the camera position information storage unit 1, the material image recorded in the material image storage unit 8, and the hidden mask information recorded in the hidden mask information storage unit 9 are read out at a certain time. Synchronized in series.

【００３５】画像撮り込み部１００の構成について図２
を参照して説明する。図２の構成は、ユーザの顔の映像
を貼り付け対象オブジェクトとして撮り込む構成を示し
たものである。ユーザ２０は固定台２１に貼り付け対象
オブジェクトとなる顔を固定する。固定台２１とカメラ
移動部３（ここではアームロボット）の位置関係は既知
である。カメラ移動部３に取り付けられたカメラ４が、
前述したカメラ位置情報蓄積部１に蓄積されたカメラ位
置情報にしたがったカメラ移動制御部２による制御によ
って移動しながらユーザの顔の画像を撮り込む。FIG. 2 shows the configuration of the image capturing unit 100.
This will be described with reference to FIG. The configuration in FIG. 2 shows a configuration in which a video of a user's face is captured as an object to be pasted. The user 20 fixes the face to be the object to be pasted on the fixed base 21. The positional relationship between the fixed base 21 and the camera moving unit 3 (here, an arm robot) is known. The camera 4 attached to the camera moving unit 3
An image of the user's face is captured while moving under the control of the camera movement control unit 2 according to the camera position information stored in the camera position information storage unit 1 described above.

【００３６】ここで、カメラ移動部３は、カメラ４を所
望の位置・向きに移動することができるものであれば、
図２に示すようなアームロボットの構成に限られるもの
ではない。固定台２１は、例えば図２に示すように貼り
付け対象オブジェクト（顔）を出す穴のついた青、緑等
の単色に塗られた板を取りつけ、顔画像抽出部で顔の抽
出がしやすい構成とする。あるいは貼り付け対象オブジ
ェクト（ｅｘ．ユーザの顔）以外の部分を青、緑等の布
で覆うという構成としてもよい。Here, if the camera moving unit 3 can move the camera 4 to a desired position and direction,
The configuration is not limited to the configuration of the arm robot shown in FIG. For example, as shown in FIG. 2, the fixed base 21 is attached with a plate painted in a single color such as blue or green with a hole for projecting an object to be pasted (face), and the face image extraction unit can easily extract the face. Configuration. Alternatively, a configuration may be adopted in which a portion other than the paste target object (ex. User's face) is covered with a cloth of blue, green, or the like.

【００３７】カメラ４の撮り込む画像は、顔の周囲画像
も含む画像となるので、図１に示す撮り込み画像抽出部
６が貼り付け対象となる顔画像のみを抽出する処理を実
行する。たとえば青や緑といった単色の色情報を持つ固
定台２１とともに貼り付け対象オブジェクト（ｅｘ．ユ
ーザの顔）をカメラ４で撮影し、撮影画像から単色の色
情報データを削除して所望の領域としての貼り付け対象
オブジェクト画像を抽出する、すなわちクロマキー技術
を用いることが可能である。Since the image captured by the camera 4 is an image including the surrounding image of the face, the captured image extracting unit 6 shown in FIG. 1 executes a process of extracting only the face image to be pasted. For example, an object to be pasted (ex. User's face) is photographed by the camera 4 together with the fixed base 21 having monochromatic color information such as blue or green, and monochromatic color information data is deleted from the photographed image to obtain a desired area. It is possible to extract the paste target object image, that is, to use the chroma key technique.

【００３８】なお、素材映像の進行状況やその時のセリ
フ情報をユーザに提示するディスプレイ２５をユーザの
見える位置に配置し、ユーザがディスプレイ２５に表示
された素材映像を見て、貼り付け対象となる登場人物の
表情あるいはセリフに基づいて口を動かすなどを行なう
構成としてもよい。このような処理構成とすることで、
より登場人物に近い表情に従ったユーザの顔を素材映像
に貼り付けることが可能となる。A display 25 for presenting the progress of the material video and the line information at that time to the user is arranged at a position visible to the user, and the user looks at the material video displayed on the display 25 to be pasted. A configuration in which the mouth is moved based on the expression of the characters or the lines may be used. With such a processing configuration,
The user's face according to the expression closer to the character can be pasted on the material video.

【００３９】図１に示すカメラ位置情報蓄積部１には、
素材映像蓄積部に蓄積された映画などの素材映像に基づ
くカメラ位置情報が予め蓄積される。素材映像中の置き
換え対象オブジェクトである特定のオブジェクト（例え
ば特定の登場人物の顔）に基づいてその方向データを求
めて、その同一方向からユーザの顔５を撮りこむカメラ
の位置をカメラ位置情報とする。The camera position information storage unit 1 shown in FIG.
Camera position information based on material video such as a movie stored in the material video storage unit is stored in advance. The direction data is obtained based on a specific object (for example, the face of a specific character) that is a replacement target object in the material video, and the position of the camera that captures the user's face 5 from the same direction is defined as camera position information. I do.

【００４０】素材映像中の登場人物の顔の領域の特定
は、マニュアル作業で行なうか、自動的に抽出してもよ
い。画像からの特定画像抽出処理方法としては、例えば
H.Rowley, S.Baluja, and T.Kanade, "Neural Network-
Based Face Detection," IEEETransactions on Pattern
Analysis and Machine Intelligence, Vol. 20, No.1,
January, 1998, pp. 23-38. に詳細に記載されてい
る。The region of the face of the character in the material video may be specified manually or may be automatically extracted. As a specific image extraction processing method from an image, for example,
H. Rowley, S. Baluja, and T. Kanade, "Neural Network-
Based Face Detection, "IEEETransactions on Pattern
Analysis and Machine Intelligence, Vol. 20, No. 1,
January, 1998, pp. 23-38.

【００４１】カメラ位置情報の求め方を、図３を用いて
説明する。図３は、素材映像の２つのフレームｐ，ｑの
画像（ａ），（ｄ）について、貼り付け対象オブジェク
ト（ユーザの顔）の画像撮りこみ処理を行なうためのカ
メラ位置を設定するカメラ位置情報の取得処理について
説明する図である。A method for obtaining camera position information will be described with reference to FIG. FIG. 3 shows camera position information for setting a camera position for performing image capturing processing of an object to be pasted (user's face) with respect to images (a) and (d) of two frames p and q of a material video. It is a figure explaining the acquisition processing of.

【００４２】まず、それぞれのフレーム画像（ａ），
（ｄ）の画像中の置き換え対象オブジェクトとしての登
場人物３０の位置と向きから直交座標軸を設定する。こ
こでは首の付け根を原点とし、顔の向きにＺ、頭の上に
向かってＹ、そしてＹ−Ｚ平面に垂直な方向をＸとす
る。なお、画像中の顔の向きの検出処理方法は、例えば
T.Horprasert, Y.Yacoob, and L.Davis, "Computing 3-
D head orientation froma monocular image sequenc
e," Proc. of the second International Conference o
n Automatic Face and Gesture Recognition, pp.242-2
47, 1996に詳細に説明されている。First, each frame image (a),
An orthogonal coordinate axis is set based on the position and orientation of the character 30 as the replacement target object in the image (d). Here, the origin is the base of the neck, Z is the direction of the face, Y is the top of the head, and X is the direction perpendicular to the YZ plane. In addition, the detection method of the direction of the face in the image is, for example,
T. Horrprasert, Y. Yacoob, and L. Davis, "Computing 3-
D head orientation froma monocular image sequenc
e, "Proc. of the second International Conference o
n Automatic Face and Gesture Recognition, pp.242-2
47, 1996.

【００４３】ユーザの顔４０をカメラ５０によって撮り
こむ環境（ｂ），（ｅ）では、座標系はユーザの顔に固
定されており、各Ｘ，Ｙ，Ｚ軸の取り方を、それぞれ対
応するフレーム画像（ａ），（ｄ）に示す各フレームの
素材映像中の登場人物に対応させるようにカメラ５０の
位置を設定する。すなわち、素材映像のフレームの画像
（ａ）に対するカメラ位置は、（ｂ）のユーザの顔４０
をカメラ５０によって撮りこむ環境において、（ａ），
（ｂ）の２つの座標軸をほぼ一致するようなカメラ位置
として設定する。In the environments (b) and (e) in which the user's face 40 is captured by the camera 50, the coordinate system is fixed to the user's face, and the way of taking the X, Y, and Z axes corresponds to each. The position of the camera 50 is set so as to correspond to the characters in the material video of each frame shown in the frame images (a) and (d). That is, the camera position of the frame of the material video with respect to the image (a) is the user's face
(A), in an environment in which
The two coordinate axes in (b) are set as camera positions that almost coincide.

【００４４】この結果（ｃ）に示すように、（ａ）の登
場人物３０の顔と同じ位置・向き・大きさのユーザの顔
４０をカメラ５０によって撮り込むことができる。な
お、（ｃ）は前述のクロマキー技術を用いて周囲画像を
削除し撮り込み画像としてのユーザの顔４０のみを抽出
した画像である。As shown in the result (c), the camera 50 can capture the user's face 40 having the same position, orientation, and size as the face of the character 30 in (a). (C) is an image obtained by deleting the surrounding image and extracting only the user's face 40 as a captured image using the above-described chroma key technique.

【００４５】同様に、素材映像のフレームの画像（ｄ）
に対するカメラ位置は、（ｅ）のユーザの顔４０をカメ
ラ５０によって撮りこむ環境において、（ｄ），（ｅ）
の２つの座標軸を一致するようなカメラ位置として設定
する。この結果（ｆ）に示すように、（ｄ）の登場人物
３０の顔と同じ位置・向き・大きさのユーザの顔４０を
カメラ５０によって撮り込むことができる。Similarly, the image (d) of the frame of the material video
The camera position with respect to (d) and (e) in the environment where the user's face 40 is captured by the camera 50 in (e).
Are set as camera positions that match the two coordinate axes. As shown in the result (f), the camera 50 can capture the user's face 40 having the same position, orientation, and size as the face of the character 30 in (d).

【００４６】なお、図３に示す（ａ），（ｂ），
（ｃ）、および（ｄ），（ｅ），（ｆ）のそれぞれは素
材画像の１フレーム画像についてのカメラ位置設定処理
であり、このようなカメラ位置設定処理を素材画像の各
フレームにおいて実行し、これらを時系列情報としてカ
メラ位置情報蓄積部１に格納する。Incidentally, FIG. 3 shows (a), (b),
Each of (c) and (d), (e), and (f) is a camera position setting process for one frame image of the material image, and such a camera position setting process is executed for each frame of the material image. These are stored in the camera position information storage unit 1 as time series information.

【００４７】格納されたカメラ位置情報がカメラ移動制
御部２に出力され、カメラ移動部がカメラ４をカメラ位
置情報に従って移動させて貼り付け対象オブジェクト
（ｅｘ．ユーザの顔）を撮り込むことにより、素材映像
の例えば特定の登場人物の顔と同一の方向からのユーザ
の顔を時系列的に撮り込むことが可能となる。なお、さ
らに合成画像における違和感を減少させるため、カメラ
の焦点距離やレンズによる歪曲収差（ディストーショ
ン）も考慮し、適切なパラメータ設定、補正処理を行な
うことが好ましい。なお、これらのパラメータ設定、補
正処理は、画像撮り込み部１００、あるいは画像合成部
７において実行する。The stored camera position information is output to the camera movement control unit 2, and the camera movement unit moves the camera 4 in accordance with the camera position information to capture an object to be pasted (ex. User's face). For example, the user's face from the same direction as the face of a specific character in the material video can be captured in chronological order. In order to further reduce discomfort in the synthesized image, it is preferable to perform appropriate parameter setting and correction processing in consideration of the focal length of the camera and the distortion caused by the lens. Note that these parameter setting and correction processing are executed by the image capturing unit 100 or the image synthesizing unit 7.

【００４８】このように、素材映像における、カメラの
移動やズームなどのカメラワークによる顔の位置や向き
の変化のみならず、登場人物が移動したり首をまわした
りすることによる顔の位置や向きの変化も、カメラの位
置を動かすだけですべて実現でき、ユーザは顔の位置を
固定したままとすることが可能となる。なお、前述した
ように図２に示すディスプレイ２５に素材映像を表示し
て、人物の表情、セリフ情報をユーザに提示し、貼り付
け対象となる登場人物の表情あるいはセリフに基づいて
口を動かすなどを行なうことで、自らが顔や声の演技を
楽しむことができ、さらにリアルな画像データの合成が
可能になる。As described above, in the material video, not only the position and orientation of the face due to camera movement such as camera movement and zooming, but also the position and orientation of the face due to the movement or turning of the character Can be realized only by moving the position of the camera, and the user can keep the position of the face fixed. As described above, the material video is displayed on the display 25 shown in FIG. 2, the facial expression of the person and the line information are presented to the user, and the mouth is moved based on the expression or the line of the character to be pasted. By performing the above, the actor himself can enjoy the performance of the face and voice, and it is possible to synthesize more realistic image data.

【００４９】次に、画像合成部７における隠れマスク情
報との合成処理について図４を用いて説明する。隠れマ
スク情報は、前述したように、置き換え対象オブジェク
トの一部が手前のオブジェクト、例えば柱、建物などに
隠れているような場合の遮蔽領域情報である。Next, a process of synthesizing with hidden mask information in the image synthesizing section 7 will be described with reference to FIG. As described above, the hidden mask information is shielding area information in the case where a part of the replacement target object is hidden by a near object, for example, a pillar or a building.

【００５０】図４（ａ）に示すように、素材画像のある
フレームで、たとえば柱６０のようなものが置き換え対
象となる登場人物３０の顔を隠す形で存在する場合、画
像合成部７は合成処理の際に、この柱６０を描く、すな
わち出力する必要がある。そこで図４（ｂ）のようなマ
スクを設定する。このマスク領域７０内の画素は、撮り
込み画像抽出部６の出力データではなく、必ず素材映像
蓄積部８の出力する素材映像を選択する処理を実行する
ことで、素材映像と同様にユーザの顔が柱６０に隠れた
画像として出力され、隠れの問題が解決される。なお、
画像合成部７はこのマスク領域７０の周囲にアルファブ
レンディング（ぼかし処理）の効果を入れる処理を実行
することで、より一層自然な合成が可能となる。As shown in FIG. 4A, if a frame such as a pillar 60 exists in a frame of the material image so as to hide the face of the character 30 to be replaced, the image synthesizing unit 7 In the combining process, it is necessary to draw the pillar 60, that is, output the pillar 60. Therefore, a mask as shown in FIG. 4B is set. Pixels in the mask area 70 are not output data of the captured image extraction unit 6 but always execute processing of selecting a material image output from the material image storage unit 8, so that the user's face can be selected in the same manner as the material image. Is output as an image hidden by the pillar 60, and the problem of occlusion is solved. In addition,
The image synthesizing unit 7 performs a process of adding an effect of alpha blending (blur process) around the mask area 70, so that a more natural synthesis can be performed.

【００５１】次に、本発明の画像処理装置における処理
フローを、カメラ位置情報およびマスク情報生成処理、
および合成処理とに分けてそれぞれフローを用いて説明
する。Next, the processing flow in the image processing apparatus of the present invention will be described with reference to camera position information and mask information generation processing.
The description will be made by using the flow separately for the synthesis process.

【００５２】まず、図５を用いてカメラ位置情報および
マスク情報生成処理の手順について説明する。ステップ
Ｓ１０１では初期値としてフレームナンバを示すｉ（ｉ
＝１〜ｎ、ｎは素材映像のフレーム数）について、ｉ＝
１の設定処理を実行する。First, the procedure of the camera position information and mask information generation processing will be described with reference to FIG. In step S101, i (i indicating the frame number as an initial value)
= 1 to n, where n is the number of frames of the material video), i =
1 is performed.

【００５３】次に、ステップＳ１０２において、素材映
像のｉフレームの画像を取得し、ステップＳ１０３にお
いて、素材映像のｉフレーム画像から置き換え対象オブ
ジェクト画像（ｅｘ．特定の登場人物の顔）を選択し、
顔の位置と向き情報を抽出する。次に、これらの情報に
基づいてステップＳ１０４において、貼り付け対象オブ
ジェクト（ｅｘ．顔）の撮影方向、位置情報としてのカ
メラ位置情報を生成する。この処理は、先に図３を用い
て説明した処理である。このカメラ位置情報がカメラ位
置情報蓄積部１に格納される。Next, in step S102, an i-frame image of the material video is obtained, and in step S103, an object image to be replaced (ex. A face of a specific character) is selected from the i-frame image of the material video.
Extract face position and orientation information. Next, based on these information, in step S104, camera position information as the photographing direction and position information of the paste target object (ex. Face) is generated. This process is the process described above with reference to FIG. This camera position information is stored in the camera position information storage unit 1.

【００５４】さらに、ステップＳ１０５において、隠れ
マスク情報の生成処理がなされる。これは、先に図４を
用いて説明した処理であり、素材映像のフレームｉにつ
いて、置き換え対象となる登場人物の顔を隠す形で存在
する領域を抽出し、これらの領域をマスク領域として設
定（図４（ｂ）参照）する処理であり、素材映像の各フ
レームに対応させて、マスク情報が生成されて隠れマス
ク情報蓄積部９に格納される。Further, in step S105, hidden mask information generation processing is performed. This is the processing described above with reference to FIG. 4. In the frame i of the material video, areas that exist in a form that hides the face of the character to be replaced are extracted, and these areas are set as mask areas. (Refer to FIG. 4B.) In this process, mask information is generated corresponding to each frame of the material video and stored in the hidden mask information storage unit 9.

【００５５】次に、ステップＳ１０６において、すべて
のフレームに対する処理が終了したか否かを判定し、未
処理フレームがある場合は、ステップＳ１０７において
フレームナンバｉのインクリメント処理を実行し、ステ
ップＳ１０２以下の処理をフレームｉ＋１について実行
する。Next, in step S106, it is determined whether or not the processing has been completed for all the frames. If there is an unprocessed frame, the increment processing of the frame number i is executed in step S107. The process is executed for frame i + 1.

【００５６】すべてのフレームについてのカメラ位置情
報、隠れマスク情報の生成が終了すると、ステップＳ１
０８において、各フレームがスムーズなつながりを持つ
ように、画像のパラメータの調整が実行される。これ
は、画像を動画として見たときに不連続や振動が生じな
いようにパラメータを調整する処理である。なお、この
処理は最後に一括して行なうか、あるいは、各フレーム
についての情報を生成するたびに過去の生成済みフレー
ム情報との連続性を考慮して調整するフレーム単位処理
としてもよい。When the generation of the camera position information and the hidden mask information for all the frames is completed, step S1 is performed.
At 08, the parameters of the image are adjusted so that each frame has a smooth connection. This is a process of adjusting parameters so that discontinuity and vibration do not occur when an image is viewed as a moving image. Note that this processing may be performed at the end in a lump, or may be a frame unit processing in which each time information about each frame is generated, adjustment is made in consideration of continuity with past generated frame information.

【００５７】次に、図６を用いて合成処理の手順につい
て説明する。まず、ステップＳ２０１において、図１，
２におけるカメラ４の撮影する貼り付け対象オブジェク
ト（ｅｘ．ユーザの顔）の画像を取得する。Next, the procedure of the synthesizing process will be described with reference to FIG. First, in step S201, FIG.
2. An image of the paste target object (ex. User's face) captured by the camera 4 in Step 2 is acquired.

【００５８】次にステップＳ２０２では、撮り込み画像
抽出部６において、貼り付け対象オブジェクト（ｅｘ．
顔）領域の抽出処理を実行する。この処理は、前述した
ように、撮影画像から単色の色情報データを削除して所
望の領域としての貼り付け対象オブジェクト画像を抽出
するクロマキー技術を用いることが可能である。Next, in step S202, the captured image extracting unit 6 causes the object to be pasted (ex.
A face) region extraction process is executed. For this processing, as described above, it is possible to use the chroma key technique of deleting the single color information from the captured image and extracting the paste target object image as a desired area.

【００５９】次に、ステップＳ２０３において、画像合
成部７が撮り込み画像と、素材映像との合成処理を実行
する。この合成処理において、素材映像の置き換え対象
オブジェクト（ｅｘ．登場人物の顔）と、貼り付け対象
オブジェクト、例えば撮影画像から抽出されたユーザの
顔は、位置・大きさ・向きなどがすべて一致しているの
で、抽出されたユーザの顔領域を素材画像に上書きする
だけでよい。なお、前述したように貼り付け処理を行な
った顔領域の周囲にアルファブレンディングの効果を入
れることで、より一層自然な合成が可能となる。また、
画像合成部７は、隠れマスク情報蓄積部７から出力され
る隠れマスク情報がある場合には、マスク領域部分は素
材画像の出力データを選択する。Next, in step S203, the image synthesizing section 7 executes a synthesizing process of the captured image and the material video. In the synthesizing process, the position, size, direction, and the like of the replacement target object (ex. The face of the character) of the material video and the paste target object, for example, the user's face extracted from the captured image, match. Therefore, it is only necessary to overwrite the extracted user's face area on the material image. As described above, by adding an alpha blending effect around the face area on which the pasting process has been performed, a more natural composition can be achieved. Also,
When there is hidden mask information output from the hidden mask information storage unit 7, the image synthesizing unit 7 selects the output data of the material image for the mask area.

【００６０】なお、例えば、貼り付け対象オブジェクト
の色、明るさなどが撮影方向によって変化している場合
など、これらのパラメータを各フレーム間で一致させた
り、あるいは各フレームの画像に合わせる処理などのパ
ラメータ調整処理を画像合成部７がオプショナルステッ
プとしてのステップＳ２０３’において実行してもよ
い。ステップＳ２０３’の処理は、オプション処理であ
り、必ずしも必要とはならない。このパラメータ調整処
理は、ある条件設定の下に自動的に実行することが可能
である。例えば貼り付け対象オブジェクトの輝度値が撮
影方向の差異により、前フレームと一定の閾値以上異な
っている場合に、貼り付け対象オブジェクトの輝度値を
変更したり、あるいは素材画像の平均輝度値と、貼り付
け対象オブジェクトの輝度値とが一定の閾値以上異なっ
ている場合に、貼り付け対象オブジェクトの輝度値を変
更するなどの処理を自動的に実行する。また、素材映像
の登場人物の顔の領域とユーザの顔の領域とでは輪郭が
異なり、ユーザの領域に入りきらないと、上書きしたと
きに素材映像の登場人物の顔が残ってしまうので、素材
映像の登場人物の顔領域（特にその周辺部）に、背景
（あるいは肌）を生成して置き換える処理を実行するこ
とが望ましい。For example, when the color, brightness, etc. of the object to be pasted are changed depending on the photographing direction, these parameters are matched between frames, or a process of matching the parameters with the image of each frame. The parameter synthesizing process may be executed by the image synthesizing unit 7 in step S203 ′ as an optional step. The process of step S203 'is an optional process, and is not always necessary. This parameter adjustment processing can be automatically executed under a certain condition setting. For example, if the luminance value of the paste target object differs from the previous frame by a certain threshold or more due to the difference in the shooting direction, the luminance value of the paste target object is changed, or the average luminance value of the material image and the paste When the luminance value of the paste target object is different from the luminance value of the paste target object by a certain threshold value or more, processing such as changing the luminance value of the paste target object is automatically executed. In addition, the outline of the character's face region in the material video is different from the contour of the user's face, and if it does not fit in the user's region, the character's face in the material video will remain when overwritten. It is desirable to execute a process of generating and replacing the background (or skin) in the face region (particularly the peripheral portion) of the character in the video.

【００６１】このようにして合成された画像をステップ
Ｓ２０４において出力部（ｅｘ．表示モニタ、データ蓄
積手段、通信手段）に出力する。ステップＳ２０５で
は、素材映像が終了したか否かを判定し、終了していな
い場合は、ステップＳ２０１以下の処理を繰り返し実行
する。The image thus synthesized is output to an output unit (ex. Display monitor, data storage means, communication means) in step S204. In step S205, it is determined whether or not the material video has been completed, and if not completed, the processing of step S201 and subsequent steps is repeatedly executed.

【００６２】また、より見た目に正しい合成を行うため
に、最初にユーザが固定台に顔を固定したときに、固定
台の位置のずれの補正や、顔の大きさや皮膚の色といっ
たユーザの固体差を校正する処理をステップＳ２０１の
処理以前に行ってもよい。Further, in order to make the composition more visually correct, when the user first fixes the face on the fixed base, correction of the displacement of the fixed base and the user's solid state such as face size and skin color are performed. The process of calibrating the difference may be performed before the process of step S201.

【００６３】このように、本発明の画像処理装置によれ
ば、素材映像蓄積部に蓄積された画像の特定の置き換え
対象オブジェクト（ｅｘ．特定の登場人物の顔）を、カ
メラで撮影した貼り付け対象オブジェクト（ｅｘ．ユー
ザの顔）に変更して出力する画像合成処理を蓄積画像の
再生に同期した処理として迅速かつ容易に実行すること
が可能となる。As described above, according to the image processing apparatus of the present invention, a specific replacement target object (ex. A specific character's face) of an image stored in the material video storage unit is pasted by a camera. It is possible to quickly and easily execute the image synthesizing process of changing and outputting the target object (ex. User's face) as a process synchronized with the reproduction of the stored image.

【００６４】ユーザは図２に示す固定台に顔を固定する
のみで、移動させたり方向を変えたりといった動作を行
なうことなく、素材映像の置き換え対象オブジェクトに
併せた画像をカメラの移動により取得して貼り付けるこ
とが可能となる。The user obtains an image corresponding to the object to be replaced with the material image by moving the camera without fixing the face to the fixed base shown in FIG. 2 and performing an operation such as moving or changing the direction. Can be attached.

【００６５】［実施例２］図７は、本発明の第２の実施
の形態に係わるインタラクティブ型の画像処理装置のブ
ロック図である。図８は図７の画像処理装置における画
像撮り込み部８０の具体的構成例を示す図である。図７
および図８を用いて本発明の画像処理装置の第２実施例
概要について説明する。[Embodiment 2] FIG. 7 is a block diagram of an interactive image processing apparatus according to a second embodiment of the present invention. FIG. 8 is a diagram showing a specific configuration example of the image capturing unit 80 in the image processing apparatus of FIG. FIG.
The outline of a second embodiment of the image processing apparatus of the present invention will be described with reference to FIG.

【００６６】照明位置情報蓄積部１１には照明位置情報
が格納される。照明位置情報は、予め素材蓄積部８に格
納された素材データ中の置き換え対象オブジェクト、例
えば映画のある特定の登場人物の顔の輝度情報に対応し
た輝度分布を持つユーザの顔５を撮り込むための照明１
４の位置情報を時系列のシーケンシャルデータとして格
納している。すなわち素材映像蓄積部８の置き換え対象
となる画像オブジェクト（ここでは特定の登場人物の
顔）の輝度分布に対応する輝度分布を持つ貼り付け対象
オブジェクトの顔画像の撮り込みを行なうための照明１
４の位置情報をシーケンシャルデータとして蓄積してい
る。The lighting position information storage section 11 stores lighting position information. The lighting position information is used to capture a replacement target object in the material data stored in advance in the material storage unit 8, for example, a user's face 5 having a luminance distribution corresponding to the luminance information of the face of a specific character in a movie. Lighting 1
4 is stored as time-sequential sequential data. That is, the illumination 1 for capturing the face image of the paste target object having the luminance distribution corresponding to the luminance distribution of the image object (here, the face of a specific character) to be replaced by the material video storage unit 8
4 is stored as sequential data.

【００６７】照明位置情報蓄積部１１に格納された照明
位置情報は、画像撮り込み部８０（ｅｘ．図８参照）を
構成する照明移動制御部１２に送られる。照明位置情報
はたとえば照明移動部１３の先端（照明１４部）の位
置、方向の時系列の情報からなる。照明移動制御部１２
では照明位置情報に沿って照明移動部１３を制御する。
照明移動部１３はたとえばアームロボットのように、空
間内の所望の位置・方向に移動することが可能な装置で
ある。照明移動部１３の先端に取り付けられた照明１４
は、ユーザの顔５に照明をあてる。The illumination position information stored in the illumination position information storage unit 11 is sent to the illumination movement control unit 12 constituting the image capturing unit 80 (ex. FIG. 8). The illumination position information includes, for example, time-series information on the position and direction of the tip (illumination 14) of the illumination moving unit 13. Lighting movement control unit 12
Then, the illumination moving unit 13 is controlled according to the illumination position information.
The illumination moving unit 13 is a device that can move to a desired position and direction in a space, for example, like an arm robot. Illumination 14 attached to the tip of illumination moving unit 13
Illuminates the face 5 of the user.

【００６８】カメラ位置情報蓄積部１は、予め素材蓄積
部８に格納された素材データ、例えば映画のある特定の
登場人物の顔の向き（方向）、位置情報に対応してユー
ザの顔５を撮り込むためのカメラの位置情報を時系列の
シーケンシャルデータとして格納している。すなわち素
材映像蓄積部８の置き換え対象となる画像オブジェクト
（ここでは特定の登場人物の顔）に一致した方向から貼
り付け対象オブジェクトである顔画像の撮り込みを行な
うためのカメラの位置情報をシーケンシャルデータとし
て蓄積している。The camera position information storage unit 1 stores material data stored in the material storage unit 8 in advance, for example, the face 5 of the user in accordance with the direction (direction) of the face of a specific character in the movie and the position information. Camera position information for capturing images is stored as time-sequential sequential data. That is, the position information of the camera for capturing the face image, which is the object to be pasted, from the direction corresponding to the image object (here, the face of a specific character) to be replaced in the material video storage unit 8 is stored as sequential data. Has accumulated.

【００６９】また、カメラ位置情報蓄積部１に蓄積され
たカメラ位置情報は、画像撮り込み部８０（ｅｘ．図８
参照）を構成するカメラ移動制御部２に送られる。カメ
ラ移動制御部２ではカメラ位置情報蓄積部１から入力さ
れるカメラ位置情報に基づいてカメラ移動部３を制御す
る。カメラ移動部３は、図８に示すように、たとえばア
ームロボットのように、空間内の所望の位置・方向に移
動することが可能な装置である。カメラ移動部３の先端
に取り付けられたカメラ４は、ユーザの顔５を撮影す
る。カメラ4で撮像された画像は撮り込み画像抽出部６
に時系列で送られる。The camera position information stored in the camera position information storage unit 1 is stored in an image capturing unit 80 (ex. FIG. 8).
(See FIG. 2). The camera movement control unit 2 controls the camera movement unit 3 based on the camera position information input from the camera position information storage unit 1. As shown in FIG. 8, the camera moving unit 3 is a device that can move to a desired position and direction in a space, for example, like an arm robot. The camera 4 attached to the tip of the camera moving unit 3 captures the face 5 of the user. The image captured by the camera 4 is a captured image extraction unit 6
Sent in chronological order.

【００７０】撮り込み画像抽出部6では、画像撮り込み
部１００のカメラ４で撮像された画像から合成対象とな
る部分データ（この例では顔領域画像データ）のみを抽
出する。抽出された画像は画像合成部７に送られ、合成
処理が実行される。The captured image extracting section 6 extracts only partial data (in this example, face area image data) to be synthesized from the image captured by the camera 4 of the image capturing section 100. The extracted image is sent to the image synthesizing unit 7, where the synthesizing process is executed.

【００７１】素材映像蓄積部８には、映画やドラマのシ
ーンの映像（以下、素材映像と呼ぶ）が蓄積されてお
り、蓄積画像は時系列で画像合成部７に送られる。隠れ
マスク情報蓄積部9には、隠れマスク情報が記録されて
おり、素材映像蓄積部８の蓄積画像出力に同期して時系
列で画像合成部７に送られる。画像合成部７における合
成処理、隠れマスク情報についての処理については実施
例１と同様であるので説明を省略する。画像合成部７に
おいて生成された画像は出力部１０に送られ、出力され
る。出力部は、例えばＣＲＴ、ＬＣＤ等のモニタ装置
や、ＣＤ，ＤＶＤ，ビデオなどの記録装置、あるいはデ
ータ通信のための通信手段等である。The material video storage unit 8 stores video of a movie or drama scene (hereinafter, referred to as material video), and the stored images are sent to the image synthesizing unit 7 in a time series. Hidden mask information is stored in the hidden mask information storage unit 9 and is sent to the image synthesizing unit 7 in time series in synchronization with the output of the stored image from the material video storage unit 8. The synthesizing process in the image synthesizing unit 7 and the process for the hidden mask information are the same as those in the first embodiment, and thus the description is omitted. The image generated by the image synthesizing unit 7 is sent to the output unit 10 and output. The output unit is, for example, a monitor device such as a CRT or LCD, a recording device such as a CD, DVD, or video, or a communication unit for data communication.

【００７２】なお、カメラ位置情報蓄積部１に記録され
たカメラ位置情報、照明位置情報蓄積部１１に蓄積され
た照明位置情報、素材映像蓄積部８に記録された素材映
像、および隠れマスク情報蓄積部９に記録された隠れマ
スク情報の読み出しは、時系列で同期して行われる。The camera position information stored in the camera position information storage unit 1, the illumination position information stored in the illumination position information storage unit 11, the material image stored in the material image storage unit 8, and the hidden mask information storage Reading of the hidden mask information recorded in the unit 9 is performed in a time-series manner.

【００７３】画像撮り込み部８０の構成について図８を
参照して説明する。図８の構成は、ユーザの顔の映像
を、貼り付け対象オブジェクトとして撮り込む構成を示
したものである。ユーザ２０は固定台２１に貼り付け対
象オブジェクトとなる顔を固定する。固定台２１とカメ
ラ移動部３（ここではアームロボット）の位置関係は既
知である。カメラ移動部３に取り付けられたカメラ４
が、前述したカメラ位置情報蓄積部１に蓄積されたカメ
ラ位置情報にしたがったカメラ移動制御部２による制御
によって移動しながらユーザの顔の画像を撮り込む。The configuration of the image capturing section 80 will be described with reference to FIG. The configuration in FIG. 8 illustrates a configuration in which a video of a user's face is captured as an object to be pasted. The user 20 fixes the face to be the object to be pasted on the fixed base 21. The positional relationship between the fixed base 21 and the camera moving unit 3 (here, an arm robot) is known. Camera 4 attached to camera moving unit 3
Captures an image of the user's face while moving under the control of the camera movement control unit 2 according to the camera position information stored in the camera position information storage unit 1 described above.

【００７４】また、そのとき照明移動部１３に取り付け
られた照明１４が、照明位置情報蓄積部に蓄積された照
明位置情報に従った照明移動制御部１２の制御にしたが
って移動しながらユーザの顔に照明をあてる。ここで、
カメラ移動部３および照明移動部１３としては、カメラ
４および照明１４を所望の位置・向きに移動することが
できるものであれば、アームロボットである必要はな
い。例えば照明移動部１３と照明１４の他の構成例とし
て、多数の光源をユーザの周囲に配置し、光源を明滅さ
せる構成とすることにより、多方面からの照明効果をも
たらす構成とすることも可能である。At this time, the illumination 14 attached to the illumination moving unit 13 moves to the user's face while moving under the control of the illumination movement control unit 12 according to the illumination position information stored in the illumination position information storage unit. Turn on the lights. here,
The camera moving unit 3 and the lighting moving unit 13 do not need to be arm robots as long as they can move the camera 4 and the lighting 14 to desired positions and directions. For example, as another configuration example of the illumination moving unit 13 and the illumination 14, a configuration in which a large number of light sources are arranged around the user and the light sources blink may be used to provide a lighting effect from various directions. It is.

【００７５】固定台２１は、例えば図８に示すように貼
り付け対象オブジェクト（顔）を出す穴のついた青、緑
等の単色に塗られた板を取りつけ、顔画像抽出部で顔の
抽出がしやすい構成とする。あるいは貼り付け対象オブ
ジェクト（ｅｘ．ユーザの顔）以外の部分を青、緑等の
布で覆うという構成としてもよい。For example, as shown in FIG. 8, a fixed base 21 is provided with a plate painted in a single color such as blue or green with a hole for projecting an object to be pasted (face), and a face image extraction unit extracts the face. The configuration is easy to remove. Alternatively, a configuration may be adopted in which a portion other than the paste target object (ex. User's face) is covered with a cloth of blue, green, or the like.

【００７６】カメラ４の撮り込む画像は、顔の周囲画像
も含む画像となるので、図７に示す撮り込み画像抽出部
６が貼り付け対象となる顔画像のみを抽出する処理を実
行する。たとえば青や緑といった単色の色情報を持つ固
定台２１とともに貼り付け対象オブジェクト（ｅｘ．ユ
ーザの顔）をカメラ４で撮影し、撮影画像から単色の色
情報データを削除して所望の領域としての貼り付け対象
オブジェクト画像を抽出する、すなわちクロマキー技術
を用いることが可能である。Since the image captured by the camera 4 is an image including the surrounding image of the face, the captured image extracting unit 6 shown in FIG. 7 executes a process of extracting only the face image to be pasted. For example, an object to be pasted (ex. User's face) is photographed by the camera 4 together with the fixed base 21 having monochromatic color information such as blue or green, and monochromatic color information data is deleted from the photographed image to obtain a desired area. It is possible to extract the paste target object image, that is, to use the chroma key technique.

【００７７】なお、素材映像の進行状況やその時のセリ
フ情報をユーザに提示するディスプレイ２５をユーザの
見える位置に配置し、ユーザがディスプレイ２５に表示
された素材映像を見て、貼り付け対象となる登場人物の
表情あるいはセリフに基づいて口を動かすなどを行なう
構成としてもよい。このような処理構成とすることで、
より登場人物に近い表情に従ったユーザの顔を素材映像
に貼り付けることが可能となる。The display 25 for presenting the progress of the material video and the line information at that time to the user is arranged at a position where the user can see the material video, and the user looks at the material video displayed on the display 25 to be pasted. A configuration in which the mouth is moved based on the expression of the characters or the lines may be used. With such a processing configuration,
The user's face according to the expression closer to the character can be pasted on the material video.

【００７８】本実施例は、照明を制御することで、合成
された画像を第１の実施例より自然で質の高いものにす
ることが可能となる。照明位置情報は、素材映像の各フ
レームにおいて、画像を解析して光源の位置を推定して
作成される。したがって、光源の数を増やせば、より質
の高い合成が可能となる。また、素材映像において、カ
メラ固定で登場人物が首を回している場面が撮影されて
いる場合などは、カメラと照明の位置関係は変わらない
ので、カメラ移動部に照明も取り付けてしまうという構
成としてもよい。In the present embodiment, by controlling the illumination, it is possible to make the synthesized image more natural and high quality than the first embodiment. The illumination position information is created by analyzing the image and estimating the position of the light source in each frame of the material video. Therefore, if the number of light sources is increased, higher quality synthesis can be performed. Also, in the case of a material video where a scene where a character is turning around with a camera fixed is shot, the positional relationship between the camera and the lighting does not change, so the lighting is attached to the camera moving part as Is also good.

【００７９】［実施例３］図９は、本発明の第３の実施
の形態に係わるインタラクティブ型の画像処理装置のブ
ロック図である。カメラ位置情報蓄積部1に記録された
カメラ位置情報は、カメラ移動制御部２に送られる。カ
メラ位置情報は、前述の実施例１と同様素材映像の各フ
レームの置き換え対象オブジェクトに基づいてあらかじ
め生成された情報であり、貼り付け対象オブジェクト
（ｅｘ．ユーザの顔５）を撮影するカメラ４を駆動する
カメラ移動部３の先端の位置、方向の時系列の情報から
なる。[Embodiment 3] FIG. 9 is a block diagram of an interactive image processing apparatus according to a third embodiment of the present invention. The camera position information recorded in the camera position information storage unit 1 is sent to the camera movement control unit 2. The camera position information is information that is generated in advance based on the replacement target object of each frame of the material video as in the first embodiment, and the camera 4 that captures the paste target object (ex. It consists of time-series information on the position and direction of the tip of the camera moving unit 3 to be driven.

【００８０】カメラ移動制御部２ではカメラ位置情報蓄
積部1に記録されたカメラ位置情報に沿ってカメラ移動
部３を制御する。カメラ移動部３はたとえばアームロボ
ットのように、空間内の所望の位置・方向に移動するこ
とが可能な装置である。カメラ移動部３の先端に取り付
けられたカメラ４は、ユーザの顔５を撮影する。カメラ
４で撮像された画像は撮り込み画像抽出部６に時系列で
送られる。The camera movement control unit 2 controls the camera movement unit 3 according to the camera position information recorded in the camera position information storage unit 1. The camera moving unit 3 is a device that can move to a desired position and direction in a space, for example, like an arm robot. The camera 4 attached to the tip of the camera moving unit 3 captures the face 5 of the user. Images captured by the camera 4 are sent to the captured image extraction unit 6 in a time series.

【００８１】撮り込み画像抽出部６では、カメラ４の撮
影した画像から貼り付け対象オブジェクトとしての顔の
部分だけを、例えばクロマキー技術を用いて抽出し、抽
出画像を画像合成部7に送る。素材映像蓄積部８には、
映画やドラマのシーンの映像（以下素材映像と呼ぶ）が
蓄積されており、その画像は時系列で画像合成部７に送
られる。隠れマスク情報蓄積部９には、隠れマスク情報
が記録されており、時系列で画像合成部７に送られる。
隠れマスク情報は、素材映像中の置き換え対象オブジェ
クトである登場人物の顔において、たとえば手や柱の画
像が手前にかぶさって顔の一部または全部が隠れている
場合に、合成された顔においてもその隠れを実現するた
めの情報（図４参照）である。The captured image extracting section 6 extracts only the face portion as an object to be pasted from the image captured by the camera 4 using, for example, the chroma key technique, and sends the extracted image to the image synthesizing section 7. In the material video storage unit 8,
Video of a movie or drama scene (hereinafter referred to as material video) is stored, and the images are sent to the image synthesizing unit 7 in time series. The hidden mask information is stored in the hidden mask information storage unit 9 and is sent to the image synthesizing unit 7 in time series.
Hidden mask information is applied to the face of a character that is the replacement target object in the material video, for example, when a part or all of the face is hidden by an image of a hand or a pillar covering the front, This is information (see FIG. 4) for realizing the hiding.

【００８２】置き換え対象オブジェクト位置情報蓄積部
１５には、顔位置情報が記録されており、時系列で画像
合成部7に送られる。顔位置情報は、撮影されて抽出さ
れたユーザの顔の領域を、素材映像中の登場人物の顔の
領域に重ねるための情報で、位置関係や領域の縮尺そし
て必要があれば画像内の回転情報などからなる。Face position information is recorded in the replacement target object position information storage section 15 and is sent to the image synthesizing section 7 in time series. The face position information is information for superimposing the user's face region that has been photographed and extracted on the character's face region in the material video, the positional relationship, the scale of the region, and the rotation in the image if necessary. It consists of information.

【００８３】画像合成部７では、素材映像蓄積部８から
入力される素材映像と、撮り込み画像抽出部６から入力
される貼り付け対象オブジェクトとしてのユーザの顔の
画像と、隠れマスク情報蓄積部９から入力される隠れマ
スク情報と、置き換え対象オブジェクト位置情報蓄積部
１５から入力される顔位置情報を用いて合成画像を生成
する。画像合成部７では、置き換え対象オブジェクト位
置情報蓄積部１５から入力される顔位置情報を用いて貼
り付け対象オブジェクトの画像のサイズ調整および位置
調整処理を実行する。ただし、他の物体で顔が隠れてし
まうような状態がない場合には、隠れマスク情報はなく
ても合成可能である。生成された画像は出力部１０に送
られ、出力される。出力部は、例えばＣＲＴ、ＬＣＤ等
のモニタ装置や、ＣＤ，ＤＶＤ，ビデオなどの記録装
置、あるいはデータ通信のための通信手段等である。な
お、カメラ位置情報蓄積部1に記録されたカメラ位置情
報、素材映像蓄積部8に記録された素材映像、隠れマス
ク情報蓄積部9に記録された隠れマスク情報、および置
き換え対象オブジェクト位置情報蓄積部１５に記録され
た顔位置情報の読み出しは、時系列で同期して行われ
る。The image synthesizing unit 7 includes a material image input from the material image storing unit 8, an image of the user's face as an object to be pasted input from the captured image extracting unit 6, and a hidden mask information storing unit. A composite image is generated using the hidden mask information input from step 9 and the face position information input from the replacement target object position information storage unit 15. The image synthesizing unit 7 performs size adjustment and position adjustment processing of the image of the paste target object using the face position information input from the replacement target object position information storage unit 15. However, when there is no state where the face is hidden by another object, the composition can be performed without the hidden mask information. The generated image is sent to the output unit 10 and output. The output unit is, for example, a monitor device such as a CRT or LCD, a recording device such as a CD, DVD, or video, or a communication unit for data communication. Note that the camera position information recorded in the camera position information accumulation unit 1, the material image recorded in the material image accumulation unit 8, the hidden mask information recorded in the hidden mask information accumulation unit 9, and the replacement target object position information accumulation unit The reading of the face position information recorded in No. 15 is performed in a time-series manner.

【００８４】本実施例におけるカメラ位置情報と顔位置
情報の求め方を、図１０を参照しながら具体例を示して
説明する。ここでのカメラ位置情報は、第１の実施例の
それとは情報の意味が異なる。例えば図１０のように、
置き換え対象オブジェクトとしての登場人物の顔９０を
含む素材映像のあるフレームの画像（ａ）について、登
場人物の位置と向きから直交座標軸を設定する。ここで
は首の付け根を原点とし、顔の向きにＺ、頭の上に向か
ってＹ、そしてＹ−Ｚ平面に垂直な方向をＸとしてい
る。The method of obtaining the camera position information and the face position information in the present embodiment will be described with reference to a specific example with reference to FIG. The meaning of the camera position information here is different from that of the first embodiment. For example, as shown in FIG.
For the image (a) of a frame of a material video including the face 90 of a character as a replacement target object, orthogonal coordinate axes are set based on the position and orientation of the character. Here, the origin is the base of the neck, Z is the direction of the face, Y is the direction above the head, and X is the direction perpendicular to the YZ plane.

【００８５】（ｂ）の貼り付け対象オブジェクトとして
のユーザの顔９２をカメラ９４によって撮り込む環境に
おいて、座標系はユーザの顔に固定されており、軸の取
り方は素材映像中の登場人物と同じになるようにカメラ
位置が設定される。In the environment (b) in which the user's face 92 as an object to be pasted is photographed by the camera 94, the coordinate system is fixed to the user's face, and the axes are determined by the characters in the material video. The camera position is set to be the same.

【００８６】このようにカメラ位置が設定された状態で
貼り付け対象オブジェクトとしてのユーザの顔９２をカ
メラ９４によって撮り込むと、（ｃ）に示すように、
（ａ）の登場人物の顔９０と同じ向きのユーザの顔９６
を撮り込むことができる。なお、（ｃ）は（ｂ）中のカ
メラ９４の撮り込み画像からユーザの顔のみを抽出した
画像である。つまり、顔の画像中における位置や大きさ
は異なるが、顔の向きは素材映像の画像中の顔と同じに
なるカメラ位置をカメラ位置情報とする。When the user's face 92 as an object to be pasted is photographed by the camera 94 with the camera position set in this way, as shown in FIG.
The user's face 96 in the same direction as the character's face 90 in FIG.
Can be captured. (C) is an image obtained by extracting only the user's face from the image captured by the camera 94 in (b). In other words, the camera position information is a camera position that has a different position and size in the face image but has the same face direction as the face in the material video image.

【００８７】さらに、素材映像（ａ）の画像中の登場人
物の顔９０の位置情報と、素材映像（ａ）の画像中の登
場人物の顔９０と（ｃ）の抽出された顔画像９６との拡
大縮尺情報とを顔位置情報とする。Further, the position information of the face 90 of the character in the image of the material video (a), the face 90 of the character in the image of the material video (a), and the extracted face image 96 of (c) And the enlargement / reduction information of the image are used as face position information.

【００８８】画像合成部７における合成処理の際に、置
き換え対象オブジェクト位置情報蓄積部１５から入力さ
れる各フレームごとの拡大縮尺情報を用いて抽出された
貼り付け対象オブジェクトとしてのユーザの顔の位置と
縮尺を変更して、（ｄ）に示すような貼り付け対象オブ
ジェクトのユーザの顔画像９８を生成する。At the time of the synthesizing process in the image synthesizing unit 7, the position of the user's face as the object to be pasted, which is extracted by using the scale-down information for each frame input from the object position information storing unit 15 for replacement. And the scale is changed to generate a face image 98 of the user of the paste target object as shown in FIG.

【００８９】このように、素材映像（ａ）の画像中の置
き換え対象オブジェクト（ｅｘ．登場人物の顔）の位置
情報と、置き換え対象オブジェクトと、貼り付け対象オ
ブジェクトの撮り込み画像との大きさの対比に基づく拡
大縮尺情報とを置き換え対象オブジェクト位置情報蓄積
部１５に格納し、これらの位置情報と、拡縮情報とを用
いて素材映像の置き換え対象オブジェクトに対する貼り
付け対象オブジェクトの合成処理が実行できる。As described above, the position information of the replacement target object (ex. The face of the character) in the image of the material video (a), the size of the replacement target object, and the size of the captured image of the paste target object The enlargement / reduction information based on the comparison is stored in the replacement target object position information storage unit 15, and the composition processing of the pasting target object with respect to the replacement target object of the material video can be executed using the position information and the enlargement / reduction information.

【００９０】このような合成処理を実行することで、置
き換え対象オブジェクトの画像の大きさに併せて実施例
１のようにカメラ位置を変更、すなわちカメラを被写体
から遠ざけたり、近づけたりする処理を省略することが
可能となり、カメラの撮影方向のみを考慮した画像撮り
込みが可能となる。なお、カメラの焦点距離やレンズに
よる歪曲収差（ディストーション）による画像の変形の
影響は残るので、補正するか、あるいは「遠くに小さく
写っている顔であれば、多少の変形は許す」という方針
で画像撮り込みが実行可能である。By executing such a synthesizing process, as in the first embodiment, the camera position is changed according to the size of the image of the object to be replaced, that is, the process of moving the camera away from or closer to the subject is omitted. It is possible to take an image taking only the shooting direction of the camera into consideration. The effect of image deformation due to the camera's focal length and lens distortion remains, so either correct it or use a policy of "Allow some deformation if the face is small and distant." Image capture is possible.

【００９１】図１１を用いてカメラ位置情報、顔位置情
報、およびマスク情報生成処理の手順について説明す
る。ステップＳ３０１では初期値としてフレームナンバ
を示すｉ（ｉ＝１〜ｎ、ｎは素材映像のフレーム数）に
ついて、ｉ＝１の設定処理を実行する。The procedure of the camera position information, face position information, and mask information generation processing will be described with reference to FIG. In step S301, a setting process of i = 1 is executed for i (i = 1 to n, where n is the number of frames of a material video) indicating a frame number as an initial value.

【００９２】次に、ステップＳ３０２において、素材映
像のｉフレームの画像を取得し、ステップＳ３０３にお
いて、素材映像のｉフレーム画像から置き換え対象オブ
ジェクト画像（ｅｘ．特定の登場人物の顔）を選択し、
顔の位置と向き情報を抽出する。顔の位置情報にはサイ
ズも含む。基本的に図９のカメラ４で撮り込まれる貼り
付け対象オブジェクト（ｅｘ．ユーザの顔）のサイズは
一定であるとすれば、素材映像の置き換え対象オブジェ
クト画像（ｅｘ．特定の登場人物の顔）のサイズに基づ
いて画像の拡縮率を設定可能となる。Next, in step S302, an i-frame image of the material video is obtained. In step S303, an object image to be replaced (ex. A face of a specific character) is selected from the i-frame image of the material video.
Extract face position and orientation information. The face position information also includes the size. Basically, assuming that the size of an object to be pasted (ex. User's face) captured by the camera 4 in FIG. 9 is constant, an object image to be replaced with a material video (ex. Face of a specific character) Can be set based on the size of the image.

【００９３】次に、これらの情報に基づいてステップＳ
３０４において、貼り付け対象オブジェクト（ｅｘ．
顔）の撮影方向としてのカメラ位置情報を生成し、さら
に、貼り付け対象オブジェクト（ｅｘ．顔）の位置情報
と拡縮情報としての顔位置情報を生成する。この処理
は、先に図１０を用いて説明した処理である。生成され
たカメラ位置情報がカメラ位置情報蓄積部１に格納さ
れ、顔位置情報は置き換え対象オブジェクト位置情報蓄
積部１５に格納される。Next, based on these information, step S
At 304, the paste target object (ex.
Camera position information as a shooting direction of a face is generated, and further, position information of an object to be pasted (ex. Face) and face position information as scaling information are generated. This process is the process described above with reference to FIG. The generated camera position information is stored in the camera position information storage unit 1, and the face position information is stored in the replacement target object position information storage unit 15.

【００９４】さらに、ステップＳ３０５において、隠れ
マスク情報の生成処理がなされる。これは、先に図４を
用いて説明した処理であり、素材映像のフレームｉにつ
いて、置き換え対象となる登場人物の顔を隠す形で存在
する領域を抽出し、これらの領域をマスク領域として設
定（図４（ｂ）参照）する処理であり、素材映像の各フ
レームに対応させて、マスク情報が生成されて隠れマス
ク情報蓄積部９に格納される。Further, in step S305, a process of generating hidden mask information is performed. This is the processing described above with reference to FIG. 4. In the frame i of the material video, areas that exist in a form that hides the face of the character to be replaced are extracted, and these areas are set as mask areas. (Refer to FIG. 4B.) In this process, mask information is generated corresponding to each frame of the material video and stored in the hidden mask information storage unit 9.

【００９５】次に、ステップＳ３０６において、すべて
のフレームに対する処理が終了したか否かを判定し、未
処理フレームがある場合は、ステップＳ３０７において
フレームナンバｉのインクリメント処理を実行し、ステ
ップＳ３０２以下の処理をフレームｉ＋１について実行
する。Next, in step S306, it is determined whether or not the processing for all the frames has been completed. If there is an unprocessed frame, the process of incrementing the frame number i is executed in step S307. The process is executed for frame i + 1.

【００９６】すべてのフレームについてのカメラ位置情
報、顔位置情報、隠れマスク情報の生成が終了すると、
ステップＳ３０８において、各フレームがスムーズなつ
ながりを持つように、画像のパラメータの調整が実行さ
れる。これは、画像を動画として見たときに不連続や振
動が生じないようにパラメータを調整する処理である。
なお、この処理は最後に一括して行なうか、あるいは、
各フレームについての情報を生成するたびに過去の生成
済みフレーム情報との連続性を考慮して調整するフレム
単位処理としてもよい。When the generation of camera position information, face position information, and hidden mask information for all frames is completed,
In step S308, adjustment of image parameters is performed so that each frame has a smooth connection. This is a process of adjusting parameters so that discontinuity and vibration do not occur when an image is viewed as a moving image.
In addition, this processing is performed at the end collectively, or
The frame unit processing may be adjusted in consideration of continuity with past generated frame information every time information about each frame is generated.

【００９７】また、本実施例３の画像合成処理手順は、
実施例１で説明した図６とほぼ同じになる。ただし、素
材映像と撮影された画像から抽出されたユーザの顔の画
像とを合成した画像を生成する画像合成部での処理が異
なる。素材映像の登場人物の顔と、撮影された画像から
抽出されたユーザの顔は向きのみが合っているので、顔
位置情報を用いて位置・大きさをそろえた画像を生成
し、素材画像に上書きすることになる。The image synthesizing processing procedure of the third embodiment is as follows.
This is almost the same as FIG. 6 described in the first embodiment. However, the processing performed by the image synthesis unit that generates an image obtained by synthesizing the material video and the image of the user's face extracted from the captured image is different. Since the faces of the characters in the material video and the user's face extracted from the captured image are oriented only in direction, an image with the same position and size is generated using face position information, and Will be overwritten.

【００９８】本実施例は、実施例１に顔位置情報を付加
することで、カメラの移動範囲を小さくして、システム
をコンパクトにすることを可能にしている。実施例１で
は、たとえば素材映像中の登場人物が遠いところにい
て、顔の大きさが小さく写っている場合には、ユーザの
顔を取り込む際に撮影用カメラを遠くに移動しなければ
ならないが、本実施例では、カメラの方向のみを制御す
ればよく、遠近を考慮する必要がなくなる。In this embodiment, by adding face position information to the first embodiment, it is possible to reduce the moving range of the camera and make the system compact. In the first embodiment, for example, when the characters in the material video are far away and the face size is small, the photographing camera must be moved far when capturing the user's face. In this embodiment, only the direction of the camera needs to be controlled, and it is not necessary to consider the distance.

【００９９】［実施例４］次に、本発明の画像処理装置
の実施例４として、映画やドラマのシーンの素材映像で
はなく、３ＤＣＧ（コンピュータグラフィクス）のキャ
ラクタの映像を素材映像とし、その３ＤＣＧのオブジェ
クトを置き換え対象オブジェクトとした構成を持つシス
テムについて説明する。例えば、３ＤＣＧ（コンピュー
タグラフィクス）のキャラクタの顔を置き換え対象オブ
ジェクトとし、ユーザの顔を貼り付け対象オブジェクト
として、キャラクタにユーザが成り代わってインタラク
ティブに入り込んだ映像を生成するシステムである。[Embodiment 4] Next, as Embodiment 4 of the image processing apparatus of the present invention, a 3DCG (computer graphics) character image is used as a material image instead of a material image of a movie or drama scene. A system having a configuration in which this object is set as a replacement target object will be described. For example, there is a system that generates a video in which the user takes the place of the character and enters the interactive system using the face of a 3DCG (computer graphics) character as a replacement target object and the user's face as a paste target object.

【０１００】図１２に３ＤＣＧ（コンピュータグラフィ
クス）のキャラクタの映像を素材映像とし、その顔にユ
ーザが成り代わってインタラクティブに入り込んだ映像
を生成する画像処理装置のブロック図を示す。FIG. 12 is a block diagram of an image processing apparatus that generates a video in which a user takes the face of a 3DCG (computer graphics) character as a material video and enters the face interactively.

【０１０１】３ＤＣＧモデル蓄積部３１は、ＣＧのキャ
ラクタの三次元モデルと、その動き情報が時系列で記録
された記憶部である。仮想カメラ情報蓄積部３２は、３
ＤＣＧのオブジェクトに対する視点に対応させてカメラ
を設定したと想定した仮想カメラに関する情報を蓄積す
る。すなわち、画像レンダリングの際に利用する仮想カ
メラのパラメータとしての、焦点距離、画像サイズ、３
ＤＣＧモデルに対する仮想カメラの位置情報等が時系列
データとして記憶される。The 3DCG model storage section 31 is a storage section in which a three-dimensional model of a CG character and its motion information are recorded in a time series. The virtual camera information storage unit 32
Information about a virtual camera assumed to have been set in correspondence with the viewpoint of the DCG object is stored. That is, focal length, image size, 3
Position information of the virtual camera with respect to the DCG model is stored as time-series data.

【０１０２】素材映像生成部３３では、３ＤＣＧモデル
蓄積部３１に格納された３ＤＣＧモデルの情報と仮想カ
メラ情報蓄積部３２に格納された仮想カメラ情報とか
ら、レンダリングを行い、素材映像を生成する。The material video generation unit 33 performs rendering from the information of the 3DCG model stored in the 3DCG model storage unit 31 and the virtual camera information stored in the virtual camera information storage unit 32, and generates a source video.

【０１０３】カメラ位置情報生成部３４では、素材映像
生成部３３において３ＤＣＧモデルの情報から得られる
顔の向き情報と、仮想カメラ情報に基づくレンダリング
処理の結果、生成された素材映像内の置き換え対象オブ
ジェクト（ｅｘ．キャラクタの顔）の向き、位置情報を
求め、これらの情報に対応して、貼り付け対象オブジェ
クト（ｅｘ．ユーザの顔）５を撮り込むためのカメラの
位置情報を時系列のシーケンシャルデータとして生成す
る。すなわち素材映像生成部３３の置き換え対象となる
画像オブジェクト（ここでは特定の３ＤＣＧキャラクタ
の顔）に一致した方向から貼り付け対象オブジェクトで
ある顔画像の撮り込みを行なうためのカメラの位置情報
をシーケンシャルデータとして生成蓄積する。なお、こ
の際、実際のカメラ４のパラメータ（焦点距離やレンズ
による歪曲収差等）を加味して、カメラ位置情報を生成
する。In the camera position information generating unit 34, the object orientation information obtained from the 3D CG model information in the material image generating unit 33 and the object to be replaced in the material image generated as a result of the rendering process based on the virtual camera information The orientation and position information of the (ex. Character's face) is obtained, and the position information of the camera for capturing the paste target object (ex. The face of the user) 5 is time-series sequential data corresponding to the information. Generate as In other words, the position information of the camera for capturing the face image, which is the object to be pasted, from the direction corresponding to the image object (here, the face of the specific 3DCG character) to be replaced by the material video generation unit 33 is described as sequential data. Generate and accumulate as At this time, camera position information is generated in consideration of actual parameters of the camera 4 (focal length, distortion due to a lens, and the like).

【０１０４】すなわち、本実施例における画像撮り込み
手段１００は、素材映像生成部３３の生成する素材映像
に含まれる置き換え対象オブジェクトの仮想撮影視点情
報に基づいて生成されるカメラ位置情報に従って撮影カ
メラ位置を制御して貼り付け対象オブジェクトの画像撮
り込みを実行する。That is, the image capturing means 100 according to the present embodiment determines the photographing camera position in accordance with the camera position information generated based on the virtual photographing viewpoint information of the replacement target object included in the material image generated by the material image generating unit 33. To execute image capture of the paste target object.

【０１０５】隠れマスク情報生成部３５では、３ＤＣＧ
モデルの情報と仮想カメラ情報から隠れマスク情報が生
成される。隠れマスク情報の意味は、前述の実施例１で
説明したシステムと同様である。これ以降の処理構成は
実施例１と同じ処理となる。In the hidden mask information generating section 35, 3DCG
Hidden mask information is generated from the model information and the virtual camera information. The meaning of the hidden mask information is the same as in the system described in the first embodiment. The subsequent processing configuration is the same as that of the first embodiment.

【０１０６】なお、本システムにおいては、レンダリン
グの際に、３ＤＣＧデータからなる背景や他の物体との
合成処理が可能である。また、素材映像生成の際に任意
の方向からの光源設定処理も可能となる。このようなレ
ンダリング処理において生成した光源情報に基づいて、
先に説明した実施例２と同様、貼り付け対象オブジェク
ト（ｅｘ．ユーザの顔）に対して照射する照明方向を設
定する構成とすることも可能である。In the present system, at the time of rendering, it is possible to combine a background composed of 3DCG data and other objects. In addition, light source setting processing from an arbitrary direction is also possible when generating a material video. Based on the light source information generated in such a rendering process,
As in the second embodiment described above, it is also possible to adopt a configuration in which the illumination direction for irradiating the paste target object (ex. User's face) is set.

【０１０７】また、３ＤＣＧモデルの動き情報や仮想カ
メラの位置情報等は、蓄積された情報を使うのではなく
て、例えばユーザの動き情報を取得してユーザと同様の
動作を３ＤＣＧのキャラクタの動作として反映させる処
理構成としてインタラクティブな３ＤＣＧ合成画像を生
成する構成としてもよい。The motion information of the 3DCG model and the position information of the virtual camera do not use the accumulated information. For example, the motion information of the user is acquired and the same operation as the user is performed. As a processing configuration to be reflected as a configuration, an interactive 3DCG composite image may be generated.

【０１０８】なお、上述した実施例において、貼り付け
対象オブジェクト（ｅｘ．ユーザの顔）の映像を取り込
む環境において、貼り付け対象オブジェクト（ｅｘ．ユ
ーザの顔）を固定する構成を説明した。しかし、たとえ
ばジャイロなどのセンサを貼り付け対象オブジェクト
（ｅｘ．ユーザの顔）につけるなどして、ユーザの顔の
位置と向きをリアルタイムに計測する構成とし、その情
報を照明位置情報、カメラ位置情報、および顔位置情報
に付加することで、貼り付け対象オブジェクト（ｅｘ．
ユーザの顔）を固定しないで貼り付け対象オブジェクト
（ｅｘ．ユーザの顔）の撮り込み画像を取得して合成す
ることも可能である。In the above-described embodiment, the configuration in which the object to be pasted (ex. User's face) is fixed in an environment in which the image of the object to be pasted (ex. User's face) is captured has been described. However, the position and orientation of the user's face are measured in real time, for example, by attaching a sensor such as a gyro to the object to be pasted (ex. User's face), and the information is used as illumination position information and camera position information. , And the face position information, the paste target object (ex.
Instead of fixing the user's face, it is also possible to acquire and combine captured images of the paste target object (ex. User's face).

【０１０９】なお、上述の各実施例で述べた一連の処理
は、ハードウェアにより行うことは勿論、ソフトウェア
により行うこともできる。即ち、汎用のコンピュータ
や、マイクロコンピュータにプログラムを実行させるこ
とにより行う構成とすることが可能である。一連の処理
をソフトウェアによって行う場合には、そのソフトウェ
アを構成するプログラムが、例えば汎用のコンピュータ
や１チップのマイクロコンピュータ等にインストールさ
れる。図１３は、上述した一連の処理を実行するプログ
ラムがインストールされるコンピュータの一実施の形態
の構成例を示している。Note that the series of processes described in each of the above embodiments can be performed not only by hardware but also by software. That is, a configuration can be adopted in which the program is executed by a general-purpose computer or a microcomputer. When a series of processes is performed by software, a program constituting the software is installed in, for example, a general-purpose computer or a one-chip microcomputer. FIG. 13 illustrates a configuration example of an embodiment of a computer on which a program for executing the above-described series of processes is installed.

【０１１０】プログラムは、コンピュータに内蔵されて
いる記録媒体としてのハードディスク２０５やＲＯＭ２
０３に予め記録しておくことができる。あるいは、プロ
グラムはフロッピー（登録商標）ディスク、ＣＤ−ＲＯ
Ｍ(Compact Disc Read Only Memory)，ＭＯ(Magneto op
tical)ディスク，ＤＶＤ(Digital Versatile Disc)、磁
気ディスク、半導体メモリなどのリムーバブル記録媒体
２１０に、一時的あるいは永続的に格納（記録）してお
くことができる。このようなリムーバブル記録媒体２１
０は、いわゆるパッケージソフトウエアとして提供する
ことができる。The program is stored in a hard disk 205 or a ROM 2 as a recording medium built in the computer.
03 can be recorded in advance. Alternatively, the program may be a floppy disk, a CD-RO
M (Compact Disc Read Only Memory), MO (Magneto op
tical) disk, DVD (Digital Versatile Disc), magnetic disk, semiconductor memory, or other removable recording medium 210 can be temporarily or permanently stored (recorded). Such a removable recording medium 21
0 can be provided as so-called package software.

【０１１１】なお、プログラムは、上述したようなリム
ーバブル記録媒体２１０からコンピュータにインストー
ルする他、ダウンロードサイトから、ディジタル衛星放
送用の人工衛星を介して、コンピュータに無線で転送し
たり、ＬＡＮ(Local Area Network)、インターネットと
いったネットワークを介して、コンピュータに有線で転
送し、コンピュータでは、そのようにして転送されてく
るプログラムを、通信部２０８で受信し、内蔵するハー
ドディスク２０５にインストールすることができる。The program can be installed in the computer from the removable recording medium 210 as described above, can be wirelessly transferred from a download site to the computer via a digital satellite broadcasting artificial satellite, or can be connected to a LAN (Local Area). Network) or the Internet, and the program can be transferred to the computer by wire, and the computer can receive the transferred program by the communication unit 208 and install the program on the built-in hard disk 205.

【０１１２】コンピュータは、ＣＰＵ(Central Process
ing Unit)２０２を内蔵している。ＣＰＵ２０２には、
バス２０１を介して、入出力インタフェース２１１が接
続されており、ＣＰＵ２０２は、入出力インタフェース
２１０を介して、キーボードやマウス等を介して入力部
２０７が操作されることにより指令が入力されると、そ
れにしたがって、ＲＯＭ(Read Only Memory)２０３に格
納されているプログラムを実行する。上述の実施例にお
ける貼り付け対象オブジェクトの画像は、カメラ２１２
を介して入力され、ＣＰＵ２０２の制御の下、例えばリ
ムーバブル記録媒体２１０に格納された素材映像、３Ｄ
ＣＧデータ等のデータとの合成処理が実行される。図に
はリムーバブル記録媒体２１０として１つの構成のみを
示しているが、様々なデータ、例えばカメラ位置情報、
照明位置情報、素材映像、隠れマスク情報等をそれぞれ
個別の記憶媒体に格納して接続する構成としてもよい。
また、これらのデータのいくつかはハードディスク２０
５に格納する構成としてもよい。The computer has a CPU (Central Process).
ing Unit) 202. In the CPU 202,
An input / output interface 211 is connected via the bus 201, and the CPU 202 receives a command via the input / output interface 210 by operating the input unit 207 via a keyboard, a mouse, or the like. In accordance with this, the program stored in the ROM (Read Only Memory) 203 is executed. The image of the paste target object in the above-described embodiment is
3D, under the control of the CPU 202, for example, a material video stored in the removable recording medium 210.
Synthesis processing with data such as CG data is executed. Although only one configuration is shown as the removable recording medium 210 in the figure, various data such as camera position information,
The illumination position information, the material image, the hidden mask information, and the like may be stored in separate storage media and connected.
Some of these data are stored on the hard disk 20
5 may be stored.

【０１１３】ＣＰＵ２０２は、ＲＯＭ格納プログラムに
限らず、ハードディスク２０５に格納されているプログ
ラム、衛星若しくはネットワークから転送され、通信部
２０８で受信されてハードディスク２０５にインストー
ルされたプログラム、またはドライブ２０９に装着され
たリムーバブル記録媒体２１０から読み出されてハード
ディスク２０５にインストールされたプログラムを、Ｒ
ＡＭ(Random Access Memory)２０４にロードして実行す
ることも可能である。The CPU 202 is not limited to the ROM storage program, but may be a program stored in the hard disk 205, a program transferred from a satellite or a network, received by the communication unit 208 and installed in the hard disk 205, or mounted on the drive 209. The program read from the removable recording medium 210 and installed on the hard disk 205 is
It can also be loaded into an AM (Random Access Memory) 204 and executed.

【０１１４】これにより、ＣＰＵ２０２は、上述した各
実施例にしたがった処理、あるいは上述したブロック
図、フローチャートに従って行われる処理を行う。そし
て、ＣＰＵ２０２は、その処理結果を、必要に応じて、
例えば、入出力インタフェース２１１を介して、ＬＣＤ
(Liquid CryStal Display)やスピーカ等で構成される出
力部２０６から出力、あるいは、通信部２０８から送
信、さらには、ハードディスク２０５に記録させる。Thus, the CPU 202 performs the processing according to each of the above-described embodiments or the processing performed according to the above-described block diagrams and flowcharts. Then, the CPU 202 transmits the processing result as necessary,
For example, via the input / output interface 211, the LCD
(Liquid CryStal Display), output from an output unit 206 including a speaker, or the like, or transmission from the communication unit 208, and further, recording on the hard disk 205.

【０１１５】ここで、本明細書において、コンピュータ
に各種の処理を行わせるためのプログラムを記述する処
理ステップは、必ずしもフローチャートとして記載され
た順序に沿って時系列に処理する必要はなく、並列的あ
るいは個別に実行される処理（例えば、並列処理あるい
はオブジェクトによる処理）も含むものである。Here, in this specification, processing steps for writing a program for causing a computer to perform various processes do not necessarily have to be processed in chronological order in the order described in the flowchart, and may be performed in parallel. Alternatively, it also includes processing executed individually (for example, parallel processing or processing by an object).

【０１１６】また、プログラムは、１のコンピュータに
より処理されるものであっても良いし、複数のコンピュ
ータによって分散処理されるものであっても良い。さら
に、プログラムは、遠方のコンピュータに転送されて実
行されるものであっても良い。The program may be processed by one computer, or may be processed in a distributed manner by a plurality of computers. Further, the program may be transferred to a remote computer and executed.

【０１１７】以上、特定の実施例を参照しながら、本発
明について詳解してきた。しかしながら、本発明の要旨
を逸脱しない範囲で当業者が該実施例の修正や代用を成
し得ることは自明である。すなわち、例示という形態で
本発明を開示してきたのであり、限定的に解釈されるべ
きではない。本発明の要旨を判断するためには、冒頭に
記載した特許請求の範囲の欄を参酌すべきである。例え
ば、実施例においては、人の顔を素材映像中の登場人物
の顔と入れ替えるシステムを提供しているが、人の顔に
限らず、素材映像中のあらゆる物体を別の物体と入れ替
えることがインタラクティブに行える。The present invention has been described in detail with reference to the specific embodiments. However, it is obvious that those skilled in the art can modify or substitute the embodiment without departing from the spirit of the present invention. That is, the present invention has been disclosed by way of example, and should not be construed as limiting. In order to determine the gist of the present invention, the claims described at the beginning should be considered. For example, in the embodiment, a system for replacing a person's face with a character's face in a material image is provided.However, not only a person's face, any object in a material image can be replaced with another object. Can be interactive.

【０１１８】[0118]

【発明の効果】以上、説明したように、本発明の画像処
理装置、および画像処理方法、並びにプログラム記憶媒
体によれば、映画、ドラマなどの素材映像、あるいは３
ＤＣＧに基づいて生成される画像データ内の特定の置き
換え対象オブジェクト（ｅｘ．特定の登場人物の顔）に
カメラで撮影した貼り付け対象オブジェクト（ｅｘ．ユ
ーザの顔）を合成して貼り付ける処理が、貼り付け対象
オブジェクト（ｅｘ．ユーザの顔）に様々な動きを強い
ることなく実現される。As described above, according to the image processing apparatus, the image processing method, and the program storage medium of the present invention, a material video such as a movie or a drama, or 3
A process of synthesizing and pasting a paste target object (ex. User's face) photographed by a camera to a specific replacement target object (ex. A specific character's face) in image data generated based on DCG This is realized without imposing various movements on the paste target object (ex. User's face).

【０１１９】また、本発明の画像処理装置、および画像
処理方法、並びにプログラム記憶媒体によれば、貼り付
け対象オブジェクト（ｅｘ．ユーザの顔）の三次元形状
を取得する必要がなく、カメラ映像からの顔の画像をそ
のまま利用できるので、置き換え対象オブジェクト（ｅ
ｘ．特定の登場人物の顔）の動きや声を解析したり、表
情を合成したりする必要がなく、自然な合成画像の生成
が容易に実現できる。また、画像合成処理に必要な情報
としてのカメラ位置情報等はあらかじめ求めておけばよ
く、運用時は極めて少ない計算量となり、リアルタイム
のインタラクティブなシステムとして有効である。Further, according to the image processing apparatus, the image processing method, and the program storage medium of the present invention, there is no need to acquire the three-dimensional shape of the object to be pasted (ex. User's face), and it Can be used as it is, the replacement target object (e
x. There is no need to analyze the motion or voice of a specific character's face) or synthesize facial expressions, and a natural synthetic image can be easily generated. Further, camera position information and the like as information necessary for the image synthesizing process may be obtained in advance, and the amount of calculation becomes extremely small during operation, which is effective as a real-time interactive system.

[Brief description of the drawings]

【図１】本発明の画像処理装置の第1実施例の構成を示
すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a first embodiment of an image processing apparatus according to the present invention.

【図２】本発明の画像処理装置の第1実施例の画像撮り
込み部の構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of an image capturing unit of the first embodiment of the image processing apparatus according to the present invention.

【図３】本発明の画像処理装置におけるカメラ位置情報
の取得処理を説明する図である。FIG. 3 is a diagram illustrating a process of acquiring camera position information in the image processing apparatus of the present invention.

【図４】本発明の画像処理装置における隠れマスク情報
の取得処理を説明する図である。FIG. 4 is a diagram illustrating a process of acquiring hidden mask information in the image processing apparatus of the present invention.

【図５】本発明の画像処理装置におけるカメラ位置情
報、隠れマスク情報の取得処理を説明するフロー図であ
る。FIG. 5 is a flowchart illustrating a process of acquiring camera position information and hidden mask information in the image processing apparatus of the present invention.

【図６】本発明の画像処理装置における画像合成処理を
説明するフロー図である。FIG. 6 is a flowchart illustrating an image synthesizing process in the image processing apparatus according to the present invention.

【図７】本発明の画像処理装置の第２実施例の構成を示
すブロック図である。FIG. 7 is a block diagram showing a configuration of a second embodiment of the image processing apparatus of the present invention.

【図８】本発明の画像処理装置の第２実施例の画像撮り
込み部の構成例を示す図である。FIG. 8 is a diagram illustrating a configuration example of an image capturing unit according to a second embodiment of the image processing apparatus of the present invention.

【図９】本発明の画像処理装置の第３実施例の構成を示
すブロック図である。FIG. 9 is a block diagram showing a configuration of a third embodiment of the image processing apparatus of the present invention.

【図１０】本発明の画像処理装置の第３実施例における
カメラ位置情報、顔位置情報の取得処理を説明する図で
ある。FIG. 10 is a diagram illustrating a process of acquiring camera position information and face position information in a third embodiment of the image processing apparatus according to the present invention.

【図１１】本発明の画像処理装置の第３実施例における
カメラ位置情報、顔位置情報の取得処理を説明するフロ
ー図である。FIG. 11 is a flowchart illustrating a process of acquiring camera position information and face position information in a third embodiment of the image processing apparatus of the present invention.

【図１２】本発明の画像処理装置の第４実施例の構成を
示すブロック図である。FIG. 12 is a block diagram illustrating a configuration of an image processing apparatus according to a fourth embodiment of the present invention.

【図１３】本発明の画像処理装置における処理をソフト
ウェアによって実行する場合の処理手段構成を示したブ
ロック図である。FIG. 13 is a block diagram illustrating a configuration of a processing unit when processing in the image processing apparatus of the present invention is executed by software.

[Explanation of symbols]

１カメラ位置情報蓄積部２カメラ移動制御部３カメラ移動部４カメラ６撮り込み画像抽出部７画像合成部８素材映像蓄積部９隠れマスク情報蓄積部１０出力部１００画像撮り込み部２０ユーザ２１固定台２５ディスプレイ３０登場人物４０ユーザの顔５０カメラ６０柱７０マスク領域１１照明位置情報蓄積部１２照明移動制御部１３照明移動部１４照明８０画像撮り込み部１５置き換え対象オブジェクト位置情報蓄積部９０登場人物９２，９６，９８ユーザの顔９４カメラ３１３ＤＣＧモデル蓄積部３２仮想カメラ情報蓄積部３３素材映像生成部３４カメラ位置情報生成部３５隠れマスク情報生成部２０１バス２０２ＣＰＵ２０３ＲＯＭ２０４ＲＡＭ２０５ハードディスク２０６出力部２０７入力部２０８通信部２０９ドライブ２１０リムーバブル記録媒体２１１入出力インタフェース２１２カメラ 1 Camera position information storage unit 2 Camera movement control unit 3 Camera movement unit 4 Camera 6 Captured image extraction unit 7 Image synthesis unit 8 Material video storage unit 9 Hidden mask information storage unit 10 Output unit 100 Image capture unit 20 User 21 Fixed Table 25 Display 30 Character 40 User's face 50 Camera 60 Column 70 Mask area 11 Lighting position information storage unit 12 Lighting movement control unit 13 Lighting movement unit 14 Lighting 80 Image capturing unit 15 Replacement object position information storage unit 90 Character 92, 96, 98 User's face 94 Camera 31 3D CG model storage unit 32 Virtual camera information storage unit 33 Material video generation unit 34 Camera position information generation unit 35 Hidden mask information generation unit 201 Bus 202 CPU 203 ROM 204 RAM 205 Hard disk 206 Output Department 207 Input unit 208 Communication unit 209 Drive 210 Removable recording medium 211 Input / output interface 212 Camera

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ０６Ｔ 15/70 Ｇ０６Ｔ 15/70 ＡＨ０４Ｎ 5/225 Ｈ０４Ｎ 5/225 Ｚ 5/262 5/262 13/00 13/00 13/02 13/02 Ｆターム(参考） 5B050 AA09 BA08 BA09 BA10 BA11 BA12 EA07 EA12 EA19 EA24 EA27 FA02 5B057 AA20 BA02 CA13 CB13 CC03 CD05 CE08 CH20 DB03 DC08 5C022 AB68 AC00 CA01 CA02 5C023 AA06 AA16 AA17 AA37 BA11 CA03 DA02 DA03 DA08 5C061 AA20 AB03 AB08 AB12 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification FI FI Theme coat ゛ (Reference) G06T 15/70 G06T 15/70 A H04N 5/225 H04N 5/225 Z 5/262 5/262 13/00 13/00 13/02 13/02 F term (reference) 5B050 AA09 BA08 BA09 BA10 BA11 BA12 EA07 EA12 EA19 EA24 EA27 FA02 5B057 AA20 BA02 CA13 CB13 CC03 CD05 CE08 CH20 DB03 DC08 5C022 AB68 AC00 CA01 CA02 5C023 AA03 A17A17 DA02 DA03 DA08 5C061 AA20 AB03 AB08 AB12

Claims

[Claims]

An image processing apparatus for executing a process of setting a specific object in material image data as a replacement target object and replacing an image of the replacement target object with image data of a paste target object photographed by a camera. Image capturing means for controlling the capturing camera position in accordance with camera position information generated based on the capturing position information of the replacement target object to capture an image of the paste target object, and an image of the replacement target object in the image data And an image synthesizing unit that executes an image synthesizing process of changing the image into an image of the paste target object captured by the image capturing unit.

2. An image processing apparatus comprising: a material video storage unit storing material image data including the replacement target object; and a camera storing camera position information generated based on shooting direction information of the replacement target object. A position information storage unit, wherein the camera position information stored in the camera position information storage unit is configured as time-series data corresponding to each frame of the image data stored in the material video storage unit; Outputting the camera position information recorded in the storage means to the image capturing means, and outputting the material video recorded in the material video storage means to the image synthesizing means as synchronized processing; The means is a material video input from the material video storage means, and is input from the image capturing means. An image of a paste target object is input in parallel, and processing for changing an image of a replacement target object in the material image to an image of the paste target object based on the parallel input data is executed. The image processing apparatus according to claim 1, wherein:

3. The image processing apparatus further includes a captured image extracting unit that extracts only an image of an object to be pasted from an image captured by the image capturing unit, wherein the captured image extracting unit includes: The image processing apparatus according to claim 1, wherein the image processing unit outputs the extracted image of the paste target object to the image combining unit.

4. The image processing apparatus further comprises: hidden mask information storage means for storing hidden mask information as occluded area information of the object to be replaced in the material image data; A combining process of inputting hidden mask information from hidden mask information storage means and selecting data of the material image data as output data instead of image data of an object to be pasted from the image capturing means for a mask area. 3. The image processing apparatus according to claim 1, wherein

5. The image capturing device according to claim 1, wherein: a camera for capturing an image of the object to be pasted; and a camera moving device for moving the camera in accordance with the camera position information. 3. The image processing device according to 2.

6. The image processing apparatus further includes lighting position information storage means for storing lighting position information generated based on luminance information of the replacement target object in the material image data. 2. The apparatus according to claim 1, wherein the means comprises: illumination for irradiating the paste target object; and illumination moving means for moving the illumination in accordance with the illumination position information.
Or the image processing device according to 2.

7. The image processing apparatus further includes a replacement target object position information storage unit that stores position information of the replacement target object in the material image data, and wherein the image combining unit stores the replacement target object position. It is characterized in that, based on the position information of the replacement target object input from the information storage means, a size adjustment and position adjustment processing of the image of the paste target object input from the image capturing means is performed. The image processing device according to claim 1.

8. The image processing apparatus further comprises: three-dimensional model storage means for storing three-dimensional model data; and a virtual shooting viewpoint of a replacement target object included in the three-dimensional model data stored in the three-dimensional model storage means. Virtual camera information storing means for storing information; material video generating means for generating material video data based on the three-dimensional model data and the virtual shooting viewpoint information; the image capturing means comprising: The configuration is such that the photographing camera position is controlled according to the camera position information generated based on the virtual photographing viewpoint information of the replacement target object included in the material video generated by the video generation means, and the image of the paste target object is captured. The image processing apparatus according to claim 1, wherein:

9. An image processing method for executing a process of setting a specific object in material image data as a replacement target object and replacing an image of the replacement target object with image data of a pasting target object photographed by a camera. An image capturing step of controlling an image capturing camera position in accordance with camera position information generated based on image capturing position information of the replacement target object and capturing an image of the paste target object; and an image of the replacement target object in the image data An image synthesizing step of executing an image synthesizing process of changing the image to the image of the paste target object captured in the image capturing step.

10. The image processing method further includes a camera position information accumulating step of generating and accumulating camera position information based on photographing direction information of the replacement target object, wherein the camera position information is generated in the camera position information accumulating step. Output of the camera position information to the image capturing means, and
The output of the material video recorded in the material video storage means to the image synthesizing means is executed as synchronized processing, and the image synthesizing step comprises: Inputting the image of the paste target object to be performed in parallel, and performing a process of changing the image of the replacement target object in the material image to the image of the paste target object based on the parallel input data. The image processing method according to claim 9.

11. The image processing method according to claim 9, further comprising a photographed image extracting step of extracting only an image of the object to be pasted from the image photographed in the image photographing step. The image processing method according to 1.

12. The image processing method further comprises a hidden mask information accumulating step of generating and storing hidden mask information as occluded area information of the object to be replaced in the material image data. The hidden mask information is input, and for the mask area, a synthesis process is performed to select not the image data of the paste target object captured in the image capturing step but the data of the material image data as output data. The image processing method according to claim 9, wherein the image processing is performed.

13. The image processing method according to claim 1, further comprising the step of: generating and storing illumination position information generated based on luminance information of the replacement target object in the material image data; The photographing step moves the illumination for irradiating the paste target object according to the illumination position information generated in the illumination position information accumulation step.
0. The image processing method according to 0.

14. The image processing method further includes a replacement object position information accumulating step of generating and storing position information of the replacement object in the material image data, wherein the image combining step includes: The image processing device according to claim 1, further comprising: performing size adjustment and position adjustment processing of the image of the paste target object captured in the image capturing step based on the position information of the replacement target object generated in the position information accumulation step. Item 10. The image processing method according to item 9 or 10.

15. The image processing method further comprising: storing virtual camera information for generating and storing virtual photographing viewpoint information of a replacement target object included in the three-dimensional model data stored in the three-dimensional model storage means; A source video generating step of generating source video data based on the model data and the virtual shooting viewpoint information; and the image capturing step includes a replacement target object included in the source video generated by the source video generating step. 11. The image processing method according to claim 9, wherein the photographing camera position is controlled in accordance with the camera position information generated based on the virtual photographing viewpoint information to capture an image of the paste target object.

16. A computer / computer which performs a process of setting a specific object in material image data as an object to be replaced and replacing the image of the object to be replaced with image data of an object to be pasted taken by a camera.
A program storage medium for providing a computer program to be executed on a system, wherein the computer program controls a photographing camera position in accordance with camera position information generated based on photographing position information of the replacement target object and pastes the image. Performing an image capturing step of capturing an image of the object to be attached; and performing an image combining process of changing an image of the replacement target object in the image data to an image of the paste target object captured in the image capturing step. A program storage medium, comprising: an image synthesis step.