JP5474169B2

JP5474169B2 - Image processing apparatus and image processing method

Info

Publication number: JP5474169B2
Application number: JP2012260297A
Authority: JP
Inventors: 大松村; 要冨手
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-10-30
Filing date: 2012-11-28
Publication date: 2014-04-16
Anticipated expiration: 2028-07-17
Also published as: JP2013077307A; JP2009134693A

Description

本発明は、現実空間画像と仮想空間画像との合成技術に関するものである。 The present invention relates to a technique for synthesizing a real space image and a virtual space image.

実写風景の上に、コンピュータグラフィックス（ＣＧ）を重畳して体験者に提示することで、この体験者にあたかもその場に仮想の物体が存在するかのような体験を行わせる複合現実感(Mixed Reality:MR)技術が従来から提案されている。 By presenting the computer graphics (CG) superimposed on the live-action scene and presenting it to the experiencer, this experience makes the experience as if a virtual object existed on the spot. Mixed Reality (MR) technology has been proposed.

ＭＲ技術を用いて体験者に臨場感豊かな体験をさせる為には、実写風景の上に単純にＣＧを重畳して表示するだけでなく、体験者が実際にＣＧで描画される仮想物体に触ったり操作する（しているように体感させる）といったインタラクションが重要になる。そして、このようなインタラクションを実現するめには、仮想物体よりも手前（前景）に仮想物体を操作する体験者の手など（被写体）を表示することが必要である。なぜなら、仮想物体よりも手前にあるべき被写体が仮想物体によって隠されてしまうと、仮想物体との距離感や現実感が破綻し、臨場感を損ねてしまうからである。 In order to give the experience-rich experience to the experience using MR technology, not only simply superimposing and displaying the CG on the live-action scenery, but also the virtual object that the experience is actually drawn by CG Interactions such as touching and operating (making you feel as you do) are important. In order to realize such an interaction, it is necessary to display the hand (subject) of the user who operates the virtual object in front of the virtual object (foreground). This is because if the subject that should be in front of the virtual object is hidden by the virtual object, the sense of distance from the virtual object and the sense of reality will break down, impairing the sense of reality.

このような課題を解決するために、本願出願人は特許文献１において、最前景とすべき被写体の画像を仮想物体によって隠さないようにする技術を提案した。係る技術では、背景と被写体とを実写画像として取得し、この実写画像から、予め手作業でシステムに登録した「仮想物体より手前に表示すべき被写体（被写体検出情報としての色情報を有する領域）」を被写体領域として抽出する。そして、被写体領域には仮想物体の描画を禁止する。係る技術により、最前景となるべき被写体が仮想物体で隠されることなく、仮想物体よりも手前にあるように表示され、臨場感の高い複合現実感体験を行うことが可能となる。 In order to solve such problems, the applicant of the present application has proposed a technique in Patent Document 1 that prevents an image of a subject to be the foreground from being hidden by a virtual object. In such a technique, the background and the subject are acquired as a live-action image, and the “subject to be displayed in front of the virtual object (area having color information as subject detection information) previously registered in the system manually from the live-action image. "Is extracted as a subject area. Then, drawing of a virtual object is prohibited in the subject area. With such a technique, the subject that should be the foreground is displayed so as to be in front of the virtual object without being hidden by the virtual object, and it is possible to perform a mixed reality experience with high presence.

図１は、現実空間画像、仮想空間画像、現実空間画像上に仮想空間画像を重畳させた合成画像の一例を示す図である。 FIG. 1 is a diagram illustrating an example of a composite image in which a virtual space image is superimposed on a real space image, a virtual space image, and a real space image.

図１において１０１は現実空間画像で、係る現実空間画像１０１内には、被写体としての手の領域１５０が含まれている。１０２は、現実空間画像１０１上に重畳させる仮想空間画像である。１０３は、現実空間画像１０１上に仮想空間画像１０２を重畳させた合成画像である。合成画像１０３を生成する際には、現実空間画像１０１上において手の領域１５０に対しては仮想空間画像１０２は重畳させていないので、結果として合成画像１０３上には、手の領域１５０がそのまま描画されている。
特開２００３−２９６７５９号公報 In FIG. 1, reference numeral 101 denotes a real space image. The real space image 101 includes a hand region 150 as a subject. Reference numeral 102 denotes a virtual space image to be superimposed on the real space image 101. Reference numeral 103 denotes a composite image in which the virtual space image 102 is superimposed on the real space image 101. When generating the composite image 103, the virtual space image 102 is not superimposed on the hand region 150 on the real space image 101. As a result, the hand region 150 remains on the composite image 103 as it is. Has been drawn.
JP 2003-296759 A

特許文献１に開示されている複合現実感体験システムは、体験者の見ている被写体が単色である場合には良好に動作する。しかしながら、被写体が異なる複数の色を有する場合、ある色を有する領域のみしかＣＧの描画を禁止できないため、被写体の一部分だけがＣＧ内に浮いているように表示されてしまい、体験者の現実感を損なうことがあった。 The mixed reality experience system disclosed in Patent Document 1 operates well when the subject viewed by the experience person is monochromatic. However, when the subject has a plurality of different colors, only the area having a certain color can be prohibited from drawing CG, so that only a part of the subject is displayed as floating in the CG, and the experience of the user is felt. Could be damaged.

図２は、複数の色を有する被写体を含む現実空間画像、仮想空間画像、そして係る現実空間画像上に仮想空間画像を重畳させた合成画像の一例を示す図である。 FIG. 2 is a diagram illustrating an example of a real space image including a subject having a plurality of colors, a virtual space image, and a composite image in which the virtual space image is superimposed on the real space image.

図２において２０１は現実空間画像で、係る現実空間画像２０１内には、被写体としての手の領域１５０ａ、腕の領域１５０ｂが含まれている。それぞれの領域１５０ａ、１５０ｂは異なる色を有する領域である。２０２は、現実空間画像２０１上に重畳させる仮想空間画像である。２０３は、現実空間画像２０１上に仮想空間画像２０２を重畳させた合成画像である。ここでは、手の領域１５０ａの色を有する領域のみを、仮想空間画像２０２の重畳対象外としているので、図２に示す如く、本来は仮想空間画像２０２を重畳させない腕の領域１５０ｂには仮想空間画像２０２が描画されてしまっている。 In FIG. 2, reference numeral 201 denotes a real space image. The real space image 201 includes a hand region 150a and an arm region 150b as subjects. The respective areas 150a and 150b are areas having different colors. Reference numeral 202 denotes a virtual space image to be superimposed on the real space image 201. Reference numeral 203 denotes a composite image obtained by superimposing the virtual space image 202 on the real space image 201. Here, since only the region having the color of the hand region 150a is excluded from the superimposition target of the virtual space image 202, as shown in FIG. The image 202 has been drawn.

このような技術的背景から、次のようなことな望まれている。即ち、体験者の手や指定した領域を実写画像より抽出した後、抽出した領域に付属する領域（体験者の腕等）を更に抽出する。そして、抽出したそれぞれの領域をマージした被写体領域（手と腕）については仮想空間画像を重畳させないように、現実空間画像上に仮想空間画像を重畳させる。 From such a technical background, the following is desired. That is, after the hands of the experience person and the designated area are extracted from the live-action image, areas (such as the hands of the experience person) attached to the extracted area are further extracted. Then, the virtual space image is superimposed on the real space image so that the virtual space image is not superimposed on the subject region (hand and arm) obtained by merging the extracted regions.

本発明は以上の問題に鑑みてなされたものであり、現実空間画像において、仮想空間画像を重畳させない領域を適切に設定するための技術を提供することを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to provide a technique for appropriately setting a region in which a virtual space image is not superimposed in a real space image.

本発明の目的を達成するために、例えば、本発明の画像処理装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, an image processing apparatus of the present invention comprises the following arrangement.

即ち、複数のフレームからなる現実空間画像を取得する手段と、
前記現実空間画像の注目フレームにおいて、所定の条件を満たす画素で構成される領域を第１の領域として設定する第１の設定手段と、
前記注目フレーム内において、前記第１の領域の動きベクトルと前記注目フレーム内の前記第１の領域以外の他の領域の動きベクトルとを取得する取得手段と、
前記取得した他の領域の動きベクトルの大きさが閾値以上であるか否かを判断する第１の判断手段と、
前記第１の判断手段により、前記他の領域の動きベクトルの大きさが閾値以上であると判断された場合に、該他の領域の動きベクトルが前記第１の領域の動きベクトルと類似しているかを判断する第２の判断手段と、
前記第２の判断手段により、該他の領域の動きベクトルが前記第１の領域の動きベクトルと類似していると判断された場合に、該他の領域を第２の領域として設定する第２の設定手段と、
前記設定された前記注目フレーム内の前記第１の領域と前記第２の領域以外の領域に、仮想空間画像を合成する合成手段とを備えることを特徴とする。 That is, a means for acquiring a real space image composed of a plurality of frames,
In the frame of interest of the real space image, a first setting means for setting a region composed of a predetermined pixel satisfying a first region,
Within the frame of interest, and acquisition means you get a motion vector of another region other than the first region of the target frame the motion vector of the first region,
First determination means for determining whether or not the magnitude of a motion vector of the acquired other region is greater than or equal to a threshold ;
When the first determination unit determines that the magnitude of the motion vector of the other area is equal to or greater than a threshold, the motion vector of the other area is similar to the motion vector of the first area. A second judging means for judging whether or not
A second region that sets the other region as a second region when the second determination unit determines that the motion vector of the other region is similar to the motion vector of the first region; Setting means,
To the first region and the region other than the second region of the target frame that have been pre-Symbol set, and wherein the obtaining Bei a synthesizing means for synthesizing the virtual space image.

本発明の構成によれば、現実空間画像において、仮想空間画像を重畳させない領域を適切に設定することができる。 According to the configuration of the present invention, it is possible to appropriately set a region in which a virtual space image is not superimposed in a real space image.

現実空間画像、仮想空間画像、現実空間画像上に仮想空間画像を重畳させた合成画像の一例を示す図である。It is a figure which shows an example of the synthesized image which superimposed the virtual space image on the real space image, the virtual space image, and the real space image. 複数の色を有する被写体を含む現実空間画像、仮想空間画像、そして係る現実空間画像上に仮想空間画像を重畳させた合成画像の一例を示す図である。It is a figure which shows an example of the synthetic | combination image which superimposed the virtual space image on the real space image containing the to-be-photographed object which has a some color, a virtual space image, and the said real space image. 本発明の第１の実施形態に係るシステムの機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the system which concerns on the 1st Embodiment of this invention. 現実空間の画像上に仮想空間の画像が重畳された合成画像を、ユーザがＨＭＤを介して観察している様子を示す図である。It is a figure which shows a mode that the user observes via HMD the composite image on which the image of the virtual space was superimposed on the image of the real space. 画像処理装置３００が複合現実空間の画像を生成し、生成した複合現実空間の画像をＨＭＤ３９０が有する表示部３０９に送出する為の一連の処理のフローチャートである。12 is a flowchart of a series of processes for the image processing apparatus 300 to generate an image of mixed reality space and send the generated mixed reality space image to the display unit 309 included in the HMD 390. ステップＳ５０２における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S502. ステップＳ６０１における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S601. ステップＳ６０４における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S604. キー領域の特徴のみを特徴空間上でクラスタリングした結果の例を示す図である。It is a figure which shows the example of the result of having clustered only the feature of the key area on the feature space. キー領域のクラスとその他のクラスに属する特徴とを示す図である。It is a figure which shows the characteristic which belongs to the class of a key area | region, and another class. ステップＳ６０５における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S605. ステップＳ５０５における、複合現実空間の画像の生成処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the production | generation process of the image of mixed reality space in step S505. 本発明の第１の実施形態によって生成される複合現実空間の画像の一例を示す図である。It is a figure which shows an example of the image of the mixed reality space produced | generated by the 1st Embodiment of this invention. 本発明の第３の実施形態で行う、ステップＳ５０２における処理のフローチャートである。It is a flowchart of the process in step S502 performed in the 3rd Embodiment of this invention. 画像処理装置３００に適用可能なコンピュータのハードウェア構成例を示す図である。2 is a diagram illustrating an example of a hardware configuration of a computer applicable to the image processing apparatus 300. FIG. ステップＳ６０２において本発明の第４の実施形態で行う処理のフローチャートである。It is a flowchart of the process performed by the 4th Embodiment of this invention in step S602. 画像平面上に射影された位置変化分動きベクトルTvを算出する原理を示す図である。It is a figure which shows the principle which calculates the positional change motion vector Tv projected on the image plane.

以下、添付図面を参照し、本発明の好適な実施形態について説明する。なお、以下説明する実施形態は、本発明を具体的に実施した場合の一例を示すもので、特許請求の範囲に記載の構成の具体的な実施例の１つである。 Preferred embodiments of the present invention will be described below with reference to the accompanying drawings. The embodiment described below shows an example when the present invention is specifically implemented, and is one of the specific examples of the configurations described in the claims.

［第１の実施形態］
本実施形態では、現実空間画像上に仮想空間画像を重畳させる場合に、現実空間画像内に「手」の領域と「腕」の領域とが含まれている場合には、それぞれの領域をマージして１つの被写体の領域（合成領域）を生成する。そして、係る被写体の領域を常に仮想空間の画像よりも手前に表示するよう、係る重畳の処理を制御する。詳しくは後述するが、被写体の領域は、「手」の領域と「腕」の領域とをマージしたものに限定するものではなく、どのような領域をマージして被写体の領域を形成しても良い。即ち、以下の説明は、被写体の領域が異なる複数の画素値で表示されるようなものであれば、どのような被写体の領域でも良い。 [First Embodiment]
In this embodiment, when a virtual space image is superimposed on a real space image, if the “hand” region and the “arm” region are included in the real space image, the respective regions are merged. Thus, a single subject area (composite area) is generated. Then, the superimposition process is controlled so that the subject area is always displayed in front of the image in the virtual space. As will be described in detail later, the subject area is not limited to the merged “hand” area and “arm” area, and any area may be merged to form the subject area. good. That is, in the following description, any subject area may be used as long as the subject area is displayed with a plurality of different pixel values.

図４は、現実空間の画像上に仮想空間の画像が重畳された合成画像を、ユーザがＨＭＤを介して観察している様子を示す図である。 FIG. 4 is a diagram illustrating a state in which the user observes the composite image in which the virtual space image is superimposed on the real space image via the HMD.

図４に示すように、ユーザ４０１はＨＭＤ３９０を介して、撮像部３０１が撮像した現実空間の画像上に、センサ４０４による計測結果に基づいて生成した仮想空間の画像、を重畳させた合成画像（複合現実空間の画像）を観察している。係る観察中に自身の手４０５や自身の腕４０６が撮像部３０１の視野範囲４０９に入った場合、ＨＭＤ３９０に表示される複合現実空間の画像中には手４０５、腕４０６が表示される。即ち、前景となる現実空間の画像上の手４０５、腕４０６の領域には仮想空間の画像（仮想物体４０８の一部）は重畳させないようにする。これを実現するために、前景とすべき現実の被写体は「ユーザ４０１の手４０５の領域、腕４０６の領域」、背景とすべき現実の対象物は「壁や鉢植え等の背景現実物体４０７」、等の領域の区別を行う。 As shown in FIG. 4, a user 401 superimposes a virtual space image generated based on a measurement result by the sensor 404 on a real space image captured by the image capturing unit 301 via the HMD 390 ( The image of the mixed reality space) is observed. If the user's hand 405 or his / her arm 406 enters the visual field range 409 of the imaging unit 301 during such observation, the hand 405 and the arm 406 are displayed in the mixed reality space image displayed on the HMD 390. That is, the image of the virtual space (a part of the virtual object 408) is not superimposed on the region of the hand 405 and the arm 406 on the image of the real space that is the foreground. In order to realize this, the real subject to be the foreground is “the region of the hand 405 and the arm 406 of the user 401”, and the real object to be the background is “the background real object 407 such as a wall or potted plant”. And so on.

図３は、本実施形態に係るシステムの機能構成例を示すブロック図である。 FIG. 3 is a block diagram illustrating a functional configuration example of the system according to the present embodiment.

図３に示す如く、本実施形態に係るシステムは、ＨＭＤ３９０、位置姿勢計測部３０６、画像処理装置３００により構成されている。そして、ＨＭＤ３９０、位置姿勢計測部３０６はそれぞれ、画像処理装置３００に接続されている。 As shown in FIG. 3, the system according to the present embodiment includes an HMD 390, a position / orientation measurement unit 306, and an image processing apparatus 300. The HMD 390 and the position / orientation measurement unit 306 are each connected to the image processing apparatus 300.

先ず、ＨＭＤ３９０について説明する。 First, the HMD 390 will be described.

ＨＭＤ３９０は、頭部装着型表示装置の一例としてのものであり、撮像部３０１と表示部３０９とで構成されている。 The HMD 390 is an example of a head-mounted display device, and includes an imaging unit 301 and a display unit 309.

撮像部３０１は、現実空間の動画像を撮像するビデオカメラであり、撮像した各フレームの画像（現実空間画像）は、画像信号として後段の画像処理装置３００に入力される。撮像部３０１は、ユーザがＨＭＤ３９０を自身の頭部に装着した場合に、このユーザの目の近くに位置するように、ＨＭＤ３９０に取り付けられる。更に、取り付ける（撮像部３０１の）姿勢は、ＨＭＤ３９０を頭部に装着したユーザの正面方向（視線方向）に略一致するような姿勢である。これにより、撮像部３０１は、ユーザの頭部の位置姿勢に応じて見える現実空間の動画像を撮像することができる。従って、以下の説明では、撮像部３０１を「ユーザの視点」と呼称する場合もある。 The imaging unit 301 is a video camera that captures a moving image of the real space, and the captured image of each frame (real space image) is input to the subsequent image processing apparatus 300 as an image signal. The imaging unit 301 is attached to the HMD 390 so as to be located near the user's eyes when the user wears the HMD 390 on his / her head. Furthermore, the posture (of the imaging unit 301) to be attached is a posture that substantially matches the front direction (line-of-sight direction) of the user wearing the HMD 390 on the head. As a result, the imaging unit 301 can capture a moving image of the real space that is visible according to the position and orientation of the user's head. Therefore, in the following description, the imaging unit 301 may be referred to as a “user viewpoint”.

表示部３０９は、例えば、液晶画面であり、ＨＭＤ３９０を頭部に装着したユーザの眼前に位置するようにＨＭＤ３９０に取り付けられたものである。画像処理装置３００からＨＭＤ３９０に対して送出される映像信号に基づいた画像は、係る表示部３０９に表示される。これにより、ＨＭＤ３９０を頭部に装着したユーザの眼前には、画像処理装置３００から送信された映像信号に基づいた画像が提示されることになる。 The display unit 309 is, for example, a liquid crystal screen, and is attached to the HMD 390 so as to be positioned in front of the eyes of the user wearing the HMD 390 on the head. An image based on the video signal transmitted from the image processing apparatus 300 to the HMD 390 is displayed on the display unit 309. Accordingly, an image based on the video signal transmitted from the image processing apparatus 300 is presented in front of the user wearing the HMD 390 on the head.

本実施形態では、撮像部３０１と表示部３０９とはＨＭＤ３９０に内蔵されており、且つ撮像部３０１と表示部３０９とは、表示部３０９の光学系と撮像部３０１の撮像系とが一致するようにＨＭＤ３９０に内蔵されている。 In the present embodiment, the imaging unit 301 and the display unit 309 are built in the HMD 390, and the imaging unit 301 and the display unit 309 are configured such that the optical system of the display unit 309 and the imaging system of the imaging unit 301 match. Built in the HMD 390.

次に、位置姿勢計測部３０６について説明する。 Next, the position / orientation measurement unit 306 will be described.

位置姿勢計測部３０６は、撮像部３０１の位置姿勢を計測するためのものであり、例えば、位置姿勢計測部３０６には、磁気センサや光学式センサ等のセンサシステムを適用することができる。例えば、位置姿勢計測部３０６に磁気センサを適用する場合、位置姿勢計測部３０６は次のような動作を行うことになる。 The position / orientation measurement unit 306 is for measuring the position / orientation of the imaging unit 301. For example, a sensor system such as a magnetic sensor or an optical sensor can be applied to the position / orientation measurement unit 306. For example, when a magnetic sensor is applied to the position / orientation measurement unit 306, the position / orientation measurement unit 306 performs the following operation.

先ず、位置姿勢計測部３０６に磁気センサを適用する場合、位置姿勢計測部３０６は、次のような各部によって構成されることになる。 First, when a magnetic sensor is applied to the position / orientation measurement unit 306, the position / orientation measurement unit 306 includes the following units.

・周囲に磁界を発生させるトランスミッタ
・トランスミッタが発生する磁界中で、自身の位置姿勢に応じた磁界の変化を検知するレシーバ
・トランスミッタの動作制御を行うと共に、レシーバによる計測結果に基づいて、レシーバのセンサ座標系における位置姿勢情報を生成するセンサコントローラ
トランスミッタは現実空間中の所定の位置に配置する。そしてレシーバは、撮像部３０１に取り付ける。トランスミッタが磁界を発生させると、レシーバは、自身の位置姿勢（撮像部３０１の位置姿勢）に応じた磁界の変化を検知し、検知した結果を示す信号をセンサコントローラに送出する。センサコントローラは、係る信号に基づいて、レシーバのセンサ座標系における位置姿勢を示す位置姿勢情報を生成する。ここで、センサ座標系とは、トランスミッタの位置を原点とし、係る原点で互いに直交する３軸をそれぞれｘ軸、ｙ軸、ｚ軸とする座標系のことである。そして、センサコントローラは、求めた位置姿勢情報を、後段の画像処理装置３００に対して送出する。 A transmitter that generates a magnetic field in the surroundings.A receiver that detects changes in the magnetic field according to its position and orientation in the magnetic field generated by the transmitter. A sensor controller that generates position and orientation information in the sensor coordinate system The transmitter is placed at a predetermined position in the real space. The receiver is attached to the imaging unit 301. When the transmitter generates a magnetic field, the receiver detects a change in the magnetic field according to its own position and orientation (position and orientation of the imaging unit 301), and sends a signal indicating the detected result to the sensor controller. The sensor controller generates position and orientation information indicating the position and orientation of the receiver in the sensor coordinate system based on the signal. Here, the sensor coordinate system is a coordinate system in which the position of the transmitter is the origin and the three axes orthogonal to each other at the origin are the x-axis, y-axis, and z-axis, respectively. Then, the sensor controller sends the obtained position and orientation information to the subsequent image processing apparatus 300.

しかし、位置姿勢計測部３０６にはどのようなセンサシステムを適用しても良く、位置姿勢計測部３０６に何れのセンサシステムを適用しても、その動作については周知であるので、これについての説明は省略する。また、センサシステムの代わりに、撮像部３０１が撮像した画像を用いて撮像部３０１の位置姿勢を求める方法を用いても良く、その場合には、位置姿勢計測部３０６は省略し、係る方法を実行する演算部を後段の画像処理装置３００内に設ければよい。 However, any sensor system may be applied to the position / orientation measurement unit 306, and any sensor system may be applied to the position / orientation measurement unit 306, and the operation thereof is well known. Is omitted. Further, instead of the sensor system, a method for obtaining the position and orientation of the imaging unit 301 using an image captured by the imaging unit 301 may be used. In this case, the position and orientation measurement unit 306 is omitted, and the method is used. An arithmetic unit to be executed may be provided in the image processing apparatus 300 at the subsequent stage.

次に、画像処理装置３００について説明する。図３に示す如く、画像処理装置３００は、撮影画像取込部３０２、キー領域抽出部３０３、動きベクトル検出部３０４、被写体領域抽出部３０５、画像合成部３０８、画像生成部３０７、記憶装置３１０、により構成されている。以下、画像処理装置３００を構成する各部について説明する。 Next, the image processing apparatus 300 will be described. As illustrated in FIG. 3, the image processing apparatus 300 includes a captured image capturing unit 302, a key region extracting unit 303, a motion vector detecting unit 304, a subject region extracting unit 305, an image composition unit 308, an image generation unit 307, and a storage device 310. , Is configured. Hereinafter, each part which comprises the image processing apparatus 300 is demonstrated.

撮影画像取込部３０２は、撮像部３０１から送出された各フレームの画像の画像信号を受けると、これを順次ディジタルデータに変換し、動きベクトル検出部３０４、キー領域抽出部３０３、画像合成部３０８に送出する。 When the captured image capturing unit 302 receives the image signal of each frame image sent from the image capturing unit 301, the captured image capturing unit 302 sequentially converts the image signal into digital data. Send to 308.

キー領域抽出部３０３は、撮影画像取込部３０２から受けたディジタルデータが示す現実空間画像から、キー領域（第１の領域）を抽出する。ここで、キー領域とは、予め定められた画素値を有する画素で構成される領域である。本実施形態では、ユーザの手の色を示す画素値を有する画素で構成される領域をキー領域とする。そして、キー領域抽出部３０３は、現実空間画像においてキー領域を特定する為のデータであるキー領域データを生成し、生成したキー領域データを、被写体領域抽出部３０５に送出する。 The key area extracting unit 303 extracts a key area (first area) from the real space image indicated by the digital data received from the captured image capturing unit 302. Here, the key area is an area composed of pixels having a predetermined pixel value. In the present embodiment, an area composed of pixels having pixel values indicating the color of the user's hand is defined as a key area. Then, the key area extraction unit 303 generates key area data that is data for specifying the key area in the real space image, and sends the generated key area data to the subject area extraction unit 305.

動きベクトル検出部３０４は、撮影画像取込部３０２から受けた現実空間画像（現フレーム）と、この現実空間画像よりも１つ前のフレームの現実空間画像とを用いて、現フレームにおける現実空間画像を構成する画素毎に、フレーム間の動きベクトルを求める。そして画素毎に求めた動きベクトルのデータを、被写体領域抽出部３０５に送出する。 The motion vector detection unit 304 uses the real space image (current frame) received from the captured image capture unit 302 and the real space image of the frame immediately before this real space image to use the real space in the current frame. A motion vector between frames is obtained for each pixel constituting the image. Then, the motion vector data obtained for each pixel is sent to the subject region extraction unit 305.

動きベクトル検出部３０４が行う動きベクトルの検出は、既存のブロックマッチング法によるオプティカルフローを算出することで実行することができる。本実施形態では、動きベクトルの検出（計算）をブロックマッチング法で行うものとするが、以下の説明は係る方法を用いることに限定するものではなく、フレーム間の動きベクトルを検出することができる方法であれば如何なる方法を用いても良い。例えば、動きベクトルの検出を勾配法によるオプティカルフローを用いることで行っても良い。 The motion vector detection performed by the motion vector detection unit 304 can be executed by calculating an optical flow using an existing block matching method. In the present embodiment, motion vector detection (calculation) is performed by the block matching method. However, the following description is not limited to using such a method, and a motion vector between frames can be detected. Any method may be used as long as it is a method. For example, the motion vector may be detected by using an optical flow based on a gradient method.

被写体領域抽出部３０５は、キー領域抽出部３０３から入力されたキー領域データと、動きベクトル検出部３０４から入力された動きベクトルのデータと、を用いて、現実空間画像中における被写体の領域（被写体領域）を抽出する。係る被写体領域とは上述の通り、ユーザの手の領域と腕の領域とをマージした領域である。そして、抽出した被写体領域を特定するためのデータである被写体領域データを画像合成部３０８に送出する。 The subject region extraction unit 305 uses the key region data input from the key region extraction unit 303 and the motion vector data input from the motion vector detection unit 304 to use the subject region (subject) in the real space image. Region). As described above, the subject area is an area obtained by merging the user's hand area and the arm area. Then, the subject area data, which is data for specifying the extracted subject area, is sent to the image composition unit 308.

画像生成部３０７は先ず、記憶装置３１０が保持している仮想空間のデータを用いて、仮想空間を構築する。仮想空間のデータには、仮想空間中に配置する仮想物体についてのデータ、仮想空間中に配置する光源のデータなどが含まれている。仮想物体のデータには、例えば、仮想物体がポリゴンで構成されている場合には、ポリゴンの法線ベクトルデータ、ポリゴンの色データ、ポリゴンを構成する各頂点の座標位置データ等が含まれている。また、仮想物体に対してテクスチャマッピングを施す場合には、テクスチャマップデータも、仮想物体のデータに含まれる。また、光源のデータには、例えば、光源の種類を示すデータ、光源の配置位置姿勢を示すデータ、等が含まれていることになる。 The image generation unit 307 first constructs a virtual space using the virtual space data held by the storage device 310. The virtual space data includes data on a virtual object placed in the virtual space, light source data placed in the virtual space, and the like. The virtual object data includes, for example, polygon normal vector data, polygon color data, and coordinate position data of each vertex constituting the polygon when the virtual object is composed of polygons. . When texture mapping is performed on a virtual object, texture map data is also included in the virtual object data. The light source data includes, for example, data indicating the type of light source, data indicating the arrangement position and orientation of the light source, and the like.

そして画像生成部３０７は、仮想空間の構築後、係る仮想空間中に、位置姿勢計測部３０６から受けた位置姿勢情報が示す位置姿勢で視点を設定する。そして画像生成部３０７は、係る視点から見える仮想空間の画像（仮想空間画像）を生成する。なお、所定の位置姿勢を有する視点から見える仮想空間の画像を生成する為の技術については周知であるので、これについての説明は省略する。そして、生成した仮想空間画像のデータを、画像合成部３０８に送出する。 Then, after constructing the virtual space, the image generation unit 307 sets a viewpoint in the virtual space based on the position and orientation indicated by the position and orientation information received from the position and orientation measurement unit 306. Then, the image generation unit 307 generates a virtual space image (virtual space image) that can be seen from the viewpoint. Since a technique for generating an image of a virtual space that can be seen from a viewpoint having a predetermined position and orientation is well known, description thereof will be omitted. Then, the generated virtual space image data is sent to the image composition unit 308.

画像合成部３０８は、撮影画像取込部３０２から受けたディジタルデータが示す現実空間画像上に、画像生成部３０７から受けたデータが示す仮想空間画像を重畳させる処理を行う。係る重畳処理では、被写体領域抽出部３０５から受けた被写体領域データが示す被写体領域には仮想空間画像が重畳されないようにする。そして画像合成部３０８はこのような重畳処理によって生成された複合現実空間の画像を映像信号に変換してからＨＭＤ３９０が有する表示部３０９に送出する。これにより、ＨＭＤ３９０を頭部に装着したユーザの眼前には、自身の視点の位置姿勢に応じた複合現実空間の画像が提示されることになる。更に係る複合現実空間の画像において被写体領域（自身の手と腕の領域）には仮想空間画像は重畳されていない。 The image composition unit 308 performs processing for superimposing the virtual space image indicated by the data received from the image generation unit 307 on the real space image indicated by the digital data received from the captured image capturing unit 302. In the superimposition processing, the virtual space image is not superimposed on the subject area indicated by the subject area data received from the subject area extraction unit 305. Then, the image composition unit 308 converts the mixed reality space image generated by such superposition processing into a video signal, and then sends the video signal to the display unit 309 included in the HMD 390. As a result, an image of the mixed reality space corresponding to the position and orientation of its own viewpoint is presented in front of the user wearing the HMD 390 on the head. Furthermore, in the mixed reality space image, the virtual space image is not superimposed on the subject area (the area of the hand and the arm).

なお、本実施形態では、複合現実空間の画像は、ＨＭＤ３９０が有する表示部３０９に送出するものとしているが、複合現実空間の画像の送出先は特に限定するものではない。例えば、画像処理装置３００にＣＲＴや液晶画面等により構成されている表示装置を接続し、この表示装置に複合現実空間の画像を出力するようにしても良い。 In the present embodiment, the mixed reality space image is sent to the display unit 309 of the HMD 390, but the destination of the mixed reality space image is not particularly limited. For example, a display device configured with a CRT, a liquid crystal screen, or the like may be connected to the image processing device 300, and an image of the mixed reality space may be output to the display device.

記憶装置３１０は、上述のように、仮想空間のデータを保持しており、画像生成部３０７は係るデータを適宜読み出して用いる。更に記憶装置３１０には、ユーザの手の色を示す画素値のデータ（キーカラーデータ）も記憶保持されている。 As described above, the storage device 310 holds virtual space data, and the image generation unit 307 appropriately reads and uses the data. Further, the storage device 310 also stores and holds pixel value data (key color data) indicating the color of the user's hand.

ここで、キーカラーデータについて説明する。キーカラーデータは、多次元色空間における座標値として記述することができる。一般によく知られている表色系の種類には、RGB、YIQ、YCbCr、YUV、HSV、Lu*v*、La*b*など様々なものがある（日本規格協会 JIS色彩ハンドブック）。 Here, the key color data will be described. The key color data can be described as coordinate values in a multidimensional color space. There are various well-known types of color systems such as RGB, YIQ, YCbCr, YUV, HSV, Lu * v *, La * b * (Japanese Standards Association JIS Color Handbook).

キーカラーデータには、被写体の色彩特性に合わせて適当なものを任意に用いてよいが、照明条件の相違による被写体の色彩特性の変化を相殺するために、輝度情報と色相情報とに分離する形式の表色系を用い、色相情報だけを用いることが望ましい。このような表色系の代表的なものとしてはYIQやYCbCrが一般的である。本実施形態では、YCbCr表色系を用いる。従って、記憶装置３１０に記憶保持されているキーカラーデータとは、ユーザの手の色を予め取得しておき、取得した色をYcbCr表色系のデータとして変換したものである。 The key color data may be arbitrarily selected according to the color characteristics of the subject, but is separated into luminance information and hue information in order to offset changes in the color characteristics of the subject due to differences in illumination conditions. It is desirable to use a format color system and only use hue information. Typical examples of such a color system are YIQ and YCbCr. In this embodiment, the YCbCr color system is used. Accordingly, the key color data stored and held in the storage device 310 is obtained by acquiring the color of the user's hand in advance and converting the acquired color as YcbCr color system data.

次に、画像処理装置３００が複合現実空間の画像を生成し、生成した複合現実空間の画像をＨＭＤ３９０が有する表示部３０９に送出する為の一連の処理について、同処理のフローチャートを示す図５を用いて、以下説明する。 Next, FIG. 5 showing a flowchart of the processing for a series of processes for the image processing apparatus 300 to generate an image of the mixed reality space and send the generated mixed reality space image to the display unit 309 of the HMD 390. This will be described below.

先ずステップＳ５０１では、撮影画像取込部３０２は、撮像部３０１から送出された現実空間画像をディジタルデータとして取得する。 First, in step S 501, the captured image capturing unit 302 acquires the physical space image sent from the imaging unit 301 as digital data.

次に、ステップＳ５０２では、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像から、キー領域（第１の領域）を抽出する。そして、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像においてキー領域を特定する為のデータであるキー領域データを生成し、生成したキー領域データを、被写体領域抽出部３０５に送出する。 Next, in step S502, the key area extraction unit 303 extracts a key area (first area) from the real space image acquired by the captured image capturing unit 302 in step S501. Then, the key area extraction unit 303 generates key area data that is data for specifying the key area in the real space image acquired by the captured image capturing unit 302 in step S501, and the generated key area data is used as the subject. The data is sent to the area extraction unit 305.

また動きベクトル検出部３０４は、ステップＳ５０１で撮影画像取込部３０２が取得した現フレームの現実空間画像と、現フレームより１フレーム前の現実空間画像とを用いて、現フレームの現実空間画像を構成する画素毎にフレーム間の動きベクトルを求める。そして動きベクトル検出部３０４は、画素毎に求めた動きベクトルのデータを、被写体領域抽出部３０５に送出する。 In addition, the motion vector detection unit 304 uses the current space real space image acquired by the captured image capturing unit 302 in step S501 and the real space image one frame before the current frame to obtain the current frame real space image. A motion vector between frames is obtained for each pixel constituting the frame. Then, the motion vector detection unit 304 sends the motion vector data obtained for each pixel to the subject region extraction unit 305.

そして被写体領域抽出部３０５は、キー領域抽出部３０３が生成したキー領域データと、動きベクトル検出部３０４が生成した動きベクトルのデータと、を用いて、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像中の被写体領域を抽出する。そして、被写体領域抽出部３０５は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像中の被写体領域をマスクしたマスク画像のデータを、上記被写体領域データとして生成する。本実施形態では上述の通り、被写体とはユーザの手と腕であるので、現実空間画像から手と腕が存在する領域を抽出し、その領域からマスク画像を生成する。ステップＳ５０２における処理の詳細については後述する。 The subject area extraction unit 305 uses the key area data generated by the key area extraction unit 303 and the motion vector data generated by the motion vector detection unit 304 to acquire the captured image capture unit 302 in step S501. The subject area in the real space image is extracted. Then, the subject area extraction unit 305 generates mask image data that masks the subject area in the real space image acquired by the captured image capturing unit 302 in step S501 as the subject area data. In the present embodiment, as described above, since the subject is the user's hand and arm, a region where the hand and arm are present is extracted from the real space image, and a mask image is generated from the region. Details of the processing in step S502 will be described later.

次にステップＳ５０３では、画像生成部３０７は、位置姿勢計測部３０６から位置姿勢情報を取得する。係る位置姿勢情報は上述の通り、ＨＭＤ３９０を頭部に装着するユーザの視点の位置姿勢であり、撮像部３０１の位置姿勢である。 In step S 503, the image generation unit 307 acquires position and orientation information from the position and orientation measurement unit 306. As described above, the position and orientation information is the position and orientation of the viewpoint of the user who wears the HMD 390 on the head, and is the position and orientation of the imaging unit 301.

次にステップＳ５０４では、画像生成部３０７は、記憶装置３１０から仮想空間のデータを読み出し、読み出したデータに基づいて仮想空間を構築する。そして画像生成部３０７は、仮想空間の構築後、係る仮想空間中に、ステップＳ５０３において位置姿勢計測部３０６から取得した位置姿勢情報が示す位置姿勢で視点を設定する。そして画像生成部３０７は、係る視点から見える仮想空間の画像（仮想空間画像）を生成する。 In step S504, the image generation unit 307 reads virtual space data from the storage device 310, and constructs a virtual space based on the read data. Then, after constructing the virtual space, the image generation unit 307 sets a viewpoint in the virtual space based on the position and orientation indicated by the position and orientation information acquired from the position and orientation measurement unit 306 in step S503. Then, the image generation unit 307 generates a virtual space image (virtual space image) that can be seen from the viewpoint.

次にステップＳ５０５では、画像合成部３０８は、ステップＳ５０１で撮影画像取込部３０２が取得したディジタルデータが示す現実空間画像上に、ステップＳ５０４で画像生成部３０７が生成した仮想空間画像を重畳させる処理を行う。係る重畳処理では、ステップＳ５０２で被写体領域抽出部３０５が生成した被写体領域データが示す被写体領域には仮想空間画像が重畳されないようにする。ステップＳ５０５における処理の詳細については後述する。 In step S505, the image composition unit 308 superimposes the virtual space image generated by the image generation unit 307 in step S504 on the real space image indicated by the digital data acquired by the captured image capturing unit 302 in step S501. Process. In the superimposition processing, the virtual space image is not superimposed on the subject area indicated by the subject area data generated by the subject area extraction unit 305 in step S502. Details of the processing in step S505 will be described later.

次にステップＳ５０６では、画像合成部３０８は、ステップＳ５０５における重畳処理によって生成した複合現実空間の画像を映像信号に変換してからＨＭＤ３９０が有する表示部３０９に送出する。 In step S506, the image composition unit 308 converts the mixed reality space image generated by the superimposition processing in step S505 into a video signal, and then transmits the video signal to the display unit 309 included in the HMD 390.

次に、画像処理装置３００が有する不図示の操作部を介して本処理の終了指示がユーザから入力された、若しくは本処理の終了条件が満たされた場合にはステップＳ５０７を介して本処理が終了する。一方、画像処理装置３００が有する不図示の操作部を介して本処理の終了指示がユーザから入力されていないし、本処理の終了条件も満たされていない場合には、処理はステップＳ５０７を介してステップＳ５０１に戻す。そして、次のフレームの複合現実空間の画像を表示部３０９に送出すべく、ステップＳ５０１以降の処理を行う。 Next, when an instruction to end the process is input from the user via an operation unit (not shown) included in the image processing apparatus 300, or when an end condition for the process is satisfied, the process is performed via step S507. finish. On the other hand, if the user has not input an instruction to end the present process via the operation unit (not shown) included in the image processing apparatus 300 and the end condition for the present process is not satisfied, the process proceeds to step S507. It returns to step S501. Then, in order to send the mixed reality space image of the next frame to the display unit 309, the processing from step S501 is performed.

次に、上記ステップＳ５０２における処理の詳細について説明する。図６は、上記ステップＳ５０２における処理の詳細を示すフローチャートである。 Next, details of the processing in step S502 will be described. FIG. 6 is a flowchart showing details of the processing in step S502.

先ずステップＳ６０１では、キー領域抽出部３０３は、記憶装置３１０からキーカラーデータを読み出す。そして、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像を構成する各画素のうち、記憶装置３１０から読み出したキーカラーデータが示す画素値を有する画素の集合をキー領域として抽出する。 First, in step S 601, the key area extraction unit 303 reads key color data from the storage device 310. Then, the key area extraction unit 303 collects pixels having pixel values indicated by the key color data read from the storage device 310 among the pixels constituting the real space image acquired by the captured image capturing unit 302 in step S501. Is extracted as a key area.

具体的には、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像を構成する各画素のうち、記憶装置３１０から読み出したキーカラーデータが示す画素値を有する画素については「１」を割り当てる。一方、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像を構成する各画素のうち、記憶装置３１０から読み出したキーカラーデータが示す画素値を有していない画素については「０」を割り当てる。即ち、現実空間画像において、手の領域を構成する各画素については「１」を割り当て、それ以外の領域を構成する各画素については「０」を割り当てる。 Specifically, the key area extracting unit 303 has a pixel value indicated by the key color data read from the storage device 310 among the pixels constituting the real space image acquired by the captured image capturing unit 302 in step S501. “1” is assigned to the pixel. On the other hand, the key area extraction unit 303 does not have the pixel value indicated by the key color data read from the storage device 310 among the pixels constituting the real space image acquired by the captured image capturing unit 302 in step S501. “0” is assigned to the pixel. That is, in the real space image, “1” is assigned to each pixel constituting the hand region, and “0” is assigned to each pixel constituting the other region.

ここで、ステップＳ６０１における処理をより詳細に説明する。図７は、上記ステップＳ６０１における処理の詳細を示すフローチャートである。なお、図７のフローチャートは、現実空間画像中の画像座標（ｉ，ｊ）における画素について行う処理のフローチャートである。従って、実際にステップＳ６０１では、図７のフローチャートに従った処理を、現実空間画像を構成する各画素について行うことになる。 Here, the process in step S601 will be described in more detail. FIG. 7 is a flowchart showing details of the processing in step S601. Note that the flowchart in FIG. 7 is a flowchart of processing performed on a pixel at image coordinates (i, j) in the real space image. Therefore, in step S601, the process according to the flowchart of FIG. 7 is actually performed for each pixel constituting the real space image.

先ずステップＳ７０１では、キー領域抽出部３０３は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像において画像座標（ｉ、ｊ）における画素の画素値（本実施形態ではＲＧＢ値で表されているものとする）をＹＣｒＣｂ値に変換する。画像座標（ｉ、ｊ）における画素のＲ値をＲ（ｉ、ｊ）、Ｇ値をＧ（ｉ、ｊ）、Ｂ値をＢ（ｉ、ｊ）とする。この場合、ステップＳ７０１では、ＲＧＢ値をＹＣｒＣｂ値に変換する為の関数color_conversion()を用いて、Ｒ（ｉ、ｊ）、Ｇ（ｉ、ｊ）、Ｂ（ｉ、ｊ）を変換し、Ｙ値、Ｃｒ値、Ｃｂ値を得る。 First, in step S701, the key area extracting unit 303 represents the pixel value of the pixel at the image coordinates (i, j) in the real space image acquired by the captured image capturing unit 302 in step S501 (in this embodiment, represented by an RGB value). Are converted into YCrCb values. The R value of the pixel at the image coordinates (i, j) is R (i, j), the G value is G (i, j), and the B value is B (i, j). In this case, in step S701, R (i, j), G (i, j), B (i, j) are converted using the function color_conversion () for converting RGB values into YCrCb values, and Y Value, Cr value, Cb value are obtained.

次にステップＳ７０２では、ステップＳ７０１で求めたＹ、Ｃｒ、Ｃｂのそれぞれの値が表現する色が、記憶装置３１０から読み出したキーカラーデータが示す色に略同じであるのか否かを判断する。例えば、ステップＳ７０１で求めたＹ、Ｃｒ、Ｃｂのそれぞれの値が表現する色が、記憶装置３１０から読み出したキーカラーデータが示す色に略同じであるのか否かを、関数Key_area_func()を用いて判断する。係る関数Key_area_func()は、略同じであれば１を返し、略同じでなければ０を返す関数である。 Next, in step S702, it is determined whether or not the colors represented by the respective values Y, Cr, and Cb obtained in step S701 are substantially the same as the colors indicated by the key color data read from the storage device 310. For example, the function Key_area_func () is used to determine whether or not the colors represented by the values Y, Cr, and Cb obtained in step S701 are substantially the same as the colors indicated by the key color data read from the storage device 310. Judgment. The function Key_area_func () is a function that returns 1 if substantially the same, and returns 0 if not substantially the same.

ここで、関数Key_area_func()による判断方法としては、例えば、Ｃｂ、Ｃｒで規定されるCbCr平面上における座標値（Ｃｒ、Ｃｂ）が、キーカラーデータの色分布の領域に属するか否かを判定する。判定結果は、例えば、キーカラーデータの色分布に属するのであれば１、属さないのであれば０と二値で表してもよいが、属する度合いを０から１までの連続値でもって表現するようにしても良い。 Here, as a determination method using the function Key_area_func (), for example, it is determined whether or not the coordinate values (Cr, Cb) on the CbCr plane defined by Cb, Cr belong to the color distribution area of the key color data. To do. The determination result may be expressed as a binary value, for example, 1 if it belongs to the color distribution of the key color data, or 0 if it does not belong, but the degree of belonging is expressed as a continuous value from 0 to 1. Anyway.

そして係る関数Key_area_func()が返す値は、配列Key_area(ｉ、ｊ)に代入される。この配列key_area(ｉ、ｊ)は、画像座標（ｉ、ｊ）における画素がキー領域を構成する画素であるか否かを示す値を格納する為のものである。 The value returned by the function Key_area_func () is assigned to the array Key_area (i, j). This array key_area (i, j) is for storing a value indicating whether or not the pixel at the image coordinate (i, j) is a pixel constituting the key area.

そして、全てのｉ、ｊについて図７のフローチャートに従った処理を行うことで、配列Key_areaには、現実空間画像を構成する各画素について「１」若しくは「０」が保持されることになる。係る配列Key_areaが、上記キー領域データとなる。 Then, by performing the processing according to the flowchart of FIG. 7 for all i and j, the array Key_area holds “1” or “0” for each pixel constituting the real space image. The array Key_area is the key area data.

なお、本実施形態では、撮影画像取込部３０２が取得した現実空間画像を構成する各画素の画素値はＲＧＢで表されているものとしているが、YIQやYUVで表されていても良い。その場合には、ステップＳ７０１における処理を省略し、ステップＳ７０２において（Ｃｂ、Ｃｒ）の代わりにそれぞれIQ空間やUV空間における座標値を用いればよい。 In the present embodiment, the pixel value of each pixel constituting the real space image acquired by the captured image capturing unit 302 is represented by RGB, but may be represented by YIQ or YUV. In that case, the processing in step S701 may be omitted, and coordinate values in IQ space and UV space may be used instead of (Cb, Cr) in step S702.

以上説明したように、キー領域抽出部３０３は、撮影画像取込部３０２が取得した現実空間画像を構成する各画素が、キー領域（手）を構成するものであるのか否かを示すキー領域データを生成する。 As described above, the key area extraction unit 303 indicates whether or not each pixel constituting the real space image acquired by the captured image capturing unit 302 constitutes a key area (hand). Generate data.

図６に戻って、次にステップＳ６０２では、動きベクトル検出部３０４は、ステップＳ５０１で撮影画像取込部３０２が取得した現フレームの現実空間画像を構成する画素毎にフレーム間の動きベクトルを求める。なお、本実施形態では、動きベクトルは現実空間画像を構成する各画素について求めるものとしたが、これに限定するものではなく、現実空間画像上の複数箇所における動きベクトルを求めれば良い。例えば、キー領域の近辺のみの画素について動きベクトルを求めるようにしても良い。これにより動きベクトルを求める為に要する時間コストを削減することができる。 Returning to FIG. 6, in step S602, the motion vector detecting unit 304 obtains a motion vector between frames for each pixel constituting the real space image of the current frame acquired by the captured image capturing unit 302 in step S501. . In the present embodiment, the motion vector is obtained for each pixel constituting the real space image. However, the present invention is not limited to this, and the motion vector may be obtained at a plurality of locations on the real space image. For example, a motion vector may be obtained for pixels only in the vicinity of the key area. Thereby, the time cost required for obtaining the motion vector can be reduced.

次にステップＳ６０３では、被写体領域抽出部３０５は、ステップＳ６０２で求めた各画素についての動きベクトルのうち、キー領域以外（第１の領域以外）の領域（非キー領域、他領域）を構成する各画素について求めた動きベクトルの大きさの平均を求める。そして求めた平均（代表動きベクトルの大きさ）が、予め定めた閾値以上であるか否かを判断する。係る判断の結果、求めた平均が閾値以上であれば処理をステップＳ６０４に進め、閾値未満である場合には処理をステップＳ６０６に進める。 In step S603, the subject area extraction unit 305 configures an area (non-key area, other area) other than the key area (other than the first area) among the motion vectors for each pixel obtained in step S602. The average of the magnitudes of motion vectors obtained for each pixel is obtained. Then, it is determined whether or not the obtained average (representative motion vector magnitude) is greater than or equal to a predetermined threshold. As a result of such determination, if the obtained average is equal to or greater than the threshold, the process proceeds to step S604, and if it is less than the threshold, the process proceeds to step S606.

なお、ここで述べている「動きベクトルの大きさ」とは、動きベクトル距離成分を示している。もちろん、動きベクトルの角度成分から「大きさ」を求めても良い。このように、動きベクトルの大きさを求めるための方法については特に限定するものではない。ここで、ステップＳ６０３で行う判断処理の意義について説明する。 The “motion vector magnitude” described here indicates a motion vector distance component. Of course, the “size” may be obtained from the angle component of the motion vector. Thus, the method for obtaining the magnitude of the motion vector is not particularly limited. Here, the significance of the determination process performed in step S603 will be described.

手の領域の動きベクトルと類似度の高い領域は、腕の領域として抽出する。これは手と腕はほとんどの場合において一緒に動作しているので、それぞれの動きベクトルには類似性があるということから実現している。ただし手だけが動いて腕はほとんど動かないという場合も考えられる。例えば、手首のみを回した場合がこれに相当する。この場合、そのまま被写体領域を抽出しようとしても腕は抽出されない。また、手も腕も全く動かずさらには撮像部３０１も動かない場合は、動きベクトルは算出されないので正常に被写体領域を抽出することができない。 A region having a high similarity to the motion vector of the hand region is extracted as an arm region. This is achieved because the hand and arm are moving together in most cases, and the motion vectors are similar. However, there are cases where only the hand moves and the arm hardly moves. For example, this is the case when only the wrist is turned. In this case, the arm is not extracted even if the subject area is extracted as it is. Further, when neither the hand nor the arm moves at all, and further, the imaging unit 301 does not move, the motion vector is not calculated, so that the subject area cannot be normally extracted.

そこで本実施形態では、ステップＳ６０３において、手が動いていない場合と、手と腕と撮像部３０１の全部が動いていない場合とを判断する処理を行う。つまり、非キー領域の動きベクトルの大きさがほとんど０の場合は、腕の領域が動いていないか、手と腕と撮像部３０１の全部が動いていないと判断し、その場合はステップＳ６０６の処理を行うことで問題を回避する。ステップＳ６０６における処理については後述する。 Therefore, in the present embodiment, in step S603, processing is performed to determine whether the hand is not moving and whether the hand, the arm, and all of the imaging unit 301 are not moving. That is, when the magnitude of the motion vector of the non-key area is almost 0, it is determined that the arm area is not moving or that all of the hand, the arm, and the imaging unit 301 are not moving. In that case, in step S606 The problem is avoided by processing. The process in step S606 will be described later.

次にステップＳ６０４では、被写体領域抽出部３０５は、ステップＳ６０１で抽出したキー領域と、ステップＳ６０２で求めた動きベクトルと、に基づいて、キー領域にマージすべき第２の領域を特定する。そして特定した第２の領域とキー領域とをマージした領域を被写体領域として得る。本実施形態では、腕の領域を第２の領域として特定する。そして、特定した腕の領域を、キー領域としての手の領域にマージすることで被写体領域を得る。 In step S604, the subject area extraction unit 305 identifies a second area to be merged with the key area based on the key area extracted in step S601 and the motion vector obtained in step S602. Then, an area obtained by merging the specified second area and the key area is obtained as a subject area. In the present embodiment, the arm region is specified as the second region. Then, the subject area is obtained by merging the specified arm area with the hand area as the key area.

ここで、ステップＳ６０４における処理の詳細について説明する。図８は、上記ステップＳ６０４における処理の詳細を示すフローチャートである。 Here, details of the processing in step S604 will be described. FIG. 8 is a flowchart showing details of the processing in step S604.

先ず、ステップＳ８０１では、被写体領域抽出部３０５は、各動きベクトルを、距離成分と角度成分を特徴としてそれぞれの特徴軸で正規化する。これにより各特徴の単位の違いにより値の重み付けがなされてしまうことを回避する（一般的な正規化）。例えば、特徴のパターン相互の距離を最小にすることで正規化を行う。 First, in step S801, the subject region extraction unit 305 normalizes each motion vector with each feature axis using a distance component and an angle component as features. This avoids weighting of values due to differences in units of features (general normalization). For example, normalization is performed by minimizing the distance between feature patterns.

次に、ステップＳ８０２では、被写体領域抽出部３０５は、ステップＳ８０１で正規化された動きベクトルのうち、キー領域の特徴のみを特徴空間上でクラスタリングする。つまり、図９に示すようにベクトル距離成分軸（縦軸）と角度成分軸（横軸）とで規定される特徴空間上で、キー領域の特徴をクラスタリングする（特徴パターンの学習）。図９は、キー領域の特徴のみを特徴空間上でクラスタリングした結果の例を示す図である。 Next, in step S802, the subject region extraction unit 305 clusters only the features of the key region on the feature space among the motion vectors normalized in step S801. That is, as shown in FIG. 9, the features of the key area are clustered on the feature space defined by the vector distance component axis (vertical axis) and the angle component axis (horizontal axis) (feature pattern learning). FIG. 9 is a diagram illustrating an example of a result of clustering only the features of the key area on the feature space.

ここで必要に応じてクラスタリングされたキー領域の特徴のうちノイズ成分を除外してもよい。具体的には、特徴の数の少ないクラスや距離成分の小さいクラスなどはノイズとして除外する。 Here, noise components may be excluded from the features of the clustered key regions as necessary. Specifically, a class having a small number of features or a class having a small distance component is excluded as noise.

また、キー領域の特徴をクラスタリングするとしたが、クラスリングする特徴をキー領域のエッジ領域のみの特徴とすることでノイズを除外していもよい。エッジ領域の抽出は既存のラベリングアルゴリズムで実現できる。 Further, although the feature of the key region is clustered, noise may be excluded by making the classing feature only the feature of the edge region of the key region. The edge region extraction can be realized by an existing labeling algorithm.

次に、ステップＳ８０３では、被写体領域抽出部３０５は、正規化した全ての動きベクトルの特徴のうち、ステップＳ８０２でクラスタリングされたキー領域のクラスに含まれるものとして判断された特徴を、このクラスに含める処理を行う。即ち、図１０に示すように、動きベクトルの特徴空間においてキー領域のクラスに属する特徴は被写体領域クラスであるとみなすことで、被写体領域クラスとそれ以外を識別する。これから、被写体領域は、キーカラーデータを有する領域に加え、キー領域と類似した動きベクトル成分を有する領域も含むこととなる。図１０は、キー領域のクラスとその他のクラスに属する特徴とを示す図である。 Next, in step S803, the subject region extraction unit 305 selects the features determined to be included in the class of the key region clustered in step S802 out of all normalized motion vector features in this class. Process to include. That is, as shown in FIG. 10, in the feature vector motion space, a feature belonging to the key region class is regarded as a subject region class, so that the subject region class and the others are identified. Thus, the subject area includes an area having a motion vector component similar to the key area in addition to the area having the key color data. FIG. 10 is a diagram showing a class of the key area and features belonging to other classes.

係る処理は、例えば以下の式に基づいて行われる。 Such processing is performed based on, for example, the following equation.

target_area( i, j ) = discriminant_func( whole_vec( i, j ) )
ここで(i,j)は現実空間画像における画素座標、target_areaは被写体領域である。また、discriminant_func（）はキー領域のクラスの動きベクトルの特徴によって学習されている被写体領域識別関数、whole_vecは現実空間画像の全体の動きベクトルである。なお、上述したように、動きベクトルは現実空間画像を構成する全ての画素について求めることに限定するものではないので、ステップＳ８０３では、全ての画素について行うのではなく、動きベクトルを求めた全ての画素について行うものである。 target_area (i, j) = discriminant_func (whole_vec (i, j))
Here, (i, j) is a pixel coordinate in the real space image, and target_area is a subject area. Further, discriminant_func () is a subject area identification function learned by the motion vector characteristics of the key area class, and whole_vec is the entire motion vector of the real space image. Note that, as described above, the motion vector is not limited to being obtained for all the pixels constituting the real space image, and therefore, in step S803, not all the pixels are obtained, but all the motion vectors obtained are obtained. This is done for pixels.

次に、ステップＳ８０４では、被写体領域抽出部３０５は、キー領域と類似した動きベクトル成分を有する画素としてステップＳ８０３で特定されたそれぞれの画素についてラベリング処理を行う。以下ではラベリングされた画素で構成される領域を追加領域と呼称する。例えば、腕の領域を構成する画素のみをラベリングする。またここでラベリングされる画素にはユーザの手、腕以外の領域を構成する画素が含まれている可能性がある。本来それらの画素は、背景となる領域を構成する画素である。例えばそれはユーザ以外の人間の手や腕かもしれない。 Next, in step S804, the subject region extraction unit 305 performs a labeling process for each pixel identified in step S803 as a pixel having a motion vector component similar to the key region. Hereinafter, an area composed of labeled pixels is referred to as an additional area. For example, only the pixels constituting the arm region are labeled. In addition, the pixels labeled here may include pixels constituting an area other than the user's hand and arm. Originally, these pixels are pixels that constitute a background region. For example, it may be a human hand or arm other than the user.

次に、ステップＳ８０５では、被写体領域抽出部３０５は、ステップＳ８０４でラベリングされた追加領域が被写体領域として適切か否かを判定する。ステップＳ８０５における判定で用いる判定基準は２つある。 In step S805, the subject area extraction unit 305 determines whether the additional area labeled in step S804 is appropriate as the subject area. There are two criteria used in the determination in step S805.

１つ目の基準は、追加領域がキー領域と連結しているかどうか（連結関係）である。つまり、被写体となるべき腕はもちろん手とつながっているので、追加領域がキー領域と連結しているかを判定し、連結していればこの追加領域は被写体領域であるとする。そしてそうでないものは被写体領域から除外する。 The first criterion is whether or not the additional area is linked to the key area (linkage relationship). That is, since the arm to be the subject is connected to the hand as a matter of course, it is determined whether the additional area is connected to the key area, and if it is connected, the additional area is assumed to be the subject area. Those that are not are excluded from the subject area.

２つ目の基準は、追加領域が現実空間画像の端の領域に属しているかどうかである。つまり、被写体領域となるべきである腕はもちろんユーザとつながっているので、ユーザは自身の手を見ている際は必ず手につながった腕がユーザの視界の端までつながっているはずである。従って、ユーザが見る画像の端の領域に腕の領域が存在するはずである。これらの理由から、追加領域が現実空間画像の端の領域に属しているか否かを判定し、属していればこの追加領域は被写体領域であるとする。そしてそうでないものは被写体領域から除外する。 The second criterion is whether or not the additional region belongs to the end region of the real space image. In other words, since the arm that should be the subject area is connected to the user as a matter of course, when the user looks at his / her hand, the arm connected to the hand must be connected to the end of the user's field of view. Therefore, there should be an arm region in the end region of the image viewed by the user. For these reasons, it is determined whether or not the additional region belongs to the end region of the real space image, and if it belongs, it is assumed that this additional region is the subject region. Those that are not are excluded from the subject area.

これらの処理によって、背景領域を被写体領域と誤認識することが軽減される。 By these processes, erroneous recognition of the background area as the subject area is reduced.

ここで２つ目の基準をより厳しくするために、追加領域が現実空間画像の左右か下の端の領域に属しているかを判断するようにしても良い。 Here, in order to make the second criterion more strict, it may be determined whether the additional region belongs to the left or right end region of the real space image.

ステップＳ８０６では、ステップＳ８０５で被写体領域として認識された追加領域を、キー領域にマージすることで、被写体領域を形成する。 In step S806, the subject area is formed by merging the additional area recognized as the subject area in step S805 with the key area.

被写体領域はつまり下記のようになる。 That is, the subject area is as follows.

被写体領域＝キー領域＋追加領域
ただし、追加領域は、ステップＳ８０５で被写体領域として適切であると判定された領域である。 Subject area = Key area + Additional area However, the additional area is an area determined to be suitable as the subject area in step S805.

そして、図６のステップＳ６０５にリターンする。 Then, the process returns to step S605 in FIG.

ここで、以上の説明では、動きベクトルの距離成分と角度成分とを特徴として用いたが、距離成分のみを特徴として用いても良い。即ち、動きベクトルの特徴として用いるものは特に限定するものではなく、動きベクトル間の類似性を求めることができるのであれば、何れの特徴を用いても良い。 Here, in the above description, the distance component and the angle component of the motion vector are used as features, but only the distance component may be used as a feature. That is, what is used as the feature of the motion vector is not particularly limited, and any feature may be used as long as the similarity between the motion vectors can be obtained.

図６に戻ってステップＳ６０５では、被写体領域抽出部３０５は、ステップＳ５０１で撮影画像取込部３０２が取得した現実空間画像からステップＳ６０４で抽出した被写体領域の領域をマスクしたマスク画像のデータを、上記被写体領域データとして生成する。 Returning to FIG. 6, in step S 605, the subject region extraction unit 305 stores mask image data obtained by masking the subject region extracted in step S 604 from the real space image acquired by the captured image capturing unit 302 in step S 501. It is generated as the subject area data.

ここで、ステップＳ６０５における処理の詳細について説明する。図１１は、上記ステップＳ６０５における処理の詳細を示すフローチャートである。なお、図１１のフローチャートは、現実空間画像中の画像座標（ｉ，ｊ）における画素について行う処理のフローチャートである。従って、実際にステップＳ６０５では、図１１のフローチャートに従った処理を、現実空間画像を構成する各画素について行うことになる。 Here, details of the processing in step S605 will be described. FIG. 11 is a flowchart showing details of the processing in step S605. In addition, the flowchart of FIG. 11 is a flowchart of the process performed about the pixel in the image coordinate (i, j) in a real space image. Therefore, in step S605, the process according to the flowchart of FIG. 11 is actually performed for each pixel constituting the real space image.

ステップＳ１１０１では、被写体領域抽出部３０５は、画像座標（ｉ、ｊ）における画素が被写体領域として上記ステップＳ８０５で認識されたのであれば、配列Key_area(ｉ、ｊ)に「１」を書き込む。係る動作は、関数mask_func()を実行することでなされる。これにより、配列Key_area(ｉ、ｊ)は、画像座標（ｉ、ｊ）における画素が被写体領域を構成する画素であるか否かを示す値を格納する為の２次元配列となる。 In step S1101, the subject area extraction unit 305 writes “1” in the array Key_area (i, j) if the pixel at the image coordinates (i, j) is recognized as the subject area in step S805. Such an operation is performed by executing the function mask_func (). Thereby, the array Key_area (i, j) becomes a two-dimensional array for storing values indicating whether or not the pixel at the image coordinate (i, j) is a pixel constituting the subject area.

そして、全てのｉ、ｊについて図１１のフローチャートに従った処理を行うことで、配列Key_areaには、現実空間画像を構成する各画素について「１」若しくは「０」が保持されることになる。係る配列Key_areaが、上記被写体領域データとなる。 Then, by performing processing according to the flowchart of FIG. 11 for all i and j, the array Key_area holds “1” or “0” for each pixel constituting the real space image. The array Key_area is the subject area data.

なお、本実施形態では生成された被写体領域データが示すマスク画像はマスク領域内にノイズを含む場合がある。この場合は既存の凸閉方処理を行う。 In the present embodiment, the mask image indicated by the generated subject area data may include noise in the mask area. In this case, the existing convex closing process is performed.

一方、図６に戻って、ステップＳ６０６では、被写体領域抽出部３０５は、１フレーム前に生成した被写体領域データを、現フレームでも使用するものとして設定する。これは前述したように、動きベクトルから被写体領域を正常に抽出できない場合は前回生成したマスク画像で代用する処理である。しかし、ステップＳ６０６では、キー領域部分は常に更新し、追加領域のみ更新されていないマスク画像を示す被写体領域データを出力してもよい。キー領域部分を常に更新するということはキーカラーデータに基づいて領域を抽出するということである。従って係る処理を行えば、キー領域の形状（手の形状）は必ずキーカラーデータによって正確に抽出されることが保障される。 On the other hand, returning to FIG. 6, in step S606, the subject area extraction unit 305 sets the subject area data generated one frame before as being used in the current frame. As described above, this is a process for substituting the previously generated mask image when the subject region cannot be normally extracted from the motion vector. However, in step S606, the key area portion may be constantly updated, and subject area data indicating a mask image in which only the additional area is not updated may be output. To always update the key area portion means to extract the area based on the key color data. Therefore, if such processing is performed, it is ensured that the shape of the key area (hand shape) is always accurately extracted by the key color data.

そして、ステップＳ６０５，Ｓ６０６の何れの処理の後も、図５のステップＳ５０３にリターンする。 Then, after any processing in steps S605 and S606, the process returns to step S503 in FIG.

次に、図５のステップＳ５０５における、複合現実空間の画像の生成処理の詳細について説明する。図１２は、上記ステップＳ５０５における、複合現実空間の画像の生成処理の詳細を示すフローチャートである。なお、図１２のフローチャートは、複合現実空間の画像中の画像座標（ｉ，ｊ）における画素について行う処理のフローチャートである。従って、実際にステップＳ５０５では、図１２のフローチャートに従った処理を、複合現実空間の画像を構成する各画素について行うことになる。 Next, details of the mixed reality space image generation processing in step S505 of FIG. 5 will be described. FIG. 12 is a flowchart showing details of the mixed reality space image generation processing in step S505. Note that the flowchart of FIG. 12 is a flowchart of processing performed on a pixel at image coordinates (i, j) in an image of the mixed reality space. Therefore, in step S505, the process according to the flowchart of FIG. 12 is actually performed for each pixel constituting the mixed reality space image.

先ずステップＳ１２０１では、画像合成部３０８は、次のような処理を行う。即ち、ステップＳ５０１で撮影画像取込部３０２が取得したディジタルデータが示す現実空間画像において画像座標（ｉ，ｊ）の画素real（ｉ，ｊ）を、画像処理装置３００内のフレームメモリbuffer(i,j)に転送する。 First, in step S1201, the image composition unit 308 performs the following processing. That is, the pixel real (i, j) of the image coordinate (i, j) in the real space image indicated by the digital data acquired by the captured image capturing unit 302 in step S501 is converted into the frame memory buffer (i in the image processing apparatus 300. , j).

次にステップＳ１２０２では、上記ステップＳ５０２で生成した被写体領域データが示すマスク画像のうち画像座標（ｉ、ｊ）に対応するデータＫｅｙ＿ａｒｅａ（ｉ，ｊ）を画像処理装置３００内のステンシルバッファstencil(i,j)に転送する。 In step S1202, data Key_area (i, j) corresponding to the image coordinates (i, j) in the mask image indicated by the subject area data generated in step S502 is converted into a stencil buffer stencil (i , j).

次にステップＳ１２０３では、画像合成部３０８は、stencil(i,j)＝０である場合には、上記ステップＳ５０４で生成した仮想空間画像において画像座標（ｉ，ｊ）の画素CGI(i,j)を、フレームメモリbuffer(i,j)に上書きする。一方、画像合成部３０８は、stencil(i,j)＝１である場合には、フレームメモリbuffer(i,j)に対しては何も処理しない。即ち、被写体領域については、仮想空間画像の重畳対象外とする。 Next, in step S1203, when stencil (i, j) = 0, the image composition unit 308 determines the pixel CGI (i, j) of the image coordinates (i, j) in the virtual space image generated in step S504. ) Is overwritten in the frame memory buffer (i, j). On the other hand, when stencil (i, j) = 1, the image composition unit 308 does not process the frame memory buffer (i, j). That is, the subject area is not subject to superimposition of the virtual space image.

そして、全てのｉ、ｊについて図１２のフローチャートに従った処理を行うことで、フレームメモリbufferには、複合現実空間の画像が生成されることになる。そして、ステップＳ５０６では、この複合現実空間の画像を映像信号としてＨＭＤ３９０が有する表示部３０９に送出する。 Then, by performing processing according to the flowchart of FIG. 12 for all i and j, an image of the mixed reality space is generated in the frame memory buffer. In step S506, the mixed reality space image is sent to the display unit 309 included in the HMD 390 as a video signal.

以上の説明により本実施形態によれば、現実空間画像上に仮想空間画像を重畳させる際、現実空間画像中に「手」と「腕」とが被写体として含まれている場合は、この被写体は常に仮想空間画像よりも手前に表示されるように、重畳処理を制御することができる。 As described above, according to the present embodiment, when a virtual space image is superimposed on a real space image, if “hand” and “arm” are included as subjects in the real space image, the subject is The superimposition process can be controlled so that it is always displayed in front of the virtual space image.

図１３は、本実施形態によって生成される複合現実空間の画像の一例を示す図である。図１３に示した複合現実空間の画像１３０１は、現実空間画像、仮想空間画像がそれぞれ図２に示した現実空間画像２０１、仮想空間画像２０２である場合に生成されるものである。図１３に示す如く、手の領域１５０ａはもちろんのこと、手の領域１５０ａとは異なる画素値を有する腕の領域１５０ｂさえも、仮想空間画像２０２の手前に表示することができる。 FIG. 13 is a diagram illustrating an example of an image of the mixed reality space generated by the present embodiment. The mixed reality space image 1301 shown in FIG. 13 is generated when the real space image and the virtual space image are the real space image 201 and the virtual space image 202 shown in FIG. 2, respectively. As shown in FIG. 13, not only the hand region 150 a but also the arm region 150 b having a pixel value different from that of the hand region 150 a can be displayed in front of the virtual space image 202.

［第２の実施形態］
第１の実施形態の冒頭でも述べたように、被写体の領域は、「手」の領域と「腕」の領域とをマージしたものに限定するものではなく、どのような領域をマージして被写体の領域を形成しても良い。即ち、以下の説明は、被写体の領域が異なる複数の画素値で表示されるようなものであれば、どのような被写体の領域でも良い。 [Second Embodiment]
As described at the beginning of the first embodiment, the area of the subject is not limited to the merged area of the “hand” area and the “arm” area. These regions may be formed. That is, in the following description, any subject area may be used as long as the subject area is displayed with a plurality of different pixel values.

例えば、ユーザが手で把持している現実物体を追加領域として判断して被写体領域を決定しても良い。これにより、ユーザの手、腕に加えて、手で把持している現実物体をも、仮想空間画像の手前に表示することができる。 For example, the subject area may be determined by determining a real object held by the user's hand as an additional area. Thereby, in addition to the user's hand and arm, a real object grasped by the hand can be displayed in front of the virtual space image.

この場合、第１の実施形態において、ステップＳ８０５における処理を以下のように変更すれば良い。 In this case, what is necessary is just to change the process in step S805 as follows in 1st Embodiment.

第１の実施形態では、ステップＳ８０５において２つの判断基準を設けていた。本実施形態ではそのうちの１つの判断基準を次のような判断基準に変更する。 In the first embodiment, two determination criteria are provided in step S805. In the present embodiment, one of the judgment criteria is changed to the following judgment criteria.

具体的には、追加領域が現実空間画像の端の領域に属しているかどうかの判定をなくす。これは、手で把持される現実物体が必ずしも現実空間画像の端にかかっているわけではないためである。 Specifically, it is not determined whether the additional area belongs to the edge area of the real space image. This is because the real object grasped by the hand does not necessarily hang on the edge of the real space image.

本実施形態ではその代わりの判定処理として、キー領域と追加領域（ステップＳ８０５の判定済みの領域）とをマージすることで得られる被写体領域が現実空間画像の端にかかっているかどうかを基準として判定を行う。 In this embodiment, as an alternative determination process, a determination is made based on whether the subject area obtained by merging the key area and the additional area (the area determined in step S805) is on the edge of the real space image. I do.

これより手に把持されていない領域の誤認識を回避する。 This avoids erroneous recognition of areas that are not gripped by the hand.

［第３の実施形態］
第１，２の実施形態では、キー領域と追加領域とを毎フレーム毎に算出し、それらに基づいて被写体領域を決定した。即ち、キー領域と追加領域とに基づいて被写体領域を求める処理をフレーム毎に行っていた。本実施形態では、初期領域を指定するためのみキー領域と追加領域とを算出し、その後の被写体領域の更新は自動輪郭抽出処理によって行う。 [Third Embodiment]
In the first and second embodiments, the key area and the additional area are calculated for each frame, and the subject area is determined based on them. That is, the process for obtaining the subject area based on the key area and the additional area is performed for each frame. In the present embodiment, the key area and the additional area are calculated only for designating the initial area, and the update of the subject area thereafter is performed by automatic contour extraction processing.

本実施形態では、初期登録された被写体領域を安定して毎回更新することが可能となる。ここでいう安定とは、例えば新しくキー領域に含まれる現実物体が現れても変わらない領域を抽出可能ということである。ここで、本実施形態では、第１、２の実施形態とステップＳ５０２における処理のみが異なる。 In the present embodiment, the initially registered subject area can be stably updated every time. Here, “stable” means that, for example, it is possible to extract a region that does not change even if a new real object included in the key region appears. Here, this embodiment is different from the first and second embodiments only in the processing in step S502.

図１４は、本実施形態で行う、ステップＳ５０２における処理のフローチャートである。図１４に示すフローチャートは、図６に示したフローチャートからステップＳ６０３における処理と、ステップＳ６０６における処理とが削除されている。そして代わりに、ステップＳ６０４とステップＳ６０５との間に、ステップＳ１４０１の動的輪郭対象登録の処理と、ステップＳ１４０２の動的輪郭抽出の処理が追加されている。 FIG. 14 is a flowchart of the processing in step S502 performed in the present embodiment. In the flowchart shown in FIG. 14, the process in step S603 and the process in step S606 are deleted from the flowchart shown in FIG. Instead, between the steps S604 and S605, the dynamic contour object registration processing at step S1401 and the dynamic contour extraction processing at step S1402 are added.

ステップＳ１４０１では、被写体領域抽出部３０５は、ステップＳ６０４で抽出された被写体領域を動的輪郭抽出の対象として登録する。 In step S1401, the subject region extraction unit 305 registers the subject region extracted in step S604 as a target for dynamic contour extraction.

次にステップＳ１４０２では、被写体領域抽出部３０５は、ステップＳ１４０１で登録された被写体領域の動的輪郭抽出を行う。動的輪郭抽出はスネークなどの既存のアルゴリズムを用いればよい。動的輪郭抽出は既存の技術であるので説明は省略する。 In step S1402, the subject region extraction unit 305 performs dynamic contour extraction of the subject region registered in step S1401. For the active contour extraction, an existing algorithm such as a snake may be used. Since the active contour extraction is an existing technique, description thereof is omitted.

ステップＳ６０５では、被写体領域抽出部３０５は、ステップＳ１４０１で抽出された被写体領域に基づきマスク画像（被写体領域データ）を生成して出力する。 In step S605, the subject area extraction unit 305 generates and outputs a mask image (subject area data) based on the subject area extracted in step S1401.

［第４の実施形態］
上記実施形態では、撮像部３０１が撮像した現実空間画像から算出される動きベクトルのみから被写体領域を特定した。しかし、被写体領域の特定は係る方法に限定するものではない。例えば、現実空間画像から算出される動きベクトルを、撮像部３０１の位置および姿勢の変化から生じる動きベクトルを用いて補正することで得られる動きベクトルから被写体領域を特定してもよい。 [Fourth Embodiment]
In the above embodiment, the subject region is specified only from the motion vector calculated from the real space image captured by the imaging unit 301. However, identification of the subject area is not limited to this method. For example, the subject region may be specified from the motion vector obtained by correcting the motion vector calculated from the real space image using the motion vector generated from the change in the position and orientation of the imaging unit 301.

撮像部３０１が移動または回転する場合、現実空間画像のみから算出される動きベクトルでは、被写体領域を抽出する際に誤差が生じやすい。なぜならば、現実空間画像のみから算出される動きベクトルは、被写体の動きベクトルだけではなく撮像部３０１の動きベクトルをも含んでいるからである。例えば、被写体の動きとは逆の方向に撮像部３０１が動いた場合、被写体の動きベクトルのいくつかを打ち消してしまう可能性がある。 When the imaging unit 301 moves or rotates, an error is likely to occur when extracting a subject area with a motion vector calculated only from a real space image. This is because the motion vector calculated only from the real space image includes not only the motion vector of the subject but also the motion vector of the imaging unit 301. For example, when the imaging unit 301 moves in the direction opposite to the movement of the subject, some of the motion vectors of the subject may be canceled.

そこで本実施形態では、現実空間画像から算出される動きベクトルから、撮像部３０１の位置および姿勢変化から生じる動きベクトルの影響を差し引くことで、被写体の動きベクトルを算出する。そして、その算出結果としての動きベクトルから被写体領域を特定する。この場合、第１の実施形態においてステップＳ５０２における処理を以下のように変更すればよい。 Therefore, in the present embodiment, the motion vector of the subject is calculated by subtracting the influence of the motion vector resulting from the change in the position and orientation of the imaging unit 301 from the motion vector calculated from the real space image. Then, the subject area is specified from the motion vector as the calculation result. In this case, what is necessary is just to change the process in step S502 as follows in 1st Embodiment.

第１の実施形態では、ステップＳ５０２内の、更にステップＳ６０２では、現実空間画像からのみ動きベクトルを算出するとした。本実施形態では、動きベクトルの算出方法を下記のように変更する。 In the first embodiment, the motion vector is calculated only from the real space image in step S502 and further in step S602. In this embodiment, the motion vector calculation method is changed as follows.

図１６は、ステップＳ６０２において本実施形態で行う処理のフローチャートである。本実施形態ではステップＳ６０２では、撮像部３０１（撮像装置）の位置および姿勢の変化に基づいて、現実空間画像からのみ動きベクトルを補正する。 FIG. 16 is a flowchart of the processing performed in this embodiment in step S602. In this embodiment, in step S602, the motion vector is corrected only from the real space image based on the change in the position and orientation of the imaging unit 301 (imaging device).

ステップＳ１５０１では、動きベクトル検出部３０４は、現実空間画像から動きベクトルを算出する。ステップＳ１５０１における処理は、第１の実施形態で説明したステップＳ６０２における処理と同じである。 In step S1501, the motion vector detection unit 304 calculates a motion vector from the real space image. The process in step S1501 is the same as the process in step S602 described in the first embodiment.

次に、ステップＳ１５０２では、動きベクトル検出部３０４は、撮像部３０１の姿勢変化による動きベクトル（姿勢変化分動きベクトル）の情報を用いて、ステップＳ１５０１で算出された動きベクトルを補正する。 Next, in step S1502, the motion vector detection unit 304 corrects the motion vector calculated in step S1501 using information on a motion vector (posture change motion vector) due to a posture change of the imaging unit 301.

より詳しくは、先ず、動きベクトル検出部３０４は、位置姿勢計測部３０６から撮像部３０１の姿勢情報を得る。ここで、動きベクトル検出部３０４は、予め前フレームにおける撮像部３０１の位置および姿勢情報を保持しているとする。動きベクトル検出部３０４は、前フレームにおける撮像部３０１の姿勢情報と現フレームにおける撮像部３０１に姿勢情報とから、姿勢の変化量を算出する。この姿勢変化量から、姿勢変化によって生じる動きベクトル（姿勢変化分動きベクトル）を求める。なお、動きベクトルの算出技術は周知の技術であるので、これについての詳細な説明は省略する。なお、ここで姿勢変化とは、撮像部３０１のレンズ中心を軸として光軸が回転することをいう。 More specifically, first, the motion vector detection unit 304 obtains posture information of the imaging unit 301 from the position / orientation measurement unit 306. Here, it is assumed that the motion vector detection unit 304 holds the position and orientation information of the imaging unit 301 in the previous frame in advance. The motion vector detection unit 304 calculates the amount of change in posture from the posture information of the imaging unit 301 in the previous frame and the posture information of the imaging unit 301 in the current frame. From this posture change amount, a motion vector (posture change motion vector) generated by the posture change is obtained. Since the motion vector calculation technique is a well-known technique, a detailed description thereof will be omitted. Here, the posture change means that the optical axis rotates about the lens center of the imaging unit 301.

次に、算出された姿勢変化分動きベクトルを、撮像部３０１の画像平面に射影することで、画像上での動きベクトルとして変換する。 Next, the calculated posture change motion vector is projected onto the image plane of the imaging unit 301 to be converted as a motion vector on the image.

そして動きベクトル検出部３０４は、画像平面に射影された姿勢変化分動きベクトルを用いて、ステップＳ１５０１で算出された動きベクトルを補正する。係る補正は以下の式に基づいて行う。 Then, the motion vector detection unit 304 corrects the motion vector calculated in step S1501 using the posture change motion vector projected onto the image plane. Such correction is performed based on the following equation.

M’=M−Rv・I （式１）
ここで、M’は姿勢変化分動きベクトルを現実空間画像内の動きベクトルから差し引いた動きベクトルを表す行列、Mは現実空間画像から算出された動きベクトルを表す行列である。また、Rvは画像平面上に射影された姿勢変化ベクトル、Iは単位行列（行列Mと大きさが同じ行列）である。 M '= M−Rv · I (Formula 1)
Here, M ′ is a matrix representing a motion vector obtained by subtracting a motion vector for posture change from a motion vector in the real space image, and M is a matrix representing a motion vector calculated from the real space image. Rv is a posture change vector projected onto the image plane, and I is a unit matrix (a matrix having the same size as the matrix M).

このように、現実空間画像から算出された動きベクトルから姿勢変化による動きベクトルを減じる。 Thus, the motion vector due to the posture change is subtracted from the motion vector calculated from the real space image.

図１６に戻って、次に、ステップＳ１５０３では、動きベクトル検出部３０４は、撮像部３０１の位置変化による動きベクトル（位置変化分動きベクトル）の情報を用いて、ステップＳ１５０１で算出された動きベクトルを補正する。 Returning to FIG. 16, next, in step S1503, the motion vector detection unit 304 uses the information on the motion vector (position change motion vector) due to the position change of the imaging unit 301, and the motion vector calculated in step S1501. Correct.

より詳しくは、先ず、動きベクトル検出部３０４は、位置姿勢計測部３０６から撮像部３０１の位置情報を得る。動きベクトル検出部３０４は、前フレームにおける撮像部３０１の位置情報と現フレームにおける撮像部３０１の位置情報とから、位置の変化量を算出する。この位置変化量から、位置変化によって生じる動きベクトルを求める。なお位置変化とは、撮像部３０１のレンズ中心を軸として並進移動した場合の位置変化のことをいう。 More specifically, first, the motion vector detection unit 304 obtains position information of the imaging unit 301 from the position / orientation measurement unit 306. The motion vector detection unit 304 calculates the amount of change in position from the position information of the imaging unit 301 in the previous frame and the position information of the imaging unit 301 in the current frame. A motion vector generated by the position change is obtained from the position change amount. The change in position refers to a change in position when the lens unit of the imaging unit 301 is translated about the axis.

次に、位置変化分動きベクトルを、撮像部３０１の画像平面に射影することで、画像上での動きベクトルとして変換する。ここで、位置変化分動きベクトルを画像平面に射影する際は、位置変化分動きベクトルの画像平面射影とは異なり、被写体までの奥行き情報を考慮する必要がある。なぜならば、画像平面上に射影される位置変化分動きベクトルは、被写体までの奥行き距離に応じて異なるからである。具体的には、被写体までの距離が大きくなるにつれて、位置変化分動きベクトルの大きさは大きくなる。 Next, the position change motion vector is projected onto the image plane of the imaging unit 301 to be converted as a motion vector on the image. Here, when projecting the position change motion vector onto the image plane, it is necessary to consider the depth information to the subject, unlike the image plane projection of the position change motion vector. This is because the position change motion vector projected onto the image plane differs depending on the depth distance to the subject. Specifically, as the distance to the subject increases, the magnitude of the position change motion vector increases.

従って、動きベクトル検出部３０４は、画像平面上に射影される位置変化分動きベクトルを算出するために被写体までの奥行き距離を測定する。 Therefore, the motion vector detection unit 304 measures the depth distance to the subject in order to calculate the position change motion vector projected onto the image plane.

本実施形態では、被写体はステレオビデオカメラから成るＨＭＤ３９０によって撮像されているので、ステレオマッチング法により奥行き距離を測定する。ステレオマッチング法は周知の技術であるので、これについての説明は省略する。 In this embodiment, since the subject is imaged by the HMD 390 composed of a stereo video camera, the depth distance is measured by the stereo matching method. Since the stereo matching method is a well-known technique, a description thereof will be omitted.

本実施形態では、奥行き距離の測定をステレオマッチング法によって行うとしているが、係る方法に限定するものではない。例えば、奥行き距離の測定を赤外式距離測定カメラを用いて行っても良い。即ち、距離を測定できる方法であればどのような方法を用いても良い。また、位置変化分動きベクトルを算出するために、ユーザが奥行き距離を設定するようにしても良い。 In the present embodiment, the depth distance is measured by the stereo matching method, but is not limited to this method. For example, the depth distance may be measured using an infrared distance measuring camera. In other words, any method may be used as long as it can measure the distance. Further, the depth distance may be set by the user in order to calculate the position change motion vector.

奥行き距離が測定されると、動きベクトル検出部３０４は、画像平面上に射影される位置変化分動きベクトルの算出を以下の式に基づいて行う。 When the depth distance is measured, the motion vector detection unit 304 calculates a position change motion vector projected on the image plane based on the following equation.

Tv＝f・t／z （式２）
ここで、Tvは画像平面上に射影された位置変化分動きベクトル、fは撮像部３０１のレンズから結像面までの距離である。また、tは撮像部３０１の位置変化によって生じた動きベクトル、zは被写体までの奥行き距離である。 Tv = f · t / z (Formula 2)
Here, Tv is a position change motion vector projected onto the image plane, and f is the distance from the lens of the imaging unit 301 to the imaging plane. In addition, t is a motion vector generated by a change in the position of the imaging unit 301, and z is a depth distance to the subject.

図１７は、画像平面上に射影された位置変化分動きベクトルTvを算出する原理を示す図である。図１７では、現在よりも一つ前のフレームから現在のフレームに変わった際に撮像部３０１が並進移動した（tだけ並進移動した）例を示している（被写体は固定とする）。また、図１７では、画像面を規定するＸＹ座標軸のうち、Ｘ軸方向に撮像部３０１が並進した場合を示している。ここでは説明を簡単にするため、Ｘ軸方向のみの移動を想定しているが、ここで説明する方法の原理は、移動方向がＹ軸成分を有する場合にも適用可能であることは言うまでもない。 FIG. 17 is a diagram illustrating the principle of calculating the position change motion vector Tv projected onto the image plane. FIG. 17 shows an example in which the imaging unit 301 is translated (translated by t) when the current frame changes from the previous frame to the current frame (the subject is fixed). FIG. 17 shows a case where the imaging unit 301 translates in the X-axis direction among the XY coordinate axes that define the image plane. Here, in order to simplify the explanation, it is assumed that the movement is only in the X-axis direction, but it goes without saying that the principle of the method described here is also applicable when the movement direction has a Y-axis component. .

図１７において、O₁は現在よりも一つ前のフレーム（前フレーム）おける撮像部３０１のレンズ中心を示す。O₂は現在のフレーム（現フレーム）における撮像部３０１のレンズ中心を示す。P(ｘ、ｚ)は、撮像部３０１で撮像された被写体の一点（計測点）を示す。xはx座標の値、zはz座標の値を示す。ここで表現されている座標系は、前フレームにおける撮像部３０１のレンズ中心を原点とした現実空間の座標系である。つまり、zは撮像部３０１からの奥行き値である。 In FIG. 17, O ₁ indicates the lens center of the imaging unit 301 in the previous frame (previous frame) from the present. O ₂ indicates the lens center of the imaging unit 301 in the current frame (current frame). P (x, z) indicates one point (measurement point) of the subject imaged by the imaging unit 301. x represents the value of the x coordinate, and z represents the value of the z coordinate. The coordinate system expressed here is a coordinate system in the real space with the origin of the lens center of the imaging unit 301 in the previous frame. That is, z is a depth value from the imaging unit 301.

X₁は、前フレームにおいて計測点を画像平面上に投影したときのx座標である。X2は現フレームおいて計測点を画像平面上に投影したときのx座標である。つまり、X2-X1が、画像平面上での撮像部３０１の動きベクトルといえる。その他は（式２）と同じである。 X ₁ is an x coordinate when the measurement point is projected on the image plane in the previous frame. X2 is the x coordinate when the measurement point is projected on the image plane in the current frame. That is, X2-X1 can be said to be a motion vector of the imaging unit 301 on the image plane. Others are the same as (Formula 2).

図１７から分かるように、位置変化分動きベクトルtが与えられると、撮像部３０１のレンズから結像面までの距離fと被写体までの距離zとの相似関係から、画像平面上の位置変化分動きベクトルTvを算出することが可能である。 As can be seen from FIG. 17, when a position change motion vector t is given, the position change on the image plane is calculated from the similarity between the distance f from the lens of the imaging unit 301 to the imaging plane and the distance z to the subject. It is possible to calculate the motion vector Tv.

そして動きベクトル検出部３０４は、画像平面に射影された位置変化分動きベクトルを用いて、ステップＳ１５０１で算出された動きベクトルを補正する。係る補正は下記の式に基づいて行う。 The motion vector detection unit 304 corrects the motion vector calculated in step S1501 using the position change motion vector projected onto the image plane. Such correction is performed based on the following equation.

M”＝M’−Tv・I
M”は、位置変化分動きベクトルを動きベクトルM’から差し引いた動きベクトルを表す行列、M’はステップＳ１５０２の処理において補正された動きベクトルを表す行列である。また、Tvは、画像平面上に射影された位置変化ベクトル、Iは単位行列（行列Mと大きさが同じ行列）である。このように、現実空間画像から算出された動きベクトルから、撮像部３０１の位置変化による動きベクトルを減じる。 M ”= M′−Tv · I
M ″ is a matrix representing a motion vector obtained by subtracting the position change motion vector from the motion vector M ′, and M ′ is a matrix representing the motion vector corrected in the processing of step S1502. Further, Tv is an image plane. Is a unit matrix (a matrix having the same size as the matrix M.) In this way, a motion vector due to a position change of the imaging unit 301 is calculated from a motion vector calculated from a real space image. Decrease.

最終的に、撮像部３０１が撮像した現実空間画像から算出された動きベクトルから、撮像部３０１の位置及び姿勢によって生じた動きベクトルを減ずることとなる。従って、結果的に撮像部３０１による動きベクトルを排除した被写体の動きベクトルを算出することとなる。 Finally, the motion vector generated by the position and orientation of the imaging unit 301 is subtracted from the motion vector calculated from the real space image captured by the imaging unit 301. Therefore, as a result, the motion vector of the subject from which the motion vector by the imaging unit 301 is excluded is calculated.

本実施形態では、このようにして撮像部３０１の動きの影響を補正した動きベクトルに基づいて、被写体領域の特定を行う。ここで、本実施形態では、位置及び姿勢の両方の変化によって生じる動きベクトル分を補正するとした。しかし、動きベクトルの補正を、撮像部３０１の姿勢変化のみを考慮して行ってもよいし、移動変化のみを考慮して行っても良い。 In the present embodiment, the subject region is specified based on the motion vector in which the influence of the motion of the imaging unit 301 is corrected in this way. Here, in the present embodiment, it is assumed that the motion vector component caused by changes in both the position and orientation is corrected. However, the correction of the motion vector may be performed considering only the posture change of the imaging unit 301 or may be performed considering only the movement change.

また、本実施形態では、撮像部３０１の動きによって生じる動きベクトルを、位置姿勢計測部３０６から得られる位置及び姿勢情報を用いて求めた。しかし、係る動きベクトルは他の方法でもって求めても良い。即ち、必ずしも、磁気センサや光学式センサなどのセンサシステムから得られる位置及び姿勢情報に基づいて動きベクトルを求める必要はない。例えば、撮像部３０１が撮像した画像を用いて、撮像部３０１の動きによって生じる動きベクトルを求めるようにしても良い。 In the present embodiment, the motion vector generated by the movement of the imaging unit 301 is obtained using the position and orientation information obtained from the position and orientation measurement unit 306. However, the motion vector may be obtained by other methods. That is, it is not always necessary to obtain a motion vector based on position and orientation information obtained from a sensor system such as a magnetic sensor or an optical sensor. For example, a motion vector generated by the movement of the imaging unit 301 may be obtained using an image captured by the imaging unit 301.

例えば、撮像部３０１が撮像した画面全体の動きベクトルの平均を、撮像部３０１の動きにより生じた動きベクトルと仮定するようにしても良い。また、撮像した画像を領域分割することで背景領域が分かっている場合には、背景領域に生じている動きベクトルを、撮像部３０１の動きにより生じた動きベクトルと仮定するようにしても良い。 For example, the average of the motion vectors of the entire screen imaged by the imaging unit 301 may be assumed to be a motion vector generated by the movement of the imaging unit 301. Further, when the background area is known by dividing the captured image into areas, the motion vector generated in the background area may be assumed to be a motion vector generated by the movement of the imaging unit 301.

［第５の実施形態］
上記各実施形態では、図３に示した画像処理装置３００を構成する各部はハードウェアで構成されているものとして説明したが、記憶装置３１０、撮影画像取込部３０２を除く他の各部はソフトウェアプログラムの形態で実現させても良い。その場合、記憶装置３１０、撮影画像取込部３０２を有するコンピュータに、係るソフトウェアプログラムをインストールし、係るソフトウェアプログラムを、このコンピュータが有するＣＰＵが実行することで、各部の動作を実現させることになる。即ち、画像処理装置３００には、一般のＰＣ（パーソナルコンピュータ）などのコンピュータを適用させることができる。 [Fifth Embodiment]
In each of the above embodiments, each unit configuring the image processing apparatus 300 illustrated in FIG. 3 has been described as being configured by hardware. However, each unit other than the storage device 310 and the captured image capturing unit 302 is software. It may be realized in the form of a program. In that case, the software program is installed in a computer having the storage device 310 and the captured image capturing unit 302, and the CPU of the computer executes the software program, thereby realizing the operation of each unit. . That is, a computer such as a general PC (personal computer) can be applied to the image processing apparatus 300.

図１５は、画像処理装置３００に適用可能なコンピュータのハードウェア構成例を示す図である。 FIG. 15 is a diagram illustrating a hardware configuration example of a computer applicable to the image processing apparatus 300.

ＣＰＵ１５０１は、ＲＡＭ１５０２やＲＯＭ１５０３に格納されているプログラムやデータを用いて本コンピュータ全体の制御を行うと共に、画像処理装置３００が行うものとして上述した各処理を実行する。 The CPU 1501 controls the entire computer using programs and data stored in the RAM 1502 and the ROM 1503, and executes the processes described above as performed by the image processing apparatus 300.

ＲＡＭ１５０２は、外部記憶装置１５０６からロードされたプログラムやデータ、Ｉ／Ｆ（インターフェース）１５０７を介して外部から受信した各種のデータ等を一時的に記憶するためのエリアを有する。更にＲＡＭ１５０２は、ＣＰＵ１５０１が各種の処理を実行する際に用いるワークエリアも有する。更に、ＲＡＭ１５０２は、上記フレームメモリ、ステンシルバッファとしても機能する。即ち、ＲＡＭ１５０２は、各種のエリアを適宜提供することができる。 The RAM 1502 has an area for temporarily storing programs and data loaded from the external storage device 1506, various data received from the outside via an I / F (interface) 1507, and the like. The RAM 1502 also has a work area used when the CPU 1501 executes various processes. Further, the RAM 1502 also functions as the frame memory and stencil buffer. That is, the RAM 1502 can provide various areas as appropriate.

ＲＯＭ１５０３は、本コンピュータの設定データや、ブートプログラムなどを格納する。 The ROM 1503 stores setting data of the computer, a boot program, and the like.

操作部１５０４は、キーボードやマウスなどにより構成されており、本コンピュータの操作者が操作することで、各種の指示をＣＰＵ１５０１に対して入力することができる。例えば、処理の終了指示等はこの操作部１５０４を用いて入力することができる。 The operation unit 1504 is configured by a keyboard, a mouse, and the like, and can input various instructions to the CPU 1501 by being operated by an operator of the computer. For example, a process end instruction or the like can be input using the operation unit 1504.

表示部１５０５は、ＣＲＴや液晶画面などにより構成されており、ＣＰＵ１５０１による処理結果を画像や文字で表示することができる。例えば、画像処理装置３００が行うものとして上述した各処理を本コンピュータ（ＣＰＵ１５０１）が実行することで生成された複合現実空間の画像を表示することができる。 The display unit 1505 is configured by a CRT, a liquid crystal screen, or the like, and can display a processing result by the CPU 1501 as an image or text. For example, it is possible to display an image of the mixed reality space generated by the computer (CPU 1501) executing the above-described processes performed by the image processing apparatus 300.

外部記憶装置１５０６は、ハードディスクドライブ装置に代表される大容量情報記憶装置である。外部記憶装置１５０６には、ＯＳ（オペレーティングシステム）や、画像処理装置３００が行うものとして上述した各処理をＣＰＵ１５０１に実行させるためのプログラムやデータなどが保存されている。係るプログラムには、動きベクトル検出部３０４、キー領域抽出部３０３、被写体領域抽出部３０５、画像合成部３０８、画像生成部３０７のそれぞれの機能をＣＰＵ１５０１に実行させるためのプログラムが含まれている。また、外部記憶装置１５０６は記憶装置３１０も兼ねている。外部記憶装置１５０６に保存されているプログラムやデータは、ＣＰＵ１５０１による制御に従って適宜ＲＡＭ１５０２にロードされる。そしてＣＰＵ１５０１はこのロードされたプログラムやデータを用いて処理を実行するので、本コンピュータは、画像処理装置３００が行うものとして上述した各処理（上述の各フローチャートに従った処理）を実行することになる。 The external storage device 1506 is a mass information storage device represented by a hard disk drive device. The external storage device 1506 stores an OS (operating system) and programs and data for causing the CPU 1501 to execute the above-described processes that the image processing apparatus 300 performs. The program includes a program for causing the CPU 1501 to execute the functions of the motion vector detection unit 304, the key region extraction unit 303, the subject region extraction unit 305, the image composition unit 308, and the image generation unit 307. The external storage device 1506 also serves as the storage device 310. Programs and data stored in the external storage device 1506 are appropriately loaded into the RAM 1502 under the control of the CPU 1501. Since the CPU 1501 executes processing using the loaded program and data, the computer executes the above-described processing (processing according to the above-described flowcharts) that the image processing apparatus 300 performs. Become.

Ｉ／Ｆ１５０７は、上述のＨＭＤ３９０や位置姿勢計測部３０６を本コンピュータに接続する為のもので、ＨＭＤ３９０、位置姿勢計測部３０６とはこのＩ／Ｆ１５０７を介して信号の送受信を行う。Ｉ／Ｆ１５０７は、撮影画像取込部３０２も兼ねている。 An I / F 1507 is used to connect the above-described HMD 390 and position / orientation measurement unit 306 to this computer. The I / F 1507 transmits / receives signals to / from the HMD 390 and position / orientation measurement unit 306 via the I / F 1507. The I / F 1507 also serves as the captured image capturing unit 302.

１５０８は上述の各部を繋ぐバスである。 A bus 1508 connects the above-described units.

なお、画像処理装置３００に適用可能なコンピュータのハードウェア構成については図１５に示した構成に限定しない。例えば、本コンピュータにグラフィックスカード（ボード）を取り付け、係るグラフィックスカードが仮想空間画像の生成や、複合現実空間の画像の生成を行うようにしても良い。 Note that the hardware configuration of the computer applicable to the image processing apparatus 300 is not limited to the configuration shown in FIG. For example, a graphics card (board) may be attached to the computer, and the graphics card may generate a virtual space image or a mixed reality space image.

［その他の実施形態］
また、本発明の目的は、以下のようにすることによって達成されることはいうまでもない。即ち、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体（または記憶媒体）を、システムあるいは装置に供給する。係る記憶媒体は言うまでもなく、コンピュータ読み取り可能な記憶媒体である。そして、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行する。この場合、記録媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記録した記録媒体は本発明を構成することになる。 [Other Embodiments]
Needless to say, the object of the present invention can be achieved as follows. That is, a recording medium (or storage medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded is supplied to the system or apparatus. Needless to say, such a storage medium is a computer-readable storage medium. Then, the computer (or CPU or MPU) of the system or apparatus reads and executes the program code stored in the recording medium. In this case, the program code itself read from the recording medium realizes the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

また、コンピュータが読み出したプログラムコードを実行することにより、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行う。その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, an operating system (OS) or the like running on the computer performs part or all of the actual processing based on the instruction of the program code. Needless to say, the process includes the case where the functions of the above-described embodiments are realized.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれたとする。その後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, it is assumed that the program code read from the recording medium is written in a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer. After that, based on the instruction of the program code, the CPU included in the function expansion card or function expansion unit performs part or all of the actual processing, and the function of the above-described embodiment is realized by the processing. Needless to say.

本発明を上記記録媒体に適用する場合、その記録媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the present invention is applied to the recording medium, program code corresponding to the flowchart described above is stored in the recording medium.

Claims

Means for acquiring a real space image comprising a plurality of frames;
In the frame of interest of the real space image, a first setting means for setting a region composed of a predetermined pixel satisfying a first region,
Within the frame of interest, and acquisition means you get a motion vector of another region other than the first region of the target frame the motion vector of the first region,
First determination means for determining whether or not the magnitude of a motion vector of the acquired other region is greater than or equal to a threshold ;
When the first determination unit determines that the magnitude of the motion vector of the other area is equal to or greater than a threshold, the motion vector of the other area is similar to the motion vector of the first area. A second judging means for judging whether or not
A second region that sets the other region as a second region when the second determination unit determines that the motion vector of the other region is similar to the motion vector of the first region; Setting means,
To the first region and the region other than the second region in the frame of interest is pre-Symbol set, the image processing apparatus characterized by obtaining Bei a synthesizing means for synthesizing the virtual space image.

The second setting means includes
If the magnitude of the motion vector of said another region is determined to not more than the threshold value, the second area set in a previous frame from the frame of interest is set as a second area of the frame of interest The image processing apparatus according to claim 1.

The second setting means includes
If the magnitude of the motion vector of said another region is determined to not more than the threshold value, a first region and a second region set in a previous frame from the frame of interest, the said frame of interest The image processing apparatus according to claim 1, wherein the image processing apparatus is set as one area and a second area .

The second setting means includes
The first determination means determines that the magnitude of the motion vector of the other area is greater than or equal to a threshold value, and the second determination means determines that the motion vector of the other area is a motion of the first area. The second region is set as the second region when the second region is determined to be similar to a vector and the other region is connected to the first region. The image processing apparatus according to any one of 1 to 3.

Said first determination means, and wherein the magnitude of the representative motion vector that is determined based on the respective motion vectors are determined Me with a plurality of other regions, it is determined whether or not the threshold value or more The image processing apparatus according to any one of claims 1 to 4.

The acquisition means includes
Means for calculating an attitude change amount of the imaging device;
Means for calculating a motion vector generated by a posture change of the imaging device as a posture change motion vector based on the posture change amount;
6. The image processing apparatus according to claim 1 , further comprising: a unit that corrects a motion vector of the first region based on the posture change amount motion vector.

The acquisition means includes
Means for calculating real space depth information;
Means for calculating a position change amount of the imaging device;
Means for calculating a motion vector caused by a position change of the imaging device as a position change motion vector based on the depth information and the position change amount;
The image processing apparatus according to claim 1 , further comprising: a unit that corrects a motion vector of the first area based on the position change motion vector.

The virtual space image, the image processing according to any one of claims 1 to 7, characterized in Rukoto is generate based on the position and orientation information indicating the position and orientation of an imaging device imaging the real space image apparatus.

It said first setting means, to any one of claims 1 to 8, characterized in that sets a region formed in the physical space image in pixels having a predetermined color information as the first region The image processing apparatus described.

The image processing apparatus according to claim 1, wherein the obtaining unit obtains a motion vector for each pixel constituting the real space image.

The image processing apparatus according to any one of claims 1 to 10, further comprising means for outputting a composite image obtained by the composition processing by the composition means.

Acquiring a physical space image composed of a plurality of frames;
In the frame of interest of the real space image, a first setting step of setting an area consisting of a predetermined pixel satisfying a first region,
Within the frame of interest, an acquisition step get a motion vector of the first area regions other than the first region and the motion vector the target frame of,
A first determination step of determining whether or not the magnitude of a motion vector of the acquired other region is equal to or greater than a threshold ;
Whether the motion vector of the other region is similar to the motion vector of the first region when it is determined in the first determination step that the size of the motion vector of the other region is greater than or equal to a threshold value A second determination step for determining
When it is determined in the second determination step that the motion vector of the other region is similar to the motion vector of the first region, the second region is set as the second region. A setting process;
Image processing method characterized in that the first region and the region other than the second region of the target frame that have been pre-Symbol set, obtain Preparations and synthesis step of synthesizing the virtual space image.

A program for causing a computer to execute the image processing method according to claim 12.