JP2018063567A

JP2018063567A - Image processing device, image processing method and program

Info

Publication number: JP2018063567A
Application number: JP2016201487A
Authority: JP
Inventors: 羽鳥　健司; Kenji Hatori; 健司羽鳥
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-10-13
Filing date: 2016-10-13
Publication date: 2018-04-19

Abstract

PROBLEM TO BE SOLVED: To provide an image processing device, processing method and program that are capable of presenting a mixed reality without an uncomfortable feeling while reducing a contact possibility with a real object.SOLUTION: An image processing device includes: first acquiring means for acquiring a first image in which a real space is imaged; second acquiring means for acquiring a moving speed of an object imaged in the first image; determining means for determining, on the basis of the moving speed acquired by the second acquiring means, a transparency of a second image created on the basis of a positional attitude of an imaging device that has imaged the first image; and display control means for displaying the second image on a display unit in accordance with the transparency determined by the determining means.SELECTED DRAWING: None

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

近年、現実空間と仮想空間との繋ぎ目のない結合を目的とした、複合現実感（ＭＲ：ＭｉｘｅｄＲｅａｌｉｔｙ）技術の研究が盛んに行われている。ＭＲ技術において複合現実感の提示を行う画像表示装置は、ビデオシースルー方式もしくは光学シースルー方式によって実現される。
ビデオシースルー方式は、ビデオカメラ等の撮像装置によって撮像された現実空間の画像上に、撮像装置の位置および姿勢に応じて生成された仮想空間の画像を重畳描画した合成画像を表示する方式である。ここで、仮想空間の画像は、ＣＧ（コンピュータグラフィックス）により描画された仮想物体や文字情報等により構成される。また、光学シースルー方式は、観察者の頭部に装着された光学シースルー型ディスプレイに、観察者の視点の位置および姿勢に応じて生成された仮想空間の画像を表示する方式である。 In recent years, research on mixed reality (MR) technology for the purpose of seamless connection between real space and virtual space has been actively conducted. An image display device that presents mixed reality in MR technology is realized by a video see-through method or an optical see-through method.
The video see-through method is a method for displaying a composite image in which a virtual space image generated according to the position and orientation of an imaging device is superimposed and drawn on a real space image captured by an imaging device such as a video camera. . Here, the image in the virtual space includes a virtual object drawn by CG (computer graphics), character information, and the like. The optical see-through method is a method of displaying an image of a virtual space generated according to the position and orientation of the observer's viewpoint on an optical see-through display mounted on the observer's head.

いずれの方式においても、観察者の視界を遮って仮想空間の画像（仮想空間画像）が表示されるため、現実物体が仮想空間画像に隠れて見えない場合があり、手などを現実物体にぶつけるおそれがあった。特許文献１には、観察者の手の位置と現実物体の位置との距離に応じて、仮想空間画像の透明度を制御する画像処理装置が開示されている。この特許文献１に記載の技術では、例えば上記距離が小さいほど仮想空間画像の透明度を上げるようにしている。 In either method, an image of the virtual space (virtual space image) is displayed while obstructing the observer's field of view, so the real object may be hidden behind the virtual space image and hit the real object There was a fear. Patent Document 1 discloses an image processing apparatus that controls the transparency of a virtual space image according to the distance between the position of an observer's hand and the position of a real object. In the technique described in Patent Document 1, for example, the transparency of the virtual space image is increased as the distance is smaller.

特開２００９−２５９１８号公報JP 2009-25918 A

しかしながら、上記従来の方法では、手と現実物体との距離に応じて仮想空間画像を透明にしているため、現実物体から離れた場所から、手を高速で現実物体に近づけた場合、仮想空間画像が透明になるタイミングが遅れ、手を現実物体に衝突させてしまうおそれがある。また、現実物体の近くで手がゆっくりと動いており、手が現実物体に接触したとしても衝撃が少ない場合であっても、仮想空間画像が透明になってしまうため、観察者が仮想空間画像を見ることを阻害してしまう。
そこで、本発明は、現実物体との接触可能性を低減しつつ、違和感のない複合現実感を提示可能とすることを目的としている。 However, in the above conventional method, since the virtual space image is made transparent according to the distance between the hand and the real object, the virtual space image is obtained when the hand is brought close to the real object at a high speed from a place away from the real object. There is a risk that the timing at which the screen becomes transparent will cause the hand to collide with a real object. In addition, even if the hand is moving slowly near the real object and the hand touches the real object, even if there is little impact, the virtual space image becomes transparent, so the observer can see the virtual space image. Will be disturbed.
Therefore, an object of the present invention is to make it possible to present a mixed reality without a sense of incongruity while reducing the possibility of contact with a real object.

上記課題を解決するために、本発明に係る画像処理装置の一態様は、現実空間が撮像された第一の画像を取得する第一の取得手段と、前記第一の画像に撮像されている物体の移動速度を取得する第二の取得手段と、前記第二の取得手段により取得された移動速度に基づいて、前記第一の画像を撮像した撮像装置の位置姿勢に基づいて生成される第二の画像の透明度を決定する決定手段と、前記決定手段により決定された透明度に従って、前記第二の画像を表示部に表示させる表示制御手段と、を備える。 In order to solve the above-described problem, an aspect of the image processing apparatus according to the present invention includes a first acquisition unit that acquires a first image in which a real space is imaged, and the first image. A second acquisition unit configured to acquire a moving speed of the object; and a second generation unit generated based on the position and orientation of the imaging apparatus that captured the first image based on the moving speed acquired by the second acquisition unit. Determining means for determining the transparency of the second image; and display control means for displaying the second image on the display unit in accordance with the transparency determined by the determining means.

本発明によれば、現実物体との接触可能性を低減しつつ、違和感のない複合現実感を提示可能とすることができる。 According to the present invention, it is possible to present a mixed reality without a sense of incongruity while reducing the possibility of contact with a real object.

本発明の実施形態における複合現実感システムの構成例を示す図である。It is a figure which shows the structural example of the mixed reality system in embodiment of this invention. マーカーの例およびＣＧモデルの表示例を表す図である。It is a figure showing the example of a marker and the example of a display of CG model. 画像処理装置のハードウェア構成図である。It is a hardware block diagram of an image processing apparatus. 画像処理装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of an image processing apparatus. 物体推定処理を示すフローチャートである。It is a flowchart which shows an object estimation process. ＣＧレンダリング処理を示すフローチャートである。It is a flowchart which shows CG rendering processing. ＣＧをすべて透明にした場合の表示例である。It is an example of a display when all CG is made transparent. 第二の実施形態のＣＧレンダリング処理を示すフローチャートである。It is a flowchart which shows the CG rendering process of 2nd embodiment. 手の周りのＣＧだけを透明にした場合の表示例である。It is a display example when only the CG around the hand is made transparent.

以下、添付図面を参照して、本発明を実施するための形態について詳細に説明する。なお、以下に説明する実施の形態は、本発明の実現手段としての一例であり、本発明が適用される装置の構成や各種条件によって適宜修正または変更されるべきものであり、本発明は以下の実施の形態に限定されるものではない。
（第一の実施形態）
図１（ａ）は、本発明の実施形態における複合現実感システム（ＭＲシステム）１０の構成例を示す図である。本実施形態におけるＭＲシステム１０は、現実空間と仮想空間とを融合した複合現実空間（ＭＲ空間）を観察者に提示するためのシステムである。
複合現実感システム１０は、撮像装置２１Ｌおよび２１Ｒと、表示装置２２と、画像処理装置３０と、を備える。本実施形態では、撮像装置２１Ｌ、２１Ｒおよび表示装置２２は、ＭＲ空間を体感する観察者の頭部に装着される頭部装着型表示装置であるヘッドマウントディスプレイ（ＨＭＤ）に備わっているものとして説明する。ここで、ＨＭＤは、現実空間の画像とコンピュータグラフィックス（ＣＧ）等の仮想空間の画像とを合成した合成画像を観察者に提示可能なビデオシースルー型ＨＭＤとする。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the accompanying drawings. The embodiment described below is an example as means for realizing the present invention, and should be appropriately modified or changed depending on the configuration and various conditions of the apparatus to which the present invention is applied. It is not limited to the embodiment.
(First embodiment)
FIG. 1A is a diagram illustrating a configuration example of a mixed reality system (MR system) 10 according to an embodiment of the present invention. The MR system 10 in the present embodiment is a system for presenting an observer with a mixed reality space (MR space) in which a real space and a virtual space are fused.
The mixed reality system 10 includes imaging devices 21L and 21R, a display device 22, and an image processing device 30. In the present embodiment, the imaging devices 21L and 21R and the display device 22 are provided in a head-mounted display (HMD) that is a head-mounted display device that is mounted on the head of an observer who experiences MR space. explain. Here, the HMD is a video see-through HMD that can present a synthesized image obtained by synthesizing a real space image and a virtual space image such as computer graphics (CG) to an observer.

撮像装置２１Ｌ、２１Ｒは、それぞれ、観察者がＨＭＤを頭部に装着した場合に、観察者の視点位置から視線方向の現実空間を撮像可能なように互いに固定されたカメラであり、これら２つのカメラによってステレオカメラを構成している。撮像装置２１Ｌ、２１Ｒは、観察者の視点位置から見える現実空間の物体（現実物体）を撮像する。撮像装置２１Ｌ、２１Ｒは、例えば観察者の体の一部（例えば、手４０）や、撮像装置２１Ｌ、２１Ｒの位置姿勢を計測するためのマーカー５０を撮像する。そして、撮像装置２１Ｌ、２１Ｒは、それぞれ撮像した画像を画像処理装置３０に出力する。なお、上記ステレオカメラの焦点距離やレンズ歪み係数等のカメラ内部パラメータは、予め所定の方法で求められており、既知であるとする。 The imaging devices 21L and 21R are cameras fixed to each other so that when the observer wears the HMD on the head, the real space in the gaze direction can be imaged from the viewpoint position of the observer. The camera constitutes a stereo camera. The imaging devices 21L and 21R image an object in the real space (real object) that can be seen from the viewpoint position of the observer. The imaging devices 21L and 21R image, for example, a part of the observer's body (for example, the hand 40) and the marker 50 for measuring the position and orientation of the imaging devices 21L and 21R. Then, the imaging devices 21L and 21R output the captured images to the image processing device 30, respectively. It is assumed that camera internal parameters such as the focal length and lens distortion coefficient of the stereo camera are obtained in advance by a predetermined method and are known.

表示装置２２は、画像処理装置３０から出力された合成画像を表示するディスプレイを備える。ディスプレイは、ＣＲＴや液晶画面などである。ここで、上記ディスプレイは、観察者の左右の目にそれぞれ対応して配置されていてもよい。この場合、観察者の左目に対応するディスプレイには左目用の合成画像が提示され、観察者の右目に対応するディスプレイには右目用の合成画像が提示される。また、表示装置２２は、ディスプレイ上の画像を眼球に導くための光学系を備えていてもよい。なお、撮像装置２１Ｌ、２１Ｒによって観察者の視野画像を撮像し、表示装置２２によって観察者に画像を提示できる構成であれば、撮像装置２１Ｌ、２１Ｒと表示装置２２との配置位置は、任意に設定することができる。 The display device 22 includes a display that displays the composite image output from the image processing device 30. The display is a CRT or a liquid crystal screen. Here, the display may be arranged corresponding to the left and right eyes of the observer. In this case, a composite image for the left eye is presented on the display corresponding to the left eye of the observer, and a composite image for the right eye is presented on the display corresponding to the right eye of the observer. The display device 22 may include an optical system for guiding an image on the display to the eyeball. In addition, the arrangement positions of the imaging devices 21L and 21R and the display device 22 may be arbitrarily set as long as the imaging device 21L and 21R can capture a visual field image of the observer and the display device 22 can present the image to the observer. Can be set.

画像処理装置３０は、画像取得部３１と、画像記憶部３２と、物体推定部３３と、モデル形状記憶部３４と、位置姿勢推定部３５と、画像生成部３６と、画像合成部３７と、表示制御部３８と、を備える。
画像取得部３１は、撮像装置２１Ｌおよび２１Ｒによって撮像されたステレオ画像を取得し、画像記憶部３２に出力する。このステレオ画像は、ステレオ計測用の処理画像として用いられる。画像記憶部３２は、画像取得部３１から受けた画像を一時的に記憶する。画像取得部３１からは、例えば１／３０秒ごとに画像が送信される。 The image processing device 30 includes an image acquisition unit 31, an image storage unit 32, an object estimation unit 33, a model shape storage unit 34, a position / orientation estimation unit 35, an image generation unit 36, an image synthesis unit 37, A display control unit 38.
The image acquisition unit 31 acquires a stereo image captured by the imaging devices 21L and 21R and outputs the stereo image to the image storage unit 32. This stereo image is used as a processing image for stereo measurement. The image storage unit 32 temporarily stores the image received from the image acquisition unit 31. For example, an image is transmitted from the image acquisition unit 31 every 1/30 seconds.

物体推定部３３は、画像記憶部３２に記憶されているステレオ画像を取得し、取得したステレオ画像から手４０の３次元形状および位置を推定する。また、手４０の位置情報を時間と共に記憶し、それに基づいて手４０の移動速度および移動方向を推定する。手４０の３次元形状および位置の推定方法、ならびに手４０の移動速度および移動方向の推定方法については後述する。物体推定部３３は、手４０の３次元形状および位置をモデル形状記憶部３４に出力し、手４０の移動速度および移動方向を画像生成部３６に出力する。
モデル形状記憶部３４は、図１（ｂ）に示すような、表示装置２２に表示させる仮想物体６０の３次元モデル（ＣＧモデル）のデータおよび物体推定部３３から受けた手４０の３次元形状のデータを保持しておく。 The object estimation unit 33 acquires a stereo image stored in the image storage unit 32, and estimates the three-dimensional shape and position of the hand 40 from the acquired stereo image. Further, the position information of the hand 40 is stored with time, and the moving speed and moving direction of the hand 40 are estimated based on the information. A method for estimating the three-dimensional shape and position of the hand 40 and a method for estimating the moving speed and direction of the hand 40 will be described later. The object estimation unit 33 outputs the three-dimensional shape and position of the hand 40 to the model shape storage unit 34, and outputs the moving speed and moving direction of the hand 40 to the image generation unit 36.
As shown in FIG. 1B, the model shape storage unit 34 stores the data of the three-dimensional model (CG model) of the virtual object 60 displayed on the display device 22 and the three-dimensional shape of the hand 40 received from the object estimation unit 33. Keep the data.

位置姿勢推定部３５は、撮像装置２１Ｌ、２１Ｒの位置姿勢を推定する。本実施形態では、位置姿勢推定部３５は、画像に映り込む矩形状のマーカー５０の投影像に基づいて、撮像装置２１Ｌ、２１Ｒの位置姿勢を推定する。例えば、位置姿勢推定部３５は、画像を二値化し、直線フィッティングにより四角形の頂点を抽出し、山登り法の繰り返し演算により画像上における投影誤差を最小化して、撮像装置２１Ｌ、２１Ｒの位置姿勢を推定することができる。
なお、位置姿勢推定部３５による位置姿勢の推定方法は、上記の方法に限定されるものではなく、モーションキャプチャ装置や磁気センサなどを用いた方法により撮像装置２１Ｌ、２１Ｒの位置姿勢を推定してもよい。 The position / orientation estimation unit 35 estimates the position / orientation of the imaging devices 21L and 21R. In the present embodiment, the position / orientation estimation unit 35 estimates the position / orientation of the imaging devices 21L and 21R based on the projection image of the rectangular marker 50 reflected in the image. For example, the position / orientation estimation unit 35 binarizes the image, extracts square vertices by straight line fitting, minimizes the projection error on the image by repetitive calculation of the hill-climbing method, and determines the position / orientation of the imaging devices 21L and 21R. Can be estimated.
The position / orientation estimation method by the position / orientation estimation unit 35 is not limited to the above method, and the position / orientation of the imaging devices 21L and 21R is estimated by a method using a motion capture device or a magnetic sensor. Also good.

画像生成部３６は、モデル形状記憶部３４に記憶されたＣＧモデルのデータと、手４０の３次元形状および位置に関する情報とを入力する。また、画像生成部３６は、位置姿勢推定部３５において推定された撮像装置２１Ｌ、２１Ｒの位置姿勢と、物体推定部３３において推定された手４０の移動速度および移動方向とに関する情報を入力する。そして、画像生成部３６は、入力されたこれらの情報に基づいて、仮想物体６０の画像（ＣＧ）を生成する。
具体的には、画像生成部３６は、ＣＧモデルのデータと、撮像装置２１Ｌ、２１Ｒの位置姿勢とに基づいて、仮想物体６０の位置姿勢を決定する。そして、画像生成部３６は、仮想物体６０の描画ピクセルにおける手４０との前後関係を比較し、仮想物体６０を描画するかどうかを決定する。つまり、手４０の方が仮想物体６０よりも手前にあると判定した場合は、そのピクセルに仮想物体６０を描画せず、後述する画像合成部３７において生成される合成画像において実写画像である手４０が見えるように仮想物体６０の画像を加工する。また、画像生成部３６は、物体推定部３３により推定された手４０の移動速度と移動方向とに応じて、仮想物体６０の画像の透明度を決定する。詳細については後述する。 The image generation unit 36 inputs CG model data stored in the model shape storage unit 34 and information regarding the three-dimensional shape and position of the hand 40. Further, the image generation unit 36 inputs information regarding the position and orientation of the imaging devices 21L and 21R estimated by the position and orientation estimation unit 35 and the movement speed and movement direction of the hand 40 estimated by the object estimation unit 33. Then, the image generation unit 36 generates an image (CG) of the virtual object 60 based on the input information.
Specifically, the image generation unit 36 determines the position and orientation of the virtual object 60 based on the CG model data and the positions and orientations of the imaging devices 21L and 21R. Then, the image generation unit 36 compares the context of the drawing pixel of the virtual object 60 with the hand 40 and determines whether to draw the virtual object 60. That is, when it is determined that the hand 40 is in front of the virtual object 60, the virtual object 60 is not drawn on the pixel, and the hand that is a real image in the composite image generated by the image composition unit 37 described later. The image of the virtual object 60 is processed so that 40 can be seen. In addition, the image generation unit 36 determines the transparency of the image of the virtual object 60 according to the movement speed and movement direction of the hand 40 estimated by the object estimation unit 33. Details will be described later.

画像合成部３７は、画像記憶部３２に記憶されている撮像装置２１Ｌ、２１Ｒの夫々の画像に対して、画像生成部３６において生成された仮想物体６０の画像を上書き合成する。生成された合成画像は、表示制御部３８に出力される。表示制御部３８は、画像合成部３７から出力された合成画像を表示装置２２のディスプレイに表示させる表示制御を行う。これにより、観察者は、仮想物体６０と手４０との前後関係が正しい合成画像をディスプレイにて観察することができ、あたかもその場所に仮想物体６０が実在するかのような体験をすることができる。
なお、本実施形態では、画像記憶部３２は、物体推定部３３、位置姿勢推定部３５、画像生成部３６の処理で利用した実写画像を、画像合成部３７に入力している。これは、画像生成部３６において生成された３次元形状の画像と画像記憶部３２の画像とが同期された状態で、画像合成部３７において画像を合成するためである。画像合成部３７において同期した画像を取り扱うためには、物体推定部３３、位置姿勢推定部３５、画像生成部３６のすべての処理を１／３０秒以内で完了させることが好ましい。 The image composition unit 37 synthesizes the image of the virtual object 60 generated by the image generation unit 36 with the respective images of the imaging devices 21L and 21R stored in the image storage unit 32. The generated composite image is output to the display control unit 38. The display control unit 38 performs display control for displaying the composite image output from the image composition unit 37 on the display of the display device 22. Thereby, the observer can observe the composite image in which the front-rear relationship between the virtual object 60 and the hand 40 is correct on the display, and can feel as if the virtual object 60 actually exists at the place. it can.
In the present embodiment, the image storage unit 32 inputs the real image used in the processing of the object estimation unit 33, the position / orientation estimation unit 35, and the image generation unit 36 to the image composition unit 37. This is because the image synthesizing unit 37 synthesizes the image in a state where the image of the three-dimensional shape generated by the image generating unit 36 and the image of the image storage unit 32 are synchronized. In order to handle synchronized images in the image composition unit 37, it is preferable to complete all the processes of the object estimation unit 33, the position / orientation estimation unit 35, and the image generation unit 36 within 1/30 second.

次に、撮像装置２１Ｌ、２１Ｒの位置姿勢を計測するためのマーカー５０の例、およびＣＧの表示例について説明する。図２（ａ）はマーカー５０の例である。この図２（ａ）に示すように、車輪がついた箱状の物体５１にマーカー５０を貼り付けてもよい。また、表示装置（ディスプレイ）２２には、例えば図２（ｂ）に示すような仮想物体６０の画像を表示させてもよい。この図２（ｂ）において、仮想物体６０は、事務用のキャビネットであり、図２（ａ）の物体５１の上に重畳して表示した例を示している。このとき、観察者の手４０が図２（ａ）の物体５１よりも手前に存在する場合、図２（ｂ）に示すように、手４０が存在する領域には仮想物体６０を表示させないようにする。これにより、仮想物体６０と手４０との前後関係を提示することができる。
なお、仮想物体６０を配置する位置は、撮像装置２１Ｌ、２１Ｒの位置姿勢に応じた位置であればよく、マーカー５０を貼り付けた物体５１の位置に限定されない。 Next, an example of the marker 50 for measuring the position and orientation of the imaging devices 21L and 21R and a display example of CG will be described. FIG. 2A shows an example of the marker 50. As shown in FIG. 2A, a marker 50 may be attached to a box-shaped object 51 with wheels. Moreover, you may display the image of the virtual object 60 as shown, for example in FIG.2 (b) on the display apparatus (display) 22. FIG. In FIG. 2B, the virtual object 60 is an office cabinet, and shows an example in which the virtual object 60 is displayed superimposed on the object 51 of FIG. At this time, if the observer's hand 40 is present in front of the object 51 in FIG. 2A, the virtual object 60 is not displayed in the area where the hand 40 exists, as shown in FIG. 2B. To. Thereby, the front-rear relationship between the virtual object 60 and the hand 40 can be presented.
Note that the position at which the virtual object 60 is disposed is not limited to the position of the object 51 to which the marker 50 is attached, as long as the position is in accordance with the position and orientation of the imaging devices 21L and 21R.

図３は、画像処理装置３０のハードウェア構成図である。
画像処理装置３０は、ＣＰＵ３０１と、ＲＡＭ３０２と、ＲＯＭ３０３と、キーボード３０４と、マウス３０５と、外部記憶装置３０６と、記憶媒体ドライブ３０７と、インターフェース（Ｉ／Ｆ）３０８と、システムバス３０９と、を備える。
ＣＰＵ３０１は、画像処理装置３０における動作を統括的に制御するプロセッサであり、システムバス３０９を介して、各構成部（３０２〜３０８）を制御する。ＲＡＭ３０２は、外部記憶装置３０６や記憶媒体ドライブ３０７からロードされたプログラムやデータを一時的に記憶するための領域を有する。さらに、ＲＡＭ３０２は、Ｉ／Ｆ３０８を介して外部装置（本実施形態では、撮像装置２１Ｌ、２１Ｒ）から受信したデータ（本実施形態では、現実空間のステレオ画像）を一時的に記憶するためのエリアを有する。また、ＲＡＭ３０２は、ＣＰＵ３０１が各処理を実行する際に用いるワークエリアも有する。つまり、ＲＡＭ３０２は、各種エリアを適宜提供することができる。例えば、ＲＡＭ３０２は、図１の画像記憶部３２やモデル形状記憶部３４として機能することもできる。 FIG. 3 is a hardware configuration diagram of the image processing apparatus 30.
The image processing apparatus 30 includes a CPU 301, a RAM 302, a ROM 303, a keyboard 304, a mouse 305, an external storage device 306, a storage medium drive 307, an interface (I / F) 308, and a system bus 309. Prepare.
The CPU 301 is a processor that comprehensively controls the operation of the image processing apparatus 30, and controls each component (302 to 308) via the system bus 309. The RAM 302 has an area for temporarily storing programs and data loaded from the external storage device 306 and the storage medium drive 307. Further, the RAM 302 is an area for temporarily storing data (stereo images in the real space in the present embodiment) received from the external device (in the present embodiment, the imaging devices 21L and 21R) via the I / F 308. Have The RAM 302 also has a work area used when the CPU 301 executes each process. That is, the RAM 302 can provide various areas as appropriate. For example, the RAM 302 can function as the image storage unit 32 and the model shape storage unit 34 in FIG.

ＲＯＭ３０３は、コンピュータの設定データやブートプログラムなどを格納する。キーボード３０４およびマウス３０５は、操作入力装置の一例であり、コンピュータのユーザが操作することで、各種の指示をＣＰＵ３０１に対して入力することができる。
外部記憶装置３０６は、ハードディスクドライブ（ＨＤＤ）装置に代表される大容量情報記憶装置である。外部記憶装置３０６には、ＯＳ（オペレーティングシステム）や、画像処理装置３０が行うものとして説明した上述の各処理をＣＰＵ３０１に実行させるためのプログラムやデータが格納されている。係るプログラムには、画像取得部３１、物体推定部３３、位置姿勢推定部３５、画像生成部３６、画像合成部３７のそれぞれに対応するプログラムを含めることができる。また、係るデータには、ＣＧモデルのデータなどを含めることができる。 The ROM 303 stores computer setting data, a boot program, and the like. A keyboard 304 and a mouse 305 are examples of an operation input device, and various instructions can be input to the CPU 301 by being operated by a computer user.
The external storage device 306 is a large-capacity information storage device represented by a hard disk drive (HDD) device. The external storage device 306 stores an OS (operating system) and programs and data for causing the CPU 301 to execute the above-described processes described as being performed by the image processing apparatus 30. Such programs can include programs corresponding to the image acquisition unit 31, the object estimation unit 33, the position and orientation estimation unit 35, the image generation unit 36, and the image synthesis unit 37, respectively. The data can include CG model data and the like.

外部記憶装置３０６に保存されているプログラムやデータは、ＣＰＵ３０１による制御に従って適宜ＲＡＭ３０２にロードされる。ＣＰＵ３０１は、このロードされたプログラムやデータを用いて処理を実行することで、図１に示す画像処理装置３０の各部の機能を実現することができる。なお、外部記憶装置３０６は、図１の画像記憶部３２、モデル形状記憶部３５として機能することもできる。
記憶媒体ドライブ３０７は、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの記憶媒体に記録されたプログラムやデータを読み出したり、係る記憶媒体にプログラムやデータを書き込んだりすることができる。なお、上記において、外部記憶装置３０６に保存されているものとして説明したプログラムやデータの一部若しくは全部を、この記憶媒体に記録しておいてもよい。記憶媒体ドライブ３０７が記憶媒体から読み出したプログラムやデータは、外部記憶装置３０６やＲＡＭ３０２に対して出力される。 Programs and data stored in the external storage device 306 are appropriately loaded into the RAM 302 under the control of the CPU 301. The CPU 301 can implement the functions of the respective units of the image processing apparatus 30 shown in FIG. 1 by executing processing using the loaded program and data. Note that the external storage device 306 can also function as the image storage unit 32 and the model shape storage unit 35 of FIG.
The storage medium drive 307 can read a program and data recorded on a storage medium such as a CD-ROM and a DVD-ROM, and can write a program and data on the storage medium. In the above, a part or all of the programs and data described as being stored in the external storage device 306 may be recorded on this storage medium. Programs and data read from the storage medium by the storage medium drive 307 are output to the external storage device 306 and the RAM 302.

Ｉ／Ｆ３０８は、撮像装置２１Ｌ、２１Ｒを接続するためのアナログビデオポートあるいはＩＥＥＥ１３９４等のデジタル入出力ポート、イーサネット（登録商標）ポートなどによって構成される。Ｉ／Ｆ３０８を介して受信したデータは、ＲＡＭ３０２や外部記憶装置３０６に入力される。
上述したように、図１に示す画像処理装置３０の各部の機能は、ＣＰＵ３０１がプログラムを実行することで実現することができる。ただし、図１に示す画像処理装置３０の各部のうち少なくとも一部が専用のハードウェアとして動作するようにしてもよい。この場合、専用のハードウェアは、ＣＰＵ３０１の制御に基づいて動作する。 The I / F 308 includes an analog video port for connecting the imaging devices 21L and 21R, a digital input / output port such as IEEE1394, an Ethernet (registered trademark) port, or the like. Data received via the I / F 308 is input to the RAM 302 and the external storage device 306.
As described above, the function of each unit of the image processing apparatus 30 illustrated in FIG. 1 can be realized by the CPU 301 executing a program. However, at least some of the units of the image processing apparatus 30 shown in FIG. 1 may operate as dedicated hardware. In this case, the dedicated hardware operates based on the control of the CPU 301.

以下、画像処理装置３０において実行される処理の手順について、図４を参照しながら説明する。図４に示す処理は、例えばユーザによる指示入力に応じて開始される。ただし、図３の処理の開始タイミングは、上記のタイミングに限らない。画像処理装置３０は、ＣＰＵ３０１が必要なプログラムを読み出して実行することにより、図４に示す処理を実現することができる。ただし、上述したように、図１に示す画像処理装置３０の各要素のうち少なくとも一部が専用のハードウェアとして動作することで図４の処理が実現されるようにしてもよい。この場合、専用のハードウェアは、ＣＰＵ３０１の制御に基づいて動作する。なお、以降、アルファベットＳはフローチャートにおけるステップを意味するものとする。 Hereinafter, the procedure of processing executed in the image processing apparatus 30 will be described with reference to FIG. The process shown in FIG. 4 is started in response to an instruction input by the user, for example. However, the start timing of the process in FIG. 3 is not limited to the above timing. The image processing apparatus 30 can implement the processing illustrated in FIG. 4 by the CPU 301 reading and executing a necessary program. However, as described above, at least a part of each element of the image processing apparatus 30 shown in FIG. 1 may operate as dedicated hardware so that the processing in FIG. 4 is realized. In this case, the dedicated hardware operates based on the control of the CPU 301. Hereinafter, the alphabet S means a step in the flowchart.

まずＳ１において、画像取得部３１は、撮像装置２１Ｌ，２１Ｒからステレオ画像を取得する。次にＳ２において、画像記憶部３２は、画像取得部３１から取得したステレオ画像を一時的に記憶する。
Ｓ３では、物体推定部３３は、画像記憶部３２に記憶されたステレオ画像から手４０の領域を抽出し、手４０の３次元形状と位置とを推定する。また、手４０の位置情報を時間と共に記憶し、それに基づいて手４０の移動速度および移動方向を推定する。このＳ３における物体推定処理の詳細については後述する。 First, in S1, the image acquisition unit 31 acquires a stereo image from the imaging devices 21L and 21R. Next, in S 2, the image storage unit 32 temporarily stores the stereo image acquired from the image acquisition unit 31.
In S 3, the object estimation unit 33 extracts the region of the hand 40 from the stereo image stored in the image storage unit 32 and estimates the three-dimensional shape and position of the hand 40. Further, the position information of the hand 40 is stored with time, and the moving speed and moving direction of the hand 40 are estimated based on the information. Details of the object estimation process in S3 will be described later.

Ｓ４では、位置姿勢推定部３５は、撮像装置２１Ｌ、２１Ｒの少なくとも一方の位置姿勢を推定する。推定された位置姿勢は、画像生成部３６におけるＣＧのレンダリング処理に用いられる。Ｓ５では、画像生成部３６は、撮像装置２１Ｌ、２１Ｒの位置姿勢から見た仮想物体６０の画像を生成（レンダリング）する。このＳ５におけるＣＧレンダリング処理の詳細については後述する。
Ｓ６では、画像合成部３７は、Ｓ２において記憶された実写画像の上に、Ｓ５において生成されたＣＧを重畳し、合成画像を生成する。Ｓ７では、表示制御部３８は、Ｓ６において生成された合成画像を表示装置２２のディスプレイに表示させる表示制御を行う。 In S4, the position / orientation estimation unit 35 estimates the position / orientation of at least one of the imaging devices 21L and 21R. The estimated position and orientation are used for CG rendering processing in the image generation unit 36. In S5, the image generation unit 36 generates (renders) an image of the virtual object 60 viewed from the position and orientation of the imaging devices 21L and 21R. Details of the CG rendering process in S5 will be described later.
In S6, the image composition unit 37 superimposes the CG generated in S5 on the live-action image stored in S2, and generates a composite image. In S 7, the display control unit 38 performs display control for displaying the composite image generated in S 6 on the display of the display device 22.

（物体推定処理）
図５は、物体推定部３３が図４のＳ３において実行する物体推定処理の流れを示すフローチャートである。
先ずＳ３１において、物体推定部３３は、画像記憶部３２に記憶されたステレオ画像から手４０の３次元形状を推定する。この処理は、公知の処理によって行うことができる。例えば、あらかじめ取得した背景の画像と現在の実写画像との差分を抽出することにより、前景となる手４０の領域を抽出する。さらに、ステレオ画像からそれぞれ抽出された２つの手４０の領域を対応付ける。その後、撮像装置２１Ｌ、２１Ｒの既知の配置情報に基づいて、手４０の領域をステレオ計測し、手４０の３次元形状を推定する。物体推定部３３は、推定した３次元形状をモデル形状記憶部３４に出力する。 (Object estimation processing)
FIG. 5 is a flowchart showing the flow of the object estimation process executed by the object estimation unit 33 in S3 of FIG.
First, in S 31, the object estimation unit 33 estimates the three-dimensional shape of the hand 40 from the stereo image stored in the image storage unit 32. This process can be performed by a known process. For example, the foreground region of the hand 40 is extracted by extracting the difference between the background image acquired in advance and the current live-action image. Further, the areas of the two hands 40 extracted from the stereo image are associated with each other. Thereafter, based on the known arrangement information of the imaging devices 21L and 21R, the region of the hand 40 is measured in stereo, and the three-dimensional shape of the hand 40 is estimated. The object estimation unit 33 outputs the estimated three-dimensional shape to the model shape storage unit 34.

Ｓ３２では、物体推定部３３は、手４０の位置を推定する。手４０の位置は、Ｓ３１において推定された手４０の３次元形状から、例えば手４０の領域の重心の座標によって表現する。なお、手４０の位置は、手４０の重心に限定されるものではなく、特定の指の先端の座標などでもよい。また、手４０の位置を推定する方法は、手４０の３次元形状をもとに推定する方法に限定されず、モーションキャプチャ装置や磁気センサなどを用いた方法であってもよい。物体推定部３３は、推定した位置をモデル形状記憶部３４に出力する。
Ｓ３３では、物体推定部３３は、Ｓ３２において推定された手４０の位置を、その時点での時刻と共に保存する。上記時刻は、手４０の３次元形状を推定するときに使用したステレオ画像のタイムコードなどから導き出すことができる。あるいはタイマ計測手段によって取得してもよい。手４０の位置を時刻と共に保存することにより、ある過去における手４０の位置の履歴を保存しておくことができる。 In S32, the object estimation unit 33 estimates the position of the hand 40. The position of the hand 40 is expressed by, for example, the coordinates of the center of gravity of the region of the hand 40 from the three-dimensional shape of the hand 40 estimated in S31. The position of the hand 40 is not limited to the center of gravity of the hand 40, and may be the coordinates of the tip of a specific finger. Further, the method of estimating the position of the hand 40 is not limited to the method of estimating based on the three-dimensional shape of the hand 40, and may be a method using a motion capture device or a magnetic sensor. The object estimation unit 33 outputs the estimated position to the model shape storage unit 34.
In S33, the object estimation unit 33 stores the position of the hand 40 estimated in S32 together with the time at that time. The time can be derived from the time code of the stereo image used when estimating the three-dimensional shape of the hand 40. Or you may acquire by a timer measurement means. By storing the position of the hand 40 together with the time, a history of the position of the hand 40 in a certain past can be stored.

Ｓ３４では、物体推定部３３は、その時点での手４０の移動速度と移動方向とを推定する。移動速度および移動方向は、Ｓ３２において推定された手４０の位置と、Ｓ３３において保存された過去の時刻における手４０の位置の履歴とから推定することができる。例えば、時刻Ｔ１において手４０が座標（Ｘ１，Ｙ１，Ｚ１）に存在し、時刻Ｔ２において座標（Ｘ２，Ｙ２，Ｚ２）に存在した場合、２時点の距離を計算しそれを時刻の差で割ることにより、移動速度を算出することができる。また、２時点のベクトルを求めることにより、移動方向を算出することができる。ここで、移動速度の代わりに加速度を算出してもよい。物体推定部３３は、推定した移動速度と移動方向とを画像生成部３７に出力し、図５の処理を終了する。 In S34, the object estimation unit 33 estimates the moving speed and moving direction of the hand 40 at that time. The moving speed and moving direction can be estimated from the position of the hand 40 estimated in S32 and the history of the position of the hand 40 at the past time stored in S33. For example, if the hand 40 exists at the coordinates (X1, Y1, Z1) at the time T1 and exists at the coordinates (X2, Y2, Z2) at the time T2, the distance between the two time points is calculated and divided by the time difference. Thus, the moving speed can be calculated. Further, the moving direction can be calculated by obtaining vectors at two time points. Here, acceleration may be calculated instead of the moving speed. The object estimation unit 33 outputs the estimated movement speed and movement direction to the image generation unit 37, and ends the process of FIG.

（ＣＧレンダリング処理）
図６は、画像生成部３６が図４のＳ５において実行するＣＧレンダリング処理の流れを示すフローチャートである。
まずＳ５１において、画像生成部３６は、仮想物体６０の画像（ＣＧ）の透明度を決定する。透明度は、図５のＳ３４において決定された手４０の移動速度に応じて決定される。
具体的には、手４０の移動速度が所定の速度未満の場合は、ＣＧを不透明にしてそのまま表示し、所定の速度以上の場合はＣＧを透明にする。これにより、観察者が手４０を素早く動かしたときはＣＧが透明になり、仮に現実物体が仮想物体６０に隠れている場合には、その現実物体を表示させ、観察者に提示することができる。一方、観察者が手４０を動かさなかったか手４０の動きが遅い場合は、ＣＧは透明にならず、観察者は仮想物体６０を観察することができる。 (CG rendering process)
FIG. 6 is a flowchart showing the flow of the CG rendering process executed by the image generation unit 36 in S5 of FIG.
First, in S51, the image generation unit 36 determines the transparency of the image (CG) of the virtual object 60. The transparency is determined according to the moving speed of the hand 40 determined in S34 of FIG.
Specifically, when the moving speed of the hand 40 is less than a predetermined speed, the CG is made opaque and displayed as it is, and when it is equal to or higher than the predetermined speed, the CG is made transparent. Thereby, when the observer quickly moves the hand 40, the CG becomes transparent. If the real object is hidden behind the virtual object 60, the real object can be displayed and presented to the observer. . On the other hand, when the observer does not move the hand 40 or when the movement of the hand 40 is slow, the CG is not transparent, and the observer can observe the virtual object 60.

図７は、図２（ａ）に示す物体５１およびマーカー５０を用いて図２（ｂ）に示す仮想物体６０の画像を表示する場合において、観察者が手４０を素早く動かした場合の表示例を示している。図７に示すように、手４０が所定の速度以上の移動速度で素早く動かされた場合、ＣＧは透明になり、仮想物体６０の裏側にあるマーカー５０が貼り付けられた物体５１が透けて見えることになる。
なお、手４０の移動速度が所定の速度以上の場合、移動速度に比例して透明度を高くし、ある一定の移動速度以上で完全に透明にするようにしてもよい。この場合、観察者が手４０を動かさなかったか手４０の動きが遅い場合にはＣＧは透明にならないが、手４０が所定の速度以上で動いた場合には、観察者が手４０を速く動かすほどＣＧの透明度が上がる。そして、さらに手４０がある移動速度以上で動いた場合は、ＣＧが完全に透明になる。なお、上記以外の方法で、手４０の移動速度と透明度とを連動させてもよい。また、図４のＳ３４において物体推定部３３が速度ではなく加速度を推定した場合は、加速度と透明度とを連動させてもよい。例えば、推定された加速度が所定の加速度以上である場合に、ＣＧを透明にするよう透明度を決定するようにしてもよい。 FIG. 7 shows a display example when the observer quickly moves the hand 40 when displaying the image of the virtual object 60 shown in FIG. 2B using the object 51 and the marker 50 shown in FIG. Is shown. As shown in FIG. 7, when the hand 40 is quickly moved at a moving speed equal to or higher than a predetermined speed, the CG becomes transparent, and the object 51 with the marker 50 on the back side of the virtual object 60 can be seen through. It will be.
When the moving speed of the hand 40 is equal to or higher than a predetermined speed, the transparency may be increased in proportion to the moving speed so that the hand 40 is completely transparent at a certain moving speed or higher. In this case, when the observer does not move the hand 40 or when the movement of the hand 40 is slow, the CG does not become transparent, but when the hand 40 moves at a predetermined speed or more, the observer moves the hand 40 quickly. The transparency of CG increases. When the hand 40 moves at a certain moving speed or more, the CG becomes completely transparent. Note that the moving speed of the hand 40 and the transparency may be linked by a method other than the above. Moreover, when the object estimation part 33 estimates acceleration instead of speed in S34 of FIG. 4, you may link acceleration and transparency. For example, the transparency may be determined so as to make the CG transparent when the estimated acceleration is equal to or greater than a predetermined acceleration.

Ｓ５２では、画像生成部３６は、モデル形状記憶部３４に記憶されている手４０の３次元形状のデータと仮想物体６０の３次元モデルのデータと、位置姿勢推定部３５において推定された撮像装置２１Ｌ、２１Ｒの位置姿勢とを取得する。そして、画像生成部３６は、撮像装置２１Ｌ、２１Ｒの位置姿勢から見た仮想物体６０の画像を生成する。ただし、画像生成にあたり、手４０と仮想物体６０との撮像装置２１Ｌ、２１Ｒからの距離を描画ピクセルごとに判別し、手４０の方が手前にあるピクセルについては、仮想物体６０を描画せず、透明の状態にする。すなわち、手４０の方が手前にあるピクセルについては、実写画像を観察者に提示し、手４０が手前に存在しているように見せる。また、このＳ５２では、画像生成部３６は、Ｓ５１において決定された透明度に応じて、仮想物体６０の画像を生成する。 In S 52, the image generation unit 36 captures the three-dimensional shape data of the hand 40 and the three-dimensional model data of the virtual object 60 stored in the model shape storage unit 34 and the imaging device estimated by the position / orientation estimation unit 35. The positions and orientations of 21L and 21R are acquired. Then, the image generation unit 36 generates an image of the virtual object 60 viewed from the position and orientation of the imaging devices 21L and 21R. However, when generating an image, the distance from the imaging devices 21L and 21R between the hand 40 and the virtual object 60 is determined for each drawing pixel, and the virtual object 60 is not drawn for the pixel in which the hand 40 is in front, Make it transparent. That is, for a pixel in which the hand 40 is in front, the photographed image is presented to the observer so that the hand 40 is present in front. In S52, the image generation unit 36 generates an image of the virtual object 60 according to the transparency determined in S51.

以上説明したように、本実施形態における画像処理装置３０は、現実空間を撮像する撮像装置２１Ｌ、２１Ｒの位置姿勢に基づいて、仮想物体６０の画像を表示させる。このとき、画像処理装置３０は、撮像装置２１Ｌ、２１Ｒの画像に撮像されている物体の移動速度を取得し、取得された移動速度に基づいて仮想物体６０の画像の透明度を決定し、決定された透明度に従って仮想物体６０の画像を表示させる。ここで、移動速度を取得する物体は、仮想物体６０の画像を観察する観察者の手４０とすることができる。
画像処理装置３０は、手４０の移動速度が所定の速度以上である場合に、仮想物体６０の画像を透明にするよう、透明度を決定することができる。このとき、画像処理装置３０は、移動速度が速いほど透明度を上げるようにしてもよい。上記のように、仮想物体６０の画像を透明にした場合、仮想物体６０の裏側に現実物体が存在する場合には、その現実物体が表示され、観察者は現実物体を確認することができる。そのため、観察者の手４０が現実物体に接触する可能性を低減することができる。 As described above, the image processing device 30 in the present embodiment displays the image of the virtual object 60 based on the position and orientation of the imaging devices 21L and 21R that capture the real space. At this time, the image processing device 30 acquires the moving speed of the object imaged in the images of the imaging devices 21L and 21R, and determines the transparency of the image of the virtual object 60 based on the acquired moving speed. The image of the virtual object 60 is displayed according to the transparency. Here, the object for obtaining the moving speed can be the hand 40 of the observer who observes the image of the virtual object 60.
The image processing apparatus 30 can determine the transparency so that the image of the virtual object 60 is transparent when the moving speed of the hand 40 is equal to or higher than a predetermined speed. At this time, the image processing apparatus 30 may increase the transparency as the moving speed increases. As described above, when the image of the virtual object 60 is made transparent, if a real object exists behind the virtual object 60, the real object is displayed, and the observer can confirm the real object. Therefore, the possibility that the observer's hand 40 contacts the real object can be reduced.

また、手４０の移動速度が所定の速度以上である場合に、仮想物体６０の画像が透明になるため、手４０が現実物体から離れた場所にある場合であっても、手４０が高速で移動している場合には仮想物体６０の画像を透明にすることができる。したがって、観察者は、現実物体の位置を容易に確認することができ、手４０を現実物体に衝突させてしまうことを、余裕を持って回避することができる。一方、手４０の移動速度が所定の速度未満である場合には、仮想物体６０の画像が透明になることはない。そのため、現実物体の近くで手４０がゆっくりと動いており、手４０が現実物体に接触したとしても衝撃が少ない場合には、仮想物体６０の画像を透明にせず表示させておくことができる。したがって、観察者が仮想物体６０を観察することを阻害しない。 Further, since the image of the virtual object 60 becomes transparent when the moving speed of the hand 40 is equal to or higher than a predetermined speed, even if the hand 40 is in a place away from the real object, the hand 40 is at a high speed. When moving, the image of the virtual object 60 can be made transparent. Therefore, the observer can easily confirm the position of the real object and can avoid the collision of the hand 40 with the real object with a margin. On the other hand, when the moving speed of the hand 40 is less than a predetermined speed, the image of the virtual object 60 does not become transparent. Therefore, if the hand 40 is moving slowly near the real object and there is little impact even if the hand 40 contacts the real object, the image of the virtual object 60 can be displayed without being transparent. Therefore, it does not hinder the observer from observing the virtual object 60.

また、画像処理装置３０が仮想物体６０の画像を表示させる表示装置は、観察者が頭部に装着した頭部装着型表示装置（ＨＭＤ）であり、当該ＨＭＤは、ビデオシースルー型ＨＭＤとすることができる。画像処理装置３０は、観察者の視点位置から現実空間を撮像した画像上に仮想物体６０の画像を重畳して表示させることができる。
観察者が頭部装着型表示装置を装着し、ディスプレイにＣＧを表示している状態で移動すると、ＣＧに隠れている現実物体に気が付かずに接触してしまうおそれがある。特に観察者の手が素早く動いた場合に、手がＣＧに隠れている現実物体に接触する可能性が高く、接触した際の衝撃も大きい。これに対して、本実施形態では、観察者が装着する頭部装着型表示装置に表示するＣＧの透明度を、観察者の手の移動速度に応じて制御することができるので、上述した接触可能性を適切に抑制しつつ、違和感のない複合現実感を提示することができる。 The display device on which the image processing device 30 displays the image of the virtual object 60 is a head-mounted display device (HMD) worn by the observer on the head, and the HMD is a video see-through HMD. Can do. The image processing apparatus 30 can superimpose and display the image of the virtual object 60 on the image obtained by capturing the real space from the viewpoint position of the observer.
If the observer wears the head-mounted display device and moves while displaying the CG on the display, there is a possibility that the real object hidden behind the CG may come into contact without noticing. In particular, when the observer's hand moves quickly, there is a high possibility that the hand will come into contact with a real object hidden in the CG, and the impact upon contact will be great. On the other hand, in this embodiment, the transparency of CG displayed on the head-mounted display device worn by the observer can be controlled according to the movement speed of the observer's hand. It is possible to present a mixed reality without any sense of incongruity while appropriately suppressing sex.

（変形例）
上記実施形態においては、手４０の３次元形状および位置を推定し、さらに移動速度や移動方向を推定する場合について説明したが、移動速度や移動方向の推定対象とする物体は、観察者の手４０に限らず、足や頭部といった他の体の一部であってもよい。また、上記物体は、観察者の体の一部に限定されるものではなく、観察者が操作する物体であってもよい。また、上記物体は、観察者が遠隔で操作するロボットのアームであってもよく、撮像装置２１Ｌ、２１Ｒは、観察者の視点位置からの現実空間を撮像するカメラに限定されない。 (Modification)
In the above embodiment, the case where the three-dimensional shape and position of the hand 40 are estimated and the moving speed and moving direction are estimated has been described. However, the object to be estimated for the moving speed and moving direction is the observer's hand. It is not limited to 40, but may be a part of another body such as a foot or a head. The object is not limited to a part of the observer's body, and may be an object operated by the observer. Further, the object may be a robot arm that is remotely operated by an observer, and the imaging devices 21L and 21R are not limited to cameras that capture the real space from the viewpoint position of the observer.

さらに、上記実施形態においては、表示装置としてビデオシースルー方式のＨＭＤを用いる場合について説明したが、光学シースルー方式のＨＭＤであっても実現可能である。上述した本実施形態では、画像合成部３７は、撮像装置２１Ｌ、２１Ｒの夫々の画像に対して、画像生成部３６において生成された仮想物体６０の画像を上書きするように合成した。しかしながら、撮像装置２１Ｌ、２１Ｒの画像に仮想物体６０の画像を合成するのではなく、単に仮想物体６０の画像を表示装置２２のディスプレイに表示するようにしてもよい。
また、表示装置はＨＭＤに限定されるものではなく、ハンドヘルドディスプレイ（ＨＨＤ）を用いてもよい。ＨＨＤは、手持ちのディスプレイである。つまり、観察者が手にとり、双眼鏡のように覗き込むことで画像を観察するディスプレイであってもよい。さらに、表示装置は、タブレットやスマートフォン等の表示端末であってもよい。 Furthermore, in the above-described embodiment, the case where a video see-through type HMD is used as the display device has been described. However, an optical see-through type HMD can also be realized. In the present embodiment described above, the image composition unit 37 composites the images of the imaging devices 21L and 21R so as to overwrite the image of the virtual object 60 generated by the image generation unit 36. However, instead of combining the image of the virtual object 60 with the images of the imaging devices 21L and 21R, the image of the virtual object 60 may simply be displayed on the display of the display device 22.
In addition, the display device is not limited to the HMD, and a handheld display (HHD) may be used. HHD is a handheld display. That is, it may be a display in which an observer takes a picture and observes an image by looking into it like binoculars. Furthermore, the display device may be a display terminal such as a tablet or a smartphone.

また、図６のＳ５１において、画像生成部３６は、手４０の位置と、仮想物体６０が重畳される現実物体５１の位置（仮想物体６０が配置される位置）との間の距離に基づいて、透明度を決定してもよい。例えば、手４０の移動速度が所定の速度以上であり、現実物体５１との距離が所定の距離以下である場合に、ＣＧを透明する。つまり、手４０の移動速度が所定の速度以上である場合であっても、現実物体５１との距離が所定の距離を上回っていれば、ＣＧを透明にしないようにする。
この場合、撮像装置２１Ｌ、２１Ｒによって、例えば図２（ａ）のようなマーカー５０を含む現実物体５１を撮像した場合には、図４のＳ３の物体推定処理と同様に現実物体５１の位置を推定する。次に、手４０の位置と現実物体５１の位置との差を算出することにより、手４０と現実物体５１との距離を算出する。そして、算出した距離に応じてＣＧの透明度を決定する。現実物体５１から手４０が離れた場所にあった場合に、手４０を早く動かしても、現実物体５１に接触するおそれはない。したがって、このような場合にはＣＧを透明にせず、違和感のない複合現実感を提示することができる。 In S51 of FIG. 6, the image generation unit 36 is based on the distance between the position of the hand 40 and the position of the real object 51 on which the virtual object 60 is superimposed (position where the virtual object 60 is arranged). The transparency may be determined. For example, when the moving speed of the hand 40 is equal to or higher than a predetermined speed and the distance from the real object 51 is equal to or lower than the predetermined distance, the CG is made transparent. That is, even when the moving speed of the hand 40 is equal to or higher than a predetermined speed, the CG is not made transparent if the distance from the real object 51 exceeds the predetermined distance.
In this case, when the real object 51 including the marker 50 as shown in FIG. 2A is imaged by the imaging devices 21L and 21R, the position of the real object 51 is determined in the same manner as the object estimation process in S3 of FIG. presume. Next, the distance between the hand 40 and the real object 51 is calculated by calculating the difference between the position of the hand 40 and the position of the real object 51. Then, the transparency of the CG is determined according to the calculated distance. Even if the hand 40 is moved quickly when the hand 40 is away from the real object 51, there is no possibility of touching the real object 51. Therefore, in such a case, the CG is not made transparent, and a mixed reality without a sense of incongruity can be presented.

さらに、図６のＳ５１において、画像生成部３６は、手４０の移動方向に基づいて、透明度を決定してもよい。観察者が手４０を現実物体５１から離れる方向に移動させた場合、手４０を早く動かしても、現実物体５１に接触することはない。したがって、このような場合にはＣＧを透明にしないように、手４０の移動方向が現実物体５１に近づく方向である場合にのみＣＧを透明にするようにしてもよい。 Furthermore, in S51 of FIG. 6, the image generation unit 36 may determine the transparency based on the moving direction of the hand 40. When the observer moves the hand 40 in a direction away from the real object 51, the real object 51 is not contacted even if the hand 40 is moved quickly. Therefore, in such a case, the CG may be made transparent only when the moving direction of the hand 40 is a direction approaching the real object 51 so that the CG is not made transparent.

（第二の実施形態）
上述した第一の実施形態では、物体（手４０）の移動速度に応じて仮想物体６０の画像全体を透明にする場合について説明した。この第二の実施形態では、仮想物体６０の画像の一部を透明にする場合について説明する。
本実施形態における画像処理装置３０を備えるＭＲシステム１０の構成は、上述した図１に示す構成と同様である。また、本実施形態における画像処理装置３０の動作も、上述した図４に示す動作と同様である。ただし、図４のＳ５におけるＣＧレンダリング処理が上述した第一の実施形態とは異なる。したがって、以下、処理の異なる部分を中心に説明する。 (Second embodiment)
In the first embodiment described above, the case where the entire image of the virtual object 60 is made transparent according to the moving speed of the object (hand 40) has been described. In the second embodiment, a case where a part of the image of the virtual object 60 is made transparent will be described.
The configuration of the MR system 10 including the image processing apparatus 30 in the present embodiment is the same as the configuration shown in FIG. 1 described above. Also, the operation of the image processing apparatus 30 in the present embodiment is the same as the operation shown in FIG. 4 described above. However, the CG rendering process in S5 of FIG. 4 is different from the first embodiment described above. Therefore, the following description will focus on the different parts of the process.

図８は、本実施形態における画像生成部３６が図４のＳ５において実行するＣＧレンダリング処理の流れを示すフローチャートである。
まずＳ１５１において、画像生成部３６は、手４０の位置に応じてＣＧを透明にする領域（以下、「透明領域」と呼ぶ。）を決定する。透明領域は、手４０を含む所定の範囲に対応する領域、例えば、手４０から一定の距離の内側の領域とすることができる。この透明領域は、物体推定処理において推定された手４０の３次元形状および位置と、あらかじめ設定された上記一定の距離とから決定することができる。 FIG. 8 is a flowchart showing the flow of the CG rendering process executed by the image generation unit 36 in this embodiment in S5 of FIG.
First, in S 151, the image generation unit 36 determines a region (hereinafter referred to as “transparent region”) that makes the CG transparent according to the position of the hand 40. The transparent region can be a region corresponding to a predetermined range including the hand 40, for example, a region inside a certain distance from the hand 40. This transparent region can be determined from the three-dimensional shape and position of the hand 40 estimated in the object estimation process and the predetermined distance set in advance.

図９（ａ）は、手４０の周りの仮想物体６０のみを透明にした様子を表している。この図９（ａ）において、透明領域７１は、手４０から一定の距離の領域であり、その透明領域７１に含まれている仮想物体６０のみが透明になってその裏にある現実物体５１の一部が表示されている様子を表している。この場合、観察者が手４０を移動することで、手４０の位置に応じて手４０の周りの現実物体５１が表示されるようになるので、手４０が現実物体５１に接触する可能性を低減することができる。また、手４０の周り以外のＣＧは透明にならないので、観察者がＣＧを見ることが阻害されない。 FIG. 9A shows a state in which only the virtual object 60 around the hand 40 is made transparent. In FIG. 9A, the transparent area 71 is an area of a certain distance from the hand 40, and only the virtual object 60 included in the transparent area 71 is transparent, and the real object 51 behind it is A part of the image is displayed. In this case, when the observer moves the hand 40, the real object 51 around the hand 40 is displayed according to the position of the hand 40. Therefore, the possibility that the hand 40 contacts the real object 51 is increased. Can be reduced. Moreover, since CG other than the surroundings of the hand 40 does not become transparent, it is not hindered that an observer sees CG.

なお、本実施形態では、手４０から一定の距離の内側の領域を透明領域とする場合について説明したが、手４０の移動方向における透明領域を他の方向に対して広く設定してもよい。図９（ｂ）は、手４０が指先方向に移動している場合の表示例を示している。この場合、透明領域７２は、手４０の指先方向における範囲が他の方向に対して広くなる。
これにより、手４０の移動方向に配置された仮想物体６０の透明になる範囲が広くなり、手４０が仮想物体６０の裏に隠された現実物体５１と接触する可能性をより低減することができる。この場合の透明領域は、物体推定処理において推定された手４０の３次元形状および位置と、手４０の移動方向と、あらかじめ設定された所定の距離とから決定することができる。なお、透明領域を広げる範囲は、移動速度にかかわらず一定としてもよいし、移動速度が速いほど大きく広げるようにしてもよい。 In the present embodiment, a case has been described in which a region within a certain distance from the hand 40 is a transparent region, but the transparent region in the moving direction of the hand 40 may be set wider than other directions. FIG. 9B shows a display example when the hand 40 is moving in the fingertip direction. In this case, the transparent region 72 has a wider range in the fingertip direction of the hand 40 than in other directions.
Thereby, the transparent range of the virtual object 60 arranged in the moving direction of the hand 40 is widened, and the possibility that the hand 40 comes into contact with the real object 51 hidden behind the virtual object 60 can be further reduced. it can. The transparent area in this case can be determined from the three-dimensional shape and position of the hand 40 estimated in the object estimation process, the moving direction of the hand 40, and a predetermined distance set in advance. Note that the range in which the transparent region is expanded may be constant regardless of the movement speed, or may be increased as the movement speed increases.

図８に戻って、Ｓ１５２では、画像生成部３６は、Ｓ１５１において決定された透明領域と仮想物体６０との重なりを考慮して、仮想物体６０の画像（ＣＧ）の透明度を決定する。具体的には以下の通りである。
まず、透明領域と重ならない仮想物体６０は透明にしない。
透明領域と重なる仮想物体６０の画像の透明度は、Ｓ３４において決定された手４０の移動速度に応じて決定する。この場合の透明度の決定方法は、Ｓ５１における透明度の決定方法と同様であってもよい。あるいは、透明領域と重なる仮想物体６０の画像の透明度を、透明領域内において重み付けしてもよい。つまり、透明領域のうち手４０に近い領域はより透明度が高くなり、手４０から遠い領域は透明度が低くなるようにしてもよい。これにより、より現実物体との接触の可能性が高い、手４０に近い部分の仮想物体６０を透明にすることができ、現実物体との接触の可能性を適切に低減することができる。 Returning to FIG. 8, in S 152, the image generation unit 36 determines the transparency of the image (CG) of the virtual object 60 in consideration of the overlap between the transparent region determined in S 151 and the virtual object 60. Specifically, it is as follows.
First, the virtual object 60 that does not overlap the transparent area is not made transparent.
The transparency of the image of the virtual object 60 that overlaps the transparent region is determined according to the moving speed of the hand 40 determined in S34. The transparency determination method in this case may be the same as the transparency determination method in S51. Alternatively, the transparency of the image of the virtual object 60 that overlaps the transparent area may be weighted in the transparent area. That is, a region close to the hand 40 in the transparent region may have higher transparency, and a region far from the hand 40 may have low transparency. Thereby, the virtual object 60 near the hand 40 that has a higher possibility of contact with the real object can be made transparent, and the possibility of contact with the real object can be appropriately reduced.

Ｓ１５３では、画像生成部３６は、Ｓ１５２において決定された透明度に従って、仮想物体６０の画像を生成する。詳細な処理は、Ｓ５２と同様であるため説明は省略する。
なお、上述した第一の実施形態と同様に、本実施形態においても、Ｓ１５２において、現実物体と手との距離に応じて透明度を決定してもよい。この処理は第一の実施形態と同様に実施できるため説明は省略する。
このように、本実施形態における画像処理装置３０は、仮想物体６０の画像のうち、手４０を含む所定の範囲に対応する領域を透明にする。仮想物体６０の画像全体ではなく、仮想物体６０の画像の一部を透明領域とするので、不必要にＣＧを透明にすることがない。また、手４０の移動方向の透明領域を他の方向に対して広く設定するので、より確実に現実物体との接触可能性を低減することができる。 In S153, the image generation unit 36 generates an image of the virtual object 60 according to the transparency determined in S152. Detailed processing is the same as that in S52, and a description thereof will be omitted.
Similar to the first embodiment described above, also in this embodiment, the transparency may be determined in S152 according to the distance between the real object and the hand. Since this process can be performed in the same manner as in the first embodiment, a description thereof will be omitted.
As described above, the image processing apparatus 30 according to the present embodiment makes a region corresponding to a predetermined range including the hand 40 in the image of the virtual object 60 transparent. Since a part of the image of the virtual object 60, not the entire image of the virtual object 60, is a transparent region, the CG is not unnecessarily transparent. In addition, since the transparent region in the moving direction of the hand 40 is set wider than in other directions, the possibility of contact with a real object can be more reliably reduced.

以上のように、上記各実施形態では、画像処理装置３０は、手などの物体の移動速度に応じてＣＧの透明度を制御するようにした。したがって、手などが素早く移動された場合にはＣＧが透明になり、通常はＣＧの裏側に隠れて見えない現実物体を表示させることができるので、手などが現実物体に接触する可能性を低減することができる。また、手などが移動しないか、ゆっくりと移動された場合には、ＣＧは透明にならないので、観察者がＣＧを見ることを阻害しないようにすることができる。 As described above, in each of the embodiments described above, the image processing apparatus 30 controls the transparency of the CG according to the moving speed of an object such as a hand. Therefore, when a hand or the like is moved quickly, the CG becomes transparent, and it is possible to display a real object that is normally hidden behind the CG and thus cannot be seen. can do. In addition, when the hand or the like does not move or is moved slowly, the CG does not become transparent, so that it is possible to prevent the observer from seeing the CG.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記録媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a recording medium, and one or more processors in the computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０…複合現実感システム、２１Ｌ、２１Ｒ…撮像装置、２２…表示装置、３０…画像処理装置、３１…画像取得部、３２…画像記憶部、３３…物体推定部、３４…モデル形状記憶部、３５…位置姿勢推定部、３６…画像生成部、３７…画像合成部、３８…表示制御部 DESCRIPTION OF SYMBOLS 10 ... Mixed reality system, 21L, 21R ... Imaging device, 22 ... Display device, 30 ... Image processing device, 31 ... Image acquisition part, 32 ... Image storage part, 33 ... Object estimation part, 34 ... Model shape storage part, 35 ... Position and orientation estimation unit, 36 ... Image generation unit, 37 ... Image composition unit, 38 ... Display control unit

Claims

First acquisition means for acquiring a first image obtained by imaging a real space;
Second acquisition means for acquiring the moving speed of the object imaged in the first image;
Determining means for determining the transparency of the second image generated based on the position and orientation of the imaging device that captured the first image based on the moving speed acquired by the second acquiring means;
An image processing apparatus comprising: a display control unit configured to display the second image on a display unit according to the transparency determined by the determination unit.

The determining means includes
The image processing apparatus according to claim 1, wherein when the moving speed acquired by the second acquiring unit is equal to or higher than a predetermined speed, the transparency is determined so that the second image is transparent. .

The determining means includes
The image processing apparatus according to claim 1, wherein the transparency is increased as the moving speed acquired by the second acquisition unit increases.

The second image includes an image of a virtual object;
The determining means includes
The image processing apparatus according to claim 1, wherein the transparency is determined based on a distance between the position of the object and a position where the virtual object is arranged.

The determining means includes
Based on the distance between the position of the first object whose movement speed is acquired by the second acquisition means and the position of the second object different from the first object captured in the first image. The image processing apparatus according to claim 1, wherein the transparency is determined.

The image processing apparatus according to claim 4, wherein the determining unit determines the transparency so that the second image is transparent when the distance is equal to or less than a predetermined distance.

The second acquisition means further acquires a moving direction of the object,
The determining means includes
The image processing apparatus according to claim 1, wherein the transparency is determined based on a moving speed and a moving direction acquired by the second acquiring unit.

The determining means includes
8. The image processing apparatus according to claim 1, wherein transparency of an area corresponding to a predetermined range including the object is determined in the second image. 9.

The second acquisition means further acquires a moving direction of the object,
The determining means includes
The image processing apparatus according to claim 8, wherein the range in the movement direction acquired by the second acquisition unit is set wider than other directions.

The image processing apparatus according to claim 1, wherein the object is a part of an observer's body.

The second acquisition means further acquires acceleration of the object,
The determining means includes
The image processing apparatus according to claim 1, wherein the transparency is determined based on an acceleration acquired by the second acquisition unit.

The display control means includes
The image processing apparatus according to claim 1, wherein the second image is combined with the first image and displayed on the display unit.

The display unit is a head-mounted display device worn by an observer on the head,
The image processing apparatus according to claim 1, wherein the first image is an image obtained by capturing a real space from the viewpoint position of the observer.

Obtaining a first image in which real space is imaged;
Obtaining a moving speed of an object imaged in the first image;
Determining the transparency of the second image generated based on the position and orientation of the imaging device that captured the first image based on the moving speed;
And displaying the second image on a display unit in accordance with the transparency.

The program for functioning a computer as each means of the image processing apparatus of any one of Claim 1 to 13.