JP2018077732A

JP2018077732A - Image processing apparatus, and image processing method

Info

Publication number: JP2018077732A
Application number: JP2016219953A
Authority: JP
Inventors: 彰尚三原; Akinao Mihara
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-11-10
Filing date: 2016-11-10
Publication date: 2018-05-17
Anticipated expiration: 2036-11-10
Also published as: JP6929043B2

Abstract

PROBLEM TO BE SOLVED: To provide a technique capable of generating alternative information on an object in a state where a user holds the object at a part of his/her body, even when it is failed to acquire the position and orientation of the body part or a model superposed on the body part or the position and orientation of the object.SOLUTION: The technique includes generating a first shape model of a body part of a user based on a measurement result of the body part, acquiring the position and orientation of an object to be gripped by the body part, and generating a second shape model of the object. The technique includes generating an image of a virtual space including the first shape model and the second shape model having the acquired position and orientation. When the result of the processing for acquiring the position and orientation of the object satisfies a predetermined condition, the technique further includes estimating the position and orientation of the object based on the position and orientation of the body part and the position and orientation of the object acquired when the body part has previously gripped the object, and includes generating the second shape model based on the estimated result.SELECTED DRAWING: Figure 1

Description

本発明は、複合現実感の提示技術に関するものである。 The present invention relates to a mixed reality presentation technique.

近年、設計・製造分野においてプロトタイプを用いた評価の期間短縮、費用削減が求められている。ＣＡＤ（コンピュータ支援設計）システムで作成した設計（形状・デザイン）データを用いて、組み立てやすさやメンテナンス性の仮想評価をするための複合現実感（ＭＲ：ＭｉｘｅｄＲｅａｌｉｔｙ）システムが導入されている。例えば、組み立てやすさを評価する場合は、手で仮想物体（以後操作CGモデルと呼ぶ）を把持して動かし、別の仮想物体（以後非操作CGモデルと呼ぶ）との接触を仮想空間上でシミュレーションすることが想定される。このとき手と操作CGモデルとの接触をシミュレーションするためには、手をモデル化して手に重畳させる必要があり、同様に操作CGモデルと非操作CGモデルとの接触をシミュレーションするためには操作CGモデルの位置姿勢を取得する必要がある。 In recent years, there has been a demand for shortening the evaluation period using prototypes and reducing costs in the design and manufacturing fields. A mixed reality (MR) system has been introduced for virtual evaluation of ease of assembly and maintainability using design (shape / design) data created by a CAD (Computer Aided Design) system. For example, when evaluating ease of assembly, a virtual object (hereinafter referred to as an operation CG model) is grasped and moved by hand, and contact with another virtual object (hereinafter referred to as a non-operation CG model) is performed in the virtual space. A simulation is assumed. At this time, in order to simulate the contact between the hand and the operation CG model, it is necessary to model the hand and superimpose it on the hand. Similarly, to simulate the contact between the operation CG model and the non-operation CG model, It is necessary to obtain the position and orientation of the CG model.

手をモデル化するためには、例えば、Leap Motion社のLeap Motionを利用することが考えられる。Leap Motionは、手の指も含めた位置姿勢を計測することができる。Leap Motionでは、内蔵されているステレオカメラから手の領域を検出し、手の形状を模した３次元ポリゴンモデル（以後ハンドモデルと呼ぶ）をリアルタイムで出力することができる。Leap Motion以外にもMicrosoft（登録商標）社のKinect等のデプスセンサから手と指の位置姿勢を推定することができる（非特許文献１）。非特許文献１に記載の技術では、デプスセンサで得られた手の形状の奥行画像に基づいて、初期位置から繰り返し計算し、コストを最適化することによって手と指の姿勢を推定した３次元ポリゴンモデルを生成している。以後、ハンドモデルを出力できるLeap Motionおよびデプスセンサのことをセンサと総称する。 In order to model the hand, for example, Leap Motion manufactured by Leap Motion may be used. Leap Motion can measure the position and orientation including the fingers. In Leap Motion, a hand region can be detected from a built-in stereo camera, and a three-dimensional polygon model imitating the shape of a hand (hereinafter referred to as a hand model) can be output in real time. In addition to Leap Motion, the position and orientation of hands and fingers can be estimated from a depth sensor such as Kinect of Microsoft (registered trademark) (Non-Patent Document 1). In the technique described in Non-Patent Document 1, a three-dimensional polygon in which hand and finger postures are estimated by repeatedly calculating from an initial position based on a depth image of a hand shape obtained by a depth sensor and optimizing costs. A model is being generated. Hereinafter, Leap Motion and a depth sensor that can output a hand model are collectively referred to as a sensor.

物体の位置姿勢を求めるためは、物体に既知の幾何学パターン（以後、マーカと呼ぶ）を貼り、カメラでマーカの画像を取得し、マーカの幾何変化を算出し、マーカとカメラとの相対位置姿勢を算出すればよい。ここでカメラとは、複合現実感を提示するための表示デバイスであるビデオシースルー型のヘッドマウントディスプレイ（以後HMDと略す）に搭載されているステレオカメラを想定している。CGモデルを描画するときの仮想カメラ位置姿勢を、HMDに搭載されているステレオカメラの位置姿勢と同期させることにより、あたかも現実空間の中にハンドモデルやCGモデルが存在するように表示することができる。 In order to obtain the position and orientation of an object, a known geometric pattern (hereinafter referred to as a marker) is pasted on the object, the marker image is acquired by the camera, the geometric change of the marker is calculated, and the relative position between the marker and the camera The posture may be calculated. Here, the camera is assumed to be a stereo camera mounted on a video see-through head mounted display (hereinafter abbreviated as HMD) which is a display device for presenting mixed reality. By synchronizing the position and orientation of the virtual camera when drawing the CG model with the position and orientation of the stereo camera installed in the HMD, it can be displayed as if the hand model and CG model exist in the real space. it can.

特開２００８−４０９１３号公報JP 2008-40913 A

Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J. : Realtime and Robust Hand Tracking from Depth. In CVPR 2014. Pp. 1106-1113. Colum¬bus, USA (2014)Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J .: Realtime and Robust Hand Tracking from Depth. In CVPR 2014. Pp. 1106-1113. Colum¬bus, USA (2014 ) D. Holz, S. Ullrich, M. Wolter, and T. Kuhlen.: Multi-contact grasp interaction for virtual environments. Journal of Virtual Reality and Broadcasting, 5(7), 2008.D. Holz, S. Ullrich, M. Wolter, and T. Kuhlen .: Multi-contact grasp interaction for virtual environments.Journal of Virtual Reality and Broadcasting, 5 (7), 2008. Moehring, M. , Froehlich, B.: Pseudo-Physical Interaction with a Virtual Car Interior in Immersive Environments. In Proceedings of IPT/EGVG-Workshop. 2005. AalborgMoehring, M., Froehlich, B .: Pseudo-Physical Interaction with a Virtual Car Interior in Immersive Environments. In Proceedings of IPT / EGVG-Workshop. 2005. Aalborg

しかし手で物体を把持していると、手が邪魔で物体に貼りつけられたマーカが見えず、物体の位置姿勢が正しく算出できない場合がある。同様に物体が邪魔で手が見えず、手の位置姿勢形状が正しく算出できない場合がある。 However, when the object is held by the hand, the marker attached to the object cannot be seen because the hand is in the way, and the position and orientation of the object may not be calculated correctly. Similarly, there are cases where the object is in the way and the hand cannot be seen, and the position / posture shape of the hand cannot be calculated correctly.

特許文献１では、マーカが貼られた物体の位置姿勢を検出する際に、物体の形状が既知のものとして、マーカから算出した位置姿勢を補正する手法を提案しているが、物体の形状が既知であることを前提にしなくてはならない。また、特許文献１には、マーカを用いて算出した位置姿勢を補正する手法であって、マーカを用いて位置姿勢が算出できなかった場合の対処方法は記載されていない。 Patent Document 1 proposes a method for correcting the position and orientation calculated from the marker assuming that the shape of the object is known when detecting the position and orientation of the object to which the marker is attached. It must be assumed that it is already known. Further, Patent Document 1 is a method for correcting a position and orientation calculated using a marker, and does not describe a coping method when the position and orientation cannot be calculated using a marker.

本発明はこのような問題に鑑みてなされたものであり、ユーザが自身の部位で物体を把持している状態において、該部位に重畳するモデルや該部位の位置姿勢、該物体の位置姿勢、の何れか一方が取得できなくても代わりとなる情報を生成可能な技術を提供する。 The present invention has been made in view of such problems, and in a state where the user is holding an object at his / her site, the model superimposed on the site, the position and orientation of the site, the position and orientation of the object, Provided is a technology capable of generating alternative information even if either of them cannot be acquired.

本発明の一様態は、ユーザの部位に対する測定結果に基づいて該部位の第１の形状モデルを生成する第１の生成手段と、前記部位による把持対象となる対象物の位置姿勢を取得する取得手段と、前記対象物の第２の形状モデルを生成する第２の生成手段と、前記第１の形状モデル、前記取得手段が取得した位置姿勢を有する前記第２の形状モデル、を含む仮想空間の画像を生成する第３の生成手段とを備え、前記第２の生成手段は、前記取得手段による前記対象物の位置姿勢を取得する処理の結果が所定の条件を満たす場合には、過去に前記部位が前記対象物を把持したときの前記部位の位置姿勢及び前記対象物の位置姿勢、に基づいて前記対象物の位置姿勢を推定し、該推定された結果に基づいて前記第２の形状モデルを生成することを特徴とする。 According to one aspect of the present invention, a first generation unit that generates a first shape model of a part based on a measurement result for a part of a user, and acquisition of acquiring a position and orientation of an object to be grasped by the part Virtual space including means, second generation means for generating a second shape model of the object, the first shape model, and the second shape model having the position and orientation acquired by the acquisition means A second generation unit configured to generate an image of the object, wherein the second generation unit includes a past process when the result of the process of acquiring the position and orientation of the object by the acquisition unit satisfies a predetermined condition. The position and orientation of the object are estimated based on the position and orientation of the part when the part grips the object and the position and orientation of the object, and the second shape is based on the estimated result. Specially for generating models To.

本発明の構成によれば、ユーザが自身の部位で物体を把持している状態において、該部位に重畳するモデルや該部位の位置姿勢、該物体の位置姿勢、の何れか一方が取得できなくても代わりとなる情報を生成することができる。 According to the configuration of the present invention, in a state where the user is holding an object at his / her part, any one of the model superimposed on the part, the position / posture of the part, and the position / posture of the object cannot be acquired. However, alternative information can be generated.

複合現実空間提示システムの機能構成例を示すブロック図。The block diagram which shows the function structural example of a mixed reality space presentation system. ハンドモデル及び操作ＣＧモデルを説明する図。The figure explaining a hand model and operation CG model. 画像処理装置１９０が行う処理のフローチャート。5 is a flowchart of processing performed by the image processing apparatus 190. 画像処理装置１９０が行う処理のフローチャート。5 is a flowchart of processing performed by the image processing apparatus 190. コンピュータ装置のハードウェア構成例を示すブロック図。The block diagram which shows the hardware structural example of a computer apparatus.

以下、添付図面を参照し、本発明の実施形態について説明する。なお、以下説明する実施形態は、本発明を具体的に実施した場合の一例を示すもので、特許請求の範囲に記載した構成の具体的な実施例の１つである。 Embodiments of the present invention will be described below with reference to the accompanying drawings. The embodiment described below shows an example when the present invention is specifically implemented, and is one of the specific examples of the configurations described in the claims.

［第１の実施形態］
本実施形態では次のような構成を有する画像処理装置の一例について説明する。この画像処理装置は、ユーザの部位に対する測定結果に基づいて該部位の第１の形状モデルを生成（第１の生成）し、該部位による把持対象となる対象物の位置姿勢を取得し、該対象物の第２の形状モデルを生成（第２の生成）する。そして画像処理装置は、第１の形状モデル、上記取得した位置姿勢を有する第２の形状モデル、を含む仮想空間の画像を生成（第３の生成）する。ここで、上記の第２の生成では、対象物の位置姿勢を取得する処理の結果が所定の条件を満たす場合には、過去に上記部位が対象物を把持したときの上記部位の位置姿勢及び対象物の位置姿勢、に基づいて対象物の位置姿勢を推定する。そして上記の第２の生成では更に、該推定された結果に基づいて第２の形状モデルを生成する。 [First Embodiment]
In the present embodiment, an example of an image processing apparatus having the following configuration will be described. The image processing apparatus generates a first shape model of the part based on a measurement result for the part of the user (first generation), acquires a position and orientation of the target object to be grasped by the part, A second shape model of the object is generated (second generation). Then, the image processing apparatus generates (third generation) an image of a virtual space including the first shape model and the second shape model having the acquired position and orientation. Here, in the second generation, when the result of the process of acquiring the position and orientation of the object satisfies a predetermined condition, the position and orientation of the part when the part has gripped the object in the past and The position and orientation of the object are estimated based on the position and orientation of the object. In the second generation, a second shape model is further generated based on the estimated result.

本実施形態に係る複合現実空間提示システムでは、図２に示す如く、ＨＭＤ等の頭部装着型表示装置を自身の頭部に装着しているユーザの手２０１の位置姿勢でもって、手２０１の形状を模した３次元仮想物体であるハンドモデル２１１を生成して配置する。更に複合現実空間提示システムは、現実物体としてのドライバ２０２の位置姿勢でもって、ドライバ２０２の形状を模した３次元仮想物体である操作ＣＧモデル２１２を配置する。このとき、操作ＣＧモデル２１２はドライバ２０２の位置姿勢でもって配置されるものであるから、ドライバ２０２にはその位置姿勢を複合現実空間提示システムに認識させるためにマーカ２０３が取り付けられている。図２ではマーカ２０３は２次元バーコードとして示しているが、マーカ２０３として利用可能なものはこれに限らず、如何なる指標であっても良い。また、マーカ２０３はドライバ２０２の何れの箇所に設けても構わない。複合現実空間提示システムはこのマーカ２０３を認識してドライバ２０２の位置姿勢を求め、該求めた位置姿勢でもって操作ＣＧモデル２１２を配置するのであるが、例えば手２０１でマーカ２０３の一部若しくは全部を隠蔽するようなケースが発生しうる。このようなケースが発生すると、複合現実空間提示システムはマーカ２０３を正しく認識することができず、その結果、ドライバ２０２の位置姿勢を正しく認識できない。後述するように、ドライバ２０２の位置姿勢が正しく認識できなくなるようなケースは他にもある。ドライバ２０２の位置姿勢が正しく認識できないと、操作ＣＧモデル２１２を正しくドライバ２０２の位置姿勢でもって配置することができない。 In the mixed reality space presentation system according to the present embodiment, as shown in FIG. 2, the position and orientation of the user's hand 201 wearing a head-mounted display device such as an HMD on his / her head is used. A hand model 211 that is a three-dimensional virtual object imitating a shape is generated and arranged. Furthermore, the mixed reality space presentation system arranges an operation CG model 212 that is a three-dimensional virtual object imitating the shape of the driver 202 with the position and orientation of the driver 202 as a real object. At this time, since the operation CG model 212 is arranged according to the position and orientation of the driver 202, the marker 203 is attached to the driver 202 in order to make the mixed reality space presentation system recognize the position and orientation. In FIG. 2, the marker 203 is shown as a two-dimensional barcode, but what can be used as the marker 203 is not limited to this, and any index may be used. Further, the marker 203 may be provided at any location of the driver 202. The mixed reality space presentation system recognizes the marker 203 to determine the position and orientation of the driver 202, and arranges the operation CG model 212 with the obtained position and orientation. There may be a case of concealing. When such a case occurs, the mixed reality space presentation system cannot correctly recognize the marker 203, and as a result, cannot correctly recognize the position and orientation of the driver 202. As will be described later, there are other cases where the position and orientation of the driver 202 cannot be recognized correctly. If the position and orientation of the driver 202 cannot be recognized correctly, the operation CG model 212 cannot be correctly placed with the position and orientation of the driver 202.

本実施形態ではドライバ２０２の位置姿勢が正しく認識できなくなるようなケースが発生した場合、過去に手２０１がドライバ２０２を把持していた状態における手２０１とドライバ２０２との位置姿勢関係を参酌して現在のドライバ２０２の位置姿勢を求める。 In the present embodiment, when a case where the position and orientation of the driver 202 cannot be correctly recognized occurs, the position and orientation relationship between the hand 201 and the driver 202 when the hand 201 has gripped the driver 202 in the past is taken into consideration. The current position and orientation of the driver 202 are obtained.

先ず、本実施形態に係る複合現実空間提示システムの機能構成例について、図１のブロック図を用いて説明する。なお、図１に示した構成は一例であり、上記のケースに対処可能な構成であれば、如何なる構成を採用しても良い。図１に示す如く、本実施形態に係る複合現実空間提示システムは、センサ１０１、頭部装着型表示装置の一例であるＨＭＤ１５１、画像処理装置１９０、表示装置１４２、を有する。 First, a functional configuration example of the mixed reality space presentation system according to the present embodiment will be described with reference to the block diagram of FIG. The configuration illustrated in FIG. 1 is an example, and any configuration may be adopted as long as the configuration can handle the above case. As shown in FIG. 1, the mixed reality space presentation system according to the present embodiment includes a sensor 101, an HMD 151 that is an example of a head-mounted display device, an image processing device 190, and a display device 142.

先ず、センサ１０１について説明する。センサ１０１は、ＨＭＤ１５１を自身の頭部に装着したユーザの手（指を含む）の位置姿勢や形状を計測するために設けられたものである。センサ１０１としては、例えば、上記の非特許文献１に記載のKinectを使用しても良い。センサ１０１はＨＭＤ１５１に取り付けても良いし、ＨＭＤ１５１には取り付けずに現実空間中の所定の位置に取り付けても良い。 First, the sensor 101 will be described. The sensor 101 is provided to measure the position and orientation and shape of a user's hand (including a finger) wearing the HMD 151 on his / her head. As the sensor 101, for example, Kinect described in Non-Patent Document 1 may be used. The sensor 101 may be attached to the HMD 151 or may be attached to a predetermined position in the real space without being attached to the HMD 151.

次に、ＨＭＤ１５１について説明する。表示部１１１は、ＨＭＤ１５１を自身の頭部に装着したユーザの眼（右眼及び左眼）前に位置するようにＨＭＤ１５１に取り付けられたものであり、画像処理装置１９０から出力された画像を表示する。撮像部１４１は、ＨＭＤ１５１を自身の頭部に装着したユーザの眼（右眼及び左眼）の近傍位置から該ユーザの視線方向を撮像するようにＨＭＤ１５１に取り付けられたものであり、現実空間の動画像を撮像する。撮像部１４１が撮像した各フレームの画像（現実空間の撮像画像）は順次、画像処理装置１９０に対して出力される。 Next, the HMD 151 will be described. The display unit 111 is attached to the HMD 151 so as to be positioned in front of the eyes (right eye and left eye) of the user wearing the HMD 151 on his / her head, and displays an image output from the image processing apparatus 190. To do. The imaging unit 141 is attached to the HMD 151 so as to capture the user's line-of-sight direction from the position near the eyes (right eye and left eye) of the user wearing the HMD 151 on his / her head. Capture a moving image. The image of each frame (physical space captured image) captured by the imaging unit 141 is sequentially output to the image processing device 190.

次に、画像処理装置１９０について説明する。ハンドモデル生成部１０２は、センサ１０１によるユーザの手の測定結果に基づいて、該手の形状を模した３次元仮想物体である形状モデル、すなわちハンドモデル（図２のハンドモデル２１１）を生成する。ハンドモデル生成部１０２によるハンドモデルの生成には、例えば、Microsoft（登録商標）社製のKinect SDKを用いればよい。すなわち、センサ１０１から手の３次元領域を抽出し、参照データとマッチングすることにより、リアルタイムに変化する手の形状が反映されたハンドモデルを出力する。 Next, the image processing apparatus 190 will be described. Based on the measurement result of the user's hand by the sensor 101, the hand model generation unit 102 generates a shape model that is a three-dimensional virtual object imitating the shape of the hand, that is, a hand model (hand model 211 in FIG. 2). . For example, Kinect SDK manufactured by Microsoft (registered trademark) may be used to generate the hand model by the hand model generation unit 102. That is, by extracting a three-dimensional region of the hand from the sensor 101 and matching with reference data, a hand model reflecting the shape of the hand changing in real time is output.

生成成功判断部１０３は、ハンドモデル生成部１０２によるハンドモデルの生成が成功したか否かを判断する。ハンドモデルの生成が成功したか否かの判断基準には様々なものが考えられる。 The generation success determination unit 103 determines whether the hand model generation unit 102 has successfully generated the hand model. Various criteria for determining whether or not the hand model has been successfully generated can be considered.

例えば、現フレームにおける手の位置姿勢と、過去のフレーム（例えば現フレームの１フレーム過去のフレーム）における手の位置姿勢と、の差分が閾値以上であれば、ハンドモデルの生成は失敗したと判断する。なお、該差分は、位置成分のみの差分でも良いし姿勢成分のみの差分でも良いし、位置及び姿勢の両方の差分でも良い。また、手の位置姿勢の代わりに、各指や関節の位置姿勢を用いてもよい。 For example, if the difference between the hand position / posture in the current frame and the hand position / posture in a past frame (for example, one frame past the current frame) is greater than or equal to a threshold value, it is determined that the generation of the hand model has failed. To do. The difference may be a position component only difference, a posture component only difference, or both a position and orientation difference. Further, instead of the hand position / posture, the position / posture of each finger or joint may be used.

また、他の判断基準として、センサ１０１から手への直線上に別の現実物体が存在したが故に、センサ１０１から十分な測定情報（ハンドモデルを生成するのに十分な情報）が得られなかった場合に、ハンドモデルの生成は失敗したと判断しても良い。 Further, as another criterion for judgment, there is another real object on the straight line from the sensor 101 to the hand, so that sufficient measurement information (information sufficient to generate a hand model) cannot be obtained from the sensor 101. In such a case, it may be determined that the generation of the hand model has failed.

何れにせよ、生成成功判断部１０３が、ハンドモデルの生成は失敗したと判断しない限りは、ハンドモデルの生成は成功したものとして取り扱われる。 In any case, unless the generation success determination unit 103 determines that the generation of the hand model has failed, the generation of the hand model is handled as successful.

位置姿勢認識部１１２は、撮像部１４１から出力される撮像画像中に写っているマーカ（図２のマーカ２０３）を認識し、該マーカの（撮像部１４１に対する）位置姿勢をドライバ（図２のドライバ２０２）の位置姿勢として求める（認識する）。なお、ドライバの位置姿勢を取得することができるのであれば、その取得方法は撮像部１４１による撮像画像を利用した方法に限らず、例えば、ドライバに磁気センサや光学式センサを取り付けて、該センサによる測定結果からドライバの位置姿勢を求めても良い。 The position / orientation recognition unit 112 recognizes a marker (marker 203 in FIG. 2) that appears in the captured image output from the imaging unit 141, and determines the position / orientation (relative to the imaging unit 141) of the marker (see FIG. 2). It is obtained (recognized) as the position and orientation of the driver 202). If the position and orientation of the driver can be acquired, the acquisition method is not limited to a method using an image captured by the imaging unit 141. For example, a magnetic sensor or an optical sensor is attached to the driver, and the sensor The position and orientation of the driver may be obtained from the measurement result obtained by the above.

認識成功判断部１１３は、位置姿勢認識部１１２によるドライバの位置姿勢の認識に成功したか否かを判断する。ドライバの位置姿勢の認識に成功したか否かの判断基準には様々な判断基準が考えられる。 The recognition success determination unit 113 determines whether the position / posture recognition unit 112 has successfully recognized the driver's position / posture. Various criteria can be considered as criteria for determining whether or not the driver's position and orientation have been successfully recognized.

例えば、撮像部１４１から出力される撮像画像からマーカが検出できなかった場合には、ドライバの位置姿勢の認識には失敗したと判断する。また、現フレームにおけるドライバの位置姿勢と、過去のフレーム（例えば現フレームの１フレーム過去のフレーム）におけるドライバの位置姿勢と、の差分が閾値以上であれば、ドライバの位置姿勢の認識には失敗したと判断する。なお、該差分は、位置成分のみの差分でも良いし姿勢成分のみの差分でも良いし、位置及び姿勢の両方の差分でも良い。 For example, when the marker cannot be detected from the captured image output from the imaging unit 141, it is determined that the recognition of the position and orientation of the driver has failed. Also, if the difference between the driver position and orientation in the current frame and the driver position and orientation in the past frame (for example, one frame past the current frame) is equal to or greater than a threshold value, the driver position and orientation recognition fails. Judge that The difference may be a position component only difference, a posture component only difference, or both a position and orientation difference.

把持判定部１２１は、ユーザの手がドライバを把持しているか否かを判断する。ユーザの手がドライバを把持しているか否かを判断する方法は周知の技術で実装可能であり、例えば、非特許文献２や非特許文献３に記載されている方法を用いて、ユーザの手がドライバを把持しているか否かを判断するようにしても良い。例えば、ユーザの手とドライバとの位置姿勢関係が規定の位置姿勢関係（ユーザの手がドライバを把持している状態における手とドライバとの位置姿勢関係）であれば、ユーザの手がドライバを把持していると判断する。 The grip determination unit 121 determines whether or not the user's hand is holding the driver. A method for determining whether or not the user's hand is holding the driver can be implemented by a well-known technique. For example, the method described in Non-Patent Document 2 or Non-Patent Document 3 is used. It may be determined whether or not the driver is holding the driver. For example, if the position and orientation relationship between the user's hand and the driver is a prescribed position and orientation relationship (position and orientation relationship between the hand and the driver when the user's hand is holding the driver), the user's hand Judge that it is gripping.

情報保存部１２２は、後述する推定部１２３が現フレームにおけるドライバの位置姿勢を推定するために使用する様々な情報を登録するためのメモリとして機能する。 The information storage unit 122 functions as a memory for registering various types of information used by the estimation unit 123 described later to estimate the position and orientation of the driver in the current frame.

推定部１２３は、ハンドモデルの生成には成功したものの、ドライバの位置姿勢の認識には失敗した場合に、情報保存部１２２に登録されている情報を用いてドライバの位置姿勢を推定する。推定部１２３の動作について詳しくは後述する。 The estimation unit 123 estimates the position and orientation of the driver using information registered in the information storage unit 122 when the hand model is successfully generated but the recognition of the position and orientation of the driver fails. Details of the operation of the estimation unit 123 will be described later.

仮想空間生成部１３２は先ず、センサ１０１による測定結果に応じた手の位置姿勢で配置されたハンドモデル、位置姿勢認識部１１２が認識した若しくは推定部１２３が推定したドライバの位置姿勢で配置された操作ＣＧモデル、を含む仮想空間を構築する。そして仮想空間生成部１３２は、この構築した仮想空間を撮像部１４１の位置姿勢を有する視点から見た画像を仮想空間画像として生成する。撮像部１４１の位置姿勢は、例えば、撮像部１４１による撮像画像中の自然特徴を用いて求めても良いし、撮像部１４１に磁気センサや光学式センサを取り付けて該センサによる測定結果に応じて求めても良い。 First, the virtual space generation unit 132 is arranged with the hand model arranged with the position and orientation of the hand according to the measurement result by the sensor 101, with the position and orientation of the driver recognized by the position and orientation recognition unit 112 or estimated by the estimation unit 123. A virtual space including the operation CG model is constructed. Then, the virtual space generation unit 132 generates an image obtained by viewing the constructed virtual space from a viewpoint having the position and orientation of the imaging unit 141 as a virtual space image. The position and orientation of the imaging unit 141 may be obtained using, for example, a natural feature in an image captured by the imaging unit 141, or a magnetic sensor or an optical sensor is attached to the imaging unit 141 according to a measurement result by the sensor. You may ask.

画像生成部１３３は、撮像部１４１から出力された撮像画像と、仮想空間生成部１３２によって生成された仮想空間画像と、を合成した合成画像を生成する。画像出力部１３４は、画像生成部１３３が生成した合成画像を、ＨＭＤ１５１（表示部１１１）及び表示装置１４２に対して出力する。なお、画像出力部１３４による合成画像の出力先は、ＨＭＤ１５１、表示装置１４２に限らない。 The image generation unit 133 generates a combined image obtained by combining the captured image output from the imaging unit 141 and the virtual space image generated by the virtual space generation unit 132. The image output unit 134 outputs the composite image generated by the image generation unit 133 to the HMD 151 (display unit 111) and the display device 142. Note that the output destination of the composite image by the image output unit 134 is not limited to the HMD 151 and the display device 142.

次に、画像処理装置１９０が１フレーム分の合成画像を生成して出力するために行う処理について、同処理のフローチャートを示す図３を用いて説明する。つまり、画像処理装置１９０は、撮像部１４１から出力される各フレームの撮像画像について、図３のフローチャートに従った処理を行うことになる。 Next, processing performed by the image processing apparatus 190 to generate and output a composite image for one frame will be described with reference to FIG. 3 showing a flowchart of the processing. That is, the image processing apparatus 190 performs processing according to the flowchart of FIG. 3 for the captured image of each frame output from the imaging unit 141.

ステップＳ３０１では、ハンドモデル生成部１０２は、センサ１０１によるユーザの手の測定結果に基づいてハンドモデルを生成する。ステップＳ３０２では、生成成功判断部１０３は、ハンドモデル生成部１０２によるハンドモデルの生成が成功したか否かを判断する。この判断の結果、ハンドモデルの生成が成功したと判断した場合には、処理はステップＳ３０３に進み、ハンドモデルの生成が失敗したと判断した場合には、処理はステップＳ３１１に進む。 In step S <b> 301, the hand model generation unit 102 generates a hand model based on the measurement result of the user's hand by the sensor 101. In step S <b> 302, the generation success determination unit 103 determines whether the hand model generation by the hand model generation unit 102 is successful. As a result of this determination, if it is determined that the generation of the hand model is successful, the process proceeds to step S303, and if it is determined that the generation of the hand model has failed, the process proceeds to step S311.

ステップＳ３０３では、生成成功判断部１０３は、センサ１０１による測定結果が示す手の位置姿勢（ハンドモデルの位置姿勢）と、現フレームを特定するフレーム情報（フレーム番号、撮像日時など）と、を関連づけて情報保存部１２２に登録する。 In step S303, the generation success determination unit 103 associates the position and orientation of the hand indicated by the measurement result of the sensor 101 (the position and orientation of the hand model) with the frame information (frame number, imaging date and time) specifying the current frame. Is registered in the information storage unit 122.

ステップＳ３１１では、位置姿勢認識部１１２は、撮像部１４１から出力される撮像画像中に写っているマーカを認識し、該認識の結果に基づいて該マーカの位置姿勢をドライバの位置姿勢（操作ＣＧモデルの位置姿勢）として求める（認識する）。 In step S311, the position / orientation recognition unit 112 recognizes a marker in the captured image output from the imaging unit 141, and determines the position / orientation of the marker based on the recognition result based on the recognition result (operation CG). (Recognize) as model position and orientation).

ステップＳ３１２では、認識成功判断部１１３は、位置姿勢認識部１１２によるドライバの位置姿勢の認識に成功したか否かを判断する。この判断の結果、ドライバの位置姿勢の認識に成功したと判断した場合には、処理はステップＳ３１３に進む。一方、ドライバの位置姿勢の認識に失敗したと判断した場合には、処理はステップＳ３２１に進む。 In step S312, the recognition success determination unit 113 determines whether or not the position / posture recognition unit 112 has successfully recognized the driver's position / posture. As a result of this determination, if it is determined that the driver's position and orientation have been successfully recognized, the process proceeds to step S313. On the other hand, if it is determined that the recognition of the driver's position and orientation has failed, the process proceeds to step S321.

ステップＳ３１３では、認識成功判断部１１３は、ステップＳ３１１で認識したドライバの位置姿勢（操作ＣＧモデルの位置姿勢）と、現フレームを特定するフレーム情報（フレーム番号、撮像日時など）と、を関連づけて情報保存部１２２に登録する。 In step S313, the recognition success determination unit 113 associates the position and orientation of the driver recognized in step S311 (position and orientation of the operation CG model) with the frame information (frame number, imaging date and time) specifying the current frame. Register in the information storage unit 122.

そして、この時点で「ハンドモデルの生成が成功したと判断され且つドライバの位置姿勢の認識が成功したと判断された」という成功条件が満たされている場合には、処理はステップＳ３２１を介してステップＳ３２２に進む。一方、この成功条件が満たされていない場合には、処理はステップＳ３２１を介してステップＳ３２３に進む。 At this time, if the success condition “determination that the hand model generation is successful and that the driver's position / posture recognition is successful” is satisfied, the process goes through step S321. Proceed to step S322. On the other hand, if the success condition is not satisfied, the process proceeds to step S323 via step S321.

ステップＳ３２２では、把持判定部１２１は、ユーザの手がドライバを把持しているか否かを判断し、その判断結果を示す値（判断値）を、現フレームを特定するフレーム情報（フレーム番号、撮像日時など）と関連づけて情報保存部１２２に登録する。判断値は、例えば、ユーザの手がドライバを把持している場合には値「１」を有し、ユーザの手がドライバを把持していない場合には値「０」を有する。 In step S322, the grip determination unit 121 determines whether or not the user's hand is gripping the driver, and uses a value indicating the determination result (determination value) as frame information (frame number, imaging). The date is registered in the information storage unit 122 in association with the date and time. The determination value has, for example, a value “1” when the user's hand is holding the driver, and a value “0” when the user's hand is not holding the driver.

そして処理がステップＳ３２３に進むと、ハンドモデル生成部１０２によるハンドモデルの生成が成功している場合には、処理はステップＳ３２４に進み、ハンドモデルの生成が失敗した場合には、処理はステップＳ３３０に進む。 When the process proceeds to step S323, if the hand model generation unit 102 has successfully generated the hand model, the process proceeds to step S324. If the hand model generation has failed, the process proceeds to step S330. Proceed to

ステップＳ３２４では、推定部１２３は先ず、情報保存部１２２に登録されている情報を参照して、「ユーザの手がドライバを把持している」ことを示す判断値と関連づけて登録されているフレーム情報のうち最近のフレームを表すフレーム情報を特定する。そして推定部１２３は、該特定したフレーム情報と関連づけて情報保存部１２２に登録されている手の位置姿勢、ドライバの位置姿勢、を検索する。この検索が成功した場合には、処理はステップＳ３２５に進み、この検索が失敗した場合には、処理はステップＳ３３０に進む。つまり、検索に成功した場合、推定部１２３は、最近に手がドライバを把持していたときの、手の位置姿勢及びドライバの位置姿勢を取得したことになる。 In step S324, the estimation unit 123 first refers to the information registered in the information storage unit 122, and registers the frame registered in association with the determination value indicating that “the user's hand is holding the driver”. Of the information, frame information representing the latest frame is specified. Then, the estimation unit 123 searches for the position and orientation of the hand and the position and orientation of the driver registered in the information storage unit 122 in association with the identified frame information. If this search is successful, the process proceeds to step S325, and if this search fails, the process proceeds to step S330. That is, when the search is successful, the estimation unit 123 has acquired the position and orientation of the hand and the position and orientation of the driver when the hand has recently held the driver.

ステップＳ３２５では、推定部１２３は、上記の検索により取得した手の位置姿勢及びドライバの位置姿勢を用いて、手に対するドライバの相対的な位置姿勢Δを算出する。そして推定部１２３は、現フレームにおけるハンドモデルの位置姿勢に対する相対的な位置姿勢が、この求めた相対的な位置姿勢Δとなる位置姿勢を、現フレームにおける操作ＣＧモデルの位置姿勢として算出する。 In step S325, the estimation unit 123 calculates the relative position / posture Δ of the driver with respect to the hand, using the hand position / posture and the driver's position / posture acquired by the search. Then, the estimation unit 123 calculates a position and orientation in which the relative position and orientation relative to the position and orientation of the hand model in the current frame become the obtained relative position and orientation Δ as the position and orientation of the operation CG model in the current frame.

ステップＳ３３０では、仮想空間生成部１３２は、ハンドモデル及び操作ＣＧモデルを上記の如く配置した仮想空間を、撮像部１４１の位置姿勢を有する視点から見た画像を仮想空間画像として生成する。なお、ハンドモデルの生成に失敗した場合や、操作ＣＧモデルの位置姿勢の認識に失敗した場合には、ハンドモデル及び操作ＣＧモデルは配置できない。このような場合における仮想空間画像としては、例えば、直前のフレームにおける仮想空間画像を現フレームの仮想空間画像として使用しても良い。 In step S330, the virtual space generation unit 132 generates, as a virtual space image, an image obtained by viewing the virtual space in which the hand model and the operation CG model are arranged as described above from the viewpoint having the position and orientation of the imaging unit 141. Note that the hand model and the operation CG model cannot be arranged when the generation of the hand model fails or when the recognition of the position and orientation of the operation CG model fails. As the virtual space image in such a case, for example, the virtual space image in the immediately preceding frame may be used as the virtual space image of the current frame.

ステップＳ３４０では、画像生成部１３３は、撮像部１４１から出力された撮像画像と、仮想空間生成部１３２によって生成された仮想空間画像と、を合成した合成画像を生成する。画像出力部１３４は、画像生成部１３３が生成した合成画像を、ＨＭＤ１５１（表示部１１１）及び表示装置１４２に対して出力する。 In step S340, the image generation unit 133 generates a composite image obtained by combining the captured image output from the imaging unit 141 and the virtual space image generated by the virtual space generation unit 132. The image output unit 134 outputs the composite image generated by the image generation unit 133 to the HMD 151 (display unit 111) and the display device 142.

＜変形例１＞
第１の実施形態では、ハンドモデルの生成に成功するたびに手の位置姿勢を情報保存部１２２に登録し、ドライバの位置姿勢の認識に成功するたびにドライバの位置姿勢を情報保存部１２２に登録していた。しかし、情報保存部１２２に登録した位置姿勢のうち実際に使用されるものは、最近に手がドライバを把持したときの手の位置姿勢及びドライバの位置姿勢である。然るに、ユーザの手がドライバを把持していると判断された場合にのみ、手の位置姿勢及びドライバの位置姿勢を登録するようにしても良い。また、登録する位置姿勢は最新のフレームにおけるもののみとしても良い。 <Modification 1>
In the first embodiment, the hand position and orientation are registered in the information storage unit 122 every time the hand model is successfully generated, and the driver position and orientation are registered in the information storage unit 122 every time the driver position and orientation are successfully recognized. I was registered. However, what is actually used among the positions and orientations registered in the information storage unit 122 is the position and orientation of the hand when the hand has recently gripped the driver and the position and orientation of the driver. However, the position and orientation of the hand and the position and orientation of the driver may be registered only when it is determined that the user's hand is holding the driver. Further, the position and orientation to be registered may be only in the latest frame.

また、このほかにも、手の位置姿勢やドライバの位置姿勢を登録する条件としては様々なものが考えられる。例えば、手とドライバとの間の相対位置の変化量が閾値以上となった場合や、ハンドモデルの形状の変化量が閾値以上となった場合に登録するようにしても良い。ハンドモデルの形状変化は、ハンドモデルの指や関節の位置姿勢変化から求めることができる。 In addition to this, various conditions for registering the position and orientation of the hand and the position and orientation of the driver are conceivable. For example, registration may be performed when the amount of change in the relative position between the hand and the driver is greater than or equal to a threshold, or when the amount of change in the shape of the hand model is greater than or equal to the threshold. The shape change of the hand model can be obtained from the position and orientation changes of the fingers and joints of the hand model.

［第２の実施形態］
第１の実施形態は、ハンドモデルの生成には成功したものの、ドライバの位置姿勢の認識に失敗したケースに対処するものであった。本実施形態は、ドライバの位置姿勢の認識には成功したものの、ハンドモデルの生成に失敗したケースに対処するものである。本実施形態を含め、以下の各実施形態では、第１の実施形態との差分について重点的に説明し、以下で特に触れない限りは第１の実施形態と同様であるものとする。 [Second Embodiment]
The first embodiment deals with a case where the hand model is successfully generated, but the driver's position and orientation recognition fails. This embodiment deals with a case where the hand model generation has failed, although the driver's position and orientation have been recognized. In each of the following embodiments including this embodiment, differences from the first embodiment will be described mainly, and unless otherwise noted, the same as the first embodiment.

本実施形態では、画像処理装置１９０は図３のフローチャートに従った処理を行う代わりに、図４のフローチャートに従った処理を行う。図４において図３に示した処理ステップと同じ処理ステップには同じステップ番号を付しており、該処理ステップに係る説明は省略する。 In the present embodiment, the image processing apparatus 190 performs processing according to the flowchart of FIG. 4 instead of performing processing according to the flowchart of FIG. 3. In FIG. 4, the same processing steps as those shown in FIG. 3 are denoted by the same step numbers, and description thereof will be omitted.

ステップＳ４００では、生成成功判断部１０３は、ステップＳ３０１で生成したハンドモデルと、該ハンドモデル（手）の位置姿勢と、現フレームを特定するフレーム情報（フレーム番号、撮像日時など）と、を関連づけて情報保存部１２２に登録する。 In step S400, the generation success determination unit 103 associates the hand model generated in step S301 with the position and orientation of the hand model (hand) and the frame information (frame number, imaging date and time) specifying the current frame. Is registered in the information storage unit 122.

そして、「ハンドモデルの生成が成功したと判断され且つドライバの位置姿勢の認識が成功したと判断された」という成功条件が満たされていない場合には、処理はステップＳ３２１を介してステップＳ４０１に進む。 If the success condition “determined that the hand model has been successfully generated and the driver's position / posture has been successfully recognized” is not satisfied, the process proceeds to step S401 via step S321. move on.

そして処理がステップＳ４０１に進むと、位置姿勢認識部１１２によるドライバの位置姿勢の認識が成功している場合には、処理はステップＳ３２４に進み、ドライバの位置姿勢の認識が失敗した場合には、処理はステップＳ３３０に進む。そしてステップＳ３２４における検索が成功した場合には、処理はステップＳ４０３に進み、この検索が失敗した場合には、処理はステップＳ３３０に進む。 When the process proceeds to step S401, if the position / orientation recognition unit 112 has successfully recognized the position / orientation of the driver, the process proceeds to step S324. If the recognition of the driver's position / orientation has failed, The process proceeds to step S330. If the search in step S324 is successful, the process proceeds to step S403. If this search fails, the process proceeds to step S330.

ステップＳ４０３では、推定部１２３は、上記の検索により取得した手の位置姿勢及びドライバの位置姿勢を用いて、ドライバに対する手の相対的な位置姿勢Δを算出する。そして推定部１２３は、現フレームにおける操作ＣＧモデルの位置姿勢に対する相対的な位置姿勢が、この求めた相対的な位置姿勢Δとなる位置姿勢を、現フレームにおけるハンドモデルの位置姿勢として算出する。また、ハンドモデルそのものが生成できなかった場合には、ステップＳ３２４で特定したフレーム情報と関連づけて情報保存部１２２に登録されているハンドモデルを読み出し、該読み出したハンドモデルを現フレームにおけるハンドモデルとして使用してもよい。なお、ステップＳ４００ではハンドモデルの代わりに、情報保存部１２２におけるメモリ効率の観点から、該ハンドモデルのボーンを登録しても良いし、ハンドモデルを視点から見た２次元画像を登録しても良い。 In step S403, the estimation unit 123 calculates a relative position / posture Δ of the hand with respect to the driver using the hand position / posture and the driver's position / posture acquired by the search. Then, the estimation unit 123 calculates, as the position and orientation of the hand model in the current frame, the position and orientation in which the relative position and orientation with respect to the position and orientation of the operation CG model in the current frame become the obtained relative position and orientation Δ. If the hand model itself cannot be generated, the hand model registered in the information storage unit 122 is read in association with the frame information specified in step S324, and the read hand model is used as the hand model in the current frame. May be used. In step S400, instead of the hand model, the bone of the hand model may be registered from the viewpoint of memory efficiency in the information storage unit 122, or a two-dimensional image viewed from the viewpoint of the hand model may be registered. good.

［第３の実施形態］
第１の実施形態では、ハンドモデルの生成に失敗した場合や、ドライバの位置姿勢の認識に失敗した場合に、現フレームのハンドモデルやその位置姿勢、現フレームの操作ＣＧモデルの位置姿勢、を求めるようにした。しかし、ハンドモデルの生成やドライバの位置姿勢の認識に成功したと判断したとしても、生成したハンドモデルの精度やその位置姿勢の精度、操作ＣＧモデルの位置姿勢の精度を考慮すると、その信頼度が低い場合がある。然るに、単にハンドモデルの生成に成功した／失敗した、ドライバの位置姿勢の認識に成功した／失敗した、に応じて情報保存部１２２への情報登録を制御するのではなく、その信頼度を考慮して制御するようにしても良い。 [Third Embodiment]
In the first embodiment, when generation of a hand model fails or when recognition of a driver's position and orientation fails, the hand model of the current frame, its position and orientation, and the position and orientation of the operation CG model of the current frame are obtained. I asked for it. However, even if it is determined that the generation of the hand model and the recognition of the position and orientation of the driver have been successful, the reliability of the generated hand model, the accuracy of the position and orientation, and the accuracy of the position and orientation of the operation CG model are considered. May be low. However, instead of controlling the information registration in the information storage unit 122 according to whether the hand model is successfully generated / failed, or the driver's position / posture is recognized / failed, the reliability is considered. Then, it may be controlled.

例えば、センサ１０１から得られる手のデータ量が、規定のデータ量（本来センサ１０１から得られるデータ量）未満であれば信頼度＝０として、ステップＳ３０２からステップＳ３１１に処理を進める。一方、センサ１０１から得られる手のデータ量が規定のデータ量以上であれば信頼度＝１として、ステップＳ３０２からステップＳ３０３に処理を進める。 For example, if the data amount of the hand obtained from the sensor 101 is less than the prescribed data amount (data amount originally obtained from the sensor 101), the reliability is set to 0, and the process proceeds from step S302 to step S311. On the other hand, if the data amount of the hand obtained from the sensor 101 is equal to or greater than the prescribed data amount, the reliability is set to 1, and the process proceeds from step S302 to step S303.

また例えば、撮像部１４１による撮像画像中のマーカの数が多いほど、ドライバの位置姿勢の認識の信頼度が高いと判断しても良い。そして信頼度が閾値以上であれば、処理はステップＳ３１２からステップＳ３１３に処理を進め、信頼度が閾値未満であれば、処理はステップＳ３１２からステップＳ３２１に処理を進める。 For example, it may be determined that the greater the number of markers in the image captured by the imaging unit 141, the higher the reliability of recognition of the position and orientation of the driver. If the reliability is greater than or equal to the threshold, the process proceeds from step S312 to step S313. If the reliability is less than the threshold, the process proceeds from step S312 to step S321.

［第４の実施形態］
図１に示した画像処理装置１９０に含まれている各機能部の全てをハードウェアで構成しても良いが、情報保存部１２２をメモリで実装し、それ以外の各機能部をソフトウェア（コンピュータプログラム）で実装するようにしても良い。このような場合、情報保存部１２２をメモリとして有し、且つ該コンピュータプログラムを実行可能なコンピュータ装置は、上記の画像処理装置１９０に適用可能である。画像処理装置１９０に適用可能なコンピュータ装置のハードウェア構成例について、図５のブロック図を用いて説明する。 [Fourth Embodiment]
Although all the functional units included in the image processing apparatus 190 illustrated in FIG. 1 may be configured by hardware, the information storage unit 122 is implemented by a memory, and the other functional units are configured by software (computer It may be implemented by a program). In such a case, a computer apparatus having the information storage unit 122 as a memory and capable of executing the computer program can be applied to the image processing apparatus 190 described above. A hardware configuration example of a computer apparatus applicable to the image processing apparatus 190 will be described with reference to the block diagram of FIG.

ＣＰＵ５１０は、ＲＯＭ５２０やＲＡＭ５３０に格納されているコンピュータプログラムやデータを用いて処理を実行する。これによりＣＰＵ５１０は、コンピュータ装置全体の動作制御を行うと共に、画像処理装置１９０が行うものとして上述した各処理を実行若しくは制御する。 CPU 510 executes processing using computer programs and data stored in ROM 520 and RAM 530. As a result, the CPU 510 controls the operation of the entire computer apparatus and executes or controls each process described above as being performed by the image processing apparatus 190.

ＲＯＭ５２０には、書き換え不要の本コンピュータ装置の設定データやブートプログラムなどが格納されている。ＲＡＭ５３０は、Ｉ／Ｆ（インターフェース）５４０を介して外部から受信したデータ、外部記憶装置５６０からロードされたコンピュータプログラムやデータを格納するためのエリアを有する。更にＲＡＭ５３０は、ＣＰＵ５１０が各種の処理を実行する際に用いるワークエリアを有する。このようにＲＡＭ５３０は、各種のエリアを適宜提供することができる。 The ROM 520 stores setting data, a boot program, and the like of the computer apparatus that do not require rewriting. The RAM 530 has an area for storing data received from the outside via the I / F (interface) 540, computer programs loaded from the external storage device 560, and data. Further, the RAM 530 has a work area used when the CPU 510 executes various processes. As described above, the RAM 530 can provide various areas as appropriate.

Ｉ／Ｆ５４０は、上記のセンサ１０１、ＨＭＤ１５１、表示装置１４２を接続するためのものである。センサ１０１から出力される手の測定結果やＨＭＤ１５１から出力される各フレームの撮像画像は、Ｉ／Ｆ５４０を介してＲＡＭ５３０や外部記憶装置５６０に入力される。また、本コンピュータ装置が生成した合成画像などの情報は、Ｉ／Ｆ５４０を介してＨＭＤ１５１（表示部１１１）や表示装置１４２に対して出力される。 The I / F 540 is for connecting the sensor 101, the HMD 151, and the display device 142 described above. The hand measurement result output from the sensor 101 and the captured image of each frame output from the HMD 151 are input to the RAM 530 and the external storage device 560 via the I / F 540. Further, information such as a composite image generated by the computer apparatus is output to the HMD 151 (display unit 111) and the display apparatus 142 via the I / F 540.

外部記憶装置５６０は、ハードディスクドライブ装置に代表される大容量情報記憶装置である。外部記憶装置５６０には、ＯＳ（オペレーティングシステム）や、画像処理装置１９０が行うものとして上述した各処理をＣＰＵ５１０に実行若しくは制御させるためのコンピュータプログラムやデータが保存されている。このコンピュータプログラムには、図１において画像処理装置１９０が有するものとして示した各機能部のうち情報保存部１２２を除く各機能部の機能をＣＰＵ５１０に実現させるためのコンピュータプログラムが含まれている。また、外部記憶装置５６０に保存されているデータには、ハンドモデルや操作ＣＧモデルのデータ、上記の説明において既知の情報として説明した情報、が含まれている。 The external storage device 560 is a large-capacity information storage device represented by a hard disk drive device. The external storage device 560 stores an OS (Operating System) and computer programs and data for causing the CPU 510 to execute or control each of the above-described processes performed by the image processing apparatus 190. This computer program includes a computer program for causing the CPU 510 to realize the functions of the functional units excluding the information storage unit 122 among the functional units shown as the image processing apparatus 190 in FIG. The data stored in the external storage device 560 includes hand model and operation CG model data and information described as known information in the above description.

外部記憶装置５６０に保存されているコンピュータプログラムやデータは、ＣＰＵ５１０による制御に従って適宜ＲＡＭ５３０にロードされ、ＣＰＵ５１０による処理対象となる。ＣＰＵ５１０、ＲＯＭ５２０、ＲＡＭ５３０、Ｉ／Ｆ５４０、外部記憶装置５６０は何れも、バス５００に接続されている。 Computer programs and data stored in the external storage device 560 are appropriately loaded into the RAM 530 under the control of the CPU 510 and are processed by the CPU 510. The CPU 510, ROM 520, RAM 530, I / F 540, and external storage device 560 are all connected to the bus 500.

［第５の実施形態］
上記の各実施形態では、ＨＭＤ１５１はビデオシースルー方式のものを用いるものとして説明したが、光学シースルー方式のものを採用しても良い。その場合、撮像部１４１による撮像画像は、マーカの撮像やＨＭＤ１５１の位置姿勢を算出するために使用され、表示用には使用されない。また、画像処理装置１９０も、仮想空間画像を生成すると、これを表示装置１４２及びＨＭＤ１５１に対して出力する。また、ＨＭＤ１５１の代わりに、スマートフォンやカメラ付タブレット端末装置を使用しても構わない。 [Fifth Embodiment]
In each of the above embodiments, the HMD 151 has been described as using a video see-through method, but an optical see-through method may be employed. In that case, the image captured by the imaging unit 141 is used for imaging a marker and calculating the position and orientation of the HMD 151, and is not used for display. In addition, when the image processing device 190 generates a virtual space image, the image processing device 190 outputs the virtual space image to the display device 142 and the HMD 151. Moreover, you may use a smart phone and a tablet terminal device with a camera instead of HMD151.

なお、以上の各実施形態や変形例では、物体を把持するユーザの部位を手、把持対象となる対象物をドライバとして説明を行ったが、物体を把持するユーザの部位、把持対象となる対象物、のそれぞれは手、ドライバに限るものではない。 In each of the embodiments and modifications described above, the user's part that holds the object is described as a hand, and the target object that is to be held is a driver. However, the user's part that holds an object and the target that is to be held are described. Each thing is not limited to a hand or a driver.

また、操作ＣＧモデルと、ハンドモデル及び操作ＣＧモデル以外の仮想物体である非操作ＣＧモデルとの接触があった場合には、その旨を表示部１１１や表示装置１４２に通知するようにしても良い。例えば、操作ＣＧモデルと非操作ＣＧモデルとで接触のあった箇所（ポリゴンなど）を明示的に表示しても良い。また、以上説明した各実施形態や変形例の一部若しくは全部を適宜組み合わせても構わない。 Further, when there is a contact between the operation CG model and a non-operation CG model that is a virtual object other than the hand model and the operation CG model, the display unit 111 and the display device 142 may be notified of the contact. good. For example, a location (polygon or the like) where the operation CG model is in contact with the non-operation CG model may be explicitly displayed. Also, some or all of the embodiments and modifications described above may be combined as appropriate.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１０２：ハンドモデル生成部１０３：生成成功判断部１１２：位置姿勢認識部１１３：認識成功判断部１２１：把持判定部１２３：推定部１３２：仮想空間生成部１３３：画像生成部 102: Hand model generation unit 103: Generation success determination unit 112: Position and orientation recognition unit 113: Recognition success determination unit 121: Grasping determination unit 123: Estimation unit 132: Virtual space generation unit 133: Image generation unit

Claims

First generation means for generating a first shape model of the part based on a measurement result for the part of the user;
Acquisition means for acquiring the position and orientation of an object to be grasped by the part;
Second generating means for generating a second shape model of the object;
A third generation unit configured to generate an image of a virtual space including the first shape model and the second shape model having the position and orientation acquired by the acquisition unit;
The second generation means includes
When the result of the process of acquiring the position and orientation of the object by the acquisition unit satisfies a predetermined condition, the position and orientation of the part and the position of the object when the part has gripped the object in the past An image processing apparatus that estimates a position and orientation of the object based on an orientation, and generates the second shape model based on the estimated result.

When the acquisition of the position and orientation of the object by the acquisition unit fails as the predetermined condition, the second generation unit determines the position of the part when the part has gripped the object in the past. The position and orientation of the object are estimated based on the orientation and the position and orientation of the object, and the second shape model is generated based on the estimated result. Image processing device.

When the second generation means cannot detect an index provided on the target object for recognizing the position and orientation of the target object as the predetermined condition, the part is determined to be the target in the past. Estimating the position and orientation of the object based on the position and orientation of the part when the object is gripped and the position and orientation of the object, and generating the second shape model based on the estimated result The image processing apparatus according to claim 1, wherein:

If the position and / or orientation recognized by the object has changed from the position and / or orientation recognized in the past by a threshold value or more as the predetermined condition, the second generation means determines that the part has Estimating the position and orientation of the object based on the position and orientation of the part and the position and orientation of the object when the object is gripped, and generating the second shape model based on the estimated result The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

When the acquisition of the position and orientation of the object by the acquisition unit fails as the predetermined condition, the second generation unit determines the position of the part when the part has gripped the object in the past. A relative position / posture relationship between a posture and a position / posture of the object is obtained, and the target is based on the obtained relative position / posture relationship and the current position / posture of the first shape model. The image processing apparatus according to claim 1, wherein a position and orientation of an object is estimated, and the second shape model is generated based on the estimated result.

First generation means for generating a first shape model of the part based on the measurement result of the position and orientation of the part of the user;
Second generation means for generating a second shape model of an object to be grasped by the part;
Third generation means for generating an image of a virtual space including the first shape model and the second shape model,
The first generation means includes:
When the result of the process of acquiring the position and orientation of the part satisfies a predetermined condition, based on the position and orientation of the part and the position and orientation of the object when the part has gripped the object in the past An image processing apparatus that estimates a position and orientation of the part and generates the first shape model based on the estimated position and orientation.

In the case where acquisition of the position and orientation of the part fails as the predetermined condition, the first generation unit determines the position and orientation of the part when the part grips the target in the past and the target The image processing apparatus according to claim 6, wherein the position and orientation of the part is estimated based on the position and orientation of the image, and the first shape model is generated based on the estimated position and orientation.

The first generation means, as the predetermined condition, when the generation of the first shape model fails, the first shape model when the part has gripped the object in the past, The image processing apparatus according to claim 6, wherein the first shape model is the current shape model.

Furthermore, a means for acquiring a captured image of the real space is provided,
The image processing apparatus according to claim 1, wherein the third generation unit generates a composite image of the image in the virtual space and the captured image.

An image processing method performed by an image processing apparatus,
A first generation step in which a first generation unit of the image processing apparatus generates a first shape model of the part based on a measurement result for the part of the user;
An acquisition step in which the acquisition unit of the image processing apparatus acquires the position and orientation of an object to be grasped by the part;
A second generation step in which a second generation means of the image processing apparatus generates a second shape model of the object;
A third generation step in which a third generation unit of the image processing device generates an image of a virtual space including the first shape model and the second shape model having the position and orientation acquired in the acquisition step. And
In the second generation step,
If the result of the process of acquiring the position and orientation of the object in the acquisition step satisfies a predetermined condition, the position and orientation of the part and the position of the object when the part has gripped the object in the past An image processing method characterized by estimating a position and orientation of the object based on an orientation, and generating the second shape model based on the estimated result.

An image processing method performed by an image processing apparatus,
A first generation step in which a first generation unit of the image processing apparatus generates a first shape model of the part based on a measurement result of a position and orientation of the part of the user;
A second generation step in which a second generation unit of the image processing apparatus generates a second shape model of an object to be grasped by the part;
A third generation step of generating a virtual space image including the first shape model and the second shape model, wherein the third generation means of the image processing apparatus comprises:
In the first generation step,
When the result of the process of acquiring the position and orientation of the part satisfies a predetermined condition, based on the position and orientation of the part and the position and orientation of the object when the part has gripped the object in the past An image processing method, wherein the position and orientation of the part are estimated, and the first shape model is generated based on the estimated position and orientation.

The computer program for functioning a computer as each means of the image processing apparatus of any one of Claims 1 thru | or 9.