JP2015069512A

JP2015069512A - Information processing apparatus and information processing method

Info

Publication number: JP2015069512A
Application number: JP2013204531A
Authority: JP
Inventors: 美木子中西; Mikiko Nakanishi; 堀越　力; Tsutomu Horikoshi; 力堀越
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2013-09-30
Filing date: 2013-09-30
Publication date: 2015-04-13
Anticipated expiration: 2033-09-30
Also published as: JP6169462B2

Abstract

PROBLEM TO BE SOLVED: To allow a user to perform input operation, utilizing an object existing around the user.SOLUTION: At an input area recognition unit 105 of an information processing apparatus 100, an input area where input operation concerning the content is to be performed by a user is presumed on the basis of first area specifying information beforehand held, in a real space image acquired by an image acquisition unit 103, and the content is drawn by drawing means and displayed on the presumed input area. Consequently, for example, information to specify, from among objects existing around the user, an object wanted to be used for the input operation is set as the first area specifying information, and an area in the real space image where the object is imaged is presumed to be an input area, thereby the input area is determined as the area where the input operation is to be performed by the user, thus enabling the user to perform the input operation, utilizing an object existing around the user.

Description

本発明は、ユーザによる入力操作に応じて処理を行う情報処理装置及び情報処理方法に関する。 The present invention relates to an information processing apparatus and an information processing method for performing processing according to an input operation by a user.

近年、ＡＲ（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ：拡張現実）技術を用いたサービスが開発・提供されている。ＡＲ技術に関連した技術として、例えば、特許文献１には、カメラにより撮像された画像から特定の対象物の特徴を認識し、これに基づいて対象物を追跡すると共に、対象物に対して関連する画像を重畳して表示する方法が示されている。 In recent years, services using AR (Augmented Reality) technology have been developed and provided. As a technology related to the AR technology, for example, in Patent Document 1, a feature of a specific object is recognized from an image captured by a camera, and the object is tracked based on this, and related to the object. A method of superimposing and displaying an image to be displayed is shown.

特表２００６−５０７７２２号公報Special table 2006-507722 gazette

ところで、従来のノート型ＰＣや携帯端末に代わり、例えば、より軽量なメガネ型端末等のウェアラブル端末を用いて、現実空間の上に仮想空間の情報を重畳して表示する方法等が知られている。仮想空間の情報に対して操作を行うようにした場合、仮想空間が空中に浮かんでいて実際には存在しないため、仮想空間の位置している奥行きがわからない、操作をしても触覚フィードバックがない、といった問題がある。 By the way, instead of conventional notebook PCs and portable terminals, for example, a method of superimposing and displaying virtual space information on real space using a wearable terminal such as a lighter glasses-type terminal is known. Yes. When an operation is performed on information in the virtual space, the virtual space is floating in the air and does not actually exist, so the depth of the virtual space is not known, and there is no tactile feedback even if the operation is performed There is a problem such as.

これに対して、例えば、端末に対する入力動作をユーザが行う際に、ユーザの身の回りを撮像した現実空間画像に含まれる特定の物体の表面の画像に対して仮想空間の情報を重畳して表示すると共に、重畳して仮想空間の情報が表示された特定の物体の表面をタッチパネルのように使用することで、入力操作時のユーザ自身の操作感覚を向上させる方法が考えられる。 On the other hand, for example, when the user performs an input operation on the terminal, the virtual space information is superimposed on the image of the surface of the specific object included in the real space image obtained by capturing the surroundings of the user. At the same time, a method of improving the user's own operational feeling at the time of input operation by using the surface of a specific object on which the virtual space information is superimposed and displayed like a touch panel can be considered.

しかしながら、特許文献１記載の方法では、対象物の特徴を予め取得した上で、これに基づいて対象物の追跡を行うことは可能であるが、不特定多数の物体が撮像された画像から特定の対象物を取り出すことは困難である。したがって、特許文献１記載の方法を用いたとしても、依然としてユーザから端末に対して何らかの処理を指示する場合に、ユーザの身の回りにある物体を気軽に利用することは困難であり、ユーザ自身の操作感覚を向上させることは困難である。 However, in the method described in Patent Document 1, it is possible to track the target based on the characteristics of the target acquired in advance, but it is possible to specify from an image in which an unspecified number of objects are captured. It is difficult to take out the object. Therefore, even when the method described in Patent Document 1 is used, it is still difficult to easily use an object around the user when the user instructs the terminal to perform some processing, and the user's own operation is difficult. It is difficult to improve the senses.

本発明は上記を鑑みてなされたものであり、ユーザの身の回りにある物体を利用して、ユーザからの入力操作を行うことが可能な情報処理装置及び情報処理方法を提供することを目的とする。 The present invention has been made in view of the above, and an object thereof is to provide an information processing apparatus and an information processing method capable of performing an input operation from a user using an object around the user. .

上記目的を達成するため、本発明に係る情報処理装置は、現実空間画像を取得する画像取得手段と、ユーザによる入力操作を認識する入力操作認識手段と、前記入力操作認識手段が前記ユーザによる入力操作を認識することを契機として、前記画像取得手段により取得された前記現実空間画像において、予め保持された第１の領域特定情報に基づいて、コンテンツに係るユーザの入力操作を行う入力領域を推定する入力領域推定手段と、前記入力領域推定手段により推定された前記入力領域に対応する領域に前記コンテンツを描画する描画手段と、前記描画手段により描画された前記コンテンツを表示する表示手段と、を備えることを特徴とする。 In order to achieve the above object, an information processing apparatus according to the present invention includes an image acquisition unit that acquires a physical space image, an input operation recognition unit that recognizes an input operation by a user, and the input operation recognition unit that is input by the user. In response to recognizing the operation, in the real space image acquired by the image acquisition unit, the input area for performing the user input operation related to the content is estimated based on the first area specifying information stored in advance. Input area estimation means, drawing means for drawing the content in an area corresponding to the input area estimated by the input area estimation means, and display means for displaying the content drawn by the drawing means It is characterized by providing.

また、本発明に係る情報処理方法は、画像取得手段により、現実空間画像を取得する画像取得ステップと、入力操作認識手段により、ユーザによる入力操作を認識する入力操作認識ステップと、前記入力操作認識ステップにおいて前記ユーザによる入力操作を認識することを契機として、入力領域推定手段により、前記画像取得手段により取得された前記現実空間画像において、予め保持された第１の領域特定情報に基づいて、コンテンツに係るユーザの入力操作を行う入力領域を推定する入力領域推定ステップと、描画手段により、前記入力領域推定ステップにおいて推定された前記入力領域に対応する領域に前記コンテンツを描画する描画ステップと、表示手段により、前記描画ステップにおいて描画された前記コンテンツを表示する表示ステップと、を有することを特徴とする。 The information processing method according to the present invention includes an image acquisition step of acquiring a real space image by an image acquisition unit, an input operation recognition step of recognizing an input operation by a user by the input operation recognition unit, and the input operation recognition. Based on the first area specifying information held in advance in the real space image acquired by the image acquisition means by the input area estimation means with the recognition of the input operation by the user in the step An input area estimation step for estimating an input area for a user's input operation, a drawing step for drawing the content in an area corresponding to the input area estimated in the input area estimation step by a drawing means, and a display Means for displaying the content drawn in the drawing step. A method, characterized by having a.

上記の情報処理装置及び情報処理方法によれば、入力領域推定手段において、画像取得手段により取得された現実空間画像において、予め保持された第１の領域特定情報に基づいて、コンテンツに係るユーザの入力操作を行う入力領域を推定し、推定された入力領域に対して描画手段によりコンテンツが描画されて、表示される。この結果、例えば、第１の領域特定情報としてユーザの身の回りにある物体の中から入力操作に利用したい物体を特定する情報を設定して、現実空間画像においてその物体を撮像した領域を入力領域として推定することで、当該入力領域がユーザによる入力操作を行う領域として判断されることから、ユーザが身の回りにある物体を用いて入力操作を行うことが可能となる。 According to the information processing apparatus and the information processing method described above, the input region estimation unit includes the user of the content based on the first region specifying information stored in advance in the real space image acquired by the image acquisition unit. The input area for performing the input operation is estimated, and the content is drawn and displayed on the estimated input area by the drawing means. As a result, for example, information that identifies an object that is desired to be used for an input operation from among objects around the user is set as the first area specifying information, and an area in which the object is captured in the real space image is set as the input area As a result of the estimation, the input area is determined as an area for the user to perform an input operation. Therefore, the user can perform an input operation using an object around him.

ここで、上記作用を効果的に奏する構成として第１の領域特定情報は、前記入力領域の形状を特定する情報とすることができる。ユーザが入力操作に用いたい物体の形状が特定されている場合には、これを特定する情報を第１の領域特定情報とすることで、現実空間画像の中から入力領域を推定する操作をより簡便且つ確実に行うことができる。 Here, as a configuration that effectively exhibits the above-described operation, the first region specifying information can be information that specifies the shape of the input region. When the shape of the object that the user wants to use for the input operation is specified, the operation for estimating the input region from the real space image is performed by using the information for specifying this as the first region specifying information. It can be carried out simply and reliably.

また、前記入力領域推定手段は、前記入力操作認識手段により認識された前記ユーザの入力操作に基づいて、前記画像取得手段により取得された前記現実空間画像から第２の領域特定情報を取得し、前記第１の領域特定情報と、前記第２の領域特定情報とに基づいて、前記現実空間画像における前記入力領域を推定する態様とすることができる。このように、第２の領域特定情報を利用することで、入力領域の推定をより確実に行うことができる。 Further, the input area estimation means acquires second area specifying information from the real space image acquired by the image acquisition means based on the input operation of the user recognized by the input operation recognition means, The input area in the real space image can be estimated based on the first area specifying information and the second area specifying information. Thus, by using the second area specifying information, the input area can be estimated more reliably.

ここで、前記第２の領域特定情報には、前記入力領域を特定する色情報が含まれることが好ましい。入力領域を特定する色情報を第２の領域特定情報として取得することで、入力領域の推定をより確実に行うことができる。 Here, it is preferable that the second area specifying information includes color information for specifying the input area. By acquiring the color information specifying the input area as the second area specifying information, the input area can be estimated more reliably.

前記第１の領域特定情報は、前記入力領域の形状を特定する情報であって、前記入力領域推定手段は、前記第２の領域特定情報により特定される領域を多角形近似し、近似された多角形の頂点の一部及び該多角形を構成する複数の辺を伸長した直線同士の交点から形成された領域が前記第１の領域特定情報で特定される形状となる場合に、当該領域を入力領域と推定する態様とすることができる。このように、第２の領域特定情報により特定される領域を多角形近似した後に、近似された多角形に基づいて形成される領域が第１の領域特定情報で特定される形状となる場合に、入力領域と推定する構成を有することで、例えば、ユーザの手が重なる等によって第１の領域特定情報と合致しない形状をなす画像が取得された場合であっても、入力領域を推定することが可能となる。 The first area specifying information is information for specifying the shape of the input area, and the input area estimating means approximates the area specified by the second area specifying information by polygonal approximation. When an area formed from a part of the vertex of a polygon and an intersection of straight lines extending a plurality of sides constituting the polygon is a shape specified by the first area specifying information, the area is It can be set as the aspect estimated as an input area. As described above, when the region specified by the second region specifying information is approximated by a polygon, and the region formed based on the approximated polygon becomes the shape specified by the first region specifying information. By having a configuration for estimating an input area, for example, even when an image having a shape that does not match the first area specifying information is acquired due to, for example, overlapping hands of the user, the input area is estimated Is possible.

また、前記入力操作認識手段は、前記入力領域のうち、前記第２の領域特定情報により特定される領域以外の領域の色情報に基づいて前記入力領域内で移動する物体を特定し、該物体の移動を、前記ユーザによる入力操作として認識する構成とすることが好ましい。このように第２の領域特定情報により特定される領域以外の領域の色情報に基づいて、ユーザによる入力操作を行う物体を特定する構成とすることで、ユーザによる入力操作をより確実に検出することができる。 The input operation recognizing means identifies an object that moves in the input area based on color information of an area other than the area specified by the second area specifying information in the input area. The movement is preferably recognized as an input operation by the user. In this way, by configuring the object to be input by the user based on the color information of the area other than the area specified by the second area specifying information, the input operation by the user can be detected more reliably. be able to.

また、前記入力操作認識手段は、画像取得手段により取得された連続する複数の現実空間画像において、前記入力領域推定手段により推定された前記入力領域内で移動する物体のうち最も移動量が大きい物体の端部の移動を、前記ユーザによる入力操作として認識する態様とすることができる。入力領域内で移動する物体が複数ある場合には、ユーザによる入力操作を行う物体を特定し、これを入力操作として認識することで、ユーザによる入力操作をより確実に検出することができる。 The input operation recognizing means is an object having the largest movement amount among the objects moving in the input area estimated by the input area estimating means in a plurality of continuous real space images acquired by the image acquiring means. It can be set as the aspect which recognizes the movement of this edge part as input operation by the said user. When there are a plurality of objects that move within the input area, it is possible to more reliably detect an input operation by the user by identifying an object to be input by the user and recognizing this as the input operation.

また、前記描画手段は、前記入力領域推定手段により推定された前記入力領域の面積に応じて描画するコンテンツを変更する態様とすることができる。これにより、ユーザの利便性を向上することができる。 The drawing unit may change the content to be drawn according to the area of the input region estimated by the input region estimation unit. Thereby, a user's convenience can be improved.

本発明によれば、ユーザの身の回りにある物体を利用して、ユーザからの入力操作を行うことが可能な情報処理装置及び情報処理方法が提供される。 ADVANTAGE OF THE INVENTION According to this invention, the information processing apparatus and information processing method which can perform input operation from a user using the object around a user's body are provided.

本実施形態の情報処理装置１００の外観図である。It is an external view of the information processing apparatus 100 of this embodiment. 本実施形態に係る情報処理装置１００の機能を示すブロック図である。It is a block diagram which shows the function of the information processing apparatus 100 which concerns on this embodiment. 情報処理装置１００のハードウェア構成図である。2 is a hardware configuration diagram of the information processing apparatus 100. FIG. 情報処理装置１００の処理を示すフローチャートである。3 is a flowchart showing processing of the information processing apparatus 100. 情報処理装置１００の入力データ解析部１０２における解析方法を説明する図である。6 is a diagram illustrating an analysis method in an input data analysis unit 102 of the information processing apparatus 100. FIG. 情報処理装置１００における領域特定情報の取得について説明する図である。It is a figure explaining acquisition of field specific information in information processor 100. 情報処理装置１００における入力領域の推定について説明する図である。It is a figure explaining the estimation of the input area in the information processing apparatus. 情報処理装置１００における入力領域の推定について説明する図である。It is a figure explaining the estimation of the input area in the information processing apparatus.

以下、添付図面を参照して、本発明を実施するための形態を詳細に説明する。なお、図面の説明においては同一要素には同一符号を付し、重複する説明を省略する。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description is omitted.

図１は、本実施形態の情報処理装置１００の外観図である。図１に示される通り、情報処理装置１００は、画像取得部１０３たるカメラが取り付けられたメガネ型端末１１０、指輪型端末１２０、及び制御端末１３０が有線接続されている。なお、有線接続の代わりに、ブルートゥース／Ｂｌｕｅｔｏｏｔｈ（登録商標）等の近距離無線通信により接続されてもよい。情報処理装置１００のユーザは、メガネ型端末１１０及び指輪型端末１２０を装着して使用する。メガネ型端末１１０は、例えば、ビデオシースルー式ＨＭＤ（ヘッドマウントディスプレイ）や、光学シースルー式ＨＭＤを用いることができるが、光学シースルー式ＨＭＤを用いることが好ましい。制御端末１３０は、例えば、情報処理装置１００のユーザが携帯する形状とすることができる。また、制御端末１３０を例えば無線通信網内のサーバとして実現し、無線通信網を利用して通信を行う構成としてもよい。 FIG. 1 is an external view of an information processing apparatus 100 according to the present embodiment. As shown in FIG. 1, in the information processing apparatus 100, a glasses-type terminal 110, a ring-type terminal 120, and a control terminal 130 to which a camera as an image acquisition unit 103 is attached are wired. Instead of wired connection, connection may be made by short-range wireless communication such as Bluetooth (registered trademark). A user of the information processing apparatus 100 wears and uses the glasses-type terminal 110 and the ring-type terminal 120. For example, a video see-through HMD (head-mounted display) or an optical see-through HMD can be used as the glasses-type terminal 110, but an optical see-through HMD is preferably used. For example, the control terminal 130 may have a shape carried by the user of the information processing apparatus 100. The control terminal 130 may be realized as a server in a wireless communication network, for example, and may be configured to perform communication using the wireless communication network.

本実施形態に係る情報処理装置１００は、身の回りにある物体を利用してユーザが入力操作を行うことを可能とした装置である。具体的には、ユーザの身の回りにある例えばノートやファイル等の四角形の物体をタッチパネルに見立てて、カメラにより撮像された画像のうち、疑似タッチパネルとなる物体を撮像した領域に対して、情報処理装置１００から所望のコンテンツを重畳して表示をすることで、ユーザは疑似タッチパネルとなる物体を利用して当該コンテンツに対する入力操作を行うことを可能とする装置である。また、ユーザによる入力操作とは、情報処理装置１００による特定の処理を開始又は終了を指示する操作も含まれ、ユーザが身の回りにある物体を利用して入力操作を行うことで、情報処理装置１００は、自装置における特定の処理（例えば、特定のプログラムの起動）を開始又は終了する構成とすることもできる。 The information processing apparatus 100 according to the present embodiment is an apparatus that allows a user to perform an input operation using an object around him. Specifically, for example, a rectangular object such as a notebook or a file around the user is regarded as a touch panel, and an information processing apparatus is used for an area in which an object to be a pseudo touch panel is captured among images captured by a camera. By superimposing desired content from 100 and displaying it, the user can perform an input operation on the content using an object that becomes a pseudo touch panel. The input operation by the user includes an operation for instructing the start or end of a specific process by the information processing apparatus 100, and the information processing apparatus 100 is performed by the user performing an input operation using an object around him. May be configured to start or end a specific process (for example, activation of a specific program) in its own device.

情報処理装置１００が重畳して表示をするコンテンツは特に限定されないが、疑似タッチパネルとなる物体を撮像した領域には、例えば次のコンテンツに進むか又は終了するかを選択する選択肢等ユーザが何らかの入力を行うことを必要とするコンテンツが表示される。 The content that the information processing apparatus 100 displays in a superimposed manner is not particularly limited, but in the area where the object that becomes the pseudo touch panel is imaged, for example, an option for selecting whether to proceed to the next content or to end the user, etc. The content that needs to be displayed is displayed.

なお、疑似タッチパネルとなる物体形状は特に限定されないが、以下の実施形態では、疑似タッチパネルとなる物体が四角形状である場合を中心に説明する。 In addition, although the object shape used as a pseudo touch panel is not specifically limited, in the following embodiment, it demonstrates centering on the case where the object used as a pseudo touch panel is square shape.

次に、この情報処理装置１００の機能構成を示す。図２は、本実施形態の情報処理装置１００の機能を示すブロック図である。図２に示される通り、入力部１０１（入力操作認識手段）、入力データ解析部１０２（入力操作認識手段）、画像取得部１０３（画像取得手段）、入力領域認識部１０５（入力領域推定手段）及び手領域認識部１０６（入力操作認識手段）を含んで構成される画像認識部１０４、コンテンツ蓄積部１０７、描画部１０８（描画手段）、表示部１０９（表示手段）を含んで構成されている。このうち、入力部１０１は指輪型端末１２０により実現され、画像取得部１０３及び表示部１０９はメガネ型端末１１０により実現される。本実施形態においては、他の機能部は制御端末１３０により実現されるが、上記の機能構成には限定されず、種々の変更を行うことができる。 Next, a functional configuration of the information processing apparatus 100 is shown. FIG. 2 is a block diagram illustrating functions of the information processing apparatus 100 according to the present embodiment. As shown in FIG. 2, the input unit 101 (input operation recognition unit), the input data analysis unit 102 (input operation recognition unit), the image acquisition unit 103 (image acquisition unit), and the input region recognition unit 105 (input region estimation unit). And an image recognition unit 104 including a hand region recognition unit 106 (input operation recognition unit), a content storage unit 107, a drawing unit 108 (drawing unit), and a display unit 109 (display unit). . Among these, the input unit 101 is realized by the ring type terminal 120, and the image acquisition unit 103 and the display unit 109 are realized by the glasses type terminal 110. In the present embodiment, the other functional units are realized by the control terminal 130, but are not limited to the functional configuration described above, and various changes can be made.

図３は、情報処理装置１００のハードウェア構成図である。図２に示される情報処理装置１００は、物理的には、図３に示すように、１または複数のＣＰＵ１１、主記憶装置であるＲＡＭ１２及びＲＯＭ１３、入力デバイスであるキーボード及びマウス等の入力装置１４、ディスプレイ等の出力装置１５、ネットワークカード等のデータ送受信デバイスである通信モジュール１６、半導体メモリ等の補助記憶装置１７などを含むコンピュータシステムとして構成されている。図２における各機能は、図３に示すＣＰＵ１１、ＲＡＭ１２等のハードウェア上に所定のコンピュータソフトウェアを読み込ませることにより、ＣＰＵ１１の制御のもとで入力装置１４、出力装置１５、通信モジュール１６を動作させるとともに、ＲＡＭ１２や補助記憶装置１７におけるデータの読み出し及び書き込みを行うことで実現される。 FIG. 3 is a hardware configuration diagram of the information processing apparatus 100. As shown in FIG. 3, the information processing apparatus 100 shown in FIG. 2 physically includes one or more CPUs 11, a RAM 12 and a ROM 13 that are main storage devices, and an input device 14 such as a keyboard and a mouse that are input devices. The computer system includes an output device 15 such as a display, a communication module 16 that is a data transmission / reception device such as a network card, an auxiliary storage device 17 such as a semiconductor memory, and the like. Each function in FIG. 2 operates the input device 14, the output device 15, and the communication module 16 under the control of the CPU 11 by reading predetermined computer software on the hardware such as the CPU 11 and the RAM 12 shown in FIG. 3. In addition, it is realized by reading and writing data in the RAM 12 and the auxiliary storage device 17.

次に、図２に戻り、情報処理装置１００を構成する各機能ブロックについて説明する。 Next, returning to FIG. 2, each functional block constituting the information processing apparatus 100 will be described.

入力部１０１は、ユーザの動作を検知する機能を有する。ユーザの動作とは、例えば、身の回りの何らかの物体の表面をユーザが指で叩く（タップする）又はクリックする動作のことを言う。また、何らかの物体の表面を指でスライドする又はフリックする動作であってもよい。入力部１０１は、例えば加速度センサ、マイク、距離センサ、又はカメラ等により実現され、図１に示す指輪型端末１２０に取り付けられる。そして、指輪型端末１２０を取り付けた指で上記の動作を行うことで、入力部１０１はその動作を検知し、その情報を入力データ解析部１０２へ送る。 The input unit 101 has a function of detecting a user operation. The user's action refers to, for example, an action in which the user taps (tap) or clicks on the surface of an object around him with a finger. Moreover, the operation | movement which slides or flicks the surface of a certain object with a finger | toe may be sufficient. The input unit 101 is realized by, for example, an acceleration sensor, a microphone, a distance sensor, a camera, or the like, and is attached to the ring type terminal 120 illustrated in FIG. The input unit 101 detects the operation by performing the above operation with a finger with the ring-type terminal 120 attached, and sends the information to the input data analysis unit 102.

入力データ解析部１０２は、入力部１０１から送られたユーザの動作に係る情報から、ユーザが入力操作を行ったか否かを判断すると共に、具体的なユーザの動作に基づいて、入力操作の種類を判別する機能を備える。すなわち、入力部１０１及び入力データ解析部１０２によって、ユーザによる入力操作が認識される。入力データ解析部１０２によって認識されたユーザによる入力操作に係る情報は、後述の手領域認識部１０６により認識される指の位置に係る情報と組み合わせて処理される。 The input data analysis unit 102 determines whether or not the user has performed an input operation from the information related to the user's operation sent from the input unit 101, and determines the type of the input operation based on the specific user operation. The function to discriminate is provided. That is, the input operation by the user is recognized by the input unit 101 and the input data analysis unit 102. Information related to the input operation by the user recognized by the input data analysis unit 102 is processed in combination with information related to the finger position recognized by the hand region recognition unit 106 described later.

画像取得部１０３は、ユーザの周囲の画像である現実空間画像を取得する機能を有する。画像取得部１０３は、例えばカメラ等により実現され、画像取得部１０３により取得された現実空間画像は、画像認識部１０４へ送られる。 The image acquisition unit 103 has a function of acquiring a real space image that is an image around the user. The image acquisition unit 103 is realized by a camera or the like, for example, and the real space image acquired by the image acquisition unit 103 is sent to the image recognition unit 104.

画像認識部１０４は、画像取得部１０３により取得された現実空間画像について種々の処理を施すことで、情報処理装置１００から提供するコンテンツに対してユーザの入力操作を行う入力領域を推定する機能を有する。また、ユーザの入力領域内で移動する指の位置を認識する機能を有する。画像認識部１０４は、入力領域認識部１０５と、手領域認識部１０６とを含んで構成される。 The image recognition unit 104 has a function of estimating an input area where a user performs an input operation on content provided from the information processing apparatus 100 by performing various processes on the real space image acquired by the image acquisition unit 103. Have. Moreover, it has the function to recognize the position of the finger | toe which moves within a user's input area. The image recognition unit 104 includes an input region recognition unit 105 and a hand region recognition unit 106.

入力領域認識部１０５においては、入力領域を推定する機能を有する。入力領域の推定は、現実空間画像に含まれる種々の物体を撮像した領域について、情報処理装置１００において予め保持された第１の領域特定情報と、ユーザの入力操作によって取得される第２の領域特定情報とに基づいて入力領域か否かを判断することにより行われる。第１の領域特定情報とは、例えば、入力領域の形状が挙げられる。四角形状の物体を入力領域として認識することを予め決めている場合には、第１の領域特定情報とは「四角形状であること」となる。そして、現実空間画像によって撮像された種々の物体のうち、四角形状の物体を撮像したと思われる領域について、入力領域と推定される。また、第２の領域特定情報とは、ユーザの入力操作に基づいて画像取得部１０３により取得された現実空間画像から取り出される情報であり、例えば、入力領域の色情報が挙げられる。入力領域認識部１０５においては、これらの領域特定情報を用いて入力領域の推定を行う。具体的な処理については後述する。 The input area recognition unit 105 has a function of estimating the input area. The estimation of the input area is based on the first area specifying information stored in advance in the information processing apparatus 100 and the second area acquired by the user's input operation for the area where various objects included in the real space image are captured. This is done by determining whether or not the input area is based on the specific information. The first area specifying information includes, for example, the shape of the input area. When it is determined in advance that a rectangular object is recognized as an input area, the first area specifying information is “to be a quadrilateral”. Of the various objects captured by the real space image, an area that seems to have captured a rectangular object is estimated as an input area. The second region specifying information is information extracted from the real space image acquired by the image acquisition unit 103 based on the user's input operation, and includes, for example, color information of the input region. The input area recognition unit 105 estimates the input area using these area specifying information. Specific processing will be described later.

手領域認識部１０６は、現実空間画像を参照し、入力領域認識部１０５において推定された入力領域内で移動する物体を手領域として認識する機能を有する。入力領域内で移動する物体とは、例えば、入力領域を構成する四角形状の物体（疑似タッチパネル）を支持する手、四角形状の物体の表面を移動することで入力操作を行うユーザの手、等が挙げられる。手領域置認識部１０６では、このように入力領域内で移動する物体を撮像した領域を手領域として認識した上で、ユーザの入力操作に該当するか否かを判断する機能を備える。 The hand region recognition unit 106 has a function of referring to the real space image and recognizing an object moving in the input region estimated by the input region recognition unit 105 as a hand region. Examples of the object that moves within the input area include a hand that supports a rectangular object (pseudo touch panel) that constitutes the input area, a user's hand that performs an input operation by moving the surface of the rectangular object, and the like. Is mentioned. The hand region position recognizing unit 106 has a function of determining whether or not a user's input operation is satisfied after recognizing a region obtained by imaging an object moving in the input region as a hand region.

入力領域認識部１０５による入力領域の推定結果、及び、手領域認識部１０６による手領域の認識結果は、描画部１０８へ送られる。 The estimation result of the input region by the input region recognition unit 105 and the recognition result of the hand region by the hand region recognition unit 106 are sent to the drawing unit 108.

コンテンツ蓄積部１０７は、ユーザに対して表示するコンテンツを格納する機能を有する。コンテンツ蓄積部１０７は、描画部１０８からの指示に応じて、ユーザに対して表示するコンテンツを描画部１０８に対して送信する。 The content storage unit 107 has a function of storing content to be displayed to the user. The content storage unit 107 transmits content to be displayed to the user to the drawing unit 108 in response to an instruction from the drawing unit 108.

描画部１０８は、画像認識部１０４から送られる入力領域の推定結果及び手領域の認識結果を示す情報に基づいて、コンテンツ蓄積部１０７からユーザに対して表示するコンテンツを取得すると共に、入力領域に対応する領域にコンテンツ蓄積部１０７から取得したコンテンツを描画する機能を有する。描画部１０８により描画されたコンテンツは、表示部１０９へ送られる。 The drawing unit 108 acquires content to be displayed to the user from the content storage unit 107 based on the information indicating the estimation result of the input region and the recognition result of the hand region sent from the image recognition unit 104 and It has a function of drawing the content acquired from the content storage unit 107 in the corresponding area. The content drawn by the drawing unit 108 is sent to the display unit 109.

表示部１０９は、描画部１０８において描画されたコンテンツ、すなわち入力領域に対応したコンテンツを表示する機能を有する。表示部１０９は、メガネ型端末１１０におけるディスプレイや、プロジェクタ等が挙げられる。メガネ型端末１１０が光学シースルー式ＨＭＤの場合は、画像取得部１０３により撮像された現実空間画像において入力領域と推定された領域に対して、描画部１０８において描画されたコンテンツを表示させることで、メガネ型端末１１０に対して入射する外部からの光により描かれるユーザの周囲の状況を示す情報（現実空間を示す情報）と、表示部１０９に表示されたコンテンツとが重畳された結果、ユーザに重畳画像として認識される。また、メガネ型端末１１０がビデオシースルー式ＨＭＤである場合には、描画部１０８において描画されたコンテンツを画像取得部１０３により撮像された現実空間画像に対して重畳して表示部１０９に表示することで、ユーザは、表示部１０９に表示された重畳画像を認識する。 The display unit 109 has a function of displaying the content drawn by the drawing unit 108, that is, the content corresponding to the input area. Examples of the display unit 109 include a display in the glasses-type terminal 110, a projector, and the like. When the glasses-type terminal 110 is an optical see-through HMD, the content drawn in the drawing unit 108 is displayed on the region estimated as the input region in the real space image captured by the image acquisition unit 103. As a result of superimposing information (information indicating the real space) indicating the situation around the user drawn by external light incident on the glasses-type terminal 110 and the content displayed on the display unit 109, Recognized as a superimposed image. When the glasses-type terminal 110 is a video see-through HMD, the content drawn in the drawing unit 108 is superimposed on the real space image captured by the image acquisition unit 103 and displayed on the display unit 109. Thus, the user recognizes the superimposed image displayed on the display unit 109.

上記の構成を有する情報処理装置１００による情報処理方法について、図４のフローチャートを参照しながらさらに詳細に説明する。図４は、情報処理装置１００による情報処理方法を説明するフローチャートである。また、図５〜８は、情報処理装置１００における情報処理について、説明する図である。なお、以下の説明では、情報処理装置１００において予め保持される第１の領域特定情報が「四角形状であること」であることを前提として説明する。 The information processing method by the information processing apparatus 100 having the above configuration will be described in more detail with reference to the flowchart of FIG. FIG. 4 is a flowchart for explaining an information processing method by the information processing apparatus 100. 5 to 8 are diagrams for describing information processing in the information processing apparatus 100. In the following description, description will be made on the assumption that the first area specifying information stored in advance in the information processing apparatus 100 is “rectangular”.

まず、メガネ型端末１１０及び指輪型端末１２０を含んで構成される情報処理装置１００をユーザが装着した後に、ユーザが指輪型端末１２０を装着した指で例えば身の回りの物体を利用して入力操作となる何らかの動作を行う。これに対して、入力部１０１がユーザの動作を検知すると、制御端末１３０の入力データ解析部１０２において、ユーザの動作が入力操作であるか否かを判断することで、ユーザの入力操作を認識する（Ｓ０１：入力操作認識ステップ）。そして、入力データ解析部１０２において、ユーザの入力操作を認識した場合には、これを契機として、情報処理装置１００における画像認識に係るシステムを起動させ、これにより画像取得部１０３はユーザの周囲の画像（現実空間画像）の取得を開始する（Ｓ０２：画像取得ステップ）。 First, after the user wears the information processing apparatus 100 including the glasses-type terminal 110 and the ring-type terminal 120, the user performs an input operation using, for example, an object around him with the finger wearing the ring-type terminal 120. Do some action. On the other hand, when the input unit 101 detects a user operation, the input data analysis unit 102 of the control terminal 130 recognizes the user input operation by determining whether or not the user operation is an input operation. (S01: input operation recognition step). Then, when the input data analysis unit 102 recognizes the user's input operation, the system related to the image recognition in the information processing apparatus 100 is activated by using this as an opportunity, whereby the image acquisition unit 103 Acquisition of an image (real space image) is started (S02: image acquisition step).

ここで、入力部１０１がマイクであって、ユーザの入力操作をダブルクリックとした場合に、入力部１０１により取得されるデータの一例を図５に示す。図５に示すデータは、横軸を時間（秒）とし、マイクにより受信した音の振幅を縦軸に示す。ユーザによる入力操作が身の回りの物体のダブルクリックである場合、物体を指でクリックした（叩いた）場合にはマイクは何らかの音を受信するはずである。そこで、例えば予め閾値Ｔ０を設定しておき、マイクは、閾値Ｔ０を超える振幅の音を所定期間内に連続して受信した場合（例えばＴ１に示すようなデータが得られた場合）には、ユーザによる入力操作が行われたと判断する。なお、ユーザによる入力操作が複数種類ある場合には、入力操作の種類に応じて個別に認識するための基準（閾値等）を設定しておくことが好ましい。 Here, FIG. 5 shows an example of data acquired by the input unit 101 when the input unit 101 is a microphone and the user's input operation is a double click. In the data shown in FIG. 5, the horizontal axis represents time (seconds), and the amplitude of sound received by the microphone is represented on the vertical axis. When the input operation by the user is double-clicking on an object around him, the microphone should receive some sound when the object is clicked (struck) with a finger. Therefore, for example, when a threshold value T0 is set in advance and the microphone continuously receives a sound with an amplitude exceeding the threshold value T0 within a predetermined period (for example, when data as shown in T1 is obtained), It is determined that an input operation by the user has been performed. In addition, when there are a plurality of types of input operations by the user, it is preferable to set a reference (a threshold value or the like) for individually recognizing according to the type of input operation.

なお、システム起動（Ｓ０２）のためにユーザが入力操作を行う（Ｓ０１）際には、後の段階において疑似タッチパネルとして利用したい物体が画像認識部１０４においてより確実に認識されるように、疑似タッチパネルとして利用したい物体を画像取得部１０３の前に配置した上で、入力操作を行うことが好ましい。これにより、システム起動（Ｓ０２）後に画像取得部１０３により取得される画像には、疑似タッチパネルとして利用物体が撮像されるため、より高い精度で入力領域を推定することができる。 When the user performs an input operation for system activation (S02) (S01), the pseudo touch panel is used so that an object to be used as a pseudo touch panel at a later stage is more reliably recognized by the image recognition unit 104. It is preferable to perform an input operation after placing an object to be used as the image acquisition unit 103 in front of the object. Thereby, since the utilization object is imaged as a pseudo touch panel in the image acquired by the image acquisition unit 103 after the system is started (S02), the input area can be estimated with higher accuracy.

図４に戻り、ユーザによる入力操作を情報処理装置１００側で認識すること契機として、情報処理装置１００における画像認識に係るシステムが起動されると、画像認識部１０４の入力領域認識部１０５により、入力領域の推定に用いる第２の領域特定情報を取得するための入力領域指標の表示が指示され、表示部１０９は、入力領域認識部１０５からの指示に基づいて、入力領域指標をユーザに対して表示する（Ｓ０２）。入力領域指標の例を図６に示す。図６（Ａ）はすなわち画像取得部１０３により取得される画像の例を示していて、図６（Ｂ）は、画像取得部１０３により取得される画像に対して表示部１０９によって表示される入力領域指標Ｐを表示させた例である。図６（Ａ）及び図６（Ｂ）では、それぞれ画像取得部１０３により取得される画像Ｓを示している。ここでは、疑似タッチパネルとして使用したい物体が、表面が同一色のファイルＦであり、ファイルＦが画像取得部１０３により撮像される画像の中央に配置されるようにした状態で、システムを起動（Ｓ０２）させた場合の例を示している。入力領域指標Ｐとは、図６（Ｂ）の画像Ｓの中央に示される四角形状の枠のことをいう。図６（Ｂ）の例では、四角形状の枠である入力領域指標ＰはファイルＦと重なって表示されているが、入力領域指標ＰがファイルＦとは重なっていない場合には、ファイルＦと入力領域指標Ｐとが重なるように画像取得部１０３を移動させる（メガネ型端末１１０の場合には、ユーザの視野を変更させる）。 Returning to FIG. 4, when a system related to image recognition in the information processing apparatus 100 is activated as a trigger for recognizing an input operation by the user on the information processing apparatus 100 side, the input area recognition unit 105 of the image recognition unit 104 The display unit 109 is instructed to display the input region index for acquiring the second region specifying information used for the estimation of the input region, and the display unit 109 sends the input region index to the user based on the instruction from the input region recognition unit 105. Are displayed (S02). An example of the input area index is shown in FIG. 6A shows an example of an image acquired by the image acquisition unit 103, and FIG. 6B shows an input displayed by the display unit 109 with respect to the image acquired by the image acquisition unit 103. This is an example in which a region index P is displayed. 6A and 6B show images S acquired by the image acquisition unit 103, respectively. Here, the object to be used as a pseudo touch panel is the file F having the same color on the surface, and the system is activated in a state where the file F is arranged at the center of the image captured by the image acquisition unit 103 (S02). ) Is shown as an example. The input area index P refers to a quadrangular frame shown at the center of the image S in FIG. In the example of FIG. 6B, the input area index P, which is a rectangular frame, is displayed so as to overlap the file F. However, if the input area index P does not overlap the file F, The image acquisition unit 103 is moved so as to overlap the input area index P (in the case of the glasses-type terminal 110, the user's field of view is changed).

その後、ユーザが再び入力操作を行うことで、入力データ解析部１０２においてユーザの入力操作を認識すると（Ｓ０３）、これを契機として、入力領域認識部１０５は、画像取得部１０３により取得されている画像から、入力領域指標Ｐにより囲われた領域の色情報を取得する（Ｓ０４）。これにより、入力領域認識部１０５は、入力領域指標Ｐに対して重なって表示されるファイルＦの色情報を取得する。これにより、ファイルＦの色情報が第２の領域特定情報として入力領域認識部１０５により取得される。 After that, when the user performs the input operation again and the input data analysis unit 102 recognizes the user's input operation (S03), the input area recognition unit 105 is acquired by the image acquisition unit 103. Color information of the area surrounded by the input area index P is acquired from the image (S04). Thereby, the input area recognition unit 105 acquires the color information of the file F displayed so as to overlap the input area index P. Thereby, the color information of the file F is acquired by the input area recognition unit 105 as the second area specifying information.

次に、これまでの処理によって得られた情報に基づいて入力領域を推定する（Ｓ０５：入力領域推定ステップ）。ここで、入力領域認識部１０５は、第１の領域特定情報（入力領域が四角形状であること）と、第２の領域特定情報（ファイルＦの色情報）とに基づいて、画像取得部１０３により取得された画像内の入力領域を推定する。 Next, the input area is estimated based on the information obtained by the processing so far (S05: input area estimation step). Here, the input area recognition unit 105 uses the image acquisition unit 103 based on the first area specifying information (the input area is a square shape) and the second area specifying information (color information of the file F). The input area in the image acquired by is estimated.

このとき、疑似タッチパネルとして利用したい物体の形状が四角形状であり、その物体の色が入力領域指標Ｐにより囲われた領域の色情報と一致する、と入力領域認識部１０５により認識された場合には、当該物体を撮像した領域が入力領域であると推定される。しかしながら、実際には、例えば図７の画像Ｓとして示すように、ユーザの手Ｈ１，Ｈ２が入力領域として認識したいファイルＦと重なっていることが多く、画像取得部１０３により取得された画像Ｓでは、目的の物体を撮像した領域の中から第１の領域特定情報及び第２の領域特定情報を満たす領域を特定し、これを入力領域と判断する方法では、入力領域とを区呈することが困難なことが多い。 At this time, when the input area recognition unit 105 recognizes that the shape of the object to be used as the pseudo touch panel is a square shape and the color of the object matches the color information of the area surrounded by the input area index P. Is presumed that the area where the object is imaged is the input area. However, in practice, for example, as shown as an image S in FIG. 7, the user's hands H1 and H2 often overlap with the file F that the user wants to recognize as an input area, and in the image S acquired by the image acquisition unit 103, In the method of identifying a region satisfying the first region specifying information and the second region specifying information from the region where the target object is imaged and determining this as the input region, it is difficult to demarcate the input region There are many things.

そこで、入力領域認識部１０５においては、以下の処理を行うことで、入力領域の推定を行う。まず、画像Ｓから、入力領域指標Ｐにより囲われた領域の色情報と同じ色の領域を特定する。そしてその領域が四角形状の領域の一部を何らかの物体で覆った形状を成しているかを判断することで、入力領域に相当するかを判断する。例えば、図７で示すようにファイルＦをユーザの手により支持している場合、四角形状の物体の一部は手Ｈ１，Ｈ２により覆われる。しかし、ファイルＦの上辺（図７の上部側の辺）は、画像Ｓの中に撮像されている。このように四角形状の物体の一部がユーザの手の画像と重なっている場合であっても上辺が支持されていることは少ないと思われる。そこで、四角形状の物体を認識するために（１）四角形状の外形全体が画像Ｓに撮像されている、（２）外形のうち、上辺は隠れていない（画像Ｓにおいて四角形状の上辺は認識することができる）、という２つの前提を設け、これらに基づいて、四角形状をなすと推定される物体を撮像した領域を検出する。 Therefore, the input area recognition unit 105 estimates the input area by performing the following processing. First, an area having the same color as the color information of the area surrounded by the input area index P is specified from the image S. Then, it is determined whether or not the area corresponds to the input area by determining whether a part of the rectangular area is covered with some object. For example, when the file F is supported by the user's hand as shown in FIG. 7, a part of the rectangular object is covered with the hands H1 and H2. However, the upper side of the file F (upper side in FIG. 7) is captured in the image S. Thus, even when a part of a quadrangular object overlaps the image of the user's hand, it is unlikely that the upper side is supported. Therefore, in order to recognize a quadrilateral object, (1) the entire quadrangular outer shape is captured in the image S, (2) the upper side of the outer shape is not hidden (the upper side of the quadrangular shape is recognized in the image S) Based on these two assumptions, a region in which an object that is estimated to have a quadrangular shape is imaged is detected.

四角形状をなすと推定される物体を撮像した領域を検出する方法について、図８を参照しながら説明する。まず画像Ｓにおいて、下方が隠れている四角形を認識するために、取得した色情報と同色の領域を多角形近似する。そこで、多角形を構成する頂点のうち，画面左上に近い点をＰ０とし，反時計回りにＰ０から順に番号をふっていく（図８でＰ０〜Ｐ５が割り振られている)。次に、Ｐ０〜Ｐ５で示される多角形に外接する矩形を求め、この外接矩形の底辺（図７では、Ｐ２−Ｐ３を含む辺）とＰ０−Ｐ１との交点（ＰＣ０）と、外接矩形の底辺とＰ５−Ｐ４との交点（ＰＣ１）と、算出する。これにより、Ｐ０−ＰＣ１−ＰＣ２−Ｐ５が、ファイルＦの外形であり、入力領域であると推定することができる。このように、入力領域を推定することで、ユーザの手Ｈ１，Ｈ２が重なっていてもファイルＦの外形、すなわち、入力領域を推定することができる。 A method for detecting a region obtained by imaging an object estimated to form a quadrangular shape will be described with reference to FIG. First, in the image S, in order to recognize a quadrangle whose lower part is hidden, an area having the same color as the acquired color information is approximated to a polygon. Therefore, among the vertices constituting the polygon, a point close to the upper left of the screen is set as P0, and numbers are assigned in order from P0 counterclockwise (P0 to P5 are assigned in FIG. 8). Next, a rectangle circumscribing the polygons indicated by P0 to P5 is obtained, the intersection (PC0) between the base of the circumscribed rectangle (the side including P2-P3 in FIG. 7) and P0-P1, and the circumscribed rectangle The intersection (PC1) between the base and P5-P4 is calculated. Thereby, it can be estimated that P0-PC1-PC2-P5 is the outer shape of the file F and is the input area. As described above, by estimating the input area, the outer shape of the file F, that is, the input area can be estimated even if the user's hands H1 and H2 overlap.

なお、入力領域認識部１０５における入力領域の推定と並行して、手領域認識部１０６では手領域の認識が行われる。これは、入力領域として推定された領域に含まれる入力領域とは異なる色を呈している領域を手領域として判断するものであり、例えば図８では、Ｐ０−ＰＣ１−ＰＣ２−Ｐ５により囲われる領域のうち、入力領域認識部１０５で取得した色情報と異なる色情報を有する領域が手領域として認識される。さらに、手領域認識部１０６では、手領域として認識された領域の色情報を取得し、取得した手領域の色情報に基づき、画像取得部１０３により取得される画像に基づいて時間経過に対する手領域の移動を認識することで、ユーザの手の移動を確認することができる。ユーザによる入力操作は、入力領域内で行われるため、この手領域の移動を認識することで、ユーザの入力操作、すなわちユーザの指示内容を情報処理装置１００において認識することができる。 In parallel with the estimation of the input area in the input area recognition unit 105, the hand area recognition unit 106 recognizes the hand area. This is to determine, as a hand area, an area that has a color different from that of the input area included in the area estimated as the input area. For example, in FIG. 8, an area surrounded by P0-PC1-PC2-P5. Among these, an area having color information different from the color information acquired by the input area recognition unit 105 is recognized as a hand area. Further, the hand region recognition unit 106 acquires color information of the region recognized as the hand region, and based on the acquired color information of the hand region, the hand region with respect to time passes based on the image acquired by the image acquisition unit 103. The movement of the user's hand can be confirmed by recognizing the movement of the user. Since the input operation by the user is performed within the input area, the information processing apparatus 100 can recognize the user's input operation, that is, the user's instruction content by recognizing the movement of the hand area.

このようにして、入力領域が推定されると、当該入力領域を示す情報が描画部１０８に送られ、描画部１０８において、コンテンツ蓄積部１０７から取得されたコンテンツの描画に係る処理が行われる（Ｓ０６：描画ステップ）。このときに、入力領域の大きさや形状に応じて、コンテンツの形状を変化させる処理等が行われると共に、表示部１０９において、画像取得部１０３により取得された画像に対してコンテンツを重畳して表示する構成とされている場合には、重畳に係る処理も行われる。 When the input area is estimated in this way, information indicating the input area is sent to the drawing unit 108, and the drawing unit 108 performs processing related to drawing of the content acquired from the content storage unit 107 ( S06: Drawing step). At this time, processing for changing the shape of the content is performed according to the size and shape of the input area, and the content is superimposed on the image acquired by the image acquisition unit 103 and displayed on the display unit 109. When it is set as the structure to perform, the process which concerns on superimposition is also performed.

描画部１０８では、疑似タッチパネルとして用いられる物体（ファイルＦ）の大きさに合わせて重畳するコンテンツを変更することもできる。この物体Ｆの大きさは、画像Ｓ内での物体を撮像した領域の大きさ（面積）で推定する。描画部１０８では、コンテンツの表示の大きさを変えるだけではなく、画面が大きいときには複数アイコンを表示し、操作はポインティングおよびクリック操作とし、画面が小さい時にはアイコンは１つだけ表示し、フリックとクリックだけで操作ができるようにするなど、四角形の物の大きさに合わせて適した表示形式に変更することができる。これにより、操作性を向上することができる。 The drawing unit 108 can change the content to be superimposed according to the size of the object (file F) used as the pseudo touch panel. The size of the object F is estimated by the size (area) of the area where the object is imaged in the image S. The drawing unit 108 not only changes the display size of the content, but also displays a plurality of icons when the screen is large, operates as a pointing and clicking operation, displays only one icon when the screen is small, and flicks and clicks It is possible to change to a display format suitable for the size of the rectangular object, such as enabling the operation only by the user. Thereby, operability can be improved.

メガネ型端末１１０として光学シースルー式ＨＭＤを用いる場合、視野角が比較的狭いと考えられる。そのため、現実空間画像に対して重畳する映像（コンテンツ）を表示する領域は、四角形の物体（画像Ｓ中の四角形の物体Ｆの領域）とＨＭＤにおける表示領域とが重なる部分となる。ユーザが光学シースルー式ＨＭＤを介して見る領域と、画像取得部１０３により取得される画像の領域（すなわち、カメラの視野）と、が一致するように利用前に調整しておくことが望ましい。 When an optical see-through HMD is used as the glasses-type terminal 110, it is considered that the viewing angle is relatively narrow. For this reason, the area where the video (content) to be superimposed on the real space image is displayed is a portion where the rectangular object (the area of the rectangular object F in the image S) and the display area in the HMD overlap. It is desirable to make adjustment before use so that the area that the user sees through the optical see-through HMD matches the area of the image acquired by the image acquisition unit 103 (that is, the field of view of the camera).

そして、描画部１０８により描画されたコンテンツは、表示部１０９に対して送られ、ユーザの動作による入力操作の認識を契機として（Ｓ０７）、表示部１０９に表示される（Ｓ０８：表示ステップ）。入力領域に対応して表示されるコンテンツは、ユーザの入力操作に応じて適宜変更される、すなわち、ユーザの入力操作の認識（Ｓ０７）に対して表示部１０９への表示（Ｓ０８）は繰り返され、また、必要に応じて描画・重畳の処理（Ｓ０６）についても繰り返し行われる。 Then, the content drawn by the drawing unit 108 is sent to the display unit 109, and is displayed on the display unit 109 in response to the recognition of the input operation by the user's operation (S07) (S08: display step). The content displayed corresponding to the input area is appropriately changed according to the user's input operation, that is, the display (S08) on the display unit 109 is repeated for the recognition of the user's input operation (S07). In addition, the drawing / superimposing process (S06) is repeated as necessary.

上述のように、入力領域に対応する疑似タッチパネルでのユーザの入力操作は、なお、図８に示す画像の場合では、左右の両方の手Ｈ１，Ｈ２が認識領域内にある。したがって、ユーザの入力操作を行う手（図８では指輪型端末を装着した手Ｈ１）とファイルＦを持っている手（図８ではＨ２）とを区別できない可能性がある。そこで、それを回避するために、ファイルを持つ手Ｈ２はあまり動かず、入力操作を行なっている側の手Ｈ１の方が動作が大きくなると予想されるため、これに基づいて、検出位置の変動が大きい側の手（ここではＨ１）を入力操作している側の手領域と判断する。また、ユーザが入力操作（コンテンツの該当箇所を指すポインティング操作）を行う場合には、図８に示すように人差し指が一本だけ突出していて、その端部は一番移動量が大きいと考えられるので、手領域として認識した領域のうち最も移動量が大きい端部がポインティングを行っている位置と判断する。また、図８に示すように人差し指を立てていることが明らかな場合には、手領域の上端がポインティングを行っている位置として、ユーザの入力操作を認識する構成としてもよい。このように画像取得部１０３により取得された画像に基づいて、手領域の移動をトレースすることで、例えば、左右フリックおよび上下のスクロールを認識することが可能となる。 As described above, in the case of the image shown in FIG. 8, the user's input operation on the pseudo touch panel corresponding to the input area has both the left and right hands H1 and H2 in the recognition area. Therefore, there is a possibility that a hand performing a user's input operation (hand H1 wearing a ring-type terminal in FIG. 8) and a hand holding the file F (H2 in FIG. 8) cannot be distinguished. In order to avoid this, the hand H2 holding the file does not move much, and the hand H1 on the input operation side is expected to move more. Is determined to be the hand region of the input operation side (H1 in this case). Further, when the user performs an input operation (pointing operation pointing to a corresponding portion of the content), only one index finger protrudes as shown in FIG. 8, and the end portion is considered to have the largest amount of movement. Therefore, it is determined that the end portion with the largest movement amount in the region recognized as the hand region is the position where the pointing is performed. Further, when it is clear that the index finger is raised as shown in FIG. 8, the input operation of the user may be recognized as the position where the upper end of the hand region is pointing. Thus, by tracing the movement of the hand region based on the image acquired by the image acquisition unit 103, for example, it is possible to recognize a left / right flick and a vertical scroll.

また、疑似タッチパネルとなるファイルＦを用いて入力操作を行なうためには、ユーザの指（ポインティングを行う部分）の位置の認識に加え，「ボタンを押した」など入力操作を行なったタイミングも検出する必要がある。そこで、入力データ解析部１０２では、ユーザが持っている疑似タッチパネルとなる物体（ファイルＦ）を叩いたタイミングを、入力を行ったタイミングとして認識する。ユーザがファイルＦを叩いたタイミングを画像取得部１０３により取得される画像のみから認識するのは困難であるから、入力部１０１及び入力データ解析部１０２により取得されたユーザの入力操作に係る情報に基づいて、ユーザのクリックのタイミングを認識することが好ましい。 In addition, in order to perform an input operation using the file F serving as a pseudo touch panel, in addition to recognizing the position of the user's finger (pointing part), the timing of the input operation such as “button pressed” is also detected. There is a need to. Therefore, the input data analysis unit 102 recognizes the timing of hitting an object (file F) that is a pseudo touch panel held by the user as the input timing. Since it is difficult to recognize the timing at which the user hits the file F only from the image acquired by the image acquisition unit 103, the information related to the user's input operation acquired by the input unit 101 and the input data analysis unit 102 is included. Based on this, it is preferable to recognize the timing of the user's click.

以上のように、本実施形態に係る情報処理装置１００及び情報処理方法によれば、入力領域認識部１０５では、画像取得部１０３により取得された現実空間画像において、予め保持された第１の領域特定情報に基づいて、コンテンツに係るユーザの入力操作を行う入力領域を推定し、推定された入力領域に対して描画手段によりコンテンツが描画されて、表示される。この結果、例えば、第１の領域特定情報としてユーザの身の回りにある物体の中から入力操作に利用したい物体を特定する情報を設定して、現実空間画像においてその物体を撮像した領域を入力領域として推定することで、当該入力領域がユーザによる入力操作を行う領域として判断されることから、ユーザが身の回りにある物体を用いて入力操作を行うことが可能となる。 As described above, according to the information processing apparatus 100 and the information processing method according to the present embodiment, the input region recognition unit 105 uses the first region held in advance in the real space image acquired by the image acquisition unit 103. Based on the specific information, an input area where the user performs an input operation related to the content is estimated, and the content is drawn and displayed on the estimated input area by the drawing means. As a result, for example, information that identifies an object that is desired to be used for an input operation from among objects around the user is set as the first area specifying information, and an area in which the object is captured in the real space image is set as the input area As a result of the estimation, the input area is determined as an area for the user to perform an input operation. Therefore, the user can perform an input operation using an object around him.

また、上記実施形態のように、ユーザが入力操作に用いたい物体の形状が特定されている場合には、これを特定する情報を第１の領域特定情報とすることで、現実空間画像の中から入力領域を推定する操作をより簡便且つ確実に行うことができる。 Also, as in the above embodiment, when the shape of the object that the user wants to use for the input operation is specified, the information for specifying this is set as the first area specifying information, so that Therefore, the operation for estimating the input area can be performed more simply and reliably.

さらに、上記実施形態のように、入力領域認識部１０５は、第１の領域特定情報により特定される入力領域の形状を特定する情報と現実空間画像に含まれる特定の物体を撮像した領域の形状とに基づいて、前入力領域を推定する構成とすることができる。この場合、現実空間画像では、第１の領域特定情報により特定される入力領域の形状と一致する画像が得られない場合があるので、両者を比較して入力領域を推定する構成を採用することが好ましい。 Further, as in the above-described embodiment, the input area recognition unit 105 includes information specifying the shape of the input area specified by the first area specifying information and the shape of the area obtained by imaging a specific object included in the real space image. Based on the above, the previous input area can be estimated. In this case, in the real space image, an image that matches the shape of the input area specified by the first area specifying information may not be obtained. Therefore, a configuration in which both are compared to estimate the input area is adopted. Is preferred.

また、入力領域認識部１０５は、入力データ解析部１０２により認識されたユーザの入力操作に基づいて、画像取得部１０３により取得された現実空間画像から第２の領域特定情報を取得し、第１の領域特定情報と、第２の領域特定情報とに基づいて、現実空間画像における入力領域を推定する態様とすることで、入力領域の推定をより確実に行うことができる。そして、第２の領域特定情報として、入力領域を特定する色情報を取得することで、疑似タッチパネルとして用いたい物体（例えばファイルＦ）の情報を確実に取得することができるため、入力領域の推定をより正確に行うことができる。なお、ファイルＦが例えば何色かの模様を有する物体である場合には、異なる色の領域は手領域認識部１０６により手領域として認識される可能性がある。 Further, the input area recognition unit 105 acquires the second area specifying information from the real space image acquired by the image acquisition unit 103 based on the user's input operation recognized by the input data analysis unit 102, By setting the input area in the real space image based on the area specifying information and the second area specifying information, the input area can be estimated more reliably. Since the color information for specifying the input area is acquired as the second area specifying information, the information on the object (eg, file F) that is desired to be used as the pseudo touch panel can be acquired with certainty. Can be performed more accurately. If the file F is an object having, for example, a pattern of several colors, an area with a different color may be recognized as a hand area by the hand area recognition unit 106.

以上、本発明の好適な実施形態について説明してきたが、本発明は必ずしも上述した実施形態に限定されるものではなく、その要旨を逸脱しない範囲で様々な変更が可能である。例えば、入力領域の形状は特に限定されず、例えば、四角形状ではなく、円形状とすることも考えらえる。この場合には、第１の領域特定情報を変更すると共に、入力領域の推定のためのロジックを適宜変更すればよい。 The preferred embodiments of the present invention have been described above. However, the present invention is not necessarily limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention. For example, the shape of the input area is not particularly limited, and for example, a circular shape may be considered instead of a quadrangular shape. In this case, the first region specifying information may be changed and the logic for estimating the input region may be changed as appropriate.

１００…情報処理装置、１０１…入力部、１０２…入力データ解析部、１０３…画像取得部、１０４…画像認識部、１０７…コンテンツ蓄積部、１０８…描画部、１０９…表示部。 DESCRIPTION OF SYMBOLS 100 ... Information processing apparatus 101 ... Input part 102 ... Input data analysis part 103 ... Image acquisition part 104 ... Image recognition part 107 ... Content storage part 108 ... Drawing part 109 ... Display part

Claims

Image acquisition means for acquiring a real space image;
An input operation recognition means for recognizing an input operation by a user;
In response to the input operation recognizing unit recognizing the input operation by the user, the real space image acquired by the image acquiring unit is related to the content based on the first area specifying information held in advance. An input area estimation means for estimating an input area for a user input operation;
Drawing means for drawing the content in an area corresponding to the input area estimated by the input area estimating means;
Display means for displaying the content drawn by the drawing means;
An information processing apparatus comprising:

The information processing apparatus according to claim 1, wherein the first area specifying information is information for specifying a shape of the input area.

The input area estimation means includes:
Based on the input operation of the user recognized by the input operation recognition unit, the second region specifying information is acquired from the real space image acquired by the image acquiring unit, the first region specifying information, The information processing apparatus according to claim 1, wherein the input area in the real space image is estimated based on the second area specifying information.

The information processing apparatus according to claim 3, wherein the second area specifying information includes color information for specifying the input area.

The first area specifying information is information for specifying the shape of the input area,
The input area estimating means approximates the area specified by the second area specifying information with a polygon, and extends a part of the apex of the approximated polygon and a plurality of sides constituting the polygon. 5. The information processing apparatus according to claim 3, wherein when a region formed from the intersection of the shapes has a shape specified by the first region specifying information, the region is estimated as an input region.

The input operation recognizing means identifies an object that moves in the input area based on color information of an area other than the area specified by the second area specifying information in the input area, and moves the object The information processing apparatus according to any one of claims 3 to 5, wherein the information is recognized as an input operation by the user.

The input operation recognizing means is the end of the object having the largest movement amount among the objects moving within the input area estimated by the input area estimating means in a plurality of continuous real space images acquired by the image acquiring means. The information processing apparatus according to any one of claims 1 to 6, wherein movement of a part is recognized as an input operation by the user.

The information processing apparatus according to claim 1, wherein the drawing unit changes content to be drawn according to an area of the input region estimated by the input region estimation unit.

An image acquisition step of acquiring a real space image by the image acquisition means;
An input operation recognition step for recognizing an input operation by a user by an input operation recognition means;
In the input operation recognition step, the first area specifying information held in advance in the real space image acquired by the image acquisition means by the input area estimation means by recognizing the input operation by the user. Based on the input area estimation step for estimating the input area for performing the user input operation related to the content,
A drawing step of drawing the content in a region corresponding to the input region estimated in the input region estimation step by a drawing unit;
A display step of displaying the content drawn in the drawing step by a display means;
An information processing method comprising: