JP2014021893A

JP2014021893A - Information processing device, operation signal generation method, and program

Info

Publication number: JP2014021893A
Application number: JP2012162445A
Authority: JP
Inventors: Toshiyuki Ueno; 寿之上野
Original assignee: NEC Casio Mobile Communications Ltd
Current assignee: NEC Casio Mobile Communications Ltd
Priority date: 2012-07-23
Filing date: 2012-07-23
Publication date: 2014-02-03

Abstract

PROBLEM TO BE SOLVED: To provide an information processing device for realizing an input operation excellent in operability.SOLUTION: A line-of-sight information acquisition unit 31 acquires line-of-sight information indicating a line of sight of a user. A sound information acquisition unit 32 acquires sound information indicating a sound, and when volume of the sound is equal to or larger than a sound volume threshold, determines that an input position is selected. An operation signal generation unit 33 generates an operation signal indicating an input operation to an operation surface on the basis of the line-of-sight information and the sound information.

Description

本発明は、情報処理装置、操作信号生成方法、およびプログラムに関する。 The present invention relates to an information processing apparatus, an operation signal generation method, and a program.

情報処理装置に対する入力操作は、主に手を用いて行われる。例えば入力装置がマウスの場合、入力操作は、手でマウスを動かすこと、および手でマウスに設けられた釦を押下することである。ユーザがマウスを動かすと、入力位置が変更され、釦を押下すると、そのときの入力位置が選択される。また入力装置がタッチ式入力装置の場合、入力操作は、手の指や手で持ったスタイラスのような操作体を操作面上に接触または近接させる（以下、タッチと称する。）ことである。タッチ位置は、マウスの場合における選択された入力位置に相当し、このタッチ位置、タッチ位置の移動方向、およびタッチしている時間などに基づいて、入力操作の種類が判定される。 An input operation to the information processing apparatus is mainly performed using a hand. For example, when the input device is a mouse, the input operation includes moving the mouse with a hand and pressing a button provided on the mouse with a hand. When the user moves the mouse, the input position is changed, and when the user presses the button, the input position at that time is selected. When the input device is a touch-type input device, the input operation is to bring an operation body such as a finger or a stylus held by the hand into contact with or close to the operation surface (hereinafter referred to as touch). The touch position corresponds to the selected input position in the case of the mouse, and the type of input operation is determined based on the touch position, the moving direction of the touch position, and the touching time.

このような手を用いた入力操作は、直感的で操作性に優れるため広く普及しているが、近年では、手以外のものを用いた入力操作を実現することの可能な情報処理装置に対する需要が高まっている。例えば、スマートホンやタブレット端末のような情報処理装置の普及により、移動中に情報処理装置を使用する機会が増大しているが、移動中では、ユーザは手を自由に動かせない状況が多いため、手を用いた入力操作を極力減らすことが望ましい。特にタブレット端末においては大型化が進んでおり、一方の手で端末を持ち、他方の手で入力操作を行うことが一般的となっている。このため片手で荷物を持っている場合には、手を用いた入力操作が困難であり、手以外のものを用いた入力操作が求められている。 Such an input operation using a hand is widespread because it is intuitive and excellent in operability, but in recent years, there is a demand for an information processing apparatus capable of realizing an input operation using something other than a hand. Is growing. For example, with the widespread use of information processing devices such as smart phones and tablet terminals, the opportunity to use information processing devices while moving is increasing, but there are many situations in which users cannot move their hands freely while moving. It is desirable to reduce input operations using hands as much as possible. In particular, tablet terminals are becoming larger in size, and it is common to hold the terminal with one hand and perform input operations with the other hand. For this reason, when holding a luggage with one hand, an input operation using a hand is difficult, and an input operation using a hand other than the hand is required.

そこで、手以外のものを用いた入力操作を実現するための方法として、視線により操作面上の入力位置を特定し、瞼の開閉により入力位置を選択する方法が考えられる。この方法においては、ユーザは、視線を操作面上に向けることで入力位置を特定し、瞼を閉じることで、入力位置を選択する。この場合、例えばユーザは、瞬きをすることで、クリック操作と同様な操作を行ったり、一度瞼を閉じたまま所定時間経過した後に瞼を開くことで、長押し操作と同様な操作を行ったりすることができる。 Therefore, as a method for realizing an input operation using something other than a hand, a method of specifying an input position on the operation surface with a line of sight and selecting an input position by opening and closing the heel can be considered. In this method, the user specifies the input position by directing his / her line of sight on the operation surface, and selects the input position by closing the bag. In this case, for example, the user performs an operation similar to the click operation by blinking, or performs an operation similar to the long press operation by opening the bag after a predetermined time has elapsed with the bag closed once. can do.

しかしながら、上記の方法では、ユーザがドラッグ操作と同様な操作を行おうとして、瞼を閉じた後に視線を移動させようとしても、瞼を閉じた状態では視線が検出されず、入力位置を選択したまま移動させる操作を行うことができない。 However, in the above method, even if the user tries to perform the same operation as the drag operation and moves the line of sight after closing the eyelid, the line of sight is not detected when the eyelid is closed, and the input position is selected. It is not possible to perform the move operation.

そこで、瞼の開閉の代わりに、一定時間同じ箇所を見つめた場合に、その入力位置を選択したと判断する方法が考えられる。しかしながら、この方法では、一定時間待たなければ入力位置を選択することができないため、操作性が低下してしまう。 Therefore, instead of opening and closing the bag, a method of determining that the input position is selected when looking at the same place for a certain period of time can be considered. However, with this method, the input position cannot be selected without waiting for a certain period of time, so that the operability is degraded.

これに対して特許文献１には、視線により操作面上の入力位置を特定し、釦の押下により入力位置を選択する携帯端末が開示されている。この携帯端末によれば、表示画面上の文字に視線を向けた状態でユーザが手で釦を押下すると、視線を向けた箇所に表示された文字を入力することができる。 On the other hand, Patent Document 1 discloses a portable terminal that specifies an input position on an operation surface by line of sight and selects an input position by pressing a button. According to this portable terminal, when the user presses the button with his / her hand in a state where the line of sight is directed to the character on the display screen, the character displayed at the position where the line of sight is directed can be input.

一方、特許文献２には、ユーザの息など空気の流れによって情報を入力する携帯端末が開示されている。この携帯端末によれば、ユーザが息を吹きかける強さおよび長さを組み合わせたパターンに基づいて入力操作の種類を変更することができる。 On the other hand, Patent Document 2 discloses a portable terminal that inputs information by an air flow such as a user's breath. According to this portable terminal, the type of input operation can be changed based on a pattern that combines the strength and length with which the user blows.

特開２０１０−２６７０７１号公報JP 2010-267071 A 特開２００４−１７７９９２号公報JP 2004-177992 A

しかしながら、特許文献１に記載の携帯端末では、ユーザは、入力位置を選択するために手で釦を押下しなければならない。このため、両手が自由に動かない状況などでは、入力操作を行うことが困難になるなど、操作性が低いという問題がある。 However, in the portable terminal described in Patent Document 1, the user must press the button with his hand to select the input position. For this reason, in the situation where both hands do not move freely, there is a problem that the operability is low, such as it becomes difficult to perform an input operation.

また特許文献２に記載の携帯端末は、ユーザが息を吹きかける強さおよび長さを組合せたパターンによって、入力操作の種類を変更するが、このパターンが限られているため、複雑な入力操作を行うことができず、操作性が低いという問題がある。 Moreover, although the portable terminal of patent document 2 changes the kind of input operation with the pattern which combined the strength and length which a user blows, since this pattern is limited, complicated input operation is carried out. There is a problem that it cannot be performed and operability is low.

本発明は、操作性により優れた入力操作を実現する情報処理装置、操作信号生成方法、およびプログラムを提供することを目的とする。 An object of the present invention is to provide an information processing apparatus, an operation signal generation method, and a program that realize an input operation with better operability.

本発明による情報処理装置は、ユーザの視線を示す視線情報を取得する視線情報取得部と、音声を示す音声情報を取得する音声情報取得部と、前記視線情報および前記音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成する操作信号生成部と、を有する。 An information processing apparatus according to the present invention operates based on a line-of-sight information acquisition unit that acquires line-of-sight information indicating a user's line of sight, a sound information acquisition unit that acquires sound information indicating sound, and the line-of-sight information and the sound information An operation signal generation unit that generates an operation signal indicating an input operation on the surface.

本発明による操作信号生成方法は、ユーザの視線を示す視線情報を取得し、音声を示す音声情報を取得し、前記視線情報および前記音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成する。 An operation signal generation method according to the present invention acquires line-of-sight information indicating a user's line of sight, acquires sound information indicating sound, and generates an operation signal indicating an input operation on an operation surface based on the line-of-sight information and the sound information. Generate.

本発明によるプログラムは、ユーザの視線を示す視線情報を取得する手順と、音声を示す音声情報を取得する手順と、前記視線情報および前記音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成する手順と、をコンピュータに実行させる。 The program according to the present invention includes a procedure for acquiring gaze information indicating a user's gaze, a procedure for acquiring audio information indicating a voice, and an operation signal indicating an input operation on the operation surface based on the gaze information and the audio information. And causing the computer to execute.

本発明によれば、操作性のより優れた入力操作を実現することが可能になる。 According to the present invention, it is possible to realize an input operation with better operability.

本発明の第１の実施形態にかかる情報処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the information processing apparatus concerning the 1st Embodiment of this invention. 本実施形態にかかる視線情報取得部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the gaze information acquisition part concerning this embodiment. ユーザの視線を検出するために用いられる視線検出画像の一例を示す説明図である。It is explanatory drawing which shows an example of the gaze detection image used in order to detect a user's gaze. 視線検出画像のエッジ検出処理結果について説明するための説明図である。It is explanatory drawing for demonstrating the edge detection process result of a gaze detection image. 本実施形態にかかる音声情報取得部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the audio | voice information acquisition part concerning this embodiment. 本実施形態にかかる制御部が、操作信号が示す入力操作により実行する処理を判定する判定処理の一例を説明するためのフローチャートである。操作信号生成部が、視線情報および音声情報に基づいて操作信号を生成するときの入力操作の判定処理の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the determination process which determines the process performed by the control part concerning this embodiment by input operation which an operation signal shows. It is a flowchart for demonstrating an example of the determination process of input operation when an operation signal production | generation part produces | generates an operation signal based on eyes | visual_axis information and audio | voice information. 本発明の第２の実施形態にかかる情報処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the information processing apparatus concerning the 2nd Embodiment of this invention. 本実施形態にかかる動作モード設定部の動作例を説明するためのフローチャートである。It is a flowchart for demonstrating the operation example of the operation mode setting part concerning this embodiment. 本発明の第３の実施形態にかかる情報処理装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the information processing apparatus concerning the 3rd Embodiment of this invention. 本実施形態にかかる情報処理装置の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation | movement of the information processing apparatus concerning this embodiment.

以下、本発明の実施形態について添付の図面を参照して説明する。なお、本明細書および図面において、同一の機能を有する構成要素については同じ符号を付することにより重複説明を省略する場合がある。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In addition, in this specification and drawing, the description which overlaps by the same code | symbol may be attached | subjected about the component which has the same function.

（第１の実施形態）
まず本発明の第１の実施形態について説明する。 (First embodiment)
First, a first embodiment of the present invention will be described.

図１は、本発明の第１の実施形態にかかる情報処理装置の機能構成を示すブロック図である。図１に示す情報処理装置１００は、例えば携帯電話、スマートホン、ゲーム機器、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）、ＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｔ：携帯情報端末）、ナビゲーション装置、音楽再生装置、映像処理装置、およびサーバ装置などである。なおＰＣには、タブレット型ＰＣやノート型ＰＣなども含まれる。また映像処理装置には、カメラ、レコーダ、プレイヤなども含まれる。 FIG. 1 is a block diagram showing a functional configuration of the information processing apparatus according to the first embodiment of the present invention. An information processing apparatus 100 illustrated in FIG. 1 includes, for example, a mobile phone, a smart phone, a game device, a PC (Personal Computer), a PDA (Personal Digital Assistant), a navigation device, a music playback device, a video processing device, and a server. Such as a device. Note that the PC includes a tablet PC and a notebook PC. The video processing device also includes a camera, a recorder, a player, and the like.

情報処理装置１００は、視線情報取得部１１と、音声情報取得部１２と、操作信号生成部１３と、制御部１４と、記憶部１５とを有する。 The information processing apparatus 100 includes a line-of-sight information acquisition unit 11, an audio information acquisition unit 12, an operation signal generation unit 13, a control unit 14, and a storage unit 15.

視線情報取得部１１は、ユーザの視線を示す視線情報を取得する。ここで視線は、ユーザの目の中心と、ユーザが見ている対象とを結ぶ線のことである。なお以下の説明中において、視線の開始点である目の中心を視点といい、視線が操作面と交わる点を注視点という。視線情報は、例えばユーザの視点と、視点からの視線方向とを含む。或いは視線情報は、注視点の位置情報でもよい。視線情報取得部１１は、取得した視線情報を操作信号生成部１３に出力する。 The line-of-sight information acquisition unit 11 acquires line-of-sight information indicating the user's line of sight. Here, the line of sight is a line connecting the center of the user's eyes and the object that the user is looking at. In the following description, the center of the eye, which is the starting point of the line of sight, is referred to as the viewpoint, and the point where the line of sight intersects the operation surface is referred to as the gaze point. The line-of-sight information includes, for example, the user's viewpoint and the line-of-sight direction from the viewpoint. Alternatively, the line-of-sight information may be gaze point position information. The line-of-sight information acquisition unit 11 outputs the acquired line-of-sight information to the operation signal generation unit 13.

音声情報取得部１２は、音声を示す音声情報を取得する。音声情報は、例えば音声の音量を示す音量情報を含む。また音声情報は、音量が予め定められた閾値（以下、これを音量閾値と称する）以上であるか否かを示す情報を含んでもよい。音声情報取得部１２は、取得した音声情報を操作信号生成部１３に出力する。 The sound information acquisition unit 12 acquires sound information indicating sound. The audio information includes volume information indicating the volume of the audio, for example. The audio information may include information indicating whether or not the volume is equal to or higher than a predetermined threshold (hereinafter referred to as a volume threshold). The audio information acquisition unit 12 outputs the acquired audio information to the operation signal generation unit 13.

操作信号生成部１３は、視線情報および音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成して出力する。これにより情報処理装置１００は、操作信号生成部１３から出力された操作信号に応じた処理を実行することとなる。 The operation signal generation unit 13 generates and outputs an operation signal indicating an input operation on the operation surface based on the line-of-sight information and the audio information. As a result, the information processing apparatus 100 executes processing according to the operation signal output from the operation signal generation unit 13.

操作信号生成部１３は、具体的には、視線情報に基づいて操作面上の入力位置を特定し、音声情報に基づいて入力位置を選択したか否かを判断する。そして操作信号生成部１３は、上記判断の結果および特定された入力位置に基づいて、操作信号を生成する。 Specifically, the operation signal generation unit 13 specifies an input position on the operation surface based on the line-of-sight information, and determines whether the input position is selected based on the audio information. The operation signal generation unit 13 generates an operation signal based on the result of the determination and the specified input position.

例えば、操作信号生成部１３は、注視点を入力位置とする。なお視線情報は、右目の視線を示す情報と、左目の視線を示す情報とを含む。このとき操作信号生成部１３は、右目の視線を示す情報に基づいて右目に対応する入力位置である第１の入力位置を特定し、左目の視線を示す情報に基づいて左目に対応する入力位置である第２の入力位置を特定する。ユーザの右目の視線と左目の視線とが操作面上で交わる場合には、右目の注視点と左目の注視点とは重なり合い、操作信号生成部１３は、右目の視線および左目の視線から１つの入力位置を特定する。一方、ユーザの右目の視線と左目の視線とが操作面上で交わらない場合には、右目の注視点と左目の注視点とはそれぞれ異なる位置となり、操作信号生成部１３は、操作面上に２つの入力位置を特定する。 For example, the operation signal generation unit 13 sets the gazing point as the input position. The line-of-sight information includes information indicating the line of sight of the right eye and information indicating the line of sight of the left eye. At this time, the operation signal generation unit 13 specifies a first input position that is an input position corresponding to the right eye based on information indicating the line of sight of the right eye, and an input position corresponding to the left eye based on information indicating the line of sight of the left eye A second input position is specified. When the user's right eye gaze and left eye gaze intersect on the operation surface, the right eye gaze point and the left eye gaze point overlap each other, and the operation signal generation unit 13 performs one operation from the right eye gaze and the left eye gaze. Specify the input position. On the other hand, when the user's right eye gaze and left eye gaze do not intersect on the operation surface, the right eye gaze point and the left eye gaze point are different from each other, and the operation signal generation unit 13 is placed on the operation surface. Two input positions are specified.

なお、操作面上の右目の注視点と左目の注視点との距離が所定距離以下の場合には、操作信号生成部１３は、右目の視線および左目の視線から１つの入力位置を特定してもよい。例えば、操作信号生成部１３は、操作面上の右目の注視点と左目の注視点とを結ぶ線分の中点の位置を入力位置として特定する。 If the distance between the right eye point and the left eye point on the operation surface is equal to or smaller than a predetermined distance, the operation signal generator 13 specifies one input position from the right eye line and the left eye line. Also good. For example, the operation signal generation unit 13 specifies the position of the midpoint of the line segment connecting the right eye point and the left eye point on the operation surface as the input position.

また操作信号生成部１３は、音声の音量が音量閾値以上となる場合、特定された入力位置が選択されたと判断する。この場合、ユーザは、音声を取得するマイクロフォン（以下、マイクと略す。）に息を吹きかけることで、選択操作を行うことができる。 In addition, the operation signal generation unit 13 determines that the specified input position is selected when the sound volume is equal to or higher than the sound volume threshold. In this case, the user can perform a selection operation by blowing on a microphone (hereinafter abbreviated as a microphone) that acquires sound.

また、操作信号生成部１３は、音量以外の音声情報に基づいて、入力位置が選択されたか否かを判断してもよい。例えば、操作信号生成部１３は、音声認識を用いて、入力位置が選択されたか否かを判断してもよい。この場合、操作信号生成部１３は、入力された音声を解析して予め定められた単語を検知すると、入力位置が選択されたと判断する。以下、音声の入力はマイクに対して息を吹きかけることにより行うこととして説明を続ける。 Further, the operation signal generation unit 13 may determine whether or not an input position has been selected based on audio information other than the volume. For example, the operation signal generation unit 13 may determine whether an input position has been selected using voice recognition. In this case, the operation signal generator 13 determines that the input position has been selected when the input voice is analyzed to detect a predetermined word. Hereinafter, the description will be continued on the assumption that voice input is performed by blowing on the microphone.

このような構成によれば、ユーザが操作面上に視線を向けると、この視線により特定される操作面上の点の位置は入力位置（ポインティング位置）として取り扱われ、ユーザがマイクに対して息を吹きかけるとこの入力位置が選択されたと判断される。 According to such a configuration, when the user turns his gaze on the operation surface, the position of the point on the operation surface specified by this gaze is handled as the input position (pointing position), and the user breathes into the microphone. It is determined that this input position has been selected.

例えばマウスやタッチ式入力装置などの入力装置では、この入力位置と入力位置に対する選択によって様々な入力操作が用いられている。例えばタッチ式入力装置では、入力操作として、クリック操作、長押し操作、ドラッグ操作、およびピンチ操作などが用いられている。またマウスでは、上記した入力操作の他にポインティング操作が用いられる。 For example, in an input device such as a mouse or a touch input device, various input operations are used depending on the input position and the selection of the input position. For example, in a touch input device, a click operation, a long press operation, a drag operation, a pinch operation, and the like are used as an input operation. In addition, the mouse uses a pointing operation in addition to the input operation described above.

なおここでクリック操作は、操作面上の特定の箇所にタッチし、その後すぐに離す操作である。また長押し操作は、操作面上の特定の箇所をタッチし、そのまま一定時間待ってから離す操作である。またドラッグ操作は、操作面上の特定の箇所をタッチし、タッチしたまま操作体を動かして離す操作である。またピンチ操作は、操作面上の２点をタッチし、タッチした２点の間隔を変化させる操作である。またポインティング操作は、ボタンを押下していない状態でマウスを動かす操作である。 Here, the click operation is an operation of touching a specific part on the operation surface and releasing it immediately thereafter. The long press operation is an operation of touching a specific part on the operation surface, waiting for a certain time, and releasing it. The drag operation is an operation of touching a specific part on the operation surface and moving and releasing the operation body while touching. The pinch operation is an operation of touching two points on the operation surface and changing the interval between the two touched points. The pointing operation is an operation of moving the mouse without pressing the button.

ユーザは、タッチ式入力部の操作面をタッチしたり、マウスを動かして釦を押下したりする代わりに、視線を動かし、マイクに息を吹きかけることによって例示した入力操作と同様の操作を行うことができる。 Instead of touching the operation surface of the touch-type input unit or moving the mouse and pressing the button, the user performs the same operation as the input operation exemplified by moving the line of sight and blowing on the microphone. Can do.

例えばユーザがポインティング操作を行いたい場合には、ユーザは、マイクに対して息を吹きかける動作を行わずに、視線を操作面上に向ければよい。この場合、操作信号生成部１３は、視線情報に基づいて入力位置を特定し、音声情報に基づいて入力位置が選択されていないと判断する。この場合操作信号生成部１３が生成する操作信号は、ポインティング操作に対応した操作信号となり、情報処理装置１００は、ポインティング操作が行われた場合と同様の処理を実行することとなる。 For example, when the user wants to perform a pointing operation, the user may turn his / her line of sight on the operation surface without performing an operation of blowing the microphone. In this case, the operation signal generation unit 13 specifies the input position based on the line-of-sight information and determines that the input position is not selected based on the audio information. In this case, the operation signal generated by the operation signal generation unit 13 is an operation signal corresponding to the pointing operation, and the information processing apparatus 100 performs the same processing as when the pointing operation is performed.

またユーザがクリック操作を行いたい場合には、ユーザは、視線を操作面上のクリックしたい位置に向け、その視線を固定したまま、時間閾値以下の短い間マイクに息を吹きかければよい。この場合、操作信号生成部１３は、視線情報に基づいて入力位置を特定し、音声情報に基づいて、入力位置が一度選択された後、時間閾値以下の短い間にこの選択が解除されたと判断する。この場合操作信号生成部１３が生成する操作信号は、クリック操作に対応した操作信号となり、情報処理装置１００は、クリック操作が行われた場合と同様の処理を実行することとなる。 Further, when the user wants to perform a click operation, the user only has to direct his / her line of sight to the position where he / she wants to click on the operation surface, and blow his / her breath on the microphone for a short time below the time threshold while fixing the line of sight. In this case, the operation signal generation unit 13 specifies the input position based on the line-of-sight information, and determines that the selection is canceled within a short period of time below the time threshold after the input position is selected once based on the audio information. To do. In this case, the operation signal generated by the operation signal generation unit 13 is an operation signal corresponding to the click operation, and the information processing apparatus 100 performs the same processing as when the click operation is performed.

またユーザが長押し操作を行いたい場合には、ユーザは、視線を操作面上の長押ししたい位置に向けて、その視線を固定したまま、時間閾値以上の間マイクに息を吹きかければよい。このとき、操作信号生成部１３は、視線情報に基づいて入力位置を特定し、音声情報に基づいて、入力位置が一度選択され、時間閾値以上の間この選択された状態が継続した後に、この選択が解除されたと判断する。この場合操作信号生成部１３が生成する操作信号は、長押し操作に対応した操作信号となり、情報処理装置１００は、長押し操作が行われた場合と同様の処理を実行することとなる。 In addition, when the user wants to perform a long press operation, the user may direct the line of sight to the position where the user wants to long press on the operation surface, and keep blowing the line of sight for a time threshold or more while keeping the line of sight fixed. . At this time, the operation signal generation unit 13 specifies the input position based on the line-of-sight information, and after the input position is selected once based on the audio information and this selected state continues for a time threshold or more, It is determined that the selection has been canceled. In this case, the operation signal generated by the operation signal generation unit 13 is an operation signal corresponding to the long press operation, and the information processing apparatus 100 performs the same processing as when the long press operation is performed.

またユーザがドラッグ操作を行いたい場合には、ユーザは、視線を操作面上のドラッグ操作を開始したい位置に向け、時間閾値以上の間マイクに息を吹きかけながら、視線をドラッグしたい方向に移動させ、ドラッグ操作を終了したい位置に視線を向けたところで息を吹きかけることを止めればよい。このとき、操作信号生成部１３は、視線情報に基づいて入力位置を特定し、音声情報に基づいて、入力位置が一度選択され、入力位置が選択されたまま移動した後にこの選択が解除されたと判断する。この場合操作信号生成部１３が生成する操作信号は、ドラッグ操作に対応した操作信号となり、情報処理装置１００は、ドラッグ操作が行われた場合と同様の処理を実行することとなる。 Also, when the user wants to perform a drag operation, the user moves the line of sight in the direction in which he / she wants to drag while directing his / her line of sight on the operation surface to the position where the drag operation is to be started, It suffices to stop blowing when the line of sight is directed to the position where the drag operation is to be terminated. At this time, the operation signal generation unit 13 specifies the input position based on the line-of-sight information, and based on the audio information, the input position is selected once, and the selection is canceled after moving with the input position selected. to decide. In this case, the operation signal generated by the operation signal generation unit 13 is an operation signal corresponding to the drag operation, and the information processing apparatus 100 executes the same processing as when the drag operation is performed.

またユーザがピンチ操作を行いたい場合には、ユーザは、視線を操作面上に向けた状態で、時間閾値以上の間マイクに息を吹きかけながら、視線の間隔を変化させる動作を行えばよい。なお、ユーザは、いわゆる「より目の状態」にする動作を行うことで視線の間隔を狭めることができる。またユーザは、「外斜視の状態」にする動作を行うことで視線の間隔を広げることができる。このとき、操作信号生成部１３は、視線情報に基づいて入力位置を特定し、視線の間隔を変化させる動作が行われると２つの入力位置間の距離が変化することを検知する。この場合操作信号生成部１３が生成する操作信号は、ピンチ操作に対応した操作信号となり、情報処理装置１００は、ピンチ操作が行われた場合と同様の処理を実行することとなる。 In addition, when the user wants to perform a pinch operation, the user may perform an operation of changing the interval of the line of sight while blowing his / her breath on the microphone for a time threshold or more with the line of sight directed on the operation surface. Note that the user can narrow the interval of the line of sight by performing a so-called “more eye state” operation. In addition, the user can widen the line-of-sight interval by performing the operation of “external perspective”. At this time, the operation signal generation unit 13 identifies the input position based on the line-of-sight information, and detects that the distance between the two input positions changes when an operation of changing the line-of-sight interval is performed. In this case, the operation signal generated by the operation signal generation unit 13 is an operation signal corresponding to the pinch operation, and the information processing apparatus 100 performs the same processing as when the pinch operation is performed.

制御部１４は、記憶部１５に記憶されるプログラムを読み取り、このプログラムを実行し、操作信号に従って動作する。制御部１４は、操作信号生成部１３から出力された操作信号に応じた処理を実行する。例えば制御部１４は、操作信号が、選択されていない状態の入力位置を示す場合、この操作信号に示される入力位置にポインタを表示させる。これにより、ユーザは、現在視線が指し示している表示画面上の位置を認識することができる。 The control unit 14 reads a program stored in the storage unit 15, executes the program, and operates according to an operation signal. The control unit 14 executes processing according to the operation signal output from the operation signal generation unit 13. For example, when the operation signal indicates an input position that is not selected, the control unit 14 displays a pointer at the input position indicated by the operation signal. Thereby, the user can recognize the position on the display screen currently indicated by the line of sight.

制御部１４は、表示画面を生成し、表示装置に表示画面を表示させる。このとき制御部１４は、外部の表示装置と接続できる構成でもよいし、表示装置を有した構成でもよい。なお表示装置は、表示パネル、または表示画面を投影するプロジェクタ装置などである。 The control unit 14 generates a display screen and causes the display device to display the display screen. At this time, the control unit 14 may be configured to be connected to an external display device or may be configured to include a display device. The display device is a display panel or a projector device that projects a display screen.

記憶部１５は、情報を格納する記憶媒体である。ここで記憶部１５は、例えばフラッシュメモリ、ＭＲＡＭ（Magnetoresistive Random Access Memory）、ＦｅＲＡＭ（Ferroelectric Random Access Memory）、ＰＲＡＭ（Phase change Random Access Memory）、及びＥＥＰＲＯＭ（Electronically Erasable and Programmable Read Only Memory）などの不揮発性メモリや、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）などの磁気記録媒体である。記憶部１５は、例えば情報処理装置１００の動作を規定するプログラムを記憶する。 The storage unit 15 is a storage medium that stores information. Here, the storage unit 15 is a nonvolatile memory such as a flash memory, an MRAM (Magnetoresistive Random Access Memory), an FeRAM (Ferroelectric Random Access Memory), a PRAM (Phase change Random Access Memory), and an EEPROM (Electronically Erasable and Programmable Read Only Memory). Or a magnetic recording medium such as an HDD (Hard Disk Drive). The storage unit 15 stores a program that defines the operation of the information processing apparatus 100, for example.

ここで、視線情報取得部１１の詳細な構成の一例について説明する。図２は、本実施形態にかかる視線情報取得部の構成例を示すブロック図である。視線情報取得部１１は、カメラ２１と、視線検出画像格納メモリ２２と、視線検出処理部２３とを有する。 Here, an example of a detailed configuration of the line-of-sight information acquisition unit 11 will be described. FIG. 2 is a block diagram illustrating a configuration example of the line-of-sight information acquisition unit according to the present embodiment. The line-of-sight information acquisition unit 11 includes a camera 21, a line-of-sight detection image storage memory 22, and a line-of-sight detection processing unit 23.

カメラ２１は、ユーザを撮像した視線検出画像を取得する撮像装置である。カメラ２１は、視線検出画像として、ユーザの顔、特に眼球部分を含む画像を取得する。図３は、ユーザの視線を検出するために用いられる視線検出画像の一例を示す説明図である。 The camera 21 is an imaging device that acquires a line-of-sight detection image obtained by imaging a user. The camera 21 acquires an image including the user's face, particularly the eyeball portion, as the line-of-sight detection image. FIG. 3 is an explanatory diagram illustrating an example of a line-of-sight detection image used for detecting a user's line of sight.

視線検出画像格納メモリ２２は、カメラ２１が撮像した視線検出画像を格納する。視線検出画像格納メモリ２２は、例えば図１に示す記憶部１５と物理的に同じ記憶装置でもよいし、記憶部１５とは別体の記憶装置でもよい。 The line-of-sight detection image storage memory 22 stores the line-of-sight detection image captured by the camera 21. The line-of-sight detection image storage memory 22 may be a storage device physically the same as the storage unit 15 illustrated in FIG. 1, for example, or may be a storage device separate from the storage unit 15.

視線検出処理部２３は、視線検出画像に基づいてユーザの視線を検出する。例えば視線検出処理部２３は、視線検出画像に対してエッジ検出処理、および顔検出処理を行い、視線を検出する。エッジ検出処理は、画像中において領域の境界を検出する処理である。領域の境界においては、輝度の変化が大きくなる。このため、視線検出処理部２３は、視線検出画像に含まれる各画素について輝度Ｙを算出し、この輝度Ｙの変化が大きい箇所を特定することによって領域の境界を検出する。 The line-of-sight detection processing unit 23 detects the line of sight of the user based on the line-of-sight detection image. For example, the line-of-sight detection processing unit 23 performs an edge detection process and a face detection process on the line-of-sight detection image to detect the line of sight. The edge detection process is a process for detecting a boundary between regions in an image. The change in luminance becomes large at the boundary of the region. For this reason, the line-of-sight detection processing unit 23 calculates the luminance Y for each pixel included in the line-of-sight detection image, and detects the boundary of the region by specifying the portion where the change in the luminance Y is large.

具体的には、視線検出処理部２３は、視線検出画像の各画素について、例えば以下の数式（１）を用いて輝度を算出する。数式（１）は、ＩＴＵ−ＴＢＴ６０１の変換式である。赤、緑、青の各色の強さ（Ｒ，Ｇ，Ｂ）から輝度を算出する。 Specifically, the line-of-sight detection processing unit 23 calculates the luminance for each pixel of the line-of-sight detection image using, for example, the following formula (1). Formula (1) is a conversion formula of ITU-T BT 601. Luminance is calculated from the intensity (R, G, B) of each color of red, green, and blue.

Ｙ＝（７７×Ｒ／２５６）＋（１５０×Ｇ／２５６）＋（２９×Ｂ／２５６）・・・数式（１）
そして視線検出処理部２３は、各画素を輝度信号に変換した後、隣接する画素の輝度信号の値の差分の絶対値に基づいてエッジ検出を行う。図４は、視線検出画像のエッジ検出処理結果について説明するための説明図である。図３の視線検出画像に対してエッジ検出を行うと、画像中の輝度の変化が大きい箇所、すなわち領域の境界部分が図４に示すように検出される。 Y = (77 × R / 256) + (150 × G / 256) + (29 × B / 256) (1)
The line-of-sight detection processing unit 23 converts each pixel into a luminance signal, and then performs edge detection based on the absolute value of the difference between the luminance signal values of adjacent pixels. FIG. 4 is an explanatory diagram for explaining the edge detection processing result of the line-of-sight detection image. When edge detection is performed on the line-of-sight detection image of FIG. 3, a portion where the luminance change in the image is large, that is, a boundary portion of the region is detected as shown in FIG.

視線検出処理部２３は、上記のエッジ検出結果および顔検出技術を用いて、視線検出画像中の顔部分を検出する。例えば視線検出処理部２３は、エッジ検出結果から下に凸形状の部分を検出し、さらに視線検出画像の色を解析することによって、検出した部分が肌色であれば当該部分を顔部分とすることができる。 The line-of-sight detection processing unit 23 detects a face portion in the line-of-sight detection image using the edge detection result and the face detection technique. For example, the line-of-sight detection processing unit 23 detects a downwardly convex part from the edge detection result, and further analyzes the color of the line-of-sight detection image. Can do.

また視線検出処理部２３は、視線検出画像中の顔部分を検出すると、当該顔部分の中の目を検出する。このとき視線検出処理部２３は、エッジ検出情報を用いる。また視線検出処理部２３は、検出した目の内部のエッジ検出情報を用いて、眼球の位置および向きを検出する。そして視線検出処理部２３は、検出した眼球の位置および向きから視線を検出する。 Further, when detecting the face portion in the line-of-sight detection image, the line-of-sight detection processing unit 23 detects eyes in the face portion. At this time, the line-of-sight detection processing unit 23 uses edge detection information. The line-of-sight detection processing unit 23 detects the position and orientation of the eyeball using the detected edge detection information inside the eye. The line-of-sight detection processing unit 23 detects the line of sight from the detected position and orientation of the eyeball.

以上、視線情報取得部１１の詳細な構成例について説明した。ここでは、視線情報取得部１１は、視線検出画像を取得するカメラ２１、視線検出画像格納メモリ２２、および視線検出画像を解析する視線検出処理部２３を有することとしたが、本発明はかかる例に限定されない。例えば、情報処理装置１００と接続される外部装置が、カメラ２１、視線検出画像格納メモリ２２、および視線検出処理部２３の一部または全部の機能を有してもよい。 The detailed configuration example of the line-of-sight information acquisition unit 11 has been described above. Here, the line-of-sight information acquisition unit 11 includes the camera 21 that acquires the line-of-sight detection image, the line-of-sight detection image storage memory 22, and the line-of-sight detection processing unit 23 that analyzes the line-of-sight detection image. It is not limited to. For example, an external device connected to the information processing apparatus 100 may have some or all of the functions of the camera 21, the line-of-sight detection image storage memory 22, and the line-of-sight detection processing unit 23.

また音声情報取得部１２の詳細な構成の一例について説明する。図５は、音声情報取得部の構成例を示すブロック図である。音声情報取得部１２は、マイク２４と、Ａ／Ｄ（Ａｎａｌｏｇ／Ｄｉｇｉｔａｌ）変換機２５と、音データ格納メモリ２６と、音量検出部２７とを有する。 An example of a detailed configuration of the voice information acquisition unit 12 will be described. FIG. 5 is a block diagram illustrating a configuration example of the audio information acquisition unit. The audio information acquisition unit 12 includes a microphone 24, an A / D (Analog / Digital) converter 25, a sound data storage memory 26, and a volume detection unit 27.

マイク２４は、音を電気信号に変換する機器である。マイク２４は、音をアナログ信号に変換して、Ａ／Ｄ変換機２５に出力する。 The microphone 24 is a device that converts sound into an electrical signal. The microphone 24 converts the sound into an analog signal and outputs it to the A / D converter 25.

Ａ／Ｄ変換機２５は、マイク２４から出力された音のアナログ信号をデジタル信号に変換する。Ａ／Ｄ変換機２５は、例えばΣΔ変調を用いてアナログ信号をデジタル信号へ変換する。また、Ａ／Ｄ変換機２５は、例えばアナログ信号を１秒間に８０００回サンプリングして、デジタル量で示されるデジタル信号を取得する。 The A / D converter 25 converts the analog signal of the sound output from the microphone 24 into a digital signal. The A / D converter 25 converts an analog signal into a digital signal using, for example, ΣΔ modulation. Further, the A / D converter 25 samples an analog signal 8000 times per second, for example, and acquires a digital signal indicated by a digital quantity.

音データ格納メモリ２６は、Ａ／Ｄ変換機２５にてデジタル信号に変換された音データを格納する記憶部である。音データ格納メモリ２６は、例えば図１に示す記憶部１５と物理的に同じ記憶装置でもよいし、記憶部１５とは別体の記憶装置でもよい。 The sound data storage memory 26 is a storage unit that stores sound data converted into a digital signal by the A / D converter 25. The sound data storage memory 26 may be a storage device physically the same as the storage unit 15 shown in FIG. 1, for example, or may be a separate storage device from the storage unit 15.

音量検出部２７は、音データ格納メモリ２６に格納された音データから、音の音量を検出する。より具体的には、音量検出部２７は、音データのデジタル値の絶対値に基づいて音量を検出する。音量検出部２７は、このデジタル量の絶対値を音量情報として、操作信号生成部１３に出力する。 The sound volume detector 27 detects the sound volume from the sound data stored in the sound data storage memory 26. More specifically, the volume detector 27 detects the volume based on the absolute value of the digital value of the sound data. The volume detection unit 27 outputs the absolute value of the digital quantity as volume information to the operation signal generation unit 13.

また音量検出部２７は、音量が音量閾値以上であるかを判断し、音量が音量閾値以上であるか否かを示す情報を、音量情報として、操作信号生成部１３に出力してもよい。ここで音量検出部２７は、例えばデジタル量の絶対値が音量閾値以上であるサンプルの数が一定割合以上である場合に、この音の音量が音量閾値以上であると判断する。 The sound volume detection unit 27 may determine whether the sound volume is equal to or higher than the sound volume threshold value and output information indicating whether the sound volume is equal to or higher than the sound volume threshold value to the operation signal generation unit 13 as sound volume information. Here, the volume detector 27 determines that the volume of this sound is equal to or greater than the volume threshold when, for example, the number of samples whose digital value absolute value is equal to or greater than the volume threshold is equal to or greater than a certain ratio.

以上、情報処理装置１００の機能構成について説明した。ここで示す機能を実現するためのハードウェア構成については、様々な形態をとることができる。例えば上記実施形態においては、情報処理装置１００は、カメラ２１およびマイク２４を有することとしたが、本発明はかかる例に限定されない。例えばカメラ２１およびマイク２４は、ヘッドセットなどの外部装置に設けられ、情報処理装置１００は、この外部装置から視線検出画像および音データを取得してもよい。また視線検出画像および音データを解析する機能についても、外部装置が有することとしてもよい。 Heretofore, the functional configuration of the information processing apparatus 100 has been described. The hardware configuration for realizing the functions shown here can take various forms. For example, in the above embodiment, the information processing apparatus 100 includes the camera 21 and the microphone 24, but the present invention is not limited to such an example. For example, the camera 21 and the microphone 24 may be provided in an external device such as a headset, and the information processing apparatus 100 may acquire a line-of-sight detection image and sound data from the external device. The external device may have a function of analyzing the line-of-sight detection image and the sound data.

次に、制御部１４が、操作信号生成部１３の生成した操作信号が示す入力操作により実行する処理を判定する判定処理の一例について説明する。図６は、本実施形態にかかる制御部が、操作信号が示す入力操作により実行する処理を判定する判定処理の一例を説明するためのフローチャートである。なお図６の判定処理は、視線に基づいて入力位置が特定されている状態で実行される。 Next, an example of determination processing in which the control unit 14 determines processing to be executed by an input operation indicated by the operation signal generated by the operation signal generation unit 13 will be described. FIG. 6 is a flowchart for explaining an example of a determination process in which the control unit according to the present embodiment determines a process to be executed by an input operation indicated by the operation signal. Note that the determination process of FIG. 6 is executed in a state where the input position is specified based on the line of sight.

まず制御部１４は、操作信号に基づいて、入力位置は選択されているか否かを判断する（ステップＳ１００）。そして、入力位置が選択されていない場合、制御部１４は、ポインティング操作に対応づけられた処理を実行する（ステップＳ１０５）。 First, the control unit 14 determines whether or not the input position is selected based on the operation signal (step S100). When the input position is not selected, the control unit 14 executes a process associated with the pointing operation (step S105).

一方、入力位置は選択されている場合、制御部１４は、次に選択した状態は時間閾値以上継続しているか否かを判断する（ステップＳ１１０）。そして選択した状態は時間閾値以上継続していないとき、制御部１４は、クリック操作に対応付けられた処理を実行する（ステップＳ１１５）。 On the other hand, when the input position is selected, the control unit 14 determines whether or not the next selected state continues for a time threshold or more (step S110). When the selected state does not continue for the time threshold or more, the control unit 14 executes a process associated with the click operation (step S115).

一方、選択した状態は時間閾値以上継続している場合、制御部１４は、次に入力位置が移動したか否かを判断する（ステップＳ１２０）。そして入力位置が移動していない場合、制御部１４は、長押し操作に対応付けられた処理を実行する（ステップＳ１２５）。 On the other hand, when the selected state continues for the time threshold or more, the control unit 14 determines whether or not the input position has moved (step S120). If the input position has not moved, the control unit 14 executes a process associated with the long press operation (step S125).

一方、入力位置が移動した場合、次に制御部１４は、左右の入力位置の間隔が変化したか否かを判断する（ステップＳ１３０）。左右の入力位置の間隔が変化していない場合、制御部１４は、ドラッグ操作に対応づけられた処理を実行する（ステップＳ１３５）。一方、左右の入力位置の間隔が変化した場合、制御部１４は、ピンチ操作に対応付けられた処理を実行する（ステップＳ１４０）。 On the other hand, when the input position has moved, the control unit 14 next determines whether or not the interval between the left and right input positions has changed (step S130). When the interval between the left and right input positions has not changed, the control unit 14 executes a process associated with the drag operation (step S135). On the other hand, when the interval between the left and right input positions changes, the control unit 14 executes a process associated with the pinch operation (step S140).

以上説明したように本実施形態によれば、視線情報および音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成することができる。このため、ユーザは、手以外のものを用いた入力操作を実現することができる。また視線のみ、または音声のみを用いる場合と比較して、多様な入力操作を可能とする。したがって、例えば手を使うことが難しい状況であっても、ユーザは、操作を行うことが可能となるため、操作性に優れた入力操作を実現することができる。 As described above, according to the present embodiment, an operation signal indicating an input operation on the operation surface can be generated based on line-of-sight information and audio information. For this reason, the user can realize an input operation using something other than the hand. In addition, various input operations are possible as compared with the case where only the line of sight or only the voice is used. Therefore, for example, even in a situation where it is difficult to use a hand, the user can perform an operation, and thus an input operation with excellent operability can be realized.

また本実施形態では、操作信号生成部１３は、視線情報に基づいて操作面上の入力位置を特定し、音声情報に基づいて入力位置を選択したか否かを判断し、当該判断の結果および入力位置に基づいて、操作信号を生成する。このため、ユーザは、視線の入力により操作面上の位置を指定し、この指定した位置に対して選択するか否かを音声の入力により操作することができる。操作面上の入力位置を選択してこの選択した入力位置を動かすことにより行う入力操作は、マウスやタッチセンサなどにより広く用いられている。上記の構成によれば、ユーザは、マウスやタッチセンサなどで行っていた使い慣れた操作を、視線と音声とを用いて行うことができる。 In the present embodiment, the operation signal generation unit 13 specifies the input position on the operation surface based on the line-of-sight information, determines whether the input position is selected based on the audio information, and determines the result of the determination and An operation signal is generated based on the input position. For this reason, the user can designate a position on the operation surface by inputting a line of sight, and can operate whether or not to select the designated position by inputting voice. An input operation performed by selecting an input position on the operation surface and moving the selected input position is widely used by a mouse or a touch sensor. According to said structure, the user can perform the familiar operation which had been performed with the mouse | mouth, the touch sensor, etc. using a visual line and an audio | voice.

また本実施形態では、音声情報は、音声の音量を示す音量情報を含み、操作信号生成部１３は、音量情報に基づいて、入力位置を選択したか否かを判断する。具体的には、操作信号生成部１３は、音量が音量閾値以上となる場合、入力位置を選択したと判断する。入力位置を選択したか否かを判断するために音量を用いることで、情報処理装置１００は、例えば音声を取得するマイクに息を吹きかけることによって、入力操作を行うことができる。操作信号生成部１３は、音量以外の音声情報に基づいて、例えば音声認識を用いて入力位置が選択されたか否かを判断してもよいが、音声認識を用いる場合には、入力操作に言語依存性があり、人ではなく装置に向かって声を発することに対して、ユーザが抵抗感を感じることがある。これに対して、音量を用いて、マイクに対して息を吹きかけることにより音声を入力する場合には、言語依存性がなく、上記の抵抗感を軽減することができるという利点がある。 In the present embodiment, the sound information includes sound volume information indicating the sound volume, and the operation signal generation unit 13 determines whether or not an input position has been selected based on the sound volume information. Specifically, the operation signal generation unit 13 determines that the input position has been selected when the volume is equal to or greater than the volume threshold. By using the sound volume to determine whether or not the input position has been selected, the information processing apparatus 100 can perform an input operation, for example, by blowing on a microphone that acquires sound. The operation signal generator 13 may determine whether or not the input position is selected using, for example, voice recognition based on voice information other than the volume. There is a dependency, and the user may feel resistance to uttering a voice toward a device rather than a person. On the other hand, when sound is input by blowing on the microphone using the sound volume, there is an advantage that there is no language dependence and the above resistance can be reduced.

また本実施形態では、操作信号生成部１３は、視線情報に含まれる右目の視線を示す情報に基づいて右目に対応する入力位置である第１の入力位置を特定し、左目の視線を示す情報に基づいて左目に対応する入力位置である第２の入力位置を特定する。右目の注視点と左目の注視点とが一致している場合には、第１の入力位置と第２の入力位置とは重なり合うため、１つの入力位置として取り扱うことができる。一方、ユーザがいわゆる「より目」や「外斜視」の状態にする動作を行うと、第１の入力位置は第２の入力位置とは異なる位置となる。したがってこの場合、ユーザは、操作面上に２つの位置を入力することができる。 Further, in the present embodiment, the operation signal generation unit 13 specifies the first input position that is the input position corresponding to the right eye based on the information indicating the right eye line of sight included in the line of sight information, and indicates the left eye line of sight The second input position, which is the input position corresponding to the left eye, is specified based on the above. When the right-eye gazing point and the left-eye gazing point coincide with each other, the first input position and the second input position overlap with each other, and can be handled as one input position. On the other hand, when the user performs a so-called “more eyes” or “external perspective” state, the first input position is different from the second input position. Therefore, in this case, the user can input two positions on the operation surface.

（第２の実施形態）
次に、本発明の第２の実施形態について説明する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described.

図７は、本発明の第２の実施形態にかかる情報処理装置の機能構成を示すブロック図である。図７に示す情報処理装置２００は、視線情報取得部１１と、音声情報取得部１２と、操作信号生成部１３と、制御部１４と、記憶部１５と、タッチ式入力部１６と、動作モード設定部１７とを有する。すなわち、情報処理装置２００は、第１の実施形態にかかる情報処理装置１００の構成に加えて、タッチ式入力部１６と、動作モード設定部１７とをさらに有する。以下、第１の実施形態にかかる情報処理装置１００と同様の構成については説明を省略し、差異部分について主に説明する。 FIG. 7 is a block diagram showing a functional configuration of the information processing apparatus according to the second embodiment of the present invention. An information processing apparatus 200 illustrated in FIG. 7 includes a line-of-sight information acquisition unit 11, an audio information acquisition unit 12, an operation signal generation unit 13, a control unit 14, a storage unit 15, a touch input unit 16, and an operation mode. And a setting unit 17. That is, the information processing apparatus 200 further includes a touch input unit 16 and an operation mode setting unit 17 in addition to the configuration of the information processing apparatus 100 according to the first embodiment. Hereinafter, the description of the same configuration as the information processing apparatus 100 according to the first embodiment will be omitted, and the difference will be mainly described.

タッチ式入力部１６は、操作面を有し、操作面上に接触または近接する操作体の位置を検知する。タッチ式入力部１６は、検知した操作体の位置情報を操作信号生成部１３に出力する。 The touch-type input unit 16 has an operation surface and detects the position of the operation body that is in contact with or close to the operation surface. The touch input unit 16 outputs the detected position information of the operating tool to the operation signal generation unit 13.

動作モード設定部１７は、操作信号生成部１３に動作モードを設定する。動作モード設定部１７は、タッチ式入力部１６の検知した位置に基づいて操作信号を生成する第１の動作モードと、視線情報および音声情報に基づいて操作信号を生成する第２の動作モードと、のうちいずれかの動作モードを操作信号生成部１３に設定する。また動作モード設定部１７は、操作信号生成部１３に第２の動作モードを設定するとき、タッチ式入力部１６への電力の供給を停止する。なお動作モード設定部１７は、例えばユーザが入力する動作モード設定操作を検知して、この動作モード設定操作に従って動作モードを設定する。 The operation mode setting unit 17 sets an operation mode in the operation signal generation unit 13. The operation mode setting unit 17 generates a first operation mode based on the position detected by the touch-type input unit 16, and a second operation mode generates a manipulation signal based on the line-of-sight information and audio information. , One of the operation modes is set in the operation signal generator 13. The operation mode setting unit 17 stops the supply of power to the touch input unit 16 when setting the second operation mode in the operation signal generation unit 13. The operation mode setting unit 17 detects an operation mode setting operation input by the user, for example, and sets the operation mode according to the operation mode setting operation.

ここで本発明の第２の実施形態にかかる情報処理装置２００の動作モード設定部１７の動作例について説明する。図８は、本実施形態にかかる動作モード設定部の動作例を説明するためのフローチャートである。 Here, an operation example of the operation mode setting unit 17 of the information processing apparatus 200 according to the second embodiment of the present invention will be described. FIG. 8 is a flowchart for explaining an operation example of the operation mode setting unit according to the present embodiment.

まず動作モード設定部１７は、操作信号生成部１３から供給される操作信号に基づいて、動作モード設定操作を検知する（ステップＳ２００）。そして動作モード設定部１７は、動作モード設定操作に従って動作モードを設定する（ステップＳ２０５）。 First, the operation mode setting unit 17 detects an operation mode setting operation based on the operation signal supplied from the operation signal generation unit 13 (step S200). Then, the operation mode setting unit 17 sets the operation mode according to the operation mode setting operation (step S205).

続いて、動作モード設定部１７は、設定した動作モードが第２の動作モードであるか否かを判断する（ステップＳ２１０）。ここで設定した動作モードが第２の動作モードである場合、動作モード設定部１７は、タッチ式入力部１６への電力の供給を停止する（ステップＳ２１５）。 Subsequently, the operation mode setting unit 17 determines whether or not the set operation mode is the second operation mode (step S210). When the operation mode set here is the second operation mode, the operation mode setting unit 17 stops the supply of power to the touch input unit 16 (step S215).

以上説明したように本実施形態においては、情報処理装置２００は、操作面上に接触または近接した位置を検知するタッチ式入力部１６を有し、このタッチ式入力部１６が検知した位置に基づいて操作信号を生成する第１の動作モードと、視線情報および音声情報に基づいて操作信号を生成する第２の動作モードと、のうちいずれかの動作モードを操作信号生成部１３に設定する動作モード設定部１７をさらに有する。そして動作モード設定部１７は、操作信号生成部１３を第２の動作モードに設定するとき、タッチ式入力部１６への電力の供給を停止する。これにより、情報処理装置２００は、視線情報および音声情報に基づいて操作信号を生成する第２の動作モードで動作する場合に、タッチ式入力部１６の動作電力の分の消費電力を低減することができる。 As described above, in the present embodiment, the information processing apparatus 200 includes the touch-type input unit 16 that detects a position in contact with or close to the operation surface, and based on the position detected by the touch-type input unit 16. The operation signal generation unit 13 is set to one of the first operation mode for generating the operation signal and the second operation mode for generating the operation signal based on the line-of-sight information and the audio information. A mode setting unit 17 is further included. The operation mode setting unit 17 stops supplying power to the touch input unit 16 when setting the operation signal generation unit 13 to the second operation mode. Thereby, when the information processing apparatus 200 operates in the second operation mode in which the operation signal is generated based on the line-of-sight information and the sound information, the power consumption corresponding to the operation power of the touch input unit 16 is reduced. Can do.

（第３の実施形態）
次に、本発明の第３の実施形態について説明する。 (Third embodiment)
Next, a third embodiment of the present invention will be described.

図９は、本発明の第３の実施形態にかかる情報処理装置の機能構成を示すブロック図である。図９に示す情報処理装置３００は、視線情報取得部３１と、音声情報取得部３２と、操作信号生成部３３とを有する。 FIG. 9 is a block diagram showing a functional configuration of an information processing apparatus according to the third embodiment of the present invention. An information processing apparatus 300 illustrated in FIG. 9 includes a line-of-sight information acquisition unit 31, an audio information acquisition unit 32, and an operation signal generation unit 33.

視線情報取得部３１は、ユーザの視線を示す視線情報を取得する。 The line-of-sight information acquisition unit 31 acquires line-of-sight information indicating the user's line of sight.

音声情報取得部３２は、音声を示す音声情報を取得する。 The sound information acquisition unit 32 acquires sound information indicating sound.

操作信号生成部３３は、視線情報および音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成する。 The operation signal generation unit 33 generates an operation signal indicating an input operation on the operation surface based on the line-of-sight information and the sound information.

次に本実施形態にかかる情報処理装置３００の動作について説明する。図１０は、本実施形態にかかる情報処理装置の動作を説明するためのフローチャートである。 Next, the operation of the information processing apparatus 300 according to the present embodiment will be described. FIG. 10 is a flowchart for explaining the operation of the information processing apparatus according to the present embodiment.

視線情報取得部３１は、視線情報を取得する（ステップＳ３００）。音声情報取得部３２は、音声情報を取得する（ステップＳ３０５）。そして操作信号生成部３３は、視線情報および音声情報に基づき、操作信号を生成する（ステップ３１０）。 The line-of-sight information acquisition unit 31 acquires line-of-sight information (step S300). The voice information acquisition unit 32 acquires voice information (step S305). Then, the operation signal generator 33 generates an operation signal based on the line-of-sight information and the audio information (step 310).

本実施形態でも、視線情報および音声情報に基づいて、操作面に対する入力操作を示す操作信号を生成することができる。このため、ユーザは、手以外のものを用いた入力操作を実現することができる。また視線のみ、または音声のみを用いる場合と比較して、多様な入力操作を可能とする。したがって、例えば手を使うことが難しい状況であっても、ユーザは、操作を行うことが可能となるため、操作性に優れた入力操作を実現することができる。 Also in the present embodiment, an operation signal indicating an input operation on the operation surface can be generated based on the line-of-sight information and audio information. For this reason, the user can realize an input operation using something other than the hand. In addition, various input operations are possible as compared with the case where only the line of sight or only the voice is used. Therefore, for example, even in a situation where it is difficult to use a hand, the user can perform an operation, and thus an input operation with excellent operability can be realized.

以上、本発明の好適な実施形態について説明した。しかし上記において説明した各実施形態で示した構成は単なる一例であって、本発明はその構成に限定されるものではない。 The preferred embodiments of the present invention have been described above. However, the configuration shown in each embodiment described above is merely an example, and the present invention is not limited to the configuration.

例えば情報処理装置１００、２００、および３００の機能は、その機能を実現するためのプログラムを、コンピュータにて読取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータに読み込ませ実行させることで、実現されてもよい。 For example, the functions of the information processing apparatuses 100, 200, and 300 are recorded on a computer-readable recording medium and a program recorded on the recording medium is read by the computer. It may be realized by executing.

また上記実施形態で記載した入力操作は一例であり、入力位置と選択された状態か否かを示す情報とを用いた様々な入力操作が用いられてよい。例えば、ダブルクリック操作、フリック操作などと同様の入力操作が用いられてよい。 Moreover, the input operation described in the above embodiment is an example, and various input operations using an input position and information indicating whether or not the input state is selected may be used. For example, an input operation similar to a double click operation, a flick operation, or the like may be used.

また上記実施形態では、手を用いない入力操作の例を記載したが、本発明はかかる例に限定されない。例えば、手を使った操作と、上記実施形態で記載した視線情報および音声情報に基づいた操作とが組み合わせて用いられることも可能である。 Moreover, although the example of input operation which does not use a hand was described in the said embodiment, this invention is not limited to this example. For example, the operation using the hand and the operation based on the line-of-sight information and the audio information described in the above embodiment can be used in combination.

１００，２００，３００情報処理装置
１１，３１視線情報取得部
１２，３２音声情報取得部
１３，３３操作信号生成部
１４制御部
１５記憶部
１６タッチ式入力部
１７動作モード設定部 100, 200, 300 Information processing apparatus 11, 31 Line-of-sight information acquisition unit 12, 32 Audio information acquisition unit 13, 33 Operation signal generation unit 14 Control unit 15 Storage unit 16 Touch type input unit 17 Operation mode setting unit

Claims

A line-of-sight information acquisition unit for acquiring line-of-sight information indicating the user's line of sight;
A voice information acquisition unit that acquires voice information indicating voice;
An operation signal generation unit that generates an operation signal indicating an input operation on the operation surface based on the line-of-sight information and the audio information;
An information processing apparatus comprising:

The operation signal generation unit specifies an input position on the operation surface based on the line-of-sight information, determines whether the input position is selected based on the audio information, and determines the result of the determination and the input Generating the operation signal based on the position;
The information processing apparatus according to claim 1.

The audio information includes volume information indicating a volume of the audio,
The operation signal generator determines whether the input position is selected based on the volume information;
The information processing apparatus according to claim 2.

The operation signal generation unit determines that the input position is selected when the volume is equal to or higher than a volume threshold.
The information processing apparatus according to claim 3.

The operation signal generation unit, based on each of information indicating a right eye line of sight and information indicating a left eye line of sight included in the line of sight information, as the input position, the first input position corresponding to the right eye and the Identify each second input position corresponding to the left eye,
The operation signal generation unit generates the operation signal based on the first input position and the second input position.
The information processing apparatus according to any one of claims 2 to 4.

A touch-type input unit that detects a position in contact with or close to the operation surface;
A first operation mode for generating the operation signal based on the position detected by the touch input unit, and a second operation signal for generating the operation signal based on the line-of-sight information and the audio information. An operation mode setting unit for setting any one of the operation modes,
Further comprising
The operation mode setting unit stops supplying power to the touch input unit when setting the second operation mode in the operation signal generation unit.
The information processing apparatus according to any one of claims 1 to 5.

Get line-of-sight information showing the user's line of sight
Acquire audio information indicating the audio,
Based on the line-of-sight information and the audio information, an operation signal indicating an input operation on the operation surface is generated.
Operation signal generation method.

A procedure for obtaining gaze information indicating a user's gaze;
A procedure for obtaining audio information indicating audio;
A procedure for generating an operation signal indicating an input operation to the operation surface based on the line-of-sight information and the audio information;
A program that causes a computer to execute.