JP2019016141A

JP2019016141A - Input device and input method

Info

Publication number: JP2019016141A
Application number: JP2017132755A
Authority: JP
Inventors: 岡本　明浩; Akihiro Okamoto; 明浩岡本
Original assignee: Asahi Kasei Corp
Current assignee: Asahi Kasei Corp
Priority date: 2017-07-06
Filing date: 2017-07-06
Publication date: 2019-01-31

Abstract

To provide an input device and an input method capable of suppressing erroneous detection of an instruction input.SOLUTION: An input device comprises a foreground mask generation part 12 generating a foreground mask indicating whether a pixel value of pixels of a current image which is one frame of a captured image and an image value of pixels having same coordinates of a background image are different, a foreground image generation part 13 masking the current image with a foreground mask to generate a foreground image, an instruction area setting part 14 setting an instruction area on coordinates of pixels not included in a pixel area corresponding to the foreground image of pixels of an imaging frame of an imaging part, an image for operation display part 17 displaying an image for operation being formed by overwriting an instruction area icon image indicating the instruction area on an operation object image and a second foreground image which is a foreground image generated after setting of the instruction area, and an instruction input determination part 18 determining that an instruction input of the predetermined operation is performed if it is determined that there is an overlapping portion of the instruction area of the pixels of the imaging frame and a pixel area corresponding to the second foreground image.SELECTED DRAWING: Figure 2

Description

本発明は、入力装置及び入力方法に関する。 The present invention relates to an input device and an input method.

従来、ユーザを撮像し、撮像して得た画像上のユーザの手が指示領域に触れた場合に、所定動作の指示入力があったと判定する入力装置が提案されている（特許文献１参照）。この特許文献１に記載の入力装置では、画像上のユーザの目の位置と手の位置とを検出し、検出した目の位置と手の位置との中間位置に指示領域を設定するとともに、指示領域の画素の画素値が変化した場合に、画像上のユーザの手が指示領域に触れたと判定する。 2. Description of the Related Art Conventionally, there has been proposed an input device that captures an image of a user and determines that an instruction input for a predetermined operation has been made when the user's hand on the image obtained by the image touches the instruction area (see Patent Document 1). . In the input device described in Patent Document 1, the position of the user's eyes and the position of the hand on the image are detected, an instruction area is set at an intermediate position between the detected eye position and the hand position, and the instruction is performed. When the pixel value of the pixel in the area changes, it is determined that the user's hand on the image has touched the instruction area.

特願２００９−２５９１１７号公報Japanese Patent Application No. 2009-259117

しかしながら、特許文献１に記載の入力装置を家庭内等で用いる場合、ユーザ以外にペット等の移動体が映ることがある。この場合、例えば、画像上のユーザの目の位置と手の位置との間にペット等の移動体が映ると、ユーザの意図しない、指示入力の誤検出が発生する。
本発明は、上記のような課題に着目したもので、指示入力の誤検出を抑制可能な入力装置及び入力方法を提供することを目的とする。 However, when the input device described in Patent Document 1 is used at home or the like, a moving body such as a pet may appear in addition to the user. In this case, for example, if a moving body such as a pet is reflected between the position of the user's eyes and the position of the hand on the image, an erroneous detection of an instruction input unintended by the user occurs.
The present invention focuses on the above-described problems, and an object thereof is to provide an input device and an input method capable of suppressing erroneous detection of an instruction input.

上記課題を解決するために、本発明の一態様は、（ａ）所定空間を連続して撮像する撮像部が出力する映像の１フレームをカレント画像として取得するカレント画像取得部と、（ｂ）予めカレント画像と同じ画角で撮像された映像から生成された背景画像を記憶する背景画像記憶部と、（ｃ）カレント画像の画素の画素値と背景画像の座標を同じくする画素の画像値とが異なるか否かを示す前景マスクを生成する前景マスク生成部と、（ｄ）カレント画像を前景マスクでマスクして前景画像を生成する前景画像生成部と、（ｅ）撮像部の撮像フレームの画素のうち前景画像に対応する画素領域に含まれない画素の座標に指示領域を設定する指示領域設定部と、（ｆ）予め定められた操作対象画像に指示領域を示す指示領域アイコン画像とその指示領域の設定後に取得したカレント画像に基づいて生成された前景画像である第２の前景画像とを上書きしてなる操作用画像を、所定空間に向けられた画像表示部に表示させる操作用画像表示部と、（ｇ）指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があるか否かを判定する画素領域重複判定部と、（ｈ）指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定した場合に、所定動作の指示入力があったと判定する指示入力判定部と、を備える入力装置であることを要旨とする。 In order to solve the above-described problem, an aspect of the present invention is as follows. (A) A current image acquisition unit that acquires one frame of a video output by an imaging unit that continuously images a predetermined space as a current image; and (b) A background image storage unit that stores a background image generated from a video imaged in advance with the same angle of view as the current image; and (c) a pixel value of a pixel of the current image and an image value of a pixel having the same coordinates as the background image A foreground mask generating unit that generates a foreground mask indicating whether or not are different, (d) a foreground image generating unit that generates a foreground image by masking the current image with a foreground mask, and (e) an imaging frame of the imaging unit An instruction area setting unit that sets an instruction area at the coordinates of a pixel that is not included in the pixel area corresponding to the foreground image among the pixels; (f) an instruction area icon image that indicates the instruction area in a predetermined operation target image; An operation image for displaying an operation image overwritten with a second foreground image, which is a foreground image generated based on the current image acquired after setting the display area, on an image display unit directed to a predetermined space A display unit; (g) a pixel region overlap determination unit that determines whether or not there is an overlapping portion between the pointing region and the pixel region corresponding to the second foreground image; and (h) the pointing region and the second foreground image. The gist of the present invention is that the input device includes an instruction input determination unit that determines that there has been an instruction input for a predetermined operation when it is determined that there is an overlapping portion with the pixel region corresponding to.

本発明の他の態様は、（ａ）所定空間を連続して撮像する撮像部と、（ｂ）操作用画像を表示する画像表示部と、を備えた入力装置であって、（ｃ）操作用画像は、指示領域が設定された画像であり、かつ、予め定められた操作対象画像に現在の前景画像及び指示領域を示す指示領域アイコン画像が上書きされており、（ｄ）現在の前景画像は、撮像部が出力した映像の１フレームから取得されたカレント画像の画素のうち、予めカレント画像と同じ画角で撮像された背景画像と座標を同じくする画素と画素値が異なる画素で形成された画像であり、（ｅ）指示領域は、その指示領域が設定される時点で生成された過去の前景画像に対応する画素領域に含まれない領域に設定されており、かつ、指示領域の設定後に生成された現在の前景画像の少なくとも一部が指示領域の少なくとも一部に存在するか否かの判定に用いられる入力装置であることを要旨とする。 Another aspect of the present invention is an input device including: (a) an imaging unit that continuously images a predetermined space; and (b) an image display unit that displays an operation image. The image for use is an image in which an instruction area is set, and a predetermined operation target image is overwritten with a current foreground image and an instruction area icon image indicating the instruction area, and (d) a current foreground image. Is formed from pixels of the current image acquired from one frame of the video output by the imaging unit, and pixels having a pixel value different from those of the background image captured in advance at the same angle of view as the current image. (E) the indication area is set to an area not included in the pixel area corresponding to the past foreground image generated at the time when the indication area is set, and the indication area is set Current foreground image generated later And summarized in that at least a portion of which is an input device used to determine whether or not to present at least a portion of the instruction region.

本発明の他の態様は、（ａ）所定空間を連続して撮像する撮像部が出力する映像の１フレームをカレント画像として取得するステップと、（ｂ）カレント画像の画素の画素値と、予めカレント画像と同じ画角で撮像された背景画像の座標を同じくする画素の画像値とが異なるか否かを示す前景マスクを生成するステップと、（ｃ）カレント画像を前景マスクでマスクして前景画像を生成するステップと、（ｄ）撮像部の撮像フレームの画素のうち前景画像に対応する画素領域に含まれない画素から１つ以上の画素を選んでその座標に指示領域を設定するステップと、（ｅ）予め定められた操作対象画像に指示領域を示す指示領域アイコン画像と該指示領域の設定後に生成された前景画像である第２の前景画像とを上書きしてなる操作用画像を所定空間に向けて表示するステップと、（ｇ）撮像部の撮像フレームの画素のうちの、指示領域と、第２の前景画像に対応する画素領域とに重なり合う部分があるか否かを判定するステップと、（ｈ）指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定した場合に、所定動作の指示入力があったと判定するステップと、を備える入力方法であることを要旨とする。 In another aspect of the present invention, (a) a step of acquiring one frame of a video output from an imaging unit that continuously captures a predetermined space as a current image, (b) a pixel value of a pixel of the current image, and Generating a foreground mask indicating whether or not the image value of a pixel having the same coordinates of a background image captured at the same angle of view as the current image is different; and (c) masking the current image with the foreground mask. A step of generating an image; and (d) selecting one or more pixels from pixels not included in the pixel region corresponding to the foreground image among the pixels of the imaging frame of the imaging unit and setting an indication region at the coordinates thereof; (E) An operation image formed by overwriting an instruction area icon image indicating an instruction area on a predetermined operation target image and a second foreground image that is a foreground image generated after setting the instruction area. A step of displaying toward the constant space; and (g) determining whether or not there is an overlapping portion between the designated area and the pixel area corresponding to the second foreground image among the pixels of the imaging frame of the imaging unit. And (h) determining that there is an instruction input for a predetermined operation when it is determined that there is an overlapping portion between the instruction area and the pixel area corresponding to the second foreground image. This is the gist.

本発明によれば、例えば、ペット等、ユーザ以外の移動体が映っている部分も前景画像に含まれ、前景画像に対応する画素領域には指示領域が設定されないので、指示入力の誤検出を抑制可能な入力装置及び入力方法を提供することができる。 According to the present invention, for example, a portion in which a moving body other than the user, such as a pet, is included in the foreground image, and no indication area is set in the pixel area corresponding to the foreground image. An input device and an input method that can be suppressed can be provided.

本発明の実施形態に係る入力装置の構成を表す概念図である。It is a conceptual diagram showing the structure of the input device which concerns on embodiment of this invention. プロセッサの内部構成を表すブロック図である。It is a block diagram showing the internal structure of a processor. 記憶装置の内部構成を表すブロック図である。It is a block diagram showing the internal structure of a memory | storage device. 背景画像記憶処理を示すフローチャートである。It is a flowchart which shows a background image storage process. 変形例に係るプロセッサの内部構成を表すブロック図である。It is a block diagram showing the internal structure of the processor which concerns on a modification. 背景画像を説明するための説明図である。It is explanatory drawing for demonstrating a background image. 実行許可判定処理を示すフローチャートである。It is a flowchart which shows execution permission determination processing. 変形例に係るプロセッサの内部構成を表すブロック図である。It is a block diagram showing the internal structure of the processor which concerns on a modification. 指示検知ループ処理を示すフローチャートである。It is a flowchart which shows an instruction | indication detection loop process. カレント画像を説明するための説明図である。It is explanatory drawing for demonstrating a current image. 前景マスクの生成方法を説明するための説明図である。It is explanatory drawing for demonstrating the production | generation method of a foreground mask. 前景画像の生成方法を説明するための説明図である。It is explanatory drawing for demonstrating the production | generation method of a foreground image. 指示領域の設定方法を説明するための説明図である。It is explanatory drawing for demonstrating the setting method of an instruction | indication area | region. 指示領域の設定方法を説明するための説明図であり、（ａ）はカレント画像を示す図であり、（ｂ）は前景画像を示す図である。It is explanatory drawing for demonstrating the setting method of an instruction | indication area | region, (a) is a figure which shows a current image, (b) is a figure which shows a foreground image. 操作対象画像を説明するための説明図である。It is explanatory drawing for demonstrating the operation target image. 操作用画像を説明するための説明図である。It is explanatory drawing for demonstrating the image for operation. 前景画像の鏡像を説明するための説明図である。It is explanatory drawing for demonstrating the mirror image of a foreground image. 前景画像のリサイズを説明するための説明図である。It is explanatory drawing for demonstrating resizing of a foreground image. 指示入力の判定方法を説明するための説明図である。It is explanatory drawing for demonstrating the determination method of instruction input.

以下、本発明の実施形態に係る入力装置について、図面を参照しつつ説明する。本発明の実施形態に係る入力装置は、例えば、カラオケで音程変更を行うためのものである。
なお、以下に示す実施形態は、本発明の技術的思想を具体化するための装置や方法を例示するものであって、本発明の技術的思想は、構成部品の形状、構造、配置等を下記のものに特定するものでない。本発明の技術的思想は、請求の範囲に記載された請求項が規定する技術的範囲内において、種々の変更を加えることができる。 Hereinafter, an input device according to an embodiment of the present invention will be described with reference to the drawings. The input device according to the embodiment of the present invention is for changing a pitch in karaoke, for example.
The following embodiments exemplify apparatuses and methods for embodying the technical idea of the present invention, and the technical idea of the present invention includes the shape, structure, arrangement, etc. of components. It is not specified to the following. The technical idea of the present invention can be variously modified within the technical scope defined by the claims described in the claims.

（構成）
本発明の実施形態に係る入力装置１は、図１に示すように、撮像部２と、画像表示部３と、撮像部２及び画像表示部３に接続されたコンピュータ４とを備えている。
撮像部２は、画像表示部３の前の空間（以下、「所定空間」とも呼ぶ）を連続して撮像する。そして、撮像した映像の映像データをコンピュータ４に出力する。撮像部２としては、例えば、ＷＥＢカメラ、ビデオカメラ、パソコン付属カメラを用いることができる。 (Constitution)
As shown in FIG. 1, the input device 1 according to the embodiment of the present invention includes an imaging unit 2, an image display unit 3, and a computer 4 connected to the imaging unit 2 and the image display unit 3.
The imaging unit 2 continuously images the space in front of the image display unit 3 (hereinafter also referred to as “predetermined space”). Then, the video data of the captured video is output to the computer 4. As the imaging unit 2, for example, a WEB camera, a video camera, or a camera attached to a personal computer can be used.

画像表示部３は、コンピュータ４からの信号に従い、カラオケムービー等を所定空間に向けて表示する。画像表示部３としては、例えば、液晶ディスプレイを用いることができる。
コンピュータ４は、プロセッサ５及び記憶装置６等のハードウェア資源を備えている。 The image display unit 3 displays a karaoke movie or the like in a predetermined space in accordance with a signal from the computer 4. For example, a liquid crystal display can be used as the image display unit 3.
The computer 4 includes hardware resources such as a processor 5 and a storage device 6.

プロセッサ５は、図２に示すように、背景画像記憶判定部７、顔画像検出部８、背景画像書込部９、カレント画像取得部１０、実行許可判定部１１、前景マスク生成部１２、前景画像生成部１３、指示領域設定部１４、画素領域重複判定部１５、操作対象画像取得部１６、操作用画像表示部１７、指示入力判定部１８等のハードウェア資源を論理的に備えている。図２では、背景画像記憶判定部７等は、論理的な機能に着目したハードウェア資源を形式的に表現している。すなわち、図２に示したプロセッサ５の内部構造の表現は、必ずしも半導体チップ上に物理的な領域としてそれぞれ独立して存在する専用の集積回路や機能ブロックを意味するものではなく、ソフトウェア的に汎用のコンピュータシステムの回路を制御し、背景画像記憶判定部７等に等価な機能を実現することも可能である。 As shown in FIG. 2, the processor 5 includes a background image storage determination unit 7, a face image detection unit 8, a background image writing unit 9, a current image acquisition unit 10, an execution permission determination unit 11, a foreground mask generation unit 12, a foreground. Hardware resources such as an image generation unit 13, an instruction region setting unit 14, a pixel region overlap determination unit 15, an operation target image acquisition unit 16, an operation image display unit 17, and an instruction input determination unit 18 are logically provided. In FIG. 2, the background image storage determination unit 7 and the like formally express hardware resources focusing on logical functions. That is, the representation of the internal structure of the processor 5 shown in FIG. 2 does not necessarily mean a dedicated integrated circuit or functional block that exists independently as a physical area on the semiconductor chip, but is general-purpose in terms of software. It is also possible to realize a function equivalent to the background image storage determination unit 7 and the like by controlling the circuit of the computer system.

記憶装置６は、図３に示すように、背景画像記憶部１９及び前景情報記憶部２０等のハードウェア資源を論理的に備えている。図３では、背景画像記憶部１９及び前景情報記憶部２０等は、論理的な機能に着目したハードウェア資源を形式的に表現している。よって、図３に示した記憶装置６の表現は、必ずしも基板上に物理的な領域としてそれぞれ独立して存在する記憶装置を意味するものではない。背景画像記憶部１９及び前景情報記憶部２０等としては、例えば、１つの主記憶装置内の互いに異なる領域に保存されるファイル等の形式を採用でき、データ構造を構築することができる。 As illustrated in FIG. 3, the storage device 6 logically includes hardware resources such as a background image storage unit 19 and a foreground information storage unit 20. In FIG. 3, the background image storage unit 19, foreground information storage unit 20, and the like formally express hardware resources focusing on logical functions. Therefore, the expression of the storage device 6 shown in FIG. 3 does not necessarily mean a storage device that exists independently as a physical area on the substrate. As the background image storage unit 19 and the foreground information storage unit 20, for example, a format such as a file stored in different areas in one main storage device can be adopted, and a data structure can be constructed.

（背景画像記憶処理）
次に、プロセッサ５が実行する背景画像記憶処理について説明する。背景画像記憶処理は、ユーザによってコンピュータ４に予め定められた開始操作が行われると実行され、その後、所定時間（例えば、１秒）が経過するたびに実行される。なお、後述する指示領域が設定された場合には、処理の実行を中断して、背景画像の更新を中断する。
図４に示すように、ステップＳ１０１で、背景画像記憶判定部７が、撮像部２から出力される映像データが示す映像の１フレーム（以下、「候補画像」とも呼ぶ）、つまり、後述するカレント画像と同じ画角で撮像された映像から生成される画像を取得する。 (Background image storage processing)
Next, background image storage processing executed by the processor 5 will be described. The background image storage process is executed when a predetermined start operation is performed on the computer 4 by the user, and then executed every time a predetermined time (for example, 1 second) elapses. If an instruction area to be described later is set, the execution of the process is interrupted, and the update of the background image is interrupted.
As shown in FIG. 4, in step S101, the background image storage determination unit 7 performs one frame (hereinafter also referred to as “candidate image”) of the video indicated by the video data output from the imaging unit 2, that is, a current described later. An image generated from video captured at the same angle of view as the image is acquired.

続いてステップＳ１０２に移行して、背景画像記憶判定部７及び顔画像検出部８が、ステップＳ１０１で取得した候補画像に基づき、背景画像を更新して記憶するか否かを判定する。背景画像を更新して記憶するか否かの判定方法としては、例えば、以下の方法を用いることができる。まず、顔画像検出部８が、候補画像からの顔画像の検出を行う。そして、背景画像記憶判定部７が、顔画像検出部８で所定サイズ以上の顔画像が検出されていないときには、背景画像を更新して記憶すると判定し（Ｙｅｓ）、ステップＳ１０３に移行する。一方、顔画像検出部８で所定サイズ以上の顔画像が検出されたときには、候補画像を背景画像として記憶しないと判定し（Ｎｏ）、この演算処理を終了する。 Subsequently, the process proceeds to step S102, where the background image storage determination unit 7 and the face image detection unit 8 determine whether to update and store the background image based on the candidate image acquired in step S101. As a method for determining whether or not the background image is updated and stored, for example, the following method can be used. First, the face image detection unit 8 detects a face image from candidate images. Then, when the face image detection unit 8 does not detect a face image of a predetermined size or larger, the background image storage determination unit 7 determines to update and store the background image (Yes), and proceeds to step S103. On the other hand, when a face image of a predetermined size or more is detected by the face image detection unit 8, it is determined not to store the candidate image as a background image (No), and this calculation process is terminated.

なお、本発明の実施形態に係る入力装置１では、背景画像記憶判定部７が、顔画像検出部８で所定サイズ以上の顔画像が検出されていないときに、背景画像を更新して記憶すると判定する例を示したが、顔画像検出部８を含まない他の構成を採用することもできる。例えば、コンピュータ４に背景画像を記憶させるための操作をユーザが行ったと判定したときに、背景画像を更新して記憶すると判定する構成としてもよい。また例えば、図５に示すように、撮像部２から所定距離（例えば、５０［cm］）内の範囲から特定の熱源を検出する人感センサ等の人検出部２１を備え、人検出部２１で特定の熱源が検出されていないときに、背景画像を取得すると判定する構成としてもよい。 In the input device 1 according to the embodiment of the present invention, the background image storage determination unit 7 updates and stores the background image when the face image detection unit 8 does not detect a face image of a predetermined size or larger. Although an example of determination is shown, another configuration that does not include the face image detection unit 8 may be employed. For example, when it is determined that the user has performed an operation for storing the background image in the computer 4, the background image may be determined to be updated and stored. Further, for example, as shown in FIG. 5, a human detection unit 21 such as a human sensor that detects a specific heat source from a range within a predetermined distance (for example, 50 [cm]) from the imaging unit 2 is provided. It is good also as a structure which determines with acquiring a background image, when a specific heat source is not detected by this.

ステップＳ１０３では、背景画像書込部９が、図６に示すように、ステップＳ１０１で取得した候補画像、つまり、カレント画像と同じ画角で撮像された映像から生成される画像を背景画像として背景画像記憶部１９に記憶させた後、この演算処理を終了する。これにより、背景画像記憶部１９は、ユーザの顔が所定サイズ以上で映っていない映像の１フレームを背景画像として記憶し、所定時間が経過するたびに背景画像を次々に更新して記憶する。 In step S103, as shown in FIG. 6, the background image writing unit 9 uses the candidate image acquired in step S101, that is, the image generated from the video imaged at the same angle of view as the current image as the background image. After storing the image in the image storage unit 19, the calculation process is terminated. As a result, the background image storage unit 19 stores one frame of a video in which the user's face is not reflected in a predetermined size or more as a background image, and updates and stores the background image one after another for a predetermined time.

なお、本発明の実施形態に係る入力装置１は、背景画像記憶部１９に記憶されていた背景画像に代えて候補画像を背景画像として記憶する例、つまり、背景画像を候補画像に速やかに更新する例を示したが、他の構成を採用してもよい。例えば、所定時間（例えば、１秒）が経過するのを待ってから更新、すなわち所定時間だけ過去の時刻に記憶した背景画像で更新してもよい。あるいは、候補画像の画素毎に、各画素の画素値と、その画素に対応する記憶済みの背景画像の画素の画素値との差分を算出し、算出した差分を予め定めた整数値Ｎで割って得られる商の絶対値の小数第一位を切り上げた後に符号を戻して整数値を算出し、算出した整数値を、上記背景画像の画素の画素値に加算する方法を用いることもできる。画素値としては、例えば、ＹＣｂＣｒのＹ値（以下、「輝度値」とも呼ぶ）、ＲＧＢを用いることができる。 The input device 1 according to the embodiment of the present invention is an example in which a candidate image is stored as a background image instead of the background image stored in the background image storage unit 19, that is, the background image is quickly updated to the candidate image. Although the example which does is shown, you may employ | adopt another structure. For example, the update may be performed after waiting for a predetermined time (for example, 1 second), that is, the background image stored at the past time for the predetermined time may be used. Alternatively, for each pixel of the candidate image, the difference between the pixel value of each pixel and the pixel value of the pixel of the stored background image corresponding to that pixel is calculated, and the calculated difference is divided by a predetermined integer value N. It is also possible to use a method of calculating the integer value by returning the sign after rounding up the first decimal place of the absolute value of the quotient obtained and adding the calculated integer value to the pixel value of the pixel of the background image. As the pixel value, for example, Y value of YCbCr (hereinafter also referred to as “luminance value”), RGB can be used.

それゆえ、例えば、画素値が８ビット（すなわち０から２５５の整数）で表され、整数値Ｎが「２５５」である場合には、背景画像の画素の画素値は、「１」ずつ候補画像の画素の画素値に近づく。そのため、例えば、背景画像に黒い壁の画像が含まれていたが、黒い壁に直射日光が当たって黒い壁が白色に光る壁に変化した場合、２５５フレーム後には、背景画像に含まれる黒い壁の画像が白く光る壁の画像に変化する。その後、直射日光が当たらなくなって、白色に光る壁が黒い壁に戻った場合、２５５フレーム後には、背景画像に含まれる白色に光る壁の画像が再び黒い壁の画像に戻る。 Therefore, for example, when the pixel value is represented by 8 bits (that is, an integer from 0 to 255) and the integer value N is “255”, the pixel value of the pixel of the background image is “1” for each candidate image. It approaches the pixel value of this pixel. Therefore, for example, when a black wall image is included in the background image, but the black wall is changed to a wall that shines in white by direct sunlight, the black wall included in the background image is 255 frames later. Changes to an image of a wall that glows white. After that, when the direct sunlight is no longer applied and the white wall returns to the black wall, the image of the white wall included in the background image returns to the black wall image again after 255 frames.

また、本発明の実施形態に係る入力装置１は、背景画像記憶部１９に記憶させる背景画像として、撮像部２から出力される映像データが示す映像の１フレーム（候補画像）を用いる例を示したが、他の構成を採用することもできる。例えば、撮像部２と別のカメラ等で、後述するカレント画像と同じ画角で撮像された映像から生成される画像を用いることができる。この場合、背景画像記憶部１９が記憶している背景画像の更新を行わない。 Further, the input device 1 according to the embodiment of the present invention shows an example in which one frame (candidate image) of a video indicated by video data output from the imaging unit 2 is used as a background image to be stored in the background image storage unit 19. However, other configurations can be employed. For example, an image generated from a video imaged with the same angle of view as a current image, which will be described later, by a camera separate from the imaging unit 2 can be used. In this case, the background image stored in the background image storage unit 19 is not updated.

（実行許可判定処理）
次に、プロセッサ５が実行する実行許可判定処理について説明する。実行許可判定処理は、ユーザがコンピュータ４に予め定められた開始操作を行うと実行される。開始操作としては、例えば、背景画像記憶処理の開始操作と同じ操作を用いることができる。
図７に示すように、ステップＳ２０１で、カレント画像取得部１０が、撮像部２から出力される映像データが示す映像の１フレーム（以下、「カレント画像」とも呼ぶ）を取得する。 (Execution permission judgment process)
Next, an execution permission determination process executed by the processor 5 will be described. The execution permission determination process is executed when the user performs a predetermined start operation on the computer 4. As the start operation, for example, the same operation as the start operation of the background image storage process can be used.
As illustrated in FIG. 7, in step S <b> 201, the current image acquisition unit 10 acquires one frame (hereinafter also referred to as “current image”) of the video indicated by the video data output from the imaging unit 2.

続いてステップＳ２０２に移行して、実行許可判定部１１が、ステップＳ２０１で取得したカレント画像に基づき指示検知ループ処理（後述）の実行を許可するか否かを判定する。指示検知ループ処理の実行を許可するか否かの判定方法としては、例えば、カレント画像から予め定められた所定サイズ（例えば、横×縦の画素数で定義する）以上の大きさの顔画像が検出されたときに指示検知ループ処理の実行を許可すると判定する方法を用いることができる。顔画像の大きさを表す所定サイズは、例えば、ユーザ毎に設定可能な構成としてもよい。そして、指示検知ループ処理の実行を許可すると判定した場合には（Ｙｅｓ）、ステップＳ２０３に移行する。一方、許可しないと判定した場合には（Ｎｏ）、ステップＳ２０１に戻る。 Subsequently, the process proceeds to step S202, where the execution permission determination unit 11 determines whether to permit execution of an instruction detection loop process (described later) based on the current image acquired in step S201. As a method for determining whether or not to allow execution of the instruction detection loop process, for example, a face image having a size equal to or larger than a predetermined size (for example, defined by horizontal × vertical pixel count) that is predetermined from the current image is used. A method of determining that execution of the instruction detection loop process is permitted when it is detected can be used. The predetermined size representing the size of the face image may be configured to be set for each user, for example. And when it determines with permitting execution of an instruction | indication detection loop process (Yes), it transfers to step S203. On the other hand, when it determines with not permitting (No), it returns to step S201.

なお、本発明の実施形態に係る入力装置１では、カレント画像から所定サイズ以上の顔画像が検出されたときに指示検知ループ処理の実行を許可する例を示したが、他の構成を採用することもできる。例えば、図５に示すように、撮像部２から所定の距離（５０［cm］）内の範囲から特定の熱源を検出する人感センサ等の人検出部２１を備え、人検出部２１で特定の熱源が検出されたときに指示検知ループ処理の実行を許可する構成としてもよい。 In the input device 1 according to the embodiment of the present invention, the example in which execution of the instruction detection loop process is permitted when a face image of a predetermined size or more is detected from the current image is shown, but another configuration is adopted. You can also For example, as shown in FIG. 5, a human detection unit 21 such as a human sensor that detects a specific heat source from a range within a predetermined distance (50 cm) from the imaging unit 2 is provided. It may be configured to permit the execution of the instruction detection loop process when the heat source is detected.

また、例えば、図８に示すように、カレント画像に基づき個人を認証する個人認証部２２を備え、個人認証部２２で特定の個人が認証された場合にのみ、指示検知ループ処理の実行を許可する構成としてもよい。認証方法としては、例えば、顔画像認証を用いることができる。これにより、特定のユーザだけが、コンピュータ４に指示入力を行うことができる。
ステップＳ２０３では、指示検知ループ処理を実行した後、ステップＳ２０１に戻る。 Further, for example, as shown in FIG. 8, a personal authentication unit 22 that authenticates an individual based on the current image is provided, and only when a specific individual is authenticated by the personal authentication unit 22, execution of the instruction detection loop process is permitted. It is good also as composition to do. As an authentication method, for example, face image authentication can be used. Thereby, only a specific user can input an instruction to the computer 4.
In step S203, the instruction detection loop process is executed, and then the process returns to step S201.

（指示検知ループ処理）
次に、プロセッサ５が実行する指示検知ループ処理について説明する。
図９に示すように、ステップＳ３０１で、カレント画像取得部１０が、図１０に示すように、撮像部２から出力される映像データが示す映像の１フレーム、つまり、カレント画像を取得する。 (Instruction detection loop processing)
Next, the instruction detection loop process executed by the processor 5 will be described.
As shown in FIG. 9, in step S301, the current image acquisition unit 10 acquires one frame of the video indicated by the video data output from the imaging unit 2, that is, the current image, as shown in FIG.

続いてステップＳ３０２に移行して、前景マスク生成部１２及び前景画像生成部１３が、ステップＳ３０１で取得したカレント画像に基づき、前景マスクと前景画像とを含む前景情報を生成する。前景情報の生成方法としては、例えば、以下の方法を用いることができる。まず前景マスク生成部１２が、カレント画像の画素の画素値と、背景画像記憶部１９が記憶している背景画像の座標を同じくする画素の画像値とが異なるか否かを示す前景マスクを生成する。例えば、カレント画像と背景画像記憶部１９が記憶している背景画像とで同じ座標の画素同士で画像値が異なる画素に「２５５」を、そうでない画素に「０」を与えて前景マスクを生成する。すなわち、図１１に示すように、背景画像記憶部１９が記憶している背景画像の各画素の輝度値と、カレント画像の各画素の輝度値との差分を画素毎に算出し、算出した差分の絶対値が予め定めた所定値以上である画素に「２５５」を、そうでない画素に「０」を与えて前景マスクとする。 In step S302, the foreground mask generation unit 12 and the foreground image generation unit 13 generate foreground information including the foreground mask and the foreground image based on the current image acquired in step S301. As the foreground information generation method, for example, the following method can be used. First, the foreground mask generation unit 12 generates a foreground mask indicating whether or not the pixel value of the pixel of the current image is different from the image value of the pixel having the same background image coordinates stored in the background image storage unit 19. To do. For example, a foreground mask is generated by giving “255” to pixels whose image values are different between pixels of the same coordinates in the current image and the background image stored in the background image storage unit 19 and giving “0” to the other pixels. To do. That is, as shown in FIG. 11, the difference between the luminance value of each pixel of the background image stored in the background image storage unit 19 and the luminance value of each pixel of the current image is calculated for each pixel, and the calculated difference is calculated. A foreground mask is obtained by giving “255” to pixels whose absolute value is equal to or greater than a predetermined value and giving “0” to other pixels.

なお、背景画像記憶部１９が複数の背景画像を記憶している場合には、前景マスクの生成に用いる背景画像としては、ステップＳ３０１の実行時、つまりカレント画像の取得時よりも所定時間（例えば、１秒）以上前に記憶された背景画像を用いる。ここで、指示検知ループ処理は、上述したように、ユーザが撮像部２に十分に接近し、所定サイズ以上の顔画像が検出されたときや、撮像部２から所定の距離内の範囲から特定の熱源が検出されたとき等に実行が許可される。それゆえ、カレント画像の取得直前に取得された背景画像は、カレント画像とほとんど差がなく、適切な前景マスクを生成できない可能性がある。これに対し、カレント画像の取得時よりも所定時間以上前に記憶された背景画像では、ユーザの位置が大きく変化している可能性が高い。それゆえ、カレント画像の取得時よりも所定時間以上前に記憶された背景画像を用いることで、適切な前景マスクを生成できる。 In the case where the background image storage unit 19 stores a plurality of background images, the background image used for generating the foreground mask is a predetermined time (for example, compared to the time of execution of step S301, that is, the acquisition of the current image). (1 second) The background image stored more than the previous time is used. Here, as described above, the instruction detection loop process is specified when a user sufficiently approaches the imaging unit 2 and a face image of a predetermined size or more is detected, or from a range within a predetermined distance from the imaging unit 2. Execution is permitted when a heat source is detected. Therefore, the background image acquired immediately before the acquisition of the current image is hardly different from the current image, and an appropriate foreground mask may not be generated. On the other hand, in the background image stored more than a predetermined time before the current image is acquired, there is a high possibility that the position of the user has changed greatly. Therefore, an appropriate foreground mask can be generated by using a background image stored a predetermined time or more before the current image is acquired.

続いて、前景画像生成部１３が、前景マスク生成部１２で生成された前景マスクによってカレント画像をマスクして前景画像を生成する。例えば、図１２に示すように、カレント画像の各画素のうち、前景マスクの画素値が「２５５」の画素に対応する画素で構成された画像を前景画像とする。その際、前景マスクに膨張処理と収縮処理とを繰り返し施して、生成される前景画像からゴマ塩ノイズを除去するようにしてもよい。前景画像を背景画像に簡単に重ね合わせるべく、撮像部２で撮像される映像の各フレーム（以下、「撮像部２の撮像フレーム」とも呼ぶ）の横×縦の画素数と同一のフレームを用い、前景画像以外の座標の画素の画素値に「０」を設定してもよい。そして、前景画像生成部１３が、生成した前景マスクと前景画像とを含む前景情報を生成し、生成した前景情報を前景情報記憶部２０に記憶させる。なお、前景情報に含まれる前景画像としては、例えば、前景画像の輪郭線、つまり、輝度値等が急激に変化するエッジ部だけを表したエッジ画像を記憶するようにしてもよい。あるいは、前景画像の外形線（前景マスクにおける「２５５」と「０」の境界線と同じ）を記憶するようにしてもよい。 Subsequently, the foreground image generation unit 13 generates a foreground image by masking the current image with the foreground mask generated by the foreground mask generation unit 12. For example, as shown in FIG. 12, an image composed of pixels corresponding to a pixel whose pixel value of the foreground mask is “255” among the pixels of the current image is set as the foreground image. At this time, the foreground mask may be repeatedly subjected to expansion processing and contraction processing to remove sesame salt noise from the generated foreground image. In order to easily superimpose the foreground image on the background image, a frame having the same number of horizontal × vertical pixels as each frame of the video imaged by the imaging unit 2 (hereinafter also referred to as “imaging frame of the imaging unit 2”) is used. Alternatively, “0” may be set to the pixel value of the pixel having coordinates other than the foreground image. Then, the foreground image generation unit 13 generates foreground information including the generated foreground mask and the foreground image, and stores the generated foreground information in the foreground information storage unit 20. As the foreground image included in the foreground information, for example, an edge image representing only the contour line of the foreground image, that is, the edge portion where the luminance value or the like changes rapidly may be stored. Alternatively, the outline of the foreground image (same as the boundary line of “255” and “0” in the foreground mask) may be stored.

続いてステップＳ３０３に移行して、指示領域設定部１４が、音程変更の指示を行うための領域（以下、「指示領域」とも呼ぶ）が設定済みであるか否かを判定する。そして、指示領域が設定済みであると判定した場合には（Ｙｅｓ）、ステップＳ３１０に移行する。一方、設定済みではないと判定した場合には（Ｎｏ）、ステップＳ３０４に移行する。 Subsequently, the process proceeds to step S303, where the instruction area setting unit 14 determines whether an area for instructing a pitch change (hereinafter also referred to as “instruction area”) has been set. And when it determines with the instruction | indication area having been set (Yes), it transfers to step S310. On the other hand, if it is determined that it has not been set (No), the process proceeds to step S304.

ステップＳ３０４では、指示領域設定部１４が、撮像部２の撮像フレームの画素のうち、ステップＳ３０２で生成した前景情報が含む前景画像に対応する画素領域に含まれない画素から１つ以上の画素を選んでその座標に指示領域を設定した後、ステップＳ３０６に移行する。例えば、カレント画像の画素のうち、前景情報が含む前景画像に対応する画素領域に含まれない画素の座標に指示領域を設定する。指示領域には、予め定められた所定動作（例えば、音程上げ動作または音程下げ動作）を紐付けておく。指示領域の設定方法としては、例えば、撮像部２で撮影されるフレームの四隅の座標のうち、前景画像に対応する画素領域に含まれない画素の座標に指示領域を設定する方法を用いることもできる。指示領域としては、例えば、１つまたは複数の画素、円形状や多角形状の領域の境界線上の画素、その領域内の画素を用いることができる。また、操作対象画像（後述）に指示領域アイコン画像（後述）が予め上書きされている場合には指示領域アイコン画像と同一または相似な形状を用いることもできる。 In step S304, the indication area setting unit 14 selects one or more pixels from the pixels not included in the pixel area corresponding to the foreground image included in the foreground information generated in step S302 among the pixels of the imaging frame of the imaging unit 2. After selecting and setting an instruction area at the coordinates, the process proceeds to step S306. For example, the instruction area is set to the coordinates of pixels not included in the pixel area corresponding to the foreground image included in the foreground information among the pixels of the current image. A predetermined operation (for example, a pitch raising operation or a pitch lowering operation) that is determined in advance is associated with the instruction area. As a setting method of the instruction area, for example, a method of setting the instruction area to the coordinates of pixels not included in the pixel area corresponding to the foreground image among the coordinates of the four corners of the frame captured by the imaging unit 2 may be used. it can. As the indication area, for example, one or a plurality of pixels, a pixel on a boundary line of a circular or polygonal area, or a pixel in the area can be used. In addition, when an instruction area icon image (described later) is overwritten in advance on an operation target image (described later), the same or similar shape as the instruction area icon image can be used.

なお、前景情報記憶部２０が複数のフレームの前景情報を記憶している場合には、指示領域の設定に用いる前景情報（前景画像）は１つに限定されず複数を用いてもよい。すなわち、複数のフレームの前景画像に対応する画素領域のいずれにも含まれない画素の座標に指示領域を設定する構成としてもよい。 When the foreground information storage unit 20 stores foreground information of a plurality of frames, the foreground information (foreground image) used for setting the instruction area is not limited to one, and a plurality of foreground information may be used. In other words, the indication area may be set to the coordinates of pixels not included in any of the pixel areas corresponding to the foreground images of a plurality of frames.

指示領域の具体的な設定方法としては、例えば、前景画像上の顔画像の領域を検出し、検出された領域の近傍の画素のうち、前景画像に対応する画素領域に含まれない画素の座標を用いてもよい。例えば、前景画像上の顔画像の領域と所定の位置関係にある候補画素が、前景画像に対応する画素領域に含まれない画素であると判定した場合に、その候補画素の座標に指示領域を設定する。候補画素としては、図１３に示すように、顔画像の領域の所定の基点（例えば、中心）から所定方向（例えば、右方または左方）に所定距離（例えば、顔画像幅の２倍）の座標にある画素を選択するのが好適である。例えば、右方及び左方のいずれを用いるかは、右利き、左利き等、ユーザの利き腕を考慮してもよい。図１３の例では、前景画像を輪郭線、つまり周りの画素と比べて輝度値等が急激に変化する画素として抽出されたエッジで表している。輪郭線を抽出するアルゴリズムとしては例えば１９８６年にＪｏｈｎＦ. Ｃａｎｎｙ氏によって提案されたキャニー法が適当である。 As a specific method for setting the instruction area, for example, the face image area on the foreground image is detected, and the coordinates of pixels not included in the pixel area corresponding to the foreground image among the pixels in the vicinity of the detected area. May be used. For example, if it is determined that a candidate pixel having a predetermined positional relationship with the face image area on the foreground image is a pixel that is not included in the pixel area corresponding to the foreground image, the indication area is set to the coordinates of the candidate pixel. Set. As the candidate pixels, as shown in FIG. 13, a predetermined distance (for example, twice the width of the face image) in a predetermined direction (for example, right or left) from a predetermined base point (for example, the center) of the face image area. It is preferable to select a pixel at the coordinates. For example, whether the right side or the left side is used may be determined based on the user's dominant arm, such as right-handed or left-handed. In the example of FIG. 13, the foreground image is represented by an edge extracted as a contour line, that is, a pixel whose luminance value etc. changes abruptly compared with surrounding pixels. As an algorithm for extracting the contour line, for example, the Canney method proposed by John F. Canny in 1986 is appropriate.

なお、図１４（ａ）（ｂ）に示すように、候補画素が、前景画像に対応する画素領域に含まれる場合には、指示領域を設定しない構成としてもよい。図１４（ａ）（ｂ）の例では、背景画像に無かった犬型ロボットが置かれた椅子がカレント画像に現れ、椅子が示されている領域に候補画素が含まれ、前景画像に対応する画素領域に候補画素が含まれている。これにより、操作対象画像と前景画像とだけが画像表示部３に表示され、後述する指示領域アイコン画像が画像表示部３に表示されないため、なぜ指示領域アイコン画像が表示されないのかをユーザに気づかせることができる。それゆえ、ユーザは自分の上体を左右に少し傾ける等して自分の顔が映り込む位置を椅子から遠ざけて、指示領域アイコン画像を画像表示部３に表示させることができる。これにより、ユーザの意図しない、指示入力の誤検出を防止でき、誤動作を防止できる。図１４（ｂ）の例では、図１３と同様に前景画像を輪郭線で表している。 As shown in FIGS. 14A and 14B, when the candidate pixel is included in the pixel area corresponding to the foreground image, the instruction area may not be set. In the example of FIGS. 14A and 14B, a chair on which a dog-type robot that was not in the background image is placed appears in the current image, and candidate pixels are included in the area where the chair is shown, corresponding to the foreground image. Candidate pixels are included in the pixel area. As a result, only the operation target image and the foreground image are displayed on the image display unit 3, and an instruction area icon image to be described later is not displayed on the image display unit 3, so that the user is aware why the instruction area icon image is not displayed. be able to. Therefore, the user can display the indication area icon image on the image display unit 3 by tilting his / her upper body slightly to the left or right to keep the position where his / her face is reflected away from the chair. As a result, it is possible to prevent erroneous detection of an instruction input that is not intended by the user, and to prevent malfunction. In the example of FIG. 14B, the foreground image is represented by a contour line as in FIG.

ちなみに、例えば、候補画素が、前景画像に対応する画素領域に含まれる場合にも候補画素の座標に指示領域を設定する方法では、ペット等の移動体がカレント画像に現れ、前景画像の中のペット等の移動体の画像に対応した画素領域と指示領域とが重なり合い、指示領域を設定した途端に誤動作が発生する。
なお、本発明の実施形態に係る入力装置１では、指示領域が設定済みであるか否かを判定し、設定済みでないと判定した場合に、指示領域の設定を行う例を示したが、他の構成を採用することもできる。例えば、映り込む顔の位置の変化に追従すべく、指示領域を順次設定して更新する構成としてもよい。 Incidentally, for example, even when the candidate pixel is included in the pixel area corresponding to the foreground image, in the method of setting the instruction area at the coordinates of the candidate pixel, a moving body such as a pet appears in the current image, and A pixel area corresponding to an image of a moving body such as a pet overlaps with an instruction area, and a malfunction occurs as soon as the instruction area is set.
In the input device 1 according to the embodiment of the present invention, an example is shown in which it is determined whether or not the instruction area has been set, and when it is determined that the instruction area has not been set, the instruction area is set. It is also possible to adopt the configuration. For example, the indication area may be set and updated sequentially in order to follow the change in the position of the face to be reflected.

ステップＳ３０５では、画素領域重複判定部１５が、撮像部２の撮像フレームの画素のうちの、ステップＳ３０４で設定した指示領域と、ステップＳ３０２で生成した前景情報が含む前景画像、つまり、指示領域の設定後に生成された前景画像（以下、「第２の前景画像」とも呼ぶ）に対応する画素領域とに重なり合う部分があるか否かを判定する。ここで、上述したように、前景画像は、前景マスクの画素値が「２５５」の画素の領域と同一形状となる。そのため、重なり合う部分があるか否かを判定する方法としては、例えば、第２の前景画像を用いずに、前景マスクの画素値が「２５５」の画素の領域と、指示領域に対応する画素とに重なり合う部分があるか否かを判定する方法を用いることもできる。
重なり合う部分があるか否かを判定する方法としては、例えば、現在のカレント画像の指示領域の画素の画素値が、指示領域の設定時に取得したカレント画像の座標を同じくする画素の画素値との差（以下、「画素値差」とも呼ぶ）の絶対値が予め定められた所定値以上であると判定した場合に、指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定する方法を用いることもできる。また、例えば、指示領域の画素が複数である場合には、画素値差の絶対値が所定値以上であると判定した画素の数が、指示領域内のすべての画素の数の所定の割合以上であるときに、指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定する方法を用いることもできる。そして、重なり合う部分があると判定した場合には（Ｙｅｓ）ステップＳ３１０に移行する。一方、重なり合う部分がないと判定した場合には（Ｎｏ）ステップＳ３０６に移行する。 In step S305, the pixel area overlap determination unit 15 includes the instruction area set in step S304 and the foreground image included in the foreground information generated in step S302 among the pixels of the imaging frame of the imaging unit 2, that is, the instruction area. It is determined whether or not there is an overlapping portion with a pixel area corresponding to a foreground image generated after setting (hereinafter also referred to as a “second foreground image”). Here, as described above, the foreground image has the same shape as the region of the pixel whose pixel value of the foreground mask is “255”. Therefore, as a method for determining whether or not there is an overlapping portion, for example, without using the second foreground image, a pixel area whose pixel value of the foreground mask is “255”, and a pixel corresponding to the indication area It is also possible to use a method for determining whether or not there is an overlapping portion.
As a method for determining whether or not there is an overlapping portion, for example, the pixel value of the pixel in the current area of the current image is the same as the pixel value of the pixel having the same coordinates of the current image acquired when setting the current area. When it is determined that the absolute value of the difference (hereinafter also referred to as “pixel value difference”) is greater than or equal to a predetermined value, there is an overlapping portion between the designated area and the pixel area corresponding to the second foreground image Can be used. Further, for example, when there are a plurality of pixels in the indication area, the number of pixels determined that the absolute value of the pixel value difference is equal to or greater than a predetermined value is equal to or greater than a predetermined ratio of the number of all the pixels in the indication area. In this case, a method of determining that there is an overlapping portion between the designated area and the pixel area corresponding to the second foreground image may be used. If it is determined that there are overlapping portions (Yes), the process proceeds to step S310. On the other hand, if it is determined that there is no overlapping portion (No), the process proceeds to step S306.

ステップＳ３０６では、操作対象画像取得部１６が、図１５に示すように、操作対象であるカラオケシステムのカラオケムービーの１フレーム（以下、「操作対象画像」とも呼ぶ）を取得する。図１５の例では、操作対象画像として、題名と日本列島の図形と歌詞とが表示された「君が代」のカラオケムービーの１フレームが用いられている。図１５の例では、操作対象画像の横×縦の画素数を、撮像部２の撮像フレームの横×縦の画素数、つまり、前景マスクの横×縦の画素数よりも大きくしている。そして、操作対象画像の日本列島の図形の表示領域の横×縦の画素数を、撮像部２の撮像フレームの横×縦の画素数（前景マスクの横×縦の画素数）と同一としている。なお、指示検知ループ処理が開始されてから、ステップＳ３０６が１回以上実行された場合、つまり、１回以上操作対象画像が取得された場合には、直前に取得された操作対象画像の次フレームを操作対象画像として取得し、各フレームを順次取得する。 In step S306, as illustrated in FIG. 15, the operation target image acquisition unit 16 acquires one frame of a karaoke movie of the karaoke system that is the operation target (hereinafter also referred to as “operation target image”). In the example of FIG. 15, one frame of a “Kimigayo” karaoke movie in which a title, a graphic of the Japanese archipelago, and lyrics are displayed is used as the operation target image. In the example of FIG. 15, the number of horizontal × vertical pixels of the operation target image is larger than the number of horizontal × vertical pixels of the imaging frame of the imaging unit 2, that is, the number of horizontal × vertical pixels of the foreground mask. Then, the horizontal × vertical number of pixels in the display area of the figure of the Japanese archipelago of the operation target image is made the same as the horizontal × vertical number of pixels in the imaging frame of the imaging unit 2 (horizontal × vertical number of pixels in the foreground mask). . When step S306 is executed once or more after the instruction detection loop processing is started, that is, when the operation target image is acquired one or more times, the next frame of the operation target image acquired immediately before is obtained. Are acquired as operation target images, and each frame is sequentially acquired.

なお、本発明の実施形態に係る入力装置１では、操作対象画像として、カラオケムービーの１フレーム、つまり、動画の１フレームを用い、各フレームを順次取得する例を示したが、他の構成を採用することもできる。例えば、静止画像を用いる構成としてもよい。この場合、指示領域アイコン画像が上書きされた操作対象画像、つまり、静止画像を１回だけ生成し、生成した静止画像を繰り返し用いることで、操作対象画像（静止画像）を何度も取得する必要がなくて済む。静止画像としては、例えば、工場設備の操作マニュアルの１ページでもよいし、背景画像そのものを用いることもできる。 In the input device 1 according to the embodiment of the present invention, one frame of a karaoke movie, that is, one frame of a moving image is used as an operation target image, and each frame is sequentially acquired. It can also be adopted. For example, a configuration using still images may be used. In this case, it is necessary to generate the operation target image (still image) over and over again by generating the operation target image with the instruction area icon image overwritten, that is, the still image only once and repeatedly using the generated still image. There is no need. As a still image, for example, one page of an operation manual for factory equipment may be used, or a background image itself may be used.

ステップＳ３０７では、操作用画像表示部１７が、音程変更の操作を行うための画像（以下、「操作用画像」とも呼ぶ）を生成する。操作用画像の生成方法としては、例えば、図１６に示すように、ステップＳ３０６で取得した操作対象画像に、ステップＳ３０２で生成した前景情報が含む前景画像と、ステップＳ３０４で設定した指示領域を示す指示領域アイコン画像とを上書きする方法を用いることができる。指示領域アイコン画像は、操作対象画像の画素のうち指示領域に対応する座標の画素に上書きする。例えば、指示領域に紐付けられている動作に対応する画像が用いられる。図１６の例では、指示領域が２つ設定され、各指示領域に指示領域アイコン画像「▲」「▼」が上書きされている。「▲」は音程上げ操作用のアイコン画像であり「▼」は音程下げ操作用のアイコン画像である。また、図１６の例では、操作対象画像の日本列島の図形の表示領域に前景画像が上書きされている。この図１６の例では、日本列島の図形の表示領域のサイズは、撮像部２の撮像フレームのサイズに一致しているので、前景画像や指示領域アイコン画像は拡大縮小することなく、日本列島の図形の表示領域の左上の座標に応じて前景画像や指示領域アイコン画像それぞれの画素の座標をシフトして上書きすればよい。 In step S <b> 307, the operation image display unit 17 generates an image for performing a pitch change operation (hereinafter also referred to as “operation image”). As an operation image generation method, for example, as shown in FIG. 16, the foreground image included in the foreground information generated in step S302 in the operation target image acquired in step S306 and the indication area set in step S304 are shown. A method of overwriting the instruction area icon image can be used. The instruction area icon image overwrites the pixel of the coordinate corresponding to the instruction area among the pixels of the operation target image. For example, an image corresponding to an action associated with the instruction area is used. In the example of FIG. 16, two instruction areas are set, and the instruction area icon images “▲” and “▼” are overwritten in each instruction area. “▲” is an icon image for pitch up operation, and “▼” is an icon image for pitch down operation. In the example of FIG. 16, the foreground image is overwritten on the graphic display area of the Japanese archipelago of the operation target image. In the example of FIG. 16, the size of the graphic display area of the Japanese archipelago matches the size of the imaging frame of the imaging unit 2, so that the foreground image and the pointing area icon image are not enlarged or reduced. The coordinates of the pixels of the foreground image and the pointing area icon image may be shifted and overwritten according to the upper left coordinates of the graphic display area.

前景画像を上書きする方法としては、例えば、前景画像をそのまま上書きする方法を用いることができる。また、例えば、操作対象画像が透けて見えるように加工した前景画像を上書きする方法を用いることもできる。さらに、例えば、前景画像に対応する画素領域を単色または操作対象画像の補色で表した画像を上書きする方法を用いることもできる。また、例えば、前景画像の外形線（前景マスクにおける「２５５」と「０」の境界線）を単色または操作対象画像の補色で表した画像を上書きする方法を用いることもできる。さらに、例えば、前景画像の輪郭線、つまり、周りの画素と比べて輝度値等が急激に変化する画素として抽出されたエッジ部を単色または操作対象画像の補色で表した画像を上書きする方法を用いることもできる。また、例えば、これらの方法のうちの、２以上の方法を組合せて用いることもできる。 As a method of overwriting the foreground image, for example, a method of overwriting the foreground image as it is can be used. Further, for example, a method of overwriting a foreground image processed so that the operation target image can be seen through can be used. Furthermore, for example, a method of overwriting an image in which a pixel region corresponding to a foreground image is represented by a single color or a complementary color of the operation target image can be used. Further, for example, a method of overwriting an image in which the outline of the foreground image (the boundary line between “255” and “0” in the foreground mask) is represented by a single color or a complementary color of the operation target image can be used. Further, for example, a method of overwriting an image representing a contour line of a foreground image, that is, an edge portion extracted as a pixel whose luminance value etc. changes abruptly compared with surrounding pixels, with a single color or a complementary color of the operation target image. It can also be used. Also, for example, two or more of these methods can be used in combination.

輪郭線としては、例えば、ＹＣｂＣｒの輝度値Ｙを基に抽出したものを用いることができる。また、例えば、ＲＧＢの３原色それぞれの輪郭線を抽出し、抽出した３つの輪郭線を１つに合成（マージ）したものを用いることもできる。また、例えば、カレント画像全体の輪郭線から、背景画像全体の輪郭線を差し引いた画像を生成（差し引いた値が負数の場合はその絶対値を採用）し、生成した画像の画素値が所定値以上か否かで２値（例えば、「２５５」と「０」）に設定し、設定後の画像を前景マスクでマスクしたものを用いることもできる。この方法であれば背景画像とカレント画像とで照明や日差しが変化して前景マスクをうまく生成できない場合でも、輪郭線同士の差分を求めた段階でほぼ移動体の輪郭線だけが残ったものになるというメリットがある。もちろん、背景画像とカレント画像とで照明や日差しに変化がなければ、後段の前景マスクでマスクすることにより、前述の前景画像を生成してから抽出した輪郭線と同じものが得られる。
図１６の例では、前景画像として、輪郭線を上書きしている。 As the contour line, for example, one extracted based on the luminance value Y of YCbCr can be used. Further, for example, it is also possible to use a contour obtained by extracting the contour lines of the three primary colors of RGB and combining (merging) the extracted three contour lines into one. In addition, for example, an image obtained by subtracting the outline of the entire background image from the outline of the entire current image is generated (if the subtracted value is a negative number, the absolute value is adopted), and the pixel value of the generated image is a predetermined value. It is also possible to use a binary value (for example, “255” and “0”) depending on whether or not it is above, and mask the image after setting with a foreground mask. In this method, even if the foreground mask cannot be successfully generated due to changes in illumination and sunlight between the background image and the current image, only the outline of the moving object remains when the difference between the outlines is obtained. There is a merit that Of course, if there is no change in illumination or sunlight between the background image and the current image, the same contour line extracted after the above-mentioned foreground image is generated can be obtained by masking with the foreground mask in the subsequent stage.
In the example of FIG. 16, the outline is overwritten as the foreground image.

なお、本発明の実施形態に係る入力装置１では、操作対象画像に、前景画像をそのまま、または前景画像の輪郭線等を上書きする例を示したが、他の構成を採用することもできる。例えば、前景画像として、図１７に示すように、前景画像の鏡像を上書きする構成としてもよい。この場合、ユーザは見慣れた自分の鏡像を用いるので入力装置１の操作性を向上できる。更に、例えば、操作対象画像として背景画像の鏡像を用いることで、撮像したユーザの指示動作を誤りなく検出する機能が付いたデジタルミラーを実現できる。デジタルミラーはユーザが立ち位置を変えることなく全身の画像と顔のアップ画像とを切り替えて見せたり、ユーザが様々な服を着た姿を順次切り替えて見せたりすることができるが、本発明の機能によりその切り替え指示を的確に行わせることができる。 In the input device 1 according to the embodiment of the present invention, an example in which the operation target image is directly overwritten with the foreground image or the outline of the foreground image is shown, but other configurations may be employed. For example, as a foreground image, as shown in FIG. 17, a mirror image of the foreground image may be overwritten. In this case, since the user uses a familiar mirror image, the operability of the input device 1 can be improved. Furthermore, for example, by using a mirror image of the background image as the operation target image, it is possible to realize a digital mirror having a function of detecting an instruction operation of the captured user without error. The digital mirror can switch the whole body image and the close-up image of the face without changing the standing position of the user, or the user can sequentially switch the appearance of wearing various clothes. The switching instruction can be accurately performed by the function.

また、例えば、前景画像から手の画像を検出し、検出した手の画像のみを前景画像として上書きする構成としてもよい。これにより、操作対象画像をより明瞭に表示できる。
指示領域アイコン画像としては、例えば、指示領域の形状と同一または相似な形状の画像を用いることができる。また、例えば、指示領域が１つまたは少数の座標で設定され、指示領域アイコン画像として指示領域の形状と同一の画像が用いられると操作用画像において指示領域アイコン画像が見え難い場合には、その座標を含む円や多角形を用いることもできる。 Further, for example, a hand image may be detected from the foreground image, and only the detected hand image may be overwritten as the foreground image. Thereby, the operation target image can be displayed more clearly.
As the instruction area icon image, for example, an image having the same shape as or similar to the shape of the instruction area can be used. Further, for example, when the instruction area is set with one or a small number of coordinates and an image having the same shape as the instruction area is used as the instruction area icon image, the instruction area icon image is difficult to see in the operation image. A circle or polygon including coordinates can also be used.

なお、本発明の実施形態に係る入力装置１では、操作対象画像に撮像部２の撮像フレームがそのまま収まる例を示したが、他の構成を採用することもできる。例えば、操作対象画像のサイズと撮像部２の撮像フレームのサイズとが異なる構成としてもよい。その際、操作対象画像の縦横比と撮像部２の撮像フレームの縦横比とが同一である場合には、前景画像を所定数倍に拡大や縮小、つまり、リサイズして操作対象画像に上書きすればよい。また、操作対象画像の縦横比と撮像部２の撮像フレームの縦横比とが異なる場合には、操作対象画像から前景画像がはみ出さないように、前景画像をリサイズや描画位置の調整等をして操作対象画像に上書きすればよい。 In the input device 1 according to the embodiment of the present invention, an example in which the imaging frame of the imaging unit 2 is included in the operation target image as it is is shown, but other configurations may be adopted. For example, the size of the operation target image may be different from the size of the imaging frame of the imaging unit 2. At this time, if the aspect ratio of the operation target image is the same as the aspect ratio of the imaging frame of the imaging unit 2, the foreground image is enlarged or reduced by a predetermined number of times, that is, resized and overwritten on the operation target image. That's fine. Further, when the aspect ratio of the operation target image is different from the aspect ratio of the imaging frame of the imaging unit 2, the foreground image is resized, the drawing position is adjusted, or the like so that the foreground image does not protrude from the operation target image. The operation target image may be overwritten.

また、本発明の実施形態に係る入力装置１は、操作対象画像に指示領域アイコン画像を上書きする例を示したが、他の構成を採用することもできる。例えば、操作対象画像に予め指示領域アイコンが描画されている場合には、指示領域アイコン画像の上書きを省略する構成としてもよい。この場合、図１８に示すように、予め描画されている指示領域アイコン画像に指示領域が重なるように、前景画像のリサイズや描画位置の調整等をして前景画像を上書きする。また、指示領域アイコン画像の形状と同一となるように、指示領域の形状を設定する。 Moreover, although the input device 1 which concerns on embodiment of this invention showed the example which overwrites an instruction | indication area icon image on the operation target image, the other structure can also be employ | adopted. For example, when an instruction area icon is drawn in advance on the operation target image, overwriting of the instruction area icon image may be omitted. In this case, as shown in FIG. 18, the foreground image is overwritten by resizing the foreground image, adjusting the drawing position, or the like so that the designated area overlaps the previously drawn designated area icon image. The shape of the instruction area is set so as to be the same as the shape of the instruction area icon image.

続いてステップＳ３０８に移行して、操作用画像表示部１７が、ステップＳ３０７で生成した操作用画像を表示させる映像データを画像表示部３に出力する。これにより、画像表示部３が、その信号に従い、図１６に示すように操作対象画像、つまりカラオケムービーの１フレームに前景画像と指示領域アイコン画像とを上書きした操作用画像を表示する。 Subsequently, the process proceeds to step S308, and the operation image display unit 17 outputs video data for displaying the operation image generated in step S307 to the image display unit 3. As a result, the image display unit 3 displays an operation image in which the foreground image and the instruction area icon image are overwritten on one frame of the karaoke movie, as shown in FIG. 16, in accordance with the signal.

続いてステップＳ３０９に移行して、操作用画像表示部１７が、ステップＳ３０１で取得したカレント画像に基づき指示検知ループ処理を終了するか否かを判定する。指示検知ループ処理を終了するか否かの判定方法としては、例えば、カレント画像から所定サイズ以上の顔画像が検出されないことが所定回数連続した場合に指示検知ループ処理を終了すると判定する。そして、指示検知ループ処理を終了すると判定した場合には（Ｙｅｓ）、この演算処理を終了する。一方、指示検知ループ処理を終了しないと判定した場合には（Ｎｏ）、ステップＳ３０１に戻る。 Subsequently, the process proceeds to step S309, and the operation image display unit 17 determines whether or not to end the instruction detection loop process based on the current image acquired in step S301. As a method for determining whether or not to end the instruction detection loop process, for example, it is determined that the instruction detection loop process is ended when a face image of a predetermined size or more is not detected from the current image for a predetermined number of times. If it is determined that the instruction detection loop process is to be terminated (Yes), the calculation process is terminated. On the other hand, if it is determined not to end the instruction detection loop process (No), the process returns to step S301.

一方、ステップＳ３１０では、指示入力判定部１８が、予め定められた所定動作の指示入力があったと判定し、所定動作を行った後、ステップＳ３０１に戻る。所定動作として、例えば、音程下げ動作／音程下げ動作で説明する。図１６に示すように、音程下げ動作が紐付けられた指示領域には指示領域アイコン画像として「▼」が、音程上げ動作が紐付けられた指示領域には指示領域アイコン画像として「▲」が上書きされている。図１９に示すように、前景画像に対応する画素領域と「▼」が上書きされた指示領域が重なり合ったと判定すると、音程下げ動作の指示入力があったとしてカラオケの演奏の音程下げの動作を行う。また、例えば、前景画像に対応する画素領域と「▲」が上書きされた指示領域が重なり合ったと判定すると、音程上げ動作の指示入力があったとしてカラオケの演奏の音程上げの動作を行う。 On the other hand, in step S310, the instruction input determination unit 18 determines that there is an instruction input for a predetermined operation, and returns to step S301 after performing the predetermined operation. As the predetermined operation, for example, a pitch lowering operation / pitch lowering operation will be described. As shown in FIG. 16, “▼” is indicated as an instruction area icon image in the instruction area associated with the pitch lowering operation, and “▲” is indicated as the instruction area icon image in the instruction area associated with the pitch raising operation. It has been overwritten. As shown in FIG. 19, when it is determined that the pixel area corresponding to the foreground image and the instruction area overwritten with “▼” are overlapped, an operation for lowering the pitch of the karaoke performance is performed on the assumption that the instruction for the pitch lowering operation has been input. . Also, for example, if it is determined that the pixel area corresponding to the foreground image and the instruction area overwritten with “重なり” overlap, an operation for raising the pitch of the karaoke performance is performed on the assumption that an instruction for raising the pitch is received.

なお、本発明の実施形態に係る入力装置１は、前景画像に対応する画素領域と指示領域とが重なり合ったと判定した場合に、指示領域に紐付けられた所定動作の指示入力があったとして所定動作を実行する例を示したが、他の構成を採用することもできる。例えば、複数の指示領域を設定し、設定した複数の指示領域のうちの、２つ以上の指示領域に前景画像に対応する画素領域が重なった場合に、前景画像に対応する画素領域と重なった指示領域の組合せに応じて所定動作を選択し、選択した所定動作の指示入力があったと判定して、選択した所定動作を実行する構成としてもよい。この場合、例えば、２つ以上の指示領域に前景画像に対応する画素領域が重なった時刻を計測し、時刻の差分に基づきユーザの動作方向を判定する構成としてもよい。 Note that the input device 1 according to the embodiment of the present invention is determined to have received an instruction input for a predetermined operation associated with the instruction area when it is determined that the pixel area corresponding to the foreground image and the instruction area overlap. Although an example of executing the operation has been shown, other configurations may be adopted. For example, when a plurality of instruction areas are set and a pixel area corresponding to the foreground image overlaps two or more of the set instruction areas, the pixel area corresponding to the foreground image overlaps A configuration may be adopted in which a predetermined operation is selected according to the combination of the instruction areas, it is determined that there is an instruction input for the selected predetermined operation, and the selected predetermined operation is executed. In this case, for example, the time when the pixel area corresponding to the foreground image overlaps two or more instruction areas may be measured, and the user's movement direction may be determined based on the time difference.

また、例えば、同一パターンの動作を繰り返すことができる人間の特性を利用し、指示領域と前景画像に対応する画素領域とが重なり合うタイミングを検出し、検出したタイミングに基づき、重なり合うタイミングのパターンを検出し、同一のパターンが２回以上検出された場合に、所定動作の指示入力があったと判定する構成としてもよい。これにより、例えば、撮像部２のノイズ等の影響により、指示領域と前景画像に対応する画素領域とが重なり合ったと誤判定されたとしても、ユーザの意図しない指示入力の誤検出を抑制できる。重なり合うタイミングのパターンだけでなく、重なり合った部分の画素の画素値の変化のパターン（例えば、どれくらいの明るさからどれくらいの暗さに変化したか）を判定に加えてもよい。 In addition, for example, using the human characteristics that can repeat the operation of the same pattern, the timing at which the designated area and the pixel area corresponding to the foreground image overlap is detected, and the pattern of the overlapping timing is detected based on the detected timing. And it is good also as a structure which determines with having received the instruction | indication input of predetermined | prescribed operation | movement, when the same pattern is detected twice or more. Thereby, for example, even if it is erroneously determined that the instruction area and the pixel area corresponding to the foreground image overlap due to the influence of noise or the like of the imaging unit 2, erroneous detection of instruction input unintended by the user can be suppressed. In addition to the overlapping timing pattern, a pixel value change pattern (for example, how much brightness has changed to how much darkness) may be added to the determination.

（動作その他）
次に、本発明の実施形態に係る入力装置１の動作について説明する。
ユーザが、コンピュータ４に開始操作を行った後、撮像部２の撮像範囲外に移動したとする。すると、コンピュータ４のプロセッサ５が、背景画像記憶処理を開始し、撮像部２から出力される映像データが示す映像の１フレーム、つまり、候補画像を取得し（図４のステップＳ１０１）、取得した候補画像から所定サイズ以上の顔画像が検出されないとき、背景画像を更新して記憶すると判定する（図４のステップＳ１０２「Ｙｅｓ」）。 (Operation other)
Next, the operation of the input device 1 according to the embodiment of the present invention will be described.
It is assumed that the user moves outside the imaging range of the imaging unit 2 after performing a start operation on the computer 4. Then, the processor 5 of the computer 4 starts the background image storage process, acquires one frame of the video indicated by the video data output from the imaging unit 2, that is, a candidate image (step S101 in FIG. 4), and acquires it. When a face image of a predetermined size or larger is not detected from the candidate image, it is determined that the background image is updated and stored (step S102 “Yes” in FIG. 4).

続いて、プロセッサ５が、図６に示すように、候補画像を背景画像として背景画像記憶部１９に記憶する（図４のステップＳ１０３）。これにより、背景画像記憶部１９は、ユーザの顔が所定のサイズ以上で映っていない映像の１フレームを背景画像として記憶する。そして、所定時間が経過するたびに上記フローを繰り返して、背景画像を次々に記憶する。背景画像は瞬時に更新してもよいし、所定時間（例えば、１秒）が経過するのを待ってから更新してもよい。 Subsequently, as shown in FIG. 6, the processor 5 stores the candidate image as a background image in the background image storage unit 19 (step S103 in FIG. 4). Thereby, the background image storage unit 19 stores one frame of a video in which the user's face is not reflected in a predetermined size or more as a background image. Then, the above flow is repeated every time a predetermined time elapses, and the background images are stored one after another. The background image may be updated instantaneously, or may be updated after a predetermined time (for example, 1 second) has elapsed.

また同時に、プロセッサ５が、実行許可判定処理を開始し、撮像部２から出力される映像データが示す映像の１フレーム、つまりカレント画像を取得する（図７のステップＳ２０１）。ここで、ユーザが、音程変更の指示を行うために、撮像部２に十分に接近したとする。すると、プロセッサ５が、取得したカレント画像から所定サイズ以上の大きさの顔画像が検出されたとき、指示検知ループ処理の実行を許可すると判定し（図７のステップＳ２０２「Ｙｅｓ」）、指示検知ループ処理を実行する（図７のステップＳ２０３）。 At the same time, the processor 5 starts execution permission determination processing, and acquires one frame of the video indicated by the video data output from the imaging unit 2, that is, the current image (step S201 in FIG. 7). Here, it is assumed that the user has sufficiently approached the imaging unit 2 in order to instruct the pitch change. Then, the processor 5 determines that execution of the instruction detection loop process is permitted when a face image having a size equal to or larger than a predetermined size is detected from the acquired current image (step S202 “Yes” in FIG. 7), and instruction detection is performed. A loop process is executed (step S203 in FIG. 7).

指示検知ループ処理が実行されると、プロセッサ５が、図１０に示すように、撮像部２から出力される映像データが示す映像の１フレーム、つまり、カレント画像を取得する（図９のステップＳ３０１）。続いて、図１１に示すように、取得したカレント画像の画素のうち背景画像記憶部１９が記憶している背景画像の座標を同じくする画素と画像値が異なる画素に「２５５」を、そうでない画素に「０」を与えて前景マスクを生成する。続いて、図１２に示すように生成された前景マスクによってカレント画像をマスクして前景画像を生成し、生成した前景マスクと前景画像とを含む前景情報を前景情報記憶部２０に記憶させる（図９のステップＳ３０２）。 When the instruction detection loop process is executed, the processor 5 acquires one frame of the video indicated by the video data output from the imaging unit 2, that is, the current image, as shown in FIG. 10 (step S301 in FIG. 9). ). Subsequently, as shown in FIG. 11, among the pixels of the acquired current image, “255” is set to a pixel having a different image value from a pixel having the same coordinates of the background image stored in the background image storage unit 19, and not so. A foreground mask is generated by giving “0” to the pixel. Subsequently, the foreground image is generated by masking the current image with the generated foreground mask as shown in FIG. 12, and the foreground information including the generated foreground mask and the foreground image is stored in the foreground information storage unit 20 (FIG. 12). 9 step S302).

続いて、プロセッサ５が、指示領域が設定済みではないと判定し（図９のステップＳ３０３「Ｎｏ」）、撮像部２の撮像フレームの画素のうち、前景情報が含む前景画像に対応する画素領域に含まれない画素の中から１つ以上の画素を選んでその座標に指示領域を設定する（図９のステップＳ３０４）。指示領域は、例えば、図１３に示すように、前景画像上の顔画像の中心から顔画像幅の２倍の位置の右方にある画素の座標に設定する。 Subsequently, the processor 5 determines that the designated area has not been set (step S303 “No” in FIG. 9), and the pixel area corresponding to the foreground image included in the foreground information among the pixels of the imaging frame of the imaging unit 2 One or more pixels are selected from the pixels not included in the pixel and an instruction area is set at the coordinates (step S304 in FIG. 9). For example, as shown in FIG. 13, the instruction area is set to the coordinates of a pixel on the right side of the position of the face image on the foreground image that is twice the face image width.

続いて、プロセッサ５が、図１５に示すように、操作対象であるカラオケシステムのカラオケムービーの１フレーム、つまり、操作対象画像を取得し（図９のステップＳ３０６）、図１６に示すように、取得した操作対象画像に前景情報が含む前景画像と指示領域を示す指示領域アイコン画像とを上書きし、音程変更の操作を行うための操作用画像を生成する（図９のステップＳ３０７）。続いて、生成した操作用画像を表示させる信号を画像表示部３に出力する。これにより、画像表示部３が、プロセッサ５からの信号に従い、図１６に示すように、操作対象画像に前景画像と指示領域アイコン画像とを上書きした操作用画像を表示する。 Subsequently, as shown in FIG. 15, the processor 5 acquires one frame of the karaoke movie of the karaoke system that is the operation target, that is, the operation target image (step S306 in FIG. 9), and as shown in FIG. The foreground image included in the foreground information and the instruction area icon image indicating the instruction area are overwritten on the acquired operation target image, and an operation image for performing a pitch change operation is generated (step S307 in FIG. 9). Subsequently, a signal for displaying the generated operation image is output to the image display unit 3. As a result, the image display unit 3 displays an operation image in which the foreground image and the instruction area icon image are overwritten on the operation target image, as shown in FIG. 16, in accordance with the signal from the processor 5.

続いて、プロセッサ５が、カレント画像から所定サイズ以上の大きさの顔画像を検出し、指示検知ループ処理を終了しないと判定し（図９のステップＳ３０９）、上記フローを繰り返し実行する。なお、指示領域が既に設定されているため、指示領域が設定済みであると判定され（図９のステップＳ３０３「Ｎｏ」）、再度の指示領域の設定は行われない。 Subsequently, the processor 5 detects a face image having a size equal to or larger than a predetermined size from the current image, determines that the instruction detection loop processing is not ended (step S309 in FIG. 9), and repeatedly executes the above flow. Since the instruction area has already been set, it is determined that the instruction area has been set (step S303 “No” in FIG. 9), and the instruction area is not set again.

上記フローが繰り返されるうちに、ユーザが手や指を動かし、図１９に示すように、前景画像に対応する画素領域と指示領域とが重なり合ったとする。すると、プロセッサ５が、撮像部２の撮像フレームの画素のうち、指示領域とその指示領域の設定後に生成された前景画像に対応する画素領域とに重なり合う部分があると判定する（図９のステップＳ３０５「Ｙｅｓ」）。続いて、前景画像に対応する画素領域と重なり合う部分がある指示領域に紐付けられた所定動作を行う（図９のステップＳ３１０）。図１９の例では、前景画像の輪郭線、つまり、前景画像に対応する画素領域と重なり合う部分がある指示領域（指示領域アイコン画像「▼」が上書きされていた領域）には音程下げ動作が紐付けられており、音程下げ動作の指示入力があったと判定し、カラオケの演奏の音程下げの動作を行う。 It is assumed that the user moves his / her hand or finger while the above flow is repeated, and the pixel area corresponding to the foreground image and the instruction area overlap as shown in FIG. Then, the processor 5 determines that there is a portion that overlaps the designated area and the pixel area corresponding to the foreground image generated after setting the designated area among the pixels of the imaging frame of the imaging unit 2 (step in FIG. 9). S305 “Yes”). Subsequently, a predetermined operation associated with an instruction area having a portion overlapping with the pixel area corresponding to the foreground image is performed (step S310 in FIG. 9). In the example of FIG. 19, the pitch reduction operation is tied to the contour line of the foreground image, that is, the pointing region where the pixel region corresponding to the foreground image overlaps (the region where the pointing region icon image “▼” is overwritten). It is determined that an instruction for a pitch lowering operation has been input, and a pitch lowering operation is performed for karaoke performance.

以上説明したように、本発明の実施形態に係る入力装置１では、所定空間を連続して撮像する撮像部２が出力する映像の１フレームをカレント画像として取得するカレント画像取得部１０と、予めカレント画像と同じ画角で撮像された映像から生成された背景画像を記憶する背景画像記憶部１９と、カレント画像の画素の画素値と背景画像の座標を同じくする画素の画像値とが異なるか否かを示す前景マスクを生成する前景マスク生成部１２と、カレント画像を前景マスクでマスクして前景画像を生成する前景画像生成部１３と、撮像部２の撮像フレームの画素のうち前景画像に対応する画素領域に含まれない画素から１つ以上の画素を選んでその座標に指示領域を設定する指示領域設定部１４と、予め定められた操作対象画像に指示領域を示す指示領域アイコン画像とその指示領域の設定後に生成された前景画像である第２の前景画像とを上書きしてなる操作用画像を所定空間に向けられた画像表示部３に表示させる操作用画像表示部１７と、撮像フレームの画素のうちの、指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があるか否かを判定する画素領域重複判定部１５と、指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定した場合に、所定動作の指示入力があったと判定する指示入力判定部１８とを備えるようにした。 As described above, in the input device 1 according to the embodiment of the present invention, the current image acquisition unit 10 that acquires, as a current image, one frame of video output from the imaging unit 2 that continuously captures a predetermined space, Whether the background image storage unit 19 that stores a background image generated from an image captured at the same angle of view as the current image, and the pixel value of the pixel of the current image and the image value of the pixel having the same coordinates of the background image are different A foreground mask generating unit 12 that generates a foreground mask indicating whether or not, a foreground image generating unit 13 that generates a foreground image by masking the current image with a foreground mask, and a foreground image among pixels of an imaging frame of the imaging unit 2. An instruction area setting unit 14 that selects one or more pixels from pixels that are not included in the corresponding pixel area and sets the instruction area at the coordinates, and an instruction area in a predetermined operation target image An operation image for displaying on the image display unit 3 directed to a predetermined space an operation image formed by overwriting the instruction region icon image and the second foreground image that is the foreground image generated after setting the instruction region. A display unit 17; a pixel region overlap determining unit 15 that determines whether or not there is an overlapping portion of the pixel in the imaging frame and the pixel region corresponding to the second foreground image; When it is determined that there is an overlapping portion with the pixel region corresponding to the second foreground image, an instruction input determination unit 18 that determines that there is an instruction input for a predetermined operation is provided.

言い換えると、所定空間を連続して撮像する撮像部２と、操作用画像を表示する画像表示部３と、を備えた入力装置１であって、操作用画像は、指示領域が設定された画像であり、かつ、予め定められた操作対象画像に現在の前景画像及び指示領域を示す指示領域アイコン画像が上書きされており、現在の前景画像は、撮像部２が出力した現在の映像の１フレームから取得されたカレント画像の画素のうち、予めカレント画像と同じ画角で撮像された背景画像と座標を同じくする画素と画素値が異なる画素で形成された画像であり、指示領域は、該指示領域が設定される時点で生成された過去の前景画像に対応する画素領域に含まれない領域に設定されており、かつ、指示領域の設定後に生成された現在の前景画像の少なくとも一部が指示領域の少なくとも一部に存在するか否かの判定に用いられるようにした。それゆえ、例えば、ペット等、ユーザ以外の移動体が映っている部分も前景画像に含まれ、前景画像に対応する画素領域には指示領域が設定されないので、指示入力の誤検出を抑制可能な入力装置１を提供することができる。 In other words, the input device 1 includes an imaging unit 2 that continuously captures a predetermined space and an image display unit 3 that displays an operation image. The operation image is an image in which an instruction area is set. And a predetermined operation target image is overwritten with a current foreground image and an instruction area icon image indicating the instruction area, and the current foreground image is one frame of the current video output by the imaging unit 2. Among the pixels of the current image acquired from the above, an image formed by pixels having pixel values different from those of pixels having the same coordinates as those of the background image previously captured at the same angle of view as the current image. At least a part of the current foreground image that is set in the pixel area that is not included in the pixel area corresponding to the past foreground image generated at the time when the area is set and that is generated after setting the instruction area is indicated Territory Of as it used to determine whether or not to present at least a portion. Therefore, for example, a portion in which a moving body other than the user, such as a pet, is included in the foreground image, and no indication area is set in the pixel area corresponding to the foreground image, so that erroneous detection of an instruction input can be suppressed. The input device 1 can be provided.

また、例えば、画像上のユーザの目の位置と手の位置とを検出し、検出した目の位置と手の位置との中間位置に指示領域を設定するとともに、指示領域の画素の画素値が変化した場合に、画像上のユーザの手が指示領域に触れたと判定する従来技術と異なり、ユーザの目の位置と手の位置の検出を省略することができ、演算量を低減することができる。 In addition, for example, the position of the user's eyes and the position of the hand on the image are detected, the indication area is set at an intermediate position between the detected eye position and the hand position, and the pixel value of the pixel in the indication area is set. Unlike the prior art that determines that the user's hand on the image has touched the pointing area when the change occurs, detection of the user's eye position and hand position can be omitted, and the amount of computation can be reduced. .

また、指示領域設定部１４が、前景画像上の顔画像の領域と所定の位置関係にある候補画素が、前景画像に対応する画素領域に含まれない画素であると判定した場合に、候補画素の座標に指示領域を設定するようにした。それゆえ、指示領域をより適切な座標に設定することができ、指示入力の誤検出をより確実に抑制することができる。
さらに、候補画素は、前景画像上における顔画像の領域の所定の基点（例えば、中心）から所定方向に所定距離の座標にある画素とするようにした。それゆえ、指示領域をより適切な座標に設定でき、指示入力を比較的容易に行うことができる。 In addition, when the instruction area setting unit 14 determines that the candidate pixel having a predetermined positional relationship with the face image area on the foreground image is a pixel that is not included in the pixel area corresponding to the foreground image, the candidate pixel The indication area was set to the coordinates of. Therefore, the instruction area can be set to more appropriate coordinates, and erroneous detection of the instruction input can be more reliably suppressed.
Further, the candidate pixels are pixels located at coordinates at a predetermined distance in a predetermined direction from a predetermined base point (for example, the center) of the face image area on the foreground image. Therefore, the instruction area can be set to a more appropriate coordinate, and instruction input can be performed relatively easily.

また、背景画像を更新して記憶するか否かを判定する背景画像記憶判定部７を備え、背景画像記憶判定部７は、所定時間が経過するたびに、背景画像を更新して記憶するか否かを判定するようにした。それゆえ、背景画像をより適切なものとすることができる。
さらに、背景画像記憶判定部７は、指示領域が設定されていない場合に、背景画像を更新して記憶するか否かを判定するようにした。それゆえ、指示領域と、前景画像（第２の前景画像）に対応する画素領域とに重なり合う部分があるかをより適切に判定できる。 Further, the image processing apparatus includes a background image storage determination unit 7 that determines whether or not to update and store a background image, and the background image storage determination unit 7 updates and stores the background image every time a predetermined time elapses. Judged whether or not. Therefore, the background image can be made more appropriate.
Further, the background image storage determination unit 7 determines whether or not to update and store the background image when the designated area is not set. Therefore, it is possible to more appropriately determine whether there is an overlapping portion between the instruction area and the pixel area corresponding to the foreground image (second foreground image).

また、背景画像を更新して記憶するか否かを判定する背景画像記憶判定部７と、背景画像記憶判定部７で背景画像を更新して記憶すると判定したときに、映像の１フレームである候補画像の画素毎に、各画素の画素値と、その画素に対応する記憶済みの背景画像の画素の画素値との差分を算出し、算出した差分を予め定めた整数値で割って得られる商の絶対値の小数第一位を切り上げた後に符号を戻して整数値を算出し、算出した整数値を背景画像の画素の画素値に加算する背景画像書込部９と、を備えるようにした。それゆえ、背景画像を候補画像に徐々に近づけていくことができる。 Further, when the background image storage determination unit 7 determines whether or not to update and store the background image, and when the background image storage determination unit 7 determines to update and store the background image, it is one frame of the video. For each pixel of the candidate image, the difference between the pixel value of each pixel and the pixel value of the pixel of the stored background image corresponding to that pixel is calculated and obtained by dividing the calculated difference by a predetermined integer value. A background image writing unit 9 that rounds up the first decimal place of the absolute value of the quotient, returns the sign, calculates an integer value, and adds the calculated integer value to the pixel value of the pixel of the background image; did. Therefore, the background image can be gradually brought closer to the candidate image.

さらに、映像の１フレームから顔画像を検出する顔画像検出部８を備え、背景画像記憶判定部７は、顔画像検出部８で顔画像が検出されていないときに背景画像を更新して記憶すると判定するようにした。それゆえ、より適切な背景画像を記憶することができる。
また、撮像部２から所定距離内の範囲から特定の熱源を検出する人検出部２１を備え、背景画像記憶判定部７は、人検出部２１で特定の熱源が検出されていないときに背景画像を更新して記憶すると判定するようにしたため、より適切な背景画像を記憶できる。 Furthermore, a face image detection unit 8 that detects a face image from one frame of the video is provided, and the background image storage determination unit 7 updates and stores the background image when no face image is detected by the face image detection unit 8. Then I decided to judge. Therefore, a more appropriate background image can be stored.
In addition, a human detection unit 21 that detects a specific heat source from a range within a predetermined distance from the imaging unit 2 is provided, and the background image storage determination unit 7 uses a background image when the specific heat source is not detected by the human detection unit 21. Since it is determined that the image is updated and stored, a more appropriate background image can be stored.

さらに、操作用画像表示部１７は、操作対象画像に上書きする前景画像として、前景画像の鏡像を用いるようにした。それゆえ、ユーザは指示入力を容易に行うことができ、入力装置１の操作性を向上することができる。
また、操作用画像表示部１７は、操作対象画像に上書きする前景画像として、前景画像の輪郭線、つまり、周りの画素と比べて輝度値等が急激に変化する画素として抽出されたエッジ部だけを表した画像を用いるようにした。それゆえ、操作対象画像を覆い隠す面積を低減することができ、操作対象画像をより明瞭に表示することができる。 Further, the operation image display unit 17 uses a mirror image of the foreground image as the foreground image to be overwritten on the operation target image. Therefore, the user can easily input instructions, and the operability of the input device 1 can be improved.
In addition, the operation image display unit 17 has only the foreground image overwritten on the operation target image as the outline of the foreground image, that is, the edge portion extracted as a pixel whose luminance value etc. changes abruptly compared with surrounding pixels. The image showing was used. Therefore, the area that covers the operation target image can be reduced, and the operation target image can be displayed more clearly.

さらに、操作用画像表示部１７は、操作対象画像に上書きする前景画像として、操作対象画像が透けて見えるように加工した前景画像を用いるようにした。それゆえ、前景画像越しに操作対象画像が透けて見えるため、操作対象画像をより明瞭に表示することができる。
また、操作用画像表示部１７は、操作対象画像に上書きする前景画像として、前景画像に対応する部分を単色または操作対象画像の補色で表した画像を用いるようにした。それゆえ、前景画像が目立つようになるため、前景画像をより明瞭に表示することができる。 Further, the operation image display unit 17 uses a foreground image processed so that the operation target image can be seen through as the foreground image to be overwritten on the operation target image. Therefore, since the operation target image can be seen through the foreground image, the operation target image can be displayed more clearly.
Further, the operation image display unit 17 uses an image in which a portion corresponding to the foreground image is represented by a single color or a complementary color of the operation target image as the foreground image to be overwritten on the operation target image. Therefore, the foreground image becomes conspicuous, and the foreground image can be displayed more clearly.

さらに、操作用画像表示部１７は、操作対象画像に上書きする前景画像として、前景画像の外形線（前景マスクにおける「２５５」と「０」の境界線）を単色または操作対象画像の補色で表した画像を上書きする方法を用いるようにした。それゆえ、前景画像が目立つようになるため、前景画像をより明瞭に表示できる。
また、画素領域重複判定部１５は、現在のカレント画像の指示領域内の画素の画素値と、指示領域の設定時に取得したカレント画像の座標を同じくする画素の画素値との差（画素値差）の絶対値が所定値以上であると判定した場合に、指示領域と前景画像に対応する画素領域とに重なり合う部分があると判定するようにした。それゆえ、指示領域と前景画像に対応する画素領域とに重なり合う部分があるか否かを比較的簡単に判定できる。 Further, the operation image display unit 17 displays the outline of the foreground image (the boundary line between “255” and “0” in the foreground mask) as a foreground image to be overwritten on the operation target image by a single color or a complementary color of the operation target image. The method of overwriting the done image was used. Therefore, since the foreground image becomes conspicuous, the foreground image can be displayed more clearly.
In addition, the pixel area overlap determination unit 15 determines a difference (pixel value difference) between a pixel value of a pixel in the designated area of the current current image and a pixel value of a pixel having the same coordinates of the current image acquired when the designated area is set. ) Is determined to be greater than or equal to a predetermined value, it is determined that there is an overlapping portion between the designated area and the pixel area corresponding to the foreground image. Therefore, it can be relatively easily determined whether or not there is an overlapping portion between the designated area and the pixel area corresponding to the foreground image.

さらに、画素領域重複判定部１５は、指示領域内におけるこの画素値差の絶対値が所定値以上であると判定した画素の数が、指示領域内のすべての画素の数の所定の割合以上であるときに、指示領域と第２の前景画像に対応する画素領域とに重なり合う部分があると判定するようにした。それゆえ、重なり合う部分があるか否かをより適切に判定することができる。 Further, the pixel area overlap determination unit 15 determines that the number of pixels determined that the absolute value of the pixel value difference in the instruction area is equal to or greater than a predetermined value is equal to or greater than a predetermined ratio of the number of all pixels in the instruction area. In some cases, it is determined that there is an overlapping portion between the designated area and the pixel area corresponding to the second foreground image. Therefore, it can be more appropriately determined whether or not there are overlapping portions.

１…入力装置、２…撮像部、３…画像表示部、４…コンピュータ、５…プロセッサ、６…記憶装置、７…背景画像記憶判定部、８…顔画像検出部、９…背景画像書込部、１０…カレント画像取得部、１１…実行許可判定部、１２…前景マスク生成部、１３…前景画像生成部、１４…指示領域設定部、１５…画素領域重複判定部、１６…操作対象画像取得部、１７…操作用画像表示部、１８…指示入力判定部、１９…背景画像記憶部、２０…前景情報記憶部、２１…人検出部、２２…個人認証部 DESCRIPTION OF SYMBOLS 1 ... Input device, 2 ... Imaging part, 3 ... Image display part, 4 ... Computer, 5 ... Processor, 6 ... Memory | storage device, 7 ... Background image memory | storage determination part, 8 ... Face image detection part, 9 ... Background image writing , 10 ... current image acquisition unit, 11 ... execution permission determination unit, 12 ... foreground mask generation unit, 13 ... foreground image generation unit, 14 ... designated region setting unit, 15 ... pixel region overlap determination unit, 16 ... operation target image Acquisition unit, 17 ... operation image display unit, 18 ... instruction input determination unit, 19 ... background image storage unit, 20 ... foreground information storage unit, 21 ... human detection unit, 22 ... personal authentication unit

Claims

A current image acquisition unit that acquires, as a current image, one frame of video output by an imaging unit that continuously images a predetermined space;
A background image storage unit that stores a background image generated in advance from a video imaged at the same angle of view as the current image;
A foreground mask generating unit that generates a foreground mask indicating whether a pixel value of a pixel of the current image is different from an image value of a pixel having the same coordinates of the background image;
A foreground image generation unit that generates a foreground image by masking the current image with the foreground mask;
An instruction area setting unit that selects one or more pixels from pixels that are not included in the pixel area corresponding to the foreground image among the pixels of the imaging frame of the imaging unit, and sets the instruction area at the coordinates;
An operation image formed by overwriting a predetermined operation target image with an instruction area icon image indicating the instruction area and a second foreground image that is a foreground image generated after the setting of the instruction area is set in the predetermined space. An image display section for operation to be displayed on the image display section directed to
A pixel region overlap determination unit that determines whether or not there is an overlapping portion between the indication region and the pixel region corresponding to the second foreground image among the pixels of the imaging frame of the imaging unit;
An input device including an instruction input determination unit that determines that there is an instruction input for a predetermined operation when it is determined that there is an overlapping portion between the instruction area and a pixel area corresponding to the second foreground image.

The candidate area setting unit determines that the candidate pixel having a predetermined positional relationship with the face image area on the foreground image is a pixel not included in the pixel area corresponding to the foreground image. The input device according to claim 1, wherein the indication area is set in a coordinate of a pixel.

The input device according to claim 2, wherein the candidate pixel is a pixel at a predetermined distance in a predetermined direction from a predetermined base point of a face image region on the foreground image.

A background image storage determination unit for determining whether to update and store the background image;
The input device according to claim 1, wherein the background image storage determination unit determines whether to update and store the background image every time a predetermined time elapses.

The input device according to claim 4, wherein the background image storage determination unit determines whether to update and store the background image when the instruction area is not set.

A face image detection unit for detecting a face image from one frame of the video;
The input device according to claim 4, wherein the background image storage determination unit determines to update and store the background image when a face image of a predetermined size or larger is not detected by the face image detection unit.

A human detection unit that detects a specific heat source from a range within a predetermined distance from the imaging unit;
The input device according to claim 4, wherein the background image storage determination unit determines to update and store the background image when a specific heat source is not detected by the human detection unit.

The input device according to claim 1, wherein the operation image display unit uses a mirror image of the second foreground image as a foreground image to be overwritten on the operation target image.

The input device according to claim 1, wherein the operation image display unit uses an outline of the second foreground image as a foreground image to be overwritten on the operation target image.

The said operation image display part uses the said 2nd foreground image processed so that the said operation target image may show through as a foreground image overwritten on the said operation target image. The input device described.

The operation image display unit uses, as a foreground image to be overwritten on the operation target image, an image in which a portion corresponding to the second foreground image is represented by a single color or a complementary color of the operation target image. The input device according to any one of the above.

The operation image display unit uses an image in which an outline of the second foreground image is represented by a single color or a complementary color of the operation target image as a foreground image to be overwritten on the operation target image. The input device according to claim 1.

The pixel area overlap determination unit is configured to obtain an absolute difference between a pixel value of a pixel in the designated area of the current image and a pixel value of a pixel having the same coordinate of the current image acquired when the designated area is set. 13. The device according to claim 1, wherein when it is determined that the value is equal to or greater than a predetermined value, it is determined that there is an overlapping portion between the indication region and a pixel region corresponding to the second foreground image. Input device.

An imaging unit for continuously imaging a predetermined space;
An input device including an image display unit for displaying an operation image;
The operation image is an image in which an instruction area is set, and a predetermined operation target image is overwritten with a current foreground image and an instruction area icon image indicating the instruction area,
The current foreground image is a pixel and a pixel having the same coordinates as the background image captured in advance at the same angle of view as the current image among the pixels of the current image acquired from one frame of the video output by the imaging unit. It is an image formed with pixels with different values,
The indication area is set to an area that is not included in a pixel area corresponding to a past foreground image generated at the time when the indication area is set, and the current area generated after setting the indication area An input device used for determining whether or not at least part of the foreground image is present in at least part of the designated area.

Acquiring one frame of video output by an imaging unit that continuously images a predetermined space as a current image;
Generating a foreground mask indicating whether a pixel value of a pixel of the current image is different from an image value of a pixel having the same coordinates of a background image captured in advance with the same angle of view as the current image;
Masking the current image with the foreground mask to generate a foreground image;
Selecting one or more pixels from pixels not included in the pixel area corresponding to the foreground image among the pixels of the imaging frame of the imaging unit, and setting an instruction area at the coordinates;
An operation image in which an instruction area icon image indicating the instruction area and a second foreground image, which is a foreground image generated after setting the instruction area, are overwritten on a predetermined operation target image in the predetermined space. Displaying on the directed image display;
Determining whether or not there is an overlapping portion between the indication area and the pixel area corresponding to the second foreground image among the pixels of the imaging frame of the imaging unit;
An input method comprising: determining that there is an instruction input for a predetermined operation when it is determined that there is an overlapping portion between the instruction area and a pixel area corresponding to the second foreground image.