JP2000125196A

JP2000125196A - Input image processing method, input image processor, and recording medium recording input image processing program

Info

Publication number: JP2000125196A
Application number: JP10315348A
Authority: JP
Inventors: Junko Saito; 潤子斎藤; Shunichi Takeuchi; 俊一竹内; Hideyoshi Tominaga; 英義富永
Original assignee: Telecommunications Advancement Organization; Matsushita Electric Industrial Co Ltd
Current assignee: National Institute of Information and Communications Technology; Panasonic Holdings Corp
Priority date: 1998-10-20
Filing date: 1998-10-20
Publication date: 2000-04-28

Abstract

PROBLEM TO BE SOLVED: To obtain an input image processor where an accurate depth of an image is measured in spite of a small calculation quantity so as to process a photographed image in real time and that copes with even a case that a background plane of the photographed image is wide. SOLUTION: The input image processor 100 is provided with camera sections 102, 103 that photograph an object moving on a background plane to acquire an input image, object extracted image generating sections 108, 109 that generate an object extracted image from the input image, a difference image generating section 110 that generates a difference image from the object extracted image, an occlusion image generating section 111 that generates an occlusion image from the object extracted image and the difference image and a distance measurement section that measures a position of the object with respect to the background plane based on the occlusion image. Thus, an accurate depth of the image is measured in spite of a small calculation quantity and the photographed image is processed in real time.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、入力された画像を
処理する入力画像処理方法、入力画像処理装置、及び入
力画像処理プログラムを記録した記録媒体に関する。特
には、ＣＣＤカメラなどの画像入力部から入力された画
像を処理する入力画像処理方法、入力画像処理装置、及
び入力画像処理プログラムを記録した記録媒体に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an input image processing method for processing an input image, an input image processing device, and a recording medium on which an input image processing program is recorded. More particularly, the present invention relates to an input image processing method for processing an image input from an image input unit such as a CCD camera, an input image processing device, and a recording medium on which an input image processing program is recorded.

【０００２】[0002]

【従来の技術】近年、画像の入力装置としてマウス装置
やキーボード装置からでなくＣＣＤカメラのようなカメ
ラを用いて情報入力を行いその入力画像を処理する入力
画像処理装置が研究、開発されている。2. Description of the Related Art In recent years, an input image processing apparatus for inputting information using a camera such as a CCD camera and processing the input image using a camera such as a CCD camera instead of a mouse apparatus or a keyboard apparatus as an image input apparatus has been researched and developed. .

【０００３】図１１は、画像の入力部にＣＣＤカメラな
どのカメラを用いた従来の入力画像処理装置を含むコン
ピュータシステムである。このコンピュータシステム
は、入力画像を処理する入力画像処理装置１１００と、
入力画像処理装置１１００に接続される画像処理装置１
１１１とを備える。FIG. 11 shows a computer system including a conventional input image processing apparatus using a camera such as a CCD camera in an image input section. The computer system includes an input image processing device 1100 that processes an input image,
Image processing device 1 connected to input image processing device 1100
111.

【０００４】図１１において、入力画像処理装置１１０
０は、画像を撮影するためのＣＣＤカメラなどのカメラ
部Ａ１１０２及びカメラ部Ｂ１１０３と、カメラ部Ａ１
１０２及びカメラ部Ｂ１１０３に対して撮影の同期信号
を送出する同期信号発生部１１０１と、カメラ部Ａ１１
０２及びカメラ部Ｂ１１０３によって撮影されたそれぞ
れの画像のアナログ信号をディジタル信号に変換するＡ
／Ｄ変換部１１０４及び１１０５と、Ａ／Ｄ変換部１１
０４及び１１０５から出力されるそれぞれのディジタル
画像の特徴点を抽出する特徴点抽出部１１０６及び１１
０７と、特徴点抽出部１１０６及び１１０７で抽出され
たそれぞれの特徴点の１対１の対応を探索する対応点探
索部１１０８と、対応点探索部１１０８で対応付けされ
たそれぞれの特徴点の３次元座標値を三角測量に基づい
て計測する距離計測部１１０９と、距離計測部１１０９
の距離計測結果に基づいて画像の動きなどを認識し、そ
の認識に基づいた制御信号、例えば、マウスにおけるク
リック動作などと同様な制御信号を画像処理装置１１１
１に入力する入力処理部１１１０とを備えている。In FIG. 11, an input image processing device 110
0 denotes a camera unit A1102 and a camera unit B1103 such as a CCD camera for taking an image, and a camera unit A1.
A synchronization signal generation unit 1101 for transmitting a synchronization signal for photographing to the camera unit 102 and the camera unit B 1103;
A which converts an analog signal of each image taken by the camera unit 02 and the camera unit B 1103 into a digital signal.
/ D converters 1104 and 1105 and A / D converter 11
Feature point extraction units 1106 and 11 for extracting feature points of the respective digital images output from 04 and 1105
07, a corresponding point search unit 1108 that searches for a one-to-one correspondence between the respective feature points extracted by the feature point extraction units 1106 and 1107, and three of the respective feature points associated by the corresponding point search unit 1108. A distance measuring unit 1109 for measuring dimensional coordinate values based on triangulation, and a distance measuring unit 1109
Of the image based on the distance measurement result, and a control signal based on the recognition, for example, a control signal similar to a click operation with a mouse, is transmitted to the image processing apparatus 111.
1 is provided with an input processing unit 1110 for inputting the data to the input unit 1.

【０００５】また、画像処理装置１１１１は、入力処理
部１１１０から制御信号を受け取り、その制御信号に基
づいて処理を行う制御部１１１２と、制御プログラムな
どを記憶するＲＯＭ（Read Only Memory）１１１５と、
データなどを記憶するＲＡＭ（Random Access Memory）
１１１６と、画像を表示する表示部１１１４と、表示部
１１１４を制御する表示制御部１１１３と、を備えてい
る。[0005] The image processing apparatus 1111 receives a control signal from the input processing section 1110 and performs processing based on the control signal; a ROM (Read Only Memory) 1115 for storing a control program and the like;
RAM (Random Access Memory) for storing data
1116, a display unit 1114 that displays an image, and a display control unit 1113 that controls the display unit 1114.

【０００６】上述の入力画像処理装置及び画像処理装置
１２１１で構成される画像処理システムにおいては、入
力装置としてマウス装置やキーボード装置を用いる替わ
りにＣＣＤカメラなどのカメラを用いてオペレータ（使
用者）の動作（アクション）などを撮影し、そのカメラ
画像を入力画像処理装置で画像処理し、そのオペレータ
のアクションを認識することによって、画像処理装置１
１１１に制御信号を入力している。In the image processing system including the input image processing device and the image processing device 1211 described above, a camera such as a CCD camera is used instead of a mouse device or a keyboard device as an input device, and an operator (user) is used. The operation (action) is photographed, the camera image is processed by the input image processing device, and the image processing device 1 is recognized by recognizing the action of the operator.
A control signal is input to 111.

【０００７】以下、入力画像処理装置の動作について説
明する。ここでは、ピアノの演奏の自動入力について説
明する。この場合、カメラ部Ａ１１０２及びカメラ部Ｂ
１１０３で撮影された画像を画像処理して画像処理装置
１１１１に制御信号の入力処理を行う。このため、実際
に音の出るピアノ鍵盤は必要なく、鍵盤を模した絵があ
ればよい。Hereinafter, the operation of the input image processing apparatus will be described. Here, automatic input of a piano performance will be described. In this case, the camera unit A 1102 and the camera unit B
The image photographed in 1103 is subjected to image processing, and input processing of a control signal to the image processing apparatus 1111 is performed. For this reason, there is no need for a piano keyboard that actually produces a sound, and it is sufficient to have a picture that imitates the keyboard.

【０００８】図１２は、撮影画像の処理を示す概念図で
ある。図１２（ａ１）及び（ａ２）は、鍵盤を描いた絵
の鉛直方向に設けられた２台のカメラ部Ａ１１０２及び
カメラ部Ｂ１１０３でピアノを演奏している様子を撮影
したカメラ画像であり、図１２（ａ１）は、カメラ部Ａ
１１０２（左カメラ）の画像、図１２（ａ２）は、カメ
ラ部Ｂ１１０３（右カメラ）の画像である。また、図１
２（ｂ１）及び（ｂ２）は、それぞれ図１２（ａ１）及
び（ａ２）のカメラ画像におけるエピポーラ線と特徴点
を示す図である。FIG. 12 is a conceptual diagram showing processing of a photographed image. FIGS. 12A1 and 12A2 are camera images obtained by photographing a piano being played by two camera units A1102 and B1103 provided in the vertical direction of the picture depicting the keyboard. 12 (a1) is the camera unit A
FIG. 12A shows an image of the camera unit B 1103 (right camera). FIG.
FIGS. 2 (b1) and (b2) are diagrams showing epipolar lines and feature points in the camera images of FIGS. 12 (a1) and (a2), respectively.

【０００９】ここで、図１１及び図１２を用いて、カメ
ラ画像から演奏データを画像処理装置１１１１に入力す
る場合の処理について説明する。まず、カメラ部Ａ１１
０２及びカメラ部Ｂ１１０３から入力されるカメラ画像
（図１２（ａ１）及び（ａ２））の撮影フレームの同期
をとるために、同期信号発生部１１０１は、カメラ部Ａ
１１０２及びカメラ部Ｂ１１０４に同期信号を送出す
る。同期信号発生部１１０１からの同期信号を受けて、
カメラ部Ａ１１０２及びカメラ部Ｂ１１０３は、ピアノ
演奏の状況を撮影し、撮影画像１２１１、１２１２のア
ナログ信号を得る。Ａ／Ｄ変換部１１０４、１１０５
は、この撮影画像１２１１、１２１２のアナログ信号を
それぞれディジタル信号（ディジタル画像）に変換す
る。Here, a process for inputting performance data from a camera image to the image processing device 1111 will be described with reference to FIGS. First, the camera section A11
In order to synchronize the captured frames of the camera images (FIGS. 12A1 and 12A2) input from the camera unit B1103 and the camera unit B1103, the synchronization signal generation unit 1101
A synchronization signal is sent to the camera unit 1102 and the camera unit B 1104. Upon receiving the synchronization signal from the synchronization signal generation unit 1101,
The camera unit A 1102 and the camera unit B 1103 capture the situation of the piano performance, and obtain analog signals of the captured images 1211 and 1212. A / D converters 1104 and 1105
Converts the analog signals of the captured images 1211 and 1212 into digital signals (digital images).

【００１０】次に、指先１２１３で押下されている鍵盤
１２１４の位置を検出するために、カメラ部Ａ１１０２
及びカメラ部Ｂ１１０３の位置からの指先１２１３の３
次元奥行きを求める。Next, in order to detect the position of the keyboard 1214 pressed by the fingertip 1213, the camera unit A1102
And the fingertip 1213-3 from the position of the camera unit B1103.
Find the dimension depth.

【００１１】この各指先１２１３の３次元奥行きの値を
求めるために、特徴点抽出部１１０６及び１１０７は、
指先１２１３の特徴点をディジタル画像から抽出する
（図１２（ａ１）及び（ａ２））。この特徴点は、撮影
画像１２１１、１２１２における指先１２１３などの特
徴点の対象を予め決めておくことによって抽出する。In order to determine the value of the three-dimensional depth of each fingertip 1213, the feature point extracting units 1106 and 1107
The feature points of the fingertip 1213 are extracted from the digital image (FIGS. 12A and 12A). This feature point is extracted by previously determining the target of the feature point such as the fingertip 1213 in the captured images 1211 and 1212.

【００１２】次に、対応点探索部１１０８は、このカメ
ラ部Ａ１１０２及びカメラ部Ｂ１１０３の撮影画像１２
１１、１２１２から抽出されたそれぞれの特徴点１２０
１〜１２１０の１対１の対応関係を探索する。この対応
点探索は、エピポーラ線１２１５毎に特徴点１２０１〜
１２１０を探索し、１つのエピポーラ線１２１５上に１
つの特徴点１２０１乃至１２１０しか存在しないものを
抽出する。これによって、１対１の対応点は、図１２
（ｂ１）及び（ｂ２）における対応点１２０２と１２０
７、１２０３と１２０８、及び１２０４と１２０９のよ
うに決定できる。また、特徴点１２０１、１２０５、１
２０６、及び１２１０のように、同一のエピポーラ線１
２１５上に複数の存在が探索された場合、このエピポー
ラ線１２１５の拘束条件のみから特徴点の対応関係は判
定できないので、各特徴点１２０１、１２０５、１２０
６、及び１２１０の座標やカメラ部の追加による新たな
エピポーラ線によって、対応点１２０１と１２０６、及
び１２０５と１２１０を決定する。Next, the corresponding point searching unit 1108 generates the captured image 12 of the camera unit A 1102 and the camera unit B 1103.
11. Each feature point 120 extracted from 11 and 1212
A one-to-one correspondence between 1 and 1210 is searched. This corresponding point search is performed for each feature point 1201 to 1201 for each epipolar line 1215.
Search for 1210 and put one on one epipolar line 1215
A feature that has only two feature points 1201 to 1210 is extracted. As a result, the one-to-one correspondence points are
Corresponding points 1202 and 120 in (b1) and (b2)
7, 1203 and 1208, and 1204 and 1209. Also, feature points 1201, 1205, 1
As in 206 and 1210, the same epipolar line 1
When a plurality of entities are searched for on the reference point 215, the correspondence between the feature points cannot be determined only from the constraint condition of the epipolar line 1215.
Corresponding points 1201 and 1206 and 1205 and 1210 are determined based on the coordinates of 6 and 1210 and a new epipolar line due to the addition of a camera unit.

【００１３】距離計測部１１０９は、このようにして対
応付けされた特徴点１２０１〜１２１０の３次元座標値
を三角測量に基づいて計測する。The distance measuring unit 1109 measures the three-dimensional coordinate values of the feature points 1201 to 1210 associated with each other based on triangulation.

【００１４】続いて、距離計測部１１０９で計測された
指先１２１３の奥行き値に基づいて、入力処理部１１１
０は、鍵盤１２１４の押下の判定を行い、押下を判定さ
れた鍵盤の位置を示す信号を画像処理装置１１１１に対
し出力する。Subsequently, based on the depth value of the fingertip 1213 measured by the distance measuring unit 1109, the input processing unit 111
A value of 0 determines whether or not the keyboard 1214 has been pressed, and outputs a signal indicating the position of the keyboard for which the determination has been made to the image processing apparatus 1111.

【００１５】画像処理装置１１１１の制御部１１１２
は、入力処理部１１１０からの信号に基づいて、ＲＯＭ
１１１５やＲＡＭ１１１６などに記憶されている画像処
理用のプログラムやデータを制御する。表示制御部１１
１３は、制御部１１１２の処理結果に基づいて、表示部
１１１４に画像を表示する。The control unit 1112 of the image processing device 1111
Is a ROM based on a signal from the input processing unit 1110.
It controls an image processing program and data stored in the RAM 1115, the RAM 1116, and the like. Display control unit 11
13 displays an image on the display unit 1114 based on the processing result of the control unit 1112.

【００１６】[0016]

【発明が解決しようとする課題】しかしながら、図１１
及び図１２に示したような従来の入力画像処理装置にお
いては、指先１２１３と判断できるような特徴点１２０
１〜１２１０を画像１２１１及び１２１２のエピポーラ
線１２１５全体を探索することで抽出する必要があるた
め、その特徴点１２０１〜１２１０の抽出における計算
量が多くなり処理時間がかかる。このため、画像処理に
おけるリアルタイム性に欠けるという問題点があった。However, FIG.
In the conventional input image processing apparatus as shown in FIG.
Since it is necessary to extract 1 to 1210 by searching the entire epipolar line 1215 of the images 1211 and 1212, the amount of calculation in the extraction of the feature points 1201 to 1210 increases and the processing time is increased. For this reason, there is a problem that real-time properties in image processing are lacking.

【００１７】また指先１２１３のような特徴が類似して
いる対象画像が、同一画像中に複数存在する場合には、
対応点のミスマッチングが発生する場合があるため、誤
った特徴点の対応付けに基づく誤った３次元計測が行わ
れてしまうという問題があった。If there are a plurality of target images having similar characteristics such as the fingertip 1213 in the same image,
Since the matching of the corresponding points may occur, there has been a problem that erroneous three-dimensional measurement based on the erroneous feature point association is performed.

【００１８】また、３次元計測が可能な範囲は、撮影に
使用される複数（図１１及び図１２では２台）のカメラ
の撮像領域のオーバーラップ部分に限られるため、入力
に使われるカメラの撮影範囲を狭くしなければならない
という問題があった。Further, the range in which three-dimensional measurement is possible is limited to the overlapping portion of the image pickup areas of a plurality of cameras (two in FIGS. 11 and 12) used for photographing. There was a problem that the shooting range had to be narrowed.

【００１９】したがって、本発明の目的は、上記の問題
点を解決するために、少ない計算量で３次元画像の正確
な奥行きを計測して撮像画像をリアルタイムに処理する
ことができ、また、撮像画像の背景面が広い場合にも対
応することができる入力画像処理方法、入力画像処理装
置、及び入力画像処理プログラムを記録した記録媒体を
提供することである。Accordingly, an object of the present invention is to solve the above-mentioned problems, to measure the depth of a three-dimensional image accurately with a small amount of calculation and to process the captured image in real time. An object of the present invention is to provide an input image processing method, an input image processing device, and a recording medium on which an input image processing program is recorded, which can cope with a case where an image has a wide background surface.

【００２０】[0020]

【課題を解決するための手段】上記課題を解決するため
に、本発明の第１の態様の入力画像処理方法は、背景
面の上を移動する被写体を撮影した画像を処理する入力
画像処理方法において、背景面の上を移動する被写体
を同時に複数の撮影部で撮影して複数の入力画像を取得
し、それぞれの入力画像から被写体のみを抽出して、
複数の被写体抽出画像を生成し、生成された複数の被
写体抽出画像から差分画像を生成し、複数の被写体抽出
画像のうちの任意の被写体抽出画像と差分画像からオク
ルージョン画像を生成し、オクルージョン画像に基づ
いて被写体の背景面に対する位置を計測する、ことを
特徴とする。このとき、複数の撮影部は、互いの相対
位置関係が既知であり、互いの光軸方向が平行で且つそ
れぞれの撮像平面が背景面と平行な平面上に設置されて
いるようにする。According to a first aspect of the present invention, there is provided an input image processing method for processing an image of a subject moving on a background surface. In, the subject moving on the background surface is photographed simultaneously by a plurality of photographing units to acquire a plurality of input images, and only the subject is extracted from each of the input images,
Generating a plurality of subject extraction images, generating a difference image from the generated plurality of subject extraction images, generating an occlusion image from an arbitrary subject extraction image and the difference image among the plurality of subject extraction images, and generating an occlusion image Measuring the position of the subject with respect to the background based on the background. At this time, the relative positions of the plurality of imaging units are known, and the optical axis directions of the imaging units are parallel to each other, and the respective imaging planes are set on a plane parallel to the background plane.

【００２１】また、上記課題を解決するために、本発明
の第１の態様の入力画像処理装置は、背景面の上を移
動する被写体を撮影した画像を処理する入力画像処理装
置において、背景面の上を移動する被写体を同時に撮
影して複数の入力画像を取得する複数の撮影手段と、
撮影手段で取得された複数の入力画像から被写体のみを
抽出して、複数の被写体抽出画像を生成する抽出画像生
成手段と、抽出画像生成手段で生成された複数の被写
体抽出画像から差分画像を生成する差分画像生成手段
と、複数の被写体抽出画像のうちの任意の被写体抽出
画像と差分画像からオクルージョン画像を生成するオク
ルージョン画像生成手段と、オクルージョン画像生成
手段で生成されたオクルージョン画像に基づいて被写体
の背景面に対する位置を計測する距離計測手段と、を
備えることを特徴とする。このとき、複数の撮影手段
は、互いの相対位置関係が既知であり、互いの光軸方向
が平行で且つそれぞれの撮像平面が背景面と平行な平面
上に設置されているようにするとよい。According to another aspect of the present invention, there is provided an input image processing apparatus for processing an image of a subject moving on a background. A plurality of photographing means for simultaneously photographing a subject moving on the to obtain a plurality of input images,
Extraction image generation means for extracting only the subject from the plurality of input images acquired by the imaging means to generate a plurality of subject extraction images, and generating a difference image from the plurality of subject extraction images generated by the extraction image generation means Image generating means for generating an occlusion image from an arbitrary object extracted image and a difference image among a plurality of object extracted images, and an object based on the occlusion image generated by the occlusion image generating means. And a distance measuring means for measuring a position with respect to the background surface. At this time, it is preferable that the relative positions of the plurality of photographing units are known, and the optical axis directions of the photographing units are parallel to each other, and the respective imaging planes are set on a plane parallel to the background surface.

【００２２】これにより、カメラなどを用いて被写体の
動作画像を処理する際に、少ない計算量で、かつ正確に
被写体の奥行きを計測することができる。これによっ
て、リアルタイム処理を行うことができる高速かつ正確
な入力画像処理が可能となる。Thus, when processing a motion image of a subject using a camera or the like, the depth of the subject can be accurately measured with a small amount of calculation. Thereby, high-speed and accurate input image processing capable of performing real-time processing becomes possible.

【００２３】また、上記課題を解決するために、本発明
の第２の態様の入力画像処理方法は、背景面の上を移
動する被写体を撮影した画像を処理する入力画像処理方
法において、背景面の上を移動する被写体を同時に複
数の撮影部で撮影して複数の入力画像を取得し、複数
の入力画像からそれぞれの正規化画像を生成し、それ
ぞれの正規化画像から被写体のみを抽出して、複数の被
写体抽出画像を生成し、生成された複数の被写体抽出
画像から差分画像を生成し、複数の被写体抽出画像の
うちの任意の被写体抽出画像と差分画像からオクルージ
ョン画像を生成し、オクルージョン画像に基づいて被
写体の背景面に対する位置を計測する、ことを特徴とす
る。ここで、複数の撮影部は、互いの相対位置関係が既
知であり、互いの光軸方向が既知で輻輳するように設置
されているようにするとよい。According to another aspect of the present invention, there is provided an input image processing method for processing an image of a subject moving on a background surface. A plurality of input images are obtained by simultaneously photographing a subject moving on the camera with a plurality of photographing units, a normalized image is generated from the plurality of input images, and only the subject is extracted from each normalized image. Generating a plurality of subject extraction images, generating a difference image from the generated plurality of subject extraction images, generating an occlusion image from any of the plurality of subject extraction images and the difference image, and generating an occlusion image The position of the subject with respect to the background plane is measured based on Here, the plurality of photographing units may be installed such that their relative positional relationships are known, their optical axis directions are known, and they are congested.

【００２４】また、上記課題を解決するために、本発明
の第２の態様の入力画像処理装置は、背景面の上を移
動する被写体を撮影した画像を処理する入力画像処理装
置において、背景面の上を移動する被写体を同時に撮
影して複数の入力画像を取得する複数の撮影手段と、
撮影手段で取得された複数の入力画像からそれぞれの正
規化画像を生成する正規化画像生成手段と、正規化画
像生成手段で生成されたそれぞれの正規化画像から被写
体のみを抽出して、複数の被写体抽出画像を生成する抽
出画像生成手段と、抽出画像生成手段で生成された複
数の被写体抽出画像から差分画像を生成する差分画像生
成手段と、複数の被写体抽出画像のうちの任意の被写
体抽出画像と差分画像からオクルージョン画像を生成す
るオクルージョン画像生成手段と、オクルージョン画
像生成手段で生成されたオクルージョン画像に基づいて
被写体の背景面に対する位置を計測する距離計測手段
と、を備えることを特徴とする。ここで、複数の撮影手
段は、互いの相対位置関係が既知であり、互いの光軸方
向が既知で輻輳するように設置されているようにすると
よい。According to another aspect of the present invention, there is provided an input image processing apparatus for processing an image of a subject moving on a background surface. A plurality of photographing means for simultaneously photographing a subject moving on the to obtain a plurality of input images,
A normalized image generating means for generating respective normalized images from a plurality of input images obtained by the photographing means; and extracting only a subject from the respective normalized images generated by the normalized image generating means, Extraction image generation means for generating a subject extraction image; difference image generation means for generating a difference image from a plurality of subject extraction images generated by the extraction image generation means; and an arbitrary subject extraction image of the plurality of subject extraction images And an occlusion image generating means for generating an occlusion image from the difference image; and a distance measuring means for measuring a position of the subject with respect to a background surface based on the occlusion image generated by the occlusion image generating means. Here, the plurality of photographing units may be installed such that their relative positional relationships are known, their optical axis directions are known, and they are congested.

【００２５】これにより、入力に必要となる背景面が広
い場合にも、被写体の動作画像の処理が可能となる。This makes it possible to process a motion image of a subject even when the background required for input is large.

【００２６】また、上記課題を解決するために、本発明
の第３の態様の入力画像処理方法は、上述した第１及
び第２の態様の入力画像処理方法において、オクルー
ジョン画像を生成するステップを、複数の被写体抽出画
像のうちの２以上の任意の被写体抽出画像と前記差分画
像から２以上のオクルージョン画像を生成するステップ
とし、被写体の位置を計測するステップを、２以上の
オクルージョン画像毎に被写体の背景面に対する位置を
計測し、２以上のオクルージョン画像毎に計測した位置
の平均値に基づいて被写体の位置を決定するステップと
する、ことを特徴とする。According to a third aspect of the present invention, there is provided an input image processing method according to the third aspect of the present invention, further comprising the step of generating an occlusion image in the first and second aspects. Generating two or more occlusion images from two or more arbitrary subject extracted images of the plurality of subject extracted images and the difference image, and measuring the position of the subject is performed for each of the two or more occlusion images. Is characterized by a step of measuring a position with respect to a background surface and determining a position of a subject based on an average value of positions measured for each of two or more occlusion images.

【００２７】また、上記課題を解決するために、本発明
の第３の態様の入力画像処理装置は、上述した第１及
び第２の態様の入力画像処理装置において、オクルー
ジョン画像生成手段を、複数の被写体抽出画像のうちの
２以上の任意の被写体抽出画像と差分画像から２以上の
オクルージョン画像を生成する構成にし、距離計測手
段を、２以上のオクルージョン画像毎に被写体の背景面
に対する位置を計測し、２以上のオクルージョン画像毎
に計測した位置の平均値に基づいて被写体の位置を決定
する構成にする、ことを特徴とする。According to a third aspect of the present invention, there is provided an input image processing apparatus according to the first and second aspects, wherein a plurality of occlusion image generating means are provided. The configuration is such that two or more occlusion images are generated from two or more arbitrary subject extracted images of the subject extracted images and the difference image, and the distance measuring means measures the position of the subject with respect to the background plane for each of the two or more occlusion images The position of the subject is determined based on the average value of the positions measured for each of two or more occlusion images.

【００２８】これにより、３次元計測の精度を向上する
ことができる。Thus, the accuracy of three-dimensional measurement can be improved.

【００２９】また、上記課題を解決するために、本発明
の第４の態様の入力画像処理方法は、上述した第１乃
至第３の態様の入力画像処理方法において、被写体の
位置を計測するステップで、計測された位置を所定の補
正値によって補正するよにすることを特徴とする。こ
のとき、被写体の位置を計測するステップにおいて、被
写体の位置は、背景面から被写体までの距離を示す値Ｚ
h ’であり、所定の補正値は、被写体の背景面と垂直す
る方向での幅の値Ｚ0 であり、計測した位置の値をＺh
とした場合、被写体までの距離を示す値Ｚh ’は、Ｚh
’＝Ｚh −Ｚ0よって求める、ことができる。According to a fourth aspect of the present invention, there is provided an input image processing method according to the first to third aspects, wherein the position of the subject is measured. The measured position is corrected by a predetermined correction value. At this time, in the step of measuring the position of the subject, the position of the subject is determined by a value Z indicating the distance from the background surface to the subject.
h ′, the predetermined correction value is a width value Z0 in a direction perpendicular to the background surface of the subject, and the value of the measured position is Zh.
, The value Zh 'indicating the distance to the subject is Zh
'= Zh-Z0.

【００３０】また、上記課題を解決するために、本発明
の第４の態様の入力画像処理装置は、上述した第１乃
至第３の態様の入力画像処理装置において、距離計測
手段で、計測した位置を所定の補正値によって補正する
ことを特徴とする。ここで、距離計測手段において、
被写体の位置が、背景面から被写体までの距離を示す値
Ｚh ’であり、所定の補正値が、被写体の背景面と垂直
する方向での幅の値Ｚ0 であり、計測した位置の値をＺ
h とした場合、被写体までの距離を示す値Ｚhを、Ｚh
’＝Ｚh −Ｚ0 によって求める、ようにすることがで
きる。According to a fourth aspect of the present invention, there is provided an input image processing apparatus according to the first to third aspects, wherein the distance is measured by distance measuring means. The position is corrected by a predetermined correction value. Here, in the distance measuring means,
The position of the subject is a value Zh 'indicating the distance from the background surface to the subject, the predetermined correction value is a width value Z0 in a direction perpendicular to the background surface of the subject, and the measured position value is Zh.
h, the value Zh indicating the distance to the subject is Zh
'= Zh-Z0.

【００３１】これにより、３次元計測の精度を向上する
ことができる。Thus, the accuracy of three-dimensional measurement can be improved.

【００３２】また、上記課題を解決するために、上述し
た第１乃至第４の態様の入力画像処理方法において、
複数の入力画像を取得するステップを、背景面の上を移
動する被写体を同期を取りながら同時に複数の撮影部で
撮影して複数の入力画像を取得するステップとするとよ
い。また、複数の被写体抽出画像を生成するステップに
おいて、被写体の写っていない背景画像に基づいて被写
体のみを抽出する、ようにするとよい。さらに、差分画
像を生成するステップで、生成された複数の被写体抽出
画像のそれぞれの対応する画素毎の排他的論理和（ＥＸ
ＯＲ：EXclusive-OR）を取ることによって差分画像を生
成することもできる。また、オクルージョン画像を生成
するステップで、任意の被写体抽出画像と差分画像のそ
れぞれの対応する画素毎の論理積（ＡＮＤ）を取ること
によってオクルージョン画像を生成するようにしてもよ
い。さらに、被写体の位置を計測するステップの後に、
計測された被写体の位置から被写体の動作を認識する、
ようにしてもよい。According to another aspect of the present invention, there is provided an input image processing method according to any one of the first to fourth aspects.
The step of acquiring a plurality of input images may be a step of acquiring a plurality of input images by simultaneously photographing a subject moving on a background surface with a plurality of photographing units while synchronizing. In the step of generating a plurality of subject extraction images, it is preferable that only the subject is extracted based on the background image in which the subject is not shown. Further, in the step of generating the difference image, an exclusive OR (EX) for each corresponding pixel of the generated plurality of subject extraction images is provided.
OR: EXclusive-OR) to generate a difference image. Further, in the step of generating an occlusion image, an occlusion image may be generated by calculating a logical product (AND) of corresponding pixels of an arbitrary subject extracted image and a difference image. Furthermore, after the step of measuring the position of the subject,
Recognize the movement of the subject from the measured position of the subject,
You may do so.

【００３３】また、上記課題を解決するために、上述し
た第１乃至第４の態様の入力画像処理装置において、
複数の撮影手段は、背景面の上を移動する被写体を同期
を取りながら同時に撮影して複数の入力画像を取得する
構成にすることができる。また、出画像生成手段は、被
写体の写っていない背景画像に基づいて被写体のみを抽
出する、ようにしてもよい。さらに、差分画像生成手段
は、生成された複数の被写体抽出画像のそれぞれの対応
する画素毎の排他的論理和（ＥＸＯＲ：EXclusive-OR）
を取ることによって差分画像を生成する、こともでき
る。また、オクルージョン画像生成手段は、任意の被写
体抽出画像と差分画像のそれぞれの対応する画素毎の論
理積（ＡＮＤ）を取ることによってオクルージョン画像
を生成する、ようにしてもよい。さらに、距離計測手段
で計測された被写体の位置から被写体の動作を認識する
手段を有する、こともできる。Further, in order to solve the above-mentioned problem, in the input image processing apparatus according to the first to fourth aspects,
The plurality of photographing units may be configured to simultaneously photograph a subject moving on the background surface while synchronizing with each other to acquire a plurality of input images. Further, the output image generating means may extract only the subject based on the background image in which the subject is not shown. Further, the difference image generating means is an exclusive OR (EXOR: EXOR-OR) for each corresponding pixel of the plurality of generated subject extraction images.
, A difference image can be generated. Further, the occlusion image generating means may generate an occlusion image by taking a logical product (AND) of corresponding pixels of an arbitrary subject extracted image and a difference image. Further, it is possible to have means for recognizing the motion of the subject from the position of the subject measured by the distance measuring means.

【００３４】また、上述した本発明の第１乃至第４の態
様の入力画像処理方法を入力画像処理プログラムとし
て、コンピュータ読み取り可能な記録媒体に記録するこ
とができる。The input image processing methods according to the first to fourth aspects of the present invention can be recorded as an input image processing program on a computer-readable recording medium.

【００３５】[0035]

【発明の実施の形態】以下、図面を参照しつつ、本発明
の入力画像処理方法、入力画像処理装置、及び入力画像
処理プログラムを記録した記録媒体について説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, an input image processing method, an input image processing device, and a recording medium on which an input image processing program is recorded according to the present invention will be described with reference to the drawings.

【００３６】＜実施の形態１＞図１は、本発明の入力画
像処理装置を含むコンピュータシステムのブロック構成
図である。このコンピュータシステムは、入力画像を処
理する入力画像処理装置１００と、入力画像処理装置１
００に接続される画像処理装置１１４とを備えている。<Embodiment 1> FIG. 1 is a block diagram of a computer system including an input image processing apparatus of the present invention. The computer system includes an input image processing device 100 that processes an input image and an input image processing device 1
00 and an image processing apparatus 114 connected to the image processing apparatus.

【００３７】図１において、入力画像処理装置１００
は、テクスチャを有する背景面１２０上で移動する被写
体を撮影するための２つのカメラ部Ａ１０２及びカメラ
部Ｂ１０３と、カメラ部Ａ１０２及びカメラ部Ｂ１０３
に対して撮影の同期信号を送出する同期信号発生部１０
１と、カメラ部Ａ１０２及びカメラ部Ｂ１０３によって
撮影されたそれぞれの画像のアナログ信号をディジタル
信号（ディジタル画像）に変換するＡ／Ｄ変換部１０４
及び１０５と、背景面１２０の背景画像のみを蓄積する
背景画像蓄積部１０６及び１０７と、Ａ／Ｄ変換部１０
４及び１０５から出力されるそれぞれのディジタル画像
と背景画像蓄積部１０６及び１０７に蓄積されているそ
れぞれの背景画像とに基づいて被写体のみを抽出し、抽
出した結果（信号）の２値化を行って被写体抽出画像を
生成する被写体抽出画像生成部１０８及び１０９と、被
写体抽出画像生成部１０８及び１０９で生成された２つ
被写体抽出画像間の差分画像を生成する差分画像生成部
１１０と、何れか一方の被写体抽出画像生成部（図１で
は被写体抽出画像生成部１０８）で生成された被写体抽
出画像と差分画像生成部１１０で生成された差分画像か
らオクルージョン領域の検出を行うオクルージョン検出
部１１１と、オクルージョン検出部１１１で検出された
オクルージョン領域から被写体の背景面１２０に対する
３次元位置計測を行う距離計測部１１２と、距離計測部
１１２の距離計測結果に基づいて画像の動きなどを認識
し、その認識に基づいた制御信号を画像処理装置１１４
に入力する入力処理部１１３とを備えている。In FIG. 1, an input image processing device 100
Are two camera units A102 and B103 for photographing a moving subject on a background surface 120 having a texture, and two camera units A102 and B103.
Synchronizing signal generating unit 10 for transmitting a synchronizing signal for photographing to
1 and an A / D converter 104 for converting an analog signal of each image taken by the camera unit A102 and the camera unit B103 into a digital signal (digital image).
And 105, background image storage units 106 and 107 for storing only the background image of the background plane 120, and the A / D conversion unit 10
Only the subject is extracted based on each digital image output from each of the digital images 4 and 105 and each background image stored in the background image storage units 106 and 107, and the extraction result (signal) is binarized. And a difference image generation unit 110 that generates a difference image between the two subject extraction images generated by the subject extraction image generation units 108 and 109. An occlusion detecting section 111 for detecting an occlusion area from the subject extracted image generated by one subject extracted image generating section (the subject extracted image generating section 108 in FIG. 1) and the differential image generated by the differential image generating section 110; From the occlusion area detected by the occlusion detection unit 111, three-dimensional position measurement of the subject with respect to the background surface 120 is performed. And the Hare distance measurement unit 112, recognizes a like motion of the image based on the distance measurement result of the distance measurement unit 112, the image processing apparatus control signal based on the recognition 114
And an input processing unit 113 for inputting the information to the user.

【００３８】ここで、カメラ部Ａ１０２及びカメラ部Ｂ
１０３は、互いの相対位置関係が既知であり、また、互
いの光軸方向が平行で、かつ撮像平面が背景面と平行な
同一平面上に配置されている。Here, the camera unit A102 and the camera unit B
Reference numerals 103 have a known relative positional relationship, and are arranged on the same plane whose optical axis directions are parallel to each other and whose imaging plane is parallel to the background plane.

【００３９】また、画像処理装置１１４は、入力処理部
１１３からの制御信号を受け取り、その制御信号に基づ
いて処理を行う制御部１１５と、制御プログラムなどを
記憶するＲＯＭ（Read Only Memory）１１９と、データ
などを記憶するＲＡＭ（Random Access Memory）１１８
と、画像を表示する表示部１１７と、表示部１１７を制
御する表示制御部１１６とを備えている。The image processing device 114 receives a control signal from the input processing unit 113 and performs processing based on the control signal, and a ROM (Read Only Memory) 119 for storing a control program and the like. (Random Access Memory) 118 for storing data, data, etc.
, A display unit 117 for displaying an image, and a display control unit 116 for controlling the display unit 117.

【００４０】以下、上述の構成のコンピュータシステム
における画像処理について図１〜図４を用いて説明す
る。なお、この画像処理については、ピアノ鍵盤の絵を
用いた演奏入力の例によって説明する。Hereinafter, image processing in the computer system having the above configuration will be described with reference to FIGS. This image processing will be described using an example of performance input using a picture of a piano keyboard.

【００４２】図２は、本発明の入力画像処理装置１００
の動作を示すフローチャートである。ここで、背景面１
２０の背景（ピアノ鍵盤の絵）のみを予めカメラ部Ａ１
０２及びカメラ部Ｂ１０３で撮影しておき、Ａ／Ｄ変換
部１０４及び１０５でディジタル変換処理を行って、背
景画像として背景画像蓄積部１０６及び１０７に蓄積し
ておく。FIG. 2 shows an input image processing apparatus 100 according to the present invention.
6 is a flowchart showing the operation of the first embodiment. Here, the background surface 1
Only 20 backgrounds (pictures of the piano keyboard) are stored in the camera section A1 in advance.
02 and the camera unit B103, digital conversion processing is performed by the A / D conversion units 104 and 105, and the digital images are stored in the background image storage units 106 and 107 as background images.

【００４３】図２において、まず、オペレータは、ピア
ノの鍵盤を描いた絵の背景面１２０上で演奏動作を行
う。この時、カメラ部Ａ１０２及びカメラ部Ｂ１０３
は、同期信号発生部１０１から発生する同期信号に同期
して、画角内の手指の画像をアナログ信号として取り込
む（ステップ２０１Ａ、２０１Ｂ）。In FIG. 2, first, the operator performs a performance operation on a background surface 120 of a picture depicting a keyboard of a piano. At this time, the camera unit A102 and the camera unit B103
Captures an image of a finger within the angle of view as an analog signal in synchronization with a synchronization signal generated from the synchronization signal generator 101 (steps 201A and 201B).

【００４４】このアナログ信号は、Ａ／Ｄ変換部１０
４、１０５でディジタル信号（ディジタル画像）に変換
される。以下、距離計測部１１２で距離計測を行うため
の差分画像生成について説明する。This analog signal is supplied to the A / D converter 10
At 4 and 105, the signal is converted into a digital signal (digital image). Hereinafter, generation of a difference image for performing distance measurement by the distance measurement unit 112 will be described.

【００４５】図３は、差分画像の生成を示す概念図であ
る。図１及び図３において、まず、被写体抽出画像生成
部１０８、１０９は、背景画像蓄積部１０６、１０７に
蓄積されている背景（テクスチャ）３０９のみの画像３
０１、３０３と背景３０９に被写体( 手指) ３１０が入
った画像３０２、３０４（Ａ／Ｄ変換部１０４、１０５
で変換されたディジタル画像）のそれぞれでの差分を計
算して差分画像を生成する（ステップ２０２Ａ、２０２
Ｂ）。FIG. 3 is a conceptual diagram showing generation of a difference image. In FIGS. 1 and 3, first, the subject extraction image generation units 108 and 109 generate an image 3 of only the background (texture) 309 stored in the background image storage units 106 and 107.
01 and 303 and images (302) and 304 (304) containing a subject (fingers) 310 in a background 309 (A / D converters 104 and 105).
The difference is calculated by calculating the difference between each of the digital images converted in (Step 202A, 202).
B).

【００４６】そして、この差分画像に基づいて、被写体
３１０の存在する部分は「１」で、背景３０９のみの部
分は「０」で２値化を行い、被写体抽出画像３０５、３
０６を生成する（ステップ２０３Ａ、２０３Ｂ）。図３
の被写体抽出画像３０５、３０６においては、被写体３
１０の存在する部分「１」を、黒く塗りつぶしている。Then, based on the difference image, the portion where the subject 310 exists is “1”, and the portion of only the background 309 is “0”, and the binarization is performed.
06 is generated (steps 203A and 203B). FIG.
In the subject extraction images 305 and 306 of
The portion “1” where 10 is present is blacked out.

【００４７】次に、差分画像生成部１１０は、被写体抽
出画像３０５、３０６の各対応する画素毎の排他的論理
和（ＥＸＯＲ：EXclusive-OR）を取って差分画像３０７
を生成する（ステップ２０４）。ここで、差分を取る際
の被写体抽出画像３０５と被写体抽出画像３０６の位置
関係（画素の対応関係）は、背景画像３０１、３０３に
撮影されている背景面１２０のテクスチャ（ピアノ鍵盤
の絵）３０９がずれることなく重なる位置に設定され
る。図３の差分画像３０７においては、被写体抽出画像
３０５、３０６で被写体３１０の重なった部分が「０」
（白抜き）となっている。Next, the difference image generation unit 110 obtains an exclusive OR (EXOR: EXOR-OR) for each corresponding pixel of the subject extracted images 305 and 306, and obtains a difference image 307.
Is generated (step 204). Here, the positional relationship (correspondence relationship between pixels) between the subject extraction image 305 and the subject extraction image 306 when taking the difference is the texture (picture of a piano keyboard) 309 of the background surface 120 photographed in the background images 301 and 303. Are set at overlapping positions without displacement. In the difference image 307 of FIG. 3, the overlapped portion of the subject 310 in the subject extracted images 305 and 306 is “0”.
(Open).

【００４８】続いて、オクルージョン検出部１１１は、
何れか一方の被写体抽出画像（図３では被写体抽出画像
３０５）と差分画像３０７の各対応する画素毎の論理積
（ＡＮＤ）を取ってオクルージョン画像３０８を生成す
る（ステップ２０５）。ここで、論理積を取る際の被写
体抽出画像３０５と差分画像３０７の位置関係（画素の
対応関係）は、上述の差分画像生成部１１０で設定され
た位置関係と同一に設定される。Subsequently, the occlusion detecting section 111
An occlusion image 308 is generated by taking the logical product (AND) of one of the subject extracted images (the subject extracted image 305 in FIG. 3) and the difference image 307 for each corresponding pixel (step 205). Here, the positional relationship (correspondence relationship between pixels) between the subject extraction image 305 and the difference image 307 at the time of calculating the logical product is set to be the same as the positional relationship set by the difference image generation unit 110 described above.

【００４９】続いて、距離計測部１１２は、オクルージ
ョン画像３０８のオクルージョン領域３１１から被写体
３１０の境界線に沿って、オクルージョン領域３１１の
幅の画素数を求める。このオクルージョン領域３１１の
幅の画素数に基づいて、背景面１２０によって定められ
る３次元座標系における被写体３１０の座標値を求める
（ステップ２０６）。Subsequently, the distance measuring unit 112 obtains the number of pixels of the width of the occlusion area 311 from the occlusion area 311 of the occlusion image 308 along the boundary line of the subject 310. Based on the number of pixels having the width of the occlusion area 311, a coordinate value of the subject 310 in a three-dimensional coordinate system defined by the background surface 120 is obtained (Step 206).

【００５０】図４は、被写体３１０の座標値の求め方を
示す図である。図４（ａ）は、背景面１２０及び被写体
３１０と、そこで定められる座標系４０３を示す。ま
た、図４（ｂ）は、Ｙ軸上方向から見たカメラ焦点（ピ
ンホール位置）４０４、４０５、撮像面４０６、４０
７、被写体３１０、及び背景面１２０の位置関係を示
す。FIG. 4 is a diagram showing how to obtain the coordinate values of the subject 310. FIG. 4A shows a background surface 120 and a subject 310, and a coordinate system 403 defined there. FIG. 4B shows the camera focal points (pinhole positions) 404 and 405 and the imaging planes 406 and 40 viewed from above the Y axis.
7, the positional relationship between the subject 310 and the background surface 120 is shown.

【００５１】今、カメラ部Ａ１０２及びカメラ部Ｂ１０
３が、それぞれ座標（ｘc1，ｙc ，−ｚc ）及び（ｘc
2，ｙc ，−ｚc ）の位置に設置されているとする。図
４においては、カメラ部Ａ１０２及びカメラ部Ｂ１０３
をピンホールカメラモデルとしてモデル化して説明す
る。ピンホール４０４、４０５の間の距離をＢ、ピンホ
ール４０４、４０５から背景面１２０までの距離をＬ、
ピンホール４０４、４０５からそれぞれの撮像面４０
６、４０７までの距離（焦点距離）をＦ、オクルージョ
ン画像３０８（図３）から判定されるオクルージョン領
域３１１の幅にある画素数をｐ、オクルージョン領域３
１１の幅の実際の長さをｘ0 とすると、以下に示すNow, the camera unit A102 and the camera unit B10
3 are the coordinates (xc1, yc, -zc) and (xc1,
2, yc, -zc). In FIG. 4, a camera unit A102 and a camera unit B103
Is described as a pinhole camera model. The distance between the pinholes 404 and 405 is B, the distance from the pinholes 404 and 405 to the background 120 is L,
The respective imaging planes 40 from the pinholes 404 and 405
F is the distance (focal length) to 6, 407, p is the number of pixels in the width of the occlusion area 311 determined from the occlusion image 308 (FIG. 3), and occlusion area 3
Assuming that the actual length of the width of 11 is x0,

【数１】が成り立つ。Equation 1 holds.

【数１】ｐ：ｘ0 ＝Ｆ：Ｌ ∴ ｘ0 ＝（ｐ×Ｌ）／Ｆ## EQU1 ## p: x0 = F: L∴x0 = (p × L) / F

【００５２】また背景面１２０から被写体（この場合は
指先）３１０までの距離をＺh とすると、三角形の相似
により、以下のWhen the distance from the background surface 120 to the object (in this case, the fingertip) 310 is Zh, the similarity between the triangles is as follows.

【数２】が成り立つ。Equation 2 holds.

【数２】Ｂ：ｘ0 ＝（Ｌ−Ｚh ）：Ｚh ∴ Ｚh ＝（Ｌ×ｘ0 ）／（Ｂ＋ｘ0 ）## EQU2 ## B: x0 = (L-Zh): Zh∴Zh = (L × x0) / (B + x0)

【００５３】ここで、Where

【数１】で得られたｘ0 をX0 obtained by the following equation

【数２】に代入することにより、以下のBy substituting into Equation 2,

【数３】が得られる。Equation 3 is obtained.

【数３】Ｚh ＝Ｌ／［１＋｛（Ｂ×Ｆ）／（ｐ×Ｌ）｝］Zh = L / [1 + {(B × F) / (p × L)}]

【００５４】このThis

【数３】において、Ｌ、ｐ、Ｆ、Ｂの値は、全て既知の
ため、背景面１２０から被写体３１０までの距離Ｚh が
得られる。In Equation 3, since the values of L, p, F, and B are all known, the distance Zh from the background surface 120 to the subject 310 can be obtained.

【００５５】距離計測部１１２は、The distance measuring unit 112

【数３】によって、背景面１２０から被写体３１０まで
の距離Ｚh を計算し、この値を被写体３１０の背景面１
２０内におけるｘ−ｙ座標値と一緒に出力する。The distance Zh from the background plane 120 to the subject 310 is calculated by the following equation (3), and this value is calculated as
20 together with the xy coordinate values.

【００５６】続いて、入力処理部１１３では、背景面１
２０の背景（ピアノの鍵盤を描いた絵）３０９からの被
写体（指先）３１０までの距離Ｚh に基づいて、鍵盤押
下の判定を行う（ステップ２０７）。Subsequently, in the input processing unit 113, the background 1
Based on the distance Zh from the background 20 (picture depicting the keyboard of the piano) 309 to the subject (fingertip) 310, it is determined that the keyboard has been pressed (step 207).

【００５７】入力処理部１１３は、所定の閾値などに基
づいて鍵盤押下と判定した場合には、被写体（指先）３
１０に対応する鍵盤の位置を示す（ｘ，ｙ）座標値を制
御信号として、画像処理装置１１４に送出する（ステッ
プ２０８）。一方、鍵盤押下と判定しなかった場合に
は、そのまま処理を続行する（ステップ２０９）。When it is determined that the keyboard has been pressed based on a predetermined threshold or the like, the input processing unit 113 sets the subject (fingertip) 3
The (x, y) coordinate value indicating the position of the keyboard corresponding to 10 is sent to the image processing device 114 as a control signal (step 208). On the other hand, if it is not determined that the key is pressed, the process is continued (step 209).

【００５８】以下、画像の計測処理が終了するまで（上
述で説明した場合には、例えば、ピアノの演奏が終わる
まで）、上述のステップ２０１Ａ及び２０１Ｂ〜２０９
までの処理を繰り返す（ステップ２１０）。Hereinafter, steps 201A and 201B to 209 described above are performed until the measurement processing of the image is completed (in the case described above, for example, until the performance of the piano is completed).
The processing up to is repeated (step 210).

【００５９】画像処理装置１１４の制御部１１５は、入
力処理部１１３からの座標値（信号）に基づいて、ＲＯ
Ｍ１１９やＲＡＭ１１８などに記憶されているプログラ
ムやデータを処理する。制御部１１５は、その処理結果
に基づいて、表示制御部１１６を制御して表示部１１７
に画像を表示したり、音声出力部（図示せず）から音声
を出力する。The control unit 115 of the image processing device 114 determines the RO based on the coordinate value (signal) from the input processing unit 113.
It processes programs and data stored in M119, RAM 118, and the like. The control unit 115 controls the display control unit 116 based on the processing result, and
And an audio output unit (not shown) outputs an audio.

【００６０】以上のように、本発明の入力画像処理装置
及び入力画像処理方法によれば、カメラを用いた画像の
入力の際に、少ない計算量で、かつ正確に背景面と被写
体間の距離を計測することができるようになる。これに
よって、画像をリアルタイムに処理することができる高
速かつ安定な入力画像処理が可能となる。As described above, according to the input image processing apparatus and the input image processing method of the present invention, when an image is input using a camera, the distance between the background surface and the subject can be accurately calculated with a small amount of calculation. Can be measured. As a result, high-speed and stable input image processing capable of processing an image in real time becomes possible.

【００６１】＜実施の形態２＞次に、本発明の入力画像
処理装置及び入力画像処理方法について、図面を参照し
ながら説明する。<Embodiment 2> Next, an input image processing apparatus and an input image processing method of the present invention will be described with reference to the drawings.

【００６２】図５は、本発明の入力画像処理装置を含む
コンピュータシステムのブロック構成図である。なお、
図５において、図１と同様の構成には同一の符号を付し
ている。FIG. 5 is a block diagram of a computer system including the input image processing device of the present invention. In addition,
5, the same components as those in FIG. 1 are denoted by the same reference numerals.

【００６３】図５において、このコンピュータシステム
は、入力画像を処理する入力画像処理装置５００と、入
力画像処理装置５００に接続される画像処理装置１１４
とを備えている。In FIG. 5, the computer system includes an input image processing device 500 for processing an input image, and an image processing device 114 connected to the input image processing device 500.
And

【００６４】この入力画像処理装置５００は、テクスチ
ャを有する背景面１２０上で移動する被写体を撮影する
ための２つのカメラ部Ａ５０２及びカメラ部Ｂ５０３
と、カメラ部Ａ５０２及びカメラ部Ｂ５０３に対して撮
影の同期信号を送出する同期信号発生部１０１と、カメ
ラ部Ａ５０２及びカメラ部Ｂ５０３によって撮影された
それぞれの画像のアナログ信号をディジタル信号（ディ
ジタル画像）に変換するＡ／Ｄ変換部１０４及び１０５
と、Ａ／Ｄ変換部１０４及び１０５で変換されたディジ
タル画像を、光軸方向が平行な２つのカメラ部で撮像さ
れ、その撮像面が背景面１２０と平行になる画像と同一
な画像（以下、単に「正規化画像」ともいう）に正規化
変換する正規化画像生成部５０６及び５０７と、背景面
１２０の背景画像のみを蓄積する背景画像蓄積部１０６
及び１０７と、正規化画像生成部５０６及び５０７から
出力されるそれぞれの正規化画像と背景画像蓄積部１０
６及び１０７に蓄積されているそれぞれの背景画像とに
基づいて被写体のみを抽出し、抽出した結果（信号）の
２値化を行って被写体抽出画像を生成する被写体抽出画
像生成部１０８及び１０９と、被写体抽出画像生成部１
０８及び１０９で生成された２つ被写体抽出画像間の差
分画像を生成する差分画像生成部１１０と、何れか一方
の被写体抽出画像生成部（図１では被写体抽出画像生成
部１０８）で生成された被写体抽出画像と差分画像生成
部１１０で生成された差分画像からオクルージョン領域
の検出を行うオクルージョン検出部１１１と、オクルー
ジョン検出部１１１で検出されたオクルージョン領域か
ら被写体の背景面１２０に対する３次元位置計測を行う
距離計測部１１２と、距離計測部１１２の距離計測結果
に基づいて画像の動きなどを認識し、その認識に基づい
た制御信号を画像処理装置１１４に入力する入力処理部
１１３とを備えている。The input image processing apparatus 500 includes two camera units A 502 and B 503 for photographing a moving object on the background surface 120 having a texture.
A synchronizing signal generating unit 101 for transmitting a synchronizing signal of photographing to the camera unit A502 and the camera unit B503; and a digital signal (digital image) of an analog signal of each image photographed by the camera unit A502 and the camera unit B503. A / D converters 104 and 105 for converting to
And the digital image converted by the A / D conversion units 104 and 105 is captured by two camera units whose optical axis directions are parallel to each other, and the same image (hereinafter referred to as an image whose imaging surface is parallel to the background surface 120) (Referred to simply as “normalized image”) and a background image storage unit 106 that stores only the background image of the background surface 120.
And 107, the respective normalized images output from the normalized image generators 506 and 507, and the background image storage 10
6 and 107, only the subject is extracted based on the respective background images stored therein, binarized extraction results (signals) are used to generate subject extraction images, and subject extraction image generation units 108 and 109 , Subject extraction image generation unit 1
A difference image generation unit 110 that generates a difference image between the two subject extraction images generated in 08 and 109 and one of the subject extraction image generation units (the subject extraction image generation unit 108 in FIG. 1). An occlusion detection unit 111 that detects an occlusion area from the subject extraction image and the difference image generated by the difference image generation unit 110, and measures a three-dimensional position of the subject with respect to the background surface 120 from the occlusion area detected by the occlusion detection unit 111. The image processing apparatus 114 includes a distance measurement unit 112 that performs the operation, and an input processing unit 113 that recognizes a motion of an image based on the distance measurement result of the distance measurement unit 112 and inputs a control signal based on the recognition to the image processing device 114. .

【００６５】ここで、カメラ部Ａ５０２及びカメラ部Ｂ
５０３は、互いの相対位置関係が既知であり、また、互
いの光軸方向が既知で輻輳するように配置されている。Here, the camera unit A 502 and the camera unit B
Reference numerals 503 are arranged such that their relative positional relationships are known, and their optical axis directions are known and congested.

【００６６】また、画像処理装置１１４は、入力処理部
１１３からの制御信号を受け取り、その制御信号に基づ
いて処理を行う制御部１１５と、制御プログラムなどを
記憶するＲＯＭ１１９と、データなどを記憶するＲＡＭ
１１８と、画像を表示する表示部１１７と、表示部１１
７を制御する表示制御部１１６とを備えている。The image processing device 114 receives a control signal from the input processing unit 113 and performs processing based on the control signal, a ROM 119 for storing a control program and the like, and stores data and the like. RAM
118, a display unit 117 for displaying an image, and a display unit 11
And a display control unit 116 for controlling the control unit 7.

【００６７】図５に示した入力画像処理装置５００と図
１に示した入力画像処理装置１００との相違点は、図５
の入力画像処理装置５００において、カメラ部Ａ５０２
及びカメラ部Ｂ５０３を互いの相対位置関係が既知で、
互いの光軸方向が既知で輻輳するように配置し、ディジ
タル画像を正規化画像に正規化変換する正規化画像生成
部５０６及び５０７を備える点である。The difference between the input image processing device 500 shown in FIG. 5 and the input image processing device 100 shown in FIG.
In the input image processing apparatus 500 of FIG.
And the relative positional relationship between the camera unit B503 and each other is known,
The point is that normalization image generation units 506 and 507 are arranged so that their optical axis directions are known and congested, and normalize and convert a digital image into a normalization image.

【００６８】本発明の入力画像処理装置を以上のような
構成にすることで、カメラ部Ａ５０２及びカメラ部Ｂ５
０３で撮影することができる背景面１２０の範囲を広げ
ることが出来る。By configuring the input image processing apparatus of the present invention as described above, the camera unit A502 and the camera unit B5
The range of the background surface 120 that can be photographed at 03 can be expanded.

【００６９】以下、図５に示した入力画像処理装置５０
０の動作について、上述と同様にピアノの演奏を例にと
って説明する。Hereinafter, the input image processing device 50 shown in FIG.
The operation of 0 will be described by taking a piano performance as an example as described above.

【００７０】図６は、入力画像処理装置５００での画像
処理を示すフローチャートである。なお、図６におい
て、上述した図２と同様の処理をするステップには、同
一のステップ番号を付している。ここで、背景面１２０
の背景（ピアノ鍵盤の絵）のみを予めカメラ部Ａ１０２
及びカメラ部Ｂ１０３で撮影しておき、Ａ／Ｄ変換部１
０４及び１０５でディジタル変換処理を行って、背景画
像として背景画像蓄積部１０６及び１０７に蓄積してお
く。FIG. 6 is a flowchart showing image processing in the input image processing device 500. In FIG. 6, the same steps as those in FIG. 2 described above are denoted by the same step numbers. Here, the background surface 120
Only the background (picture of the piano keyboard) of the camera unit A102
And A / D converter 1
In steps 04 and 105, digital conversion processing is performed, and the digital images are stored in the background image storage units 106 and 107 as background images.

【００７１】図６において、図２と同様にして、まず、
オペレータは、ピアノの鍵盤を描いた絵の背景面１２０
上で演奏動作を行う。この時、カメラ部Ａ５０２及びカ
メラ部Ｂ５０３は、同期信号発生部１０１から発生する
同期信号に同期して、画角内の手指の画像をアナログ信
号として取り込む（ステップ６０１Ａ、６０１Ｂ）。こ
のアナログ信号は、Ａ／Ｄ変換部１０４、１０５でディ
ジタル信号（ディジタル画像）に変換される。In FIG. 6, similar to FIG. 2, first,
The operator operates the background 120 of the picture of the piano keyboard.
Perform the performance operation above. At this time, the camera unit A502 and the camera unit B503 take in an image of the finger within the angle of view as an analog signal in synchronization with the synchronization signal generated from the synchronization signal generation unit 101 (steps 601A and 601B). The analog signals are converted into digital signals (digital images) by A / D converters 104 and 105.

【００７２】次に、正規化画像生成部５０６、５０７
は、カメラ部Ａ５０２及びカメラ部Ｂ５０３の輻輳した
既知である光軸方向に基づいて、Ａ／Ｄ変換部１０４及
び１０５で変換されたディジタル画像を、光軸方向が平
行な２つのカメラ部で撮像されその撮像面が背景面１２
０と平行になる画像と同一な画像（正規化画像）に正規
化変換する（ステップ６０２Ａ、６０２Ｂ）。このよう
な正規化画像の生成は、両眼立体視法などで用いられる
一般的な正規化（rectification ）処理によって行うと
よい。Next, the normalized image generation units 506 and 507
Captures digital images converted by the A / D converters 104 and 105 based on the known optical axis directions in which the camera units A502 and B503 are congested by two camera units whose optical axis directions are parallel to each other. And its imaging surface is the background surface 12
Normalization conversion is performed to the same image (normalized image) as the image parallel to 0 (steps 602A and 602B). The generation of such a normalized image may be performed by a general normalization (rectification) process used in a binocular stereoscopic method or the like.

【００７３】以下、図２で説明したのと同様にして、こ
の正規化変換の画像を用いて、被写体抽出画像生成部１
０８、１０９は、差分画像を生成する（ステップ２０２
Ａ、２０２Ｂ）。そして、この差分画像に基づいて、２
値化を行い、被写体抽出画像３０５、３０６（図３）を
生成する（ステップ２０３Ａ、２０３Ｂ）。次に、差分
画像生成部１１０は、被写体抽出画像３０５、３０６に
基づいて、差分画像３０７（図３）を生成する（ステッ
プ２０４）。Thereafter, in the same manner as described with reference to FIG.
08 and 109 generate difference images (step 202).
A, 202B). Then, based on the difference image, 2
The binarization is performed to generate subject extraction images 305 and 306 (FIG. 3) (steps 203A and 203B). Next, the difference image generation unit 110 generates a difference image 307 (FIG. 3) based on the subject extraction images 305 and 306 (Step 204).

【００７４】続いて、オクルージョン検出部１１１は、
何れか一方の被写体抽出画像（図３では被写体抽出画像
３０５）と差分画像３０７に基づいてオクルージョン画
像３０８（図３）を生成する（ステップ２０５）。次
に、距離計測部１１２は、オクルージョン画像３０８に
基づいて、被写体３１０（図３）の座標値を求める（ス
テップ２０６）。Subsequently, the occlusion detecting section 111
An occlusion image 308 (FIG. 3) is generated based on one of the subject extracted images (the subject extracted image 305 in FIG. 3) and the difference image 307 (step 205). Next, the distance measurement unit 112 obtains the coordinate value of the subject 310 (FIG. 3) based on the occlusion image 308 (Step 206).

【００７５】続いて、入力処理部１１３では、被写体３
１０の座標値に基づいて、所定の閾値などにより鍵盤押
下の判定を行う（ステップ２０７）。入力処理部１１３
は、鍵盤押下と判定した場合には、被写体（指先）３１
０に対応する鍵盤の座標値を制御信号として、画像処理
装置１１４に送出する（ステップ２０８）。一方、鍵盤
押下と判定しなかった場合には、そのまま処理を続行す
る（ステップ２０９）。Subsequently, in the input processing section 113, the object 3
Based on the coordinate values of 10, the keyboard is determined to be pressed by a predetermined threshold or the like (step 207). Input processing unit 113
Indicates that the subject (fingertip) 31
The coordinate value of the keyboard corresponding to 0 is sent to the image processing device 114 as a control signal (step 208). On the other hand, if it is not determined that the key is pressed, the process is continued (step 209).

【００７６】以下、画像の計測処理が終了するまで（上
述で説明した場合には、例えば、ピアノの演奏が終わる
まで）、上述のステップ６０１Ａ及び６０１Ｂ〜２０９
までの処理を繰り返す（ステップ２１０）。Hereinafter, steps 601A and 601B to 209B until the image measurement processing ends (in the case described above, for example, until the piano performance ends).
The processing up to is repeated (step 210).

【００７７】画像処理装置１１４の制御部１１５は、入
力処理部１１３からの座標値（信号）に基づいて、ＲＯ
Ｍ１１９やＲＡＭ１１８などに記憶されているプログラ
ムやデータを処理する。制御部１１５は、その処理結果
に基づいて、表示制御部１１６を制御して表示部１１７
に画像を表示したり、音声出力部（図示せず）から音声
を出力する。The control unit 115 of the image processing device 114 determines the RO based on the coordinate value (signal) from the input processing unit 113.
It processes programs and data stored in M119, RAM 118, and the like. The control unit 115 controls the display control unit 116 based on the processing result, and
And an audio output unit (not shown) outputs an audio.

【００７８】以上のように、本発明の入力画像処理装置
及び入力画像処理方法によれば、画像の入力に必要とな
る背景面が広い場合にも、オペレータのアクションを示
す画像を処理することができる。As described above, according to the input image processing apparatus and the input image processing method of the present invention, it is possible to process an image showing the action of the operator even when the background required for inputting the image is wide. it can.

【００７９】＜実施の形態３＞次に、本発明の他の実施
の形態について説明する。本実施の形態の特徴は、複数
のオクルージョン領域を検出することによって、計測精
度をより向上させたことにある。ここで、本実施の形態
における入力画像処理装置は、図１で示した入力画像処
理装置１００と同一の構成である。次に、本実施の形態
における入力画像処理装置の動作について説明する。<Embodiment 3> Next, another embodiment of the present invention will be described. A feature of the present embodiment is that measurement accuracy is further improved by detecting a plurality of occlusion regions. Here, the input image processing device according to the present embodiment has the same configuration as input image processing device 100 shown in FIG. Next, the operation of the input image processing device according to the present embodiment will be described.

【００８０】図７は、入力画像処理装置１００での画像
処理を示すフローチャートである。なお、図７におい
て、上述した図２と同様の処理をするステップには、同
一のステップ番号を付している。ここで、背景面１２０
の背景（ピアノ鍵盤の絵）のみを予めカメラ部Ａ１０２
及びカメラ部Ｂ１０３で撮影しておき、Ａ／Ｄ変換部１
０４及び１０５でディジタル変換処理を行って、背景画
像として背景画像蓄積部１０６及び１０７に蓄積してお
く。FIG. 7 is a flowchart showing image processing in the input image processing apparatus 100. Note that, in FIG. 7, the same steps as those in FIG. 2 described above are denoted by the same step numbers. Here, the background surface 120
Only the background (picture of the piano keyboard) of the camera unit A102
And A / D converter 1
In steps 04 and 105, digital conversion processing is performed, and the digital images are stored in the background image storage units 106 and 107 as background images.

【００８１】図７において、図２と同様にして、まず、
オペレータは、ピアノの鍵盤を描いた絵の背景面１２０
上で演奏動作を行う。この時、カメラ部Ａ１０２及びカ
メラ部Ｂ１０３は、同期信号発生部１０１から発生する
同期信号に同期して、画角内の手指の画像をアナログ信
号として取り込む（ステップ２０１Ａ、２０１Ｂ）。こ
のアナログ信号は、Ａ／Ｄ変換部１０４、１０５でディ
ジタル信号（ディジタル画像）に変換される。In FIG. 7, similar to FIG. 2, first,
The operator operates the background 120 of the picture of the piano keyboard.
Perform the performance operation above. At this time, the camera unit A102 and the camera unit B103 capture an image of the finger within the angle of view as an analog signal in synchronization with the synchronization signal generated from the synchronization signal generation unit 101 (steps 201A and 201B). The analog signals are converted into digital signals (digital images) by A / D converters 104 and 105.

【００８２】図８は、差分画像の生成を示す概念図であ
る。図１及び図８において、まず、被写体抽出画像生成
部１０８、１０９は、背景画像蓄積部１０６、１０７に
蓄積されている背景（テクスチャ）３０９のみの画像３
０１、３０３と背景３０９に被写体( 手指) ３１０が入
った画像３０２、３０４（Ａ／Ｄ変換部１０４、１０５
で変換されたディジタル画像）のそれぞれでの差分を計
算して差分画像を生成する（ステップ２０２Ａ、２０２
Ｂ）。FIG. 8 is a conceptual diagram showing generation of a difference image. 1 and 8, first, the subject extraction image generation units 108 and 109 generate an image 3 of only the background (texture) 309 stored in the background image storage units 106 and 107.
01 and 303 and images (302) and 304 (304) containing a subject (fingers) 310 in a background 309 (A / D converters 104 and 105).
The difference is calculated by calculating the difference between each of the digital images converted in (Step 202A, 202).
B).

【００８３】そして、この差分画像に基づいて、被写体
３１０の存在する部分は「１」で、背景３０９のみの部
分は「０」で２値化を行い、被写体抽出画像３０５、３
０６を生成する（ステップ２０３Ａ、２０３Ｂ）。図８
の被写体抽出画像３０５、３０６においては、被写体３
１０の存在する部分「１」を、黒く塗りつぶしている。Then, based on the difference image, the portion where the subject 310 exists is “1”, and the portion of only the background 309 is “0”, and the binarization is performed.
06 is generated (steps 203A and 203B). FIG.
In the subject extraction images 305 and 306 of
The portion “1” where 10 is present is blacked out.

【００８４】次に、差分画像生成部１１０は、被写体抽
出画像３０５、３０６の各対応する画素毎の排他的論理
和（ＥＸＯＲ：Exclusive-OR）を取って差分画像３０７
を生成する（ステップ２０４）。ここで、差分を取る際
の被写体抽出画像３０５と被写体抽出画像３０６の位置
関係（画素の対応関係）は、背景画像３０１、３０３に
撮影されている背景面１２０のテクスチャ（ピアノ鍵盤
の絵）３０９がずれることなく重なる位置に設定され
る。図８の差分画像３０７においては、被写体抽出画像
３０５、３０６で被写体３１０の重なった部分が「０」
（白抜き）となっている。Next, the difference image generation unit 110 calculates the exclusive OR (EXOR: Exclusive-OR) for each corresponding pixel of the subject extraction images 305 and 306, and obtains the difference image 307.
Is generated (step 204). Here, the positional relationship (correspondence relationship between pixels) between the subject extraction image 305 and the subject extraction image 306 when taking the difference is the texture (picture of the piano keyboard) 309 of the background surface 120 captured in the background images 301 and 303. Are set at overlapping positions without displacement. In the difference image 307 of FIG. 8, the overlapped portion of the subject 310 in the subject extracted images 305 and 306 is “0”.
(Open).

【００８５】続いて、オクルージョン検出部１１１は、
被写体抽出画像３０５と差分画像３０７の各対応する画
素毎の論理積（ＡＮＤ）を取ってオクルージョン画像８
０８を生成する（ステップ７０５Ａ）。ここで、論理積
を取る際の被写体抽出画像３０５と差分画像３０７の位
置関係（画素の対応関係）は、上述の差分画像生成部１
１０で設定された位置関係と同一に設定される。同様に
して、オクルージョン検出部１１１は、被写体抽出画像
３０６と差分画像３０７の各対応する画素毎の論理積
（ＡＮＤ）を取ってオクルージョン画像８０９を生成す
る（ステップ７０５Ｂ）。Subsequently, the occlusion detecting section 111
The logical product (AND) of each corresponding pixel of the subject extraction image 305 and the difference image 307 is calculated to obtain the occlusion image 8
08 is generated (step 705A). Here, the positional relationship (correspondence relationship between pixels) between the subject extraction image 305 and the difference image 307 at the time of calculating the logical product is the same as the difference image generation unit 1 described above.
The position is set to be the same as the positional relationship set at 10. Similarly, the occlusion detecting unit 111 generates an occlusion image 809 by taking the logical product (AND) of each corresponding pixel of the subject extraction image 306 and the difference image 307 (step 705B).

【００８６】次に、距離計測部１１２は、図２で説明し
たのと同様にして、オクルージョン画像３０８及び３０
９のオクルージョン領域３１１Ａ及び３１１Ｂに基づい
て、被写体３１０の座標値Ｚh を求める（ステップ２０
６）。この座標値Ｚh は、それぞれのオクルージョン領
域３１１Ａ及び３１１Ｂに基づいて算出された座標値Ｚ
hA及びＺhBの平均値にするとよい。Next, the distance measuring section 112 executes the occlusion images 308 and 30 in the same manner as described with reference to FIG.
Based on the occlusion areas 311A and 311B, the coordinate value Zh of the subject 310 is obtained (step 20).
6). The coordinate value Zh is the coordinate value Z calculated based on the respective occlusion areas 311A and 311B.
The average value of hA and ZhB may be used.

【００８７】続いて、入力処理部１１３では、被写体３
１０の座標値Ｚh に基づいて、所定の閾値などにより鍵
盤押下の判定を行う（ステップ２０７）。入力処理部１
１３は、鍵盤押下と判定した場合には、被写体（指先）
３１０に対応する鍵盤の座標値を制御信号として、画像
処理装置１１４に送出する（ステップ２０８）。一方、
鍵盤押下と判定しなかった場合には、そのまま処理を続
行する（ステップ２０９）。Subsequently, in the input processing unit 113, the subject 3
Based on the ten coordinate values Zh, the keyboard press is determined by a predetermined threshold or the like (step 207). Input processing unit 1
13 indicates a subject (fingertip) when it is determined that a key is pressed.
The coordinate value of the keyboard corresponding to 310 is sent to the image processing device 114 as a control signal (step 208). on the other hand,
If it is not determined that the key is pressed, the processing is continued (step 209).

【００８８】以下、画像の計測処理が終了するまで（上
述で説明した場合には、例えば、ピアノの演奏が終わる
まで）、上述のステップ２０１Ａ及び２０１Ｂ〜２０９
までの処理を繰り返す（ステップ２１０）。Hereinafter, steps 201A and 201B to 209 are performed until the image measurement processing is completed (in the case described above, for example, until the performance of the piano is completed).
The processing up to is repeated (step 210).

【００８９】画像処理装置１１４の制御部１１５は、入
力処理部１１３からの座標値（信号）に基づいて、ＲＯ
Ｍ１１９やＲＡＭ１１８などに記憶されているプログラ
ムやデータを処理する。制御部１１５は、その処理結果
に基づいて、表示制御部１１６を制御して表示部１１７
に画像を表示したり、音声出力部（図示せず）から音声
を出力する。The control unit 115 of the image processing device 114 determines the RO based on the coordinate value (signal) from the input processing unit 113.
It processes programs and data stored in M119, RAM 118, and the like. The control unit 115 controls the display control unit 116 based on the processing result, and
And an audio output unit (not shown) outputs an audio.

【００９０】以上のように、本発明の実施の形態によれ
ば、２つのオクルージョン画像８０８、８０９を生成
し、これらから被写体３１０の座標値Ｚh の平均値を求
めるため、距離計測の精度を向上させることができる。As described above, according to the embodiment of the present invention, since two occlusion images 808 and 809 are generated and the average value of the coordinate values Zh of the subject 310 is obtained from them, the accuracy of distance measurement is improved. Can be done.

【００９１】＜実施の形態４＞次に、本発明の他の実施
の形態について説明する。本発明の入力画像処理におい
ては、背景面１２０（図４）から被写体３１０（図４）
までの距離Ｚh を求める際に、被写体の厚みを考慮する
ことによって、画像動作の判定（押下判定）の精度を上
げることができる。そこで、本実施の形態の特徴は、入
力処理部１１３において被写体３１０の形状をモデル化
し、そのモデルに基づいて画像の動作判定を行う。これ
によって、判定精度をより向上させたことにある。<Embodiment 4> Next, another embodiment of the present invention will be described. In the input image processing of the present invention, the subject 310 (FIG. 4) is shifted from the background plane 120 (FIG. 4).
When determining the distance Zh to the distance, the thickness of the subject is taken into consideration, so that the accuracy of the image operation determination (press determination) can be improved. Therefore, the feature of the present embodiment is that the shape of the subject 310 is modeled in the input processing unit 113, and the operation of the image is determined based on the model. As a result, the determination accuracy is further improved.

【００９２】ここで、本実施の形態における入力画像処
理装置は、図１で示した入力画像処理装置１００と同一
の構成である。次に、本実施の形態における入力画像処
理装置の動作について説明する。Here, the input image processing apparatus according to the present embodiment has the same configuration as the input image processing apparatus 100 shown in FIG. Next, the operation of the input image processing device according to the present embodiment will be described.

【００９３】図９は、入力画像処理装置１００での画像
処理を示すフローチャートである。なお、図９におい
て、上述した図２と同様の処理をするステップには、同
一のステップ番号を付している。ここで、背景面１２０
の背景（ピアノ鍵盤の絵）のみを予めカメラ部Ａ１０２
及びカメラ部Ｂ１０３で撮影しておき、Ａ／Ｄ変換部１
０４及び１０５でディジタル変換処理を行って、背景画
像として背景画像蓄積部１０６及び１０７に蓄積してお
く。FIG. 9 is a flowchart showing image processing in the input image processing apparatus 100. In FIG. 9, the steps that perform the same processing as in FIG. 2 described above are given the same step numbers. Here, the background surface 120
Only the background (picture of the piano keyboard) of the camera unit A102
And A / D converter 1
In steps 04 and 105, digital conversion processing is performed, and the digital images are stored in the background image storage units 106 and 107 as background images.

【００９４】図９において、図２と同様にして、まず、
オペレータは、ピアノの鍵盤を描いた絵の背景面１２０
上で演奏動作を行う。この時、カメラ部Ａ１０２及びカ
メラ部Ｂ１０３は、同期信号発生部１０１から発生する
同期信号に同期して、画角内の手指の画像をアナログ信
号として取り込む（ステップ２０１Ａ、２０１Ｂ）。こ
のアナログ信号は、Ａ／Ｄ変換部１０４、１０５でディ
ジタル信号（ディジタル画像）に変換される。In FIG. 9, similar to FIG. 2, first,
The operator operates the background 120 of the picture of the piano keyboard.
Perform the performance operation above. At this time, the camera unit A102 and the camera unit B103 capture an image of the finger within the angle of view as an analog signal in synchronization with the synchronization signal generated from the synchronization signal generation unit 101 (steps 201A and 201B). The analog signals are converted into digital signals (digital images) by A / D converters 104 and 105.

【００９５】次に、被写体抽出画像生成部１０８、１０
９は、背景画像蓄積部１０６、１０７に蓄積されている
背景（テクスチャ）３０９（図３）のみの画像３０１、
３０３（図３）と背景３０９に被写体( 手指) ３１０
（図３）が入った画像３０２、３０４（Ａ／Ｄ変換部１
０４、１０５で変換されたディジタルディジタル画像）
のそれぞれでの差分を計算して差分画像を生成する（ス
テップ２０２Ａ，２０２Ｂ）。そして、この差分画像に
基づいて、被写体３１０の存在する部分は「１」で、背
景３０９のみの部分は「０」で２値化を行い、被写体抽
出画像３０５、３０６（図３）を生成する（ステップ２
０３Ａ、２０３Ｂ）。Next, the subject extraction image generators 108, 10
9 is an image 301 of only the background (texture) 309 (FIG. 3) stored in the background image storage units 106 and 107;
303 (FIG. 3) and a subject (finger) 310 in the background 309
Images 302 and 304 (A / D conversion unit 1) containing (FIG. 3)
Digital digital image converted in 04, 105)
Are calculated to generate difference images (steps 202A and 202B). Then, based on the difference image, the portion where the subject 310 exists is “1”, and the portion of only the background 309 is “0”, and the binarization is performed to generate subject extraction images 305 and 306 (FIG. 3). (Step 2
03A, 203B).

【００９６】次に、差分画像生成部１１０は、被写体抽
出画像３０５、３０６の各対応する画素毎の排他的論理
和（ＥＸＯＲ：Exclusive-OR）を取って差分画像３０７
（図３）を生成する（ステップ２０４）。続いて、オク
ルージョン検出部１１１は、何れか一方の被写体抽出画
像（図３では被写体抽出画像３０５）と差分画像３０７
に基づいてオクルージョン画像３０８（図３）を生成す
る（ステップ２０５）。Next, the difference image generation unit 110 calculates an exclusive OR (EXOR: Exclusive-OR) for each corresponding pixel of the subject extraction images 305 and 306, and obtains a difference image 307.
(FIG. 3) is generated (step 204). Subsequently, the occlusion detection unit 111 compares one of the subject extracted images (the subject extracted image 305 in FIG. 3) with the difference image 307.
Then, an occlusion image 308 (FIG. 3) is generated based on (step 205).

【００９７】次に、距離計測部１１２は、オクルージョ
ン画像３０８に基づいて、被写体３１０（図３）の座標
値Ｚh を求める。Next, based on the occlusion image 308, the distance measuring section 112 calculates the coordinate value Zh of the subject 310 (FIG. 3).

【００９８】図８は、被写体（指先）３１０に厚みがあ
る場合の例を示す。このように、被写体３１０に厚みが
ある場合、背景面に押下動作を行っても座標値Ｚh は０
にはならない。そこで、入力処理部１１３では、予め被
写体３１０の厚みをモデル化し、その厚みに応じた補正
値（図８におけるＺ0 ）を記憶しておく。そして、入力
処理部１１３は、距離計測部１１２の出力結果である座
標値Ｚh から、この記憶してある補正値Ｚ0 を減算する
ことで、実際の座標値Ｚh ’（Ｚh ’＝Ｚh −Ｚ0 ）を
求める（ステップ９０６）。これによって、被写体３１
０の厚み分の補正が行われる。なお、被写体３１０の座
標値の補正は、距離計測部１１２で行ってもよい。FIG. 8 shows an example in which the subject (fingertip) 310 has a thickness. As described above, when the subject 310 has a thickness, the coordinate value Zh is 0 even when the pressing operation is performed on the background surface.
It does not become. Therefore, the input processing unit 113 models the thickness of the subject 310 in advance and stores a correction value (Z0 in FIG. 8) corresponding to the thickness. Then, the input processing unit 113 subtracts the stored correction value Z0 from the coordinate value Zh, which is the output result of the distance measurement unit 112, to obtain the actual coordinate value Zh ′ (Zh ′ = Zh−Z0). (Step 906). Thereby, the subject 31
The correction for the thickness of 0 is performed. Note that the coordinate value of the subject 310 may be corrected by the distance measurement unit 112.

【００９９】続いて、入力処理部１１３では、被写体３
１０の座標値Ｚh ’と所定の閾値に基づいて、鍵盤押下
の判定を行う（ステップ９０７）。Subsequently, in the input processing unit 113, the subject 3
Based on the 10 coordinate values Zh 'and a predetermined threshold value, it is determined whether the keyboard has been pressed (step 907).

【０１００】入力処理部１１３は、鍵盤押下と判定した
場合には、被写体（指先）３１０に対応する鍵盤の座標
値を制御信号として、画像処理装置１１４に送出する
（ステップ２０８）。一方、鍵盤押下と判定しなかった
場合には、そのまま処理を続行する（ステップ２０
９）。If the input processing section 113 determines that the keyboard has been pressed, it sends the coordinate value of the keyboard corresponding to the subject (fingertip) 310 to the image processing device 114 as a control signal (step 208). On the other hand, if it is not determined that the key is pressed, the process is continued (step 20).
9).

【０１０１】以下、画像の計測処理が終了するまで（上
述で説明した場合には、例えば、ピアノの演奏が終わる
まで）、上述のステップ２０１Ａ及び２０１Ｂ〜２０９
までの処理を繰り返す（ステップ２１０）。Thereafter, steps 201A and 201B to 209 described above are performed until the measurement processing of the image is completed (in the case described above, for example, until the performance of the piano is completed).
The processing up to is repeated (step 210).

【０１０２】画像処理装置１１４の制御部１１５は、入
力処理部１１３からの座標値（信号）に基づいて、ＲＯ
Ｍ１１９やＲＡＭ１１８などに記憶されているプログラ
ムやデータを処理する。制御部１１５は、その処理結果
に基づいて、表示制御部１１６を制御して表示部１１７
に画像を表示したり、音声出力部（図示せず）から音声
を出力する。The control unit 115 of the image processing device 114 determines the RO based on the coordinate value (signal) from the input processing unit 113.
It processes programs and data stored in M119, RAM 118, and the like. The control unit 115 controls the display control unit 116 based on the processing result, and
And an audio output unit (not shown) outputs an audio.

【０１０３】以上のように、本発明の実施の形態によれ
ば、被写体の形状をモデル化して補正するため、画像の
動作判定の精度を向上することができる。As described above, according to the embodiment of the present invention, since the shape of the subject is modeled and corrected, the accuracy of the motion judgment of the image can be improved.

【０１０４】以上、本発明の入力画像処理装置及び入力
画像処理方法について説明したが、上述した実施の形態
２に、実施の形態３又は実施の形態４を適用することが
できる。さらに、実施の形態２に、実施の形態３及び実
施の形態４の両方を適用することができる。また、実施
の形態３に、実施の形態４を適用することもできる。The input image processing apparatus and the input image processing method of the present invention have been described above. However, the third embodiment or the fourth embodiment can be applied to the second embodiment. Further, both the third embodiment and the fourth embodiment can be applied to the second embodiment. Further, the fourth embodiment can be applied to the third embodiment.

【０１０５】また、上述の入力画像処理方法をコンピュ
ータ実行可能な入力画像処理プログラムとして記録媒体
に記録しておくこともできる。The above-described input image processing method can be recorded on a recording medium as an input image processing program which can be executed by a computer.

【０１０６】[0106]

【発明の効果】以上のように、本発明の入力画像処理装
置、入力画像処理方法、及び入力画像処理プログラムを
記録した記録媒体によれば、少ない計算量で３次元画像
の正確な奥行きを計測して撮像画像をリアルタイムに処
理することができ、また、撮像画像の背景面が広い場合
にも対応することができる。As described above, according to the input image processing apparatus, the input image processing method, and the recording medium on which the input image processing program is recorded, the depth of a three-dimensional image can be accurately measured with a small amount of calculation. Thus, the captured image can be processed in real time, and it is possible to cope with a case where the background surface of the captured image is wide.

【０１０７】また被写体の座標値の平均値を取ることに
よって、計測精度の高い距離計測をすることができる。
さらに、被写体の座標値にモデル化した補正値を適用す
ることによって、画像の動作判定の精度を向上すること
ができる。Further, by taking the average value of the coordinate values of the subject, distance measurement with high measurement accuracy can be performed.
Further, by applying the modeled correction value to the coordinate value of the subject, it is possible to improve the accuracy of the image motion determination.

[Brief description of the drawings]

【図１】本発明の入力画像処理装置の構成を示すブロッ
ク図である。FIG. 1 is a block diagram illustrating a configuration of an input image processing device of the present invention.

【図２】本発明の入力画像処理を示すフローチャートで
ある。FIG. 2 is a flowchart showing input image processing of the present invention.

【図３】差分画像の生成を示す概念図である。FIG. 3 is a conceptual diagram illustrating generation of a difference image.

【図４】被写体の座標値の求め方を示す図である。FIG. 4 is a diagram illustrating a method of obtaining a coordinate value of a subject.

【図５】本発明の入力画像処理装置の構成を示すブロッ
ク図である。FIG. 5 is a block diagram illustrating a configuration of an input image processing device according to the present invention.

【図６】本発明の入力画像処理を示すフローチャートで
ある。FIG. 6 is a flowchart illustrating input image processing according to the present invention.

【図７】本発明の入力画像処理を示すフローチャートで
ある。FIG. 7 is a flowchart showing input image processing of the present invention.

【図８】差分画像の生成を示す概念図である。FIG. 8 is a conceptual diagram illustrating generation of a difference image.

【図９】本発明の入力画像処理を示すフローチャートで
ある。FIG. 9 is a flowchart showing input image processing of the present invention.

【図１０】被写体の座標値の求め方を示す図である。FIG. 10 is a diagram illustrating a method of obtaining a coordinate value of a subject.

【図１１】従来の入力画像処理装置の構成を示すブロッ
ク図である。FIG. 11 is a block diagram illustrating a configuration of a conventional input image processing device.

【図１２】従来の撮影画像の処理を示す図である。FIG. 12 is a diagram showing processing of a conventional captured image.

[Explanation of symbols]

１００、５００、１１００入力画像処理装置１０１同期信号発生部１０２、５０２、１１０２カメラ部Ａ１０３、５０３、１１０３カメラ部Ｂ１０４、１０５、１１０４、１１０５Ａ／Ｄ変換部１０６、１０７背景画像蓄積部１０８、１０９被写体抽出画像生成部１１０差分画像生成部１１１オクルージョン検出部１１２、１１０９距離計測部１１３、１１１０入力処理部１１４、１１１１画像処理装置１１５、１１１２制御部１１６、１１１３表示制御部１１７、１１１４表示部１１８、１１１６ＲＡＭ１１９、１１１５ＲＯＭ１２０背景面３０１、３０３背景画像３０２、３０４ディジタル画像３０５、３０６被写体抽出画像３０７差分画像３０８、８０８、８０９オクルージョン画像３１１、３１１Ａ、３１１Ｂオクルージョン領域４０３座標系４０４、４０５カメラ焦点４０６、４０７撮像面５０６、５０７正規化画像生成部１２０１〜１２１０特徴点１２１１、１２１２撮影画像１２１３指先１２１４鍵盤１２１５エピポーラ線 100, 500, 1100 Input image processing device 101 Synchronous signal generation unit 102, 502, 1102 Camera unit A 103, 503, 1103 Camera unit B 104, 105, 1104, 1105 A / D conversion unit 106, 107 Background image storage unit 108 , 109 subject extraction image generation unit 110 difference image generation unit 111 occlusion detection unit 112, 1109 distance measurement unit 113, 1110 input processing unit 114, 1111 image processing device 115, 1112 control unit 116, 1113 display control unit 117, 1114 display unit 118, 1116 RAM 119, 1115 ROM 120 Background surface 301, 303 Background image 302, 304 Digital image 305, 306 Subject extraction image 307 Difference image 308, 808, 809 Occlusion image 311, 311A 311B Occlusion area 403 Coordinate system 404, 405 Camera focus 406, 407 Image plane 506, 507 Normalized image generator 1201 to 1210 Feature points 1211, 1212 Photographed image 1213 Fingertip 1214 Keyboard 1215 Epipolar line

フロントページの続き (72)発明者斎藤潤子東京都新宿区大久保３丁目４番１号早稲田大学内 (72)発明者竹内俊一東京都新宿区西早稲田１丁目21番１号通信・放送機構早稲田リサーチセンター内 (72)発明者富永英義東京都新宿区大久保３丁目４番１号早稲田大学内Ｆターム(参考） 2F065 AA04 DD03 DD06 FF01 FF05 JJ03 JJ26 QQ03 QQ24 QQ25 QQ42 UU05 5B057 CE09 CH08 DA07 DB03 5C023 AA01 AA06 AA08 AA10 AA34 AA38 BA01 BA02 CA01 DA02 DA03 DA08 Continued on the front page (72) Inventor Junko Saito Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo (72) Inventor Shunichi Takeuchi 1-21-1, Nishiwaseda, Shinjuku-ku, Tokyo Inside the Waseda Research Center (72) Inventor Hideyoshi Tominaga 3-4-1 Okubo, Shinjuku-ku, Tokyo F-term within Waseda University 2F065 AA04 DD03 DD06 FF01 FF05 JJ03 JJ26 QQ03 QQ24 QQ25 QQ42 UU05 5B057 CE09 CH08 DA07 AA01 AA06 AA08 AA10 AA34 AA38 BA01 BA02 CA01 DA02 DA03 DA08

Claims

[Claims]

1. An input image processing method for processing an image of a subject moving on a background surface, wherein a plurality of input images are obtained by simultaneously photographing a subject moving on the background surface by a plurality of photographing units. Extracting only the subject from each input image to generate a plurality of subject extracted images; generating a difference image from the generated plurality of subject extracted images; selecting any of the plurality of subject extracted images Generating an occlusion image from the subject extracted image and the difference image, and measuring a position of the subject with respect to the background surface based on the occlusion image.

2. The plurality of photographing units have a known relative positional relationship, and their optical axis directions are parallel to each other, and their respective imaging planes are set on a plane parallel to the background plane. The input image processing method according to claim 1, wherein:

3. An input image processing method for processing an image of a subject moving on a background surface, wherein the subject moving on the background surface is simultaneously photographed by a plurality of photographing units to obtain a plurality of input images. Generating respective normalized images from the plurality of input images; extracting only the subject from the respective normalized images to generate a plurality of subject extraction images; and generating the plurality of subject extraction images. A difference image is generated from the plurality of subject extraction images, an occlusion image is generated from any of the subject extraction images and the difference image, and a position of the subject with respect to the background surface is measured based on the occlusion image. An input image processing method comprising:

4. The input image according to claim 3, wherein the plurality of photographing units are installed so that their relative positional relationship is known, their optical axis directions are known, and they are congested. Processing method.

5. The method according to claim 1, wherein the step of generating the occlusion image includes the step of generating two or more occlusion images from two or more arbitrary subject extracted images of the plurality of subject extracted images and the difference image. Measuring the position of the subject with respect to the background surface for each of the two or more occlusion images, and determining the position of the subject based on an average value of the positions measured for each of the two or more occlusion images. The input image processing method according to any one of claims 1 to 4, further comprising a step of determining.

6. The step of measuring the position of the subject,
6. The input image processing method according to claim 1, wherein the measured position is corrected by a predetermined correction value.

7. The step of measuring the position of the subject, wherein the position of the subject is a value Zh ′ indicating a distance from the background surface to the subject, and the predetermined correction value is the value of the background of the subject. When a value of the width in the direction perpendicular to the surface is Z0 and the value of the measured position is Zh, the value Zh indicating the distance to the subject is obtained by Zh ′ = Zh−Z0. The input image processing method according to claim 6.

8. The step of acquiring the plurality of input images is a step of acquiring a plurality of input images by simultaneously photographing a subject moving on the background surface with a plurality of photographing units while synchronizing. 8. The method according to claim 1, wherein
The input image processing method as described.

9. The method according to claim 1, wherein the step of generating the plurality of subject extracted images includes extracting only the subject based on a background image in which the subject is not shown.
9. An input image processing method according to any one of claims 1 to 8.

10. The step of generating a difference image includes generating a difference image by taking an exclusive OR (EXOR: EXOR-OR) for each corresponding pixel of the generated plurality of subject extraction images. Is the step of
10. The input image processing method according to claim 1, wherein:

11. The step of generating an occlusion image is a step of generating an occlusion image by taking a logical product (AND) of each of the corresponding subject extracted image and the difference image for each corresponding pixel. 11. The input image processing method according to claim 1, wherein:

12. The image according to claim 1, further comprising, after the step of measuring the position of the subject, recognizing a motion of the subject from the measured position of the subject. input method.

13. An input image processing apparatus for processing an image of a subject moving on a background surface, wherein a plurality of photographing means for simultaneously photographing a subject moving on the background surface to obtain a plurality of input images. Extraction image generation means for extracting only the subject from the plurality of input images obtained by the imaging means to generate a plurality of subject extraction images; and the plurality of subjects generated by the extraction image generation means A difference image generating unit configured to generate a difference image from the extracted image; an occlusion image generating unit configured to generate an occlusion image from an arbitrary subject extracted image of the plurality of subject extracted images and the difference image; and the occlusion image generating unit. Distance measuring means for measuring a position of the subject with respect to the background surface based on the generated occlusion image. Input image processing apparatus characterized by.

14. The plurality of photographing means have a known relative positional relationship, and their optical axis directions are parallel to each other, and their respective imaging planes are set on a plane parallel to the background plane. 14. The input image processing device according to claim 13, wherein:

15. An input image processing apparatus for processing an image of a subject moving on a background surface, wherein a plurality of photographing means for simultaneously photographing a subject moving on the background surface to obtain a plurality of input images. A normalized image generating means for generating respective normalized images from the plurality of input images obtained by the photographing means; and only the subject from the respective normalized images generated by the normalized image generating means Extracted image generating means for extracting a plurality of subject extracted images, a difference image generating means for generating a difference image from the plurality of subject extracted images generated by the extracted image generating means, and the plurality of subjects Occlusion image generation means for generating an occlusion image from an arbitrary subject extraction image of the extraction images and the difference image; and the occlusion image generation Input image processing apparatus characterized by comprising: a distance measuring means for measuring a position relative to the background surface of the object on the basis of the occlusion image generated by the stage, a.

16. The input image according to claim 15, wherein said plurality of photographing means are installed such that their relative positional relationship is known, their optical axis directions are known, and they are congested. Processing equipment.

17. The distance measuring means, wherein the occlusion image generating means generates two or more occlusion images from two or more arbitrary subject extracted images of the plurality of subject extracted images and the difference image. Is a configuration in which the position of the subject with respect to the background surface is measured for each of the two or more occlusion images, and the position of the subject is determined based on an average value of the positions measured for each of the two or more occlusion images. 17. The input image processing device according to claim 13, wherein:

18. The apparatus according to claim 1, wherein said distance measuring means corrects the measured position by a predetermined correction value.
18. The input image processing device according to any one of items 3 to 17.

19. The distance measuring means according to claim 1, wherein said position of said subject is a value Zh indicating a distance from said background surface to said subject.
', The predetermined correction value is a width value Z0 of the subject in a direction perpendicular to the background surface, and a value indicating the distance to the subject when the measured position value is Zh. 19. The input image processing apparatus according to claim 18, wherein Zh 'is obtained by Zh' = Zh-Z0.

20. The apparatus according to claim 13, wherein said plurality of photographing means are configured to simultaneously photograph a subject moving on said background plane while synchronizing to obtain a plurality of input images. 20. The input image processing device according to 19.

21. The input image processing apparatus according to claim 13, wherein said extracted image generating means extracts only said subject based on a background image in which said subject is not shown.

22. The difference image generating means generates a difference image by taking an exclusive OR (EXOR: EXOR-OR) for each corresponding pixel of the plurality of generated subject extracted images. 2. The method according to claim 1, wherein
22. The input image processing device according to any one of Items 3 to 21.

23. The occlusion image generating means generates an occlusion image by taking a logical product (AND) of each of the corresponding extracted pixels of the subject and the difference image for each corresponding pixel. The input image processing device according to claim 13.

24. The image input apparatus according to claim 13, further comprising means for recognizing a motion of said subject from a position of said subject measured by said distance measuring means.

25. A method of simultaneously photographing a subject moving on a background with a plurality of photographing units to obtain a plurality of input images, extracting only the subject from each input image, and extracting a plurality of subjects. Generating an image; generating a difference image from the generated plurality of subject extraction images; and generating an occlusion image from an arbitrary subject extraction image of the plurality of subject extraction images and the difference image. Measuring a position of the subject with respect to the background based on the occlusion image; and a computer-readable recording medium recording an input image processing program for causing a computer to execute an input image processing method.

26. The step of obtaining a plurality of input images, wherein the relative positional relationship between the input images is known, the optical axis directions are parallel to each other, and the respective imaging planes are set on a plane parallel to the background plane 26. The recording medium according to claim 25, further comprising a step of acquiring a plurality of input images photographed by said plurality of photographing units.

27. A step of simultaneously photographing a subject moving on a background with a plurality of photographing units to obtain a plurality of input images; and generating respective normalized images from the plurality of input images. Extracting only the subject from each of the normalized images to generate a plurality of subject extracted images; generating a difference image from the generated plurality of subject extracted images; and the plurality of subject extracted images Generating an occlusion image from an arbitrary subject extracted image and the difference image, and measuring a position of the subject with respect to the background surface based on the occlusion image. A computer-readable recording medium that stores an input image processing program to be executed.

28. The step of acquiring the plurality of input images, wherein the relative positional relationship between the plurality of input images is known, the directions of the optical axes are known, and the plurality of input images are photographed by the plurality of photographing units installed so as to be congested. 28. The recording medium according to claim 27, further comprising a step of acquiring a plurality of input images.

29. The step of generating the occlusion image is a step of generating two or more occlusion images from two or more arbitrary subject extracted images of the plurality of subject extracted images and the difference image. Measuring the position of the subject with respect to the background surface for each of the two or more occlusion images, and determining the position of the subject based on an average value of the positions measured for each of the two or more occlusion images. 29. The recording medium according to claim 25, which is a step of determining.

30. The recording medium according to claim 25, wherein the step of measuring the position of the subject corrects the measured position by a predetermined correction value.

31. In the step of measuring the position of the subject, the position of the subject is a value Zh ′ indicating a distance from the background surface to the subject, and the predetermined correction value is the background of the subject. When there is a width value Z0 in a direction perpendicular to the surface, and when the value of the measured position is Zh, the value Zh indicating the distance to the subject is obtained by Zh ′ = Zh−Z0. The recording medium according to claim 30.

32. The step of acquiring a plurality of input images is a step of acquiring a plurality of input images by simultaneously photographing a subject moving on the background surface with a plurality of photographing units while synchronizing. 32. The recording medium according to claim 25, wherein:

33. The recording medium according to claim 25, wherein the step of generating the plurality of subject extracted images extracts only the subject based on a background image not including the subject.

34. The step of generating a differential image includes generating a differential image by taking an exclusive OR (EXOR: EXOR-OR) for each corresponding pixel of the generated plurality of subject extraction images. Is the step of
The recording medium according to any one of claims 25 to 33, wherein:

35. The step of generating an occlusion image is a step of generating an occlusion image by taking a logical product (AND) of each of the corresponding subject extracted image and the difference image for each corresponding pixel. 35. The recording medium according to claim 25, wherein:

36. The recording method according to claim 25, further comprising, after the step of measuring the position of the subject, recognizing an operation of the subject from the measured position of the subject. Medium.