JP5951966B2

JP5951966B2 - Image processing apparatus, image processing system, image processing method, and program

Info

Publication number: JP5951966B2
Application number: JP2011254294A
Authority: JP
Inventors: 隼平佐藤
Original assignee: Lapis Semiconductor Co Ltd
Current assignee: Lapis Semiconductor Co Ltd
Priority date: 2011-11-21
Filing date: 2011-11-21
Publication date: 2016-07-13
Anticipated expiration: 2031-11-21
Also published as: JP2013109590A

Description

本発明は、利用者の顔の特徴部の動きを認識する画像処理装置、画像処理システム、画像処理方法、及びプログラムに関するものである。 The present invention relates to an image processing apparatus, an image processing system, an image processing method, and a program for recognizing a motion of a feature portion of a user's face.

従来、例えば、図８に示すように、カメラ２１０等の撮像手段で、利用者の顔を撮像し、撮像した画像に基づいて、顔の中の鼻や口等の動きを認識し、その動き応じて表示画面２１６中に表示された画像（ポインタ等）２１８を移動させる手法が提案されている（例えば、特許文献１及び特許文献２参照）。 Conventionally, for example, as shown in FIG. 8, the user's face is imaged by an imaging means such as a camera 210, and the movement of the nose, mouth, etc. in the face is recognized based on the captured image, and the movement Accordingly, a method of moving an image (pointer or the like) 218 displayed on the display screen 216 has been proposed (see, for example, Patent Document 1 and Patent Document 2).

特開平７−９３０８９号公報JP-A-7-93089 特開２００９−３１３６８号公報JP 2009-31368 A

しかしながら、従来の手法によれば、ポインタ２１８を動かしている際に、利用者が（画像２１８を動かす目的ではなく）体勢を変えた場合にも、ポインタ２１８が移動してしまうため、不便が生じることがあった。 However, according to the conventional technique, when the user changes his / her posture while moving the pointer 218 (not for the purpose of moving the image 218), the pointer 218 moves, which causes inconvenience. There was a thing.

具体的には、例えば、利用者が姿勢を正した状態から、椅子に深く腰をかけて左下に沈んだ体勢をとった場合には、その体勢変更によって、認識対象の鼻等が動いてしまうため、その動きと共にポインタ２１８が表示画面２１６の左端に移動してしまい、利用者が、その体勢のままポインタ２１８を表示画面２１６の右上に移動させようとして通常通りの量だけ顔を動かしても、意図する位置まで届かない、或いは利用者がポインタ２１８を見失う等の問題が生じていた。 Specifically, for example, when the user corrects his / her posture and takes a posture that sits deeply on the chair and sinks to the lower left, the nose or the like to be recognized moves due to the posture change. Therefore, the pointer 218 moves to the left end of the display screen 216 along with the movement, and the user moves the face by the usual amount in an attempt to move the pointer 218 to the upper right of the display screen 216 while maintaining the posture. There has been a problem that the user does not reach the intended position or the user loses sight of the pointer 218.

本発明は、上述した課題を解決するために提案されたものであり、利用者が顔を動かすことにより表示画面中の画像を移動させている際に、利用者が体勢等を変更しても、容易に表示画面の基準位置に該画像を配置させることができ、体勢変更後も画像の移動を容易に継続することができる画像処理装置、画像処理システム、画像処理方法、及びプログラムを提供することを目的とする。 The present invention has been proposed in order to solve the above-described problem, and even when the user moves the face and moves the image on the display screen, the user can change the posture or the like. An image processing apparatus, an image processing system, an image processing method, and a program that can easily arrange the image at a reference position on a display screen and can easily continue the movement of the image even after the posture change is provided. For the purpose.

上記目的を達成するために、本発明の画像処理装置は、利用者を撮像した撮像画像から該利用者の顔の特徴部を検出する特徴部検出手段と、前記撮像画像から前記利用者の顔の向きを検出する向き検出手段と、前記向き検出手段によって予め定められた顔の向きが検出される毎に、該予め定められた顔の向きが検出されたときに前記特徴部検出手段により検出された特徴部の位置を最新の基準点として識別可能に記憶手段に記憶する記憶処理手段と、前記記憶手段に記憶された最新の基準点を基準とした前記特徴部検出手段により検出された特徴部の相対位置に基づいて、表示画面に表示された予め定められた画像が移動するように該画像の表示を制御する第１制御手段と、前記向き検出手段により前記予め定められた顔の向きが検出されたときに、前記表示画面の予め定められた基準位置に前記画像が配置されるように前記画像の表示を制御する第２制御手段と、を備えている。 In order to achieve the above object, an image processing apparatus according to the present invention includes a feature detection unit that detects a feature of a user's face from a captured image obtained by capturing the user, and the user's face from the captured image. Direction detecting means for detecting the orientation of the camera, and whenever the predetermined face orientation is detected by the orientation detecting means, the feature detection means detects when the predetermined face orientation is detected. features detected by been a storage processing means for storing the position of the feature to identifiably storing means as the latest reference points, the feature detecting unit relative to the latest reference point stored in said storage means First control means for controlling the display of the image so that the predetermined image displayed on the display screen moves based on the relative position of the part, and the predetermined face orientation by the orientation detection means Detected To come, and a, and a second control means for controlling the display of the image as the image to a predetermined reference position is the arrangement of the display screen.

また、本発明の画像処理システムは、利用者を撮像する撮像手段と、前記撮像手段により撮像された撮像画像から該利用者の顔の特徴部を検出する特徴部検出手段と、前記撮像画像から前記利用者の顔の向きを検出する向き検出手段と、前記向き検出手段によって予め定められた顔の向きが検出される毎に、該予め定められた顔の向きが検出されたときに前記特徴部検出手段により検出された特徴部の位置を最新の基準点として識別可能に記憶手段に記憶する記憶処理手段と、前記記憶手段に記憶された最新の基準点を基準とした前記特徴部検出手段により検出された特徴部の相対位置に基づいて、表示画面に表示された予め定められた画像が移動するように該画像の表示を制御する第１制御手段と、前記向き検出手段により前記予め定められた顔の向きが検出されたときに、前記表示画面の予め定められた基準位置に前記画像が配置されるように前記画像の表示を制御する第２制御手段と、を備えている。 The image processing system according to the present invention includes an imaging unit that captures an image of a user, a feature detection unit that detects a feature portion of the user's face from the captured image captured by the imaging unit, and the captured image. The direction detecting means for detecting the orientation of the user's face, and the feature when the predetermined face orientation is detected every time the orientation detecting means detects the predetermined face orientation. a storage processing means for identifiably stored in the storage means the position of the feature portion detected by the section detecting means as the latest reference points, the feature detecting unit relative to the latest reference point stored in said storage means First control means for controlling the display of the predetermined image displayed on the display screen based on the relative position of the characteristic part detected by the above-mentioned, and the predetermined by the orientation detection means. Et And when the direction is detected face, and a, and a second control means for controlling the display of the image such that the image at a predetermined reference position is the arrangement of the display screen.

また、本発明の画像処理方法は、利用者を撮像した撮像画像から利用者の顔の特徴部を検出すると共に、前記利用者の顔の向きを検出し、前記撮像画像から前記利用者の予め定められた顔の向きが検出される毎に、該予め定められた顔の向きが検出されたときに検出された特徴部の位置を最新の基準点として識別可能に記憶手段に記憶し、前記記憶手段に記憶された最新の基準点を基準とした前記特徴部の相対位置に基づいて、表示画面に表示された予め定められた画像が移動するように該画像の表示を制御し、前記予め定められた顔の向きが検出されたときに、前記表示画面の予め定められた基準位置に前記画像が配置されるように前記画像の表示を制御するものである。 In addition, the image processing method of the present invention detects a feature portion of a user's face from a captured image obtained by capturing the user, detects the orientation of the user's face, and determines the user's face from the captured image. Each time a predetermined face orientation is detected, the position of the feature detected when the predetermined face orientation is detected is stored in the storage means so as to be identifiable as the latest reference point. Based on the relative position of the feature portion with respect to the latest reference point stored in the storage means, the display of the image is controlled so that the predetermined image displayed on the display screen moves, When a predetermined face orientation is detected, the display of the image is controlled so that the image is arranged at a predetermined reference position on the display screen.

また、本発明のプログラムは、コンピュータを、利用者を撮像した撮像画像から該利用者の顔の特徴部を検出する特徴部検出手段、前記撮像画像から前記利用者の顔の向きを検出する向き検出手段、前記向き検出手段によって予め定められた顔の向きが検出される毎に、該予め定められた顔の向きが検出されたときに前記特徴部検出手段により検出された特徴部の位置を最新の基準点として識別可能に記憶手段に記憶する記憶処理手段、前記記憶手段に記憶された最新の基準点を基準とした前記特徴部検出手段により検出された特徴部の相対位置に基づいて、表示画面に表示された予め定められた画像が移動するように該画像の表示を制御する第１制御手段、及び前記向き検出手段により前記予め定められた顔の向きが検出されたときに、前記表示画面の予め定められた基準位置に前記画像が配置されるように前記画像の表示を制御する第２制御手段、として機能させるためのプログラムである。 Further, the program of the present invention includes a computer that detects a feature portion of a user's face from a captured image obtained by capturing the user, and a direction that detects the orientation of the user's face from the captured image. Each time a predetermined face orientation is detected by the detection means and the orientation detection means, the position of the feature detected by the feature detection means when the predetermined face orientation is detected is detected. storage processing means for storing the identifiable storing means as the latest reference point, based on the relative positions of the features detected by the feature detecting means relative to the latest reference point stored in said storage means, When a predetermined face orientation is detected by the first control means for controlling the display of the image so that the predetermined image displayed on the display screen moves, and the orientation detection means, Is a program for functioning as a second control means, for controlling the display of the image such that the image is located at a predetermined reference position of the display screen.

以上説明したように、本発明によれば、利用者が顔を動かすことにより表示画面中の画像を移動させている際に、利用者が体勢等を変更しても、容易に表示画面の基準位置に該画像を配置させることができ、体勢変更後も画像の移動を容易に継続することができる、という効果を奏する。 As described above, according to the present invention, when the user moves an image on the display screen by moving his / her face, even if the user changes the posture, the reference of the display screen can be easily obtained. There is an effect that the image can be arranged at the position, and the movement of the image can be easily continued even after the posture change.

実施の形態に係る画像処理システム、画像処理システムに接続された処理装置及びモニタを示す図である。1 is a diagram illustrating an image processing system according to an embodiment, a processing device connected to the image processing system, and a monitor. 画像処理装置の構成の一例を示す図である。It is a figure which shows an example of a structure of an image processing apparatus. 画像処理装置で行われる顔検出処理の流れを示すフローチャートの一例である。It is an example of the flowchart which shows the flow of the face detection process performed with an image processing apparatus. 顔及び顔パーツ検出について説明する説明図である。It is explanatory drawing explaining a face and face part detection. 検出された利用者の顔パーツ領域及び顔領域を模式的に示すと共に、該検出結果に応じたポインタの表示位置を模式的に示す図である。It is a figure which shows typically the display position of the pointer according to the detection result while showing typically the detected user's face part field and face field. （Ａ）は顔を傾けた状態で顔全体を移動させたときのポインタの表示制御例を示す図であり、（Ｂ）は、顔が正面を向いた状態で顔全体を移動させたときのポインタの表示制御例を示す図である。(A) is a figure which shows the display control example of a pointer when moving the whole face in the state which inclined the face, (B) is the case when moving the whole face in the state where the face turned to the front It is a figure which shows the example of a display control of a pointer. モニタの表示画面を複数個の画面領域により構成し、基準位置が複数個の画面領域の各々に個別に設定されている様子を示す図である。It is a figure which shows a mode that the display screen of a monitor is comprised by several screen area | regions, and the reference position is set separately to each of several screen area | regions. 利用者が顔を動かすことによりポインタを動かす従来の技術を説明するための説明図である。It is explanatory drawing for demonstrating the prior art which moves a pointer by a user moving a face.

以下、本発明の実施の形態について図面を参照しながら詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１には、本実施の形態に係る画像処理システム１と、画像処理システム１に接続された処理装置１４と、モニタ１６とが示されている。 FIG. 1 shows an image processing system 1 according to the present embodiment, a processing device 14 connected to the image processing system 1, and a monitor 16.

画像処理システム１は、カメラ１０及び画像処理装置１２を備えている。画像処理システム１において、まず、カメラ１０により利用者の顔を認識対象物３０として撮像し（図１では、利用者の後頭部が図示されているが、本実施の形態では、カメラ１０に向けられた利用者の顔が認識対象物３０とされる）、画像処理装置１２において撮像された画像から利用者の顔を検出し、検出結果を処理装置１４に出力する。処理装置１４は、出力された検出結果に基づいて、表示手段としてのモニタ１６の表示画面１６ａに表示する画像（本実施の形態ではポインタ）１８を移動する。処理装置１４は、例えば、パーソナルコンピュータであってもよい。 The image processing system 1 includes a camera 10 and an image processing device 12. In the image processing system 1, first, the camera 10 captures the user's face as the recognition object 30 (in FIG. 1, the user's back head is illustrated, but in the present embodiment, the camera 10 is directed to the camera 10. And the user's face is detected from the image captured by the image processing device 12, and the detection result is output to the processing device 14. Based on the output detection result, the processing device 14 moves an image (a pointer in the present embodiment) 18 displayed on the display screen 16a of the monitor 16 serving as a display unit. The processing device 14 may be a personal computer, for example.

図２には、画像処理装置１２の構成が示されている。 FIG. 2 shows the configuration of the image processing apparatus 12.

画像処理装置１２は、ＣＰＵ（Central Processing Unit）２０、ＲＯＭ（Read Only Memory）２２、及びＲＡＭ（Random Access Memory）２４を備え、それらがバス（不図示）を介して相互に接続されて構成された一般的なコンピュータにより構成されている。更に又、画像処理装置１２は、画像処理用メモリ２６を備えている。 The image processing apparatus 12 includes a central processing unit (CPU) 20, a read only memory (ROM) 22, and a random access memory (RAM) 24, which are connected to each other via a bus (not shown). It is composed of a general computer. Furthermore, the image processing apparatus 12 includes an image processing memory 26.

ＲＯＭ２２に記憶されているプログラムはＣＰＵ２０の起動時にＲＡＭ２４にロードされ、ＣＰＵ２０がＲＯＭ２２に記憶されたプログラムを実行することで、画像処理装置１２全体を制御すると共に後述する顔検出処理（図３も参照。）を実行するよう構成されている。ＲＯＭ２２はフラッシュメモリにより構成されていてもよい。なお、ＣＰＵ２０が実行するプログラム格納用の記憶媒体はＲＯＭ２２に限定されない。例えば、ＨＤＤ(Hard Disk Drive)であってもよいし、着脱可能なＵＳＢメモリであってもよいし、通信ネットワークに接続された記憶装置であってもよい。 The program stored in the ROM 22 is loaded into the RAM 24 when the CPU 20 is started up, and the CPU 20 executes the program stored in the ROM 22 to control the entire image processing apparatus 12 and to perform face detection processing described later (see also FIG. 3). .) Is configured to perform. The ROM 22 may be configured by a flash memory. The storage medium for storing programs executed by the CPU 20 is not limited to the ROM 22. For example, it may be an HDD (Hard Disk Drive), a removable USB memory, or a storage device connected to a communication network.

画像処理用メモリ２６には、カメラ１０で撮像された撮像画像を表わす画像データが一時的に記憶される。ＣＰＵ２０は、画像処理用メモリ２６に記憶された画像データを対象として画像処理を行って、処理結果（検出結果）を処理装置１４に出力する。 In the image processing memory 26, image data representing a captured image captured by the camera 10 is temporarily stored. The CPU 20 performs image processing on the image data stored in the image processing memory 26 and outputs a processing result (detection result) to the processing device 14.

なお、図示は省略するが、処理装置１４は、ＣＰＵ、ＲＡＭ、ＲＯＭ、及び各種設定を指定するための指定手段を備えて構成されている。システム管理者は、例えば、モニタ１６の表示画面１６ａの大きさや、顔検出の精度等の各種設定を指定手段により指定し、指定された設定は、処理装置１４から画像処理装置１２に入力される。画像処理装置１２は、入力された設定に従って動作する。例えば、入力された表示画面１６ａの大きさの情報に基づいて、処理装置１４に対して出力する認識結果の座標値のスケールを調整するようにしてもよい。また、入力された顔検出の精度を示す精度情報が、例えば、精度優先であるか、速度優先であるかに応じて、顔検出の際の解像度を調整したり、或いは検出する顔サイズに制限を設けたりしてもよい。なお、こうした設定は一例であって、これらに限定されるものではない。 Although not shown, the processing device 14 includes a CPU, a RAM, a ROM, and designation means for designating various settings. For example, the system administrator designates various settings such as the size of the display screen 16a of the monitor 16 and the accuracy of face detection by the designation means, and the designated settings are input from the processing device 14 to the image processing device 12. . The image processing device 12 operates according to the input settings. For example, the scale of the coordinate value of the recognition result output to the processing device 14 may be adjusted based on the input information about the size of the display screen 16a. In addition, the accuracy information indicating the accuracy of the input face detection is adjusted to the resolution at the time of face detection or limited to the face size to be detected depending on, for example, whether priority is given to accuracy or priority to speed. May be provided. Note that these settings are merely examples, and the present invention is not limited to these.

なお、処理装置１４に画像処理装置１２からカメラ１０で撮像された撮像画像の画像データが入力されるようにしてもよい。これにより、モニタ１６において、カメラ１０で撮像された撮像画像の表示も可能となる。なお、撮像画像は、モニタ１６の表示画面１６ａに表示してもよいし、表示しなくてもよい。 Note that image data of a captured image captured by the camera 10 from the image processing apparatus 12 may be input to the processing apparatus 14. As a result, the monitor 16 can also display the captured image captured by the camera 10. The captured image may be displayed on the display screen 16a of the monitor 16 or may not be displayed.

撮像画像を撮影するカメラ１０は、例えば、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）等を含んでおり、被写体を連続的に撮像可能に構成されている。撮像された撮像画像を表わす画像データは、逐次画像処理装置１２に入力され、画像処理用メモリ２６に一時的に格納される。 The camera 10 that captures a captured image includes, for example, a CCD (Charge Coupled Device), a CMOS (Complementary Metal Oxide Semiconductor), and the like, and is configured to be able to continuously capture a subject. Image data representing the captured image is sequentially input to the image processing device 12 and temporarily stored in the image processing memory 26.

図３は、画像処理装置１２で行われる顔検出処理の流れを示すフローチャートの一例である。本処理は、顔検出処理の開始指令が処理装置１４から入力された場合など、本システムの機能が開始されたときに起動される。なお、下記ステップ１１６が開始されて利用者が顔を動かすまでは、モニタ１６の表示画面１６ａの予め定められた基準位置（本実施の形態ではモニタ１６の表示画面１６ａの中心の座標位置）にポインタ１８が表示された状態が継続するものとする。また、以下で、単に「位置」といった場合には、ＸＹ座標系における絶対位置を示し、「何らかの点を基準とした位置」といった場合、或いは、「相対位置」といった場合には、ＸＹ座標系において何らかの点を基準とする相対位置を示すものとする。 FIG. 3 is an example of a flowchart showing the flow of face detection processing performed by the image processing apparatus 12. This process is started when the function of the system is started, such as when a face detection process start command is input from the processing device 14. Until the following step 116 is started and the user moves his / her face, the predetermined reference position of the display screen 16a of the monitor 16 (in this embodiment, the coordinate position of the center of the display screen 16a of the monitor 16) is set. It is assumed that the state where the pointer 18 is displayed continues. In the following description, the simple “position” indicates the absolute position in the XY coordinate system, and the “position relative to some point” or the “relative position” indicates the XY coordinate system. The relative position with respect to some point shall be indicated.

ステップ１００では、カメラ１０から入力された画像データを画像処理用メモリ２６から読み出し、該画像データに画像処理を施すことで、該画像データが表わす撮像画像から顔があると予測される予測顔検出領域４０を求める（顔領域予測）。ここで処理した撮像画像を初期画像と呼称する。 In step 100, image data input from the camera 10 is read from the image processing memory 26, and image processing is performed on the image data, thereby predicting a face that is predicted to be present from the captured image represented by the image data. An area 40 is obtained (face area prediction). The captured image processed here is referred to as an initial image.

ここでは、Hue(色相)と、Sat(彩度)と、Lum(輝度)とを用いて、顔の位置と大きさを予測する。具体的には、色相、彩度及び輝度の各々について、撮像画像に含まれる人の顔（肌色）に相当する画素値（色相値、彩度値、輝度値）を有する画素の度数分布を示すヒストグラムをＸ軸及びＹ軸の各々の方向で生成し、色相、彩度及び輝度の各々のヒストグラムの論理和をとって、人の顔が存在すると予測される領域を求める。このような処理により、図４（Ａ）に示すように、認識対象物３０に対して予測顔検出領域４０が定まる。 Here, the position and size of the face are predicted using Hue (hue), Sat (saturation), and Lum (luminance). Specifically, for each of hue, saturation, and luminance, a frequency distribution of pixels having pixel values (hue value, saturation value, luminance value) corresponding to a human face (skin color) included in the captured image is shown. A histogram is generated in each of the X-axis and Y-axis directions, and a logical sum of the hue, saturation, and luminance histograms is calculated to obtain a region where a human face is predicted to exist. By such processing, as shown in FIG. 4A, the predicted face detection region 40 is determined for the recognition target object 30.

続いて、ステップ１０２では、上記予測顔検出領域４０から、目、鼻、口（以下、顔パーツと称する）があると予測される予測顔パーツ領域４２を求める（顔パーツ領域予測）。これは、人の顔であれば、顔の上部の左右に目があり、中心付近に鼻があり、下部に口がある、といったように、顔に対する各顔パーツのおおよその相対位置は定まるため、この相対的な位置情報を予めＲＯＭ２２に記憶しておき、この位置情報と、上記予測顔検出領域４０の顔の大きさとに応じて予測する。これにより、図４（Ｂ）に示すように、予測顔検出領域４０に対して、目、鼻、口に対応した予測顔パーツ領域４２が求められる。なお、ここでは、目、鼻、口を顔パーツの具体例として説明するが、これに限定されるものではなく、これ以外の顔パーツとして、例えば、眉が含まれていてもよい。 Subsequently, in step 102, a predicted face part region 42 that is predicted to have eyes, nose, and mouth (hereinafter referred to as face parts) is obtained from the predicted face detection region 40 (face part region prediction). This is because the approximate relative position of each face part relative to the face is determined, such as a human face with eyes on the left and right of the top, a nose near the center, and a mouth at the bottom. The relative position information is stored in the ROM 22 in advance, and prediction is performed according to the position information and the size of the face in the predicted face detection area 40. As a result, as shown in FIG. 4B, a predicted face part area 42 corresponding to the eyes, nose, and mouth is obtained for the predicted face detection area 40. Here, the eyes, nose, and mouth will be described as specific examples of face parts, but the present invention is not limited to this. For example, eyebrows may be included as other face parts.

ステップ１０４では、初期画像における上記予測顔パーツ領域４２から、各顔パーツ（顔パーツ領域４４）を検出する。例えば、ステップ１００と同様に、目を検出する場合には、目があると予測される予測顔パーツ領域４２において、色相、彩度及び輝度の各々について、撮像画像に含まれる目に相当する画素値（色相値、彩度値、輝度値）を有する画素の度数分布を示すヒストグラムをＸ軸及びＹ軸の各々の方向で生成し、色相、彩度及び輝度の各々のヒストグラムの論理和をとって、目が存在すると予測される領域を求める。或いは、パターンマッチング等の手法を用いてもよい。これにより、図４（Ｃ）に示すように、各顔パーツについて、顔パーツ領域４４が検出される。 In step 104, each face part (face part area 44) is detected from the predicted face part area 42 in the initial image. For example, as in step 100, when detecting eyes, in the predicted face part region 42 where it is predicted that there are eyes, pixels corresponding to the eyes included in the captured image for each of hue, saturation, and luminance A histogram showing the frequency distribution of pixels having values (hue value, saturation value, luminance value) is generated in each direction of the X axis and Y axis, and the logical sum of the histograms of hue, saturation, and luminance is obtained. Thus, an area where the eyes are predicted to exist is obtained. Alternatively, a method such as pattern matching may be used. As a result, as shown in FIG. 4C, a face part region 44 is detected for each face part.

ステップ１０６では、初期画像から全ての顔パーツが検出できたか否かを判断する。ここで否定判断した場合には、利用者の顔が、カメラ１０に対して正面を向いていなかったと判断されるため、再度ステップ１００に戻り、上記処理を繰り返す。 In step 106, it is determined whether or not all face parts have been detected from the initial image. If a negative determination is made here, it is determined that the user's face is not facing the front with respect to the camera 10, so the process returns to step 100 again and the above processing is repeated.

ステップ１０６で肯定判断した場合には、利用者の顔がカメラ１０に対して正面を向いた状態で顔パーツの各々が検出されたと判断されるため、ステップ１０８に進み、検出した顔パーツ領域４４を示す領域情報をＲＡＭ２４に記憶して設定する。領域情報には位置を示す情報や大きさを示す情報などが含まれる。この領域情報は、後述するトラッキング中、変化がある毎に逐次更新される。 If the determination in step 106 is affirmative, it is determined that each of the facial parts has been detected in a state where the user's face is facing the front of the camera 10, so the process proceeds to step 108, and the detected facial part region 44 is detected. Is stored in the RAM 24 and set. The area information includes information indicating the position and information indicating the size. This area information is sequentially updated every time there is a change during tracking described later.

続いて、ステップ１１０では、設定された顔パーツ領域４４の各々に基づいて、顔領域４６を設定する。前述したように、人の顔であれば、顔の上部の左右に目があり、中心付近に鼻があり、下部に口がある、といった、各顔パーツの顔に対するおおよその相対位置は定まるため、該位置情報に従って、上記ステップ１０２とは逆に、顔パーツ領域４４から正確な顔の領域を求める。そして、検出した顔領域４６を示す領域情報をＲＡＭ２４に記憶して設定する。なお、この領域情報も後述するトラッキング中、変化がある毎に更新される。 Subsequently, in step 110, the face area 46 is set based on each of the set face part areas 44. As mentioned above, the approximate relative position of each face part relative to the face, such as the eyes on the left and right of the upper part of the face, the nose near the center, and the mouth at the lower part of the face, is determined. In accordance with the position information, an accurate face area is obtained from the face part area 44, contrary to the step 102. Then, area information indicating the detected face area 46 is stored in the RAM 24 and set. This area information is also updated whenever there is a change during tracking described later.

ステップ１１２では、上記初期画像から検出された複数の顔パーツ領域４４のうち、予め定められた顔パーツ領域４４（本実施の形態では、鼻領域とする）の位置（本実施の形態では、重心位置）を基準点４８とし、該基準点４８の位置情報をＲＡＭ２４に記憶して設定する。鼻領域は本願発明の顔の特徴部に対応する。 In step 112, among the plurality of face part areas 44 detected from the initial image, the position of a predetermined face part area 44 (in this embodiment, the nose area) (in this embodiment, the center of gravity is determined). Position) is a reference point 48, and the position information of the reference point 48 is stored in the RAM 24 and set. The nose region corresponds to the facial feature of the present invention.

なお、ここでは、顔の特徴部として鼻領域を例に挙げたが、特徴部は、口領域であってもよいし、左目領域、或いは右目領域であってもよい。また、左目領域と右目領域に挟まれた領域（例えば、左目領域の左端と、右目領域の右端の中間点を含む領域或いは中間点そのもの）であってもよい。 Here, the nose region is taken as an example of the facial feature, but the feature may be a mouth region, a left-eye region, or a right-eye region. Further, it may be an area sandwiched between the left eye area and the right eye area (for example, an area including an intermediate point between the left end of the left eye area and the right end of the right eye area or the intermediate point itself).

また、鼻領域の位置を鼻領域の重心位置としたが、これに限定されず、例えば、鼻領域の左上端部の位置としてもよいし、右下端部の位置としてもよいし、鼻が有する２つの外鼻孔の中間点の位置としてもよい。 In addition, the position of the nose region is the center of gravity of the nose region, but is not limited to this. For example, it may be the position of the upper left end of the nose region, the position of the lower right end, or the nose. It is good also as a position of the middle point of two outer nostrils.

ステップ１１４では、上記設定した基準点４８の上記顔領域４６に対する相対位置の情報（以下、基準点相対位置情報）をＲＡＭ２４に記憶して保持する。初期画像から全ての顔パーツの顔パーツ領域４４が検出された場合には、利用者はカメラ１０に対して正面を向いていると考えられ、従って、以下では、この基準点４８の相対位置を、利用者がカメラ１０に対して正面を向いた状態における相対位置であるものとして取り扱う。 In step 114, information on the relative position of the set reference point 48 with respect to the face area 46 (hereinafter referred to as reference point relative position information) is stored in the RAM 24 and held. When the face part regions 44 of all the face parts are detected from the initial image, it is considered that the user is facing the front with respect to the camera 10. Therefore, in the following, the relative position of the reference point 48 is determined. The user is assumed to be a relative position in a state where the user faces the camera 10.

これにより基準点更新領域５０が定まる（図６も参照。）。本実施の形態では、基準点更新領域５０は、検出された最新の顔領域４６に対する基準点相対位置情報が示す相対位置から定まる位置を重心位置とする予め定められた大きさの正方形状の領域である。なお、基準点更新領域５０は、顔がカメラ１０に対して正面を向いているか否かを判断するために用いられる領域であるが、顔が正面を向いていることが検出されたときには後述するように基準点４８が更新されるため、ここでは、該領域を基準点更新領域と呼称している。 As a result, the reference point update region 50 is determined (see also FIG. 6). In the present embodiment, the reference point update area 50 is a square area having a predetermined size with the position determined from the relative position indicated by the reference point relative position information with respect to the latest detected face area 46 as the center of gravity position. It is. The reference point update area 50 is an area used to determine whether or not the face is facing the front with respect to the camera 10, but will be described later when it is detected that the face is facing the front. Since the reference point 48 is updated as described above, this region is referred to as a reference point update region.

ステップ１１６では、顔パーツ及び顔のトラッキングを開始する。ここで、顔パーツ及び顔のトラッキングとは、カメラ１０で連続的に或いは間欠的に撮影された撮影画像の各々から、上記検出した顔パーツ領域４４及び顔領域４６を検出して追跡することをいう。利用者の顔が動けば、それに従って、顔パーツ領域４４及び顔領域４６の位置も移動する。トラッキング方法は特に限定されず、例えば、顔パーツ領域４４及び顔領域４６について周知の輪郭追跡処理を行ってトラッキングしてもよいし、ステップ１０４及びステップ１１０の処理を繰り返し行ってトラッキングしてもよい。 In step 116, face parts and face tracking are started. Here, the face part and face tracking refers to detecting and tracking the detected face part region 44 and the face region 46 from each of the images photographed continuously or intermittently by the camera 10. Say. If the user's face moves, the positions of the face part area 44 and the face area 46 move accordingly. The tracking method is not particularly limited. For example, the face part region 44 and the face region 46 may be tracked by performing a well-known contour tracking process, or may be tracked by repeatedly performing the processes of step 104 and step 110. .

ステップ１１８で、トラッキング期間中に検出した顔パーツ領域４４及び顔領域４６に基づいて、顔が正面を向いているか否かを判断する。この判断方法の具体例については後述する。ここで、顔が正面を向いていないと判断した場合には、ステップ１２０に進み、顔パーツ領域４４のいずれかが不検出になったか否かを判断する。ここで、上記顔パーツ（左目、右目、鼻、口）のいずれかの領域が検出されなくなったと判断した場合には、利用者の顔を見失ったとして、ステップ１２４でトラッキングを中止して、ステップ１００に戻り、再度利用者の顔領域及び顔パーツの検出を行う。このとき、ポインタ１８の表示位置が基準位置でない場合には、基準位置に戻るように制御される。 In step 118, based on the face part area 44 and the face area 46 detected during the tracking period, it is determined whether or not the face is facing the front. A specific example of this determination method will be described later. If it is determined that the face is not facing the front, the process proceeds to step 120, and it is determined whether or not any of the face part areas 44 is not detected. If it is determined that any region of the face parts (left eye, right eye, nose, mouth) is not detected, the user's face is lost and tracking is stopped in step 124. Returning to 100, the user's face area and face parts are detected again. At this time, if the display position of the pointer 18 is not the reference position, control is performed so as to return to the reference position.

本実施の形態では、顔パーツのうち鼻のみをトラッキングするだけでは、利用者とカメラ１０との間を人が通った場合等においては鼻領域を誤検出する場合もあることから、本実施の形態では各顔パーツを検出するようにして、少なくとも１つが検出されなかった場合に、ステップ１００に戻るようにしている。 In the present embodiment, tracking only the nose of the facial parts may cause the nose region to be erroneously detected when a person passes between the user and the camera 10. In the form, each face part is detected, and when at least one face part is not detected, the process returns to step 100.

しかしながら、例えば、誤検出の可能性が低くなるような運用（例えば、利用者とカメラ１０との間を人が通らない、等）においては、「少なくとも２つ」或いは「少なくとも３つ」が検出されなくなった場合に、ステップ１００に戻るようにしてもよい。また、顔パーツのうち少なくともポインタ１８を操作するために用いられる顔パーツ領域４４（ここでは鼻領域）のみが検出されていればステップ１００に戻らずに継続してポインタ１８の操作が可能なように構成してもよい。 However, for example, in an operation in which the possibility of erroneous detection is low (for example, a person does not pass between the user and the camera 10), “at least two” or “at least three” are detected. If it is not performed, the process may return to step 100. Further, if only the face part area 44 (here, the nose area) used for operating the pointer 18 is detected among the face parts, the pointer 18 can be operated continuously without returning to step 100. You may comprise.

一方、ステップ１２０で否定判断した場合には、ステップ１２４に進み、上記ＲＡＭ２４に記憶されている最新の基準点４８を基準とした鼻領域の位置（基準点４８に対する鼻領域の相対位置）を求め、該相対位置を示す情報（以下、相対位置情報）を、処理装置１４に入力する。ここでは、鼻領域の位置を、鼻領域の重心位置としているが、前述したように重心位置に限定されないのはもちろんである。処理装置１４は、表示画面１６ａの該入力された相対位置情報に応じた座標位置にポインタ１８を移動させる。このように、画像処理装置１２は、相対位置情報を処理装置１４に入力することにより、モニタ１６の表示画面１６ａに表示されたポインタ１８が移動するように制御している。 On the other hand, if a negative determination is made in step 120, the process proceeds to step 124, and the position of the nose region (relative position of the nose region with respect to the reference point 48) is determined with reference to the latest reference point 48 stored in the RAM 24. , Information indicating the relative position (hereinafter, relative position information) is input to the processing device 14. Here, the position of the nose region is set as the barycentric position of the nose region, but it is needless to say that the position is not limited to the barycentric position as described above. The processing device 14 moves the pointer 18 to a coordinate position corresponding to the input relative position information on the display screen 16a. As described above, the image processing device 12 controls the pointer 18 displayed on the display screen 16 a of the monitor 16 to move by inputting the relative position information to the processing device 14.

ここで、トラッキングによるポインタ１８の表示制御について図５を参照して説明する。図５は、検出された利用者の顔パーツ領域４４及び顔領域４６を模式的に示すと共に、該検出結果に応じたポインタ１８の表示結果を模式的に示す図である。 Here, display control of the pointer 18 by tracking will be described with reference to FIG. FIG. 5 is a diagram schematically showing the detected face part area 44 and the face area 46 of the user, and schematically showing the display result of the pointer 18 according to the detection result.

利用者が正面を向いた状態にあり、まだ顔を動かしていない初期段階では、図５（Ａ）に示すように、モニタ１６の表示画面１６ａの基準位置にポインタ１８が配置される。 In the initial stage where the user is facing the front and the face is not yet moved, the pointer 18 is arranged at the reference position of the display screen 16a of the monitor 16 as shown in FIG.

ここで、利用者が自分の顔を、カメラ１０に対して向かって右側に回転させて顔の向きを変更したとする。これにより、図５（Ｂ）に示すように、鼻領域の位置４８ａが移動する。画像処理装置１２は処理装置１４に対して、この移動に応じた相対位置情報を入力し、処理装置１４は、該相対位置情報に応じて、表示画面１６ａの基準位置から相対位置情報に応じた距離だけ右側の位置にポインタ１８を移動させる。 Here, it is assumed that the user rotates his / her face to the right with respect to the camera 10 to change the face orientation. As a result, the position 48a of the nose region moves as shown in FIG. The image processing device 12 inputs relative position information corresponding to this movement to the processing device 14, and the processing device 14 responds to the relative position information from the reference position of the display screen 16a according to the relative position information. The pointer 18 is moved to the right position by the distance.

また、利用者が上側を向いた場合には、図５（Ｃ）に示すように、鼻領域の位置４８ｂが移動する。画像処理装置１２は処理装置１４に対して、この移動に応じた相対位置情報を入力し、処理装置１４は、該相対位置情報に応じて、表示画面１６ａの基準位置から相対位置情報に応じた距離だけ上側の位置にポインタ１８を移動させる。 When the user faces upward, the nose region position 48b moves as shown in FIG. The image processing device 12 inputs relative position information corresponding to this movement to the processing device 14, and the processing device 14 responds to the relative position information from the reference position of the display screen 16a according to the relative position information. The pointer 18 is moved to an upper position by the distance.

なお、本実施の形態では、顔の向きを変えるだけでも、ポインタ１８を移動させることができるが、顔を傾けた状態で顔全体を移動させることによっても、鼻領域を移動させることができる。後者の場合、顔全体の動きと顔の傾きの両方により鼻領域が移動することとなるため、各々の動きによる移動量が加算された移動量でポインタ１８を移動させることができる。従って、利用者は、細かいポインタ操作或いは移動量の少ないポインタ操作を行う場合には、顔の向きを正面向きから異なる向きに傾けることでポインタ操作を行えばよいし、移動量の大きなポインタ操作を行う場合には、顔を該異なる向きに傾けた状態で、顔全体を移動させることでポインタ操作を行えばよい。 In the present embodiment, the pointer 18 can be moved simply by changing the orientation of the face, but the nose region can also be moved by moving the entire face with the face tilted. In the latter case, the nose region moves due to both the movement of the entire face and the inclination of the face, so that the pointer 18 can be moved by the movement amount obtained by adding the movement amounts of the respective movements. Therefore, when a user performs a fine pointer operation or a pointer operation with a small amount of movement, the user may perform the pointer operation by tilting the face from a front direction to a different direction, or perform a pointer operation with a large amount of movement. When performing, the pointer operation may be performed by moving the entire face with the face tilted in the different direction.

ステップ１２４の後は、ステップ１１８に戻る。 After step 124, the process returns to step 118.

一方、ステップ１１８で、顔が正面を向いていると判断した場合には、ステップ１２６に進み、現在の鼻領域の重心位置の位置情報を基準点４８として、ＲＡＭ２４に記憶して設定する。ここでは、ＲＡＭ２４の基準点４８の位置情報を上書きして更新してもよいし、既に記憶されている位置情報とは別の記憶領域に記憶してもよい。ただし、後者の場合には、最新の基準点４８の位置情報にフラグを立てる等して、最新の基準点４８を識別可能に構成することが必要である。 On the other hand, if it is determined in step 118 that the face is facing the front, the process proceeds to step 126, where the position information of the current center of gravity of the nose region is stored and set in the RAM 24 as the reference point 48. Here, the position information of the reference point 48 of the RAM 24 may be overwritten and updated, or may be stored in a storage area different from the already stored position information. However, in the latter case, it is necessary to configure the latest reference point 48 so that it can be identified by setting a flag on the position information of the latest reference point 48.

ステップ１２８では、ポインタ１８の位置をリセットする。ここでは、ポインタ１８が、モニタ１６の表示画面１６ａの基準位置に配置されるように基準位置に相当する座標情報（或いはリセット命令）を処理装置１４に入力して制御する。これにより、ポインタ１８は、表示画面１６ａの基準位置に配置される。そして、ステップ１１８に戻る。 In step 128, the position of the pointer 18 is reset. Here, coordinate information (or a reset command) corresponding to the reference position is input to the processing device 14 and controlled so that the pointer 18 is arranged at the reference position of the display screen 16a of the monitor 16. Thereby, the pointer 18 is arranged at the reference position of the display screen 16a. Then, the process returns to step 118.

すなわち、本実施の形態では、利用者が正面を向いている状態では、利用者がいくら顔を移動させて鼻領域の位置（ここでは重心位置）を移動させたとしても、基準点４８がその移動と共に更新されるため、ポインタ１８の表示位置は表示画面１６ａの基準位置から動かないのである。一方、利用者が正面方向とは異なる方向に顔を動かして顔の向きを変えた場合には、鼻領域の位置の移動に伴ってポインタ１８の表示位置も移動することになる。 That is, in the present embodiment, when the user is facing the front, no matter how much the user moves the face and moves the position of the nose region (here, the center of gravity), the reference point 48 is Since it is updated with the movement, the display position of the pointer 18 does not move from the reference position of the display screen 16a. On the other hand, when the user moves the face in a direction different from the front direction and changes the direction of the face, the display position of the pointer 18 moves with the movement of the position of the nose region.

なお、本実施の形態において、利用者の顔がカメラ１０に対して正面を向いているか否かの判断は次のようにして行う。 In the present embodiment, whether or not the user's face is facing the front with respect to the camera 10 is determined as follows.

前述したように、ＲＡＭ２４に記憶されている基準点相対位置情報と顔領域４６の位置から基準点更新領域５０が定まるが、上記トラッキングにより検出される現在の鼻領域の重心位置が該基準点更新領域５０外に存在する場合（図６（Ａ）の上段の図も参照。）には、顔が正面を向いていないと判断し、鼻領域の重心位置が該基準点更新領域５０以内に存在する場合（図６（Ｂ）の上段の図も参照。）には、顔が正面を向いていないと判断する。 As described above, the reference point update area 50 is determined from the reference point relative position information stored in the RAM 24 and the position of the face area 46. The center of gravity position of the current nose area detected by the tracking is the reference point update. If it exists outside the region 50 (see also the upper diagram in FIG. 6A), it is determined that the face is not facing the front, and the center of gravity position of the nose region exists within the reference point update region 50. If it is to be performed (see also the upper diagram in FIG. 6B), it is determined that the face is not facing the front.

なお、図３に示すフローチャートに従えば、図６（Ａ）に示すように、顔が正面を向いておらず且つ各顔パーツが検出されている状態で認識対象物３０が移動して（図６（Ａ）の３０ａも参照。）、鼻領域の位置が変化した場合には、ステップ１１８、１２０で否定判断されるため、最新の基準点４８に対する変化後の鼻領域の位置４８ａの相対位置に応じてポインタ１８の位置が移動する。 If the flowchart shown in FIG. 3 is followed, as shown in FIG. 6A, the recognition object 30 moves in a state where the face is not facing the front and each face part is detected (FIG. (See also 30a of 6 (A).) If the position of the nose region has changed, a negative determination is made in steps 118 and 120, so the relative position of the position 48a of the nose region after the change with respect to the latest reference point 48. Accordingly, the position of the pointer 18 moves.

また、図６（Ｂ）に示すように、顔が正面を向いた状態で、鼻領域の位置が変化した場合（顔全体がカメラ１０に対して正面を向いたまま移動した場合。図６（Ｂ）の３０ｂも参照。）には、ステップ１１８で肯定判断されるため、変化後の鼻領域の位置４８ｂが最新の基準点４８となるように更新されて、ポインタ１８は、表示画面１６ａの基準位置に配置されて表示される。すなわち、ポインタ１８を基準位置に静止させたままとすることができる。 Also, as shown in FIG. 6B, when the position of the nose region changes with the face facing forward (when the entire face moves while facing the front of the camera 10). (See also B) 30b.) Since the determination at step 118 is affirmative, the nose region position 48b after the change is updated to the latest reference point 48, and the pointer 18 is displayed on the display screen 16a. It is displayed at the reference position. That is, the pointer 18 can remain stationary at the reference position.

以上説明したように、本実施の形態によれば、予め定められた顔の向きが検出されたときに、基準点４８を更新し、且つ表示画面１６ａの予め定められた基準位置にポインタ１８が配置されるように制御するよう構成したため、利用者が体勢等を変更しても、利用者が手を使って何らかのリセットボタン等を押下することなく、容易に表示画面の基準位置に該画像を配置させることができ、体勢変更後も画像の移動を容易に継続することができる。 As described above, according to the present embodiment, when the predetermined face orientation is detected, the reference point 48 is updated, and the pointer 18 is moved to the predetermined reference position on the display screen 16a. Since it is configured to be controlled so as to be arranged, even if the user changes his / her posture, the user can easily put the image at the reference position on the display screen without pressing any reset button or the like with his hand. It can be arranged, and the movement of the image can be easily continued even after the posture change.

なお、本実施の形態では、ポインタ１８の基準位置を、表示画面１６ａの中心の位置としたが、これに限定されず、例えば、中心から若干ずれた位置であってもよいし、画像処理システム１が適用されるアプリケーションに応じて予め定めてもよいし、管理者が手動で指定してもよく、特に限定されない。 In the present embodiment, the reference position of the pointer 18 is the center position of the display screen 16a. However, the present invention is not limited to this, and may be a position slightly deviated from the center, for example. 1 may be predetermined according to the application to which 1 is applied, or may be manually specified by the administrator, and is not particularly limited.

また、本実施の形態では、画像処理用メモリ２６を、カメラ１０で撮像された撮像画像を表わす画像データを一時的に記憶するためのメモリとする例について説明したが、これに限定されるものではない。例えば、画像処理用メモリ２６が、画像処理の過程で発生した情報を記憶する一時記憶領域として使用されるメモリであってもよい。より具体的には、例えば、撮像画像から顔があると予測される予測顔検出領域を求める際に取り出した、色相、彩度及び輝度の各々の情報を一時的に記憶するためのメモリとして使用する等である。 In the present embodiment, an example in which the image processing memory 26 is a memory for temporarily storing image data representing a captured image captured by the camera 10 has been described. However, the present invention is not limited to this. is not. For example, the image processing memory 26 may be a memory used as a temporary storage area for storing information generated in the course of image processing. More specifically, for example, it is used as a memory for temporarily storing each information of hue, saturation, and luminance extracted when obtaining a predicted face detection area predicted to have a face from a captured image. Etc.

また、撮像画像を表わす画像データ等は、例えば、ＣＰＵを構成するＬＳＩ内で保持することもできるため、顔パーツ検出処理等の処理過程で過去の画像データを参照しないアルゴリズムとする場合には、画像処理用メモリ２６を設けない構成とすることもできる。 In addition, since image data representing a captured image can be held in, for example, an LSI that constitutes a CPU, in the case of an algorithm that does not refer to past image data in a process such as face part detection processing, A configuration in which the image processing memory 26 is not provided is also possible.

また、本実施の形態では、利用者がカメラ１０に対して正面を向いた状態を、「予め定められた向き」として説明したが、これに限定されるものではなく、正面向きとは異なる向きとしてもよい。 In the present embodiment, the state in which the user faces the front with respect to the camera 10 has been described as a “predetermined direction”, but the present invention is not limited to this, and a direction different from the front direction. It is good.

また、ポインタ１８を表示するモニタ１６は、パーソナルコンピュータ等に接続されるモニタ１６であってもよいし、広告等の用途で街頭に設置され、通行人等が閲覧可能な大型のモニタ１６としてもよい。 The monitor 16 that displays the pointer 18 may be a monitor 16 connected to a personal computer or the like, or may be a large monitor 16 that is installed on the street for advertising purposes and can be viewed by passers-by. Good.

後者の場合において、例えば、図７に示すように、モニタ１６の表示画面１６ａが複数個の画面領域１７により構成され、上記基準位置が、該複数個の画面領域１７の各々に個別に設定されているようにしてもよい。この場合、ポインタ１８を基準位置に配置する際に、どの基準位置に配置するかは、例えば、以下のようにして定めても良い。 In the latter case, for example, as shown in FIG. 7, the display screen 16 a of the monitor 16 is composed of a plurality of screen areas 17, and the reference position is individually set in each of the plurality of screen areas 17. You may be allowed to. In this case, when the pointer 18 is arranged at the reference position, which reference position is arranged may be determined as follows, for example.

予め、複数個の画面領域を示す複数の識別情報の各々に対応させて、基準位置の情報と顔領域の位置を示す位置情報とを登録したテーブルを予めＲＯＭ２２に記憶しておく。 A table in which the reference position information and the position information indicating the position of the face area are registered in advance in the ROM 22 in correspondence with each of the plurality of pieces of identification information indicating the plurality of screen areas.

そして、上記複数個の画面領域のうち、カメラ１０に対して正面方向を向いている顔の向きが検出されたときの該利用者の顔領域４６の位置を示す位置情報に対応する画面領域を上記テーブルを参照して抽出し、該抽出した画面領域に対応して上記テーブルに記憶されている基準位置にポインタ１８が配置されるように、該基準位置を示す情報を処理装置１４に入力して、表示を制御するようにしてもよい。 Then, among the plurality of screen areas, a screen area corresponding to position information indicating the position of the user's face area 46 when the orientation of the face facing the camera 10 is detected. Information is extracted by referring to the table, and information indicating the reference position is input to the processing device 14 so that the pointer 18 is arranged at the reference position stored in the table corresponding to the extracted screen area. The display may be controlled.

これにより、例えば、上記テーブルで、顔領域の位置と画面領域の位置とが略同じような位置となるように対応付けて登録しておけば、顔検出により、利用者の顔領域４６が、カメラ１０の撮像画像の右端に近い領域に存在することが検出された場合において、利用者が正面を向いたことが検出されたときには、上記複数個の画面領域のうち、右端に配置された画面領域の基準位置にポインタ１８が配置されるように制御することができる。これにより、利用者の顔から遠い画面領域の基準位置にポインタ１８が配置されることを抑制でき、利用者がポインタ１８を見失うことを防止できる。 Accordingly, for example, if the face area position and the screen area position are registered in association with each other in the above table so that the position of the face area is substantially the same, the face area 46 of the user is detected by the face detection. When it is detected that the user is present in the region near the right end of the captured image of the camera 10, when it is detected that the user is facing the front, the screen arranged at the right end among the plurality of screen regions. The pointer 18 can be controlled to be arranged at the reference position of the area. Thereby, it can suppress that the pointer 18 is arrange | positioned at the reference | standard position of a screen area | region far from a user's face, and it can prevent that a user loses sight of the pointer 18. FIG.

なお、上記実施の形態では、ポインタ１８を操作するときに用いられる顔パーツ領域４４と、顔の向きを検出するために用いられる基準点更新領域５０の元となる顔パーツ領域４４とが共に鼻領域である場合について説明したが、これに限定されない。例えば、ポインタ１８を操作するときに用いられる顔パーツ領域４４と、顔の向きを検出するために用いられる基準点更新領域５０の元となる顔パーツ領域４４とが異なるものであってもよい。例えば、ポインタ１８を操作するときに用いられる顔パーツ領域４４は鼻領域とし、顔の向きを検出するために用いられる基準点更新領域５０の元となる顔パーツ領域４４は口領域とする、等である。この場合、例えば、ステップ１１４において、検出された口領域の重心位置の顔領域４６に対する相対位置の情報（以下、顔パーツ相対位置情報）をＲＡＭ２４に記憶して保持しておき、基準点更新領域５０を、検出された最新の顔領域４６に対する上記顔パーツ相対位置情報が示す相対位置から定まる位置を重心位置とする予め定められた大きさの正方形状の領域としてもよい。そして、ステップ１１８では、検出された最新の口領域の重心位置が該基準点更新領域５０外か否かに応じて、利用者の顔が正面を向いているか否かを判断してもよい。一方、ポインタ１８の移動に用いられる基準点４８は鼻領域の重心位置として、上記実施の形態と同様に制御することができる。 In the above embodiment, the face part area 44 used when operating the pointer 18 and the face part area 44 that is the basis of the reference point update area 50 used for detecting the face orientation are both noses. Although the case of the area has been described, the present invention is not limited to this. For example, the face part area 44 used when operating the pointer 18 may be different from the face part area 44 that is the basis of the reference point update area 50 used for detecting the face orientation. For example, the face part area 44 used when operating the pointer 18 is a nose area, the face part area 44 that is the basis of the reference point update area 50 used for detecting the face orientation is a mouth area, etc. It is. In this case, for example, in step 114, information on the relative position of the center of gravity position of the detected mouth area with respect to the face area 46 (hereinafter referred to as face part relative position information) is stored and held in the RAM 24, and the reference point update area 50 may be a square area having a predetermined size with a position determined from a relative position indicated by the face part relative position information with respect to the latest detected face area 46 as a center of gravity position. In step 118, it may be determined whether or not the user's face is facing the front depending on whether or not the detected center of gravity of the latest mouth area is outside the reference point update area 50. On the other hand, the reference point 48 used for the movement of the pointer 18 can be controlled as the center of gravity of the nose region in the same manner as in the above embodiment.

更にまた、顔の向きを検出するために用いられる基準点更新領域５０を、異なる２つの顔パーツ領域４４の各々の位置（例えば重心位置）の中間点を重心位置とする、予め定められた大きさの領域としてもよい。 Furthermore, the reference point update area 50 used for detecting the orientation of the face has a predetermined size in which the center point is the midpoint between the positions (for example, centroid positions) of the two different face part areas 44. It is good also as an area.

また、基準点更新領域５０は正方形状でなくてもよく、長方形状であってもよいし、円形状であってもよい。 Further, the reference point update region 50 does not have to be square, but may be rectangular or circular.

更に又、顔の向きを検出するための手法は上記で説明した例に限定されない。例えば、特開２００３−１４１５５１号公報に記載されている手法を用いてもよい。この手法を簡便に説明すると、カメラで人物の顔を撮影して得た画像データを入力して、この画像データと顔の向きに関する情報が予め記憶されている複数の顔向きテンプレートとを比較して、その画像データに撮影されている顔の向きを計算するか、又は、この画像データの特徴点を抽出してその特徴点の位置関係から顔の向きを計算する顔向き計算方法であって、３次元空間の回転における一つの回転軸を身体の中心軸とし、この中心軸回りの回転である回旋の回転角をψとし、他の２つの回転軸回りの回転角をθ，φとした場合の回転角ψ，θ，φで顔の向きを出力するものであり、上記入力された画像データの中から、最も頻度の高い顔の向きを正面向きと定義すると共に、該定義された正面向きの顔に対しては、３つの回転角ψ，θ，φの全てについて０を出力し、上記入力された画像データの中から、上記中心軸を中心にのみ回転した顔に対しては、その回転角ψを出力し、他の２つの回転角θ，φについて０に近い値を出力するものである。これにより、回転角が全て０となれば、利用者が正面を向いていると判断し、それ以外であれば、顔の向きは正面を向いていないと判断することができる。 Furthermore, the method for detecting the orientation of the face is not limited to the example described above. For example, a technique described in JP2003-141551A may be used. Briefly describing this method, image data obtained by photographing a human face with a camera is input, and the image data is compared with a plurality of face orientation templates in which information on the face orientation is stored in advance. A face orientation calculation method for calculating the orientation of the face photographed in the image data or extracting the feature points of the image data and calculating the face orientation from the positional relationship of the feature points. One rotation axis in the rotation of the three-dimensional space is set as the central axis of the body, the rotation angle of rotation that is rotation around this central axis is set as ψ, and the rotation angles around the other two rotation axes are set as θ and φ. In this case, the orientation of the face is output at the rotation angles ψ, θ, and φ, and the most frequent face orientation is defined as the front orientation from the input image data, and the defined front orientation is defined. For a face of orientation, three rotation angles ψ, θ, φ 0 is output for all, and the rotation angle ψ is output for the face rotated only around the central axis from the input image data, and the other two rotation angles θ and φ are output. A value close to 0 is output. Thereby, if all the rotation angles are 0, it can be determined that the user is facing the front, and otherwise, it can be determined that the face is not facing the front.

また、上記実施の形態では、カメラ１０と画像処理装置１２とを独立した装置として説明したが、カメラ１０と画像処理装置１２とを一体化したモジュールとして画像処理システム１を構成してもよい。更に、このモジュールが処理装置１４と共通の接続インタフェースを備え、該接続インタフェースを介して処理装置１４に対して着脱可能に構成されていてもよい。より具体的には、例えば、このモジュール及び処理装置１４の各々が、例えばＵＳＢ（ユニバーサルシリアルバス）インタフェースを備え、ＵＳＢインタフェースにより接続することで、利用者の顔を撮像してポインタの操作を行う機能の実行を可能とする構成してもよい。 In the above embodiment, the camera 10 and the image processing device 12 are described as independent devices. However, the image processing system 1 may be configured as a module in which the camera 10 and the image processing device 12 are integrated. Further, the module may be provided with a connection interface common to the processing apparatus 14 and configured to be detachable from the processing apparatus 14 via the connection interface. More specifically, for example, each of the module and the processing device 14 includes, for example, a USB (Universal Serial Bus) interface, and is connected via the USB interface, thereby imaging a user's face and operating a pointer. You may comprise so that execution of a function is enabled.

なお、本実施の形態に係る画像処理システム１は、様々なアプリケーションに適用可能であり、例えば、上述した街頭の大型モニタで広告や標識等を表示するアプリケーションに適用してもよい。その際、例えば、利用者がモニタに近づいたことを検知したときに、上記図３に示す顔検出処理のプログラムが起動して、表示画面１６ａにポインタ１８が出現するように制御し、その後は、利用者が顔を上述したように動かすことでポインタ１８を移動させる。更に、例えば、ポインタ１８に移動により表示画面１６ａに表示されている広告の種別を指定し、表示させたい広告を切替える等のアプリケーション等にも適用できる。 The image processing system 1 according to the present embodiment can be applied to various applications. For example, the image processing system 1 may be applied to an application that displays advertisements, signs, and the like on the above-described large street monitor. At that time, for example, when it is detected that the user has approached the monitor, the face detection processing program shown in FIG. 3 is started and control is performed so that the pointer 18 appears on the display screen 16a. The user moves the pointer 18 by moving the face as described above. Further, for example, the present invention can be applied to an application or the like that designates the type of advertisement displayed on the display screen 16a by moving the pointer 18 and switches the advertisement to be displayed.

このような構成によれば、例えば、買い物等により手がふさがっている利用者も、顔を動かせば、ポインタ１８を操作して表示の切り替えができるため、手を使って操作することなく所望の広告を閲覧することができ、操作性が格段に向上する。 According to such a configuration, for example, a user whose hands are occupied by shopping or the like can switch the display by operating the pointer 18 by moving his / her face, so that a desired operation can be performed without using a hand. The advertisement can be browsed, and the operability is greatly improved.

更に又、パーソナルコンピュータを手が不自由な人が操作する場合にも、本実施の形態の画像処理システム１は有用である。手が不自由な人であっても、顔さえ動かせば、ポインタ１８を移動させることができるため、パソコン等を容易に操作することができる。 Furthermore, the image processing system 1 of the present embodiment is useful even when a person with a handicap operates the personal computer. Even a handicapped person can easily move a personal computer or the like because the pointer 18 can be moved as long as the face is moved.

また、ポインタ１８を移動させて座標位置を指定する指定操作だけでなく、例えば、ジェスチャ操作等にも上記画像処理システム１を適用することも可能である。ジェスチャ操作は、例えば、ポインタ１８で１回丸を描いた場合には、クリック、ポインタ１８で２回丸を描いた場合には、ダブルクリック等のように、利用者が顔を動かして、ポインタ１８を移動させて、予め定められた軌跡を描くことで、何らかの命令を入力する操作をいう。このように、上記画像処理システム１は様々なアプリケーションに適用することができる。 Further, the image processing system 1 can be applied not only to a designation operation for designating a coordinate position by moving the pointer 18 but also to a gesture operation, for example. For example, when a circle is drawn once with the pointer 18, the user moves the face and moves the pointer like a double click when the circle is drawn twice with the pointer 18. This is an operation of inputting some command by moving 18 and drawing a predetermined trajectory. Thus, the image processing system 1 can be applied to various applications.

１画像処理システム
１０カメラ
１２画像処理装置
１４処理装置
１６モニタ
１６ａ表示画面
１７画面領域
１８ポインタ
２０ＣＰＵ
２２ＲＯＭ
２４ＲＡＭ
２６画像処理用メモリ
３０認識対象物
４０予測顔検出領域
４２予測顔パーツ領域
４４顔パーツ領域
４６顔領域
４８基準点
５０基準点更新領域 DESCRIPTION OF SYMBOLS 1 Image processing system 10 Camera 12 Image processing apparatus 14 Processing apparatus 16 Monitor 16a Display screen 17 Screen area 18 Pointer 20 CPU
22 ROM
24 RAM
26 Image processing memory 30 Recognition object 40 Predicted face detection area 42 Predicted face part area 44 Face part area 46 Face area 48 Reference point 50 Reference point update area

Claims

Feature detection means for detecting a feature of the user's face from a captured image of the user;
Orientation detection means for detecting the orientation of the user's face from the captured image;
Each time a predetermined face orientation is detected by the orientation detection means, the feature position detected by the feature detection means when the predetermined face orientation is detected is the latest reference. Storage processing means for storing in the storage means identifiable as a point;
The predetermined image displayed on the display screen is moved based on the relative position of the feature part detected by the feature part detection unit with the latest reference point stored in the storage unit as a reference. First control means for controlling display of an image;
Second control means for controlling display of the image so that the image is arranged at a predetermined reference position on the display screen when the predetermined face orientation is detected by the orientation detection means. When,
An image processing apparatus.

The display screen is configured by a plurality of predetermined screen areas, and the reference position is individually set for each of the plurality of screen areas,
The second control unit is configured to select a screen region corresponding to a position of the user's face when the predetermined face direction is detected by the direction detection unit among the plurality of screen regions. The image processing apparatus according to claim 1, wherein display of the image is controlled so that the image is arranged at a set reference position.

Imaging means for imaging a user;
A feature detection unit for detecting a feature of the user's face from a captured image captured by the imaging unit;
Orientation detection means for detecting the orientation of the user's face from the captured image;
Each time a predetermined face orientation is detected by the orientation detection means, the feature position detected by the feature detection means when the predetermined face orientation is detected is the latest reference. Storage processing means for storing in the storage means identifiable as a point;
The predetermined image displayed on the display screen is moved based on the relative position of the feature part detected by the feature part detection unit with the latest reference point stored in the storage unit as a reference. First control means for controlling display of an image;
Second control means for controlling display of the image so that the image is arranged at a predetermined reference position on the display screen when the predetermined face orientation is detected by the orientation detection means. When,
An image processing system.

A feature of the user's face is detected from a captured image of the user, and the orientation of the user's face is detected,
Each time a predetermined face orientation of the user is detected from the captured image, the position of the feature detected when the predetermined face orientation is detected is identified as the latest reference point. Memorize it in the memory means,
Based on the relative position of the feature portion relative to the latest reference point stored in the storage means, the display of the image is controlled so that a predetermined image displayed on the display screen moves,
Controlling the display of the image so that the image is arranged at a predetermined reference position on the display screen when the predetermined face orientation is detected;
Image processing method.

Computer
Feature detecting means for detecting a feature of the user's face from a captured image of the user;
Orientation detection means for detecting the orientation of the user's face from the captured image;
Each time a predetermined face orientation is detected by the orientation detection means, the feature position detected by the feature detection means when the predetermined face orientation is detected is the latest reference. Storage processing means for storing in the storage means identifiable as a point,
The predetermined image displayed on the display screen is moved based on the relative position of the feature part detected by the feature part detection unit with the latest reference point stored in the storage unit as a reference. A first control unit for controlling display of an image; and when the predetermined face orientation is detected by the direction detection unit, the image is arranged at a predetermined reference position on the display screen. Second control means for controlling display of the image;
Program to function as.