JP2011166305A

JP2011166305A - Image processing apparatus and imaging apparatus

Info

Publication number: JP2011166305A
Application number: JP2010024623A
Authority: JP
Inventors: Masanori Murakami; 正典村上; Takashi Tsujimura; 貴辻村; Yoshifumi Tochi; 佳史土地; Kenji Nakamura; 健次中村
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2010-02-05
Filing date: 2010-02-05
Publication date: 2011-08-25

Abstract

<P>PROBLEM TO BE SOLVED: To specify and track a head of a subject person, even when the detection of the face of the subject person is impossible in a moving image. <P>SOLUTION: The image processing apparatus includes: a face detection unit 21 for detecting the face of the subject person from captured image data obtained by an imaging unit; a memory 7 for storing head information including the face of the subject person detected in the face detection unit 21; and a head tracking unit 22 for continuously detecting the head including the part from the profile to the back of the head of the subject person in the present captured image data on the basis of the head information including the face of the subject person stored in the memory 7. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、動画像に映っている被写体を追尾する画像処理装置および撮像装置に係り、特に被写体の顔の検出が不可能な場合にその頭部を検知・追尾する技術に関する。 The present invention relates to an image processing apparatus and an imaging apparatus that track a subject appearing in a moving image, and more particularly to a technique for detecting and tracking the head of a subject when the face of the subject cannot be detected.

近年、動画像から人物の顔を検知・追尾する技術がデジタルスチルカメラやビデオカメラに利用されている。一般的な顔の追尾処理は以下のように実行される。まず、万人に共通する顔の特徴を用いることにより、画像中から人の顔が検知され、顔の領域が特定される。次に、検知された顔の中から追尾対象が選択される。追尾対象が選択されると、前回の処理において対象の顔が検知された位置の周囲において顔検知が実行される。そして、この検知された顔について対象の顔との同定処理が実行されることにより、追尾が実現されている。この顔の追尾処理については、さらに下記特許文献１〜３に示すような改良技術が提案されている。 In recent years, a technique for detecting and tracking a human face from a moving image has been used for a digital still camera and a video camera. A general face tracking process is executed as follows. First, by using facial features common to all, a human face is detected from an image and a face region is specified. Next, a tracking target is selected from the detected faces. When the tracking target is selected, face detection is executed around the position where the target face was detected in the previous process. Then, tracking is realized by executing identification processing of the detected face with the target face. For this face tracking process, improved techniques as shown in the following Patent Documents 1 to 3 have been proposed.

例えば、特許文献１では、撮像データから対象の顔を抽出し、その抽出された対象の顔の動きベクトルを検知し、その動きベクトルに基づいて対象の顔を画像内の一定の位置（例えば、画像中央）になるように追跡する技術についての記載がある。
特許文献２では、個人認識、顔認識、肌色認識の組み合わせで対象の顔を検知する処理を行い、これらの組み合わせで認識できない場合には音声方向認識や動き方向からの位置予測により追跡を行い、対象を検知する技術についての記載がある。
また、特許文献３では、顔検出結果と顔の周囲、主に胴体にあたる部分（例えば、服の色や形）の情報を組み合わせて対象の被写体を追跡する技術についての記載がある。 For example, in Patent Document 1, a target face is extracted from imaging data, a motion vector of the extracted target face is detected, and the target face is detected based on the motion vector at a certain position (for example, There is a description of the technique of tracking so as to be in the center of the image.
In Patent Document 2, a process of detecting a target face is performed by a combination of personal recognition, face recognition, and skin color recognition. When the combination cannot be recognized, tracking is performed by voice direction recognition or position prediction from a movement direction. There is a description of the technology for detecting objects.
Patent Document 3 describes a technique for tracking a target subject by combining information about a face detection result and information about a portion around the face, mainly a body (for example, the color and shape of clothes).

特開２００８―３０１１６２号公報JP 2008-301162 A 特開２００４―２８３９５９号公報JP 2004-283959 A 特開２００７−４２０７２号公報JP 2007-42072 A

しかしながら、特許文献１〜３に記載された技術には以下のような不都合がある。
まず特許文献１に記載の技術では、顔が検知されない場合は追跡を行えず、追跡可能な対象の顔の向きが正面に限定されてしまう。
また特許文献２に記載の技術では、顔を検知できない場合には音声方向認識や動き方向からの位置予測により追跡を行うことができるものの、対象が音声を発しない場合や騒音のある環境下では対象を追跡できない可能性が高い。また、動き方向からの位置予測を用いても対象の頭部やそのサイズは検出できない。 However, the techniques described in Patent Documents 1 to 3 have the following disadvantages.
First, in the technique described in Patent Document 1, tracking cannot be performed when a face is not detected, and the direction of the target face that can be tracked is limited to the front.
In the technique described in Patent Document 2, tracking can be performed by voice direction recognition or position prediction from a movement direction when a face cannot be detected, but in a case where the target does not emit voice or in an environment with noise. It is likely that the subject cannot be tracked. Further, the target head and its size cannot be detected using position prediction from the movement direction.

また特許文献３に記載の技術では、顔と離れている胴体にあたる部分が障害物の陰や画面外にある場合などには追跡することができない。また、胴体にあたる部分（服）の色は同一人物でも日によってもしくは一日の中でも大きく異なるので、胴体にあたる部分（服）の色に追跡精度が大きく影響されてしまう。さらに顔による追跡時と色による追跡時で検出する対象サイズの把握方法が大きく異なるため、対象のサイズを把握し続ける精度にも問題がある。 Further, the technique described in Patent Document 3 cannot be tracked when a portion corresponding to the body away from the face is behind an obstacle or outside the screen. In addition, since the color of the portion corresponding to the trunk (clothes) varies greatly depending on the day or even within the same day, the tracking accuracy is greatly affected by the color of the portion corresponding to the trunk (clothes). Furthermore, since the method of grasping the target size to be detected differs greatly between the tracking with the face and the tracking with the color, there is a problem in the accuracy of continuously grasping the size of the target.

本発明はかかる点に鑑みてなされたものであり、動画像において対象人物の顔の検出が不可能な場合でも対象人物の頭部を特定して追尾可能にすることを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to specify the head of a target person so that tracking can be performed even when the face of the target person cannot be detected in a moving image.

上記課題を解決するため、本発明は、撮像部から得られた撮像データから対象人物の顔を検出する顔検出部と、顔検出部で検出された対象人物の顔を含む頭部情報を記憶する記憶部と、記憶部に記憶された対象人物の顔を含む頭部情報を基に、現在の撮像データにおける対象人物の横顔から後頭部までを含む頭部を検出し続ける頭部追尾部とを備えたものである。 In order to solve the above problems, the present invention stores a face detection unit that detects a face of a target person from imaging data obtained from the imaging unit, and head information that includes the face of the target person detected by the face detection unit. And a head tracking unit that continues to detect the head including the profile of the target person to the back of the current image data based on the head information including the face of the target person stored in the storage unit. It is provided.

上記構成によれば、対象人物の顔を含む頭部情報を基に頭部検出するので、一度対象人物の顔を検出するだけで当該対象人物の顔がどの方向を向いてもその頭部を検出することができる。 According to the above configuration, since the head is detected based on the head information including the face of the target person, the head of the target person can be detected in any direction by simply detecting the face of the target person once. Can be detected.

本発明により、動画像において対象人物の顔の検出が不可能な場合でも対象人物の頭部を安定して正確に追尾することができる。 According to the present invention, the head of the target person can be tracked stably and accurately even when the face of the target person cannot be detected in the moving image.

本発明の一実施形態に係る撮像装置の構成を示したブロック図である。1 is a block diagram illustrating a configuration of an imaging apparatus according to an embodiment of the present invention. 本発明の一実施形態に係る顔検出部および頭部追尾部による処理の例を示すフローチャートである。It is a flowchart which shows the example of the process by the face detection part and head tracking part which concern on one Embodiment of this invention. 本発明の一実施形態に係る特徴点抽出を示すフレームの一例を示す図である。It is a figure which shows an example of the frame which shows the feature point extraction which concerns on one Embodiment of this invention. 本発明の一実施形態に係る色情報抽出を示すフレームの一例を示す図である。It is a figure which shows an example of the frame which shows color information extraction which concerns on one Embodiment of this invention. 本発明の一実施形態に係る特徴点追尾部による処理を示すフローチャートである。It is a flowchart which shows the process by the feature point tracking part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る特徴点追尾部によるフレームの一例を示す図である。It is a figure which shows an example of the flame | frame by the feature point tracking part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る色追尾部による処理の例を示すフローチャートである。It is a flowchart which shows the example of the process by the color tracking part which concerns on one Embodiment of this invention. 本発明の一実施形態に係る色追尾部によるフレームの一例を示す図である。It is a figure which shows an example of the flame | frame by the color tracking part which concerns on one Embodiment of this invention. 本発明の一実施形態に係るフレームにおける色ヒストグラムの一例を示す図である。It is a figure which shows an example of the color histogram in the flame | frame which concerns on one Embodiment of this invention. 本発明の一実施形態に係る表示枠変更処理の例を示すフローチャートである。It is a flowchart which shows the example of the display frame change process which concerns on one Embodiment of this invention. 本発明の一実施形態に係る表示枠変更処理での判定処理を行う場合のフレームの一例を示す図である。It is a figure which shows an example of the frame in the case of performing the determination process in the display frame change process which concerns on one Embodiment of this invention. 本発明の一実施形態に係る比較及び表示枠の変更処理の例を示すフローチャートである。It is a flowchart which shows the example of the change process of the comparison and display frame which concerns on one Embodiment of this invention. 本発明の一実施形態に係る比較及び表示枠の変更処理での、頭部領域が基準の表示枠より小さいまたは大きい場合でのフレームの一例を示す図である。It is a figure which shows an example of a frame in case the head area | region is smaller than or larger than the reference | standard display frame in the comparison and the display frame change process which concerns on one Embodiment of this invention. 本発明の一実施形態に係る光学系駆動部による処理の例を示すフローチャートである。It is a flowchart which shows the example of the process by the optical system drive part which concerns on one Embodiment of this invention. 本発明の一実施形態に係るは撮像装置と対象人物の顔との距離関係を示した図の一例である。1 is an example of a diagram illustrating a distance relationship between an imaging apparatus and a target person's face according to an embodiment of the present invention. 本発明の一実施形態に係る光学系駆動部を使用した場合と使用しない場合での表示部に表示されるフレーム（画像）の一例を示す。An example of the frame (image) displayed on the display part when the optical system drive part which concerns on one Embodiment of this invention is used, and the case where it is not used is shown.

以下、本発明の一実施形態について、添付図面を参照して下記の順序で説明する。
１．頭部追尾処理を行う撮像装置の構成
２．頭部追尾処理の流れ
３．特徴点追尾部での処理
４．色追尾部での処理
５．頭部領域を基準の表示枠で表示する表示制御処理 Hereinafter, an embodiment of the present invention will be described in the following order with reference to the accompanying drawings.
1. 1. Configuration of an imaging apparatus that performs head tracking processing 2. Head tracking process flow 3. Processing in the feature point tracking unit 4. Processing in the color tracking unit Display control processing that displays the head area in the reference display frame

［１．頭部追尾処理を行う撮像装置の構成］
本発明の一実施形態の例における撮像装置の構成を、図１を用いて説明する。撮像装置としては動画撮影機能を備えるものであればよく、デジタルスチルカメラやビデオカメラなどが適用される。 [1. Configuration of an imaging device that performs head tracking processing]
A configuration of an imaging apparatus in an example of an embodiment of the present invention will be described with reference to FIG. Any imaging device may be used as long as it has a video shooting function, and a digital still camera, a video camera, or the like is applied.

図１は、本発明の一実施形熊の例に係る撮像装置の構成を示すブロック図である。
撮像装置１は、制御部２と、画像ＲＡＭ３と、表示部４と、操作部５と、画像処理部６と、記憶部７と、外部インターフェース８と、光学系９と、光学系駆動部１０と、イメージセンサ１１と、信号処理部１２と、パンチルト駆動部１３と、データバス１４と、顔検出部２１と、頭部追尾部２２とで構成している。なお、各部間におけるデータの送受信は、データバス１４を経由したやりとりによって行われる。 FIG. 1 is a block diagram showing a configuration of an imaging apparatus according to an example of an embodiment of the present invention.
The imaging device 1 includes a control unit 2, an image RAM 3, a display unit 4, an operation unit 5, an image processing unit 6, a storage unit 7, an external interface 8, an optical system 9, and an optical system driving unit 10. The image sensor 11, the signal processing unit 12, the pan / tilt driving unit 13, the data bus 14, the face detection unit 21, and the head tracking unit 22. Note that transmission / reception of data between the respective units is performed by exchange via the data bus 14.

以下に、撮像装置１の各構成要素の説明を行う。
まず、制御部２は、演算制御装置であり、一例として図示しないＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）およびＲＡＭ（Random Access Memory）から構成される。そして、制御部２は、制御信号が伝送されるデータバス１４を介して、撮像装置１内の各部と接続される。そして、制御部２は、データバス１４を通して各部と通信を行い、各部の処理制御を行う。 Below, each component of the imaging device 1 is demonstrated.
First, the control unit 2 is an arithmetic control device, and includes, as an example, a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory) not shown. And the control part 2 is connected with each part in the imaging device 1 via the data bus 14 by which a control signal is transmitted. And the control part 2 communicates with each part via the data bus 14, and controls the process of each part.

画像ＲＡＭ３は、撮像データを一時的に記憶するワークメモリ（記憶部）の一例である。各部間では、画像ＲＡＭ３を介して撮像データが受け渡される。なお、この例では、各部は、画像ＲＡＭ３を介して撮像データを受け渡すものとして説明している。しかし、この例に限られず、例えば、表示部４、画像処理部６および後述する顔検出部２１は、信号処理部１２から出力される撮像データを、データバス１４を介さずに直接受け取ることができるようにしてもよい。
また、画像ＲＡＭ３は、後述する顔検出部２１および後述する頭部追尾部２２の処理で扱われる特徴点情報と、色情報と、撮像データの情報を記憶するようにしてもよい。 The image RAM 3 is an example of a work memory (storage unit) that temporarily stores imaging data. Imaging data is transferred between the units via the image RAM 3. Note that in this example, each unit is described as passing image data via the image RAM 3. However, the present invention is not limited to this example. For example, the display unit 4, the image processing unit 6, and the face detection unit 21 described later can directly receive imaging data output from the signal processing unit 12 without passing through the data bus 14. You may be able to do it.
Further, the image RAM 3 may store feature point information, color information, and imaging data information that are handled in the processing of the face detection unit 21 and the head tracking unit 22 that will be described later.

表示部４は、信号処理部１２または画像処理部６等が出力した撮像データや、後述する頭部追尾部２２で処理した撮像データを表示するものである。また、表示部４は、撮像装置１におけるビューファインダとして用いられるとともに、記憶部７から再生された画像のモニタとして用いられる。表示部４としては、例えばＬＣＤ（Liquid Crystal Display）を用いることができる。 The display unit 4 displays imaging data output by the signal processing unit 12 or the image processing unit 6 or the like, and imaging data processed by the head tracking unit 22 described later. The display unit 4 is used as a view finder in the imaging apparatus 1 and is used as a monitor for an image reproduced from the storage unit 7. As the display unit 4, for example, an LCD (Liquid Crystal Display) can be used.

操作部５は、ユーザの操作入力に応じた操作信号を生成し、データバス１４を介して各部の操作を行うものであり、タッチパネルや方向キー、ボタン等から構成される。例えば、ユーザが操作部５を操作して、表示部４に表示されている人物の顔を指定して、当該指定した顔が常に表示部４に表示されるよう自動で追尾しながら撮影を行うことができる。 The operation unit 5 generates an operation signal corresponding to a user's operation input and operates each unit via the data bus 14 and includes a touch panel, direction keys, buttons, and the like. For example, the user operates the operation unit 5 to specify the face of a person displayed on the display unit 4 and perform shooting while automatically tracking so that the specified face is always displayed on the display unit 4. be able to.

画像処理部６は、信号処理部１２から出力された撮像データ、あるいは画像ＲＡＭ３に記憶された撮像データを受け取り、この撮像データを圧縮符号化して、動画像あるいは静止画像のデータファイルとして記憶部７に出力するものである。また、画像処理部６は、後述する記憶部７から読み出された撮像データファイルを復号し、画像ＲＡＭ３を介して表示部４に供給する。 The image processing unit 6 receives the imaging data output from the signal processing unit 12 or the imaging data stored in the image RAM 3, compresses and encodes the imaging data, and stores the data as a moving image or still image data file. Is output. Further, the image processing unit 6 decodes an imaging data file read from the storage unit 7 described later, and supplies the image data file to the display unit 4 via the image RAM 3.

記憶部７は、画像処理部６により符号化されて生成された撮像データや、様々な顔角度でなる複数の基準顔データを記憶するものである。記憶部７としては、例えば、磁気テープや光ディスクなどの可搬型記録媒体のドライブ装置、あるいはＨＤＤ（Hard Disc Drive）などを用いることができる。また、記憶部７は、後述する顔検出部２１および後述する頭部追尾部２２の処理で扱われる特徴点情報と、色情報と、撮像データの情報を記憶するようにしてもよい。 The storage unit 7 stores imaging data encoded and generated by the image processing unit 6 and a plurality of reference face data having various face angles. As the storage unit 7, for example, a drive device of a portable recording medium such as a magnetic tape or an optical disk, or an HDD (Hard Disc Drive) can be used. In addition, the storage unit 7 may store feature point information, color information, and imaging data information that are handled by processing of a face detection unit 21 and a head tracking unit 22 that will be described later.

外部インターフェース８は、例えばパーソナルコンピュータなどの外部機器（図示せず）が接続されることで、記憶部７からデータを読み出し、当該データを外部機器に供給するように構成されている。 The external interface 8 is configured to read data from the storage unit 7 and supply the data to the external device by connecting an external device (not shown) such as a personal computer, for example.

光学系９は、図示しないがレンズおよび絞りなどを含み、所望の被写体を取り込んでイメージセンサ１１に入射する光入射機構である。光学系９は、図示しないが、さらにシャッタ機構、ズーム機構、自動焦点調整機構、自動露出調整機構、赤外線（Infrared Rays : IR）カットフィルタ、光学ローパスフィルタ（Low Pass Filter : LPF）などを含んでもよい。 Although not shown, the optical system 9 includes a lens, a diaphragm, and the like, and is a light incident mechanism that takes in a desired subject and enters the image sensor 11. Although not shown, the optical system 9 may further include a shutter mechanism, a zoom mechanism, an automatic focus adjustment mechanism, an automatic exposure adjustment mechanism, an infrared ray (IR) cut filter, an optical low pass filter (LPF), and the like. Good.

光学系駆動部１０は、操作部５からの操作信号または後述する頭部追尾部２２から出力される表示枠のズーム値を基に、制御部２の制御に従い、光学系９を駆動させるものである。また、操作部５からの操作信号により、光学系駆動部１０が光学系９を駆動させるものである。 The optical system drive unit 10 drives the optical system 9 according to the control of the control unit 2 based on the operation signal from the operation unit 5 or the zoom value of the display frame output from the head tracking unit 22 described later. is there. Further, the optical system driving unit 10 drives the optical system 9 by an operation signal from the operation unit 5.

イメージセンサ１１は、光学系９から出力された入射光を光電変換し、光電変換された電気信号を信号処理部１２に出力するものである。なお、ブレ補正用レンズの移動または傾き、ブレ補正用プリズムの変形や傾き、イメージセンサ１１の移動等をすることによって、検出されたブレを光学的に補正することができる。 The image sensor 11 photoelectrically converts incident light output from the optical system 9 and outputs the photoelectrically converted electrical signal to the signal processing unit 12. The detected blur can be optically corrected by moving or tilting the blur correction lens, deforming or tilting the blur correction prism, moving the image sensor 11, and the like.

信号処理部１２は、イメージセンサ１１から出力された電気信号に対して各種の信号処理を施し、信号処理が施された撮像データを画像ＲＡＭ３に一時記憶するものである。そして、記憶部７と、外部インターフェース８と、画像処理部６と、後述する顔検出部２１にも供給するものである。信号処理部１２における信号処理として、ノイズ軽減処理、レベル補正処理、Ａ／Ｄ変換処理および色彩補正処理等の信号処理がある。また、信号処理部１２は、制御部２の指示に基づいて各部から入力された画像に対して各種の画像処理を実行する。
ところで、光学系９と、光学系駆動部１０と、イメージセンサ１１と、信号処理部１２で構成されたものを、撮像部とすることができる。 The signal processing unit 12 performs various types of signal processing on the electrical signal output from the image sensor 11 and temporarily stores the imaged data subjected to the signal processing in the image RAM 3. Then, the data is also supplied to the storage unit 7, the external interface 8, the image processing unit 6, and a face detection unit 21 described later. Signal processing in the signal processing unit 12 includes signal processing such as noise reduction processing, level correction processing, A / D conversion processing, and color correction processing. In addition, the signal processing unit 12 performs various image processing on the image input from each unit based on the instruction of the control unit 2.
By the way, an optical system 9, an optical system driving unit 10, an image sensor 11, and a signal processing unit 12 can be used as an imaging unit.

パンチルト駆動部１３は、パンチルト機構のパン方向及びチルト方向への動作を制御するものである。パンチルト駆動部１３は、制御部２による自動制御又は操作部５からの操作信号に基づいて駆動する。 The pan / tilt drive unit 13 controls operations of the pan / tilt mechanism in the pan direction and the tilt direction. The pan / tilt drive unit 13 is driven based on automatic control by the control unit 2 or an operation signal from the operation unit 5.

データバス１４は、撮像装置１を構成している各部との間で、データ転送や制御命令の転送に用いるための共通の経路である。 The data bus 14 is a common path to be used for data transfer and control command transfer with each unit constituting the imaging apparatus 1.

顔検出部２１は、信号処理部１２から送出される撮像データおよび記憶部７等に記憶された撮像データから人物の顔の検出を行い、その検出された顔の中から、対象人物の顔の特定を行うものである。対象人物とは、追尾対象として指定された人物を指す。この顔検出部２１は、一例として専用のプロセッサ（DSP：Digital Signal Processor）を適用してもよい。
また、顔検出部２１は、顔向き判定部２１Ａを備えている。この顔向き判定部２１Ａでは、特定された対象人物の顔を基に、正面、左向き、右向きなど対象人物の顔がいずれを向いているか判定するものである。この処理は、例えば、一般的な横顔全体の輪郭に対応した基準テンプレートを用いたテンプレートマッチングによって、顔の横向きの度合いを検出するように構成されてもよい。 The face detection unit 21 detects a person's face from the imaging data sent from the signal processing unit 12 and the imaging data stored in the storage unit 7 and the like, and the face of the target person is detected from the detected faces. It is to identify. The target person refers to a person designated as a tracking target. For example, a dedicated processor (DSP: Digital Signal Processor) may be applied to the face detection unit 21.
The face detection unit 21 includes a face orientation determination unit 21A. The face orientation determination unit 21A determines which direction the target person's face is facing, such as front, left, or right, based on the identified target person's face. This process may be configured to detect the degree of the face sideways by, for example, template matching using a reference template corresponding to the general profile of the entire sideview.

頭部追尾部２２は、特徴点追尾部２３および色追尾部２４で構成され、顔検出部２１で検出された顔を基に、対象人物の顔を含む頭部を追尾する処理を行うものである。
詳細には、特徴点追尾部２３は、撮像データ内に対象人物の顔が検出されなかった場合、特徴点抽出の処理によって得られた特徴点の座標、輝度値を使って対象人物の頭部の追尾を行う。色追尾部２４は、入力された撮像データ内に対象人物の顔が検出されなかった場合、色抽出の処理によって得られた色情報を用いて対象人物の頭部の追尾を行う。この頭部追尾部２２は、一例として専用のプロセッサ（DSP：Digital Signal Processor）を適用してもよい。
この特徴点追尾部２３および色追尾部２４による処理の詳細は、後述する。 The head tracking unit 22 includes a feature point tracking unit 23 and a color tracking unit 24, and performs processing for tracking the head including the face of the target person based on the face detected by the face detection unit 21. is there.
Specifically, the feature point tracking unit 23 uses the coordinates and brightness values of the feature points obtained by the feature point extraction process when the face of the subject person is not detected in the imaging data. Tracking. When the face of the target person is not detected in the input image data, the color tracking unit 24 tracks the head of the target person using the color information obtained by the color extraction process. For example, a dedicated processor (DSP: Digital Signal Processor) may be applied to the head tracking unit 22.
Details of processing by the feature point tracking unit 23 and the color tracking unit 24 will be described later.

また、特徴点追尾部２３は、特徴点情報抽出部２３Ａとベクトル算出部２３Ｂを備えている。
特徴点情報抽出部２３Ａは、対象人物の顔から目、鼻、口等の領域を特定し、その特定された領域の特徴点情報の抽出を行うものである。この特徴点情報抽出部２３Ａによる処理の詳細は、後述する。
特徴点情報とは、所定の領域における座標および輝度値を含む。 The feature point tracking unit 23 includes a feature point information extraction unit 23A and a vector calculation unit 23B.
The feature point information extraction unit 23A identifies areas such as eyes, nose, and mouth from the face of the target person, and extracts feature point information of the identified areas. Details of the processing by the feature point information extraction unit 23A will be described later.
The feature point information includes coordinates and luminance values in a predetermined area.

ベクトル算出部２３Ｂは、現在の撮像データの一つ前又はそれ以外の直近のフレームでの特徴点情報の座標および現在の撮像データの特徴点情報を用いて、対象人物の顔での特徴点の動きベクトルを算出する。
このベクトル算出部２３Ｂによる処理の詳細は、後述する。 The vector calculation unit 23B uses the coordinates of the feature point information in the immediately preceding frame other than the current imaging data or the feature point information of the current imaging data and the feature point information on the face of the target person. A motion vector is calculated.
Details of the processing by the vector calculation unit 23B will be described later.

色追尾部２４は、色情報抽出部２４Ａを備えている。色情報抽出部２４Ａは、対象人物の顔を有する領域（顔領域）から所定の距離だけ離れた領域（後頭部領域）についての色情報抽出の処理を行うものである。
この色情報抽出部２４Ａによる処理の詳細は、後述する。 The color tracking unit 24 includes a color information extraction unit 24A. The color information extraction unit 24A performs color information extraction processing for a region (occipital region) separated from the region (face region) having the face of the target person by a predetermined distance.
Details of the processing by the color information extraction unit 24A will be described later.

色情報とは、一例として抽出領域における色ヒストグラムの情報である。この色ヒストグラムは、抽出領域における色差信号の度数（画素数）を表す分布のことである。色差信号Ｃｒと色差信号Ｃｂのそれぞれについてこの色ヒストグラムの情報を取得する。
また、色ヒストグラムを得るときに用いられる色は色差信号Ｃｒ，Ｃｂに限らず、Ｒ，Ｇ，Ｂなど他の色でもよい。また、色情報は色ヒストグラムに限られるものではなく、その他の色に関する情報を用いてもよいことは勿論である。なお、この例での後頭部領域は、対象人物の後頭部の髪がある領域の事を指しており、その領域の大きさは適宜とする。 The color information is, for example, color histogram information in the extraction region. This color histogram is a distribution representing the frequency (number of pixels) of the color difference signal in the extraction region. Information of this color histogram is acquired for each of the color difference signal Cr and the color difference signal Cb.
The color used when obtaining the color histogram is not limited to the color difference signals Cr and Cb, but may be other colors such as R, G, and B. In addition, the color information is not limited to the color histogram, and it is needless to say that information on other colors may be used. Note that the occipital region in this example refers to a region with hair on the occipital region of the target person, and the size of the region is appropriate.

［２．頭部追尾処理の流れ］
本発明の一実施形態の例における顔検出部および頭部追尾部による処理を、図２〜４を用いて説明する。
図２は、顔検出部および頭部追尾部による処理におけるフローチャートである。 [2. Flow of head tracking process]
Processing by the face detection unit and the head tracking unit in the example of the embodiment of the present invention will be described with reference to FIGS.
FIG. 2 is a flowchart of processing by the face detection unit and the head tracking unit.

まず、顔検出部２１により、信号処理部１２から送出された現在の撮像データのフレームを読み出し、その現在のフレームから人物の顔の検出を行う（ステップＳ１）。この処理は、例えば、一般的な顔の耳から顎にかけての輪郭に対応した基準テンプレートを用いたテンプレートマッチングによって顔を検出するように構成されてもよい。また、顔検出部２１では、顔の構成要素（目，鼻，耳など）に基づくテンプレートマッチングによって顔を検出するように構成されてもよい。 First, the face detection unit 21 reads a frame of the current imaging data sent from the signal processing unit 12, and detects a human face from the current frame (step S1). For example, this process may be configured to detect a face by template matching using a reference template corresponding to a contour of a general face from the ear to the chin. The face detection unit 21 may be configured to detect a face by template matching based on face components (eyes, nose, ears, etc.).

また、顔検出部２１は、クロマキー処理によって頭部などの頂点を検出し、この頂点に基づいて顔を検出するように構成されてもよい。また、顔検出部２１は、肌の色に近い領域を検出し、その領域を顔として検出するように構成されてもよい。また、顔検出部２１は、ニューラルネットワークを使って教師信号による学習を行い、顔らしい領域を顔として検出するように構成されてもよい。また、顔検出部２１による顔検出処理は、その他、既存のどのような技術が適用されることによって実現されてもよい。 The face detection unit 21 may be configured to detect a vertex such as a head by chroma key processing and detect a face based on the vertex. The face detection unit 21 may be configured to detect a region close to the skin color and detect the region as a face. Further, the face detection unit 21 may be configured to perform learning by a teacher signal using a neural network and detect a face-like region as a face. Further, the face detection process by the face detection unit 21 may be realized by applying any other existing technique.

そして、顔検出部２１は、ステップＳ１の検出処理で検出された顔から、対象人物の顔に適合する顔のマッチング処理を行う（ステップＳ２）。例えば、顔検出部２１は、ステップＳ１の検出処理で検出された顔と、記憶部７に予め登録してある対象人物の顔に適合する顔があるかないかの判定を行うことである。または、ユーザが操作部５を操作して撮像データの中から人物の顔を指定することによって行う。
なお、ステップＳ２による対象人物の顔の検出処理を初めて行ったときに、対象人物の顔が検出されなかった場合は、そのまま処理を終了する。 Then, the face detection unit 21 performs face matching processing that matches the face of the target person from the faces detected by the detection processing in step S1 (step S2). For example, the face detection unit 21 determines whether or not there is a face that matches the face detected in the detection process of step S1 and the face of the target person registered in advance in the storage unit 7. Alternatively, the user operates the operation unit 5 to designate a person's face from the imaging data.
When the target person's face detection process in step S2 is performed for the first time, if the target person's face is not detected, the process ends.

ステップＳ２の判定処理でその対象人物の顔に適合した顔を含む頭部に関する情報（以後、「頭部情報」と称す）を、記憶部７に記憶する（ステップＳ４）。また、記憶部７の他に、画像ＲＡＭ３にも記憶するようにしてもよい。
頭部情報としては、例えば対象人物の顔の領域（特に頭髪部）と、対象人物の顔全体の輪郭と、目と鼻等の顔の要部の位置関係と、顔の面積の大小（カメラからの距離）等を含む。 Information relating to the head including the face that matches the face of the target person in the determination process in step S2 (hereinafter referred to as “head information”) is stored in the storage unit 7 (step S4). Further, in addition to the storage unit 7, it may be stored in the image RAM 3.
As the head information, for example, the face area (particularly the hair part) of the target person, the outline of the entire face of the target person, the positional relationship between the main parts of the face such as eyes and nose, and the size of the face (camera Distance).

そして、顔向き判定部２１Ａが、頭部情報を基に、正面、左向き、右向きなど対象人物の顔がいずれを向いているか判定する（ステップＳ３）。この処理は、例えば、一般的な横顔全体の輪郭に対応した基準テンプレートを用いたテンプレートマッチングによって、顔の横向きの度合いを検出するように構成されてもよい。 Then, the face orientation determination unit 21A determines, based on the head information, which direction the target person's face is facing, such as front, left, or right (step S3). This process may be configured to detect the degree of the face sideways by, for example, template matching using a reference template corresponding to the general profile of the entire sideview.

ステップＳ４の顔向き判定処理で検出される対象人物の顔の向きに応じて、特徴点情報抽出部２３Ａは特徴点情報抽出の処理を行う（ステップＳ５）。特徴点情報抽出の処理は、例えば、対象人物の顔の左向きであると判定された場合、左向きの顔の領域３１Ｌから左目３３Ｌ、鼻３４Ｌ、口３５Ｌの領域を抽出して、それぞれの領域の特徴点情報抽出を行う（図３Ａを参照）。同様に、図３Ｂに示すように、対象人物の顔の右向きであると判定された場合、右向きの顔の領域３１Ｒから右目３３Ｒ、鼻３４Ｒ、口３５Ｒの領域を抽出して、それぞれの領域の特徴点情報抽出を行う（図３Ｂを参照）。 The feature point information extraction unit 23A performs feature point information extraction processing according to the face orientation of the target person detected in the face orientation determination processing in step S4 (step S5). For example, when it is determined that the face of the target person is facing left, the feature point information extraction process extracts the regions of the left eye 33L, the nose 34L, and the mouth 35L from the region 31L of the face facing left, Feature point information extraction is performed (see FIG. 3A). Similarly, as shown in FIG. 3B, when it is determined that the face of the target person is facing right, the areas of the right eye 33R, nose 34R, and mouth 35R are extracted from the face area 31R facing right, Feature point information extraction is performed (see FIG. 3B).

次に、ステップＳ４の顔向き判定処理で検出される対象人物の顔の向きに応じて、色情報抽出部２４Ａは、その対象人物の顔を有する領域（顔領域）から所定の距離だけ離れた領域（後頭部領域）についての色情報抽出の処理を行う（ステップＳ６）。色情報抽出の処理は、例えば、対象人物の顔の左向きであると判定された場合、左向きの顔領域４２Ｌに対して、所定の距離にある右上の後頭部領域４３Ｌを認識し、その後頭部領域４３Ｌの色情報の抽出を行う（図４Ａを参照）。同様に、対象人物の顔の右向きであると判定された場合、右向きの顔領域４２Ｒに対して、所定の距離にある右上の後頭部領域４３Ｒを認識し、その後頭部領域４３Ｒの色情報の抽出を行う（図４Ｂを参照）。 Next, according to the face orientation of the target person detected in the face orientation determination process in step S4, the color information extraction unit 24A is separated from the area (face area) having the face of the target person by a predetermined distance. Color information extraction processing is performed for the region (occipital region) (step S6). For example, when it is determined that the face of the target person is facing left, the color information extraction process recognizes the upper right occipital region 43L at a predetermined distance with respect to the left facing face region 42L, and then moves the head region 43L. The color information is extracted (see FIG. 4A). Similarly, when it is determined that the face of the target person is facing right, the upper right occipital region 43R at a predetermined distance is recognized with respect to the right facing face region 42R, and color information of the head region 43R is extracted thereafter. (See FIG. 4B).

そして、顔向き情報、特徴点情報および色情報のデータを画像ＲＡＭ３に出力する（ステップＳ７）。また、それらのデータを記憶部７、表示部４に出力するようにしてもよい。 Then, face orientation information, feature point information, and color information data are output to the image RAM 3 (step S7). Further, the data may be output to the storage unit 7 and the display unit 4.

ステップＳ２の判定処理で対象人物の顔が検出されなかった場合、特徴点追尾部２３は、前フレームでの特徴点情報を基に、対象人物の特徴点追尾処理を行う（ステップＳ８）。この特徴点追尾処理は、対象人物の顔の特徴点が存在するかの判定（ステップＳ９）を行い、その特徴点が存在した場合、その特徴点を有する領域を対象人物の頭部と特定することである。なお、前フレームとは、動画を構成するフレームのうち、現在処理を行っているフレームの一つ前又はそれ以外の直近のフレームのことを指す。
この特徴点追尾部２３による処理の詳細は、後述する。 When the target person's face is not detected in the determination process of step S2, the feature point tracking unit 23 performs the feature point tracking process of the target person based on the feature point information in the previous frame (step S8). In this feature point tracking process, it is determined whether or not a feature point of the face of the target person exists (step S9). If the feature point exists, the region having the feature point is identified as the head of the target person. That is. Note that the previous frame refers to the frame immediately before the frame currently being processed or other recent frames among the frames constituting the moving image.
Details of the processing by the feature point tracking unit 23 will be described later.

そして、ステップＳ９での特徴点追尾部２３による判定処理において、特徴点が存在した場合、特定した対象人物の頭部における、特徴点情報および色情報等のデータを画像ＲＡＭ３に出力する（ステップＳ７）。また、それらのデータを記憶部７、表示部４に出力するようにしてもよい。 Then, in the determination process by the feature point tracking unit 23 in step S9, if a feature point exists, data such as feature point information and color information in the head of the identified target person is output to the image RAM 3 (step S7). ). Further, the data may be output to the storage unit 7 and the display unit 4.

ステップＳ９の判定処理にて対象人物の頭部の特徴点が検出されなかった場合、色追尾部２４において前フレームでの色情報を基に、対象人物の後頭部の色追尾を行う（ステップＳ１０）。 When the feature point of the head of the target person is not detected in the determination process in step S9, the color tracking unit 24 performs color tracking of the back of the target person based on the color information in the previous frame (step S10). .

色追尾部２４の処理は、前フレームから取得した色ヒストグラムを基準色ヒストグラムとし、基準色ヒストグラムに近い色ヒストグラム（以後、「抽出色ヒストグラム」と称す）を有する領域が現在のフレーム内にあるかの判定を行う（ステップＳ１１）。そして、その抽出色ヒストグラムに適合する領域が存在した場合、その領域を対象人物の頭部と特定する。また、その抽出色ヒストグラムに適合する領域が存在しない場合は、対象人物の頭部が存在しないと判定し、色追尾処理を終了する。このとき、例えば色追尾部２４から制御部２へ対象人物の頭部が存在しない旨の情報を送出し、制御部２の制御の下、表示部４が対象人物の頭部が検出されなかった旨の表示を行うようにしてもよい。
なお、この色追尾処理の詳細は、後述する。 The process of the color tracking unit 24 uses the color histogram acquired from the previous frame as a reference color histogram, and whether there is an area having a color histogram close to the reference color histogram (hereinafter referred to as “extracted color histogram”) in the current frame. Is determined (step S11). When there is a region that matches the extracted color histogram, the region is identified as the head of the target person. If there is no region that matches the extracted color histogram, it is determined that the head of the target person does not exist, and the color tracking process ends. At this time, for example, information indicating that the head of the target person does not exist is sent from the color tracking unit 24 to the control unit 2, and the control unit 2 does not detect the head of the target person under the control of the control unit 2. You may make it display to the effect.
Details of the color tracking process will be described later.

そして、ステップＳ１１での色追尾部２４の処理後は、対象人物の頭部における、色情報等のデータを画像ＲＡＭ３に出力する（ステップＳ７）。また、それらのデータを記憶部７、表示部４に出力するようにしてもよい。 After the processing of the color tracking unit 24 in step S11, data such as color information in the head of the target person is output to the image RAM 3 (step S7). Further, the data may be output to the storage unit 7 and the display unit 4.

ところで、図２に示したステップＳ１０の後頭部の色追尾処理では、顔の検出および特徴点追尾ができない場合に色追尾を行うようにしている。しかしながら、ステップＳ２で顔の検出が検出できない場合にステップＳ８の特徴点追尾を行うことなくステップＳ１０の後頭部の色追尾を行うようにしてもよい。 By the way, in the color tracking process of the occipital region of step S10 shown in FIG. 2, color tracking is performed when face detection and feature point tracking cannot be performed. However, if face detection cannot be detected in step S2, color tracking of the back of the head in step S10 may be performed without performing feature point tracking in step S8.

［３．特徴点追尾部での処理］
ここで、特徴点追尾部２３による特徴点追尾処理を、図５〜６を用いて詳細に説明する。図５は特徴点追尾部２３の処理によるフローチャートを示す。そして、図６Ａは前フレームの一例を示し、図６Ｂは現在のフレームの一例を示す。
この特徴点追尾部２３による処理は、図２のステップＳ２の判定処理で対象人物の顔が検出されなかった場合に実行されるものである。 [3. Processing at the feature point tracking unit]
Here, the feature point tracking process by the feature point tracking unit 23 will be described in detail with reference to FIGS. FIG. 5 shows a flowchart of processing by the feature point tracking unit 23. 6A shows an example of the previous frame, and FIG. 6B shows an example of the current frame.
The process by the feature point tracking unit 23 is executed when the face of the target person is not detected in the determination process of step S2 of FIG.

まず、画像ＲＡＭ３から最新（前フレーム）の特徴点情報（座標および輝度値）を読み込む（ステップＳ２１）。最新（前フレーム）の特徴点情報とは、例えば図２のステップＳ５の特徴点情報抽出の処理で抽出されたフレームのうち、現在処理を行っているフレームの一つ前又はそれ以外の直近のフレームのことでの特徴点情報である。また抽出できたフレームのうち直近のフレームでもよい。そして、この最新の特徴点情報は、画像ＲＡＭ３の他に記憶部７にも記憶されている場合があるので、その記憶部７から読み出してもよい。 First, the latest (previous frame) feature point information (coordinates and luminance values) is read from the image RAM 3 (step S21). The latest (previous frame) feature point information is, for example, the frame extracted in the feature point information extraction process of step S5 in FIG. This is feature point information about a frame. The most recent frame may be used among the extracted frames. The latest feature point information may be stored in the storage unit 7 in addition to the image RAM 3, and may be read from the storage unit 7.

次に、最新（前フレーム）の特徴点情報に含まれている特徴点ごとでの動きベクトルを算出する処理を行う（ステップＳ２２〜ステップＳ２６）。例えば、前フレームの特徴点情報に含まれている特徴点が目および口であった場合、特徴点が２点であるので２回の反復処理を行うこととなる。つまり、ステップＳ２２〜ステップＳ２６の処理を２回行うことになる。 Next, a process of calculating a motion vector for each feature point included in the latest (previous frame) feature point information is performed (steps S22 to S26). For example, when the feature points included in the feature point information of the previous frame are the eyes and the mouth, since the feature points are two points, the iterative process is performed twice. That is, the processing from step S22 to step S26 is performed twice.

前フレームでの特徴点の輝度値と最も近い値の輝度値を有する座標が、現在のフレーム内にあるかないかを判定する（ステップＳ２３）。
例えば、前フレーム５１Ｌ１から抽出された特徴点が目（以後、この目を含む領域を「眼領域」と称す）である場合について、図６を用いて説明する。前フレーム５１Ｌ１での目領域５３Ｌ１を、現在のフレーム５１Ｌ２上に同じ領域（目領域５３Ｌ２）を設定する。この目領域５３Ｌ２を中心に、目領域５３Ｌ１の輝度値に近い輝度値を有する座標が現在のフレーム内を走査する。その走査の結果により、現在のフレーム５１Ｌ２から、目領域５３Ｌ１の輝度値と近い輝度値をもつ目領域５３Ｌ３が検出されることとなる。 It is determined whether or not the coordinates having the luminance value closest to the luminance value of the feature point in the previous frame are in the current frame (step S23).
For example, a case where the feature point extracted from the previous frame 51L1 is an eye (hereinafter, an area including the eye is referred to as an “eye area”) will be described with reference to FIG. The eye area 53L1 in the previous frame 51L1 is set to the same area (eye area 53L2) on the current frame 51L2. A coordinate having a luminance value close to the luminance value of the eye region 53L1 scans within the current frame around the eye region 53L2. As a result of the scanning, an eye region 53L3 having a luminance value close to the luminance value of the eye region 53L1 is detected from the current frame 51L2.

図２のステップＳ９の判定処理で、前フレームの特徴点情報の輝度値と最も近い値の輝度値があった場合、その最も近い値の輝度値での座標とその座標での輝度値を記憶する（ステップＳ２４）。
一方、図２のステップＳ９の判定処理で、前フレームの特徴点情報の輝度値と最も近い値の輝度値がなかった場合、後頭部の色追尾の処理に移行する（図２のステップＳ１０）。 If there is a luminance value closest to the luminance value of the feature point information of the previous frame in the determination processing in step S9 of FIG. 2, the coordinates at the closest luminance value and the luminance value at that coordinate are stored. (Step S24).
On the other hand, if there is no brightness value closest to the brightness value of the feature point information of the previous frame in the determination process of step S9 of FIG. 2, the process proceeds to the color tracking process of the back of the head (step S10 of FIG. 2).

次に、前フレームの特徴点情報での座標および現在のフレームの特徴点情報での座標を用いて、ベクトル算出部２３Ｂは動きベクトル（移動量と方向）を算出する（ステップＳ２５）。例えば、周知慣用技術の動きベクトル探索(ME: Motion Estimation)やブロックマッチング法等を利用して、前フレームの特徴点の座標から現在のフレームの特徴点での座標までの動きベクトルが算出される。
例えば、図６Ｂに示すように、前フレーム５１Ｌ１の目領域５３Ｌ１は、現在のフレーム５１Ｌ２において目領域５３Ｌ３に移動している。これにより、前フレームおよび現在のフレームでの目領域の動きベクトル５４が算出される。 Next, using the coordinates in the feature point information of the previous frame and the coordinates in the feature point information of the current frame, the vector calculation unit 23B calculates a motion vector (movement amount and direction) (step S25). For example, the motion vector from the coordinates of the feature point of the previous frame to the coordinates of the feature point of the current frame is calculated by using a motion vector search (ME: Motion Estimation) or a block matching method of a well-known conventional technique. .
For example, as shown in FIG. 6B, the eye area 53L1 of the previous frame 51L1 has moved to the eye area 53L3 in the current frame 51L2. Thereby, the motion vector 54 of the eye area in the previous frame and the current frame is calculated.

そして、前フレームの特徴点情報に含まれている特徴点の点数分の反復処理を行った後に、全ての特徴点から算出された動きベクトルを用いて動きベクトルの平均化処理を行う（ステップＳ２７）。例えば、ステップＳ２２〜ステップＳ２６の処理により、全ての特徴点から算出された移動量を用いて、平均した移動量を算出する。次に、全ての特徴点から算出された方向を用いて、移動する方向を算出する。そして、平均した移動量と算出した方向より、平均化した動きベクトルが算出される。 Then, after iterative processing is performed for the number of feature points included in the feature point information of the previous frame, motion vector averaging processing is performed using motion vectors calculated from all feature points (step S27). ). For example, the average movement amount is calculated using the movement amounts calculated from all the feature points by the processing of step S22 to step S26. Next, using the directions calculated from all the feature points, the moving direction is calculated. Then, an averaged motion vector is calculated from the average movement amount and the calculated direction.

この平均化した動きベクトルを基に、現在のフレームにおける領域を特定して、その特定された領域を新たな対象人物の頭部領域として変更する（ステップＳ２８）。例えば、図６Ｂに示すように、前フレームで抽出された特徴点が目のみである場合は、動きベクトル５４のみであるので、平均化された動きベクトルは動きベクトル５４となる。そして、その動きベクトル５４を基に、現在のフレームでの対象人物の頭部領域は、頭部領域５２Ｌ３となる。
なお、頭部領域とは、対象人物の頭部を含めたフレーム中での最低限での領域のこととする。 Based on the averaged motion vector, an area in the current frame is specified, and the specified area is changed as a head area of a new target person (step S28). For example, as shown in FIG. 6B, when the feature point extracted in the previous frame is only the eye, only the motion vector 54 is obtained, and thus the averaged motion vector becomes the motion vector 54. Based on the motion vector 54, the head region of the target person in the current frame is a head region 52L3.
The head region is a minimum region in the frame including the head of the target person.

［４．色追尾部での処理］
ここで、色追尾部２４の処理を、図７〜９を用いて説明する。この色追尾部２４による処理は、図２のステップＳ９で対象人物の顔の特徴点が検出されなかった場合に実行されるものである。
図７は、色追尾部２４の処理におけるフローチャートを示す。そして、図８Ａは前フレームの一例を示し、図８Ｂは現在のフレームの一例を示す。図９Ａは前フレームでの抽出領域の基準色ヒストグラムの一例を示し、図９Ｂは現在のフレームの抽出領域の抽出色ヒストグラムを示す。図９Ａ，Ｂはそれぞれ、横軸が階調、縦軸が度数（画素数）を示している。 [4. Processing in the color tracking unit]
Here, the processing of the color tracking unit 24 will be described with reference to FIGS. The processing by the color tracking unit 24 is executed when the feature point of the face of the target person is not detected in step S9 of FIG.
FIG. 7 shows a flowchart in the processing of the color tracking unit 24. 8A shows an example of the previous frame, and FIG. 8B shows an example of the current frame. FIG. 9A shows an example of the reference color histogram of the extraction region in the previous frame, and FIG. 9B shows the extraction color histogram of the extraction region of the current frame. In each of FIGS. 9A and 9B, the horizontal axis indicates gradation and the vertical axis indicates frequency (number of pixels).

色差信号Ｃｒと色差信号Ｃｂのそれぞれについて、この色ヒストグラムの情報を取得する。そして、色差信号Ｃｒと色差信号Ｃｂにおける色追尾部２４の処理は共通であるので、色追尾部２４の処理の説明は、色差信号として説明する。
また、色ヒストグラムを得るときに用いられる色は色差信号Ｃｒ，Ｃｂに限らず、Ｒ，Ｇ，Ｂなど他の色でもよい。また、色情報は色ヒストグラムに限られるものではなく、その他の色に関する情報を用いてもよいことは勿論である。 Information of this color histogram is acquired for each of the color difference signal Cr and the color difference signal Cb. Since the process of the color tracking unit 24 in the color difference signal Cr and the color difference signal Cb is common, the description of the process of the color tracking unit 24 will be described as a color difference signal.
The color used when obtaining the color histogram is not limited to the color difference signals Cr and Cb, but may be other colors such as R, G, and B. In addition, the color information is not limited to the color histogram, and it is needless to say that information on other colors may be used.

まず、画像ＲＡＭ３から前フレームでの色情報を読み込む（ステップＳ３１）。前フレームでの色情報とは、図２のステップＳ６の色情報抽出の処理で抽出された前フレームでの色情報である。そして、この前フレームでの色情報は、画像ＲＡＭ３の他に記憶部７にも記憶されている場合があるので、その記憶部７から読み出してもよい。 First, color information in the previous frame is read from the image RAM 3 (step S31). The color information in the previous frame is the color information in the previous frame extracted by the color information extraction process in step S6 of FIG. The color information in the previous frame may be stored in the storage unit 7 in addition to the image RAM 3, and may be read from the storage unit 7.

次に、前フレームでの色情報を色基準値とする（ステップＳ３２）。この色基準値に含まれる色ヒストグラムを基準色ヒストグラムとする。 Next, the color information in the previous frame is set as a color reference value (step S32). A color histogram included in this color reference value is set as a reference color histogram.

前フレームでの頭部領域を含む所定の範囲を現在のフレーム内に設定する。そして、その設定した範囲（色検索範囲）内を、基準色ヒストグラムを算出した範囲（検索領域）で走査しながら色ヒストグラムおよびマッチング二乗誤差を算出する処理を、検索領域の数だけ反復して行う（ステップＳ３３〜ステップＳ３９）。例えば、前フレーム６１Ｌでの色検索範囲６２Ｌと同じ領域（色検索範囲６２Ｂ）を現在のフレーム６１Ｂ上に設定する（図８を参照）。そして、色検索範囲６２Ｂ内を検索領域で走査しながら、各検索領域の抽出色ヒストグラムおよびマッチング二乗誤差を算出する処理を行う。 A predetermined range including the head region in the previous frame is set in the current frame. Then, the process of calculating the color histogram and matching square error while scanning the set range (color search range) within the range (search region) in which the reference color histogram is calculated is repeated for the number of search regions. (Step S33 to Step S39). For example, the same area (color search range 62B) as the color search range 62L in the previous frame 61L is set on the current frame 61B (see FIG. 8). Then, a process of calculating the extracted color histogram and the matching square error of each search region is performed while scanning the color search range 62B with the search region.

まず、ステップ３４にて、色検索範囲６２Ｂ内において、検索領域の抽出色ヒストグラムを算出する。そして、ステップＳ３５にて抽出色ヒストグラムと基準色ヒストグラムを用いて、マッチング二乗誤差を算出する。 First, in step 34, an extracted color histogram of the search area is calculated within the color search range 62B. In step S35, a matching square error is calculated using the extracted color histogram and the reference color histogram.

以下に、マッチング二乗誤差の算出方法を、図９を用いて説明する。ｉは、抽出色ヒストグラム（図９Ｂ）および基準色ヒストグラム（図９Ａ）での任意の階調を示し、ｉｍａｘは抽出色ヒストグラム（図９Ｂ）および基準色ヒストグラム（図９Ａ）での最大の度数であるときの階調を示す。階調範囲Ｂはｉｍａｘを中心とした範囲で、その階調範囲Ｂは適宜に設定される。
次に、基準色ヒストグラムの階調毎の度数の値をＳｔｄＨ［ｉ］とし、ｉ＝ｉｍａｘのときのＳｔｄＨ［ｉｍａｘ］＝Ｘとなる。抽出色ヒストグラムの階調毎の度数の値をＧｅｔＨ［ｉ］とし、ｉ＝ｉｍａｘのときのＧｅｔＨ［ｉｍａｘ］＝Ｙとなる。 Hereinafter, a method of calculating the matching square error will be described with reference to FIG. i represents an arbitrary gradation in the extracted color histogram (FIG. 9B) and the reference color histogram (FIG. 9A), and imax is the maximum frequency in the extracted color histogram (FIG. 9B) and the reference color histogram (FIG. 9A). The gradation at a certain time is shown. The gradation range B is a range centered on imax, and the gradation range B is appropriately set.
Next, the frequency value for each gradation of the reference color histogram is StdH [i], and StdH [imax] = X when i = imax. The frequency value for each gradation of the extracted color histogram is GetH [i], and GetH [imax] = Y when i = imax.

マッチング二乗誤差ＳＥは、次の式１により決定される。 The matching square error SE is determined by the following equation 1.

図７のステップＳ３６にて、最大度数値誤差ｐｅａｋＥを行う。この最大度数値誤差ｐｅａｋＥは、基準色ヒストグラム内での最大の度数と抽出色ヒストグラム内での最大の度数の差とする。
最大度数値誤差ｐｅａｋＥは、以下の式にて決定される。
ｐｅａｋＥ＝ＳｔｄＨ［ｉｍａｘ］−ＧｅｔＨ［ｉｍａｘ］ In step S36 of FIG. 7, the maximum numerical value error peakE is performed. The maximum numerical value error peakE is a difference between the maximum frequency in the reference color histogram and the maximum frequency in the extracted color histogram.
The maximum numerical value error peakE is determined by the following equation.
peakE = StdH [imax] −GetH [imax]

次に、ステップＳ３７にて、マッチング二乗誤差ＳＥが前フレームでのマッチング二乗誤差より小さいかどうかを判定する。
マッチング二乗誤差ＳＥが小さい場合、現在のフレームから抽出される抽出色ヒストグラム、マッチング二乗誤差ＳＥおよび最大度数値誤差ｐｅａｋＥの値を画像ＲＡＭ３に記憶する（ステップＳ３８）。そして、マッチング二乗誤差ＳＥが大きい場合、前フレームから抽出される抽出色ヒストグラム、マッチング二乗誤差ＳＥおよび最大度数値誤差ｐｅａｋＥの値を更新せずに、そのままとする。 Next, in step S37, it is determined whether the matching square error SE is smaller than the matching square error in the previous frame.
If the matching square error SE is small, the extracted color histogram extracted from the current frame, the matching square error SE, and the maximum degree numerical error peakE are stored in the image RAM 3 (step S38). If the matching square error SE is large, the extracted color histogram extracted from the previous frame, the matching square error SE, and the maximum degree numerical error peakE are not updated and are left as they are.

ただし、前フレームでのマッチング二乗誤差ＳＥの値が存在しない場合は、現在のフレームで算出される、抽出色ヒストグラム、マッチング二乗誤差ＳＥおよび最大度数値誤差ｐｅａｋＥの値が記憶される。 However, if there is no matching square error SE value in the previous frame, the extracted color histogram, matching square error SE, and maximum degree numerical error peakE values calculated in the current frame are stored.

そして、色検索範囲の全領域での検索が終わった後、最終的に記憶されている最大度数値誤差ｐｅａｋＥが所定の閾値以内かどうかを判定する（ステップＳ４０）。なお、所定の閾値とは、適宜に設定することができる。しかし、この閾値は、あまり大きい値に設定しないことである。例えば、この閾値を大きい値にした場合、実際には対象人物の頭部がない領域を対象人物の頭部だと特定してしまう恐れがある。ちなみに、本実施の形態では、一例として基準ピーク値の１０％としている。 Then, after the search in the entire area of the color search range is completed, it is determined whether or not the finally stored maximum degree numerical error peakE is within a predetermined threshold (step S40). The predetermined threshold value can be set as appropriate. However, this threshold is not set to a very large value. For example, when this threshold value is set to a large value, there is a possibility that an area that does not actually have the head of the target person is identified as the head of the target person. Incidentally, in this embodiment, it is 10% of the reference peak value as an example.

最大度数値誤差ｐｅａｋＥが所定の閾値内である場合、その最大度数値誤差ｐｅａｋＥが算出された時の後頭部領域を頭部領域として更新する（ステップＳ４１）。そして、最大度数値誤差ｐｅａｋＥが所定の閾値以内でない場合、色追尾の処理を終了する（ステップＳ４２） When the maximum numerical value error peakE is within a predetermined threshold, the occipital region when the maximum numerical value error peakE is calculated is updated as the head region (step S41). If the maximum numerical error peakE is not within a predetermined threshold, the color tracking process is terminated (step S42).

次に、色追尾部２４における表示枠変更処理について、図１０〜１３を用いて説明する。ここでの表示枠変更処理は、対象人物の頭部が常に一定の大きさに表示されるように、基準の表示枠の大きさに対する現在のフレームにおける頭部の表示枠の大きさの変化率を算出する。そして、変化率に応じて、現在のフレームにおける頭部の表示枠の大きさが基準の表示枠に合わせて一定となるよう対象人物の頭部に対するズーム値を決定する。
基準の表示枠とは、最初に対象人物を指定したときの頭部の表示の大きさ、又は予め指定している頭部の表示の大きさである。 Next, the display frame changing process in the color tracking unit 24 will be described with reference to FIGS. The display frame change process here is the rate of change in the size of the display frame of the head in the current frame relative to the size of the reference display frame so that the head of the target person is always displayed at a constant size. Is calculated. Then, according to the change rate, the zoom value for the head of the target person is determined so that the size of the display frame of the head in the current frame becomes constant according to the reference display frame.
The reference display frame is the display size of the head when the target person is first specified, or the display size of the head specified in advance.

図１０は色追尾部２４における表示枠変更処理におけるフローチャートを示し、図１１は表示枠変更処理でのフレームの一例を示す。そして、図１２は比較及び表示枠の変更処理のフローチャートを示す。図１３Ａは頭部領域が基準の表示枠より小さい場合でのフレームの一例を示し、図１３Ｂは頭部領域が基準の表示枠より大きい場合でのフレームの一例を示す。 FIG. 10 shows a flowchart of the display frame changing process in the color tracking unit 24, and FIG. 11 shows an example of the frame in the display frame changing process. FIG. 12 shows a flowchart of the comparison and display frame change processing. FIG. 13A shows an example of a frame when the head region is smaller than the reference display frame, and FIG. 13B shows an example of a frame when the head region is larger than the reference display frame.

表示枠変更処理は、色追尾部２４により色追尾ができた場合に実行されるものである。 The display frame changing process is executed when the color tracking unit 24 can perform color tracking.

まず、図１１Ａに示すように、前フレームでの頭部領域を縦１／２倍および横１／２倍にした領域８３Ａで色検索範囲８１内を走査する。そして、前フレームでの頭部領域を縦１／２倍および横１／２倍にした領域での抽出色ヒストグラムに適合する領域が現在のフレーム内にあるかの判定を行う（ステップＳ５１）。 First, as shown in FIG. 11A, the color search range 81 is scanned in an area 83A in which the head area in the previous frame is halved vertically and halved horizontally. Then, it is determined whether or not there is a region in the current frame that matches the extracted color histogram in the region in which the head region in the previous frame is halved vertically and halved horizontally (step S51).

次に、図１１Ｂに示すように、前フレームでの頭部領域を縦３／２倍および横３／２倍にした領域８３Ｂで色検索範囲８１内を走査する。そして、前フレームでの頭部領域を縦３／２倍および横３／２倍にした領域での抽出色ヒストグラムに適合する領域が現在のフレーム内にあるかの判定を行う（ステップＳ５２）。 Next, as shown in FIG. 11B, the color search range 81 is scanned in an area 83B in which the head area in the previous frame is 3/2 times vertical and 3/2 times horizontal. Then, it is determined whether or not there is a region in the current frame that matches the extracted color histogram in a region in which the head region in the previous frame is 3/2 times longer and 3/2 times longer (step S52).

図１１Ｃに示すように、前フレームでの頭部領域と等しい領域８３Ｃで色検索範囲８１内を走査して、前フレームでの頭部領域での抽出色ヒストグラムに適合する領域が現在のフレーム内にあるかの判定を行う（ステップＳ５３）。
なお、上述のステップＳ５１〜ステップＳ５３の判定処理において、前フレームでの頭部領域の大きさを１／２倍および３／２倍にしているのは一例であり、例えば、２／３倍や３／４倍等の様々な倍率に設定することができる。これにより、頭部領域の表示枠の変化率をより精度良く算出することができる。 As shown in FIG. 11C, the region 83C equal to the head region in the previous frame is scanned in the color search range 81, and the region that matches the extracted color histogram in the head region in the previous frame is within the current frame. (Step S53).
Note that, in the determination processing in steps S51 to S53 described above, the size of the head region in the previous frame is ½ times and 3/2 times, for example, 2/3 times, Various magnifications such as 3/4 can be set. Thereby, the change rate of the display frame of the head region can be calculated with higher accuracy.

そして、ステップＳ５１〜ステップＳ５３での判定処理の結果を比較し、その比較結果により頭部領域の表示枠の変化率を算出し、その変化率を用いて基準の表示枠に合うようなズーム値を決定する（ステップＳ５４）。 Then, the result of the determination processing in step S51 to step S53 is compared, the change rate of the display frame of the head region is calculated based on the comparison result, and the zoom value that matches the reference display frame using the change rate Is determined (step S54).

ここで、図１０のステップＳ５４の検索結果の比較及び表示枠の変更処理の詳細について、図１２および図１３を用いて説明する。
まず、ステップＳ５１〜ステップＳ５３での判定処理の結果から、基準の表示枠に対して、現在のフレームでの対象人物の頭部領域の割合を算出する（ステップＳ６１）。例えば、基準の表示枠に対する領域の割合とは、図１３Ａに示すように、基準の表示枠９１に対して、現在のフレームでの頭部領域９２を占める割合を算出することである。 Details of the comparison of the search results and the display frame changing process in step S54 in FIG. 10 will be described with reference to FIGS.
First, the ratio of the head area of the target person in the current frame with respect to the reference display frame is calculated from the results of the determination processes in steps S51 to S53 (step S61). For example, the ratio of the area to the reference display frame is to calculate the ratio of the head area 92 in the current frame to the reference display frame 91 as shown in FIG. 13A.

基準の表示枠に対して、現在のフレームでの対象人物の頭部領域を占める割合を算出する方法を、以下に説明する。
ここで、ＧｅｔＨ_ｊ［ｉ］は、添字ｊの倍率とした前フレームでの頭部領域を検索領域としたときの抽出色ヒストグラムにおいて最大の度数を持つ階調iを中心とした、階調範囲Ｂでの各階調の度数である。また、ＧｅｔＰ_ｊは、添字ｊの倍率とした前フレームでの頭部領域を検索領域としたときの抽出色ヒストグラムにおいて階調範囲Ｂの各階調の度数の合計値とする。ｊは前フレームでの頭部領域の縦横の倍率を示し、ｉは階調範囲Ｂにおける任意の階調である。
階調範囲Ｂの度数の合計値である、ＧｅｔＰ_１／２、ＧｅｔＰ_３／２、ＧｅｔＰ_１は、次の式２〜４により求められる。 A method for calculating the ratio of the head area of the target person in the current frame to the reference display frame will be described below.
Here, GetH _j [i] is a gradation range centering on the gradation i having the maximum frequency in the extracted color histogram when the head area in the previous frame with the magnification of the subscript j is used as the search area. This is the frequency of each gradation in B. Further, GetP _j is the total value of the frequencies of each gradation in the gradation range B in the extracted color histogram when the head area in the previous frame with the magnification of the subscript _j is used as the search area. j indicates the vertical and horizontal magnifications of the head region in the previous frame, and i is an arbitrary gradation in the gradation range B.
GetP _1/2 , GetP _3/2 , and GetP ₁ , which are the total values of the frequencies in the gradation range B, are obtained by the following equations 2 to 4.

ＧｅｔＰ_１／２は、前フレームの頭部領域を縦１／２倍および横１／２倍にした領域を用いたときの階調範囲Ｂの度数の合計値を示す。ＧｅｔＰ_３／２は、前フレームの頭部領域を縦３／２倍および横３／２倍にした領域での最大度数値の合計値を示す。ＧｅｔＰ_１は、前フレームの頭部領域と等倍である領域での度数の合計値を示す。 GetP _1/2 indicates the total value of the frequencies in the gradation range B when using a region in which the head region of the previous frame is halved vertically and halved horizontally. GetP _3/2 indicates the total value of the maximum power values in a region in which the head region of the previous frame is 3/2 times long and 3/2 times wide. GetP ₁ indicates the total value of the frequencies in an area that is the same size as the head area of the previous frame.

次に、図１２のステップＳ６２で、ＧｅｔＰ_１がＧｅｔＰ_１／２の７／８未満かつ、ＧｅｔＰ_３／２がＧｅｔＰ_１／２の７／１８未満かどうかの判定を行う。 Next, in step S62 of FIG. 12, it is determined whether GetP ₁ is less than 7/8 of GetP _1/2 and GetP _3/2 is less than 7/18 of GetP _1/2 .

ＧｅｔＰ_１がＧｅｔＰ_１／２の７／８未満かつ、ＧｅｔＰ_３／２がＧｅｔＰ_１／２の７／１８未満である場合、基準の表示枠の大きさに対する現在のフレームでの頭部領域の大きさの変化率を算出する（ステップＳ６３）。例えば、この場合の変化率は７／８倍となる。
なお、図１３Ａに示すように、頭部が前フレームの頭部領域の７／８倍の大きさである場合、それぞれの領域からの色ヒストグラムから得られる各階調の度数の合計値は、理想的には以下の割合となる。
ＧｅｔＰ_１／２：ＧｅｔＰ_１：ＧｅｔＰ_３／２＝１：７／８：７／１８ If GetP ₁ is less than 7/8 of GetP _1/2 and GetP _3/2 is less than 7/18 of GetP _1/2 , the size of the head region in the current frame relative to the size of the reference display frame The rate of change in height is calculated (step S63). For example, the change rate in this case is 7/8 times.
As shown in FIG. 13A, when the head is 7/8 times as large as the head region of the previous frame, the total value of the frequencies of each gradation obtained from the color histogram from each region is ideal. Specifically, the ratio is as follows.
GetP _1/2 : GetP ₁ : GetP _3/2 = 1: 7/8: 7/18

ＧｅｔＰ_１がＧｅｔＰ_１／２の７／８未満かつ、ＧｅｔＰ_３／２がＧｅｔＰ_１／２の７／１８未満でない場合、ＧｅｔＰ_３／２がＧｅｔＰ_１の１／２以上かつＧｅｔＰ_３／２がＧｅｔＰ_１／２の１／２以上かどうかの判定を行う（ステップＳ６４）。 If GetP ₁ is less than 7/8 of GetP _1/2 and GetP _3/2 is not less than 7/18 of GetP _1/2 , GetP _3/2 is greater than or equal to 1/2 of GetP ₁ and GetP _3/2 is GetP ₃ _It is determined whether or not _½ of ½ or more (step S64).

ＧｅｔＰ_３／２がＧｅｔＰ_１の１／２以上かつＧｅｔＰ_３／２がＧｅｔＰ_１／２の１／２以上であった場合、基準の表示枠の大きさに対する現在のフレームでの頭部領域の大きさの変化率を算出する（ステップＳ６５）。例えば、この場合の変化率は９／８倍となる。
なお、図１３Ｂに示すように、頭部が前フレームの頭部領域の９／８倍の大きさである場合、それぞれの領域からの色ヒストグラムから得られる各階調の度数の合計値は、理想的には以下の割合となる。
ＧｅｔＰ_１／２：ＧｅｔＰ_１：ＧｅｔＰ_３／２＝１：１：１／２ When GetP _3/2 is 1/2 or more of GetP ₁ and GetP _3/2 is 1/2 or more of GetP _1/2 , the size of the head region in the current frame with respect to the size of the reference display frame The rate of change in height is calculated (step S65). For example, the change rate in this case is 9/8 times.
As shown in FIG. 13B, when the head is 9/8 times the size of the head region of the previous frame, the total value of the frequencies of each gradation obtained from the color histogram from each region is ideal. Specifically, the ratio is as follows.
GetP _1/2 : GetP ₁ : GetP _3/2 = 1: 1: 1/2

上述した色ヒストグラムから得られる階調範囲Ｂの度数の合計値の割合は、理想的な数字であって、撮像データの状態や階調範囲Ｂの取り方などの条件により若干変わるので、計算に用いる数字は適宜に設定する。そして、これらの割合を使って頭部の表示枠の大きさの変化率を算出し、色追尾時の頭部の表示枠の大きさの変更を行う。 The ratio of the total value of the frequency of the gradation range B obtained from the color histogram described above is an ideal number, and varies slightly depending on conditions such as the state of the imaging data and how to obtain the gradation range B. The number to be used is set appropriately. Then, the rate of change in the size of the head display frame is calculated using these ratios, and the size of the head display frame during color tracking is changed.

ＧｅｔＰ_３／２がＧｅｔＰ_１の１／２以上かつＧｅｔＰ_３／２がＧｅｔＰ_１／２の１／２以上でない場合、対象人物の頭部の表示枠の大きさを変更せず、前フレームでの表示枠のままとする（ステップＳ６６）。 If getP _3/2 1/2 or more and _{getP 3/2} of getP ₁ is not more than 1/2 of _{getP 1/2,} without changing the size of the display frame of the head of the target person, in the previous frame The display frame remains unchanged (step S66).

そして、色追尾部２４の表示枠変更処理により求められた変化率に応じて、現在のフレームでの頭部領域の大きさが基準の表示枠に合わせて一定となるよう対象人物の頭部領域に対するズーム値が決定される。
このズーム値は、制御部２で光学系駆動部１０を制御するための制御信号であってもよい。 Then, according to the change rate obtained by the display frame changing process of the color tracking unit 24, the head area of the target person is set so that the size of the head area in the current frame is constant according to the reference display frame. A zoom value for is determined.
This zoom value may be a control signal for controlling the optical system driving unit 10 by the control unit 2.

［５．頭部領域を基準の表示枠で表示する表示制御の処理］
光学系駆動部１０における、対象人物の頭部領域を一定の表示枠で表示する表示制御の処理を、図１４〜１６を用いて説明する。図１４は表示制御処理におけるフローチャートを示し、図１５は撮像装置１と対象人物の顔との距離関係を示した図の一例を示す。そして、図１６は、光学系駆動部１０を使用した場合と使用しない場合での表示部４に表示されるフレーム（画像）の一例を示す。 [5. Display control processing for displaying the head area in the reference display frame]
Display control processing in the optical system driving unit 10 for displaying the head region of the target person in a fixed display frame will be described with reference to FIGS. FIG. 14 shows a flowchart in the display control process, and FIG. 15 shows an example of a diagram showing the distance relationship between the imaging apparatus 1 and the face of the target person. FIG. 16 shows an example of a frame (image) displayed on the display unit 4 when the optical system driving unit 10 is used and when it is not used.

まず、色追尾部２４で算出されたズーム値の信号を、制御部２が受信したどうかを判定する（ステップＳ７１）。ズーム値の信号を受けた場合、制御部２は、そのズーム値の信号に従って光学系駆動部１０を制御して、基準の表示枠と同等の大きさの表示になるように、フレーム（画像）を調整する（ステップＳ７２）。また、ズーム値の信号を受けていない場合には、現在のフレームでの対象人物の頭部領域と基準の表示枠と同等であるかどうかの判定を行う（ステップＳ７３）。
対象人物の頭部領域が基準の表示枠と同等である場合は、光学系駆動部１０を制御しないで終了する。 First, it is determined whether the control unit 2 has received the zoom value signal calculated by the color tracking unit 24 (step S71). When receiving the zoom value signal, the control unit 2 controls the optical system driving unit 10 in accordance with the zoom value signal, so that a frame (image) is displayed so that the display has the same size as the reference display frame. Is adjusted (step S72). If no zoom value signal is received, it is determined whether the head area of the target person in the current frame is equivalent to the reference display frame (step S73).
If the head area of the target person is equivalent to the reference display frame, the process is terminated without controlling the optical system driving unit 10.

ステップＳ７３の判定処理で対象人物の頭部領域が基準の表示枠と同等でないと判定された場合、制御部２が基準の表示枠と現在のフレームでの頭部領域との差を算出する。そして、その算出された差の値に従って光学系駆動部１０を制御して、基準の表示枠と同等の大きさの表示になるように、フレーム（画像）を調整する（ステップＳ７４）。 If it is determined in step S73 that the head area of the target person is not equivalent to the reference display frame, the control unit 2 calculates the difference between the reference display frame and the head area in the current frame. Then, the optical system driving unit 10 is controlled in accordance with the calculated difference value, and the frame (image) is adjusted so that the display has the same size as the reference display frame (step S74).

例えば、図１５に示すように、撮像装置１に対して、頭部１０５が基準の位置から距離Ｌ１下がって頭部１０４の位置になった場合は、基準の表示枠１０７と同じ大きさの表示になるように、光学系９のズームイン駆動が行われる。つまり、通常の表示では図１６Ａのフレーム（画像）１０１ａとなり、対象人物の頭部は小さく表示される。しかし、この表示制御処理により図１６Ｂのフレーム（画像）１０１ｂでの対象人物の頭部は小さくならず、基準の位置にあるフレーム（画像）１０２ｂの頭部１０５ｂと同じ大きさの表示となる。 For example, as illustrated in FIG. 15, when the head 105 is moved to the position of the head 104 by a distance L1 from the reference position with respect to the imaging apparatus 1, the display having the same size as the reference display frame 107 is displayed. The zoom-in driving of the optical system 9 is performed. That is, in a normal display, the frame (image) 101a in FIG. 16A is displayed, and the head of the target person is displayed small. However, this display control process does not reduce the head of the target person in the frame (image) 101b of FIG. 16B, but displays the same size as the head 105b of the frame (image) 102b at the reference position.

また、撮像装置１に対して、頭部１０５が基準の位置から距離Ｌ２に前進して頭部１０６の位置になった場合は、基準の表示枠１０７と同じ大きさの表示になるように、光学系９のズームアウト駆動が行われる。つまり、通常の表示では図１６Ａのフレーム（画像）１０３ａとなり、対象人物の頭部は大きく表示される。しかし、この表示制御処理により図１６Ｂのフレーム（画像）１０３ｂでの対象人物の頭部は小さくならず、基準の位置にあるフレーム（画像）１０２ｂの頭部１０５ｂと同じ大きさの表示となる。 In addition, when the head 105 moves forward from the reference position to the distance L2 to the position of the head 106 with respect to the imaging device 1, the display has the same size as the reference display frame 107. Zoom-out driving of the optical system 9 is performed. That is, in a normal display, the frame (image) 103a in FIG. 16A is displayed, and the head of the target person is displayed large. However, this display control process does not reduce the head of the target person in the frame (image) 103b of FIG. 16B, but displays the same size as the head 105b of the frame (image) 102b at the reference position.

以上説明したように、本発明は、撮影した対象人物の後頭部の頭髪色を利用することにより、対象人物の顔を一度検出するだけで、対象人物の顔がどの方向を向いていても、動画像において対象人物の頭部を追尾することができる。また、対象人物が近づいても遠ざかっても、同様に動画像において対象人物の頭部を追尾することができる。 As described above, according to the present invention, by using the hair color of the back of the head of the target person that has been photographed, the face of the target person can be detected only once by detecting the face of the target person. The head of the target person can be tracked in the image. In addition, even if the target person approaches or moves away, the head of the target person can be tracked in the moving image.

また、対象人物の後頭部の頭髪色を利用することにより、対象人物の頭部の大きさが顔の方向や姿勢によらず検出できるため、安定して正確な頭部追尾を行うことができる。 In addition, since the size of the head of the target person can be detected regardless of the face direction or posture by using the hair color of the back of the target person, stable and accurate head tracking can be performed.

また、色追尾処理により頭部の表示枠の大きさを調整して一定サイズで頭部を検出し続けることにより、頭部の追尾が安定して行われるようにすることができる。 In addition, the head tracking can be stably performed by adjusting the size of the display frame of the head by color tracking processing and continuing to detect the head at a fixed size.

また、タッチパネルを使って対象人物の顔をタッチする方法や方向キーにより指定する方法、登録した対象人物の顔（頭部）を用いた自動の個人照合など、顔向きによらず様々な方法での対象顔（対象頭部）を指定することができる。それにより、大勢の人がいる中で顔の一部や後頭部のみが撮影可能な状態でも、対象人物の顔を一度検出していれば対象人物の頭部を常に追尾することが可能である。 In addition, there are various methods regardless of the face orientation, such as a method of touching the face of the target person using the touch panel, a method of specifying with the direction keys, and an automatic personal verification using the registered target person's face (head). The target face (target head) can be designated. As a result, even if only a part of the face or only the back of the head can be photographed in the presence of a large number of people, it is possible to always track the head of the target person once the face of the target person is detected.

また、対象人物の頭部を予め一定の大きさに表示させるように、基準の表示枠を設定することで、顔の一部や後頭部のみが撮影可能な状態でも対象人物の頭部の表示は常に基準の表示枠と同じ大きさで表示させることが可能である。さらに、対象人物の顔（頭部）とその（表示枠の）サイズを予め設定しておくことで、大勢の中に対象顔が見つかった時のみ頭部を指定サイズまで自動でズームすることも可能である。 In addition, by setting a reference display frame so that the target person's head is displayed in a certain size in advance, the target person's head can be displayed even when only a part of the face or the back of the head can be photographed. It is possible to always display the same size as the reference display frame. In addition, by setting the target person's face (head) and its size (in the display frame) in advance, the head can be automatically zoomed to the specified size only when the target face is found in large numbers. Is possible.

また、撮像装置にパンチルト駆動部を備えることで、色追尾処理と組み合わせて上下左右方向に対する広範囲で対象人物の頭部を追尾することができる。 Further, by providing the pan / tilt driving unit in the imaging apparatus, the head of the target person can be tracked in a wide range in the vertical and horizontal directions in combination with the color tracking process.

また、ズーム制御にトリミング機能を用いることにより、制御部２は、対象人物の顔が画像端に映っている場合でも、対象人物の顔周辺の領域のみを切り出して例えば画面中央に拡大表示する画像処理を行うことで、横の動きにも対応することができる。 Further, by using the trimming function for zoom control, the control unit 2 can cut out only the area around the face of the target person and display the enlarged image at the center of the screen, for example, even when the face of the target person is shown at the edge of the image. By performing processing, it is possible to cope with lateral movement.

なお、本発明に係る撮像装置および画像処理装置の処理は、ソフトウェアにより実行させることもできるし、ハードウェアにより実行させることもできる。また、これらの処理を実行する機能はハードウェアとソフトウェアの組み合わせによっても実現できることは言うまでもない。これらの処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで各種の機能を実行することが可能である。例えば汎用のコンピュータなどに、プログラム記録媒体からインストールされる。 Note that the processing of the imaging apparatus and the image processing apparatus according to the present invention can be executed by software or can be executed by hardware. Needless to say, the function for executing these processes can also be realized by a combination of hardware and software. When these processes are executed by software, it is possible to execute various functions by installing a computer in which the programs that make up the software are installed in dedicated hardware, or by installing various programs. is there. For example, it is installed from a program recording medium in a general-purpose computer or the like.

例えば、上述例では、図２、５、７、１０、１１のフローチャートでの各処理要素を、ソフトウェアにより実行する例を説明したが、ハードウェアにより実行されるようにしてもよい。 For example, in the above-described example, each processing element in the flowcharts of FIGS. 2, 5, 7, 10, and 11 has been described as being executed by software, but may be executed by hardware.

また、本実施の形態では、対象人物の頭部を追尾した結果のフレームを表示部４で表示させるようにしている。しかし、このフレームを表示部４で表示させるとともに、記憶部７に記憶（録画）させてもよい。これにより、対象人物の頭部を追尾した録画データをいつでも見ることができる。 In the present embodiment, the display unit 4 displays a frame obtained as a result of tracking the head of the target person. However, this frame may be displayed on the display unit 4 and stored (recorded) in the storage unit 7. Thereby, the recorded data tracking the head of the target person can be viewed at any time.

また、本実施の形態での対象人物の頭部追尾処理は、実時間での撮像データを用いて頭部追尾処理を行っているが、事前に録画した撮像データに対しても対象人物の頭部追尾処理を行うようにしてもよい。 Further, in the target person's head tracking process in the present embodiment, the head tracking process is performed using real-time imaging data, but the target person's head tracking process is also performed on pre-recorded imaging data. A part tracking process may be performed.

以上、本発明の一実施形熊の例について説明したが、本発明は上記実施形態の例に限定されるものではなく、特許請求の範囲に記載した本発明の要旨を逸脱しない限りにおいて、他の変形例、応用例を含む。 As mentioned above, although the example of the embodiment bear of this invention was demonstrated, this invention is not limited to the example of the said embodiment, As long as it does not deviate from the summary of this invention described in the claim, other Including modifications and application examples.

１…撮像装置、２…制御部、３…画像ＲＡＭ、４…表示部、５…操作部、６…画像処理、７…記憶部、８…外部インターフェース、９…光学系、１０…光学系駆動部、１１…イメージセンサ、１２…信号処理部、１３…パンチルト駆動部、１４…データバス、２１…顔検出部、２１Ａ…顔向き判定部、２２…頭部追尾部、２３…特徴点追尾部、２３Ａ…特徴点情報抽出部、２３Ｂ…ベクトル算出部、２４…色追尾部、２４Ａ…色情報抽出部 DESCRIPTION OF SYMBOLS 1 ... Imaging device, 2 ... Control part, 3 ... Image RAM, 4 ... Display part, 5 ... Operation part, 6 ... Image processing, 7 ... Memory | storage part, 8 ... External interface, 9 ... Optical system, 10 ... Optical system drive 11, an image sensor, 12 a signal processing unit, 13 a pan / tilt drive unit, 14 a data bus, 21 a face detection unit, 21 A a face orientation determination unit, 22 a head tracking unit, and 23 a feature point tracking unit. , 23A ... feature point information extraction unit, 23B ... vector calculation unit, 24 ... color tracking unit, 24A ... color information extraction unit

Claims

A face detection unit for detecting the face of the target person from the imaging data obtained from the imaging unit;
A storage unit for storing head information including the face of the target person detected by the face detection unit;
Based on head information including the face of the target person stored in the storage unit, a head tracking unit that continues to detect a head including the profile of the target person to the back of the target person in current imaging data;
An image processing apparatus comprising:

A face orientation determination unit that determines the orientation of the face of the target person in the current imaging data detected by the face detection unit;
Further comprising
The image processing apparatus according to claim 1, wherein information on the face of the target person is extracted according to the face orientation determined by the face orientation determination unit.

The head tracking section is
If the face orientation determination unit cannot determine the face orientation, the current imaging data includes a color information extraction unit that extracts color information from the back of the head part of the target person and the face as information on the target person's face A tracking unit is further provided,
The color tracking unit uses the color information obtained from the color information extraction unit and the color information of the back of the head portion of the target person stored in the storage unit and the corresponding color information in the current imaging data. The image processing apparatus according to claim 2, wherein a region having a gap is detected.

The color information extraction unit obtains an extracted color histogram indicating the color distribution of the back of the head part of the target person's face and the combination from current imaging data,
The color tracking unit includes an extracted color histogram of the occipital region of the target person in the current imaging data acquired by the color information extraction unit, and a head including the past face of the target person stored in the storage unit The region having the corresponding color information in the current imaging data is identified as the head of the target person from the reference color histogram indicating the color distribution of the back of the head part of the target person's face and the combination based on the part information. The image processing apparatus described.

The color tracking unit includes a display frame that is set according to the size of the face or head of the target person displayed on the display unit, and a reference display frame at the time of obtaining the extracted color histogram and current imaging data. Comparing the size of the display frame of the head, and based on the result of the comparison, calculating the rate of change of the size of the display frame of the head in the current imaging data relative to the size of the reference display frame,
Zooming with respect to the head of the target person so that the size of the display frame of the head in the current imaging data is constant according to the reference display frame according to the change rate obtained by the color tracking unit The image processing apparatus according to claim 4, wherein a value is determined.

The head tracking section is
When the face orientation determination unit cannot determine the face orientation, a feature point information extraction unit that extracts the coordinates and brightness of the feature points of the target person as the face information of the target person in the current imaging data;
The coordinates and brightness of the feature point of the target person's face of the latest image data stored in the storage unit extracted by the feature point information extraction unit, and the feature of the target person's face of the current image data A feature vector tracking unit having a motion vector calculation unit that calculates a motion vector of the feature point from the coordinates and luminance of the point, and the feature point tracking unit is a motion vector calculated by the motion vector calculation unit The image processing apparatus according to claim 2, wherein the face of the target person in the current imaging data is specified.

An imaging unit for imaging a subject;
A face detection unit for detecting the face of the target person from the imaging data obtained from the imaging unit;
A storage unit for storing head information including the face of the target person detected by the face detection unit;
Based on head information including the face of the target person stored in the storage unit, a head tracking unit that continues to detect a head including the profile of the target person to the back of the target person in current imaging data;
An imaging apparatus comprising: