JP5247419B2

JP5247419B2 - Imaging apparatus and subject tracking method

Info

Publication number: JP5247419B2
Application number: JP2008331756A
Authority: JP
Inventors: 仁志保田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-12-26
Filing date: 2008-12-26
Publication date: 2013-07-24
Anticipated expiration: 2028-12-26
Also published as: JP2010154374A

Description

本発明は、撮像装置および被写体追跡方法に関し、特に画像中に含まれる人物や動物や物体などの特定の被写体または被写体の一部を検出し追跡する撮像装置および被写体追跡方法に関する。 The present invention relates to an imaging device and a subject tracking method, and more particularly to an imaging device and a subject tracking method for detecting and tracking a specific subject or a part of a subject such as a person, an animal, or an object included in an image.

画像から特定の被写体を自動的に検出し追跡する画像処理方法は、非常に有用であり、例えば動画像における人間の顔領域の特定に利用することができる。
このような方法は、通信会議、マン・マシン・インタフェース、セキュリティ、人間の顔を追跡するためのモニタ・システム、画像圧縮などの多くの分野で使用することができる。
デジタルカメラやデジタルビデオカメラでは、撮影画像から顔を検出し、その検出結果を制御対象として焦点や露出を最適化させている。
例えば、特許文献１では画像中の顔の位置を検出し、顔に焦点を合わせ、顔に最適な露出で撮影する撮影装置について開示されている。 An image processing method for automatically detecting and tracking a specific subject from an image is very useful, and can be used, for example, for specifying a human face area in a moving image.
Such methods can be used in many areas such as teleconferencing, man-machine interfaces, security, monitor systems for tracking human faces, image compression, and the like.
In digital cameras and digital video cameras, a face is detected from a captured image, and the focus and exposure are optimized by using the detection result as a control target.
For example, Patent Document 1 discloses a photographing apparatus that detects the position of a face in an image, focuses on the face, and photographs the face with an optimal exposure.

また、このように検出された顔を追跡することにより、時系列的に安定した制御を可能にする装置および方法が知られている。
例えば、特許文献２では、自動で特定の被写体を追跡する装置および方法として、テンプレートマッチングの手法を利用した装置および方法が提案されている。このようなテンプレートマッチングは、追跡対象の画像領域を切り出した部分画像を基準画像として登録し、画像内で基準画像と最も相関度が高い領域を推定し、特定の被写体を追跡する手法である。
図８にテンプレートマッチングを用いた被写体追跡の一例のフローチャートを示す。
また、図９にテンプレートマッチングを用いた被写体追跡の一例の説明図を示す。
ここでは、顔を目的とする被写体として追跡する例を示す。
図９において１１０１はフレームｔ＝０での入力画像、１１０２はフレームｔ＝０の入力画像における被写体検出結果、１１０３はフレームｔ＝０の入力画像において登録される基準画像、１１０４はフレームｔ＝１での入力画像である。
１１０５はフレームｔ＝１の入力画像におけるマッチング結果、１１０６はフレームｔ＝１の入力画像において更新される基準画像、１１０７はフレームｔ＝２での入力画像である。
１１０８はフレームｔ＝２の入力画像におけるマッチング結果、１１０９はフレームｔ＝２の入力画像において更新される基準画像である。 Also, an apparatus and a method are known that enable stable control in time series by tracking the face detected in this way.
For example, Patent Document 2 proposes an apparatus and method using a template matching technique as an apparatus and method for automatically tracking a specific subject. Such template matching is a method of registering a partial image obtained by cutting out an image region to be tracked as a reference image, estimating a region having the highest correlation with the reference image in the image, and tracking a specific subject.
FIG. 8 shows a flowchart of an example of subject tracking using template matching.
FIG. 9 illustrates an example of subject tracking using template matching.
Here, an example is shown in which a face is tracked as a target subject.
In FIG. 9, 1101 is an input image at frame t = 0, 1102 is a subject detection result in the input image at frame t = 0, 1103 is a reference image registered in the input image at frame t = 0, and 1104 is frame t = 1. It is an input image at.
1105 is a matching result in the input image of the frame t = 1, 1106 is a reference image updated in the input image of the frame t = 1, and 1107 is an input image in the frame t = 2.
1108 is a matching result in the input image of frame t = 2, and 1109 is a reference image updated in the input image of frame t = 2.

つぎに、これらの図８および図９を用いて、上記した特定の被写体を追跡する手法につい説明する。
まず、ビデオカメラなどの撮像装置により、フレームｔ＝０における入力画像１１０１を読み込む（Ｓ１００１）。
次に、入力画像１１０１から被写体検出処理により、被写体領域を抽出し、１１０２のような被写体検出結果を得る（Ｓ１００２）。
この被写体検出結果から初期の基準画像１１０３を登録する（Ｓ１００３）。続いて、フレームｔ＝１における入力画像１１０４を読み込む（Ｓ１００４）。
入力画像１１０４とフレームｔ＝０の入力画像１１０１において登録された基準画像１１０３とのマッチング処理を行なう（Ｓ１００５）。
所定のマッチングエリアにおいてマッチング処理が完了していなければ（Ｓ１００６でＮＯ）、続けてマッチング処理を行なう（Ｓ１００５）。
完了していれば（Ｓ１００６でＹＥＳ）、相関度が最も高い領域をフレームｔ＝１での被写体領域としてマッチング結果１１０５を得る（Ｓ１００７）。
そして、推定された被写体領域に基づき基準画像１１０６を更新する（Ｓ１００８）。 Next, a method for tracking the specific subject described above will be described with reference to FIGS.
First, an input image 1101 at frame t = 0 is read by an imaging device such as a video camera (S1001).
Next, a subject area is extracted from the input image 1101 by subject detection processing, and a subject detection result such as 1102 is obtained (S1002).
An initial reference image 1103 is registered from the subject detection result (S1003). Subsequently, the input image 1104 at the frame t = 1 is read (S1004).
A matching process is performed between the input image 1104 and the reference image 1103 registered in the input image 1101 of the frame t = 0 (S1005).
If the matching process is not completed in the predetermined matching area (NO in S1006), the matching process is subsequently performed (S1005).
If completed (YES in S1006), the matching result 1105 is obtained with the region having the highest degree of correlation as the subject region at frame t = 1 (S1007).
Then, the reference image 1106 is updated based on the estimated subject area (S1008).

続いて、フレームｔ＝２における入力画像１１０７を読み込む（Ｓ１００４）。
入力画像１１０７とフレームｔ＝１の入力画像１１０４において更新された基準画像１１０６とのマッチング処理を行なう（Ｓ１００５）。
所定のマッチングエリアにおいてマッチング処理が完了していなければ（Ｓ１００６でＮＯ）、続けてマッチング処理を行なう（Ｓ１００５）。
完了していれば（Ｓ１００６でＹＥＳ）、相関度が最も高い領域をフレームｔ＝２での被写体領域としてマッチング結果１１０８を得る（Ｓ１００７）。
そして、推定された被写体領域に基づき基準画像１１０９を更新する（Ｓ１００８）。
以上のように、連続して入力される画像と前フレームにおけるマッチング結果によって得られる基準画像との相関をとることにより、目的とする被写体を追跡する。
特開２００５−３１８５５４号公報特開２００１−６０２６９号公報 Subsequently, the input image 1107 in the frame t = 2 is read (S1004).
A matching process is performed between the input image 1107 and the reference image 1106 updated in the input image 1104 of the frame t = 1 (S1005).
If the matching process is not completed in the predetermined matching area (NO in S1006), the matching process is subsequently performed (S1005).
If completed (YES in S1006), the matching result 1108 is obtained with the region having the highest degree of correlation as the subject region at frame t = 2 (S1007).
Then, the reference image 1109 is updated based on the estimated subject area (S1008).
As described above, the target object is tracked by correlating the continuously input image with the reference image obtained from the matching result in the previous frame.
JP 2005-318554 A JP 2001-60269 A

しかしながら、上記した従来例による特定の被写体を追跡する手法では、現在フレームの画像と基準画像とのパターンの類似性に基づいていることから、つぎのような課題が生じる。
すなわち、上記従来例による被写体の追跡法では、追跡対象が障害物に隠れてしまった場合、変倍動作により追跡対象の大きさが変化してしまった場合、
あるいは、パンニングや手ブレにより画面自体が不安定な場合等において、実際の追跡対象とは異なる領域を類似性から被写体領域として抽出してしまうこととなる。
つまり、被写体追跡において誤った被写体追跡が発生してしまうこととなる。 However, the method of tracking a specific subject according to the above-described conventional example has the following problems because it is based on the similarity of the pattern between the current frame image and the reference image.
That is, in the tracking method of the subject according to the conventional example, when the tracking target is hidden behind an obstacle, the size of the tracking target changes due to the scaling operation,
Alternatively, when the screen itself is unstable due to panning or camera shake, an area different from the actual tracking target is extracted as a subject area from similarity.
That is, wrong subject tracking occurs in subject tracking.

本発明は、上記課題に鑑み、動画像中に含まれる特定の被写体または一部の被写体を検出して追跡する際に、誤った被写体の追跡が発生するのを軽減させることが可能となる撮像装置および被写体追跡方法の提供を目的とする。 In view of the above problems, the present invention is capable of reducing the occurrence of erroneous tracking of an object when detecting and tracking a specific object or a part of an object included in a moving image. An object is to provide an apparatus and a subject tracking method.

本発明は、つぎのように構成した撮像装置および被写体追跡方法を提供するものである。
本発明の撮像装置は、動画像中に含まれる特定の被写体または一部の被写体を検出し、予め登録された基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡手段を有する撮像装置であって、
前記被写体追跡手段は、前記基準画像を登録する基準画像登録手段と、前記マッチング処理を行うマッチング処理手段と、前記被写体の追跡を終了させる追跡終了判定手段と、を備え、
前記追跡終了判定手段は、前記基準画像として登録された被写体領域と、前記マッチング手段にて抽出された類似する領域との距離が、
前記検出された被写体の大きさによって変更するように設定されている所定量を超えたときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の撮像装置は、動画像中に含まれる特定の被写体または一部の被写体を検出し、予め登録された基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡手段を有する撮像装置であって、
前記被写体追跡手段は、前記基準画像を登録する基準画像登録手段と、前記マッチング処理を行うマッチング処理手段と、前記被写体の追跡を終了させる追跡終了判定手段と、を備え、
前記追跡終了判定手段は、連続して前記被写体の追跡を行っている際において、初めに前記基準画像として登録された被写体領域と、前記マッチング手段により抽出された類似する領域との距離が、
前記検出された被写体の大きさによって変更するように設定されている所定量を超えたときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の撮像装置は、動画像中に含まれる特定の被写体または一部の被写体を検出し、予め登録された基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡手段を有する撮像装置であって、
前記被写体追跡手段は、前記基準画像を登録する基準画像登録手段と、前記マッチング処理を行うマッチング処理手段と、前記被写体の追跡を終了させる追跡終了判定手段と、前記追跡終了判定手段により選択された領域の画像信号の高周波成分を抽出し、該領域における焦点評価値を生成する焦点評価値生成手段と、を備え、
前記追跡終了判定手段は、前記焦点評価値が所定量より小さくなったときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の撮像装置は、動画像中に含まれる特定の被写体または一部の被写体を検出し、予め登録された基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡手段を有する撮像装置であって、
前記被写体追跡手段は、前記基準画像を登録する基準画像登録手段と、前記マッチング処理を行うマッチング処理手段と、前記被写体の追跡を終了させる追跡終了判定手段と、前記被写体の画像を変倍する変倍手段と、を備え、
前記追跡終了判定手段は、前記変倍手段による変倍量が所定量に達したときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の被写体追跡方法は、動画像中に含まれる特定の被写体または一部の被写体を検出し、基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡方法であって、
前記基準画像として、動画像中のいずれかのフレーム画像における被写体領域を、予め登録する基準画像登録工程と、
前記基準画像とは異なるタイミングで生成されたフレーム画像に対して前記基準画像とのマッチングを行い、前記基準画像と類似する領域を抽出するマッチング工程と、
前記マッチング処理に用いられた情報以外の情報により、前記被写体の追跡を終了させる追跡終了判定工程と、を有し、
前記追跡終了判定工程では、前記基準画像として登録された被写体領域と、前記マッチング手段により抽出された類似する領域との距離が、
前記検出された被写体の大きさによって変更するように設定されている所定量を超えたときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の被写体追跡方法は、動画像中に含まれる特定の被写体または一部の被写体を検出し、基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡方法であって、
前記基準画像として、動画像中のいずれかのフレーム画像における被写体領域を、予め登録する基準画像登録工程と、
前記基準画像とは異なるタイミングで生成されたフレーム画像に対して前記基準画像とのマッチングを行い、前記基準画像と類似する領域を抽出するマッチング工程と、
前記マッチング処理に用いられた情報以外の情報により、前記被写体の追跡を終了させる追跡終了判定工程と、を有し、
前記追跡終了判定工程では、連続して前記被写体の追跡を行っている際において、初めに前記基準画像として登録された被写体領域と、前記マッチング手段により抽出された類似する領域との距離が、
前記検出された被写体の大きさによって変更するように設定されている所定量を超えたときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の被写体追跡方法は、動画像中に含まれる特定の被写体または一部の被写体を検出し、基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡方法であって、
前記基準画像として、動画像中のいずれかのフレーム画像における被写体領域を、予め登録する基準画像登録工程と、
前記基準画像とは異なるタイミングで生成されたフレーム画像に対して前記基準画像とのマッチングを行い、前記基準画像と類似する領域を抽出するマッチング工程と、
前記マッチング処理に用いられた情報以外の情報により、前記被写体の追跡を終了させる追跡終了判定工程と、前記追跡終了判定工程により選択された領域の画像信号の高周波成分を抽出し、該領域における焦点評価値を生成する焦点評価値生成工程と、を有し、
前記追跡終了判定工程では、前記焦点評価値が所定量より小さくなったときに、前記被写体の追跡動作を中止することを特徴とする。
また、本発明の被写体追跡方法は、動画像中に含まれる特定の被写体または一部の被写体を検出し、基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する被写体追跡方法であって、
前記基準画像として、動画像中のいずれかのフレーム画像における被写体領域を、予め登録する基準画像登録工程と、
前記基準画像とは異なるタイミングで生成されたフレーム画像に対して前記基準画像とのマッチングを行い、前記基準画像と類似する領域を抽出するマッチング工程と、
前記マッチング処理に用いられた情報以外の情報により、前記被写体の追跡を終了させる追跡終了判定工程と、前記被写体の画像を変倍する変倍工程と、を有し、
前記追跡終了判定工程では、前記変倍工程による変倍量が所定量に達したときに、前記被写体の追跡動作を中止することを特徴とする。 The present invention provides an imaging apparatus and a subject tracking method configured as follows.
The imaging apparatus of the present invention detects a specific subject or a part of subjects included in a moving image, and tracks the subject based on similarity extracted by a matching process with a reference image registered in advance. An imaging apparatus having subject tracking means,
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, and a tracking end determination unit that ends the tracking of the subject .
The tracking end determination means has a distance between a subject area registered as the reference image and a similar area extracted by the matching means.
The tracking operation of the subject is stopped when a predetermined amount that is set to be changed depending on the size of the detected subject is exceeded .
Further, the imaging apparatus of the present invention detects a specific subject or a part of subjects included in a moving image, and detects the subject based on similarity extracted by a matching process with a reference image registered in advance. An imaging apparatus having subject tracking means for tracking,
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, and a tracking end determination unit that ends the tracking of the subject.
When the tracking end determination means is continuously tracking the subject, the distance between the subject area first registered as the reference image and the similar area extracted by the matching means is:
The tracking operation of the subject is stopped when a predetermined amount that is set to be changed depending on the size of the detected subject is exceeded.
Further, the imaging apparatus of the present invention detects a specific subject or a part of subjects included in a moving image, and detects the subject based on similarity extracted by a matching process with a reference image registered in advance. An imaging apparatus having subject tracking means for tracking,
The subject tracking unit is selected by a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, a tracking end determination unit that ends tracking of the subject, and a tracking end determination unit. A focus evaluation value generating means for extracting a high frequency component of the image signal of the region and generating a focus evaluation value in the region;
The tracking end determination unit stops the tracking operation of the subject when the focus evaluation value becomes smaller than a predetermined amount.
Further, the imaging apparatus of the present invention detects a specific subject or a part of subjects included in a moving image, and detects the subject based on similarity extracted by a matching process with a reference image registered in advance. An imaging apparatus having subject tracking means for tracking,
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, a tracking end determination unit that ends tracking of the subject, and a variable that scales the image of the subject. A double means,
The tracking end determination unit stops the tracking operation of the subject when the amount of magnification by the scaling unit reaches a predetermined amount.
In addition, the subject tracking method of the present invention detects a specific subject or a part of subjects included in a moving image, and subjects the subject to be tracked based on similarity extracted by matching processing with a reference image. A tracking method,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used for the matching process ,
In the tracking end determination step, a distance between a subject area registered as the reference image and a similar area extracted by the matching unit is
The tracking operation of the subject is stopped when a predetermined amount that is set to be changed depending on the size of the detected subject is exceeded .
In addition, the subject tracking method of the present invention detects a specific subject or a part of subjects included in a moving image, and subjects the subject to be tracked based on similarity extracted by matching processing with a reference image. A tracking method,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used for the matching process,
In the tracking end determination step, when tracking the subject continuously, the distance between the subject region initially registered as the reference image and the similar region extracted by the matching means is
The tracking operation of the subject is stopped when a predetermined amount that is set to be changed depending on the size of the detected subject is exceeded.
In addition, the subject tracking method of the present invention detects a specific subject or a part of subjects included in a moving image, and subjects the subject to be tracked based on similarity extracted by matching processing with a reference image. A tracking method,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
Based on information other than the information used for the matching process, a tracking end determination step for ending tracking of the subject, and a high-frequency component of the image signal of the region selected by the tracking end determination step are extracted, and the focus in the region A focus evaluation value generation step for generating an evaluation value, and
In the tracking end determination step, the tracking operation of the subject is stopped when the focus evaluation value becomes smaller than a predetermined amount.
In addition, the subject tracking method of the present invention detects a specific subject or a part of subjects included in a moving image, and subjects the subject to be tracked based on similarity extracted by matching processing with a reference image. A tracking method,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used in the matching process, and a scaling step for scaling the image of the subject,
In the tracking end determination step, the tracking operation of the subject is stopped when the amount of scaling in the scaling step reaches a predetermined amount.

本発明によれば、動画像中に含まれる特定の被写体または一部の被写体を検出して追跡する際に、誤った被写体の追跡が発生するのを軽減させることができる。 According to the present invention, it is possible to reduce the occurrence of tracking of an erroneous subject when detecting and tracking a specific subject or part of subjects included in a moving image.

本発明を実施するための最良の形態を、以下の実施例により説明する。 The best mode for carrying out the present invention will be described by the following examples.

以下に、本発明の実施例における動画像中に含まれる特定の被写体または一部の被写体を検出し、予め登録された基準画像とのマッチング処理によって抽出された類似性に基づいて、前記被写体を追跡する撮像装置について、図面を参照しながら詳述する。
図１に、本発明の実施例に係るビデオカメラ等の撮像装置を説明するブロック図を示す。
図１において、１０１は変倍レンズ、１０２はフォーカスレンズ、１０３は撮像素子、１０４はアナログ信号処理部、１０５はＡ／Ｄ変換部、１０６はカメラ信号処理部、１０７は表示部、１０８は記録媒体、１０９はカメラ制御部、１１０は被写体検出部である。
１１１は被写体追跡部、１１２は基準画像登録部、１１３はマッチング処理部、１１４は特徴画素判定部、１１５は終了判定部、１１６は振動ジャイロである。 Hereinafter, a specific subject or a part of subjects included in the moving image in the embodiment of the present invention is detected, and the subject is determined based on similarity extracted by matching processing with a reference image registered in advance. The imaging device to be tracked will be described in detail with reference to the drawings.
FIG. 1 is a block diagram illustrating an imaging apparatus such as a video camera according to an embodiment of the present invention.
In FIG. 1, 101 is a zoom lens, 102 is a focus lens, 103 is an image sensor, 104 is an analog signal processing unit, 105 is an A / D conversion unit, 106 is a camera signal processing unit, 107 is a display unit, and 108 is a recording unit. A medium, 109 is a camera control unit, and 110 is a subject detection unit.
Reference numeral 111 denotes a subject tracking unit, 112 a reference image registration unit, 113 a matching processing unit, 114 a feature pixel determination unit, 115 an end determination unit, and 116 a vibration gyro.

本実施例の撮像装置において、ズームレンズ１０１によって撮像される被写体像の大きさが変倍され、フォーカスレンズ１０２によって被写体像を表す光線が集光され、ＣＣＤイメージセンサやＣＭＯＳイメージセンサのような撮像素子１０３に入射する。
撮像素子１０３は、入射した光線の強度に応じた電気信号を画素単位で出力する。この電気信号が映像信号である。
撮像素子１０３から出力された映像信号は、アナログ信号処理部１０４において相関二重サンプリング（ＣＤＳ）等のアナログ信号処理が行われる。
アナログ信号処理部１０４から出力された映像信号は、Ａ／Ｄ変換部１０５においてデジタルデータの形式に変換され、カメラ信号処理部１０６に入力する。
カメラ信号処理部１０６においては、ガンマ補正、ホワイトバランス処理や、ＡＦ／ＡＥ評価値生成などのカメラ信号処理が行われる。
ここでＡＦ評価値は、カメラ信号の輝度の高周波成分を抽出し画面内の特定のエリアについて加算するのが一般的である。
また、ＡＥ評価値は、カメラ信号の輝度を画面内の特定のエリアについて加算するのが一般的である。
カメラ信号のカメラ信号処理部１０６は、通常のカメラ信号処理に加え、後述する被写体検出部１１０、被写体追跡部１１１から供給される画像中の特定の被写体領域に関する情報を用いたカメラ信号処理をおこなう機能を有する。
カメラ信号処理部１０６から出力された映像信号は、表示部１０７に送られる。表示部１０７は、例えばＬＣＤや有機ＥＬディスプレイであり、映像信号を表示する。
時系列的に連続撮影した画像を逐次表示部１０７に表示することで、表示部１０７を電子ビューファインダ（ＥＶＦ）として機能させることができる。
また、映像信号は記録媒体１０８、例えば着脱可能なメモリカードに記録される。
記録先はカメラの内蔵メモリであっても、通信可能な接続された外部装置であっても良い。 In the image pickup apparatus of the present embodiment, the size of the subject image picked up by the zoom lens 101 is changed, and the light beam representing the subject image is condensed by the focus lens 102 to pick up an image like a CCD image sensor or a CMOS image sensor. Incident on the element 103.
The image sensor 103 outputs an electrical signal corresponding to the intensity of the incident light beam in units of pixels. This electrical signal is a video signal.
The analog signal processing unit 104 performs analog signal processing such as correlated double sampling (CDS) on the video signal output from the image sensor 103.
The video signal output from the analog signal processing unit 104 is converted into a digital data format by the A / D conversion unit 105 and input to the camera signal processing unit 106.
The camera signal processing unit 106 performs camera signal processing such as gamma correction, white balance processing, and AF / AE evaluation value generation.
Here, the AF evaluation value is generally obtained by extracting a high-frequency component of the luminance of the camera signal and adding it for a specific area in the screen.
The AE evaluation value is generally obtained by adding the luminance of the camera signal for a specific area in the screen.
In addition to normal camera signal processing, the camera signal processing unit 106 for camera signals performs camera signal processing using information on a specific subject area in an image supplied from a subject detection unit 110 and a subject tracking unit 111 described later. It has a function.
The video signal output from the camera signal processing unit 106 is sent to the display unit 107. The display unit 107 is, for example, an LCD or an organic EL display, and displays a video signal.
By sequentially displaying images taken continuously in time series on the display unit 107, the display unit 107 can function as an electronic viewfinder (EVF).
The video signal is recorded on a recording medium 108, for example, a removable memory card.
The recording destination may be a built-in memory of the camera or a connected external device capable of communication.

カメラ信号処理部１０６から出力された映像信号は、被写体検出部１１０にも供給される。
被写体検出部１１０は画像中の目的とする特定の被写体を検出し、被写体の人数と被写体領域を特定するためのものである。
目的とする被写体としては、人物の顔などが代表的である。検出方法は公知の顔検出方法を用いる。
顔検出の公知技術は、顔に関する知識（肌色情報、目・鼻・口などのパーツ）を利用する方法とニューラルネットに代表される学習アルゴリズムにより顔検出のための識別器を構成する方法などがある。
認識率向上のためにこれらを組み合わせて顔認識を行なうのが一般的である。
具体的には特開２００２−２５１３８０号広報に記載のウェーブレット変換と画像特徴量を利用して顔検出する方法などが挙げられる。
被写体追跡部１１１では、カメラ信号処理部１０６から出力される時刻の異なる映像信号から、映像信号のパターンの類似性に基づき特定の被写体を追跡する。
被写体追跡部１１１は、基準画像登録部１１２、マッチング処理部１１３、特徴画素判定部１１４、終了判定部１１５により構成される。 The video signal output from the camera signal processing unit 106 is also supplied to the subject detection unit 110.
The subject detection unit 110 is for detecting a specific target subject in the image and specifying the number of subjects and the subject region.
A typical subject is a human face or the like. A known face detection method is used as the detection method.
Known techniques for face detection include a method of using knowledge about the face (skin color information, parts such as eyes, nose, mouth, etc.) and a method of configuring a classifier for face detection by a learning algorithm represented by a neural network. is there.
In order to improve the recognition rate, face recognition is generally performed by combining them.
Specifically, there is a method for detecting a face using wavelet transform and an image feature amount described in JP 2002-251380 A.
The subject tracking unit 111 tracks a specific subject from video signals with different times output from the camera signal processing unit 106 based on the similarity of the video signal patterns.
The subject tracking unit 111 includes a reference image registration unit 112, a matching processing unit 113, a feature pixel determination unit 114, and an end determination unit 115.

基準画像登録部１１２では、被写体検出部１１０もしくは被写体追跡部１１１の結果に基づき、カメラ信号処理部１０６から出力される映像信号の部分領域を基準画像として登録する。
基準画像登録部１１２では、被写体追跡部１１１の初期動作時には、被写体追跡部１１１の結果が存在しないため被写体検出部１１０の結果に基づく被写体領域を基準画像として登録する。
被写体追跡部１１１の初期動作以降は、被写体追跡部１１１の結果を基準画像に登録することが可能となる。
被写体検出部１１０の結果と被写体追跡部１１１の結果において、より信頼性の高い結果に基づき基準画像を登録することにより、被写体追跡部１１１の精度を向上させることができる。
また、マッチング処理部１１３では、現在フレームにおける映像信号のフレーム画像と基準画像登録部１１２により登録されている基準画像とのマッチング処理を行なう。
マッチング処理により、現在フレームにおける映像信号の画像において基準画像と最も相関度が高い領域を目的とする被写体の領域とし抽出する。
ここで、マッチング処理部１１３は、現在フレームの映像信号と時刻の異なる基準画像が登録されている場合のみ動作するものとする。
また、マッチング処理部１１３は、画像を多数の領域に分割し、基準画像に含まれる領域における輝度値と最も差分の小さくなる領域を、最も相関の高い領域として抽出する。 The reference image registration unit 112 registers a partial area of the video signal output from the camera signal processing unit 106 as a reference image based on the result of the subject detection unit 110 or the subject tracking unit 111.
The reference image registration unit 112 registers the subject area based on the result of the subject detection unit 110 as a reference image because the result of the subject tracking unit 111 does not exist during the initial operation of the subject tracking unit 111.
After the initial operation of the subject tracking unit 111, the result of the subject tracking unit 111 can be registered in the reference image.
In the result of the subject detection unit 110 and the result of the subject tracking unit 111, the accuracy of the subject tracking unit 111 can be improved by registering the reference image based on the more reliable result.
The matching processing unit 113 performs matching processing between the frame image of the video signal in the current frame and the reference image registered by the reference image registration unit 112.
By the matching process, a region having the highest degree of correlation with the reference image in the image of the video signal in the current frame is extracted as a target subject region.
Here, it is assumed that the matching processing unit 113 operates only when a reference image having a different time from the video signal of the current frame is registered.
In addition, the matching processing unit 113 divides the image into a number of regions, and extracts a region having the smallest difference from the luminance value in the region included in the reference image as a region having the highest correlation.

特徴画素判定部１１４では、マッチング処理部１１３により抽出された領域内において、各画素の有する色情報が目的とする被写体の特徴を示す所定の色モデルに含まれる場合に、特徴画素として判定する。
本実施例では、目的とする被写体を顔としているため、色モデルは肌色モデルとする。所定の色モデルは、固定であっても、抽出された被写体に応じて動的に変化しても構わない。
次の終了判定部１１５は、本発明の特徴とするところで、被写体追跡部の追跡結果の他、撮像装置内の各種情報（ＡＦ評価値、変倍率、ブレ情報）により追跡終了を判断するものである。
カメラ制御部１０９は、カメラ信号処理部１０６から出力されたＡＦ／ＡＥ評価値信号および、後述の振動ジャイロ１１６から出力されたブレ信号に基づいて、フォーカスレンズ１０２や、図示しない露出制御回路、手ブレ補正回路を制御する。
カメラ制御部１０９は、このフォーカスレンズ制御や露出制御に、被写体検出部１１０や被写体追跡部１１１から供給された目的とする被写体領域の抽出結果の情報を用いる。
従って、本実施例の撮像装置は、撮像画像における特定の被写体領域の情報を考慮した撮影処理を行なう機能を有する。
振動ジャイロ１１６は、角速度センサーでそのセンサー出力からカメラ制御部は画像のブレ量を検出している。
本実施例でのブレ検出方法として振動ジャイロ等の角速度センサーを用いているが、センサーを用いない画像検出による動きベクトル検出でも構わない。 In the region extracted by the matching processing unit 113, the feature pixel determination unit 114 determines that a pixel is a feature pixel when the color information of each pixel is included in a predetermined color model indicating the target feature of the subject.
In this embodiment, since the target subject is a face, the color model is a skin color model. The predetermined color model may be fixed or may change dynamically according to the extracted subject.
The next end determination unit 115 is characterized by the present invention. In addition to the tracking result of the subject tracking unit, the next end determination unit 115 determines the end of tracking based on various information (AF evaluation value, variable magnification, blur information) in the imaging apparatus. is there.
Based on the AF / AE evaluation value signal output from the camera signal processing unit 106 and the shake signal output from the vibration gyroscope 116 (to be described later), the camera control unit 109 detects the focus lens 102, an exposure control circuit (not shown), Controls the shake correction circuit.
The camera control unit 109 uses information on the extraction result of the target subject area supplied from the subject detection unit 110 and the subject tracking unit 111 for the focus lens control and the exposure control.
Therefore, the imaging apparatus according to the present exemplary embodiment has a function of performing imaging processing in consideration of information on a specific subject area in the captured image.
The vibration gyroscope 116 is an angular velocity sensor, and the camera control unit detects an image blur amount from the sensor output.
Although an angular velocity sensor such as a vibration gyroscope is used as a shake detection method in the present embodiment, motion vector detection by image detection without using a sensor may be used.

図２に、本実施例に係る被写体追跡方法について説明するフローチャートを示す。
また、図３に本実施例に係る被写体追跡方法についての説明図を示す。
以下の説明では、人物の顔を被写体として追跡する例について説明する。
図２において、Ｓ２０２はマッチング処理部１１３、Ｓ２０３は特徴画素判定部１１４、Ｓ２０４は基準画像登録部１１２による処理である。
また、図３において、３０１は基準画像登録部１１２により登録された基準画像、３０２は被写体追跡部１１１の入力画像を示す。
また、３０３はマッチング処理部１１３による被写体抽出結果、３０４は特徴画素判定部１１４による特徴画素の判定結果、３０６は基準画像登録部１１２により登録された基準画像を示す。
まず、ビデオカメラなどの撮像装置により、３０２のような撮像画像を入力画像として読み込む（Ｓ２０１）。
次に、入力画像とあらかじめ登録されている３０１のような基準画像とのマッチング処理を行なう（Ｓ２０２）。
このようなマッチング処理手は、例えば、基準画像とは異なるタイミングで生成されたフレーム画像に対して該基準画像とのマッチングを行い、この基準画像と類似する領域を抽出する。
マッチング処理では、入力画像における基準画像と同じサイズの部分領域の各画素の輝度と基準画像の各画素の輝度値との差分和を算出する。
その際、従来例のテンプレートマッチングを用いた被写体追跡の一例のフローチャーで示したように、
入力画像における基準画像と同じサイズの部分領域の位置を変化させ、算出される差分和が最小となる部分領域の位置が相関度の最も高い領域とする処理を行なうとする（図８のＳ１００５からＳ１００７参照）。
なお、２つの画像のマッチング方法は、さまざまな方式が考えられるので、本実施例の処理例は、ほんの一例であり、本発明が、このマッチング処理の方式にとらわれるものではない。 FIG. 2 is a flowchart illustrating the subject tracking method according to the present embodiment.
FIG. 3 shows an explanatory diagram of the subject tracking method according to the present embodiment.
In the following description, an example of tracking a person's face as a subject will be described.
In FIG. 2, S202 is processing performed by the matching processing unit 113, S203 is processing by the characteristic pixel determination unit 114, and S204 is processing performed by the reference image registration unit 112.
In FIG. 3, reference numeral 301 denotes a reference image registered by the reference image registration unit 112, and 302 denotes an input image of the subject tracking unit 111.
Reference numeral 303 denotes a subject extraction result by the matching processing unit 113, 304 denotes a feature pixel determination result by the feature pixel determination unit 114, and 306 denotes a reference image registered by the reference image registration unit 112.
First, a captured image such as 302 is read as an input image by an imaging device such as a video camera (S201).
Next, a matching process is performed between the input image and a reference image such as 301 registered in advance (S202).
Such a matching processor, for example, matches a frame image generated at a timing different from that of the reference image with the reference image, and extracts a region similar to the reference image.
In the matching process, a difference sum between the luminance of each pixel of the partial area having the same size as the reference image in the input image and the luminance value of each pixel of the reference image is calculated.
At that time, as shown in the flow chart of an example of subject tracking using the template matching of the conventional example,
It is assumed that the position of the partial area having the same size as the reference image in the input image is changed and the position of the partial area where the calculated difference sum is the minimum is the area having the highest degree of correlation (from S1005 in FIG. 8). (See S1007).
Since various methods can be considered as a method of matching two images, the processing example of this embodiment is only an example, and the present invention is not limited to this matching processing method.

ここで、マッチング処理では、基準画像と入力画像との類似性に基づくため、必ずしも正しい被写体領域をマッチング処理結果とするとは限らない。
３０３に示すマッチング処理結果のように、正しい被写体領域とは一部ずれた領域を結果として抽出している。
これは３０１に示すように基準画像内に被写体領域とは異なる領域が含まれていること、３０１に示す基準画像と３０２に示す入力画像とにおいて被写体の見えに変化があることに起因する。
３０１の基準画像内の被写体領域とは異なる領域と３０２に示す入力画像の部分領域との相関度が高いため、３０３のようなマッチング処理結果が得られる。
次に、マッチング処理により得られた領域において、目的とする被写体の特徴を示す特徴画素を判定する（Ｓ２０３）。 Here, since the matching process is based on the similarity between the reference image and the input image, a correct subject area is not always a matching process result.
As a result of the matching process shown in 303, an area that is partially deviated from the correct subject area is extracted as a result.
This is because the reference image includes an area different from the subject area as indicated by 301, and the appearance of the subject changes between the reference image indicated by 301 and the input image indicated by 302.
Since the correlation between the region different from the subject region in the reference image 301 and the partial region of the input image shown in 302 is high, a matching processing result like 303 is obtained.
Next, in the region obtained by the matching process, a feature pixel indicating a target subject characteristic is determined (S203).

つぎに、この特徴画素の判定処理の詳細について説明する。
図４に、特徴画素の判定処理について説明するフローチャートを示す。
まず、マッチング処理により得られた領域における所定位置の画素の有する色情報を取得する（Ｓ４０１）。
次に、取得された色情報が、所定条件を満たす色モデルに含まれる情報かを判定する（Ｓ４０２）。
次に、色情報が、所定の色モデルに含まれる情報であった場合（Ｓ４０２でＹＥＳ）、その画素は特徴画素として判定される（Ｓ４０３）。
画色情報が、所定の色モデルに含まれる情報でなかった場合（Ｓ４０２でＮＯ）、その画素は特徴画素として判定されない。
次に、マッチング処理により得られた領域における全ての画素に関して、処理をおこなったかを判定する（Ｓ４０４）。
全ての画素に関して、処理をおこなっていない場合（Ｓ４０４でＮＯ）、処理を行なっていない画素の色情報を取得する（Ｓ４０１）。全ての画素に関して、処理をおこなった場合（Ｓ４０４でＹＥＳ）、終了する。 Next, details of the feature pixel determination process will be described.
FIG. 4 is a flowchart for explaining the feature pixel determination process.
First, the color information which the pixel of the predetermined position in the area | region obtained by the matching process has is acquired (S401).
Next, it is determined whether the acquired color information is information included in a color model that satisfies a predetermined condition (S402).
Next, when the color information is information included in a predetermined color model (YES in S402), the pixel is determined as a feature pixel (S403).
If the image color information is not information included in the predetermined color model (NO in S402), the pixel is not determined as a feature pixel.
Next, it is determined whether processing has been performed for all pixels in the region obtained by the matching processing (S404).
If the processing is not performed for all the pixels (NO in S404), the color information of the pixels that are not processed is acquired (S401). If processing has been performed for all pixels (YES in S404), the processing ends.

特徴画素の判定により、３０４のような特徴画素判定結果が得られる。３０４では、特徴画素を白で特徴画素以外の画素を黒で塗りつぶしている。
ここで、所定の色モデルとは、目的とする被写体の特徴を示す色モデルであり、目的とする被写体が人物の顔である場合、肌色モデルとする。
図５に、色モデルの例を示す。
画素から取得される色情報はＹＣｂＣｒデータの色差ＣｂＣｒとし、図５の縦軸はＣｂ、横軸はＣｒとする。
図５の斜線部分が肌色モデルであり、取得されるＣｂＣｒ成分が斜線部分内に含まれるか否かを判定する。
図５のような、色モデルはあらかじめ設定されているものとする。
なお、取得される色情報や色モデルの形式は、さまざまな形式が考えられるので、本実施例の例は、ほんの一例であり、本発明が、この形式にとらわれるものではない。
例えば、取得される色情報の形式は、ＲＧＢデータであっても、ＨＳＶ表色系のデータに変換した色相Ｈのデータとしても良い。
また、色モデルと画素の色情報に基づく特徴画素の判定法に関しても、さまざまな方法が考えられるので、本実施例の例は、ほんの一例であり、本発明が、この方法にとらわれるものではない。
また、特徴画素の判定を実施する領域は、マッチング処理結果により抽出される領域と同じであっても、抽出された領域の重心を中心とした所定の領域であっても、本発明を適用可能である。 By the feature pixel determination, a feature pixel determination result such as 304 is obtained. In 304, feature pixels are painted white and pixels other than the feature pixels are painted black.
Here, the predetermined color model is a color model indicating the characteristics of a target subject, and when the target subject is a person's face, it is a skin color model.
FIG. 5 shows an example of a color model.
The color information acquired from the pixel is the color difference CbCr of the YCbCr data, the vertical axis in FIG. 5 is Cb, and the horizontal axis is Cr.
The shaded portion in FIG. 5 is a skin color model, and it is determined whether or not the acquired CbCr component is included in the shaded portion.
It is assumed that the color model as shown in FIG. 5 is set in advance.
It should be noted that since various types of color information and color models can be considered, the example of this embodiment is only an example, and the present invention is not limited to this format.
For example, the format of the acquired color information may be RGB data or hue H data converted into HSV color system data.
In addition, since various methods are conceivable regarding the feature pixel determination method based on the color model and the color information of the pixel, the example of this embodiment is only an example, and the present invention is not limited to this method. .
In addition, the present invention can be applied to whether the region for performing the feature pixel determination is the same as the region extracted based on the matching processing result or a predetermined region centered on the center of gravity of the extracted region. It is.

次に、マッチング処理に基づき抽出された領域を３０６に示すように基準画像として登録する（Ｓ２０４）。
例えば、マッチング手段の抽出結果を用いて、前記被写体が存在すると推定される領域を選択すると共に、この選択された領域を新たな基準画像として登録する。
このような基準画像３０６は次フレームのマッチング処理において利用される。
基準画像を順次更新していくことにより、被写体の向きが変化するなど時系列的に被写体の見えが変化する場合においても、被写体追跡処理を行なうことができる。
また、被写体検出部１１０および被写体追跡部１１１が共に動作する場合は、信頼性のより高い結果に基づき基準画像を登録しても良い。
一方、時系列的な被写体の見えの変化を考慮しない場合などは、基準画像を更新せず、初期に登録された基準画像を維持しても本発明を適用可能である。
なお、３０６の基準画像の形状を矩形で示したが、基準画像の形状は、さまざまな形状が考えられるので、本実施例の例は、ほんの一例であり、本発明が、この形状にとらわれるものではない。
例えば、基準画像の形状が、円形であっても多角形であっても本発明を適用可能である。 Next, an area extracted based on the matching process is registered as a reference image as indicated by 306 (S204).
For example, using the extraction result of the matching means, an area where the subject is estimated to be present is selected, and the selected area is registered as a new reference image.
Such a reference image 306 is used in the matching process of the next frame.
By sequentially updating the reference image, the subject tracking process can be performed even when the appearance of the subject changes in time series, such as when the orientation of the subject changes.
When both the subject detection unit 110 and the subject tracking unit 111 operate, the reference image may be registered based on a result with higher reliability.
On the other hand, when the change in appearance of the subject in time series is not taken into consideration, the present invention can be applied even if the reference image registered in the initial stage is maintained without updating the reference image.
Although the shape of the reference image 306 is shown as a rectangle, various shapes can be considered as the shape of the reference image. Therefore, the example of this embodiment is only an example, and the present invention is limited to this shape. is not.
For example, the present invention can be applied regardless of whether the shape of the reference image is a circle or a polygon.

次に、各種情報により追跡終了判定を行う（Ｓ２０５）。
図６に、各種情報による追跡終了判定処理について説明するフローチャート示す。
また、図７に本発明の実施例に係る追跡終了（追跡動作の中止）に適応されるシーンの一例を示す。
まず、特徴画素判定部１１４により得られる特徴画素の数が所定値ｔｈ１よりも少ない（下回った）場合（Ｓ６０１でＹＥＳ）、被写体追跡を終了する（Ｓ６０９）。
図７（ａ）のように、被写体追跡部１１１の追跡対象が障害物に隠れた場合などは、マッチング処理結果による抽出された領域中の特徴画素の数は少なくなり、被写体追跡部１１１は終了される。
次に、追跡している被写体の大きさに応じて、後で用いる前回から今回までの追跡移動量の閾値ｔｈ２を設定する（Ｓ６０２）。
図７（ｂ）のように、追跡被写体の前回から今回までの画面内における移動量が、所定値ｔｈ２より大きいか判定し所定量ｔｈ２より大きい場合（Ｓ６０３でＹＥＳ）は、被写体追跡を中断し、基準画像をリセットする（Ｓ６０９）。
この際、所定量ｔｈ２は被写体検出部１１０で検出された被写体の大きさにより被写体が大きい場合は大きく、小さい場合は小さくする。
これは被写体の動きは被写体の大きさによって決まった量以下と考えられるためで、小さな被写体が大きな移動量となる場合は誤追跡を行っている可能性が高いためである。
次に、追跡している被写体の大きさに応じて、後で用いる追跡移動量の閾値ｔｈ３を設定する（Ｓ６０４）。
図７（ｃ）のように追跡被写体の追跡開始から今回までの連続して追跡に成功した被写体の画面内における移動量が、所定値ｔｈ３より大きいか判定し所定値ｔｈ３より大きい場合（Ｓ６０５でＹＥＳ）は、被写体追跡を終了し、基準画像をリセットする（Ｓ６０９）。
この際、所定値ｔｈ３は被写体検出部１１０で検出された被写体の大きさにより変更し、被写体が大きい場合は大きく、小さい場合は小さくする。
これは被写体の動きが被写体の大きさによって決まった量以下と考えられるためで、小さな被写体が大きな移動量となる場合は誤追跡を行っている可能性がより高いためである。
これらは、マッチングにより抽出された類似する領域との距離が所定量を超えたときに、前記被写体の追跡動作を中止する場合の一例である。 Next, the tracking end determination is performed based on various information (S205).
FIG. 6 is a flowchart for explaining the tracking end determination process based on various information.
FIG. 7 shows an example of a scene adapted to the end of tracking (stopping the tracking operation) according to the embodiment of the present invention.
First, when the number of feature pixels obtained by the feature pixel determination unit 114 is smaller (below) the predetermined value th1 (YES in S601), the subject tracking is ended (S609).
As shown in FIG. 7A, when the tracking target of the subject tracking unit 111 is hidden by an obstacle, the number of feature pixels in the extracted region based on the matching processing result is reduced, and the subject tracking unit 111 is terminated Is done.
Next, in accordance with the size of the subject being tracked, a threshold th2 of the tracking movement amount from the previous time to this time used later is set (S602).
As shown in FIG. 7B, it is determined whether the movement amount of the tracked subject from the previous time to the current time is larger than the predetermined value th2, and if it is larger than the predetermined value th2 (YES in S603), the subject tracking is interrupted. The reference image is reset (S609).
At this time, the predetermined amount th2 is large when the subject is large due to the size of the subject detected by the subject detection unit 110, and is small when the subject is small.
This is because the movement of the subject is considered to be less than or equal to the amount determined by the size of the subject, and there is a high possibility of mistracking when a small subject has a large movement amount.
Next, a tracking movement amount threshold th3 to be used later is set according to the size of the subject being tracked (S604).
As shown in FIG. 7C, it is determined whether the movement amount of the subject that has been successfully tracked continuously from the start of tracking of the tracked subject to this time is larger than the predetermined value th3, and is larger than the predetermined value th3 (S605). YES) ends the subject tracking and resets the reference image (S609).
At this time, the predetermined value th3 is changed according to the size of the subject detected by the subject detection unit 110, and is large when the subject is large and small when the subject is small.
This is because the movement of the subject is considered to be less than or equal to the amount determined by the size of the subject, and if the small subject has a large amount of movement, there is a higher possibility of performing mistracking.
These are examples of a case where the subject tracking operation is stopped when a distance from a similar region extracted by matching exceeds a predetermined amount.

次に、図７（ｄ）のように追跡被写体のエリアから抽出されたＡＦ評価値が所定値ｔｈ４より小さいか判定し、所定量ｔｈ４より小さい場合（Ｓ６０６でＹＥＳ）は、被写体追跡を終了し、基準画像をリセットする（Ｓ６０９）。
これは被写体を追跡しているのだからある程度のＡＦ評価値が得られるはずで、そのレベルが小さい場合は誤追尾を行っている可能性が高いためである。
これらの手段として、前記追跡終了判定手段により選択された領域の画像信号の高周波成分を抽出し、該領域における焦点評価値を生成する焦点評価値生成手段を構成する。
そして、前記追跡終了判定手段は、前記焦点評価値が所定量より小さくなったときに、前記被写体の追跡動作を中止する。
次に、図７（ｅ）のように変倍レンズによる変倍量が所定量ｔｈ５変化しているか判定し所定量ｔｈ５変化している場合（Ｓ６０７でＹＥＳ）は、被写体追跡を終了し、基準画像をリセットする（Ｓ６０９）。
これは比較する２つの画像の変倍率が違えばマッチングを取っても意味がなく、誤追跡を行ってしまう可能性が高いためである。
次に、図７（ｆ）のように、手ブレ検出手段により検出された手ブレが所定量ｔｈ６を超えている場合（Ｓ６０８でＹＥＳ）は、被写体追跡を終了し、基準画像をリセットする（Ｓ６０９）。
これは手ブレにより比較する２つの画像が違えばマッチングを取っても意味がなく、誤追跡を行ってしまう可能性が高いためである。
これらにより、目的とする被写体領域とは異なる領域を誤って追跡する可能性を軽減することが可能となる。 Next, as shown in FIG. 7D, it is determined whether the AF evaluation value extracted from the area of the tracked subject is smaller than the predetermined value th4. If it is smaller than the predetermined amount th4 (YES in S606), the subject tracking is finished. The reference image is reset (S609).
This is because the subject is being tracked, and a certain AF evaluation value should be obtained. If the level is small, there is a high possibility that erroneous tracking is being performed.
As these means, a focus evaluation value generating means for extracting a high frequency component of the image signal of the area selected by the tracking end determination means and generating a focus evaluation value in the area is configured.
The tracking end determination unit stops the tracking operation of the subject when the focus evaluation value becomes smaller than a predetermined amount.
Next, as shown in FIG. 7E, it is determined whether or not the amount of magnification change by the magnification lens has changed by the predetermined amount th5. If the amount of change has changed by the predetermined amount th5 (YES in S607), the subject tracking is finished, and the reference The image is reset (S609).
This is because if there is a difference in magnification between the two images to be compared, there is no point in matching and there is a high possibility of mistracking.
Next, as shown in FIG. 7F, when the camera shake detected by the camera shake detection means exceeds the predetermined amount th6 (YES in S608), the subject tracking is ended and the reference image is reset ( S609).
This is because if two images to be compared are different due to camera shake, there is no point in taking matching, and there is a high possibility of mistracking.
Accordingly, it is possible to reduce the possibility of erroneously tracking a region different from the target subject region.

以上の実施例の構成によれば、被写体追跡処理では、追跡に用いる輝度マッチング情報以外に、追跡状況や、撮像装置の各種情報により追跡終了判定処理を行いて追跡を終了させることができる。
これにより、被写体追跡処理の精度を向上させ、目的とする被写体領域とは異なる領域を誤って追跡する可能性を軽減することが可能となる。
また、上記実施例はビデオカメラを例に挙げて説明を行ったが、動画機能を備えたデジタルスチルカメラに適用できることは言うまでもない。
また、パーソナルコンピュータなどの汎用コンピュータ上で動作するアプリケーションにおいても、上記実施例と同様の処理を行なって被写体の追跡を行うことも可能である。 According to the configuration of the above embodiment, in the subject tracking process, the tracking end determination process can be performed based on the tracking status and various information of the imaging device in addition to the luminance matching information used for tracking, and the tracking can be ended.
As a result, the accuracy of subject tracking processing can be improved, and the possibility of erroneously tracking a region different from the target subject region can be reduced.
Further, although the above embodiment has been described by taking a video camera as an example, it goes without saying that it can be applied to a digital still camera having a moving image function.
Further, even in an application that runs on a general-purpose computer such as a personal computer, it is possible to track the subject by performing the same processing as in the above-described embodiment.

本発明の実施例に係る撮像装置のブロック図。1 is a block diagram of an image pickup apparatus according to an embodiment of the present invention. 本発明の実施例に係る被写体追跡方法について説明するフローチャート。6 is a flowchart for explaining a subject tracking method according to the embodiment of the present invention. 本発明の実施例に係る被写体追跡方法についての説明図。Explanatory drawing about the subject tracking method which concerns on the Example of this invention. 本発明の実施例に係る特徴画素の判定処理について説明するフローチャート。The flowchart explaining the determination process of the feature pixel based on the Example of this invention. 本発明の実施例に係る色モデルの一例。An example of the color model which concerns on the Example of this invention. 本発明の実施例に係る追跡終了判定処理について説明するフローチャート。The flowchart explaining the tracking end determination process based on the Example of this invention. 本発明の実施例に係る追跡終了に適応されるシーンの一例。An example of the scene applied to the end of tracking according to an embodiment of the present invention. 従来例におけるテンプレートマッチングによる被写体追跡について説明するフローチャート。10 is a flowchart for explaining subject tracking by template matching in a conventional example. 従来例におけるテンプレートマッチングによる被写体追跡についての説明図。Explanatory drawing about the subject tracking by template matching in a prior art example.

Explanation of symbols

１０１：変倍レンズ（ズームレンズ）
１０２：フォーカスレンズ
１０３：撮像素子
１０４：アナログ信号処理部
１０５：Ａ／Ｄ変換部
１０６：カメラ信号処理部
１０７：表示部
１０８：記録媒体
１０９：カメラ制御部
１１０：被写体検出部
１１１：被写体追跡部
１１２：基準画像登録部
１１３：マッチング処理部
１１４：特徴画素判定部
１１５：終了判定部
１１６：振動ジャイロ
３０１：基準画像登録部１１２により登録されている基準画像
３０２：入力画像
３０３：マッチング処理部１１３による被写体抽出結果
３０４：特徴画素判定部１１４による特徴画素の判定結果
３０６：基準画像登録部１１２により登録されている基準画像
５０１：特徴画素判定部１１４における肌色モデルの一例 101: Variable magnification lens (zoom lens)
102: Focus lens 103: Image sensor 104: Analog signal processing unit 105: A / D conversion unit 106: Camera signal processing unit 107: Display unit 108: Recording medium 109: Camera control unit 110: Subject detection unit 111: Subject tracking unit 112: reference image registration unit 113: matching processing unit 114: feature pixel determination unit 115: end determination unit 116: vibration gyro 301: reference image registered by the reference image registration unit 112 302: input image 303: matching processing unit 113 Subject extraction result 304: Feature pixel determination result 306: Feature pixel determination result 306: Reference image 501 registered by reference image registration unit 112: Example of skin color model in feature pixel determination unit 114

Claims

An imaging apparatus having a subject tracking unit that detects a specific subject or a part of a subject included in a moving image and tracks the subject based on similarity extracted by matching processing with a reference image registered in advance Because
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, and a tracking end determination unit that ends the tracking of the subject .
The tracking end determination means has a distance between a subject area registered as the reference image and a similar area extracted by the matching means.
An image pickup apparatus , wherein the subject tracking operation is stopped when a predetermined amount set to be changed depending on the size of the detected subject is exceeded .

An imaging apparatus having a subject tracking unit that detects a specific subject or a part of a subject included in a moving image and tracks the subject based on similarity extracted by matching processing with a reference image registered in advance Because
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, and a tracking end determination unit that ends the tracking of the subject.
When the tracking end determination means is continuously tracking the subject, the distance between the subject area first registered as the reference image and the similar area extracted by the matching means is:
An image pickup apparatus, wherein the subject tracking operation is stopped when a predetermined amount set to be changed depending on the size of the detected subject is exceeded.

An imaging apparatus having a subject tracking unit that detects a specific subject or a part of a subject included in a moving image and tracks the subject based on similarity extracted by matching processing with a reference image registered in advance Because
The subject tracking unit is selected by a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, a tracking end determination unit that ends tracking of the subject, and a tracking end determination unit. A focus evaluation value generating means for extracting a high frequency component of the image signal of the region and generating a focus evaluation value in the region;
The image pickup apparatus, wherein the tracking end determination unit stops the tracking operation of the subject when the focus evaluation value becomes smaller than a predetermined amount.

An imaging apparatus having a subject tracking unit that detects a specific subject or a part of a subject included in a moving image and tracks the subject based on a similarity extracted by a matching process with a reference image registered in advance Because
The subject tracking unit includes a reference image registration unit that registers the reference image, a matching processing unit that performs the matching process, a tracking end determination unit that ends tracking of the subject, and a variable that scales the image of the subject. A double means,
The imaging apparatus according to claim 1, wherein the tracking end determination unit stops the tracking operation of the subject when a scaling amount by the scaling unit reaches a predetermined amount.

A subject tracking method for detecting a specific subject or a part of a subject included in a moving image and tracking the subject based on similarity extracted by matching processing with a reference image,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used for the matching process ,
In the tracking end determination step, the distance between the subject area registered as the reference image and the similar area extracted by the matching means is:
A subject tracking method , wherein the subject tracking operation is stopped when a predetermined amount set to be changed according to the size of the detected subject is exceeded .

A subject tracking method for detecting a specific subject or a part of a subject included in a moving image and tracking the subject based on similarity extracted by matching processing with a reference image,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used for the matching process,
In the tracking end determination step, when tracking the subject continuously, the distance between the subject region initially registered as the reference image and the similar region extracted by the matching means is
A subject tracking method, wherein the subject tracking operation is stopped when a predetermined amount set to be changed according to the size of the detected subject is exceeded.

A subject tracking method for detecting a specific subject or a part of a subject included in a moving image and tracking the subject based on similarity extracted by matching processing with a reference image,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
Based on information other than the information used for the matching process, a tracking end determination step for ending tracking of the subject, and a high-frequency component of the image signal of the region selected by the tracking end determination step are extracted, and the focus in the region A focus evaluation value generation step for generating an evaluation value, and
In the tracking end determination step, the subject tracking operation is stopped when the focus evaluation value becomes smaller than a predetermined amount.

A subject tracking method for detecting a specific subject or a part of a subject included in a moving image and tracking the subject based on similarity extracted by matching processing with a reference image,
As the reference image, a reference image registration step of previously registering a subject area in any frame image in the moving image;
A matching step for matching a frame image generated at a timing different from the reference image with the reference image and extracting a region similar to the reference image;
A tracking end determination step for ending tracking of the subject by information other than the information used in the matching process, and a scaling step for scaling the image of the subject,
In the tracking end determination step, the subject tracking operation is stopped when the amount of scaling in the scaling step reaches a predetermined amount.