JP5677080B2

JP5677080B2 - Image processing apparatus and control method thereof

Info

Publication number: JP5677080B2
Application number: JP2010291130A
Authority: JP
Inventors: 篤史藤田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-12-27
Filing date: 2010-12-27
Publication date: 2015-02-25
Anticipated expiration: 2030-12-27
Also published as: JP2012138845A

Description

本発明は、画像処理装置に関する。特に、画像内の被写体を自動的に検出する主被写体検出機能と、被写体の動きを捕捉し続ける被写体追尾機能を有する画像処理装置に関するものである。 The present invention relates to an image processing apparatus. In particular, the present invention relates to an image processing apparatus having a main subject detection function for automatically detecting a subject in an image and a subject tracking function for continuously capturing the movement of the subject.

デジタルカメラに代表される画像処理装置において、ユーザの関心の高い被写体に対して適切な撮像制御・画像処理を行うことは、撮影画像の品質を向上する上で非常に効果的である。そして、移動する被写体に対して上記適切な処理を行うために、パターンマッチング法などにより被写体を捕捉し続ける被写体追尾機能を有するデジタルカメラが開発されている。被写体を追尾するにおいては、追尾の対象となるべき被写体、たとえばユーザの関心の高い被写体（以降、主被写体を呼ぶ）を特定することが必要である。しかしながら、タッチパネルやボタン操作などのユーザの手動操作により主被写体領域を指定する構成では、ユーザの操作が煩雑になるという課題があった。 In an image processing apparatus typified by a digital camera, it is very effective to improve the quality of a captured image to perform appropriate imaging control and image processing on a subject that is of high interest to the user. In order to perform the above-described appropriate processing on a moving subject, a digital camera having a subject tracking function that keeps capturing a subject by a pattern matching method or the like has been developed. In tracking a subject, it is necessary to specify a subject to be tracked, for example, a subject that is highly interested by the user (hereinafter referred to as a main subject). However, in the configuration in which the main subject area is designated by a user's manual operation such as a touch panel or button operation, there is a problem that the user's operation becomes complicated.

上記課題に対して、近年では、特許文献１に開示されているように、主被写体を自動的に検出し背景と分離する主被写体検出機能を持つデジタルカメラが開発されている。また、主被写体検出機能に関わる画像処理技術に関しても様々な提案がなされている。 In recent years, as disclosed in Patent Document 1, a digital camera having a main subject detection function for automatically detecting a main subject and separating it from the background has been developed. Various proposals have also been made regarding image processing techniques related to the main subject detection function.

特許文献２においては、抽出した被写体領域に基づいてニューラルネットワークを構成することにより、位置ずれに強く、且つ精度の良い被写体認識を実現することが提案されている。また、特許文献３においては、動画像において、各フレームを一定の大きさの画素ブロックに分割し、画素ブロック毎に画素データのブロック内平均値を求め、現フレームと前フレームとのブロック内平均値差を算出し、差分の大きいブロックが連続して出現する領域を被写体の動きとみなし、被写体の動き量から画像表示領域を決定している。 In Patent Document 2, it is proposed that a neural network is configured based on the extracted subject area, thereby realizing subject recognition that is resistant to misalignment and has high accuracy. Further, in Patent Document 3, in a moving image, each frame is divided into pixel blocks of a certain size, an average value within the block of pixel data is obtained for each pixel block, and an average within the block between the current frame and the previous frame is obtained. A value difference is calculated, and an area in which blocks having a large difference appear continuously is regarded as a movement of the subject, and an image display area is determined from the amount of movement of the subject.

特開２００７−１２９３１０号公報JP 2007-129310 A 特開平０５−２８２２７５号公報JP 05-282275 A 特開２０００−２９５５１７号公報JP 2000-295517 A

主被写体検出機能及び被写体追尾機能を有するデジタルカメラにおいては、主被写体検出機能によって検出された被写体領域に対して被写体追尾機能を適用することで動きのある被写体を捕捉し続けるという方法が一般的に用いられている。しかしながら、被写体の抽出に関わる主被写体検出のアルゴリズムと、パターンマッチング法などによって移動する被写体を追尾する被写体追尾のアルゴリズムは一般には異なる。そのため、特許文献２や特許文献３のように検出した主被写体領域をそのまま被写体検出に適用した場合、追尾性能が劣化する可能性があるという課題があった。 In a digital camera having a main subject detection function and a subject tracking function, a method of continuously capturing a moving subject by applying a subject tracking function to a subject area detected by the main subject detection function is generally used. It is used. However, the main subject detection algorithm related to subject extraction is generally different from the subject tracking algorithm for tracking a moving subject by a pattern matching method or the like. Therefore, when the main subject area detected as in Patent Literature 2 and Patent Literature 3 is applied to subject detection as it is, there is a problem that the tracking performance may be deteriorated.

本発明は上記の課題に鑑みてなされたものであり、自動的に検出された主被写体領域を追尾するにおいて追尾性能を向上することを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to improve the tracking performance in tracking the automatically detected main subject region.

上記目的を達成するための本発明の一態様による画像処理装置は以下の構成を備える。すなわち、
動画像の少なくとも１つのフレームから被写体領域を検出する検出手段と、
前記検出手段で検出された被写体領域の位置に基づいて設定された領域から、前記検出手段とは別の方法を用いて、追尾処理で用いられる画像情報を抽出するための領域を設定する設定手段と、
前記設定手段で設定された領域の画像情報に基づいて前記追尾処理を実行する追尾手段と、を備え、
前記設定手段は、いずれかのフレームを複数のブロックに分割し、前記複数のブロックから前記被写体領域の位置を含むブロックとその周囲のブロックからなる所定数のブロックを参照範囲として設定し、該参照範囲に含まれるブロックから、１つまたは複数のブロックを、前記画像情報を抽出するための領域として選択する。
In order to achieve the above object, an image processing apparatus according to an aspect of the present invention has the following arrangement. That is,
Detecting means for detecting a subject area from at least one frame of the moving image;
Setting means for setting an area for extracting image information used in the tracking process from an area set based on the position of the subject area detected by the detecting means, using a method different from the detecting means. When,
Tracking means for executing the tracking process based on the image information of the area set by the setting means ,
The setting means divides one of the frames into a plurality of blocks, sets a predetermined number of blocks including a block including the position of the subject area and the surrounding blocks from the plurality of blocks as a reference range. from blocks included in the scope, one or more blocks, select a region for extracting the image information.

本発明によれば、ユーザによる煩雑な入力操作なしに主被写体を自動的に検出し、動きのある被写体においても、正確に被写体を追尾し、被写体に適切な撮像制御・画像処理を行うことのできる画像処理装置を提供することができる。 According to the present invention, it is possible to automatically detect a main subject without a complicated input operation by a user, accurately track the subject even in a moving subject, and perform appropriate imaging control and image processing on the subject. An image processing apparatus that can be provided can be provided.

実施形態によるデジタルカメラのブロック図。1 is a block diagram of a digital camera according to an embodiment. 実施形態による主被写体検出処理を示すフローチャート。6 is a flowchart illustrating main subject detection processing according to the embodiment. 主被写体検出処理によるヒストグラム分割処理を説明する図。The figure explaining the histogram division process by the main subject detection process. 主被写体検出処理における被写体評価値の算出に関わる説明図。Explanatory drawing regarding calculation of the object evaluation value in the main object detection process. 主被写体検出処理における被写体評価値の算出に関わる説明図Explanatory drawing related to calculation of subject evaluation value in main subject detection processing 実施形態によるデジタルカメラの全体処理を示すフローチャート。5 is a flowchart showing overall processing of the digital camera according to the embodiment. 位置情報更新部の動作を説明する図。The figure explaining operation | movement of a positional information update part. 被写体追尾処理を説明するフローチャート。6 is a flowchart for explaining subject tracking processing. テンプレート選択に関わる説明図。Explanatory drawing in connection with template selection. 被写体追尾部のテンプレート選択を説明する図。The figure explaining the template selection of a subject tracking part. 被写体追尾部における同系色画素の分布例を示す図。The figure which shows the example of distribution of the similar color pixel in a subject tracking part. 被写体追尾部におけるテンプレート形状を説明する図。The figure explaining the template shape in a subject tracking part. 実施形態における重みテーブル設計カーブの例を示す図。The figure which shows the example of the weight table design curve in embodiment. 実施形態における重みテーブルの例を示す図。The figure which shows the example of the weight table in embodiment. 評価値の変換関数の例を示す図。The figure which shows the example of the conversion function of an evaluation value.

以下に、本発明の好ましい一実施形態について、添付の図面に基づいて詳細に説明する。 Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

図１は、本発明の画像処理装置としての、デジタルカメラ１００の構成例を示す図である。図１において、１０は撮影レンズ、１２はメカニカルシャッター、１４は絞りである。１６は光学像を電気信号に変換する撮像素子であり、本実施形態ではＣＣＤが用いられている。１８は撮像素子１６からの出力信号にゲインをかけるゲイン回路、２２はアナログ信号出力をデジタル信号に変換するＡ／Ｄ変換器である。また、メカニカルシャッター１２以外にも、撮像素子１６のリセットタイミングの制御によって、電子シャッタとして、蓄積時間を制御することが可能であり、デジタルカメラ１００は動画撮影などに使用可能である。 FIG. 1 is a diagram illustrating a configuration example of a digital camera 100 as an image processing apparatus of the present invention. In FIG. 1, 10 is a photographing lens, 12 is a mechanical shutter, and 14 is an aperture. Reference numeral 16 denotes an image sensor that converts an optical image into an electrical signal. In this embodiment, a CCD is used. Reference numeral 18 denotes a gain circuit that applies a gain to the output signal from the image sensor 16, and reference numeral 22 denotes an A / D converter that converts an analog signal output into a digital signal. In addition to the mechanical shutter 12, the accumulation time can be controlled as an electronic shutter by controlling the reset timing of the image sensor 16, and the digital camera 100 can be used for moving image shooting and the like.

５０は画像処理回路であり、Ａ／Ｄ変換器２２からのデータに対して所定の画素補間処理や色変換処理を行う。また、画像処理回路５０は、撮像した画像データを用いて所定の演算処理を行い、得られた演算結果をシステム制御回路６０に提供する。システム制御回路６０は、画像処理回路５０から提供された演算結果に基づいて露光制御部、及びレンズ駆動回路４２に対して制御を行い、ＡＥ処理（自動露出機能）・ＡＦ処理（自動焦点機能）を行っている。なお、露光制御部には、絞り駆動回路２６、メカニカルシャッター駆動回路２８、フラッシュ発光回路４６が含まれる。また、画像処理回路５０は、撮像した画像データを用いて所定の演算処理を行い、得られた演算結果に基づいてＡＷＢ（オートホワイトバランス）処理も行っている。なお、システム制御回路６０は不図示のＣＰＵが不図示のメモリに格納されたプログラムを実行して各種処理を実現する。後述の主被写体検出処理およびその検出結果を用いた被写体追尾処理は、システム制御回路６０により実行される。 An image processing circuit 50 performs predetermined pixel interpolation processing and color conversion processing on the data from the A / D converter 22. The image processing circuit 50 performs predetermined calculation processing using the captured image data, and provides the obtained calculation result to the system control circuit 60. The system control circuit 60 controls the exposure control unit and the lens driving circuit 42 based on the calculation result provided from the image processing circuit 50, and performs AE processing (automatic exposure function) / AF processing (automatic focus function). It is carried out. The exposure control unit includes an aperture driving circuit 26, a mechanical shutter driving circuit 28, and a flash light emitting circuit 46. The image processing circuit 50 performs predetermined arithmetic processing using the captured image data, and also performs AWB (auto white balance) processing based on the obtained arithmetic result. The system control circuit 60 implements various processes by executing a program stored in a memory (not shown) by a CPU (not shown). A main subject detection process, which will be described later, and a subject tracking process using the detection result are executed by the system control circuit 60.

Ａ／Ｄ変換器２２によりデジタル変換された画像データが画像処理回路５０によって所望の画像データへと変換される。そして、画像データを圧縮伸長する圧縮・伸長回路３２により、圧縮された画像データが画像記憶媒体Ｉ／Ｆ８０を介して画像記憶媒体８２に書き込まれる。 The image data digitally converted by the A / D converter 22 is converted into desired image data by the image processing circuit 50. The compressed image data is written to the image storage medium 82 via the image storage medium I / F 80 by the compression / decompression circuit 32 that compresses and expands the image data.

１０８は画像表示部であり、ＶＲＡＭ３４に書き込まれた表示用の画像データを表示する。画像表示部１０８を用いて撮像素子１６により撮像された画像データを逐次表示すれば、電子ファインダー機能（ＥＶＦ：Electrical Viewfinder）を実現することが可能である。このとき、ＡＥやＡＦにおける測光領域情報、測距領域情報をＥＶＦのライブ画像に重畳して表示することも可能である。さらに、被写体の認識状況を示すために、人物顔や主被写体の認識結果である被写体領域に枠をＥＶＦのライブ画像と重畳させて表示することが可能となっている。また、画像表示部１０８にタッチパネルが装着されていれば、入力装置としても用いることが可能である。またデジタルカメラ１００に具備される画像表示部１０８以外にも、画像処理回路５０によって処理された画像データを、外部機器Ｉ／Ｆ８４を介して接続されたＴＶディスプレイなどの外部装置８６に表示することも可能である。 An image display unit 108 displays image data for display written in the VRAM 34. By sequentially displaying image data captured by the image sensor 16 using the image display unit 108, an electronic viewfinder function (EVF: Electric Viewfinder) can be realized. At this time, it is also possible to superimpose and display the photometry area information and distance measurement area information in AE and AF on the EVF live image. Further, in order to indicate the recognition status of the subject, it is possible to display a frame superimposed on the EVF live image in the subject area that is the recognition result of the human face or the main subject. Further, if a touch panel is attached to the image display unit 108, it can also be used as an input device. In addition to the image display unit 108 provided in the digital camera 100, the image data processed by the image processing circuit 50 is displayed on an external device 86 such as a TV display connected via the external device I / F 84. Is also possible.

３０は一時記憶メモリであり、撮影した静止画像や動画像を格納する。一時記憶メモリ３０は、所定枚数の静止画像や所定時間の動画像を格納するのに十分な記憶量を備えており、システム制御回路６０の作業領域としても使用することが可能である。 Reference numeral 30 denotes a temporary storage memory that stores captured still images and moving images. The temporary storage memory 30 has a storage capacity sufficient to store a predetermined number of still images and a moving image for a predetermined time, and can also be used as a work area for the system control circuit 60.

メカニカルシャッター駆動回路２８はメカニカルシャッター１２を制御し、フラッシュ発光回路４６と連動することによりフラッシュ調光機能も実現するものである。レンズ駆動回路４２は撮影レンズ１０のフォーカシングを制御する。システム制御回路６０はデジタルカメラ１００全体を制御する。７０はデジタルカメラ１００の操作部材からなる操作部であり、電源のＯＮ／ＯＦＦ、撮影開始、撮影モードの切り替え、各種設定の変更をユーザがシステム制御回路６０へ指示するために用いられる。 The mechanical shutter drive circuit 28 controls the mechanical shutter 12 and, in conjunction with the flash light emission circuit 46, realizes a flash light control function. The lens driving circuit 42 controls the focusing of the taking lens 10. A system control circuit 60 controls the entire digital camera 100. Reference numeral 70 denotes an operation unit made up of operation members of the digital camera 100, which is used by the user to instruct the system control circuit 60 to turn on / off the power, start shooting, switch shooting modes, and change various settings.

操作部７０のシャッタースイッチで、撮影が指示された場合、ＡＥ処理で決定された露光時間分、撮像素子１６を露光させる。その後、撮像素子１６から信号を読み出し、読み出された信号はＡ／Ｄ変換器２２によりデジタル値に変換され、一時記憶メモリ３０に書き込まれる。画像処理回路５０での現像処理、圧縮・伸長回路３２での圧縮を実行し、画像記憶媒体Ｉ／Ｆ８０を介して画像記憶媒体８２に画像データを書き込む記録処理を行う。尚、画像記憶媒体８２はメモリカードやハードディスク等の記録媒体である。 When shooting is instructed by the shutter switch of the operation unit 70, the image sensor 16 is exposed for the exposure time determined by the AE process. Thereafter, a signal is read from the image sensor 16, and the read signal is converted into a digital value by the A / D converter 22 and written to the temporary storage memory 30. Development processing in the image processing circuit 50 and compression in the compression / decompression circuit 32 are executed, and recording processing for writing image data to the image storage medium 82 via the image storage medium I / F 80 is performed. The image storage medium 82 is a recording medium such as a memory card or a hard disk.

また、画像処理回路５０は後述する主被写体判別や、被写体追尾を行うための各種信号処理回路を具備しており、画像のブロック分割、ヒストグラム算出、差分演算などの処理を高速に実行することができる。 The image processing circuit 50 includes various signal processing circuits for performing main subject discrimination and subject tracking, which will be described later, and can execute processing such as image block division, histogram calculation, and difference calculation at high speed. it can.

次に、上記構成からなるデジタルカメラにおいて、ユーザによる煩雑な操作を必要とすることなく、撮影対象である主被写体にピントや明るさを最適な条件で撮影するために、自動的に主被写体を検出する方法（主被写体検出処理）について説明する。なお、以下に説明する主被写体検出処理は、単に一例を示したものであり、周知のいかなる主被写体検出処理を用いてもよい。 Next, in the digital camera having the above-described configuration, the main subject is automatically selected in order to shoot the main subject that is the subject to be photographed under optimum conditions without focusing on the main subject. A detection method (main subject detection process) will be described. Note that the main subject detection process described below is merely an example, and any known main subject detection process may be used.

以下に図２を参照して、本実施形態による主被写体検出の処理手順について詳細に述べる。ここでは、一例として、画像内から１つまたは複数の被写体候補領域を抽出し、特徴量によってその領域の被写体らしさの評価値を算出することで被写体領域を抽出する方法について説明する。 With reference to FIG. 2, the main subject detection processing procedure according to the present embodiment will be described in detail below. Here, as an example, a method for extracting a subject region by extracting one or a plurality of subject candidate regions from an image and calculating an evaluation value of the subjectness of the region based on a feature amount will be described.

図２は、実施形態による主被写体検出の処理を示すフローチャートである。まず、ステップＳ１０１において、システム制御回路６０は、画像内をある一定のサイズの複数のブロックに分割する。次に、ステップＳ１０２において、システム制御回路６０は、分割した各ブロック内の画素における色相の平均値を算出し、各ブロックの代表値とする。ステップＳ１０３において、システム制御回路６０は、ステップＳ１０２で求めた色相の代表値を要素とした色相のヒストグラムを作成する。尚、彩度の低いブロックは色相の値の信頼性が低いため、彩度が所定の閾値以上のブロックのみをヒストグラム作成に使用するようにしてもよい。ステップＳ１０４において、システム制御回路６０は、色相ヒストグラムを同一色領域とみなす複数のグループに分割する。 FIG. 2 is a flowchart illustrating main subject detection processing according to the embodiment. First, in step S101, the system control circuit 60 divides the image into a plurality of blocks having a certain size. Next, in step S102, the system control circuit 60 calculates the average value of the hues of the pixels in each divided block and sets it as the representative value of each block. In step S103, the system control circuit 60 creates a hue histogram with the hue representative value obtained in step S102 as an element. Note that since blocks with low saturation have low hue value reliability, only blocks with saturation equal to or higher than a predetermined threshold may be used for histogram creation. In step S104, the system control circuit 60 divides the hue histogram into a plurality of groups that are regarded as the same color region.

以下、色相ヒストグラムをｎ（ｎ＞１）個の同一色グループに分割する方法について図３（Ａ），（Ｂ）を用いて説明する。まず、システム制御回路６０は、ヒストグラムの全区間を走査し、最も高い山HighestPeak1を検出する。次に、システム制御回路６０は、その山から左右両側の谷方向に向かって走査し、ヒストグラムの頻度がTH_Freq以下となるか、HighestPeak1からの区間の距離がTH_HueRangeに達するまでの区間を同一色領域とする。そして、システム制御回路６０は、ブロックの代表値がこの範囲に含まれるブロックを同一色領域１としてグルーピングすると共に、この区間をグルーピング処理済みとして記録する。次に、システム制御回路６０は、グルーピング処理済みの区間を除いたヒストグラムの全区間を再び走査し、最も高い山HighestPeak2を検出する（図３（Ｂ））。以下同様の処理を、ヒストグラム中の全区間がグルーピング処理済みとされるまで繰り返すことにより、色相ヒストグラムをｎ個の同一色グループに分割することができる。尚、山から谷方向に向かって走査しているとき、ヒストグラムの頻度がTH_Freq以下となる区間にたどり着く前に、グルーピング処理済みの区間にたどり着いた場合は、そこまでの区間を同一色領域の範囲とする。 Hereinafter, a method of dividing the hue histogram into n (n> 1) identical color groups will be described with reference to FIGS. First, the system control circuit 60 scans all sections of the histogram and detects the highest peak HighestPeak1. Next, the system control circuit 60 scans from the peak toward the left and right valleys, and the same color region is used in the interval until the frequency of the histogram becomes equal to or less than TH_Freq or the distance from the HighestPeak1 reaches TH_HueRange. And Then, the system control circuit 60 groups the blocks in which the representative values of the blocks are included in this range as the same color area 1 and records this section as having been grouped. Next, the system control circuit 60 scans again all the sections of the histogram excluding the section subjected to the grouping process, and detects the highest peak HighestPeak2 (FIG. 3B). Thereafter, the same process is repeated until all the sections in the histogram are grouped, whereby the hue histogram can be divided into n identical color groups. When scanning from the peak to the valley direction, if you arrive at a grouped section before reaching the section where the histogram frequency is less than TH_Freq, the section up to that is the same color area range. And

次に、ステップＳ１０５において、システム制御回路６０は、全ブロックを走査し、隣接する連続した同一色領域を同一グループとし、隣接していない同一色領域を別グループとするラベリング処理を行う。このラベリング処理を複数色に対して実行することにより、同一色領域となった、離れている別の物体を、別領域と認識することが可能となる。 Next, in step S105, the system control circuit 60 scans all blocks, and performs a labeling process in which adjacent consecutive identical color areas are set as the same group, and non-adjacent identical color areas are set as another group. By performing this labeling process for a plurality of colors, it is possible to recognize another object that is in the same color area and is separated as another area.

次に、図２のステップＳ１０６において、システム制御回路６０は、ステップＳ１０５のラベリング処理で分類された領域のそれぞれについて、被写体評価値Ｐを算出する。被写体評価値Ｐを算出する方法としては、以下のような方法を用いることができる。
・画面中心からの距離Ｒに応じて、評価値（中心距離評価値）を付加する方法。これは、例えば図４に示すように、画面中心からの距離が大きくなるに従って被写体評価値が低下するよう評価値を設定することがあげられる。なお、図４では、画面中心からの距離がＲを超えると被写体評価値は０になる。
・同一グループ領域内のブロック数の総和をカウントすることにより領域内の面積Ｓを求め、領域面積に応じて高い評価値（被写体面積評価値）を付加する方法。例えば、図５の（Ａ）に示すように、面積が大きいほど高い被写体評価値が与えられるようにする。図５（Ｂ）のような画像の場合、被写体２００の面積の方が被写体２０１の面積よりも大きいので、被写体２００の方により高い被写体評価値が与えられることになる。
・各領域内の彩度を求め、彩度に応じて高い評価値（被写体彩度評価値）を付加する方法。
・上記複数の評価値のうち少なくとも一つ以上の評価値の重み付け加算によって、例えば以下の式により被写体評価値Ｐを算出する方法。
ｐ＝ｗ₁×ｐ₁＋ｗ₂×ｐ₂＋ｗ₃×ｐ₃
ｗ₁，ｗ₂，ｗ₃：重み計数、
ｐ₁：中心距離評価値、
ｐ₂：被写体面積評価値、
ｐ₃：被写体彩度評価値 Next, in step S106 of FIG. 2, the system control circuit 60 calculates a subject evaluation value P for each of the regions classified by the labeling process in step S105. As a method for calculating the subject evaluation value P, the following method can be used.
A method of adding an evaluation value (center distance evaluation value) according to the distance R from the screen center. For example, as shown in FIG. 4, the evaluation value is set so that the subject evaluation value decreases as the distance from the center of the screen increases. In FIG. 4, when the distance from the screen center exceeds R, the subject evaluation value becomes zero.
A method of obtaining the area S in the area by counting the total number of blocks in the same group area and adding a high evaluation value (subject area evaluation value) according to the area of the area. For example, as shown in FIG. 5A, a higher subject evaluation value is given as the area increases. In the case of an image as shown in FIG. 5B, since the area of the subject 200 is larger than the area of the subject 201, a higher subject evaluation value is given to the subject 200.
A method of obtaining saturation in each region and adding a high evaluation value (subject saturation evaluation value) according to the saturation.
A method of calculating the subject evaluation value P by weighted addition of at least one of the plurality of evaluation values, for example, using the following formula.
p = w ₁ × p ₁ + w ₂ × p ₂ + w ₃ × p ₃
w ₁ , w ₂ , w ₃ : weight counting,
p ₁ : center distance evaluation value,
p ₂ : subject area evaluation value,
p ₃ : Subject saturation evaluation value

最後に、図２のステップＳ１０７において、システム制御回路６０は、ステップＳ１０６で算出された被写体評価値Ｐによって、主被写体領域を決定し、画像中から抽出する。主被写体領域と判定する領域は、例えば、ステップＳ１０６で算出された評価値が最も高い領域としても良いし、評価値が所定のしきい値以上となる複数の領域を組み合わせた領域としても良い。 Finally, in step S107 of FIG. 2, the system control circuit 60 determines the main subject area based on the subject evaluation value P calculated in step S106, and extracts it from the image. The region determined as the main subject region may be, for example, a region having the highest evaluation value calculated in step S106, or may be a region combining a plurality of regions whose evaluation values are equal to or greater than a predetermined threshold value.

続いて、以下に図６、図７を用いて、上述した主被写体検出処理によって検出された主被写体領域の情報を被写体追尾処理へと伝達する処理について説明する。 Subsequently, a process for transmitting information on the main subject area detected by the main subject detection process described above to the subject tracking process will be described with reference to FIGS.

図６は本実施形態による主被写体検出処理から被写体追尾処理までを説明するフローチャートである。システム制御回路６０は、上述した主被写体検出処理、後述の位置情報更新処理および追尾処理を実行する。位置情報更新処理（Ｓ３０２）は、主被写体検出処理（Ｓ３０１）によって検出された主被写体情報を元に、被写体追尾処理（Ｓ３０３）へと渡す被写体情報の更新を制御する。例えば、被写体追尾が行われていない状態で、前述の被写体評価値Ｐが充分に高い被写体（主被写体）が検出された場合、位置情報更新処理は、そのフレームにおいて主被写体領域内の特定の位置を主被写体位置として取得し、これを被写体追尾処理へ伝達する。例えば、位置情報更新部は、主被写体領域の中心または重心の位置情報（x,y）を主被写体位置として被写体追尾処理へと伝達する。なお、特定の位置の取得に用いられる主被写体領域とは、たとえば、検出された被写体領域であってもよいし、検出された被写体領域を内包する矩形であってもよい。被写体追尾処理は受け取った位置情報（主被写体位置）を元に、被写体追尾部のアルゴリズムに従って被写体追尾に最適な追尾枠を主被写体検出処理が検出した主被写体の領域とは無関係に再設定し、追尾を実行する。また既に追尾中の主被写体がある場合、位置情報更新処理は主被写体検出処理によって新たに検出された被写体の被写体評価値及び位置情報を元に情報更新すべきかどうかを判断する。 FIG. 6 is a flowchart for explaining the main subject detection processing to subject tracking processing according to the present embodiment. The system control circuit 60 performs the above-described main subject detection process, position information update process, and tracking process described later. The position information update process (S302) controls the update of subject information to be passed to the subject tracking process (S303) based on the main subject information detected by the main subject detection process (S301). For example, when subject tracking is not performed and a subject (main subject) having a sufficiently high subject evaluation value P described above is detected, position information update processing is performed at a specific position within the main subject region in the frame. Is acquired as the main subject position, and this is transmitted to the subject tracking process. For example, the position information update unit transmits position information (x, y) of the center or the center of gravity of the main subject area to the subject tracking process as the main subject position. Note that the main subject area used for acquiring a specific position may be, for example, a detected subject area or a rectangle that includes the detected subject area. In the subject tracking process, based on the received position information (main subject position), the tracking frame optimum for subject tracking is reset according to the subject tracking unit algorithm regardless of the main subject area detected by the main subject detection process, Perform tracking. If there is a main subject that is already being tracked, the position information update process determines whether information should be updated based on the subject evaluation value and position information of the subject newly detected by the main subject detection process.

以下に図７を用いて、位置情報更新処理における判断基準の一例について説明する。 Hereinafter, an example of a determination criterion in the position information update process will be described with reference to FIG.

位置情報更新処理（Ｓ３０２）は、被写体追尾が行われていない場合には、上述のように主被写体の検出に応じてその位置を被写体追尾処理に提供する。被写体追尾処理が有る被写体を追尾している場合には、以下のような条件で、主被写体位置を被写体追尾処理に通知して追尾対象を変更させる。たとえば、図７（Ａ）のように、Object1とObject2が画面内に存在する状態で、現在Object1が主被写体として判別され、被写体追尾処理によって追尾されているものとする。この状態で、以下のいずれかの条件を満たす場合は、位置情報更新部はObject2の位置情報（x2,y2）を新たな追尾対象として被写体追尾部へと伝達する。
・図７（Ｂ）のようにObject1が画面外にフレームアウトしてしまった場合。なお、フレームアウトの検出は、被写体追尾処理の追尾結果から得ることができる。
・図７（Ｃ）のようにObject1が画面の端へと移動し、Object2が画面内の中心に近い位置に存在すると共に、Object2の被写体評価値P2が所定の閾値P_Thを上回る場合（P2＞P_Th）。（追尾位置が画面中心から遠く、画面中心にはobject2が存在する場合）
・Object1の評価値P1よりObject2の評価値P2が高い場合（P1＜P2）。ただし、この場合、追尾中であるObject1位置（追尾位置）における重みを大きくして評価する。 The position information update process (S302) provides the position to the subject tracking process according to the detection of the main subject as described above when the subject tracking is not performed. When a subject with subject tracking processing is being tracked, the main subject position is notified to the subject tracking processing under the following conditions to change the tracking target. For example, as shown in FIG. 7A, it is assumed that Object1 and Object2 are present in the screen, and that Object1 is currently determined as the main subject and is being tracked by subject tracking processing. In this state, when any of the following conditions is satisfied, the position information update unit transmits the position information (x2, y2) of Object2 as a new tracking target to the subject tracking unit.
・ When Object1 is out of frame as shown in FIG. The detection of frame-out can be obtained from the tracking result of the subject tracking process.
As shown in FIG. 7C, when Object1 moves to the edge of the screen, Object2 exists at a position close to the center of the screen, and the object evaluation value P2 of Object2 exceeds a predetermined threshold value P_Th (P2> P_Th). (When the tracking position is far from the center of the screen and object2 exists in the center of the screen)
When the evaluation value P2 of Object2 is higher than the evaluation value P1 of Object1 (P1 <P2). However, in this case, the evaluation is performed by increasing the weight at the Object1 position (tracking position) during tracking.

次に、被写体追尾処理について以下に詳細に説明する。上述の位置情報更新処理（Ｓ３０２）によって、主被写体の位置情報が更新された場合、被写体追尾処理（Ｓ３０３）は被写体追尾のアルゴリズムに従って被写体を追尾し、移動体を捕捉し続ける。図８は本実施形態による被写体追尾アルゴリズムを示すフローチャートである。また、図９は被写体追尾のシーンの一例として、主被写体を含む入力画像とそのブロック分割後の画像を示している。 Next, the subject tracking process will be described in detail below. When the position information of the main subject is updated by the above-described position information update process (S302), the subject tracking process (S303) tracks the subject according to the subject tracking algorithm and continues to capture the moving object. FIG. 8 is a flowchart showing the subject tracking algorithm according to this embodiment. FIG. 9 shows an input image including the main subject and an image after the block division as an example of the subject tracking scene.

本実施形態による被写体追尾のアルゴリズムは、大きく分けてテンプレート（特徴画素）を決定するテンプレート設定部とテンプレートに類似する画像パターンを対象画像内から検出する追尾部に区別できる。尚、本実施形態における被写体追尾は色情報によるテンプレートマッチングによって被写体を追尾するものであり、誤追尾を抑えつつ、被写体の微小な色の変化や形状変化に耐性の強いテンプレートを適切に選択することが追尾性能を向上させる上で重要である。 The subject tracking algorithm according to the present embodiment can be roughly classified into a template setting unit that determines a template (feature pixel) and a tracking unit that detects an image pattern similar to the template from the target image. Note that subject tracking in the present embodiment tracks a subject by template matching based on color information, and appropriately selects a template that is resistant to minute color changes and shape changes of the subject while suppressing erroneous tracking. Is important in improving the tracking performance.

以下に、上述したテンプレート設定部の処理フローについて詳細に述べる。テンプレート設定部では、追尾処理で用いられるテンプレートを生成するための参照範囲を、主被写体検出処理によって取得された主被写体位置に基づいて、主被写体検出処理が検出した主被写体領域とは無関係に設定する。そして、テンプレート設定部は、この参照範囲の画像情報に基づいて、主被写体位置を追尾するためのテンプレートを生成する。本実施形態では、フレームを複数のブロックに分割し、これら複数のブロックから主被写体位置を含むブロックとその周囲のブロックからなる所定数のブロックの範囲を参照範囲として設定して、テンプレートの生成を行う。 The processing flow of the template setting unit described above will be described in detail below. The template setting unit sets a reference range for generating a template used in the tracking process regardless of the main subject area detected by the main subject detection process based on the main subject position acquired by the main subject detection process. To do. Then, the template setting unit generates a template for tracking the main subject position based on the image information of the reference range. In this embodiment, the frame is divided into a plurality of blocks, and a template is generated by setting a range of a predetermined number of blocks including a block including the main subject position and the surrounding blocks from the plurality of blocks as a reference range. Do.

まず、ステップＳ４０１において、システム制御回路６０は、画像内を複数のブロックに分割し、分割した各ブロック内の画素における画素情報平均値を算出し、各ブロックの代表値とする。以上の操作によって、図９（Ａ）に示されるような入力画像４０１は、図９（ｂ）に示されるような分解能の粗い画像４０３へと変換される。なお、被写体追尾処理のブロック分割（Ｓ４０１）と、主被写体検出処理のブロック分割（Ｓ１０１）は同じであっても、そうでなくてもよい。 First, in step S401, the system control circuit 60 divides the image into a plurality of blocks, calculates the pixel information average value of the pixels in each divided block, and sets it as the representative value of each block. Through the above operation, the input image 401 as shown in FIG. 9A is converted into an image 403 with a coarse resolution as shown in FIG. 9B. Note that the block division (S401) of the subject tracking process and the block division (S101) of the main subject detection process may or may not be the same.

ここで、前述の主被写体検出処理によって検出された被写体の位置（主被写体位置）が座標４０２で与えられたものとする。ステップＳ４０２において、この座標４０２を元に、画像４０３上での被写体位置座標を含むブロック４０４及びその近傍８ブロックからなる領域４０５を、追尾対象の特徴候補色（参照範囲）として設定する。図１０に示すように、領域４０５は、ブロックＡ〜Ｉ（ブロック４０４は、領域４０５の中心にあるブロックＥに対応する）からなる９ブロックの領域である。そして、これら９つの特徴候補ブロックから追尾対象の特徴となりうる画素を抽出する作業を行う。なお、本実施形態では、一例として、図１０に示したように全画面を３２×２４ブロックに分割している。これら９つの特徴候補ブロックから、追尾対象の特徴となりうるブロック、すなわち画像全体に対して特異性のあるブロックを抽出する作業を行う。まず、特徴候補の各ブロックと類似するブロックが画面内のどこにどの程度分布しているのかを調査する。本実施形態では、特徴候補のブロックと類似するブロックであるか否かの判定を、たとえば、特徴候補のブロックと同系色のブロックであるか否かの判定により行う。 Here, it is assumed that the position of the subject (main subject position) detected by the above-described main subject detection process is given by coordinates 402. In step S402, based on the coordinates 402, a block 404 including subject position coordinates on the image 403 and an area 405 including eight blocks in the vicinity thereof are set as feature candidate colors (reference ranges) to be tracked. As illustrated in FIG. 10, the area 405 is a 9-block area including blocks A to I (the block 404 corresponds to the block E at the center of the area 405). Then, an operation for extracting a pixel that can be a feature to be tracked from these nine feature candidate blocks is performed. In the present embodiment, as an example, the entire screen is divided into 32 × 24 blocks as shown in FIG. From these nine feature candidate blocks, a block that can be a feature to be tracked, that is, a block having specificity with respect to the entire image is extracted. First, it is investigated where and how much blocks similar to each block of feature candidates are distributed in the screen. In the present embodiment, whether or not the block is similar to the feature candidate block is determined, for example, by determining whether or not the block is the same color as the feature candidate block.

図１１に、特徴候補画素Ａ〜Ｉの類似色を画面内から抽出した結果の分布イメージを示す。図１１中の網掛けで示されたブロックが各候補ブロックと類似色を有するブロックであることを示している。尚、ここでは以下の（式１）のように、特徴候補ブロックのＲ，Ｇ，Ｂの各成分の差分和のすべてが、所定の信号レベルThRGBの範囲内にあることをもって、類似色であると判断している。特徴候補ブロックのＲ，Ｇ，Ｂの各成分の差分和は、０〜２５５の値に設定される。
ただし、ThRGB=30、n＝特徴色（A…I）、x=水平座標（0…31）、y=垂直座標（0…23）である。 FIG. 11 shows a distribution image as a result of extracting similar colors of the feature candidate pixels A to I from the screen. 11 indicates that blocks indicated by shading in FIG. 11 have similar colors to the candidate blocks. Here, as shown in the following (Equation 1), all of the difference sums of the R, G, and B components of the feature candidate block are within a predetermined signal level ThRGB, so that they are similar colors. Judging. The difference sum of the R, G, B components of the feature candidate block is set to a value of 0-255.
However, ThRGB = 30, n = feature color (A ... I), x = horizontal coordinates (0 ... 31), y = vertical coordinates (0 ... 23).

図９に示したシーン例でボールを追尾対象としようとした場合、図１１に示される分布から、ブロックＢ、Ｃ、Ｅは、周辺に類似色の分布が少ないことが分かり、特異性が大きく、ボールであることを特徴づける色と成り得ることが推測される。反面、ブロックＧなどは、類似色画素が周辺に多く分布している様子が分かり、床面の色を抽出してしまっている可能性が高く、ボールの特徴と成り得るかは疑問である。よって、ブロックＧを特徴候補から除外する。ここでは、特徴候補画素Ａ〜Ｉから抽出された類似色画素の数に対する、画面全体から抽出された類似色画素の数の割合が、閾値以上となるブロックを特徴候補から除外する。このように９つの特徴候補ブロックのうち追尾対象として適切でないブロックを除外した上で、最終的に残ったブロック（１つまたは複数）を被写体に固有のテンプレートとして使用する。すなわち、参照範囲である所定数のブロックの各々の画像情報のフレームにおける特異性に基づいて追尾に用いるべきブロックを選択している。 When trying to track the ball in the example of the scene shown in FIG. 9, it can be seen from the distribution shown in FIG. 11 that the blocks B, C, and E have a small distribution of similar colors in the periphery, and the specificity is large. It is speculated that it can be a color that characterizes the ball. On the other hand, in block G and the like, it can be seen that a lot of similar color pixels are distributed in the vicinity, and the color of the floor surface is likely to be extracted. Therefore, the block G is excluded from the feature candidates. Here, a block in which the ratio of the number of similar color pixels extracted from the entire screen to the number of similar color pixels extracted from the feature candidate pixels A to I is not less than a threshold is excluded from the feature candidates. Thus, after excluding blocks that are not suitable as tracking targets from the nine feature candidate blocks, the finally remaining block (s) is used as a template unique to the subject. In other words, the block to be used for tracking is selected based on the specificity of each of the predetermined number of blocks as the reference range in the frame of the image information.

また上記の判別基準以外にも、
・同系色画素の面積が大き過ぎる特徴候補を除外する（図１１（Ａ））、
・同系色画素の一部が画面端にかかっている特徴候補は背景画像である可能性が高いと判断して候補から除外する（図１１（Ｉ））、
・同系色が画面内に広く分布している特徴候補は、誤って背景を追尾してしまう誤追尾が発生しやすいと考えられるため候補から除外する（図１１（Ｄ））、
といった基準のうち、少なくとも一つ以上の判別基準を組み合わせて特徴候補ブロックを抽出するようにしても良い。以上のような方法で抽出した特徴ブロックを中心（Center）として、図１２のように上（Ｕ）下（Ｄ）左（Ｌ）右（Ｒ）のブロックを含めた５ブロックをテンプレート（追尾枠）と設定したテンプレートマッチングにより被写体追尾を実現する。テンプレートは、候補ブロックから追尾に適切であるものとして抽出されたブロックの数だけ生成される。尚、特徴画素が複数残る場合は、同系色の分散、面積、彩度のいずれかの情報を用いて最も適切なテンプレートを選択する。また、後述する追尾部において、最初の追尾の際に複数のテンプレートを使用してテンプレートごとにパターンマッチングを行い、被写体位置の補足精度を高めてもよい。 In addition to the above criteria,
Exclude feature candidates whose area of similar color pixels is too large (FIG. 11A),
A feature candidate having a part of a similar color pixel on the screen edge is determined to be highly likely to be a background image, and is excluded from the candidate (FIG. 11 (I)).
Feature candidates in which similar colors are widely distributed in the screen are likely to be erroneously tracked by mistake, and are excluded from the candidates (FIG. 11D).
The feature candidate blocks may be extracted by combining at least one of the above criteria. With the feature block extracted by the above method as the center, the 5 blocks including the upper (U), lower (D), left (L), and right (R) blocks as shown in FIG. The tracking of the subject is realized by the template matching set as). Templates are generated by the number of blocks extracted from the candidate blocks as being suitable for tracking. If a plurality of feature pixels remain, the most appropriate template is selected using information on the distribution, area, and saturation of similar colors. Further, in the tracking unit described later, pattern matching may be performed for each template using a plurality of templates at the time of the first tracking, and the subject position supplement accuracy may be improved.

次に被写体追尾処理における追尾部の動作について詳細に説明する。基本的な追尾概念は、上記テンプレート設定部において特徴ブロック（テンプレート）として記憶した特徴色ＲＧＢと類似する画素を対象画面内（後続するフレーム内）から探索し、類似度が高い画素位置を追尾対象位置とするものである。また、追尾対象と同系色の色を持つ別の追尾対象が画面内に存在する場合や、追尾対象と類似する色が背景に存在していた場合、前回の追尾対象検出位置に近い程、類似度が高くなるように前回の検出位置座標を考慮した類似度算出を行う。追尾対象の検出結果のハンチングや、同系色背景への誤追尾を防ぐためである。 Next, the operation of the tracking unit in the subject tracking process will be described in detail. The basic tracking concept is to search for a pixel similar to the feature color RGB stored as a feature block (template) in the template setting unit from the target screen (in the subsequent frame), and search for a pixel position with a high similarity Position. Also, if another tracking target with a color similar to the tracking target exists in the screen, or if a color similar to the tracking target exists in the background, the closer the tracking target detection position to the previous tracking target, the more similar The similarity is calculated in consideration of the previous detection position coordinates so that the degree becomes higher. This is to prevent hunting of the tracking target detection result and erroneous tracking to similar color backgrounds.

まず、図８のステップＳ４０３において、システム制御回路６０は、前回の追尾対象検出位置に近い程、類似度を高くして優先的に追尾対象位置とするために、重みテーブルを生成する。重みテーブルは前回の追尾対象検出位置を基点とし、画面内の各位置において基点との距離に応じて重み量が設定されるようにするものであり、距離に応じた重みの変化量は様々な条件に応じて適応的に変化させるものとする。図１３に（式２）によって求めた距離ｒに応じた重みテーブルの設計カーブの例を示す。
（ｘ_t-1,ｙ_t-1）：前回の追尾対象検出位置 First, in step S403 in FIG. 8, the system control circuit 60 generates a weight table so that the closer to the previous tracking target detection position, the higher the similarity and the priority tracking position. The weight table uses the previous tracking target detection position as a base point, and the weight amount is set according to the distance from the base point at each position in the screen. The amount of change in the weight according to the distance varies. It shall be adaptively changed according to conditions. FIG. 13 shows an example of a design curve of the weight table corresponding to the distance r obtained by (Equation 2).
(X _t-1 , y _t-1 ): Previous tracking target detection position

また図１４は図１３の設計カーブを元に、３２×２４のブロックに展開した重みテーブルの例を示している。図１４中の前回検出位置を中心として距離別の重みテーブルを設定すると、ある一定以上の距離は重みが０になるため、実質探索を行う必要がない。このように重みテーブルの設計によっては、誤追尾の可能性を低減することができると共に、実質的な探索領域を狭め、マッチング演算の負荷を減らすことができる。ただし、探索領域を狭くしすぎると動きの速い物体を追尾できないなどの問題が発生するため、充分に留意して設計する必要がある。 FIG. 14 shows an example of a weight table developed into 32 × 24 blocks based on the design curve of FIG. If a weight table for each distance is set with the previous detection position in FIG. 14 as the center, the weight is 0 for a certain distance or more, so there is no need to perform a substantial search. Thus, depending on the design of the weight table, the possibility of erroneous tracking can be reduced, the substantial search area can be narrowed, and the load of matching calculation can be reduced. However, if the search area is too narrow, problems such as inability to track fast-moving objects occur. Therefore, it is necessary to design with great care.

次に図８のステップＳ４０４において、システム制御回路６０は、処理対象の画像をテンプレート生成時と同様にブロック分割し、ステップＳ４０３で定めた探索領域内を走査してテンプレートと類似度の高い追尾対象位置を探索する。図１２のようにテンプレートの中心ブロック（Ｃ）とその上（Ｕ）・下（Ｄ）・左（Ｌ）・右（Ｒ）４ブロックのＲＧＢ値をそれぞれC(Rc,Gc,Bc)，Ｕ(Ru,Gu,Bu)，Ｄ(Rd,Gd,Bd)，Ｌ(Rl,Gl,Bl)，Ｒ(Rr,Gr,Br)とする。同様にマッチング対象ブロック及びその上下左右ブロックのＲＧＢ値をＣ'(Rc',Gc',Bc')，Ｕ'(Ru',Gu',Bu')，Ｄ'(Rd',Gd',Bd')，Ｌ'(Rl',Gl',Bl')，Ｒ'(Rr',Gr',Br')とする。すると、評価値P1は以下の（式３）で示すような各画素におけるテンプレートとのＲＧＢ値の差分和で表現することができる。
Next, in step S404 in FIG. 8, the system control circuit 60 divides the processing target image into blocks in the same manner as when generating the template, scans the search area defined in step S403, and has a tracking target having a high similarity to the template. Search for a location. As shown in FIG. 12, the RGB values of the central block (C) and the upper (U), lower (D), left (L), and right (R) blocks of the template are respectively represented by C (Rc, Gc, Bc), U (Ru, Gu, Bu), D (Rd, Gd, Bd), L (Rl, Gl, Bl), R (Rr, Gr, Br). Similarly, the RGB values of the matching target block and its upper, lower, left, and right blocks are set to C ′ (Rc ′, Gc ′, Bc ′), U ′ (Ru ′, Gu ′, Bu ′), D ′ (Rd ′, Gd ′, Bd). '), L' (Rl ', Gl', Bl '), R' (Rr ', Gr', Br '). Then, the evaluation value P1 can be expressed as a difference sum of RGB values from the template in each pixel as shown in the following (Equation 3).

ここでα、β、γはＲＧＢごとに重みを与えるものであり、α＝β＝γと設定することでＲＧＢを等価に扱うことができる。評価値Ｐ１の値が小さくなるほど、その座標(x,y)における特徴色とテンプレートとの差が小さい、すなわち類似性が高いということになる。そこで、本実施形態では、式３により得られた評価値Ｐ１を図１５のような変換テーブルの入力値として使用し、出力評価値Ｐ２を得る。次に、システム制御回路６０は、以下の式４のように、出力評価値Ｐ２にステップＳ４０３で求めた重みテーブルにより与えられる重み値ｗ_(x,y)を乗算することにより座標(x,y)におけるテンプレートとの類似度Ｐｓを求める。
Here, α, β, and γ are weights for each RGB, and by setting α = β = γ, RGB can be handled equivalently. The smaller the evaluation value P1, the smaller the difference between the feature color and the template at the coordinates (x, y), that is, the higher the similarity. Therefore, in this embodiment, the evaluation value P1 obtained by Expression 3 is used as the input value of the conversion table as shown in FIG. 15 to obtain the output evaluation value P2. Next, the system control circuit 60 multiplies the output evaluation value P2 by the weight value w _{(x, y)} given by the weight table obtained in step S403, as shown in the following Equation 4, to thereby display the coordinates (x, y The similarity Ps with the template in) is obtained.

上記類似度Psはテンプレートとの一致度が高いほど値が大きくなるため、探索領域内のすべての画素について上記類似度を求め、最も類似度の高い座標を被写体位置として決定する。 Since the degree of similarity Ps increases as the degree of coincidence with the template increases, the degree of similarity is obtained for all pixels in the search region, and the coordinate with the highest degree of similarity is determined as the subject position.

最後に、ステップＳ４０５において、システム制御回路６０は、ステップＳ４０４で得られた追尾結果を元に露出・液晶表示などの各種情報を更新すると共に、新しい被写体座標を中心とした上下左右の画素を新たなテンプレートとして保存する。以下同様に更新されたテンプレートを元に順次フレームごとに被写体追尾を実行し、被写体を捕捉し続けることができる。以上の追尾部の処理（ステップＳ４０３〜Ｓ４０５）は、位置情報更新処理（Ｓ３０２）により新たな被写体情報が提供されるまで、あるいは、追尾の対象がフレームアウトするなどして、追尾不能となるまで繰り返される（ステップＳ４０６）。新しい被写体情報が位置情報更新処理から提供されると、被写体追尾処理では、図８に示す処理の先頭から、すなわちテンプレート設定部からの処理を実行する。 Finally, in step S405, the system control circuit 60 updates various information such as exposure and liquid crystal display based on the tracking result obtained in step S404, and newly sets the upper, lower, left, and right pixels around the new subject coordinates. Save as a template. Similarly, subject tracking can be executed sequentially for each frame based on the updated template, and the subject can be continuously captured. The processing of the tracking unit (steps S403 to S405) described above is performed until new subject information is provided by the position information update processing (S302), or until tracking becomes impossible due to the tracking target being out of frame or the like. Repeated (step S406). When new subject information is provided from the position information update process, the subject tracking process executes the process from the top of the process shown in FIG. 8, that is, the process from the template setting unit.

以上のように、上記実施形態の構成によれば、自動的に主被写体を検出し、主被写体の位置情報を元に被写体追尾のテンプレートを再設定し、被写体追尾に最適なテンプレートを使用して被写体追尾が行われる。このような構成により、ユーザの関心の高い被写体が動いている場合においても、高い精度で追尾し続けることが可能である。 As described above, according to the configuration of the above embodiment, the main subject is automatically detected, the subject tracking template is reset based on the position information of the main subject, and the optimal template for subject tracking is used. Subject tracking is performed. With such a configuration, it is possible to continue tracking with high accuracy even when a subject with high user interest is moving.

また、以上に本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 Moreover, although preferable embodiment of this invention was described above, this invention is not limited to these embodiment, A various deformation | transformation and change are possible within the range of the summary.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

Detecting means for detecting a subject area from at least one frame of the moving image;
Setting means for setting an area for extracting image information used in the tracking process from an area set based on the position of the subject area detected by the detecting means, using a method different from the detecting means. When,
Tracking means for executing the tracking process based on the image information of the area set by the setting means ,
The setting means divides one of the frames into a plurality of blocks, sets a predetermined number of blocks including a block including the position of the subject area and the surrounding blocks from the plurality of blocks as a reference range. from blocks included in the scope, the image processing apparatus according to claim you to select one or more blocks, as a region for extracting the image information.

The setting means selects from the blocks included in the reference range, the one or more blocks as an area for extracting the image information based on the specificity of each block in the frame. The image processing apparatus according to claim 1 , wherein:

It said tracking means, using the template generated based on the image information set by the setting unit area, the image processing apparatus according to claim 1 or 2, characterized in that to perform the tracking process.

It said tracking means, using the template generated from a block adjacent to the selected block and the selected block as a region for extracting the image information, 1 to claim, characterized in that to perform the tracking process 4. The image processing apparatus according to any one of items 3 .

The image processing apparatus according to claim 3, wherein the tracking unit executes the tracking process by pattern matching using the template.

The detection unit recognizes a continuous region of similar colors as the same subject region, and sets one of the recognized subject regions as a main subject region, and the image information based on the position of the main subject region. the image processing apparatus according to any one of claims 1 to 5, characterized in that to set the region for extraction.

And a setting unit configured to extract the image information based on a position of a main subject region that is one of the plurality of subject regions when the detection unit detects a plurality of subject regions. It is for setting a region, in the case where the main subject region is switched to another subject, to any one of claims 1 to 6, characterized in that to set a region for extracting the image information The image processing apparatus described.

Detecting means for detecting a subject area from at least one frame of the moving image;
Setting means for setting an area for extracting image information used in the tracking process from an area set based on the position of the subject area detected by the detecting means, using a method different from the detecting means. When,
Tracking means for executing the tracking process based on the image information of the area set by the setting means,
The detection unit recognizes a continuous region of similar colors as the same subject region, and sets one of the recognized subject regions as a main subject region, and the image information based on the position of the main subject region. An image processing apparatus characterized in that an area for extraction is set.

Detecting means for detecting a subject area from at least one frame of the moving image;
Setting means for setting an area for extracting image information used in the tracking process from an area set based on the position of the subject area detected by the detecting means, using a method different from the detecting means. When,
Tracking means for executing the tracking process based on the image information of the area set by the setting means,
And a setting unit configured to extract the image information based on a position of a main subject region that is one of the plurality of subject regions when the detection unit detects a plurality of subject regions. An image processing apparatus for setting an area, wherein an area for extracting the image information is set when the main subject area is switched to another subject.

A control method for an image processing apparatus, comprising:
A detecting step for detecting a subject region from at least one frame of the moving image;
The setting means extracts an area for extracting image information used in the tracking process from an area set based on the position of the subject area output in the detection process, using a method different from the detection process. A setting process to set;
Tracking means, have a, a tracking step of executing the tracking process based on the image information of the set by the setting step region,
In the setting step, one of the frames is divided into a plurality of blocks, and a predetermined number of blocks including a block including the position of the subject area and the surrounding blocks are set as a reference range from the plurality of blocks. A control method for an image processing apparatus , wherein one or a plurality of blocks are selected as an area for extracting the image information from blocks included in a range .

A control method for an image processing apparatus, comprising:
A detecting step for detecting a subject region from at least one frame of the moving image;
The setting means extracts an area for extracting image information used in the tracking process from an area set based on the position of the subject area output in the detection process, using a method different from the detection process. A setting process to set;
A tracking unit that performs the tracking process based on the image information of the region set in the setting step;
In the detecting step, a continuous region of similar colors is recognized as the same subject region, and one of the recognized subject regions is set as a main subject region, and the image information is obtained based on the position of the main subject region. A control method for an image processing apparatus, characterized in that an area for extraction is set.

  A control method for an image processing apparatus, comprising:
  A detecting step for detecting a subject region from at least one frame of the moving image;
  The setting means extracts an area for extracting image information used in the tracking process from an area set based on the position of the subject area output in the detection process, using a method different from the detection process. A setting process to set;
  A tracking unit that performs the tracking process based on the image information of the region set in the setting step;
  In the setting step, when a plurality of subject areas are detected by the detection unit, the image information is extracted based on a position of a main subject area which is one of the plurality of subject areas. A method for controlling an image processing apparatus, wherein an area is set, and an area for extracting the image information is set when the main subject area is switched to another subject.

The program for functioning a computer as each means of the image processing apparatus described in any one of Claims 1 thru | or 9.