JP2011146826A

JP2011146826A - Unit and method for processing image, and program

Info

Publication number: JP2011146826A
Application number: JP2010004541A
Authority: JP
Inventors: Masaya Kinoshita; 雅也木下; Yutaka Yoneda; 豊米田; Takashi Kametani; 敬亀谷; Kazuki Aisaka; 一樹相坂
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2010-01-13
Filing date: 2010-01-13
Publication date: 2011-07-28

Abstract

PROBLEM TO BE SOLVED: To more stably track a subject. SOLUTION: A subject map generation unit 71 generates a subject map indicating region-likelihood of a subject in the current frame, based on a weighting factor for each feature quantity from a feature quantity map indicating the feature quantity in a predetermined region of the current frame of an input image, for each feature of the input image. A candidate rectangular subject region determining unit 72 determines a rectangular region including a candidate subject region in the subject map. The subject region selection unit 73 selects a rectangular subject region including a subject of interest, from the rectangular region, based on region information on the rectangular region. A weighting factor calculation unit 74 calculates a weighting factor for weighting the feature quantity map of the next frame corresponding to a relatively large feature quantity in feature quantities in a region corresponding to a subject region on the feature quantity map for each feature quantity of the current frame. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、画像処理装置および方法、並びにプログラムに関し、特に、被写体をより安定して追尾するようにする画像処理装置および方法、並びにプログラムに関する。 The present invention relates to an image processing device and method, and a program, and more particularly, to an image processing device and method, and a program for tracking a subject more stably.

近年、撮像装置によって撮像された画像において、ユーザによって選択された被写体を追尾し、その被写体の位置に応じて撮像に関するパラメータ（焦点位置や明るさ等）を最適に調整することが行われている。 In recent years, in an image captured by an imaging device, a subject selected by a user is tracked, and parameters (focus position, brightness, etc.) relating to imaging are optimally adjusted according to the position of the subject. .

例えば、入力画像の所定のフレームにおいて、最初にユーザによって選択された被写体における、輝度情報や色情報等の特徴量を抽出し、次フレームにおいて、その特徴量に一致する特徴量を有する領域を近傍から検索することで被写体を追尾する手法がある（例えば、特許文献１参照）。 For example, in a predetermined frame of the input image, a feature amount such as luminance information or color information is extracted from the subject first selected by the user, and a region having a feature amount that matches the feature amount in the next frame There is a method of tracking a subject by searching from (see, for example, Patent Document 1).

特開２００６−７２３３２号公報JP 2006-72332 A

しかしながら、上述した手法では、最初にユーザによって選択された被写体の一部の領域についての特徴量を基に被写体を追尾するので、被写体全体の領域のいずれかの座標または一部の領域しか同定できず、被写体全体を安定して追尾することはできない。 However, in the above-described method, the subject is tracked based on the feature amount of the partial region of the subject first selected by the user, so that only one coordinate or partial region of the entire subject region can be identified. Therefore, the entire subject cannot be tracked stably.

また、入力画像における、被写体への照明光（例えば、色温度、照度等）や被写体の姿勢、サイズ（撮像装置と被写体との距離）等の被写体の状態の変動によって、最初にユーザによって選択された被写体のうちの一部の領域についての特徴量が変化すると、被写体の追尾に失敗してしまう。例えば、特徴量として、最初にユーザによって選択された被写体の一部の領域の色情報が抽出された場合、その色情報を有する領域を追尾するが、被写体が回転する等して選択された、その被写体の一部が隠れてしまうと、その色情報を有する領域が入力画像から存在しなくなり、被写体の追尾に失敗してしまう。これは、特徴量としての輝度情報や色情報の出にくい低照度下でも起こり得る。 Further, in the input image, it is first selected by the user due to fluctuations in the state of the subject, such as illumination light (for example, color temperature, illuminance, etc.) to the subject, posture of the subject, and size (distance between the imaging device and the subject). If the feature amount of a part of the subject changes, tracking of the subject fails. For example, when the color information of a part of the subject initially selected by the user is extracted as the feature amount, the region having the color information is tracked, but the subject is selected by rotating, If a part of the subject is hidden, the area having the color information does not exist from the input image, and tracking of the subject fails. This can occur even under low illuminance where luminance information and color information as feature quantities are difficult to be generated.

本発明は、このような状況に鑑みてなされたものであり、特に、被写体をより安定して追尾するようにするものである。 The present invention has been made in view of such circumstances, and in particular, is intended to track a subject more stably.

本発明の一側面の画像処理装置は、入力画像が有する特徴毎の、前記入力画像の現フレームの所定領域における特徴量を示す特徴量マップから、前記特徴量毎の重み係数に基づいて、前記現フレームにおける被写体の領域らしさを示す被写体マップを生成する被写体マップ生成手段と、前記被写体マップにおいて、前記被写体の候補となる領域を含む矩形領域を求める被写体候補領域矩形化手段と、注目すべき前記被写体が含まれる前記矩形領域である被写体領域を、前記矩形領域についての領域情報に基づいて、前記矩形領域から選択する被写体領域選択手段と、前記現フレームの前記特徴量毎の前記特徴量マップ上の、前記被写体領域に対応する領域における前記特徴量のうち、相対的に大きい前記特徴量に対応する、前記現フレームより時間的に後の後フレームの特徴量マップに対して重み付けする前記重み係数を算出する重み係数算出手段とを備える。 The image processing apparatus according to one aspect of the present invention provides a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature included in the input image, based on the weight coefficient for each feature amount. Subject map generation means for generating a subject map indicating the likelihood of the subject area in the current frame; subject candidate area rectangularization means for obtaining a rectangular area including the candidate area in the subject map; Subject area selection means for selecting a subject area, which is the rectangular area including the subject, from the rectangular area based on area information about the rectangular area, and the feature amount map for each feature amount of the current frame Of the current frame corresponding to the relatively large feature amount of the feature amount in the region corresponding to the subject region. And a weight coefficient calculating means for calculating the weighting factor for weighting the feature amount map frame after the post-time.

前記被写体マップ生成手段には、前記現フレームから、前記特徴量毎に前記特徴量マップを生成する特徴量マップ生成手段と、前記特徴量マップから、複数の、所定の帯域の特徴量を示す帯域特徴量マップを生成する帯域特徴量マップ生成手段と、前記特徴量毎に、前記帯域特徴量マップを合成する帯域特徴量マップ合成手段と、前記帯域特徴量マップが合成された、前記特徴量毎の合成特徴量マップを合成することで、前記被写体マップを生成する合成特徴量マップ合成手段とを設け、前記帯域特徴量マップ合成手段には、前記帯域特徴量マップに対して、前記特徴量毎および前記帯域毎の前記重み係数に基づいて重み付けして合成させ、前記重み係数算出手段には、前記現フレームの、前記所定の帯域についての前記特徴量毎の前記帯域特徴量マップ上の、前記被写体領域に対応する領域における前記特徴量のうち、相対的に大きい前記特徴量に対応する前記後フレームの帯域特徴量マップに対して重み付けする前記重み係数を算出させることができる。 The subject map generation means includes a feature quantity map generation means for generating the feature quantity map for each feature quantity from the current frame, and a band indicating a plurality of feature quantities in a predetermined band from the feature quantity map. Band feature quantity map generating means for generating a feature quantity map, band feature quantity map synthesizing means for synthesizing the band feature quantity map for each feature quantity, and for each feature quantity obtained by synthesizing the band feature quantity map. And a combined feature amount map combining unit that generates the subject map by combining the combined feature amount map, and the band feature amount map combining unit includes the band feature amount map for each feature amount. And weighting and combining based on the weighting factor for each band, and the weighting factor calculation means causes the band for each feature amount for the predetermined band of the current frame. Calculating the weighting factor for weighting the band feature map of the subsequent frame corresponding to the relatively large feature quantity among the feature quantities in the area corresponding to the subject area on the collection map. Can do.

前記被写体マップ生成手段には、前記現フレームから、前記特徴量毎に前記特徴量マップを生成する特徴量マップ生成手段と、前記特徴量マップから、複数の、所定の帯域の特徴量を示す帯域特徴量マップを生成する帯域特徴量マップ生成手段と、前記特徴量毎に、前記帯域特徴量マップを合成する帯域特徴量マップ合成手段と、前記帯域特徴量マップが合成された、前記特徴量毎の合成特徴量マップを合成することで、前記被写体マップを生成する合成特徴量マップ合成手段とを設け、前記合成特徴量マップ合成手段には、前記合成特徴量マップを、前記特徴量毎の前記重み係数に基づいて重み付けして合成させ、前記重み係数算出手段には、前記現フレームの、前記特徴量毎の前記合成特徴量マップ上の、前記被写体領域に対応する領域における前記特徴量のうち、相対的に大きい前記特徴量に対応する前記後フレームの合成特徴量マップに対して重み付けする前記重み係数を算出させることができる。 The subject map generation means includes a feature quantity map generation means for generating the feature quantity map for each feature quantity from the current frame, and a band indicating a plurality of feature quantities in a predetermined band from the feature quantity map. Band feature quantity map generating means for generating a feature quantity map, band feature quantity map synthesizing means for synthesizing the band feature quantity map for each feature quantity, and for each feature quantity obtained by synthesizing the band feature quantity map. A combined feature amount map combining unit that generates the subject map by combining the combined feature amount map, and the combined feature amount map combining unit stores the combined feature amount map for each feature amount. An area corresponding to the subject area on the combined feature value map for each feature value of the current frame is weighted and synthesized based on a weighting factor. Among definitive the feature amount, the weighting factor for weighting synthesis feature amount map of the rear frame corresponding to the relatively large the feature amount can be calculated.

前記領域情報は、前記被写体マップ上の前記矩形領域の中心座標またはサイズであり、前記被写体領域選択手段には、前記矩形領域から、前記現フレームより時間的に前の前フレームにおいて選択された前記被写体領域の中心座標またはサイズに最も近い中心座標またはサイズを有する前記矩形領域を、前記被写体領域として選択させることができる。 The area information is a center coordinate or a size of the rectangular area on the subject map, and the subject area selecting means selects the rectangular area from the rectangular area in the previous frame temporally before the current frame. The rectangular region having the center coordinate or size closest to the center coordinate or size of the subject region can be selected as the subject region.

前記領域情報は、前記被写体マップ上の前記矩形領域における前記特徴量の積分値またはピーク値であり、前記被写体領域選択手段には、前記矩形領域から、前記現フレームより時間的に前の前フレームにおいて選択された前記被写体領域における前記特徴量の積分値またはピーク値に最も近い前記特徴量の積分値またはピーク値を有する前記矩形領域を、前記被写体領域として選択させることができる。 The region information is an integral value or a peak value of the feature amount in the rectangular region on the subject map, and the subject region selecting means sends the previous frame temporally before the current frame from the rectangular region. The rectangular region having the integral value or peak value of the feature value closest to the integral value or peak value of the feature value in the subject region selected in (2) can be selected as the subject region.

本発明の一側面の画像処理方法は、入力画像が有する特徴毎の、前記入力画像の現フレームの所定領域における特徴量を示す特徴量マップから、前記特徴量毎の重み係数に基づいて、前記現フレームにおける被写体の領域らしさを示す被写体マップを生成する被写体マップ生成ステップと、前記被写体マップにおいて、前記被写体の候補となる領域を含む矩形領域を求める被写体候補領域矩形化ステップと、注目すべき前記被写体が含まれる前記矩形領域である被写体領域を、前記矩形領域についての領域情報に基づいて、前記矩形領域から選択する被写体領域選択ステップと、前記現フレームの前記特徴量毎の前記特徴量マップ上の、前記被写体領域に対応する領域における前記特徴量のうち、相対的に大きい前記特徴量に対応する、前記現フレームより時間的に後の後フレームの特徴量マップに対して重み付けする前記重み係数を算出する重み係数算出ステップとを含む。 The image processing method according to one aspect of the present invention is based on a weighting factor for each feature amount from a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature of the input image. A subject map generation step for generating a subject map indicating the likelihood of the region of the subject in the current frame, a subject candidate region rectangularization step for obtaining a rectangular region including a region that is a candidate for the subject in the subject map, A subject area selection step of selecting a subject area, which is the rectangular area including the subject, from the rectangular area based on area information about the rectangular area; and on the feature quantity map for each feature quantity of the current frame Corresponding to the relatively large feature amount among the feature amounts in the region corresponding to the subject region, And a weighting coefficient calculation step for calculating the weighting factor for weighting the feature amount map frame after temporally subsequent from the frame.

本発明の一側面のプログラムは、入力画像が有する特徴毎の、前記入力画像の現フレームの所定領域における特徴量を示す特徴量マップから、前記特徴量毎の重み係数に基づいて、前記現フレームにおける被写体の領域らしさを示す被写体マップを生成する被写体マップ生成ステップと、前記被写体マップにおいて、前記被写体の候補となる領域を含む矩形領域を求める被写体候補領域矩形化ステップと、注目すべき前記被写体が含まれる前記矩形領域である被写体領域を、前記矩形領域についての領域情報に基づいて、前記矩形領域から選択する被写体領域選択ステップと、前記現フレームの前記特徴量毎の前記特徴量マップ上の、前記被写体領域に対応する領域における前記特徴量のうち、相対的に大きい前記特徴量に対応する、前記現フレームより時間的に後の後フレームの特徴量マップに対して重み付けする前記重み係数を算出する重み係数算出ステップとを含む処理をコンピュータに実行させる。 The program according to one aspect of the present invention is based on a weight map for each feature amount from a feature amount map indicating a feature amount in a predetermined area of the current frame of the input image for each feature of the input image. A subject map generating step for generating a subject map indicating the region likeness of the subject in the subject, a subject candidate region rectangularizing step for obtaining a rectangular region including a candidate region in the subject map, and the subject to be noted A subject region selection step of selecting a subject region that is the included rectangular region from the rectangular region based on region information about the rectangular region; and on the feature amount map for each feature amount of the current frame, The current amount corresponding to the relatively large feature amount among the feature amounts in the region corresponding to the subject region. To execute processing including the weighting coefficient calculating step of calculating the weighting factor for weighting the feature amount map frame after the post-frame temporally than the computer.

本発明の一側面においては、入力画像が有する特徴毎の、入力画像の現フレームの所定領域における特徴量を示す特徴量マップから、特徴量毎の重み係数に基づいて、現フレームにおける被写体の領域らしさを示す被写体マップが生成され、被写体マップにおいて、被写体の候補となる領域を含む矩形領域が求められ、注目すべき被写体が含まれる矩形領域である被写体領域が、矩形領域についての領域情報に基づいて、矩形領域から選択され、現フレームの特徴量毎の特徴量マップ上の、被写体領域に対応する領域における特徴量のうち、相対的に大きい特徴量に対応する、次フレームの特徴量マップに対して重み付けする重み係数が算出される。 In one aspect of the present invention, a region of a subject in a current frame is calculated based on a weighting factor for each feature amount from a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature of the input image. A subject map indicating the likelihood is generated, a rectangular area including a candidate area is obtained in the subject map, and a subject area that is a rectangular area including a subject to be noticed is based on area information about the rectangular area. The feature amount map of the next frame corresponding to a relatively large feature amount among the feature amounts in the region corresponding to the subject region on the feature amount map for each feature amount of the current frame is selected from the rectangular region. A weighting factor for weighting is calculated.

本発明の一側面によれば、被写体をより安定して追尾することが可能となる。 According to one aspect of the present invention, it is possible to track a subject more stably.

本発明を適用した画像処理装置の一実施の形態の構成例を示すブロック図である。It is a block diagram which shows the structural example of one Embodiment of the image processing apparatus to which this invention is applied. 被写体追尾部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a subject tracking part. 被写体マップ生成部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a to-be-photographed map production | generation part. 被写体候補領域矩形化部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a to-be-photographed object candidate area | region rectangle part. 被写体領域選択部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a to-be-photographed region selection part. 被写体追尾処理について説明するフローチャートである。It is a flowchart explaining a subject tracking process. 被写体マップ生成処理について説明するフローチャートである。It is a flowchart explaining a subject map generation process. 被写体マップ生成処理の具体例を示す図である。It is a figure which shows the specific example of a to-be-photographed map production | generation process. 被写体候補領域矩形化処理について説明するフローチャートである。It is a flowchart explaining a to-be-photographed object candidate area | region rectangle process. 被写体候補領域矩形化処理の具体例を示す図である。It is a figure which shows the specific example of a to-be-photographed object area | region rectangle process. 被写体領域選択処理について説明するフローチャートである。It is a flowchart explaining a subject area selection process. 帯域特徴量マップの被写体領域特徴量和について説明する図である。It is a figure explaining the object area | region feature-value sum of a zone | band feature-value map. 重み係数について説明する図である。It is a figure explaining a weighting coefficient. コンピュータのハードウェアの構成例を示すブロック図である。It is a block diagram which shows the structural example of the hardware of a computer.

以下、本発明の実施の形態について図を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

［画像処理装置の構成例］
図１は、本発明を適用した画像処理装置の一実施の形態の構成例を示している。 [Configuration example of image processing apparatus]
FIG. 1 shows a configuration example of an embodiment of an image processing apparatus to which the present invention is applied.

図１の画像処理装置１１は、例えば、動きのある被写体を撮影するデジタルカメラ等の撮影装置に備えられる。 The image processing apparatus 11 of FIG. 1 is provided in a photographing apparatus such as a digital camera that photographs a moving subject.

画像処理装置１１は、光学系３１、イメージャ３２、デジタル信号処理部３３、制御部３４、レンズ駆動部３５、インタフェース制御部３６、およびユーザインタフェース３７から構成される。 The image processing apparatus 11 includes an optical system 31, an imager 32, a digital signal processing unit 33, a control unit 34, a lens driving unit 35, an interface control unit 36, and a user interface 37.

光学系３１は、図示せぬ撮像レンズを含む光学系として構成される。光学系３１に入射した光は、CCD（Charge Coupled Device）等の撮像素子で構成されるイメージャ３２により光電変換される。イメージャ３２により光電変換された電気信号（アナログ信号）は、図示せぬA/D（Analog to Digital）変換部によりデジタル信号の画像データに変換され、デジタル信号処理部３３に供給される。 The optical system 31 is configured as an optical system including an imaging lens (not shown). The light incident on the optical system 31 is photoelectrically converted by an imager 32 configured by an image sensor such as a CCD (Charge Coupled Device). The electrical signal (analog signal) photoelectrically converted by the imager 32 is converted into image data of a digital signal by an A / D (Analog to Digital) converter (not shown) and supplied to the digital signal processor 33.

デジタル信号処理部３３は、イメージャ３２からのデジタル信号（画像データ）に対して、所定の信号処理を施す。デジタル信号処理部３３は、前処理部５１、デモザイク処理部５２、YC生成部５３、解像度変換部５４、および被写体追尾部５５を備えている。 The digital signal processing unit 33 performs predetermined signal processing on the digital signal (image data) from the imager 32. The digital signal processing unit 33 includes a preprocessing unit 51, a demosaic processing unit 52, a YC generation unit 53, a resolution conversion unit 54, and a subject tracking unit 55.

前処理部５１は、イメージャ３２からの画像データに対して、Ｒ，Ｇ，Ｂの黒レベルを所定のレベルにクランプするクランプ処理や、Ｒ，Ｇ，Ｂの色チャンネル間の補正処理等を施す。デモザイク処理部５２は、各画素についての画像データが、Ｒ，Ｇ，Ｂ全ての色成分を有するようにするデモザイク処理を施す。YC生成部５３は、Ｒ，Ｇ，Ｂの画像データから、輝度（Ｙ）信号および色（Ｃ）信号を生成（分離）する。解像度変換部５４は、各種の信号処理が施された画像データに対して、解像度変換処理を実行し、制御部３４や図示せぬ符号化処理部に供給する。 The pre-processing unit 51 performs clamp processing for clamping the R, G, B black levels to a predetermined level, correction processing between the R, G, B color channels, and the like on the image data from the imager 32. . The demosaic processing unit 52 performs demosaic processing so that the image data for each pixel has all the R, G, and B color components. The YC generation unit 53 generates (separates) a luminance (Y) signal and a color (C) signal from R, G, and B image data. The resolution conversion unit 54 performs resolution conversion processing on the image data that has been subjected to various types of signal processing, and supplies the image data to the control unit 34 and an encoding processing unit (not shown).

被写体追尾部５５は、YC生成部５３によって生成された輝度信号および色信号からなる画像データを基に、画像データに対応する入力画像（撮像画像）における被写体を検出し、追尾する被写体追尾処理を実行する。 The subject tracking unit 55 detects a subject in the input image (captured image) corresponding to the image data based on the image data including the luminance signal and the color signal generated by the YC generation unit 53, and performs subject tracking processing for tracking. Execute.

ここで、被写体の検出は、ユーザが入力画像を一瞥した場合に、ユーザが注目すると推
定される入力画像上の物体、つまりユーザが目を向けると推定される物体が被写体であるとして行われる。したがって、被写体は必ずしも人物に限られる訳ではない。 Here, the detection of the subject is performed on the assumption that the object on the input image that is estimated to be noticed by the user when the user glances at the input image, that is, the object that is estimated to be looked at by the user is the subject. Therefore, the subject is not necessarily limited to a person.

被写体追尾部５５は、被写体追尾処理の結果得られた、入力画像における被写体が含まれる領域を表す被写体枠についてのデータを制御部３４に供給する。なお、被写体追尾部５５の詳細については、図２を参照して後述する。 The subject tracking unit 55 supplies the control unit 34 with data about a subject frame representing a region including the subject in the input image obtained as a result of the subject tracking process. Details of the subject tracking unit 55 will be described later with reference to FIG.

制御部３４は、インタフェース制御部３６から供給される制御信号を基に、画像処理装置１１の各部を制御する。 The control unit 34 controls each unit of the image processing apparatus 11 based on the control signal supplied from the interface control unit 36.

例えば、制御部３４は、デジタル信号処理部３３に、各種の信号処理に用いられるパラメータ等を供給するとともに、デジタル信号処理部３３からの、各種の信号処理の結果得られたデータ（画像データを含む）を取得し、インタフェース制御部３６に供給する。また、制御部３４は、レンズ駆動部３５に、光学系３１に含まれる撮像レンズを駆動させたり、絞りなどを調節するための制御信号を供給する。 For example, the control unit 34 supplies parameters and the like used for various signal processing to the digital signal processing unit 33, and data (image data obtained from the digital signal processing unit 33 as a result of various signal processings). Including) and supplying the information to the interface control unit 36. In addition, the control unit 34 supplies the lens driving unit 35 with a control signal for driving the imaging lens included in the optical system 31 and adjusting the diaphragm and the like.

ユーザインタフェース３７は、ユーザが画像処理装置１１に対する指示を入力するために操作されるボタンやキーボード等の入力装置や、ユーザに対して情報を提供（表示）するLCD（Liquid Crystal Display）やマイクロホン等の出力装置から構成される。ボタンとしてのユーザインタフェース３７が操作されることで、画像処理装置１１に対する指示を表す制御信号が、インタフェース制御部３６を介して制御部３４に供給される。また、インタフェース制御部３６を介して制御部３４から供給された制御信号（データ）に応じた情報が、LCDとしてのユーザインタフェース３７に表示される。LCDとしてのユーザインタフェース３７には、例えば、画像データに対応する入力画像と、入力画像における被写体に対する被写体追尾処理の結果である被写体枠とが表示される。 The user interface 37 is an input device such as a button or a keyboard operated by the user to input an instruction to the image processing apparatus 11, an LCD (Liquid Crystal Display), a microphone, or the like that provides (displays) information to the user. Output device. By operating the user interface 37 as a button, a control signal indicating an instruction to the image processing apparatus 11 is supplied to the control unit 34 via the interface control unit 36. Further, information corresponding to a control signal (data) supplied from the control unit 34 via the interface control unit 36 is displayed on a user interface 37 as an LCD. On the user interface 37 as an LCD, for example, an input image corresponding to the image data and a subject frame that is a result of subject tracking processing on the subject in the input image are displayed.

［被写体追尾部の構成例］
次に、図２を参照して、図１の被写体追尾部５５の構成例について説明する。 [Configuration example of subject tracking unit]
Next, a configuration example of the subject tracking unit 55 in FIG. 1 will be described with reference to FIG.

図２の被写体追尾部５５は、被写体マップ生成部７１、被写体候補領域矩形化部７２、被写体領域選択部７３、および重み係数算出部７４から構成される。 The subject tracking unit 55 in FIG. 2 includes a subject map generation unit 71, a subject candidate region rectangularization unit 72, a subject region selection unit 73, and a weight coefficient calculation unit 74.

被写体マップ生成部７１は、入力画像が有する輝度や色等の特徴毎に、入力画像の所定フレームの所定領域における特徴量を示す特徴量マップを生成し、重み係数算出部７４に供給する。また、被写体マップ生成部７１は、生成した特徴量マップと、重み係数算出部７４から供給される特徴量毎の重み係数とに基づいて、入力画像における被写体の領域らしさを示す被写体マップを生成する。 The subject map generation unit 71 generates a feature amount map indicating a feature amount in a predetermined region of a predetermined frame of the input image for each feature such as brightness and color of the input image, and supplies the feature amount map to the weight coefficient calculation unit 74. In addition, the subject map generation unit 71 generates a subject map indicating the likelihood of the subject area in the input image based on the generated feature amount map and the weighting factor for each feature amount supplied from the weighting factor calculation unit 74. .

より具体的には、被写体マップ生成部７１は、特徴毎の特徴量マップの各領域の情報（特徴量）を、同じ位置にある領域毎に重み付き加算して被写体マップを生成する。被写体マップ生成部７１は、生成した被写体マップを被写体候補領域矩形化部７２に供給する。 More specifically, the subject map generation unit 71 generates a subject map by weighting and adding information (feature amount) of each region of the feature amount map for each feature for each region at the same position. The subject map generation unit 71 supplies the generated subject map to the subject candidate area rectangularization unit 72.

なお、各特徴量マップにおいて、より情報量の多い領域、つまり特徴量の多い領域に対応する入力画像上の領域は、被写体が含まれる可能性のより高い領域となり、したがって、各特徴量マップにより入力画像における被写体の含まれる領域を特定することができる。 In each feature amount map, a region having a larger amount of information, that is, a region on the input image corresponding to a region having a larger feature amount is a region having a higher possibility of including a subject. A region including a subject in the input image can be specified.

被写体候補領域矩形化部７２は、被写体マップ生成部７１からの被写体マップにおいて、被写体の候補となる領域、すなわち、被写体マップにおける情報量の多い領域を含む矩形領域を求め、その矩形領域の座標を表す座標情報を、被写体領域選択部７３に供給する。また、被写体候補領域矩形化部７２は、被写体マップ上で座標情報により表わされる矩形領域に関する情報（以下、領域情報という）を算出し、座標情報に対応付けて被写体領域選択部７３に供給する。 The subject candidate area rectangularizing unit 72 obtains a candidate area in the subject map from the subject map generating unit 71, that is, a rectangular area including a large amount of information in the subject map, and obtains the coordinates of the rectangular area. The coordinate information to be expressed is supplied to the subject area selection unit 73. In addition, the subject candidate area rectangularization unit 72 calculates information related to the rectangular area represented by the coordinate information on the subject map (hereinafter referred to as area information) and supplies the information to the subject area selection unit 73 in association with the coordinate information.

被写体領域選択部７３は、追尾対象となる、注目すべき被写体が含まれる矩形領域である被写体領域を、被写体候補領域矩形化部７２からの領域情報に基づいて矩形領域の中から選択し、その被写体領域の座標情報を制御部３４（図１）および重み係数算出部７４に供給する。 The subject area selection unit 73 selects a subject area, which is a rectangular area including a subject to be noticed, as a tracking target from the rectangular areas based on the area information from the subject candidate area rectangularization unit 72, and The coordinate information of the subject area is supplied to the control unit 34 (FIG. 1) and the weight coefficient calculation unit 74.

重み係数算出部７４は、被写体マップ生成部７１からの所定フレームの各特徴量マップ上の、被写体領域に対応する領域における特徴量のうち、相対的に大きい特徴量に対応する次フレームの特徴量マップを重み付けする重み係数を算出し、被写体マップ生成部７１に供給する。 The weighting coefficient calculation unit 74 is the feature amount of the next frame corresponding to a relatively large feature amount among the feature amounts in the region corresponding to the subject region on each feature amount map of the predetermined frame from the subject map generation unit 71. A weighting factor for weighting the map is calculated and supplied to the subject map generation unit 71.

このような構成により、被写体追尾部５５は、入力画像のフレーム毎に、被写体領域を表す被写体枠を求めることができる。 With such a configuration, the subject tracking unit 55 can obtain a subject frame representing a subject region for each frame of the input image.

［被写体マップ生成部の構成例］
次に、図３を参照して、図２の被写体マップ生成部７１の構成例について説明する。 [Configuration example of subject map generator]
Next, a configuration example of the subject map generation unit 71 in FIG. 2 will be described with reference to FIG.

図３の被写体マップ生成部７１は、特徴量マップ生成部１１１、帯域特徴量マップ生成部１１２、帯域特徴量マップ合成部１１３、および合成特徴量マップ合成部１１４から構成される。 3 includes a feature map generation unit 111, a band feature map generation unit 112, a band feature map synthesis unit 113, and a synthesis feature map synthesis unit 114.

特徴量マップ生成部１１１は、入力画像の所定フレームから、輝度や色といった特徴に関する情報（特徴量）を示す特徴量マップを特徴量毎に生成し、帯域特徴量マップ生成部１１２に供給する。 The feature amount map generation unit 111 generates a feature amount map indicating information (feature amount) related to features such as luminance and color from a predetermined frame of the input image for each feature amount, and supplies the feature amount map to the band feature amount map generation unit 112.

帯域特徴量マップ生成部１１２は、特徴量マップ生成部１１１からの各特徴量マップにおける特徴量から、所定の帯域成分の特徴量を所定の回数だけ抽出し、抽出したそれぞれの特徴量を示す帯域特徴量マップを生成し、重み係数算出部７４および帯域特徴量マップ合成部１１３に供給する。 The band feature amount map generation unit 112 extracts a feature amount of a predetermined band component from the feature amount in each feature amount map from the feature amount map generation unit 111 a predetermined number of times, and indicates the extracted feature amount A feature amount map is generated and supplied to the weighting factor calculation unit 74 and the band feature amount map synthesis unit 113.

帯域特徴量マップ合成部１１３は、帯域特徴量マップ生成部１１２からの帯域特徴量マップを、重み係数算出部７４からの重み係数に基づいて特徴量毎に合成することで、合成特徴量マップを生成し、重み係数算出部７４および合成特徴量マップ合成部１１４に供給する。 The band feature quantity map synthesis unit 113 synthesizes the band feature quantity map from the band feature quantity map generation unit 112 for each feature quantity based on the weighting coefficient from the weighting coefficient calculation unit 74, thereby generating a synthesized feature quantity map. Generated and supplied to the weighting factor calculation unit 74 and the combined feature amount map combining unit 114.

合成特徴量マップ合成部１１４は、帯域特徴量マップ合成部１１３からの合成特徴量マップを、重み係数算出部７４からの重み係数に基づいて合成することで、被写体マップを生成し、被写体候補領域矩形化部７２（図２）に供給する。 The composite feature amount map combining unit 114 generates a subject map by combining the combined feature amount map from the band feature amount map combining unit 113 based on the weighting factor from the weighting factor calculating unit 74, and generates a subject candidate area. It supplies to the rectangularization part 72 (FIG. 2).

ここで、以下においては、上述した帯域特徴量マップおよび合成特徴量マップを、単に、特徴量マップともいう。 Here, in the following, the band feature amount map and the combined feature amount map described above are also simply referred to as a feature amount map.

［被写体候補領域矩形化部の構成例］
次に、図４を参照して、図２の被写体候補領域矩形化部７２の構成例について説明する。 [Configuration Example of Subject Candidate Area Rectification Unit]
Next, a configuration example of the subject candidate area rectangularization unit 72 in FIG. 2 will be described with reference to FIG.

図４の被写体候補領域矩形化部７２は、２値化処理部１３１、ラベリング処理部１３２、矩形領域座標算出部１３３、および領域情報算出部１３４から構成される。 4 includes a binarization processing unit 131, a labeling processing unit 132, a rectangular region coordinate calculation unit 133, and a region information calculation unit 134.

２値化処理部１３１は、被写体マップ生成部７１から供給された被写体マップにおける、入力画像の各画素に対応する情報を、所定の閾値に基づいて０または１のいずれかの値に２値化して、ラベリング処理部１３２に供給する。ここで、以下においては、被写体マップにおいて、入力画像の各画素に対応する情報を、単に、画素ともいう。 The binarization processing unit 131 binarizes information corresponding to each pixel of the input image in the subject map supplied from the subject map generation unit 71 to either 0 or 1 based on a predetermined threshold. To the labeling processing unit 132. Hereinafter, in the subject map, information corresponding to each pixel of the input image is also simply referred to as a pixel.

ラベリング処理部１３２は、２値化処理部１３１からの、２値化された被写体マップにおいて、１の値である画素が隣接する領域（以下、連結領域という）に対してラベリングし、矩形領域座標算出部１３３に供給する。 In the binarized subject map from the binarization processing unit 131, the labeling processing unit 132 performs labeling on a region adjacent to a pixel having a value of 1 (hereinafter referred to as a connected region), and rectangular region coordinates. It supplies to the calculation part 133.

矩形領域座標算出部１３３は、ラベリング処理部１３２からの、連結領域がラベリングされた被写体マップにおいて、連結領域を含む（囲む）矩形領域の座標を算出し、その座標を表す座標情報を、被写体マップとともに領域情報算出部１３４に供給する。 The rectangular area coordinate calculation unit 133 calculates the coordinates of the rectangular area including (surrounding) the connected area in the object map labeled with the connected area from the labeling processing unit 132, and uses the coordinate information representing the coordinates as the object map. At the same time, it is supplied to the area information calculation unit 134.

領域情報算出部１３４は、矩形領域座標算出部１３３からの被写体マップ上で座標情報により表される矩形領域に関する情報である領域情報を算出し、座標情報に対応付けて被写体領域選択部７３（図１）に供給する。 The area information calculation unit 134 calculates area information that is information related to the rectangular area represented by the coordinate information on the object map from the rectangular area coordinate calculation unit 133, and associates with the coordinate information to the object area selection unit 73 (FIG. To 1).

［被写体領域選択部の構成例］
次に、図５を参照して、被写体領域選択部７３の構成例について説明する。 [Configuration example of subject area selection unit]
Next, a configuration example of the subject area selection unit 73 will be described with reference to FIG.

図５の被写体領域選択部７３は、領域情報比較部１５１および被写体領域決定部１５２から構成される。 The subject area selection unit 73 in FIG. 5 includes an area information comparison unit 151 and a subject area determination unit 152.

領域情報比較部１５１は、被写体候補領域矩形化部７２からの各矩形領域の領域情報と、領域情報記憶部１５３に記憶されている１フレーム前の被写体領域の領域情報とを比較し、比較結果を被写体領域決定部２５２に供給する。 The region information comparison unit 151 compares the region information of each rectangular region from the subject candidate region rectangularization unit 72 with the region information of the subject region of the previous frame stored in the region information storage unit 153, and the comparison result Is supplied to the subject region determination unit 252.

被写体領域決定部１５２は、領域情報比較部１５１からの比較結果に基づいて、１フレーム前の被写体領域の領域情報に最も近い領域情報に対応付けられている座標情報で表される矩形領域を被写体領域とする。被写体領域決定部１５２は、決定した被写体領域の座標情報を制御部３４（図１）および重み係数算出部７４（図２）に供給するとともに、被写体領域の領域情報を、領域情報記憶部１５３に供給する。 Based on the comparison result from the region information comparison unit 151, the subject region determination unit 152 sets the rectangular region represented by the coordinate information associated with the region information closest to the region information of the subject region one frame before the subject. This is an area. The subject area determination unit 152 supplies the coordinate information of the determined subject area to the control unit 34 (FIG. 1) and the weighting coefficient calculation unit 74 (FIG. 2), and the area information of the subject area is stored in the area information storage unit 153. Supply.

領域情報記憶部１５３は、被写体領域決定部１５２からの、被写体領域の領域情報を記憶する。領域情報記憶部１５３に記憶された被写体領域の領域情報は、１フレーム後に、領域情報比較部１５１に読み出される。 The area information storage unit 153 stores the area information of the subject area from the subject area determination unit 152. The area information of the subject area stored in the area information storage unit 153 is read by the area information comparison unit 151 one frame later.

［被写体追尾処理］
以下においては、画像処理装置１１の被写体追尾処理について説明する。 [Subject tracking processing]
In the following, the subject tracking process of the image processing apparatus 11 will be described.

図６は、画像処理装置１１の被写体追尾処理について説明するフローチャートである。被写体追尾処理は、例えば、ボタンとしてのユーザインタフェース３７がユーザに操作されることで、画像処理装置１１の動作モードが被写体追尾処理を実行する被写体追尾処理モードに遷移し、LCDとしてのユーザインタフェース３７に表示されている入力画像において、追尾対象としての被写体の所定領域がユーザにより選択されたときに開始される。 FIG. 6 is a flowchart for describing subject tracking processing of the image processing apparatus 11. In the subject tracking process, for example, when the user operates a user interface 37 as a button, the operation mode of the image processing apparatus 11 is changed to a subject tracking process mode in which the subject tracking process is executed, and the user interface 37 as an LCD is displayed. Is started when a predetermined area of the subject as a tracking target is selected by the user.

ステップＳ１１において、被写体追尾部５５の被写体マップ生成部７１は、被写体マップ生成処理を行い、被写体マップを生成して、被写体候補領域矩形化部７２に供給する。 In step S 11, the subject map generation unit 71 of the subject tracking unit 55 performs a subject map generation process, generates a subject map, and supplies the subject map to the subject candidate area rectangularization unit 72.

［被写体マップ生成処理］
ここで、図７および図８を参照して、被写体マップ生成処理の詳細について説明する。図７は、被写体マップ生成処理について説明するフローチャートであり、図８は、被写体マップ生成処理の具体例を示す図である。 [Subject map generation processing]
Here, the details of the subject map generation process will be described with reference to FIGS. FIG. 7 is a flowchart for describing subject map generation processing, and FIG. 8 is a diagram illustrating a specific example of subject map generation processing.

図７のフローチャートのステップＳ３１において、被写体マップ生成部７１の特徴量マップ生成部１１１は、入力画像の所定フレームから、輝度や色等の特徴（特徴量毎）に特徴量マップを生成し、帯域特徴量マップ生成部１１２に供給する。 In step S31 of the flowchart of FIG. 7, the feature amount map generation unit 111 of the subject map generation unit 71 generates a feature amount map for features (for each feature amount) such as luminance and color from a predetermined frame of the input image, and the bandwidth This is supplied to the feature amount map generation unit 112.

具体的には、図８に示されるように、入力画像２００から、輝度に関する情報を示す輝度情報マップＦ₁、色に関する情報を示す色情報マップＦ₂乃至Ｆ_K、エッジに関する情報を示すエッジ情報マップＦ_(K+1)乃至Ｆ_Mの、Ｍ種類の特徴量マップが生成される。 Specifically, as shown in FIG. 8, from the input image 200, a luminance information map F ₁ indicating information relating to luminance, color information maps F _{2 to} F _K indicating information relating to color, and edge information indicating information relating to edges. M types of feature amount maps of maps F _{(K + 1) to} F _M are generated.

輝度情報マップＦ₁においては、入力画像の各画素から得られる輝度成分（輝度信号）Ｙが、入力画像の各画素に対応する情報となり、色情報マップＦ₂乃至Ｆ_Kにおいては、入力画像の各画素から得られる色成分（色信号）Ｒ，Ｇ，Ｂが、入力画像の各画素に対応する情報となる。また、エッジ情報マップＦ_(K+1)乃至Ｆ_Mにおいては、例えば、入力画像の各画素における０度、４５度、９０度、および１３５度の方向のエッジ強度が、入力画像の各画素に対応する情報となる。 In the luminance information map F ₁ , the luminance component (luminance signal) Y obtained from each pixel of the input image is information corresponding to each pixel of the input image. In the color information maps F _{2 to} F _K , the input image Color components (color signals) R, G, and B obtained from each pixel become information corresponding to each pixel of the input image. In the edge information maps F _{(K + 1) to} F _M , for example, edge strengths in directions of 0 degrees, 45 degrees, 90 degrees, and 135 degrees in each pixel of the input image are applied to each pixel of the input image. Corresponding information.

なお、上述した特徴量マップについて、画素のＲ，Ｇ，Ｂの各成分の値の平均値を輝度情報マップＦ₁の情報（特徴量）としてもよいし、色差成分Ｃｒ，Ｃｂや、Lab色空間におけるａ*座標成分およびｂ*座標成分を色情報マップＦ₂乃至Ｆ_Kの情報としてもよい。また、０度、４５度、９０度、および１３５度以外の方向のエッジ強度をエッジ情報マップＦ_(K+1)乃至Ｆ_Mの情報としてもよい。 In the feature amount map described above, the average value of the R, G, and B components of the pixel may be used as the information (feature amount) of the luminance information map F ₁ , or the color difference components Cr, Cb, Lab color The a * coordinate component and b * coordinate component in the space may be used as information of the color information maps F _{2 to} F _K. Further, 0 degrees, 45 degrees, 90 degrees, and the edge information in the direction of edge strength other than 135 degrees map F _{(K + 1)} to be as the information of the F _M.

ステップＳ３２において、帯域特徴量マップ生成部１１２は、各特徴量マップにおける特徴量から、所定の帯域成分の特徴量をＮ回抽出し、抽出したそれぞれの特徴量を示す帯域特徴量マップを生成して、重み係数算出部７４および帯域特徴量マップ合成部１１３に供給する。 In step S32, the band feature quantity map generation unit 112 extracts a feature quantity of a predetermined band component from the feature quantities in each feature quantity map N times, and generates a band feature quantity map indicating each extracted feature quantity. And supplied to the weight coefficient calculation unit 74 and the band feature amount map synthesis unit 113.

具体的には、図８に示されるように、輝度情報マップＦ₁における輝度情報から、所定の帯域１乃至帯域Ｎの輝度情報が抽出され、その帯域それぞれの輝度情報を示す帯域輝度情報マップＲ₁₁乃至Ｒ_1Nが生成される。また、色情報マップＦ₂乃至Ｆ_Kにおける色情報から、所定の帯域１乃至帯域Ｎの色情報が抽出され、その帯域それぞれの色情報を示す帯域色情報マップＲ₂₁乃至Ｒ_2N，…，Ｒ_K1乃至Ｒ_KNが生成される。さらに、エッジ情報マップＦ_(K+1)乃至Ｆ_Mにおけるエッジ情報から、所定の帯域１乃至帯域Ｎのエッジ情報が抽出され、その帯域それぞれのエッジ情報を示す帯域エッジ情報マップＲ_(K+1)1乃至Ｒ_(K+1)N，…，Ｒ_M1乃至Ｒ_MNが生成される。このように、帯域特徴量マップ生成部１１２は、（Ｍ×Ｎ）種類の帯域特徴量マップを生成する。 Specifically, as shown in FIG. 8, the luminance information of a predetermined band 1 to N is extracted from the luminance information in the luminance information map F ₁ , and the band luminance information map R indicating the luminance information of each band. _{11 to} R _1N are generated. Further, color information of predetermined bands 1 to N is extracted from the color information in the color information maps F _{2 to} F _K , and band color information maps R _{21 to} R _2N,. _{K1 to} _RKN are generated. Further, edge information of predetermined bands 1 to N is extracted from edge information in the edge information maps F _{(K + 1) to} F _M , and a band edge information map R _{(K + 1)} indicating edge information of each band. _{) 1 to} R _{(K + 1) N} ,..., R _{M1 to} R _MN are generated. In this manner, the band feature amount map generation unit 112 generates (M × N) types of band feature amount maps.

ここで、帯域特徴量マップ生成部１１２の処理の一例について説明する。 Here, an example of processing of the band feature amount map generation unit 112 will be described.

例えば、帯域特徴量マップ生成部１１２は、各特徴量マップを用いて、互いに解像度の異なる複数の特徴量マップを生成し、それらの特徴量マップをその特徴量のピラミッド画像とする。例えば、レベルＬ１乃至レベルＬ８までの８つの解像度の階層のピラミッド画像が生成され、レベルＬ１のピラミッド画像が最も解像度が高く、レベルＬ１からレベルＬ８まで順番にピラミッド画像の解像度が低くなるものとする。 For example, the band feature quantity map generation unit 112 uses each feature quantity map to generate a plurality of feature quantity maps having different resolutions, and sets these feature quantity maps as pyramid images of the feature quantities. For example, it is assumed that pyramid images of eight resolution layers from level L1 to level L8 are generated, the pyramid image of level L1 has the highest resolution, and the resolution of the pyramid image sequentially decreases from level L1 to level L8. .

この場合、特徴量マップ生成部１１１により生成された特徴量マップが、レベルＬ１のピラミッド画像とされる。また、レベルＬｉ（但し、１≦ｉ≦７）のピラミッド画像における、互いに隣接する４つの画素の画素値の平均値が、それらの画素と対応するレベルＬ（ｉ＋１）のピラミッド画像の１つの画素の画素値とされる。したがって、レベルＬ（ｉ＋１）のピラミッド画像は、レベルＬｉのピラミッド画像に対して縦横半分（割り切れない場合は切り捨て）の画像となる。 In this case, the feature amount map generated by the feature amount map generation unit 111 is a pyramid image of level L1. In addition, in the pyramid image of level Li (where 1 ≦ i ≦ 7), one pixel of the pyramid image of level L (i + 1) in which the average value of the pixel values of four pixels adjacent to each other corresponds to those pixels Pixel value. Accordingly, the pyramid image at the level L (i + 1) is an image of half the length and breadth (discarded if not divisible) with respect to the pyramid image at the level Li.

また、帯域特徴量マップ生成部１１２は、複数のピラミッド画像のうち、互いに階層の異なる２つのピラミッド画像を選択し、選択したピラミッド画像の差分を求めて各特徴量の差分画像をＮ枚生成する。なお、各階層のピラミッド画像は、それぞれ大きさ（画素数）が異なるので、差分画像の生成時には、より小さい方のピラミッド画像が、より大きいピラミッド画像に合わせてアップコンバートされる。 Further, the band feature amount map generation unit 112 selects two pyramid images having different hierarchies from among a plurality of pyramid images, obtains a difference between the selected pyramid images, and generates N difference images of each feature amount. . Since the pyramid images in each layer have different sizes (number of pixels), the smaller pyramid image is up-converted in accordance with the larger pyramid image when generating the difference image.

例えば、帯域特徴量マップ生成部１１２は、各階層の特徴量のピラミッド画像のうち、レベルＬ６およびレベルＬ３、レベルＬ７およびレベルＬ３、レベルＬ７およびレベルＬ４、レベルＬ８およびレベルＬ４、並びにレベルＬ８およびレベルＬ５の各階層の組み合わせのピラミッド画像の差分を求める。これにより、合計５つの特徴量の差分画像が得られる。 For example, the band feature map generation unit 112 includes the level L6 and level L3, the level L7 and level L3, the level L7 and level L4, the level L8 and level L4, and the level L8 and The difference of the pyramid image of the combination of each hierarchy of level L5 is calculated | required. As a result, difference images of a total of five feature amounts are obtained.

具体的には、例えば、レベルＬ６およびレベルＬ３の組み合わせの差分画像が生成される場合、レベルＬ６のピラミッド画像が、レベルＬ３のピラミッド画像の大きさに合わせてアップコンバートされる。つまり、アップコンバート前のレベルＬ６のピラミッド画像の１つの画素の画素値が、その画素に対応する、アップコンバート後のレベルＬ６のピラミッド画像の互いに隣接するいくつかの画素の画素値とされる。そして、レベルＬ６のピラミッド画像の画素の画素値と、その画素と同じ位置にあるレベルＬ３のピラミッド画像の画素の画素値との差分が求められ、その差分が差分画像の画素の画素値とされる。 Specifically, for example, when a differential image of a combination of level L6 and level L3 is generated, the pyramid image at level L6 is up-converted according to the size of the pyramid image at level L3. That is, the pixel value of one pixel of the pyramid image of level L6 before up-conversion is the pixel value of several pixels adjacent to each other of the pyramid image of level L6 after up-conversion corresponding to that pixel. Then, the difference between the pixel value of the pixel of the level L6 pyramid image and the pixel value of the pixel of the level L3 pyramid image at the same position as the pixel is obtained, and the difference is set as the pixel value of the pixel of the difference image. The

このように、差分画像を生成することで、特徴量マップにバンドパスフィルタを用いたフィルタ処理を施すように、特徴量マップから所定の帯域成分の特徴量を抽出することができる。 In this way, by generating the difference image, it is possible to extract the feature amount of a predetermined band component from the feature amount map so that the feature amount map is subjected to filter processing using a bandpass filter.

なお、以上の説明において、特徴量マップから抽出される帯域の幅は、差分画像を求める際の、ピラミッド画像の各階層の組み合わせによって決まるが、この組み合わせは任意に決定される。 In the above description, the width of the band extracted from the feature map is determined by the combination of each layer of the pyramid image when obtaining the difference image, but this combination is arbitrarily determined.

また、所定の帯域成分の特徴量の抽出は、上述した差分画像による手法に限らず、他の手法を用いるようにしてもよい。 Further, the extraction of the feature amount of the predetermined band component is not limited to the above-described method using the difference image, and other methods may be used.

図７のフローチャートに戻り、ステップＳ３３において、帯域特徴量マップ合成部１１３は、帯域特徴量マップ生成部１１２からの帯域特徴量マップを、重み係数算出部７４からの重み係数群Ｗ_Rに基づいて特徴量毎に合成する。帯域特徴量マップ合成部１１３は、合成した帯域特徴量マップ（合成特徴量マップ）を、重み係数算出部７４および合成特徴量マップ合成部１１４に供給する。 Returning to the flowchart of FIG. 7, in step S 33, the band feature amount map synthesis unit 113 uses the band feature amount map from the band feature amount map generation unit 112 based on the weight coefficient group W _R from the weight coefficient calculation unit 74. Combining for each feature. The band feature amount map combining unit 113 supplies the combined band feature amount map (combined feature amount map) to the weighting coefficient calculating unit 74 and the combined feature amount map combining unit 114.

具体的には、図８に示されるように、帯域輝度情報マップＲ₁₁乃至Ｒ_1Nは、重み係数算出部７４からの帯域輝度情報マップ毎の重みである重み係数ｗ₁₁乃至ｗ_1Nにより重み付き加算され、合成特徴量マップＣ₁が求められる。また、帯域色情報マップＲ₂₁乃至Ｒ_2N，…，Ｒ_K1乃至Ｒ_KNは、重み係数算出部７４からの帯域色情報マップ毎の重みである重み係数ｗ₂₁乃至ｗ_2N，…，ｗ_K1乃至ｗ_KNにより重み付き加算され、合成特徴量マップＣ₂乃至Ｃ_Kが求められる。さらに、帯域エッジ情報マップＲ_(K+1)1乃至Ｒ_(K+1)N，…，Ｒ_M1乃至Ｒ_MNは、重み係数算出部７４からの帯域エッジ情報マップ毎の重みである重み係数ｗ_(K+1)1乃至ｗ_(K+1)N，…，ｗ_M1乃至ｗ_MNにより重み付き加算され、合成特徴量マップＣ_K+1乃至Ｃ_Mが求められる。このように、帯域特徴量マップ合成部１１３は、Ｍ種類の合成特徴量マップを生成する。なお、重み係数群Ｗ_Rの詳細については後述するが、重み係数群Ｗ_Rの各重み係数は、０乃至１の値を有する。但し、１回目の被写体マップ生成処理においては、重み係数群Ｗ_Rの各重み係数は全て１とされ、帯域特徴量マップは、重みなしで加算される。 Specifically, as shown in FIG. 8, the band luminance information maps R _{11 to} R _1N are weighted by weight coefficients w _{11 to} w _1N that are weights for each band luminance information map from the weight coefficient calculation unit 74. The combined feature value map C ₁ is obtained by addition. Further, the map-band color information R ₂₁ to R _2N, ..., R _K1 to R _KN is the weighting factor w ₂₁ through w _2N is the weight of the band color information for each map from the weight coefficient calculation unit 74, ..., w _K1 to Weighted addition is performed by w _KN to obtain composite feature amount maps C _{2 to} C _K. Further, the band edge information maps R _{(K + 1) 1 to} R _{(K + 1) N} ,..., R _{M1 to} R _MN are weight coefficients w that are weights for each band edge information map from the weight coefficient calculation unit 74. _{(K + 1) 1} to _{w (K + 1) N,} ..., are summed weighted by w _M1 to w _MN, synthetic feature amount map C _{K + 1} to C _M are determined. In this manner, the band feature quantity map synthesis unit 113 generates M types of synthesized feature quantity maps. As will be described later in detail weight coefficient group W _R, the weight coefficients of the weight coefficient group W _R has a value of 0 to 1. However, the first subject map generation processing, each weighting factor of the weighting coefficient group W _R is all 1, band feature amount maps are added without weighting.

ステップＳ３４において、合成特徴量マップ合成部１１４は、帯域特徴量マップ合成部１１３からの合成特徴量マップを、重み係数算出部７４からの重み係数群Ｗ_Cに基づいて合成することで、被写体マップを生成し、被写体候補領域矩形化部７２に供給する。 In step S34, the combined feature value map combining unit 114 combines the combined feature amount map from the band feature amount map combining unit 113 based on the weighting coefficient group W _C from the weighting coefficient calculating unit 74, so that the subject map Is generated and supplied to the subject candidate area rectangularization unit 72.

具体的には、図８に示されるように、合成特徴量マップＣ₁乃至Ｃ_Mは、重み係数算出部７４からの帯域輝度情報マップ毎の重みである重み係数ｗ₁乃至ｗ_Mを用いて線形結合される。さらに、線形結合の結果得られたマップの画素値に、予め求められた重みである被写体重みが乗算され正規化されて、被写体マップ２０１を求める。なお、重み係数群Ｗ_Cの詳細については後述するが、重み係数群Ｗ_Cの各重み係数は、０乃至１の値を有する。但し、１回目の被写体マップ生成処理においては、重み係数群Ｗ_Cの各重み係数は全て１とされ、合成特徴量マップは、重みなしで線形結合される。 Specifically, as shown in FIG. 8, the combined feature amount maps C _{1 to} C _M use weight coefficients w _{1 to} w _M that are weights for each band luminance information map from the weight coefficient calculation unit 74. Linearly combined. Furthermore, the subject map 201 is obtained by multiplying the pixel value of the map obtained as a result of the linear combination by the subject weight, which is a weight obtained in advance, and normalizing it. Although details of the weighting coefficient group W _C will be described later, each weighting coefficient of the weighting coefficient group W _C has a value of 0 to 1. However, in the first subject map generation process, all the weighting factors of the weighting factor group W _C are set to 1, and the combined feature amount map is linearly combined without weighting.

つまり、これから求めようとする被写体マップ上の注目する位置（画素）を注目位置とすると、各合成特徴量マップの注目位置と同じ位置（画素）の画素値に、合成特徴量マップごとの重み係数が乗算され、重み係数の乗算された画素値の総和が、注目位置の画素値とされる。さらに、このようにして求められた被写体マップの各位置の画素値に、被写体マップに対して予め求められた被写体重みが乗算されて正規化され、最終的な被写体マップとされる。例えば、正規化は、被写体マップの各画素の画素値が、０から２５５までの間の値となるようになされる。 In other words, if the target position (pixel) on the subject map to be obtained is the target position, the weight coefficient for each composite feature map is set to the pixel value at the same position (pixel) as the target position of each composite feature map. And the sum of the pixel values multiplied by the weighting coefficient is used as the pixel value at the target position. Further, the pixel value at each position of the subject map obtained in this way is multiplied by a subject weight obtained in advance with respect to the subject map and normalized to obtain a final subject map. For example, the normalization is performed so that the pixel value of each pixel of the subject map becomes a value between 0 and 255.

以上のようにして、被写体マップ生成部７１は、特徴量マップから、帯域特徴量マップおよび合成特徴量マップを生成することにより、被写体マップを生成する。 As described above, the subject map generation unit 71 generates the subject map by generating the band feature amount map and the combined feature amount map from the feature amount map.

図６のフローチャートに戻り、ステップＳ１２において、被写体候補領域矩形化部７２は、被写体候補領域矩形化処理を行い、被写体マップ生成部７１からの被写体マップにおいて、被写体の候補となる領域を含む矩形領域を求める。 Returning to the flowchart of FIG. 6, in step S 12, the subject candidate region rectangularization unit 72 performs subject candidate region rectangularization processing, and in the subject map from the subject map generation unit 71, a rectangular region that includes regions that are subject candidates. Ask for.

［被写体候補領域矩形化処理］
ここで、図９および図１０を参照して、被写体候補領域矩形化処理の詳細について説明する。図９は、被写体候補領域矩形化処理について説明するフローチャートであり、図１０は、被写体候補領域矩形化処理の具体例を示す図である。 [Subject candidate area rectangle processing]
Here, with reference to FIG. 9 and FIG. 10, details of the subject candidate area rectangularization process will be described. FIG. 9 is a flowchart for describing the subject candidate area rectangularization process, and FIG. 10 is a diagram illustrating a specific example of the subject candidate area rectangularization process.

図９のフローチャートのステップＳ５１において、被写体候補領域矩形化部７２の２値化処理部１３１は、被写体マップ生成部７１から供給された被写体マップにおける情報を、所定の閾値に基づいて０または１のいずれかの値に２値化し、ラベリング処理部１３２に供給する。 In step S51 of the flowchart of FIG. 9, the binarization processing unit 131 of the subject candidate area rectangularization unit 72 sets the information in the subject map supplied from the subject map generation unit 71 to 0 or 1 based on a predetermined threshold. It binarizes to either value and supplies it to the labeling processing unit 132.

より具体的には、２値化処理部１３１は、図１０の上から１番目に示される、０から２５５までの間の値である被写体マップ２０１の各画素の画素値に対して、例えば、閾値１２７より小さい値の画素値を０とし、１２７より大きい値の画素値を１とする。これによって、図１０の上から２番目に示されるような２値化マップ２０２が得られる。図１０で示される２値化マップ２０２においては、白で示される部分（画素）が１の画素値を有し、黒で示される部分（画素）が０の画素値を有している。なお、ここでは、閾値を１２７であるものとしたが、他の値であってもよい。 More specifically, the binarization processing unit 131 performs, for example, a pixel value of each pixel of the subject map 201 that is a value between 0 and 255 shown first from the top in FIG. A pixel value having a value smaller than the threshold 127 is set to 0, and a pixel value having a value greater than 127 is set to 1. As a result, a binarized map 202 as shown second from the top in FIG. 10 is obtained. In the binarization map 202 shown in FIG. 10, a portion (pixel) indicated by white has a pixel value of 1, and a portion (pixel) indicated by black has a pixel value of 0. Although the threshold value is 127 here, other values may be used.

ステップＳ５２において、ラベリング処理部１３２は、２値化処理部１３１から２値化マップ２０２（２値化された被写体マップ）において、例えば、モルフォロジー演算等によって得られる、１である画素値の画素が隣接する連結領域に対してラベリングし、矩形領域座標算出部１３３に供給する。 In step S52, the labeling processing unit 132 obtains a pixel having a pixel value of 1 obtained from the binarization processing unit 131 by a morphological operation or the like in the binarization map 202 (binarized subject map). The adjacent connected areas are labeled and supplied to the rectangular area coordinate calculation unit 133.

より具体的には、例えば、図１０の上から３番目に示されるように、２値化マップ２０２においては、連結領域２１１が、ラベル「１」でラベリングされ、連結領域２１２が、ラベル「２」でラベリングされる。 More specifically, for example, as shown third from the top in FIG. 10, in the binarization map 202, the connected area 211 is labeled with the label “1”, and the connected area 212 is labeled “2”. ".

ステップＳ５３において、矩形領域座標算出部１３３は、ラベリング処理部１３２からの２値化マップ２０２において、連結領域を含む（囲む）矩形領域の座標を算出し、その座標を表す座標情報を、２値化マップ２０２とともに領域情報算出部１３４に供給する。 In step S 53, the rectangular area coordinate calculation unit 133 calculates the coordinates of the rectangular area including (enclosed) the connected area in the binarized map 202 from the labeling processing unit 132, and converts the coordinate information representing the coordinates into binary. It is supplied to the region information calculation unit 134 together with the conversion map 202.

より具体的には、図１０の上から４番目に示されるように、２値化マップ２０２において、ラベル「１」でラベリングされた連結領域２１１を外側から囲む矩形枠（外接枠）２２１が検出され、その矩形枠の、例えば図中左上および右下の頂点の座標が求められる。また、ラベル「２」でラベリングされた連結領域２１２を外側から囲む矩形枠２２２が検出され、その矩形枠の、例えば図中左上および右下の頂点の座標が求められる。 More specifically, as shown in the fourth from the top in FIG. 10, in the binarization map 202, a rectangular frame (circumscribed frame) 221 that surrounds the connected region 211 labeled with the label “1” from the outside is detected. Then, for example, the coordinates of the vertices of the upper left corner and the lower right corner of the rectangular frame are obtained. Further, a rectangular frame 222 surrounding the connection area 212 labeled with the label “2” is detected from the outside, and the coordinates of the vertexes of the rectangular frame, for example, the upper left and lower right in the figure are obtained.

ステップＳ５４において、領域情報算出部１３４は、矩形領域座標算出部１３３からの座標情報と、被写体マップ生成部７１からの被写体マップに基づいて、被写体マップ上で矩形枠に囲まれる矩形領域についての領域情報を算出する。 In step S 54, the region information calculation unit 134 is a region for a rectangular region surrounded by a rectangular frame on the subject map based on the coordinate information from the rectangular region coordinate calculation unit 133 and the subject map from the subject map generation unit 71. Calculate information.

より具体的には、領域情報算出部１３４は、２値化マップ２０２における矩形枠２２１，２２２を表す、矩形領域座標算出部１３３からの座標情報に基づいて、矩形枠のサイズおよび中心位置の座標を、矩形領域についての領域情報として算出する。領域情報算出部１３４は、算出した領域情報を、矩形領域座標算出部１３３からの座標情報に対応付けて被写体領域選択部７３に供給する。 More specifically, the area information calculation unit 134 represents the rectangular frame size and center position coordinates based on the coordinate information from the rectangular area coordinate calculation unit 133 representing the rectangular frames 221 and 222 in the binarization map 202. Is calculated as region information for the rectangular region. The area information calculation unit 134 supplies the calculated area information to the subject area selection unit 73 in association with the coordinate information from the rectangular area coordinate calculation unit 133.

以上のようにして、被写体候補領域矩形化部７２は、被写体マップにおいて、注目すべき被写体の候補となる各領域を囲む矩形枠、および、被写体マップ上でその矩形枠で囲まれる領域の特徴を表す領域情報を求める。 As described above, the subject candidate region rectangularization unit 72 determines the characteristics of the rectangular frame surrounding each region that is a candidate for the subject to be noted in the subject map and the region surrounded by the rectangular frame on the subject map. Find the region information to represent.

図６のフローチャートに戻り、ステップＳ１３において、被写体領域選択部７３は、被写体領域選択処理を行い、注目すべき被写体が含まれる矩形領域である被写体領域を、被写体領域選択部７３からの領域情報に基づいて矩形領域の中から選択する。 Returning to the flowchart of FIG. 6, in step S 13, the subject region selection unit 73 performs subject region selection processing, and uses the subject region, which is a rectangular region including the subject to be noted, as region information from the subject region selection unit 73. Based on this, select from the rectangular area.

［被写体領域選択処理］
ここで、図１１のフローチャートを参照して、被写体領域選択処理の詳細について説明する。 [Subject area selection processing]
Here, the details of the subject area selection processing will be described with reference to the flowchart of FIG.

ステップＳ７１において、領域情報比較部１５１は、被写体候補領域矩形化部７２からの各矩形領域の領域情報と、領域情報記憶部１５３に記憶されている１フレーム前の被写体領域の領域情報とを比較し、比較結果を被写体領域決定部１５２に供給する。 In step S 71, the region information comparison unit 151 compares the region information of each rectangular region from the subject candidate region rectangularization unit 72 with the region information of the subject region of the previous frame stored in the region information storage unit 153. Then, the comparison result is supplied to the subject region determination unit 152.

より具体的には、例えば、領域情報比較部１５１は、被写体候補領域矩形化部７２からの、被写体マップ上での各矩形領域を囲む矩形枠のサイズと、領域情報記憶部１５３に記憶されている１フレーム前の被写体領域を囲む矩形枠（被写体枠）のサイズとを比較する。また、例えば、領域情報比較部１５１は、被写体候補領域矩形化部７２からの、被写体マップ上での各矩形領域を囲む矩形枠の中心位置の座標と、領域情報記憶部１５３に記憶されている１フレーム前の被写体領域を囲む矩形枠（被写体枠）の中心位置の座標とを比較する。 More specifically, for example, the region information comparison unit 151 stores the size of the rectangular frame surrounding each rectangular region on the subject map from the subject candidate region rectangularization unit 72 and the region information storage unit 153. The size of the rectangular frame (subject frame) surrounding the subject area one frame before is compared. Further, for example, the region information comparison unit 151 is stored in the region information storage unit 153 and the coordinates of the center position of a rectangular frame surrounding each rectangular region on the subject map from the subject candidate region rectangularization unit 72. The coordinates of the center position of a rectangular frame (subject frame) surrounding the subject area one frame before are compared.

ステップＳ７２において、被写体領域決定部１５２は、領域情報比較部１５１からの比較結果に基づいて、１フレーム前の被写体領域を囲む矩形枠（被写体枠）のサイズまたは中心位置の座標に最も近い矩形枠のサイズまたは中心位置を有する矩形領域を被写体領域とする。被写体領域決定部１５２は、決定した被写体領域の座標情報を制御部３４および重み係数算出部７４に供給するとともに、被写体領域の領域情報（被写体枠のサイズまたは中心位置）を、領域情報記憶部１５３に供給する。 In step S72, the subject area determination unit 152, based on the comparison result from the area information comparison unit 151, has the rectangular frame closest to the coordinates of the size or center position of the rectangular frame (subject frame) surrounding the subject area one frame before. A rectangular area having a size or a center position is defined as a subject area. The subject area determination unit 152 supplies the coordinate information of the determined subject area to the control unit 34 and the weighting coefficient calculation unit 74, and the area information storage unit 153 stores the area information (subject frame size or center position) of the subject area. To supply.

但し、１回目の被写体領域選択処理において、領域情報記憶部１５３には、１フレーム前の被写体領域の領域情報は記憶されていないので、被写体追尾処理の開始時にユーザによって選択された被写体の所定領域（以下、初期選択領域という）を含む矩形領域が被写体領域とされる。 However, in the first subject region selection process, the region information storage unit 153 does not store the region information of the subject region one frame before, so the predetermined region of the subject selected by the user at the start of the subject tracking process A rectangular area including (hereinafter referred to as an initial selection area) is a subject area.

以上のようにして、被写体領域選択部７３は、被写体の候補となる矩形領域の中から、注目すべき被写体の被写体領域を選択する。 As described above, the subject area selection unit 73 selects the subject area of the subject to be noted from the rectangular areas that are subject candidates.

［重み係数の算出］
図６のフローチャートに戻り、ステップＳ１４において、重み係数算出部７４は、被写体マップ生成部７１からの帯域特徴量マップおよび合成特徴量マップと、被写体領域選択部７３からの被写体領域を表す座標情報とに基づいて、図８で示された重み係数群Ｗ_R，Ｗ_Cを算出する。 [Calculation of weighting factor]
Returning to the flowchart of FIG. 6, in step S 14, the weight coefficient calculation unit 74 includes a band feature amount map and a combined feature amount map from the subject map generation unit 71, and coordinate information representing the subject region from the subject region selection unit 73. Based on the above, the weight coefficient groups W _R and W _C shown in FIG. 8 are calculated.

より具体的には、図１２に示されるように、所定の帯域特徴量マップＲ_mn（１≦ｍ≦Ｍ，１≦ｎ≦Ｎ）上の、被写体領域を表す被写体枠２３１に対応する矩形領域内の特徴量（情報量）の和を被写体領域特徴量和ｒ_mnとした場合、図１３の上側に示されるような重み係数群Ｗ_Rが算出される。 More specifically, as shown in FIG. 12, a rectangular area corresponding to a subject frame 231 representing a subject area on a predetermined band feature amount map R _mn (1 ≦ m ≦ M, 1 ≦ n ≦ N). when the characteristic amounts of the inner sum of (information amount) was subject region feature amount sum r _mn, weight coefficient group W _R as shown on the upper side of FIG. 13 is calculated.

図１３の重み係数群Ｗ_Rにおける係数のそれぞれは、図８で示された重み係数ｗ₁₁乃至ｗ_MNのそれぞれに対応している。なお、図１３において、Max[a,…,z]は、値ａ乃至ｚのうちの最大値を表すものとする。 Each coefficient in the weight coefficient group W _R of FIG. 13 corresponds to each of the weight coefficients w ₁₁ to w _MN shown in FIG. In FIG. 13, Max [a,..., Z] represents the maximum value among the values a to z.

例えば、図１３の重み係数群Ｗ_Rにおける上から１番目の行の各係数は、図８で示された、「帯域１」である特徴量毎の帯域特徴量マップＲ₁₁乃至Ｒ_M1についての重み係数ｗ₁₁乃至ｗ_M1を示している。図１３に示されるように、重み係数ｗ₁₁乃至ｗ_M1は、分母が帯域特徴量マップＲ₁₁乃至Ｒ_M1それぞれについての被写体領域特徴量和ｒ₁₁乃至ｒ_M1のうちの最大値とされ、分子が帯域特徴量マップＲ₁₁乃至Ｒ_M1それぞれについての被写体領域特徴量和ｒ₁₁乃至ｒ_M1とされる係数であり、０乃至１の値をとる。 For example, each coefficient of the first row from the top in the weight coefficient group W _R of FIG. 13, shown in FIG. 8, "band 1" is the feature quantity for each of the band feature amount maps R ₁₁ to R _M1 of Weighting factors w _{11 to} w _M1 are shown. As shown in FIG. 13, the weighting factors w _{11 to} w _M1 are denominators that are the maximum values of the subject area feature amount sums r _{11 to} r _{M1 for} the band feature amount maps R _{11 to} R _M1, respectively. Are coefficients for the subject area feature amount sums r _{11 to} r _M1 for the band feature amount maps R _{11 to} R _M1, respectively, and take values of 0 to 1.

同様に、図１３の重み係数群Ｗ_Rにおける上からＮ番目の行の各係数は、図８で示された、「帯域Ｎ」である特徴量毎の帯域特徴量マップＲ_1N乃至Ｒ_MNについての重み係数ｗ_1N乃至ｗ_MNを示している。図１３に示されるように、重み係数ｗ_1N乃至ｗ_MNは、分母が帯域特徴量マップＲ_1N乃至Ｒ_MNそれぞれについての被写体領域特徴量和ｒ_1N乃至ｒ_MNのうちの最大値とされ、分子が帯域特徴量マップＲ_1N乃至Ｒ_MNそれぞれについての被写体領域特徴量和ｒ_1N乃至ｒ_MNとされる係数であり、０乃至１の値をとる。 Similarly, the coefficients in the Nth row from the top in the weighting coefficient group W _R in FIG. 13 are the band feature amount maps R _{1N to} R _MN for each feature amount that is “band N” shown in FIG. The weighting factors w _{1N to} w _MN are shown. As shown in FIG. 13, the weighting factors w _{1N to} w _MN have the denominator as the maximum value of the subject area feature amount sums r _{1N to} r _{MN for} the band feature amount maps R _{1N to} R _MN, respectively. _Are coefficients for subject area feature amount sums r _{1N to} r _MN for the band feature amount maps R _{1N to} R _MN, respectively, and take values of 0 to 1.

すなわち、重み係数ｗ_1n乃至ｗ_Mnによれば、「帯域ｎ」である特徴量毎の帯域特徴量マップＲ_1n乃至Ｒ_Mnにおいて、被写体領域特徴量和が最大となる特徴量の帯域特徴量マップに最大値１となる重み付けがされ、その他の帯域特徴量マップには、被写体領域特徴量和に応じた重み付けがされる。 That is, according to the weighting factors w _{1n to} w _Mn , the band feature quantity map of the feature quantity that maximizes the subject area feature quantity sum in the band feature quantity maps R _{1n to} R _{Mn for} each feature quantity that is “band n”. The other band feature amount maps are weighted according to the subject region feature amount sum.

また、所定の合成特徴量マップＣ_m（１≦ｍ≦Ｍ）上の、被写体領域を表す矩形枠２２１に対応する矩形領域内の特徴量（情報量）の和を被写体領域特徴量和ｃ_mとした場合、図１３の下側に示されるような重み係数群Ｗ_Cが算出される。 Further, the sum of the feature amounts (information amounts) in the rectangular area corresponding to the rectangular frame 221 representing the subject area on the predetermined composite feature quantity map C _m (1 ≦ m ≦ M) is the subject area feature quantity sum c _m. In this case, a weight coefficient group W _C as shown in the lower side of FIG. 13 is calculated.

図１３の重み係数群Ｗ_Cにおける係数のそれぞれは、図８で示された重み係数ｗ₁乃至ｗ_Mのそれぞれに対応している。 Each of the coefficients in the weight coefficient group W _C in FIG. 13 corresponds to each of the weight coefficients w _{1 to} w _M shown in FIG.

つまり、図１３の重み係数群Ｗ_Cにおける各係数は、図８で示された、特徴量毎の合成特徴量マップＣ₁乃至Ｃ_Mについての重み係数ｗ₁乃至ｗ_Mを示している。図１３に示されるように、重み係数ｗ₁乃至ｗ_Mは、分母が合成特徴量マップＣ₁乃至Ｃ_Mそれぞれについての被写体領域特徴量和ｃ₁乃至ｃ_Mのうちの最大値とされ、分子が合成特徴量マップＣ₁乃至Ｃ_Mそれぞれについての被写体領域特徴量和ｃ₁乃至ｃ_Mとされる係数であり、０乃至１の値をとる。 That is, each coefficient in the weight coefficient group W _C in FIG. 13 indicates the weight coefficients w _{1 to} w _M for the combined feature amount maps C _{1 to} C _M for each feature amount shown in FIG. As shown in FIG. 13, the weighting factors w _{1 to} w _M have the denominator as the maximum value of the subject area feature amount sums c _{1 to} c _{M for} the combined feature amount maps C _{1 to} C _M, respectively. Are coefficients that are subject area feature amount sums c _{1 to} c _M for each of the combined feature amount maps C _{1 to} C _M , and take values of 0 to 1.

すなわち、重み係数ｗ₁乃至ｗ_mによれば、特徴量毎の合成特徴量マップＣ₁乃至Ｃ_Mにおいて、被写体領域特徴量和が最大となる特徴量の合成特徴量マップに最大値１となる重み付けがされ、その他の帯域特徴量マップには、被写体領域特徴量和に応じた重み付けがされる。 In other words, according to the weighting factors w _{1 to} w _m , the combined feature value map having the maximum subject area feature value in the combined feature value maps C _{1 to} C _{M for} each feature value has a maximum value of 1. The other band feature quantity maps are weighted according to the subject area feature quantity sum.

重み係数算出部７４は、算出した重み係数群Ｗ_Rを、被写体マップ生成部７１の帯域特徴量マップ合成部１１３に供給するとともに、重み係数群Ｗ_Cを、被写体マップ生成部７１の合成特徴量マップ合成部１１４に供給する。図６のフローチャートにおいては、ステップＳ１４の後、処理はステップＳ１１に戻り、次フレームについての被写体追尾処理が実行され、この処理が１フレーム毎に繰り返される。 The weighting factor calculation unit 74 supplies the calculated weighting factor group W _R to the band feature amount map synthesis unit 113 of the subject map generation unit 71 and also uses the weighting factor group W _C as the synthesis feature amount of the subject map generation unit 71. This is supplied to the map composition unit 114. In the flowchart of FIG. 6, after step S14, the process returns to step S11, subject tracking processing for the next frame is executed, and this processing is repeated for each frame.

以上の処理によれば、入力画像の所定のフレームについての特徴量毎の特徴量マップにおける、そのフレームで選択された被写体領域に対応する領域の特徴量の相対的な大きさに応じて、次フレームについての特徴量毎の特徴量マップに対する重み係数が決定される。したがって、フレーム間で特徴量が変動するような場合であっても、複数の特徴量のうちの被写体を最もよく表す特徴量の特徴量マップが最も大きく重み付けされた被写体マップが生成されるので、被写体の状態が変動するような環境下でも、被写体をより安定して追尾することが可能となる。 According to the above processing, in the feature amount map for each feature amount for a predetermined frame of the input image, the following is performed according to the relative size of the feature amount of the region corresponding to the subject region selected in the frame. A weighting factor for the feature amount map for each feature amount of the frame is determined. Therefore, even when the feature amount varies between frames, a feature map of the feature amount that best represents the subject among the plurality of feature amounts is generated with the most weighted subject map. Even in an environment where the state of the subject fluctuates, the subject can be tracked more stably.

また、被写体領域は、被写体全体を含むように決定されるので、被写体の一部の領域の状態が変動するような環境下でも、被写体をより安定して追尾することができる。 In addition, since the subject area is determined so as to include the entire subject, the subject can be tracked more stably even in an environment in which the state of a partial area of the subject fluctuates.

特に、従来の被写体追尾の手法において、被写体領域内のいずれかの座標（またはその座標を含む一部領域）が同定されるような場合では、被写体全体を追尾することができず、AF（Auto Focus）やAE（Auto Exposure）、ACC（Auto Color Control）の検波枠を正しく設定することができなかった。また、被写体領域内で特徴量が同一である同一特徴量領域が同定されるような場合では、上述の場合よりは検波枠を設定する精度を上げることができるが、同一特徴量領域は、被写体領域のごく一部に過ぎないことが多く、十分な検波精度は得られなかった。 In particular, when any of the coordinates in the subject area (or a partial area including the coordinates) is identified in the conventional subject tracking method, the entire subject cannot be tracked, and AF (Auto The detection frame for Focus, AE (Auto Exposure), and ACC (Auto Color Control) could not be set correctly. In addition, in the case where the same feature amount region having the same feature amount in the subject region is identified, the accuracy of setting the detection frame can be improved as compared with the above case. In many cases, it was only a small part of the area, and sufficient detection accuracy could not be obtained.

一方、本発明の被写体追尾処理によれば、被写体全体を含む被写体領域を同定できるので、検波精度を上げることができ、ひいては、追尾結果を様々なアプリケーションに適用することが可能となる。 On the other hand, according to the subject tracking process of the present invention, the subject region including the entire subject can be identified, so that the detection accuracy can be improved, and the tracking result can be applied to various applications.

また、従来の被写体追尾の手法には、例えば、人間の全体像を学習により辞書に登録する等して、人間を検出・追尾するものもあるが、辞書に登録されていない人間以外の被写体を追尾することはできない。さらに、辞書に登録される情報（画像）の量は膨大な量となるため、装置規模が大きくなってしまう。 In addition, conventional subject tracking methods include, for example, a method for detecting and tracking a human by registering an entire image of a human in a dictionary by learning, but a non-human subject that is not registered in the dictionary can be detected. It cannot be tracked. Furthermore, since the amount of information (images) registered in the dictionary is enormous, the apparatus scale becomes large.

一方、本発明の被写体追尾処理によれば、任意の被写体を検出・追尾することができる上に、辞書等に膨大な量の情報を登録する必要がないので、装置規模をコンパクトにすることができる。 On the other hand, according to the subject tracking process of the present invention, an arbitrary subject can be detected and tracked, and it is not necessary to register a huge amount of information in a dictionary or the like. it can.

以上においては、特徴量として、輝度成分、色成分、およびエッジ方向を用いるものとしたが、これに限らず、例えば、動き情報等を加えるようにしてもよい。また、用いられる特徴量は、例えば、輝度成分と色成分のような、相補的な関係にあるものが好適であり、適宜、選択されるようにしてもよい。 In the above description, the luminance component, the color component, and the edge direction are used as the feature amount. However, the present invention is not limited to this, and for example, motion information or the like may be added. In addition, the feature amount used is preferably a complementary relationship such as a luminance component and a color component, and may be appropriately selected.

また、以上においては、Ｍ×（Ｎ＋１）種類の特徴量マップに対応して、Ｍ×（Ｎ＋１）種類の重み係数を算出するようにしたが、一部の特徴量マップに対応する重み係数のみを、適宜算出するようにすることで、画像処理装置１１における演算量を抑えることができる。例えば、合成特徴量マップＣ₁乃至Ｃ_MのＭ種類の特徴量マップに対応する重み係数ｗ₁乃至ｗ_Mのみを算出するようにしてもよい。 In the above description, M × (N + 1) types of weighting coefficients are calculated corresponding to M × (N + 1) types of feature amount maps. However, only weighting factors corresponding to some feature amount maps are calculated. Is appropriately calculated, the amount of calculation in the image processing apparatus 11 can be suppressed. For example, only the weighting factors w _{1 to} w _M corresponding to the M types of feature amount maps of the combined feature amount maps C _{1 to} C _M may be calculated.

さらに、以上においては、領域情報算出部１３４は、矩形領域の領域情報として、矩形枠のサイズおよび中心位置の座標を算出するようにしたが、矩形領域内の画素値の積分値やピーク値（最大値）を算出するようにしてもよい。この場合、被写体領域選択処理（図１１）においては、１フレーム前の被写体領域内の画素値の積分値またはピーク値に最も近い領域内の画素値の積分値またはピーク値を有する矩形領域が被写体領域とされる。 Further, in the above, the area information calculation unit 134 calculates the size of the rectangular frame and the coordinates of the center position as the area information of the rectangular area. However, the integrated value or peak value ( (Maximum value) may be calculated. In this case, in the subject region selection process (FIG. 11), a rectangular region having an integral value or peak value of the pixel value in the region closest to the integrated value or peak value of the pixel value in the subject region one frame before is the subject. It is considered as an area.

上述した一連の処理は、ハードウェアにより実行することもできるし、ソフトウェアにより実行することもできる。一連の処理をソフトウェアにより実行する場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータ等に、プログラム記録媒体からインストールされる。 The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a program recording medium in a general-purpose personal computer or the like.

図１４は、上述した一連の処理をプログラムにより実行するコンピュータのハードウェアの構成例を示すブロック図である。 FIG. 14 is a block diagram illustrating a hardware configuration example of a computer that executes the above-described series of processing by a program.

コンピュータにおいて、CPU（Central Processing Unit）９０１，ROM（Read Only Memory）９０２，RAM（Random Access Memory）９０３は、バス９０４により相互に接続されている。 In a computer, a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, and a RAM (Random Access Memory) 903 are connected to each other by a bus 904.

バス９０４には、さらに、入出力インタフェース９０５が接続されている。入出力インタフェース９０５には、キーボード、マウス、マイクロホン等よりなる入力部９０６、ディスプレイ、スピーカ等よりなる出力部９０７、ハードディスクや不揮発性のメモリ等よりなる記憶部９０８、ネットワークインタフェース等よりなる通信部９０９、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリ等のリムーバブルメディア９１１を駆動するドライブ９１０が接続されている。 An input / output interface 905 is further connected to the bus 904. The input / output interface 905 includes an input unit 906 made up of a keyboard, mouse, microphone, etc., an output unit 907 made up of a display, a speaker, etc., a storage unit 908 made up of a hard disk, nonvolatile memory, etc., and a communication unit 909 made up of a network interface, etc. A drive 910 for driving a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is connected.

以上のように構成されるコンピュータでは、CPU９０１が、例えば、記憶部９０８に記憶されているプログラムを、入出力インタフェース９０５およびバス９０４を介して、RAM９０３にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 901 loads the program stored in the storage unit 908 to the RAM 903 via the input / output interface 905 and the bus 904 and executes the program, for example. Is performed.

コンピュータ（CPU９０１）が実行するプログラムは、例えば、磁気ディスク（フレキシブルディスクを含む）、光ディスク（CD-ROM(Compact Disc-Read Only Memory),DVD(Digital Versatile Disc)等）、光磁気ディスク、もしくは半導体メモリ等よりなるパッケージメディアであるリムーバブルメディア９１１に記録して、あるいは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供される。 The program executed by the computer (CPU 901) is, for example, a magnetic disk (including a flexible disk), an optical disk (CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc), etc.), a magneto-optical disk, or a semiconductor. The program is recorded on a removable medium 911 which is a package medium including a memory or the like, or is provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.

そして、プログラムは、リムーバブルメディア９１１をドライブ９１０に装着することにより、入出力インタフェース９０５を介して、記憶部９０８にインストールすることができる。また、プログラムは、有線または無線の伝送媒体を介して、通信部９０９で受信し、記憶部９０８にインストールすることができる。その他、プログラムは、ROM９０２や記憶部９０８に、あらかじめインストールしておくことができる。 The program can be installed in the storage unit 908 via the input / output interface 905 by attaching the removable medium 911 to the drive 910. The program can be received by the communication unit 909 via a wired or wireless transmission medium and installed in the storage unit 908. In addition, the program can be installed in the ROM 902 or the storage unit 908 in advance.

なお、コンピュータが実行するプログラムは、本明細書で説明する順序に沿って時系列に処理が行われるプログラムであっても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで処理が行われるプログラムであっても良い。 The program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.

また、本発明の実施の形態は、上述した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiments of the present invention are not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present invention.

１１画像処理装置，５５被写体追尾部，７１被写体マップ生成部，７２被写体候補領域矩形化部，７３被写体領域選択部，７４重み係数算出部，１１１特徴量マップ生成部，１１２帯域特徴量マップ生成部，１１３帯域特徴量マップ合成部，１１４合成特徴量マップ合成部，１３１２値化処理部，１３２ラベリング処理部，１３３矩形領域座標算出部，１３４領域情報算出部，１５１領域情報比較部，１５２被写体領域決定部，２００入力画像，２０１被写体マップ，２２１，２２２矩形領域，２３１被写体枠 DESCRIPTION OF SYMBOLS 11 Image processing apparatus, 55 Subject tracking part, 71 Subject map generation part, 72 Subject candidate area | region rectangularization part, 73 Subject area selection part, 74 Weight coefficient calculation part, 111 Feature value map generation part, 112 Band feature value map generation part , 113 band feature map synthesis unit, 114 synthesized feature map synthesis unit, 131 binarization processing unit, 132 labeling processing unit, 133 rectangular area coordinate calculation unit, 134 area information calculation unit, 151 area information comparison unit, 152 subject Area determination unit, 200 input image, 201 subject map, 221 and 222 rectangular area, 231 subject frame

Claims

A subject map indicating the likelihood of a subject area in the current frame based on a weighting factor for each feature amount from a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature of the input image Subject map generating means for generating
In the subject map, subject candidate region rectangularization means for obtaining a rectangular region including a candidate region for the subject;
Subject region selection means for selecting a subject region that is the rectangular region including the subject to be noted from the rectangular region based on region information about the rectangular region;
Of the feature amounts in the region corresponding to the subject region on the feature amount map for each feature amount of the current frame, the feature amount corresponding to the relatively large feature amount is temporally later than the current frame. An image processing apparatus comprising: a weighting factor calculating unit that calculates the weighting factor for weighting a feature amount map of a subsequent frame.

The subject map generation means includes:
Feature quantity map generating means for generating the feature quantity map for each feature quantity from the current frame;
Band feature quantity map generating means for generating a plurality of band feature quantity maps indicating the feature quantities of a predetermined band from the feature quantity map;
Band feature quantity map synthesis means for synthesizing the band feature quantity map for each feature quantity;
A combined feature amount map combining unit that generates the subject map by combining the combined feature amount map for each of the feature amounts combined with the band feature amount map;
The band feature amount map combining unit weights and combines the band feature amount map based on the weighting factor for each feature amount and for each band,
The weighting factor calculating unit is configured to calculate a relatively large amount of the feature amount in the region corresponding to the subject region on the band feature amount map for each feature amount for the predetermined band of the current frame. The image processing apparatus according to claim 1, wherein the weighting coefficient for weighting the band feature quantity map of the subsequent frame corresponding to the feature quantity is calculated.

The subject map generation means includes:
Feature quantity map generating means for generating the feature quantity map for each feature quantity from the current frame;
Band feature quantity map generating means for generating a plurality of band feature quantity maps indicating the feature quantities of a predetermined band from the feature quantity map;
Band feature quantity map synthesis means for synthesizing the band feature quantity map for each feature quantity;
A combined feature amount map combining unit that generates the subject map by combining the combined feature amount map for each of the feature amounts combined with the band feature amount map;
The synthesized feature map synthesis means weights and synthesizes the synthesized feature map based on the weighting coefficient for each feature.
The weighting factor calculation means corresponds to the relatively large feature amount among the feature amounts in the region corresponding to the subject region on the composite feature amount map for each feature amount of the current frame. The image processing apparatus according to claim 1, wherein the weighting coefficient for weighting the composite feature amount map of the subsequent frame is calculated.

The area information is center coordinates or size of the rectangular area on the subject map,
The subject area selecting means is configured to select the rectangular area having a center coordinate or size closest to a center coordinate or size of the subject area selected in the previous frame temporally prior to the current frame from the rectangular area. The image processing device according to claim 1, wherein the image processing device is selected as a subject area.

The area information is an integral value or a peak value of the feature amount in the rectangular area on the subject map,
The subject area selection means is an integrated value of the feature quantity closest to the integral value or peak value of the feature quantity in the subject area selected in the previous frame temporally prior to the current frame from the rectangular area, or The image processing apparatus according to claim 1, wherein the rectangular area having a peak value is selected as the subject area.

A subject map indicating the likelihood of a subject area in the current frame based on a weighting factor for each feature amount from a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature of the input image A subject map generation step for generating
In the subject map, a subject candidate region rectangularization step for obtaining a rectangular region including a candidate region for the subject;
A subject region selection step of selecting a subject region that is the rectangular region including the subject to be noted from the rectangular region based on region information about the rectangular region;
Of the feature amounts in the region corresponding to the subject region on the feature amount map for each feature amount of the current frame, the feature amount corresponding to the relatively large feature amount is temporally later than the current frame. A weighting factor calculating step of calculating the weighting factor for weighting the feature amount map of the subsequent frame.

A subject map indicating the likelihood of a subject area in the current frame based on a weighting factor for each feature amount from a feature amount map indicating a feature amount in a predetermined region of the current frame of the input image for each feature of the input image A subject map generation step for generating
In the subject map, a subject candidate region rectangularization step for obtaining a rectangular region including a candidate region for the subject;
A subject region selection step of selecting a subject region that is the rectangular region including the subject to be noted from the rectangular region based on region information about the rectangular region;
Of the feature amounts in the region corresponding to the subject region on the feature amount map for each feature amount of the current frame, the feature amount corresponding to the relatively large feature amount is temporally later than the current frame. A program that causes a computer to execute processing including: a weighting factor calculating step of calculating the weighting factor that weights a feature amount map of a subsequent frame.