JP6405638B2

JP6405638B2 - Subject detection device, imaging device, and program

Info

Publication number: JP6405638B2
Application number: JP2014020767A
Authority: JP
Inventors: 鉾井　逸人; 逸人鉾井
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2014-02-05
Filing date: 2014-02-05
Publication date: 2018-10-17
Anticipated expiration: 2034-02-05
Also published as: JP2015148905A

Description

本発明は、被写体検出装置、撮像装置及びプログラムに関する。 The present invention relates to a subject detection device, an imaging device, and a program.

従来、焦点調節に関する様々な技術が考えられている。例えば、特許文献１の発明では、被写体抽出の手法を用いた焦点調節を行う撮像装置が開示されている。特許文献１の発明では、被写体形状の変化量に応じてＡＦ測距枠を変形することにより、被写体がカメラ方向に移動する場合でも、被写体に対して最適に自動焦点調節を行うことを可能としている。 Conventionally, various techniques relating to focus adjustment have been considered. For example, the invention of Patent Document 1 discloses an imaging apparatus that performs focus adjustment using a subject extraction technique. In the invention of Patent Document 1, it is possible to optimally perform automatic focus adjustment on a subject even when the subject moves in the camera direction by deforming the AF distance measurement frame in accordance with the amount of change in the subject shape. Yes.

特開２００９−０６９７４８号公報JP 2009-069748 A

ところで、被写体抽出に際しては、ユーザが撮影対象としている特定領域（例えば、被写体領域や注目領域など）を抽出する際に、背景である領域についても抽出を行ってしまう場合がある。 By the way, when extracting a subject, when a specific region (for example, a subject region, a region of interest, etc.) targeted by the user is extracted, the region that is the background may also be extracted.

一の観点の被写体検出装置は、処理対象の画像を用いて複数の異なる二値化画像を生成する二値化処理部と、各々の二値化画像で抽出された複数の画素領域について、二値化処理前の画像から取得した前記複数の画素領域に対応する画素の値を用いて、前記二値化画像に含まれる前記複数の画素領域の基準点を求め、前記基準点を用いて前記画素領域を評価する評価部と、前記評価を用いて、前記処理対象の画像の特定領域に相当する前記画素領域を抽出する抽出部と、を備える。 Object detection apparatus according to an aspect, a binarization processing unit for generating a plurality of different binary image by using the image to be processed, the plurality of pixel areas extracted in each of the binarized image, the two using the values of pixels corresponding to the plurality of pixel regions obtained from the binarization processing the previous image, the reference point of the plurality of pixel areas contained in the binarized image determined by have use the criteria points comprising an evaluation unit for evaluating the pixel region, and have use of the evaluation, an extraction unit for extracting the pixel region corresponding to a specific region of the processing target image.

一の観点の撮像装置は、被写体の像を撮像する撮像部と、一の観点の被写体検出装置とを備える。 An imaging device according to one aspect includes an imaging unit that captures an image of a subject and a subject detection device according to one aspect.

一の観点のプログラムは、処理対象の画像を用いて複数の異なる二値化画像を生成し、各々の二値化画像で抽出された複数の画素領域について、二値化処理前の画像から取得した前記複数の画素領域に対応する画素の値を用いて、前記二値化画像に含まれる前記複数の画素領域の基準点を求め、前記基準点を用いて前記画素領域を評価し、前記評価を用いて、前記処理対象の画像の特定領域に相当する前記画素領域を抽出する処理をコンピュータに実行させる。 A program according to one aspect generates a plurality of different binarized images using an image to be processed, and acquires a plurality of pixel regions extracted from each binarized image from an image before binarization processing. using the values of pixels corresponding to the plurality of pixel regions, wherein included in the binary image determined a reference point of said plurality of pixel areas, evaluating the pixel region have use the criteria points, the and we have use the evaluation, to execute processing for extracting the pixel region corresponding to a specific region of the processing target image to a computer.

本発明によれば、被写体に応じて的確に特定領域の抽出を行うことができる。 According to the present invention, it is possible to accurately extract a specific area according to a subject.

撮像装置の構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of an imaging device. 被写体検出部の構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of a to-be-photographed object detection part. Ｙ成分の差分画像の画素値の分布の一例を示す図である。It is a figure which shows an example of distribution of the pixel value of the difference image of Y component. 静止画撮影モードにおける処理を示すフローチャートである。It is a flowchart which shows the process in still image shooting mode. 被写体検出処理に係る処理を示すフローチャートである。It is a flowchart which shows the process which concerns on an object detection process. 差分画像を絞り込む処理を示すフローチャートである。It is a flowchart which shows the process which narrows down a difference image. スルー画像の一例を示す図である。It is a figure which shows an example of a through image. 図６に示すスルー画像から得られる、Ｙ成分、Ｃｂ成分及びＣｒ成分の差分画像の一例を示す図である。It is a figure which shows an example of the difference image of Y component, Cb component, and Cr component obtained from the through image shown in FIG. 図７に示す差分画像から得られる二値化画像の一例を示す図である。It is a figure which shows an example of the binarized image obtained from the difference image shown in FIG. マスクを絞り込む処理を示すフローチャートである。It is a flowchart which shows the process which narrows down a mask. スルー画像の一例を示す図である。It is a figure which shows an example of a through image. 図１１に示すスルー画像から得られる二値化画像の一例を示す図である。It is a figure which shows an example of the binarized image obtained from the through image shown in FIG. 被写体に相当するマスクを抽出する処理を示すフローチャートである。It is a flowchart which shows the process which extracts the mask corresponding to a to-be-photographed object. スルー画像の一例を示す図である。It is a figure which shows an example of a through image. 図１４に示すスルー画像から得られる二値化画像の一例を示す図である。It is a figure which shows an example of the binarized image obtained from the through image shown in FIG. スルー画像の一例を示す図である。It is a figure which shows an example of a through image. 図１６に示すスルー画像から得られる二値化画像の一例を示す図である。It is a figure which shows an example of the binarized image obtained from the through image shown in FIG. スルー画像の一例を示す図である。It is a figure which shows an example of a through image. 図１８に示すスルー画像から得られる二値化画像の一例を示す図である。It is a figure which shows an example of the binarized image obtained from the through image shown in FIG. 或る二値化画像での基準点の算出例を示す図である。It is a figure which shows the example of calculation of the reference point in a certain binarized image. （ａ）図２０に示したマスクのうち実施形態で上位のマスクとして抽出されるマスクの例を示す図である。（ｂ）画面中央の点Ｏを基準点としたときに、図２０に示すマスクのうちで抽出される上位のマスクの例を示す図である。(A) It is a figure which shows the example of the mask extracted as a high-order mask in embodiment among the masks shown in FIG. (B) It is a figure which shows the example of the high-order mask extracted among the masks shown in FIG. 20, when the point O of the screen center is made into the reference point. 図２０において重み付け係数を用いて基準点を算出した例を示す図である。It is a figure which shows the example which calculated the reference point using the weighting coefficient in FIG. 画像処理装置の構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of an image processing apparatus.

＜第１実施形態＞
以下、本発明の実施の形態について説明する。図１は、本発明の撮像装置の一例を示す機能ブロック図である。なお、撮像装置１０としては、デジタルカメラの他、例えばカメラ機能を備えた携帯電話機やタブレットＰＣなどの携帯型端末機器が挙げられる。 <First Embodiment>
Embodiments of the present invention will be described below. FIG. 1 is a functional block diagram illustrating an example of an imaging apparatus according to the present invention. In addition to the digital camera, examples of the imaging device 10 include portable terminal devices such as a mobile phone and a tablet PC having a camera function.

撮像装置１０は、撮像光学系１５、光学系駆動部１６、撮像部１７、焦点調節部１８、制御部１９、第１メモリ２０、第２メモリ２１、メディアＩ／Ｆ２２、表示部２３及び操作部２４を備えている。ここで、光学系駆動部１６、撮像部１７、焦点調節部１８、第１メモリ２０、第２メモリ２１、メディアＩ／Ｆ２２、表示部２３及び操作部２４は、制御部１９にそれぞれ接続される。 The imaging apparatus 10 includes an imaging optical system 15, an optical system drive unit 16, an imaging unit 17, a focus adjustment unit 18, a control unit 19, a first memory 20, a second memory 21, a media I / F 22, a display unit 23, and an operation unit. 24. Here, the optical system drive unit 16, the imaging unit 17, the focus adjustment unit 18, the first memory 20, the second memory 21, the media I / F 22, the display unit 23, and the operation unit 24 are connected to the control unit 19. .

撮像光学系１５は、例えばズームレンズやフォーカスレンズを含む複数のレンズを有している。簡単のため、図１では、撮像光学系１５を１枚のレンズで示す。撮像光学系１５に含まれるズームレンズやフォーカスレンズの各レンズ位置は、光学系駆動部１６によってそれぞれ光軸方向に調整される。この撮像光学系１５は、一眼レフレックスカメラなどのビューファインダカメラに代表されるように、撮像装置１０の装置本体に対して着脱自在となるレンズ鏡筒内に設けてもよいし、コンパクトカメラに代表されるように、撮像装置１０の装置本体に一体に設けられるレンズ鏡筒内に設けてもよい。 The imaging optical system 15 has a plurality of lenses including, for example, a zoom lens and a focus lens. For simplicity, the imaging optical system 15 is shown as a single lens in FIG. The lens positions of the zoom lens and the focus lens included in the imaging optical system 15 are adjusted in the optical axis direction by the optical system driving unit 16, respectively. The imaging optical system 15 may be provided in a lens barrel that is detachable from the apparatus main body of the imaging apparatus 10 as represented by a viewfinder camera such as a single-lens reflex camera, or a compact camera. As represented, it may be provided in a lens barrel that is provided integrally with the apparatus main body of the imaging apparatus 10.

撮像部１７は、撮像光学系１５を介して入射された光束による被写体の像を撮像（撮影）するモジュールである。撮像部１７は、例えば、撮像光学系１５により光電変換面に結像された光学像を撮像素子内でデジタル信号に変換して出力する撮像素子を備えてもよい。また、例えば、撮像部１７は、撮像素子およびＡ／Ｄ変換部を備え、撮像素子は撮像光学系１５により光電変換面に結像された光学像を電気信号に変換してＡ／Ｄ変換部に出力し、Ａ／Ｄ変換部は、撮像素子によって変換された電気信号をデジタル化して、デジタル信号として出力する構成としてもよい。ここで、上述した撮像素子は、例えばＣＭＯＳ（Complementary Metal Oxide Semiconductor）などの光電変換素子で構成される。この撮像素子の画素には、例えば公知のベイヤ配列に従ってＲＧＢのカラーフィルタが配置されており、カラーフィルタでの色分解によって各色に対応する画像信号を出力することができる。なお、撮像素子は、光電変換面の一部の領域について被写体の像を電気信号に変換する、所謂画像の切り出しを行えるようにしてもよい。 The imaging unit 17 is a module that captures (captures) an image of a subject by a light beam incident through the imaging optical system 15. The imaging unit 17 may include, for example, an imaging element that converts an optical image formed on the photoelectric conversion surface by the imaging optical system 15 into a digital signal within the imaging element and outputs the digital signal. Further, for example, the imaging unit 17 includes an imaging device and an A / D conversion unit, and the imaging device converts an optical image formed on the photoelectric conversion surface by the imaging optical system 15 into an electrical signal and converts the image into an A / D conversion unit. The A / D converter may be configured to digitize the electrical signal converted by the image sensor and output it as a digital signal. Here, the above-described imaging element is configured by a photoelectric conversion element such as a complementary metal oxide semiconductor (CMOS), for example. For example, an RGB color filter is arranged in a pixel of the image sensor according to a known Bayer array, and an image signal corresponding to each color can be output by color separation by the color filter. Note that the imaging device may be configured to perform so-called image clipping, in which a subject image is converted into an electrical signal for a partial region of the photoelectric conversion surface.

ここで、撮像部１７は、操作部２４からの撮像指示に応じて、不揮発性の記憶媒体３５への記録を伴う記録用の静止画像を撮影する。このとき、撮像部１７は、記録用の静止画像を連写撮影することもできる。また、撮像部１７は、撮影待機時において所定の時間間隔ごとに観測用の画像（スルー画像）を撮影する。スルー画像は、記録用の静止画像と比べて間引きにより解像度（画像サイズ）が低くなっている。時系列に取得されたスルー画像のデータは、制御部１９による各種の演算処理や、表示部２３での動画表示（ライブビュー表示）に使用される。なお、撮像部１７で取得される画像を、撮像画像と称することもある。 Here, the imaging unit 17 captures a still image for recording accompanying recording in the nonvolatile storage medium 35 in accordance with an imaging instruction from the operation unit 24. At this time, the imaging unit 17 can also shoot continuous still images for recording. In addition, the imaging unit 17 captures images for observation (through images) at predetermined time intervals during standby for capturing. The through image has a lower resolution (image size) due to thinning than a still image for recording. The through-image data acquired in time series is used for various arithmetic processes by the control unit 19 and moving image display (live view display) on the display unit 23. Note that an image acquired by the imaging unit 17 may be referred to as a captured image.

焦点調節部１８は、撮像部１７の撮像範囲内に設定された焦点検出エリアの情報を用いて、フォーカスレンズの自動焦点調節（Auto Focus）を実行する。 The focus adjustment unit 18 uses the focus detection area information set in the imaging range of the imaging unit 17 to perform automatic focus adjustment (Auto Focus) of the focus lens.

例えば、焦点調節部１８は、スルー画像を用いて公知のコントラスト検出によりＡＦを実行するものであってもよい。あるいは、焦点調節部１８は、公知の位相差検出方式により、瞳分割された被写体像の像ズレ量からＡＦを実行するものであってもよい。位相差検出方式による場合、焦点調節部１８は撮像光学系１５から入射する光束の一部を用いて、撮像部１７から独立して焦点検出を行うモジュール（例えば、一眼レフ形式のカメラに実装される焦点検出モジュール）であってもよい。あるいは、撮像部１７の受光面に焦点検出用の受光素子を配置し、焦点調節部１８が撮像面で位相差検出ＡＦを行うようにしてもよい。 For example, the focus adjustment unit 18 may perform AF by well-known contrast detection using a through image. Alternatively, the focus adjustment unit 18 may execute AF based on the image shift amount of the subject image divided into pupils by a known phase difference detection method. In the case of using the phase difference detection method, the focus adjustment unit 18 uses a part of the light beam incident from the imaging optical system 15 and performs focus detection independently from the imaging unit 17 (for example, mounted on a single-lens reflex camera). Focus detection module). Alternatively, a light receiving element for focus detection may be disposed on the light receiving surface of the imaging unit 17 so that the focus adjusting unit 18 performs phase difference detection AF on the imaging surface.

制御部１９は、撮像装置１０の動作を統括的に制御するプロセッサである。例えば、制御部１９は、撮像部１７での撮像画像の撮影制御、自動露出（Auto Exposure）制御、表示部２３での画像の表示制御、第１メモリ２０およびメディアＩ／Ｆ２２での画像の記録制御を行う。この制御部１９は、画像処理部３１を有している。画像処理部３１は、スルー画像や撮像画像のデータに対して、色補間、階調変換、ホワイトバランス補正、輪郭強調、ノイズ除去などの画像処理を施す。この画像処理部３１は、上述した画像処理を行う機能の他に、被写体検出装置の一例としての被写体検出部３２の機能を備えている。なお、第１の実施形態においては、被写体検出部３２の機能を備えた画像処理部３１の構成としているが、被写体検出部３２は、画像処理部３１とは異なる機能として設けることもできる。 The control unit 19 is a processor that comprehensively controls the operation of the imaging apparatus 10. For example, the control unit 19 performs shooting control of a captured image by the imaging unit 17, automatic exposure control, image display control by the display unit 23, and image recording by the first memory 20 and the media I / F 22. Take control. The control unit 19 has an image processing unit 31. The image processing unit 31 performs image processing such as color interpolation, gradation conversion, white balance correction, edge enhancement, and noise removal on the data of the through image and the captured image. The image processing unit 31 includes a function of a subject detection unit 32 as an example of a subject detection device in addition to the above-described image processing function. In the first embodiment, the image processing unit 31 having the function of the subject detection unit 32 is used. However, the subject detection unit 32 can be provided as a function different from that of the image processing unit 31.

被写体検出部３２は、撮像部１７で撮像された画像を用いて、画像に含まれる被写体の位置、形状および大きさを特定する。例えば、被写体検出部３２は、カラーのスルー画像を用いて、画像に含まれる被写体の位置、形状および大きさを特定する。 The subject detection unit 32 specifies the position, shape, and size of the subject included in the image using the image captured by the imaging unit 17. For example, the subject detection unit 32 specifies the position, shape, and size of the subject included in the image using a color through image.

ここで、図１に示す制御部１９の各機能ブロックは、ハードウェア的には任意のプロセッサ、メモリ、その他の電子回路で実現でき、ソフトウェア的にはメモリにロードされたプログラムによって実現される。上述した画像処理部３１、被写体検出部３２は、制御部１９によって処理されるプログラムモジュールである。しかし、画像処理部３１、被写体検出部３２の少なくとも一方は、ＡＳＩＣ(Application Specific Integrated Circuit)等であってもよい。 Here, each functional block of the control unit 19 shown in FIG. 1 can be realized by an arbitrary processor, memory, or other electronic circuit in hardware, and is realized by a program loaded in the memory in software. The image processing unit 31 and the subject detection unit 32 described above are program modules processed by the control unit 19. However, at least one of the image processing unit 31 and the subject detection unit 32 may be an ASIC (Application Specific Integrated Circuit) or the like.

第１メモリ２０は、画像処理の前工程や後工程で撮像画像のデータを一時的に記憶するバッファメモリである。例えば、第１メモリ２０は、ＳＤＲＡＭ等の揮発性メモリである。第２メモリ２１は、制御部１９で処理されるプログラムや、プログラムで使用される各種データを記憶するメモリである。例えば、第２メモリ２１は、フラッシュメモリ等の不揮発性メモリである。 The first memory 20 is a buffer memory that temporarily stores captured image data in the pre-process and post-process of image processing. For example, the first memory 20 is a volatile memory such as an SDRAM. The second memory 21 is a memory that stores a program processed by the control unit 19 and various data used in the program. For example, the second memory 21 is a nonvolatile memory such as a flash memory.

メディアＩ／Ｆ２２は、不揮発性の記憶媒体３５を接続するためのコネクタを有している。そして、メディアＩ／Ｆ２２は、コネクタに接続された記憶媒体３５に対してデータ（記録用画像）の書き込み／読み込みを実行する。記憶媒体３５は、例えば、ハードディスクや、半導体メモリを内蔵したメモリカードである。なお、メディアＩ／Ｆ２２が記憶媒体３５に対して光学的にデータを読み取る又は書き込む場合には、記憶媒体３５は、光学ディスクである。 The media I / F 22 has a connector for connecting a nonvolatile storage medium 35. Then, the media I / F 22 executes writing / reading of data (recording image) with respect to the storage medium 35 connected to the connector. The storage medium 35 is, for example, a hard disk or a memory card incorporating a semiconductor memory. Note that when the media I / F 22 optically reads or writes data to the storage medium 35, the storage medium 35 is an optical disk.

次に、図２を用いて、被写体検出部３２の構成について説明する。被写体検出部３２は、色空間変換部４１、解像度変換部４２、差分画像生成部４３、画像判定部４４、二値化処理部４５、マスク絞込部４６、評価値算出部４７及びマスク抽出部４８を備えている。ここで、被写体検出部３２の各機能ブロックは、制御部１９によって処理されるプログラムモジュールであってもよいし、ＡＳＩＣ等であってもよい。 Next, the configuration of the subject detection unit 32 will be described with reference to FIG. The subject detection unit 32 includes a color space conversion unit 41, a resolution conversion unit 42, a difference image generation unit 43, an image determination unit 44, a binarization processing unit 45, a mask narrowing unit 46, an evaluation value calculation unit 47, and a mask extraction unit. 48 is provided. Here, each functional block of the subject detection unit 32 may be a program module processed by the control unit 19 or may be an ASIC or the like.

色空間変換部４１は、入力されるスルー画像のデータに対して色空間変換処理を実行する。制御部１９に入力される画像のデータは、例えばＲＧＢ色空間で表されるデータである。したがって、色空間変換部４１は、入力される画像のデータを、ＲＧＢ色空間で表されるデータから、ＹＣｂＣｒ色空間で表されるデータに変換する。 The color space conversion unit 41 performs a color space conversion process on the input through image data. The image data input to the control unit 19 is, for example, data represented in an RGB color space. Therefore, the color space conversion unit 41 converts input image data from data represented in the RGB color space to data represented in the YCbCr color space.

解像度変換部４２は、色空間変換処理が施されたスルー画像のデータに対して解像度変換処理を施す。この解像度変換処理により、入力されるスルー画像は、その解像度が元の解像度よりも低解像度となる、言い換えればスルー画像の画像サイズが小さくなる。 The resolution conversion unit 42 performs resolution conversion processing on the data of the through image that has been subjected to the color space conversion processing. By this resolution conversion processing, the input through image has a lower resolution than the original resolution, in other words, the image size of the through image is reduced.

第１実施形態では、被写体検出部３２として、色空間変換部４１及び解像度変換部４２を備えた構成としているが、色空間変換処理及び解像度変換処理が施されたスルー画像のデータが被写体検出部３２に入力されることを前提とする場合には、色空間変換部４１及び解像度変換部４２の構成は省略することが可能である。また、色空間変換処理及び解像度変換処理を被写体検出処理における前処理として１処理でまとめて実行することを前提とするのであれば、色空間変換部４１及び解像度変換部４２の構成を１つの構成として（画像変換部の構成として）まとめることも可能である。 In the first embodiment, the subject detection unit 32 includes a color space conversion unit 41 and a resolution conversion unit 42. However, data of a through image that has been subjected to the color space conversion process and the resolution conversion process is the subject detection unit. In the case where it is assumed that the data is input to 32, the configuration of the color space conversion unit 41 and the resolution conversion unit 42 can be omitted. Further, if it is assumed that the color space conversion process and the resolution conversion process are executed together as a pre-process in the subject detection process in one process, the configuration of the color space conversion unit 41 and the resolution conversion unit 42 is one configuration. (As a configuration of the image conversion unit).

差分画像生成部４３は、ＹＣｂＣｒ色空間で表されるスルー画像の画素値の平均値を、Ｙ成分、Ｃｂ成分及びＣｒ成分毎に算出する。差分画像生成部４３は、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎に算出された画素値の平均値を用いて、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎の基準濃度画像を生成する。ここで、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎の基準濃度画像は、解像度変換処理が施されたスルー画像の解像度と同一の解像度であり、各画素の画素値がいずれも共通する画像である。 The difference image generation unit 43 calculates the average value of the pixel values of the through image represented in the YCbCr color space for each of the Y component, the Cb component, and the Cr component. The difference image generation unit 43 generates a reference density image for each component of the Y component, the Cb component, and the Cr component by using the average value of the pixel values calculated for each component of the Y component, the Cb component, and the Cr component. To do. Here, the reference density image for each component of the Y component, the Cb component, and the Cr component has the same resolution as the resolution of the through image that has been subjected to the resolution conversion process, and the pixel value of each pixel is the same. It is.

ここで、Ｙ成分、Ｃｂ成分及びＣｒ成分のうち、Ｃｂ成分及びＣｒ成分の画素値は、Ｙ成分の画素値よりも小さいことがわかっている。つまり、二値化処理により得られる各成分のマスクのうち、Ｙ成分のマスクは、その評価値が高い。ここで、評価値の高いＹ成分のマスクは、光が反射している領域や背景となる領域など、輝度が高い領域が多い。その結果、被写体がない領域が被写体の領域として特定されてしまう。したがって、各成分の二値化画像から生成されたマスクの評価値を適切に演算するために、Ｃｂ成分及びＣｒ成分の画素値に対しては、予め所定の係数（例えば３など）を乗算しておくことも可能である。 Here, it is known that the pixel values of the Cb component and the Cr component among the Y component, the Cb component, and the Cr component are smaller than the pixel value of the Y component. That is, among the component masks obtained by the binarization process, the Y component mask has a high evaluation value. Here, the Y component mask having a high evaluation value has many areas with high luminance such as an area where light is reflected and an area which is a background. As a result, an area where there is no subject is specified as the subject area. Therefore, in order to appropriately calculate the evaluation value of the mask generated from the binarized image of each component, the pixel values of the Cb component and the Cr component are multiplied in advance by a predetermined coefficient (for example, 3). It is also possible to keep it.

差分画像生成部４３は、解像度変換処理が施されたスルー画像と基準濃度画像とを用いて、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎に、２つの差分画像を生成する。この２つの差分画像は、正の差分画像と負の差分画像とからなる。ここで、正の差分画像は、解像度変換処理が施されたスルー画像において、基準濃度画像の画素値を超過する画素値を有する画素の分布を表す画像であり、負の差分画像とは、解像度変換処理が施されたスルー画像において、基準濃度画像の画素値未満となる画素値を有する画素の分布を表す画像である。 The difference image generation unit 43 generates two difference images for each of the Y component, the Cb component, and the Cr component by using the through image subjected to the resolution conversion process and the reference density image. These two difference images are composed of a positive difference image and a negative difference image. Here, the positive difference image is an image representing a distribution of pixels having pixel values exceeding the pixel value of the reference density image in the through image subjected to the resolution conversion process, and the negative difference image is a resolution In the through image that has been subjected to the conversion process, the image represents a distribution of pixels having pixel values that are less than the pixel value of the reference density image.

差分画像生成部４３は、Ｙ成分の正の差分画像を生成する際に、差分値が０を超過する画素の画素値を差分値に設定し、差分値が０以下となる画素の画素値を０に設定する。または、差分画像生成部４３は、差分値が０以上となる画素の画素値を差分値に設定し、差分値が０未満となる画素については差分値はなしとしてもよい。いずれの場合においても、Ｙ成分の正の差分画像では、Ｙ成分の画像の画素値がＹ成分の基準濃度画像の画素値を超過する画素においては、その階調がそのまま保持される。ここで、Ｙ成分の正の差分画像において、Ｙ成分の画像の画素値がＹ成分の基準濃度画像の画素値を超過する画素の画素値の大きさは、Ｙ成分の基準濃度画像からの乖離度合いを示している。 When generating the positive difference image of the Y component, the difference image generation unit 43 sets the pixel value of the pixel whose difference value exceeds 0 as the difference value, and sets the pixel value of the pixel whose difference value is 0 or less. Set to 0. Alternatively, the difference image generation unit 43 may set the pixel value of a pixel having a difference value of 0 or more as the difference value, and the pixel having the difference value less than 0 may have no difference value. In any case, in the positive difference image of the Y component, the gradation is maintained as it is in the pixel in which the pixel value of the Y component image exceeds the pixel value of the reference density image of the Y component. Here, in the positive difference image of the Y component, the magnitude of the pixel value of the pixel in which the pixel value of the Y component image exceeds the pixel value of the Y component reference density image is different from the Y component reference density image. Indicates the degree.

また、差分画像生成部４３は、Ｙ成分の負の差分画像を生成する際に、差分値が０以上となる画素の画素値を０に設定し、差分値が０未満となる画素の画素値を差分値の絶対値に設定する。または、差分画像生成部４３は、差分値が０超過する画素の画素値においては差分値を０に設定し、差分値が０未満となる画素を、差分値の絶対値に設定する。いずれの場合についても、Ｙ成分の負の差分画像では、Ｙ成分の画素値がＹ成分の基準濃度画像の画素値未満となる画素においては、その階調がそのまま保持される。ここで、負の差分画像において、Ｙ成分の画像の画素値がＹ成分の基準濃度画像の画素値未満となる画素の画素値の大きさは、基準濃度画像からの乖離度合いを示している。 Further, when generating the negative difference image of the Y component, the difference image generation unit 43 sets the pixel value of a pixel having a difference value of 0 or more to 0, and the pixel value of a pixel having a difference value of less than 0 Is set to the absolute value of the difference value. Alternatively, the difference image generation unit 43 sets the difference value to 0 for the pixel value of the pixel whose difference value exceeds 0, and sets the pixel whose difference value is less than 0 to the absolute value of the difference value. In any case, in the negative difference image of the Y component, the gradation is maintained as it is in the pixel in which the pixel value of the Y component is less than the pixel value of the reference density image of the Y component. Here, in the negative difference image, the magnitude of the pixel value at which the pixel value of the Y component image is less than the pixel value of the Y component reference density image indicates the degree of deviation from the reference density image.

差分画像生成部４３は、上述した方法でＹ成分の正の差分画像及び負の差分画像を生成すると同時に、Ｃｂ成分の正の差分画像及び負の差分画像、及びＣｒ成分の正の差分画像及び負の差分画像を生成する。したがって、差分画像生成部４３は、計６個の差分画像を生成する。 The difference image generation unit 43 generates the positive difference image and the negative difference image of the Y component by the method described above, and at the same time, the positive difference image and the negative difference image of the Cb component, and the positive difference image of the Cr component and A negative difference image is generated. Therefore, the difference image generation unit 43 generates a total of six difference images.

上述した記載では、差分画像生成部４３は、各成分の画像から対応成分の基準濃度画像を減じた結果を用いて、Ｙ成分、Ｃｂ成分及びＣｒ成分の負の差分画像をそれぞれ生成しているが、これに限定する必要はなく、基準濃度画像から対応成分の画像を減じた後、差分値が０を超過する画素の画素値を差分値に設定し、差分値が０以下となる画素の画素値を０に設定することで、負の差分画像を生成してもよい。 In the above description, the difference image generation unit 43 generates negative difference images of the Y component, the Cb component, and the Cr component, respectively, using the result of subtracting the reference density image of the corresponding component from the image of each component. However, it is not necessary to limit to this, and after subtracting the corresponding component image from the reference density image, the pixel value of the pixel whose difference value exceeds 0 is set as the difference value, and the pixel value for which the difference value is 0 or less is set. A negative difference image may be generated by setting the pixel value to 0.

画像判定部４４は、生成された６個の差分画像毎にヒストグラムを生成する。そして、画像判定部４４は、生成したヒストグラム毎に、ヒストグラムの標準偏差σ及びピーク・ピーク値（peak to peak)ｐｐを求める。ここで、ヒストグラムの標準偏差σ及びピーク・ピーク値ｐｐは、ヒストグラムにおける画素値の分布のばらつき（または広がり）を示す指標となる。上述したように、差分値が０を超過する画素の画素値を差分値に設定し、差分値が０以下となる画素の画素値を０に設定した正の差分画像の場合、ピーク・ピーク値ｐｐは、例えば度数が最も高くなる画素値から、度数が二番目に高くなる画素値までの幅である。また、差分値が０以上となる画素の画素値を差分値に設定し、差分値が０未満となる画素については差分値なしとした正の差分画像の場合、ピーク・ピーク値は、例えばヒストグラムの０から、最大値までの幅である。 The image determination unit 44 generates a histogram for each of the generated six difference images. Then, the image determination unit 44 obtains a standard deviation σ and a peak / peak value (peak to peak) pp of the histogram for each generated histogram. Here, the standard deviation σ and the peak-to-peak value pp of the histogram serve as indices indicating the variation (or spread) of the pixel value distribution in the histogram. As described above, in the case of a positive difference image in which the pixel value of a pixel whose difference value exceeds 0 is set as the difference value, and the pixel value of the pixel whose difference value is 0 or less is set to 0, the peak / peak value For example, pp is a width from a pixel value having the highest frequency to a pixel value having the second highest frequency. In the case of a positive difference image in which a pixel value of a pixel having a difference value of 0 or more is set as a difference value and no difference value is set for a pixel having a difference value of less than 0, the peak / peak value is, for example, a histogram From 0 to the maximum value.

なお、上述したピーク・ピーク値ｐｐは一例であり、ヒストグラムにおける画素値及び度数の分布のばらつきを示す指標であれば、どのようなものであってもよい。例えば差分画像に代えて、ＹＣｂＣｒ色空間で表されるスルー画像の各成分毎に対してヒストグラムを生成し、各成分毎のヒストグラムにおいて度数が最も高くなる値を基準として、ピーク・ピーク値ｐｐを求めることも可能である。 Note that the above-described peak / peak value pp is an example, and any index may be used as long as it is an index indicating variations in the distribution of pixel values and frequencies in the histogram. For example, instead of the difference image, a histogram is generated for each component of the through image represented in the YCbCr color space, and the peak / peak value pp is set based on the value having the highest frequency in the histogram for each component. It is also possible to ask for it.

画像判定部４４は、ヒストグラム毎に生成した標準偏差σと閾値Ｔｈ１とを比較する。また、画像判定部４４は、ヒストグラム毎のピーク・ピーク値ｐｐと閾値Ｔｈ２とを比較する。ここで、閾値Ｔｈ１及び閾値Ｔｈ２は、差分画像に含まれる画素の画素値の分布のばらつきを判断するための閾値である。また、閾値Ｔｈ１及び閾値Ｔｈ２は異なる閾値であり、それぞれ独立に定められる。 The image determination unit 44 compares the standard deviation σ generated for each histogram with the threshold Th1. The image determination unit 44 also compares the peak / peak value pp for each histogram with the threshold Th2. Here, the threshold value Th1 and the threshold value Th2 are threshold values for determining variation in the distribution of pixel values of pixels included in the difference image. Further, the threshold value Th1 and the threshold value Th2 are different threshold values and are determined independently.

例えば複数の花を撮影範囲としたスルー画像が取得した場合を考慮する。この場合、取り込まれるスルー画像においては、花の領域が画像の大部分を占める。したがって、色差成分の差分画像においては、花の色に基づく色差成分の度数が高く、また、色差成分の値のばらつきは小さい。 For example, consider a case where a through image is acquired with a plurality of flowers in the shooting range. In this case, in the captured through image, the flower region occupies most of the image. Therefore, in the difference image of the color difference component, the frequency of the color difference component based on the color of the flower is high, and the variation in the value of the color difference component is small.

例えば複数の花が黄色の花であれば、Ｃｂ成分の負の差分画像から生成されるヒストグラムにおける標準偏差σ及びピーク・ピーク値ｐｐは、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜閾値Ｔｈ２となる。一方、Ｃｂ成分の正の差分画像や、Ｃｒ成分の正及び負の差分画像のヒストグラムにおける標準偏差σ及びピーク・ピーク値ｐｐは、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜閾値Ｔｈ２とはならない。このような場合、画像判定部４４は、Ｃｂ成分の負の差分画像に対してのみ、被写体として特定できるオブジェクトが存在しないと判定し、Ｃｂ成分の負の差分画像を被写体を特定する際に用いる差分画像から除外する。 For example, if the plurality of flowers are yellow flowers, the standard deviation σ and the peak / peak value pp in the histogram generated from the negative difference image of the Cb component are the standard deviation σ <threshold Th1 and the peak / peak value pp < The threshold value Th2. On the other hand, the standard deviation σ and the peak / peak value pp in the positive difference image of the Cb component and the histogram of the positive and negative difference images of the Cr component are standard deviation σ <threshold Th1, and peak / peak value pp <threshold Th2. It will not be. In such a case, the image determination unit 44 determines that there is no object that can be specified as a subject only for the negative difference image of the Cb component, and uses the negative difference image of the Cb component when specifying the subject. Exclude from the difference image.

このように、花の色が他の単一色の場合には、Ｃｂ成分及びＣｒ成分の差分画像のいずれかにおいて、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜Ｔｈ２となる。したがって、画像判定部４４は、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜Ｔｈ２となる色差成分の差分画像を、被写体を特定する際に用いる差分画像から除外する。ここでは、複数の花の画像を例に挙げて説明しているが、森林の画像など、単一の色成分がスルー画像の大部分を占めている画像の場合には、いずれかの色差成分の差分画像において、上述した判定となりやすい。したがって、画像判定部４４は、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜閾値Ｔｈ２となるＣｂ成分又はＣｒ成分の差分画像を、被写体を特定する際に用いる差分画像から除外する。 Thus, when the color of the flower is another single color, the standard deviation σ <threshold Th1 and the peak-peak value pp <Th2 are satisfied in either of the Cb component and Cr component difference images. Therefore, the image determination unit 44 excludes the difference image of the color difference component satisfying the standard deviation σ <threshold Th1 and the peak / peak value pp <Th2 from the difference image used when specifying the subject. Here, a plurality of flower images are described as examples. However, in the case of an image in which a single color component occupies most of the through image, such as a forest image, any color difference component is used. In the difference image, it is easy to make the above-described determination. Therefore, the image determination unit 44 excludes the Cb component or Cr component difference image satisfying the standard deviation σ <threshold Th1 and the peak-peak value pp <threshold Th2 from the difference image used when the subject is specified.

一方、複数の花の他に、異なる被写体や背景部分を含めた撮像範囲のスルー画像では、複数の花が占める領域が、異なる被写体の領域や背景部分の領域と同一の大きさ又はそれ以下の大きさとなる、つまり、撮像範囲に占める割合が小さくなる。この場合、色差成分の差分画像のヒストグラムでは、特定色の画素値の度数は高くなるが、画素値のばらつきが大きい。このような画像では、標準偏差σの値は小さいが、ピーク・ピーク値ｐｐの値は大きくなりやすい。このような差分画像は、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ≧閾値Ｔｈ２、又は、標準偏差σ≧閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜閾値Ｔｈ２であると判定される。したがって、画像判定部４４は、該当する差分画像を、被写体を特定する際に用いる差分画像から除外することはせずに保持する。 On the other hand, in the through image of the imaging range including different subjects and background parts in addition to a plurality of flowers, the area occupied by the plurality of flowers is the same size or smaller than the area of the different subjects and the background part area. It becomes the size, that is, the proportion of the imaging range is reduced. In this case, in the histogram of the difference image of the color difference component, the frequency of the pixel value of the specific color is high, but the variation of the pixel value is large. In such an image, the value of the standard deviation σ is small, but the peak / peak value pp tends to be large. Such a difference image is determined as standard deviation σ <threshold Th1, and peak-peak value pp ≧ threshold Th2, or standard deviation σ ≧ threshold Th1, and peak-peak value pp <threshold Th2. Therefore, the image determination unit 44 retains the corresponding difference image without excluding it from the difference image used when specifying the subject.

さらに、突出した色差成分がないスルー画像の場合、差分画像のヒストグラムは、標準偏差σ≧閾値Ｔｈ１、且つピーク・ピーク値ｐｐ≧閾値Ｔｈ２となりやすい。この場合も、画像判定部４４は、該当する差分画像を、被写体を特定する際に用いる差分画像から除外することはせずに保持する。 Further, in the case of a through image without a protruding color difference component, the histogram of the difference image tends to satisfy standard deviation σ ≧ threshold Th1 and peak-peak value pp ≧ threshold Th2. Also in this case, the image determination unit 44 holds the corresponding difference image without excluding it from the difference image used when specifying the subject.

なお、色差成分の差分画像を対象にして説明しているが、上記判定は、Ｃｂ成分及びＣｒ成分の差分画像の他、Ｙ成分の差分画像に対しても実行される。 In addition, although it demonstrated demonstrating the difference image of a color difference component, the said determination is performed also about the difference image of Y component other than the difference image of Cb component and Cr component.

ここで、上記判定を行ったときに、生成した６個の差分画像の全ての差分画像が、被写体を特定する際に用いる差分画像から除外されてしまう場合もある。この場合、画像判定部４４は、一旦除外された６個の差分画像を、被写体を特定する際に用いる差分画像として再度保持する。 Here, when the above determination is performed, all the difference images of the generated six difference images may be excluded from the difference images used when the subject is specified. In this case, the image determination unit 44 holds the six differential images once excluded as the differential images used when specifying the subject.

二値化処理部４５は、保持された差分画像を所定の閾値で二値化する二値化処理を行う。以下、保持される差分画像の種類に応じた二値化処理について説明する。 The binarization processing unit 45 performs binarization processing that binarizes the held difference image with a predetermined threshold. Hereinafter, the binarization process according to the type of difference image to be held will be described.

Ｙ成分の正の差分画像に対する二値化処理を行う場合、二値化処理部４５は、Ｙ成分の正の差分画像のヒストグラムから求まる標準偏差σに、係数Ｋ_１及びＫ_２（Ｋ_１＜Ｋ_２）を乗算した値を閾値Ｋ_１σ、Ｋ_２σとして設定する。これら閾値Ｋ_１σ、Ｋ_２σの設定の後、二値化処理部４５は、Ｙ成分の正の差分画像に対して、閾値Ｋ_１σを用いて二値化処理を行う。次に、二値化処理部４５は、Ｙ成分の正の差分画像と、閾値Ｋ_２σとを用いて二値化処理を行う。つまり、二値化処理部４５は、Ｙ成分の正の差分画像から２個の二値化画像を生成する。 When binarization processing is performed on a positive difference image of the Y component, the binarization processing unit 45 adds coefficients K ₁ and K ₂ (K ₁ <K) to the standard deviation σ obtained from the histogram of the positive difference image of the Y component. The values multiplied by K ₂ ) are set as threshold values K ₁ σ and K ₂ σ. After setting the threshold values K ₁ σ and K ₂ σ, the binarization processing unit 45 performs binarization processing on the positive difference image of the Y component using the threshold value K ₁ σ. Next, the binarization processing unit 45 performs binarization processing using the positive difference image of the Y component and the threshold value K ₂ σ. That is, the binarization processing unit 45 generates two binarized images from the positive difference image of the Y component.

また、Ｙ成分の負の差分画像に対する二値化処理を行う場合、二値化処理部４５は、Ｙ成分の負の差分画像から求まる標準偏差σに係数Ｋ_３を乗算した値を閾値Ｋ_３σとして設定する。閾値Ｋ_３σの設定の後、二値化処理部４５は、Ｙ成分の負の差分画像と、閾値Ｋ_３σとを用いて二値化処理を行う。つまり、二値化処理部４５は、Ｙ成分の負の差分画像から、１個の二値化画像を生成する。ここで、上述した係数Ｋ_１，係数Ｋ_２，係数Ｋ_３は、撮影シーンなどの撮影条件に応じて設定される値である。 When binarization processing is performed on the negative difference image of the Y component, the binarization processing unit 45 multiplies the standard deviation σ obtained from the negative difference image of the Y component by the coefficient K ₃ to the threshold value K _3. Set as σ. After setting the threshold value K ₃ σ, the binarization processing unit 45 performs binarization processing using the negative difference image of the Y component and the threshold value K ₃ σ. That is, the binarization processing unit 45 generates one binarized image from the negative difference image of the Y component. Here, the above-described coefficient K ₁ , coefficient K ₂ , and coefficient K ₃ are values set in accordance with shooting conditions such as a shooting scene.

Ｃｂ成分の正の差分画像に対する二値化処理を行う場合、二値化処理部４５は、閾値として、閾値Ｌ_１及び閾値Ｌ_２（Ｌ_１＜Ｌ_２）を設定する。二値化処理部４５は、設定された閾値Ｌ_１及び閾値Ｌ_２と、Ｃｂ成分の正の差分画像とを用いて二値化処理を行う。つまり、二値化処理部４５は、Ｃｂ成分の正の差分画像から、２個の二値化画像を生成する。 When performing binarization processing on a positive difference image of the Cb component, the binarization processing unit 45 sets a threshold value L ₁ and a threshold value L ₂ (L ₁ <L ₂ ) as threshold values. Binarization processing unit 45, a threshold L ₁ and threshold L ₂ that is set, performs binarization processing using a positive difference image Cb component. That is, the binarization processing unit 45 generates two binarized images from the positive difference image of the Cb component.

また、Ｃｂ成分の負の差分画像に対する二値化処理を行う場合、二値化処理部４５は、閾値として、閾値Ｌ_３を設定する。二値化処理部４５は、設定された閾値Ｌ_３と、Ｃｂ成分の正の差分画像とを用いて二値化処理を行う。つまり、二値化処理部４５は、Ｃｂ成分の負の差分画像から、１個の二値化画像を生成する。ここで、上述した閾値Ｌ_１，閾値Ｌ_２，閾値Ｌ_３は、撮影シーンなどの撮影条件に応じて設定される値である。 Also, when performing binarization processing for the negative difference image Cb component, binarization processing unit 45, as a threshold value, it sets the threshold value L _3. Binarization processing unit 45, a threshold L ₃ that is set, performs binarization processing using a positive difference image Cb component. That is, the binarization processing unit 45 generates one binarized image from the negative difference image of the Cb component. Here, the threshold value L ₁ , the threshold value L ₂ , and the threshold value L ₃ described above are values that are set according to shooting conditions such as a shooting scene.

Ｃｒ成分の正の差分画像に対する二値化処理を行う場合、二値化処理部４５は、Ｃｒ成分の正の差分画像に対して設定された閾値として、閾値Ｍ_１及び閾値Ｍ_２（Ｍ_１＜Ｍ_２）を設定する。二値化処理部４５は、設定された閾値Ｍ_１及び閾値Ｍ_２と、Ｃｒ成分の正の差分画像とを用いて二値化処理を行う。つまり、二値化処理部４５は、Ｃｒ成分の正の差分画像から、２個の二値化画像を生成する。 When the binarization process is performed on the positive difference image of the Cr component, the binarization processing unit 45 uses the threshold value M ₁ and the threshold value M ₂ (M ₁ as threshold values set for the positive difference image of the Cr component. <M ₂ ) is set. Binarization processing unit 45, a threshold M ₁ and threshold M ₂ that is set, performs binarization processing using a positive difference image Cr component. That is, the binarization processing unit 45 generates two binarized images from the positive difference image of the Cr component.

また、Ｃｒ成分の負の差分画像に対する二値化処理を行う場合、二値化処理部４５は、閾値として閾値Ｍ_３を設定する。二値化処理部４５は、設定された閾値Ｍ_３を用いて二値化処理を行う。つまり、二値化処理部４５は、Ｃｒ成分の負の差分画像から、１個の二値化画像を生成する。ここで、上述した閾値Ｍ_１，閾値Ｍ_２，閾値Ｍ_３は、撮影シーンなどの撮影条件に応じて設定される値である。 Also, when performing binarization processing for the negative difference image Cr component, the binarization processing unit 45 sets the threshold value M ₃ as the threshold. Binarization processing unit 45 performs a binarization process using the threshold M ₃ that is set. That is, the binarization processing unit 45 generates one binarized image from the negative difference image of the Cr component. Here, the above-described threshold value M ₁ , threshold value M ₂ , and threshold value M ₃ are values set in accordance with shooting conditions such as a shooting scene.

第１実施形態では、各成分の正の差分画像に対しては、２つの閾値をそれぞれ用いて２つの二値化画像を、負の差分画像に対しては、１つの閾値を用いて１つの二値化画像を生成しているが、用いる閾値の数や、生成する二値化画像の数は、上記に限定されるものではなく、適宜設定してよい。 In the first embodiment, for a positive difference image of each component, two binary images are respectively used using two threshold values, and for a negative difference image, one threshold value is used. Although a binarized image is generated, the number of threshold values to be used and the number of binarized images to be generated are not limited to the above, and may be set as appropriate.

以下、各差分画像に対する二値化処理を行ったときに閾値以上となる画素を白画素とし、閾値未満となる画素を黒画素とする。二値化処理部４５は、各差分画像に対する二値化処理を行って生成された二値化画像に対して、画素のまとまりを求めるラベリング処理を実行する。このラベリング処理を行った後、二値化処理部４５は、白画素のかたまりをマスク（島領域）として抽出する。 Hereinafter, pixels that are equal to or higher than the threshold when binarization processing is performed on each difference image are white pixels, and pixels that are less than the threshold are black pixels. The binarization processing unit 45 executes a labeling process for obtaining a group of pixels on the binarized image generated by performing the binarization process on each difference image. After performing the labeling process, the binarization processing unit 45 extracts a group of white pixels as a mask (island area).

マスク絞込部４６は、二値化画像から抽出されたマスクのうち、被写体候補として用いるマスクを絞り込む。以下、被写体候補として用いるマスクを被写体候補のマスクと称する。 The mask narrowing unit 46 narrows down the masks used as subject candidates among the masks extracted from the binarized image. Hereinafter, a mask used as a subject candidate is referred to as a subject candidate mask.

二値化画像から抽出されたマスクの中には、二値化画像に対するマスクの面積比が１％以下となるマスクが存在する。マスク絞込部４６は、二値化画像から抽出されたマスクに対して、二値化画像に対するマスクの面積比が１％以下となるか否かを判定する。マスク絞込部４６は、二値化画像に対するマスクの面積比が１％以下となるマスクをノイズであると判定し、被写体候補のマスクから除外する。 Among the masks extracted from the binarized image, there is a mask in which the area ratio of the mask to the binarized image is 1% or less. The mask narrowing unit 46 determines whether the area ratio of the mask to the binarized image is 1% or less with respect to the mask extracted from the binarized image. The mask narrowing unit 46 determines that a mask whose area ratio of the mask to the binarized image is 1% or less is noise, and excludes it from the subject candidate mask.

また、二値化画像から抽出されたマスクの中には、二値化画像に対するマスクの面積比が６０％以上となるマスクが存在する。マスク絞込部４６は、二値化画像から抽出されたマスクに対して、二値化画像に対するマスクの面積比が６０％以上となるか否かを判定する。マスク絞込部４６は、二値化画像に対するマスクの面積比が６０％以上となるマスクを背景であると判定し、被写体候補のマスクから除外する。 Among the masks extracted from the binarized image, there is a mask having an area ratio of the mask to the binarized image of 60% or more. The mask narrowing unit 46 determines whether the area ratio of the mask to the binarized image is 60% or more with respect to the mask extracted from the binarized image. The mask narrowing unit 46 determines that the mask whose ratio of the mask area to the binarized image is 60% or more is the background, and excludes it from the subject candidate mask.

また、二値化画像から抽出されたマスクの中には、マスクを含む矩形領域に対するマスクの面積比が所定の比率（例えば０．２）以下となるマスクもある。マスク絞込部４６は、二値化画像から抽出されたマスクに対して、マスクを含む矩形領域に対するマスクの面積比が所定の比率以下となるか否かを判定する。マスク絞込部４６は、マスクを含む矩形領域に対するマスクの面積比が所定の比率以下となるマスクを充填率が低いマスクであると判定し、被写体候補のマスクから除外する。 Among the masks extracted from the binarized image, there is a mask in which the area ratio of the mask to the rectangular area including the mask is a predetermined ratio (for example, 0.2) or less. The mask narrowing unit 46 determines whether the area ratio of the mask to the rectangular area including the mask is equal to or less than a predetermined ratio with respect to the mask extracted from the binarized image. The mask narrowing unit 46 determines that the mask whose area ratio of the mask to the rectangular area including the mask is equal to or less than a predetermined ratio is a mask with a low filling rate, and excludes the mask from the subject candidate masks.

なお、二値化画像から抽出されたマスクの中には、二値化画像の外周に相当する４辺のうち、隣り合う２辺にかかるマスクや、マスクを含む矩形領域の縦横比が所定の範囲（例えば０．２以上５未満）に含まれないマスクもある。したがって、抽出されるマスクの中に、これらのマスクが存在している場合、マスク絞込部４６は、これらマスクを被写体としては適していないマスクとして、被写体候補のマスクから除外してもよい。 Among the masks extracted from the binarized image, among the four sides corresponding to the outer periphery of the binarized image, the masks on adjacent two sides and the aspect ratio of the rectangular area including the mask are predetermined. Some masks are not included in the range (for example, 0.2 or more and less than 5). Therefore, when these masks are present in the extracted masks, the mask narrowing unit 46 may exclude these masks from the subject candidate masks as masks that are not suitable as subjects.

また、マスク絞込部４６は、二値化画像から抽出されるマスクの数を二値化画像毎に計数する。マスク絞込部４６は、１個の二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上であるか否かを判定する。そして、同一の二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上であると判定された場合に、マスク絞込部４６は、対象となるマスク全体の平均強度を求める。ここで、マスク全体の平均強度とは、差分画像中の全てのマスクに該当する画素の画素値の平均値（平均画素値）が挙げられる。マスク絞込部４６は、求めたマスク全体の平均強度が閾値Ｔｈ４以下となる場合には、元になる差分画像には被写体が含まれていないと判定し、該当する差分画像から抽出される全てのマスクを被写体候補のマスクから除外する。一方、マスク全体の平均強度が閾値Ｔｈ４を超過する場合には、マスク絞込部４６は、元になる差分画像には被写体が含まれていると判定する。この場合、対象となるマスクは、被写体候補のマスクとして保持される。 In addition, the mask narrowing unit 46 counts the number of masks extracted from the binarized image for each binarized image. The mask narrowing unit 46 determines whether or not the number of masks extracted from one binarized image is equal to or greater than the threshold Th3. When it is determined that the number of masks extracted from the same binarized image is equal to or greater than the threshold Th3, the mask narrowing unit 46 obtains the average intensity of the entire target mask. Here, the average intensity of the entire mask includes an average value (average pixel value) of pixel values of pixels corresponding to all masks in the difference image. The mask narrowing unit 46 determines that the subject is not included in the original difference image when the obtained average intensity of the entire mask is equal to or less than the threshold Th4, and all the extracted from the corresponding difference image Are excluded from the subject candidate masks. On the other hand, when the average intensity of the entire mask exceeds the threshold Th4, the mask narrowing unit 46 determines that the subject is included in the original difference image. In this case, the target mask is held as a mask for the subject candidate.

なお、マスク絞込部４６は、同一の二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上で、且つマスク全体の平均強度が閾値Ｔｈ４以下となるか否かを判定している。しかしながら、二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上となるか否かを判定し、二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上となる場合に、該当する二値化画像から抽出されるマスクを被写体候補のマスクから除外することも可能である。 The mask narrowing unit 46 determines whether or not the number of masks extracted from the same binarized image is equal to or greater than the threshold value Th3 and the average intensity of the entire mask is equal to or less than the threshold value Th4. However, when it is determined whether or not the number of masks extracted from the binarized image is equal to or greater than the threshold value Th3, and the number of masks extracted from the binarized image is equal to or greater than the threshold value Th3, the corresponding binary value is determined. It is also possible to exclude the mask extracted from the converted image from the subject candidate mask.

評価値算出部４７は、マスク絞込部４６により絞り込まれた被写体候補のマスクのそれぞれに対して評価値を算出する。評価値は、一例として、マスクの面積、マスクに対する慣性モーメント、マスクの平均強度から求まる値である。ここで、マスクの平均強度は、マスク抽出の元になる差分画像のうちマスクに対応する各画素の平均画素値から求めることができる。 The evaluation value calculation unit 47 calculates an evaluation value for each of the subject candidate masks narrowed down by the mask narrowing unit 46. For example, the evaluation value is a value obtained from the mask area, the moment of inertia with respect to the mask, and the average strength of the mask. Here, the average intensity of the mask can be obtained from the average pixel value of each pixel corresponding to the mask in the difference image from which the mask is extracted.

まず、評価値算出部４７は、マスクの慣性モーメントを算出する。このとき、評価値算出部４７は、被写体候補のマスクを有している二値化画像ごとに、慣性モーメントの算出のための基準点をそれぞれ設定する。例えば、評価値算出部４７は、二値化画像に残されている被写体候補のマスクの重心を求め、この重心の位置をその二値化画像での基準点とする。 First, the evaluation value calculation unit 47 calculates the moment of inertia of the mask. At this time, the evaluation value calculation unit 47 sets a reference point for calculating the moment of inertia for each binarized image having a subject candidate mask. For example, the evaluation value calculating unit 47 obtains the center of gravity of the mask of the subject candidate remaining in the binarized image, and uses the position of the center of gravity as the reference point in the binarized image.

例えば、評価値算出部４７は、二値化画像内に複数の被写体候補のマスクがあるときに、被写体候補のマスクの重心の平均をとって基準点を設定する。このとき、評価値算出部４７は、基準点Ｐの座標（ｘ，ｙ）を以下の式（１）、式（２）により求めればよい。 For example, when there are a plurality of subject candidate masks in the binarized image, the evaluation value calculation unit 47 sets the reference point by taking the average of the centroids of the subject candidate masks. At this time, the evaluation value calculation unit 47 may obtain the coordinates (x, y) of the reference point P by the following equations (1) and (2).

Ｐｘ＝Σ（Ｇｘ_ｎ）／ｎ・・・（１）
Ｐｙ＝Σ（Ｇｙ_ｎ）／ｎ・・・（２）
ここで、「Ｐｘ」は二値化画像での基準点Ｐのｘ座標を示し、「Ｐｙ」は二値化画像での基準点Ｐのｙ座標を示す。「ｎ」は、二値化画像に含まれる被写体候補のマスクを示す変数であって、１以上の整数の値をとる。「Ｇｘ_ｎ」は、被写体候補のマスクｎでの重心のｘ座標を示し、「Ｇｙ_ｎ」は、被写体候補のマスクｎでの重心のｙ座標を示す。 Px = Σ (Gx _n ) / n (1)
Py = Σ (Gy _n ) / n (2)
Here, “Px” indicates the x coordinate of the reference point P in the binarized image, and “Py” indicates the y coordinate of the reference point P in the binarized image. “N” is a variable indicating a mask of a subject candidate included in the binarized image, and takes an integer value of 1 or more. “Gx _n ” represents the x coordinate of the center of gravity of the subject candidate with the mask n, and “Gy _n ” represents the y coordinate of the center of gravity of the subject candidate with the mask n.

図２０は、或る二値化画像での基準点の算出例を示す図である。図７は、二値化画像内に被写体候補のマスク１〜３が存在する例を示している。式（１）および式（２）によれば、被写体候補のマスク１〜３の重心の平均が基準点Ｐの位置となるので、図７での基準点Ｐは、被写体候補のマスク１〜３の重心を頂点とする三角形の内側に位置することとなる。また、図７での被写体候補のマスク１〜３は画面の左側に偏って存在するため、図７での基準点Ｐの位置は画面中央（点Ｏ）よりも左側にずれた位置となる。 FIG. 20 is a diagram illustrating a calculation example of a reference point in a certain binarized image. FIG. 7 shows an example where subject candidate masks 1 to 3 exist in the binarized image. According to the equations (1) and (2), the average of the centers of gravity of the subject candidate masks 1 to 3 is the position of the reference point P. Therefore, the reference point P in FIG. It will be located inside the triangle with its center of gravity as the vertex. Since the subject candidate masks 1 to 3 in FIG. 7 are biased to the left side of the screen, the position of the reference point P in FIG. 7 is shifted to the left side of the screen center (point O).

そして、評価値算出部４７は、基準点Ｐからの画素距離の２乗×（０または１）の和により、被写体候補のマスクの慣性モーメントをそれぞれ算出する。なお、評価値算出部４７での基準点Ｐの設定と、マスクの慣性モーメントの算出は、二値化画像ごとに行われる。つまり、異なる二値化画像の間では、各画像に含まれる被写体候補のマスクがそれぞれ異なるので、各々の二値化画像での基準点Ｐの位置も相違することとなる。 Then, the evaluation value calculation unit 47 calculates the moment of inertia of the subject candidate mask based on the sum of the square of the pixel distance from the reference point P × (0 or 1). Note that the setting of the reference point P and the calculation of the inertia moment of the mask in the evaluation value calculation unit 47 are performed for each binarized image. That is, since the masks of subject candidates included in each image are different between different binarized images, the position of the reference point P in each binarized image is also different.

次に、評価値算出部４７は、マスクの面積と、マスクの平均強度とをそれぞれ求める。 Next, the evaluation value calculation unit 47 obtains the mask area and the average mask intensity, respectively.

最後に、評価値算出部４７は、以下の（３）式を用いて、各々のマスクに対する評価値を求める。以下、マスクに対する評価値をＥｖと称する。 Finally, the evaluation value calculation unit 47 obtains an evaluation value for each mask using the following equation (3). Hereinafter, the evaluation value for the mask is referred to as Ev.

Ｅｖ＝Ａｒ＾α／ＭＯＩ＋Ａｖ／β・・・（３）
ここで、「Ａｒ」はマスクの面積、「ＭＯＩ」はマスクの慣性モーメント、「Ａｖ」はマスクの平均強度を示す。また、「α」及び「β」は、チューニングパラメータとしての係数である。係数αは１．５〜２程度に、係数βは１００程度に設定されるが、これら係数α、係数βは上記の例に限定されるものではない。 Ev = Ar ^ α / MOI + Av / β (3)
Here, “Ar” represents the area of the mask, “MOI” represents the moment of inertia of the mask, and “Av” represents the average intensity of the mask. “Α” and “β” are coefficients as tuning parameters. The coefficient α is set to about 1.5 to 2, and the coefficient β is set to about 100. However, the coefficient α and the coefficient β are not limited to the above example.

マスク抽出部４８は、被写体候補のマスクから、被写体とするマスクを抽出する。まず、マスク抽出部４８は、被写体候補のマスクのそれぞれに対して求めた評価値Ｅｖを用いて、被写体候補のマスクの順位付けを行う。マスク抽出部４８は、順位付けされた被写体候補のマスクのうち、上位５位のマスクを最終候補のマスクとして選択する。マスク抽出部４８は、最終候補として選択したマスクに対して、以下の処理を実行する。 The mask extraction unit 48 extracts a mask as a subject from the subject candidate masks. First, the mask extraction unit 48 ranks the subject candidate masks using the evaluation value Ev obtained for each of the subject candidate masks. The mask extraction unit 48 selects the top five masks from the ranked subject candidate masks as the final candidate masks. The mask extraction unit 48 performs the following processing on the mask selected as the final candidate.

マスク抽出部４８は、最終候補となる上位５位のマスクの中に、Ｃｂ成分又はＣｒ成分の差分画像を元にして抽出されたマスクがあるか否かを判定する。Ｃｂ成分又はＣｒ成分の差分画像を元にして抽出されたマスクがあれば、マスク抽出部４８は、該当するマスクの平均強度を求める。マスク抽出部４８は、求めたマスクの平均強度が閾値Ｔｈ５以下であるか否かを判定する。マスクの平均強度が閾値Ｔｈ５以下の場合、マスク抽出部４８は、該当するマスクを最終候補のマスクから除外する。例えば、スルー画像に含まれる背景の領域の色差成分には、黄色成分やシアンの成分が多く含まれる。したがって、この判定における閾値Ｔｈ５は、黄色成分やシアン成分が多く含まれる背景部分のマスクを、被写体候補のマスクから除外するために設定される値となる。ここで、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクで、且つマスクの平均強度が閾値Ｔｈ５以下であるか否かの判定を、第１の抽出判定とする。 The mask extraction unit 48 determines whether there is a mask extracted based on the difference image of the Cb component or the Cr component among the top five masks that are final candidates. If there is a mask extracted based on the difference image of the Cb component or Cr component, the mask extracting unit 48 obtains the average intensity of the corresponding mask. The mask extraction unit 48 determines whether or not the obtained average mask intensity is equal to or less than the threshold Th5. When the average intensity of the mask is equal to or less than the threshold Th5, the mask extraction unit 48 excludes the corresponding mask from the final candidate mask. For example, the color difference component of the background region included in the through image includes a lot of yellow components and cyan components. Therefore, the threshold value Th5 in this determination is a value that is set to exclude the mask of the background portion that contains a lot of yellow and cyan components from the subject candidate mask. Here, it is assumed that the first extraction determination is to determine whether the mask is extracted based on the difference image of the Cb component or the Cr component and the average intensity of the mask is equal to or less than the threshold Th5.

次に、マスク抽出部４８は、最終候補となるマスクのうち、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位に、Ｙ成分の差分画像を元に抽出されたマスクがあるか否かを判定する。そして、Ｃｂ成分及びＣｒ成分の差分画像を元に抽出されたマスクの下位に、Ｙ成分の差分画像を元に抽出されたマスクがあれば、マスク抽出部４８は、該当するマスクの平均強度を求める。マスク抽出部４８は、求めたマスクの平均強度が閾値Ｔｈ６以下であるか否かを判定する。この閾値Ｔｈ６は、例えば陰などの黒い領域や、色が抜けている白い領域などのマスクを、被写体候補のマスクから除外するために設定される値となる。ここで、Ｃｂ成分及びＣｒ成分の差分画像を元に抽出されたマスクの下位に、Ｙ成分の差分画像を元に抽出されたマスクがあるか否かの判定を第２の抽出判定とする。ここで、閾値Ｔｈ６の値は、差分画像の種類に応じて異なる値を用いてもよいし、差分画像の種類に関係なく固定値であってもよい。 Next, the mask extraction unit 48 has a mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component or the Cr component among the masks as final candidates. It is determined whether or not. If there is a mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component and the Cr component, the mask extraction unit 48 calculates the average intensity of the corresponding mask. Ask. The mask extraction unit 48 determines whether or not the obtained average mask intensity is equal to or less than the threshold Th6. This threshold Th6 is a value set to exclude a mask such as a black area such as a shadow or a white area where a color is missing from the subject candidate mask. Here, the determination as to whether or not there is a mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component and the Cr component is a second extraction determination. Here, the value of the threshold Th6 may be different depending on the type of the difference image, or may be a fixed value regardless of the type of the difference image.

次に、マスク抽出部４８は、最終候補となるマスクのうち、Ｙ成分の正の差分画像を元に抽出されたマスクと、Ｙ成分の負の差分画像を元に抽出されたマスクとがあるか否かを判定する。この判定でＹ成分の正の差分画像を元に抽出されたマスクとＹ成分の負の差分画像を元に抽出されたマスクがある場合には、マスク抽出部４８は、これらマスクのうち、上位のマスクを保持し、下位のマスクを被写体候補のマスクから除外する。画像中に輝度の高い（ハイライト）領域と、画像中に輝度の低い（シャドー）領域とが混在している場合、具体的には、晴天時に撮影をした場合、被写体の輝度は高く、被写体の陰の輝度は低い。また、これらに相当するマスクの評価値Ｅｖは、被写体に相当するマスクの評価値Ｅｖは被写体の陰に相当するマスクの評価値Ｅｖよりも高い。したがって、これら領域に相当するマスクが最終候補のマスクとなる場合には、被写体に相当するマスクを残し、被写体の陰に相当するマスクを除外する。ここで、Ｙ成分の正の差分画像及びＹ成分の負の差分画像を元に抽出されたマスクがそれぞれ最終候補のマスクにあるか否かの判定を第３の抽出判定とする。 Next, the mask extraction unit 48 includes a mask extracted based on a positive difference image of the Y component and a mask extracted based on a negative difference image of the Y component among the masks as final candidates. It is determined whether or not. In this determination, when there is a mask extracted based on the positive difference image of the Y component and a mask extracted based on the negative difference image of the Y component, the mask extraction unit 48 selects the higher rank of these masks. And the lower mask is excluded from the subject candidate masks. When a high brightness (highlight) area and a low brightness (shadow) area are mixed in the image, specifically, when shooting in fine weather, the brightness of the subject is high. The shade brightness of is low. Further, the evaluation value Ev of the mask corresponding to these is higher than the evaluation value Ev of the mask corresponding to the shadow of the subject. Therefore, when the masks corresponding to these regions are the final candidate masks, the mask corresponding to the subject is left and the mask corresponding to the shadow of the subject is excluded. Here, the determination as to whether or not the mask extracted based on the positive difference image of the Y component and the negative difference image of the Y component is in the final candidate mask is a third extraction determination.

上述した第１から第３の抽出判定の後、マスク抽出部４８は、最終候補のマスクの中に、包含関係にあるマスクがあるか否かを判定する。包含関係にあるマスクがあれば、マスク抽出部４８は、包含関係にあるマスクを統合する処理を行う。包含関係にあるマスクがあるか否かの判定を第４の抽出判定とする。 After the first to third extraction determinations described above, the mask extraction unit 48 determines whether there is a mask in the inclusion relationship among the final candidate masks. If there is a mask having an inclusion relationship, the mask extraction unit 48 performs processing for integrating the masks having the inclusion relationship. The determination as to whether there is a mask in an inclusive relationship is the fourth extraction determination.

上述したように、二値化処理部４５は、Ｙ成分、Ｃｂ成分及びＣｒ成分の正の差分画像のそれぞれに対して、異なる２つの閾値を用いた二値化処理を行っている。図３は、閾値Ｙ成分の正の差分画像の一例を示す。図３に示すＹ成分の正の差分画像Ｐ１において、図３に示す領域Ａ１は、含まれる画素の画素値が閾値Ｋ_１σよりも高い領域を示す。閾値Ｋ_１σを用いてＹ成分の正の差分画像Ｐ１に対する二値化処理を行うと、領域Ａ１は白画素の領域、つまりマスクとなる。また、図３に示す領域Ａ２は、領域Ａ１に含まれる領域で、かつ領域Ａ１に含まれる他の画素の画素値よりも高い画素値を有する画素の領域である。閾値Ｋ_２σを用いてＹ成分の正の差分画像Ｐ１に対する二値化処理を行うと、領域Ａ２が白画素の領域、つまりマスクとなる。ここで、領域Ａ２は、閾値Ｋ_１σを用いてＹ成分の正の差分画像Ｐ１に対する二値化処理においてもマスクとして抽出される領域である。したがって、領域Ａ１と領域Ａ２とは包含関係にあると言える。つまり、包含関係にあるとは、同一の差分画像を用いた異なる閾値を用いた二値化処理を実行したときに、一方の二値化画像から抽出されるマスクが他方の二値化画像から抽出されるマスクに含まれることを指している。このような包含関係にあるマスクが最終候補のマスクの中にある場合、マスク抽出部４８は、これらマスクを統合する。ここで、マスクを統合するとは、包含関係にあるマスクのうち、一方のマスクが含まれる他方のマスクを保持し、該一方のマスクを除外することである。図３においては、マスク抽出部４８は、領域Ａ１及び領域Ａ２に示すマスクのうち、領域Ａ２に示すマスクを除外し、領域Ａ１に示すマスクを保持する。なお、Ｙ成分の正の差分画像から生成されるマスクについて説明しているが、Ｃｂ成分及びＣｒ成分の正の差分画像から生成されるマスクについても同様である。 As described above, the binarization processing unit 45 performs binarization processing using two different thresholds for each of the positive difference images of the Y component, the Cb component, and the Cr component. FIG. 3 shows an example of a positive difference image of the threshold Y component. In the positive difference image P1 of the Y component shown in FIG. 3, a region A1 shown in FIG. 3 shows a region where the pixel value of the included pixel is higher than the threshold value K ₁ σ. When the binarization process is performed on the positive difference image P1 of the Y component using the threshold value K ₁ σ, the area A1 becomes a white pixel area, that is, a mask. In addition, a region A2 illustrated in FIG. 3 is a region of pixels that are included in the region A1 and have pixel values higher than the pixel values of other pixels included in the region A1. When the binarization process is performed on the positive difference image P1 of the Y component using the threshold value K ₂ σ, the area A2 becomes a white pixel area, that is, a mask. Here, the region A2 is a region that is extracted as a mask in the binarization process for the positive difference image P1 of the Y component using the threshold value K ₁ σ. Therefore, it can be said that the region A1 and the region A2 are in an inclusive relationship. In other words, an inclusive relationship means that when a binarization process using different threshold values using the same difference image is executed, a mask extracted from one binarized image is changed from the other binarized image. It is included in the extracted mask. When a mask having such an inclusion relationship is included in the final candidate mask, the mask extracting unit 48 integrates these masks. Here, integrating the masks means that, among the masks in an inclusive relationship, the other mask including one mask is retained and the one mask is excluded. In FIG. 3, the mask extraction unit 48 excludes the mask shown in the region A2 from the masks shown in the region A1 and the region A2, and holds the mask shown in the region A1. Although the mask generated from the positive difference image of the Y component has been described, the same applies to the mask generated from the positive difference image of the Cb component and the Cr component.

マスク抽出部４８は、上述した第１から第４の抽出判定を行った後、保持されるマスクが３個を超過する場合には、保持されるマスクのうち、上位３位のマスクを被写体に相当するマスクとして抽出する。一方、保持されるマスクが３個以下の場合には、保持される全てのマスクを被写体に相当するマスクとして抽出する。マスク抽出部４８は、スルー画像に対して、抽出したマスクに相当する領域を被写体の領域として特定する。そして、マスク抽出部４８は、特定した被写体の領域を示す被写体領域データとして出力する。一方、上述した複数の判定で、全てのマスクが除外される場合、言い換えれば、保持されるマスクがない場合には、マスク抽出部４８は、被写体に相当するマスクがないとする。 After performing the first to fourth extraction determinations described above, if the number of masks to be retained exceeds three, the mask extraction unit 48 sets the top third mask among the masks to be used as a subject. Extract as a corresponding mask. On the other hand, when the number of held masks is three or less, all the held masks are extracted as masks corresponding to the subject. The mask extraction unit 48 identifies an area corresponding to the extracted mask as a subject area for the through image. Then, the mask extraction unit 48 outputs the subject area data indicating the identified subject area. On the other hand, when all the masks are excluded in the plurality of determinations described above, in other words, when there is no mask to be held, the mask extraction unit 48 assumes that there is no mask corresponding to the subject.

ここで、第１実施形態の撮像装置１０は、動作モードとして、静止画像を撮影する静止画撮影モードの他、動画像を撮影する動画撮影モードや、取得した静止画像や動画像を表示部２３にて再生する再生モードを備えている。以下、静止画撮影モードにおける処理の流れについて、図４のフローチャートを用いて説明する。 Here, the imaging apparatus 10 according to the first embodiment displays, as an operation mode, a still image shooting mode for shooting a still image, a moving image shooting mode for shooting a moving image, and the acquired still image or moving image as a display unit 23. It has a playback mode to play back with. Hereinafter, the flow of processing in the still image shooting mode will be described with reference to the flowchart of FIG.

ステップＳ１０１は、スルー画像を取得する処理である。制御部１９は、撮像部１７を駆動させてスルー画像の撮像を行わせる。この撮像部１７の駆動によりスルー画像が取得される。なお、撮像部１７の駆動により取得されるスルー画像のデータは、撮像部１７から制御部１９に入力される。制御部１９は、撮像部１７からのスルー画像のデータに対して画像処理を実行する。ここで、制御部１９は、画像処理が施されたスルー画像のデータを表示部２３に出力し、表示部２３に取得されたスルー画像を表示させることも可能である。 Step S101 is processing for acquiring a through image. The control unit 19 drives the imaging unit 17 to capture a through image. A through image is acquired by driving the imaging unit 17. Note that through image data acquired by driving the imaging unit 17 is input from the imaging unit 17 to the control unit 19. The control unit 19 performs image processing on the through image data from the imaging unit 17. Here, the control unit 19 can also output the through image data subjected to the image processing to the display unit 23 and cause the display unit 23 to display the acquired through image.

ステップＳ１０２は、被写体検出処理である。制御部１９は、ステップＳ１０１にて取得したスルー画像を用いて被写体検出処理を実行する。これにより、撮像部１７の撮像範囲内での被写体の領域が検出される。 Step S102 is subject detection processing. The control unit 19 performs subject detection processing using the through image acquired in step S101. Thereby, the region of the subject within the imaging range of the imaging unit 17 is detected.

ステップＳ１０３は、ＡＦ制御である。制御部１９は、ステップＳ１０２で検出された被写体の領域を焦点検出エリアに設定したＡＦ制御を実行する。ここで、ステップＳ１０２における被写体検出処理において被写体の領域が得られない場合には、制御部１９は、撮像範囲の中心の領域を焦点検出エリアに設定したＡＦ制御を実行する。ここで、撮像範囲の中心の領域としては、例えば風景撮影用に設定される領域が挙げられる。この風景撮影用に設定される領域は、撮像範囲の面積の６０％の面積からなる領域である。このステップＳ１０３の処理を実行する際に、制御部１９は、設定した焦点検出エリアを基準としたＡＥ演算を行ってもよい。 Step S103 is AF control. The control unit 19 executes AF control in which the subject area detected in step S102 is set as the focus detection area. If the subject area is not obtained in the subject detection process in step S102, the control unit 19 executes AF control in which the central area of the imaging range is set as the focus detection area. Here, examples of the central area of the imaging range include an area set for landscape photography. The area set for landscape photography is an area having an area of 60% of the area of the imaging range. When executing the process of step S103, the control unit 19 may perform an AE calculation based on the set focus detection area.

ステップＳ１０４は、撮影指示があるか否かを判定する処理である。記録用の静止画像の撮影指示が操作部２４の操作によって行われると、制御部１９は、ステップＳ１０４の判定処理の結果をＹｅｓとする。この場合、ステップＳ１０５に進む。一方、記録用の静止画像の撮影指示が操作部２４の操作により行われない場合、制御部１９は、ステップＳ１０４の判定処理の結果をＮｏとする。この場合、ステップＳ１０６に進む。 Step S104 is processing to determine whether or not there is a shooting instruction. When an instruction to shoot a still image for recording is given by operating the operation unit 24, the control unit 19 sets the result of the determination process in step S104 to Yes. In this case, the process proceeds to step S105. On the other hand, when an instruction to shoot a still image for recording is not made by operating the operation unit 24, the control unit 19 sets the result of the determination process in step S104 to No. In this case, the process proceeds to step S106.

ステップＳ１０５は、撮像処理である。制御部１９は、撮像部１７を駆動させて、記録用の静止画像の撮像処理を実行する。制御部１９は、記録用の静止画像のデータに対して所定の画像処理を施す。制御部１９は、画像処理を施した記録用の静止画像のデータを、メディアＩ／Ｆ２２を介して記憶媒体３５に書き込む。この処理が終了すると、ステップＳ１０６に進む。 Step S105 is an imaging process. The control unit 19 drives the imaging unit 17 to execute a recording still image imaging process. The control unit 19 performs predetermined image processing on still image data for recording. The control unit 19 writes the still image data for recording subjected to image processing into the storage medium 35 via the media I / F 22. When this process ends, the process proceeds to step S106.

ステップＳ１０６は、撮影を終了させるか否かを判定する処理である。記録用の静止画像の撮影終了の指示が操作部２４の操作によって行われると、制御部１９は、ステップＳ１０６の判定処理の結果をＹｅｓとする。この場合、図４のフローチャートの処理が終了する。 Step S106 is processing to determine whether or not to end shooting. When an instruction to end the recording of the still image for recording is given by operating the operation unit 24, the control unit 19 sets the result of the determination process in step S106 to Yes. In this case, the process of the flowchart in FIG. 4 ends.

一方、記録用の静止画像の撮影終了の指示が操作部２４の操作によって行われない場合、制御部１９は、ステップＳ１０６の判定処理の結果をＮｏとする。この場合、ステップＳ１０１に戻る。したがって、ステップＳ１０６の判定処理がＮｏとなる場合には、ステップＳ１０１からステップＳ１０４の処理が繰り返し実行され、これら処理が繰り返し実行されるときに、必要に応じてステップＳ１０５の撮像処理が実行される。 On the other hand, if the instruction to end the recording of the still image for recording is not made by operating the operation unit 24, the control unit 19 sets the result of the determination process in step S106 to No. In this case, the process returns to step S101. Therefore, when the determination process in step S106 is No, the processes from step S101 to step S104 are repeatedly executed. When these processes are repeatedly executed, the imaging process in step S105 is executed as necessary. .

次に、図４のフローチャートのステップＳ１０２に示した被写体検出処理を、図５のフローチャートに基づいて説明する。 Next, the subject detection process shown in step S102 of the flowchart of FIG. 4 will be described based on the flowchart of FIG.

ステップＳ２０１は、色空間変換処理である。制御部１９は、画像処理が施されたスルー画像に対して色空間変換処理を実行する。これにより、ＲＧＢ色空間で表されるスルー画像が、ＹＣｂＣｒ色空間で表されるスルー画像に変換される。 Step S201 is a color space conversion process. The control unit 19 performs color space conversion processing on the through image that has been subjected to image processing. As a result, the through image represented in the RGB color space is converted into a through image represented in the YCbCr color space.

ステップＳ２０２は、解像度変換処理である。制御部１９は、ステップＳ２０１の色空間変換処理が施されたスルー画像に対して解像度変換処理を実行する。これにより、スルー画像が元の解像度よりも低い解像度に変換される。つまり、画像サイズが縮小されたスルー画像が生成される。 Step S202 is resolution conversion processing. The control unit 19 performs resolution conversion processing on the through image that has been subjected to the color space conversion processing in step S201. Thereby, the through image is converted to a resolution lower than the original resolution. That is, a through image with a reduced image size is generated.

ステップＳ２０３は、差分画像を生成する処理である。制御部１９は、ステップＳ２０２の解像度変換処理が施されたスルー画像を用いて、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎の基準濃度画像をそれぞれ生成する。そして、制御部１９は、解像度変換処理が施されたスルー画像と基準濃度画像とを用いて、Ｙ成分、Ｃｂ成分及びＣｒ成分の各成分毎に、正の差分画像及び負の差分画像を生成する。 Step S203 is processing for generating a difference image. The control unit 19 generates a reference density image for each component of the Y component, the Cb component, and the Cr component, using the through image that has been subjected to the resolution conversion processing in step S202. And the control part 19 produces | generates a positive difference image and a negative difference image for every component of Y component, Cb component, and Cr component using the through image and reference density image which were subjected to resolution conversion processing To do.

ステップＳ２０４は、差分画像を絞り込む処理である。制御部１９は、ステップＳ２０３にて生成した各成分の正の差分画像及び負の差分画像の計６個の差分画像のそれぞれからヒストグラムを生成する。制御部１９は、生成したヒストグラムを用いて標準偏差σ及びピーク・ピーク値を求める。これら値を用いて、制御部１９は、被写体を特定する際に用いる差分画像を絞り込む。このステップＳ２０４の処理については、後述する。 Step S204 is processing to narrow down the difference image. The control unit 19 generates a histogram from each of a total of six difference images of the positive difference image and the negative difference image of each component generated in step S203. The control unit 19 obtains the standard deviation σ and the peak / peak value using the generated histogram. Using these values, the control unit 19 narrows down the difference image used when specifying the subject. The process of step S204 will be described later.

ステップＳ２０５は、二値化処理である。制御部１９は、ステップＳ２０４の処理により絞り込まれた差分画像のそれぞれに対して二値化処理を行う。二値化画像を生成した後、制御部１９は、生成された二値化画像に対してラベリング処理を行う。このラベリング処理を行うことで、制御部１９は、二値化画像からマスクを抽出する。 Step S205 is a binarization process. The control unit 19 performs binarization processing for each of the difference images narrowed down by the processing in step S204. After generating the binarized image, the control unit 19 performs a labeling process on the generated binarized image. By performing this labeling process, the control unit 19 extracts a mask from the binarized image.

ステップＳ２０６は、マスクを絞り込む処理である。制御部１９は、二値化画像から抽出されたマスクから、被写体候補のマスクを絞り込む。このステップＳ２０６の処理については、後述する。 Step S206 is a process of narrowing down the mask. The control unit 19 narrows down subject candidate masks from the masks extracted from the binarized image. The process of step S206 will be described later.

ステップＳ２０７は、評価値を算出する処理である。制御部１９は、ステップＳ２０６の処理によりマスクが絞り込まれた二値化画像について、慣性モーメントの算出のための基準点Ｐを設定する。そして、制御部１９は、二値化画像ごとに基準点Ｐからの各マスクの慣性モーメントをそれぞれ求める。 Step S207 is processing for calculating an evaluation value. The control unit 19 sets a reference point P for calculating the moment of inertia for the binarized image whose mask has been narrowed down by the process of step S206. Then, the control unit 19 obtains the inertia moment of each mask from the reference point P for each binarized image.

その後、制御部１９は、マスクの慣性モーメントや、マスクの面積およびマスクの平均強度を用いて、上述した（３）式により各マスクの評価値を算出する。 Thereafter, the control unit 19 calculates an evaluation value of each mask by the above-described equation (3) using the moment of inertia of the mask, the area of the mask, and the average intensity of the mask.

ステップＳ２０８は、被写体に相当するマスクを抽出する処理である。制御部１９は、ステップＳ２０７にて求めた各マスクの評価値Ｅｖに基づいて、被写体候補のマスクに対する順位付けを行う。制御部１９は、順位付けされた被写体候補のマスクのうち、上位５位のマスクを最終候補のマスクとして選択する。そして、最終候補のマスクに対して上述した第１から第４の抽出判定を行う。そして、これら抽出判定の後に保持されるマスクを被写体に相当するマスクとして抽出する。なお、このステップＳ２０８の処理の詳細については、後述する。このステップＳ２０８の処理が終了すると、図５に示す被写体検出処理が終了する。 Step S208 is processing for extracting a mask corresponding to the subject. The control unit 19 ranks the subject candidate masks based on the evaluation value Ev of each mask obtained in step S207. The control unit 19 selects, as the final candidate mask, the top five masks among the ranked subject candidate masks. Then, the first to fourth extraction determinations described above are performed on the final candidate mask. Then, the mask held after the extraction determination is extracted as a mask corresponding to the subject. Details of the processing in step S208 will be described later. When the process of step S208 ends, the subject detection process shown in FIG. 5 ends.

ここで、上述した被写体検出処理により、マスクが抽出されている場合には、制御部１９は、抽出されたマスクに相当する領域を被写体の領域として設定する。その結果、スルー画像を表示部２３に表示したときには、被写体の領域を含む矩形の枠（被写体枠）が、スルー画像に重畳して表示される。したがって、ステップＳ２０８の処理を行ったときにマスクが抽出されている場合には、撮影者は、スルー画像に重畳される被写体枠の表示により、撮像範囲に含まれる被写体の位置や大きさを認識することができる。 Here, when a mask is extracted by the subject detection process described above, the control unit 19 sets a region corresponding to the extracted mask as a region of the subject. As a result, when the through image is displayed on the display unit 23, a rectangular frame (subject frame) including the region of the subject is displayed superimposed on the through image. Therefore, when the mask is extracted when the processing of step S208 is performed, the photographer recognizes the position and size of the subject included in the imaging range by displaying the subject frame superimposed on the through image. can do.

次に、図５におけるステップＳ２０４に示す差分画像を絞り込む処理について、図６のフローチャートを用いて説明する。 Next, the process of narrowing down the difference image shown in step S204 in FIG. 5 will be described using the flowchart in FIG.

ステップＳ３０１は、ヒストグラムを生成する処理である。上述した図５におけるステップＳ２０３の処理を行うことで、制御部１９は、Ｙ成分、Ｃｂ成分及びＣｒ成分毎の正の差分画像及び負の差分画像の計６個の差分画像を生成している。制御部１９は、これら計６個の差分画像のそれぞれに対してヒストグラムを生成する。 Step S301 is processing for generating a histogram. By performing the process of step S203 in FIG. 5 described above, the control unit 19 generates a total of six difference images, that is, a positive difference image and a negative difference image for each of the Y component, the Cb component, and the Cr component. . The control unit 19 generates a histogram for each of these six difference images.

ステップＳ３０２は、標準偏差σ及びピーク・ピーク値ｐｐを算出する処理である。制御部１９は、ステップＳ３０２にて生成した各差分画像のヒストグラム毎に、標準偏差σ及びピーク・ピーク値ｐｐを算出する。このステップＳ３０２の処理により、各ヒストグラムにおける画素値の分布のばらつきを示す指標が算出される。 Step S302 is processing for calculating the standard deviation σ and the peak / peak value pp. The control unit 19 calculates the standard deviation σ and the peak / peak value pp for each histogram of each difference image generated in step S302. By the processing in step S302, an index indicating variation in the distribution of pixel values in each histogram is calculated.

ステップＳ３０３は、標準偏差σと閾値Ｔｈ１とを比較する処理である。制御部１９は、ステップＳ３０２にて求めたヒストグラム毎の閾値σと、閾値Ｔｈ１とを比較する。 Step S303 is processing for comparing the standard deviation σ with the threshold Th1. The control unit 19 compares the threshold value σ for each histogram obtained in step S302 with the threshold value Th1.

ステップＳ３０４は、ピーク・ピーク値ｐｐと閾値Ｔｈ２とを比較する処理である。制御部１９は、ステップＳ３０３にて求めたヒストグラム毎のピーク・ピーク値ｐｐと、閾値Ｔｈ２とを比較する。 Step S304 is processing for comparing the peak / peak value pp with the threshold Th2. The control unit 19 compares the peak / peak value pp for each histogram obtained in step S303 with the threshold Th2.

ステップＳ３０５は、除外する差分画像があるか否かを判定する処理である。制御部１９は、ステップＳ３０３及びステップＳ３０４の判定処理の結果をそれぞれ用いて、生成された計６個の差分画像のうち、除外する差分画像があるか否かを判定する。計６個の差分画像のうち、例えば標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜Ｔｈ２となる差分画像があれば、制御部１９は、この差分画像を除外する差分画像とする。この場合、制御部１９は、ステップＳ３０５の判定処理の結果をＹｅｓとし、ステップＳ３０６とする。一方、計６個の差分画像のいずれもが、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜Ｔｈ２とならない場合には、制御部１９は、ステップＳ３０５の判定処理の結果をＮｏとする。この場合、除外する差分画像はないと判定される。したがって、ステップＳ３０５の判定処理の結果がＮｏとなる場合、制御部１９は、計６個の差分画像をそのまま使用すると判定し、図６に示すフローチャートの処理を終了させる。 Step S305 is processing to determine whether there is a difference image to be excluded. The control unit 19 determines whether there is a difference image to be excluded among the generated six difference images, using the results of the determination processing in step S303 and step S304, respectively. If there is a difference image satisfying, for example, standard deviation σ <threshold Th <b> 1 and peak / peak value pp <Th <b> 2 among the total of six difference images, the control unit 19 sets the difference image to exclude this difference image. In this case, the control unit 19 sets Yes as the result of the determination process in step S305, and sets it as step S306. On the other hand, if none of the total of six difference images satisfies the standard deviation σ <threshold Th1, and the peak / peak value pp <Th2, the control unit 19 sets the result of the determination process in step S305 to No. . In this case, it is determined that there is no difference image to be excluded. Therefore, when the result of the determination process in step S305 is No, the control unit 19 determines to use a total of six difference images as they are, and ends the process of the flowchart illustrated in FIG.

ステップＳ３０６は、差分画像を除外する処理である。ステップＳ３０５において除外する差分画像があると判定されている。したがって、制御部１９は、対象となる差分画像を除外する。このステップＳ３０６の処理を行うことで、差分画像が絞り込まれる。このステップＳ３０６の処理を実行した後、制御部は、図６に示すフローチャートの処理を終了する。 Step S306 is processing to exclude the difference image. In step S305, it is determined that there is a difference image to be excluded. Therefore, the control unit 19 excludes the target difference image. The difference image is narrowed down by performing the process of step S306. After executing the process of step S306, the control unit ends the process of the flowchart shown in FIG.

図７は、鳥を撮像範囲に収めたときに得られるスルー画像の一例を、図８は図７に示すスルー画像から得られるＹ成分、Ｃｂ成分及びＣｒ成分の正及び負の差分画像の一例を示す。ここで、図８（ａ）はＹ成分の正の差分画像、図８（ｂ）はＹ成分の負の差分画像である。また、図８（ｃ）はＣｂ成分の正の差分画像、図８（ｄ）はＣｂ成分の負の差分画像である。また、図８（ｅ）はＣｒ成分の正の差分画像、図８（ｆ）はＣｒ成分の負の差分画像である。 FIG. 7 shows an example of a through image obtained when a bird is within the imaging range, and FIG. 8 shows an example of positive and negative difference images of the Y component, Cb component, and Cr component obtained from the through image shown in FIG. Indicates. Here, FIG. 8A is a positive difference image of the Y component, and FIG. 8B is a negative difference image of the Y component. FIG. 8C shows a positive difference image of the Cb component, and FIG. 8D shows a negative difference image of the Cb component. FIG. 8E shows a positive difference image of the Cr component, and FIG. 8F shows a negative difference image of the Cr component.

上述したステップＳ３０１及びステップＳ３０２の処理が実行されると、制御部１９は、生成された計６個の差分画像のそれぞれに対してヒストグラムを生成し、ヒストグラム毎に標準偏差σ及びピーク・ピーク値ｐｐを算出する。ここで、図８（ａ）に示す差分画像に基づくヒストグラムの標準偏差σ及びピーク・ピーク値ｐｐは、標準偏差σ＜閾値Ｔｈ１、且つピーク・ピーク値ｐｐ＜閾値Ｔｈ２となる。したがって、制御部１９は、図８（ａ）の差分画像中には被写体に該当するオブジェクトはないと判定し、この生成された６個の差分画像のうち、図８（ａ）に示す差分画像を、被写体を特定する際に用いる差分画像から除外する。したがって、被写体を特定する際に用いる差分画像が、図８（ｂ）から図８（ｆ）に示す計５個の差分画像に絞り込まれる。 When the processing of step S301 and step S302 described above is executed, the control unit 19 generates a histogram for each of the generated six difference images, and the standard deviation σ and the peak / peak value for each histogram. pp is calculated. Here, the standard deviation σ and peak / peak value pp of the histogram based on the difference image shown in FIG. 8A are standard deviation σ <threshold Th1, and peak / peak value pp <threshold Th2. Therefore, the control unit 19 determines that there is no object corresponding to the subject in the difference image of FIG. 8A, and among the generated six difference images, the difference image shown in FIG. Are excluded from the difference image used when the subject is specified. Therefore, the difference images used when specifying the subject are narrowed down to a total of five difference images shown in FIGS. 8B to 8F.

そして、ステップＳ２０５の処理が実行されると、制御部１９は、図８（ｂ）から図８（ｆ）の計５個の差分画像のそれぞれに対して二値化処理を実行する。この二値化処理により、図９（ｃ）から図９（ｉ）の計７個の二値化画像が生成される。この図９においては、各成分の正の差分画像に対して二値化処理を行うときの閾値を２σ及び３σに、各成分の負の差分画像に対して二値化処理を行うときの閾値を２σに設定した場合を示す。 Then, when the processing of step S205 is executed, the control unit 19 executes binarization processing for each of a total of five difference images from FIG. 8B to FIG. 8F. By this binarization processing, a total of seven binarized images from FIG. 9C to FIG. 9I are generated. In FIG. 9, the threshold when performing binarization processing on the positive difference image of each component is 2σ and 3σ, and the threshold when performing binarization processing on the negative difference image of each component Is set to 2σ.

なお、図９では、Ｙ成分の正の差分画像から得られる２個の二値化画像を図９（ａ）及び図９（ｂ）として便宜上記載しているが、図８（ａ）の差分画像が除外される場合、図９（ａ）及び図９（ｂ）に示す二値化画像は生成されない。なお、図９（ｃ）から図９（ｉ）の二値化画像に示すグレイ及び白の領域が、各二値化画像から抽出されるマスクを示している。 In FIG. 9, two binarized images obtained from the positive difference image of the Y component are illustrated for convenience as FIGS. 9A and 9B, but the difference of FIG. When the image is excluded, the binarized image shown in FIGS. 9A and 9B is not generated. Note that the gray and white regions shown in the binarized images in FIGS. 9C to 9I indicate the masks extracted from each binarized image.

ステップＳ２０６の処理が実行されると、制御部１９は、図９（ｃ）から図９（ｉ）の二値化画像におけるマスクを用いてマスクを絞り込む処理を行う。そして、絞り込まれたマスクを用いて評価値Ｅｖを算出し、被写体に相当するマスクを抽出する。図９では、被写体に相当するマスクが抽出されない場合を示している。つまり、被写体に相当するマスクがない場合には、風景撮影時に用いる枠５１をスルー画像Ｐ２に重畳して表示部２３に表示する。 When the process of step S206 is executed, the control unit 19 performs a process of narrowing down the mask using the masks in the binarized images of FIGS. 9C to 9I. Then, the evaluation value Ev is calculated using the narrowed-down mask, and a mask corresponding to the subject is extracted. FIG. 9 shows a case where a mask corresponding to the subject is not extracted. That is, when there is no mask corresponding to the subject, the frame 51 used for landscape photography is superimposed on the through image P2 and displayed on the display unit 23.

このように、上述した差分画像を絞り込む処理において、風景撮影を行うときに得られるスルー画像であれば、スルー画像の各画素の画素値は近似した値、つまり、画像中のコントラストは小さい。その結果、差分画像に基づくヒストグラムのいずれかは、その標準偏差σやピーク・ピーク値ｐｐは小さい値をとる。一方、人物撮影を行うときに得られるスルー画像であれば、背景となる領域ではコントラストは小さいが、人物となる領域とのコントラストが大きくなる。その結果、差分画像に基づくヒストグラムでは、その標準偏差σやピーク・ピーク値ｐｐは大きい値をとる。したがって、上述した標準偏差σと閾値Ｔｈ１との比較や、ピーク・ピーク値ｐｐと閾値Ｔｈ２との比較を行った場合には、被写体が存在しないと想定される差分画像を除外することができる。その結果、後述する処理に係る処理負荷を軽減し、処理時間を短縮することができる。また、二値化処理の前に、被写体が存在しないと想定される差分画像を除外することで、被写体に相当するマスクの抽出精度を向上させ、的確に特定領域（例えば、被写体領域や注目領域など）の抽出を行うことができる。 In this way, in the above-described processing for narrowing down the difference image, if the through image is obtained when taking a landscape image, the pixel value of each pixel of the through image is an approximate value, that is, the contrast in the image is small. As a result, any one of the histograms based on the difference image has a small standard deviation σ and peak / peak value pp. On the other hand, in the case of a through image obtained when a person is photographed, the contrast in the background area is small, but the contrast with the person area is large. As a result, in the histogram based on the difference image, the standard deviation σ and the peak / peak value pp are large. Therefore, when the above-described standard deviation σ is compared with the threshold value Th1, or the peak-to-peak value pp is compared with the threshold value Th2, a difference image that is assumed to have no subject can be excluded. As a result, it is possible to reduce a processing load related to processing to be described later and to shorten a processing time. Further, by excluding the difference image that is assumed that the subject does not exist before the binarization process, the extraction accuracy of the mask corresponding to the subject is improved, and the specific region (for example, the subject region or the attention region) is accurately detected. Etc.) can be extracted.

なお、上述した差分画像を絞り込む処理は、特に、接写モード（マクロモード）による撮影時に有用である。したがって、接写モード（マクロモード）の設定に連動して差分画像を絞り込む処理を実行しても良いし、ＡＦ情報、ＡＥ情報、被写体解析の情報などに連動して差分画像を絞り込む処理を実行しても良い。 Note that the above-described processing for narrowing down the difference image is particularly useful when photographing in the close-up mode (macro mode). Therefore, the processing for narrowing the difference image may be executed in conjunction with the setting of the close-up mode (macro mode), or the processing for narrowing the difference image in conjunction with AF information, AE information, subject analysis information, or the like. May be.

次に、図５のフローチャートに示すステップＳ２０６のマスクを絞り込む処理について、図１０のフローチャートに基づき説明する。 Next, the process of narrowing down the mask in step S206 shown in the flowchart of FIG. 5 will be described based on the flowchart of FIG.

ステップＳ４０１は、ノイズと認識されるマスクを除外する処理である。制御部１９は、二値化画像から抽出されたマスクの面積を求めた後、二値化画像の面積とマスクの面積との面積比を求める。制御部１９は、二値化画像の面積とマスクの面積との面積比を参照して、二値化画像の面積とマスクの面積との面積比が１％以下となるマスクがあるか否かを判定する。二値化画像の面積とマスクの面積との面積比が１％以下となるマスクがあれば、制御部１９は、そのマスクをノイズとして、被写体候補のマスクから除外する。 Step S401 is processing to exclude a mask recognized as noise. After obtaining the area of the mask extracted from the binarized image, the control unit 19 obtains the area ratio between the area of the binarized image and the area of the mask. The control unit 19 refers to the area ratio between the area of the binarized image and the area of the mask, and determines whether or not there is a mask in which the area ratio between the area of the binarized image and the area of the mask is 1% or less. Determine. If there is a mask in which the area ratio between the area of the binarized image and the area of the mask is 1% or less, the control unit 19 excludes the mask from the subject candidate mask as noise.

ステップＳ４０２は、背景と認識されるマスクを除外する処理である。制御部１９は、ステップＳ４０１の処理により、二値化画像の面積とマスクの面積との面積比をマスク毎に求めている。制御部１９は、二値化画像の面積とマスクの面積との面積比を参照して、面積比が６０％以上となるマスクがあるか否かを判定する。二値化画像の面積とマスクの面積との面積比が６０％以上となるマスクがあれば、制御部１９は、そのマスクを背景であるとして、被写体候補のマスクから除外する。 Step S402 is processing to exclude a mask recognized as a background. The control unit 19 obtains an area ratio between the area of the binarized image and the area of the mask for each mask by the process of step S401. The control unit 19 refers to the area ratio between the area of the binarized image and the area of the mask, and determines whether there is a mask having an area ratio of 60% or more. If there is a mask in which the area ratio between the area of the binarized image and the area of the mask is 60% or more, the control unit 19 excludes the mask from the subject candidate mask as the background.

ステップＳ４０３は、充填率の低いマスクを除外する処理である。制御部１９は、マスクを含む矩形領域の面積に対するマスクの面積の比率が所定の閾値（例えば０．２）以下となるか否かを判定する。この判定で、マスクを含む矩形領域の面積に対するマスクの面積の比率が所定の閾値以下となるマスクがあれば、制御部１９は、該当するマスクを、被写体候補のマスクから除外する。 Step S403 is processing for removing a mask with a low filling rate. The control unit 19 determines whether or not the ratio of the mask area to the area of the rectangular region including the mask is a predetermined threshold value (for example, 0.2) or less. In this determination, if there is a mask in which the ratio of the mask area to the area of the rectangular area including the mask is equal to or less than a predetermined threshold, the control unit 19 excludes the corresponding mask from the subject candidate mask.

ステップＳ４０４は、二値化画像におけるマスクの数を計数する処理である。制御部１９は、二値化画像から抽出されたマスクの数を計数する。 Step S404 is processing to count the number of masks in the binarized image. The control unit 19 counts the number of masks extracted from the binarized image.

ステップＳ４０５は、マスクの数が閾値Ｔｈ３以上であるか否かを判定する処理である。制御部１９は、計数されたマスクの数が閾値Ｔｈ３以上となる場合、制御部１９は、ステップＳ４０５の判定処理の結果をＹｅｓとする。この場合、ステップＳ４０６に進む。一方、計数されたマスクの数が閾値Ｔｈ３未満となる場合には、ステップＳ４０９に進む。 Step S405 is processing for determining whether or not the number of masks is equal to or greater than the threshold value Th3. When the counted number of masks is equal to or greater than the threshold Th3, the control unit 19 sets the result of the determination process in step S405 to Yes. In this case, the process proceeds to step S406. On the other hand, if the counted number of masks is less than the threshold Th3, the process proceeds to step S409.

ステップＳ４０６は、マスク全体の平均強度を算出する処理である。制御部１９は、二値化画像の元になる差分画像から、各マスクに該当する画素の画素値を読み出し、マスク全体の平均強度を算出する。 Step S406 is processing for calculating the average intensity of the entire mask. The control unit 19 reads the pixel value of the pixel corresponding to each mask from the difference image that is the basis of the binarized image, and calculates the average intensity of the entire mask.

ステップＳ４０７は、マスク全体の平均強度が閾値Ｔｈ４以下となるか否かを判定する処理である。ステップＳ４０６にて求めたマスク全体の平均強度が閾値Ｔｈ４以下となる場合、制御部１９は、ステップＳ４０７の判定処理の結果をＹｅｓとする。この場合、ステップＳ４０８に進む。一方、ステップＳ４０６にて求めたマスク全体の平均強度が閾値Ｔｈ４を超過する場合、制御部１９は、ステップＳ４０７の判定処理の結果をＮｏとする。この場合、ステップＳ４０９に進む。 Step S407 is processing for determining whether or not the average intensity of the entire mask is equal to or less than the threshold Th4. When the average intensity of the entire mask obtained in step S406 is equal to or less than the threshold Th4, the control unit 19 sets the result of the determination process in step S407 to Yes. In this case, the process proceeds to step S408. On the other hand, when the average intensity of the entire mask obtained in step S406 exceeds the threshold Th4, the control unit 19 sets the result of the determination process in step S407 to No. In this case, the process proceeds to step S409.

ステップＳ４０８は、対象となるマスクを除外する処理である。ステップＳ４０７の判定処理でＹｅｓとなる場合、制御部１９は、二値化画像から抽出されるマスクの中にはマスクが含まれていないと判断し、この二値化画像から抽出されたマスク全てを被写体候補のマスクから除外する。 Step S408 is processing to exclude the target mask. When the determination process in step S407 is Yes, the control unit 19 determines that the mask extracted from the binarized image does not include a mask, and all the masks extracted from the binarized image. Are excluded from the subject candidate mask.

ステップＳ４０９は、全ての二値化画像に対して実行したか否かを判定する処理である。全ての二値化画像に対してステップＳ４０５からステップＳ４０８の処理を行っている場合、制御部１９は、ステップＳ４０９の判定処理の結果をＹｅｓとする。これにより、図１０のフローチャートの処理を終了する。一方、全ての二値化画像に対してステップＳ４０４からステップＳ４０８の処理を行っていない場合、制御部１９は、ステップＳ４０９の判定処理の結果をＮｏとする。この場合、ステップＳ４０４に戻る。したがって、全ての二値化画像に対して、ステップＳ４０４からステップＳ４０８の処理が実行される。 Step S409 is processing for determining whether or not the processing has been executed for all the binarized images. When the processes from step S405 to step S408 are performed on all the binarized images, the control unit 19 sets the result of the determination process in step S409 to Yes. Thereby, the process of the flowchart of FIG. 10 is completed. On the other hand, when the processes from step S404 to step S408 have not been performed on all the binarized images, the control unit 19 sets the result of the determination process in step S409 to No. In this case, the process returns to step S404. Therefore, the processing from step S404 to step S408 is executed for all binarized images.

図１１は、キリンを撮像範囲に収めたときのスルー画像Ｐ３を示す。ここで、図１２（ａ）から図１２（ｉ）の二値化画像において、グレイ及び白で表される領域がマスクとして抽出される。また、図１２においては、各成分の正の差分画像に対して二値化処理を行うときの閾値を２σ及び３σに、各成分の負の差分画像に対して二値化処理を行うときの閾値を２σに設定した場合を示す。 FIG. 11 shows a through image P3 when the giraffe is within the imaging range. Here, in the binarized images of FIGS. 12A to 12I, regions represented by gray and white are extracted as masks. In FIG. 12, the threshold when binarization processing is performed on the positive difference image of each component is 2σ and 3σ, and the binarization processing is performed on the negative difference image of each component. The case where the threshold is set to 2σ is shown.

制御部１９は、これら二値化画像から抽出されるマスクから、被写体候補のマスクを絞り込む。ここで、ステップＳ４０２及びステップＳ４０３の処理を行うことで、例えば図１２（ｂ）、図１２（ｄ）及び図１２（ｅ）のマスクが全て被写体候補のマスクから除外される。 The controller 19 narrows down subject candidate masks from the masks extracted from these binarized images. Here, by performing the processing of step S402 and step S403, for example, the masks of FIGS. 12B, 12D, and 12E are all excluded from the subject candidate masks.

ここで、図１２（ｃ）、図１２（ｆ）及び図１２（ｉ）の二値化画像からは、数多くのマスクが抽出されている。したがって、制御部１９は、これら二値化画像に対しては、ステップＳ４０５の判定処理を行うと、その判定処理の結果がＹｅｓとなる。また、これら二値化画像から抽出されるマスクのマスク全体の平均輝度に対しても、ステップＳ４０７の判定処理の結果がＹｓとなる。したがって、図１２（ｃ）、図１２（ｆ）及び図１２（ｉ）の二値化画像のそれぞれから抽出されるマスクが全て除外される。したがって、マスクの絞り込み処理を実行すると、図１２（ａ）の白の領域に示すマスク、図１２（ｇ）の白の領域に示す２つのマスク及び図１２（ｈ）の白の領域に示すマスクが被写体候補のマスクとして保持される。制御部１９は、これら被写体候補のマスクに対して評価値Ｅｖを求め、求めた評価値Ｅｖを用いて、被写体候補のマスクの順位付けを行う。この順位付けの後、第１から第４の抽出判定が行われ、例えば図１２（ｇ）の白の領域に示す２つのマスクが、被写体に相当するマスクとして抽出される。したがって、図１１のスルー画像を表示部２３に表示すると、被写体枠５２，５３が重畳表示される。 Here, a large number of masks are extracted from the binarized images of FIG. 12C, FIG. 12F, and FIG. Therefore, if the control part 19 performs the determination process of step S405 with respect to these binarized images, the result of the determination process will be Yes. Further, the result of the determination process in step S407 is also Ys for the average brightness of the entire mask extracted from these binarized images. Therefore, all the masks extracted from each of the binarized images of FIGS. 12C, 12F, and 12I are excluded. Therefore, when the mask narrowing process is executed, the mask shown in the white area in FIG. 12A, the two masks shown in the white area in FIG. 12G, and the mask shown in the white area in FIG. Are held as masks of subject candidates. The control unit 19 obtains an evaluation value Ev for these subject candidate masks, and ranks the subject candidate masks using the obtained evaluation value Ev. After this ranking, first to fourth extraction determinations are made, and for example, two masks shown in the white area in FIG. 12G are extracted as masks corresponding to the subject. Accordingly, when the through image of FIG. 11 is displayed on the display unit 23, the subject frames 52 and 53 are displayed in a superimposed manner.

このように、二値化画像から数多くのマスクが抽出される場合には、そのマスク全体の平均強度を判断することにより、数多くの島が抽出される領域が背景のテクスチャによるものであるか、被写体自身のテクスチャなのかを判別することができる。つまり、背景のテクスチャであればマスク全体の平均強度が低くなるので、抽出されたマスクは、背景に相当する領域であり、被写体に相当する領域は存在していないと判断できる。一方、被写体自身のテクスチャであれば、マスク全体の平均強度は高い。したがって、抽出されるマスクには、被写体に相当するマスクが存在すると判断できる。このように、上記判定を行うことで、不要なマスクを被写体候補のマスクから除外でき、被写体候補のマスクを適切に絞り込むことができる。 In this way, when a large number of masks are extracted from the binarized image, by determining the average intensity of the entire mask, whether the area where a large number of islands are extracted is due to the background texture, It is possible to determine whether the texture of the subject itself. In other words, since the average intensity of the entire mask is low for a background texture, it can be determined that the extracted mask is an area corresponding to the background and no area corresponding to the subject exists. On the other hand, if the texture of the subject itself, the average intensity of the entire mask is high. Therefore, it can be determined that the extracted mask has a mask corresponding to the subject. In this way, by performing the above determination, unnecessary masks can be excluded from subject candidate masks, and subject candidate masks can be appropriately narrowed down.

次に、図５のフローチャートに示すステップＳ２０８の被写体に相当するマスクを抽出する処理の流れについて、図１３のフローチャートに基づいて説明する。 Next, the flow of processing for extracting a mask corresponding to the subject in step S208 shown in the flowchart of FIG. 5 will be described based on the flowchart of FIG.

ステップＳ５０１は、最終候補のマスクを取得する処理である。制御部１９は、ステップＳ２０７にて求めた、被写体候補のマスクに対する評価値Ｅｖを用いて、被写体候補のマスクの順位付けを行う。そして、制御部１９は、順位付けされたマスクのうち、上位５位のマスクを最終候補のマスクとして選択する。 Step S501 is processing for obtaining a final candidate mask. The control unit 19 ranks the subject candidate masks using the evaluation value Ev for the subject candidate masks obtained in step S207. Then, the control unit 19 selects the top five masks among the ranked masks as final candidate masks.

ステップＳ５０２は、Ｃｒ成分の差分画像又はＣｂ成分の差分画像を元に抽出されたマスクがあるか否かを判定する処理である。最終候補のマスクのうち、Ｃｒ成分及びＣｂ成分の差分画像を元に抽出されたマスクがあれば、制御部１９は、ステップＳ５０２の判定処理の結果をＹｅｓとする。この場合、ステップＳ５０３に進む。一方、上位５位のマスクのうち、Ｃｒ成分及びＣｂ成分の差分画像を元に抽出されたマスクがない場合、制御部１９は、ステップＳ５０２の判定処理の結果をＮｏとする。この場合、ステップＳ５０５に進む。 Step S502 is processing for determining whether or not there is a mask extracted based on the Cr component difference image or the Cb component difference image. If there is a mask extracted based on the difference image of the Cr component and the Cb component among the final candidate masks, the control unit 19 sets the result of the determination process in step S502 to Yes. In this case, the process proceeds to step S503. On the other hand, when there is no mask extracted based on the difference image of the Cr component and the Cb component among the top five masks, the control unit 19 sets the result of the determination process in step S502 to No. In this case, the process proceeds to step S505.

ステップＳ５０３は、平均強度が閾値Ｔｈ５以下となるマスクがあるか否かを判定する処理である。制御部１９は、Ｃｒ成分又はＣｂ成分の差分画像を元に抽出されたマスクの平均強度をマスク毎に求める。制御部１９は、求めたマスクの平均強度と閾値Ｔｈ５とを比較する。Ｃｒ成分又はＣｂ成分の差分画像を元に抽出されたマスクの平均強度のいずれかが閾値Ｔｈ５以下となる場合、制御部１９は、ステップＳ５０３の判定処理の結果をＹｅｓとする。この場合、ステップＳ５０４に進む。一方、Ｃｒ成分又はＣｂ成分の差分画像を元に抽出されたマスクの平均強度が全て閾値Ｔｈ５を超過する場合、制御部１９は、ステップＳ５０３の判定処理の結果をＮｏとする。この場合、ステップＳ５０５に進む。これらステップＳ５０２及びステップＳ５０３の処理が、第１の抽出判定に係る処理となる。 Step S503 is processing for determining whether or not there is a mask whose average intensity is equal to or less than the threshold Th5. The control unit 19 obtains the average intensity of the mask extracted for each mask based on the difference image of the Cr component or the Cb component. The control unit 19 compares the obtained average intensity of the mask with the threshold Th5. When any of the average intensities of the masks extracted based on the difference image of the Cr component or the Cb component is equal to or less than the threshold Th5, the control unit 19 sets the result of the determination process in step S503 to Yes. In this case, the process proceeds to step S504. On the other hand, when all the average intensities of the masks extracted based on the difference image of the Cr component or Cb component exceed the threshold Th5, the control unit 19 sets the result of the determination process in step S503 to No. In this case, the process proceeds to step S505. The processes in step S502 and step S503 are processes related to the first extraction determination.

ステップＳ５０４は、対象となるマスクを除外する処理である。ステップＳ５０３の判定処理において、制御部１９は、平均強度が閾値Ｔｈ５以下となるマスクがあると判定している。したがって、制御部１９は、平均強度が閾値Ｔｈ５以下となるマスクを背景部分から得られたマスクであるとし、該当するマスクを最終候補のマスクから除外する。 Step S504 is processing to exclude the target mask. In the determination process in step S503, the control unit 19 determines that there is a mask whose average intensity is equal to or less than the threshold Th5. Therefore, the control unit 19 determines that the mask whose average intensity is equal to or less than the threshold Th5 is a mask obtained from the background portion, and excludes the corresponding mask from the final candidate mask.

ステップＳ５０５は、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位にＹ成分の差分画像を元に抽出されたマスクがあるか否かを判定する処理である。対象となるマスクは、ステップＳ５０４の処理によりマスクが除外されていない場合は、最終候補のマスク全てが該当する。一方、ステップＳ５０４の処理によりマスクが除外されている場合には、最終候補のマスクのうち、除外されたマスク以外のマスクが該当する。 Step S505 is a process of determining whether or not there is a mask extracted based on the Y component difference image below the mask extracted based on the Cb component or Cr component difference image. If the mask is not excluded by the process of step S504, the target mask is all the final candidate masks. On the other hand, when a mask is excluded by the process of step S504, a mask other than the excluded mask among the final candidate masks corresponds.

制御部１９は、被写体検出処理における処理の履歴を示す情報を第１メモリ２０から読み出す。制御部１９は、読み出した情報から、対象となるマスクが、Ｃｂ成分、Ｃｒ成分又はＹ成分の差分画像のいずれかの差分画像を元に抽出されたかを特定する。この特定の後、制御部１９は、対象となるマスクのうち、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位にＹ成分の差分画像を元に抽出されたマスクがあるか否かを判定する。Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位にＹ成分の差分画像を元に抽出されたマスクがある場合、制御部１９は、ステップＳ５０５の判定処理の結果をＹｅｓとする。この場合、ステップＳ５０６に進む。一方、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位にＹ成分の差分画像を元に抽出されたマスクがない場合には、制御部１９は、ステップＳ５０５の判定処理の結果をＮｏとする。この場合、ステップＳ５０８に進む。 The control unit 19 reads information indicating the history of processing in the subject detection processing from the first memory 20. The control unit 19 specifies whether the target mask is extracted from the read information based on any one of the Cb component, Cr component, and Y component difference images. After this specification, the control unit 19 determines whether there is a mask extracted based on the Y component difference image below the mask extracted based on the Cb component or Cr component difference image among the target masks. Determine whether or not. When there is a mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component or the Cr component, the control unit 19 sets the result of the determination process in step S505 to Yes. . In this case, the process proceeds to step S506. On the other hand, when there is no mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component or Cr component, the control unit 19 determines the result of the determination process in step S505. Is No. In this case, the process proceeds to step S508.

ステップＳ５０６は、平均強度が閾値Ｔｈ６以下となるマスクがあるか否かを判定する処理である。制御部１９は、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクの下位にＹ成分の差分画像を元に抽出されたマスクを対象にして、マスクの平均強度を求める。そして、制御部１９は、求めたマスクの平均強度と閾値Ｔｈ６とを比較する。対象となるマスクのうち、平均強度が閾値Ｔｈ６以下となるマスクがあれば、制御部１９は、ステップＳ５０６の判定処理の結果をＹｅｓとする。この場合、ステップＳ５０７に進む。対象となるマスクの全てが、平均強度が閾値Ｔｈ６を超過する場合、制御部１９は、ステップＳ５０６の判定処理の結果をＮｏとする。この場合、ステップＳ５０８に進む。これらステップＳ５０５及びステップＳ５０６の処理が、第２の抽出判定に係る処理となる。 Step S506 is processing for determining whether there is a mask having an average intensity equal to or less than the threshold Th6. The control unit 19 obtains the average intensity of the mask with respect to the mask extracted based on the Y component difference image below the mask extracted based on the Cb component or Cr component difference image. Then, the control unit 19 compares the obtained average intensity of the mask with the threshold Th6. If there is a mask whose average intensity is equal to or less than the threshold Th6 among the target masks, the control unit 19 sets the result of the determination process in step S506 to Yes. In this case, the process proceeds to step S507. When the average intensity of all the target masks exceeds the threshold Th6, the control unit 19 sets the result of the determination process in step S506 to No. In this case, the process proceeds to step S508. The processes in step S505 and step S506 are processes related to the second extraction determination.

ステップＳ５０７は、対象となるマスクを除外する処理である。制御部１９は、平均強度が閾値Ｔｈ６以下となるマスクを最終候補のマスクから除外する。 Step S507 is processing to exclude the target mask. The control unit 19 excludes the mask whose average intensity is equal to or less than the threshold Th6 from the final candidate mask.

ステップＳ５０８は、Ｙ成分の正の差分画像を元に抽出されたマスクと、Ｙ成分の負の差分画像を元に抽出されたマスクとがあるか否かを判定する処理である。対象となるマスクは、最終候補のマスクのうち、ステップＳ５０４又はステップＳ５０７のいずれかの処理によって除外されたマスクを除いたマスクである。なお、ステップＳ５０４又はステップＳ５０７のいずれかの処理も行っていない場合には、最終候補のマスクの全てが、対象のマスクとなる。 Step S508 is processing for determining whether there is a mask extracted based on a positive difference image of Y component and a mask extracted based on a negative difference image of Y component. The target mask is a mask obtained by removing the masks excluded by the process of either step S504 or step S507 from among the final candidate masks. Note that if neither the processing in step S504 nor step S507 is performed, all of the final candidate masks are the target masks.

対象となるマスクの中に、Ｙ成分の正の差分画像を元に抽出されたマスク及びＹ成分の負の差分画像を元に抽出されたマスクがあれば、制御部１９は、ステップＳ５０８の判定処理の結果をＹｅｓとする。この場合、ステップＳ５０９に進む。一方、対象となるマスクの中に、Ｙ成分の正の差分画像を元に抽出されたマスク又はＹ成分の負の差分画像を元に抽出されたマスクの一方のマスクしかない場合や、Ｙ成分の正の差分画像を元に抽出されたマスク及びＹ成分の負の差分画像を元に抽出されたマスクの両方のマスクがない場合には、制御部１９はステップＳ５０８の判定処理の結果をＮｏとし、ステップＳ５１０に進む。なお、ステップＳ５０８の処理が、第３の抽出判定に係る処理となる。 If the target mask includes a mask extracted based on the positive difference image of the Y component and a mask extracted based on the negative difference image of the Y component, the control unit 19 determines in step S508. The processing result is set to Yes. In this case, the process proceeds to step S509. On the other hand, if the target mask has only one of the mask extracted based on the positive difference image of the Y component or the mask extracted based on the negative difference image of the Y component, If there is no mask of both the mask extracted based on the positive difference image and the mask extracted based on the negative difference image of the Y component, the control unit 19 sets the result of the determination process in step S508 to No. And go to step S510. Note that the processing in step S508 is processing related to the third extraction determination.

ステップＳ５０９は、下位マスクを除外する処理である。ステップＳ５０８により、対象となるマスクには、Ｙ成分の正の差分画像を元に抽出されたマスク及びＹ成分の負の差分画像を元に抽出されたマスクの両方のマスクがあると判定されている。したがって、制御部１９は、これらマスクのうち、順位の高いマスクを最終候補のマスクとして保持し、順位の低いマスクを最終候補のマスクから除外する。 Step S509 is processing to exclude the lower mask. In step S508, it is determined that the target mask includes both a mask extracted based on a positive difference image of Y component and a mask extracted based on a negative difference image of Y component. Yes. Therefore, the control unit 19 holds a mask having a higher rank as a final candidate mask among these masks, and excludes a mask having a lower rank from the final candidate mask.

ステップＳ５１０は、同一成分の正の差分画像から生成されるマスクのうち、包含関係にあるマスクがあるか否かを判定する処理である。対象となるマスクは、最終候補のマスクのうち、ステップＳ５０４、ステップＳ５０７或いはステップＳ５０９のいずれかの処理によって除外されたマスクを除いたマスクである。なお、ステップＳ５０４、ステップＳ５０７或いはステップＳ５０９のいずれかの処理も行っていない場合には、最終候補のマスクの全てが、対象のマスクとなる。上述したように、Ｙ成分、Ｃｂ成分及びＣｒ成分の正の差分画像に対する二値化処理においては、異なる２つの閾値を用いている。したがって、包含関係にあるとは、同一成分の正の差分画像に対して閾値を用いた二値化処理により抽出されるマスクが、閾値を用いた二値化処理により抽出されるマスクに含まれる場合を指す。対象となるマスクの中に、包含関係にあるマスクがある場合には、制御部１９は、ステップＳ５１０の判定処理の結果をＹｅｓとする。この場合、ステップＳ５１１に進む。一方、対象となるマスクの中に、包含関係にあるマスクがない場合には、制御部１９は、ステップＳ５１０の判定処理の結果をＮｏとする。この場合、ステップＳ５１２に進む。 Step S510 is processing for determining whether there is a mask having an inclusion relationship among masks generated from positive difference images of the same component. The target mask is a mask obtained by removing the masks excluded by any one of the processes in step S504, step S507, or step S509 from among the final candidate masks. If none of the processes in step S504, step S507, or step S509 is performed, all of the final candidate masks are the target masks. As described above, in the binarization process for the positive difference image of the Y component, the Cb component, and the Cr component, two different threshold values are used. Therefore, a mask extracted by a binarization process using a threshold for a positive difference image of the same component is included in a mask extracted by a binarization process using a threshold. Refers to the case. If there is a mask in the inclusion relationship among the target masks, the control unit 19 sets the result of the determination process in step S510 to Yes. In this case, the process proceeds to step S511. On the other hand, when there is no mask in the inclusion relationship among the target masks, the control unit 19 sets the result of the determination process in step S510 to No. In this case, the process proceeds to step S512.

ステップＳ５１１は、包含関係にあるマスクを統合する処理である。制御部１９は、例えば２つのマスクが包含関係にある場合、それらマスクを統合する。このステップＳ５１１の処理を実行することで、制御部１９は、最終候補のマスクを絞ることができる。 Step S511 is processing for integrating masks in an inclusive relationship. For example, when two masks are in an inclusive relationship, the control unit 19 integrates the masks. By executing the processing in step S511, the control unit 19 can narrow down the final candidate mask.

ステップＳ５１２は、保持されるマスクの数が３を超過するか否かを判定する処理である。制御部１９は、ステップＳ５０２からステップＳ５１１の処理を行うことで、最終候補のマスクに対して第１から第４の抽出判定を行っている。例えば最終候補のマスクが１個も除外されない場合や１個のみが除外された場合、最終候補として保持されるマスクの数は３個を超過している。したがって、制御部１９は、ステップＳ５１２の判定処理の結果をＹｅｓとする。この場合、ステップＳ５１３に進む。 Step S512 is processing for determining whether or not the number of masks to be held exceeds three. The control unit 19 performs the first to fourth extraction determinations on the final candidate mask by performing the processing from step S502 to step S511. For example, when no final candidate masks are excluded or only one mask is excluded, the number of masks held as final candidates exceeds three. Therefore, the control unit 19 sets the result of the determination process in step S512 to Yes. In this case, the process proceeds to step S513.

一方、２個以上のマスクが最終候補のマスクから除外されてしまった場合には、保持されるマスクの数は３個以下となる。したがって、制御部１９は、ステップＳ５１２の判定処理の結果をＮｏとする。この場合、ステップＳ５１４に進む。
ステップＳ５１３は、上位３位までのマスクを抽出する処理である。制御部１９は、最終候補として保持されるマスクのうち、評価値Ｅｖの順位が上位３位のマスクを被写体の領域に相当するマスクとして抽出する。このステップＳ５１３の処理が実行されると、制御部１９は、図１３におけるフローチャートの処理を終了させる。 On the other hand, when two or more masks are excluded from the final candidate masks, the number of masks to be held is three or less. Therefore, the control unit 19 sets the result of the determination process in step S512 to No. In this case, the process proceeds to step S514.
Step S513 is processing to extract the top three masks. The control unit 19 extracts, from among the masks held as final candidates, the mask with the third highest evaluation value Ev as the mask corresponding to the subject area. When the process of step S513 is executed, the control unit 19 ends the process of the flowchart in FIG.

ステップＳ５１４は、保持されたマスクがないか否かを判定する処理である。上述した第１から第４の抽出判定を行うことで、最終候補となるマスクが全て除外されてしまう場合がある。このような場合、制御部１９はステップＳ５１４の判定処理の結果をＹｅｓとする。つまり、このような場合には、制御部１９は、被写体の領域に相当するマスクはないとし、図１３におけるフローチャートの処理を終了させる。 Step S514 is processing to determine whether or not there is a held mask. By performing the first to fourth extraction determinations described above, all the masks that are final candidates may be excluded. In such a case, the control unit 19 sets the result of the determination process in step S514 to Yes. That is, in such a case, the control unit 19 determines that there is no mask corresponding to the subject area, and ends the process of the flowchart in FIG.

一方、第１から第４の抽出判定を行ったときに、２〜４個のマスクが最終候補のマスクから除外される場合、制御部１９が保持するマスクは、１〜３個のマスクのいずれかである。したがって、制御部１９は、ステップＳ５１４の判定処理の結果をＮｏとする。この場合ステップＳ５１５に進む。 On the other hand, when 2 to 4 masks are excluded from the final candidate masks when the first to fourth extraction determinations are performed, the mask held by the control unit 19 is any of 1 to 3 masks. It is. Therefore, the control unit 19 sets the result of the determination process in step S514 to No. In this case, the process proceeds to step S515.

ステップＳ５１５は、保持されたマスクを抽出する処理である。制御部１９は、最終候補のマスクとして保持されるマスクの全てを被写体の領域に相当するマスクとして抽出する。このステップＳ５１５の処理が実行されると、制御部１９は、図１３におけるフローチャートの処理を終了させる。 Step S515 is processing to extract the held mask. The control unit 19 extracts all the masks held as final candidate masks as masks corresponding to the subject area. When the process of step S515 is executed, the control unit 19 ends the process of the flowchart in FIG.

以下、各成分の正の差分画像に対して二値化処理を行うときの閾値を２σ及び３σに、各成分の負の差分画像に対して二値化処理を行うときの閾値を２σに、それぞれ設定した場合について説明する。 Hereinafter, the threshold when performing binarization processing on the positive difference image of each component is 2σ and 3σ, and the threshold when performing binarization processing on the negative difference image of each component is 2σ, The case where each is set will be described.

図１４は、赤い花を撮像範囲に収めたときのスルー画像を示す。上述したように、スルー画像Ｐ４が取得されると、制御部１９は、Ｙ成分、Ｃｂ成分及びＣｒ成分の二値化画像を計９個生成する。図１５（ａ）から図１５（ｉ）は、スルー画像Ｐ３から得られるＹ成分、Ｃｂ成分及びＣｒ成分の二値化画像の一例を示す。図１５では、各閾値の下に記載した「（正）」は、正の差分画像に基づいて生成された二値化画像であり、閾値の下に記載した「（負）」は、負の差分画像に基づいて生成された二値化画像であることを示している。 FIG. 14 shows a through image when a red flower is placed in the imaging range. As described above, when the through image P4 is acquired, the control unit 19 generates a total of nine binarized images of the Y component, the Cb component, and the Cr component. FIG. 15A to FIG. 15I show an example of a binarized image of Y component, Cb component, and Cr component obtained from the through image P3. In FIG. 15, “(positive)” described below each threshold is a binarized image generated based on a positive difference image, and “(negative)” described below the threshold is negative. It shows that it is a binarized image generated based on the difference image.

図１５（ａ）から図１５（ｉ）は、Ｙ成分、Ｃｂ成分及びＣｒ成分の二値化画像であり、各二値化画像においてグレイ及び白で表される領域がマスクとして抽出される。抽出された各二値化画像のマスクから被写体候補となるマスクを絞り込むことでグレイで表されるマスクが除外され、図１５（ａ）、図１５（ｄ）から図１５（ｈ）の二値化画像中の白の領域で示すマスクが被写体候補のマスクとして絞り込まれる。上述したように、被写体候補となるマスクが絞り込まれると、制御部１９は、被写体候補のマスクのそれぞれに対して評価値Ｅｖを求める。ここで、図１５（ａ）、図１５（ｄ）から図１５（ｈ）に示す「×」印はマスクの重心であり、マスクの慣性モーメントを求める場合に基準となる。各マスクに対する評価値Ｅｖを求めた後、制御部１９は、求めた評価値Ｅｖを用いてマスクの順位付けを行う。ここでは、図１５（ａ）の白の領域として示すマスク、図１５（ｄ）の白の領域として示すマスク、図１５（ｆ）の白の領域として示すマスク、図１５（ｇ）の白の領域に示すマスク及び図１５（ｈ）の白の領域として示すマスクが最終候補のマスクとなる。 FIG. 15A to FIG. 15I are binarized images of a Y component, a Cb component, and a Cr component, and regions represented by gray and white in each binarized image are extracted as a mask. By narrowing down the masks that are subject candidates from the extracted masks of the respective binarized images, the masks expressed in gray are excluded, and the binary values shown in FIGS. 15A and 15D to 15H are used. Masks indicated by white areas in the digitized image are narrowed down as subject candidate masks. As described above, when the subject candidate masks are narrowed down, the control unit 19 obtains the evaluation value Ev for each of the subject candidate masks. Here, the “x” mark shown in FIGS. 15A, 15D to 15H is the center of gravity of the mask, which serves as a reference when obtaining the moment of inertia of the mask. After obtaining the evaluation value Ev for each mask, the control unit 19 ranks the masks using the obtained evaluation value Ev. Here, the mask shown as the white area in FIG. 15A, the mask shown as the white area in FIG. 15D, the mask shown as the white area in FIG. 15F, and the white area in FIG. The mask shown in the region and the mask shown as the white region in FIG. 15H are the final candidate masks.

制御部１９は、最終候補のマスクに対して第１の抽出判定を行う。この第１の抽出判定で、Ｃｂ成分及びＣｒ成分の差分画像を元に抽出されたマスクがあれば、そのマスクの平均強度が閾値Ｔｈ５以下となるかを判定する。なお、図１５においては、図１５（ｄ）の白の領域として示すマスク及び図１５（ｆ）の白の領域として示すマスクが、Ｃｂ成分の差分画像を元に抽出されたマスクである。また、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクがＣｒ成分の差分画像を元に抽出されたマスクである。この第１の抽出判定で、例えば図１５（ｄ）の白の領域として示すマスク及び図１５（ｆ）の白の領域として示すマスクが最終候補のマスクから除外される。 The control unit 19 performs the first extraction determination on the final candidate mask. If there is a mask extracted based on the difference image between the Cb component and the Cr component in the first extraction determination, it is determined whether the average intensity of the mask is equal to or less than the threshold Th5. In FIG. 15, the mask shown as the white region in FIG. 15D and the mask shown as the white region in FIG. 15F are masks extracted based on the difference image of the Cb component. Further, the mask shown as the white area in FIG. 15G and the mask shown as the white area in FIG. 15H are masks extracted based on the difference image of the Cr component. In this first extraction determination, for example, the mask shown as the white area in FIG. 15D and the mask shown as the white area in FIG. 15F are excluded from the final candidate mask.

次に、制御部１９は、対象となるマスクに対して第２の抽出判定を行う。この第２の抽出判定では、まず、対象となるマスクの中にＹ成分のマスクがあれば、Ｃｂ成分又はＣｒ成分のマスクの下位に、該Ｙ成分のマスクが位置あるか否かが判定される。例えば、図１５（ａ）の白の領域として示すマスクが、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクよりも下位の順位であれば、制御部１９は、図１５（ａ）の白の領域として示すマスクの平均強度が閾値Ｔｈ６以下であるか否かを判定する。この判定で、図１５（ａ）の白の領域として示すマスクの平均強度が閾値Ｔｈ６以下であると判定された場合には、制御部１９は、図１４（ａ）の白の領域として示すマスクを最終候補のマスクから除外する。つまり、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクが最終候補のマスクとして保持される。 Next, the control unit 19 performs a second extraction determination on the target mask. In this second extraction determination, first, if there is a Y component mask in the target mask, it is determined whether or not the Y component mask is located below the Cb component or Cr component mask. The For example, if the mask shown as the white area in FIG. 15A is lower in rank than the mask shown as the white area in FIG. 15G and the mask shown as the white area in FIG. The control unit 19 determines whether or not the average intensity of the mask shown as the white region in FIG. 15A is equal to or less than the threshold Th6. In this determination, when it is determined that the average intensity of the mask shown as the white area in FIG. 15A is equal to or less than the threshold Th6, the control unit 19 performs the mask shown as the white area in FIG. Are excluded from the final candidate mask. That is, the mask shown as the white region in FIG. 15G and the mask shown as the white region in FIG. 15H are held as final candidate masks.

次に、制御部１９は、第３の抽出判定を行う。ここでは、最終候補のマスクとして保持されているマスクは、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクである。これらマスクは、Ｃｒ成分の差分画像を元に抽出されたマスクであり、Ｙ成分の正の差分画像及び負の差分画像を元に抽出されたマスクではない。したがって、第３の抽出判定により、最終候補のマスクから除外されるマスクはない。 Next, the control unit 19 performs a third extraction determination. Here, the masks held as final candidate masks are a mask shown as a white region in FIG. 15G and a mask shown as a white region in FIG. These masks are extracted based on the difference image of the Cr component, and are not extracted based on the positive difference image and the negative difference image of the Y component. Therefore, no mask is excluded from the final candidate mask by the third extraction determination.

最後に、制御部１９は、第４の抽出判定を行う。図１５の例では、第３の抽出判定を行った後に保持されるマスクは、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクである。これらマスクは、Ｃｒ成分の正の差分画像に対して異なる閾値を用いることで生成された二値化画像からそれぞれ抽出されている。また、図１５（ｇ）の白の領域として示すマスクに、図１５（ｈ）の白の領域として示すマスクが含まれる。したがって、制御部１９は、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクは包含関係にあると判定する。そして、制御部１９は、図１５（ｇ）の白の領域として示すマスク及び図１５（ｈ）の白の領域として示すマスクを、図１５（ｇ）の白の領域として示すマスクに統合する。ここで、第１から第４の抽出判定を行った結果、最終候補のマスクとして保持されるマスクは、図１５（ｇ）の白の領域として示すマスクのみである。制御部１９は、図１５（ｇ）の白の領域として示すマスクを被写体に相当するマスクとして抽出する。 Finally, the control unit 19 performs a fourth extraction determination. In the example of FIG. 15, the masks retained after the third extraction determination is performed are a mask shown as a white region in FIG. 15G and a mask shown as a white region in FIG. These masks are respectively extracted from the binarized images generated by using different threshold values for the positive difference image of the Cr component. Further, the mask shown as the white region in FIG. 15G includes the mask shown as the white region in FIG. Therefore, the control unit 19 determines that the mask shown as the white region in FIG. 15G and the mask shown as the white region in FIG. Then, the control unit 19 integrates the mask shown as the white region in FIG. 15G and the mask shown as the white region in FIG. 15H into the mask shown as the white region in FIG. Here, as a result of performing the first to fourth extraction determinations, the masks retained as the final candidate masks are only the masks shown as white regions in FIG. The control unit 19 extracts the mask shown as the white area in FIG. 15G as a mask corresponding to the subject.

ここで、図１５（ｇ）の白の領域として示すマスクは、スルー画像においては花の領域に相当する。したがって、図１４（ａ）に示すように、スルー画像を表示部２３に表示すると、花の領域に対して、被写体枠５５が重畳表示される。 Here, the mask shown as a white region in FIG. 15G corresponds to a flower region in the through image. Accordingly, as shown in FIG. 14A, when the through image is displayed on the display unit 23, the subject frame 55 is superimposed on the flower area.

例えば、従来の手法では、図１５（ａ）の白の領域として示すマスク及び図１５（ｇ）の白の領域として示すマスクが上位３位までのマスクに入っていれば、制御部１９は、図１５（ａ）の白の領域として示すマスクに該当する領域と、図１５（ｇ）の白の領域として示すマスクに該当する領域を被写体の領域に特定する。なお、図１５（ａ）の白の領域として示すマスクに該当する領域は、光が反射している領域である。また、図１５（ｇ）の白の領域として示すマスクに該当する領域は、花の領域である。したがって、図１４（ｂ）に示すように、スルー画像を表示部２３に表示すると、花の領域と、光が反射している領域のそれぞれに対して、被写体枠５５，５６が重畳表示される。しかしながら、光が反射している領域は輝度が高い領域であり、被写体が位置する領域ではない。つまり、従来の手法では、被写体がない領域であっても輝度が高い領域であれば、被写体の領域に相当するマスクとして抽出されてしまう。 For example, in the conventional method, if the mask shown as the white region in FIG. 15A and the mask shown as the white region in FIG. The area corresponding to the mask shown as the white area in FIG. 15A and the area corresponding to the mask shown as the white area in FIG. Note that a region corresponding to a mask shown as a white region in FIG. 15A is a region where light is reflected. Further, the area corresponding to the mask shown as the white area in FIG. 15G is a flower area. Therefore, as shown in FIG. 14B, when the through image is displayed on the display unit 23, the subject frames 55 and 56 are superimposed on the flower area and the area where the light is reflected. . However, the region where the light is reflected is a region with high luminance and is not a region where the subject is located. In other words, according to the conventional method, even if there is no subject, if the luminance is high, it is extracted as a mask corresponding to the subject region.

しかしながら、第１実施形態では、上述した評価値Ｅｖの順位付けにより、Ｃｂ成分又はＣｒ成分の差分画像を元に抽出されたマスクよりも下位に、輝度成分の差分画像を元に抽出されたマスクが位置する場合には、そのマスクの平均強度に基づいて、輝度成分の差分画像を元に抽出されたマスクを除外している。したがって、輝度が高い領域がマスクとして抽出されたとしても、色度成分のマスクよりも順位が低いと判断されれば、そのマスクは除外されやすくなる。したがって、被写体がなく、単に輝度が高い領域であるマスクを被写体に相当する領域として特定することが防止される。 However, in the first embodiment, by the ranking of the evaluation values Ev described above, the mask extracted based on the luminance component difference image lower than the mask extracted based on the Cb component or Cr component difference image. Is located, the mask extracted based on the luminance component difference image is excluded based on the average intensity of the mask. Therefore, even if an area with high luminance is extracted as a mask, if it is determined that the rank is lower than that of the chromaticity component mask, the mask is easily excluded. Therefore, it is possible to prevent a mask that has no subject and is simply a high luminance region from being identified as a region corresponding to the subject.

図１６は、遊園地の乗り物を撮像範囲に収めたスルー画像Ｐ５である。この場合のＹ成分、Ｃｂ成分及びＣｒ成分の二値化画像を、図１７（ａ）から図１７（ｉ）に示す。ここで、図１７（ａ）から図１７（ｉ）の二値化画像において、グレイ及び白で表される領域がマスクとして抽出される。各二値化画像から抽出されたマスクから被写体候補のマスクが絞り込まれる。なお、図１７（ｆ）の白の領域として示す２つのマスク、図１７（ｇ）の白の領域として示すマスク及び図１７（ｈ）の白の領域として示すマスクが、それぞれ被写体候補となるマスクである。 FIG. 16 is a through image P5 in which an amusement park vehicle is within the imaging range. The binarized images of the Y component, Cb component, and Cr component in this case are shown in FIGS. 17 (a) to 17 (i). Here, in the binarized images of FIGS. 17A to 17I, regions represented by gray and white are extracted as masks. The subject candidate masks are narrowed down from the masks extracted from the respective binary images. Note that the two masks shown as the white area in FIG. 17F, the mask shown as the white area in FIG. 17G, and the mask shown as the white area in FIG. It is.

制御部１９は、これら被写体候補のマスクに対して評価値Ｅｖを算出し、算出された評価値Ｅｖを用いて各マスクの順位付けを行う。この場合、被写体候補のマスクは、上述した４個のマスクであることから、これら４個のマスクが最終候補のマスクとなる。 The control unit 19 calculates evaluation values Ev for these subject candidate masks, and ranks the masks using the calculated evaluation values Ev. In this case, since the subject candidate masks are the above-described four masks, these four masks are the final candidate masks.

制御部１９は、最終候補となる４個のマスクに対して第１の抽出判定を行う。ここで、図１７（ｆ）の白の領域として示す２つのマスクは、Ｃｂ成分の負の差分画像を元に抽出されたマスクであり、図１７（ｇ）の白の領域として示すマスク及び図１７（ｈ）の白の領域として示すマスクは、Ｃｒ成分の正の差分画像を元に抽出されたマスクである。この例では、図１７（ｆ）の白の領域として示す２つのマスクが閾値Ｔｈ５以下となるので、これらマスクが最終候補のマスクから除外される。したがって、図１７（ｇ）の白の領域として示すマスク及び図１７（ｆ）の白の領域として示すマスクが最終候補のマスクとして保持される。その後、第２の抽出判定や第３の抽出判定を行うが、これら判定では該当するマスクがないので、第１の抽出判定で絞り込まれたマスクが、そのまま最終候補のマスクとして保持される。 The control unit 19 performs the first extraction determination on the four masks that are final candidates. Here, the two masks shown as the white area in FIG. 17F are masks extracted based on the negative difference image of the Cb component, and the mask and the figure shown as the white area in FIG. The mask shown as a white area 17 (h) is a mask extracted based on a positive difference image of the Cr component. In this example, the two masks shown as white regions in FIG. 17F are equal to or less than the threshold Th5, so these masks are excluded from the final candidate masks. Accordingly, the mask shown as the white region in FIG. 17G and the mask shown as the white region in FIG. 17F are held as the final candidate mask. Thereafter, the second extraction determination and the third extraction determination are performed. Since there is no corresponding mask in these determinations, the mask narrowed down by the first extraction determination is held as it is as the final candidate mask.

最後に、制御部１９は、最終候補のマスクに対して第４の抽出判定を行う。上述した図１７（ｇ）の白の領域として示すマスク及び図１７（ｈ）の白の領域として示すマスクは、それぞれ、Ｃｒ成分の差分画像に対して異なる閾値を用いた二値化処理により抽出されるマスクである。したがって、これら２つのマスクは包含関係にある。したがって、制御部１９はマスクを統合する処理を行う。これにより、図１７（ｇ）の白の領域として示すマスクが最終候補のマスクとして保持される。この場合、制御部１９は、図１７（ｇ）の白の領域として示すマスクを被写体の領域に相当するマスクとして抽出する。ここで、図１７（ｇ）の白の領域として示すマスクは、乗り物の領域に相当する。したがって、スルー画像Ｐ５を表示部２３に表示すると、スルー画像Ｐ５中の乗り物の領域に対して枠５８が重畳表示される（図１６（ａ）参照）。 Finally, the control unit 19 performs the fourth extraction determination on the final candidate mask. The above-described mask shown as the white region in FIG. 17G and the mask shown as the white region in FIG. 17H are respectively extracted by binarization processing using different thresholds for the difference image of the Cr component. Is a mask. Therefore, these two masks are in an inclusive relationship. Therefore, the control unit 19 performs processing for integrating the masks. As a result, the mask shown as the white area in FIG. 17G is held as the final candidate mask. In this case, the control unit 19 extracts the mask shown as the white region in FIG. 17G as a mask corresponding to the subject region. Here, the mask shown as the white area in FIG. 17G corresponds to the area of the vehicle. Therefore, when the through image P5 is displayed on the display unit 23, the frame 58 is superimposed on the vehicle area in the through image P5 (see FIG. 16A).

ここで、従来の方法では、被写体候補のマスクのうち、評価値Ｅｖに基づいた順位付けで、上位３位までのマスクであれば、それらマスクが被写体に相当するマスクであると判断される。つまり、図１７（ｆ）の白の領域として示す２つのマスクは、被写体の領域に相当するマスクであると判定される。ここで、図１７（ｆ）の白の領域として示す２つのマスクは、それぞれ木の領域である。したがって、スルー画像Ｐ５を表示部２３に表示すると、乗り物の領域に対して被写体枠５８が重畳表示される他、木の領域に対しても被写体枠５９，６０が重畳表示されてしまう。（図１６（ｂ）参照）。 Here, in the conventional method, among the masks of the subject candidates, if the ranking is based on the evaluation value Ev and the top three masks, the masks are determined to be masks corresponding to the subject. That is, the two masks shown as white areas in FIG. 17F are determined to be masks corresponding to the subject area. Here, the two masks shown as white areas in FIG. 17F are each a tree area. Therefore, when the through image P5 is displayed on the display unit 23, the subject frame 58 is superimposed and displayed on the vehicle area, and the subject frames 59 and 60 are also superimposed and displayed on the tree area. (See FIG. 16 (b)).

しかしながら、第１実施形態では、第１の抽出判定により平均強度が低いマスクがあれば、そのマスクが背景部分であると判断され、最終候補のマスクから除外される。したがって、被写体がない領域がマスクとして抽出された場合であっても、そのマスクを確実に除外することができ、被写体に相当するマスクのみを抽出することができる。 However, in the first embodiment, if there is a mask having a low average intensity in the first extraction determination, it is determined that the mask is a background portion and is excluded from the final candidate mask. Therefore, even when an area without a subject is extracted as a mask, the mask can be reliably excluded, and only a mask corresponding to the subject can be extracted.

図１８は、猫を被写体として撮像範囲内に収めたスルー画像Ｐ６である。ここで、図１９（ａ）から図１９（ｉ）の二値化画像において、グレイ及び白で表される領域がマスクとして抽出される。この例では、各二値化画像から抽出されたマスクから、被写体候補のマスクが絞り込まれる。図１９（ａ）の白の領域に示すマスク、図１９（ｂ）の白の領域に示すマスク、図１９（ｃ）の白の領域に示すマスク及び図１９（ｄ）の白の領域に示すマスクが、被写体候補のマスクとなる。 FIG. 18 is a through image P6 in which a cat is taken as a subject within the imaging range. Here, in the binarized images shown in FIGS. 19A to 19I, regions represented by gray and white are extracted as masks. In this example, subject candidate masks are narrowed down from masks extracted from the respective binarized images. 19A, the mask shown in the white area, the mask shown in the white area in FIG. 19B, the mask shown in the white area in FIG. 19C, and the white area in FIG. 19D. The mask becomes a subject candidate mask.

この例では、被写体候補のマスクの順位付けを行うと、図１９（ａ）の白の領域に示すマスク、図１９（ｂ）の白の領域に示すマスク、図１９（ｃ）の白の領域に示すマスク、図１９（ｄ）の白の領域に示すマスクの順となる。また、この場合、被写体候補となるマスクは、上述した４個のマスクである。したがって、これら４個のマスクが最終候補のマスクとして選択される。制御部１９は、これらマスクに対して第１の抽出判定を行う。ここで、図１９（ｄ）の白の領域に示すマスクは、Ｃｂ成分の差分画像を元に抽出されたマスクである。したがって、第１の抽出判定により、図１９（ｄ）の白の領域に示すマスクの平均強度が閾値Ｔｈ３以下となれば、制御部１９は、図１９（ｄ）の白の領域に示すマスクを、最終候補のマスクから除外する。次に、制御部１９は、第２の抽出判定を行う。この場合、最終候補のマスクの中には、Ｃｂ成分及びＣｒ成分の差分画像を元に抽出されたマスクの下位に、Ｙ成分の差分画像を元に抽出されたマスクはないので、最終候補のマスクから除外されるマスクはない。 In this example, when the masks of the subject candidates are ranked, the mask shown in the white area in FIG. 19A, the mask shown in the white area in FIG. 19B, and the white area in FIG. And the mask shown in the white area in FIG. 19D. In this case, the masks that are subject candidates are the above-described four masks. Therefore, these four masks are selected as final candidate masks. The control unit 19 performs a first extraction determination on these masks. Here, the mask shown in the white area in FIG. 19D is a mask extracted based on the difference image of the Cb component. Therefore, if the average intensity of the mask shown in the white region in FIG. 19D is equal to or less than the threshold Th3 by the first extraction determination, the control unit 19 uses the mask shown in the white region in FIG. 19D. , Exclude from the final candidate mask. Next, the control unit 19 performs a second extraction determination. In this case, since there is no mask extracted based on the difference image of the Y component below the mask extracted based on the difference image of the Cb component and the Cr component in the final candidate mask, No mask is excluded from the mask.

ここで、図１９（ａ）の白の領域に示すマスク及び図１９（ｂ）の白の領域に示すマスクは、Ｙ成分の正の差分画像を元に抽出されたマスクであり、図１９（ｃ）の白の領域に示すマスクは、Ｙ成分の負の差分画像を元に抽出されたマスクである。したがって、制御部１９が第３の抽出判定を行うと、これらマスクのいずれかが最終候補のマスクから除外される。ここで、図１９（ａ）の白の領域に示すマスク及び図１９（ｂ）の白の領域に示すマスクは、それぞれ図１９（ｃ）の白の領域に示すマスクよりも評価値Ｅｖの順位が高い。したがって、制御部１９は、図１９（ｃ）の白の領域に示すマスクを最終候補のマスクから除外する。 Here, the mask shown in the white region in FIG. 19A and the mask shown in the white region in FIG. 19B are masks extracted based on the positive difference image of the Y component, and FIG. The mask shown in the white area of c) is a mask extracted based on the negative difference image of the Y component. Therefore, when the control unit 19 performs the third extraction determination, any of these masks is excluded from the final candidate mask. Here, the mask shown in the white area in FIG. 19A and the mask shown in the white area in FIG. 19B are ranked in the order of the evaluation value Ev over the mask shown in the white area in FIG. Is expensive. Therefore, the control unit 19 excludes the mask shown in the white area in FIG. 19C from the final candidate mask.

例えば輝度が高いマスクが輝度の低いマスクよりも上位となる場合、輝度の低いマスクは、被写体の黒い部分に該当していることが多く、不要なマスクとなる。したがって、Ｙ成分の正の二値化画像から抽出されたマスクと、Ｙ成分の負の二値化画像から抽出されたマスクとの両方のマスクが上位５位までのマスクとなる場合には、制御部１９は、Ｙ成分の正の二値化画像から抽出されたマスクと、Ｙ成分の負の二値化画像から抽出されたマスクとのうち、上位に位置しているマスクを保持し、下位に位置しているマスクを除外する。 For example, when a mask with high luminance is higher than a mask with low luminance, the mask with low luminance often corresponds to a black portion of the subject and becomes an unnecessary mask. Therefore, when both the mask extracted from the positive binary image of the Y component and the mask extracted from the negative binary image of the Y component are the top five masks, The control unit 19 holds a mask that is positioned higher among the mask extracted from the positive binary image of the Y component and the mask extracted from the negative binary image of the Y component, Excludes masks located below.

なお、Ｙ成分の負の二値化画像から抽出されたマスクがＹ成分の正の二値化画像から抽出されたマスクよりも上位に位置している場合には、スルー画像においては、そのマスクに該当する被写体は被写体自体の色が濃いと想定でき、Ｙ成分の正の二値化画像から抽出されたマスクに該当する領域は背景であることが多い。したがって、Ｙ成分の正の二値化画像から抽出されたマスクと、Ｙ成分の負の二値化画像から抽出されたマスクとの両方のマスクが上位５位までのマスクとなる場合には、制御部１９は、上位に位置するマスクを保持し、下位に位置するマスクを除外する。 If the mask extracted from the negative binary image of the Y component is positioned higher than the mask extracted from the positive binary image of the Y component, the mask is displayed in the through image. It can be assumed that the subject corresponding to the color of the subject itself is dark, and the area corresponding to the mask extracted from the positive binary image of the Y component is often the background. Therefore, when both the mask extracted from the positive binary image of the Y component and the mask extracted from the negative binary image of the Y component are the top five masks, The control unit 19 holds the mask positioned at the upper level and excludes the mask positioned at the lower level.

最後に、制御部１９は、第４の抽出判定を行う。ここで、図１９（ａ）の白の領域に示すマスクと、図１９（ｂ）の白の領域に示すマスクは、包含関係にある。したがって、制御部１９は、これらマスクを統合する。その結果、図１９（ａ）の白の領域に示すマスクが被写体の領域に相当するマスクとして抽出される。 Finally, the control unit 19 performs a fourth extraction determination. Here, the mask shown in the white area in FIG. 19A and the mask shown in the white area in FIG. 19B are in an inclusive relationship. Therefore, the control unit 19 integrates these masks. As a result, the mask shown in the white area in FIG. 19A is extracted as a mask corresponding to the object area.

図１９（ａ）の白の領域に示すマスクは、猫の胴体及び足の領域である。したがって、スルー画像Ｐ６を表示部２３に表示させたときには、猫の胴体及び足の領域に対して被写体枠６１が重畳表示される。このように、輝度の高い領域と、輝度の低い領域との両方の領域がマスクとし抽出される場合には、評価の高いマスクのみを選択することで、被写体に相当する領域を適切に特定することが可能となる。 The mask shown in the white area | region of Fig.19 (a) is the area | region of a torso of a cat and a leg | foot. Accordingly, when the through image P6 is displayed on the display unit 23, the subject frame 61 is superimposed and displayed on the cat's torso and foot regions. As described above, when both the high luminance region and the low luminance region are extracted as masks, the region corresponding to the subject is appropriately identified by selecting only the high evaluation mask. It becomes possible.

以下、上記実施形態での作用効果を述べる。上記実施形態の制御部１９は、二値化画像内に複数の被写体候補のマスクがあるときに、被写体候補のマスクの重心の平均をとって基準点Ｐを設定する。そして、制御部１９は、基準点Ｐからのマスクの慣性モーメントＭＯＩの値を用いて、二値化画像の各マスクの評価値Ｅｖを算出する（Ｓ２０７）。そして、制御部１９は、マスクの評価値Ｅｖに基づいて、被写体候補のマスクに対する順位付けを行うことで、被写体に相当するマスクを抽出する（Ｓ２０８）。 Hereinafter, the operational effects of the above embodiment will be described. When there are a plurality of subject candidate masks in the binarized image, the control unit 19 of the above embodiment sets the reference point P by taking the average of the centroids of the subject candidate masks. Then, the control unit 19 calculates the evaluation value Ev of each mask of the binarized image using the value of the inertia moment MOI of the mask from the reference point P (S207). Then, the control unit 19 extracts a mask corresponding to the subject by ranking the subject candidate masks based on the mask evaluation value Ev (S208).

図２１（ａ）は、図２０に示したマスクのうち上記実施形態で上位のマスクとして抽出されるマスクの例を示す。また、図２１（ｂ）は、比較例として、画面中央の点Ｏを基準点としたときに、図２０に示すマスクのうちで抽出される上位のマスクの例を示す。なお、図２１の説明では、図２０に示す被写体候補のマスクの平均強度がほぼ同様であることを前提とする。 FIG. 21A shows an example of a mask extracted as an upper mask in the above-described embodiment among the masks shown in FIG. FIG. 21B shows an example of a higher-order mask extracted from the masks shown in FIG. 20 when a point O at the center of the screen is used as a reference point as a comparative example. In the description of FIG. 21, it is assumed that the average intensity of the subject candidate masks shown in FIG.

図２１（ｂ）に示す比較例の場合、慣性モーメントを算出するための基準点が画面中央に位置する。そのため、同じ二値化画像に含まれる被写体候補のマスク１〜３のうち、画面中央に最も近いマスク３の慣性モーメントが小さくなることからマスク３の評価値Ｅｖが高くなり、マスク３が上位のマスクとして抽出される。しかし、例えば、撮影者が明確な意図を持って被写体を片側に配置した構図の場合や、移動している被写体を撮影するような場合、画面中央に近いマスクを優先すると、抽出されるマスクが撮影者の注目する被写体に合致しないこともある。 In the comparative example shown in FIG. 21B, the reference point for calculating the moment of inertia is located at the center of the screen. Therefore, among the masks 1 to 3 of the subject candidates included in the same binarized image, the moment of inertia of the mask 3 closest to the center of the screen is reduced, so that the evaluation value Ev of the mask 3 is increased and the mask 3 is in the higher rank. Extracted as a mask. However, for example, when the photographer has a clear intent and the subject is arranged on one side, or when shooting a moving subject, if the mask near the center of the screen is given priority, the extracted mask will be Sometimes it does not match the subject of interest of the photographer.

一方、図２１（ａ）に示す上記実施形態の場合、被写体候補のマスクの重心の平均をとって基準点Ｐが設定され、各マスクの慣性モーメントは基準点Ｐからの距離を用いて算出される。そのため、画面中央に近いマスク３の慣性モーメントよりも、マスクの面積が大きく、形がまとまっているマスク２の慣性モーメントが小さくなる。そのため、被写体候補のマスクの平均強度がほぼ同様であれば、マスク３よりもマスク２の評価値Ｅｖが高くなり、マスク２が上位のマスクとして抽出される。 On the other hand, in the case of the above embodiment shown in FIG. 21A, the reference point P is set by taking the average of the centroids of the masks of the subject candidates, and the inertia moment of each mask is calculated using the distance from the reference point P. The Therefore, the moment of inertia of the mask 2 having a larger mask area and a uniform shape becomes smaller than the moment of inertia of the mask 3 near the center of the screen. Therefore, if the average intensity of the subject candidate masks is substantially the same, the evaluation value Ev of the mask 2 is higher than that of the mask 3, and the mask 2 is extracted as the upper mask.

このように、上記実施形態では、被写体候補のマスクの分布に応じて慣性モーメントの算出のための基準点Ｐが設定されるので、様々な撮影シーンでより人間の感覚に近い条件でマスクの抽出を行うことが可能となる。 As described above, in the above embodiment, since the reference point P for calculating the moment of inertia is set according to the mask distribution of the subject candidates, the mask extraction is performed under conditions closer to human senses in various shooting scenes. Can be performed.

＜変形例１＞
上記実施形態において、被写体候補のマスクを１つのみ含む二値化画像では、式（１）および式（２）によると基準点Ｐがマスクの重心と一致するため、マスクの慣性モーメントが過度に小さくなり、被写体候補のマスクの評価値Ｅｖが他の二値化画像のマスクと比べて必要以上に大きくなることがある。 <Modification 1>
In the above embodiment, in the binarized image including only one subject candidate mask, the reference point P coincides with the center of gravity of the mask according to the equations (1) and (2), and therefore the moment of inertia of the mask is excessive. The evaluation value Ev of the subject candidate mask may become larger than necessary as compared with the masks of other binarized images.

そのため、評価値算出部４７は、被写体候補のマスクを１つのみ含む二値化画像の場合、慣性モーメントの算出のための基準点Ｐを被写体候補のマスクの重心からずらすために、基準点Ｐを画面中央に近づけるように調整してもよい。例えば、評価値算出部４７は、被写体候補のマスクを１つのみ含む二値化画像の場合、被写体候補のマスクの重心から画面中央までの中点に基準点Ｐに設定してもよい。これにより、被写体候補のマスクを１つのみ含む二値化画像で算出される評価値を、他の二値化画像で算出される評価値との関係で適切に調整できる。 Therefore, in the case of a binarized image including only one subject candidate mask, the evaluation value calculation unit 47 shifts the reference point P for calculating the moment of inertia from the center of gravity of the subject candidate mask. You may adjust so that is closer to the center of the screen. For example, in the case of a binarized image including only one subject candidate mask, the evaluation value calculation unit 47 may set the reference point P at the midpoint from the center of gravity of the subject candidate mask to the center of the screen. Thereby, the evaluation value calculated by the binarized image including only one subject candidate mask can be appropriately adjusted in relation to the evaluation value calculated by another binarized image.

＜変形例２＞
上記実施形態において、評価値算出部４７は、慣性モーメントを算出するための基準点Ｐを設定するときに、被写体候補のマスクの強度を用いてマスク間の重心に重み付けを行ってもよい。 <Modification 2>
In the above embodiment, the evaluation value calculation unit 47 may weight the centroid between masks using the mask strength of the subject candidate when setting the reference point P for calculating the moment of inertia.

例えば、評価値算出部４７は、基準点Ｐの座標（ｘ，ｙ）を以下の式（４）、式（５）により求めればよい。 For example, the evaluation value calculation unit 47 may obtain the coordinates (x, y) of the reference point P by the following equations (4) and (5).

Ｐｘ＝Σ（Ｇｘ_ｎ・ｋ_ｎ）／ｎ・・・（４）
Ｐｙ＝Σ（Ｇｙ_ｎ・ｋ_ｎ）／ｎ・・・（５）
ここで、「ｋ_ｎ」は、被写体候補のマスクｎに対応する重み付け係数である。重み付け係数ｋ_ｎは、マスクｎの強度が大きいほど高い値に設定される。例えば、評価値算出部４７は、マスクの平均強度の比に応じて、各マスクの重み付け係数をそれぞれ設定する。あるいは、評価値算出部４７は、マスクの平均強度の高さに応じて各マスクを順位付けし、上位のマスクから順に高い重み付け係数を与えてもよい。 _{_{Px = Σ (Gx n · k}} n) / n ··· (4)
_{_{Py = Σ (Gy n · k}} n) / n ··· (5)
Here, “k _n ” is a weighting coefficient corresponding to the mask n of the subject candidate. Weighting factor k _n is set to a higher value the greater the strength of the mask n. For example, the evaluation value calculation unit 47 sets a weighting coefficient for each mask according to the ratio of the average intensity of the masks. Alternatively, the evaluation value calculation unit 47 may rank each mask according to the height of the average intensity of the mask, and give a higher weighting coefficient in order from the upper mask.

図２２は、図２０において重み付け係数を用いて基準点Ｐ’を算出した例を示す図である。図２２の例では、マスク１の平均強度が、マスク２，３の平均強度と比べて十分に高いものとする。図２２の場合、基準点Ｐ’の算出のときに重み付け係数により平均強度の高いマスク１の影響が大きくなるため、基準点Ｐ’の位置は基準点Ｐと比べてマスク１に近づくこととなる。これにより、変形例２の場合には、マスク自体の顕著性を考慮して慣性モーメントの基準点を決定することができる。 FIG. 22 is a diagram illustrating an example in which the reference point P ′ is calculated using the weighting coefficient in FIG. 20. In the example of FIG. 22, it is assumed that the average intensity of the mask 1 is sufficiently higher than the average intensity of the masks 2 and 3. In the case of FIG. 22, when the reference point P ′ is calculated, the influence of the mask 1 having a high average intensity is increased by the weighting coefficient, so that the position of the reference point P ′ is closer to the mask 1 than the reference point P. . Thereby, in the case of the modification 2, the reference point of the moment of inertia can be determined in consideration of the saliency of the mask itself.

なお、変形例２の重み付け係数の決定のときに、マスクの平均強度の代わりに、マスクの重心におけるマスクの強度を用いてもよい。 Note that when determining the weighting coefficient of the second modification, the mask strength at the center of gravity of the mask may be used instead of the average strength of the mask.

また、評価値算出部４７は、被写体候補のマスクの強度差が閾値未満であるときには、式（１）および式（２）で重み付け係数を用いずに基準点Ｐを算出し、被写体候補のマスクの強度差が閾値以上であるときには、式（４）および式（５）で重み付け係数を用いて基準点Ｐ’を算出するようにしてもよい。 Further, when the difference in the intensity of the subject candidate mask is less than the threshold value, the evaluation value calculation unit 47 calculates the reference point P without using the weighting coefficient in Equation (1) and Equation (2), and the subject candidate mask. When the difference in intensity is equal to or greater than the threshold value, the reference point P ′ may be calculated using the weighting coefficient in Expressions (4) and (5).

＜変形例３＞
上記実施形態において、評価値算出部４７は、慣性モーメントを算出するための基準点Ｐを設定するときに、マスクの強度が閾値未満であるマスクの重心を除外して基準点Ｐを算出してもよい。 <Modification 3>
In the above embodiment, the evaluation value calculation unit 47 calculates the reference point P by excluding the center of gravity of the mask whose mask strength is less than the threshold when setting the reference point P for calculating the moment of inertia. Also good.

＜第１実施形態の補足事項＞
第１実施形態では、解像度変換処理が施されたスルー画像の各画素の画素値の平均値を算出し、算出した画素値の平均値を用いて基準濃度画像を生成しているが、これに限定される必要はなく、スルー画像の各画素の画素値の中央値を用いて、基準濃度画像を生成することも可能である。 <Supplementary items of the first embodiment>
In the first embodiment, an average value of pixel values of each pixel of a through image subjected to resolution conversion processing is calculated, and a reference density image is generated using the calculated average value of pixel values. There is no need to be limited, and it is also possible to generate a reference density image using the median value of each pixel value of the through image.

第１実施形態では、Ｙ成分、Ｃｂ成分及びＣｒ成分の差分画像から、各成分の二値化画像を生成しているが、Ｙ成分、Ｃｂ成分及びＣｒ成分の画像から、各成分の二値化画像を生成することも可能である。この場合、各成分の画像に対して、異なる３以上の閾値を用いて、３以上の二値化画像を生成すればよい。また、この場合には、Ｙ成分、Ｃｂ成分及びＣｒ成分の画像毎にヒストグラムを生成し、生成したヒストグラムにおける標準偏差σやピーク・ピーク値ｐｐを求め、これら標準偏差σやピーク・ピーク値ｐｐを用いて、被写体候補を絞り込む際に使用する画像とするか否かを判定すればよい。 In the first embodiment, the binary image of each component is generated from the difference image of the Y component, Cb component, and Cr component. However, the binary of each component is generated from the image of the Y component, Cb component, and Cr component. It is also possible to generate a digitized image. In this case, three or more binarized images may be generated using three or more different thresholds for each component image. In this case, a histogram is generated for each image of the Y component, the Cb component, and the Cr component, and the standard deviation σ and the peak / peak value pp in the generated histogram are obtained, and the standard deviation σ and the peak / peak value pp are obtained. It is sufficient to determine whether or not to use an image for narrowing down subject candidates.

第１実施形態では、生成された差分画像に基づいたヒストグラムを生成し、生成したヒストグラムにおける標準偏差σやピーク・ピーク値ｐｐを用いて、被写体候補を絞り込む際に使用する差分画像とするか否かを判定しているが、これに限定される必要はなく、差分画像から二値化画像を生成した後に、上記判定を行うことも可能である。 In the first embodiment, whether or not to generate a histogram based on the generated difference image and use the standard deviation σ and peak / peak value pp in the generated histogram as a difference image used when narrowing down subject candidates is determined. However, the present invention is not limited to this, and it is also possible to perform the above determination after generating a binarized image from the difference image.

第１実施形態では、二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上で、且つマスク全体の平均強度が閾値Ｔｈ４以下となるか否かを判定し、これら条件を満足する場合に、該当する二値化画像から抽出される全てのマスクを被写体候補のマスクから除外している。しかしながら、この判定においては、全ての二値化画像から抽出されるマスクの全てが被写体候補のマスクから除外されてしまう場合もある。したがって、このような場合、閾値の数を増やして二値化処理を行ってもよい。 In the first embodiment, it is determined whether or not the number of masks extracted from the binarized image is equal to or greater than the threshold Th3 and the average intensity of the entire mask is equal to or less than the threshold Th4, and when these conditions are satisfied, All masks extracted from the corresponding binarized image are excluded from the subject candidate masks. However, in this determination, all of the masks extracted from all the binarized images may be excluded from the subject candidate masks. Therefore, in such a case, the binarization process may be performed by increasing the number of threshold values.

第１実施形態では、全ての二値化画像のそれぞれでマスクの数を計数し、二値化画像から抽出されるマスクの数が閾値Ｔｈ３以上となる場合に、マスク全体の平均強度が閾値Ｔｈ４以下となるか否かを判定している。しかしながら、全ての二値化画像を用いる必要はなく、例えば差分画像から生成されるヒストグラムにおいて標準偏差σが小さい差分画像に基づく二値化画像に対してのみ、上記判定を行うことも可能である。つまり、上述した差分画像における標準偏差σが小さい場合、差分画像における画素値が平均に近い画素値となる、つまり差分画像中に目立つオブジェクトがないと判断できる。したがって、このような標準偏差σが小さい差分画像から求まる二値化画像に対して、上記判定を行うのが効果的である。 In the first embodiment, the number of masks is counted for each of all binarized images, and when the number of masks extracted from the binarized image is equal to or greater than the threshold Th3, the average intensity of the entire mask is the threshold Th4. It is determined whether or not: However, it is not necessary to use all the binarized images. For example, the above determination can be performed only on the binarized image based on the difference image having a small standard deviation σ in the histogram generated from the difference image. . That is, when the standard deviation σ in the difference image described above is small, it can be determined that the pixel value in the difference image is a pixel value close to the average, that is, there is no conspicuous object in the difference image. Therefore, it is effective to perform the above determination on a binarized image obtained from a difference image having a small standard deviation σ.

また、この判定は、二値化画像を対象に行っているが、これに限定される必要はなく、二値化画像を複数の領域に分割し、分割した複数の領域のそれぞれで行うことも可能である。 In addition, this determination is performed on the binarized image, but the determination is not limited to this, and the binarized image may be divided into a plurality of regions, and may be performed on each of the divided regions. Is possible.

第１実施形態では、Ｙ成分、Ｃｂ成分及びＣｒ成分の正の差分画像及び負の差分画像をそれぞれ生成したときに標準偏差σを求めている。したがって、差分画像を生成したときに得られる標準偏差σの値から、二値化画像を生成する際に用いる差分画像を選択することも可能である。 In the first embodiment, the standard deviation σ is obtained when a positive difference image and a negative difference image of the Y component, the Cb component, and the Cr component are generated. Therefore, it is possible to select a difference image used when generating a binarized image from the value of the standard deviation σ obtained when the difference image is generated.

第１実施形態では、抽出されるマスクの数に対する閾値を閾値Ｔｈ３、マスク全体の平均強度に対する閾値を閾値Ｔｈ４とし、これら閾値の値を固定としているが、これら閾値は、Ｙ成分、Ｃｂ成分及びＣｒ成分毎に異なる値としてもよい。この場合、例えば撮影シーンや、スルー画像における画像構成に基づいて、Ｙ成分、Ｃｂ成分及びＣｒ成分のそれぞれに対する閾値Ｔｈ３及び閾値Ｔｈ４を個別に設定することも可能である。 In the first embodiment, the threshold value for the number of extracted masks is the threshold value Th3, the threshold value for the average intensity of the entire mask is the threshold value Th4, and these threshold values are fixed. These threshold values are the Y component, the Cb component, and the threshold value. It is good also as a different value for every Cr component. In this case, for example, the threshold Th3 and the threshold Th4 for each of the Y component, the Cb component, and the Cr component can be individually set based on the shooting scene and the image configuration in the through image.

第１実施形態では、最終候補のマスクのうち、包含関係にあるマスクを統合している。この統合においては、２つのマスクが包含関係にある場合、一方のマスクに含まれる他方のマスクを最終候補のマスクから除外し、一方のマスクを最終候補のマスクから除外している。しかしながら、包含関係にある２つのマスクの両方をそれぞれ最終候補となるマスクとして保持しておき、被写体枠を表示する際に、一方のマスクに相当する被写体の領域に対してのみ被写体枠を表示することも可能である。 In the first embodiment, of the final candidate masks, masks in an inclusive relationship are integrated. In this integration, when two masks are in an inclusive relationship, the other mask included in one mask is excluded from the final candidate mask, and one mask is excluded from the final candidate mask. However, both of the two masks in the inclusive relationship are held as final candidate masks, and when the subject frame is displayed, the subject frame is displayed only for the subject area corresponding to one mask. It is also possible.

第１実施形態では、最終候補のマスクのうち、包含関係にあるマスクを統合しているが、これに限定する必要はなく、例えば二値化画像からマスクを抽出したときに、包含関係にあるマスクを統合する処理を行った後、被写体候補のマスクを絞り込む処理を行うことも可能である。 In the first embodiment, the masks in the inclusion relationship among the final candidate masks are integrated. However, the present invention is not limited to this. For example, when the mask is extracted from the binarized image, the mask is in the inclusion relationship. After performing the process of integrating the masks, it is also possible to perform a process of narrowing down the subject candidate masks.

第１実施形態では、差分画像を絞り込む処理（図５に示すフローチャートのステップＳ２０４、図６に示すフローチャートの処理）、マスクを絞り込む処理（図５に示すフローチャートのステップＳ２０６、図１０に示すフローチャートの処理）を行うようにしている。しかしながら、差分画像を絞り込む処理やマスクを絞り込む処理は、少なくともいずれか一方のみを行うようにしてもよいし、両方の処理を省略することが可能である。 In the first embodiment, the process for narrowing down the difference image (step S204 in the flowchart shown in FIG. 5 and the process in the flowchart shown in FIG. 6) and the process for narrowing down the mask (step S206 in the flowchart shown in FIG. 5 and the flowchart shown in FIG. 10). Process). However, at least one of the process of narrowing down the difference image and the process of narrowing down the mask may be performed, or both processes can be omitted.

第１実施形態では、被写体に相当するマスクを抽出する処理（図５に示すフローチャートのステップＳ２０８、図１３に示すフローチャートの処理）として、第１から第４の抽出判定を行っている。しかしながら、第１から第４の抽出判定のうち、少なくとも１つの判定処理のみを行うようにすることも可能である。また、これら抽出判定を行わずに、最終候補のマスクのうち、評価値Ｅｖの最も高いマスクを被写体に相当するマスクとして抽出してもよい。 In the first embodiment, first to fourth extraction determinations are performed as processing for extracting a mask corresponding to a subject (step S208 in the flowchart shown in FIG. 5 and processing in the flowchart shown in FIG. 13). However, it is also possible to perform only at least one determination process among the first to fourth extraction determinations. Further, without performing these extraction determinations, a mask having the highest evaluation value Ev among the final candidate masks may be extracted as a mask corresponding to the subject.

第１実施形態において、マスク絞込部４６は、以下の基準を用いて被写体候補のマスクを絞り込んでもよい。 In the first embodiment, the mask narrowing unit 46 may narrow down subject masks using the following criteria.

例えば、マスク絞込部４６は、画像全体に対して面積が６割となり、その重心が画像中央と一致する矩形の足きり領域を設定し、この足きり領域に５０％以上属しないマスクを被写体候補から除外してもよい。 For example, the mask narrowing unit 46 sets a rectangular footprint area whose area is 60% of the entire image and whose center of gravity coincides with the center of the image, and a mask that does not belong to 50% or more of the footprint area is set as a subject. You may exclude from a candidate.

また、例えば、マスク絞込部４６は、連続して取得されたスルー画像間でマスクの動きベクトルを求め、他の条件で被写体候補から除外されるマスクのうち、動きベクトルのあるマスクは被写体候補として保持し、動きベクトルのないマスクはそのまま被写体候補から除外してもよい。
＜第２実施形態＞
図２３は、第２実施形態の画像処理装置の構成例を示す図である。第２実施形態の画像処理装置９０は、本発明の被写体検出装置として機能する。なお、画像処理装置９０の具体例としては、コンピュータが挙げられる。 Further, for example, the mask narrowing unit 46 obtains a mask motion vector between continuously acquired through images, and among masks excluded from subject candidates under other conditions, a mask having a motion vector is a subject candidate. And masks without motion vectors may be excluded from subject candidates as they are.
Second Embodiment
FIG. 23 is a diagram illustrating a configuration example of the image processing apparatus according to the second embodiment. The image processing device 90 of the second embodiment functions as a subject detection device of the present invention. A specific example of the image processing apparatus 90 is a computer.

図２３に示す画像処理装置９０は、データ読込部９１、記憶装置９２、ＣＰＵ９３、メモリ９４、出力Ｉ／Ｆ９５及びバス９６を備えている。データ読込部９１、記憶装置９２、ＣＰＵ９３、メモリ９４及び入出力Ｉ／Ｆ９５は、バス９６を介して相互に接続されている。この画像処理装置９０には、入出力Ｉ／Ｆ９５を介して、キーボードや、マウスなどの入力デバイス９７や、モニタ９８がそれぞれ接続される。入出力Ｉ／Ｆ９５は、入力デバイス９７からの各種入力を受け付けるとともに、モニタ９８に対して表示用のデータを出力する。 23 includes a data reading unit 91, a storage device 92, a CPU 93, a memory 94, an output I / F 95, and a bus 96. The data reading unit 91, the storage device 92, the CPU 93, the memory 94, and the input / output I / F 95 are connected to each other via a bus 96. The image processing apparatus 90 is connected to an input device 97 such as a keyboard and a mouse and a monitor 98 via an input / output I / F 95. The input / output I / F 95 receives various inputs from the input device 97 and outputs display data to the monitor 98.

データ読込部９１は、画像のデータやプログラムを外部から読み込むときに用いられる。データ読込部９１は、例えば着脱自在な記憶媒体９９からデータを取得する読込デバイス（光ディスク、磁気ディスク、光磁気ディスクなどの読込装置など）や、公知の通信規格に準拠して外部の装置と通信を行う通信デバイス（ＵＳＢインターフェース、有線又は無線のＬＡＮモジュールなど）である。なお、図２３においては、データ読込部９１が、着脱自在な記憶媒体９９からデータを取得する読込デバイスの場合を示している。 The data reading unit 91 is used when reading image data or a program from the outside. The data reading unit 91 communicates with a reading device (such as a reading device such as an optical disk, a magnetic disk, or a magneto-optical disk) that acquires data from, for example, a removable storage medium 99 or an external device in accordance with a known communication standard. Communication device (USB interface, wired or wireless LAN module, etc.) FIG. 23 shows a case where the data reading unit 91 is a reading device that acquires data from a removable storage medium 99.

記憶装置９２は、例えばハードディスクや不揮発性の半導体メモリなどの記憶媒体からなる。記憶装置９２は、上記プログラムや、プログラムの実行に必要となる各種データを記憶する。なお、記憶装置９２は、データ読込部９１が読み込んだ画像のデータなどを記憶することができる。 The storage device 92 is composed of a storage medium such as a hard disk or a nonvolatile semiconductor memory. The storage device 92 stores the program and various data necessary for executing the program. The storage device 92 can store image data read by the data reading unit 91 and the like.

ＣＰＵ９３は、画像処理装置９０の各部を統括的に制御するプロセッサである。このＣＰＵ９３は、プログラムを実行したときに画像処理部３１の機能を有している。この画像処理部３１の機能の１つとして、被写体検出部３２が設けられる。ここで、画像処理部３１及び被写体検出部３２は、第１実施形態と同一の機能を有することから第１実施形態と同一の符号を付している。つまり、被写体検出部３２は、第１実施形態と同一構成（色空間変換部４１、解像度変換部４２、差分画像生成部４３、画像判定部４４、二値化処理部４５、マスク絞込部４６、評価値算出部４７及びマスク抽出部４８）を備えている。なお、この第２実施形態においても、画像処理部３１の１つの機能として被写体検出部３２を設ける他、ＣＰＵ９３が画像処理部３１の機能を実行するプログラムとは別のプログラムを実行することで、被写体検出部３２の機能を有する形態としてもよい。 The CPU 93 is a processor that comprehensively controls each unit of the image processing apparatus 90. The CPU 93 has the function of the image processing unit 31 when the program is executed. As one of the functions of the image processing unit 31, a subject detection unit 32 is provided. Here, since the image processing unit 31 and the subject detection unit 32 have the same functions as those in the first embodiment, the same reference numerals as those in the first embodiment are given. That is, the subject detection unit 32 has the same configuration as that of the first embodiment (color space conversion unit 41, resolution conversion unit 42, difference image generation unit 43, image determination unit 44, binarization processing unit 45, mask narrowing unit 46). , An evaluation value calculation unit 47 and a mask extraction unit 48). In the second embodiment, the subject detection unit 32 is provided as one function of the image processing unit 31, and the CPU 93 executes a program different from the program for executing the function of the image processing unit 31. It is good also as a form which has the function of the to-be-photographed object detection part 32. FIG.

メモリ９４は、ＣＰＵ９３がプログラムを実行したときの各種演算結果を一時的に記憶する。このメモリ９４は、例えば揮発性のＳＤＲＡＭである。 The memory 94 temporarily stores various calculation results when the CPU 93 executes the program. The memory 94 is, for example, a volatile SDRAM.

この第２実施形態の画像処理装置９０は、データ読込部９１又は記憶装置９２から入力画像となる画像のデータを取得すると、ＣＰＵ９３が図４、図５、図１０及び図１３に示す被写体検出処理を実行する。この第２実施形態の画像処理装置９０においても、第１実施形態の被写体検出処理と同様の効果を得ることができる。 In the image processing apparatus 90 according to the second embodiment, when data of an image serving as an input image is acquired from the data reading unit 91 or the storage device 92, the CPU 93 performs subject detection processing illustrated in FIGS. 4, 5, 10, and 13. Execute. In the image processing apparatus 90 according to the second embodiment, the same effect as the subject detection process according to the first embodiment can be obtained.

１０…撮像装置、１９…制御部、３１…画像処理部、３２…被写体検出部、４１…色空間変換部、４２…解像度変換部、４３…差分画像生成部、４４…画像判定部、４５…二値化処理部、４６…マスク絞込部、４７…評価値算出部、４８…マスク抽出部、９０…画像処理装置、９３…ＣＰＵ DESCRIPTION OF SYMBOLS 10 ... Imaging device, 19 ... Control part, 31 ... Image processing part, 32 ... Subject detection part, 41 ... Color space conversion part, 42 ... Resolution conversion part, 43 ... Difference image generation part, 44 ... Image determination part, 45 ... Binarization processing unit, 46... Mask narrowing unit, 47 .. evaluation value calculation unit, 48... Mask extraction unit, 90.

Claims

A binarization processing unit that generates a plurality of different binarized images using an image to be processed;
For a plurality of pixel areas extracted in each binarized image, the values of pixels corresponding to the plurality of pixel areas acquired from the image before binarization processing are used, and the binarized image includes obtains a reference point of a plurality of pixel regions, and the evaluation unit which evaluates the pixel region have use the criteria points,
The evaluation had use of an extracting section for extracting the pixel region corresponding to a specific region of the processing target image, Bei obtain object detection apparatus.

The subject detection apparatus according to claim 1,
Further comprising a narrowing-down unit that narrows down the pixel areas of the subject candidates based on the area ratio between the binarized image and the pixel area or the position of the pixel area in the binarized image;
The evaluation unit is determined Mel object detecting apparatus the reference point from the center of gravity of the pixel area of the object candidates narrowed down by the narrow-down section.

The subject detection apparatus according to claim 1 or 2,
The evaluation unit, binarization processing said acquired from the previous image by using the intensity of a pixel corresponding to the pixel region, determined Mel object detection apparatus said reference point to a weighted center of gravity of the pixel region.

The subject detection apparatus according to any one of claims 1 to 3,
A first generation unit that generates a reference density image based on pixel values in the image to be processed;
Using a difference between the image to be processed and the reference density image, a first difference image indicating a degree of deviation of the pixel value from the reference density image at a location where the pixel value exceeds the reference density image; A second generation unit that generates a second difference image indicating a degree of deviation of the pixel value from the reference density image at a location where the pixel value is lower than the reference density image;
The binarization processing unit is a subject detection device that binarizes the first difference image and the second difference image to generate a plurality of binarized images.

The subject detection apparatus according to any one of claims 1 to 4,
Object detection apparatus to obtain further Bei the display unit indicating the corresponding position of the pixel area extracted by the extraction unit.

An imaging unit that captures an image of a subject;
The subject detection device according to any one of claims 1 to 5,
Bei obtain imaging device.

Generate multiple different binarized images using the image to be processed,
For a plurality of pixel areas extracted in each binarized image, the values of pixels corresponding to the plurality of pixel areas acquired from the image before binarization processing are used, and the binarized image includes obtains a reference point of a plurality of pixel areas, evaluating the pixel region have use the criteria points,
And have use of the evaluation, programs for executing processing for extracting the computer the pixel region corresponding to a specific region of the processing target image.

The program according to claim 7,
Based on the area ratio between the binarized image and the pixel area or the position of the pixel area in the binarized image, the pixel areas of the subject candidates are narrowed down,
Program for executing a process of determining the reference point to the computer from the center of gravity of a pixel region of narrowed-down candidate subjects.

In the program according to claim 7 or 8,
Using the intensity of the pixel corresponding to the pixel region obtained from binarization previous image program to execute the processing on a computer that center of gravity by weighting determining the reference point of the pixel region.

The program according to any one of claims 7 to 9,
Generating a reference density image based on pixel values in the image to be processed;
Using a difference between the image to be processed and the reference density image, a first difference image indicating a degree of deviation of the pixel value from the reference density image at a location where the pixel value exceeds the reference density image; Generating a second difference image indicating a degree of deviation of the pixel value from the reference density image at a location where the pixel value is lower than the reference density image;
Program for executing a process of generating a plurality of binary image by binarizing the second difference image and the first difference image to the computer.

In the program according to any one of claims 7 to 10,
Program for executing processing on a computer to display the corresponding position of the extracted the pixel regions in the display device.