JP6121768B2

JP6121768B2 - Image detection apparatus, control program, and image detection method

Info

Publication number: JP6121768B2
Application number: JP2013063363A
Authority: JP
Inventors: 健太西行
Original assignee: MegaChips Corp
Current assignee: MegaChips Corp
Priority date: 2013-03-26
Filing date: 2013-03-26
Publication date: 2017-04-26
Anticipated expiration: 2033-03-26
Also published as: JP2014191369A

Description

本発明は、処理対象画像から検出対象画像を検出する技術に関する。 The present invention relates to a technique for detecting a detection target image from a processing target image.

特許文献１〜３には、処理対象画像から検出対象画像を検出する技術が開示されている。 Patent Documents 1 to 3 disclose techniques for detecting a detection target image from a processing target image.

特開２００８−２７０５８号公報JP 2008-27058 A 特開２００９−１７５８２１号公報JP 2009-175821 A 特開２０１１−２２１７９１号公報JP 2011-221791 A

さて、処理対象画像から検出対象画像を検出する際には、その検出精度の向上が望まれている。 Now, when detecting a detection target image from a processing target image, improvement of the detection accuracy is desired.

そこで、本発明は上述の点に鑑みて成されたものであり、検出対象画像についての検出精度を向上させることが可能な技術を提供することを目的とする。 Therefore, the present invention has been made in view of the above points, and an object thereof is to provide a technique capable of improving the detection accuracy of a detection target image.

上記課題を解決するため、本発明に係る画像検出装置の一態様は、処理対象画像から検出対象画像を検出する画像検出装置であって、前記検出対象画像としての確からしさを示す確度値についての前記処理対象画像での分布を示すマップを生成するマップ生成部と、しきい値を用いて前記マップを２値化して２値化マップを生成する２値化処理部と、前記マップにおける、前記確度値が前記しきい値以上あるいは当該しきい値よりも大きい領域に対応する、当該しきい値が用いられて生成された前記２値化マップでの第１領域から円形領域を抽出する抽出部と、前記抽出部で前記２値化マップの前記第１領域から抽出された円形領域と当該第１領域との重なり面積を示す第１評価値と、当該円形領域と前記抽出部で当該第１領域から抽出された他の円形領域との重なり面積を示す第２評価値とに基づいて、前記しきい値を調整するしきい値調整部と、前記しきい値調整部で調整された前記しきい値が用いられて生成された前記２値化マップの前記第１領域に基づいて、前記処理対象画像において前記検出対象画像を特定する特定部とを備える。 In order to solve the above-described problem, an aspect of the image detection device according to the present invention is an image detection device that detects a detection target image from a processing target image, the accuracy value indicating the probability as the detection target image. A map generation unit that generates a map indicating a distribution in the processing target image; a binarization processing unit that binarizes the map using a threshold value to generate a binarized map; and An extraction unit that extracts a circular area from the first area in the binarization map generated by using the threshold corresponding to an area having an accuracy value equal to or greater than the threshold or greater than the threshold. A first evaluation value indicating an overlapping area between the first region and the circular region extracted from the first region of the binarized map by the extraction unit, and the first region by the circular region and the extraction unit. Extracted from the region Based on a second evaluation value indicating an overlapping area with another circular region, a threshold value adjusting unit that adjusts the threshold value, and the threshold value adjusted by the threshold value adjusting unit is used. And a specifying unit that specifies the detection target image in the processing target image based on the first region of the binarized map generated in the above.

ここで、円形領域とは、正円形領域だけではなく楕円形領域等も含む概念である。 Here, the circular area is a concept including not only a regular circular area but also an elliptical area.

また、本発明に係る画像検出装置の一態様では、前記しきい値調整部は、前記抽出部で前記２値化マップの前記第１領域から抽出された円形領域と当該第１領域との重なり面積を示す第１評価値と、当該円形領域と前記抽出部で当該第１領域から抽出された他の円形領域との重なり面積を示す第２評価値と、当該円形領域と、前記マップにおける、前記確度値が当該２値化マップの生成で用いられた前記しきい値未満あるいは当該しきい値以下の領域に対応する、当該２値化マップでの第２領域との重なり面積を示す第３評価値とに基づいて、前記しきい値を調整する。 In one aspect of the image detection apparatus according to the present invention, the threshold adjustment unit overlaps the first region with a circular region extracted from the first region of the binarization map by the extraction unit. A first evaluation value indicating an area, a second evaluation value indicating an overlapping area between the circular region and another circular region extracted from the first region by the extraction unit, the circular region, and the map, A third value indicating an overlapping area with the second region in the binarization map corresponding to a region where the accuracy value is less than or less than the threshold value used in the generation of the binarization map. The threshold value is adjusted based on the evaluation value.

また、本発明に係る画像検出装置の一態様では、前記抽出部は、前記２値化マップのエッジを検出し、当該エッジの座標を用いたハフ変換により当該２値化マップの前記第１領域での円形領域を特定する。 In the image detection device according to the aspect of the invention, the extraction unit may detect an edge of the binarized map and perform the first region of the binarized map by a Hough transform using the coordinates of the edge. Identify the circular area at.

また、本発明に係る制御プログラムの一態様は、処理対象画像から検出対象画像を検出する画像検出装置を制御するための制御プログラムであって、前記画像検出装置に、（ａ）前記検出対象画像としての確からしさを示す確度値についての前記処理対象画像での分布を示すマップを生成する工程と、（ｂ）しきい値を用いて前記マップを２値化して２値化マップを生成する工程と、（ｃ）前記マップにおける、前記確度値が前記しきい値以上あるいは当該しきい値よりも大きい領域に対応する、当該しきい値が用いられて生成された前記２値化マップでの部分領域から円形領域を抽出する工程と、（ｄ）前記工程（ｃ）で前記２値化マップの前記部分領域から抽出された円形領域と当該部分領域との重なり面積を示す第１評価値と、当該円形領域と前記工程（ｃ）で当該部分領域から抽出された他の円形領域との重なり面積を示す第２評価値とに基づいて、前記しきい値を調整する工程と、（ｅ）前記工程（ｄ）で調整された前記しきい値が用いられて生成された前記２値化マップの前記部分領域に基づいて、前記処理対象画像において前記検出対象画像を特定する工程とを実行させるためのものである。 An aspect of the control program according to the present invention is a control program for controlling an image detection device that detects a detection target image from a processing target image. The control program includes (a) the detection target image. A step of generating a map indicating the distribution of the accuracy value indicating the probability in the processing target image, and (b) generating a binarized map by binarizing the map using a threshold value And (c) a portion of the binarized map generated by using the threshold corresponding to an area where the accuracy value is equal to or greater than the threshold or greater than the threshold in the map. A step of extracting a circular region from the region; (d) a first evaluation value indicating an overlapping area between the circular region extracted from the partial region of the binarized map and the partial region in step (c); Yen Adjusting the threshold based on a second evaluation value indicating an overlapping area between the region and another circular region extracted from the partial region in the step (c), and (e) the step ( d) performing the step of identifying the detection target image in the processing target image based on the partial region of the binarization map generated using the threshold value adjusted in d) It is.

また、本発明に係る画像検出方法の一態様は、処理対象画像から検出対象画像を検出する画像検出方法であって、（ａ）前記検出対象画像としての確からしさを示す確度値についての前記処理対象画像での分布を示すマップを生成する工程と、（ｂ）しきい値を用いて前記マップを２値化して２値化マップを生成する工程と、（ｃ）前記マップにおける、前記確度値が前記しきい値以上あるいは当該しきい値よりも大きい領域に対応する、当該しきい値が用いられて生成された前記２値化マップでの部分領域から円形領域を抽出する工程と、（ｄ）前記工程（ｃ）で前記２値化マップの前記部分領域から抽出された円形領域と当該部分領域との重なり面積を示す第１評価値と、当該円形領域と前記工程（ｃ）で当該部分領域から抽出された他の円形領域との重なり面積を示す第２評価値とに基づいて、前記しきい値を調整する工程と、（ｅ）前記工程（ｄ）で調整された前記しきい値が用いられて生成された前記２値化マップの前記部分領域に基づいて、前記処理対象画像において前記検出対象画像を特定する工程とを備える。 An aspect of the image detection method according to the present invention is an image detection method for detecting a detection target image from a processing target image, wherein: (a) the process for the accuracy value indicating the probability as the detection target image (B) generating a binarized map by binarizing the map using a threshold value; and (c) the accuracy value in the map. Extracting a circular region from a partial region in the binarization map generated using the threshold value, corresponding to a region where is greater than or equal to the threshold value, and (d ) A first evaluation value indicating an overlap area between the circular area extracted from the partial area of the binarized map in the step (c) and the partial area, and the circular area and the part in the step (c) Other extracted from region A step of adjusting the threshold value based on a second evaluation value indicating an overlapping area with the circular region; and (e) generated using the threshold value adjusted in the step (d). Identifying the detection target image in the processing target image based on the partial region of the binarized map.

本発明によれば、検出対象画像についての検出精度を向上させることができる。 According to the present invention, it is possible to improve detection accuracy for a detection target image.

画像検出装置の構成を示す図である。It is a figure which shows the structure of an image detection apparatus. 画像検出装置が備える複数の機能ブロックの構成を示す図である。It is a figure which shows the structure of the several functional block with which an image detection apparatus is provided. 検出部の構成を示す図である。It is a figure which shows the structure of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出部の動作を説明するための図である。It is a figure for demonstrating operation | movement of a detection part. 検出結果枠を処理対象画像に重ねて示す図である。It is a figure which overlaps and shows a detection result frame on a process target image. 出力値マップの生成方法を説明するための図である。It is a figure for demonstrating the production | generation method of an output value map. 出力値マップの生成方法を説明するための図である。It is a figure for demonstrating the production | generation method of an output value map. 出力値マップの一例を示す図である。It is a figure which shows an example of an output value map. 処理対象画像の一例を模式的に示す図である。It is a figure which shows an example of a process target image typically. 出力値マップの一例を示す図である。It is a figure which shows an example of an output value map. ２値化マップの一例を示す図である。It is a figure which shows an example of a binarization map. ２値化マップの高確度領域に設定された外接矩形を示す図である。It is a figure which shows the circumscribed rectangle set to the high-accuracy area | region of a binarization map. ２値化マップでの外接矩形を処理対象画像に設定した様子を示す図である。It is a figure which shows a mode that the circumscribed rectangle in a binarization map was set to the process target image. ２値化マップの一例を示す図である。It is a figure which shows an example of a binarization map. ２値化マップの高確度領域に設定された外接矩形を示す図である。It is a figure which shows the circumscribed rectangle set to the high-accuracy area | region of a binarization map. ２値化マップでの外接矩形を処理対象画像に設定した様子を示す図である。It is a figure which shows a mode that the circumscribed rectangle in a binarization map was set to the process target image. ２値化マップの一例を示す図である。It is a figure which shows an example of a binarization map. ２値化マップの高確度領域に設定された外接矩形を示す図である。It is a figure which shows the circumscribed rectangle set to the high-accuracy area | region of a binarization map. ２値化マップでの外接矩形を処理対象画像に設定した様子を示す図である。It is a figure which shows a mode that the circumscribed rectangle in a binarization map was set to the process target image. 画像検出装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of an image detection apparatus. エッジマップの一例を示す図である。It is a figure which shows an example of an edge map. ２値化マップから抽出した円形領域を当該２値化マップに設定した様子を示す図である。It is a figure which shows a mode that the circular area | region extracted from the binarization map was set to the said binarization map. ２値化マップから抽出した円形領域を当該２値化マップに設定した様子を示す図である。It is a figure which shows a mode that the circular area | region extracted from the binarization map was set to the said binarization map.

図１は実施の形態に係る画像検出装置１の構成を示す図である。本実施の形態に係る画像検出装置１は、入力される画像データが示す画像から検出対象画像を検出する。画像検出装置１は、例えば、監視カメラシステム、デジタルカメラシステム等で使用される。本実施の形態では、検出対象画像は、例えば人の顔画像である。以後、単に「顔画像」と言えば、人の顔画像を意味するものとする。また、検出対象画像の検出処理を行う対象の画像を「処理対象画像」と呼ぶ。 FIG. 1 is a diagram illustrating a configuration of an image detection apparatus 1 according to an embodiment. The image detection apparatus 1 according to the present embodiment detects a detection target image from an image indicated by input image data. The image detection apparatus 1 is used, for example, in a surveillance camera system, a digital camera system, or the like. In the present embodiment, the detection target image is, for example, a human face image. Hereinafter, simply speaking “face image” means a human face image. An image to be subjected to detection processing of a detection target image is referred to as a “processing target image”.

画像検出装置１は、一種のコンピュータであって、図１に示されるように、ＣＰＵ（Central Processing Unit）１０及び記憶部１１を備えている。記憶部１１は、ＲＯＭ（Read Only Memory）及びＲＡＭ（Random Access Memory）等の、画像検出装置１（ＣＰＵ１０）が読み取り可能な非一時的な記録媒体で構成されている。記憶部１１には、画像検出装置１の動作を制御するための制御プログラム１２等が記憶されている。記憶部１１は、ＲＯＭ及びＲＡＭ以外の、コンピュータが読み取り可能な非一時的な記録媒体を備えていても良い。記憶部１１は、例えば、ハードディスクドライブ、ＳＳＤ（Solid State Drive）、ＵＳＢ（Universal Serial Bus）メモリ等を備えていても良い。画像検出装置１の各種機能は、ＣＰＵ１０が記憶部１１内の制御プログラム１２を実行することによって実現される。画像検出装置１では、制御プログラム１２が実行されることによって、図２に示されるような複数の機能ブロックが形成される。 The image detection apparatus 1 is a kind of computer, and includes a CPU (Central Processing Unit) 10 and a storage unit 11 as shown in FIG. The storage unit 11 is configured by a non-transitory recording medium that can be read by the image detection apparatus 1 (CPU 10), such as a ROM (Read Only Memory) and a RAM (Random Access Memory). The storage unit 11 stores a control program 12 for controlling the operation of the image detection apparatus 1 and the like. The storage unit 11 may include a computer-readable non-transitory recording medium other than the ROM and RAM. The storage unit 11 may include, for example, a hard disk drive, an SSD (Solid State Drive), a USB (Universal Serial Bus) memory, and the like. Various functions of the image detection apparatus 1 are realized by the CPU 10 executing the control program 12 in the storage unit 11. In the image detection apparatus 1, a plurality of functional blocks as shown in FIG. 2 are formed by executing the control program 12.

図２に示されるように、画像検出装置１は、機能ブロックとして、画像入力部２と、検出部３と、マップ生成部４と、２値化処理部５と、検出対象画像特定部６と、しきい値調整部７と、円形領域抽出部８とを備えている。画像検出装置１が備える各種機能は、機能ブロックではなくハードウェア回路で実現しても良い。 As shown in FIG. 2, the image detection device 1 includes, as functional blocks, an image input unit 2, a detection unit 3, a map generation unit 4, a binarization processing unit 5, and a detection target image specifying unit 6. , A threshold adjustment unit 7 and a circular region extraction unit 8 are provided. Various functions provided in the image detection apparatus 1 may be realized by hardware circuits instead of function blocks.

画像入力部２には、監視カメラシステム等が備える撮像部（カメラ）で順次撮像された複数枚の画像をそれぞれ示す複数の画像データが順次入力される。画像入力部２は、処理対象画像を示す画像データを出力する。画像入力部２は、撮像部で得られる各画像を処理対象画像としても良いし、撮像部で得られる画像のうち、数秒毎に得られる画像を処理対象画像としても良い。撮像部では、例えば、１秒間にＬ枚（Ｌ≧２）の画像が撮像される。つまり、撮像部での撮像フレームレートは、Ｌｆｐｓ(frame per second）である。また、撮像部で撮像される画像では、行方向にＭ個（Ｍ≧２）のピクセルが並び、列方向にＮ個（Ｎ≧２）のピクセルが並んでいる。撮像部で撮像される画像の解像度は、例えばＶＧＡ（Video Graphics Array）であって、Ｍ＝６４０、Ｎ＝４８０となっている。 The image input unit 2 is sequentially input with a plurality of image data respectively indicating a plurality of images sequentially captured by an imaging unit (camera) included in the surveillance camera system or the like. The image input unit 2 outputs image data indicating the processing target image. The image input unit 2 may use each image obtained by the imaging unit as a processing target image, or may use an image obtained every few seconds among images obtained by the imaging unit as a processing target image. In the imaging unit, for example, L (L ≧ 2) images are captured per second. That is, the imaging frame rate at the imaging unit is Lfps (frame per second). In the image captured by the imaging unit, M (M ≧ 2) pixels are arranged in the row direction, and N (N ≧ 2) pixels are arranged in the column direction. The resolution of the image picked up by the image pickup unit is, for example, VGA (Video Graphics Array), and M = 640 and N = 480.

なお以後、撮像部で撮像される画像において、行方向にｍ個（ｍ≧１）のピクセルが並び、列方向にｎ個（ｎ≧１）のピクセルが並ぶ領域の大きさをｍｐ×ｎｐで表す（ｐはピクセルの意味）。また、行列状に配置された複数の値において、左上を基準にして第ｍ行目であって第ｎ列目に位置する値をｍ×ｎ番目の値と呼ぶことがある。 In the following, in an image captured by the imaging unit, the size of an area where m pixels (m ≧ 1) are arranged in the row direction and n pixels (n ≧ 1) are arranged in the column direction is mp × np. (P means pixel). In addition, among a plurality of values arranged in a matrix, a value located in the m-th row and the n-th column with reference to the upper left may be referred to as an m × n-th value.

検出部３は、画像入力部２から出力される画像データを使用して、処理対象画像に対して顔画像の検出を行う。マップ生成部４は、検出部３での検出結果に基づいて、顔画像としての確からしさを示す検出確度値についての処理対象画像での分布を示す出力値マップを生成する。 The detection unit 3 uses the image data output from the image input unit 2 to detect a face image for the processing target image. The map generation unit 4 generates an output value map indicating the distribution in the processing target image with respect to the detection accuracy value indicating the likelihood as the face image based on the detection result of the detection unit 3.

２値化処理部５は、マップ生成部４で生成された出力値マップをしきい値を用いて２値化して２値化マップを生成する。円形領域抽出部８は、出力値マップにおける、検出確度値がしきい値以上の領域あるいは当該しきい値よりも大きい領域に対応する、当該しきい値が用いられて生成された２値化マップでの領域から円形領域を抽出する。 The binarization processing unit 5 binarizes the output value map generated by the map generation unit 4 using a threshold value to generate a binarized map. The circular area extraction unit 8 is a binarized map generated by using the threshold corresponding to an area where the detection accuracy value is equal to or greater than the threshold or an area greater than the threshold in the output value map. Extract a circular region from the region at.

ここで、円形とは、正円（真円）形だけではなく楕円形等も含む概念である。また、円形領域とは、正円形領域だけではなく楕円形領域等も含む概念である。 Here, the circle is a concept including not only a perfect circle (perfect circle) but also an ellipse. The circular region is a concept including not only a regular circular region but also an elliptical region.

また、出力値マップにおける、検出確度値がしきい値以上の領域あるいは当該しきい値よりも大きい領域に対応する、当該しきい値が用いられて生成された２値化マップでの領域を「高確度領域」と呼ぶ。円形領域抽出部８は２値化マップの高確度領域から円形領域を抽出する。また、出力値マップにおける、検出確度値がしきい値未満の領域あるいは当該しきい値以下の領域に対応する、当該しきい値が用いられて生成された２値化マップでの領域を「低確度領域」と呼ぶ。高確度領域が、出力値マップにおける、検出確度値がしきい値以上の領域に対応する場合には、低確度領域は、出力値マップにおける、検出確度値がしきい値未満の領域に対応する。一方で、高確度領域が、出力値マップにおける、検出確度値がしきい値よりも大きい領域に対応する場合には、低確度領域は、出力値マップにおける、検出確度値がしきい値以下の領域に対応する。 Further, in the output value map, the region in the binarization map generated by using the threshold corresponding to the region where the detection accuracy value is greater than or equal to the threshold or larger than the threshold is expressed as “ This is called “high accuracy region”. The circular area extraction unit 8 extracts a circular area from the high accuracy area of the binarized map. Further, in the output value map, the region in the binarization map generated by using the threshold corresponding to the region where the detection accuracy value is less than the threshold or the region below the threshold is expressed as “low”. This is called “accuracy region”. When the high accuracy region corresponds to a region where the detection accuracy value is greater than or equal to the threshold value in the output value map, the low accuracy region corresponds to a region where the detection accuracy value is less than the threshold value in the output value map . On the other hand, when the high accuracy region corresponds to the region where the detection accuracy value is larger than the threshold value in the output value map, the low accuracy region is detected when the detection accuracy value is equal to or less than the threshold value in the output value map. Corresponds to the region.

しきい値調整部７は、２値化処理部５で生成された２値化マップと、円形領域抽出部８において当該２値化マップの高確度領域から抽出された円形領域とに基づいて、２値化処理部５での２値化で使用されるしきい値を調整する。検出対象画像特定部６は、しきい値調整部７で調整されたしきい値が用いられて２値化処理部５で生成された２値化マップに基づいて、処理対象画像において検出対象画像を特定する。これにより、画像検出装置１では、処理対象画像から顔画像が検出される。 The threshold adjustment unit 7 is based on the binarization map generated by the binarization processing unit 5 and the circular region extracted from the high-accuracy region of the binarization map by the circular region extraction unit 8. A threshold value used for binarization in the binarization processing unit 5 is adjusted. The detection target image specifying unit 6 uses the threshold adjusted by the threshold adjustment unit 7 to generate a detection target image in the processing target image based on the binarization map generated by the binarization processing unit 5. Is identified. As a result, the image detection apparatus 1 detects a face image from the processing target image.

次に、画像検出装置１の各ブロックの動作について詳細に説明する。 Next, the operation of each block of the image detection apparatus 1 will be described in detail.

＜検出処理＞
図３は検出部３の構成を示す図である。図３に示されるように、検出部３は、特徴量抽出部３０及び識別器３１を備えている。検出部３は、検出枠を用いて、処理対象画像において当該検出枠と同じサイズの顔画像である可能性が高い領域を検出結果領域として検出する検出処理を行う。以後、単に「検出処理」と言えば、検出部３でのこの検出処理を意味する。検出部３は、処理対象画像における様々な大きさの顔画像を検出するために、サイズの異なる複数種類の検出枠を使用する。検出部３では、例えば３０種類の検出枠が使用される。各検出枠は例えば正方形である。 <Detection process>
FIG. 3 is a diagram illustrating a configuration of the detection unit 3. As shown in FIG. 3, the detection unit 3 includes a feature amount extraction unit 30 and a discriminator 31. The detection unit 3 uses the detection frame to perform detection processing for detecting, as a detection result region, a region that is highly likely to be a face image having the same size as the detection frame in the processing target image. Hereinafter, simply speaking “detection process” means this detection process in the detection unit 3. The detection unit 3 uses a plurality of types of detection frames having different sizes in order to detect face images of various sizes in the processing target image. In the detection unit 3, for example, 30 types of detection frames are used. Each detection frame is, for example, a square.

本実施の形態では、後述するように、特徴量抽出部３０は、画像から特徴量を抽出する。そして、特徴量抽出部３０においては、特徴量を抽出する対象の画像については、基準サイズ（正規化サイズ）の画像を使用する必要がある。 In the present embodiment, as will be described later, the feature amount extraction unit 30 extracts a feature amount from an image. The feature amount extraction unit 30 needs to use an image having a reference size (normalized size) as a target image from which feature amounts are extracted.

一方で、本実施の形態では、互いにサイズが異なる複数種類の検出枠には、基準サイズと同じサイズの検出枠と、基準サイズとは異なるサイズの検出枠とが含まれている。以後、基準サイズと同じサイズの検出枠を「基準検出枠」と呼び、基準サイズとは異なるサイズの検出枠を「非基準検出枠」と呼ぶ。本実施の形態では、複数種類の検出枠のうちのサイズが最小の検出枠が基準検出枠となっている。したがって、非基準検出枠のサイズは基準サイズよりも大きくなっている。基準検出枠のサイズは、例えば１６ｐ×１６ｐである。また、複数種類の検出枠には、例えば、大きさが１８ｐ×１８ｐの非基準検出枠及び大きさが２０ｐ×２０ｐの非基準検出枠などが含まれている。 On the other hand, in the present embodiment, the plurality of types of detection frames having different sizes include a detection frame having the same size as the reference size and a detection frame having a size different from the reference size. Hereinafter, a detection frame having the same size as the reference size is referred to as a “reference detection frame”, and a detection frame having a size different from the reference size is referred to as a “non-reference detection frame”. In the present embodiment, the detection frame having the smallest size among the plurality of types of detection frames is the reference detection frame. Therefore, the size of the non-reference detection frame is larger than the reference size. The size of the reference detection frame is, for example, 16p × 16p. The plurality of types of detection frames include, for example, a non-reference detection frame having a size of 18p × 18p and a non-reference detection frame having a size of 20p × 20p.

本実施の形態では、検出部３は、処理対象画像について基準検出枠を使用して検出処理を行う際には、処理対象画像に対して基準検出枠を移動させながら、当該基準検出枠内の画像に対して顔画像の検出を行って、当該画像が顔画像である可能性が高いかを判定する。そして、検出部３は、処理対象画像において、顔画像である可能性が高いと判定した領域（基準検出枠内の画像）を検出結果領域とする。 In the present embodiment, when the detection unit 3 performs the detection process on the processing target image using the reference detection frame, the detection unit 3 moves the reference detection frame relative to the processing target image while moving the reference detection frame within the reference detection frame. A face image is detected for the image to determine whether the image is highly likely to be a face image. And the detection part 3 makes the area | region (image in a reference | standard detection frame) determined with possibility that it is a face image high in a process target image as a detection result area | region.

一方で、検出部３は、処理対象画像について非基準検出枠を使用して検出処理を行う際には、基準サイズとサイズが一致するように非基準検出枠をサイズ変更する。そして、検出部３は、非基準検出枠のサイズ変更に応じて処理対象画像のサイズ変更を行う。検出部３は、サイズ変更を行った処理対象画像に対して、サイズ変更を行った非基準検出枠を移動させながら、当該非基準検出枠内の画像に対して顔画像の検出を行って、当該画像が顔画像である可能性が高いかを判定する。そして、検出部３は、サイズ変更を行った処理対象画像において、顔画像である可能性が高いと判定した領域（サイズ変更後の非基準検出枠内の画像）に基づいて、サイズ変更が行われていない、本来のサイズの処理対象画像において顔画像である可能性が高い領域を特定し、当該領域を検出結果領域とする。 On the other hand, when performing detection processing using the non-reference detection frame for the processing target image, the detection unit 3 changes the size of the non-reference detection frame so that the size matches the reference size. Then, the detection unit 3 changes the size of the processing target image in accordance with the size change of the non-reference detection frame. The detection unit 3 detects the face image for the image in the non-reference detection frame while moving the non-reference detection frame whose size has been changed with respect to the processing target image whose size has been changed, It is determined whether or not the image is likely to be a face image. Then, the detection unit 3 performs the size change based on the region (the image in the non-reference detection frame after the size change) that is determined to be highly likely to be a face image in the processing target image that has undergone the size change. A region that has a high possibility of being a face image in a processing target image of an original size that is not known is identified, and the region is set as a detection result region.

以後、処理対象画像に対して非基準検出枠が使用されて検出処理が行われる際のサイズ変更後の当該処理対象画像を「サイズ変更画像」と呼ぶ。また、処理対象画像に対して非基準検出枠が使用されて検出処理が行われる際のサイズ変更後の当該非基準検出枠を「サイズ変更検出枠」と呼ぶ。 Hereinafter, the processing target image after the size change when the non-reference detection frame is used for the processing target image and the detection processing is performed is referred to as a “size-changed image”. Further, the non-reference detection frame after the size change when the detection process is performed using the non-reference detection frame for the processing target image is referred to as a “size change detection frame”.

このように、本実施の形態では、検出部３が処理対象画像に対して基準検出枠を使用して検出処理を行う際の当該検出部３の動作と、検出部３が処理対象画像に対して非基準検出枠を使用して検出処理を行う際の当該検出部３の動作とが異なっている。以下に検出部３の動作について詳細に説明する。 Thus, in the present embodiment, the operation of the detection unit 3 when the detection unit 3 performs detection processing on the processing target image using the reference detection frame, and the detection unit 3 performs processing on the processing target image. Thus, the operation of the detection unit 3 when performing the detection process using the non-reference detection frame is different. The operation of the detection unit 3 will be described in detail below.

検出部３では、検出処理に基準検出枠が使用される際には、特徴量抽出部３０が、処理対象画像に対して基準検出枠を設定し、当該処理対象画像における当該基準検出枠内の画像から複数の特徴量を抽出する。一方で、検出処理に非基準検出枠が使用される際には、特徴量抽出部３０は、処理対象画像をサイズ変更して得られるサイズ変更画像に対して、非基準検出枠をサイズ変更して得られるサイズ変更検出枠を設定し、当該サイズ変更画像における当該サイズ変更検出枠内の画像から複数の特徴量を抽出する。以後、特徴量が抽出される、基準検出枠内の画像及びサイズ変更検出枠内の画像を総称して「枠内画像」と呼ぶことがある。 In the detection unit 3, when the reference detection frame is used for the detection process, the feature amount extraction unit 30 sets a reference detection frame for the processing target image, and within the reference detection frame in the processing target image. A plurality of feature amounts are extracted from the image. On the other hand, when a non-reference detection frame is used for the detection process, the feature amount extraction unit 30 resizes the non-reference detection frame with respect to a size-changed image obtained by resizing the processing target image. The size change detection frame obtained in this way is set, and a plurality of feature amounts are extracted from the image in the size change detection frame in the size change image. Hereinafter, the image in the reference detection frame and the image in the size change detection frame from which the feature amount is extracted may be collectively referred to as “in-frame image”.

ここで、基準検出枠のサイズは基準サイズと一致することから、処理対象画像における基準検出枠内の画像のサイズは基準サイズとなる。また、サイズ変更検出枠のサイズは基準サイズと一致することから、サイズ変更画像におけるサイズ変更検出枠内の画像のサイズは基準サイズとなる。よって、特徴量抽出部３０は、常に基準サイズの画像から特徴量を抽出することができる。特徴量抽出部３０は、枠内画像から、例えばＨａａｒ−ｌｉｋｅ特徴量やＬＢＰ（Local Binary Pattern）特徴量などの特徴量を抽出する。 Here, since the size of the reference detection frame matches the reference size, the size of the image in the reference detection frame in the processing target image becomes the reference size. In addition, since the size of the size change detection frame matches the reference size, the size of the image in the size change detection frame in the size change image becomes the reference size. Therefore, the feature amount extraction unit 30 can always extract the feature amount from the image of the reference size. The feature quantity extraction unit 30 extracts feature quantities such as Haar-like feature quantities and LBP (Local Binary Pattern) feature quantities from the in-frame image.

識別器３１は、特徴量抽出部３０が枠内画像から抽出した複数の特徴量から成る特徴ベクトルと、学習サンプル（学習用のサンプル画像）に基づいて生成された複数の重み係数から成る重みベクトルとに基づいて、当該枠内画像が顔画像である確からしさを示す検出確度値を算出する。具体的には、特徴量抽出部３０は、枠内画像についての特徴ベクトルと、重みベクトルとの内積を求め、当該内積に所定のバイアス値を加算して得られる実数値を、当該枠内画像が顔画像である確からしさを示す検出確度値とする。識別器３１で算出される検出確度値は、基準検出枠内の画像あるいはサイズ変更検出枠内の画像についての顔画像らしさ（顔らしさ）を示している。識別器３１には、例えば、ＳＶＭ（Support Vector Machine）あるいはＡｄａｂｏｏｓｔが使用される。 The discriminator 31 includes a feature vector composed of a plurality of feature amounts extracted from the in-frame image by the feature amount extraction unit 30, and a weight vector composed of a plurality of weight coefficients generated based on the learning sample (learning sample image). Based on the above, a detection accuracy value indicating the likelihood that the in-frame image is a face image is calculated. Specifically, the feature quantity extraction unit 30 obtains an inner product of the feature vector and the weight vector for the in-frame image, and adds a real value obtained by adding a predetermined bias value to the in-frame image. Is a detection accuracy value indicating the probability that is a face image. The detection accuracy value calculated by the discriminator 31 indicates the face image likelihood (face likelihood) of the image within the reference detection frame or the image within the size change detection frame. For the discriminator 31, for example, SVM (Support Vector Machine) or Adaboost is used.

識別器３１は、算出した検出確度値がしきい値以上であれば、枠内画像が顔画像である可能性が高いと判定する。つまり、基準検出枠が使用される際には、識別器３１は、処理対象画像における基準検出枠内の画像が、基準検出枠と同じサイズの顔画像である可能性が高い領域であると判定する。また、非基準検出枠が使用される際には、識別器３１は、サイズ変更画像におけるサイズ変更検出枠内の画像が、サイズ変更検出枠と同じサイズの顔画像である可能性が高い領域であると判定する。 If the calculated detection accuracy value is greater than or equal to the threshold value, the classifier 31 determines that the possibility that the in-frame image is a face image is high. That is, when the reference detection frame is used, the discriminator 31 determines that the image in the reference detection frame in the processing target image is a region that is highly likely to be a face image having the same size as the reference detection frame. To do. Further, when the non-reference detection frame is used, the discriminator 31 is an area in which the image in the size change detection frame in the size change image is likely to be a face image having the same size as the size change detection frame. Judge that there is.

一方で、識別器３１は、算出した検出確度値がしきい未満であれば、枠内画像が顔画像でない可能性が高いと判定する。つまり、基準検出枠が使用される際には、識別器３１は、処理対象画像における基準検出枠内の画像が、基準検出枠と同じサイズの顔画像である可能性が高い領域ではないと判定する。また、非基準検出枠が使用される際には、識別器３１は、サイズ変更画像におけるサイズ変更検出枠内の画像が、サイズ変更検出枠と同じサイズの顔画像である可能性が高い領域ではないと判定する。 On the other hand, the discriminator 31 determines that there is a high possibility that the in-frame image is not a face image if the calculated detection accuracy value is less than the threshold. That is, when the reference detection frame is used, the discriminator 31 determines that the image in the reference detection frame in the processing target image is not a region that is highly likely to be a face image having the same size as the reference detection frame. To do. When the non-reference detection frame is used, the discriminator 31 determines that the image in the size change detection frame in the size change image is likely to be a face image having the same size as the size change detection frame. Judge that there is no.

識別器３１は、処理対象画像における基準検出枠内の画像が、基準検出枠と同じサイズの顔画像である可能性が高い領域であると判定すると、当該画像を検出結果領域とし、当該基準検出枠を検出結果枠とする。 When the discriminator 31 determines that the image in the reference detection frame in the processing target image is a region that is highly likely to be a face image having the same size as the reference detection frame, the image is used as a detection result region, and the reference detection is performed. Let the frame be the detection result frame.

また識別器３１は、サイズ変更画像におけるサイズ変更検出枠内の画像が、サイズ変更検出枠と同じサイズの顔画像である可能性が高い領域であると判定すると、当該領域の外形枠を仮検出結果枠とする。そして、識別器３１は、仮検出結果枠に基づいて、サイズ変更画像の元の画像である処理対象画像において、非基準検出枠と同じサイズの顔画像である可能性が高い領域を特定し、当該領域を検出結果領域とするとともに、当該検出結果領域の外形枠を最終的な検出結果枠とする。 If the discriminator 31 determines that the image in the size change detection frame in the size change image is an area that is highly likely to be a face image having the same size as the size change detection frame, the identifier 31 temporarily detects the outer frame of the area. The result frame. Then, the discriminator 31 specifies an area that is highly likely to be a face image having the same size as the non-reference detection frame in the processing target image that is the original image of the resized image based on the temporary detection result frame, The area is set as a detection result area, and the outer frame of the detection result area is set as a final detection result frame.

＜基準検出枠を用いた検出処理＞
次に、検出部３が処理対象画像に対して基準検出枠を移動させながら、当該基準検出枠内の画像が顔画像である可能性が高いかを判定する際の当該検出部３の一連の動作について説明する。図４〜７は、検出部３の当該動作を説明するための図である。検出部３は、基準検出枠をラスタスキャンさせながら、当該基準検出枠内の画像に対して顔画像の検出を行う。 <Detection process using reference detection frame>
Next, the detection unit 3 moves the reference detection frame with respect to the processing target image, and determines whether the image in the reference detection frame is likely to be a face image. The operation will be described. 4-7 is a figure for demonstrating the said operation | movement of the detection part 3. FIG. The detection unit 3 detects a face image with respect to an image in the reference detection frame while raster scanning the reference detection frame.

図４に示されるように、特徴量抽出部３０は、処理対象画像２０の左上にまず基準検出枠１００を設定して、当該基準検出枠１００内の画像から複数の特徴量を抽出する。識別器３１は、特徴量抽出部３０が抽出した複数の特徴量から成る特徴ベクトルと、複数の重み係数から成る重みベクトルとに基づいて、基準検出枠１００内の画像についての検出確度値を求める。そして、識別器３１は、算出した検出確度値がしきい値以上である場合には、処理対象画像２０での左上の基準検出枠１００内の領域が顔画像である可能性が高いと判定し、当該領域を検出結果領域とし、当該領域の外形枠である当該基準検出枠１００を検出結果枠とする。 As shown in FIG. 4, the feature amount extraction unit 30 first sets a reference detection frame 100 at the upper left of the processing target image 20, and extracts a plurality of feature amounts from the image in the reference detection frame 100. The discriminator 31 obtains a detection accuracy value for an image in the reference detection frame 100 based on a feature vector composed of a plurality of feature amounts extracted by the feature amount extraction unit 30 and a weight vector composed of a plurality of weight coefficients. . Then, when the calculated detection accuracy value is equal to or greater than the threshold value, the classifier 31 determines that there is a high possibility that the region in the upper left reference detection frame 100 in the processing target image 20 is a face image. The region is set as a detection result region, and the reference detection frame 100 which is the outer frame of the region is set as a detection result frame.

次に特徴量抽出部３０は、処理対象画像２０において基準検出枠１００を少し右に移動させる。特徴量抽出部３０は、例えば、１画素分あるいは数画素分だけ右に基準検出枠１００を移動させる。そして、特徴量抽出部３０は、処理対象画像２０における移動後の基準検出枠１００内の画像から複数の特徴量を抽出する。 Next, the feature amount extraction unit 30 moves the reference detection frame 100 slightly to the right in the processing target image 20. For example, the feature amount extraction unit 30 moves the reference detection frame 100 to the right by one pixel or several pixels. Then, the feature quantity extraction unit 30 extracts a plurality of feature quantities from the image in the reference detection frame 100 after movement in the processing target image 20.

その後、識別器３１は、特徴量抽出部３０で抽出された複数の特徴量から成る特徴ベクトルと、複数の重み係数から成る重みベクトルとに基づいて、移動後の基準検出枠１００内の画像についての検出確度値を求める。そして、識別器３１は、算出した検出確度値がしきい値以上である場合には、移動後の基準検出枠１００内の画像が顔画像である可能性が高いと判定して、当該画像を検出結果領域とするとともに、当該画像の外形枠である移動後の基準検出枠１００を検出結果枠とする。 After that, the discriminator 31 uses the feature vector composed of a plurality of feature amounts extracted by the feature amount extraction unit 30 and the weight vector composed of a plurality of weight coefficients for the image in the reference detection frame 100 after movement. The detection accuracy value is obtained. When the calculated detection accuracy value is equal to or greater than the threshold value, the discriminator 31 determines that the image in the reference detection frame 100 after the movement is likely to be a face image, and determines the image as the image. In addition to the detection result area, the reference detection frame 100 after movement, which is the outer frame of the image, is used as the detection result frame.

その後、検出部３は同様に動作して、図５に示されるように、基準検出枠１００が処理対象画像２０の右端まで移動すると、検出部３は、右端の基準検出枠１００内の画像についての検出確度値を求める。そして、検出部３は、求めた検出確度値がしきい値以上であれば、右端の基準検出枠１００内の画像を検出結果領域とするとともに、当該右端の基準検出枠１００を検出結果枠とする。 Thereafter, the detection unit 3 operates in the same manner, and when the reference detection frame 100 moves to the right end of the processing target image 20 as illustrated in FIG. 5, the detection unit 3 detects the image in the reference detection frame 100 at the right end. The detection accuracy value is obtained. If the obtained detection accuracy value is equal to or greater than the threshold value, the detection unit 3 sets the image in the rightmost reference detection frame 100 as a detection result region, and uses the rightmost reference detection frame 100 as a detection result frame. To do.

次に、特徴量抽出部３０は、図６に示されるように、基準検出枠１００を少し下げつつ処理対象画像２０の左端に移動させた後、当該基準検出枠１００内の画像から複数の特徴量を抽出する。特徴量抽出部３０は、上下方向（列方向）において例えば１画素分あるいは数画素分だけ下に基準検出枠１００を移動させる。その後、識別器３１は、特徴量抽出部３０から抽出された複数の特徴量から成る特徴ベクトルと、複数の重み係数から成る重みベクトルとに基づいて、現在の基準検出枠１００内の画像についての検出確度値を求めて出力する。そして、識別器３１は、算出した検出確度値がしきい値以上である場合には、現在の基準検出枠１００内の画像が顔画像である可能性が高いと判定して、当該画像を検出結果領域とするとともに、当該基準検出枠１００を検出結果枠とする。 Next, as illustrated in FIG. 6, the feature amount extraction unit 30 moves the reference detection frame 100 to the left end of the processing target image 20 while slightly lowering the reference detection frame 100, and then extracts a plurality of features from the image in the reference detection frame 100. Extract the amount. The feature amount extraction unit 30 moves the reference detection frame 100 downward by, for example, one pixel or several pixels in the vertical direction (column direction). Thereafter, the discriminator 31 uses the feature vector composed of a plurality of feature amounts extracted from the feature amount extraction unit 30 and the weight vector composed of a plurality of weight coefficients to calculate the image in the current reference detection frame 100. Find and output the detection accuracy value. When the calculated detection accuracy value is equal to or greater than the threshold value, the discriminator 31 determines that the image in the current reference detection frame 100 is likely to be a face image, and detects the image. In addition to the result area, the reference detection frame 100 is set as a detection result frame.

その後、検出部３は同様に動作して、図７に示されるように、基準検出枠１００が処理対象画像２０の右下まで移動すると、検出部３は、右下の当該基準検出枠１００内の画像についての検出確度値を求める。そして、検出部３は、求めた検出確度値がしきい値以上であれば、右下の基準検出枠１００内の画像を検出結果領域とするとともに、当該右下の基準検出枠を検出結果枠とする。 Thereafter, the detection unit 3 operates in the same manner, and when the reference detection frame 100 moves to the lower right of the processing target image 20 as illustrated in FIG. 7, the detection unit 3 moves within the lower right reference detection frame 100. The detection accuracy value for the image is obtained. If the obtained detection accuracy value is equal to or greater than the threshold value, the detection unit 3 sets the image in the lower right reference detection frame 100 as a detection result region and uses the lower right reference detection frame as the detection result frame. And

以上のようにして、検出部３は、基準検出枠を使用して、処理対象画像において、当該基準検出枠と同じサイズの顔画像である可能性が高い領域を検出結果領域として検出する。言い換えれば、検出部３は、基準検出枠を使用して、処理対象画像において、当該基準検出枠と同じサイズの顔画像を特定する。 As described above, the detection unit 3 uses the reference detection frame to detect, in the processing target image, an area that is highly likely to be a face image having the same size as the reference detection frame as a detection result area. In other words, the detection unit 3 specifies a face image having the same size as the reference detection frame in the processing target image using the reference detection frame.

＜非基準検出枠を用いた検出処理＞
検出部３が非基準検出枠を使用して検出処理を行う際には、特徴量抽出部３０は、非基準検出枠の大きさが基準サイズ（基準検出枠のサイズ）と一致するように、当該非基準検出枠をサイズ変更する。そして、特徴量抽出部３０は、非基準検出枠についてのサイズ変更比率と同じだけ処理対象画像をサイズ変更する。 <Detection process using non-reference detection frame>
When the detection unit 3 performs the detection process using the non-reference detection frame, the feature amount extraction unit 30 is configured so that the size of the non-reference detection frame matches the reference size (the size of the reference detection frame). The non-reference detection frame is resized. Then, the feature amount extraction unit 30 resizes the processing target image by the same size change ratio as the non-reference detection frame.

本実施の形態では、基準サイズは１６ｐ×１６ｐであることから、例えば、大きさがＲｐ×Ｒｐ（Ｒ＞１６）の非基準検出枠が使用される場合、特徴量抽出部３０は、当該非基準検出枠の縦幅（上下方向の幅）及び横幅（左右方向の幅）をそれぞれ（１６／Ｒ）倍して当該非基準検出枠を縮小し、サイズ変更検出枠を生成する。そして、特徴量抽出部３０は、処理対象画像の縦幅（画素数）及び横幅（画素数）をそれぞれ（１６／Ｒ）倍して当該処理対象画像を縮小し、サイズ変更画像を生成する。その後、検出部３は、上述の図４〜７を用いて説明した処理と同様に、サイズ変更画像に対してサイズ変更検出枠を移動させながら、当該サイズ変更検出枠内の画像から特徴量を抽出し、当該特徴量に基づいて、当該サイズ変更検出枠内の画像が、当該サイズ変更検出枠と同じサイズの顔画像である可能性が高いか判定する。つまり、検出部３は、サイズ変更検出枠を用いて、サイズ変更画像において当該サイズ変更検出枠と同じサイズの顔画像である可能性が高い領域を検出する処理を行う。以後、この処理を「サイズ変更版検出処理」と呼ぶ。 In the present embodiment, since the reference size is 16p × 16p, for example, when a non-reference detection frame having a size of Rp × Rp (R> 16) is used, the feature amount extraction unit 30 performs the non-reference detection The non-reference detection frame is reduced by multiplying the vertical width (vertical width) and horizontal width (horizontal width) of the reference detection frame by (16 / R), respectively, to generate a size change detection frame. Then, the feature amount extraction unit 30 reduces the processing target image by multiplying the vertical width (number of pixels) and the horizontal width (number of pixels) of the processing target image by (16 / R), and generates a size-changed image. After that, the detection unit 3 moves the size change detection frame with respect to the size change image and moves the feature amount from the image in the size change detection frame in the same manner as the processing described with reference to FIGS. Based on the extracted feature amount, it is determined whether there is a high possibility that the image in the size change detection frame is a face image having the same size as the size change detection frame. That is, the detection unit 3 performs processing for detecting an area that is highly likely to be a face image having the same size as the size change detection frame in the size change image using the size change detection frame. Hereinafter, this process is referred to as a “size-changed version detection process”.

検出部３は、サイズ変更版検出処理において、サイズ変更画像に対してサイズ変更検出枠を設定し、当該サイズ変更検出枠内の画像が、当該サイズ変更検出枠と同じサイズの顔画像である可能性が高いと判定すると、当該画像の外形枠である当該サイズ変更検出枠を仮検出結果枠とする。 In the size change version detection process, the detection unit 3 sets a size change detection frame for the size change image, and the image in the size change detection frame may be a face image having the same size as the size change detection frame. If it is determined that the property is high, the size change detection frame that is the outer frame of the image is set as a temporary detection result frame.

検出部３では、サイズ変更画像について少なくとも一つの仮検出結果枠が得られると、識別器３１が、当該少なくとも一つの仮検出結果枠を、当該サイズ変更画像の元になる処理対象画像に応じた検出結果枠に変換する。 In the detection unit 3, when at least one temporary detection result frame is obtained for the resized image, the discriminator 31 selects the at least one temporary detection result frame according to the processing target image that is the basis of the resized image. Convert to detection result frame.

具体的には、識別器３１は、まず、サイズ変更画像に対して、得られた少なくとも一つの仮検出結果枠を設定する。図８は、サイズ変更画像１２０に対して仮検出結果枠１３０が設定されている様子を示す図である。図８の例では、サイズ変更画像１２０に対して複数の仮検出結果枠１３０が設定されている。 Specifically, the classifier 31 first sets at least one temporary detection result frame obtained for the resized image. FIG. 8 is a diagram illustrating a state in which the temporary detection result frame 130 is set for the size-changed image 120. In the example of FIG. 8, a plurality of temporary detection result frames 130 are set for the resized image 120.

次に識別器３１は、図９に示されるように、仮検出結果枠１３０が設定されたサイズ変更画像１２０を拡大（サイズ変更）して元のサイズに戻すことによって、サイズ変更画像１２０を処理対象画像２０に変換する。これにより、サイズ変更画像１２０に設定された仮検出結果枠１３０も拡大されて、仮検出結果枠１３０は、図９に示されるように、処理対象画像２０に応じた検出結果枠１５０に変換される。処理対象画像２０における検出結果枠１５０内の領域が、処理対象画像２０において非基準検出枠と同じサイズの顔画像である可能性が高い検出結果領域となる。これにより、検出部３では、サイズ変更版検出処理によって得られた仮検出結果枠１３０に基づいて、処理対象画像において非基準検出枠と同じサイズの顔画像である可能性が高い検出結果領域が特定される。 Next, as shown in FIG. 9, the classifier 31 processes the resized image 120 by enlarging (resizing) the resized image 120 in which the temporary detection result frame 130 is set and returning it to the original size. The target image 20 is converted. As a result, the temporary detection result frame 130 set in the size-changed image 120 is also enlarged, and the temporary detection result frame 130 is converted into a detection result frame 150 corresponding to the processing target image 20 as shown in FIG. The A region in the detection result frame 150 in the processing target image 20 is a detection result region that is highly likely to be a face image having the same size as the non-reference detection frame in the processing target image 20. Thereby, in the detection unit 3, based on the temporary detection result frame 130 obtained by the size-changed version detection process, a detection result region that is highly likely to be a face image having the same size as the non-reference detection frame in the processing target image. Identified.

このように、検出部３は、非基準検出枠を使用して処理対象画像についての検出処理を行う際には、サイズが基準サイズと一致するようにサイズ変更した非基準検出枠と、当該非基準検出枠のサイズ変更に応じてサイズ変更した処理対象画像とを使用してサイズ変更版検出処理を行う。これにより、基準サイズとは異なるサイズの検出枠が使用される場合であっても、特徴量抽出部３０は、基準サイズの画像から特徴量を抽出できる。そして、検出部３は、サイズ変更版検出処理の結果に基づいて、処理対象画像において非基準検出枠と同じサイズの顔画像である可能性が高い検出結果領域を特定する。これにより、検出部３では非基準検出枠が用いられた検出処理が行われる。 As described above, when the detection unit 3 performs the detection process on the processing target image using the non-reference detection frame, the non-reference detection frame that has been resized so that the size matches the reference size, The size-changed version detection process is performed using the processing target image whose size has been changed according to the size change of the reference detection frame. Thereby, even when a detection frame having a size different from the reference size is used, the feature amount extraction unit 30 can extract the feature amount from the image of the reference size. And the detection part 3 pinpoints the detection result area | region with high possibility that it is a face image of the same size as a non-reference | standard detection frame in a process target image based on the result of a size-change version detection process. Thus, the detection unit 3 performs detection processing using the non-reference detection frame.

検出部３は、以上のような検出処理を、複数種類の検出枠のそれぞれを用いて行う。これにより、処理対象画像に顔画像が含まれている場合には、検出結果領域（顔画像である可能性が高い領域）及び検出結果枠（顔画像である可能性が高い領域の外形枠）が得られるとともに、検出結果枠に対応した検出確度値が得られる。処理対象画像について得られた検出結果枠に対応した検出確度値とは、当該処理対象画像における当該検出結果枠内の画像が顔画像である確からしさを示している。 The detection unit 3 performs the above detection process using each of a plurality of types of detection frames. Thereby, when a face image is included in the processing target image, a detection result area (area that is highly likely to be a face image) and a detection result frame (outer frame of an area that is likely to be a face image) Is obtained, and a detection accuracy value corresponding to the detection result frame is obtained. The detection accuracy value corresponding to the detection result frame obtained for the processing target image indicates the probability that the image in the detection result frame in the processing target image is a face image.

図１０は、処理対象画像２０について得られた検出結果枠１５０が当該処理対象画像２０に重ねて配置された様子を示す図である。図１０に示されるように、互いにサイズの異なる複数種類の検出枠が使用されて検出処理が行われることによって、様々な大きさの検出結果枠１５０が得られる。これは、処理対象画像２０に含まれる様々な大きさの顔画像が検出されていることを意味している。 FIG. 10 is a diagram illustrating a state in which the detection result frame 150 obtained for the processing target image 20 is arranged so as to overlap the processing target image 20. As shown in FIG. 10, detection processing frames 150 of various sizes are obtained by performing detection processing using a plurality of types of detection frames having different sizes. This means that face images of various sizes included in the processing target image 20 are detected.

＜出力値マップ生成処理＞
マップ生成部４は、検出部３での検出結果に基づいて、顔画像としての確からしさ（顔画像らしさ）を示す検出確度値についての処理対象画像での分布を示す出力値マップを生成する。 <Output value map generation processing>
The map generation unit 4 generates an output value map indicating the distribution in the processing target image with respect to the detection accuracy value indicating the likelihood (face image likelihood) as the face image based on the detection result of the detection unit 3.

具体的には、マップ生成部４は、処理対象画像と同様に、行方向にＭ個の値が並び、列方向にＮ個の値が並ぶ、合計（Ｍ×Ｎ）個の値から成るマップ２００を考える。そして、マップ生成部４は、処理対象画像についての一つの検出結果枠を対象検出結果枠とし、対象検出結果枠と同じ位置に、対象検出結果枠と同じ大きさの枠２１０をマップ２００に対して設定する。図１１は、マップ２００に対して枠２１０を設定した様子を示す図である。 Specifically, the map generation unit 4 is a map composed of a total of (M × N) values in which M values are arranged in the row direction and N values are arranged in the column direction, similarly to the processing target image. Think about 200. Then, the map generation unit 4 sets one detection result frame for the processing target image as the target detection result frame, and sets a frame 210 having the same size as the target detection result frame to the map 200 at the same position as the target detection result frame. To set. FIG. 11 is a diagram illustrating a state in which a frame 210 is set for the map 200.

次にマップ生成部４は、マップ２００における、枠２１０外の各値については“０”とし、枠２１０内の各値については、対象検出結果枠に対応する検出確度値（対象検出結果枠となった検出枠内の画像に対して顔画像の検出を行った結果得られた検出確度値）を用いて決定する。対象検出結果枠の大きさが、例えば１６ｐ×１６ｐであるとすると、枠２１０内には、行方向に１６個、列方向に１６個、合計２５６個の値が存在する。また、対象検出結果枠の大きさが、例えば２０ｐ×２０ｐであるとすると、枠２１０内には、行方向に２０個、列方向に２０個、合計４００個の値が存在する。図１２は、枠２１０内の各値を決定する方法を説明するための図である。 Next, the map generation unit 4 sets “0” for each value outside the frame 210 in the map 200, and for each value within the frame 210, a detection accuracy value corresponding to the target detection result frame (the target detection result frame and The detection accuracy value obtained as a result of detecting the face image with respect to the image within the detection frame is determined. If the size of the target detection result frame is, for example, 16p × 16p, there are 16 values in the row 210, 16 in the column direction and 16 in the column direction, for a total of 256 values. Further, assuming that the size of the target detection result frame is 20p × 20p, for example, there are 20 values in the row 210 and 20 in the column direction, for a total of 400 values. FIG. 12 is a diagram for explaining a method for determining each value in the frame 210.

マップ生成部４は、枠２１０内の中心２１１の値を、検出部３で求められた、対象検出結果枠に対応する検出確度値とする。そして、マップ生成部４は、枠２１０内のそれ以外の複数の値を、枠２１０の中心２１１の値を最大値とした正規分布曲線に従って枠２１０内の中心２１１から外側に向けて値が徐々に小さくなるようにする。これにより、マップ２００を構成する複数の値のそれぞれが決定されて、対象検出結果枠に対応するマップ２００が完成する。 The map generation unit 4 sets the value of the center 211 in the frame 210 as the detection accuracy value corresponding to the target detection result frame obtained by the detection unit 3. Then, the map generation unit 4 gradually increases the values of the other values in the frame 210 from the center 211 in the frame 210 to the outside according to a normal distribution curve with the value at the center 211 of the frame 210 as the maximum value. To be smaller. Thereby, each of a plurality of values constituting the map 200 is determined, and the map 200 corresponding to the target detection result frame is completed.

以上のようにして、マップ生成部４は、処理対象画像についての複数の検出結果枠にそれぞれ対応する複数のマップ２００を生成する。そして、マップ生成部４は、生成した複数のマップ２００を合成して出力値マップを生成する。 As described above, the map generation unit 4 generates a plurality of maps 200 respectively corresponding to a plurality of detection result frames for the processing target image. And the map production | generation part 4 synthesize | combines the produced | generated several map 200, and produces | generates an output value map.

具体的には、マップ生成部４は、生成した複数のマップ２００のｍ×ｎ番目の値を加算し、それによって得られた加算値を出力値マップのｍ×ｎ番目の検出確度値とする。マップ生成部４は、このようにして、出力値マップを構成する各検出確度値を求める。これにより、処理対象画像での検出確度値の分布を示す出力値マップが完成する。出力値マップでは、処理対象画像と同様に、行方向にＭ個の検出確度値が並び、列方向にＮ個の検出確度値が並んでいる。出力値マップは（Ｍ×Ｎ）個の検出確度値で構成される。出力値マップを参照すれば、処理対象画像において顔画像らしさが高い領域を特定することができる。つまり、出力値マップを参照することによって、処理対象画像おける顔画像を特定することができる。 Specifically, the map generation unit 4 adds the m × n-th values of the plurality of generated maps 200, and uses the obtained addition value as the m × n-th detection accuracy value of the output value map. . In this way, the map generation unit 4 obtains each detection accuracy value constituting the output value map. As a result, an output value map indicating the distribution of detection accuracy values in the processing target image is completed. In the output value map, similarly to the processing target image, M detection accuracy values are arranged in the row direction and N detection accuracy values are arranged in the column direction. The output value map is composed of (M × N) detection accuracy values. By referring to the output value map, it is possible to specify a region having a high likelihood of a face image in the processing target image. That is, the face image in the processing target image can be specified by referring to the output value map.

図１３は、処理対象画像２０についての出力値マップを当該処理対象画像２０に重ねて示す図である。図１３では、理解し易いように、検出確度値の大きさを例えば第１段階から第５段階の５段階に分けて出力値マップを示している。図１３及び後述の図１５に示される出力値マップにおいては、検出確度値が、最も大きい第５段階に属する領域については縦線のハッチングが示されており、２番目に大きい第４段階に属する領域については砂地のハッチングが示されている。また、図１３及び図１５に示される出力値マップにおいては、検出確度値が、３番目に大きい第３段階に属する領域については右上がりのハッチングが示されており、４番目に大きい第２段階に属する領域については左上がりのハッチングが示されている。そして、図１３及び図１５に示される出力値マップにおいては、検出確度値が、最も小さい第１段階に属する領域についてはハッチングが示されていない。 FIG. 13 is a diagram showing an output value map for the processing target image 20 superimposed on the processing target image 20. In FIG. 13, for easy understanding, the output value map is shown by dividing the magnitude of the detection accuracy value into, for example, five stages from the first stage to the fifth stage. In the output value map shown in FIG. 13 and FIG. 15 described later, vertical line hatching is shown for the region belonging to the fifth stage where the detection accuracy value is the largest, and belongs to the second largest fourth stage. The area is shown as sand hatching. In the output value maps shown in FIG. 13 and FIG. 15, the region belonging to the third stage where the detection accuracy value is the third largest shows a right-up hatching, and the fourth largest second stage. For areas belonging to, left-upward hatching is shown. In the output value maps shown in FIGS. 13 and 15, hatching is not shown for the region belonging to the first stage having the smallest detection accuracy value.

図１３に示される出力値マップにおいては、処理対象画像２０での顔画像に対応する領域（顔画像と同じ位置にある領域）での検出確度値が高くなっている。これは、処理対象画像２０に含まれる顔画像が適切に検出されていることを意味する。また、出力値マップにおける、処理対象画像２０での顔画像に対応する領域では、顔画像の中心付近と同じ位置での検出確度値が最も大きくなっており、外側に向かうほど検出確度値が小さくなっている。 In the output value map shown in FIG. 13, the detection accuracy value in the region corresponding to the face image in the processing target image 20 (region in the same position as the face image) is high. This means that the face image included in the processing target image 20 is properly detected. In the output value map, in the region corresponding to the face image in the processing target image 20, the detection accuracy value at the same position as the vicinity of the center of the face image is the largest, and the detection accuracy value decreases toward the outside. It has become.

＜２値化処理＞
２値化処理部５は、マップ生成部４で生成された出力値マップをしきい値を用いて２値化して２値化マップを生成する。具体的に、２値化処理部５は、出力値マップにおいて、検出確度値がしきい値以上あるいは当該しきい値よりも大きい領域の各値を例えば“１”に変更し、検出確度値が当該しきい値未満あるいは当該しきい値以下の領域の各値を例えば“０”に変更する。これにより、出力値マップにおける、検出確度値がしきい値以上あるいは当該しきい値よりも大きい領域に対応する、各値が“１”である高確度領域と、出力値マップにおける、検出確度値がしきい値未満あるいは当該しきい値以下の領域に対応する、各値が“０”である低確度領域とで構成された２値化マップが生成される。 <Binarization processing>
The binarization processing unit 5 binarizes the output value map generated by the map generation unit 4 using a threshold value to generate a binarized map. Specifically, the binarization processing unit 5 changes each value in a region where the detection accuracy value is equal to or larger than the threshold value or larger than the threshold value to, for example, “1” in the output value map, and the detection accuracy value is Each value in the area less than or less than the threshold value is changed to, for example, “0”. Thus, a high accuracy region where each value is “1” corresponding to a region where the detection accuracy value is greater than or equal to the threshold value or larger than the threshold value in the output value map, and the detection accuracy value in the output value map A binarized map composed of low-accuracy regions each having a value of “0” corresponding to a region of less than or less than the threshold value is generated.

図１４は処理対象画像２０の一例を模式的に示す図である。図１５は、図１４に示される処理対象画像２０についての出力値マップ４０を示す図である。図１６は、図１５に示される出力値マップ４０を所定のしきい値を用いて２値化して生成された２値化マップ５０を示す図である。 FIG. 14 is a diagram schematically illustrating an example of the processing target image 20. FIG. 15 is a diagram showing an output value map 40 for the processing target image 20 shown in FIG. FIG. 16 is a diagram showing a binarization map 50 generated by binarizing the output value map 40 shown in FIG. 15 using a predetermined threshold value.

図１５に示されるように、出力値マップ４０では、処理対象画像２０に含まれる顔画像２０ａに対応する領域４０ａでの検出確度値や、処理対象画像２０に含まれる顔画像２０ｂに対応する領域４０ｂでの検出確度値は大きくなっている。一方で、出力値マップ４０では、処理対象画像２０に含まれる顔画像２０ｃに対応する領域４０ｃでの検出確度値は小さくなっている。 As shown in FIG. 15, in the output value map 40, the detection accuracy value in the area 40 a corresponding to the face image 20 a included in the processing target image 20 and the area corresponding to the face image 20 b included in the processing target image 20. The detection accuracy value at 40b is large. On the other hand, in the output value map 40, the detection accuracy value in the region 40c corresponding to the face image 20c included in the processing target image 20 is small.

図１５に示される出力値マップ４０を、例えば、検出確度値についての第２段階（左上がりのハッチング）と第３段階（右上がりのハッチング）の境界の値をしきい値として２値化すると、図１６に示される２値化マップ５０が得られる。図１６では、高確度領域５１には斜線のハッチングが示されており、低確度領域５２にはハッチングが示されていない。出力値マップ４０では、顔画像２０ｃに対応する領域４０ｃでの検出確度値は、全体的に、顔画像２０ａ，２０ｂに対応する領域４０ａ，４０ｂでの検出確度値よりもが小さくなっていることから、２値化マップ５０の高確度領域５１では、顔画像２０ｃに対応する領域５１ｃは、顔画像２０ａ，２０ｂにそれぞれ対応する領域５１ａ，５１ｂよりも小さくなっている。 For example, when the output value map 40 shown in FIG. 15 is binarized using, as a threshold value, a boundary value between the second stage (upward hatching) and the third stage (upward hatching) of the detection accuracy value. A binarized map 50 shown in FIG. 16 is obtained. In FIG. 16, hatching is shown in the high accuracy region 51, and hatching is not shown in the low accuracy region 52. In the output value map 40, the detection accuracy value in the region 40c corresponding to the face image 20c is generally smaller than the detection accuracy value in the regions 40a and 40b corresponding to the face images 20a and 20b. Thus, in the high accuracy region 51 of the binarized map 50, the region 51c corresponding to the face image 20c is smaller than the regions 51a and 51b corresponding to the face images 20a and 20b, respectively.

２値化マップ５０の生成で用いられるしきい値を適切に調整すると、図１６に示されるように、２値化マップ５０の高確度領域５１には、処理対象画像２０に含まれる複数の顔画像２０ａ〜２０ｃにそれぞれ対応する互いに独立した（分離した）複数の領域５１ａ〜５１ｃが含まれるようになる。よって、当該複数の領域５１ａ〜５１ｃから、処理対象画像２０に含まれる複数の顔画像２０ａ〜２０ｃのそれぞれを個別に特定することが可能となる。２値化マップ５０の生成で用いられるしきい値については、後述するように、しきい値調整部７で適切に調整される。 When the threshold value used in generating the binarized map 50 is appropriately adjusted, a plurality of faces included in the processing target image 20 are included in the high accuracy area 51 of the binarized map 50 as shown in FIG. A plurality of independent (separated) regions 51a to 51c corresponding to the images 20a to 20c are included. Therefore, each of the plurality of face images 20a to 20c included in the processing target image 20 can be individually specified from the plurality of regions 51a to 51c. The threshold value used in generating the binarized map 50 is appropriately adjusted by the threshold value adjusting unit 7 as will be described later.

＜検出対象画像特定処理＞
検出対象画像特定部６は、しきい値調整部７で調整されたしきい値が用いられて２値化処理部５で生成された２値化マップの高確度領域に基づいて、処理対象画像において顔画像を特定する。以後、しきい値調整部７で調整されたしきい値が用いられて生成された２値化マップを特に「特定用２値化マップ」と呼ぶ。 <Detection target image specifying process>
The detection target image specifying unit 6 uses the threshold adjusted by the threshold adjustment unit 7 to generate a processing target image based on the high accuracy region of the binarization map generated by the binarization processing unit 5. A face image is specified at. Hereinafter, the binarization map generated by using the threshold adjusted by the threshold adjustment unit 7 is particularly referred to as “specific binarization map”.

本実施の形態では、検出対象画像特定部６は、特定用２値化マップの高確度領域に含まれる各独立領域（島領域）を特定する。図１６の例では、領域５１ａ〜５１ｃのそれぞれが独立領域として特定される。そして、検出対象画像特定部６は、特定した各独立領域について、当該独立領域に外接する外接矩形を求める。特定用２値化マップの高確度領域に含まれる各独立領域については、当該特定用２値化マップに対して４連結等を用いたラベリングを行うことによって特定することができる。 In the present embodiment, the detection target image specifying unit 6 specifies each independent region (island region) included in the high accuracy region of the specifying binarization map. In the example of FIG. 16, each of the areas 51a to 51c is specified as an independent area. Then, the detection target image specifying unit 6 obtains a circumscribed rectangle circumscribing the independent area for each specified independent area. Each independent region included in the high-accuracy region of the specifying binarization map can be specified by performing labeling using 4-connection or the like on the specifying binarization map.

図１７は、検出対象画像特定部６が、例えば図１６に示される２値化マップ５０を特定用２値化マップとして使用し、当該２値化マップ５０の高確度領域５１に含まれる独立領域５１ａ〜５１ｃについての外接矩形を求めた際の当該外接矩形を示す図である。図１７に示される外接矩形３００ａ〜３００ｃは、それぞれ、図１６に示される２値化マップ５０の高確度領域５１に含まれる独立領域５１ａ〜５１ｃの外接矩形である。 In FIG. 17, the detection target image specifying unit 6 uses, for example, the binarization map 50 shown in FIG. 16 as the specifying binarization map, and the independent region included in the high accuracy region 51 of the binarization map 50. It is a figure which shows the said circumscribed rectangle at the time of calculating | requiring the circumscribed rectangle about 51a-51c. The circumscribed rectangles 300a to 300c shown in FIG. 17 are circumscribed rectangles of the independent areas 51a to 51c included in the high accuracy area 51 of the binarized map 50 shown in FIG.

検出対象画像特定部６は、特定用２値化マップの高確度領域の各独立領域についての外接矩形を求めると、当該外接矩形を処理対象画像に設定する。図１８は、図１７に示される外接矩形３００ａ〜３００ｃを図１４に示される処理対象画像２０に設定した様子を示す図である。検出対象画像特定部６は、処理対象画像に設定された各外接矩形について、当該外接矩形内の画像が一つの顔画像であると判断する。これにより、処理対象画像２０において顔画像が特定される。 When the detection target image specifying unit 6 obtains a circumscribed rectangle for each independent region of the high-accuracy region of the specifying binarization map, the detection target image specifying unit 6 sets the circumscribed rectangle as a processing target image. 18 is a diagram illustrating a state in which the circumscribed rectangles 300a to 300c illustrated in FIG. 17 are set in the processing target image 20 illustrated in FIG. The detection target image specifying unit 6 determines that for each circumscribed rectangle set as the processing target image, the image in the circumscribed rectangle is one face image. Thereby, a face image is specified in the processing target image 20.

画像検出装置１は、処理対象画像を表示装置に表示する際には、図１８に示されるように、検出対象画像特定部６で求められた外接矩形を処理対象画像に重ねて表示する。 When the processing target image is displayed on the display device, the image detection device 1 displays the circumscribed rectangle obtained by the detection target image specifying unit 6 so as to overlap the processing target image, as shown in FIG.

また、画像検出装置１は、予め登録された顔画像と、処理対象画像において特定した顔画像（外接矩形内の画像）とを比較し、両者が一致するか否かを判定しても良い。そして、画像検出装置１は、予め登録された顔画像と、処理対象画像において特定した顔画像とが一致しない場合には、処理対象画像での当該顔画像に対してモザイク処理を行った上で、当該処理対象画像を表示装置に表示しても良い。これにより、本実施の形態に係る画像検出装置１を監視カメラシステムに使用した場合において、監視カメラによって隣家の人の顔画像が撮影された場合であっても、当該顔画像を認識できないようにすることができる。つまり、プライバシーマスクを実現することができる。 Further, the image detection apparatus 1 may compare a face image registered in advance with a face image specified in the processing target image (an image in a circumscribed rectangle) and determine whether or not they match. If the face image registered in advance and the face image specified in the processing target image do not match, the image detection apparatus 1 performs mosaic processing on the face image in the processing target image. The processing target image may be displayed on the display device. As a result, when the image detection apparatus 1 according to the present embodiment is used in a surveillance camera system, even when a face image of a neighbor's person is photographed by the surveillance camera, the face image cannot be recognized. can do. That is, a privacy mask can be realized.

＜しきい値調整処理＞
２値化処理部５が出力値マップを２値化する際に使用するしきい値が適切に設定されないと、画像検出装置１は処理対象画像から顔画像を正しく検出できない可能性がある。以下にこの点について説明する。 <Threshold adjustment processing>
If the threshold used when the binarization processing unit 5 binarizes the output value map is not appropriately set, the image detection apparatus 1 may not be able to correctly detect the face image from the processing target image. This point will be described below.

図１９は、図１５に示される出力値マップ４０を、図１６に示される２値化マップ５０の生成で使用されたしきい値よりも小さいしきい値で２値化して得られる２値化マップ５０を示す図である。 FIG. 19 shows a binarization obtained by binarizing the output value map 40 shown in FIG. 15 with a threshold smaller than the threshold used in the generation of the binarization map 50 shown in FIG. It is a figure which shows the map.

出力値マップ４０が２値化される際のしきい値が小さい場合には、出力値マップ４０において検出確度値があまり大きくない領域についても高確度領域５１となる。したがって、図１９に示されるように、高確度領域５１では、距離が近い顔画像２０ａ，２０ｂに対応する領域５１ａ，５１ｂが連結して一つの独立領域となることがある。この場合には、図１９に示される２値化マップ５０の高確度領域５１に含まれる各独立領域についての外接矩形が求められると、図２０に示されるように、領域５１ａ，５１ｂから成る独立領域に外接する外接矩形３００ｄと、領域５１ｃに外接する外接矩形３００ｃとが生成される。 When the threshold value when the output value map 40 is binarized is small, a region where the detection accuracy value is not so large in the output value map 40 also becomes the high accuracy region 51. Accordingly, as shown in FIG. 19, in the high accuracy region 51, the regions 51a and 51b corresponding to the face images 20a and 20b that are close to each other may be connected to form one independent region. In this case, when a circumscribed rectangle for each independent area included in the high-accuracy area 51 of the binarization map 50 shown in FIG. 19 is obtained, an independent area composed of areas 51a and 51b is obtained as shown in FIG. A circumscribed rectangle 300d circumscribing the region and a circumscribed rectangle 300c circumscribing the region 51c are generated.

外接矩形３００ｃ，３００ｄが処理対象画像２０に設定されると、図２１に示されるように、二つの顔画像２０ａ，２０ｂに対して一つの外接矩形３００ｄが設定され、顔画像２０ｃに対して一つの外接矩形３００ｃが設定される。検出対象画像特定部６は、処理対象画像での一つの外接矩形内の画像を一つの顔画像とすることから、処理対象画像２０から顔画像２０ｃについては適切に検出することができるものの、顔画像２０ａ，２０ｂについては一つの顔画像として特定され、顔画像２０ａ，２０ｂのそれぞれを個別に検出することが困難となる。 When the circumscribed rectangles 300c and 300d are set as the processing target image 20, as shown in FIG. 21, one circumscribed rectangle 300d is set for the two face images 20a and 20b, and one for the face image 20c. Two circumscribed rectangles 300c are set. Since the detection target image specifying unit 6 sets the image within one circumscribed rectangle in the processing target image as one face image, the face image 20c can be appropriately detected from the processing target image 20, but the face The images 20a and 20b are specified as one face image, and it becomes difficult to individually detect the face images 20a and 20b.

図２２は、図１５に示される出力値マップ４０を、図１６に示される２値化マップ５０の生成で使用されたしきい値よりも大きいしきい値で２値化して得られる２値化マップ５０を示す図である。 FIG. 22 shows a binarization obtained by binarizing the output value map 40 shown in FIG. 15 with a threshold value larger than the threshold value used in the generation of the binarization map 50 shown in FIG. It is a figure which shows the map.

出力値マップ４０が２値化される際のしきい値が大きい場合には、出力値マップ４０において検出確度値があまり大きくない領域については高確度領域５１とならない。したがって、図２２に示されるように、出力値マップ４０での対応する領域での検出確度値が小さい顔画像２０ｃについては、当該顔画像２０ｃに対応する領域が高確度領域５１に含まれないことがある。この場合には、図２１に示される２値化マップ５０の高確度領域５１に含まれる各独立領域についての外接矩形が求められると、図２３に示されるように、領域５１ａに外接する外接矩形３００ａと、領域５１ｂに外接する外接矩形３００ｂとが生成される。 When the threshold value when the output value map 40 is binarized is large, a region where the detection accuracy value is not so large in the output value map 40 does not become the high accuracy region 51. Therefore, as shown in FIG. 22, for the face image 20 c having a small detection accuracy value in the corresponding region in the output value map 40, the region corresponding to the face image 20 c is not included in the high accuracy region 51. There is. In this case, when a circumscribed rectangle for each independent area included in the high accuracy area 51 of the binarized map 50 shown in FIG. 21 is obtained, a circumscribed rectangle circumscribing the area 51a is obtained as shown in FIG. 300a and a circumscribed rectangle 300b circumscribing the region 51b are generated.

外接矩形３００ａ，３００ｂが処理対象画像２０に設定されると、図２４に示されるように、顔画像２０ａ，２０ｂに対して外接矩形３００ａ，３００ｂがそれぞれ設定されるものの、顔画像２０ｃには外接矩形が設定されない。したがって、顔画像２０ａ，２０ｂについては検出できるものの、顔画像２０ｃについては検出することが困難となる。 When the circumscribed rectangles 300a and 300b are set as the processing target image 20, the circumscribed rectangles 300a and 300b are set for the face images 20a and 20b, respectively, as shown in FIG. The rectangle is not set. Therefore, although it is possible to detect the face images 20a and 20b, it is difficult to detect the face image 20c.

このように、２値化マップの生成で使用されるしきい値が小さい場合には、近い距離にある複数の顔画像を適切に検出することが困難となる。 As described above, when the threshold value used for generating the binarized map is small, it is difficult to appropriately detect a plurality of face images at close distances.

一方で、２値化マップの生成で使用されるしきい値が大きい場合には、出力値マップでの対応する領域の検出確度値が小さい顔画像を適切に検出することが困難となる。 On the other hand, when the threshold value used in generating the binarized map is large, it is difficult to appropriately detect a face image having a small detection accuracy value of the corresponding region in the output value map.

そこで、本実施の形態では、検出対象画像特定部６が、処理対象画像において、出力値マップでの対応する領域の検出確度値が小さい顔画像を特定することができるとともに、距離が近い複数の顔画像のそれぞれを個別に特定することができるように、しきい値調整部７が２値化マップの生成で用いられるしきい値を適切に調整する。以下にしきい値調整部７がしきい値を調整する際の画像検出装置１の動作について詳細に説明する。 Therefore, in the present embodiment, the detection target image specifying unit 6 can specify a face image having a small detection accuracy value of the corresponding region in the output value map in the processing target image, and a plurality of short distances. The threshold value adjusting unit 7 appropriately adjusts the threshold value used in generating the binarized map so that each of the face images can be specified individually. Hereinafter, the operation of the image detection apparatus 1 when the threshold adjustment unit 7 adjusts the threshold will be described in detail.

図２５は画像検出装置１でのしきい値調整処理を示すフローチャートである。図２５に示されるしきい値調整処理は、マップ生成部４が処理対象画像についての出力値マップを生成すると、当該出力値マップが使用されて実行される。本実施の形態に係るしきい値調整処理では、画像検出装置１は、しきい値を複数段階変化させて、各しきい値での２値化マップを生成する。そして、画像検出装置１は、生成した複数の２値化マップに基づいて、最終的に使用する適切なしきい値を決定する。本実施の形態では、しきい値は例えば５〜１０段階変化させられる。したがって、しきい値調整処理では、使用されたしきい値が異なる５〜１０個の２値化マップが生成される。 FIG. 25 is a flowchart showing threshold adjustment processing in the image detection apparatus 1. The threshold value adjustment process shown in FIG. 25 is executed using the output value map when the map generation unit 4 generates an output value map for the processing target image. In the threshold value adjustment process according to the present embodiment, the image detection apparatus 1 changes the threshold value in a plurality of stages to generate a binarized map at each threshold value. Then, the image detection apparatus 1 determines an appropriate threshold value to be finally used based on the plurality of generated binarization maps. In the present embodiment, the threshold value is changed, for example, by 5 to 10 steps. Therefore, in the threshold adjustment process, 5 to 10 binarized maps having different used threshold values are generated.

しきい値調整処理では、図２５に示されるように、まずステップｓ１において、しきい値調整部７が２値化マップの生成で使用されるしきい値を２値化処理部５に仮設定する。ここでは、例えば、しきい値は、その変化範囲での最小値に設定される。 In the threshold value adjustment process, as shown in FIG. 25, first, in step s1, the threshold value adjustment unit 7 temporarily sets the threshold value used in the binarization map generation in the binarization processing unit 5. To do. Here, for example, the threshold value is set to the minimum value in the change range.

次にステップｓ２において、２値化処理部５は、ステップｓ１で仮設定されたしきい値を用いてマップ生成部４で生成された出力値マップを２値化し、２値化マップを生成する。 Next, in step s2, the binarization processing unit 5 binarizes the output value map generated by the map generation unit 4 using the threshold value temporarily set in step s1, and generates a binarized map. .

次にステップｓ３において、円形領域抽出部８は、ステップｓ２で生成された２値化マップの高確度領域から円形領域を抽出する。円形領域の抽出方法については後で詳細に説明する。 Next, in step s3, the circular area extraction unit 8 extracts a circular area from the high-accuracy area of the binarized map generated in step s2. A method for extracting the circular area will be described in detail later.

次にステップｓ４において、しきい値調整部７は、ステップｓ２で生成された２値化マップと、ステップｓ３で抽出された円形領域とに基づいて、ステップｓ１で仮設定されたしきい値についての判定用評価値を算出する。判定用評価値とは、しきい値の適切さを示す値である。判定用評価値の算出方法については後で詳細に説明する。 Next, in step s4, the threshold adjustment unit 7 determines the threshold temporarily set in step s1 based on the binarized map generated in step s2 and the circular area extracted in step s3. The evaluation value for determination is calculated. The evaluation value for determination is a value indicating the appropriateness of the threshold value. A method for calculating the evaluation value for determination will be described in detail later.

次にステップｓ５において、しきい値調整部７は、しきい値を所定範囲（定められた複数の段階）変化させたか判断する。ステップｓ５において、しきい値が所定範囲変化させられていないと判断されると、上述のステップｓ１が実行されて、新たなしきい値が２値化処理部５に仮設定される。ここでは、１段階だけ増加したしきい値が仮設定される。その後、ステップｓ２〜ステップｓ４が実行されて、１段階だけ増加したしきい値についての判定用評価値が算出される。以後、画像検出装置１は同様に動作する。 Next, in step s5, the threshold adjustment unit 7 determines whether the threshold has been changed by a predetermined range (a plurality of predetermined stages). If it is determined in step s5 that the threshold value has not been changed within the predetermined range, the above-described step s1 is executed, and a new threshold value is temporarily set in the binarization processing unit 5. Here, a threshold value increased by one step is temporarily set. Thereafter, Steps s2 to s4 are executed, and an evaluation value for determination is calculated for the threshold value increased by one step. Thereafter, the image detection apparatus 1 operates in the same manner.

ステップｓ５において、しきい値が所定範囲変化させられたと判断されると、ステップｓ６において、しきい値調整部７は、しきい値調整処理で算出された、複数段階のしきい値にそれぞれ対応する複数の判定用評価値に基づいて、当該複数段階のしきい値から適切なしきい値を決定する。具体的には、しきい値調整部７は、複数段階のしきい値のうち、複数の判定用評価値の最大値に対応するしきい値を適切なしきい値に決定する。これにより、しきい値調整処理が終了する。 When it is determined in step s5 that the threshold value has been changed by a predetermined range, in step s6, the threshold value adjusting unit 7 corresponds to each of the threshold values calculated in the threshold value adjusting process. Based on the plurality of determination evaluation values, an appropriate threshold value is determined from the plurality of threshold values. Specifically, the threshold value adjusting unit 7 determines a threshold value corresponding to the maximum value of the plurality of evaluation values for determination among the plurality of threshold values as an appropriate threshold value. As a result, the threshold value adjustment process ends.

ステップｓ６において適切なしきい値が決定されると、２値化処理部５は、当該適切なしきい値、つまりしきい値調整部７で調整されたしきい値を用いて出力値マップを２値化して特定用２値化マップを生成する。そして、検出対象画像特定部６は、上述のようにして、２値化処理部５で生成された特定用２値化マップの高確度領域に基づいて処理対象画像において顔画像を特定する。 When an appropriate threshold value is determined in step s6, the binarization processing unit 5 binarizes the output value map using the appropriate threshold value, that is, the threshold value adjusted by the threshold value adjustment unit 7. To generate a binarization map for identification. Then, the detection target image specifying unit 6 specifies a face image in the processing target image based on the high accuracy region of the specifying binarization map generated by the binarization processing unit 5 as described above.

＜円形領域抽出処理について＞
次にステップｓ３での円形領域抽出処理について詳細に説明する。本実施の形態では、円形領域抽出部８は、ステップｓ２で生成された２値化マップのエッジ（高確度領域と低確度領域の境界）を検出し、検出したエッジの座標を用いたハフ変換により当該２値化マップの高確度領域から円形領域、例えば正円形領域を抽出する。 <About circular area extraction processing>
Next, the circular area extraction process in step s3 will be described in detail. In the present embodiment, the circular area extraction unit 8 detects the edge of the binarized map generated in step s2 (the boundary between the high accuracy area and the low accuracy area), and the Hough transform using the detected edge coordinates. To extract a circular region, for example, a regular circular region, from the high-accuracy region of the binarized map.

ステップｓ３において、円形領域抽出部８は、まず、２値化マップのエッジを検出し、当該エッジを示すエッジマップを生成する。図２６は、図１９に示される２値化マップ５０のエッジ５１０を示すエッジマップ６０を示す図である。２値化マップのエッジについては、例えばキャニー法を用いて検出できる。以後、単にエッジと言えば、２値化マップのエッジを意味する。 In step s3, the circular area extraction unit 8 first detects an edge of the binarized map and generates an edge map indicating the edge. FIG. 26 is a diagram showing an edge map 60 showing the edges 510 of the binarized map 50 shown in FIG. The edge of the binarized map can be detected using, for example, the Canny method. Hereinafter, simply speaking an edge means an edge of a binarized map.

エッジマップでは、処理対象画像、出力値マップ及び２値化マップと同様に、行方向にＭ個の値が並び、列方向にＮ個の値が並んでいる。したがって、エッジマップは合計（Ｍ×Ｎ）個の値で構成されている。エッジマップでは、エッジを示す各値、つまり２値化マップでのエッジの位置と同じ位置での各値は例えば“１”となっており、それ以外の各値は例えば“０”となっている。 In the edge map, M values are arranged in the row direction and N values are arranged in the column direction, as in the processing target image, the output value map, and the binarization map. Therefore, the edge map is composed of a total of (M × N) values. In the edge map, each value indicating an edge, that is, each value at the same position as the edge position in the binarized map is, for example, “1”, and each other value is, for example, “0”. Yes.

円形領域抽出部８は、エッジマップを生成すると、当該エッジマップにおける、エッジを示す各値の座標を求める。エッジマップにおける、エッジを示す各値の座標は、２値化マップにおける、エッジを構成する各値の座標であることから、円形領域抽出部８は、エッジマップに基づいて、２値化マップにおける、エッジを構成する各値の座標を求めることになる。 When the circular area extraction unit 8 generates an edge map, the circular area extraction unit 8 obtains coordinates of each value indicating an edge in the edge map. Since the coordinates of each value indicating an edge in the edge map are the coordinates of each value constituting the edge in the binarized map, the circular area extracting unit 8 can determine whether the value in the binarized map is based on the edge map. The coordinates of each value constituting the edge are obtained.

本実施の形態では、エッジマップ及び２値化マップの左上の角を原点とし、行方向をｘ軸方向とし、列方向をｙ軸方向とするｘｙ平面がエッジマップ及び２値化マップに定められている。そして、円形領域抽出部８は、エッジマップにおける、エッジを示す各値について、ｘｙ平面上でのｘｙ座標を求める。これにより、２値化マップにおける、エッジを構成する各値について、ｘｙ平面上でのｘｙ座標が求められる。以後、当該ｘｙ座標を「エッジ座標」と呼ぶ。 In this embodiment, the edge map and the binarized map are defined as an xy plane in which the upper left corner of the edge map and the binarized map is the origin, the row direction is the x-axis direction, and the column direction is the y-axis direction. ing. Then, the circular area extraction unit 8 obtains xy coordinates on the xy plane for each value indicating an edge in the edge map. Thereby, the xy coordinate on the xy plane is calculated | required about each value which comprises an edge in a binarization map. Hereinafter, the xy coordinates are referred to as “edge coordinates”.

次に円形領域抽出部８は、エッジについて求めた複数のエッジ座標を用いたハフ変換により２値化マップの高確度領域での円形領域、例えば正円形領域を特定する。以下にハフ変換を用いた正円形領域の特定方法について説明する。説明の対象となるエッジ座標を対象エッジ座標と呼ぶ。 Next, the circular area extraction unit 8 specifies a circular area, for example, a regular circular area, in the high-accuracy area of the binarized map by Hough transform using a plurality of edge coordinates obtained for the edge. Hereinafter, a method for identifying a regular circular area using the Hough transform will be described. The edge coordinates to be described are called target edge coordinates.

正円形領域は、中心のｘ座標Ｃｘ、中心のｙ座標Ｃｙ及び半径ｒの３つのパラメータで表現することができる。ハフ変換では、この３つのパラメータをそれぞれ示す３次元の軸で表現されるハフ空間が使用される。以後、正円形領域を表現する３つのパラメータをまとめて「円表現パラメータ群」と呼ぶことがある。 A regular circular region can be expressed by three parameters: a center x coordinate Cx, a center y coordinate Cy, and a radius r. In the Hough transform, a Hough space represented by a three-dimensional axis indicating each of these three parameters is used. Hereinafter, the three parameters expressing the regular circular area may be collectively referred to as a “circle expression parameter group”.

円形領域抽出部８は、求めた複数のエッジ座標のそれぞれについて投票処理を行う。対象エッジ座標についての投票処理では、円形領域抽出部８は、まず、２値化マップに定められたｘｙ平面において、対象エッジ座標の値が円周上に位置するような、互いに異なる複数種類の正円形領域を考える。そして、円形領域抽出部８は、複数種類の正円形領域のそれぞれについて、当該正円形領域を表現する３つのパラメータ（円表現パラメータ群）を示す、ハフ空間内での３次元座標に対して投票を行う。 The circular area extraction unit 8 performs a voting process for each of the obtained plurality of edge coordinates. In the voting process for the target edge coordinates, the circular area extraction unit 8 first selects a plurality of different types of values such that the values of the target edge coordinates are located on the circumference on the xy plane defined in the binarization map. Consider a circular area. Then, the circular area extraction unit 8 votes for three-dimensional coordinates in the Hough space indicating three parameters (a circle expression parameter group) that represent the regular circular area for each of a plurality of types of regular circular areas. I do.

円形領域抽出部８は、２値化マップに定められたｘｙ平面における、対象エッジ座標の値が円周上に位置するような、互いに異なる複数種類の正円形領域のそれぞれについて、以下の式（１）を用いて、当該正円形領域を表現する円表現パラメータ群を求める。 The circular area extraction unit 8 uses the following formulas for each of a plurality of different types of regular circular areas where the values of the target edge coordinates are located on the circumference in the xy plane defined in the binarization map: 1) is used to obtain a circle expression parameter group expressing the regular circular area.

ｒ^２＝（ｘ−Ｃｘ）^２＋（ｙ−Ｃｙ）^２・・・（１）
ここで式（１）中のｘ及びｙは、対象エッジ座標のｘ座標及びｙ座標をそれぞれ示している。円形領域抽出部８は、式（１）中のＣｘ及びＣｙのそれぞれを複数通りに変化させて、ＣｘとＣｙの各組に対応するｒを求める。これにより、ＣｘとＣｙとｒの組が複数組得られる。１組のＣｘとＣｙとｒは、対象エッジ座標の値が円周上に位置するような一つの正円形領域を表現する円表現パラメータ群であることから、ＣｘとＣｙとｒの組が複数組求められることによって、対象エッジ座標の値が円周上に位置するような、互いに異なる複数種類の正円形領域のそれぞれについての円表現パラメータ群が得られる。Ｃｘ及びＣｙのそれぞれを例えば１００通りに変化させると、ＣｘとＣｙとｒの組が１００００組得られることから、対象エッジ座標の値が円周上に位置するような、互いに異なる１００００種類の正円形領域のそれぞれについての円表現パラメータ群が得られることになる。円形領域抽出部８は、求めた複数の円表現パラメータ群（例えば１００００個の円表現パラメータ群）のそれぞれについて、当該円表現パラメータ群を構成する３つのパラメータを示すハフ空間内での３次元座標に対して投票を行う。 r ² = (x−Cx) ² + (y−Cy) ² (1)
Here, x and y in the formula (1) indicate the x coordinate and the y coordinate of the target edge coordinates, respectively. The circular area extracting unit 8 obtains r corresponding to each set of Cx and Cy by changing each of Cx and Cy in Equation (1) in a plurality of ways. Thereby, a plurality of sets of Cx, Cy, and r are obtained. Since one set of Cx, Cy, and r is a circle expression parameter group that represents one regular circular region in which the value of the target edge coordinate is located on the circumference, there are a plurality of sets of Cx, Cy, and r. By obtaining the set, a circle expression parameter group is obtained for each of a plurality of different types of regular circular regions where the values of the target edge coordinates are located on the circumference. For example, if Cx and Cy are changed in 100 ways, for example, 10,000 pairs of Cx, Cy, and r are obtained. A circle expression parameter group for each of the circular regions is obtained. For each of the obtained plurality of circle expression parameter groups (for example, 10,000 circle expression parameter groups), the circular area extraction unit 8 performs three-dimensional coordinates in the Hough space indicating the three parameters constituting the circle expression parameter group. Vote for.

円形領域抽出部８は、このような投票処理を、求めた複数のエッジ座標（２値化マップでのエッジを構成する複数の値の座標）のそれぞれについて行う。そして、円形領域抽出部８は、ハフ空間内において投票数が最も多い３次元座標を構成する３つのパラメータで表現される正円形領域を、２値化マップの高確度領域に含まれる正円形領域とする。このようにして、２値化マップの高確度領域から一つの正円形領域が抽出される。 The circular area extraction unit 8 performs such voting processing for each of the obtained plurality of edge coordinates (coordinates of a plurality of values constituting an edge in the binarized map). Then, the circular area extraction unit 8 converts the circular area represented by the three parameters constituting the three-dimensional coordinates having the largest number of votes in the Hough space into the circular area included in the high accuracy area of the binarized map. And In this way, one regular circular area is extracted from the high accuracy area of the binarized map.

円形領域抽出部８は、２値化マップの高確度領域での一つの正円形領域を特定すると、当該正円形領域の円周上に位置する値のエッジ座標を、エッジについて求めた複数のエッジ座標から削除し、残りのエッジ座標のそれぞれについて投票処理をあらためて行う。そして、円形領域抽出部８は、ハフ空間内での投票数が最も多い３次元座標を構成する３つのパラメータで表現される正円形領域を、２値化マップの高確度領域に含まれる正円形領域とする。 When the circular region extraction unit 8 specifies one regular circular region in the high-accuracy region of the binarized map, a plurality of edges obtained from the edge coordinates of values located on the circumference of the regular circular region are obtained. Delete from the coordinates, and perform voting again for each of the remaining edge coordinates. Then, the circular area extraction unit 8 converts the circular area represented by the three parameters constituting the three-dimensional coordinates having the largest number of votes in the Hough space into the circular area included in the high accuracy area of the binarized map. This is an area.

以後、円形領域抽出部８は、同様に動作して、残ったエッジ座標の数が所定のしきい値以下となると、円形領域抽出処理を終了する。 Thereafter, the circular area extraction unit 8 operates in the same manner, and ends the circular area extraction process when the number of remaining edge coordinates is equal to or less than a predetermined threshold value.

このようにして、円形領域抽出部８は、ステップｓ３において、ステップｓ１で仮設定されたしきい値が使用されて生成された２値化マップの高確度領域から円形領域を抽出する。図２７は、図１９に示される２値化マップ５０から抽出された円形領域４００を当該２値化マップ５０に重ねて示す図である。図２７の例では、２値化マップ５０から３つの円形領域４００が抽出されている。 In this way, in step s3, the circular area extraction unit 8 extracts a circular area from the high-accuracy area of the binarized map generated using the threshold value temporarily set in step s1. FIG. 27 is a diagram showing the circular area 400 extracted from the binarization map 50 shown in FIG. In the example of FIG. 27, three circular regions 400 are extracted from the binarized map 50.

なお、円形領域抽出部８は、２値化マップから、正円形領域ではなく、楕円形領域等の他の円形領域を抽出しても良い。楕円形領域については、ハフ変換を用いて抽出することができる。 The circular area extraction unit 8 may extract other circular areas such as an elliptical area instead of a regular circular area from the binarized map. An elliptical region can be extracted using the Hough transform.

＜判定用評価値算出方法＞
人の顔については円形を成していることから、処理対象画像に含まれる顔画像の輪郭も円形を成している。したがって、２値化マップの生成で使用されるしきい値を適切に調整することによって、上述の図１６に示されるように、２値化マップの高確度領域には、処理対象画像に含まれる各顔画像に対応する独立した円形領域が含まれる可能性が高くなる。したがって、この場合には、検出対象画像特定部６は、２値化マップの高確度領域に基づいて、処理対象画像に含まれる各顔画像を個別に適切に特定することが可能となる。 <Evaluation value calculation method for determination>
Since the human face has a circular shape, the contour of the face image included in the processing target image also has a circular shape. Therefore, by appropriately adjusting the threshold value used in the generation of the binarization map, the high accuracy region of the binarization map is included in the processing target image as shown in FIG. There is a high possibility that an independent circular area corresponding to each face image is included. Therefore, in this case, the detection target image specifying unit 6 can appropriately specify each face image included in the processing target image based on the high accuracy region of the binarized map.

一方で、２値化マップの生成で使用されるしきい値が小さすぎると、上述の図１９に示されるように、２値化マップの高確度領域では、顔画像に対応する円形領域が大きくなり、距離が近い複数の顔画像にそれぞれ対応する複数の円形領域が接続されて、２値化マップの高確度領域には、複数の顔画像に対応する一つの独立領域が含まれる可能性が高くなる。したがって、この場合には、検出対象画像特定部６は、２値化マップの高確度領域に基づいて、処理対象画像に含まれる各顔画像を個別に適切に特定することが困難となる。 On the other hand, if the threshold value used in generating the binarized map is too small, the circular area corresponding to the face image is large in the high accuracy area of the binarized map as shown in FIG. Therefore, there is a possibility that a plurality of circular areas respectively corresponding to a plurality of face images close to each other are connected, and the high accuracy area of the binarized map includes one independent area corresponding to the plurality of face images. Get higher. Therefore, in this case, it becomes difficult for the detection target image specifying unit 6 to appropriately specify each face image included in the processing target image based on the high accuracy region of the binarized map.

また、２値化マップの生成で使用されるしきい値が大きすぎると、２値化マップの高確度領域では、上述の図２２に示されるように、顔画像に対応する円形領域が小さくなり、出力値マップでの対応する領域の検出確度値が小さい顔画像（図１４の顔画像２０ｃ）に対応する円形領域が消えてしまう可能性が高くなる。したがって、この場合にも、検出対象画像特定部６は、２値化マップの高確度領域に基づいて、処理対象画像に含まれる各顔画像を個別に適切に特定することが困難となる。 Also, if the threshold value used in the generation of the binarized map is too large, the circular area corresponding to the face image becomes small in the high accuracy area of the binarized map as shown in FIG. There is a high possibility that a circular area corresponding to a face image (face image 20c in FIG. 14) having a small detection accuracy value of the corresponding area in the output value map will disappear. Therefore, in this case as well, it is difficult for the detection target image specifying unit 6 to appropriately specify each face image included in the processing target image based on the high accuracy region of the binarized map.

このように、２値化マップの生成で使用されるしきい値が大きすぎると、２値化マップの高確度領域では、顔画像に対応する円形領域が小さくなって消えてしまう可能性が高くなる。したがって、この可能性を低減するためには、２値化マップの高確度領域にはできるだけ大きな円形領域が含まれるようにしきい値が調整されることが望まれる。 As described above, if the threshold value used in the generation of the binarized map is too large, in the high-accuracy region of the binarized map, there is a high possibility that the circular region corresponding to the face image becomes small and disappears. Become. Therefore, in order to reduce this possibility, it is desirable to adjust the threshold value so that the high-accuracy region of the binarization map includes as large a circular region as possible.

一方で、２値化マップの生成で使用されるしきい値が小さすぎると、２値化マップの高確度領域では、複数の顔画像にそれぞれ対応する複数の円形領域が接続されて、当該複数の顔画像に対応する一つの独立領域が含まれる可能性が高くなる。したがって、この可能性を低減するためには、２値化マップの高確度領域には独立した円形領域ができるだけ含まれるようにしきい値が調整されることが望まれる。 On the other hand, if the threshold value used in the generation of the binarized map is too small, a plurality of circular areas respectively corresponding to a plurality of face images are connected in the high-accuracy area of the binarized map. There is a high possibility that one independent area corresponding to the face image is included. Therefore, in order to reduce this possibility, it is desirable to adjust the threshold value so that the high-accuracy region of the binarization map includes as many independent circular regions as possible.

そこで、本実施の形態では、しきい値調整部７は、２値化マップの生成で使用されるしきい値を判定用評価値に基づいて調整することによって、当該２値化マップの高確度領域には、できるだけ大きな円形領域が含まれつつ、独立した円形領域ができるだけ含まれるようにする。これにより、処理対象画像に含まれる各顔画像を個別に適切に特定することが可能となる。以下に判定用評価値の算出方法について詳細に説明する。 Therefore, in the present embodiment, the threshold adjustment unit 7 adjusts the threshold used in generating the binarized map based on the evaluation value for determination, thereby increasing the accuracy of the binarized map. The region includes as large a circular region as possible, but includes as many independent circular regions as possible. As a result, each face image included in the processing target image can be appropriately specified individually. Hereinafter, a method for calculating the evaluation value for determination will be described in detail.

本実施の形態では、しきい値調整部７は、ステップｓ３で抽出された各円形領域について、以下の式（２）を用いて統合評価値Ｂを求める。 In the present embodiment, the threshold adjustment unit 7 obtains the integrated evaluation value B using the following formula (2) for each circular region extracted in step s3.

Ｂ＝Ａ１−Ａ２−Ａ３・・・（２）
ここで、式（２）において、Ａ１は、式（２）を用いて統合評価値Ｂを求める対象の円形領域（以後、「対象円形領域」と呼ぶ）と、ステップｓ２で生成された２値化マップの高確度領域との重なり面積を示す第１評価値である。またＡ２は、対象円形領域と、ステップｓ３で抽出された他の円形領域との重なり面積を示す第２評価値である。そしてＡ３は、対象円形領域と、ステップｓ２で生成された２値化マップの低確度領域との重なり面積を示す第３評価値である。 B = A1-A2-A3 (2)
Here, in Expression (2), A1 is a target circular area for which the integrated evaluation value B is calculated using Expression (2) (hereinafter referred to as “target circular area”) and the binary value generated in Step s2. It is a 1st evaluation value which shows the overlapping area with the high-accuracy area | region of a conversion map. A2 is a second evaluation value indicating the overlapping area between the target circular area and the other circular areas extracted in step s3. A3 is a third evaluation value indicating the overlapping area between the target circular area and the low-accuracy area of the binarized map generated in step s2.

しきい値調整部７は、例えば、上述の図２７に示されるように、ステップｓ３で抽出された円形領域がすべて配置された２値化マップ（以後、「評価値算出用２値化マップ」と呼ぶ）を用いて、第１評価値Ａ１、第２評価値Ａ２及び第３評価値Ａ３を求める。 For example, as shown in FIG. 27 described above, the threshold adjustment unit 7 is a binarized map in which all the circular regions extracted in step s3 are arranged (hereinafter, “binary map for evaluation value calculation”). Are used to obtain the first evaluation value A1, the second evaluation value A2, and the third evaluation value A3.

具体的には、しきい値調整部７は、評価値算出用２値化マップの高確度領域における、対象円形領域と重なっている部分を構成する複数の値の数を求めて、求めた数を第１評価値Ａ１とする。 Specifically, the threshold adjustment unit 7 obtains the number of a plurality of values constituting a portion overlapping the target circular region in the high accuracy region of the evaluation value calculation binarization map. Is a first evaluation value A1.

また、しきい値調整部７は、評価値算出用２値化マップにおいて、対象円形領域における、他の円形領域と重なっている部分を構成する複数の値の数を求めて、求めた数を第２評価値Ａ２とする。例えば、ステップｓ３において四つの円形領域が抽出されたとすると、しきい値調整部７は、評価値算出用２値化マップにおいて、対象円形領域における、他の３つの円形領域と重なっている部分を構成する複数の値の数を求めて、求めた数を第２評価値Ａ２とする。 Further, the threshold value adjusting unit 7 obtains the number of the plurality of values constituting the portion overlapping the other circular area in the target circular area in the evaluation value calculation binarization map, and calculates the obtained number. The second evaluation value A2. For example, assuming that four circular regions are extracted in step s3, the threshold value adjusting unit 7 determines a portion of the target circular region that overlaps with the other three circular regions in the evaluation value calculation binarization map. The number of the plurality of values constituting is obtained, and the obtained number is set as the second evaluation value A2.

そして、しきい値調整部７は、評価値算出用２値化マップの低確度領域における、対象円形領域と重なっている部分を構成する複数の値の数を求めて、求めた数を第３評価値Ａ３とする。 Then, the threshold adjustment unit 7 obtains the number of the plurality of values constituting the portion overlapping the target circular area in the low-accuracy area of the evaluation value calculation binarization map, and obtains the obtained number as the third number. Assume evaluation value A3.

上述の式（２）から理解できるように、対象円形領域についての統合評価値Ｂは、第１評価値Ａ１により、対象円形領域と２値化マップの高確度領域との重なり面積が大きいほど大きくなる。２値化マップの高確度領域に含まれる円形領域が大きいほど、２値化マップから抽出された円形領域と２値化マップの高確度領域との重なり面積は大きくなることから、統合評価値Ｂは、２値化マップの高確度領域に含まれる円形領域が大きいほど大きくなる。 As can be understood from the above equation (2), the integrated evaluation value B for the target circular region is larger as the overlap area between the target circular region and the high-accuracy region of the binarized map is larger according to the first evaluation value A1. Become. The larger the circular area included in the high-accuracy area of the binarized map, the larger the overlap area between the circular area extracted from the binarized map and the high-accuracy area of the binarized map. Increases as the circular area included in the high-accuracy area of the binarized map increases.

また、対象円形領域についての統合評価値Ｂは、第２評価値Ａ２により、対象円形領域と、ステップｓ３で抽出された他の円形領域との重なり面積が小さいほど大きくなる。したがって、統合評価値Ｂは、２値化マップの高確度領域において独立した円形領域が存在する場合には大きくなる。 Further, the integrated evaluation value B for the target circular area becomes larger as the overlap area between the target circular area and the other circular areas extracted in step s3 becomes smaller by the second evaluation value A2. Therefore, the integrated evaluation value B increases when there is an independent circular area in the high accuracy area of the binarized map.

また、対象円形領域についての統合評価値Ｂは、第３評価値Ａ３により、対象円形領域と２値化マップの低確度領域との重なり面積が小さいほど大きくなる。 Further, the integrated evaluation value B for the target circular area becomes larger as the overlap area between the target circular area and the low-accuracy area of the binarized map becomes smaller by the third evaluation value A3.

ここで、２値化マップの生成で使用されるしきい値が小さすぎると、図２８に示されるように、２値化マップ５０の高確度領域５１において、距離が近い複数の顔画像にそれぞれ対応する複数の領域５１ａ，５１ｂが接触して円形に近い一つの独立領域５１１を形成することがある。つまり、２値化マップ５０の高確度領域５１には、複数の顔画像に対応する円形に近い一つの独立領域５１１が含まれることがある。このような２値化マップ５０の高確度領域５１から円形領域が抽出されると、図２８に示されるように、独立領域５１１に対して一つの円形領域４００が抽出される可能性がある。 Here, if the threshold value used in the generation of the binarization map is too small, each of the face images close to each other in the high accuracy region 51 of the binarization map 50 is shown in FIG. A plurality of corresponding regions 51a and 51b may be in contact with each other to form one independent region 511 that is nearly circular. That is, the high accuracy region 51 of the binarization map 50 may include one independent region 511 that is close to a circle corresponding to a plurality of face images. When a circular region is extracted from the high-accuracy region 51 of the binarization map 50 as described above, there is a possibility that one circular region 400 is extracted for the independent region 511 as shown in FIG.

このように、２値化マップの高確度領域において、複数の顔画像に対応する、円形に近い独立領域５１１が含まれる場合には、図２８と上述の図２７とを比較して理解できるように、当該２値化マップの高確度領域から抽出された円形領域と、当該２値化マップの低確度領域との重なり面積が大きくなる可能性がある。よって、この場合には、第３評価値Ａ３が大きくなり統合評価値Ｂが小さくなる。逆に言えば、２値化マップの高確度領域において、一つの顔画像に対応する独立した円形領域が多く存在する場合には、当該２値化マップの高確度領域から抽出された円形領域と、当該２値化マップの低確度領域との重なり面積が小さくなる傾向にあり、その結果、統合評価値Ｂは、２値化マップの高確度領域において独立した円形領域が存在するほど大きくなる傾向にある。 As described above, when the high-accuracy region of the binarization map includes an independent region 511 that is close to a circle and corresponds to a plurality of face images, it can be understood by comparing FIG. 28 with FIG. 27 described above. In addition, there is a possibility that the overlapping area between the circular region extracted from the high accuracy region of the binarized map and the low accuracy region of the binarized map may increase. Therefore, in this case, the third evaluation value A3 increases and the integrated evaluation value B decreases. In other words, if there are many independent circular regions corresponding to one face image in the high accuracy region of the binarized map, the circular region extracted from the high accuracy region of the binarized map and The overlapping area with the low accuracy region of the binarized map tends to decrease, and as a result, the integrated evaluation value B tends to increase as an independent circular region exists in the high accuracy region of the binarized map. It is in.

以上のように、統合評価値Ｂは、２値化マップの高確度領域に含まれる円形領域が大きいほど大きくなるとともに、２値化マップの高確度領域において独立した円形領域が存在するほど大きくなる。 As described above, the integrated evaluation value B increases as the circular area included in the high-accuracy area of the binarized map increases and increases as an independent circular area exists in the high-accuracy area of the binarized map. .

しきい値調整部７は、ステップｓ３で抽出された各円形領域についての統合評価値Ｂを求めると、それらの統合評価値Ｂの総和を求めて、求めた総和を判定用評価値とする。これにより、ステップｓ１で仮設定されたしきい値が適切かを示す判定用評価値が得られる。判定用評価値についても、２値化マップの高確度領域に含まれる円形領域が大きいほど大きくなるとともに、２値化マップの高確度領域において独立した円形領域が存在するほど大きくなる。 When the threshold adjustment unit 7 obtains the integrated evaluation value B for each circular area extracted in step s3, the threshold adjustment unit 7 obtains the sum of the integrated evaluation values B and sets the obtained sum as the evaluation value for determination. Thereby, an evaluation value for determination indicating whether or not the threshold value temporarily set in step s1 is appropriate is obtained. The evaluation value for determination also increases as the circular area included in the high accuracy area of the binarized map increases, and increases as the independent circular area exists in the high accuracy area of the binarized map.

しきい値調整部７は、複数段階のしきい値にそれぞれ対応する複数の判定用評価値が得られると、ステップｓ６において、当該複数の判定用評価値の最大値に対応するしきい値を適切なしきい値に決定する。このようにして適切なしきい値が決定されると、２値化処理部５は、当該適切なしきい値を用いて出力値マップを２値化して特定用２値化マップを生成する。これにより、特定用２値化マップは、その高確度領域においてできるだけ大きな円形領域が含まれつつ、独立した円形領域ができるだけ含まれるように生成される。このような特定用２値化マップに基づいて検出対象画像特定部６が処理対象画像において顔画像を特定することにより、検出対象画像特定部６は、処理対象画像において、出力値マップでの対応する領域の検出確度値が小さい顔画像を特定することができるとともに、距離が近い複数の顔画像のそれぞれを個別に特定することができる。 When a plurality of determination evaluation values respectively corresponding to a plurality of stages of threshold values are obtained, the threshold adjustment unit 7 sets a threshold corresponding to the maximum value of the plurality of determination evaluation values in step s6. Decide on an appropriate threshold. When an appropriate threshold value is determined in this way, the binarization processing unit 5 binarizes the output value map using the appropriate threshold value, and generates a specifying binarization map. Thereby, the binarization map for specification is generated so as to include as many independent circular regions as possible while including as large a circular region as possible in the high accuracy region. The detection target image specifying unit 6 specifies a face image in the processing target image based on such a specifying binarization map, so that the detection target image specifying unit 6 responds with an output value map in the processing target image. A face image with a small detection accuracy value of a region to be identified can be specified, and each of a plurality of face images with close distances can be specified individually.

なお、複数段階のしきい値にそれぞれ対応する複数の判定用評価値において複数の最大値が存在する場合には、最大値である複数の判定用評価値のうち、それに対応するしきい値が最小である判定用評価値を適切なしきい値に決定することが好ましい。これにより、処理対象画像において、出力値マップでの対応する領域の検出確度値が小さい顔画像を特定し易くなる。 When there are a plurality of maximum values in the plurality of determination evaluation values respectively corresponding to the threshold values in a plurality of stages, the threshold value corresponding to the maximum evaluation value among the plurality of determination evaluation values is the maximum value. It is preferable to determine the evaluation value for determination that is the minimum to an appropriate threshold value. This makes it easy to specify a face image having a small detection accuracy value of the corresponding region in the output value map in the processing target image.

また上記の例では、統合評価値Ｂが算出される際には、対象円形領域と、２値化マップの低確度領域との重なり面積を示す第３評価値Ａ３が考慮されていたが、当該第３評価値Ａ３は考慮されなくても良い。つまり、統合評価値Ｂは以下の式（３）で表されても良い。 In the above example, when the integrated evaluation value B is calculated, the third evaluation value A3 indicating the overlapping area between the target circular region and the low-accuracy region of the binarized map is considered. The third evaluation value A3 may not be considered. That is, the integrated evaluation value B may be expressed by the following formula (3).

Ｂ＝Ａ１−Ａ２・・・（３）
しきい値調整部７が、式（３）を用いて統合評価値Ｂを求める場合であっても、２値化マップの高確度領域に、できるだけ大きな円形領域が含まれつつ、独立した円形領域ができるだけ含まれるように、当該２値化マップの生成で使用するしきい値を決定することができる。 B = A1-A2 (3)
Even when the threshold adjustment unit 7 obtains the integrated evaluation value B using Equation (3), the high-accuracy region of the binarization map includes as large a circular region as possible, and an independent circular region. Can be determined so that the threshold value used in the generation of the binarization map is included.

以上のように、本実施の形態では、しきい値調整部７は、２値化マップの高確度領域から抽出された円形領域と、当該高確度領域との重なり面積を示す第１評価値Ａ１と、当該円形領域と、当該高確度領域から抽出された他の円形領域との重なり面積を示す第２評価値Ａ２とに基づいて、２値化マップの生成で使用されるしきい値を調整することから、２値化マップの高確度領域に、できるだけ大きな円形領域が含まれつつ、独立した円形領域ができるだけ含まれるように、当該２値化マップの生成で使用されるしきい値を調整することができる。よって、検出対象画像特定部６は、調整されたしきい値が用いられて生成された２値化マップに基づいて処理対象画像において検出対象画像を特定することによって、処理対象画像において、出力値マップでの対応する領域の検出確度値が小さい検出対象画像を特定することができるとともに、距離が近い複数の検出対象画像のそれぞれを個別に特定することができる。よって、検出対象画像についての検出精度が向上する。 As described above, in the present embodiment, the threshold adjustment unit 7 has the first evaluation value A1 indicating the overlapping area between the circular region extracted from the high accuracy region of the binarized map and the high accuracy region. And the threshold value used in the generation of the binarization map based on the second evaluation value A2 indicating the overlapping area of the circular region and the other circular region extracted from the high-accuracy region Therefore, the threshold value used in the generation of the binarization map is adjusted so that the high-accuracy area of the binarization map includes as large a circular area as possible and includes an independent circular area as much as possible. can do. Therefore, the detection target image specifying unit 6 specifies the detection target image in the processing target image based on the binarized map generated using the adjusted threshold value, and thereby outputs the output value in the processing target image. A detection target image having a small detection accuracy value of a corresponding region on the map can be specified, and each of a plurality of detection target images having a short distance can be specified individually. Therefore, the detection accuracy for the detection target image is improved.

また上記のように、しきい値調整部７が、第１評価値Ａ１及び第２評価値Ａ２だけではなく、円形領域と、２値化マップの低確度領域との重なり面積を示す第３評価値Ａ３にも基づいてしきい値を調整することにより、２値化マップの高確度領域に、より多くの独立した円形領域を含めることが可能となる。よって、処理対象画像において、距離が近い複数の検出対象画像のそれぞれを精度良く個別に特定することができる。 In addition, as described above, the threshold value adjustment unit 7 performs not only the first evaluation value A1 and the second evaluation value A2, but also the third evaluation indicating the overlapping area of the circular region and the low accuracy region of the binarized map. By adjusting the threshold value based also on the value A3, it becomes possible to include more independent circular regions in the high accuracy region of the binarized map. Therefore, in the processing target image, each of a plurality of detection target images that are close to each other can be individually specified with high accuracy.

上記において画像検出装置１は詳細に説明されたが、上記した説明は、全ての局面において例示であって、この発明がそれに限定されるものではない。例えば、検出対象画像については、人の顔画像以外の画像であっても良い。また、上述した各種の例は、相互に矛盾しない限り組み合わせて適用可能である。そして、例示されていない無数の変形例が、この発明の範囲から外れることなく想定され得るものと解される。 Although the image detection apparatus 1 has been described in detail above, the above description is illustrative in all aspects, and the present invention is not limited thereto. For example, the detection target image may be an image other than a human face image. The various examples described above can be applied in combination as long as they do not contradict each other. And it is understood that the countless modification which is not illustrated can be assumed without deviating from the scope of the present invention.

１画像検出装置
４マップ生成部
５２値化処理部
６検出対象画像特定部
７しきい値調整部
８円形領域抽出部 DESCRIPTION OF SYMBOLS 1 Image detection apparatus 4 Map production | generation part 5 Binarization process part 6 Detection object image specific part 7 Threshold adjustment part 8 Circular area extraction part

Claims

An image detection device for detecting a detection target image from a processing target image,
A map generation unit that generates a map indicating a distribution in the processing target image with respect to the accuracy value indicating the probability as the detection target image;
A binarization processing unit that binarizes the map using a threshold value to generate a binarized map;
A circular region from the first region in the binarized map generated using the threshold value corresponding to a region where the accuracy value is greater than or equal to the threshold value or larger than the threshold value in the map. An extraction unit for extracting
A first evaluation value indicating an overlap area between the first region and the circular region extracted from the first region of the binarized map by the extraction unit; and the first region from the first region by the circular region and the extraction unit. A threshold adjustment unit that adjusts the threshold based on a second evaluation value indicating an overlapping area with the other extracted circular regions;
A specifying unit that specifies the detection target image in the processing target image based on the first region of the binarized map generated by using the threshold adjusted by the threshold adjustment unit; An image detection apparatus comprising:

The image detection apparatus according to claim 1,
The threshold adjuster is
A first evaluation value indicating an overlap area between the circular region extracted from the first region of the binarized map and the first region in the extraction unit;
A second evaluation value indicating an overlapping area between the circular region and another circular region extracted from the first region by the extraction unit;
A second region in the binarization map corresponding to the circular region and a region in the map where the accuracy value is less than or less than the threshold value used in the generation of the binarization map. An image detection apparatus that adjusts the threshold based on a third evaluation value indicating an overlapping area with a region.

The image detection apparatus according to any one of claims 1 and 2,
The extraction unit is an image detection device that detects an edge of the binarized map and identifies a circular region in the first region of the binarized map by a Hough transform using the coordinates of the edge.

The image detection apparatus according to any one of claims 1 to 3,
The image detection apparatus, wherein the detection target image is a human face image.

A control program for controlling an image detection device that detects a detection target image from a processing target image,
In the image detection device,
(A) generating a map indicating a distribution in the processing target image with respect to an accuracy value indicating the probability as the detection target image;
(B) binarizing the map using a threshold value to generate a binarized map;
(C) From the partial area in the binarized map generated by using the threshold value corresponding to the area where the accuracy value is greater than or equal to the threshold value or larger than the threshold value in the map. Extracting a circular region;
(D) a first evaluation value indicating an overlapping area between the circular area extracted from the partial area of the binarized map and the partial area in the step (c); and the circular area and the step (c). Adjusting the threshold based on a second evaluation value indicating an overlapping area with another circular region extracted from the partial region;
(E) specifying the detection target image in the processing target image based on the partial region of the binarized map generated using the threshold value adjusted in the step (d); A control program to execute.

An image detection method for detecting a detection target image from a processing target image,
(A) generating a map indicating a distribution in the processing target image with respect to an accuracy value indicating the probability as the detection target image;
(B) binarizing the map using a threshold value to generate a binarized map;
(C) From the partial area in the binarized map generated by using the threshold value corresponding to the area where the accuracy value is greater than or equal to the threshold value or larger than the threshold value in the map. Extracting a circular region;
(D) a first evaluation value indicating an overlapping area between the circular area extracted from the partial area of the binarized map and the partial area in the step (c); and the circular area and the step (c). Adjusting the threshold based on a second evaluation value indicating an overlapping area with another circular region extracted from the partial region;
(E) specifying the detection target image in the processing target image based on the partial region of the binarized map generated using the threshold value adjusted in the step (d); An image detection method comprising: