JP2011002882A

JP2011002882A - Imaging apparatus, image processing program, and imaging method

Info

Publication number: JP2011002882A
Application number: JP2009143186A
Authority: JP
Inventors: Shinichi Fukue; 信一福榮
Original assignee: Olympus Corp
Current assignee: Olympus Corp
Priority date: 2009-06-16
Filing date: 2009-06-16
Publication date: 2011-01-06

Abstract

PROBLEM TO BE SOLVED: To provide an imaging apparatus which quickly and correctly detects an object.SOLUTION: The imaging apparatus obtains an image, obtains a low resolution image based on the image, and detects an object candidate to the low resolution image. When the low resolution image is the object candidate, the imaging apparatus detects the object to a high resolution image of high resolution rather than the low resolution image.

Description

本発明は撮像装置、画像処理プログラム、および撮像方法に関するものである。 The present invention relates to an imaging apparatus, an image processing program, and an imaging method.

低解像度と高解像度のマスター画像パターンを用意し、これらのマスター画像パターンを用いて検査対象物を検出する方法が特許文献１に開示されている。特許文献１では、まず低解像度の全体画像に対して低解像度用のマスター画像パターンを適用し、領域を絞り込む。そして領域を絞り込んだ後に、高解像度の全体画像を取得し、高解像度の全体画像に対して高解像度用のマスター画像パターンを適用して検査対象物を検出している。 Patent Document 1 discloses a method of preparing low-resolution and high-resolution master image patterns and detecting an inspection object using these master image patterns. In Patent Document 1, first, a low-resolution master image pattern is applied to an entire low-resolution image to narrow the area. After narrowing down the area, a high-resolution whole image is acquired, and a high-resolution master image pattern is applied to the high-resolution whole image to detect the inspection object.

特許文献２には、低解像度の全体画像を取得して顔領域を検出し、顔領域を検出した後に、顔領域に対して高解像度の画像を取得して被写体の個人識別を行う技術が開示されている。 Patent Document 2 discloses a technique for acquiring a low-resolution whole image to detect a face area, detecting a face area, acquiring a high-resolution image for the face area, and identifying a subject individually. Has been.

特許文献３には、高次局所自己相関を用いて文字認識を行う技術が開示されている。 Patent Document 3 discloses a technique for performing character recognition using higher-order local autocorrelation.

非特許文献１では、矩形情報を基にした複数の弱識別器の重み付け総和をAdaBoostにより統合して強識別器を作成し、強識別器をカスケード接続させて画像中の注目対象として顔を検出する技術が開示されている。 In Non-Patent Document 1, a weighted sum of a plurality of weak classifiers based on rectangular information is integrated by AdaBoost to create a strong classifier, and a strong classifier is cascaded to detect a face as a target of interest in an image. Techniques to do this are disclosed.

非特許文献２にはAdaBoostを応用してジェスチャ認識を行う技術が開示されている。 Non-Patent Document 2 discloses a technique for performing gesture recognition by applying AdaBoost.

特開平０５−２８８５２０号公報JP 05-288520 A 特開２００７−３３４８０２号公報JP 2007-334802 A 特公昭５８−０４７０６４号公報Japanese Examined Patent Publication No. 58-047064 P. Viola and M. Jones. "Rapid Object Detection Using a Boosted Cascade of Simple Features," in Proc. of CVPR, vol.1, ppp.511-518, December, 2001P. Viola and M. Jones. "Rapid Object Detection Using a Boosted Cascade of Simple Features," in Proc. Of CVPR, vol.1, ppp.511-518, December, 2001 Takeshi Mita, Toshimitsu Kaneko, Bjo¨ rn Stenger, and Osamu Hori, "Discriminative Feature Co-Occurrence Selection for Object Detection" in JOURNAL OF LATEX CLASS FILES, VOL.1, NO.8, AUGUST 2002Takeshi Mita, Toshimitsu Kaneko, Bjo¨ rn Stenger, and Osamu Hori, "Discriminative Feature Co-Occurrence Selection for Object Detection" in JOURNAL OF LATEX CLASS FILES, VOL.1, NO.8, AUGUST 2002

しかし、上記の特許文献１の発明では、低解像度の全体画像と高解像度の全体画像を取得する必要があり、例えば、検査対象物が無い場合でも、低解像度の全体画像と高解像度の全体画像の２枚の画像を取得し、検査対象物が無い場合でも、それら取得した２枚の画像を用いて検出対象物の検出処理を行う。そのため、検査対象物の検出時間が長くなる、といった問題点がある。また、特許文献２の発明では、低解像度の全体画像に対して顔領域を検出することが出来なければ、個人識別を行うことができない、といった問題点がある。特許文献３の発明は、画像が低解像度画像である場合には、文字を認識する事が出来ない、といった問題点がある。非特許文献１および非特許文献２の技術は、被写体が小さく写っている場合には被写体を認識することができない、といった問題点がある。 However, in the invention of the above-mentioned Patent Document 1, it is necessary to acquire a low-resolution whole image and a high-resolution whole image. For example, even when there is no inspection object, the low-resolution whole image and the high-resolution whole image These two images are acquired, and even when there is no inspection object, detection processing of the detection object is performed using these two acquired images. Therefore, there is a problem that the detection time of the inspection object becomes long. Further, the invention of Patent Document 2 has a problem that personal identification cannot be performed unless a face area can be detected from the entire low-resolution image. The invention of Patent Document 3 has a problem that characters cannot be recognized when the image is a low-resolution image. The techniques of Non-Patent Literature 1 and Non-Patent Literature 2 have a problem that the subject cannot be recognized when the subject is small.

本発明はこのような問題点を解決するために発明されたもので、画像中の被写体を素早く、かつ正確に検出することを目的とする。 The present invention has been invented to solve such problems, and aims to quickly and accurately detect a subject in an image.

本発明のある態様に係る撮像装置は、被写体を撮像する撮像装置であって、画像を取得する画像取得部と、画像取得部によって取得した画像に基づいて低解像度画像を取得する低解像度画像取得部と、低解像度画像に対して被写体候補を検出する第１検出部と、画像取得部によって取得した画像から低解像度画像よりも高解像度である高解像度画像を取得する高解像度画像取得部と、第１検出部によって低解像度画像に対して被写体候補が検出された場合に、高解像度画像に対して被写体を検出する第２検出部と、を備える。 An imaging apparatus according to an aspect of the present invention is an imaging apparatus that images a subject, and an image acquisition unit that acquires an image, and a low-resolution image acquisition that acquires a low-resolution image based on the image acquired by the image acquisition unit A first detection unit that detects a subject candidate for the low resolution image, a high resolution image acquisition unit that acquires a high resolution image that is higher in resolution than the low resolution image from the image acquired by the image acquisition unit, A second detection unit configured to detect a subject from the high-resolution image when a subject candidate is detected from the low-resolution image by the first detection unit.

本発明の別の態様に係る画像処理プログラムは、撮像した画像をコンピュータで処理するための画像処理プログラムであって、コンピュータに、画像を取得する画像取得手順と、画像取得手順によって取得した画像に基づいて低解像度画像を取得する低解像度画像取得手順と、低解像度画像に対して被写体候補を検出する第１検出手順と、第１検出手順によって低解像度画像に対して被写体候補が検出された場合に、画像取得手順によって取得された画像から取得され、低解像度画像よりも高解像度である高解像度画像に対して被写体を検出する第２検出手順と、を実行させる。 An image processing program according to another aspect of the present invention is an image processing program for processing a captured image by a computer, the image acquisition procedure for acquiring an image in the computer, and the image acquired by the image acquisition procedure. A low-resolution image acquisition procedure for acquiring a low-resolution image based on the first detection procedure for detecting a subject candidate for the low-resolution image, and a subject candidate for the low-resolution image detected by the first detection procedure And a second detection procedure for detecting a subject with respect to a high-resolution image acquired from the image acquired by the image acquisition procedure and having a higher resolution than the low-resolution image.

本発明のさらに別の態様に係る撮像方法は、被写体を撮像する撮像方法であって、画像を取得し、取得した画像に基づいて低解像度画像を取得し、低解像度画像に対して被写体候補を検出し、低解像度画像に対して被写体候補が検出された場合に、取得した画像から取得され、低解像度画像よりも高解像度である高解像度画像に対して被写体を検出する。 An imaging method according to still another aspect of the present invention is an imaging method for imaging a subject, acquires an image, acquires a low-resolution image based on the acquired image, and selects a subject candidate for the low-resolution image. When a subject candidate is detected for a low-resolution image, the subject is detected for a high-resolution image that is acquired from the acquired image and has a higher resolution than the low-resolution image.

これら態様によれば、比較的処理速度が速い低解像度画像を用いて被写体候補が画像中にあるかどうか判定する。そして、画像中に被写体候補があると判定された場合にのみ、高解像度画像を用いて被写体の検出を行う。そのため、例えば被写体が画像中に無い場合には、被写体検出処理を素早く終了させることができる。また、被写体が小さく写っている場合でも、低解像度画像を用いることで被写体候補を検出することができる。そして被写体候補が検出された場合には高解像度画像によって被写体検出を行うので、画像中から被写体を正確に検出することができる。 According to these aspects, it is determined whether or not a subject candidate exists in the image using a low-resolution image having a relatively high processing speed. Only when it is determined that there is a subject candidate in the image, the subject is detected using the high-resolution image. Therefore, for example, when there is no subject in the image, the subject detection process can be quickly ended. Even when the subject is small, the subject candidate can be detected by using the low-resolution image. When a subject candidate is detected, subject detection is performed using a high-resolution image, so that the subject can be accurately detected from the image.

本発明によると、画像中の被写体を素早く、かつ正確に検出することができる。 According to the present invention, a subject in an image can be detected quickly and accurately.

第１実施形態の撮像装置を示す概略ブロック図である。It is a schematic block diagram which shows the imaging device of 1st Embodiment. 第１実施形態の第１検出部を示す概略ブロック図である。It is a schematic block diagram which shows the 1st detection part of 1st Embodiment. 矩形特徴を例示する図である。It is a figure which illustrates a rectangle feature. 第１実施形態の第２検出部を示す概略ブロック図である。It is a schematic block diagram which shows the 2nd detection part of 1st Embodiment. 第１検出部と第２検出部とにおける検出率と過検出率の関係を示す図である。It is a figure which shows the relationship between the detection rate and overdetection rate in a 1st detection part and a 2nd detection part. 教師画像を例示する図である。It is a figure which illustrates a teacher image. 図６の教師画像の矩形特徴を例示する図である。FIG. 7 is a diagram illustrating a rectangular feature of the teacher image in FIG. 6. 第１実施形態の顔検出制御を示すフローチャートである。It is a flowchart which shows the face detection control of 1st Embodiment. 高次局所自己相関の局所パターンを例示する図である。It is a figure which illustrates the local pattern of a high-order local autocorrelation. 第１抽出フィルタおよび第２抽出フィルタの矩形特徴を説明する図である。It is a figure explaining the rectangular feature of the 1st extraction filter and the 2nd extraction filter. 第４実施形態の第１検出部を示す概略ブロック図である。It is a schematic block diagram which shows the 1st detection part of 4th Embodiment. 第４実施形態の第１抽出フィルタを説明する図である。It is a figure explaining the 1st extraction filter of 4th Embodiment. 第５実施形態の撮像装置を示す概略ブロック図である。It is a schematic block diagram which shows the imaging device of 5th Embodiment. 第５実施形態の検出部を示す概略ブロック図である。It is a schematic block diagram which shows the detection part of 5th Embodiment. 第５実施形態の顔検出制御を示すフローチャートである。It is a flowchart which shows the face detection control of 5th Embodiment. 第６実施形態の撮像装置を示す概略ブロック図である。It is a schematic block diagram which shows the imaging device of 6th Embodiment.

本発明の第１実施形態の撮像装置について図１を用いて説明する。図１は第１実施形態の撮像装置の概略ブロック図である。 An imaging apparatus according to a first embodiment of the present invention will be described with reference to FIG. FIG. 1 is a schematic block diagram of the imaging apparatus according to the first embodiment.

第１実施形態の撮像装置は、画像取得部１と、第１縮小部２と、低解像度画像取得部３と、第１検出部４と、第２縮小部５と、高解像度画像取得部６と、第２検出部７とを備える。上記の各部（又はこれら全体）を、論理回路から構成してよい。或いは、上記の各部（又はこれら全体）を、データを格納するメモリ、演算プログラムを格納するメモリ、この演算プログラムを実行するＣＰＵ（中央演算処理装置）、入出力インターフェース等から構成してもよい。 The imaging apparatus according to the first embodiment includes an image acquisition unit 1, a first reduction unit 2, a low resolution image acquisition unit 3, a first detection unit 4, a second reduction unit 5, and a high resolution image acquisition unit 6. And a second detection unit 7. Each of the above parts (or all of them) may be composed of a logic circuit. Alternatively, each of the above-described units (or all of them) may be composed of a memory for storing data, a memory for storing an arithmetic program, a CPU (central processing unit) for executing the arithmetic program, an input / output interface, and the like.

画像取得部１は、外部から画像を取得する。 The image acquisition unit 1 acquires an image from the outside.

第１縮小部２は、画像取得部１が取得した画像を設定された縮小率で縮小して第１画像を取得する。入力された画像において被写体のサイズは不明である。そのため、第１縮小部２は、複数の縮小率で縮小して第１画像を取得することができる。 The first reduction unit 2 acquires the first image by reducing the image acquired by the image acquisition unit 1 at a set reduction rate. The size of the subject in the input image is unknown. Therefore, the first reduction unit 2 can acquire the first image by reducing at a plurality of reduction ratios.

被写体は、撮影者が撮影時に注目する対象であり、例えば人間の顔、人間の手、動物の顔などであるが、これに限定されることはない。なお、撮像装置は、比較的小さく撮影される人間の手についても検出することができ、人間のジェスチャを認識することができる。本実施形態では、被写体が人間の顔である場合について説明する。 The subject is an object that the photographer pays attention to when photographing, and is, for example, a human face, a human hand, or an animal face, but is not limited thereto. Note that the imaging apparatus can also detect a human hand that is shot relatively small, and can recognize a human gesture. In the present embodiment, a case where the subject is a human face will be described.

低解像度画像取得部３は、第１画像から低解解像度画像を取得する。低解像度画像は、第１画像の中で所定の大きさで区切られた領域の画像である。本実施形態では低解像度画像は、１０×１０pixelで区切られた領域の画像である。なお、本実施形態において、低解像度画像は１５×１５pixel以下の画像とし、高解像度画像は１６×１６pixel以上の画像とする。しかし、これに限られることはない。 The low resolution image acquisition unit 3 acquires a low resolution image from the first image. The low-resolution image is an image of an area divided by a predetermined size in the first image. In the present embodiment, the low resolution image is an image of an area divided by 10 × 10 pixels. In this embodiment, the low resolution image is an image of 15 × 15 pixels or less, and the high resolution image is an image of 16 × 16 pixels or more. However, it is not limited to this.

低解像度画像取得部３は、取得した低解像度画像において、後述する第１検出部４で顔候補判定が行われ、取得した低解像度画像に顔候補が含まれていないと判定されると、第１画像の中から新しく低解像度画像を取得する。低解像度画像は、第１画像中の全領域で順次取得される。低解像度画像取得部３は、第１画像に対して縦横に所定量ずらしながら順次低解像度画像を取得する。低解像度画像取得部３は、低解像度画像を、例えば最初に第１画像中の左上に設定し、その後所定ピクセル、例えば１pixelだけ左側にずらして低解像度画像を取得する。そして、低解像度画像が第１画像中の右端まで来ると、低解像度画像取得部３は、所定ピクセル、例えば１pixelだけ下側にずらして低解像度画像を取得する。 When the low-resolution image acquisition unit 3 performs face candidate determination by the first detection unit 4 described later on the acquired low-resolution image and determines that the acquired low-resolution image does not include a face candidate, A new low-resolution image is acquired from one image. The low resolution image is sequentially acquired in all areas in the first image. The low resolution image acquisition unit 3 sequentially acquires low resolution images while shifting the first image vertically and horizontally by a predetermined amount. The low resolution image acquisition unit 3 sets the low resolution image, for example, first in the upper left of the first image, and then shifts it to the left by a predetermined pixel, for example, 1 pixel, and acquires the low resolution image. When the low-resolution image reaches the right end of the first image, the low-resolution image acquisition unit 3 acquires the low-resolution image by shifting it downward by a predetermined pixel, for example, 1 pixel.

第１検出部４について図２を用いて説明する。図２は第１検出部４の概略ブロック図である。第１検出部４は、第１特徴抽出部（第１特徴量算出部）T1i-j(i=1〜L,j=1〜M)と、第１判定部H1i(i=1〜L)とを備える。第１検出部４は、低解像度画像取得部３によって取得した低解像度画像に対して被写体候補を検出する。 The first detection unit 4 will be described with reference to FIG. FIG. 2 is a schematic block diagram of the first detection unit 4. The first detection unit 4 includes a first feature extraction unit (first feature amount calculation unit) T1i-j (i = 1 to L, j = 1 to M) and a first determination unit H1i (i = 1 to L). With. The first detection unit 4 detects subject candidates for the low-resolution image acquired by the low-resolution image acquisition unit 3.

第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、低解像度画像取得部３によって取得された低解像度画像が顔候補の特徴を含んでいるか判断する識別器である。第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、例えば図３に示される矩形特徴、または図３に示される矩形特徴を組み合わせた矩形特徴によって構成される第１抽出フィルタ（第１特徴）をそれぞれ備える。第１抽出フィルタは、低解像度画像が顔候補の特徴を有しているかどうかを判定するフィルタである。つまり、第１抽出フィルタは顔候補の特徴を表している。第１抽出フィルタは、低解像度画像と同じ大きさであり、本実施形態では１０×１０pixelである。なお、矩形特徴については、図３に示されるものに限られる事はなく、顔候補を判別可能なものであればよい。 The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) is an identifier that determines whether the low-resolution image acquired by the low-resolution image acquisition unit 3 includes the feature of the face candidate. is there. The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) is, for example, a first feature configured by a rectangular feature shown in FIG. 3 or a rectangular feature obtained by combining the rectangular features shown in FIG. One extraction filter (first feature) is provided. The first extraction filter is a filter that determines whether or not a low-resolution image has a feature of a face candidate. That is, the first extraction filter represents the feature of the face candidate. The first extraction filter has the same size as the low-resolution image, and is 10 × 10 pixels in this embodiment. Note that the rectangular feature is not limited to that shown in FIG. 3, and any feature can be used as long as the face candidate can be identified.

第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、低解像度画像の特徴量を算出する。第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、低解像度画像と第１抽出フィルタとを重ね合わせて、第１抽出フィルタの矩形特徴に対応する低解像度画像の画素値に基づいて特徴量を算出する。例えば第１抽出フィルタの矩形特徴が図３（ａ）、（ｂ）、（ｄ）であった場合には、特徴量は、
特徴量＝（低解像度画像の白矩形の画素値の和）−（低解像度画像の黒矩形の画素値の和）
によって算出される。また、例えば第１抽出フィルタの矩形特徴が図３（ｃ）であった場合には、特徴量は、
特徴量＝（低解像度画像の白矩形の画素値の和）−（低解像度画像の黒矩形の画素値の和）×２
によって算出される。 The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) calculates the feature amount of the low resolution image. The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) superimposes the low-resolution image and the first extraction filter, and the low-resolution image corresponding to the rectangular feature of the first extraction filter. The feature amount is calculated based on the pixel value. For example, when the rectangular features of the first extraction filter are FIGS. 3A, 3B, and 3D, the feature amount is
Feature amount = (sum of white rectangular pixel values of low resolution image) − (sum of black rectangular pixel values of low resolution image)
Is calculated by For example, when the rectangular feature of the first extraction filter is FIG.
Feature amount = (sum of white rectangular pixel values of low resolution image) − (sum of black rectangular pixel values of low resolution image) × 2
Is calculated by

第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、算出した特徴量と第１閾値とを比較し、特徴量が第１閾値よりも大きい場合には低解像度画像が顔候補の特徴を含んでいることを示す「１」を出力する。一方、特徴量が第１閾値以下の場合には、低解像度画像が顔候補の特徴を含んでいないことを示す「０」を出力する。第１閾値は、第１特徴抽出部T1i-j(i=1〜L,j=1〜M)毎に設定された値である。 The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) compares the calculated feature quantity with the first threshold value, and if the feature quantity is larger than the first threshold value, the low resolution “1” indicating that the image includes the feature of the face candidate is output. On the other hand, when the feature amount is equal to or smaller than the first threshold, “0” indicating that the low-resolution image does not include the feature of the face candidate is output. The first threshold is a value set for each first feature extraction unit T1i-j (i = 1 to L, j = 1 to M).

第１判定部H1i(i=1〜L)は、１または複数の第１特徴抽出部H1i-j(i=1〜L,j=1〜M)によって構成される識別器である。第１判定部H1i(i=1〜L)は、各第１特徴抽出部T1i-j(i=1〜L,j=1〜M)から出力される値（「１」か「０」）に各第１特徴抽出部T1i-j(i=1〜L,j=1〜M)が有する第１抽出フィルタの重み（信頼度）を掛けて、各第１特徴抽出部T1i-j(i=1〜L,j=1〜M)の評価値を算出する。そして、第１判定部H1i(i=1〜L)は、各第１特徴抽出部T1i-j(i=1〜L,j=1〜M)の評価値を加算した合計評価値を算出する。第１判定部H1i(i=1〜L)は、算出した合計評価値を第２閾値と比較し、合計評価値が第２閾値よりも大きい場合には、低解像度画像を次ぎの第１判定部H1i(i=1〜L)に入力させる。例えば、第１判定部H11は、第１特徴抽出部T11-1〜T11-Mの評価値を加算した合計評価値を算出する。そして、合計評価値が第１判定部H11の第２閾値よりも大きい場合には、次の第１判定部H12に低解像度画像を入力させる。一方、第１判定部H1i(i=1〜L)は、合計評価値が第２閾値以下の場合には、低解像度画像が顔候補を含んでいないと判定し、現在取得している低解像度画像における顔候補検出判定を終了する。第２閾値は、第１判定部H1i(i=1〜L)毎に設定された値である。 The first determination unit H1i (i = 1 to L) is a discriminator configured by one or more first feature extraction units H1i-j (i = 1 to L, j = 1 to M). The first determination unit H1i (i = 1 to L) is a value (“1” or “0”) output from each first feature extraction unit T1i-j (i = 1 to L, j = 1 to M). Is multiplied by the weight (reliability) of the first extraction filter of each first feature extraction unit T1i-j (i = 1 to L, j = 1 to M), and each first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) are calculated. Then, the first determination unit H1i (i = 1 to L) calculates a total evaluation value obtained by adding the evaluation values of the first feature extraction units T1i-j (i = 1 to L, j = 1 to M). . The first determination unit H1i (i = 1 to L) compares the calculated total evaluation value with the second threshold value, and when the total evaluation value is larger than the second threshold value, the first determination is performed on the low-resolution image next. Part H1i (i = 1 to L). For example, the first determination unit H11 calculates a total evaluation value obtained by adding the evaluation values of the first feature extraction units T11-1 to T11-M. When the total evaluation value is larger than the second threshold value of the first determination unit H11, the low resolution image is input to the next first determination unit H12. On the other hand, when the total evaluation value is equal to or smaller than the second threshold value, the first determination unit H1i (i = 1 to L) determines that the low resolution image does not include a face candidate, and the currently acquired low resolution The face candidate detection determination in the image ends. The second threshold is a value set for each first determination unit H1i (i = 1 to L).

第１検出部４は、複数の第１判別部H1i(i=1〜L)がカスケード接続されて構成される。最後の第１判定部H1nによって低解像度画像が顔候補を含んでいると判定すると、最終的に低解像度画像は顔候補を含んだ画像であると判定される。つまり、取得した画像に顔候補が存在していることとなる。このようにして、第１検出部４は低解像度画像から顔候補を検出する。一方、縮小率を変えて取得した第１画像の全範囲で低解像度画像に対して、いずれかの第１判定部H1i(i=1〜L)で合計評価値が第２閾値以下であると判定されると、第１画像からは顔候補が検出されなかったことになる。つまり、取得した画像には顔候補が存在していないこととなる。 The first detection unit 4 is configured by cascading a plurality of first determination units H1i (i = 1 to L). When the last first determination unit H1n determines that the low-resolution image includes a face candidate, the low-resolution image is finally determined to be an image including a face candidate. That is, face candidates exist in the acquired image. Thus, the 1st detection part 4 detects a face candidate from a low resolution image. On the other hand, with respect to the low-resolution image in the entire range of the first image acquired by changing the reduction ratio, the total evaluation value is less than or equal to the second threshold value in any of the first determination units H1i (i = 1 to L). If determined, no face candidate is detected from the first image. That is, no face candidate exists in the acquired image.

第２縮小部５は、画像取得部１が取得した画像から第２画像を取得する。第２画像は、第１縮小部２によって取得される第１画像と比較して高解像度の画像である。第２縮小部５は、例えば入力された画像をそのまま取得し、または第１縮小部２よりも縮小率が小さい縮小率で縮小して第２画像を取得する。第２縮小部５は第１検出部４によって第１画像中に顔候補が検出された場合にのみ第２画像を取得する。 The second reduction unit 5 acquires a second image from the image acquired by the image acquisition unit 1. The second image is a higher resolution image than the first image acquired by the first reduction unit 2. For example, the second reduction unit 5 acquires the input image as it is, or acquires the second image by reducing it at a reduction rate smaller than that of the first reduction unit 2. The second reduction unit 5 acquires the second image only when a face candidate is detected in the first image by the first detection unit 4.

高解像度画像取得部６は、第２画像から高解像度画像を取得する。高解像度画像は、第２画像の中で所定の大きさで区切られた領域の画像である。本実施形態では高解像度画像は、２４×２４pixelで区切られた領域の画像である。高解像度画像取得部６は、取得した高解像度画像において、後述する第２検出部７で顔検出判定が行われると、第２画像の中から新しい高解像度画像を取得する。新しい高解像度画像の取得方法は、低解像度画像取得部３における低解像度画像の取得方法と同じである。 The high resolution image acquisition unit 6 acquires a high resolution image from the second image. The high-resolution image is an image of an area divided by a predetermined size in the second image. In the present embodiment, the high resolution image is an image of an area divided by 24 × 24 pixels. The high-resolution image acquisition unit 6 acquires a new high-resolution image from the second image when face detection determination is performed by the second detection unit 7 described later in the acquired high-resolution image. The new high-resolution image acquisition method is the same as the low-resolution image acquisition method in the low-resolution image acquisition unit 3.

第２検出部７について図４を用いて説明する。図４は第２検出部７の概略ブロック図である。第２検出部７は、第２特徴抽出部（第２特徴量算出部）T2i-j(i=1〜L,j=1〜M)と、第２判定部H2i(i=1〜L)とを備える。第２検出部７は、第２画像の中から顔を検出する。 The 2nd detection part 7 is demonstrated using FIG. FIG. 4 is a schematic block diagram of the second detection unit 7. The second detection unit 7 includes a second feature extraction unit (second feature amount calculation unit) T2i-j (i = 1 to L, j = 1 to M) and a second determination unit H2i (i = 1 to L). With. The second detection unit 7 detects a face from the second image.

第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、高解像度画像取得部６によって取得された高解像度画像が顔の特徴を含んでいるかどうか判断する識別器である。第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、例えば図３に示される矩形特徴、または図３に示される矩形特徴を組み合わせた矩形特徴によって構成される第２抽出フィルタ（第２特徴）を備える。第２抽出フィルタは、高解像度画像が顔の特徴を有しているかどうかを判定するフィルタである。第２抽出フィルタの矩形特徴は、例えば図３に示すような矩形特徴である。第２抽出フィルタは、高解像度画像と同じ大きさであり、本実施形態では２４×２４pixelである。なお、矩形特徴は、図３に示されるものに限られる事はなく、顔を判別可能なものであればよい。 The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is an identifier that determines whether the high resolution image acquired by the high resolution image acquisition unit 6 includes facial features. is there. The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is, for example, a first feature configured by a rectangular feature shown in FIG. 3 or a rectangular feature obtained by combining the rectangular features shown in FIG. Two extraction filters (second feature) are provided. The second extraction filter is a filter that determines whether or not the high-resolution image has facial features. The rectangular feature of the second extraction filter is, for example, a rectangular feature as shown in FIG. The second extraction filter has the same size as the high resolution image, and is 24 × 24 pixels in the present embodiment. Note that the rectangular feature is not limited to that shown in FIG.

第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、高解像度画像の特徴量を算出する。第２特徴抽出部T2i-j(i=1〜L,j=1〜M)における特徴量の算出方法は第１特徴抽出部T1i-j(i=1〜L,j=1〜M)と同じ方法である。 The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) calculates the feature amount of the high resolution image. The feature amount calculation method in the second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is the same as the first feature extraction unit T1i-j (i = 1 to L, j = 1 to M). The same way.

第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、算出した特徴量と第３閾値とを比較し、算出した特徴量が第３閾値よりも大きい場合には、高解像度画像が顔の特徴を含んでいることを示す「１」を出力する。一方、算出した特徴量が第３閾値以下である場合には、高解像度画像が顔の特徴を含んでいないことを示す「０」を出力する。第３閾値は、第２特徴抽出部T2i-j(i=1〜L,j=1〜M)毎に設定された値である。 The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) compares the calculated feature quantity with the third threshold value, and if the calculated feature quantity is greater than the third threshold value, , “1” indicating that the high-resolution image includes facial features is output. On the other hand, when the calculated feature amount is equal to or smaller than the third threshold, “0” indicating that the high-resolution image does not include facial features is output. The third threshold value is a value set for each second feature extraction unit T2i-j (i = 1 to L, j = 1 to M).

第２判定部H2i(i=1〜L)は、１または複数の第２特徴抽出部H2i-j(i=1〜L,j=1〜M)によって構成される識別器である。第２判定部H2i(i=1〜L)は、各第２特徴抽出部T2i-j(i=1〜L,j=1〜M)から出力される値（「１」か「０」）に各第２特徴抽出部T2i-j(i=1〜L,j=1〜M)が有する第２抽出フィルタの重み（信頼度）を掛けて、各第２特徴抽出部T2i-j(i=1〜L,j=1〜M)の評価値を算出する。そして、第２判定部H2i(i=1〜L)は、各第２特徴抽出部T2i-j(i=1〜L,j=1〜M)の評価値を加算した合計評価値を算出する。第２判定部H2i(i=1〜L)は、算出した合計評価値を第４閾値と比較し、合計評価値が第４閾値よりも大きい場合には、高解像度画像を次ぎの第２判定部H2i(i=1〜L)に入力させる。一方、第２判定部H2i(i=1〜L)は、合計評価値が第４閾値以下の場合には、高解像度画像が顔を含んでいないと判断し、現在取得している高解像度画像における顔検出判定を終了する。第４閾値は、第２判定部H2i(i=1〜L)毎に設定された値である。 The second determination unit H2i (i = 1 to L) is a discriminator configured by one or a plurality of second feature extraction units H2i-j (i = 1 to L, j = 1 to M). The second determination unit H2i (i = 1 to L) is a value (“1” or “0”) output from each second feature extraction unit T2i-j (i = 1 to L, j = 1 to M). Is multiplied by the weight (reliability) of the second extraction filter of each second feature extraction unit T2i-j (i = 1 to L, j = 1 to M), and each second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) are calculated. Then, the second determination unit H2i (i = 1 to L) calculates a total evaluation value obtained by adding the evaluation values of the second feature extraction units T2i-j (i = 1 to L, j = 1 to M). . The second determination unit H2i (i = 1 to L) compares the calculated total evaluation value with the fourth threshold value, and when the total evaluation value is larger than the fourth threshold value, the second determination is performed for the next second determination. Part H2i (i = 1 to L). On the other hand, the second determination unit H2i (i = 1 to L) determines that the high-resolution image does not include a face when the total evaluation value is equal to or less than the fourth threshold, and the currently acquired high-resolution image The face detection determination at is terminated. The fourth threshold value is a value set for each second determination unit H2i (i = 1 to L).

第２検出部７は、複数の第２判定部H2i(i=1〜L)をカスケード接続することで構成される。最後の第２判定部H2nによって高解像度画像が顔の特徴を含んでいると判定されると、最終的に高解像度画像は顔を含んだ画像であると判定される。このようにして、第２検出部７は高解像度画像から顔を検出する。一方、第２画像の全範囲で取得した高解像度画像に対して、いずれかの第２判定部H2i(i=1〜L)で合計評価値が第４閾値以下であると判定されると、第２画像からは顔が検出されなかったこととなる。つまり、取得した画像に顔が含まれていない事となる。 The second detection unit 7 is configured by cascading a plurality of second determination units H2i (i = 1 to L). If the final second determination unit H2n determines that the high-resolution image includes facial features, the high-resolution image is finally determined to be an image including a face. In this way, the second detection unit 7 detects the face from the high resolution image. On the other hand, when it is determined that any of the second determination units H2i (i = 1 to L) determines that the total evaluation value is equal to or lower than the fourth threshold for the high resolution image acquired in the entire range of the second image. This means that no face has been detected from the second image. That is, the acquired image does not include a face.

ここで、第１検出部４と第２検出部７とにおける検出率と過検出率の関係について図５を用いて説明する。図５は第１検出部４および第２検出部７における検出率と過検出率とのＲＯＣ（Receiver Operating Characteristic）曲線である。過検出率とは、第１検出部４または第２検出部７が顔候補または顔であると判定した数に対する実際の顔候補または顔の正解率を示すものである。 Here, the relationship between the detection rate and the overdetection rate in the first detection unit 4 and the second detection unit 7 will be described with reference to FIG. FIG. 5 is a ROC (Receiver Operating Characteristic) curve of the detection rate and the overdetection rate in the first detection unit 4 and the second detection unit 7. The overdetection rate indicates an actual face candidate or face correct answer rate with respect to the number determined by the first detection unit 4 or the second detection unit 7 as a face candidate or a face.

第１検出部４および第２検出部７においては検出率が上がると過検出率も上昇する。本実施形態では、第１検出部４における顔候補の過検出率が、第２検出部７における顔の過検出率よりも大きくなるように、第１検出部４と第２検出部７とを設定する。なお、顔の検出率と顔候補検出率とはほぼ同じ検出率であることが望ましい。ただし、第２検出部７と第１検出部４とでは学習で用いる教師画像の解像度が異なるなど、第１検出部４と第２検出部７と間で判定条件が異なるため必ずしも同じ検出率となるわけではない。 In the first detection unit 4 and the second detection unit 7, when the detection rate increases, the overdetection rate also increases. In the present embodiment, the first detection unit 4 and the second detection unit 7 are set such that the overdetection rate of face candidates in the first detection unit 4 is larger than the overdetection rate of faces in the second detection unit 7. Set. It is desirable that the face detection rate and the face candidate detection rate be substantially the same. However, since the determination conditions are different between the first detection unit 4 and the second detection unit 7 such that the resolution of the teacher image used in learning is different between the second detection unit 7 and the first detection unit 4, the detection rate is not necessarily the same. It doesn't mean.

本実施形態では、顔候補の過検出率を顔の過検出率よりも大きくすることで、第１検出部４によって顔候補を出来るだけ逃がさずに検出し、顔候補が検出された場合に、画像中に顔があるかどうかを第２検出部７によって正確に判定する。 In the present embodiment, when the face candidate overdetection rate is made larger than the face overdetection rate, the first detection unit 4 detects the face candidate without escaping as much as possible, and when the face candidate is detected, Whether or not there is a face in the image is accurately determined by the second detection unit 7.

次に第１判定部H1i(i=1〜L)および第２判定部H2i(i=1〜L)の構築方法について説明する。第１判定部H1i(i=1〜L)および第２判定部H2i(i=1〜L)は、ともに顔を含んだ多数の教師画像（学習画像）と顔を含まない多数の教師画像（学習画像）とを用いたブースティングの学習方法により構築される。本実施形態では、特にAdaBoostの学習方法によって構築されている。AdaBoostの学習方法は、詳しくは非特許文献１に開示されているので詳しい説明は省略するが、AdaBoostを用いた学習は以下の手順によって行われる。（１）教師画像を用意する。（２）学習重みを初期化する。（２）各教師画像の重みを正規化する。（３）（ａ）学習重みを正規化する。（ｂ）各特徴について識別エラーを計算する。（ｃ）識別エラーを最小とする特徴を選択する。（ｄ）学習重みを更新する。（４）最終的な抽出フィルタを構築する。 Next, a construction method of the first determination unit H1i (i = 1 to L) and the second determination unit H2i (i = 1 to L) will be described. The first determination unit H1i (i = 1 to L) and the second determination unit H2i (i = 1 to L) are both a large number of teacher images (learning images) that include faces and a large number of teacher images that do not include faces ( It is constructed by a learning method of boosting using a learning image. In this embodiment, it is especially constructed by the AdaBoost learning method. Since the learning method of AdaBoost is disclosed in detail in Non-Patent Document 1, detailed explanation is omitted, but learning using AdaBoost is performed by the following procedure. (1) Prepare a teacher image. (2) The learning weight is initialized. (2) Normalize the weight of each teacher image. (3) (a) Normalize the learning weight. (B) Calculate the identification error for each feature. (C) Select the feature that minimizes the identification error. (D) Update the learning weight. (4) Build a final extraction filter.

第２判定部H2i(i=1〜L)は、高解像度の教師画像（第２学習画像）を用いて構築される。本実施形態では、高解像度の教師画像として２４×２４pixelの画像を用いる。また、第１判定部H1i(i=1〜L)は、低解像度の教師画像（第１学習画像）を用いて構築される。本実施形態では、低解像度の教師画像として１０×１０pixelの画像を用いる。低解像度の教師画像は、高解像度の教師画像を縮小して作成される。また、高解像度の教師画像と低解像度の教師画像とをそれぞれ撮影して作成しても良い。 The second determination unit H2i (i = 1 to L) is constructed using a high-resolution teacher image (second learning image). In this embodiment, a 24 × 24 pixel image is used as the high-resolution teacher image. The first determination unit H1i (i = 1 to L) is constructed using a low-resolution teacher image (first learning image). In the present embodiment, a 10 × 10 pixel image is used as a low-resolution teacher image. The low-resolution teacher image is created by reducing the high-resolution teacher image. Alternatively, a high-resolution teacher image and a low-resolution teacher image may be captured and created.

上記する学習によって、第１抽出フィルタおよび第２抽出フィルタも構築される。ここで、第１抽出フィルタと第２抽出フィルタとについて詳しく説明する。上記学習時に、顔を含んだ高解像度の教師画像として、例えば図６（ａ）に示す画像が用いられ、顔を含んだ低解像度の教師画像として、例えば図６（ａ）の画像を縮小した図７（ａ）に示す画像が用いられたものとする。図６（ａ）と図７（ａ）に示した画像を、矩形特徴を用いて表すと、各教師画像の矩形特徴は例えば図６（ｂ）と図７（ｂ）となる。このような場合に、第１抽出フィルタと第２抽出フィルタとは、例えば図６（ｂ）、図７（ｂ）の矩形特徴に基づいて構築される。 By the learning described above, the first extraction filter and the second extraction filter are also constructed. Here, the first extraction filter and the second extraction filter will be described in detail. During the learning, for example, the image shown in FIG. 6A is used as the high-resolution teacher image including the face, and the image shown in FIG. 6A is reduced as the low-resolution teacher image including the face, for example. Assume that the image shown in FIG. 7A is used. When the images shown in FIG. 6A and FIG. 7A are expressed using rectangular features, the rectangular features of each teacher image are, for example, FIG. 6B and FIG. 7B. In such a case, the first extraction filter and the second extraction filter are constructed based on, for example, the rectangular features shown in FIGS. 6B and 7B.

低解像度の教師画像は、顔の特徴となる例えば目、鼻、口を示す境界が高解像度の教師画像よりもぼやけるので、高解像度の教師画像と比較して情報量が少なくなる。そのため低解像度の教師画像から得られる矩形特徴は、高解像度の教師画像から得られる矩形特徴よりも少なくなる。つまり、低解像度の教師画像を用いて学習して構築される第１抽出フィルタは、高解像度の教師画像を用いて学習して構築される第２抽出フィルタよりも矩形特徴のバリエーションが少なくなる。 The low-resolution teacher image has a smaller amount of information than the high-resolution teacher image because the boundaries indicating the facial features such as eyes, nose, and mouth are blurred compared to the high-resolution teacher image. For this reason, the rectangular features obtained from the low-resolution teacher image are fewer than the rectangular features obtained from the high-resolution teacher image. That is, the first extraction filter that is constructed by learning using the low-resolution teacher image has fewer variations of the rectangular features than the second extraction filter that is constructed by learning using the high-resolution teacher image.

このように、或る顔に対して低解像度画像の矩形特徴は高解像度画像の矩形特徴よりもバリエーションが少ないので、第１検出部４では低解像度画像取得部３によって取得した低解像度画像が顔候補の特徴を含んでいるかどうかを正確に判定することができない。しかし、例えば画像取得部１によって取得した画像において被写体が小さく写り、第１画像に縮小すると正確には顔とは識別が困難となった顔についても、第１判定部H1i(i=1〜L)は顔候補として検出することができる。第１判定部H1i(i=1〜L)は、低解像度の教師画像を用いた学習によって構築されているからである。そして、第１画像に顔候補があると判定されると、第２検出部７によって高解像度画像を用いて顔判定を行うことで、顔を正確に検出することができる。 As described above, since the rectangular feature of the low-resolution image has fewer variations than the rectangular feature of the high-resolution image for a certain face, the low-resolution image acquired by the low-resolution image acquisition unit 3 is the face of the first detection unit 4. It cannot be accurately determined whether the candidate feature is included. However, for example, the first determination unit H1i (i = 1 to L) is also used for a face that is small in the image acquired by the image acquisition unit 1 and difficult to be accurately identified as a face when reduced to the first image. ) Can be detected as a face candidate. This is because the first determination unit H1i (i = 1 to L) is constructed by learning using a low-resolution teacher image. When it is determined that there is a face candidate in the first image, the face can be accurately detected by performing face determination using the high-resolution image by the second detection unit 7.

次に本実施形態の顔検出制御について図８のフローチャートを用いて説明する。 Next, the face detection control of this embodiment will be described with reference to the flowchart of FIG.

ステップＳ１００では、入力された画像から第１画像を取得する。後述するステップＳ１０４からステップが戻って来た場合には、新たな縮小率で縮小した第１画像を取得する。 In step S100, a first image is acquired from the input image. When the step returns from step S104 described later, a first image reduced at a new reduction rate is acquired.

ステップＳ１０１では、第１画像から低解像度画像を取得する。後述するステップＳ１０３からステップが戻って来た場合には、新たに低解像度画像を取得する。 In step S101, a low resolution image is acquired from the first image. When the step returns from step S103 described later, a new low resolution image is acquired.

ステップＳ１０２では、取得した低解像度画像が顔候補を含むかどうか判定する。つまり、低解像度画像から顔候補が検出されたかどうか判定する。そして、低解像度画像が顔候補を含むと判定されると、ステップＳ１０５へ進む。一方、低解像度画像が顔候補を含んでいないと判定されると、ステップＳ１０３へ進む。 In step S102, it is determined whether the acquired low-resolution image includes a face candidate. That is, it is determined whether a face candidate is detected from the low resolution image. If it is determined that the low-resolution image includes a face candidate, the process proceeds to step S105. On the other hand, if it is determined that the low-resolution image does not include a face candidate, the process proceeds to step S103.

ステップＳ１０３では、第１画像の中で低解像度画像を取得していない箇所があるかどうか判定する。そして、第１画像の中で低解像度画像を取得していない箇所がある場合には、ステップＳ１０１へ戻り、第１画像の中から新たに低解像度画像を取得し、上記制御を繰り返す。一方、第１画像の全範囲において低解像度画像を取得した場合にはステップＳ１０４へ進む。 In step S103, it is determined whether or not there is a portion in the first image from which a low resolution image has not been acquired. If there is a portion in the first image where the low-resolution image is not acquired, the process returns to step S101, a new low-resolution image is acquired from the first image, and the above control is repeated. On the other hand, if a low resolution image is acquired in the entire range of the first image, the process proceeds to step S104.

ステップＳ１０４では、設定された全ての縮小率で第１画像を取得したかどうか判定する。そして、設定された全ての縮小率で第１画像を取得した場合には、入力された画像に顔候補がないと判定して本制御を終了する。一方、設定された縮小率の中で選択されていない縮小率がある場合には、ステップ１００へ戻り、新たな縮小率で縮小した第１画像を取得し、上記制御を繰り返す。 In step S104, it is determined whether or not the first image has been acquired with all the set reduction ratios. When the first image is acquired with all the set reduction ratios, it is determined that there is no face candidate in the input image, and this control is terminated. On the other hand, if there is a reduction ratio that is not selected among the set reduction ratios, the process returns to step 100 to acquire the first image reduced at the new reduction ratio, and the above control is repeated.

ステップＳ１０５では、入力された画像から第２画像を取得する。後述するステップＳ１０４からステップが戻って来た場合には、新たな縮小率で縮小した第２画像を取得する。なお、ここでは入力された画像を縮小して第２画像を取得する場合について説明するが、入力された画像をそのまま第２画像としても良い。 In step S105, a second image is acquired from the input image. When the step returns from step S104 described later, a second image reduced at a new reduction rate is acquired. Although a case where the input image is reduced to acquire the second image will be described here, the input image may be used as it is as the second image.

ステップＳ１０６では、第２画像から高解像度画像を取得する。後述するステップＳ１０８からステップが戻って来た場合には、新たに高解像度画像を取得する。 In step S106, a high resolution image is acquired from the second image. When the step returns from step S108 described later, a high-resolution image is newly acquired.

ステップＳ１０７では、取得した高解像度画像が顔を含んでいるかどうか判定する。つまり、高解像度画像から顔が検出されたかどうか判定する。そして、高解像度画像が顔を含んでいると判定されると、取得した高解像度画像が顔であることを示す信号を出力する。 In step S107, it is determined whether the acquired high resolution image includes a face. That is, it is determined whether a face is detected from the high resolution image. When it is determined that the high resolution image includes a face, a signal indicating that the acquired high resolution image is a face is output.

ステップＳ１０８では、第２画像の中で高解像度画像を取得していない箇所があるかどうか判定する。そして、第２画像の中で高解像度画像を取得していない箇所がある場合には、ステップＳ１０６へ戻り、新たに高解像度画像を取得し、上記制御を繰り返す。一方、第２画像の全範囲において高解像度画像を取得した場合にはステップＳ１０９へ進む。 In step S108, it is determined whether or not there is a portion in the second image from which a high resolution image has not been acquired. If there is a portion in the second image where the high-resolution image is not acquired, the process returns to step S106, a new high-resolution image is acquired, and the above control is repeated. On the other hand, if a high-resolution image is acquired in the entire range of the second image, the process proceeds to step S109.

ステップＳ１０９では、設定された全ての縮小率で第２画像を取得したかどうか判定する。そして、設定された全ての縮小率で第２画像を取得した場合には、本制御を終了する。一方、設定された縮小率の中で選択されていない縮小率がある場合には、ステップＳ１０５へ戻り、新たな縮小率で縮小した第２画像を取得し、上記制御を繰り返す。 In step S109, it is determined whether the second image has been acquired with all the set reduction ratios. When the second image is acquired with all the set reduction ratios, this control is terminated. On the other hand, if there is a reduction ratio that is not selected among the set reduction ratios, the process returns to step S105, a second image reduced at the new reduction ratio is acquired, and the above control is repeated.

なお、本実施形態では、第１画像から顔候補が検出されると、直ぐに第２画像を取得したが、第１画像の全範囲で顔候補を検出しても良い。さらに設定された全ての縮小率で第１画像を取得して顔候補を検出しても良い。そして、第２画像から高解像度画像を取得する際に、第１画像中で顔候補が検出された箇所に対応する箇所近傍でのみ、高解像度画像を取得し、顔検出を行っても良い。顔候補が検出された箇所近傍においてのみ、高解像度画像を取得し、顔検出を行うことで、顔検出制御を素早く行うことができる。 In the present embodiment, when a face candidate is detected from the first image, the second image is acquired immediately. However, the face candidate may be detected in the entire range of the first image. Further, the face candidates may be detected by acquiring the first image with all the set reduction ratios. And when acquiring a high resolution image from a 2nd image, you may acquire a high resolution image only in the vicinity of the location corresponding to the location where the face candidate was detected in the 1st image, and may perform face detection. Face detection control can be performed quickly by acquiring a high-resolution image and performing face detection only in the vicinity of a location where a face candidate is detected.

また、第１判定部H1i(i=1〜L)および第２判定部H2i(i=1〜L)は学習方法としてサポートベクターマシンを用いても良い。サポートベクターマシンは、学習データを２クラスに分離する分離平面を構成する。分離平面は、クラス境界近傍に位置する学習データを基準として、学習データと分離平面との距離が最も大きくなるような位置に設定される。サポートベクターマシンによる学習方法は、例えば、David Meyer, a, Friedrich Leischa and Kurt Hornikb. "The support vector machine under test" in Neurocomputing, ]Volume 55, Issues 1-2, September 2003, Pages 169-186に記載されている。 The first determination unit H1i (i = 1 to L) and the second determination unit H2i (i = 1 to L) may use a support vector machine as a learning method. The support vector machine constitutes a separation plane that separates learning data into two classes. The separation plane is set at a position where the distance between the learning data and the separation plane becomes the maximum with reference to the learning data located near the class boundary. Learning methods using support vector machines are described in, for example, David Meyer, a, Friedrich Leischa and Kurt Hornikb. "The support vector machine under test" in Neurocomputing,] Volume 55, Issues 1-2, September 2003, Pages 169-186 Has been.

また、本実施形態では、画像取得部１によって取得した画像を縮小することで、第１画像と第２画像とを取得したが、低解像度の画像を取得する画像取得部と、高解像度の画像を取得する画像取得部とを別に設けても良い。 In the present embodiment, the first image and the second image are acquired by reducing the image acquired by the image acquisition unit 1. However, an image acquisition unit that acquires a low-resolution image, and a high-resolution image are acquired. An image acquisition unit for acquiring the image may be provided separately.

本発明の第１実施形態の効果について説明する。 The effect of 1st Embodiment of this invention is demonstrated.

画像取得部１によって取得した画像に基づいて、低解像度画像を取得し、低解像度画像に対して顔候補を検出する。そして、低解像度画像に対して顔候補が検出されると、画像取得部１によって取得された画像から取得された高解像度画像に対して顔検出を行う。これによって、取得した画像中に顔が含まれている場合に顔検出を正確に行うことができる。また、画像取得部１によって取得された画像に顔候補がない場合には、高解像度画像による顔検出が行われないので、顔検出制御を素早く終了させることができる。 Based on the image acquired by the image acquisition unit 1, a low resolution image is acquired, and face candidates are detected for the low resolution image. When a face candidate is detected for the low-resolution image, face detection is performed on the high-resolution image acquired from the image acquired by the image acquisition unit 1. This makes it possible to accurately detect a face when the acquired image includes a face. Further, when there is no face candidate in the image acquired by the image acquisition unit 1, the face detection control can be quickly terminated because the face detection by the high resolution image is not performed.

第１検出部４における顔候補の過検出率を第２検出部７における顔の過検出率よりも大きくすることで、第１画像中の顔候補を漏れなく検出することができる。これにより、画像中の顔があるにもかかわらず、第１検出部４によって第１画像中に顔候補がないと判定されることを防止し、顔検出を正確に行うことができる。 By making the overdetection rate of face candidates in the first detection unit 4 greater than the overdetection rate of faces in the second detection unit 7, face candidates in the first image can be detected without omission. Accordingly, it is possible to prevent the first detection unit 4 from determining that there is no face candidate in the first image even when there is a face in the image, and to accurately detect the face.

低解像度画像の特徴量をそれぞれ算出し、算出した特徴量に基づいて低解像度画像が顔候補であるかどうか判定する。また高解像度画像についても同様に、高解像度画像が顔であるかどうか判定する。これらにより、低解像度画像に対する顔候補検出および高解像度画像に対する顔検出を正確に行うことができる。 Each feature amount of the low resolution image is calculated, and it is determined whether the low resolution image is a face candidate based on the calculated feature amount. Similarly, for a high-resolution image, it is determined whether the high-resolution image is a face. As a result, face candidate detection for a low resolution image and face detection for a high resolution image can be accurately performed.

低解像度の教師画像によって顔候補を学習することで、第１判定部H1i(i=1〜L)における顔候補検出を正確に行うことができる。また、高解像度の教師画像によって顔を学習することで、第２判定部H2i(i=1〜L)における顔検出を正確に行うことができる。 By learning a face candidate from a low-resolution teacher image, face candidate detection in the first determination unit H1i (i = 1 to L) can be accurately performed. In addition, by learning a face with a high-resolution teacher image, face detection in the second determination unit H2i (i = 1 to L) can be accurately performed.

第１抽出フィルタを低解像度の教師画像によって学習して作成することで、顔候補検出を正確に行うことができる。また、同様に第２抽出フィルタを高解像度の教師画像によって学習して作成することで、顔検出を正確に行うことができる。 Face candidates can be detected accurately by learning and creating the first extraction filter with a low-resolution teacher image. Similarly, the face detection can be accurately performed by learning and creating the second extraction filter with a high-resolution teacher image.

高解像度の教師画像を縮小した低解像度の教師画像を用いて第１抽出フィルタを作成することで、画像中に小さく写っている顔についても第１検出部４において正確に顔候補として検出することができる。また、同じ教師画像に基づいて、第１抽出フィルタと第２抽出フィルタを作成することで、第１検出部４における顔候補検出と第２検出部７における顔検出との結果を近づけることができ、顔検出を正確に行うことができる。 By creating a first extraction filter using a low-resolution teacher image obtained by reducing a high-resolution teacher image, the first detection unit 4 can accurately detect a face appearing small in the image as a face candidate. Can do. Also, by creating the first extraction filter and the second extraction filter based on the same teacher image, the results of face candidate detection in the first detection unit 4 and face detection in the second detection unit 7 can be brought closer to each other. The face detection can be performed accurately.

第１抽出フィルタおよび第２抽出フィルタを矩形特徴で構成することで、顔検出制御を素早く行うことができる。 By configuring the first extraction filter and the second extraction filter with rectangular features, face detection control can be performed quickly.

第１判定部H1i(i=1〜L)および第２判定部H2i(i=1〜L)をブースティングによる学習方法で学習させることで、顔候補検出および顔検出を素早く、正確に行うことができる。 By quickly and accurately performing face candidate detection and face detection by learning the first determination unit H1i (i = 1 to L) and the second determination unit H2i (i = 1 to L) by a learning method using boosting. Can do.

次に本発明の第２実施形態について説明する。 Next, a second embodiment of the present invention will be described.

本実施形態では、第１特徴抽出部T1i-j(i=1〜L,j=1〜M)および第２特徴抽出部T2i-j(i=1〜L,j=1〜M)における特徴量の算出方法が第１実施形態とは異なっている。第１特徴抽出部T1i-j(i=1〜L,j=1〜M)および第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は高次局所自己相関関数を用いて特徴量を算出する。高次局所自己相関関数は自己相関を高次に拡張したものであり、参照点r、対象画像をf(r)とすると、Ｎ次の自己相関関数は、変位（a1,a2,...,aN）に対して次式で定義される。 In the present embodiment, the features in the first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) and the second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) The amount calculation method is different from that of the first embodiment. The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) and the second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) are higher-order local autocorrelation functions. The feature amount is calculated using. The higher-order local autocorrelation function is an extension of the autocorrelation, and if the reference point is r and the target image is f (r), the Nth-order autocorrelation function is the displacement (a1, a2, ... , aN) is defined by

このときに変位領域をrの局所に限定したものを高次局所自己相関と呼ぶ。図９は、次数Nを２次に限定し、領域を３×３pixelに限定したときの局所パターンの一例を示したものである。図９（ａ）は、次数Nを０次とした場合の局所パターンであり、図９（ｂ）は次数Nを１次とした場合の局所パターンの一例であり、図９（ｃ）は、次数Nを２次とした場合の局所パターンの一例である。参照点rにおける特徴量は、図９の黒丸の画素値と白丸の画素値とを掛け合わせて算出される。 At this time, the region in which the displacement region is limited to the local region of r is called high-order local autocorrelation. FIG. 9 shows an example of a local pattern when the order N is limited to the second order and the area is limited to 3 × 3 pixels. FIG. 9A is a local pattern when the order N is 0th order, FIG. 9B is an example of a local pattern when the order N is 1st order, and FIG. It is an example of a local pattern when the order N is secondary. The feature amount at the reference point r is calculated by multiplying the pixel value of the black circle and the pixel value of the white circle in FIG.

第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、第１抽出フィルタを例えば図９に示す局所パターンを有するフィルタで構成する。第１特徴抽出部T1i-j(i=1〜L,j=1〜M)は、第１抽出フィルタと低解像度画像とを重ね合わせて、第１抽出フィルタの局所パターンに対応する低解像度画像の画素値に基づいて特徴量を算出する。第１抽出フィルタが複数の局所パターンによって構成されている場合には、各局所パターンに対応する画素値の積を足し合わせることで特徴量を算出する。 The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) includes the first extraction filter, for example, a filter having a local pattern shown in FIG. The first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) superimposes the first extraction filter and the low resolution image, and corresponds to the local pattern of the first extraction filter. The feature amount is calculated based on the pixel value. When the first extraction filter is composed of a plurality of local patterns, the feature amount is calculated by adding the products of the pixel values corresponding to the local patterns.

第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、第１特徴抽出部T1i-j(i=1〜L,j=1〜M)と同様の方法によって特徴量を算出する。 The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is characterized by the same method as the first feature extraction unit T1i-j (i = 1 to L, j = 1 to M). Calculate the amount.

本発明の第２実施形態の効果について説明する。 The effect of 2nd Embodiment of this invention is demonstrated.

第１抽出フィルタおよび第２抽出フィルタを高次局所自己相関によって構成することで、顔検出制御を素早く行うことができる。 By configuring the first extraction filter and the second extraction filter with higher-order local autocorrelation, face detection control can be performed quickly.

次に本発明の第３実施形態について説明する。 Next, a third embodiment of the present invention will be described.

本実施形態は、第１実施形態と比較して、第２特徴抽出部T2i-j(i=1〜L,j=1〜M)における特徴量の算出を少なくするものである。 In the present embodiment, the feature amount calculation in the second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is reduced as compared with the first embodiment.

本実施形態では、低解像度画像および第１抽出フィルタの大きさを１０×１０pixelとし、高解像度画像および第２抽出フィルタの大きさを２０×２０pixelとする。つまり、高解像度画像および第２抽出フィルタの大きさは、低解像度画像および第１抽出フィルタの大きさの２倍である。 In the present embodiment, the size of the low resolution image and the first extraction filter is 10 × 10 pixels, and the size of the high resolution image and the second extraction filter is 20 × 20 pixels. That is, the size of the high resolution image and the second extraction filter is twice the size of the low resolution image and the first extraction filter.

本実施形態の第２抽出フィルタについて説明する。第１抽出フィルタおよび第２抽出フィルタの矩形特徴が図１０に示す特徴であるとする。なお、説明のため第１抽出フィルタおよび第２抽出フィルタは、３つの矩形特徴を備えているものとする。図１０（ａ）は第１抽出フィルタを示し、図１０（ｂ）は第２抽出フィルタを示す。 The second extraction filter of this embodiment will be described. Assume that the rectangular features of the first extraction filter and the second extraction filter are the features shown in FIG. For the sake of explanation, it is assumed that the first extraction filter and the second extraction filter have three rectangular features. FIG. 10A shows the first extraction filter, and FIG. 10B shows the second extraction filter.

図１０（ａ）の矩形特徴をα、β、γとし、図１０（ｂ）の矩形特徴をα’、β’、γ’とする。矩形特徴α、β、γの位置と矩形特徴α’、β’、γ’の位置を比較すると、矩形特徴αと矩形特徴α’に関しては各フィルタに対する相対的な位置は一致している。しかし、矩形特徴β、γと矩形特徴β’、γ’に関しては各フィルタに対する相対的な位置は一致していない。 The rectangular features in FIG. 10A are α, β, and γ, and the rectangular features in FIG. 10B are α ′, β ′, and γ ′. Comparing the positions of the rectangular features α, β, and γ with the positions of the rectangular features α ′, β ′, and γ ′, the relative positions of the rectangular feature α and the rectangular feature α ′ with respect to each filter are the same. However, the relative positions of the rectangular features β and γ and the rectangular features β ′ and γ ′ with respect to the filters do not match.

また、矩形特徴α’の大きさは、矩形特徴αの大きさの２倍であり、第１抽出フィルタと第２抽出フィルタとの大きさの比率と一致する。しかし、矩形特徴γと矩形特徴γ’については大きさの比率は第１抽出フィルタと第２抽出フィルタとの大きさの比率と一致していない。 Further, the size of the rectangular feature α ′ is twice the size of the rectangular feature α, and matches the size ratio of the first extraction filter and the second extraction filter. However, the size ratio of the rectangular feature γ and the rectangular feature γ ′ does not match the size ratio of the first extraction filter and the second extraction filter.

つまり、第１抽出フィルタに対する矩形特徴αの位置、大きさと第２特徴抽出フィルタに対する矩形特徴α’の位置、大きさは、相対的に一致している。 That is, the position and size of the rectangular feature α with respect to the first extraction filter and the position and size of the rectangular feature α ′ with respect to the second feature extraction filter are relatively coincident.

このような場合に、矩形特徴αを有する第１抽出フィルタと矩形特徴α’を有する第２抽出フィルタとは、顔候補検出および顔検出において同じ検出結果を示す。例えば、矩形特徴αを有する第１抽出フィルタを備えた第１特徴抽出部T1i-j(i=1〜L,j=1〜M)が、低解像度画像を顔候補の特徴を含んだ画像であると判定すると、矩形特徴α’を有する第２抽出フィルタを備えた第２特徴抽出部T2i-j(i=1〜L,j=1〜M)もまた高解像度画像を顔の特徴を含んだ画像であると判定する。そのため、第２特徴抽出部T2i-j(i=1〜L,j=1〜M)における判定を省略しても、顔検出の結果に影響を与えない。 In such a case, the first extraction filter having the rectangular feature α and the second extraction filter having the rectangular feature α ′ show the same detection result in face candidate detection and face detection. For example, a first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) including a first extraction filter having a rectangular feature α converts a low-resolution image into an image including face candidate features. If it is determined that there is a second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) having a second extraction filter having a rectangular feature α ′, the high-resolution image also includes facial features. Is determined to be an image. Therefore, even if the determination in the second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) is omitted, the face detection result is not affected.

本実施形態では、第１抽出フィルタの矩形特徴と相対的な位置、大きさが一致する矩形特徴を有する第２抽出フィルタを備えた第２特徴抽出部T2i-j(i=1〜L,j=1〜M)においては、特徴量の算出を省略する。このような第２特徴抽出部T2i-j(i=1〜L,j=1〜M)は、特徴量を算出することなく、第１特徴抽出部T1i-j(i=1〜L,j=1〜M)における判定結果を代用する。例えば、第１特徴抽出部T1i-j(i=1〜L,j=1〜M)が低解像度画像が顔候補の特徴を含む事を示す「１」を出力すると、第１抽出フィルタの矩形特徴と相対的な位置、大きさが一致する矩形特徴を有する第２抽出フィルタを備えた第２特徴抽出部T2i-j(i=1〜L,j=1〜M)も「１」を出力する。 In the present embodiment, the second feature extraction unit T2i-j (i = 1 to L, j including a second extraction filter having a rectangular feature whose relative position and size match the rectangular feature of the first extraction filter. = 1 to M), the feature amount calculation is omitted. Such a second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) does not calculate a feature amount, and the first feature extraction unit T1i-j (i = 1 to L, j The determination result in = 1 to M) is substituted. For example, when the first feature extraction unit T1i-j (i = 1 to L, j = 1 to M) outputs “1” indicating that the low-resolution image includes the feature of the face candidate, the first extraction filter rectangle The second feature extraction unit T2i-j (i = 1 to L, j = 1 to M) including the second extraction filter having a rectangular feature whose relative position and size match the feature also outputs “1”. To do.

なお、第１抽出フィルタの矩形特徴と相対的な位置、大きさが一致する矩形特徴を削除して第２抽出フィルタを作成しても良い。 Note that the second extraction filter may be created by deleting a rectangular feature whose relative position and size match the rectangular feature of the first extraction filter.

本実施形態では、高解像度画像および第２抽出フィルタの大きさが、低解像度画像および第１抽出フィルタの大きさの２倍である場合について説明したが、これに限られることはない。高解像度画像および第２抽出フィルタの大きさは、低解像度画像および第１抽出フィルタの大きさの整数倍であり、相対的な位置、大きさが一致していればよい。 In the present embodiment, the case where the size of the high-resolution image and the second extraction filter is twice the size of the low-resolution image and the first extraction filter has been described, but the present invention is not limited to this. The size of the high-resolution image and the second extraction filter is an integral multiple of the size of the low-resolution image and the first extraction filter, and it is only necessary that the relative position and size match.

本発明の第３実施形態の効果について説明する。 The effect of the third embodiment of the present invention will be described.

第１抽出フィルタの矩形特徴と相対的な位置、大きさが一致する矩形特徴を有する第２抽出フィルタを備えた第２特徴抽出部においては、特徴量の算出を省略することで、顔検出制御を素早く行うことができる。 In the second feature extraction unit including the second extraction filter having a rectangular feature whose relative position and size match the rectangular feature of the first extraction filter, face detection control is performed by omitting calculation of the feature amount. Can be done quickly.

次に本発明の第４実施形態について図１１を用いて説明する。図１１は、第４実施形態の撮像装置の第１検出部４の概略ブロック図である。 Next, a fourth embodiment of the present invention will be described with reference to FIG. FIG. 11 is a schematic block diagram of the first detection unit 4 of the imaging apparatus according to the fourth embodiment.

本実施形態の撮像装置は、第１実施形態の撮像装置と比較して、第１検出部１０が異なっている。その他の構成については、第１実施形態と同じ構成なので、ここでの説明は省略する。なお、本実施形態では、高解像度画像は２０×２０pixelで区切られた領域の画像である。 The imaging device of this embodiment differs in the 1st detection part 10 compared with the imaging device of 1st Embodiment. Since other configurations are the same as those in the first embodiment, description thereof is omitted here. In the present embodiment, the high resolution image is an image of an area divided by 20 × 20 pixels.

第１特徴抽出部T11i-j(i=1〜L,j=1〜M)は、第２抽出フィルタを縮小したフィルタを第１抽出フィルタとして用いる。つまり、本実施形態では、学習時に高解像度画像の教師画像によって第２抽出フィルタを構築し、この第２抽出フィルタを用いて第１特徴抽出部T11i-j(i=1〜L,j=1〜M)で低解像度画像が顔候補の特徴を含んでいるか判定を行うものである。第１抽出フィルタは１０×１０pixelである。 The first feature extraction unit T11i-j (i = 1 to L, j = 1 to M) uses a filter obtained by reducing the second extraction filter as the first extraction filter. That is, in the present embodiment, the second extraction filter is constructed by the teacher image of the high-resolution image at the time of learning, and the first feature extraction unit T11i-j (i = 1 to L, j = 1 is used by using the second extraction filter. To M), it is determined whether the low-resolution image includes the feature of the face candidate. The first extraction filter is 10 × 10 pixels.

第２抽出フィルタを第１抽出フィルタとして用いる方法について図１２を用いて説明する。 A method of using the second extraction filter as the first extraction filter will be described with reference to FIG.

ここでは、第２抽出フィルタの矩形特徴が例えば図１２（ａ）に示す特徴であるとする。なお、図１２（ａ）には説明のため矩形特徴α、β、γ、θを記載している。 Here, it is assumed that the rectangular feature of the second extraction filter is, for example, the feature shown in FIG. In FIG. 12A, rectangular features α, β, γ, and θ are shown for explanation.

図１２（ａ）の矩形特徴α、β、γを低解像度画像の大きさに合わせて、１／２に縮小した矩形特徴を、α’、β’、γ’とし、図１２（ｂ）、（ｃ）に示す。ここでは、図１２の各図の左下を原点として座標（ｘ、ｙ）を設定して説明する。座標は１pixelを単位とする。そのため隣り合う画素間の境界となる画素格子が座標に対応する。 The rectangular features α, β, γ in FIG. 12A are reduced to ½ according to the size of the low-resolution image, and α ′, β ′, γ ′ are defined as α ′, β ′, γ ′, and FIG. Shown in (c). Here, a description will be given by setting coordinates (x, y) with the lower left of each figure in FIG. 12 as the origin. Coordinates are in units of 1 pixel. Therefore, a pixel grid that is a boundary between adjacent pixels corresponds to the coordinates.

矩形特徴αの頂点α１〜α６の座標は、α１（４、１０）、α２（８、１０）、α３（１２、１０）、α４（１２、１６）、α５（８、１６）α６（４、１６）となる。そのため矩形特徴αを縮小した場合には、矩形特徴α’の頂点α１’〜α６’は、図１２（ｂ）において、α１'（２、５）、α２'（４、５）、α３'（６、５）、α４'（６、８）、α５'（４、８）、α６'（２、８）となり、頂点α１'〜α６'は画素格子の交点に重なる。 The coordinates of the vertices α1 to α6 of the rectangular feature α are α1 (4, 10), α2 (8, 10), α3 (12, 10), α4 (12, 16), α5 (8, 16) α6 (4, 16). Therefore, when the rectangular feature α is reduced, the vertices α1 ′ to α6 ′ of the rectangular feature α ′ are represented by α1 ′ (2, 5), α2 ′ (4, 5), α3 ′ ( 6, 5), α4 ′ (6, 8), α5 ′ (4, 8), α6 ′ (2, 8), and the vertices α1 ′ to α6 ′ overlap the intersections of the pixel grid.

矩形特徴βの頂点β１〜β６の座標は、β１（７、１）、β２（１１、１）、β３（１５、１）、β４（１５、７）、β５（１１、７）、β６（７、７）となる。ここで矩形特徴βを縮小すると、矩形特徴βの頂点β１〜β６に対応する頂点は画素格子の交点に重ならない。この場合、一定のルールに基づいて頂点を移動させる。本実施形態では、画素格子の交点に重ならなかった頂点を右、下、もしくは右下の格子に移動させる。このルールに従うと、矩形特徴β’は、図１２（ｂ）において、β１'（４、０）、β２'（６、０）β３'（８、０）、β４'（８、３）、β５'（６、３）、β６'（４、３）となり、頂点β１'〜β６'は画素格子の交点に重なる。 The coordinates of the vertices β1 to β6 of the rectangular feature β are β1 (7, 1), β2 (11, 1), β3 (15, 1), β4 (15, 7), β5 (11, 7), β6 (7 7). Here, when the rectangular feature β is reduced, the vertices corresponding to the vertices β1 to β6 of the rectangular feature β do not overlap with the intersection of the pixel grid. In this case, the vertex is moved based on a certain rule. In the present embodiment, the vertex that does not overlap the intersection of the pixel grid is moved to the right, lower, or lower right grid. According to this rule, the rectangular feature β ′ is represented by β1 ′ (4,0), β2 ′ (6,0) β3 ′ (8,0), β4 ′ (8,3), β5 in FIG. '(6, 3), β6' (4, 3), and the vertices β1 'to β6' overlap the intersections of the pixel grid.

矩形特徴γの頂点γ１〜γ６の座標は、γ１（１４、１５）、γ２（１７、１５）、γ３（２０、１５）、γ４（２０、１８）、γ５（１７、１８）、γ６（１４、１８）となる。この場合、矩形特徴γの頂点γ１〜γ６に対応する頂点の一部は画素格子の交点に重ならない。これに対して矩形特徴βと同様のルールを適用すると、図１２（ｂ）において、γ１'（７、７）、γ２'（９、７）、γ３'（１０、７）、γ４'（１０、９）、γ５'（９、９）、γ６'（７、９）となる。上記ルールを適用すると、矩形特徴γ’の頂点は、画素格子の交点に重なるが、黒の矩形（第１の特徴領域）との面積と白の矩形（第２の特徴領域）の面積との比率が、縮小する前の矩形特徴γの比率と比較すると異なる比率となる。このような場合には、面積の多い部分を１pixel減らし、γ１''（８、７）、γ６''（８、９）とすることで、図１２（ｃ）に示すように黒の矩形の面積と白の矩形の面積との比率を縮小の前後で同じ比率となるようにする。 The coordinates of the vertices γ1 to γ6 of the rectangular feature γ are γ1 (14, 15), γ2 (17, 15), γ3 (20, 15), γ4 (20, 18), γ5 (17, 18), γ6 (14 18). In this case, some of the vertices corresponding to the vertices γ1 to γ6 of the rectangular feature γ do not overlap the intersections of the pixel grid. On the other hand, when the same rule as the rectangular feature β is applied, in FIG. 12B, γ1 ′ (7, 7), γ2 ′ (9, 7), γ3 ′ (10, 7), γ4 ′ (10 , 9), γ5 ′ (9, 9), γ6 ′ (7, 9). When the above rule is applied, the vertex of the rectangular feature γ ′ overlaps the intersection of the pixel grid, but the area of the black rectangle (first feature region) and the area of the white rectangle (second feature region) The ratio is different from the ratio of the rectangular feature γ before the reduction. In such a case, the large area is reduced by 1 pixel to obtain γ1 ″ (8, 7), γ6 ″ (8, 9), so that a black rectangular shape as shown in FIG. The ratio of the area and the area of the white rectangle is made the same before and after the reduction.

また、黒の矩形の面積と白の矩形の面積との比率を縮小の前後で同じ比率とするために、黒の矩形と白の矩形との重みを変えてもよい。これによっても第２抽出フィルタを縮小して第１抽出フィルタとして用いることができる。 In order to make the ratio of the black rectangular area and the white rectangular area the same before and after reduction, the weights of the black rectangle and the white rectangle may be changed. Also by this, the second extraction filter can be reduced and used as the first extraction filter.

なお、図１２（ａ）の矩形特徴θは、縮小すると最小単位である１pixelの黒の矩形と１piｘel白の矩形とで構成することができないので削除される。 Note that the rectangular feature θ in FIG. 12A is deleted because it cannot be composed of a 1-pixel black rectangle and a 1-pixel white rectangle, which are the smallest units when reduced.

本実施形態では、以上の方法によって第２抽出フィルタを縮小した第１抽出フィルタを用いて第１特徴抽出部T11i-j(i=1〜L,j=1〜M)によって低解像度画像の特徴量を算出する。 In the present embodiment, the features of the low-resolution image are obtained by the first feature extraction unit T11i-j (i = 1 to L, j = 1 to M) using the first extraction filter obtained by reducing the second extraction filter by the above method. Calculate the amount.

第１判定部H11i(i=1〜L)は、各第１特徴抽出部T11i-j(i=1〜L,j=1〜M)から出力される値に基づいて低解像度画像が顔候補を含んでいるかどうか判定する。 The first determination unit H11i (i = 1 to L) uses the first feature extraction unit T11i-j (i = 1 to L, j = 1 to M) as a face candidate for the low resolution image. Is determined.

なお、第１抽出フィルタを拡大して第２抽出フィルタとして用いても良い。 Note that the first extraction filter may be enlarged and used as the second extraction filter.

本発明の第４実施形態の効果について説明する。 The effect of 4th Embodiment of this invention is demonstrated.

第２抽出フィルタを低解像度画像と高解像度画像との大きさの比率に応じて縮小し、第１抽出フィルタとして使用することで、撮像装置に記憶させるデータ量を少なくすることができる。 By reducing the second extraction filter according to the ratio of the size of the low resolution image and the high resolution image and using the second extraction filter as the first extraction filter, the amount of data stored in the imaging apparatus can be reduced.

第２抽出フィルタを縮小する場合に、縮小した矩形特徴の頂点が画素素子の交点に重なるように縮小することで、第２抽出フィルタを第１抽出フィルタとして用いることができる。 When the second extraction filter is reduced, the second extraction filter can be used as the first extraction filter by reducing the vertex of the reduced rectangular feature so as to overlap the intersection of the pixel elements.

第２抽出フィルタを縮小する場合に、矩形特徴の黒の矩形の面積と白の矩形の面積との比率が、縮小前の比率と同じ比率となるように、縮小した矩形特徴の頂点を移動させることで、第２抽出フィルタを第１抽出フィルタとして用いることができる。 When reducing the second extraction filter, the vertex of the reduced rectangular feature is moved so that the ratio of the black rectangular area and the white rectangular area of the rectangular feature is the same as the ratio before the reduction. Thus, the second extraction filter can be used as the first extraction filter.

次に本発明の第５実施形態の撮像装置について図１３を用いて説明する。図１３は、第５実施形態の撮像装置の概略ブロック図である。 Next, an imaging device according to a fifth embodiment of the present invention will be described with reference to FIG. FIG. 13 is a schematic block diagram of an imaging apparatus according to the fifth embodiment.

本実施形態の撮像装置は、画像取得部１と、縮小部２１と、高解像度画像取得部２２と、低解像度画像取得部２３と、検出部２４とを備える。 The imaging apparatus of the present embodiment includes an image acquisition unit 1, a reduction unit 21, a high resolution image acquisition unit 22, a low resolution image acquisition unit 23, and a detection unit 24.

縮小部２１は、第１実施形態の第２縮小部５と同様の構成であり、画像取得部２０が取得した画像から第２画像を取得する。 The reduction unit 21 has the same configuration as the second reduction unit 5 of the first embodiment, and acquires the second image from the image acquired by the image acquisition unit 20.

高解像度画像取得部２２は、縮小部２１で取得した第２画像から高解像度画像を取得する。高解像度画像は、第２画像の中で２０×２０pixelで区切られた領域の画像である。 The high resolution image acquisition unit 22 acquires a high resolution image from the second image acquired by the reduction unit 21. The high resolution image is an image of an area divided by 20 × 20 pixels in the second image.

低解像度画像取得部２３は、高解像度画像を所定の縮小率で縮小して低解像度画像を取得する。本実施形態では、低解像度画像は、高解像度画像を１／２に縮小した１０×１０pixelの画像である。 The low resolution image acquisition unit 23 acquires the low resolution image by reducing the high resolution image at a predetermined reduction rate. In the present embodiment, the low resolution image is a 10 × 10 pixel image obtained by reducing the high resolution image to ½.

検出部２４について図１４を用いて説明する。図１４は、検出部２４の概略ブロック図である。検出部２４は、複数の判別部Hk（k=1〜n、n:２以上の整数）をカスケード接続することで構成される。また、検出部２４は、第１検出部２５と、第２検出部２６とを備える。 The detection unit 24 will be described with reference to FIG. FIG. 14 is a schematic block diagram of the detection unit 24. The detection unit 24 is configured by cascading a plurality of determination units Hk (k = 1 to n, n: integer of 2 or more). The detection unit 24 includes a first detection unit 25 and a second detection unit 26.

第１検出部２５は、判別部Hkの中でｍ番目までの判別部Hm（ｍ：１以上の整数、ｍ＜ｎ、以下、この判別部Hkを第１判別部Hmとして説明する）によって構成される。 The first detection unit 25 includes m determination units Hm (m: an integer of 1 or more, m <n, hereinafter, this determination unit Hk will be described as the first determination unit Hm) among the determination units Hk. Is done.

第１判別部Hmは、低解像度画像取得部２３によって取得された低解像度画像が顔候補の特徴を含んでいるか判断する識別器である。第１判別部Hmは、第４実施形態の第１特徴抽出部T11i-j(i=1〜L,j=1〜M)と同様に、第２抽出フィルタを縮小したフィルタを第１抽出フィルタとして用いて低解像度画像の特徴量を算出する。そして、第１判別部Hmは、特徴量が第１閾値よりも大きい場合に、低解像度画像が顔候補の特徴を含んでいると判定する。 The first determination unit Hm is a discriminator that determines whether the low-resolution image acquired by the low-resolution image acquisition unit 23 includes the feature of the face candidate. Similar to the first feature extraction unit T11i-j (i = 1 to L, j = 1 to M) of the fourth embodiment, the first determination unit Hm uses a filter obtained by reducing the second extraction filter as the first extraction filter. Is used to calculate the feature amount of the low-resolution image. Then, the first determination unit Hm determines that the low-resolution image includes the feature of the face candidate when the feature amount is larger than the first threshold value.

第１検出部２５は、ｍ番目の第１判別部Hmによって低解像度画像が顔候補の特徴を含んでいると判定すると、低解像度画像が顔候補を含んだ画像であると判定する。 If the m-th first determination unit Hm determines that the low resolution image includes the feature of the face candidate, the first detection unit 25 determines that the low resolution image is an image including the face candidate.

第２検出部２６は、判別部Hkの中でｍ＋１からｎ番目の判別部Hn（以下、この判別部を第２判別部Hnとして説明する）によって構成される。 The second detection unit 26 includes m + 1 to nth determination units Hn (hereinafter, this determination unit will be described as the second determination unit Hn) in the determination unit Hk.

第２判別部Hnは、高解像度画像取得部２２によって取得された高解像度画像が顔の特徴を含んでいるか判断する識別器である。第２判別部Hnは、第１検出部２５によって低解像度画像が顔候補を含んでいると判定されると、低解像度画像を作成した高解像度画像が有する特徴量を算出する。そして、第２判別部Hnは、特徴量が第２閾値よりも大きい場合に、高解像度画像が顔の特徴を含んでいると判定する。 The second determination unit Hn is a discriminator that determines whether the high-resolution image acquired by the high-resolution image acquisition unit 22 includes facial features. When the first detection unit 25 determines that the low resolution image includes a face candidate, the second determination unit Hn calculates the feature amount of the high resolution image that created the low resolution image. Then, the second determination unit Hn determines that the high-resolution image includes facial features when the feature amount is larger than the second threshold.

第２検出部２６は、ｎ番目の第２判別部Hnによって高解像度画像が顔の特徴を含んでいると判定すると、高解像度画像が顔を含んでいると判定する。 When the n-th second determination unit Hn determines that the high-resolution image includes facial features, the second detection unit 26 determines that the high-resolution image includes a face.

次に本実施形態の顔検出制御について図１５のフローチャートを用いて説明する。 Next, face detection control of this embodiment will be described with reference to the flowchart of FIG.

ステップＳ２００では、入力された画像から第２画像を取得する。後述するステップＳ２０６からステップが戻って来た場合には、新たな縮小率で縮小した第２画像を取得する。 In step S200, a second image is acquired from the input image. When the step returns from step S206 described later, a second image reduced at a new reduction rate is acquired.

ステップＳ２０１では、入力された第２画像から高解像度画像を取得する。後述するステップＳ２０５からステップが戻って来た場合には、新たに高解像度画像を取得する。 In step S201, a high resolution image is acquired from the input second image. When the step returns from step S205 described later, a new high-resolution image is acquired.

ステップＳ２０２では、取得した高解像度画像を所定の縮小率で縮小し、低解像度画像を取得する。 In step S202, the acquired high resolution image is reduced at a predetermined reduction rate to acquire a low resolution image.

ステップＳ２０３では、低解像度画像が顔候補を含んでいるかどうか判定する。そして、低解像度画像が顔候補を含んでいると判定するとステップＳ２０４へ進む。一方、低解像度画像が顔候補を含んでいないと判定するとステップＳ２０５へ進む。 In step S203, it is determined whether the low resolution image includes a face candidate. If it is determined that the low-resolution image includes a face candidate, the process proceeds to step S204. On the other hand, if it is determined that the low-resolution image does not include a face candidate, the process proceeds to step S205.

ステップＳ２０４では、ステップＳ２０１によって取得した高解像度画像が顔を含んでいるかどうか判定する。そして、高解像度画像が顔を含んでいると判定すると、入力された第２画像に顔が含まれていることを示す信号を出力する。 In step S204, it is determined whether or not the high-resolution image acquired in step S201 includes a face. If it is determined that the high-resolution image includes a face, a signal indicating that the input second image includes a face is output.

ステップＳ２０５では、第２画像の中で高解像度画像を取得していない箇所があるかどうか判定する。そして、第２画像の中で高解像度画像を取得していない箇所がある場合には、ステップＳ２０１へ戻り、第２画像の中から新たに高解像度画像を取得し、上記制御を繰り返す。一方、第２画像の全範囲において高解像度画像を取得した場合にはステップＳ２０６へ進む。 In step S205, it is determined whether or not there is a portion in the second image from which a high resolution image has not been acquired. If there is a portion in the second image where the high-resolution image is not acquired, the process returns to step S201, a new high-resolution image is acquired from the second image, and the above control is repeated. On the other hand, when a high-resolution image is acquired in the entire range of the second image, the process proceeds to step S206.

ステップＳ２０６では、設定された全ての縮小率で第２画像を取得したかどうか判定する。そして、設定された全ての縮小率で第２画像を取得した場合には、本制御を終了する。一方、設定された縮小率の中で選択されていない縮小率がある場合には、ステップ２００へ戻り、新たな縮小率で縮小した第２画像を取得する。 In step S206, it is determined whether the second image has been acquired with all the set reduction ratios. When the second image is acquired with all the set reduction ratios, this control is terminated. On the other hand, if there is a reduction ratio that is not selected among the set reduction ratios, the process returns to step 200 to acquire a second image reduced at the new reduction ratio.

なお、本実施形態では、高解像度画像を縮小することで、低解像度画像を作成したが、低解像度画像を取得し、低解像度画像を拡大することで、高解像度画像を作成しても良い。 In this embodiment, the low resolution image is created by reducing the high resolution image. However, the high resolution image may be created by acquiring the low resolution image and enlarging the low resolution image.

また、第１実施形態と同様に低解像度画像と高解像度画像とを用意し、第１検出部においては低解像度画像を用いて顔候補判定を行い、第２検出部においては高解像度画像を用いて顔判定を行っても良い。 Similarly to the first embodiment, a low-resolution image and a high-resolution image are prepared, the first detection unit performs face candidate determination using the low-resolution image, and the second detection unit uses the high-resolution image. Face determination may be performed.

本発明の第５実施形態の効果について説明する。 The effect of 5th Embodiment of this invention is demonstrated.

高解像度画像を縮小して低解像度画像を取得し、低解像度画像で顔候補ではないと判定された場合には、高解像度画像による顔検出を行わず、低解像度画像が顔候補であると判定された場合にのみ、高解像度画像による顔検出判定を行う。これにより、処理に時間がかかる高解像度画像による顔検出の頻度を少なくし、顔検出を素早く行うことができる。 If the high-resolution image is reduced to obtain a low-resolution image, and it is determined that the low-resolution image is not a face candidate, face detection is not performed using the high-resolution image, and the low-resolution image is determined to be a face candidate. Only when it is determined, face detection determination is performed using a high-resolution image. Thereby, it is possible to reduce the frequency of face detection using a high-resolution image that takes time to process, and to perform face detection quickly.

本発明の第６実施形態について図１６を用いて説明する。図１６は第６実施形態の概略ブロック図である。本実施形態では、撮像装置としてデジタルスチルカメラ（以下、カメラとする）を用いたものである。本実施形態では、第１実施形態で説明した顔検出制御を行う。 A sixth embodiment of the present invention will be described with reference to FIG. FIG. 16 is a schematic block diagram of the sixth embodiment. In the present embodiment, a digital still camera (hereinafter referred to as a camera) is used as the imaging apparatus. In the present embodiment, the face detection control described in the first embodiment is performed.

カメラは、撮像素子３０と、Ａ／Ｄ変換器３１と、第１縮小部３２と、低解像度画像取得部３３と、第１検出部３４と、第２縮小部３５と、高解像度画像取得部３６と、第２検出部３７と、画像表示部（表示部）３８と、メモリ（保存部）３９と、ＣＰＵ（制御部）４０とを備える。 The camera includes an image sensor 30, an A / D converter 31, a first reduction unit 32, a low resolution image acquisition unit 33, a first detection unit 34, a second reduction unit 35, and a high resolution image acquisition unit. 36, a second detection unit 37, an image display unit (display unit) 38, a memory (storage unit) 39, and a CPU (control unit) 40.

撮像素子３０は、光学系４１を通って受光面に入射される光に応じてアナログの信号を所定のタイミングで出力する。撮像素子３０は、例えばＣＣＤ（電荷結合素子）やＣＭＯＳ（相補型金属酸化膜半導体）センサと称される形式、あるいはその他の各種の形式の撮像素子である。 The image sensor 30 outputs an analog signal at a predetermined timing in accordance with light incident on the light receiving surface through the optical system 41. The imaging device 30 is, for example, a format called a CCD (charge coupled device) or a CMOS (complementary metal oxide semiconductor) sensor, or other various types of imaging devices.

Ａ／Ｄ変換器３１は、撮像素子によって出力されたアナログの信号を画像データとしてのデジタルの信号に変換する。 The A / D converter 31 converts an analog signal output by the image sensor into a digital signal as image data.

第１縮小部３２は、撮像素子によって得られた画像を縮小して第１画像を取得する。なお、第１画像はスルー画として画像表示部３８に表示される。画像表示部３８に表示されるスルー画は、決められた縮小率で縮小した画像である。低解像度画像取得部３３は、第１画像の中から低解像度画像を取得する。第１検出部３４は、低解像度画像から顔候補を検出する。 The first reduction unit 32 reduces the image obtained by the image sensor and acquires the first image. The first image is displayed on the image display unit 38 as a through image. The through image displayed on the image display unit 38 is an image reduced at a predetermined reduction rate. The low resolution image acquisition unit 33 acquires a low resolution image from the first image. The first detection unit 34 detects face candidates from the low resolution image.

第１検出部３４によって第１画像中に顔候補があると判定されると、実際に撮影を行い、第２縮小部３５は、第２画像を取得する。 When the first detection unit 34 determines that there is a face candidate in the first image, the actual detection is performed, and the second reduction unit 35 acquires the second image.

高解像度画像取得部３６は、第２画像から高解像度画像を取得する。取得された高解像度画像は、メモリ３９に保存される。なお、連写によって高解像度画像を取得しても良い。第２検出部３７は、高解像度画像から顔を検出する。第２検出部３７によって顔が検出されなかった場合には、高解像度画像はメモリ３９から削除される。 The high resolution image acquisition unit 36 acquires a high resolution image from the second image. The acquired high resolution image is stored in the memory 39. Note that a high-resolution image may be acquired by continuous shooting. The second detection unit 37 detects a face from the high resolution image. If no face is detected by the second detection unit 37, the high resolution image is deleted from the memory 39.

なお、高解像度画像をメモリ３９から削除する際には、例えばゴミ箱のような削除候補画像を入れるフォルダを設け、その中に削除する高解像度画像を一旦入れてもよい。また、ユーザーに削除することの確認を求め、ユーザーにより削除することを指示された場合に削除するようにしても良い。これによって、必要な画像が削除されることを防ぐことができる。 When deleting a high-resolution image from the memory 39, a folder for storing a deletion candidate image such as a trash can may be provided, and the high-resolution image to be deleted may be temporarily stored therein. Alternatively, the user may be asked to confirm the deletion, and the deletion may be performed when the user instructs the deletion. This can prevent a necessary image from being deleted.

画像表示部３８は、スルー画を表示させる。画像表示部３８は、スルー画の中で、第２検出部７によって検出された顔に対応する箇所を例えば矩形状の枠で囲んで表示させても良い。これによってユーザーに顔が検出されたことを示すことができる。 The image display unit 38 displays a through image. The image display unit 38 may display the portion corresponding to the face detected by the second detection unit 7 in the through image, for example, surrounded by a rectangular frame. This can indicate to the user that a face has been detected.

ＣＰＵ４０は、カメラ全体を制御する。ＣＰＵ４０は、メモリに格納されたプログラムを実行することで、カメラが有する様々な機能を発揮させる。 The CPU 40 controls the entire camera. The CPU 40 executes various programs of the camera by executing a program stored in the memory.

本発明の第６実施形態の効果について説明する。 The effect of the sixth embodiment of the present invention will be described.

低解像度画像によって顔候補が検出された場合に、高解像度画像を取得し、取得した高解像度画像をメモリ３９に保存することで、顔検出を素早く行うことができる。 When a face candidate is detected from a low-resolution image, a high-resolution image is acquired, and the acquired high-resolution image is stored in the memory 39, so that face detection can be performed quickly.

高解像度画像によって顔が検出されなかった場合には、メモリ３９に保存した高解像度画像を削除することで、メモリ３９の容量を小さくすることができる。 When a face is not detected by a high resolution image, the capacity of the memory 39 can be reduced by deleting the high resolution image stored in the memory 39.

なお、上記実施形態に限られる事はなく、上記実施形態の構成を組み合わせることが可能である。 In addition, it is not restricted to the said embodiment, It is possible to combine the structure of the said embodiment.

また、上記撮像装置は、デジタルカメラ、デジタルビデオカメラ、電子内視鏡など、正しく作動するために電流または電磁界に依存する機器である電子機器に搭載することが可能である。 In addition, the imaging apparatus can be mounted on an electronic device such as a digital camera, a digital video camera, or an electronic endoscope, which is a device that depends on an electric current or an electromagnetic field in order to operate correctly.

また、上述した実施形態の説明では、撮像装置が行う処理としてハードウェアによる処理を前提としていたが、このような構成に限定される必要はない。例えば、別途ソフトウェアにて処理する構成も可能である。 In the above description of the embodiment, the processing performed by the imaging apparatus is premised on processing by hardware, but is not necessarily limited to such a configuration. For example, a configuration in which processing is performed separately by software is also possible.

この場合、撮像装置は、ＣＰＵ、ＲＡＭ等の主記憶装置、上記処理の全て或いは一部を実現させるためのプログラムが記憶されたコンピュータ読取り可能な記憶媒体を備える。ここでは、このプログラムを画像処理プログラムと呼ぶ。そして、ＣＰＵが上記記憶媒体に記憶されている画像処理プログラムを読み出して、情報の加工・演算処理を実行することにより、上記撮像装置と同様の処理を実現させる。 In this case, the imaging apparatus includes a main storage device such as a CPU and a RAM, and a computer-readable storage medium in which a program for realizing all or part of the above processing is stored. Here, this program is called an image processing program. Then, the CPU reads out the image processing program stored in the storage medium and executes information processing / calculation processing, thereby realizing processing similar to that of the imaging apparatus.

ここで、コンピュータ読み取り可能な記録媒体とは、磁気ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、半導体メモリ等をいう。また、この画像処理プログラムを通信回線によってコンピュータに配信し、この配信を受けたコンピュータが当該画像処理プログラムを実行するようにしても良い。 Here, the computer-readable recording medium refers to a magnetic disk, a magneto-optical disk, a CD-ROM, a DVD-ROM, a semiconductor memory, and the like. Alternatively, the image processing program may be distributed to a computer via a communication line, and the computer that has received the distribution may execute the image processing program.

本発明は上記した実施形態に限定されるものではなく、その技術的思想の範囲内でなしうるさまざまな変更、改良が含まれることは言うまでもない。 It goes without saying that the present invention is not limited to the above-described embodiments, and includes various modifications and improvements that can be made within the scope of the technical idea.

１、２０画像取得部
２、３２第１縮小部
３、２３、３３低解像度画像取得部
４、１０、２５、３４第１検出部
５、３５第２縮小部
６、２２、３６高解像度画像取得部
７、２６、３７第２検出部
２１縮小部
２４検出部
３０撮像素子
３１Ａ／Ｄ変換器
３８画像表示部（表示部）
３９メモリ（保存部）
４０ＣＰＵ（制御部）
T1i-j,T11i-j 第１特徴抽出部（第１特徴量算出部）
H1i,H11i 第１判定部
Hk 判別部
T2i-j 第２特徴抽出部（第２特徴量算出部）
H2i 第２判定部 1, 20 Image acquisition unit 2, 32 First reduction unit 3, 23, 33 Low resolution image acquisition unit 4, 10, 25, 34 First detection unit 5, 35 Second reduction unit 6, 22, 36 High resolution image acquisition Unit 7, 26, 37 Second detection unit 21 Reduction unit 24 Detection unit 30 Image sensor 31 A / D converter 38 Image display unit (display unit)
39 Memory (storage unit)
40 CPU (control unit)
T1i-j, T11i-j first feature extraction unit (first feature amount calculation unit)
H1i, H11i 1st judgment part
Hk discriminator
T2i-j second feature extraction unit (second feature amount calculation unit)
H2i second judgment part

Claims

An imaging device for imaging a subject,
An image acquisition unit for acquiring images;
A low-resolution image acquisition unit that acquires a low-resolution image based on the image acquired by the image acquisition unit;
A first detection unit for detecting a subject candidate for the low-resolution image;
A high-resolution image acquisition unit that acquires a high-resolution image that is higher in resolution than the low-resolution image from the image acquired by the image acquisition unit;
An imaging apparatus comprising: a second detection unit configured to detect the subject from the high resolution image when the subject candidate is detected from the low resolution image by the first detection unit. .

The high-resolution image acquisition unit acquires the high-resolution image from the image acquired by the image acquisition unit when the subject candidate is detected for the low-resolution image by the first detection unit. The imaging apparatus according to claim 1, wherein the imaging apparatus is characterized.

The imaging apparatus according to claim 1, wherein the low-resolution image acquisition unit acquires the low-resolution image by reducing the high-resolution image.

The over-detection rate of the subject candidates, which is the correct rate of the subject candidates with respect to the number of subject candidates detected by the first detection unit, is the correct answer of the subject with respect to the number of subjects detected by the second detection unit. The imaging apparatus according to claim 2, wherein the imaging device is larger than an overdetection rate of the subject, which is a rate.

The first detection unit includes:
A first feature amount calculation unit for calculating a feature amount of the low-resolution image;
A first determination unit that determines whether or not the subject candidate is included in the low-resolution image based on the feature amount calculated by the first feature amount calculation unit;
The imaging apparatus according to claim 2, wherein the first detection unit detects the subject candidate for the low-resolution image based on a determination result by the first determination unit.

Whether the first determination unit includes the subject candidate in the low-resolution image by learning using a plurality of first learning images that are an image including the subject candidate and an image not including the subject candidate image. 6. The imaging apparatus according to claim 5, wherein whether or not to determine is determined.

The feature of the low-resolution image is based on a first feature learned using a plurality of first learning images that are an image including the subject candidate and an image not including the subject candidate image. The imaging apparatus according to claim 5, wherein an amount is calculated.

The first learning image is an image obtained by reducing a plurality of second learning images that are a high-resolution image including the subject and a high-resolution image not including the subject. The imaging device described.

The imaging apparatus according to claim 7, wherein the first feature is configured by a rectangular feature or a higher-order local autocorrelation.

The imaging apparatus according to claim 6, wherein the first determination unit learns by a learning method using boosting.

The imaging apparatus according to claim 6, wherein the first determination unit learns by a learning method using a support vector machine.

The second detector is
A second feature amount calculation unit for calculating a feature amount of the high-resolution image;
A second determination unit that determines whether or not the high-resolution image includes the subject based on the feature amount calculated by the second feature amount calculation unit;
The imaging apparatus according to claim 2, wherein the second detection unit detects the subject from the high-resolution image based on a determination result by the second determination unit.

The second determination unit learns using a plurality of second learning images that are a high-resolution image including the subject and a high-resolution image not including the subject, thereby including the subject in the high-resolution image. The imaging apparatus according to claim 12, wherein it is determined whether or not the image is to be captured.

The second feature amount calculation unit is configured to use the high resolution based on a second feature learned using a plurality of second learning images that are a high resolution image including the subject and a high resolution image not including the subject. The image pickup apparatus according to claim 12, wherein a feature amount of the image is calculated.

The imaging apparatus according to claim 14, wherein the second feature includes a rectangular feature or a higher-order local autocorrelation.

The imaging apparatus according to claim 13, wherein the second determination unit learns by a learning method using boosting.

The imaging apparatus according to claim 13, wherein the second determination unit learns by a learning method using a support vector machine.

The high-resolution image acquisition unit acquires the high-resolution image from the image acquired by the image acquisition unit only near a position corresponding to the low-resolution image from which the subject candidate is detected by the first detection unit. The imaging apparatus according to claim 2.

The first detection unit includes:
A first feature amount calculation unit that calculates a feature amount of the low-resolution image based on a first feature learned by a plurality of first learning images that are an image including the subject candidate and an image not including the subject candidate image; ,
A first determination unit that determines whether or not the subject candidate is included in the low-resolution image based on the feature amount calculated by the first feature amount calculation unit, the second detection unit,
The resolution of the high-resolution image is higher than that of the first learning image, based on the second feature learned by a plurality of second learning images that are images including the subject candidate and images not including the subject candidate image. A second feature amount calculation unit for calculating a feature amount;
A second determination unit that determines whether or not the high-resolution image includes the subject based on the feature amount calculated by the second feature amount calculation unit;
The second feature amount calculation unit calculates a feature amount of the high-resolution image for a second feature having a relative position and size that is identical to the first feature in the second feature. The imaging apparatus according to claim 2, wherein the imaging apparatus is not.

The first detection unit includes:
A first feature amount calculation unit that calculates a feature amount of the low-resolution image based on a first feature;
A first determination unit that determines whether or not the subject candidate is included in the low-resolution image based on the feature amount calculated by the first feature amount calculation unit;
The second detector is
A second feature amount calculation unit that calculates a feature amount of the high-resolution image based on a second feature;
A second determination unit that determines whether or not the high-resolution image includes the subject based on the feature amount calculated by the second feature amount calculation unit;
The second feature is learned by a plurality of learning images that are a high-resolution image including the subject and a high-resolution image not including the subject,
The first feature amount calculation unit is configured to reduce the low feature based on the first feature obtained by reducing the second feature learned by the learning image according to a ratio of the size of the high resolution image and the low resolution image. The imaging apparatus according to claim 2, wherein a feature amount of the resolution image is calculated.

The first feature and the second feature include a feature composed of a first feature region and a second feature region,
When the reduction is performed, if the vertices of the first feature region and the second feature region after the reduction do not overlap with the intersection of the pixel lattice that is a boundary between adjacent pixels, the vertex is the pixel lattice. The imaging apparatus according to claim 20, wherein the vertex is moved so as to overlap the intersection of the two.

The ratio between the first feature region and the second feature region before the reduction is adjusted by adjusting the areas of the first feature region and the second feature region after the reduction. The imaging device according to claim 21, wherein a ratio between the first feature region and the second feature region after the reduction is made equal.

The ratio between the first feature region and the second feature region before the reduction is adjusted by adjusting the weight between the first feature region and the second feature region after the reduction. The imaging device according to claim 21, wherein a ratio between the first feature region and the second feature region after the reduction is made equal.

The first detection unit includes m (m is an integer of 1 or more) first determination units,
The second detection unit includes n (n is an integer of m + 1 or more) second determination units,
The imaging device according to claim 3, wherein the first detection unit and the second detection unit are connected in series.

The storage unit that stores the high-resolution image acquired by the high-resolution image acquisition unit when the subject candidate is detected for the low-resolution image by the first detection unit. The imaging device described in 1.

A controller that deletes the high-resolution image stored in the storage unit when the subject is not detected by the second detection unit with respect to the high-resolution image stored in the storage unit; The imaging apparatus according to claim 25, characterized in that:

A control unit that moves the high-resolution image stored in the storage unit to an image deletion folder when the subject is not detected by the second detection unit with respect to the high-resolution image stored in the storage unit The imaging apparatus according to claim 25, comprising:

28. The imaging apparatus according to claim 27, wherein the control unit obtains permission to delete the high-resolution image after moving the high-resolution image to the image deletion folder.

26. The imaging apparatus according to claim 25, further comprising a display unit that displays the subject surrounded by a frame when the second detection unit detects the subject with respect to the high-resolution image.

An electronic apparatus comprising the imaging apparatus according to any one of claims 1 to 29.

An image processing program for processing a captured image by a computer,
An image acquisition procedure for acquiring images;
A low-resolution image acquisition procedure for acquiring a low-resolution image based on the image acquired by the image acquisition procedure;
A first detection procedure for detecting subject candidates for the low-resolution image;
A high-resolution image acquired from the image acquired by the image acquisition procedure and having a higher resolution than the low-resolution image when the subject candidate is detected for the low-resolution image by the first detection procedure And a second detection procedure for detecting the subject.

An imaging method for imaging a subject,
Get an image,
Acquiring a low-resolution image based on the acquired image;
Detect subject candidates for the low-resolution image,
When the subject candidate is detected for the low-resolution image, the subject is detected from a high-resolution image acquired from the acquired image and having a higher resolution than the low-resolution image. An imaging method.