JP2009230557A

JP2009230557A - Object detection device, object detection method, object detection program, and printer

Info

Publication number: JP2009230557A
Application number: JP2008076476A
Authority: JP
Inventors: Hiroyuki Tsuji; 宏幸辻
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2008-03-24
Filing date: 2008-03-24
Publication date: 2009-10-08

Abstract

<P>PROBLEM TO BE SOLVED: To reduce the time for detecting an object, and to more reduce processing and increase speed than before. <P>SOLUTION: An object detection device for detecting a prescribed object from an input image is provided with: an edge acquisition part for setting a detection window on the input image, and for acquiring the edge quantity of each of a plurality of regions in the set detection window; and an object decision part for comparing the acquired edge quantity of each region between the prescribed regions, and for, when the comparison result satisfies prescribed conditions, deciding the presence/absence of the object by using the image in the set detection window. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、オブジェクト検出装置、オブジェクト検出方法、オブジェクト検出プログラムおよび印刷装置に関する。 The present invention relates to an object detection apparatus, an object detection method, an object detection program, and a printing apparatus.

入力画像の中からある目的画像（オブジェクト）を検出する技術が知られている。
また、赤目候補領域に対して周辺領域を設定し、設定した領域の各画素のエッジ（各画素にＳｏｂｅｌフィルタを適用した際の出力値）の平均値を算出し、この平均値がしきい値よりも大きいか否かによって、赤目候補領域が赤目領域であるか否かを判定する画像処理装置が知られている（特許文献１参照。）。
特開２００７‐４４５５号公報 A technique for detecting a target image (object) from an input image is known.
Also, a peripheral region is set for the red-eye candidate region, and an average value of the edge of each pixel in the set region (an output value when the Sobel filter is applied to each pixel) is calculated, and this average value is a threshold value There is known an image processing apparatus that determines whether or not a red-eye candidate area is a red-eye area based on whether or not it is larger (see Patent Document 1).
JP 2007-4455 A

従来、入力画像から顔画像等のオブジェクトの検出処理を実行する場合、入力画像中の全ての箇所を対象としてオブジェクトの検出を試みることにより、漏れのない検出結果の取得を目指していた。しかし上記検出処理においては、その検出の精度とともに、処理の軽減化および高速化が求められており、従来のように入力画像中の全ての箇所を同じようにオブジェクトの検出対象としていては、上記処理の軽減化および高速化という目的を十分に達成できない。なお上記文献１は、赤目候補領域が赤目領域であるか否か特定する際に、上記周辺領域の各画素のエッジの平均値を利用するものであるが、かかる赤目領域の特定以外でのエッジの利用手法を提示するものではなかった。 Conventionally, when executing detection processing of an object such as a face image from an input image, an attempt has been made to acquire a detection result without omission by attempting to detect the object for all locations in the input image. However, in the above detection processing, there is a demand for reduction and speeding up of the processing along with the accuracy of the detection, and all the locations in the input image are set as object detection targets in the same manner as in the past. The purpose of reducing processing and speeding up cannot be sufficiently achieved. Note that the above document 1 uses the average value of the edges of each pixel in the peripheral area when specifying whether or not the red-eye candidate area is a red-eye area. It did not present the usage method of.

本発明は上記課題に鑑みてなされたもので、入力画像からオブジェクトを検出するに際して、高精度な検出を担保しつつ、従来に増して処理の軽減および高速化を実現可能なオブジェクト検出装置、オブジェクト検出方法、オブジェクト検出プログラムおよび印刷装置を提供することを目的とする。 The present invention has been made in view of the above problems, and an object detection device and an object capable of reducing processing and speeding up processing compared to conventional methods while ensuring high-precision detection when detecting an object from an input image. It is an object to provide a detection method, an object detection program, and a printing apparatus.

上記目的を達成するため、本発明は、入力画像から所定のオブジェクトを検出するオブジェクト検出装置であって、上記入力画像上に検出窓を設定するとともに、当該設定した検出窓内の複数の領域について各領域のエッジ量を取得するエッジ取得部と、上記取得された各領域のエッジ量を所定の領域間において比較し、当該比較の結果が所定の条件を満たす場合に、上記設定された検出窓内の画像を対象として上記オブジェクトの有無の判定を実行するオブジェクト判定部とを備える構成としてある。本発明によれば、入力画像に設定された検出窓内の所定の領域間におけるエッジ量の比較結果次第では、当該検出窓についてはオブジェクトの有無判定の実行が回避される。つまり、エッジ量の比較の結果、オブジェクトらしい画像が存在しないと推定される検出窓については上記有無判定が行なわれないため、入力画像からのオブジェクト検出の精度を落とすことなく、処理量および処理時間が軽減される。 In order to achieve the above object, the present invention provides an object detection device for detecting a predetermined object from an input image, wherein a detection window is set on the input image and a plurality of regions in the set detection window are set. An edge acquisition unit that acquires the edge amount of each region and the acquired edge amount of each region are compared between predetermined regions, and when the result of the comparison satisfies a predetermined condition, the set detection window And an object determination unit that executes the determination of the presence / absence of the object for the image inside. According to the present invention, depending on the comparison result of the edge amount between predetermined regions in the detection window set in the input image, execution of the presence / absence determination of the object is avoided for the detection window. That is, as a result of the edge amount comparison, the presence / absence determination is not performed for a detection window that is estimated to have no object-like image, so that the processing amount and the processing time can be reduced without degrading the accuracy of object detection from the input image. Is reduced.

上記エッジ取得部は、検出窓内に設定された領域であって検出窓が顔画像を含む場合に顔画像の所定の器官に対応する領域として設定された第一領域と、検出窓内に設定された領域であって検出窓が顔画像を含む場合に顔画像の上記器官以外の所定の皮膚部分に対応する領域として設定された第二領域との夫々についてエッジ量を取得し、上記オブジェクト判定部は、第一領域のエッジ量が第二領域のエッジ量より多い場合に、上記設定された検出窓内の画像を対象としてオブジェクトとしての顔画像の有無を判定するとしてもよい。当該構成によれば、第一領域と第二領域とのエッジ量の比較結果から、検出窓内に顔らしい画像が存在すると推定される場合にのみ、当該検出窓内の画像に対して顔画像の有無判定が行なわれる。そのため、一般的に処理量が多い顔画像の有無判定を無駄に行なってしまうことを防止できる。 The edge acquisition unit is an area set in the detection window, and when the detection window includes a face image, a first area set as an area corresponding to a predetermined organ of the face image, and set in the detection window When the detection window includes a face image, an edge amount is acquired for each of the second region set as a region corresponding to a predetermined skin portion other than the organ of the face image, and the object determination is performed When the edge amount of the first region is larger than the edge amount of the second region, the unit may determine the presence or absence of a face image as an object for the image in the set detection window. According to the configuration, the face image is compared with the image in the detection window only when it is estimated from the result of the edge amount comparison between the first area and the second area that a face-like image exists in the detection window. The presence / absence determination is performed. Therefore, it can be prevented that the presence / absence determination of a face image having a large amount of processing is performed in vain.

上記第一領域は、検出窓が顔画像を含む場合に顔画像の目に対応すると予め推定された領域および検出窓が顔画像を含む場合に顔画像の口に対応すると予め推定された領域を含むとしてもよい。顔画像における目や口はエッジ量が多い。そのため、当該構成によれば第一領域と第二領域とのエッジ量の比較結果に基づいて、顔画像の有無判定を実行すべきか否かを適切に判断できる。 The first area includes an area preliminarily estimated to correspond to the eyes of the face image when the detection window includes a face image and an area previously estimated to correspond to the mouth of the face image when the detection window includes the face image. It may be included. The eyes and mouth in the face image have a large amount of edges. Therefore, according to the said structure, it can be judged appropriately whether the presence or absence determination of a face image should be performed based on the comparison result of the edge amount of a 1st area | region and a 2nd area | region.

第一領域のエッジ量がある値以下である場合には、そもそも検出窓内の画像が全体的に輝度差の小さい画像であると考えられ、その場合、検出窓が実際に顔画像を含む状態であっても、第一領域のエッジ量＞第二領域のエッジ量、が成り立たない場合がある。そこで上記オブジェクト判定部は、第一領域のエッジ量が所定のしきい値以下である場合には、第二領域のエッジ量に拘らず、上記設定された検出窓内の画像を対象として顔画像の有無を判定するとしてもよい。当該構成によれば、上記エッジ量の比較に基づく判断が実質的に機能しないことによるオブジェクトの検出漏れを防止することができる。 If the edge amount of the first region is less than a certain value, the image in the detection window is considered to be an image with a small luminance difference in the first place, and in this case, the detection window actually includes a face image. Even in such a case, the edge amount of the first region> the edge amount of the second region may not hold. Therefore, when the edge amount of the first region is equal to or smaller than the predetermined threshold value, the object determination unit sets the face image for the image in the set detection window regardless of the edge amount of the second region. It may be determined whether or not there is. According to this configuration, it is possible to prevent omission of object detection due to the fact that the determination based on the comparison of the edge amounts does not substantially function.

上記エッジ取得部は、検出窓に対する上記各領域の位置および大きさを保持した状態で、上記検出窓を設定した入力画像上の位置において検出窓を所定の角度ずつ複数回回転させ、回転させた夫々の状態毎に各領域のエッジ量を取得し、上記オブジェクト判定部は、上記検出窓が回転した状態毎に上記比較を行い、当該各比較の結果に基づいて、上記設定された検出窓内の画像を対象としたオブジェクトの有無の判定を実行するか否か決定するとしてもよい。当該構成によれば、入力画像上に様々な角度で存在し得るオブジェクトらしい画像についてその存在の可能性を推定し、オブジェクトらしい画像が存在すると推定される検出窓において、オブジェクトの有無判定を実行することができる。 The edge acquisition unit rotates the detection window a plurality of times by a predetermined angle at a position on the input image where the detection window is set while maintaining the position and size of each region with respect to the detection window. The edge amount of each region is acquired for each state, and the object determination unit performs the comparison for each state in which the detection window is rotated, and based on the result of each comparison, It may be determined whether or not to perform the determination of the presence / absence of an object for the image. According to this configuration, the possibility of existence of an image that seems to be an object that can exist at various angles on the input image is estimated, and the presence / absence determination of the object is performed in the detection window in which the image that seems to be an object exists. be able to.

上記エッジ取得部は、検出窓に対する上記各領域の位置および大きさを保持した状態で、上記入力画像における検出窓の位置と大きさとの少なくとも一方を変更しながら入力画像上に繰り返し検出窓の設定を行い、検出窓を設定する度に各領域のエッジ量を取得し、上記オブジェクト判定部は、設定された検出窓毎に上記比較を行なうとしてもよい。当該構成によれば、入力画像上の様々な位置において様々なサイズで存在し得るオブジェクトらしい画像についてその存在の可能性を推定し、オブジェクトらしい画像が存在すると推定される検出窓の位置および大きさに基づいて、オブジェクトの有無判定を実行することができる。 The edge acquisition unit repeatedly sets the detection window on the input image while changing at least one of the position and size of the detection window in the input image while maintaining the position and size of each region with respect to the detection window. The edge amount of each area is acquired every time a detection window is set, and the object determination unit may perform the comparison for each set detection window. According to this configuration, the possibility of existence of an image that seems to be an object that can exist at various sizes at various positions on the input image is estimated, and the position and size of the detection window that is presumed that an image that seems to be an object exists. The presence / absence determination of the object can be executed based on the above.

本発明の技術的思想は、上述したオブジェクト検出装置の発明以外にも、上述したオブジェクト検出装置が備える各部が行なう各処理工程を備えたオブジェクト検出方法の発明や、上述したオブジェクト検出装置が備える各部に対応した機能をコンピュータに実行させるオブジェクト検出プログラムの発明としても捉えることができる。また、入力画像から所定のオブジェクトを検出するとともに、入力画像に基づく印刷を実行する印刷装置であって、上記入力画像上に検出窓を設定するとともに、当該設定した検出窓内の複数の領域について各領域のエッジ量を取得するエッジ取得部と、上記取得された各領域のエッジ量を所定の領域間において比較し、当該比較の結果が所定の条件を満たす場合に、上記設定された検出窓内の画像を対象として上記オブジェクトの有無の判定を実行するオブジェクト判定部と、上記オブジェクト判定部によってオブジェクトが有ると判定された検出窓内の画像に基づいて決定した補正情報に応じて上記入力画像の少なくとも一部を補正し、当該補正後の入力画像に基づいて印刷を行なう印刷制御部とを備える構成も把握することが可能である。 The technical idea of the present invention is that, in addition to the above-described invention of the object detection device, the invention of the object detection method including each processing step performed by each unit included in the above-described object detection device, and each unit included in the above-described object detection device It can also be understood as an invention of an object detection program for causing a computer to execute a function corresponding to the above. A printing apparatus that detects a predetermined object from an input image and performs printing based on the input image, sets a detection window on the input image, and sets a plurality of regions in the set detection window An edge acquisition unit that acquires the edge amount of each region and the acquired edge amount of each region are compared between predetermined regions, and when the result of the comparison satisfies a predetermined condition, the set detection window An object determination unit that determines whether or not the object exists for an image in the image, and the input image according to correction information determined based on an image in the detection window determined by the object determination unit to have an object It is possible to grasp a configuration including a print control unit that corrects at least a part of the image and performs printing based on the corrected input image. .

下記の順序に従って本発明の実施形態を説明する。
１．プリンタの概略構成：
２．プリンタによる処理：
２‐１．オブジェクトの有無判定の要否判断：
２‐２．オブジェクトの有無判定から印刷まで：
３．変形例： Embodiments of the present invention will be described in the following order.
1. General printer configuration:
2. Processing by printer:
2-1. Determining whether or not an object exists:
2-2. From object presence determination to printing:
3. Variations:

１．プリンタの概略構成：
図１は、本発明のオブジェクト検出装置および印刷装置の一例に該当するプリンタ１０の構成を概略的に示している。プリンタ１０は、記録メディア（例えば、メモリカードＭＣ等）から取得した画像データに基づき画像を印刷する、いわゆるダイレクトプリントに対応したカラーインクジェットプリンタである。プリンタ１０は、プリンタ１０の各部を制御するＣＰＵ１１と、例えばＲＯＭやＲＡＭによって構成された内部メモリ１２と、ボタンやタッチパネルにより構成された操作部１４と、液晶ディスプレイにより構成された表示部１５と、プリンタエンジン１６と、カードインターフェース（カードＩ／Ｆ）１７と、ＰＣやサーバやデジタルスチルカメラ等の外部機器との情報のやり取りのためのＩ／Ｆ部１３とを備えている。プリンタ１０の各構成要素は、バスを介して互いに接続されている。 1. General printer configuration:
FIG. 1 schematically shows a configuration of a printer 10 corresponding to an example of an object detection apparatus and a printing apparatus of the present invention. The printer 10 is a color inkjet printer that supports so-called direct printing, in which an image is printed based on image data acquired from a recording medium (for example, a memory card MC). The printer 10 includes a CPU 11 that controls each unit of the printer 10, an internal memory 12 configured by, for example, a ROM and a RAM, an operation unit 14 configured by buttons and a touch panel, a display unit 15 configured by a liquid crystal display, A printer engine 16, a card interface (card I / F) 17, and an I / F unit 13 for exchanging information with an external device such as a PC, a server, or a digital still camera are provided. Each component of the printer 10 is connected to each other via a bus.

プリンタエンジン１６は、印刷データに基づき印刷を行う印刷機構である。カードＩ／Ｆ１７は、カードスロット１７２に挿入されたメモリカードＭＣとの間でデータのやり取りを行うためのＩ／Ｆである。メモリカードＭＣには画像データが格納されており、プリンタ１０は、カードＩ／Ｆ１７を介してメモリカードＭＣに格納された画像データを取得することができる。画像データ提供のための記録メディアとしてはメモリカードＭＣ以外にも種々の媒体を用いることができる。むろんプリンタ１０は、記録メディア以外にも、Ｉ／Ｆ部１３を介して接続した上記外部機器から画像データを入力することも可能である。プリンタ１０は、コンシューマ向けの印刷装置であってもよいし、ＤＰＥ向けの業務用印刷装置（いわゆるミニラボ機）であってもよい。操作部１４や表示部１５は、プリンタ１０本体とは別体の入力操作部（マウスやキーボードなど）やディスプレイであってもよい。プリンタ１０は、Ｉ／Ｆ部１３を介して接続したＰＣやサーバ等から印刷データを入力することもできる。 The printer engine 16 is a printing mechanism that performs printing based on print data. The card I / F 17 is an I / F for exchanging data with the memory card MC inserted into the card slot 172. Image data is stored in the memory card MC, and the printer 10 can acquire the image data stored in the memory card MC via the card I / F 17. In addition to the memory card MC, various media can be used as recording media for providing image data. Of course, in addition to the recording medium, the printer 10 can also input image data from the external device connected via the I / F unit 13. The printer 10 may be a printing device for consumers, or may be a business printing device for DPE (so-called minilab machine). The operation unit 14 and the display unit 15 may be an input operation unit (such as a mouse or a keyboard) or a display separate from the main body of the printer 10. The printer 10 can also input print data from a PC or server connected via the I / F unit 13.

内部メモリ１２には、オブジェクト検出部２０と、画像補正部３０と、表示処理部４０と、印刷処理部５０とが格納されている。オブジェクト検出部２０や、画像補正部３０は、所定のオペレーティングシステムの下で、後述するオブジェクト検出処理や、画像補正処理等を実行するためのコンピュータプログラムである。表示処理部４０は、表示部１５を制御して、表示部１５に処理メニューやメッセージを表示させるディスプレイドライバである。印刷処理部５０は、画像データから印刷データを生成し、プリンタエンジン１６を制御して、印刷データに基づく画像の印刷を実行するためのコンピュータプログラムである。ＣＰＵ１１は、内部メモリ１２から、これらのプログラムを読み出して実行することにより、これら各部の機能を実現する。 The internal memory 12 stores an object detection unit 20, an image correction unit 30, a display processing unit 40, and a print processing unit 50. The object detection unit 20 and the image correction unit 30 are computer programs for executing object detection processing, image correction processing, and the like described below under a predetermined operating system. The display processing unit 40 is a display driver that controls the display unit 15 to display a processing menu and a message on the display unit 15. The print processing unit 50 is a computer program for generating print data from image data, controlling the printer engine 16 and printing an image based on the print data. The CPU 11 implements the functions of these units by reading and executing these programs from the internal memory 12.

オブジェクト検出部２０は、プログラムモジュールとして、検出窓設定部２１と、エッジ量算出部２２と、要否判断部２３と、検出実行部２４とを含んでいる。画像補正部３０は、プログラムモジュールとして、補正情報決定部３１と、補正実行部３２とを含んでいる。検出窓設定部２１と、エッジ量算出部２２とは、特許請求の範囲に言うエッジ取得部に該当する。要否判断部２３と、検出実行部２４とは、特許請求の範囲に言うオブジェクト判定部に該当する。画像補正部３０と、印刷処理部５０とは、特許請求の範囲に言う印刷制御部に該当する。これら各部の機能については後述する。さらに、内部メモリ１２には、エッジ量算出領域定義フィルタ１４ｂや、エッジ検出フィルタ１４ｃ，１４ｄや、ニューラルネットワークＮＮ等の各種データやプログラムが格納されている。プリンタ１０は、印刷機能以外にも、コピー機能やスキャナ機能など多種の機能を備えたいわゆる複合機であってもよい。 The object detection unit 20 includes a detection window setting unit 21, an edge amount calculation unit 22, a necessity determination unit 23, and a detection execution unit 24 as program modules. The image correction unit 30 includes a correction information determination unit 31 and a correction execution unit 32 as program modules. The detection window setting unit 21 and the edge amount calculation unit 22 correspond to an edge acquisition unit in the claims. The necessity determination part 23 and the detection execution part 24 correspond to the object determination part said to a claim. The image correction unit 30 and the print processing unit 50 correspond to a print control unit in the claims. The functions of these units will be described later. Further, the internal memory 12 stores various data and programs such as the edge amount calculation region definition filter 14b, the edge detection filters 14c and 14d, and the neural network NN. The printer 10 may be a so-called multifunction machine having various functions such as a copy function and a scanner function in addition to the print function.

２．プリンタによる処理：
２‐１．オブジェクトの有無判定の要否判断：
図２は、本実施形態においてプリンタ１０が実行する処理をフローチャートにより示している。ステップＳ（以下、ステップの表記は省略。）１００では、オブジェクト検出部２０が、画像処理の対象となる画像（入力画像）を表した画像データＤを、メモリカードＭＣ等、所定の記録メディアから取得する。つまりオブジェクト検出部２０は、入力画像を取得する。むろん、オブジェクト検出部２０は、プリンタ１０がハードディスクドライブ（ＨＤＤ）を有していれば、当該ＨＤＤに保存されている画像データＤを取得可能であるし、上述したようにＩ／Ｆ部１３を介して接続した上記外部機器から画像データＤを取得可能である。つまり、ユーザが表示部１５に表示されたユーザインターフェース（ＵＩ）画面を参照しながら操作部１４を操作して、入力画像としての画像データＤを任意に選択するとともに当該選択した画像データＤの印刷指示を行なった場合に、オブジェクト検出部２０は上記選択にかかる画像データＤを記録メディア等から取得する。 2. Processing by printer:
2-1. Determining whether or not an object exists:
FIG. 2 is a flowchart showing processing executed by the printer 10 in this embodiment. In step S (hereinafter, step notation is omitted) 100, the object detection unit 20 obtains image data D representing an image (input image) to be subjected to image processing from a predetermined recording medium such as a memory card MC. get. That is, the object detection unit 20 acquires an input image. Of course, if the printer 10 has a hard disk drive (HDD), the object detection unit 20 can acquire the image data D stored in the HDD, and the I / F unit 13 can be used as described above. The image data D can be acquired from the external device connected via the network. That is, the user operates the operation unit 14 while referring to the user interface (UI) screen displayed on the display unit 15 to arbitrarily select the image data D as the input image and print the selected image data D. When the instruction is given, the object detection unit 20 acquires the image data D related to the selection from a recording medium or the like.

画像データＤは、複数の画素からなるビットマップデータであり、それぞれの画素は、ＲＧＢ各チャネルの階調（例えば、０〜２５５の２５６階調）の組み合わせで表現されている。画像データＤは、記録メディア等に記録されている段階で圧縮されていてもよいし、他の色空間で各画素の色が表現されていてもよい。これらの場合、オブジェクト検出部２０は、画像データＤの展開や色空間の変換を実行してＲＧＢビットマップデータとしての画像データＤを取得する。 The image data D is bitmap data composed of a plurality of pixels, and each pixel is expressed by a combination of gradations of RGB channels (for example, 256 gradations of 0 to 255). The image data D may be compressed when recorded on a recording medium or the like, or the color of each pixel may be expressed in another color space. In these cases, the object detection unit 20 executes the development of the image data D and the conversion of the color space to acquire the image data D as RGB bitmap data.

Ｓ２００では、オブジェクト検出部２０は、画像データＤを縮小化する。オリジナルの画像サイズのままの画像データＤを対象として、後述するオブジェクト検出処理を行なった場合には処理負担が大きい。そのため、オブジェクト検出部２０は、画像データＤについて画素数を減らすなどして画像サイズを縮小し、縮小後の画像データを取得する。オブジェクト検出部２０は、例えば、画像データＤをＱＶＧＡ（Quarter Video Graphics Array）サイズ（３２０画素×２４０画素）に縮小した画像データＤＲを取得する。本実施形態では、画像データＤＲについても適宜、入力画像と呼ぶ。 In S200, the object detection unit 20 reduces the image data D. When the object detection process described later is performed on the image data D with the original image size as a target, the processing load is large. Therefore, the object detection unit 20 reduces the image size by reducing the number of pixels of the image data D, and acquires the reduced image data. For example, the object detection unit 20 acquires image data DR obtained by reducing the image data D to a QVGA (Quarter Video Graphics Array) size (320 pixels × 240 pixels). In the present embodiment, the image data DR is also referred to as an input image as appropriate.

Ｓ３００では、オブジェクト検出部２０は、画像データＤＲをグレー画像へ変換する。つまりオブジェクト検出部２０は、画像データＤＲの各画素のＲＧＢデータを輝度値Ｙ（０〜２５５）に変換し、画素毎に１つの輝度値Ｙを有するモノクロ画像としての画像データＤＲを生成する。輝度値Ｙは一般的に、Ｒ，Ｇ，Ｂを所定の重み付けで加算することにより求めることができる。
なお本実施形態においては、Ｓ２００は必須ではない。そのため、Ｓ２００を実行しない場合には、オブジェクト検出部２０は、画像データＤを対象としてＳ３００さらには後述のＳ４００，Ｓ５００を実行する。またＳ３００（画像データＤＲまたは画像データＤのグレー画像への変換）は、後述するオブジェクト検出処理の便宜を考慮して予め行なう処理であるが、かかるＳ３００を前もって行なうことも必須と言うわけではなくスキップしてもよい。 In S300, the object detection unit 20 converts the image data DR into a gray image. That is, the object detection unit 20 converts the RGB data of each pixel of the image data DR into a luminance value Y (0 to 255), and generates image data DR as a monochrome image having one luminance value Y for each pixel. The luminance value Y can generally be obtained by adding R, G, and B with a predetermined weight.
In this embodiment, S200 is not essential. Therefore, when S200 is not executed, the object detection unit 20 executes S300 and further S400 and S500 described later for the image data D. Further, S300 (conversion of image data DR or image data D to a gray image) is a process that is performed in advance in consideration of the convenience of an object detection process to be described later. However, it is not essential to perform S300 in advance. You may skip.

Ｓ４００では、オブジェクト検出部２０はオブジェクト検出処理を実行する。概略的には、オブジェクト検出部２０は、画像データＤＲ（または画像データＤ）において検出窓ＳＷを設定するとともに、検出窓ＳＷ内の複数の領域についてそれぞれに領域内のエッジ量を取得し、領域間のエッジ量の比較結果が所定の条件を満たす場合に、検出窓ＳＷ内の画像を対象としてオブジェクトの有無を判定する処理を、検出窓ＳＷ毎に繰り返す。本実施形態では一例として、オブジェクトは人間の顔画像であるとして説明を行なう。ただし本発明の構成を用いて検出可能なオブジェクトは人間の顔画像に限られるものではなく、人工物や、生物や、自然物や、風景など、様々な対象をオブジェクトとして検出することが可能である。 In S400, the object detection unit 20 executes an object detection process. Schematically, the object detection unit 20 sets a detection window SW in the image data DR (or image data D), acquires edge amounts in the areas for a plurality of areas in the detection window SW, and When the comparison result of the edge amount between the two satisfies the predetermined condition, the process of determining the presence / absence of the object for the image in the detection window SW is repeated for each detection window SW. In the present embodiment, as an example, the description will be made assuming that the object is a human face image. However, objects that can be detected using the configuration of the present invention are not limited to human face images, and various objects such as artifacts, living things, natural objects, and landscapes can be detected as objects. .

図３は、Ｓ４００の詳細をフローチャートにより示している。
Ｓ４１０では、オブジェクト検出部２０の検出窓設定部２１が、画像データＤＲにおいて検出窓ＳＷを１つ設定する。検出窓ＳＷの設定方法は特に限られないが、検出窓設定部２１は一例として、以下のように検出窓ＳＷを設定する。
図４は、画像データＤＲにおいて検出窓ＳＷを設定する様子を示している。検出窓設定部２１は、１回目のＳ４１０では、画像内の先頭位置（例えば、画像の左上の角位置）に複数の画素を含む所定の大きさの矩形状の検出窓ＳＷ（２点鎖線）を設定する。検出窓設定部２１は、２回目以降のＳ４１０の度に、それまで検出窓ＳＷを設定していた位置から検出窓ＳＷを画像の左右方向およびまたは上下方向に所定距離（所定画素数分）移動させ、移動先の位置において検出窓ＳＷを新たに１つ設定する。検出窓設定部２１は、検出窓ＳＷの大きさを維持した状態で画像データＤＲの最終位置（例えば、画像の右下の角位置）まで検出窓ＳＷを移動させながら繰り返し検出窓ＳＷを設定したら、先頭位置に戻って検出窓ＳＷを設定する。 FIG. 3 is a flowchart showing details of S400.
In S410, the detection window setting unit 21 of the object detection unit 20 sets one detection window SW in the image data DR. The setting method of the detection window SW is not particularly limited, but the detection window setting unit 21 sets the detection window SW as follows as an example.
FIG. 4 shows how the detection window SW is set in the image data DR. In the first S410, the detection window setting unit 21 has a rectangular detection window SW (two-dot chain line) of a predetermined size including a plurality of pixels at the head position in the image (for example, the upper left corner of the image). Set. The detection window setting unit 21 moves the detection window SW from the position where the detection window SW has been set up to a predetermined distance (a predetermined number of pixels) in the horizontal direction and / or the vertical direction of the image every time S410 is performed for the second time and thereafter. And one new detection window SW is set at the position of the movement destination. When the detection window setting unit 21 repeatedly sets the detection window SW while moving the detection window SW to the final position of the image data DR (for example, the lower right corner position of the image) while maintaining the size of the detection window SW. Returning to the head position, the detection window SW is set.

検出窓設定部２１は、検出窓ＳＷを先頭位置に戻した場合には、それまでよりも矩形の大きさを縮小した検出窓ＳＷを設定する。その後、検出窓設定部２１は上記と同様に、検出窓ＳＷの大きさを維持した状態で画像データＤＲの最終位置まで検出窓ＳＷを移動させつつ、各位置において検出窓ＳＷ設定する。検出窓設定部２１は、検出窓ＳＷの大きさを予め決められた回数だけ段階的に縮小しながら、このような検出窓ＳＷの移動と設定を繰り返す。このようにＳ４１０において検出窓ＳＷが１つ設定される度に、Ｓ４２０以降の処理が行なわれる。 When the detection window SW is returned to the head position, the detection window setting unit 21 sets the detection window SW having a smaller rectangular size than before. Thereafter, in the same manner as described above, the detection window setting unit 21 sets the detection window SW at each position while moving the detection window SW to the final position of the image data DR while maintaining the size of the detection window SW. The detection window setting unit 21 repeats such movement and setting of the detection window SW while stepwise reducing the size of the detection window SW by a predetermined number of times. In this manner, every time one detection window SW is set in S410, the processing after S420 is performed.

Ｓ４２０では、エッジ量算出部２２が、直近のＳ４１０で設定された検出窓ＳＷ内の各領域（第一領域および第二領域）においてエッジ量を算出する。まずエッジ量算出部２２は、内部メモリ１２からエッジ量算出領域定義フィルタ１４ｂを読み出し、エッジ量算出領域定義フィルタ１４ｂを上記設定された検出窓ＳＷ内の画像データに対し適用する。エッジ量算出領域定義フィルタ１４ｂは、検出窓ＳＷと相似の矩形状のフィルタであり、エッジ量算出の対象となる第一領域および第二領域を定義している。 In S420, the edge amount calculation unit 22 calculates the edge amount in each region (first region and second region) in the detection window SW set in the latest S410. First, the edge amount calculation unit 22 reads the edge amount calculation region definition filter 14b from the internal memory 12, and applies the edge amount calculation region definition filter 14b to the image data in the set detection window SW. The edge amount calculation region definition filter 14b is a rectangular filter similar to the detection window SW, and defines a first region and a second region that are targets of edge amount calculation.

図５は、エッジ量算出領域定義フィルタ１４ｂの一例を示している。図５に示すように、エッジ量算出領域定義フィルタ１４ｂは、フィルタ内に第一領域Ａ１および第二領域Ａ２をそれぞれ定義している。第一領域Ａ１は、検出窓ＳＷにエッジ量算出領域定義フィルタ１４ｂを適用した際に、仮に検出窓ＳＷが顔画像を含んでいれば、顔画像の所定の器官を含むであろうと推定される位置および大きさに予め設定された領域である。本実施形態では、第一領域Ａ１は、エッジ量算出領域定義フィルタ１４ｂを適用した検出窓ＳＷが顔画像を含んでいる場合に顔画像の左右の目（あるい左右の目および左右の眉）を含むと推定される目領域と、およびエッジ量算出領域定義フィルタ１４ｂを適用した検出窓ＳＷが顔画像を含んでいる場合に顔画像の口を含むと推定される口領域とからなる。一方、第二領域Ａ２は、検出窓ＳＷにエッジ量算出領域定義フィルタ１４ｂを適用した際に、仮に検出窓ＳＷが顔画像を含んでいれば、顔画像の上記器官以外の所定の皮膚部分を含むであろうと推定される位置および大きさに予め設定された領域である。本実施形態では、第二領域Ａ２は、エッジ量算出領域定義フィルタ１４ｂを適用した検出窓ＳＷが顔画像を含んでいる場合に、顔画像の左右の頬に対応すると推定される左右の頬目領域からなる。 FIG. 5 shows an example of the edge amount calculation region definition filter 14b. As shown in FIG. 5, the edge amount calculation area definition filter 14b defines a first area A1 and a second area A2 in the filter. When the edge amount calculation region definition filter 14b is applied to the detection window SW, the first region A1 is estimated to include a predetermined organ of the face image if the detection window SW includes the face image. This is an area preset in position and size. In the present embodiment, the first area A1 has left and right eyes (or left and right eyes and right and left eyebrows) when the detection window SW to which the edge amount calculation area definition filter 14b is applied includes a face image. And the mouth region estimated to include the mouth of the face image when the detection window SW to which the edge amount calculation region definition filter 14b is applied includes the face image. On the other hand, if the detection window SW includes a face image when the edge amount calculation area definition filter 14b is applied to the detection window SW, the second area A2 is a predetermined skin portion other than the above organ of the face image. This is a region preset to a position and a size estimated to be included. In the present embodiment, the second area A2 includes left and right cheek eyes that are estimated to correspond to the left and right cheeks of the face image when the detection window SW to which the edge amount calculation area definition filter 14b is applied includes the face image. Consists of regions.

第一領域Ａ１および第二領域Ａ２の定義態様は図５に示したものに限られず、第一領域Ａ１における目領域は２つに分離された領域であってもよいし、第一領域Ａ１は目領域と口領域とのどちらか一方からなるとしてもよい。第二領域Ａ２は、左右の頬領域ではなく、例えば、顔画像の額の位置に対応すると推定される領域であってもよい。エッジ量算出領域定義フィルタ１４ｂは、第一領域Ａ１の総面積と第二領域Ａ２の総面積とが等しくなるようにこれらの領域を定義している。また、エッジ量算出領域定義フィルタ１４ｂの矩形に対する、第一領域Ａ１および第二領域Ａ２それぞれの位置関係および大きさは一定であり、そのため、エッジ量算出領域定義フィルタ１４ｂが縮小または拡大されたときには、第一領域Ａ１および第二領域Ａ２も同様に縮小または拡大される。 The definition form of 1st area | region A1 and 2nd area | region A2 is not restricted to what was shown in FIG. 5, The eye area | region in 1st area | region A1 may be the area | region isolate | separated into two, 1st area | region A1 is It may consist of either the eye area or the mouth area. The second area A2 may be an area estimated to correspond to the position of the forehead of the face image, for example, instead of the left and right cheek areas. The edge amount calculation area definition filter 14b defines these areas so that the total area of the first area A1 is equal to the total area of the second area A2. Further, the positional relationship and the size of each of the first area A1 and the second area A2 with respect to the rectangle of the edge amount calculation area definition filter 14b are constant. Therefore, when the edge amount calculation area definition filter 14b is reduced or enlarged The first area A1 and the second area A2 are similarly reduced or enlarged.

エッジ量算出部２２は、エッジ量算出領域定義フィルタ１４ｂが、直近のＳ４１０で設定された検出窓ＳＷの大きさに一致するように必要に応じてエッジ量算出領域定義フィルタ１４ｂを拡大または縮小した上で、当該設定された検出窓ＳＷにエッジ量算出領域定義フィルタ１４ｂを重畳する。この結果、検出窓ＳＷ内に第一領域Ａ１および第二領域Ａ２が設定される。従って、第一領域Ａ１、第二領域Ａ２それぞれの検出窓ＳＷに対する位置および大きさも、上記のように検出窓ＳＷの移動、縮小が行なわれても常に一定となる。次に、エッジ量算出部２２は、内部メモリ１２からエッジ検出フィルタ１４ｃを読み出し、エッジ検出フィルタ１４ｃを第一領域Ａ１に属する画像データＤＲの各画素に適用し、第一領域Ａ１内の各画素のエッジ量を検出する。 The edge amount calculation unit 22 enlarges or reduces the edge amount calculation region definition filter 14b as necessary so that the edge amount calculation region definition filter 14b matches the size of the detection window SW set in the latest S410. Above, the edge amount calculation region definition filter 14b is superimposed on the set detection window SW. As a result, the first area A1 and the second area A2 are set in the detection window SW. Accordingly, the positions and sizes of the first area A1 and the second area A2 with respect to the detection window SW are always constant even if the detection window SW is moved and reduced as described above. Next, the edge amount calculation unit 22 reads the edge detection filter 14c from the internal memory 12, applies the edge detection filter 14c to each pixel of the image data DR belonging to the first area A1, and each pixel in the first area A1. Detect the edge amount.

図６は、第一領域Ａ１に含まれる画像データＤＲの一部領域に対して、エッジ検出フィルタ１４ｃを適用した様子を示している。エッジ検出フィルタ１４ｃは、例えば３×３のマトリクス状のフィルタである。エッジ量算出部２２は、エッジ検出フィルタ１４ｃの中央の値を注目画素に適用し、当該フィルタ１４ｃの各値と画像データＤＲの各画素値（輝度値Ｙ）とを夫々乗算した結果を積算することにより、注目画素のエッジ量を検出する。第一領域Ａ１内の全画素を順次、注目画素にしてエッジ検出フィルタ１４ｃを適用することにより、第一領域Ａ１に属する各画素のエッジ量が検出される。図６では、画像データＤＲの一部領域における中央の９画素についてのみエッジ量を示している。エッジ量算出部２２は、第一領域Ａ１の各画素のエッジ量の大きさ（絶対値）の総和を算出し、当該総和を第一領域Ａ１のエッジ量として取得する。 FIG. 6 shows a state in which the edge detection filter 14c is applied to a partial region of the image data DR included in the first region A1. The edge detection filter 14c is, for example, a 3 × 3 matrix filter. The edge amount calculation unit 22 applies the center value of the edge detection filter 14c to the pixel of interest, and integrates the results of multiplying each value of the filter 14c and each pixel value (luminance value Y) of the image data DR. Thus, the edge amount of the target pixel is detected. The edge amount of each pixel belonging to the first area A1 is detected by applying the edge detection filter 14c to all the pixels in the first area A1 sequentially as the target pixel. In FIG. 6, the edge amount is shown only for the central nine pixels in a partial region of the image data DR. The edge amount calculation unit 22 calculates the sum of the magnitudes (absolute values) of the edge amounts of the pixels in the first region A1, and acquires the sum as the edge amount of the first region A1.

エッジ量算出部２２は、第二領域Ａ２に属する画像データＤＲの各画素に対しても同様にエッジフィルタ１４ｃを適用して第二領域Ａ２の各画素のエッジ量を検出し、第二領域Ａ２の各画素のエッジ量の大きさ（絶対値）の総和を、第二領域Ａ２のエッジ量として取得する。図６に示したエッジ検出フィルタ１４ｃは、同図から明らかなように、画像の上下方向の輝度変化に応じたエッジ量を検出可能なフィルタである。ここで、人間の顔画像においては、目や眉毛や口の領域は、主に顔の横方向（顔の左右方向）を向く線によって形成されているため顔の上下方向における輝度変化に富んでおり、一方、頬などの皮膚部分の輝度変化は、目や眉毛や口の領域と比較すると乏しい。そのため、画像データＤＲに設定された検出窓ＳＷに仮に顔画像が存在する場合には、その検出窓ＳＷにおける第一領域Ａ１からは多くのエッジ量が検出され、一方、第二領域Ａ２からは少量のエッジ量しか検出されないと予想される。 The edge amount calculation unit 22 similarly applies the edge filter 14c to each pixel of the image data DR belonging to the second region A2 to detect the edge amount of each pixel in the second region A2, and the second region A2 The sum of the edge amounts (absolute values) of each pixel is acquired as the edge amount of the second area A2. As apparent from FIG. 6, the edge detection filter 14c shown in FIG. 6 is a filter that can detect the edge amount according to the luminance change in the vertical direction of the image. Here, in human face images, the area of eyes, eyebrows, and mouth is mainly formed by lines that face in the horizontal direction of the face (the left-right direction of the face), and thus the luminance change in the vertical direction of the face is abundant. On the other hand, the luminance change of the skin part such as the cheek is poor compared to the eye, eyebrows and mouth area. Therefore, if a face image exists in the detection window SW set in the image data DR, a large amount of edge is detected from the first area A1 in the detection window SW, while from the second area A2 It is expected that only a small amount of edge will be detected.

そこでＳ４３０（図３）では、要否判断部２３が、直近のＳ４２０で取得された第一領域Ａ１のエッジ量と第二領域Ａ２のエッジ量とを比較し、第一領域Ａ１のエッジ量＞第二領域Ａ２のエッジ量、が成り立つ場合にＳ４４０に進む。すなわち、第一領域Ａ１のエッジ量が第二領域Ａ２のエッジ量より多ければ、検出窓ＳＷ内の画像の顔らしさが高く、顔画像の有無を判定する必要性があると言えるため、要否判断部２３はＳ４４０に進む。一方、要否判断部２３は、第一領域Ａ１のエッジ量≦第二領域Ａ２のエッジ量、である場合には、Ｓ４４０，Ｓ４５０をスキップしてＳ４６０に進む。すなわち、第一領域Ａ１のエッジ量が第二領域Ａ２のエッジ量より多いという条件が成立しない場合には、検出窓ＳＷ内の画像の顔らしさは低く、顔画像の有無を判定する必要性が無いと言えるため、要否判断部２３はＳ４６０に進む。なお図６では、エッジ検出フィルタ１４ｄも例示しているが、エッジ検出フィルタ１４ｄの利用方法については後述する。 Therefore, in S430 (FIG. 3), the necessity determining unit 23 compares the edge amount of the first region A1 acquired in the latest S420 with the edge amount of the second region A2, and the edge amount of the first region A1> When the edge amount of the second area A2 is established, the process proceeds to S440. That is, if the edge amount of the first area A1 is larger than the edge amount of the second area A2, the image in the detection window SW has a high facial appearance, and it is necessary to determine whether or not there is a face image. The determination unit 23 proceeds to S440. On the other hand, if the edge amount of the first region A1 ≦ the edge amount of the second region A2, the necessity determination unit 23 skips S440 and S450 and proceeds to S460. That is, when the condition that the edge amount of the first region A1 is larger than the edge amount of the second region A2 is not satisfied, the face-likeness of the image in the detection window SW is low, and it is necessary to determine the presence or absence of the face image. Since it can be said that there is not, the necessity determination unit 23 proceeds to S460. In FIG. 6, the edge detection filter 14d is also illustrated, but a method of using the edge detection filter 14d will be described later.

２‐２．オブジェクトの有無判定から印刷まで：
Ｓ４４０では、検出実行部２４が、直近のＳ４１０で設定された検出窓ＳＷ内の画像を対象として、顔画像の有無の判定（顔判定）を行なう。そして、顔画像が存在すると判定した場合にはＳ４５０に進み、顔画像が存在しないと判定した場合にはＳ４５０をスキップしてＳ４６０に進む。検出実行部２４はＳ４４０において、顔画像が存在するか否かを判定可能な手法であればあらゆる手法を採用可能であるが、本実施形態では一例として、ニューラルネットワークＮＮを利用した判定を行なう。 2-2. From object presence determination to printing:
In S440, the detection execution unit 24 determines the presence / absence of a face image (face determination) for the image in the detection window SW set in the latest S410. If it is determined that the face image exists, the process proceeds to S450. If it is determined that the face image does not exist, S450 is skipped and the process proceeds to S460. The detection execution unit 24 can employ any method in S440 as long as it can determine whether or not a face image exists. In this embodiment, for example, the detection execution unit 24 performs determination using a neural network NN.

図７は、検出実行部２４が実行するＳ４４０の詳細をフローチャートにより示している。検出実行部２４は、Ｓ４４１において、直近のＳ４１０で設定された検出窓ＳＷ内の画素からなる画像データ（窓画像データ）ＸＤを取得すると、Ｓ４４２において、窓画像データＸＤから複数の特徴量を算出する。これらの特徴量は、窓画像データＸＤに対して各種のフィルタを適用し、当該フィルタ内の輝度平均やコントラスト等の画像的特徴を示す特徴量（平均値や最大値や最小値や標準偏差等）を算出することにより得られる。 FIG. 7 is a flowchart showing details of S440 executed by the detection execution unit 24. In S441, the detection execution unit 24 obtains image data (window image data) XD composed of pixels in the detection window SW set in the latest S410, and in S442, calculates a plurality of feature amounts from the window image data XD. To do. For these feature amounts, various filters are applied to the window image data XD, and feature amounts (average value, maximum value, minimum value, standard deviation, etc.) indicating image features such as luminance average and contrast in the filter are applied. ) Is obtained.

図８は、窓画像データＸＤから特徴量を算出する様子を示している。同図において、画像データＸＤとの相対的な大きさおよび位置が異なる多数のフィルタＦＴが用意されており、各フィルタＦＴを順次窓画像データＸＤに適用し、各フィルタＦＴ内の画像的特徴に基づいて、複数の特徴量ＣＡ，ＣＡ，ＣＡ…を算出する。図８では、窓画像データＸＤ内の各矩形をフィルタＦＴと呼んでいる。特徴量ＣＡ，ＣＡ，ＣＡ…が算出できると、検出実行部２４は、Ｓ４４３において、特徴量ＣＡ，ＣＡ，ＣＡ…を、予め用意したニューラルネットワークＮＮに入力し、その出力として顔画像が存在する／しないの判定結果を算出する。 FIG. 8 shows how the feature amount is calculated from the window image data XD. In the same figure, a number of filters FT having different relative sizes and positions with respect to the image data XD are prepared, and each filter FT is sequentially applied to the window image data XD, and image characteristics in each filter FT are obtained. Based on this, a plurality of feature amounts CA, CA, CA... Are calculated. In FIG. 8, each rectangle in the window image data XD is called a filter FT. If the feature quantities CA, CA, CA... Can be calculated, the detection execution unit 24 inputs the feature quantities CA, CA, CA... Into the prepared neural network NN in S443, and a face image exists as an output thereof. A determination result of whether or not is calculated is calculated.

図９は、ニューラルネットワークＮＮの構造の一例を示している。ニューラルネットワークＮＮは、前段層のユニットＵの値の線形結合（添え字ｉは前段層のユニットＵの識別番号。）によって後段層のユニットＵの値が決定される基本構造を有している。さらに、線形結合によって得られた値をそのまま次の層のユニットＵの値としてもよいが、線形結合によって得られた値を例えばハイパボリックタンジェント関数のような非線形関数によって変換して次の層のユニットＵの値を決定することにより、非線形特性を与えてもよい。ニューラルネットワークＮＮは、最外の入力層と出力層と、これらに挟まれた中間層から構成されている。各特徴量ＣＡ，ＣＡ，ＣＡ…がニューラルネットワークＮＮの入力層に入力可能となっており、出力層では出力値Ｋ（０〜１に正規化された値）を出力することが可能となっている。Ｓ４４４では、検出実行部２４は、例えばニューラルネットワークＮＮの出力値Ｋが０．５以上であれば窓画像データＸＤに顔画像が存在することを示す値であると判定し、Ｓ４５０に進む。一方、検出実行部２４は、出力値Ｋが０．５未満であれば窓画像データＸＤに顔画像が存在しないことを示す値であると判定し、Ｓ４６０に進む。 FIG. 9 shows an example of the structure of the neural network NN. The neural network NN has a basic structure in which the value of the unit U in the subsequent layer is determined by a linear combination of the values of the unit U in the previous layer (the suffix i is the identification number of the unit U in the previous layer). Further, the value obtained by the linear combination may be used as the value of the unit U of the next layer as it is, but the value obtained by the linear combination is converted by a non-linear function such as a hyperbolic tangent function, for example. By determining the value of U, non-linear characteristics may be provided. The neural network NN is composed of an outermost input layer and output layer, and an intermediate layer sandwiched between them. Each feature quantity CA, CA, CA... Can be input to the input layer of the neural network NN, and an output value K (value normalized to 0 to 1) can be output from the output layer. Yes. In S444, for example, if the output value K of the neural network NN is 0.5 or more, the detection execution unit 24 determines that the value indicates that a face image exists in the window image data XD, and proceeds to S450. On the other hand, if the output value K is less than 0.5, the detection execution unit 24 determines that the face image data does not exist in the window image data XD, and proceeds to S460.

図１０は、ニューラルネットワークＮＮを学習によって構築する様子を模式的に示している。本実施形態では、誤差逆伝搬（error back propagation）法によってニューラルネットワークＮＮの学習を行うことにより、各ユニットＵの数や、各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値が最適化される。誤差逆伝搬法による学習においては、まず各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値を適当な値に初期設定する。そして、顔画像が存在しているか否かが既知の学習用画像データについてＳ４４２，Ｓ４４３と同様の手順で特徴量ＣＡ，ＣＡ，ＣＡ…を算出し、当該特徴量ＣＡ，ＣＡ，ＣＡ…を初期設定されたニューラルネットワークＮＮに入力し、その出力値Ｋを取得する。本実施形態では、顔画像が存在している学習用画像データについては出力値Ｋとして１が出力されるのが望ましく、顔画像が存在していない学習用画像データについて出力値Ｋとして０が出力されるのが望ましい。しかしながら、各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値を適当な値に初期設定したに過ぎないため、実際の出力値Ｋと理想的な値との間には誤差が生じることとなる。このような誤差を極小化させる各ユニットＵについての重みｗやバイアスｂを、勾配法等の数値最適化手法を用いて算出する。以上のような誤差は、後段の層から前段の層に伝搬され、後段のユニットＵについて重みｗやバイアスｂが順に最適化されていく。 FIG. 10 schematically shows how the neural network NN is constructed by learning. In the present embodiment, by learning the neural network NN by the error back propagation method, the number of units U, the size of the weight w at the time of linear combination between the units U, and the bias b are determined. The value is optimized. In learning by the back propagation method, first, the magnitude of the weight w and the value of the bias b at the time of linear combination between the units U are initially set to appropriate values. Then, for learning image data for which it is known whether or not a face image exists, feature amounts CA, CA, CA... Are calculated in the same procedure as S442 and S443, and the feature amounts CA, CA, CA. Input to the set neural network NN and obtain the output value K. In this embodiment, it is desirable that 1 is output as the output value K for learning image data in which a face image exists, and 0 is output as the output value K for learning image data in which no face image exists. It is desirable to be done. However, since the weight w and the value of the bias b at the time of linear combination between the units U are merely set to appropriate values, there is an error between the actual output value K and the ideal value. Will occur. The weight w and the bias b for each unit U that minimizes such an error are calculated using a numerical optimization method such as a gradient method. The error as described above is propagated from the subsequent layer to the previous layer, and the weight w and the bias b are sequentially optimized for the subsequent unit U.

このような学習を複数の上記学習用画像データを用いて行なうことで最適化がなされたニューラルネットワークＮＮを、内部メモリ１２に予め用意しておくことにより、顔画像が窓画像データＸＤに存在するか否かを特徴量ＣＡ，ＣＡ，ＣＡ…に基づいて判定することが可能となる。
Ｓ４５０（図３）では、検出実行部２４は、直近のＳ４４０で顔画像が存在すると判定された検出窓ＳＷの位置（例えば、画像データＤＲ上における検出窓ＳＷの中心位置）および当該検出窓ＳＷの矩形の大きさを、内部メモリ１２の所定領域に記録する。このように検出窓ＳＷの位置や大きさを記録する行為が、顔画像の検出行為の一例に該当する。 A face image exists in the window image data XD by preparing in advance in the internal memory 12 a neural network NN that has been optimized by performing such learning using a plurality of learning image data. It can be determined based on the feature quantities CA, CA, CA.
In S450 (FIG. 3), the detection execution unit 24 determines the position of the detection window SW (for example, the center position of the detection window SW on the image data DR) determined that the face image exists in the latest S440, and the detection window SW. Are recorded in a predetermined area of the internal memory 12. Thus, the act of recording the position and size of the detection window SW corresponds to an example of the face image detection act.

Ｓ４６０では、検出窓設定部２１が、図４を用いて説明した検出窓ＳＷの設定方法の思想の下、検出窓ＳＷを移動させ更にその大きさを縮小したりして未だ検出窓ＳＷを設定する余地があれば、Ｓ４１０に戻り、新たに検出窓ＳＷを画像データＤＲ上に１つ設定する。一方、検出窓ＳＷの縮小を上記予め決められた回数分重ね、可能な検出窓ＳＷの設定を全て終えた場合には、オブジェクト検出部２０は、Ｓ４００の処理を終える。 In S460, the detection window setting unit 21 still sets the detection window SW by moving the detection window SW and further reducing its size under the concept of the detection window SW setting method described with reference to FIG. If there is room to do so, the process returns to S410, and one new detection window SW is set on the image data DR. On the other hand, when the reduction of the detection window SW is overlapped by the predetermined number of times and all the possible detection window SW settings are completed, the object detection unit 20 ends the process of S400.

Ｓ５００（図２）では、画像補正部３０の補正情報決定部３１が、入力画像に対する補正に用いられる補正情報を決定する。入力画像に対する補正とは、例えば、明るさ補正や、コントラスト補正や、彩度補正や、特定の記憶色に対する補正などが該当する。本実施形態では、補正情報決定部３１は、Ｓ４００において顔画像が検出された場合には、少なくとも当該顔画像に基づいて補正情報を決定する。具体的には、補正情報決定部３１は、内部メモリ１２に、顔画像として検出された検出窓ＳＷの位置および大きさの情報が記録されている場合には、画像データＤＲからこの検出窓ＳＷの位置および大きさの情報が示す範囲の画像データ（顔画像データと呼ぶ。）を抽出する。顔画像データの抽出対象となる画像データＤＲは、グレー画像への変換後の画像データＤＲでもよいし、グレー画像への変換が行なわれる前のＳ２００直後の画像データＤＲであってもよい。補正情報決定部３１は、顔画像データに基づいて補正情報（補正パラメータ）を算出する。例えば、補正情報決定部３１は、顔画像データ内の輝度の平均値を算出し、当該平均値と、所定の目標値との差分を算出し、当該算出した差分を補正情報とする。 In S500 (FIG. 2), the correction information determination unit 31 of the image correction unit 30 determines correction information used for correction of the input image. Examples of correction for an input image include brightness correction, contrast correction, saturation correction, and correction for a specific memory color. In the present embodiment, when a face image is detected in S400, the correction information determination unit 31 determines correction information based on at least the face image. Specifically, when the position and size information of the detection window SW detected as a face image is recorded in the internal memory 12, the correction information determination unit 31 uses the detection window SW from the image data DR. The image data in the range indicated by the position and size information (referred to as face image data) is extracted. The image data DR from which face image data is extracted may be image data DR after conversion to a gray image, or image data DR immediately after S200 before conversion to a gray image. The correction information determination unit 31 calculates correction information (correction parameters) based on the face image data. For example, the correction information determination unit 31 calculates an average value of luminance in the face image data, calculates a difference between the average value and a predetermined target value, and uses the calculated difference as correction information.

Ｓ６００では、補正実行部３２が、Ｓ５００で決定された補正情報に基づいて、Ｓ１００で取得された画像データＤの少なくとも一部を補正する。例えば、補正情報が上述したような顔画像データ内の輝度の平均値と目標値との差分であれば、当該差分に相当する輝度を、画像データＤ上の顔画像データに対応する領域（画像データＤに対する位置および大きさが、画像データＤＲに対する顔画像データの位置および大きさの関係と等しい領域）の各画素に対して足す。その結果、画像データＤ上の顔画像の明るさを向上させることができる。また、上記差分の大きさに基づいてトーンカーブの湾曲度合いを決定し、当該トーンカーブを用いて画像データＤの各画素値を補正するとしてもよい。むろん、Ｓ５００で決定する補正情報の種類やＳ６００で行なう補正の種類は上述したものに限られない。 In S600, the correction execution unit 32 corrects at least a part of the image data D acquired in S100 based on the correction information determined in S500. For example, if the correction information is the difference between the average value of the luminance in the face image data and the target value as described above, the luminance corresponding to the difference is set in an area (image corresponding to the face image data on the image data D). The position and the size with respect to the data D are added to the respective pixels in the region where the relationship between the position and the size of the face image data with respect to the image data DR is equal. As a result, the brightness of the face image on the image data D can be improved. Further, the curve degree of the tone curve may be determined based on the magnitude of the difference, and each pixel value of the image data D may be corrected using the tone curve. Of course, the type of correction information determined in S500 and the type of correction performed in S600 are not limited to those described above.

Ｓ７００では、印刷処理部５０が、プリンタエンジン１６を制御して、入力画像の印刷を行う。すなわち印刷処理部５０は、補正が施された後の画像データＤに、解像度変換処理や色変換処理やハーフトーン処理など必要な各処理を施して印刷データを生成する。生成された印刷データは、印刷処理部５０からプリンタエンジン１６に供給され、プリンタエンジン１６は印刷データに基づいた印刷を実行する。これにより、入力画像の印刷が完了する。 In S700, the print processing unit 50 controls the printer engine 16 to print the input image. That is, the print processing unit 50 performs necessary processes such as resolution conversion process, color conversion process, and halftone process on the corrected image data D to generate print data. The generated print data is supplied from the print processing unit 50 to the printer engine 16, and the printer engine 16 executes printing based on the print data. Thereby, the printing of the input image is completed.

このように本実施形態によれば、入力画像に対して検出窓ＳＷを設定し検出窓ＳＷ毎にオブジェクト（顔画像）の有無を判定する際に、検出窓ＳＷに顔画像が存在する場合には当該顔画像の目や口などの器官位置に対応すると推定される領域を第一領域Ａ１として検出窓ＳＷ内に設定し、顔画像が存在する場合には当該顔画像の目や口などの器官以外の所定の皮膚部分に対応すると推定される領域を第二領域Ａ２として検出窓ＳＷ内に設定する。そして、第一領域Ａ１内のエッジ量と第二領域Ａ２内のエッジ量とを比較する。そして、第一領域Ａ１のエッジ量の方が第二領域Ａ２のエッジ量より多いことを条件に、当該検出窓ＳＷ内の画像を対象として顔判定を行なう。一方、当該条件が満たされない場合には、当該検出窓ＳＷについては顔判定を行なうことなく、入力画像上の他の位置に検出窓ＳＷを新たに設定する。すなわち、入力画像上の各箇所のうちオブジェクトの有無を判定する必要性の無い箇所については、当該判定を行なう対象から外すようにした。そのため、オブジェクトの検出精度を落とすことなく、入力画像において検出窓ＳＷの設定とオブジェクトの有無の判定とを繰り返す処理の全体量を大幅に減らすことができ、その結果、オブジェクト検出処理が非常に高速化される。 As described above, according to the present embodiment, when a detection window SW is set for an input image and the presence or absence of an object (face image) is determined for each detection window SW, a face image exists in the detection window SW. Sets the region estimated to correspond to the organ position such as eyes and mouth of the face image as the first region A1 in the detection window SW, and when a face image exists, the region such as the eyes and mouth of the face image exists. A region estimated to correspond to a predetermined skin portion other than the organ is set as a second region A2 in the detection window SW. Then, the edge amount in the first region A1 is compared with the edge amount in the second region A2. Then, face determination is performed on the image in the detection window SW on the condition that the edge amount of the first region A1 is larger than the edge amount of the second region A2. On the other hand, when the condition is not satisfied, the detection window SW is newly set at another position on the input image without performing face determination for the detection window SW. That is to say, of the locations on the input image, locations where there is no need to determine the presence or absence of an object are excluded from the subject of the determination. Therefore, it is possible to greatly reduce the overall amount of processing for repeating the setting of the detection window SW and the determination of the presence / absence of an object in the input image without degrading the object detection accuracy. As a result, the object detection processing is very fast. It becomes.

３．変形例：
図１１は、オブジェクト検出部２０が実行するＳ４００（図２）の詳細を示したフローチャートであって、図３とは異なる例を示している。オブジェクト検出部２０は、図３のフォローチャートに替わって図１１のフローチャートの処理を行なうとしてもよい。図１１のＳ８１０，Ｓ８５０〜Ｓ８８０は、図３のＳ４１０，Ｓ４３０〜Ｓ４６０と同じであるため説明は省略する。Ｓ８２０では、エッジ量算出部２２が、直近のＳ８１０で設定された検出窓ＳＷ内における第一領域Ａ１のエッジ量を取得する。Ｓ８３０では、要否判断部２３が、直近のＳ８２０で取得された第一領域Ａ１のエッジ量が所定のしきい値Ｔｈより大きいか否か判断し、当該エッジ量がしきい値Ｔｈより大きい場合にはＳ８４０に進み、一方、当該エッジ量がしきい値Ｔｈ以下である場合にはＳ８６０に進む。Ｓ８４０では、エッジ量算出部２２が、直近のＳ８１０で設定された検出窓ＳＷ内における第二領域Ａ２のエッジ量を取得する。 3. Variations:
FIG. 11 is a flowchart showing details of S400 (FIG. 2) executed by the object detection unit 20, and shows an example different from FIG. The object detection unit 20 may perform the processing of the flowchart of FIG. 11 instead of the follow chart of FIG. Since S810 and S850 to S880 in FIG. 11 are the same as S410 and S430 to S460 in FIG. In S820, the edge amount calculation unit 22 acquires the edge amount of the first region A1 in the detection window SW set in the latest S810. In S830, the necessity determination unit 23 determines whether or not the edge amount of the first area A1 acquired in the latest S820 is larger than a predetermined threshold Th, and the edge amount is larger than the threshold Th. In step S840, the flow advances to step S840. On the other hand, if the edge amount is equal to or smaller than the threshold value Th, the flow advances to step S860. In S840, the edge amount calculation unit 22 acquires the edge amount of the second region A2 in the detection window SW set in the latest S810.

このように図１１の例では、第一領域Ａ１のエッジ量と第二領域Ａ２のエッジ量との比較を行なう前に、第一領域Ａ１のエッジ量がしきい値Ｔｈより大きいか否かを判断し、第一領域Ａ１のエッジ量がしきい値Ｔｈ以下である場合には、第二領域Ａ２のエッジ量に拘らず、顔判定（Ｓ８６０）を行なう。これは、第一領域Ａ１のエッジ量がある程度の量に達していない場合には、例えば逆光画像のように、検出窓ＳＷ全体（あるいは入力画像全体）において輝度変化に乏しい画像であると考えられるからである。つまり、輝度変化が全体的に乏しい画像では、仮に顔画像が存在していても目や口などにおけるエッジ量も乏しく、そのため、第一領域Ａ１のエッジ量と第二領域Ａ２のエッジ量との比較結果に基づいて検出窓ＳＷ内の画像が顔らしいか否かを判断することが困難だからである。上記のように、第一領域Ａ１のエッジ量がしきい値Ｔｈ以下である場合には顔判定を行なうことで、逆光画像のように輝度変化に乏しい入力画像であっても顔判定の対象とし、顔画像の検出ができるようにしている。第一領域Ａ１のエッジ量との比較に用いられるしきい値Ｔｈは、例えば内部メモリ１２に予め記録されている。当該しきい値Ｔｈは、例えば、逆光状態で撮影された複数のサンプル用の顔画像における目部分や口部分から検出されるエッジ量に基づいて予め決定される。 In this way, in the example of FIG. 11, before comparing the edge amount of the first region A1 and the edge amount of the second region A2, whether or not the edge amount of the first region A1 is larger than the threshold value Th. If the edge amount of the first area A1 is equal to or less than the threshold value Th, the face determination (S860) is performed regardless of the edge amount of the second area A2. When the edge amount of the first region A1 does not reach a certain amount, this is considered to be an image with poor luminance change in the entire detection window SW (or the entire input image), for example, a backlight image. Because. That is, in an image with a generally small change in brightness, even if a face image exists, the amount of edges in the eyes, mouth, and the like is also small. Therefore, the edge amount of the first region A1 and the edge amount of the second region A2 This is because it is difficult to determine whether the image in the detection window SW is a face based on the comparison result. As described above, when the edge amount of the first area A1 is equal to or less than the threshold value Th, face determination is performed, and even an input image with poor luminance change such as a backlight image is a target for face determination. The face image can be detected. The threshold value Th used for comparison with the edge amount of the first area A1 is recorded in advance in the internal memory 12, for example. The threshold value Th is determined in advance based on, for example, the edge amount detected from the eye part and the mouth part in a plurality of sample face images photographed in a backlight state.

Ｓ４１０（図３），Ｓ８１０（図１１）における検出窓ＳＷの設定は、画像データＤＲにおいて検出窓ＳＷの移動やサイズ変更（縮小）を繰り返しながら複数回行うとした。
さらにエッジ量算出部２２は、Ｓ４２０，Ｓ８２０において、直近のＳ４１０，Ｓ８１０で設定された検出窓ＳＷにエッジ量算出領域定義フィルタ１４ｂを適用（重畳）した状態で、当該検出窓ＳＷを画像データＤＲ上で所定角度ずつ複数回回転させ、回転させた夫々の状態（回転角０度の状態も含む。）毎に第一領域Ａ１のエッジ量、第二領域Ａ２のエッジ量を取得するとしてもよい。なおＳ８２０で第二領域Ａ２のエッジ量も取得する場合にはＳ８４０の処理は実質的に不要となる。 The setting of the detection window SW in S410 (FIG. 3) and S810 (FIG. 11) is performed a plurality of times while repeating the movement and size change (reduction) of the detection window SW in the image data DR.
Further, in S420 and S820, the edge amount calculation unit 22 applies (superimposes) the edge amount calculation region definition filter 14b to the detection window SW set in the latest S410 and S810, and sets the detection window SW to the image data DR. The edge amount of the first region A1 and the edge amount of the second region A2 may be acquired for each rotated state (including a state where the rotation angle is 0 degree) by rotating the image several times at a predetermined angle. . If the edge amount of the second area A2 is also acquired in S820, the process of S840 is substantially unnecessary.

図１２は、エッジ量算出部２２が画像データＤＲに設定された１つの検出窓ＳＷを複数回回転させた様子を示している。図１２においては、検出窓ＳＷが中心位置を維持した状態で９０度単位で回転した際の各状態を示している。かかる回転に伴ってエッジ量算出領域定義フィルタ１４ｂも回転するため、検出窓ＳＷとの第一領域Ａ１および第二領域Ａ２の相対的な位置関係および大きさは常に保たれる。エッジ量算出部２２は、検出窓ＳＷの回転角度（０度、９０度、１８０度、２７０度）毎に第一領域Ａ１のエッジ量および第二領域Ａ２のエッジ量を取得する。なお、９０度回転させた検出窓ＳＷにおける第一領域Ａ１および第二領域Ａ２からエッジ量を取得する場合には、第一領域Ａ１、第二領域Ａ２夫々が含む画素からなる画像データを上記９０度の回転を打ち消す方向に９０度回転させた上で、第一領域Ａ１の画像データ、第二領域Ａ２の画像データそれぞれに上記エッジ検出フィルタ１４ｃを適用し、各画素のエッジ量を検出する。同様に、１８０度回転させた検出窓ＳＷの各領域からエッジ量を取得する場合には、各領域に含まれている画像データを上記１８０度の回転を打ち消すように回転させた上でエッジ検出フィルタ１４ｃを適用し、２７０度回転させた検出窓ＳＷの各領域からエッジ量を取得する場合には、各領域に含まれている画像データを上記２７０度の回転を打ち消すように回転させた上でエッジ検出フィルタ１４ｃを適用し、夫々に各画素のエッジ量を検出する。 FIG. 12 shows a state where the edge amount calculation unit 22 rotates one detection window SW set in the image data DR a plurality of times. FIG. 12 shows each state when the detection window SW is rotated by 90 degrees with the center position maintained. With this rotation, the edge amount calculation area definition filter 14b also rotates, so that the relative positional relationship and size of the first area A1 and the second area A2 with respect to the detection window SW are always maintained. The edge amount calculation unit 22 acquires the edge amount of the first region A1 and the edge amount of the second region A2 for each rotation angle (0 degree, 90 degrees, 180 degrees, 270 degrees) of the detection window SW. When the edge amount is acquired from the first area A1 and the second area A2 in the detection window SW rotated 90 degrees, the image data including the pixels included in each of the first area A1 and the second area A2 is the 90. The edge detection filter 14c is applied to the image data of the first area A1 and the image data of the second area A2 after rotating 90 degrees in the direction to cancel the rotation of the degree, and the edge amount of each pixel is detected. Similarly, when acquiring the edge amount from each area of the detection window SW rotated 180 degrees, the edge detection is performed after rotating the image data included in each area so as to cancel the rotation of 180 degrees. When the edge amount is acquired from each area of the detection window SW rotated by 270 degrees by applying the filter 14c, the image data included in each area is rotated so as to cancel the rotation of 270 degrees. The edge detection filter 14c is applied to detect the edge amount of each pixel.

上記のように第一領域Ａ１のエッジ量および第二領域Ａ２のエッジ量が検出窓ＳＷの回転角度毎に得られたら、Ｓ４３０，Ｓ８５０では、要否判断部２３は、回転角度毎に第一領域Ａ１のエッジ量と第二領域Ａ２のエッジ量との比較を行なう。そして、かかる回転角度毎の比較のいずれかにおいて、第一領域Ａ１のエッジ量＞第二領域Ａ２のエッジ量、が成り立つ場合にはＳ４４０，Ｓ８６０に進む。つまり、検出窓ＳＷの回転状態（０度、９０度、１８０度、２７０度）のいずれかにおいて、第一領域Ａ１のエッジ量が第二領域Ａ２のエッジ量より多ければ、検出窓ＳＷ内に顔画像らしきものがあると言え、その場合には顔判定を行なう必要性があると判断し、Ｓ４４０，Ｓ８６０の顔判定を行なう。なお、検出窓ＳＷを回転させる際の１回あたりの角度は上述した９０度以外にも、３０度、４５度、６０度等、様々な角度が考えられる。 If the edge amount of the first region A1 and the edge amount of the second region A2 are obtained for each rotation angle of the detection window SW as described above, in S430 and S850, the necessity determining unit 23 sets the first amount for each rotation angle. The edge amount of the area A1 is compared with the edge amount of the second area A2. In any of the comparisons for each rotation angle, if the edge amount of the first area A1> the edge amount of the second area A2 holds, the process proceeds to S440 and S860. That is, if the edge amount of the first region A1 is larger than the edge amount of the second region A2 in any of the rotation states (0 degree, 90 degrees, 180 degrees, 270 degrees) of the detection window SW, the detection window SW is within the detection window SW. It can be said that there is something that looks like a face image. In that case, it is determined that it is necessary to perform face determination, and face determinations in S440 and S860 are performed. In addition to the 90 degrees described above, various angles such as 30 degrees, 45 degrees, and 60 degrees are conceivable as the angle per rotation when the detection window SW is rotated.

このように検出窓ＳＷの回転角度毎に第一領域Ａ１のエッジ量と第二領域Ａ２のエッジ量とを比較して顔判定の要否を判断する構成を採ることにより、図１３に示すように、顔画像の上下が画像データＤＲの左右を向いていたり、顔画像の上下が画像データＤＲの上下と逆であるというように、画像データＤＲ上における顔画像の角度が様々であっても、かかる顔画像の存在の可能性を推定し、顔判定の実行に繋げることができる。なおＳ８３０では、要否判断部２３は、上記のように回転角度毎に得られた各第一領域Ａ１のエッジ量がいずれもしきい値Ｔｈ以下である場合に“Ｎｏ”の判断をしてＳ８６０に進み、一方、回転角度毎に得られた各第一領域Ａ１のエッジ量のいずれかがしきい値Ｔｈよりも大きければ“Ｙｅｓ”の判断をして（Ｓ８４０は実質的にスキップして）Ｓ８５０に進む。 As shown in FIG. 13, by adopting a configuration that determines the necessity of face determination by comparing the edge amount of the first region A1 and the edge amount of the second region A2 for each rotation angle of the detection window SW as described above. Even if the angle of the face image on the image data DR is various, such as when the top and bottom of the face image faces the left and right of the image data DR, or the top and bottom of the face image is opposite to the top and bottom of the image data DR. Thus, the possibility of the presence of such a face image can be estimated, which can lead to execution of face determination. In S830, the necessity determination unit 23 determines “No” when the edge amount of each first region A1 obtained for each rotation angle as described above is equal to or less than the threshold value Th, and performs S860. On the other hand, if any of the edge amounts of the first regions A1 obtained for each rotation angle is larger than the threshold value Th, a determination of “Yes” is made (S840 is substantially skipped). Proceed to S850.

検出窓ＳＷの回転角度毎の上記エッジ量の比較結果に応じて、顔判定を実行する場合には、検出実行部２４は、上記窓画像データＸＤから特徴量ＣＡを算出するための上記各フィルタＦＴについても、窓画像データＸＤの中心位置を中心として適宜回転させる。そして検出実行部２４は、各回転状態毎のフィルタＦＴから得られる特徴量ＣＡに基づいて、各回転状態毎にニューラルネットワークＮＮを用いて顔画像の存在／不存在を判定する。この結果、検出窓ＳＷに様々な角度で存在し得る顔画像を的確に検出することが可能となる。 When performing face determination according to the comparison result of the edge amount for each rotation angle of the detection window SW, the detection execution unit 24 calculates each of the filters for calculating the feature amount CA from the window image data XD. The FT is also appropriately rotated around the center position of the window image data XD. Then, the detection execution unit 24 determines the presence / absence of the face image using the neural network NN for each rotation state based on the feature amount CA obtained from the filter FT for each rotation state. As a result, it is possible to accurately detect face images that may exist at various angles in the detection window SW.

図１４は、エッジ量算出領域定義フィルタ１４ｂの一例であって、図５に示した構成とは別の例を示している。図１４のエッジ量算出領域定義フィルタ１４ｂでは、目領域、口領域、左右の頬領域の他に、鼻領域ＡＮを定義している。鼻領域ＡＮは、検出窓ＳＷにエッジ量算出領域定義フィルタ１４ｂを適用した際に、仮に検出窓ＳＷが顔画像を含んでいれば、顔画像の鼻を含むであろうと推定される位置および大きさに予め設定された領域である。エッジ量算出部２２は、鼻領域ＮＡを定義したエッジ量算出領域定義フィルタ１４ｂを、図５に示したエッジ量算出領域定義フィルタ１４ｂに替えて内部メモリ１２から読み出し、画像データＤＲ上の検出窓ＳＷに適用することができる。この場合、エッジ量算出部２２は、目領域、口領域、左右の頬領域の画像データに上述のようにエッジ検出フィルタ１４ｃを適用してエッジ量を取得するとともに、鼻領域ＮＡ内の画像データからもエッジ量を取得する。 FIG. 14 shows an example of the edge amount calculation region definition filter 14b, which is an example different from the configuration shown in FIG. In the edge amount calculation area definition filter 14b in FIG. 14, a nose area AN is defined in addition to the eye area, the mouth area, and the left and right cheek areas. When the edge amount calculation region definition filter 14b is applied to the detection window SW, the nose region AN is assumed to include the nose of the face image if the detection window SW includes the face image. This is an area set in advance. The edge amount calculation unit 22 reads the edge amount calculation region definition filter 14b defining the nose region NA from the internal memory 12 in place of the edge amount calculation region definition filter 14b shown in FIG. 5, and detects the detection window on the image data DR. It can be applied to SW. In this case, the edge amount calculation unit 22 applies the edge detection filter 14c to the image data of the eye region, the mouth region, and the left and right cheek regions as described above to acquire the edge amount, and the image data in the nose region NA. The edge amount is also obtained from.

エッジ量算出部２２は、鼻領域ＡＮに属する各画素に対してはエッジ検出フィルタ１４ｄ（図６参照。）を適用することによりエッジ量を検出し、鼻領域ＡＮの各画素のエッジ量の大きさ（絶対値）の総和を、鼻領域ＡＮのエッジ量として取得する。エッジ検出フィルタ１４ｄは、図６から明らかなように、画像の左右方向の輝度変化に応じたエッジ量を検出可能なフィルタである。顔画像の鼻近辺においては、顔の上下方向に延びる鼻筋のエッジや、左右の小鼻の線などが存在するため、顔の左右方向における輝度変化も富んでいる。そのため、画像データＤＲに設定された検出窓ＳＷに仮に顔画像が存在する場合には、その検出窓ＳＷにおける鼻領域ＡＮから多くのエッジ量が検出されると予測される。 The edge amount calculation unit 22 detects the edge amount by applying the edge detection filter 14d (see FIG. 6) to each pixel belonging to the nose area AN, and the edge amount of each pixel in the nose area AN is large. The sum of the absolute values (absolute values) is acquired as the edge amount of the nose area AN. As is apparent from FIG. 6, the edge detection filter 14d is a filter capable of detecting an edge amount corresponding to a luminance change in the left-right direction of the image. In the vicinity of the nose of the face image, there are edges of the nose extending in the vertical direction of the face, lines of the left and right noses, etc., and therefore the luminance change in the horizontal direction of the face is also rich. Therefore, if a face image exists in the detection window SW set in the image data DR, it is predicted that a large amount of edge is detected from the nose region AN in the detection window SW.

このように、目領域、口領域、左右の頬領域および鼻領域ＡＮからエッジ量が取得された場合、要否判断部２３は、鼻領域ＡＮのエッジ量も考慮して顔判定を行なうか否か判断する。例えば、要否判断部２３は、第一領域Ａ１（目領域＋口領域）のエッジ量＞第二領域Ａ２（左右の頬領域）のエッジ量、かつ、鼻領域ＡＮのエッジ量＞所定のしきい値、が成り立つ場合に、顔判定を行なう（Ｓ４４０，Ｓ８６０に進む）と判断する。鼻領域ＡＮのエッジ量との比較に用いられるしきい値のデータも、予め内部メモリ１２等に記録されているとする。あるいは鼻領域ＡＮも第一領域Ａ１の一部であるとしてもよい。つまり、エッジ量算出部２２によって目領域、口領域および鼻領域から取得されたエッジ量の総和を第一領域Ａ１のエッジ量とし、要否判断部２３は、第一領域Ａ１のエッジ量＞第二領域Ａ２のエッジ量、が成り立つか否か判断する。あるいは、鼻領域ＡＮを含むことで第一領域Ａ１の面積＝第二領域Ａ２の面積、が成り立たなくなる場合には、要否判断部２３は、鼻領域ＡＮを含む第一領域Ａ１のエッジ量の平均値と、第二領域Ａ２のエッジ量の平均値とを比較し、第一領域Ａ１にかかる平均値の方が大きい場合に顔判定を行なうと判断してもよい。このように、目領域や、口領域や、顔器官以外の皮膚領域のエッジ量に加え、鼻領域ＡＮのエッジ量も考慮して、検出窓ＳＷの画像の顔らしさを評価することにより、より顔らしさが高い画像を有する検出窓ＳＷだけを顔判定の対象とすることができる。 As described above, when the edge amount is acquired from the eye region, the mouth region, the left and right cheek regions, and the nose region AN, the necessity determination unit 23 determines whether or not to perform the face determination in consideration of the edge amount of the nose region AN. Judge. For example, the necessity determining unit 23 determines the edge amount of the first area A1 (eye area + mouth area)> the edge amount of the second area A2 (left and right cheek areas) and the edge amount of the nose area AN> predetermined. When the threshold value is satisfied, it is determined that face determination is performed (proceeds to S440 and S860). It is assumed that threshold value data used for comparison with the edge amount of the nose area AN is also recorded in the internal memory 12 or the like in advance. Alternatively, the nose area AN may be a part of the first area A1. That is, the sum of the edge amounts acquired from the eye region, mouth region, and nose region by the edge amount calculation unit 22 is used as the edge amount of the first region A1, and the necessity determination unit 23 determines that the edge amount of the first region A1> the first amount. It is determined whether the edge amount of the two areas A2 holds. Alternatively, when the area of the first area A1 = the area of the second area A2 does not hold due to including the nose area AN, the necessity determining unit 23 determines the edge amount of the first area A1 including the nose area AN. The average value may be compared with the average value of the edge amounts of the second area A2, and it may be determined that the face determination is performed when the average value applied to the first area A1 is larger. In this way, by considering the edge amount of the nose region AN in addition to the edge amount of the skin region other than the eye region, the mouth region, and the facial organ, the facial appearance of the image of the detection window SW is evaluated, Only the detection window SW having an image having a high facial appearance can be set as a face determination target.

Ｓ４４０，Ｓ８６０において検出実行部２４が実行可能な顔判定であって、ニューラルネットワークＮＮを利用した手法以外の手法について説明する。
図１５は、検出実行部２４が行なう顔判定の手法の一例を模式的に示している。図１５に示す例では、複数の判定器Ｊ，Ｊ…を複数段カスケード状に接続した判定手段を使用する。ここで言う複数の判定器Ｊからなる判定手段は、実体的な装置であってもよいし、複数の判定器Ｊに相当する以下の判定機能を有したプログラムであってもよい。各判定器Ｊ，Ｊ…は、顔判定の対象となった窓画像データＸＤから、それぞれ異なる種類（例えばフィルタＦＴが異なる）の単数または複数の特徴量ＣＡ，ＣＡ，ＣＡ…をそれぞれ入力し、それぞれ正または否の判定を出力する。各判定器Ｊ，Ｊ…は、それぞれ特徴量ＣＡ，ＣＡ，ＣＡ…の大小比較や閾値判定等の判定アルゴリズムを有しており、それぞれ窓画像データＸＤが顔らしい（正）か顔らしくない（否）かの独自の判定を実行する。次の段の各判定器Ｊ，Ｊ…は、前の段の判定器Ｊ，Ｊ…の正の出力に接続されており、前の段の判定器Ｊ，Ｊ…の出力が正であった場合のみ次の段の判定器Ｊ，Ｊ…が判定を実行する。いずれの段においても否の出力がなされた時点で顔判定を終了させ、顔画像が存在しない旨の判定を出力する（Ｓ４４０，Ｓ８６０において“Ｎｏ”）。一方、各段の判定器Ｊ，Ｊ…がすべて正の出力をした場合には、顔判定を終了させ、顔画像が存在する旨の判定を出力する（Ｓ４４０，Ｓ８６０において“Ｙｅｓ”）。 A method other than the method using the neural network NN, which is face determination that can be executed by the detection execution unit 24 in S440 and S860, will be described.
FIG. 15 schematically illustrates an example of a face determination technique performed by the detection execution unit 24. In the example shown in FIG. 15, determination means in which a plurality of determination devices J, J. The determination means including the plurality of determination devices J mentioned here may be a substantial device or a program having the following determination functions corresponding to the plurality of determination devices J. Each of the determiners J, J,... Inputs one or more feature quantities CA, CA, CA,... Of different types (for example, different filters FT) from the window image data XD subjected to face determination, Outputs a positive or negative judgment, respectively. Each of the determiners J, J... Has a determination algorithm such as a feature size comparison between CA, CA, CA... And a threshold determination, and the window image data XD is a face (positive) or not a face ( Execute the original determination of NO). Each of the next stage determiners J, J... Is connected to the positive output of the previous stage determiners J, J..., And the output of the previous stage determiners J, J. Only when this is the case, the next stage decision devices J, J... In any stage, the face determination is terminated at the point of time when no is output, and a determination that no face image exists is output (“No” in S440 and S860). On the other hand, when all of the determination devices J, J... At each stage output a positive value, the face determination is terminated and a determination that a face image exists is output (“Yes” in S440 and S860).

図１６は、上記判定手段における判定特性を示している。同図においては、上述した各判定器Ｊ，Ｊ…において使用される特徴量ＣＡ，ＣＡ，ＣＡ…の軸で定義される特徴量空間を示しており、最終的に顔画像が存在すると判定される窓画像データＸＤから得られる特徴量ＣＡ，ＣＡ，ＣＡ…の組み合わせで表される特徴量空間内の座標をプロットしている。顔画像が存在すると判定される窓画像データＸＤは一定の特徴を有しているため、特徴量空間における一定の領域に分布が見られると考えることができる。各判定器Ｊ，Ｊ…は、このような特徴量空間において境界平面を生成し、当該境界平面で区切られた空間のうち、前記分布が属する空間に判定対象の特徴量ＣＡ，ＣＡ，ＣＡ…の座標が存在している場合には、正を出力する。従って、各判定器Ｊ，Ｊ…をカスケード状に接続することにより、徐々に正と出力される空間を絞り込んでいくことができる。複数の境界平面によれば、複雑な形状の前記分布についても精度よく判定を行うことができる。 FIG. 16 shows the determination characteristics of the determination means. This figure shows a feature space defined by the axes of the feature values CA, CA, CA,... Used in each of the above-described determiners J, J ..., and finally determines that a face image exists. Coordinates in the feature amount space represented by a combination of feature amounts CA, CA, CA... Obtained from the window image data XD. Since the window image data XD determined that the face image exists has a certain feature, it can be considered that a distribution is seen in a certain region in the feature amount space. Each of the determiners J, J... Generates a boundary plane in such a feature amount space, and among the spaces partitioned by the boundary plane, the determination target feature amounts CA, CA, CA. If the coordinate of exists, positive is output. Therefore, by connecting the determination devices J, J... In a cascade, it is possible to gradually narrow down the space in which the positive output is made. According to the plurality of boundary planes, it is possible to accurately determine the distribution having a complicated shape.

なお、以上においては、本発明のオブジェクト検出装置およびオブジェクト検出方法がプリンタ１０によって具現化される例を示したが、例えばオブジェクト検出装置およびオブジェクト検出方法は、コンピュータや、デジタルスチルカメラや、スキャナ等の画像機器において実現されてもよい。さらに、プリンタのように印刷用紙上に画像処理結果を出力するものに限らず、フォトビューワのようにディスプレイ上に画像処理結果を出力する装置においても本発明を実現することができる。さらに、人物認証を行うＡＴＭ（Automated Teller Machine）等においても本発明を適用することができる。さらに、検出実行部２４が実行する顔判定は、上述した特徴量の特徴量空間における種々の判別手法を用いることも可能である。例えば、サポートベクタマシンを利用してもよい。 In the above, an example in which the object detection device and the object detection method of the present invention are embodied by the printer 10 has been described. For example, the object detection device and the object detection method may be a computer, a digital still camera, a scanner, or the like. It may be realized in the image equipment. Further, the present invention can be realized not only in a printer that outputs image processing results on a printing sheet but also in an apparatus that outputs image processing results on a display such as a photo viewer. Furthermore, the present invention can also be applied to an ATM (Automated Teller Machine) that performs person authentication. Furthermore, the face determination performed by the detection execution unit 24 can use various determination methods in the feature amount space of the feature amount described above. For example, a support vector machine may be used.

プリンタの概略構成を示すブロック図である。FIG. 2 is a block diagram illustrating a schematic configuration of a printer. プリンタが実行する処理を示すフローチャートである。4 is a flowchart illustrating processing executed by a printer. オブジェクト検出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of an object detection process. 検出窓を設定する様子を示す図である。It is a figure which shows a mode that a detection window is set. エッジ量算出領域定義フィルタの一例を示す図である。It is a figure which shows an example of an edge amount calculation area | region definition filter. エッジ検出の様子を示す図である。It is a figure which shows the mode of edge detection. ニューラルネットワークを用いた顔判定を示すフローチャートである。It is a flowchart which shows the face determination using a neural network. 窓画像データから特徴量を算出する様子を示す図である。It is a figure which shows a mode that a feature-value is calculated from window image data. ニューラルネットワークの構造の一例を示す図である。It is a figure which shows an example of the structure of a neural network. ニューラルネットワークを学習する様子を模式的に示す図である。It is a figure which shows typically a mode that a neural network is learned. オブジェクト検出処理の詳細の他の例を示すフローチャートである。It is a flowchart which shows the other example of the detail of an object detection process. 検出窓を回転させた様子を示す図である。It is a figure which shows a mode that the detection window was rotated. 入力画像の一例を示す図である。It is a figure which shows an example of an input image. エッジ量算出領域定義フィルタの他の例を示す図である。It is a figure which shows the other example of an edge amount calculation area | region definition filter. 変形例にかかる顔判定処理を模式的に示す図である。It is a figure which shows typically the face determination process concerning a modification. 変形例にかかる顔判定処理の判定特性を示す図である。It is a figure which shows the determination characteristic of the face determination process concerning a modification.

Explanation of symbols

１０…プリンタ、１１…ＣＰＵ、１２…内部メモリ、１４ｂ…エッジ量算出領域定義フィルタ、１４ｃ，１４ｄ…エッジ検出フィルタ、１６…プリンタエンジン、１７…カードＩ／Ｆ、２０…オブジェクト検出部、２１…検出窓設定部、２２…エッジ量算出部、２３…要否判断部、２４…検出実行部、３０…画像補正部、３１…補正情報決定部、３２…補正実行部、５０…印刷処理部、１７２…カードスロット DESCRIPTION OF SYMBOLS 10 ... Printer, 11 ... CPU, 12 ... Internal memory, 14b ... Edge amount calculation area definition filter, 14c, 14d ... Edge detection filter, 16 ... Printer engine, 17 ... Card I / F, 20 ... Object detection part, 21 ... Detection window setting unit, 22 ... Edge amount calculation unit, 23 ... Necessity determination unit, 24 ... Detection execution unit, 30 ... Image correction unit, 31 ... Correction information determination unit, 32 ... Correction execution unit, 50 ... Print processing unit, 172 ... Card slot

Claims

An object detection device for detecting a predetermined object from an input image,
An edge acquisition unit that sets a detection window on the input image and acquires an edge amount of each area for a plurality of areas in the set detection window;
The obtained edge amount of each area is compared between predetermined areas, and when the result of the comparison satisfies a predetermined condition, the presence / absence of the object is determined for the image in the set detection window. An object detection apparatus comprising: an object determination unit to be executed.

The edge acquisition unit is an area set in the detection window, and when the detection window includes a face image, a first area set as an area corresponding to a predetermined organ of the face image, and set in the detection window When the detection window includes a face image, an edge amount is acquired for each of the second region set as a region corresponding to a predetermined skin portion other than the organ of the face image, and the object determination is performed The unit determines whether or not there is a face image as an object for the image in the set detection window when the edge amount of the first region is larger than the edge amount of the second region. The object detection device according to 1.

The first area includes an area preliminarily estimated to correspond to the eyes of the face image when the detection window includes a face image and an area previously estimated to correspond to the mouth of the face image when the detection window includes the face image. The object detection apparatus according to claim 2, further comprising:

When the edge amount of the first area is equal to or less than a predetermined threshold value, the object determination unit is configured to detect the face image for the image within the set detection window regardless of the edge amount of the second area. The object detection apparatus according to claim 2, wherein presence or absence is determined.

The edge acquisition unit rotates the detection window a plurality of times by a predetermined angle at a position on the input image where the detection window is set while maintaining the position and size of each region with respect to the detection window. The edge amount of each region is acquired for each state, and the object determination unit performs the comparison for each state in which the detection window is rotated, and based on the result of each comparison, The object detection apparatus according to claim 1, wherein it is determined whether or not to determine whether or not there is an object for the image.

The edge acquisition unit repeatedly sets the detection window on the input image while changing at least one of the position and size of the detection window in the input image while maintaining the position and size of each region with respect to the detection window. 6. Each time the detection window is set, the edge amount of each region is acquired, and the object determination unit performs the comparison for each set detection window. An object detection device according to claim 1.

An object detection method for detecting a predetermined object from an input image,
An edge acquisition step of setting a detection window on the input image and acquiring an edge amount of each area for a plurality of areas in the set detection window;
The obtained edge amount of each area is compared between predetermined areas, and when the result of the comparison satisfies a predetermined condition, the presence / absence of the object is determined for the image in the set detection window. An object detection method comprising: an object determination step to be executed.

An object detection program for causing a computer to execute processing for detecting a predetermined object from an input image,
An edge acquisition function for setting the detection window on the input image and acquiring the edge amount of each area for a plurality of areas in the set detection window;
The obtained edge amount of each area is compared between predetermined areas, and when the result of the comparison satisfies a predetermined condition, the presence / absence of the object is determined for the image in the set detection window. An object detection program for executing an object determination function to be executed.

A printing apparatus that detects a predetermined object from an input image and executes printing based on the input image,
An edge acquisition unit that sets a detection window on the input image and acquires an edge amount of each area for a plurality of areas in the set detection window;
The obtained edge amount of each area is compared between predetermined areas, and when the result of the comparison satisfies a predetermined condition, the presence / absence of the object is determined for the image in the set detection window. An object determination unit to be executed;
At least a part of the input image is corrected according to the correction information determined based on the image in the detection window determined that the object is determined by the object determination unit, and printing is performed based on the corrected input image. A printing apparatus comprising: a printing control unit.