JP5109518B2

JP5109518B2 - Image processing device

Info

Publication number: JP5109518B2
Application number: JP2007195049A
Authority: JP
Inventors: 宏幸辻
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2007-07-26
Filing date: 2007-07-26
Publication date: 2012-12-26
Anticipated expiration: 2027-07-26
Also published as: JP2009032019A

Description

この発明は、画像中において特定の対象物が写っている領域を特定する技術に関する。 The present invention relates to a technique for specifying an area where a specific object is shown in an image.

従来より、画像中に特定の物体が存在するか否かを判定する技術が存在する。そのような技術においては、検討対象の画像が暗い場合（たとえば、明度を０〜２５５の階調値で表したとき７０以下）には、対象物の検出率が低下するという問題がある。このため、たとえば、ある従来技術においては、画像の撮像環境の照度に応じて、物体検出部における検出試行回数と検出閾値が設定される（特許文献１）。 Conventionally, there exists a technique for determining whether or not a specific object is present in an image. In such a technique, when the image to be examined is dark (for example, 70 or less when the lightness is expressed by a gradation value of 0 to 255), there is a problem that the detection rate of the object is lowered. For this reason, for example, in a certain prior art, the number of detection trials and a detection threshold in the object detection unit are set according to the illuminance of the image capturing environment (Patent Document 1).

特開２００６−２０９２２７号公報JP 2006-209227 A

しかし、上記の従来技術においては、各照度について検出試行回数と検出閾値の組み合わせを得るために、膨大な処理を行う必要がある。 However, in the above prior art, it is necessary to perform enormous processing in order to obtain a combination of the number of detection trials and a detection threshold for each illuminance.

画像中において特定の対象物が写っている領域を特定する技術分野においては、上記の問題を解決するため、簡易な処理で、検出対象である画像の明るさによらず、一定以上の信頼性をもって画像中から特定の物体を検出できる技術が求められている。 In the technical field of identifying an area where a specific object is shown in an image, in order to solve the above problem, a simple process with a certain level of reliability regardless of the brightness of the image to be detected. Therefore, there is a demand for a technique that can detect a specific object from an image.

本発明は、検出対象である画像の明るさによらず、一定以上の信頼性をもって画像中から特定の物体を検出することを目的とする。 An object of the present invention is to detect a specific object from an image with a certain level of reliability, regardless of the brightness of the image to be detected.

上記目的を達成するために、本発明の一態様においては、画像中において所定の対象物が存在する部分を特定する際に、以下の処理を行う。
（ａ）所定の第１の条件が満たされた場合に、第１の画像データの少なくとも一部の数値範囲に含まれる基準階調値をより明るい色に対応する基準階調値に変換する第１の変換を行うことによって、第２の画像データを生成する。なお、基準階調値は画像データの画素の色に関する階調値である。
（ｂ）第１の画像データの画素の色の基準階調値に基づいて、第１の画像データの画像中において、所定の対象物が存在する第１の部分を特定する。
（ｃ）第２の画像データが生成された場合に、第２の画像データの画素の色の基準階調値に基づいて、第２の画像データの画像中において、所定の対象物が存在する第２の部分を特定する。
（ｄ）第１の画像データの画像の第１の部分と、第２の画像データの画像の第２の部分と、に基づいて、第１の画像データの画像中において、所定の対象物が存在する部分の集合を決定する。 In order to achieve the above object, in one aspect of the present invention, the following processing is performed when specifying a portion where a predetermined object exists in an image.
(A) When a predetermined first condition is satisfied, a reference gradation value included in at least a part of the numerical value range of the first image data is converted into a reference gradation value corresponding to a brighter color. By performing the conversion of 1, the second image data is generated. The reference gradation value is a gradation value relating to the color of the pixel of the image data.
(B) Based on the reference gradation value of the color of the pixel of the first image data, the first portion where the predetermined object exists is specified in the image of the first image data.
(C) When the second image data is generated, a predetermined object exists in the image of the second image data based on the reference tone value of the color of the pixel of the second image data. The second part is specified.
(D) On the basis of the first portion of the image of the first image data and the second portion of the image of the second image data, a predetermined object is present in the image of the first image data. Determine the set of existing parts.

なお、「第１の画像データの画像の第１の部分と、第２の画像データの画像の第２の部分と、に基づいて、」決定する、とは、第２の画像データが存在しない場合については、「第１の画像データの画像の第１の部分に基づいて、」決定することを意味する。 “Determining based on the first portion of the image of the first image data and the second portion of the image of the second image data” means that the second image data does not exist For the case, it means to determine “based on the first portion of the image of the first image data”.

上記のような態様とすれば、検出対象である画像の明るさによらず、一定以上の信頼性をもって画像中から特定の対象物を検出することができる。 If it is set as the above aspects, a specific target object can be detected from the image with a certain level of reliability regardless of the brightness of the image that is the detection target.

なお、基準階調値は、画素の明度を表す階調値とすることができる。このような態様とすれば、一つの色成分を表す階調値に基づいて処理を行う場合に比べて、多くの場合、高精度に対象物の特定を行うことができる。 The reference gradation value can be a gradation value representing the brightness of the pixel. In such a case, in many cases, it is possible to specify an object with high accuracy as compared with a case where processing is performed based on a gradation value representing one color component.

また、基準階調値は、画素の緑の色成分の強度を表す階調値とすることもできる。
このような態様としても、たとえば人の顔などの対象物を、他の色成分の階調値を使用する態様に比べて高精度に特定することができる。 The reference gradation value can also be a gradation value representing the intensity of the green color component of the pixel.
Even in such an aspect, for example, an object such as a human face can be specified with higher accuracy than in an aspect in which gradation values of other color components are used.

なお、基準階調値を変換する少なくとも一部の数値範囲は、基準階調値が取りうる範囲のうち、最も暗い色に対応する基準階調値から、取りうる範囲の幅の２５％までの範囲を含むことが好ましい。 It should be noted that at least a part of the numerical range for converting the reference gradation value is a range from the reference gradation value corresponding to the darkest color to 25% of the range of the possible range among the possible range of the reference gradation value. It is preferable to include a range.

このような態様とすれば、画像中において、最も暗い色の基準階調値から２５％までの範囲の基準階調値を有する部分について、基準階調値をより明るい色の基準階調値に変換して、第２の部分を特定することができる。このため、画像中において対象物が暗い色で描かれている場合にも、その対象物が存在する部分を第２の部分として特定できる可能性が高い。 According to such an aspect, the reference gradation value is changed to a brighter reference gradation value for a portion having a reference gradation value in a range from the darkest reference gradation value to 25% in the image. A second part can be identified by conversion. For this reason, even when the object is drawn in a dark color in the image, there is a high possibility that the part where the object exists can be specified as the second part.

基準階調値を変換する際には、基準階調値を定数倍することによって、変換を行うことが好ましい。 When converting the reference gradation value, it is preferable to perform conversion by multiplying the reference gradation value by a constant.

なお、基準階調値が、他の条件が同じである場合に、基準階調値の値が小さいほど暗い色を表すものであるときには、定数は１より大きい値とすることが好ましい。そのような態様とすれば、階調値を定数倍することによって、画像中の各部分の階調値の差を拡大して、第２の部分の特定においてより対象物が特定されやすいようにすることができる。一方、基準階調値が、他の条件が同じである場合に、基準階調値の値が小さいほど明るい色を表すものであるときには、定数は１より小さい値とすることもできる。 In the case where the reference gradation value represents the darker color as the reference gradation value is smaller when the other conditions are the same, the constant is preferably set to a value larger than 1. In such an aspect, by multiplying the gradation value by a constant, the difference in the gradation value of each part in the image is expanded so that the object can be more easily specified in the specification of the second part. can do. On the other hand, when the reference gradation value is the same under other conditions, the constant can be set to a value smaller than 1 when the reference gradation value represents a brighter color as the value of the reference gradation value is smaller.

基準階調値を変換する際には、基準階調値をガンマ変換することによって、変換を行うこともできる。このような態様においては、基準階調値を定数倍する態様に比べて、基準階調値の最小値の近傍や最大値の近傍の値を有する複数の階調値において、階調値の差がなくなってしまう可能性を少なくすることができる。このため、第２の部分を特定する際に、第２の画像データの画像全体について高精度に第２の部分を特定することができる。 When converting the reference gradation value, the conversion can also be performed by gamma-converting the reference gradation value. In such an aspect, compared to an aspect in which the reference gradation value is multiplied by a constant, the difference between the gradation values in a plurality of gradation values having values near the minimum value or the maximum value of the reference gradation value. The possibility of disappearing can be reduced. For this reason, when specifying a 2nd part, a 2nd part can be specified with high precision about the whole image of 2nd image data.

なお、基準階調値が、「他の条件が同じである場合に、基準階調値の値が小さいほど暗い色を表す基準階調値」であるときには、ガンマ曲線は上に凸であることが好ましい。一方、基準階調値が、「他の条件が同じである場合に、基準階調値の値が小さいほど明るい色を表す基準階調値」であるときには、ガンマ曲線は下に凸であることが好ましい。 When the reference gradation value is “a reference gradation value that represents a darker color as the reference gradation value is smaller when other conditions are the same”, the gamma curve is convex upward. Is preferred. On the other hand, when the reference gradation value is “a reference gradation value representing a brighter color as the reference gradation value is smaller when other conditions are the same”, the gamma curve is convex downward. Is preferred.

基準階調値を変換する際には、基準階調値を所定量だけ変更することによって、変換を行うこともできる。このような態様とすれば、画像中において、最小の基準階調値から所定の値までの範囲の基準階調値を有する部分の基準階調値を上げて、画像中において対象物が存在する第２の部分を特定することができる。このため、たとえば、画像中において対象物が、対象物の特定に失敗しやすい最大値または最小値の近傍の範囲の階調値で描かれている場合にも、その対象物を対象物の特定に失敗しやすい範囲以外の範囲の階調値で表して、第２の部分として特定することができる。 When converting the reference gradation value, the conversion can also be performed by changing the reference gradation value by a predetermined amount. According to such an aspect, an object exists in the image by increasing the reference gradation value of the portion having the reference gradation value in the range from the minimum reference gradation value to a predetermined value in the image. The second part can be identified. For this reason, for example, even when an object is drawn with a gradation value in a range near the maximum value or the minimum value that is likely to fail to specify the object in the image, the object is specified. It is possible to specify the second portion by expressing the gradation value in a range other than the range where the error is likely to fail.

なお、改変における上記の所定量は、基準階調値が取りうる範囲の幅の１０％〜４０％であることが好ましい。このような態様とすれば、所定の対象物が存在する部分を特定する際に、見落としが生じやすい端の近傍の階調値を、全体に、より中央値に近く見落としが生じにくい範囲の階調値に変換することができる。 The predetermined amount in the modification is preferably 10% to 40% of the width of the range that the reference gradation value can take. With such an aspect, when specifying a portion where a predetermined object exists, the gradation values near the edges that are likely to be overlooked are generally closer to the median value and are in a range that is less likely to be overlooked. Can be converted to a key value.

基準階調値が、「他の条件が同じである場合に、前記基準階調値の値が小さいほど暗い色を表す階調値」であるときには、以下のような態様とすることが好ましい。すなわち、第１の条件は、第１の画像データの画素の基準階調値の平均値が所定のしきい値よりも小さいことを含むことが好ましい。このような態様とすれば、第１の画像データの画像が、全体に暗い色で表される場合に、第２の画像データの生成と第２の部分の特定とが行われる。そして、第１の画像データの画像が、全体に明るい色で表される場合には、第２の画像データの生成と第２の部分の特定とが行われない。このため、対象物を特定する精度を低下させる程度を押さえつつ、全体の処理の負荷を少なくすることができる。 When the reference gradation value is “a gradation value that represents a darker color as the reference gradation value is smaller when other conditions are the same”, the following aspect is preferable. In other words, the first condition preferably includes that the average value of the reference gradation values of the pixels of the first image data is smaller than a predetermined threshold value. According to such an aspect, when the image of the first image data is expressed in a dark color as a whole, the generation of the second image data and the identification of the second portion are performed. Then, when the image of the first image data is expressed in a bright color as a whole, the generation of the second image data and the specification of the second portion are not performed. For this reason, the load of the whole process can be reduced, suppressing the grade which reduces the precision which pinpoints a target object.

また、第１の条件は、第１の画像データの画像の領域を、第１の領域と、第１の領域を囲む第２の領域と、に分けたときに、第１の領域に含まれる画素の基準階調値の平均値が所定のしきい値よりも小さいことを含むことも好ましい。 The first condition is included in the first area when the image area of the first image data is divided into a first area and a second area surrounding the first area. It is also preferable that the average value of the reference gradation values of the pixels is smaller than a predetermined threshold value.

このような態様とすれば、第１の画像データの画像のうち、対象物が存在する可能性が高い中央近傍の部分の画像が明るい色で表される場合には、第２の画像データの生成と第２の部分の特定とが行われない。このため、対象物を特定する精度を低下させることなく、全体の処理の負荷を低減することができる。 According to such an aspect, when the image in the vicinity of the center where the object is likely to exist among the images of the first image data is expressed in bright colors, the second image data Generation and identification of the second part are not performed. For this reason, the load of the whole process can be reduced, without reducing the precision which pinpoints a target object.

第１の対象物特定部は、第１の部分を特定する検出モジュールであって、あらかじめサンプル画像データを使用して学習させた第１の検出モジュールを備える態様とすることができる。また、第２の対象物特定部は、第２の部分を特定する検出モジュールであって、あらかじめサンプル画像データを使用して学習させた第２の検出モジュールを備える態様とすることができる。 The first object specifying unit may be a detection module that specifies the first part, and may include a first detection module that is previously learned using sample image data. In addition, the second object specifying unit may be a detection module that specifies the second portion, and may include a second detection module that is previously learned using sample image data.

なお、第１および第２の対象物特定部は、単一の構成部分とすることもでき、互いに別の構成部分とすることもできる。また、第１および第２の検出モジュールは、単一の構成部分とすることもでき、互いに別の構成部分とすることもできる。 Note that the first and second object specifying parts may be a single component or may be different components. Also, the first and second detection modules can be a single component or can be separate components.

なお、所定の対象物は人間の顔とすることができる。 The predetermined object can be a human face.

第１の画像データの画像中において、所定の対象物が存在する第１の部分を特定する際には、あらかじめサンプル画像データを使用して学習させた検出モジュールを使用して第１の部分を特定することが好ましい。そして、第２の画像データの画像中において、所定の対象物が存在する第２の部分を特定する際にも、その検出モジュールを使用して第２の部分を特定することができる。 When specifying the first portion where the predetermined object exists in the image of the first image data, the first portion is detected using a detection module that has been learned using the sample image data in advance. It is preferable to specify. And when specifying the 2nd part in which the predetermined target object exists in the image of the 2nd image data, the 2nd part can be specified using the detection module.

さらに、検出モジュールについては、以下の処理を行うことが好ましい。
（ｅ）第１のサンプル画像データを準備する。
（ｆ）所定の第２の条件が満たされた場合に、第１のサンプル画像データの画素の色に関する階調値であって少なくとも一部の数値範囲に含まれる基準階調値を、より明るい色に対応する基準階調値に変換する第２の変換を行うことによって、第２のサンプル画像データを生成する。
（ｇ）工程（ｂ）に先だって、第１のサンプル画像データを使用して検出モジュールに学習をさせる。
（ｈ）第２のサンプル画像データが生成された場合に、工程（ｃ）に先だって、第２のサンプル画像データを使用して検出モジュールに学習をさせる。 Furthermore, it is preferable to perform the following processing for the detection module.
(E) First sample image data is prepared.
(F) When a predetermined second condition is satisfied, a gradation value related to a pixel color of the first sample image data and a reference gradation value included in at least a part of the numerical value range is brighter Second sample image data is generated by performing a second conversion that converts the reference gradation value corresponding to the color.
(G) Prior to step (b), the detection module is trained using the first sample image data.
(H) When the second sample image data is generated, the detection module is trained using the second sample image data prior to the step (c).

このような態様とすれば、検出モジュールが第１および第２の部分を特定する際に行われる処理と同様の階調値の改変を行って、検出モジュールを学習させることができる。このため、何らの加工を行わないサンプル画像データで学習を行わせる場合に比べて、実際の対象物の特定において精度が高くなるように効率的に検出モジュールに学習を行わせることができる。 According to such an aspect, the detection module can be learned by modifying the gradation value similar to the process performed when the detection module specifies the first and second portions. For this reason, compared with the case where learning is performed using sample image data that is not subjected to any processing, the detection module can efficiently perform learning so that the accuracy in identifying the actual object is increased.

なお、第１のサンプル画像データを複数準備する場合には、一部の第１のサンプル画像データについて、第２のサンプル画像データを生成し、他の一部の第１のサンプル画像データについて、第２のサンプル画像データを生成しないようにすることもできる。 In the case of preparing a plurality of first sample image data, second sample image data is generated for some first sample image data, and for some other first sample image data, It is also possible not to generate the second sample image data.

また、基準階調値が、「他の条件が同じである場合に、前記基準階調値の値が小さいほど暗い色を表す階調値」であるときには、以下のような態様とすることが好ましい。すなわち、第２の条件は、第１のサンプル画像データの画素の基準階調値の平均値が所定のしきい値よりも小さいことを含むことが好ましい。このような態様とすれば、第１のサンプル画像データの画像が、全体に明るい色の階調値で表される場合には、第２のサンプル画像データの生成とその学習とが行われない。このため、全体として学習の処理の負荷を少なくすることができる。 When the reference gradation value is “a gradation value that represents a darker color as the reference gradation value is smaller when other conditions are the same”, the following aspect may be adopted. preferable. In other words, the second condition preferably includes that the average value of the reference gradation values of the pixels of the first sample image data is smaller than a predetermined threshold value. According to such an aspect, when the image of the first sample image data is expressed by the gradation value of the bright color as a whole, the generation and learning of the second sample image data are not performed. . For this reason, the load of the learning process as a whole can be reduced.

第２の条件は、第１のサンプル画像データの画像の領域を、第３の領域と、第３の領域を囲む第４の領域と、に分けたときに、第３の領域に含まれる画素の集合の基準階調値の平均値が所定のしきい値よりも小さいことを含むことも好ましい。 The second condition is that pixels included in the third area when the image area of the first sample image data is divided into a third area and a fourth area surrounding the third area. It is also preferable that the average value of the reference gradation values of the set is smaller than a predetermined threshold value.

このような態様とすれば、第１のサンプル画像データの画像のうち、対象物が存在する可能性が高い中央近傍の部分の画像が明るい色の階調値で表される場合には、第２のサンプル画像データの生成とその学習とが行われない。このため、対象物を特定するための学習の精度を低下させることなく、全体の学習の処理の負荷を低減することができる。 According to this aspect, when the image of the portion near the center where the object is highly likely to be present in the image of the first sample image data is represented by the gradation value of the bright color, The generation and learning of the second sample image data are not performed. For this reason, the load of the entire learning process can be reduced without reducing the accuracy of learning for specifying the target object.

第２のサンプル画像データの生成における階調値変換の対象である少なくとも一部の数値範囲は、第２の画像データの生成における階調値変換の対象である少なくとも一部の数値範囲と等しいことが好ましい。このような態様とすれば、第２のサンプル画像データを使用した第２の部分の特定の精度が高くなるように、検出モジュールに学習を行わせることができる。なお、第２のサンプル画像データの生成における階調値変換の対象である少なくとも一部の数値範囲は、第２の画像データの生成における階調値変換の対象である少なくとも一部の数値範囲とを異なる範囲とすることもできる。 At least a part of a numerical value range that is a target of gradation value conversion in the generation of the second sample image data is equal to at least a part of a numerical value range that is a target of the gradation value conversion in the generation of the second image data. Is preferred. With such an aspect, it is possible to cause the detection module to perform learning so as to increase the specific accuracy of the second portion using the second sample image data. It should be noted that at least a part of the numerical value range that is a target of gradation value conversion in the generation of the second sample image data is at least a part of a numerical value range that is a target of the gradation value conversion in the generation of the second image data. Can be in different ranges.

なお、第２のサンプル画像データを生成する際における階調値変換の対象である少なくとも一部の数値範囲は、基準階調値が取りうる範囲のうち、最も暗い色に対応する基準階調値から、取りうる範囲の幅の２５％までの範囲を含むことが好ましい。このような態様とすれば、サンプル画像データの画像中において、最小の基準階調値から２５％までの範囲の基準階調値を有する部分の基準階調値を上げて、学習を行うことができる。このため、サンプル画像データの画像中において対象物が暗い色で描かれている場合にも、その対象物に基づいて有効な学習を行うことができる。 It should be noted that at least a part of the numerical value range that is the target of gradation value conversion when generating the second sample image data is the reference gradation value corresponding to the darkest color among the possible range of the reference gradation value. To a range of up to 25% of the possible range width. According to such an aspect, learning is performed by increasing the reference gradation value of a portion having a reference gradation value in a range from the minimum reference gradation value to 25% in the image of the sample image data. it can. For this reason, even when the object is drawn in a dark color in the image of the sample image data, effective learning can be performed based on the object.

第２のサンプル画像データを生成する際には、基準階調値を定数倍することによって、変換を行うことが好ましい。 When generating the second sample image data, the conversion is preferably performed by multiplying the reference gradation value by a constant.

また、第２のサンプル画像データを生成する際には、基準階調値をガンマ変換することによって、変換を行うことも好ましい。 Further, when generating the second sample image data, it is also preferable to perform conversion by performing gamma conversion on the reference gradation value.

さらに、第２のサンプル画像データを生成する際には、基準階調値を所定量だけ変更することによって、変換を行うことも好ましい。このような態様とすれば、サンプル画像データの画像中において、最小の基準階調値から所定の値までの範囲の基準階調値を有する部分の基準階調値を上げて、学習を行うことができる。このため、たとえば、サンプル画像データの画像中において対象物が、対象物の特定に失敗しやすい最大値または最小値の近傍の範囲の階調値で描かれている場合にも、変換によって、その対象物を対象物の特定に失敗しやすい範囲以外の範囲の階調値で表して、その対象物について有効な学習を行うことができる。 Further, when generating the second sample image data, it is also preferable to perform the conversion by changing the reference gradation value by a predetermined amount. According to such an aspect, learning is performed by increasing the reference gradation value of the portion having the reference gradation value in the range from the minimum reference gradation value to a predetermined value in the sample image data image. Can do. For this reason, for example, even when the target object is drawn with a gradation value in the range near the maximum value or the minimum value in which the target object is likely to fail in the sample image data image, An object can be represented by a gradation value in a range other than the range in which it is difficult to specify the object, and effective learning can be performed for the object.

なお、本発明は、種々の形態で実現することが可能であり、例えば、画像処理方法および画像処理装置、印刷制御方法および印刷制御装置、それらの方法または装置の機能を実現するためのコンピュータプログラム、そのコンピュータプログラムを記録した記録媒体、そのコンピュータプログラムを含み搬送波内に具現化されたデータ信号、等の形態で実現することができる。 The present invention can be realized in various forms, for example, an image processing method and an image processing apparatus, a print control method and a print control apparatus, and a computer program for realizing the functions of these methods or apparatuses. It can be realized in the form of a recording medium recording the computer program, a data signal including the computer program and embodied in a carrier wave, and the like.

Ａ．第１実施例：
Ａ１．装置の構成：
図１は、本発明の実施例である画像処理装置の概略構成を示す説明図である。この画像処理装置は、画像データに対して所定の画像処理を行うパーソナルコンピュータ１００と、パーソナルコンピュータ１００に情報を入力する装置としてのキーボード１２０、マウス１３０およびＣＤ−Ｒ／ＲＷドライブ１４０と、情報を出力する装置としてのディスプレイ１１０、プリンタ２２およびプロジェクタ３２と、を備えている。コンピュータ１００では、所定のオペレーティングシステムの下で、アプリケーションプログラム９５が動作している。このアプリケーションプログラム９５が実行されることで、コンピュータ１００のＣＰＵ１０２は様々な機能を実現する。 A. First embodiment:
A1. Device configuration:
FIG. 1 is an explanatory diagram showing a schematic configuration of an image processing apparatus according to an embodiment of the present invention. The image processing apparatus includes a personal computer 100 that performs predetermined image processing on image data, a keyboard 120, a mouse 130, and a CD-R / RW drive 140 as apparatuses for inputting information to the personal computer 100, and information. A display 110, a printer 22, and a projector 32 are provided as output devices. In the computer 100, an application program 95 operates under a predetermined operating system. By executing the application program 95, the CPU 102 of the computer 100 realizes various functions.

画像のレタッチなどを行うアプリケーションプログラム９５が実行され、キーボード１２０やマウス１３０からユーザーの指示が入力されると、ＣＰＵ１０２は、ＣＤ−Ｒ／ＲＷドライブ１４０内のＣＤ−ＲＷからメモリ内に画像データを読み込む。ＣＰＵ１０２は、画像データに対して所定の画像処理を行って、ビデオドライバ９１を介して画像をディスプレイ１１０に表示する。また、ＣＰＵ１０２は、画像処理を行った画像データを、プリンタドライバ９６を介してプリンタ２２に印刷させることもできる。さらに、ＣＰＵ１０２は、画像データを、プロジェクタ３２のドライバ９８を介してプロジェクタ３２に投射させることもできる。 When an application program 95 for retouching an image is executed and a user instruction is input from the keyboard 120 or the mouse 130, the CPU 102 stores image data in the memory from the CD-RW in the CD-R / RW drive 140. Read. The CPU 102 performs predetermined image processing on the image data and displays an image on the display 110 via the video driver 91. The CPU 102 can also cause the printer 22 to print the image data that has undergone image processing via the printer driver 96. Further, the CPU 102 can project the image data onto the projector 32 via the driver 98 of the projector 32.

本実施例では、プリンタドライバ９６が、画像データの画像中において人間の顔が存在すると思われる領域を特定する。そして、そのような領域が存在する場合には、印刷に先立って、画像データに対して、人間の顔がより美しく見えるような補正を行う。人間の顔が存在すると思われる領域が存在しない場合には、画像データに対して他の補正が行われるか、画像データに対して補正が行われない。その後、その画像データに基づいて印刷が実行される。 In this embodiment, the printer driver 96 specifies an area where a human face is supposed to exist in the image of the image data. If such an area exists, correction is performed on the image data so that a human face looks more beautiful before printing. When there is no region where a human face is supposed to exist, another correction is performed on the image data, or no correction is performed on the image data. Thereafter, printing is executed based on the image data.

Ａ２．顔領域の検出の原理：
本実施例では、画像中で人間の顔が存在する領域を特定する処理は、プリンタドライバ９６のあるモジュールが実行する。このモジュールを「顔検出部９６２」と呼ぶ。顔検出部９６２については、プリンタドライバ９６に実装されるのに先立って、あらかじめサンプルの画像データを使って学習が行われる。 A2. The principle of face area detection:
In the present embodiment, the module that includes the printer driver 96 executes the process of specifying an area where a human face exists in the image. This module is referred to as a “face detection unit 962”. The face detection unit 962 is learned in advance using sample image data before being mounted in the printer driver 96.

図２は、画像データの画像中で人間の顔が存在する可能性が高いと思われる領域を特定する方法を示す図である。以下で説明する処理は、プリンタドライバ９６の顔検出部９６２が実行する。なお、人間の顔が存在する可能性が高いと思われる領域を、本明細書では「顔領域」と表記する。 FIG. 2 is a diagram illustrating a method for identifying an area where a human face is likely to exist in an image of image data. The process described below is executed by the face detection unit 962 of the printer driver 96. Note that an area that is likely to have a human face is referred to as a “face area” in this specification.

図２において、顔領域の有無の検討対象である画像ＰＩ１は、たとえば、３２０画素×２４０画素の画像である。画像ＰＩ１の画像データは、各画素の色について、明度のみの情報を有する。顔領域を特定する際には、画像の大きさ以下の大きさを有する検出窓ＤＷで、画像データの画像ＰＩ１内における検討対象の画像領域ＩＤＷを特定し、その画像領域ＩＤＷの各画素の明度のデータを取り出す。その後、その画素の明度のデータに基づいて、その画像領域ＩＤＷに人間の顔と思われるパターンが存在するか否かが検討される。 In FIG. 2, an image PI <b> 1 to be examined for the presence or absence of a face region is, for example, an image of 320 pixels × 240 pixels. The image data of the image PI1 has only brightness information for the color of each pixel. When specifying the face area, the image area IDW to be examined in the image PI1 of the image data is specified by the detection window DW having a size equal to or smaller than the image size, and the brightness of each pixel of the image area IDW is specified. Retrieve the data. Thereafter, based on the brightness data of the pixel, it is examined whether or not a pattern that seems to be a human face exists in the image area IDW.

一つの画像領域についての検討が終了すると、検出窓ＤＷは、画像ＰＩ１内で移動される。移動は、左から右に向かって行われる（図２の矢印Ａｈ参照）。検出窓ＤＷが画像ＰＩ１中の右端に到達したときには、次には画像ＰＩ１中の左端であって、それまでの位置よりも下方の位置に配される。以下、同様に検出窓ＤＷが移動される。ここでは、検出窓ＤＷは、左右方向の一度の移動において、検出窓ＤＷの幅Ｗｈよりも大きい距離ｄｈだけ移動される（図２の矢印Ａｈ参照）。また、検出窓ＤＷは、上下方向の移動についても、検出窓ＤＷの高さＷｖよりも大きい距離だけ移動される。 When the study on one image region is completed, the detection window DW is moved in the image PI1. The movement is performed from left to right (see arrow Ah in FIG. 2). When the detection window DW reaches the right end in the image PI1, it is next placed at the left end in the image PI1 and at a position below the previous position. Thereafter, the detection window DW is similarly moved. Here, the detection window DW is moved by a distance dh larger than the width Wh of the detection window DW in one movement in the left-right direction (see arrow Ah in FIG. 2). Further, the detection window DW is moved by a distance larger than the height Wv of the detection window DW in the vertical movement.

検出窓ＤＷによって取り出されたデータを解析した結果、その画像領域ＩＤＷが「顔領域である」と判定することはできないが、「顔領域である可能性がある」と判定された場合には、検出窓ＤＷは、上、下、左、右の各方向に向かって１画素分づつ、移動される（図２の矢印Ａｓ参照）。そして、それぞれの位置において検出窓ＤＷ内の画像領域のデータの解析が行われる。移動後のいずれかの検出窓ＤＷ内の画像領域が「顔領域である」と判定された場合には、検出窓ＤＷの幅（または高さ）よりも大きい距離の移動（図２の矢印Ａｈ参照）が再開される。 As a result of analyzing the data extracted by the detection window DW, it is not possible to determine that the image area IDW is “a face area”, but if it is determined that “it may be a face area”, The detection window DW is moved by one pixel in the up, down, left, and right directions (see arrow As in FIG. 2). Then, the data of the image area in the detection window DW is analyzed at each position. When it is determined that the image area in any of the detection windows DW after the movement is “a face area”, the movement (the arrow Ah in FIG. 2) is larger than the width (or height) of the detection window DW. Is resumed.

一方、上、下、左、右の移動後の検出窓ＤＷ内の画像が、いずれも「顔領域である」と判定することはできないが「顔領域である可能性がある」と判定された場合には、以下の処理が行われる。すなわち、もっとも顔領域である可能性が高いと判断された位置から再び上、下、左、右の各方向に向かって１画素分づつ、検出窓ＤＷが移動される。以下、画像ＰＩ１中の所定の範囲について同様の処理が繰り返され、もっとも顔領域である可能性が高い位置に検出窓がある状態で、最終的にその画像領域周辺に顔領域が存在するか否かが判定される。その後、検出窓ＤＷの幅（または高さ）よりも大きい距離の移動（図２の矢印Ａｈ参照）が再開される。 On the other hand, the images in the detection window DW after moving up, down, left, and right cannot be determined to be “face areas”, but are determined to be “face areas”. In the case, the following processing is performed. That is, the detection window DW is moved by one pixel from the position determined to have the highest possibility of the face area in the upward, downward, left, and right directions. Thereafter, the same processing is repeated for a predetermined range in the image PI1, and in the state where there is a detection window at a position most likely to be a face region, whether or not a face region finally exists around the image region. Is determined. Thereafter, the movement of a distance larger than the width (or height) of the detection window DW (see arrow Ah in FIG. 2) is resumed.

図３は、検出窓ＤＷによって取り出されたデータの画像領域ＩＤＷが顔領域であるか否かの判定の処理を示す図である。検出窓ＤＷによって取り出されたデータの画像領域ＩＤＷが顔領域であるか否かについては、２４段階のステージ（図３中でＳｔ１〜Ｓｔ２４で示す）を経て判定が行われる。第１のステージＳｔ１の判定において、画像領域のデータが所定の条件を満たした場合にのみ、次の第２のステージＳｔ２の判定が行われる。以下、第３〜第２３のステージＳｔ３〜Ｓｔ２３についても同様である。最終の第２４のステージの判定においても条件を満たした画像領域ＩＤＷは、「顔領域である」と判定される。 FIG. 3 is a diagram showing processing for determining whether or not the image area IDW of the data extracted by the detection window DW is a face area. Whether or not the image area IDW of the data extracted by the detection window DW is a face area is determined through 24 stages (indicated by St1 to St24 in FIG. 3). In the determination of the first stage St1, the determination of the next second stage St2 is performed only when the image area data satisfies a predetermined condition. The same applies to the third to twenty-third stages St3 to St23. The image area IDW that satisfies the conditions in the final 24th stage determination is also determined to be a “face area”.

また、第９のステージまでの条件を満たした画像については、「顔領域と判定することはできないが顔領域である可能性がある」と判定される。そのような場合には、上述のように、上下左右の方向に沿った１画素分の検出窓ＤＷの移動が行われ（図２の矢印Ａｓ参照）、データの解析と検出窓の移動が繰り返される。 For an image that satisfies the conditions up to the ninth stage, it is determined that “the face area cannot be determined but may be a face area”. In such a case, as described above, the detection window DW for one pixel is moved in the vertical and horizontal directions (see arrow As in FIG. 2), and data analysis and detection window movement are repeated. It is.

図４は、あるステージにおける判定の処理を示す図である。各ステージ（図３参照）においては、矩形フィルタを使った判定が行われる。図４においては、矩形フィルタの例として、矩形フィルタＦ１１，Ｆ１２を示す。また、矩形フィルタが適用される画像領域ＩＤＷの画像も、矩形フィルタと重ねて表示する。図４の例では、画像領域ＩＤＷの画像は、顔の画像であるものとする。画像領域ＩＤＷの画像として、人間の顔の目と鼻と口を示す。 FIG. 4 is a diagram showing determination processing in a certain stage. In each stage (see FIG. 3), determination using a rectangular filter is performed. In FIG. 4, rectangular filters F11 and F12 are shown as examples of the rectangular filter. Further, the image of the image area IDW to which the rectangular filter is applied is also displayed so as to overlap with the rectangular filter. In the example of FIG. 4, the image in the image area IDW is assumed to be a face image. As an image of the image area IDW, human face eyes, nose and mouth are shown.

矩形フィルタは、２０画素×２０画素の大きさを有する。ここでは、説明を簡略にするために、検出窓ＤＷによって取り出されたデータの画像領域ＩＤＷの大きさと、矩形フィルタの大きさとは、同じであるものとする。すなわち、画像領域ＩＤＷも、２０画素×２０画素の大きさを有する。矩形フィルタは、画像領域ＩＤＷの一部の領域（図４においてハッチを付して示す）に含まれる画素の明度の情報を取り出すために使用される。 The rectangular filter has a size of 20 pixels × 20 pixels. Here, to simplify the description, it is assumed that the size of the image area IDW of the data extracted by the detection window DW and the size of the rectangular filter are the same. That is, the image area IDW also has a size of 20 pixels × 20 pixels. The rectangular filter is used for extracting brightness information of pixels included in a part of the image area IDW (shown with hatching in FIG. 4).

矩形フィルタＦ１１を使用した判定においては、まず、矩形フィルタＦ１１を使用して、画像領域ＩＤＷのデータのうち、領域Ａ１１ａの各画素の明度のデータＹ１１ａが取り出される。領域Ａ１１ａは、画像領域ＩＤＷにおいて高さ方向の中央よりも上側にあって画像領域ＩＤＷの左右幅と等しい幅を有する長方形の領域である。また、同様に、矩形フィルタＦ１１を使用して、領域Ａ１１ｂの各画素の明度のデータＹ１１ｂも取り出される。領域Ａ１１ｂは、画像領域ＩＤＷにおいて高さ方向の中央よりも下側にあって画像領域ＩＤＷの左右幅と等しい幅を有する長方形の領域である。 In the determination using the rectangular filter F11, first, the brightness data Y11a of each pixel of the area A11a is extracted from the data of the image area IDW using the rectangular filter F11. The area A11a is a rectangular area that is above the center in the height direction in the image area IDW and has a width equal to the horizontal width of the image area IDW. Similarly, the brightness data Y11b of each pixel in the area A11b is also extracted using the rectangular filter F11. The area A11b is a rectangular area that is below the center in the height direction in the image area IDW and has a width equal to the horizontal width of the image area IDW.

なお、領域Ａ１１ａは、画像領域ＩＤＷが顔領域である場合に、人間の目があると推定される領域である。また、領域Ａ１１ｂは、画像領域ＩＤＷが顔領域である場合に、人間の口があると推定される領域である（図４の上段左側参照）。 Note that the area A11a is an area that is estimated to have human eyes when the image area IDW is a face area. The area A11b is an area that is estimated to have a human mouth when the image area IDW is a face area (see the upper left side of FIG. 4).

そして、領域Ａ１１ａの各画素の明度Ｙ１１ａの合計をα１１ａ倍した値と（α１１ａは定数）、領域Ａ１１ｂの各画素の明度Ｙ１１ｂの合計をα１１ｂ倍した値と（α１１ｂは定数）、の合計値が所定の基準値θ１１よりも大きいか否かが判定される。合計値がθ１１よりも大きい場合には、矩形フィルタＦ１１を用いた判定の結果の値Ａ１１として、値Ｄ１１ｙが返される。合計値がθ１１以下である場合には、矩形フィルタＦ１１を用いた判定の結果の値Ａ１１として、値Ｄ１１ｎが返される（図４の上段右側参照）。値Ｄ１１ｙ，Ｄ１１ｎは、所定の定数である。 The total value of the value obtained by multiplying the sum of the brightness Y11a of each pixel in the region A11a by α11a (α11a is a constant), the value obtained by multiplying the sum of the brightness Y11b of each pixel in the region A11b by α11b (α11b is a constant) is It is determined whether or not it is larger than a predetermined reference value θ11. When the total value is larger than θ11, the value D11y is returned as the value A11 of the determination result using the rectangular filter F11. When the total value is equal to or smaller than θ11, the value D11n is returned as the value A11 of the determination result using the rectangular filter F11 (see the upper right side of FIG. 4). The values D11y and D11n are predetermined constants.

同様に、矩形フィルタＦ１２を使用した判定においては、矩形フィルタＦ１２を使用して、画像領域ＩＤＷのデータのうち、領域Ａ１２ａの各画素の明度のデータＹ１２ａが取り出される。領域Ａ１２ａは、領域Ａ１１ａの左半分の領域Ａ１１ａｌの一部であって、領域Ａ１１ａｌの中心を含む領域である。また、矩形フィルタＦ１２を使用して、画像領域ＩＤＷのデータのうち、領域Ａ１２ｂの各画素の明度のデータＹ１２ｂが取り出される。領域Ａ１２ｂは、領域Ａ１１ａの右半分の領域Ａ１１ａｒの一部であって、領域Ａ１１ａｒの中心を含む領域である。 Similarly, in the determination using the rectangular filter F12, the brightness data Y12a of each pixel of the area A12a is extracted from the data of the image area IDW using the rectangular filter F12. The area A12a is a part of the left half area A11al of the area A11a and includes the center of the area A11al. Also, using the rectangular filter F12, brightness data Y12b of each pixel in the area A12b is extracted from the data in the image area IDW. The area A12b is a part of the right half area A11ar of the area A11a and includes the center of the area A11ar.

なお、領域Ａ１２ａは、画像領域ＩＤＷが顔領域である場合に、人間の右目があると推定される領域である。また、領域Ａ１２ｂは、画像領域ＩＤＷが顔領域である場合に、人間の左目があると推定される領域である（図４の中段左側参照）。 The area A12a is an area that is estimated to have a human right eye when the image area IDW is a face area. Further, the area A12b is an area that is estimated to have a human left eye when the image area IDW is a face area (see the left side in the middle of FIG. 4).

そして、領域Ａ１２ａの各画素の明度Ｙ１２ａの合計をα１２ａ倍した値と（α１２ａは定数）、領域Ａ１２ｂの各画素の明度Ｙ１２ｂの合計をα１２ｂ倍した値と（α１２ｂは定数）、の合計値が所定の基準値θ１２よりも大きいか否かが判定される。合計値がθ１２よりも大きい場合には、矩形フィルタＦ１２を用いた判定の結果の値Ａ１２として、値Ｄ１２ｙが返される。合計値がθ１２以下である場合には、矩形フィルタＦ１２を用いた判定の結果の値Ａ１２として、値Ｄ１２ｎが返される（図４の中段右側参照）。値Ｄ１２ｙ，Ｄ１２ｎは、所定の定数である。 Then, the sum of the value obtained by multiplying the sum of the brightness Y12a of each pixel in the region A12a by α12a (α12a is a constant), the value obtained by multiplying the sum of the brightness Y12b of each pixel in the region A12b by α12b (α12b is a constant) is It is determined whether or not it is larger than a predetermined reference value θ12. When the total value is larger than θ12, the value D12y is returned as the value A12 of the determination result using the rectangular filter F12. When the total value is equal to or smaller than θ12, the value D12n is returned as the value A12 as a result of determination using the rectangular filter F12 (see the middle right side of FIG. 4). The values D12y and D12n are predetermined constants.

以上のように、一つのステージの処理において、１以上の矩形フィルタを用いた判定が行われる。そして、各矩形フィルタの判定結果の値Ａ１１，Ａ１２．．．の合計値が所定の基準値Θ１よりも大きいか否かが判定される（図４の下段参照）。合計値がΘ１よりも大きい場合には、画像領域ＩＤＷは、このステージの判定条件を満たしたと判定される。合計値がΘ１以下である場合には、画像領域ＩＤＷは、このステージの判定条件を満たさないと判定される。画像領域ＩＤＷが、このステージの判定条件を満たさない場合には、画像領域ＩＤＷは、顔領域ではないと判定され、処理が終了する。一方、画像領域ＩＤＷが、このステージの判定条件を満たした場合には、次のステージの判定が行われる（図３参照）。 As described above, in one stage of processing, determination using one or more rectangular filters is performed. Then, the determination result values A11, A12. . . Is determined to be larger than a predetermined reference value Θ1 (see the lower part of FIG. 4). When the total value is larger than Θ1, it is determined that the image area IDW satisfies the determination condition of this stage. When the total value is Θ1 or less, it is determined that the image area IDW does not satisfy the determination condition of this stage. If the image area IDW does not satisfy this stage determination condition, the image area IDW is determined not to be a face area, and the process ends. On the other hand, when the image area IDW satisfies this stage determination condition, the next stage is determined (see FIG. 3).

なお、以上の処理は、複数種類の大きさの検出窓を使用して行われる。たとえば、２０画素×２０画素の大きさから２００画素×２００画素の大きさまでの複数種類の検出窓が、顔領域の検出に使用される。このため、画像中で顔が様々な大きさに描かれていても、それらを顔領域として検出することができる。なお、検出対象の画像ＰＩ１の５０％以上の面積（画素数）を有する検出窓は、たとえば、証明写真のように、一つの顔が画像全体に大きく写っている場合の顔領域の認定に有効である。 The above process is performed using detection windows of a plurality of types. For example, a plurality of types of detection windows having a size of 20 pixels × 20 pixels to a size of 200 pixels × 200 pixels are used for detection of the face area. For this reason, even if faces are drawn in various sizes in the image, they can be detected as face areas. Note that a detection window having an area (number of pixels) of 50% or more of the detection target image PI1 is effective for recognition of a face area when a single face is greatly reflected in the entire image, for example, as in an ID photo. It is.

なお、第１実施例においては、矩形フィルタの大きさは２０画素×２０画素である。このため、２０画素×２０画素以外の大きさの検出窓ＤＷが使用される場合には、検出窓ＤＷによって取り出されたデータの画像は２０画素×２０画素に解像度変換されて、上記の判定の対象とされる。 In the first embodiment, the size of the rectangular filter is 20 pixels × 20 pixels. For this reason, when a detection window DW having a size other than 20 pixels × 20 pixels is used, the resolution of the image of the data extracted by the detection window DW is converted to 20 pixels × 20 pixels, and the above determination is made. Be targeted.

上記の判定は、画素の明度に基づいて行われる。各画素の明度で表される画像中において対象物は各画素の明度差で表される。すなわち、上記の判定は、画像の一部の領域（図４の領域Ａ１１ａ，Ａ１１ｂ，Ａ１２ａ，Ａ１２ｂ参照）と他の一部の領域の画素の明度の差に基づいて行われる。このため、全体に暗く写っている人物の顔の領域については、「顔領域である」と判定されない可能性がある。たとえば、図２に示すように、夜間にフラッシュをたいて写真が撮影された際に、手前にいる人物Ｐ１にはフラッシュの光が届き、奥にいる人物Ｐ２にはフラッシュの光が届かない場合がある。そのような場合には、上記の処理の結果、人物Ｐ１の顔の領域Ａｆ１については、「顔領域である」と判定され、一方で、人物Ｐ２の顔の領域Ａｆ２については、「顔領域である」と判定されない可能性がある（図２参照）。 The above determination is made based on the brightness of the pixel. In the image represented by the brightness of each pixel, the object is represented by the brightness difference of each pixel. That is, the above determination is performed based on a difference in brightness between pixels in a partial area of the image (see areas A11a, A11b, A12a, and A12b in FIG. 4) and other partial areas. For this reason, there is a possibility that the face area of the person who appears dark overall will not be determined to be a “face area”. For example, as shown in FIG. 2, when a photograph is taken with a flash at night, the flash light reaches the person P1 in the foreground and the flash light does not reach the person P2 in the back. There is. In such a case, as a result of the above processing, the face area Af1 of the person P1 is determined to be “face area”, while the face area Af2 of the person P2 is determined to be “face area”. There is a possibility that it is not determined to be “present” (see FIG. 2).

一方で、光が強く当たりすぎ、顔全体が白く写っている人物の顔の領域についても、「顔領域である」と判定されない可能性がある。たとえば、夜間にフラッシュをたいて写真が撮影された場合に、奥にいる人物Ｐ２はフラッシュの光で適度に照らされ、手前にいる人物Ｐ１にはフラッシュの光が強く当たりすぎて顔全体が白くなってしまう場合がある。そのような場合には、人物Ｐ２の顔の領域Ａｆ２については、「顔領域である」と判定され、一方で、人物Ｐ１の顔の領域Ａｆ１については、「顔領域である」と判定されない可能性がある。 On the other hand, there is a possibility that a face area of a person whose light is too strong and the entire face is white will not be determined as a “face area”. For example, when a photograph is taken with a flash at night, the person P2 in the back is moderately illuminated by the light of the flash, and the person P1 in the foreground is exposed to the light of the flash too much and the entire face is white. It may become. In such a case, the face area Af2 of the person P2 is determined to be “face area”, while the face area Af1 of the person P1 may not be determined to be “face area”. There is sex.

なお、上記の処理において、どのような矩形フィルタが使用されるかは、サンプルの画像データを用いて行われる顔検出部９６２の学習において、顔領域の検出に先立って、あらかじめ決定される。また、θ１１，θ１２，Θ１などの基準値も、顔検出部９６２の学習において決定される。一方、α１１ａ，α１１ｂなどの係数は、矩形フィルタと対応づけられて予め決定されている。すなわち、学習において、各ステージで使用される矩形フィルタが決定されると、使用される係数も同時に決定される。 It should be noted that what kind of rectangular filter is used in the above processing is determined in advance prior to the detection of the face region in the learning of the face detection unit 962 performed using the sample image data. Further, reference values such as θ11, θ12, and Θ1 are also determined in learning by the face detection unit 962. On the other hand, coefficients such as α11a and α11b are determined in advance in association with the rectangular filter. That is, in learning, when the rectangular filter used in each stage is determined, the coefficient used is also determined simultaneously.

Ａ３．顔領域の検出処理：
図５は、顔検出部９６２の学習の際の処理を示すフローチャートである。ステップＳ１１０では、第１の学習サンプル画像データグループが準備される。第１の学習サンプル画像データグループを構成する第１の学習サンプル画像データは、たとえば２０画素×２０画素の画像データである。第１の学習サンプル画像データは、各画素について０〜２５５の階調値で表される明度のみの情報を有する。階調値０が最も暗い明度を表し、階調値２５５が最も明るい明度を表す。第１の学習サンプル画像データグループは、画像中に実際に顔が存在する、たとえば１００００個の画像データと、画像中に顔が存在しない、たとえば２００００個の画像データと、を含む。 A3. Face area detection process:
FIG. 5 is a flowchart showing processing in learning by the face detection unit 962. In step S110, a first learning sample image data group is prepared. The first learning sample image data constituting the first learning sample image data group is, for example, image data of 20 pixels × 20 pixels. The first learning sample image data has information on only the brightness represented by the gradation values of 0 to 255 for each pixel. A gradation value 0 represents the darkest lightness, and a gradation value 255 represents the brightest lightness. The first learning sample image data group includes, for example, 10,000 image data in which a face actually exists in the image, and 20,000 image data in which no face exists in the image, for example.

ステップＳ１２０では、第１の学習サンプル画像データグループ中に、全画素の平均明度がＴｈｓ以下のものが存在するか否かが決定される。全画素の平均明度がＴｈｓ以下のもの（以下「暗サンプルデータ」と呼ぶ）が存在する場合には、処理は、ステップＳ１３０に進む。第１の学習サンプル画像データグループ中に暗サンプルデータが存在しない場合には、処理は、ステップＳ１５０に進む。Ｔｈｓは、たとえば、７０とすることができる。なお、第１の学習サンプル画像データグループは、その一部として、全画素の平均明度がＴｈｓ以下の暗サンプルデータを含むように、予め準備されることが好ましい。 In step S120, it is determined whether or not the first learning sample image data group has an average brightness of all pixels equal to or less than Ths. If the average brightness of all pixels is equal to or less than Ths (hereinafter referred to as “dark sample data”), the process proceeds to step S130. If dark sample data does not exist in the first learning sample image data group, the process proceeds to step S150. Ths can be set to 70, for example. Note that the first learning sample image data group is preferably prepared in advance so as to include dark sample data in which the average brightness of all pixels is equal to or less than Ths.

ステップＳ１３０では、第１の学習サンプル画像データグループ中の暗サンプルデータに基づいて、第２の学習サンプル画像データグループを生成する。第２の学習サンプル画像データグループを構成する第２の学習サンプル画像データは、第１の学習サンプル画像データグループの各暗サンプルデータの画素の明度をそれぞれより明るい明度に置き換えることによって生成される。 In step S130, a second learning sample image data group is generated based on the dark sample data in the first learning sample image data group. The second learning sample image data constituting the second learning sample image data group is generated by replacing the lightness of each dark sample data pixel of the first learning sample image data group with a lighter brightness.

ステップＳ１４０では、第２の学習サンプル画像データグループを使用して、顔検出部９６２の学習が行われる。 In step S140, the face detection unit 962 learns using the second learning sample image data group.

ステップＳ１５０では、第１の学習サンプル画像データグループの第１の学習サンプル画像データのうち暗サンプルデータ以外の画像データを使用して、顔検出部９６２の学習が行われる。 In step S150, learning by the face detection unit 962 is performed using image data other than dark sample data among the first learning sample image data of the first learning sample image data group.

このように、全画素の平均明度がＴｈｓ（７０）以下の画像データと（ステップＳ１２０〜Ｓ１４０参照）、全画素の平均明度がＴｈｓより大きい画像データと（ステップＳ１５０参照）、を使用して学習を行うことで、プリンタドライバ９６における顔領域の検出の際の正答率を、明るい画像と暗い画像の両方について、高くすることができる。 In this way, learning is performed using image data in which the average brightness of all pixels is equal to or less than Ths (70) (see steps S120 to S140) and image data in which the average brightness of all pixels is greater than Ths (see step S150). By performing the above, the correct answer rate when detecting the face area in the printer driver 96 can be increased for both the bright image and the dark image.

図６は、画像データの画像中において顔が存在する領域を決定する処理を示すフローチャートである。各ステップは、プリンタドライバ９６によって実行される。 FIG. 6 is a flowchart showing processing for determining a region where a face exists in an image of image data. Each step is executed by the printer driver 96.

まず、ステップＳ２１０で、顔領域を特定する対象となる画像データをＣＤ−Ｒ／ＲＷドライブ１４０のＣＤ−Ｒから読み込んで（図１参照）、その画像データに基づいて第１の画像データを生成する。顔領域を特定する対象となる画像データ（以下、「オリジナル画像データ」ということがある）は、たとえば、２５６０画素×１９２０画素の２４ｂｉｔカラーの画像データである。第１の画像データは、画像のサイズが３２０画素×２４０画素である画像データである。そして、第１の画像データは、各画素について明度のみの情報を有する画像データである。 First, in step S210, image data for which a face area is specified is read from the CD-R of the CD-R / RW drive 140 (see FIG. 1), and first image data is generated based on the image data. To do. The image data (hereinafter, also referred to as “original image data”) for specifying the face area is, for example, 2560 pixel × 1920 pixel 24-bit color image data. The first image data is image data having an image size of 320 pixels × 240 pixels. The first image data is image data having only brightness information for each pixel.

第１の画像データは、オリジナル画像データから各画素の明度の情報を取りだし、さらに解像度変換を行うことによって生成される。各画素の明度Ｙは、赤、緑、青の階調値をそれぞれＲ，Ｇ，Ｂとしたとき（Ｒ，Ｇ，Ｂ＝０〜２５５）、たとえば、以下の式で得られる。 The first image data is generated by extracting brightness information of each pixel from the original image data and further performing resolution conversion. The brightness Y of each pixel is obtained by the following equation, for example, when the red, green, and blue tone values are R, G, and B (R, G, B = 0 to 255), for example.

Ｙ＝０．２９９Ｒ＋０．５８７Ｇ＋０．１１４Ｂ・・・（１） Y = 0.299R + 0.587G + 0.114B (1)

このように、画像の解像度を低減し、各画素の色情報を明度のみに限定することで、対象となる画像データをそのまま使用して顔領域を特定する態様に比べて、処理の負荷を軽くすることができる。なお、ステップＳ２１０の処理は、プリンタドライバ９６の機能部としての第１の画像データ生成部９６３が実行する（図１参照）。 In this way, by reducing the resolution of the image and limiting the color information of each pixel to lightness only, the processing load is reduced compared with the aspect of specifying the face area using the target image data as it is. can do. Note that the processing in step S210 is executed by the first image data generation unit 963 as a functional unit of the printer driver 96 (see FIG. 1).

なお、顔領域を特定する対象となる画像データの画像ＰＩ０は、夜間にフラッシュをたいて撮影された写真画像であるものとする。第１の画像データの画像ＰＩ１は、解像度が異なる点および明度のみの画像である点以外は、オリジナル画像データの画像ＰＩ０と同じ画像である（図２参照）。画像ＰＩ１中において、手前にいる人物Ｐ１にはフラッシュの光が届いている。このため、人物Ｐ１は、画像ＰＩ１中において、ほぼ７０以上の明度の濃淡で表現されている。これに対して、奥にいる人物Ｐ２にはフラッシュの光が届いていない。このため、人物Ｐ２は、画像ＰＩ１中において、ほぼ７０以下の明度の濃淡で表現されている。 It is assumed that the image PI0 of the image data that is the target for specifying the face area is a photographic image that was shot with a flash at night. The image PI1 of the first image data is the same image as the image PI0 of the original image data except that the images have different resolutions and are only lightness images (see FIG. 2). In the image PI1, flash light reaches the person P1 in front. For this reason, the person P1 is expressed with lightness and shade of about 70 or more in the image PI1. On the other hand, the flash light does not reach the person P2 in the back. For this reason, the person P2 is expressed in light and shade of lightness of about 70 or less in the image PI1.

図６のステップＳ２２０では、第１の画像データの全画素の平均明度ＡＹが所定のしきい値Ｔｈｐ以下であるか否かが決定される。平均明度ＡＹがＴｈｐ以下であるとき、処理は、ステップＳ２３０に進む。平均明度ＡＹがＴｈｐより大きい場合には、処理は、ステップＳ２５０に進む。Ｔｈｐは、たとえば、Ｔｈｓと等しい値７０とすることができる。ＴｈｐをＴｈｓと等しい値に設定することで、第２の学習サンプル画像データによって学習された設定（図４のＤ１１ｙ，Θ１１などの値や選択された矩形フィルタ）を活用して、第１の画像データの画像中から高い正答率で顔領域を特定することができる。 In step S220 of FIG. 6, it is determined whether or not the average brightness AY of all the pixels of the first image data is equal to or less than a predetermined threshold value Thp. When the average brightness AY is equal to or less than Thp, the process proceeds to step S230. If the average brightness AY is greater than Thp, the process proceeds to step S250. Thp can be, for example, a value 70 equal to Ths. By setting Thp to a value equal to Ths, the first image is obtained by utilizing the settings learned by the second learning sample image data (values such as D11y and Θ11 in FIG. 4 and the selected rectangular filter). The face area can be identified from the data image with a high correct answer rate.

ステップＳ２２０の処理を行うことにより、明度の平均値が７０以下であり顔領域の判定に失敗する可能性が高い画像についてのみ、ステップＳ２３０，Ｓ２４０の処理が行われる。このため、すべての画像についてステップＳ２３０，Ｓ２４０の処理を行う態様に比べて、全体の処理の負担を軽くすることができる。 By performing the process of step S220, the processes of steps S230 and S240 are performed only for images that have an average brightness value of 70 or less and that are likely to fail in the face area determination. For this reason, compared with the aspect which processes step S230, S240 about all the images, the burden of the whole process can be eased.

ステップＳ２３０では、第１の画像データに基づいて、第２の画像データを生成する。第２の画像データは、第１の画像データの画素の明度を、それぞれより明るい明度に置き換えることによって生成される。なお、ステップＳ２２０，Ｓ２３０の処理は、プリンタドライバ９６の機能部としての第２の画像データ生成部９６４が実行する（図１参照）。 In step S230, second image data is generated based on the first image data. The second image data is generated by replacing the brightness of the pixels of the first image data with a lighter brightness. Note that the processing of steps S220 and S230 is executed by the second image data generation unit 964 as a functional unit of the printer driver 96 (see FIG. 1).

ステップＳ２４０では、第２の画像データに基づいて、画像中で人間の顔が存在する可能性が高いと思われる１以上の顔領域が特定される。顔領域を特定する処理の内容は、図２〜図４を使用して説明したとおりである。なお、ステップＳ２４０の処理は、プリンタドライバ９６の顔検出部９６２の機能部としての第２の顔特定部９６６が実行する（図１参照）。 In step S240, based on the second image data, one or more face regions that are likely to have a human face in the image are identified. The contents of the process for specifying the face area are as described with reference to FIGS. Note that the process of step S240 is executed by the second face specifying unit 966 as a functional unit of the face detecting unit 962 of the printer driver 96 (see FIG. 1).

ステップＳ２５０では、第１の画像データに基づいて、１以上の顔領域が特定される。顔領域を特定する処理の内容は、図２〜図４を使用して説明したとおりである。なお、ステップＳ２５０の処理は、プリンタドライバ９６の顔検出部９６２の機能部としての第１の顔特定部９６５が実行する（図１参照）。 In step S250, one or more face regions are specified based on the first image data. The contents of the process for specifying the face area are as described with reference to FIGS. Note that the processing in step S250 is executed by the first face identification unit 965 as a functional unit of the face detection unit 962 of the printer driver 96 (see FIG. 1).

ステップＳ２６０では、ステップＳ２４０で特定された顔領域と、ステップＳ２５０で特定された顔領域と、に基づいて、第１の画像データが表す画像ＰＩ１中の顔領域の集合が決定される。そして、決定された第１の画像データが表す画像ＰＩ１中の顔領域に基づいて、ＣＤ−Ｒ／ＲＷドライブ１４０のＣＤ−Ｒから読み込まれた画像データの画像中の顔領域が決定される。 In step S260, a set of face areas in the image PI1 represented by the first image data is determined based on the face area specified in step S240 and the face area specified in step S250. Then, based on the face area in the image PI1 represented by the determined first image data, the face area in the image of the image data read from the CD-R of the CD-R / RW drive 140 is determined.

なお、画像ＰＩ１中の顔領域の集合を決定する際には、たとえば、互いに７５％以上の画素を共有するＮ個（Ｎは２以上の整数）の顔領域については、その中から一つの顔領域が選択され、他の顔領域は廃棄される。このような態様とすることで、互いに大部分が重複する顔領域が画像ＰＩ１中の複数の顔領域として決定されることを防止できる。選択される顔領域は、たとえば、その中心が、それらＮ個の顔領域によって共有される領域の中心と、最も近い顔領域である。なお、本実施例においては、顔領域は長方形の領域である。長方形の領域の中心は、その長方形の対角線の交点である。 When determining a set of face areas in the image PI1, for example, for N face areas (N is an integer of 2 or more) that share 75% or more of pixels, one face is selected from the face areas. The area is selected and the other face areas are discarded. By adopting such an aspect, it is possible to prevent the face regions that are mostly overlapped with each other from being determined as a plurality of face regions in the image PI1. The selected face area is, for example, the face area whose center is closest to the center of the area shared by the N face areas. In the present embodiment, the face area is a rectangular area. The center of the rectangular area is the intersection of the diagonal lines of the rectangle.

ステップＳ２６０では、そのようにして選択された顔領域によって構成される顔領域の集合を、画像ＰＩ１中の顔領域の集合とする。そして、それらの顔領域に対応する、ＣＤ−Ｒから読み込まれた画像データの画像中の領域を、ＣＤ−Ｒから読み込まれた画像データの画像中の顔領域とする。なお、ステップＳ２６０の処理は、プリンタドライバ９６の機能部としての合成部９６７が実行する（図１参照）。 In step S260, a set of face areas constituted by the face areas selected in this way is set as a set of face areas in the image PI1. Then, an area in the image of the image data read from the CD-R corresponding to the face area is set as a face area in the image of the image data read from the CD-R. Note that the processing in step S260 is executed by the synthesis unit 967 as a functional unit of the printer driver 96 (see FIG. 1).

このような処理を行うことで、通常の処理で特定できる顔領域については、ステップＳ２５０の処理によって特定されることができる。そして、通常の処理では顔領域の特定に失敗する可能性が高い暗い画像については、ステップＳ２３０およびＳ２４０の処理によって特定されることができる。そして、それらの顔領域集合の和集合（ＯＲ集合）を、検討対象の画像データの画像における顔領域とすることにより（ステップＳ２６０参照）、明るい画像領域の顔についても、暗い画像領域の画像についても、高い確率で顔領域として特定することができる。 By performing such processing, the face area that can be specified by normal processing can be specified by the processing in step S250. A dark image that is highly likely to fail to specify a face area in normal processing can be specified by the processing in steps S230 and S240. Then, by setting the union (OR set) of these face area sets as the face area in the image of the image data to be examined (see step S260), the face of the bright image area and the image of the dark image area are also determined. Can also be identified as a face region with high probability.

Ａ４．画像データの変換：
以下では、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する方法について説明する。図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する方法も同様である。 A4. Image data conversion:
Hereinafter, a method for generating the second image data based on the first image data in step S230 of FIG. 6 will be described. The method for generating the second learning sample image data based on the dark sample data in step S130 in FIG. 5 is the same.

図７は、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する方法を示すヒストグラムである。横軸は、０〜２５５の明度である。縦軸は、画像データにおける各明度の頻度である。図７においては、第１の画像データにおける画素の明度の分布Ｄ１と、第２の画像データにおける画素の明度の分布Ｄ２ａを示す。なお、分布Ｄ２ａは離散的となるが、理解を容易にするため、ここでは曲線として示す。 FIG. 7 is a histogram showing a method of generating the second image data based on the first image data in step S230 of FIG. The horizontal axis is the brightness from 0 to 255. The vertical axis represents the frequency of each brightness in the image data. FIG. 7 shows a pixel brightness distribution D1 in the first image data and a pixel brightness distribution D2a in the second image data. The distribution D2a is discrete, but is shown here as a curve for easy understanding.

第２の画像データは、第１の画像データの各画素の明度を２倍の明度に置き換えることによって生成される（図７のＤ１，Ｄ２ａ参照）。その結果、第１の画像データにおいて、たとえば、０〜７０の明度の範囲Ｒ１に含まれる明度Ｙ１を有していた画素の明度は、０〜１４０の明度の範囲Ｒ２ａに含まれる明度Ｙ２（Ｙ２＝Ｙ１×２）に変換される。 The second image data is generated by replacing the brightness of each pixel of the first image data with twice the brightness (see D1 and D2a in FIG. 7). As a result, in the first image data, for example, the brightness of the pixel having the brightness Y1 included in the brightness range R1 of 0 to 70 is the brightness Y2 (Y2 included in the brightness range R2a of 0 to 140. = Y1 × 2).

なお、２倍した結果、２５５を越えることとなる明度については、すべて２５５に置き換えられる。すなわち、第１の画像データの画像中において１２６以上の明度で表現されている部分の明度については、すべて２５５に置き換えられる。 Note that the brightness that exceeds 255 as a result of doubling is all replaced with 255. That is, the brightness of the portion expressed by the brightness of 126 or more in the image of the first image data is all replaced with 255.

このような処理を行うことで、第１の画像データの画像中において暗い色で表されている画像を、より明るく、かつ明度差の大きい画像に変換することができる（図７の範囲Ｒ１とＲ２ａ参照）。その結果、図６のステップＳ２４０において、図２〜４で説明した処理で、高い正確さで顔領域を特定することができる。 By performing such processing, it is possible to convert an image represented by a dark color in the image of the first image data into an image that is brighter and has a large brightness difference (the range R1 in FIG. 7). R2a). As a result, in step S240 of FIG. 6, the face area can be specified with high accuracy by the processing described with reference to FIGS.

また、たとえば、第１の画像データの明度２０の画素と明度３０の画素との明度の差１０と、第１の画像データの明度１２０の画素と明度１３０の画素との明度の差１０と、は、変換後の第１の画像データにおいてもそれぞれ２倍の２０となる。すなわち、等しい量の明度差は、変換後も等しい量の明度差となる。よって、第２の画像データにおいても、第１の画像データと同様に、正確に顔領域の特定を行うことができる。 Further, for example, the brightness difference 10 between the lightness 20 pixel and the lightness 30 pixel of the first image data, and the lightness difference 10 between the lightness 120 pixel and the lightness 130 pixel of the first image data, Are doubled to 20 in the first image data after conversion. That is, an equal amount of brightness difference becomes an equal amount of brightness difference after conversion. Therefore, the face area can be accurately identified in the second image data as in the case of the first image data.

なお、第１の画像データの画像中において１２６以上の明度で表現されている部分の明度については、明度を２倍にする上記の処理によって明度差がなくなる。よって、その部分については、図２〜４で説明した処理では、顔領域を特定することができなくなるおそれがある。しかし、図７の処理を含む図６のステップＳ２３０，Ｓ２４０の処理は、画像データの平均明度が７０以下の場合にのみ行われる。このため、対象となる画像中において図７の処理によって明度差がなくなる部分は、大きくはない。さらに、明度が１２６以上の部分については、第２の画像データではなく第１の画像データに対する顔領域の特定の処理（図６のステップＳ２５０参照）によって、顔領域が特定される。このため、図７の処理を行うことによって、顔が存在する領域を顔領域として特定するのに失敗する可能性が高くなることはない。 It should be noted that the brightness difference of the portion expressed by the brightness of 126 or more in the image of the first image data is eliminated by the above-described processing for doubling the brightness. Therefore, there is a possibility that the face area cannot be specified for the portion by the processing described with reference to FIGS. However, the processing in steps S230 and S240 in FIG. 6 including the processing in FIG. 7 is performed only when the average brightness of the image data is 70 or less. For this reason, the portion where the brightness difference is eliminated by the processing of FIG. 7 in the target image is not large. Further, for the portion having the brightness of 126 or more, the face area is specified by the face area specifying process (see step S250 in FIG. 6) for the first image data instead of the second image data. For this reason, by performing the process of FIG. 7, there is no possibility that the region where the face is present will fail to be specified as the face region.

第１実施例においては、図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する際にも、同様の処理が行われる。すなわち、画像データにおける顔領域の特定の際の処理（図６のステップＳ２３０参照）と同じ処理で、暗サンプルデータが改変され、学習が行われる。このような態様とすることで、実際に行われる処理の精度が高くなるように、顔検出部９６２の学習を行わせることができる。 In the first embodiment, the same processing is performed when generating the second learning sample image data based on the dark sample data in step S130 of FIG. That is, the dark sample data is modified and learning is performed in the same process as the process for specifying the face area in the image data (see step S230 in FIG. 6). By setting it as such an aspect, the face detection part 962 can be made to learn so that the precision of the process actually performed may become high.

Ｂ．第２実施例：
第２実施例の画像処理装置は、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する方法、および図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する方法が、第１実施例の画像処理装置とは異なる。第２実施例の画像処理装置の他の点は、第１実施例の画像処理装置と同じである。 B. Second embodiment:
The image processing apparatus according to the second embodiment generates a second image data based on the first image data in step S230 of FIG. 6 and a second method based on the dark sample data in step S130 of FIG. The method of generating the second learning sample image data is different from the image processing apparatus of the first embodiment. The other points of the image processing apparatus of the second embodiment are the same as those of the image processing apparatus of the first embodiment.

図８は、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する際のガンマ変換の内容を示す図である。横軸は変換前の画素の明度Ｙｉであり、縦軸は同じ画素の変換後の明度Ｙｏである。図８のガンマ曲線は、たとえば、明度７０を明度１６０に置き換えるガンマ曲線である。第２実施例においては、第２の画像データを生成するために、第１の画像データの各画素の明度は、図８のガンマ曲線に従って、変換される。 FIG. 8 is a diagram showing the content of gamma conversion when generating the second image data based on the first image data in step S230 of FIG. The horizontal axis is the brightness Yi of the pixel before conversion, and the vertical axis is the brightness Yo after conversion of the same pixel. The gamma curve in FIG. 8 is, for example, a gamma curve that replaces lightness 70 with lightness 160. In the second embodiment, in order to generate the second image data, the brightness of each pixel of the first image data is converted according to the gamma curve of FIG.

このような態様としても、第１の画像データの画像中において暗い色で表されている画像を、より明るく、かつ明度差の大きい画像に変換することができる（図８の範囲Ｒ１とＲ２ｂ参照）。その結果、図６のステップＳ２４０において、図２〜４で説明した処理によって、高い正確さで顔領域を特定することができる。 Even in such an aspect, an image represented by a dark color in the image of the first image data can be converted into an image that is brighter and has a large brightness difference (see ranges R1 and R2b in FIG. 8). ). As a result, in step S240 of FIG. 6, the face area can be specified with high accuracy by the processing described with reference to FIGS.

また、第２実施例においては、図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する際にも、同様の処理が行われる。すなわち、画像データにおける顔領域の特定の際の処理（図６のステップＳ２３０参照）と同じ処理で、暗サンプルデータが改変され、学習が行われる。このような態様とすることで、実際に行われる処理の精度が高くなるように、顔検出部９６２の学習を行わせることができる。 In the second embodiment, the same processing is performed when generating the second learning sample image data based on the dark sample data in step S130 of FIG. That is, the dark sample data is modified and learning is performed in the same process as the process for specifying the face area in the image data (see step S230 in FIG. 6). By setting it as such an aspect, the face detection part 962 can be made to learn so that the precision of the process actually performed may become high.

さらに、ガンマ曲線を使用した変換においては、第１実施例に比べて、変換前の画像データにおいて互いに異なる値を有する明度が、同じ値（たとえば最大値２５５）に変換されてしまう可能性が低い。このため、広い範囲の明度について、顔領域を特定することができる。 Further, in the conversion using the gamma curve, it is less likely that the brightness having different values in the image data before the conversion is converted to the same value (for example, the maximum value 255) as compared with the first embodiment. . For this reason, the face region can be specified for a wide range of brightness.

Ｃ．第３実施例：
第３実施例の画像処理装置は、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する方法、および図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する方法が、第１実施例の画像処理装置とは異なる。第２実施例の画像処理装置の他の点は、第１実施例の画像処理装置と同じである。 C. Third embodiment:
The image processing apparatus according to the third embodiment generates a second image data based on the first image data in step S230 of FIG. 6 and a second method based on the dark sample data in step S130 of FIG. The method of generating the second learning sample image data is different from the image processing apparatus of the first embodiment. The other points of the image processing apparatus of the second embodiment are the same as those of the image processing apparatus of the first embodiment.

図９は、図６のステップＳ２３０において、第１の画像データに基づいて第２の画像データを生成する方法を示すヒストグラムである。横軸は、０〜２５５の明度である。縦軸は、画像データにおける各明度の頻度である。図９においては、第１の画像データにおける画素の明度の分布Ｄ１と、第２の画像データにおける画素の明度の分布Ｄ２ｂを示す。第３実施例においては、第２の画像データを生成するために、第１の画像データの各画素の明度は、Δだけ増大される（Δは正の整数。１≦Δ＜２５５）。Δは、たとえば、７０とすることができる。なお、Δだけ増大された結果、２５５を越えることとなる明度については、すべて２５５に置き換えられる。 FIG. 9 is a histogram showing a method of generating the second image data based on the first image data in step S230 of FIG. The horizontal axis is the brightness from 0 to 255. The vertical axis represents the frequency of each brightness in the image data. FIG. 9 shows a pixel brightness distribution D1 in the first image data and a pixel brightness distribution D2b in the second image data. In the third embodiment, in order to generate the second image data, the brightness of each pixel of the first image data is increased by Δ (Δ is a positive integer, 1 ≦ Δ <255). Δ can be set to 70, for example. In addition, as a result of being increased by Δ, all brightness values exceeding 255 are replaced with 255.

このような態様としても、第１の画像データの画像中において暗い色で表されている画像を、より明るい画像に変換することができる（図９の範囲Ｒ１とＲ２ｃ参照）。その結果、図６のステップＳ２４０において、図２〜４で説明した処理で、高い正確さで顔領域を特定することができる。 Even in such an embodiment, an image represented by a dark color in the image of the first image data can be converted into a brighter image (see ranges R1 and R2c in FIG. 9). As a result, in step S240 of FIG. 6, the face area can be specified with high accuracy by the processing described with reference to FIGS.

明度７０以下の範囲の明度を有する画素で描かれている画像は、他の明度範囲の明度を有する画素で描かれている画像に比べて、顔領域の特定に失敗する可能性が高い。第３実施例においては、明度７０以下の範囲の明度が、明度７０以上となるように改変量Δが設定されている。このため、第２の画像データについて処理を行うことで、図２〜図４の処理で高精度に顔領域を特定することができる。 There is a higher possibility that an image drawn with pixels having a lightness in the range of lightness 70 or less will fail to specify a face region compared to an image drawn with pixels having a lightness in the other lightness ranges. In the third embodiment, the modification amount Δ is set so that the lightness in the range of lightness 70 or less becomes lightness 70 or more. For this reason, by performing processing on the second image data, the face area can be specified with high accuracy by the processing of FIGS.

また、第３実施例においては、図５のステップＳ１３０において、暗サンプルデータに基づいて第２の学習サンプル画像データを生成する際にも、同様の処理が行われる。すなわち、画像データにおける顔領域の特定の際の処理（図６のステップＳ２３０参照）と同じ処理で、暗サンプルデータが改変され、学習が行われる。このような態様とすることで、実際に行われる処理の精度が高くなるように、顔検出部９６２の学習を行わせることができる。 In the third embodiment, the same processing is performed when generating the second learning sample image data based on the dark sample data in step S130 of FIG. That is, the dark sample data is modified and learning is performed in the same process as the process for specifying the face area in the image data (see step S230 in FIG. 6). By setting it as such an aspect, the face detection part 962 can be made to learn so that the precision of the process actually performed may become high.

第１実施例および第２実施例では、明度の改変の前後で画素同士の明度の差が変わる。たとえば、第１実施例においては、第１の画像データの明度２０の画素と明度３０の画素との明度の差は１０であるのに対して、第２の画像データのにおける対応する画素同士の明度の差は、２０となる。このため、明度を改変されていない画像における「目」と「顔の地肌」との明度の差と、明度を改変された画像における「目」と「顔の地肌」との明度の差と、は異なる値となる可能性がある。すなわち、明るい画像と暗い画像とで顔領域の明度差の判定基準が異なる可能性がある。よって、事前に第１のサンプル画像と第２のサンプル画像とを使って顔検出部９６２の学習を行う際に、学習の精度が低くなるおそれがある。ガンマ曲線を使用する第２実施例の場合も同様である。 In the first and second embodiments, the difference in brightness between pixels changes before and after the brightness change. For example, in the first embodiment, the brightness difference between the lightness 20 pixel and the lightness 30 pixel of the first image data is 10, whereas the corresponding pixels in the second image data The difference in brightness is 20. Therefore, the difference in brightness between the "eye" and the "face background" in the image whose brightness has not been modified, and the difference in brightness between the "eye" and the "face background" in the image whose brightness has been modified, May have different values. That is, there is a possibility that the criterion for determining the brightness difference of the face area is different between a bright image and a dark image. Therefore, when learning is performed by the face detection unit 962 using the first sample image and the second sample image in advance, the learning accuracy may be lowered. The same applies to the second embodiment using a gamma curve.

一方、第３実施例においては、画素の明度は整数Δを加えられて改変される。すなわち、明度の改変の前後で、暗い領域における画素同士の明度の差が同じ値に保持される。このため、明るい画像と暗い画像とで顔領域の明度差の判定基準が異なる可能性が少ない。よって、事前に第１のサンプル画像と第２のサンプル画像とを使って顔検出部９６２の学習を行う際に、学習の精度が高い。 On the other hand, in the third embodiment, the brightness of the pixel is modified by adding an integer Δ. That is, the brightness difference between pixels in a dark region is held at the same value before and after the brightness change. For this reason, there is little possibility that the determination criterion of the brightness difference of the face area is different between a bright image and a dark image. Therefore, when the face detection unit 962 learns using the first sample image and the second sample image in advance, the learning accuracy is high.

Ｄ．第４実施例：
図１０は、第４実施例における顔領域を決定する処理を示すフローチャートである。第１実施例では、画像データの全画素の平均明度ＡＹが所定のしきい値Ｔｈｐ以下である場合に（図６のステップＳ２２０参照）、第２の画像データが生成され、顔領域が特定される（同、ステップＳ２３０，Ｓ２４０）。第４実施例では、図１０に示すように、画像データの平均明度に関する条件づけなしに第２の画像データが生成される（図１０のステップＳ２３０参照）。また、第２の画像データに関する顔領域の特定と、第１の画像データに関する顔領域の特定とは、並行して行われる。その後、ステップＳ２４０で特定された顔領域と、ステップＳ２５０で特定された顔領域と、の両方を要素として含む集合をステップＳ２６０で決定する。第４実施例の他の点は、第１実施例と同じである。 D. Fourth embodiment:
FIG. 10 is a flowchart showing processing for determining a face area in the fourth embodiment. In the first embodiment, when the average brightness AY of all the pixels of the image data is equal to or less than a predetermined threshold Thp (see step S220 in FIG. 6), the second image data is generated and the face area is specified. (Steps S230 and S240). In the fourth embodiment, as shown in FIG. 10, the second image data is generated without conditioning regarding the average brightness of the image data (see step S230 in FIG. 10). Further, the specification of the face area related to the second image data and the specification of the face area related to the first image data are performed in parallel. Thereafter, a set including both the face area specified in step S240 and the face area specified in step S250 as elements is determined in step S260. The other points of the fourth embodiment are the same as those of the first embodiment.

第４実施例のステップＳ２４０，Ｓ２５０の処理は、互いに独立に実行されることができる。このため、パーソナルコンピュータ１００がマルチスレッドを実行可能なＣＰＵを備えるコンピュータである場合には、第４実施例のステップＳ２４０，Ｓ２５０の処理は、別個のスレッドで実行されることが好ましい。また、パーソナルコンピュータ１００が複数のＣＰＵを備えるコンピュータである場合には、第４実施例のステップＳ２４０，Ｓ２５０の処理は、別個のＣＰＵで実行されることが好ましい。そのような態様とすれば、全体の処理時間を短縮することができる。 The processes of steps S240 and S250 of the fourth embodiment can be executed independently of each other. For this reason, when the personal computer 100 is a computer including a CPU capable of executing multi-threads, it is preferable that the processes in steps S240 and S250 of the fourth embodiment are executed in separate threads. Further, when the personal computer 100 is a computer having a plurality of CPUs, it is preferable that the processes of steps S240 and S250 of the fourth embodiment are executed by separate CPUs. With such an aspect, the entire processing time can be shortened.

Ｅ．変形例：
なお、この発明は上記の実施例や実施形態に限られるものではなく、その要旨を逸脱しない範囲において種々の態様において実施することが可能であり、例えば次のような変形も可能である。 E. Variations:
The present invention is not limited to the above-described examples and embodiments, and can be implemented in various modes without departing from the gist thereof. For example, the following modifications are possible.

Ｅ１．変形例１：
上記実施例では、画像データは３２０画素×２４０画素の画像データに変換されて、顔領域が特定される（図６のステップＳ２１０参照）。しかし、顔領域を特定する処理の対象とする画像データは、３２０画素×２４０画素の画像データに限らず、任意の大きさおよび画素数の画像データとすることができる。ただし、顔領域を特定する処理の対象とする画像データは、一定の大きさの画像データとすることが好ましい。そのような態様とすれば、その大きさの画像データのために１種類の顔検出モジュールを用意すれば、顔領域の特定の処理を行うことができる。 E1. Modification 1:
In the above embodiment, the image data is converted into image data of 320 pixels × 240 pixels, and the face area is specified (see step S210 in FIG. 6). However, the image data to be processed for specifying the face region is not limited to image data of 320 pixels × 240 pixels, but can be image data having an arbitrary size and number of pixels. However, it is preferable that the image data to be processed for specifying the face area is image data having a certain size. With such an aspect, if one type of face detection module is prepared for image data of that size, it is possible to perform specific processing of the face area.

また、画像データは、縮小され、画素数を少なくされてから、顔領域を特定する処理の対象とされることが好ましい。そのような態様とすれば、処理の負荷を軽減することができる。 The image data is preferably reduced and the number of pixels is reduced before being subjected to a process for specifying a face area. With such an aspect, the processing load can be reduced.

Ｅ２．変形例２：
上記第１実施例では、画像データの全画素の平均明度ＡＹが所定のしきい値Ｔｈｐ以下である場合に（図６のステップＳ２２０参照）、第２の画像データが生成され、顔領域が特定される（同、ステップＳ２３０，Ｓ２４０）。しかし、階調値（明度）を改変した画像データを生成し顔領域の特定を行うための条件は、他の条件とすることもできる。 E2. Modification 2:
In the first embodiment, when the average brightness AY of all the pixels of the image data is equal to or less than the predetermined threshold Thp (see step S220 in FIG. 6), the second image data is generated and the face area is specified. (Steps S230 and S240). However, other conditions may be used as conditions for generating image data with a modified tone value (brightness) and specifying a face area.

たとえば、画像データの画像の領域を、画像の中心点（たとえば、長方形の画像であれば対角線の交点）を含む第１の領域と、第１の領域を囲む第２の領域と、に分けたときに、第１の領域に含まれる画素の集合の所定の色や明度の階調値の平均値に応じて、階調値を改変した画像データを生成し顔領域の特定を行うか否かを決定することとしてもよい。なお、「第２の領域が第１の領域を囲む」とは、第１の領域に含まれる任意の点が、第２の領域に含まれる２点を内分する点で表されることをいう。 For example, an image area of image data is divided into a first area including the center point of the image (for example, an intersection of diagonal lines in the case of a rectangular image) and a second area surrounding the first area. Whether or not to generate face data by generating image data with a modified gradation value according to the average value of a predetermined color or brightness gradation value of a set of pixels included in the first area May be determined. Note that “the second region surrounds the first region” means that any point included in the first region is represented by a point that internally divides the two points included in the second region. Say.

また、たとえば、横軸を階調値とし、縦軸を頻度とした画像データの明度分布において、ピークとなる階調値の大きさに応じて、階調値を改変した画像データを生成し顔領域の特定を行うか否かを決定することとしてもよい。 Also, for example, in the brightness distribution of image data with the horizontal axis as the gradation value and the vertical axis as the frequency, the image data is generated by modifying the gradation value according to the magnitude of the peak gradation value. It may be determined whether or not to specify a region.

すなわち、階調値を改変した画像データを生成し顔領域の特定を行うか否かの判断は、対象となる画像の少なくとも一部の画素の色の階調値に基づいて行うことができる。 That is, whether or not to generate image data with a modified gradation value and specify a face area can be determined based on the gradation values of the colors of at least some of the pixels of the target image.

さらに、階調値を改変した画像データを生成し顔領域の特定を行うか否かの判断は、ユーザによる所定の処理があったか否かに基づいて行うことができる。すなわち、ユーザによる所定の処理があった場合に、階調値を改変した画像データを生成し顔領域の特定を行うこととすることもできる。 Further, whether or not to generate image data with a modified gradation value and specify a face area can be determined based on whether or not a predetermined process has been performed by the user. That is, when a predetermined process is performed by the user, it is possible to generate image data with a modified gradation value and specify a face area.

階調値（明度）を改変したサンプル画像データを生成し学習を行うための条件（図５のステップＳ１２０）についても、同様である。 The same applies to the conditions (step S120 in FIG. 5) for generating and learning the sample image data in which the gradation value (lightness) is modified.

Ｅ３．変形例３：
上記実施例では、第２の画像データを生成するために、第１の画像データのすべての明度が変換される（図６のステップＳ２３０参照）。しかし、第１の画像データの一部の範囲の階調値（明度）についてのみ改変を行って、第２の画像データを生成することもできる。そのような態様においては、階調値が取りうる範囲のうち、少なくとも最も暗い色に対応する階調値から、階調値が取りうる範囲の幅の１０％までの範囲の階調値について変換を行うことが好ましい。なお、階調値が取りうる範囲のうち、少なくとも最も暗い色に対応する階調値から、階調値が取りうる範囲の幅の２０％までの範囲の階調値について変換を行うことが、より好ましい。そして、階調値が取りうる範囲のうち、少なくとも最も暗い色に対応する階調値から、階調値が取りうる範囲の幅の３０％までの範囲の階調値について変換を行うことが、さらに好ましい。 E3. Modification 3:
In the above embodiment, in order to generate the second image data, all the brightness values of the first image data are converted (see step S230 in FIG. 6). However, it is also possible to generate the second image data by modifying only the gradation value (brightness) in a partial range of the first image data. In such an aspect, the gradation values in the range from at least the gradation value corresponding to the darkest color to the range of 10% of the range of the gradation value can be taken out of the range that the gradation value can take. It is preferable to carry out. Note that, in the range where the gradation value can be taken, conversion is performed for the gradation value in a range from at least the gradation value corresponding to the darkest color to 20% of the width of the range where the gradation value can be taken, More preferred. Then, conversion is performed for a gradation value in a range from a gradation value corresponding to at least the darkest color to a range of 30% of a range that the gradation value can take, of the range that the gradation value can take, Further preferred.

Ｅ４．変形例４：
上記第１実施例では、第２の画像データを生成する際に、第１の画像データの明度が２倍される。また、第２の学習サンプル画像データを生成する際に、暗サンプルデータ（第１の学習サンプル画像データ）の明度が２倍される。しかし、階調値（明度）を定数倍する際の定数は２以外の値とすることができる。ただし、その定数は、１．５〜２．５であることが好ましく、１．８〜２．２であることがより好ましい。 E4. Modification 4:
In the first embodiment, when the second image data is generated, the brightness of the first image data is doubled. In addition, when the second learning sample image data is generated, the brightness of the dark sample data (first learning sample image data) is doubled. However, the constant for multiplying the gradation value (lightness) by a constant can be a value other than 2. However, the constant is preferably 1.5 to 2.5, and more preferably 1.8 to 2.2.

Ｅ５．変形例５：
上記第２実施例では、明度が０から２５５で表されるという条件の下で、第２の画像データおよび第２の学習サンプル画像データを生成する際に、明度７０を明度１６０に置き換えるガンマ変換が行われる。しかし、第２の画像データおよび第２の学習サンプル画像データを生成する際に使用されるガンマ曲線は、他の形状とすることもできる。ただし、ガンマ曲線は、明度が０（黒）から２５５（白）で表されるという条件の下で、明度７０を明度１００〜１８０のいずれかの明度に置き換えるガンマ曲線であることが好ましく、明度７０を明度１３０〜１７０のいずれかの明度に置き換えるガンマ曲線であることがより好ましい。そして、ガンマ曲線は、明度７０を明度１４０〜１６０のいずれかの明度に置き換えるガンマ曲線であることがさらに好ましい。 E5. Modification 5:
In the second embodiment, the gamma conversion replaces the lightness 70 with the lightness 160 when generating the second image data and the second learning sample image data under the condition that the lightness is represented by 0 to 255. Is done. However, the gamma curve used in generating the second image data and the second learning sample image data may have other shapes. However, the gamma curve is preferably a gamma curve in which the lightness 70 is replaced with one of lightness values 100 to 180 under the condition that the lightness is represented by 0 (black) to 255 (white). More preferably, it is a gamma curve in which 70 is replaced with any one of brightness 130-170. The gamma curve is more preferably a gamma curve that replaces the brightness 70 with any one of the brightnesses 140 to 160.

Ｅ６．変形例６：
上記第３実施例では、明度が０から２５５で表されるという条件の下で、明度の改変量Δは７０に設定されている。しかし、明度の改変量Δは、たとえば６０，５０など、他の値に設定することができる。ただし、階調値（明度）の改変量Δは、階調値が取りうる範囲の幅の１０％〜４０％であることが好ましく、２０％〜３０％であることがより好ましい。そして、階調値の改変量Δは、階調値が取りうる範囲の幅の２３％〜２７％であることがさらに好ましい。画素の色の階調値に基づいて対象物が存在するか否かの判定を行う際、精度が低くなる階調値の範囲は、階調値が取りうる範囲全体に対して、通常、１０％〜４０％である。よって、上記のような態様とすれば、判定精度が低くなる範囲に含まれる階調値を、そのような範囲外の階調値に置き換えて判定を行うことができる。 E6. Modification 6:
In the third embodiment, the lightness modification amount Δ is set to 70 under the condition that the lightness is represented by 0 to 255. However, the lightness modification amount Δ can be set to other values such as 60, 50, for example. However, the modification amount Δ of the gradation value (brightness) is preferably 10% to 40%, more preferably 20% to 30%, of the width of the range that the gradation value can take. The gradation value modification amount Δ is more preferably 23% to 27% of the range of the gradation value. When determining whether or not an object exists based on the gradation value of the color of the pixel, the range of gradation values with low accuracy is usually 10 for the entire range that the gradation value can take. % To 40%. Therefore, with the above-described aspect, it is possible to perform the determination by replacing the gradation value included in the range where the determination accuracy is low with a gradation value outside the range.

Ｅ７．変形例７：
上記実施例および変形例では、さまざまな手法で、対象物を検出する際の画像の階調値の改変、および学習の際の画像の階調値の改変が行われる。それら、対象物を検出する際の画像の階調値の改変、および学習の際の画像の階調値の改変は、同じ条件（たとえば、画像の平均の明度についての条件）にしたがって行うか否かが決定され、かつ同じ変換（たとえば、明度を定数倍するなど）で実行されることが好ましい。そのような態様とすれば、対象物を検出する際の精度が高くなるように、効率的な学習を行うことができる。 E7. Modification 7:
In the above-described embodiment and modification, the gradation value of the image when the object is detected is modified and the gradation value of the image when learning is modified by various methods. Whether the modification of the gradation value of the image at the time of detecting the object and the modification of the gradation value of the image at the time of learning are performed in accordance with the same condition (for example, the condition regarding the average brightness of the image) Are preferably determined and performed with the same conversion (eg, multiplying the brightness by a constant). If it is set as such an aspect, efficient learning can be performed so that the precision at the time of detecting a target object may become high.

Ｅ８．変形例８：
上記実施例では、画像中において人間の顔が存在する領域が特定される。しかし、画像中において特定される対象物は、人間の顔に限らず、他の物とすることもできる。本明細書で説明した処理は、たとえば、犬や猫などの動物の顔や、電車や蒸気機関車などの車両、自動車、建造物、花や紅葉などの植物など、様々な対象を特定する処理に適用することができる。 E8. Modification 8:
In the above embodiment, an area where a human face exists in the image is specified. However, the object specified in the image is not limited to a human face, but may be another object. The processing described in this specification is, for example, processing for identifying various objects such as faces of animals such as dogs and cats, vehicles such as trains and steam locomotives, automobiles, buildings, and plants such as flowers and autumn leaves. Can be applied to.

Ｅ９．変形例９：
上記実施例では、画像ＰＩ１中の顔領域の集合を決定する際に、互いに７５％以上の画素を共有するＮ個（Ｎは２以上の整数）の顔領域については、その中から一つの顔領域が選択され、他の顔領域は廃棄される。しかし、その中から一つの顔領域が選択される複数の顔領域（以下、「候補顔領域」という）は、他の方法で定めることもできる。たとえば、複数の候補顔領域は、互いに７０％以上の画素を共有する複数の顔領域とすることができる。そして、複数の候補顔領域は、互いに８０％以上の画素を共有する複数の顔領域とすることがより好ましい。また、複数の候補顔領域は、互いに９０％以上の画素を共有する複数の顔領域とすることがさらに好ましい。 E9. Modification 9:
In the above-described embodiment, when determining a set of face areas in the image PI1, N face areas that share 75% or more of each other (N is an integer of 2 or more) are included in one face. The area is selected and the other face areas are discarded. However, a plurality of face areas (hereinafter referred to as “candidate face areas”) from which one face area is selected can be determined by other methods. For example, the plurality of candidate face areas may be a plurality of face areas that share 70% or more of pixels. More preferably, the plurality of candidate face areas are a plurality of face areas that share 80% or more of the pixels. More preferably, the plurality of candidate face areas are a plurality of face areas that share 90% or more of each other.

さらに、共有する画素数以外の基準で、複数の候補顔領域を決定することもできる。ただし、互いに所定量以上の領域を共有する複数の候補顔領域については、所定の基準で、その中から一つの候補顔領域を選択して、それを画像の顔領域として決定することが好ましい。 Furthermore, a plurality of candidate face regions can be determined based on a standard other than the number of pixels to be shared. However, for a plurality of candidate face areas that share a predetermined amount or more of each other, it is preferable to select one candidate face area from among them and determine it as the face area of the image on a predetermined basis.

Ｅ１０．変形例１０：
上記実施例では、第１の学習サンプル画像データを使った学習（図５のステップＳ１５０参照）と、明度を上げた第２の学習サンプル画像データを使った学習（同、ステップＳ１４０参照）とが行われる。そして、第１の画像データを使った顔領域の特定（図６のステップＳ２５０参照）と、明度を上げた第２の画像データを使った顔領域の特定（同、ステップＳ２４０参照）とが行われる。 E10. Modification 10:
In the above embodiment, learning using the first learning sample image data (see step S150 in FIG. 5) and learning using the second learning sample image data with increased brightness (see step S140). Done. Then, the specification of the face area using the first image data (see step S250 in FIG. 6) and the specification of the face area using the second image data with increased brightness (see step S240) are performed. Is called.

しかし、第２の学習サンプル画像データは、第１の学習サンプル画像データの色の階調値（明度）を下げることによって生成することもできる。また、第２の画像データは、第１の画像データの色の階調値（明度）を下げることによって生成することもできる。そのような態様とすれば、そのままの状態では対象物全体が白く写っており階調値の差が少ない範囲についても、対象物を特定することができる。 However, the second learning sample image data can also be generated by lowering the color gradation value (lightness) of the first learning sample image data. The second image data can also be generated by lowering the color gradation value (brightness) of the first image data. In such a mode, the target object can be specified even in a range where the entire target object appears white and the difference in gradation values is small.

Ｅ１１．変形例１１：
上記実施例では、一つの顔検出部９６２に対して、第１の学習サンプル画像データを使った学習（図５のステップＳ１５０参照）と、第２の学習サンプル画像データを使った学習（同、ステップＳ１４０参照）とが行われる。そして、画像データに対する顔領域の特定も、一つの顔検出部９６２を使用して行われる。 E11. Modification 11:
In the above-described embodiment, learning using the first learning sample image data (see step S150 in FIG. 5) and learning using the second learning sample image data (for the same face detection unit 962). Step S140). The face area for the image data is also specified using one face detection unit 962.

しかし、検出モジュールを複数用意して、それら複数の検出モジュールの一部に、第１の学習サンプル画像データを使った学習（図５のステップＳ１５０参照）を行わせ、他の一部に色の階調値（明度）を改変した第２の学習サンプル画像データを使った学習（同、ステップＳ１４０参照）を行わせることとしてもよい。 However, a plurality of detection modules are prepared, and learning using the first learning sample image data (see step S150 in FIG. 5) is performed on a part of the plurality of detection modules, and the color of the other modules is determined. Learning using the second learning sample image data in which the gradation value (brightness) is modified (see step S140) may be performed.

その後、第１の学習サンプル画像データを使った学習（図５のステップＳ１５０参照）を行った検出モジュールについては、第１の画像データに対する顔検出（図６のステップＳ２４０参照）を行わせることが好ましい。そして、第２の学習サンプル画像データを使った学習（同、ステップＳ１４０参照）を行った検出モジュールについては、階調値を改変した第２の画像データに対する顔検出（図６のステップＳ２５０参照）を行わせることが好ましい。 After that, for the detection module that has performed learning using the first learning sample image data (see step S150 in FIG. 5), face detection (see step S240 in FIG. 6) for the first image data can be performed. preferable. Then, for the detection module that has performed learning using the second learning sample image data (see step S140), face detection is performed on the second image data whose tone value has been modified (see step S250 in FIG. 6). Is preferably performed.

また、明度を改変する方向および程度（量）が異なる複数種類のサンプル画像データのグループを設け、それぞれのサンプル画像データのグループを使って、互いに異なる検出モジュールに学習を行わせることができる。そして、明度を改変する処理を行った画像データについて対象物の検出を行う際には、その明度を改変する方向および程度（量）と対応する方向および程度（量）で明度を改変したサンプル画像データで学習を行った検出モジュールで、その画像データについて対象物の検出を行うことが好ましい。 In addition, a plurality of types of sample image data groups having different directions and degrees (amounts) of changing the brightness can be provided, and different detection modules can be trained using each group of sample image data. Then, when detecting an object for image data that has undergone a process for modifying the brightness, a sample image in which the brightness is modified in the direction and degree (quantity) corresponding to the direction and degree (quantity) of modifying the brightness. It is preferable to detect an object for the image data with a detection module that has learned data.

なお、この態様において、検出モジュールの数は、学習を行う画像のグループ（上記実施例では、第１の学習サンプル画像データグループと第２の学習サンプル画像データグループの２グループ）の数と同じであることがより好ましい。 In this aspect, the number of detection modules is the same as the number of groups of images to be learned (in the above embodiment, two groups of the first learning sample image data group and the second learning sample image data group). More preferably.

Ｅ１２．変形例１２：
上記実施例では、図２〜図４に示す方法で顔領域の特定を行う。しかし、顔領域の特定は、ブースティング（たとえばAdaBoost）や、サポートベクターマシン、ニューラルネットワークなど様々な方法を使用して行うことができる。ただし、画像中の各画素の色に関する階調値の差に基づいて顔領域の特定を行う手法であることが好ましい。 E12. Modification 12:
In the above embodiment, the face area is specified by the method shown in FIGS. However, the face region can be specified by using various methods such as boosting (for example, AdaBoost), support vector machine, and neural network. However, it is preferable that the face region is specified based on a difference in gradation values relating to the color of each pixel in the image.

Ｅ１３．変形例１３：
上記実施例および変形例では、画像の画素の明度に基づいて顔領域の特定（図２〜図４）、第２の画像データの生成（図６および図１０のステップＳ２３０）、第２の学習サンプル画像データの生成（図５のステップＳ１３０）、さらには、第２の画像データや第２の学習サンプル画像データの生成に関する判定（図５のステップＳ１２０、Ｓ２２０）が行われる。しかし、それらの処理は、明度以外の階調値に基づいて行うこともできる。 E13. Modification 13:
In the above-described embodiment and modification, the face area is identified based on the brightness of the image pixels (FIGS. 2 to 4), the second image data is generated (step S230 in FIGS. 6 and 10), and the second learning is performed. Generation of sample image data (step S130 in FIG. 5) and further determination regarding generation of second image data and second learning sample image data (steps S120 and S220 in FIG. 5) are performed. However, these processes can also be performed based on tone values other than brightness.

たとえば、画像の画素の明度に代えて、画像の画素のグリーンの階調値に基づいて、上記各処理を行うことができる。また、画像の画素の明度に代えて、画像の画素のレッドやブルーの階調値に基づいて、上記各処理を行うことができる。さらには、画像の画素のレッド、グリーン、ブルーの階調値に基づいて得られる値に基づいて、上記各処理を行うことができる。すなわち、上記の各処理は、画像の画素の色に関する階調値に基づいて行うことができる。 For example, each of the above processes can be performed based on the green gradation value of the image pixel instead of the brightness of the image pixel. Further, each of the above processes can be performed based on the gradation values of red and blue of the pixels of the image instead of the lightness of the pixels of the image. Furthermore, each of the above processes can be performed based on values obtained based on the gradation values of red, green, and blue of the pixels of the image. In other words, each of the above processes can be performed based on the gradation value relating to the color of the pixel of the image.

また、処理に使用する画素の色に関する階調値は、画像データがあらかじめ各画素について保持している明度に関する階調値であってもよい。たとえば、ＪＰＥＧ画像データやＹＣｒＣｂ表色系の画像データが保持している輝度や明度の階調値を使用することができる。また、処理に使用する画素の色に関する階調値は、画像データが各画素について保持しているＲＧＢなどの色成分の階調値に基づいて得られる階調値であってもよい。 Further, the gradation value relating to the color of the pixel used for the processing may be a gradation value relating to the brightness that the image data holds in advance for each pixel. For example, it is possible to use gradation values of luminance and brightness held in JPEG image data and YCrCb color system image data. In addition, the gradation value relating to the color of the pixel used for processing may be a gradation value obtained based on the gradation value of a color component such as RGB held in the image data for each pixel.

Ｅ１４．変形例１４：
画像処理装置は、原画像データの画像中において所定の対象物が存在する部分を特定した後に、その対象物が存在する部分を表示する態様とすることもできる。そのような態様において、原画像データの画像を表示し、さらに、その画像上において、対象物が存在する部分を表示することがより好ましい。 E14. Modification 14:
The image processing apparatus may be configured to display a portion where the predetermined object exists after specifying a portion where the predetermined object exists in the image of the original image data. In such an aspect, it is more preferable to display an image of the original image data and to further display a portion where the object exists on the image.

Ｅ１５．変形例１５：
画像処理装置は、原画像データの画像中において所定の対象物が存在する部分を特定した後に、原画像データの画像中において所定の対象物が存在する部分の大きさに基づいて、原画像データに対して画像処理を行う態様とすることもできる。なお、原画像データの画像中に、対象物が存在する部分が複数ある場合には、画像処理の際に考慮される「対象物が存在する部分の大きさ」は、対象物が存在する複数の部分のうち最も大きい部分の大きさとすることができる。また、画像処理の際に考慮される「対象物が存在する部分の大きさ」は、対象物が存在する複数の部分の合計の大きさとすることもできる。 E15. Modification 15:
The image processing apparatus identifies the portion where the predetermined object exists in the image of the original image data, and then based on the size of the portion where the predetermined object exists in the image of the original image data, It is also possible to adopt an aspect in which image processing is performed on the image. In addition, when there are a plurality of portions where the object exists in the image of the original image data, the “size of the portion where the object exists” considered in the image processing is a plurality of portions where the object exists. The size of the largest part can be taken as the size of the part. Further, the “size of the portion where the object exists” considered in the image processing may be the total size of a plurality of portions where the object exists.

Ｅ１６．変形例１６：
上記実施例では、プリンタドライバ９６が、顔領域を特定する処理を行い、その結果に基づいて画像処理を行っている。しかし、対象物を特定する処理は、他の構成が行うこともできる。対象物を特定する処理は、たとえば、パーソナルコンピュータ１００のＯＳ上で実行されるアプリケーションソフト９５が実行することもでき、プリンタ２２やプロジェクタ３２等の出力装置が備えるＣＰＵ１０４，１０６が実行することもできる。さらに、顔領域を特定する処理は、プリンタ２２やプロジェクタ３２等の出力装置が備えるハードウェア回路で実行することもできる。さらには、液晶ディスプレイなどの出力装置を備えたデジタルスチルカメラが備えるＣＰＵやハードウェア回路で実行することもできる。 E16. Modification 16:
In the above-described embodiment, the printer driver 96 performs processing for specifying the face area, and performs image processing based on the result. However, the process of identifying the object can be performed by other configurations. The process of specifying the object can be executed by, for example, application software 95 executed on the OS of the personal computer 100, or can be executed by the CPUs 104 and 106 provided in the output device such as the printer 22 or the projector 32. . Furthermore, the process of specifying the face area can also be executed by a hardware circuit provided in an output device such as the printer 22 or the projector 32. Furthermore, it can also be executed by a CPU or hardware circuit provided in a digital still camera including an output device such as a liquid crystal display.

すなわち、上記実施例において、ハードウェアによって実現されていた構成の一部をソフトウェアに置き換えるようにしてもよく、逆に、ソフトウェアによって実現されていた構成の一部をハードウェアに置き換えるようにしてもよい。 That is, in the above embodiment, a part of the configuration realized by hardware may be replaced by software, and conversely, a part of the configuration realized by software may be replaced by hardware. Good.

このような機能を実現するコンピュータプログラムは、フロッピディスクやＣＤ−ＲＯＭ、ＤＶＤ等の、コンピュータ読み取り可能な記録媒体に記録された形態で提供される。ホストコンピュータは、その記録媒体からコンピュータプログラムを読み取って内部記憶装置または外部記憶装置に転送する。あるいは、通信経路を介してプログラム供給装置からホストコンピュータにコンピュータプログラムを供給するようにしてもよい。コンピュータプログラムの機能を実現する時には、内部記憶装置に格納されたコンピュータプログラムがホストコンピュータのマイクロプロセッサによって実行される。また、記録媒体に記録されたコンピュータプログラムをホストコンピュータが直接実行するようにしてもよい。 A computer program that realizes such a function is provided in a form recorded on a computer-readable recording medium such as a floppy disk, a CD-ROM, or a DVD. The host computer reads the computer program from the recording medium and transfers it to the internal storage device or the external storage device. Alternatively, the computer program may be supplied from the program supply device to the host computer via a communication path. When realizing the function of the computer program, the computer program stored in the internal storage device is executed by the microprocessor of the host computer. Further, the host computer may directly execute the computer program recorded on the recording medium.

この明細書において、ホストコンピュータとは、ハードウェア装置とオペレーションシステムとを含む概念であり、オペレーションシステムの制御の下で動作するハードウェア装置を意味している。コンピュータプログラムは、このようなホストコンピュータに、上述の各部の機能を実現させる。なお、上述の機能の一部は、アプリケーションプログラムでなく、オペレーションシステムによって実現されていても良い。 In this specification, the host computer is a concept including a hardware device and an operation system, and means a hardware device that operates under the control of the operation system. The computer program causes such a host computer to realize the functions of the above-described units. Note that some of the functions described above may be realized by an operation system instead of an application program.

なお、この発明において、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスクやＣＤ−ＲＯＭのような携帯型の記録媒体に限らず、各種のＲＡＭやＲＯＭ等のコンピュータ内の内部記憶装置や、ハードディスク等のコンピュータに固定されている外部記憶装置も含んでいる。 In the present invention, the “computer-readable recording medium” is not limited to a portable recording medium such as a flexible disk or a CD-ROM, but an internal storage device in a computer such as various RAMs and ROMs, An external storage device fixed to a computer such as a hard disk is also included.

本発明の実施例である画像処理装置の概略構成を示す説明図。1 is an explanatory diagram illustrating a schematic configuration of an image processing apparatus that is an embodiment of the present invention. 画像データの画像中で人間の顔が存在する可能性が高いと思われる領域を特定する方法を示す図。The figure which shows the method of pinpointing the area | region where it is likely that a human face exists in the image of image data. 検出窓ＤＷによって取り出されたデータの画像領域ＩＤＷが顔領域であるか否かの判定の処理を示す図。The figure which shows the process of determination whether the image area IDW of the data taken out by the detection window DW is a face area. あるステージにおける判定の処理を示す図。The figure which shows the process of the determination in a certain stage. 顔検出部９６２の学習の際の処理を示すフローチャート。The flowchart which shows the process in the case of the learning of the face detection part 962. 画像データの画像中において顔が存在する領域を決定する処理を示すフローチャート。The flowchart which shows the process which determines the area | region where a face exists in the image of image data. 第１の画像データに基づいて第２の画像データを生成する方法を示すヒストグラム。The histogram which shows the method of producing | generating 2nd image data based on 1st image data. 第１の画像データに基づいて第２の画像データを生成する際のガンマ変換の内容を示す図。The figure which shows the content of the gamma conversion at the time of producing | generating 2nd image data based on 1st image data. 第１の画像データに基づいて第２の画像データを生成する方法を示すヒストグラム。The histogram which shows the method of producing | generating 2nd image data based on 1st image data. 第４実施例における顔領域を決定する処理を示すフローチャート。The flowchart which shows the process which determines the face area | region in 4th Example.

Explanation of symbols

２２…プリンタ
３０…明度
３２…プロジェクタ
８８…ホストコンピュータ
９１…ビデオドライバ
９５…アプリケーションプログラム
９６…プリンタドライバ
９８…プロジェクタのドライバ
１００…パーソナルコンピュータ
１０２…ＣＰＵ
１１０…ディスプレイ
１２０…キーボード
１３０…マウス
１４０…Ｒ／ＲＷドライブ
９６２…顔検出部
Ａ１１ａ，ｂ…矩形フィルタＦ１１を使用してデータが取り出される領域
Ａ１１ａｌ…Ａ１１ａの左半分の領域
Ａ１１ａｒ…Ａ１１ｂの右半分の領域
Ａ１２ａ，ｂ…矩形フィルタＦ１２を使用してデータが取り出される領域
Ａｆ１…顔が存在する領域
Ａｆ２…顔が存在する領域
Ａｈ…検出窓ＤＷの移動を示す矢印
Ａｓ…検出窓ＤＷの上下左右の移動を示す矢印
Ｄ１…第１の画像データにおける画素の明度の分布
Ｄ２ａ，ｂ…第２の画像データにおける画素の明度の分布
Ｄ１１ｙ，ｎ…矩形フィルタＦ１１を使用した判定の結果を表す定数
Ｄ１２ｙ，ｎ…矩形フィルタＦ１１を使用した判定の結果を表す定数
ＤＷ…検出窓
Ｆ１１，Ｆ１２…矩形フィルタ
ＩＤＷ…検出窓ＤＷでデータを取り出される画像領域
Ｐ１…人物
Ｐ２…人物
ＰＩ１…第１の画像データの画像
Ｒ１…明度の範囲
Ｒ２ａ〜ｃ…第１の画像データの画像中において範囲Ｒ１内にあった明度の範囲
Ｓｔ１〜Ｓｔ２４…第１〜第２４のステージ
Ｗｈ…検出窓ＤＷの幅
Ｙ１１ａ…領域Ａ１１ａの各画素の明度
Ｙ１１ｂ…領域Ａ１１ｂの各画素の明度
Ｙ１２ａ…領域Ａ１２ａの各画素の明度
Ｙ１２ｂ…領域Ａ１２ｂの各画素の明度
Ｙｉ…変換前の明度
Ｙｏ…変換後の明度
ｄｈ…検出窓の移動距離
α１１ａ，ｂ…矩形フィルタに関連づけられている定数
α１２ａ，ｂ…矩形フィルタに関連づけられている定数 DESCRIPTION OF SYMBOLS 22 ... Printer 30 ... Lightness 32 ... Projector 88 ... Host computer 91 ... Video driver 95 ... Application program 96 ... Printer driver 98 ... Projector driver 100 ... Personal computer 102 ... CPU
DESCRIPTION OF SYMBOLS 110 ... Display 120 ... Keyboard 130 ... Mouse 140 ... R / RW drive 962 ... Face detection part A11a, b ... Area where data is extracted using rectangular filter F11 A11al ... Left half area of A11a A11ar ... Right half of A11b Area A12a, b ... Area where data is extracted using the rectangular filter F12 Af1 ... Area where the face exists Af2 ... Area where the face exists Ah ... Arrow indicating the movement of the detection window DW As ... Up, down, left and right D1 ... Pixel lightness distribution in the first image data D2a, b ... Pixel lightness distribution in the second image data D11y, n ... Constants representing results of determination using the rectangular filter F11 D12y , N... Constant indicating the result of determination using the rectangular filter F11 DW... Detection window F11 F12 ... Rectangular filter IDW ... Image area from which data is extracted in the detection window DW P1 ... Person P2 ... Person PI1 ... Image of the first image data R1 ... Lightness range R2a to c ... Range in the image of the first image data Brightness range within R1 St1 to St24 ... 1st to 24th stages Wh ... width of detection window DW Y11a ... brightness of each pixel in area A11a Y11b ... brightness of each pixel in area A11b Y12a ... each of area A12a Pixel brightness Y12b ... Brightness of each pixel in area A12b Yi ... Brightness before conversion Yo ... Brightness after conversion dh ... Moving distance of detection window [alpha] 11a, b ... Constants associated with rectangular filter [alpha] 12a, b ... Rectangular filter Associated constant

Claims

An image processing device for identifying a portion where a human face exists in an image,
When a predetermined first condition is satisfied, a reference gradation value corresponding to a brighter color is changed by changing a reference gradation value included in at least a part of the numerical range of the first image data by a predetermined amount. An image data generation unit that generates second image data by converting into the image data generation unit, wherein the reference gradation value is a gradation value related to a pixel color of the image data;
A first object specifying unit that specifies a first portion where a human face is present in the image of the first image data based on a reference gradation value of a color of a pixel of the first image data When,
A second object specifying unit that specifies a second portion where the human face is present in an image of the second image data based on a reference gradation value of a pixel color of the second image data. And
The human face is present in the image of the first image data based on the first portion of the image of the first image data and the second portion of the image of the second image data. A synthesis unit for determining a part to be performed,
Wherein at least a portion of the numerical range, within the range that the can take reference grayscale value, the reference grayscale value corresponding to the darkest color, seen including a range of up to 25% of the width of the range that can be taken above,
In the determination of the part where the human face exists, the combining unit determines that the first part and the second part share the first part and the first part when the first part and the second part share a predetermined amount or more of the area. An image processing apparatus , wherein one of the second parts is excluded from candidate parts where the human face exists .

The apparatus of claim 1 , comprising:
The predetermined amount is 10% to 40% of a width of a range that the reference gradation value can take.

The apparatus according to claim 1 or 2, comprising:
The reference gradation value is a gradation value that represents a darker color as the value of the reference gradation value is smaller when other conditions are the same,
The first condition includes an apparatus in which an average value of reference gradation values of pixels of the first image data is smaller than a predetermined threshold value.

The apparatus according to claim 1 or 2, comprising:
The reference gradation value is a gradation value that represents a darker color as the value of the reference gradation value is smaller when other conditions are the same,
The first condition is that when the area of the image of the first image data is divided into a first area and a second area surrounding the first area, the first area is defined as the first area. An apparatus including an average value of reference gradation values of included pixels being smaller than a predetermined threshold value.