JP2006163614A

JP2006163614A - Image processing device and method

Info

Publication number: JP2006163614A
Application number: JP2004351649A
Authority: JP
Inventors: Yasunori Ishii; 育規石井; Kazuyuki Imagawa; 和幸今川; Eiji Fukumiya; 英二福宮; Katsuhiro Iwasa; 克博岩佐
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2004-12-03
Filing date: 2004-12-03
Publication date: 2006-06-22

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing device capable of maintaining detection identification accuracy, and capable of detecting a target object with a small processing amount. <P>SOLUTION: This image processing device comprises a reference database 11 having a plurality of reference parameters 12 which are combinations of a reference vector including a target object reference vector as an image vector of a target object and a non-target object reference vector as an image vector of a non-target object and a weight coefficient of the reference vector; a partial image cutting out part 3 for cutting out a partial image from an input image 2; a feature quantity vector calculating part 4 for calculating a feature quantity vector of the partial image cut out; and a feature quantity vector evaluating part 5 for detecting the target object from the partial image with the usage of the feature quantity vector and the reference parameter. The feature quantity vector evaluating part includes an image identification part 6 for determining including/non-including of the target object of the partial image, based on the calculated result, and detection processing is completed when the image identification part determines that the target object is not included. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、画像から目的とする物体を検出する画像処理装置、および画像処理方法に関するものである。 The present invention relates to an image processing apparatus and an image processing method for detecting a target object from an image.

従来、セキュリティなどの監視カメラの活用において、画像の中から所望の物体を検出することが求められている。例えば、画像中の物体として「顔」を検出することが行われている。このような、画像中から特定の物体を検出する装置や方法として、顔と顔以外（以下、非顔）のベクトルを有するパラメータと入力画像との比較により識別する装置や方法が提案されている（例えば、特許文献１、２参照）。 Conventionally, in the utilization of surveillance cameras such as security, it is required to detect a desired object from an image. For example, “face” is detected as an object in an image. As such an apparatus or method for detecting a specific object from an image, an apparatus or a method for identifying by comparing a parameter having a vector between a face and a face other than the face (hereinafter, non-face) and an input image has been proposed. (For example, refer to Patent Documents 1 and 2).

特許文献１の記載では、まず、ブースティング（ｂｏｏｓｔｉｎｇ）手法で学習された参照パラメータに入力される。次いで、画像識別部の出力値の重み付き線形和が評価値として計算される。さらに、計算された評価値の大きさによって対象物体とそれ以外との識別が行われて、最終的な対象物体の検出結果が出力される。このブースティング手法は高い識別性能を有する参照パラメータを生成できることが実験的に確かめられている。 In the description of Patent Document 1, first, a reference parameter learned by a boosting technique is input. Next, a weighted linear sum of output values of the image identification unit is calculated as an evaluation value. Further, the target object is distinguished from the other objects according to the calculated evaluation value, and the final detection result of the target object is output. It has been experimentally confirmed that this boosting method can generate a reference parameter having high discrimination performance.

図１７は、従来の技術における画像処理装置のブロック図であり、特許文献１に記載の画像処理装置を表している。 FIG. 17 is a block diagram of an image processing apparatus in the prior art, and represents the image processing apparatus described in Patent Document 1.

まず入力画像の任意の部分が、部分画像切り出し部１８０１で部分画像として切り出される。部分画像は、種々のサイズのウィンドウを画像の所定の位置を起点として、順次適当な画素（例えば１画素）分を右側または下側にずらしながら走査することで切り出される。次に、特徴量ベクトル算出部１８０２は、部分画像から特徴量ベクトルを算出する。特徴量ベクトル評価部１８０３は、参照パラメータと部分画像の特徴量ベクトルを用いて評価値を算出する。このとき参照パラメータは、あらかじめブースティング手法で学習された対象物体と、非対象物体のベクトルをもったパラメータであり、これらを用いて評価値が算出されることで、対象物体もしくは非対象物体のいずれに類似するかが判定される。 First, an arbitrary part of the input image is cut out as a partial image by the partial image cutout unit 1801. The partial image is cut out by scanning various size windows starting from a predetermined position of the image while sequentially shifting an appropriate pixel (for example, one pixel) to the right side or the lower side. Next, the feature amount vector calculation unit 1802 calculates a feature amount vector from the partial image. The feature vector evaluation unit 1803 calculates an evaluation value using the reference parameter and the feature vector of the partial image. At this time, the reference parameter is a parameter having a target object and a non-target object vector learned in advance by the boosting technique, and an evaluation value is calculated using these parameters, so that the target object or the non-target object can be calculated. Which is similar is determined.

この処理により、例えば顔を対象物体とする場合には、顔と非顔とが識別される。なお、参照用パラメータを生成するブースティング手法は、高い識別性能を有することが実験的に確かめられている。 With this process, for example, when a face is a target object, a face and a non-face are identified. Note that it has been experimentally confirmed that the boosting method for generating the reference parameters has high discrimination performance.

次に、図１８は、従来の技術における画像処理装置のブロック図であり、特許文献２の画像処理装置を表している。 Next, FIG. 18 is a block diagram of an image processing apparatus in the prior art, and represents the image processing apparatus disclosed in Patent Document 2.

まず、部分画像切り出し部１９０１が、図１７の場合と同様にして入力画像から部分画像を切り出す。ついで特徴量ベクトル評価部１９０２が、切り出された部分画像の特徴量ベクトルを算出する。さらに、特徴量ベクトル評価部１９０２が、参照パラメータと部分画像の特徴量ベクトルを用いて評価値を算出する。算出された評価値により、部分画像が対象物体を包含するか非包含であるかを判定する。ここで、特徴量ベクトル評価部１９０２は、複数の画像識別部１９０３を有しており、画像識別部１９０３が評価値の算出を行う。ここで、複数の画像識別部１９０３は、おのおの異なる参照パラメータを用いて評価値算出を行う。図１８に示されるとおり、画像識別部１９０３は直列に接続され、順順に処理を実行する。図１８の装置では、直列に接続された画像識別部１９０３のいずれかで部分画像が対象物体を非包含と判定した時点で処理が終了される。処理が終了された以降は他の画像識別部は用いられない。このため、高速な処理が実現されている。 First, the partial image cutout unit 1901 cuts out a partial image from the input image in the same manner as in FIG. Next, the feature quantity vector evaluation unit 1902 calculates a feature quantity vector of the clipped partial image. Further, the feature vector evaluation unit 1902 calculates an evaluation value using the reference parameter and the feature vector of the partial image. It is determined based on the calculated evaluation value whether the partial image includes the target object or not. Here, the feature quantity vector evaluation unit 1902 includes a plurality of image identification units 1903, and the image identification unit 1903 calculates an evaluation value. Here, the plurality of image identification units 1903 calculate evaluation values using different reference parameters. As shown in FIG. 18, the image identification units 1903 are connected in series and execute processing in order. In the apparatus of FIG. 18, the process ends when any of the image identification units 1903 connected in series determines that the partial image does not include the target object. After the process is completed, no other image identification unit is used. For this reason, high-speed processing is realized.

しかしながら、特許文献１の方法では、特徴量ベクトルは、ブースティング手法で生成した全ての学習パラメータを用いて評価されるため、評価に要する処理量が莫大になり、実時間処理ができないという問題を有していた。 However, in the method of Patent Document 1, since the feature vector is evaluated using all the learning parameters generated by the boosting technique, the amount of processing required for the evaluation becomes enormous and real-time processing cannot be performed. Had.

また、特許文献２の方法では、異なる参照パラメータが与えられた複数の画像識別部が、特徴量ベクトルを評価していた。ここで、異なる参照パラメータは、対象物体を識別する性能などが異なる。このように相違する識別性能を有する参照パラメータが与えられた複数の画像識別部での処理を用いた処理では、識別性能の低い参照パラメータから順に識別を行うこともある。このため、高速処理が可能となっても、識別精度の高い検出処理ができないという問題も有していた。
米国特許出願公開第２００３／０１０８２４４号公報米国特許出願公開第２００２／０１０２０２４号公報 Further, in the method of Patent Document 2, a plurality of image identification units to which different reference parameters are given evaluates feature quantity vectors. Here, different reference parameters differ in performance for identifying a target object. In the processing using the processing in the plurality of image identification units to which the reference parameters having different identification performances are given in this way, the identification may be performed in order from the reference parameters having the lowest identification performance. For this reason, there has been a problem that even if high-speed processing is possible, detection processing with high identification accuracy cannot be performed.
US Patent Application Publication No. 2003/0108244 US Patent Application Publication No. 2002/0102024

そこで本発明は、画像から所望の対象物体を検出する識別精度を維持し、従来よりも処理量の少ない対象物体検出を行なうことができる画像処理装置、および画像処理方法を提供することを目的とする。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide an image processing apparatus and an image processing method that can maintain identification accuracy for detecting a desired target object from an image and perform target object detection with a smaller processing amount than in the past. To do.

第１の発明に係る画像処理装置は、入力画像から対象物体を検出する画像処理装置であって、対象物体の画像ベクトルである対象物体参照ベクトルと非対象物体の画像ベクトルである非対象物体参照ベクトルを含む参照ベクトルと参照ベクトルの重み係数との組み合わせである複数の参照パラメータを有する参照用データベースと、入力画像から部分画像を切り出す部分画像切り出し部と、画像切り出し部が切り出した部分画像の特徴量ベクトルを算出する特徴量ベクトル算出部と、特徴量ベクトルと参照パラメータを用いて、部分画像から対象物体を検出する検出処理を行う特徴量ベクトル評価部を備え、特徴量ベクトル評価部は、特徴量ベクトルと参照パラメータを用いて演算した演算結果に基づいて、部分画像の対象物体の包含／非包含を判定する画像識別部を有し、画像識別部が、対象物体を非包含と判定した時点で検出処理を終了する。 An image processing apparatus according to a first invention is an image processing apparatus that detects a target object from an input image, and includes a target object reference vector that is an image vector of the target object and a non-target object reference that is an image vector of the non-target object. Reference database having a plurality of reference parameters that are combinations of reference vectors including vectors and weighting coefficients of reference vectors, a partial image cutout unit that cuts out a partial image from an input image, and features of the partial image cut out by the image cutout unit A feature vector calculation unit that calculates a quantity vector, and a feature vector evaluation unit that performs a detection process of detecting a target object from a partial image using the feature vector and the reference parameter. Inclusion / non-wrapping of the target object of the partial image based on the calculation result calculated using the quantity vector and the reference parameter A determining image identification section, an image identification portion, and terminates the detection process when it is determined that the non-inclusion of the target object.

この構成により、入力画像から切り出された部分画像が所望の対象物体を非包含である場合には、検出処理を終了することができるので、処理量を削減でき、高速処理を実現することが可能となる。 With this configuration, when the partial image cut out from the input image does not include the desired target object, the detection process can be completed, so the processing amount can be reduced and high-speed processing can be realized. It becomes.

第２の発明に係る画像処理装置は、画像識別部が複数であって、複数の画像識別部の少なくとも一つが対象物体を非包含と判定した時点で、検出処理を終了する。 The image processing apparatus according to the second invention ends the detection process when there are a plurality of image identification units and at least one of the plurality of image identification units determines that the target object is not included.

これにより、画像識別部における画像処理を異なる参照パラメータで順次実行し、実行中にいずれかの画像識別部で対象物体が非包含と判断された時点で検出処理を終了することができるので、高速処理を実現することができる。 As a result, the image processing in the image identification unit can be sequentially executed with different reference parameters, and the detection processing can be terminated when any of the image identification units is determined not to be included during execution. Processing can be realized.

第３の発明に係る画像処理装置は、画像識別部が、特徴量ベクトルと対象物体参照データとの対象物体相関値と特徴量ベクトルと非対象物体参照データとの相関値を算出する相関値算出部と、相関値算出部で算出された対象物体相関値と非対象物体相関値と重み係数との積である評価値を算出する評価値算出部と、評価値に従って部分画像の対象物体の包含／非包含判定を行う判定部を備える。 In the image processing apparatus according to the third invention, the image identification unit calculates a correlation value for calculating a target object correlation value between the feature quantity vector and the target object reference data, and a correlation value between the feature quantity vector and the non-target object reference data. A target object correlation value calculated by the correlation value calculation unit, an evaluation value calculation unit for calculating an evaluation value that is a product of the non-target object correlation value and the weighting factor, and inclusion of the target object of the partial image according to the evaluation value / The determination part which performs non-inclusion determination is provided.

この構成により、対象物体と非対象物体の参照パラメータとの近似性を用いて切り出された部分画像の対象物体の包含／非包含判定を行うことができるので、精度の高い識別を行うことができる。 With this configuration, it is possible to perform inclusion / non-inclusion determination of the target object of the partial image cut out using the closeness of the reference parameters of the target object and the non-target object, so that highly accurate identification can be performed. .

第４の発明に係る画像処理装置は、判定部が、評価値算出部での算出処理が所定の回数以下の場合に異なる参照パラメータを用いて包含／非包含判定を繰り返して行う中間判定部と、評価算出部での算出処理が所定の回数以上の場合に包含／非包含判定を行う最終判定部をさらに備える。 According to a fourth aspect of the present invention, there is provided the image processing apparatus, wherein the determination unit includes an intermediate determination unit that repeatedly performs inclusion / non-inclusion determination using different reference parameters when the calculation processing in the evaluation value calculation unit is a predetermined number of times or less And a final determination unit that performs inclusion / non-inclusion determination when the calculation processing in the evaluation calculation unit is a predetermined number of times or more.

この構成により、任意の回数に基づく識別処理が実行された場合には、その時点で強制的に処理を終了させるので、仕様に応じた所定回数以下の処理量での検出処理とすることができる。これにより、高速処理が可能となる。 With this configuration, when an identification process based on an arbitrary number of times is executed, the process is forcibly terminated at that time, so that the detection process can be performed with a processing amount equal to or less than a predetermined number according to the specification. . Thereby, high-speed processing becomes possible.

第５の発明に係る画像処理装置は、特徴量ベクトル評価部が、複数の画像判定部の少なくとも一つが対象物体を包含と判断した場合には、異なる参照パラメータを用いて検出処理を続行する。 In the image processing apparatus according to the fifth invention, when the feature vector evaluation unit determines that at least one of the plurality of image determination units includes the target object, the detection processing is continued using different reference parameters.

この構成により、対象物体を包含すると判断した場合でも、さらに別の参照パラメータを用いて識別処理を続行することができるので、対象物体の検出の精度をさらに向上させることができる。 With this configuration, even when it is determined that the target object is included, the identification process can be continued using still another reference parameter, so that the accuracy of detection of the target object can be further improved.

第６の発明に係る画像処理装置は、特徴量ベクトル評価部が、異なる複数の参照パラメータを重み係数の降順に用いる。 In the image processing device according to the sixth aspect, the feature vector evaluation unit uses a plurality of different reference parameters in descending order of the weighting coefficient.

重み係数の降順に参照パラメータを用いて部分画像の対象物体の包含／非包含の識別処理を実行することができるので、対象物体を識別しやすい参照パラメータから順次識別処理を実行できる。これにより、検出処理を効率的に行うことができるとともに、早期に検出の続行の必要性を判定できるので高速処理を実現できる。 Since it is possible to execute the inclusion / non-inclusion identification process of the target object of the partial image using the reference parameters in descending order of the weighting factors, it is possible to execute the identification process sequentially from the reference parameter that easily identifies the target object. As a result, the detection process can be performed efficiently and the necessity of continuing the detection can be determined at an early stage, so that high-speed processing can be realized.

第７の発明に係る画像処理装置は、特徴量ベクトル評価部が、対象物体参照データと、非対象物体に関する非対象物体参照データの次元数の昇順に用いる。 In the image processing device according to the seventh aspect, the feature vector evaluation unit uses the target object reference data in the ascending order of the number of dimensions of the non-target object reference data regarding the non-target object.

この構成により、多次元の特徴量ベクトルを用いて評価を実行する場合でも、より識別力の高い順に検出処理を実行することができるので、効率的な検出処理を実現することができる。 With this configuration, even when evaluation is performed using a multidimensional feature quantity vector, detection processing can be executed in order of higher discriminating power, so that efficient detection processing can be realized.

第８の発明に係る画像処理装置は、参照パラメータは、対象物体参照ベクトルと非対象物体参照ベクトルからアンサンブル学習で生成される。 In the image processing apparatus according to the eighth invention, the reference parameter is generated by ensemble learning from the target object reference vector and the non-target object reference vector.

この構成により、識別処理に用いられる参照パラメータが学習的に生成されるため、識別能力の高い参照パラメータを用いることができる。 With this configuration, the reference parameter used for the identification process is generated in a learning manner, so that it is possible to use a reference parameter with high identification ability.

第９の発明に係る画像処理装置は、アンサンブル学習は、ブースティング手法を用いる。 In the image processing apparatus according to the ninth aspect of the invention, the ensemble learning uses a boosting technique.

この構成により、参照パラメータを生成するアンサンブル学習を容易かつ適切に実行することができる。 With this configuration, ensemble learning for generating reference parameters can be easily and appropriately executed.

第１０の発明に係る画像処理装置は、特徴量ベクトルは、エッジ特徴、およびハールウェーブレット変換による多重解像度特徴（以下、「ウェーブレット特徴」という）の少なくとも一つを算出するフィルタ処理により算出される。 In the image processing apparatus according to the tenth aspect, the feature vector is calculated by a filter process for calculating at least one of an edge feature and a multi-resolution feature (hereinafter referred to as “wavelet feature”) by Haar wavelet transform.

この構成により、特徴量ベクトルを周波数成分のレベル毎に生成することができ、その順序で識別処理を実行することができるので、効率の高い検出を実行することができる。さらに、あらかじめウェーブレット変換された入力画像の場合にも対応した検出が容易にできる。 With this configuration, a feature vector can be generated for each level of frequency components, and identification processing can be executed in that order, so that highly efficient detection can be executed. Furthermore, detection corresponding to the case of an input image that has been wavelet transformed in advance can be easily performed.

第１１の発明に係る画像処理装置は、ウェーブレット特徴は、オクターブ分割により得られる。 In the image processing apparatus according to the eleventh aspect, the wavelet feature is obtained by octave division.

この構成により、ウェーブレット変換のレベルに応じた画像から効率的に識別処理を実行することが可能となる。 With this configuration, it is possible to efficiently execute identification processing from an image corresponding to the wavelet transform level.

第１２の発明に係る画像処理装置は、特徴量ベクトル評価部は、オクターブ分割による高レベル成分から順次用いる。 In the image processing device according to the twelfth aspect, the feature vector evaluation unit sequentially uses high-level components by octave division.

ウェーブレット変換で発生するレベルに応じた成分から識別処理が実行されるので、効率的に識別処理を実行できる。さらに、処理量を減らすことができるので、処理の高速化を実現することができる。 Since the identification process is executed from the component corresponding to the level generated by the wavelet transform, the identification process can be executed efficiently. Furthermore, since the amount of processing can be reduced, the processing speed can be increased.

第１３の発明に係る画像処理装置は、入力画像が、ウェーブレット対応のフォーマットで符号化、および復号化の少なくとも一方がなされている。 In the image processing apparatus according to the thirteenth invention, the input image is at least one of encoded and decoded in a wavelet compatible format.

この構成により、入力画像を改めてウェーブレット変換しなくとも、ウェーブレット変換に基づく参照パラメータを用いた識別処理を実行することができる。 With this configuration, it is possible to execute identification processing using reference parameters based on wavelet transformation without performing wavelet transformation on the input image again.

第１４の発明に係る画像処理装置は、ウェーブレット対応のフォーマットがＪＰＥＧ２０００フォーマットに対応している。 In the image processing apparatus according to the fourteenth aspect, the wavelet compatible format corresponds to the JPEG2000 format.

この構成により、デジタルカメラなどで撮像されたＪＰＥＧ２０００フォーマットに対応する入力画像に対して識別処理を実行する場合に容易に対応することができるので、さまざまな撮像機能を有する電子機器を用いた対象物件の検出処理が可能となる。 With this configuration, it is possible to easily cope with the case where identification processing is performed on an input image corresponding to the JPEG2000 format captured by a digital camera or the like, so that the target property using electronic devices having various imaging functions Can be detected.

第１５の発明に係る画像処理方法は、入力画像から対象物体を検出する画像処理方法であって、対象物体の画像ベクトルである対象物体参照ベクトルと非対象物体の画像ベクトルである非対象物体参照ベクトルを含む参照ベクトルと参照ベクトルの重み係数との組み合わせである複数のパラメータを設定するパラメータ設定ステップと、入力画像から部分画像を切り出す部分画像切り出しステップと、切り出された部分画像の特徴量ベクトルを算出する特徴量ベクトル算出ステップと、特徴量ベクトルとパラメータを用いて、部分画像から対象物体を検出する検出処理を行う特徴量ベクトル評価ステップを備え、特徴量ベクトル評価ステップは、特徴量ベクトルとパラメータを用いて演算した演算結果に基づいて、部分画像の対象物体の包含／非包含を判定する画像判定ステップを有し、複数の画像判定ステップが、対象物体を非包含と判定した時点で検出処理を終了する。 An image processing method according to a fifteenth aspect of the invention is an image processing method for detecting a target object from an input image, wherein the target object reference vector is an image vector of the target object and the non-target object reference is an image vector of a non-target object. A parameter setting step for setting a plurality of parameters, which is a combination of a reference vector including a vector and a weighting factor of the reference vector, a partial image cutting step for cutting a partial image from the input image, and a feature vector of the cut partial image A feature amount vector calculating step for calculating, and a feature amount vector evaluating step for performing a detection process for detecting a target object from the partial image using the feature amount vector and the parameter. The feature amount vector evaluating step includes a feature amount vector and a parameter. Inclusion of target objects of partial images based on the calculation results calculated using And an image determining step of determining non-contained, a plurality of the image determining step, and ends the detection process when it is determined that the non-inclusion of the target object.

第１６の発明に係る画像処理方法は、画像判定ステップが、特徴量ベクトルと対象物体参照データとの対象物体相関値と特徴量ベクトルと非対象物体参照データとの相関値を算出する相関値算出ステップと、相関値と重み係数との積である評価値を算出する評価値算出ステップを備える。 In the image processing method according to the sixteenth invention, the image determination step calculates a correlation value for calculating a correlation value between the target object correlation value between the feature vector and the target object reference data, and a correlation value between the feature vector and the non-target object reference data. And an evaluation value calculating step of calculating an evaluation value that is a product of the step and the correlation value and the weighting coefficient.

この構成により、画像識別部における画像処理を異なる参照パラメータで順次実行し、実行中にいずれかの画像識別部で対象物体が非包含と判断された時点で検出処理を終了することができるので、高速処理を実現することができる。 With this configuration, the image processing in the image identification unit is sequentially executed with different reference parameters, and the detection process can be ended when it is determined that the target object is not included in any of the image identification units during execution. High-speed processing can be realized.

本発明によれば、識別精度の高い参照パラメータから順に検出処理を行うことで、識別精度を劣化させることがない。さらに、非対象物体と判断した時点で評価を終了するため、高速化も同時に実現できる。 According to the present invention, the detection accuracy is not deteriorated by performing the detection process in order from the reference parameter having the high identification accuracy. Furthermore, since the evaluation is terminated when it is determined as a non-target object, speeding up can be realized at the same time.

（実施の形態１）
本発明の実施の形態１では、物体検出の一例として、画像から顔を検出する画像処理装置、および画像処理方法について説明する。なお、対象物体は顔に限られず、他の所望のものであってもよい。 (Embodiment 1)
In the first embodiment of the present invention, an image processing apparatus and an image processing method for detecting a face from an image will be described as an example of object detection. Note that the target object is not limited to a face, and may be another desired object.

図１は本発明の実施の形態１における画像処理装置のブロック図である。 FIG. 1 is a block diagram of an image processing apparatus according to Embodiment 1 of the present invention.

まず、画像処理装置１の構成を説明する。 First, the configuration of the image processing apparatus 1 will be described.

画像処理装置１は、入力画像２から部分画像を切り出す部分画像切り出し部３と、切り出された部分画像の特徴量ベクトルを算出する特徴量ベクトル算出部４と、特徴量ベクトルと参照パラメータ１２から対象物体の包含／非包含を判定する特徴量ベクトル評価部５を備えている。さらに、参照パラメータ１２を含む参照用データベース１１を備えており、参照パラメータ１２は、対象物体参照ベクトル１３と、非対象物体参照ベクトル１４と、重み係数１５を含んでいる。特徴量ベクトル評価部５は、複数の画像識別部６を備えており、いずれか一つの画像識別部６において、部分画像が対象物体を非包含であると判定した場合には検出処理を終了する。さらに、画像識別部６は、特徴量ベクトルと対象物体参照ベクトル１３との相関値、および特徴量ベクトルと非対象物体参照ベクトル１４との相関値を算出する相関値算出部７を有する。さらに、相関値算出部７で算出された相関値から特徴量ベクトルと対象物体との類似度を算出する類似度算出部８と、算出された類似度と重み係数１５の積である評価値を算出する評価値算出部と、評価値から部分画像の対象物体の包含／非包含を判定する判定部を備えている。 The image processing apparatus 1 includes a partial image cutout unit 3 that cuts out a partial image from the input image 2, a feature amount vector calculation unit 4 that calculates a feature amount vector of the cut out partial image, and a feature amount vector and a reference parameter 12. A feature vector evaluation unit 5 for determining inclusion / non-inclusion of an object is provided. Furthermore, a reference database 11 including a reference parameter 12 is provided. The reference parameter 12 includes a target object reference vector 13, a non-target object reference vector 14, and a weighting factor 15. The feature vector evaluation unit 5 includes a plurality of image identification units 6, and ends the detection process when any one of the image identification units 6 determines that the partial image does not include the target object. . Further, the image identification unit 6 includes a correlation value calculation unit 7 that calculates a correlation value between the feature quantity vector and the target object reference vector 13 and a correlation value between the feature quantity vector and the non-target object reference vector 14. Further, a similarity calculation unit 8 that calculates the similarity between the feature quantity vector and the target object from the correlation value calculated by the correlation value calculation unit 7, and an evaluation value that is a product of the calculated similarity and the weight coefficient 15 An evaluation value calculation unit for calculating, and a determination unit for determining whether the target object of the partial image is included or not from the evaluation value are provided.

最終的に検出結果が出力される。 Finally, the detection result is output.

次に、各部の詳細な構成と動作について説明する。 Next, the detailed configuration and operation of each unit will be described.

まず、部分画像切り出し部３について説明する。 First, the partial image cutout unit 3 will be described.

部分画像切り出し部３は、入力画像２を任意のサイズに切り出す。任意のサイズは例えば画素単位で所定の大きさであってもよく、その大きさも任意である。また、入力画像２の左上の頂点を開始点として、順次右方向に移動し、右端に達すると左下方に移動して、最終的には右下を終了点とする順序で切り出してもよい。部分画像切り出し部３は、例えば、所定の大きさを有するウィンドウを入力画像２に当てることで、画像切り出しを実現する。 The partial image cutout unit 3 cuts out the input image 2 to an arbitrary size. The arbitrary size may be a predetermined size in units of pixels, for example, and the size is also arbitrary. Alternatively, the upper left vertex of the input image 2 may be sequentially moved to the right as the start point, moved to the lower left when reaching the right end, and finally cut out in the order of the lower right as the end point. The partial image cutout unit 3 realizes image cutout by, for example, applying a window having a predetermined size to the input image 2.

なお、部分画像切り出し部３は入力画像２のすべてからあらかじめ部分画像を切り出して、対象物体の検出処理を連続して行う切り出し処理でもよく、あるいは一つの部分画像を切り出し、この部分画像から対象物体の検出処理を行って、次の部分画像を切り出す処理でもよい。 The partial image cutout unit 3 may cut out partial images from all of the input images 2 in advance and continuously perform target object detection processing. Alternatively, the partial image cutout unit 3 cuts out one partial image and extracts the target object from the partial image. This processing may be performed to cut out the next partial image.

次に、特徴量ベクトル算出部４について説明する。 Next, the feature quantity vector calculation unit 4 will be described.

特徴量ベクトル算出部４は、部分画像をから特徴量ベクトルを算出する。特徴量ベクトルは、部分画像から抽出された特徴量を一次元に並べたものである。画像処理で用いられる特徴量は、画像のテクスチャ情報そのものや、複数枚の検出対象物体の画像の画素値の平均などがある。これらは、画像中の色（グレースケール画像の場合は輝度）情報と物体の位置情報とを特徴量とした高次元特徴であり、光源の強弱などの影響を受けやすいことがある。 The feature vector calculation unit 4 calculates a feature vector from the partial image. The feature amount vector is a one-dimensional arrangement of feature amounts extracted from the partial images. The feature amount used in image processing includes image texture information itself, an average of pixel values of images of a plurality of detection target objects, and the like. These are high-dimensional features that use color (luminance in the case of a grayscale image) information in an image and position information of an object as feature amounts, and may be easily affected by the strength of a light source.

これに対して、色情報や輝度情報ではなく、特徴量としてエッジ情報が用いられることもある。エッジ情報は、ｘ方向、ｙ方向に微分フィルタをかけたものであり、物体の輪郭、形状情報などを表す。さらに、エッジ情報を強度で正規化した値を特徴量としてもよい。この値は、ｘ方向、ｙ方向のエッジ情報を強度で割ったものであり、エッジ特徴の方向成分を表すベクトルであることから、本明細書では、エッジ法線方向ベクトルと呼ぶ。エッジ法線方向ベクトルは、エッジを明るさ情報であるエッジ強度で正規化しているため、明るさ変化に対してロバストな特徴を有する。 On the other hand, edge information may be used as a feature amount instead of color information and luminance information. The edge information is obtained by applying a differential filter in the x direction and the y direction, and represents the contour of the object, shape information, and the like. Furthermore, a value obtained by normalizing edge information with intensity may be used as the feature amount. This value is obtained by dividing the edge information in the x direction and the y direction by the intensity, and is a vector representing the direction component of the edge feature. Therefore, this value is referred to as an edge normal direction vector in this specification. Since the edge normal direction vector normalizes the edge with the edge intensity that is the brightness information, the edge normal direction vector has a characteristic that is robust against the brightness change.

ここで、エッジ法線方向ベクトルの算出について述べる。まず、部分画像にソーベルフィルタを施す。ソーベルフィルタは、ｘ方向について、 Here, calculation of the edge normal direction vector will be described. First, a Sobel filter is applied to the partial image. The Sobel filter is in the x direction

ｙ方向について、 For the y direction

とを有するフィルタである。これらのソーベルフィルタにより、ｘ方向の強度とｙ方向の強度であるｘｙ成分が得られる。このｘｙ成分は強度で正規化される。 It is a filter which has these. By these Sobel filters, an xy component that is an intensity in the x direction and an intensity in the y direction is obtained. This xy component is normalized by intensity.

ついで、部分画像の各画素について、エッジ法線方向ベクトルを算出したエッジ画像が得られる。このエッジ画像が所定のウィンドウサイズ（本実施の形態では３６×３６）に拡縮する。その後、拡縮したエッジ画像の画素値が１列に並べられ、特徴量ベクトルとされる。このとき、ウィンドウサイズは、参照用データベース１１の有する参照パラメータと同一サイズであることが好ましい。特徴量ベクトル評価部５での処理が容易となるからである。 Next, an edge image in which an edge normal direction vector is calculated for each pixel of the partial image is obtained. This edge image is enlarged or reduced to a predetermined window size (36 × 36 in this embodiment). Thereafter, the pixel values of the scaled edge image are arranged in one column to be a feature vector. At this time, the window size is preferably the same size as the reference parameter of the reference database 11. This is because the feature vector evaluation unit 5 can be easily processed.

以上のように、エッジ情報を基礎とした特徴量ベクトルが算出される。 As described above, a feature vector based on edge information is calculated.

次に、特徴量ベクトル評価部５について説明する。 Next, the feature vector evaluation unit 5 will be described.

特徴量ベクトル評価部５は、複数の画像識別部６を備えている。また、画像識別部６は、相関値算出部７、類似度算出部８、評価値算出部９、判定部１０を備えている。画像識別部６は複数備えられ、各々の画像識別部６は参照用データベース１１から異なる参照パラメータ１２を読み込む。すなわち、各画像識別部６は、異なる参照パラメータにしたがって、部分画像の対象物体の包含／非包含を検出する。参照パラメータ１２は、検出対象の対象物体（本実施の形態では顔）参照ベクトル１３と、非対象物体参照ベクトル１４と、重み係数１５を含んでいる。対象物体参照ベクトル１３と、非対象物体参照ベクトル１４のそれぞれが含まれていることで、部分画像が対象物体に類似するか、非対称物体に類似するかのいずれかが容易に判定される。 The feature vector evaluation unit 5 includes a plurality of image identification units 6. The image identification unit 6 includes a correlation value calculation unit 7, a similarity calculation unit 8, an evaluation value calculation unit 9, and a determination unit 10. A plurality of image identification units 6 are provided, and each image identification unit 6 reads different reference parameters 12 from the reference database 11. That is, each image identification unit 6 detects inclusion / non-inclusion of the target object of the partial image according to different reference parameters. The reference parameter 12 includes a target object (face in the present embodiment) reference vector 13, a non-target object reference vector 14, and a weighting factor 15 to be detected. By including each of the target object reference vector 13 and the non-target object reference vector 14, it is easily determined whether the partial image is similar to the target object or similar to the asymmetric object.

相関値算出部７は、読み込まれた参照パラメータに含まれる対象物体参照ベクトル１３と、非対象物体参照ベクトル１４を用いて、特徴量ベクトルとの相関値を算出する。相関値算出は、双方のベクトルの畳み込み演算などで実現される。相関値は、対象物体参照ベクトル１３との相関値と、非対象物体参照ベクトル１４との相関値の両方が算出される。相関値算出部７の算出結果は類似度算出部８に出力される。 The correlation value calculation unit 7 calculates a correlation value between the feature quantity vector using the target object reference vector 13 and the non-target object reference vector 14 included in the read reference parameter. The correlation value calculation is realized by convolution calculation of both vectors. As the correlation value, both the correlation value with the target object reference vector 13 and the correlation value with the non-target object reference vector 14 are calculated. The calculation result of the correlation value calculation unit 7 is output to the similarity calculation unit 8.

類似度算出部８は、入力された相関値を元に対象物体もしくは非対象物体への類似度を算出する。相関値は、対象物体と非対象物体の両方に対して算出されるので、相関値の度合いがそのまま類似度として算出される。例えば、対象物体との相関値が高ければ、対象物体への類似度が高く、非対象物体との相関値が高ければ非対象物体への類似度が高いと算出される。算出された類似度は、評価値算出部９に出力される。 The similarity calculation unit 8 calculates the similarity to the target object or non-target object based on the input correlation value. Since the correlation value is calculated for both the target object and the non-target object, the degree of the correlation value is directly calculated as the similarity. For example, if the correlation value with the target object is high, the similarity to the target object is high, and if the correlation value with the non-target object is high, the similarity to the non-target object is high. The calculated similarity is output to the evaluation value calculation unit 9.

なお、類似度は相関値の置き換えに値するから、類似度を算出せず、相関値から直接評価値算出を行ってもよい。 Since the similarity is worthy of replacement of the correlation value, the evaluation value may be calculated directly from the correlation value without calculating the similarity.

評価値算出部９は重み係数１５と類似度の積である評価値を算出する。ここで、重み係数１５は、画像識別部６で用いられた参照パラメータ１２の重要度を表す係数である。すなわち、使用した参照パラメータ１２の識別精度の度合いを表した。重み係数が高いほど識別精度が高く、対象物体と非対象物体とを識別しやすい参照パラメータであることが示される。画像識別部６での対象物体の包含／非包含の判定では、参照ベクトルとの相関値だけではなく、重み係数も乗算することで、より識別精度の高い判定が可能となるメリットがある。また、重み係数が低い参照パラメータ、すなわち識別精度の低い参照パラメータが用いられた場合には、重み係数が乗算されることで、評価値が減少することになり、不正確な識別が行われないメリットもある。 The evaluation value calculation unit 9 calculates an evaluation value that is a product of the weight coefficient 15 and the similarity. Here, the weighting coefficient 15 is a coefficient representing the importance of the reference parameter 12 used in the image identification unit 6. That is, the degree of identification accuracy of the used reference parameter 12 is represented. The higher the weighting coefficient, the higher the identification accuracy, indicating that the reference parameter is easy to identify the target object and the non-target object. In the determination of inclusion / non-inclusion of the target object in the image identification unit 6, there is an advantage that determination with higher identification accuracy is possible by multiplying not only the correlation value with the reference vector but also the weight coefficient. In addition, when a reference parameter with a low weight coefficient, that is, a reference parameter with low identification accuracy is used, the evaluation value is reduced by multiplying the weight coefficient, and inaccurate identification is not performed. There are also benefits.

評価値算出部９で算出された評価値は判定部１０に出力される。判定部１０は、入力した評価値から部分画像の対象物体の包含／非包含を判定する。例えば、評価値が対象物体へ類似する方向へ大きければ検出処理の対象となっている部分画像が対象物体を包含するものと判定し、非対象物体へ類似する方向へ大きければ対象物体を非包含として判定する。 The evaluation value calculated by the evaluation value calculation unit 9 is output to the determination unit 10. The determination unit 10 determines inclusion / non-inclusion of the target object of the partial image from the input evaluation value. For example, if the evaluation value is large in a direction similar to the target object, it is determined that the partial image targeted for detection processing includes the target object. If the evaluation value is large in a direction similar to the non-target object, the target object is not included. Judge as.

ここで、異なる参照パラメータ１２が与えられた画像識別部６は、切り出された部分画像の特徴量ベクトルの評価を順に実行する。このとき複数の画像識別部６のいずれかで対象物体が非包含であると判定された時点で検出処理が終了する。対象物体が顔である場合には、部分画像が非顔であると判定されれば、これ以上処理を続行する必要がないからである。逆に、対象物体である顔と判定された場合には、非顔を含んでいる可能性もあるので処理が続行される。このとき、識別能力の高い参照パラメータから順に処理が実行されることで非顔と判定された場合には顔を包含することはないので処理を終了することができる。検出処理の終了後、部分画像切り出し部３は次の位置の部分画像を切り出して、新たな部分画像に対して検出処理が続行され、入力画像２全体に対する処理が終了して、対処物体を検出することが可能となる。 Here, the image identification unit 6 given the different reference parameters 12 sequentially evaluates the feature quantity vectors of the clipped partial images. At this time, the detection process ends when any of the plurality of image identification units 6 determines that the target object is not included. This is because when the target object is a face, if it is determined that the partial image is a non-face, there is no need to continue the process. Conversely, if it is determined that the face is the target object, the process is continued because there is a possibility that a non-face is included. At this time, if it is determined that the face is non-face by executing the processing in order from the reference parameter having the high discrimination ability, the processing can be terminated because the face is not included. After the detection process is completed, the partial image cutout unit 3 cuts out the partial image at the next position, the detection process is continued for the new partial image, the process for the entire input image 2 is completed, and the handling object is detected. It becomes possible to do.

このように、複数の画像識別部６において、重み係数の高い参照パラメータ１２から順に検出処理が行われることで、検出における識別精度を低下させることなく高速処理を実現することが可能となる。 As described above, the detection processing is performed in order from the reference parameter 12 having the highest weighting coefficient in the plurality of image identification units 6, so that high-speed processing can be realized without reducing the identification accuracy in detection.

なお、ここでは画像識別部６が複数設けられた構成について説明したが、画像識別部６を単体として、単体の画像識別部６に次々と異なる参照パラメータ１２が読み込まれて準じ処理がなされる構成でもよい。 Here, the configuration in which a plurality of image identification units 6 are provided has been described. However, the configuration is such that the image identification unit 6 is a single unit, and different reference parameters 12 are read into the single image identification unit 6 one after another and the corresponding processing is performed. But you can.

次に、別の形態の特徴量ベクトル評価部５について説明する。 Next, another form of the feature vector evaluation unit 5 will be described.

図２は本発明の実施の形態１における特徴量ベクトル評価部５の内部構成図である。 FIG. 2 is an internal configuration diagram of the feature vector evaluation unit 5 according to Embodiment 1 of the present invention.

特徴量ベクトル評価部５は、評価値算出部２１、中間判定部２２、最終判定部２３、参照パラメータ保持部２４、制御パラメータ保持部２５で構成される。評価値算出部２１は、入力された特徴量ベクトルの評価値を算出する。中間判定部２２および最終判定部２３は、評価値に基づいて顔であるか否かの判定を行う。制御パラメータ保持部２５は、評価値を、中間判定部２２と最終判定部２３のどちらに入力するかという判定パラメータを保持し、参照パラメータ保持部２４は、中間判定部６０２および最終判定部６０３での顔であるか否かの判定に必要となるパラメータと、各判定部での判定処理に用いる閾値を保持している。 The feature vector evaluation unit 5 includes an evaluation value calculation unit 21, an intermediate determination unit 22, a final determination unit 23, a reference parameter holding unit 24, and a control parameter holding unit 25. The evaluation value calculation unit 21 calculates an evaluation value of the input feature vector. The intermediate determination unit 22 and the final determination unit 23 determine whether or not the face is based on the evaluation value. The control parameter holding unit 25 holds a determination parameter indicating whether the evaluation value is input to the intermediate determination unit 22 or the final determination unit 23, and the reference parameter storage unit 24 includes the intermediate determination unit 602 and the final determination unit 603. Parameters necessary for determining whether or not the face is a face and a threshold value used for determination processing in each determination unit.

図３は、本発明の実施の形態１における特徴量ベクトル評価部５の処理フローチャートである。 FIG. 3 is a process flowchart of the feature vector evaluation unit 5 according to Embodiment 1 of the present invention.

図２に表される特徴量ベクトル評価部５の動作について説明する。 The operation of the feature vector evaluation unit 5 shown in FIG. 2 will be described.

まず、ステップ７０１ａにて、評価値算出部２１は、参照パラメータ保持部２４から、参照パラメータを読み込み、評価値を算出する。参照パラメータは、対象物体である顔の参照ベクトルＦｊと非顔の参照ベクトルＮＦｊとＦｊ、ＮＦｊに対応する重み係数αｊをＭ個格納している（１≦ｊ≦Ｍ）。重み係数αｊは、学習結果Ｆｊ、ＮＦｊの識別性能を表し、αｊの値が大きいほど識別性能が高い。参照パラメータ保持部には、参照パラメータが、Ｆｊ、ＮＦｊ、αｊの値に関わらず生成された順に格納されている。 First, in step 701a, the evaluation value calculation unit 21 reads a reference parameter from the reference parameter holding unit 24 and calculates an evaluation value. The reference parameter stores M weighting factors αj corresponding to the reference vector Fj of the target object, the reference vectors NFj and Fj of the non-face, and NFj (1 ≦ j ≦ M). The weight coefficient αj represents the discrimination performance of the learning results Fj and NFj, and the discrimination performance is higher as the value of αj is larger. In the reference parameter holding unit, reference parameters are stored in the order in which they are generated regardless of the values of Fj, NFj, and αj.

次に、ステップ７０１ｂにて、評価値算出部２１は、制御パラメータ保持部２５から判定パラメータｐが読み込み、中間判定部２２と最終判定部２３のいずれかを選択する。本実施の形態では、判定パラメータｐは、あらかじめ初期値（０≦ｐ≦Ｍ）を有している。ここで、Ｍは生成された参照パラメータの数である。ついで、ステップ７０１ｃにて、中間判定部２２で処理をする毎に、ｐは値「１」だけ減少する。ｐが１以上の間は、中間判定部２２が処理を行い、ｐが０の場合は、最終判定部２３が処理を行う。すなわち、判定パラメータは参照パラメータの個数の検出処理の進行を確認するものであり、参照パラメータの個数だけ判定処理が行われた場合には、対象物体である顔の判定の有無に関わらず、処理が終了される。 Next, in step 701 b, the evaluation value calculation unit 21 reads the determination parameter p from the control parameter holding unit 25 and selects either the intermediate determination unit 22 or the final determination unit 23. In the present embodiment, the determination parameter p has an initial value (0 ≦ p ≦ M) in advance. Here, M is the number of generated reference parameters. Next, in step 701c, every time the intermediate determination unit 22 performs processing, p decreases by the value “1”. When p is 1 or more, the intermediate determination unit 22 performs processing, and when p is 0, the final determination unit 23 performs processing. That is, the determination parameter confirms the progress of the detection process for the number of reference parameters, and when the determination process is performed for the number of reference parameters, the process is performed regardless of whether the target object is determined. Is terminated.

また、ステップ７０１ｃにて中間判定部２２で判定処理が行われる。ステップ７０１ｄにて、判定処理として、計算された評価値が閾値と比較される。比較の結果、閾値以下であれば処理対象の部分画像は非顔と判定され、閾値より大きければ顔と判定される。部分画像が顔と判定された場合、評価値算出部２１の算出処理が再度実行される。評価値算出部２１は、評価中の特徴に対して未適用の参照パラメータを参照パラメータ保持部２４から新たに読み込み、評価値を再算出する。一方、非顔として判定された場合、判定結果を出力し処理を終了する。 In step 701c, the intermediate determination unit 22 performs determination processing. In step 701d, as a determination process, the calculated evaluation value is compared with a threshold value. As a result of the comparison, if it is equal to or less than the threshold value, the partial image to be processed is determined as a non-face, and if it is larger than the threshold value, it is determined as a face. When the partial image is determined to be a face, the calculation process of the evaluation value calculation unit 21 is executed again. The evaluation value calculation unit 21 newly reads a reference parameter that has not been applied to the feature under evaluation from the reference parameter holding unit 24, and recalculates the evaluation value. On the other hand, if it is determined as a non-face, the determination result is output and the process ends.

最後に、判定パラメータｐが値「０」となった場合には、ステップ７０１ｅにて、最終判定部２３が、顔であるか否かの最終的な判定結果を出力する。 Finally, when the determination parameter p becomes “0”, in step 701e, the final determination unit 23 outputs a final determination result as to whether or not the face is a face.

以上のように、中間判定部２２と最終判定部２３とに分けることで処理の終了を高速に判定することができる。 As described above, the end of the process can be determined at high speed by dividing the intermediate determination unit 22 and the final determination unit 23.

次に、評価値算出部２１、中間判定部２２、最終判定部２３の処理について、用いられる計算式を交えて説明する。 Next, the processing of the evaluation value calculation unit 21, the intermediate determination unit 22, and the final determination unit 23 will be described with calculation formulas used.

まず、評価値算出部２１は、評価値を（数３）によって算出する。 First, the evaluation value calculation unit 21 calculates an evaluation value according to (Equation 3).

ここで、Ｘｎは特徴量ベクトル、ｈ（Ｆｊ、ＮＦｊ、Ｘｎ）は評価値算出部２１が有する評価関数であり、特徴量ベクトルＸｎと参照パラメータ保持部２４から入力される参照パラメータ（顔の参照ベクトルＦｊと非顔の参照ベクトルＮＦｊ）とを引数とする。αｊは参照パラメータＦｊ、ＮＦｊに対応する重み係数である。算出部２１は、参照パラメータ保持部２４からｍ個の参照パラメータとその重み係数を読み込み、これらの重み付き線形和により評価値を算出する。なお、読み込む参照パラメータ数ｍや、読み込む順番はあらかじめ設定されており、対象とする物体や画像によって適切に設定される。 Here, Xn is a feature amount vector, h (Fj, NFj, Xn) is an evaluation function of the evaluation value calculation unit 21, and a reference parameter (face reference) input from the feature amount vector Xn and the reference parameter holding unit 24 The vector Fj and the non-face reference vector NFj) are used as arguments. αj is a weighting factor corresponding to the reference parameters Fj and NFj. The calculation unit 21 reads m reference parameters and their weight coefficients from the reference parameter holding unit 24, and calculates an evaluation value by using these weighted linear sums. Note that the number m of reference parameters to be read and the reading order are set in advance, and are appropriately set depending on the target object or image.

この評価関数ｈ（Ｆｊ、ＮＦｊ、Ｘｎ）は、（数４）で定義される。 This evaluation function h (Fj, NFj, Xn) is defined by (Equation 4).

（数４）は、特徴量ベクトルＸｎと、顔の参照ベクトルＦｊ、および非顔の参照ベクトルＮＦｊとの正規化相関値をそれぞれ算出し、ＦｊとＮＦｊの相関値を比較する。比較の結果、特徴量ベクトルＸｎがＦｊとの相関値が大きければ値「１」を返し、ＮＦｊとの相関値が大きければ値「−１」を返す。これにより、対象物体を包含すると判定された場合には値「１」を返し、対象物体を非包含と判定された場合には値「−１」を返す。 (Equation 4) calculates normalized correlation values of the feature vector Xn, the face reference vector Fj, and the non-face reference vector NFj, respectively, and compares the correlation values of Fj and NFj. As a result of comparison, the value “1” is returned if the feature vector Xn has a large correlation value with Fj, and the value “−1” is returned if the correlation value with NFj is large. As a result, when it is determined that the target object is included, the value “1” is returned, and when it is determined that the target object is not included, the value “−1” is returned.

なお、Ｘｎ（−）、Ｆｊ（−）、ＮＦｊ（−）は、ベクトルＸｎ、Ｆｊ、ＮＦｊの各平均値が格納された平均ベクトルである。 Xn (−), Fj (−), and NFj (−) are average vectors in which average values of the vectors Xn, Fj, and NFj are stored.

次に、中間判定部２２、最終判定部２３は、（数３）で算出された評価値Ｈ（Ｘｎ）を判定閾値ｔｈ（ｐ）により、対象物体である顔か非顔のいずれかに判定する。判定閾値ｔｈ（ｐ）は、判定パラメータがｐのときの判定閾値を表す。評価値Ｈ（Ｘｎ）がｔｈ（ｐ）より大きいとき、顔に判定し、そうでなければ非顔に判定する。判定閾値ｔｈ（ｐ）は参照パラメータ保持部２４から入力される。 Next, the intermediate determination unit 22 and the final determination unit 23 determine the evaluation value H (Xn) calculated in (Equation 3) as a target object or a non-face based on the determination threshold th (p). To do. The determination threshold th (p) represents a determination threshold when the determination parameter is p. When the evaluation value H (Xn) is greater than th (p), the face is determined. Otherwise, the face is determined as a non-face. The determination threshold th (p) is input from the reference parameter holding unit 24.

判定閾値ｔｈ（ｐ）は、顔の参照ベクトルがｔｈ（ｐ）以下に含まれないように設定される。この判定閾値ｔｈ（ｐ）を、中間判定部２２で用いることで、評価値が判定閾値以下と判定された場合には、この部分画像は明らかに非顔と判断することができる。非顔と判断されれば、これ以上識別処理する必要がないので、そのまま終了することができる。逆に、判定閾値以上であれば、顔と判定されるので、新たな参照パラメータを用いて判定処理が続行される。 The determination threshold th (p) is set so that the face reference vector is not included below th (p). By using this determination threshold value th (p) in the intermediate determination unit 22, when it is determined that the evaluation value is equal to or less than the determination threshold value, this partial image can be clearly determined as a non-face. If it is determined that the face is non-face, no further identification processing is required, and the processing can be terminated as it is. Conversely, if it is equal to or greater than the determination threshold, it is determined as a face, and the determination process is continued using a new reference parameter.

最終判定部２３は、保持されている参照パラメータの最後を用いて判定処理を実行し、その結果を出力する。最終判定部２３が用いられれば、顔、非顔のいずれに判断されても、そこで識別処理は終了する。この場合には次の部分画像に処理が移行する。なお、最終判定処理に入る判定パラメータｐの値の条件を値「０」より大きくすることで、処理の高速化を優先することもできる。 The final determination unit 23 executes determination processing using the last of the stored reference parameters and outputs the result. If the final determination unit 23 is used, the identification process ends regardless of whether the determination is a face or a non-face. In this case, the process proceeds to the next partial image. It should be noted that speeding up of the process can be prioritized by making the condition of the value of the determination parameter p entering the final determination process larger than the value “0”.

さらに、評価値算出部２１が読み込む参照パラメータは、重み係数の大きさの降順である。この重み係数の大きさは、参照パラメータの識別精度のレベルを表している。このため、重み係数の降順に読み込んで判定することで、顔と非顔を識別しやすい情報をもった参照パラメータから順に判定することが可能となる。この重み係数の降順に従った判定処理により、識別精度を維持したまま効率よく識別を可能とする。このため、処理量の減少を実現でき、入力画像から対象物体の検出において実時間での処理も可能となる。 Further, the reference parameter read by the evaluation value calculation unit 21 is the descending order of the weighting coefficient. The magnitude of this weighting factor represents the level of identification accuracy of the reference parameter. For this reason, by reading and determining in descending order of the weighting coefficient, it is possible to sequentially determine from the reference parameter having information that can easily identify the face and the non-face. The determination processing according to the descending order of the weighting factors enables efficient identification while maintaining the identification accuracy. For this reason, the amount of processing can be reduced, and real-time processing can be performed in the detection of the target object from the input image.

なお、評価値算出部２１と中間判定部２２は、図４に示すような処理でもよい。 The evaluation value calculation unit 21 and the intermediate determination unit 22 may be processed as shown in FIG.

図４は、本発明の実施の形態１における特徴量ベクトル評価部の処理フローチャートであり、図５は本発明の実施の形態１における顔画像と非顔画像のイメージである。 FIG. 4 is a process flowchart of the feature quantity vector evaluation unit in the first embodiment of the present invention, and FIG. 5 is an image of a face image and a non-face image in the first embodiment of the present invention.

まず、ステップ８０１ａにて、（数４）により特徴量ベクトルＸｎと顔の参照ベクトルＦｊとの相関値を求める。評価値算出部２１は、算出した相関値を中間判定部２２、および最終判定部２３に出力する。中間判定部２１および最終判定部２３は、評価値による判定ではなく、相関値の大きさにより顔／非顔の判定を行う。 First, in step 801a, a correlation value between the feature vector Xn and the face reference vector Fj is obtained by (Equation 4). The evaluation value calculation unit 21 outputs the calculated correlation value to the intermediate determination unit 22 and the final determination unit 23. The intermediate determination unit 21 and the final determination unit 23 perform face / non-face determination based on the magnitude of the correlation value, not determination based on the evaluation value.

判定方法は、図３の場合と同じく、、ステップ８０１ｄにて、相関値と相関閾値ＴｈＣｏｒ（ｐ）を比較して、相関閾値未満では非顔と判定し、相関閾値以上では顔と判定する。中間判定部２１と最終判定部２３との処理については図３で説明したのと同様である。 As in the case of FIG. 3, the determination method compares the correlation value with the correlation threshold value ThCor (p) in step 801d, determines that the face is less than the correlation threshold value, and determines that the face is greater than or equal to the correlation threshold value. The processes performed by the intermediate determination unit 21 and the final determination unit 23 are the same as those described with reference to FIG.

このように、顔の参照ベクトルとの相関値のみで顔／非顔の判定を行うことで、非顔の参照ベクトルに影響されずに判定できるメリットがある。例えば、図５に表されるように、左列の正面顔は、異なる人であっても、目鼻口がほぼ同じ位置、形状であるため顔の参照ベクトルは一定の値やレベルに収束させることが可能である。一方、背景である非顔は、正面顔以外全てを対象とするため、家や山などのようにさまざまな形状が存在する。このため、非顔の参照ベクトルをある一定の値やレベルに収束させるのは困難である。このため非顔の参照ベクトルを判定に用いることで、判定処理に負荷が大きくかかる可能性もある。このような場合を考慮して、対象物体である顔の参照ベクトルとの相関値のみを利用した判定を行うことは、高速処理などにおいてメリットが高い。 Thus, by performing face / non-face determination using only the correlation value with the face reference vector, there is an advantage that determination can be performed without being affected by the non-face reference vector. For example, as shown in FIG. 5, even if the front face in the left column is a different person, the eyes and nose and mouth are in the same position and shape so that the face reference vector converges to a certain value or level. Is possible. On the other hand, since the non-face as the background covers all but the front face, there are various shapes such as a house and a mountain. For this reason, it is difficult to converge the non-face reference vector to a certain value or level. For this reason, using a non-face reference vector for determination may cause a heavy load on the determination process. Considering such a case, performing the determination using only the correlation value with the reference vector of the face that is the target object is highly advantageous in high-speed processing or the like.

次に、参照用データベース１１について説明する。 Next, the reference database 11 will be described.

参照用データベースは対象物体参照ベクトル、および非対象物体参照ベクトル、および重み係数を含む参照パラメータ１２を有するデータベースである。ここで、参照パラメータは、対象物体である顔と非顔の画像から学習的に生成される。 The reference database is a database having reference parameters 12 including target object reference vectors, non-target object reference vectors, and weighting coefficients. Here, the reference parameter is generated by learning from the face and non-face images that are the target objects.

この、学習処理によりパラメータが得られる処理について説明する。 A process for obtaining parameters by the learning process will be described.

図６は学習処理の流れを示すフローチャートである。まず、画像データベースを用意する。画像データベースは、学習処理に用いられる顔と非顔の画像が集められた。学習処理に用いられる、種々の形態、形状、大きさの顔画像と非顔画像を、画像データベースに格納する。例えば、顔画像としては、あらかじめ顔のみが切り取られている画像であり、非顔画像は森や町などの顔を含まない背景画像などである。この画像データベースを元に、参照パラメータを学習処理により生成する。 FIG. 6 is a flowchart showing the flow of the learning process. First, an image database is prepared. In the image database, face and non-face images used in the learning process were collected. Face images and non-face images of various forms, shapes, and sizes used for the learning process are stored in the image database. For example, the face image is an image in which only a face is cut out in advance, and the non-face image is a background image that does not include a face such as a forest or a town. Based on this image database, reference parameters are generated by a learning process.

まず、ステップ１００１ａにて、画像データベースに格納されている顔画像と、非顔画像の特徴量ベクトルが算出される。部分画像の特徴量ベクトルの算出と同様に、特徴量ベクトルはエッジ法線方向ベクトルで表される。顔と非顔の画像から、（数１）、（数２）に示すソーベルフィルタでｘ方向、ｙ方向のエッジが抽出され、エッジ強度で除算されることで、エッジ法線方向ベクトルが算出される。さらに、エッジ法線方向ベクトルからエッジ画像が得られる。その画像を一定のサイズ（本実施の形態では３６画素×３６画素）に拡縮し、エッジ画像の画素値を、１列に並べることにより特徴量ベクトルが得られる。 First, in step 1001a, feature vectors of face images and non-face images stored in the image database are calculated. Similar to the calculation of the feature amount vector of the partial image, the feature amount vector is represented by an edge normal direction vector. Edges in the x and y directions are extracted from the face and non-face images by the Sobel filter shown in (Equation 1) and (Equation 2), and divided by the edge strength to calculate the edge normal direction vector. Is done. Further, an edge image is obtained from the edge normal direction vector. The image is scaled to a certain size (36 pixels × 36 pixels in this embodiment), and the feature value vector is obtained by arranging the pixel values of the edge image in one column.

次に、参照パラメータの算出が、顔と非顔の特徴量ベクトルを用いて実行される。 Next, calculation of reference parameters is executed using feature vectors of faces and non-faces.

まず、特徴量ベクトル数がＶｎｕｍと定義される。ついで、ステップ１００１ｂにて、教師データＴｉ（１≦ｉ≦Ｖｎｕｍ）が、特徴量ベクトルＸｉに割り当てられる。この教師データＴｉは、顔の特徴量ベクトルに割り当てられるときには値「＋１」が、非顔の特徴量ベクトルに割り当てられるときには、値「−１」が設定される。 First, the number of feature vectors is defined as Vnum. In step 1001b, teacher data Ti (1 ≦ i ≦ Vnum) is assigned to the feature vector Xi. The teacher data Ti is set to a value “+1” when assigned to a facial feature vector, and a value “−1” when assigned to a non-facial feature vector.

ついで、ステップ１００１ｃにて、特徴量ベクトルＸｉと教師ベクトルＴｉを用いたアンサンブル学習により参照パラメータが生成される。学習的に参照パラメータが生成されることで、対象物体の検出に適した複数の参照パラメータを生成することができる。 Next, in step 1001c, a reference parameter is generated by ensemble learning using the feature vector Xi and the teacher vector Ti. A plurality of reference parameters suitable for detection of the target object can be generated by learning reference parameters.

アンサンブル学習とは、汎化能力の向上のために、識別力の少しずつ異なった参照パラメータを複数生成することのできる学習法である。アンサンブル学習の例として、バギングやブースティングなどがある。図７にアンサンブル学習を行う学習部の構成図を示す。 Ensemble learning is a learning method that can generate a plurality of reference parameters with slightly different discriminating powers in order to improve generalization ability. Examples of ensemble learning include bagging and boosting. FIG. 7 shows a configuration diagram of a learning unit that performs ensemble learning.

学習部は、顔と非顔の特徴量ベクトルが格納された特徴量ベクトル格納部１１０１、特徴量ベクトル格納部の一部のデータをサンプリングするフィルタ部１１０２、フィルタ部から出力されるデータを用いてアンサンブル学習により参照パラメータを算出する学習処理部１１０３、評価用の特徴量ベクトルが格納された評価データ格納部１１０４、評価データを用いて評価を行う評価部１１０５を有している。 The learning unit uses a feature vector storage unit 1101 in which feature vectors of faces and non-faces are stored, a filter unit 1102 that samples some data in the feature vector storage unit, and data output from the filter unit. A learning processing unit 1103 that calculates a reference parameter by ensemble learning, an evaluation data storage unit 1104 that stores a feature quantity vector for evaluation, and an evaluation unit 1105 that performs evaluation using the evaluation data are included.

まず、フィルタ部１１０２が、特徴量ベクトル格納部１１０１に格納されている特長量ベクトルをサンプリングする。フィルタブ１１０２でサンプリングされたデータは、学習処理部１１０３に出力される。学習書リブ１１０３は、入力したデータを元にある参照パラメータを生成する。このときブースティングなどのアンサンブル学習法が用いられる。 First, the filter unit 1102 samples the feature amount vector stored in the feature amount vector storage unit 1101. Data sampled by the filter 1102 is output to the learning processing unit 1103. The learning book rib 1103 generates a reference parameter based on the input data. At this time, an ensemble learning method such as boosting is used.

生成された参照パラメータは評価データ部１１０４に格納された評価データを元に評価部１１０５にて評価される。評価データは、あらかじめ汎用的な値を持つパラメータであり、このパラメータとの比較結果に基づき次の新たな参照パラメータが生成される。 The generated reference parameter is evaluated by the evaluation unit 1105 based on the evaluation data stored in the evaluation data unit 1104. The evaluation data is a parameter having a general-purpose value in advance, and the next new reference parameter is generated based on the comparison result with this parameter.

以上の処理手順により、異なる複数の参照パラメータを生成することができる。 A plurality of different reference parameters can be generated by the above processing procedure.

なお、アンサンブル学習による学習方法については、「Ｔ，Ｇ，Ｄｉｅｔｔｅｒｉｃｈ． ”Ｅｎｓｅｍｂｌｅｍｅｔｈｏｄｓｉｎｍａｃｈｉｎｅｌｅａｒｎｉｎｇ” ＩｎＭｕｌｔｉｐｌｅＣｌａｓｓｉｅｒＳｙｓｔｅｍｓ．ＦｉｒｓｔＩｎｔｅｒｎａｔｉｏｎａｌＷｏｒｋｓｈｏｐ，ＭＣＳ２０００，Ｃａｇｌｉａｒｉｐ．１−１５，２０００」に詳しく説明されている。したがって、ここでは、これ以上の説明は省略する。 The learning method based on ensemble learning is described in “T, G, Dietrich.“ Ensemblable methods in machine learning ”In Multiple Classifier Systems. 1st First International Works, 2000. . Therefore, further explanation is omitted here.

以上により、（数３）、（数４）で用いる参照パラメータがＭ個生成される。この個数Ｍが多いほど、精度の高い識別が可能になるが、検出時の特徴量ベクトルの判定に要する計算量は増加する。したがって、参照パラメータの個数Ｍは目的に応じて決められればよい。 Thus, M reference parameters used in (Equation 3) and (Equation 4) are generated. The larger the number M, the more accurate the identification becomes, but the calculation amount required for determining the feature vector at the time of detection increases. Therefore, the number M of reference parameters may be determined according to the purpose.

最後に、実施の形態１における画像処理装置、および画像処理方法の全体動作について図８、図９を用いて説明する。 Finally, the overall operation of the image processing apparatus and the image processing method according to Embodiment 1 will be described with reference to FIGS.

図８は、本発明の実施の形態１における検出処理のフローチャートであり、図９は本発明の実施の形態１における検出処理の処理図である。 FIG. 8 is a flowchart of the detection process in the first embodiment of the present invention, and FIG. 9 is a process diagram of the detection process in the first embodiment of the present invention.

図８において、（Ｘ、Ｙ）は、検出位置の左上座標、Ｓ×Ｓは検出サイズ、ＷＩＤＴＨ，ＨＥＩＧＨＴは、画像の横、縦のピクセル数、ＳＩＺＥＭＡＸは、ＷＩＤＴＨとＨＥＩＧＨＴとの大きい方の値である。 In FIG. 8, (X, Y) is the upper left coordinate of the detection position, S × S is the detection size, WIDTH, HEIGHT is the number of horizontal and vertical pixels of the image, and SIZEMAX is the larger value of WIDTH and HEIGHT. It is.

まず、ステップ４０１ａにて、検出位置の左上座標（Ｘ、Ｙ）と検出サイズＳ×Ｓの初期設定を行う。本実施の形態では、画像の左上座標を原点とし、原点（０、０）を初期検出位置とし、初期検出サイズは２０×２０ピクセルとする。初期設定の後、ステップ４０１ｂにて、部分画像切り出し部３が、入力画像２の（Ｘ、Ｙ）から（Ｘ＋Ｓ、Ｙ＋Ｓ）までの部分画像を切り出す。ついで、ステップ４０１ｃにて、特徴量ベクトル算出部４が切り出された部分画像から特徴量ベクトルを算出する。次に、ステップ４０１ｄにて、特徴量ベクトル評価部５が、その特徴量ベクトルと参照パラメータを用いて部分画像の対象物体の包含／非包含を判定する。 First, in step 401a, the upper left coordinates (X, Y) of the detection position and the detection size S × S are initially set. In this embodiment, the upper left coordinate of the image is the origin, the origin (0, 0) is the initial detection position, and the initial detection size is 20 × 20 pixels. After the initial setting, in step 401b, the partial image cutout unit 3 cuts out partial images from (X, Y) to (X + S, Y + S) of the input image 2. Next, in step 401c, the feature quantity vector calculation unit 4 calculates a feature quantity vector from the extracted partial image. Next, in step 401d, the feature quantity vector evaluation unit 5 determines the inclusion / non-inclusion of the target object of the partial image using the feature quantity vector and the reference parameter.

以上の処理を、入力画像の右下まで適当な画素（本実施の形態では１画素）分右側にずらす（ステップ４０１ｅ、ステップ４０１ｆ）、または下側にずらしながら走査する（ステップＳ４０１ｇ、ステップＳ４０１ｈ）ことで、順次Ｓ×Ｓ画素の部分画像を切り出して対象物体を含むかを判定する。また、部分画像のサイズＳが、画像サイズ（本実施の形態では３２０×２４０）まで、適当な倍数（本実施の形態では１．２倍）ずつ拡大される（ステップＳ４０１ｉ、ステップＳ４０１ｊ）ことで処理することでもよい。 The above processing is shifted rightward by an appropriate pixel (one pixel in this embodiment) to the lower right of the input image (step 401e, step 401f), or scanned while shifting downward (step S401g, step S401h). Thus, it is determined whether or not the target object is included by sequentially cutting out partial images of S × S pixels. Further, the size S of the partial image is enlarged by an appropriate multiple (1.2 times in the present embodiment) up to the image size (320 × 240 in the present embodiment) (steps S401i and S401j). It may be processed.

以上の対象物体を検出する画像処理装置、および画像処理方法は図１０に表される汎用装置で実現されてもよい。 The above-described image processing apparatus and image processing method for detecting a target object may be realized by a general-purpose apparatus shown in FIG.

図１０は本発明の実施の形態１における画像処理装置を含む電子装置の構成図である。 FIG. 10 is a configuration diagram of an electronic apparatus including the image processing apparatus according to Embodiment 1 of the present invention.

ＣＰＵ（中央演算処理装置）１０１は、ＲＯＭ（リードオンリーメモリー）１０２に格納されている図９で説明された画像処理プログラムを実行する。 A CPU (Central Processing Unit) 101 executes the image processing program described in FIG. 9 stored in a ROM (Read Only Memory) 102.

ＲＡＭ（ランダムアクセスメモリ）１０４およびハードディスク１０５は、入力画像の記憶のほか、参照パラメータを含む参照用データベースを記憶する。これ以外にも、画像処理の途中経過の中間データなどが記憶されていてもよい。カメラ１０８がインターフェース１０７を介してバス１０３に接続されており、カメラ１０８で撮像された画像が入力画像として入力される。この入力画像に対して対象物体の検出処理が実行される。 A RAM (Random Access Memory) 104 and a hard disk 105 store a reference database including reference parameters in addition to storing input images. In addition to this, intermediate data in the middle of image processing may be stored. A camera 108 is connected to the bus 103 via an interface 107, and an image captured by the camera 108 is input as an input image. A target object detection process is executed on the input image.

なお、カメラ１０８は、スチルカメラ／ビデオカメラどちらでもよく、携帯電話に付属するカメラを用いることもできる。 The camera 108 may be either a still camera or a video camera, and a camera attached to a mobile phone can also be used.

また、このような汎用装置ではなく専用装置で実現されてもよい。 Moreover, you may implement | achieve with a dedicated apparatus instead of such a general purpose apparatus.

以上のように、本実施の形態によれば、複数の参照パラメータのうち、対象物体と非対象物体を識別する性能の高い参照パラメータから順に識別を行うことができ、部分画像を非対象物体に判定した場合、その時点で部分画像の識別を終えることができるため、画像中の大部分を占める非対象物体に対する処理量を削減できる。結果として、識別精度を低下させることなく、検出処理の高速化が実現される。 As described above, according to the present embodiment, among the plurality of reference parameters, it is possible to perform identification in order from the reference parameter with high performance for identifying the target object and the non-target object, and the partial image is set as the non-target object. When the determination is made, the identification of the partial image can be completed at that time, so that the processing amount for the non-target object occupying most of the image can be reduced. As a result, it is possible to speed up the detection process without reducing the identification accuracy.

（実施の形態２）
本発明の実施の形態２では、ハールウェーブレット変換で得られるウェーブレット特徴を特徴量とすることで、高速かつ高精度な対象物体の検出を行う技術について説明する。 (Embodiment 2)
In the second embodiment of the present invention, a technique for detecting a target object at high speed and with high accuracy by using a wavelet feature obtained by Haar wavelet transform as a feature amount will be described.

なお、説明の便宜のために対象物体を顔（人物や動物などの顔）として説明するが、実施の形態２で説明される画像処理装置、および画像処理方法は顔の検出に限られる技術ではない。 For convenience of explanation, the target object is described as a face (a face such as a person or an animal). However, the image processing apparatus and the image processing method described in the second embodiment are technologies limited to face detection. Absent.

図１１は、本発明の実施の形態２における画像処理装置のブロック図であり、図１２は、本発明の実施の形態２における特徴量ベクトル評価部の内部ブロック図であり、図１３は、本発明の実施の形態２における特徴量ベクトル評価部の動作フローチャートである。 FIG. 11 is a block diagram of an image processing apparatus according to Embodiment 2 of the present invention, FIG. 12 is an internal block diagram of a feature vector evaluation unit according to Embodiment 2 of the present invention, and FIG. It is an operation | movement flowchart of the feature-value vector evaluation part in Embodiment 2 of invention.

実施の形態２における画像処理装置は、実施の形態１で説明した画像処理装置と全体構成は同様である。ここで、図１１に表される構成において、入力画像１２０１と部分画像切り出し部１２０２は実施の形態１で説明したものと相違ないので説明を省略する。 The image processing apparatus according to the second embodiment has the same overall configuration as the image processing apparatus described in the first embodiment. Here, in the configuration shown in FIG. 11, the input image 1201 and the partial image cutout unit 1202 are not different from those described in Embodiment 1, and thus the description thereof is omitted.

まず、特徴量ベクトル算出部１２０３について説明する。 First, the feature vector calculation unit 1203 will be described.

特徴量ベクトル算出部１２０３は、切り出された部分画像から特徴量ベクトルを算出する。実施の形態２では、特徴量として、実施の形態１のエッジ情報に代えて、光源の強弱の影響が少ない周波数特徴を用いる方法について説明する。 The feature vector calculation unit 1203 calculates a feature vector from the clipped partial image. In the second embodiment, a method of using a frequency feature that is less affected by the intensity of the light source instead of the edge information of the first embodiment will be described as the feature amount.

ここで、周波数特徴の一つであるウェーブレット変換は、変換に用いられるウェーブレットの大きさを伸縮することでオリジナル信号の解析と取り出しができる。したがって、ウェーブレット変換を利用することは、オリジナル信号からの画像の形状情報に加え、周波数変換による画像上の周波数情報の両方が利用できるメリットがある。 Here, the wavelet transform, which is one of the frequency features, can analyze and extract the original signal by expanding and contracting the size of the wavelet used for the conversion. Therefore, using wavelet transform has an advantage that both frequency information on the image by frequency transformation can be utilized in addition to the shape information of the image from the original signal.

特徴量ベクトル算出部１２０３は、部分画像に対してハールウェーブレット変換による多重解像度解析を行う。ついで、ハールウェーブレット変換による多重解像度解析により得られるウェーブレット展開係数とスケーリング係数（以下、これらをウェーブレット特徴と呼ぶ）を一列に並べる。この一列に並べられたウェーブレット特徴が、特徴量ベクトルとして用いられる。 The feature vector calculation unit 1203 performs multi-resolution analysis by Haar wavelet transform on the partial image. Next, wavelet expansion coefficients and scaling coefficients (hereinafter referred to as wavelet features) obtained by multi-resolution analysis by Haar wavelet transform are arranged in a line. The wavelet features arranged in a line are used as a feature vector.

ここで、このウェーブレット変換と多重解像度解析について説明する。 Here, the wavelet transform and multiresolution analysis will be described.

まず、ウェーブレットについて説明する。ウェーブレットにおける基本ウェーブレットはマザーウェーブレットψ（ｘ）と呼ばれる。このマザーウェーブレットにより、画像上の周波数成分が検出される。すなわち、画像上の各部分における周波数成分の情報が得られる。さらに、マザーウェーブレットを引き伸ばして、それを適当な位置に平行移動させて合成することにより、オリジナル信号が表現される。ウェーブレット変換は、（数５）で定義される。 First, the wavelet will be described. The basic wavelet in the wavelet is called mother wavelet ψ (x). By this mother wavelet, a frequency component on the image is detected. That is, information on frequency components in each part on the image is obtained. Further, the mother wavelet is stretched and translated to an appropriate position to synthesize the original signal. The wavelet transform is defined by (Equation 5).

（数５）で、ｂはシフト、ａ＞０は拡大縮小のパラメータであり、スケールと呼ばれる。１／√ａは正規化のための係数である。このψ（）と信号ｆ（ｘ）との内積がウェーブレット変換の定義式である。 In (Equation 5), b is a shift parameter, and a> 0 is an enlargement / reduction parameter, which is called a scale. 1 / √a is a coefficient for normalization. The inner product of ψ () and the signal f (x) is the defining formula for wavelet transform.

マザーウェーブレットには、ハールウェーブレット、メキシカンハットウェーブレット、ガボールウェーブレットなどがある。なかでも、ハールウェーブレットは計算式が簡単であるため、他のウェーブレットに比べ計算コストを短縮できるメリットがある。そこで、実施の形態２では、ハールウェーブレットを用いる。 Mother wavelets include Haar wavelets, Mexican hat wavelets, and Gabor wavelets. Among them, the Haar wavelet has a merit that the calculation cost can be shortened compared with other wavelets because the calculation formula is simple. Therefore, in Embodiment 2, a Haar wavelet is used.

ハールウェーブレット関数は（数６）で定義される。 The Haar wavelet function is defined by (Equation 6).

次に、多重解像度解析について説明する。 Next, multiresolution analysis will be described.

多重解像度解析は、信号をスケーリング関数と呼ばれる関数の一次結合で近似することである。ここで、多重解像度解析における原信号の近似の精度はレベルとして表される。多重解像度解析により、複数解像度で算出したウェーブレット特徴の一次結合で信号が表現される。画像に対し多重解像度解析を行った結果を特徴量とすることで、複数帯域の周波数を含んだ特徴量を利用して検出評価を行うことができる。そのため、入力画像の顔領域が高解像度であれば、高解像度成分で相関が高くなり、低解像度であれば低解像度成分で相関が高くなり、解像度の違いに対してロバストな検出が可能である。 Multi-resolution analysis is to approximate a signal with a linear combination of functions called scaling functions. Here, the accuracy of approximation of the original signal in the multi-resolution analysis is expressed as a level. By multi-resolution analysis, a signal is expressed by a linear combination of wavelet features calculated at a plurality of resolutions. By using the result of performing multi-resolution analysis on an image as a feature amount, detection evaluation can be performed using a feature amount including frequencies in a plurality of bands. Therefore, if the face area of the input image is high resolution, the correlation is high at the high resolution component, and if it is low resolution, the correlation is high at the low resolution component, and robust detection can be performed against the difference in resolution. .

ハールウェーブレットのスケーリング関数は（数７）と定義される。 The scaling function of the Haar wavelet is defined as (Equation 7).

このスケーリング関数を用いて、任意の関数ｆ（ｘ）が（数８）として定義される。 Using this scaling function, an arbitrary function f (x) is defined as (Equation 8).

次に、ハールのスケーリング関数の平行移動、拡大／縮小したものが（数９）で定義される。 Next, the translation and enlargement / reduction of the Haar scaling function are defined by (Equation 9).

このφｊ、ｓを用いてレベルｊの近似が（数１０）で定義される。 Using this φj, s, an approximation of level j is defined by (Equation 10).

ここで、ｓｋ（ｊ）をスケーリング係数と呼ぶ。ここで、レベルが値「０」となるとき最も精度の高い近似となり、レベルの値が大きくなるに従い、粗い近似となる。つまりｆ１（ｘ）はｆ０（ｘ）に比べて情報が欠落しているといえる。この欠落分をｇ１（ｘ）とすると、（数１１）が成り立つ。 Here, sk (j) is called a scaling factor. Here, when the level is “0”, the approximation is the most accurate, and as the level value increases, the approximation becomes rough. That is, it can be said that f1 (x) lacks information compared to f0 (x). If this missing part is g1 (x), (Equation 11) is established.

このときｇ１（ｘ）をレベル１のウェーブレット成分と呼ぶ。このウェーブレットφから（数１２）のようにして、平行移動、拡大／縮小した関数φｊ、ｋが得られ、このφｊ、ｋを用いてｇ１（ｘ）を表すと、（数１３）となる。このとき、ωｋ（１）をレベル１のウェーブレット展開係数と呼ぶ。 At this time, g1 (x) is called a level 1 wavelet component. From this wavelet φ, functions φj, k that have been translated and expanded / reduced are obtained as shown in (Equation 12), and when g1 (x) is expressed using φj, k, (Equation 13) is obtained. At this time, ωk (1) is called a level 1 wavelet expansion coefficient.

以上により、レベル値「０」の近似関数ｆ０（ｘ）は、レベル値「１」の近似関数ｆ１（ｘ）とレベル値「１」のウェーブレットによって表現される関数ｇ１（ｘ）に分解され、（数１４）のように表される。これを一般化すると、（数１５）となる。 As described above, the approximate function f0 (x) having the level value “0” is decomposed into the approximate function f1 (x) having the level value “1” and the function g1 (x) represented by the wavelet having the level value “1”. It is expressed as (Equation 14). When this is generalized, (Expression 15) is obtained.

以上のように、関数ｆ０（ｘ）を幅の異なる、つまり解像度の異なるウェーブレット成分の和で表現することが可能である。これを再帰的に利用すると、レベル０の近似関数ｆ０（ｘ）は、レベル１、２、・・・、ｊの近似関数を用いて、（数１６）のように表される。 As described above, the function f0 (x) can be expressed by the sum of wavelet components having different widths, that is, different resolutions. When this is used recursively, the approximation function f0 (x) at level 0 is expressed as (Equation 16) using approximation functions at levels 1, 2,..., J.

次に、ウェーブレット変換による多重解像度解析を画像に適用する場合について説明する。 Next, a case where multiresolution analysis by wavelet transform is applied to an image will be described.

画像という２次元信号に対し、１次元の２分割フィルタバンクを繰り返し使用して、２次元ウェーブレット変換を実行する。具体的には、まず画像の水平方向にウェーブレット変換を施し、２：１のダウンサンプリングを行い、低周波成分と高周波成分に分離する。次に、その結果に対して、垂直方向にウェーブレット変換を施すことにより、低周波成分と３つの高周波成分が得られる。これらの操作を、低周波成分に対して繰り返し行うことにより、多重解像度解析を行った画像データが得られる。このように水平と垂直の処理を分離した形式で、二次元フィルタリングが可能になる。これには、２次元フィルタの設計が１次元フィルタの設計問題として実行でき容易であるという利点がある。図１４（ａ）は本発明の実施の形態２における元画像のイメージであり、（ｂ）は、（ａ）に対してレベル３の多重解像度解析を行った画像イメージである。このように、画像の左上が低周波成分となり、それ以外のところが高周波成分となる。 A two-dimensional wavelet transform is executed on a two-dimensional signal called an image by repeatedly using a one-dimensional two-part filter bank. Specifically, first, wavelet transformation is performed in the horizontal direction of the image, 2: 1 down-sampling is performed, and a low-frequency component and a high-frequency component are separated. Next, a low-frequency component and three high-frequency components are obtained by subjecting the result to wavelet transform in the vertical direction. By repeating these operations for low frequency components, image data subjected to multi-resolution analysis can be obtained. In this way, two-dimensional filtering can be performed in a format in which horizontal and vertical processing are separated. This has the advantage that the design of the two-dimensional filter can be easily performed as a design problem of the one-dimensional filter. FIG. 14A is an image of an original image according to the second embodiment of the present invention, and FIG. 14B is an image image obtained by performing level 3 multiresolution analysis on FIG. Thus, the upper left of the image is a low frequency component, and the rest of the image is a high frequency component.

多重解像度解析を行うことで得られるウェーブレット展開係数とスケーリング係数をウェーブレット特徴と呼ぶ。ウェーブレット特徴の高いレベルの成分から順に、１列に並べたものを特徴量ベクトルとする。 Wavelet expansion coefficients and scaling coefficients obtained by performing multi-resolution analysis are called wavelet features. The features arranged in one column in order from the component with the highest wavelet feature level are defined as a feature vector.

さらに、ウェーブレット特徴には、符号化したデータを用いて、最低域の低解像度画像から高解像度画像まで順次解像度を上げて復号できるという特徴がある。このことは、自然画像の圧縮符号化方式であるＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−１）の符号化、復号化における空間解像度スケーラビリティとして知られている。さらに、最も高いレベルの成分を用いて画像を復号した場合、その解像度における元画像の色や形状情報を、十分に表現できるという特徴がある。 Furthermore, the wavelet feature has a feature that it can be decoded by using the encoded data to sequentially increase the resolution from the lowest resolution low resolution image to the high resolution image. This is known as spatial resolution scalability in encoding and decoding of JPEG 2000 (ISO / IEC FCD 15444-1), which is a compression encoding method for natural images. Further, when the image is decoded using the highest level component, the color and shape information of the original image at the resolution can be sufficiently expressed.

次に、特徴量ベクトル評価部１２０４の処理を説明する。 Next, processing of the feature vector evaluation unit 1204 will be described.

特徴量ベクトル評価部１２０４の処理の流れが図１２に示されている。評価値算出部１３０１は、参照パラメータ保持部１３０４から、参照パラメータと重みを読み込み、評価値を算出する。ここで、本実施の形態では、参照パラメータと重みは、ウェーブレット特徴の性質を利用して次のように読み込む。まず、特徴量ベクトルのうち、レベルの高い成分のみを利用して評価を行う。先述したように、レベルの高い成分のみでも、元画像の形状情報は保持されているため、精度の高い識別は可能である。レベルの高い成分のみを利用すると、評価に用いる特徴量ベクトルの次元数が少なくなるため、処理量を削減できる。これにより算出された評価値によって、中間判定部１３０２で評価を行うことで、処理量の少ない評価で非顔であるものの判定が可能になる。 A processing flow of the feature vector evaluation unit 1204 is shown in FIG. The evaluation value calculation unit 1301 reads the reference parameter and the weight from the reference parameter holding unit 1304 and calculates an evaluation value. Here, in the present embodiment, reference parameters and weights are read as follows using the properties of wavelet features. First, evaluation is performed using only high-level components of the feature vector. As described above, the shape information of the original image is retained even with only a high-level component, so that highly accurate identification is possible. When only high-level components are used, the number of dimensions of the feature vector used for evaluation is reduced, so that the processing amount can be reduced. By performing evaluation by the intermediate determination unit 1302 based on the evaluation value thus calculated, it is possible to determine a non-face with an evaluation with a small processing amount.

ついで、、順にレベルの低い成分も利用して評価を行う。レベルの低い成分も利用することにより、次元数が増加し、情報量が増えるため、より精度の高い識別は可能になり、顔と非顔をより精密に識別することが可能になる。以上により、顔検出に要する全体的な処理量を削減できる。 Next, evaluation is performed using components having lower levels in order. By using a component having a low level, the number of dimensions increases and the amount of information increases. Therefore, discrimination with higher accuracy is possible, and a face and a non-face can be discriminated more precisely. Thus, the overall processing amount required for face detection can be reduced.

実施の形態２では、（数１７）により評価値が算出される。 In the second embodiment, the evaluation value is calculated by (Equation 17).

ここで、Ｘｎは特徴量ベクトル、ｈ（Ｆｊ、ＮＦｊ、Ｘｎ、Ｌ）は評価値算出部１３０１が有する評価関数であり、特徴量ベクトルＸｎと、参照パラメータ保持部１３０４から入力される、顔の参照ベクトルＦｊと非顔の参照ベクトルＮＦｊと使用する特徴量ベクトルの次元数Ｌを引数とする。また、αｊはｊ番目の参照パラメータに対する重みであり、ｍは参照パラメータ保持部から入力される、顔と非顔の参照ベクトルとその結果に対応する重み係数αの数である。ここでは、簡単化のため、特徴量ベクトルＸｎ、学習結果Ｆｊ、ＮＦｊに格納されている特徴量は、ウェーブレット特徴の高いレベルの成分から順に格納されているとする。したがって、使用する次元数Ｌを変化させることにより、どのレベルの成分までを使用するかということを指定できる。本実施の形態では、４８×４８の画像においてレベル３の多重解像度解析を行い、最も高レベル成分のみであれば、次元数Ｌは１２×１２となり、次のレベルの成分まで使用するとＬは２４×２４、全成分を使用するとＬは４８×４８になるようにした。 Here, Xn is a feature quantity vector, h (Fj, NFj, Xn, L) is an evaluation function of the evaluation value calculation unit 1301, and the feature quantity vector Xn and the face parameter input from the reference parameter holding unit 1304 The reference vector Fj, the non-face reference vector NFj, and the dimension number L of the feature vector used are used as arguments. Αj is a weight for the j-th reference parameter, and m is the number of face and non-face reference vectors and the weighting factor α corresponding to the result, which are input from the reference parameter holding unit. Here, for the sake of simplification, it is assumed that the feature quantities stored in the feature quantity vector Xn, the learning results Fj, and NFj are stored in order from the higher-level component of the wavelet feature. Therefore, by changing the number of dimensions L to be used, it is possible to specify up to which level of components to use. In the present embodiment, level 3 multi-resolution analysis is performed on a 48 × 48 image. If only the highest level component is present, the dimension number L is 12 × 12. When the next level component is used, L is 24. When using all components, × 24, L was set to 48 × 48.

なお、以上のことを実現するため方法には以下の２通りがある。 There are the following two methods for realizing the above.

一つは、特徴量ベクトルの全てを用いてＦｊ、ＮＦｊを生成し、検出時には、そのうちＬ次元を用いて評価を行う方法である。 One is a method in which Fj and NFj are generated using all feature quantity vectors, and evaluation is performed using the L dimension during detection.

さらに一つは、特徴量ベクトルのうち、ウェーブレット特徴の高いレベルの成分からＬ次元の成分を用いてＦｊ、ＮＦｊを生成し、検出時に評価を行う方法である。 Further, one is a method of generating Fj and NFj using L-dimensional components from high-level components of wavelet features in the feature quantity vector, and evaluating at the time of detection.

最初の方法では、多重解像度解析結果から得られる全ての特徴を用いて、Ｆｊ、ＮＦｊを生成するため、検出時に次元数Ｌを柔軟に変更できるという利点がある。次の方法には、多重解像度解析結果から得られる特徴量ベクトルのうち、Ｌ次元の特徴量ベクトルを使用してＦｊ、ＮＦｊを生成しているため、一つ目の方法に比べて、Ｌ次元中に学習で得られる全情報が含まれると考えられる。 In the first method, since Fj and NFj are generated using all the features obtained from the multiresolution analysis result, there is an advantage that the dimension number L can be flexibly changed at the time of detection. In the next method, since Fj and NFj are generated using the L-dimensional feature quantity vector among the feature quantity vectors obtained from the multiresolution analysis result, the L dimension is compared with the first method. It is considered that all the information obtained by learning is included.

ＦｊとＮＦｊの生成方法については複数考えられるが、学習環境や対象画像によって、最適な方法を選択すればよい。実施の形態２においては、二つ目の方法で学習されたものを用いる。したがって、参照パラメータ保持部１２０４から読み込まれるｍ個の参照パラメータは全てＬ次元の特徴量ベクトルとする。 There are a plurality of methods for generating Fj and NFj, but an optimal method may be selected depending on the learning environment and the target image. In the second embodiment, what is learned by the second method is used. Accordingly, all m reference parameters read from the reference parameter holding unit 1204 are L-dimensional feature quantity vectors.

（数１８）により、参照パラメータ保持部１２０４から入力されるｍ個の参照パラメータを用いた重み付き線形和により評価値が算出される。 By (Equation 18), an evaluation value is calculated by a weighted linear sum using m reference parameters input from the reference parameter holding unit 1204.

また、この評価関数ｈ（）は、（数１８）で定義される。 The evaluation function h () is defined by (Equation 18).

（数１８）は、特徴量ベクトルＸｎと、顔の参照ベクトルＦｊ、および非顔の参照ベクトルＮＦｊとの正規化相関値を算出し、Ｆｊとの相関値が大きければ１を返し、一方、ＮＦｊとの相関値が大きけれ大きければ−１を返す関数である。なお、Ｓｉ（−）、Ｆｊ（−）、ＮＦｊ（−）は、Ｓｉ、Ｆｊ、ＮＦｊのそれぞれにおいて、Ｌ次元までの平均値が格納された平均ベクトルである。 (Equation 18) calculates a normalized correlation value between the feature vector Xn, the face reference vector Fj, and the non-face reference vector NFj, and returns 1 if the correlation value with Fj is large, while NFj Is a function that returns -1 if the correlation value between and is large. Si (−), Fj (−), and NFj (−) are average vectors in which average values up to the L dimension are stored in each of Si, Fj, and NFj.

中間判定部１３０２、最終判定部１３０３は、（数１７）で算出された評価値Ｈ（Ｘｎ）を判定閾値により、顔か非顔のいずれに該当するかを判定する。評価値Ｈ（Ｘｎ）が判定閾値より大きいとき、顔に判定し、そうでなければ非顔に判定する。判定閾値は参照パラメータ保持部から入力される。 The intermediate determination unit 1302 and the final determination unit 1303 determine whether the evaluation value H (Xn) calculated in (Equation 17) corresponds to a face or a non-face based on a determination threshold. When the evaluation value H (Xn) is larger than the determination threshold, the face is determined, and otherwise, the face is determined as a non-face. The determination threshold is input from the reference parameter holding unit.

以上までの方法で、検出処理が行われる。検出を行うためにはあらかじめ、参照パラメータ保持部１３０４に参照パラメータが格納されていなければならない。 The detection process is performed by the method described above. In order to perform detection, the reference parameter must be stored in the reference parameter holding unit 1304 in advance.

ここで、図１５は本発明の実施の形態２における参照パラメータ生成のフローチャートである。図１５を用いて説明する。 Here, FIG. 15 is a flowchart of reference parameter generation in Embodiment 2 of the present invention. This will be described with reference to FIG.

参照パラメータの生成は、実施の形態１で説明したのと同様であるが、検出処理を行うに際してウェーブレット変換を用いた検出を行うので参照パラメータもウェーブレット変換を用いて生成することが必要となる。 The generation of the reference parameter is the same as that described in the first embodiment. However, since the detection is performed using the wavelet transform when performing the detection process, it is necessary to generate the reference parameter using the wavelet transform.

検出時の説明の際に述べたように、学習画像に対するハールウェーブレット変換により多重解像度解析を行った結果であるウェーブレット特徴を特徴量として用いる。まず、顔と非顔の画像を一定のサイズ（ここでは３６画素×３６画素）に拡縮処理し、それに対し多重解像度解析を行う。多重解像度解析より得られる結果のうち、学習に必要なレベルの成分を１列に並べることにより特徴量ベクトルを得る。この特徴量ベクトルに対して学習処理を行うことで、Ｌ次元の特徴量ベクトルから生成した参照パラメータが生成される。多重解像度解析で生成される複数の次元数（例えば、１２×１２、２４×２４、４８×４８）の特徴量ベクトルに対して学習処理を行うことで、複数次元の参照フォーマットを生成できる。 As described in the description at the time of detection, a wavelet feature that is a result of performing multi-resolution analysis by Haar wavelet transform on a learning image is used as a feature amount. First, the face and non-face images are scaled to a certain size (36 pixels × 36 pixels in this case), and multi-resolution analysis is performed on them. Of the results obtained from the multi-resolution analysis, the feature quantity vector is obtained by arranging the components of the level necessary for learning in one column. By performing learning processing on the feature quantity vector, a reference parameter generated from the L-dimensional feature quantity vector is generated. By performing learning processing on feature quantity vectors of a plurality of dimensions (for example, 12 × 12, 24 × 24, and 48 × 48) generated by multi-resolution analysis, a multi-dimensional reference format can be generated.

なお、ウェーブレット対応のフォーマットにおける符号化、復号化の過程において、多重解像度解析が行われ、ウェーブレット特徴が得られる場合は、そのウェーブレット特徴を特徴量ベクトルとすることにより、変換作業に要する処理を行うことなく対象物体の検出が可能になる。特に、ＪＰＥＧ２０００においては、ウェーブレットの多重解像度解析による符号化、復号化を行うというだけでなく、複数の解像度レベルを順次合成する機能を利用することができる。この機能には、サーバから、モバイル端末やＰＣなどに、多重解像度解析結果の高いレベルの成分から順に、データが転送されるという特徴がある。本実施の形態では、高いレベルの成分から順に評価値を算出することができるため、全データの転送を待つことなく、処理を行うことができため、効率的な処理が可能になる。 When wavelet features are obtained in the course of encoding and decoding in a wavelet-compatible format and wavelet features are obtained, the wavelet features are used as feature quantity vectors to perform processing required for the conversion work. The target object can be detected without any problem. In particular, in JPEG2000, not only encoding and decoding by wavelet multi-resolution analysis but also a function of sequentially synthesizing a plurality of resolution levels can be used. This function is characterized in that data is transferred from a server to a mobile terminal, a PC, or the like in order from a high-level component of a multiresolution analysis result. In the present embodiment, since the evaluation values can be calculated in order from the higher level components, the processing can be performed without waiting for the transfer of all the data, so that efficient processing is possible.

また、実施の形態１、２に示す画像処理装置は、図１６のように構成してもよい。 Further, the image processing apparatuses shown in Embodiments 1 and 2 may be configured as shown in FIG.

図１６は本発明の実施の形態２における画像処理装置のブロック図である。図１６にあらわされる画像処理装置は、特徴量ベクトル評価部（左向き）１７０４、特徴量ベクトル評価部（正面）１７０５、特徴量ベクトル評価部（右向き）１７０６の３つの特徴量ベクトルを評価することで、対象物体である顔の検出を行う。同じ顔であっても、その向きにより検出の容易性などに相違があるので、それぞれに応じた参照パラメータを用いて検出することが好ましいからである。 FIG. 16 is a block diagram of an image processing apparatus according to Embodiment 2 of the present invention. The image processing apparatus shown in FIG. 16 evaluates three feature quantity vectors of a feature quantity vector evaluation unit (leftward) 1704, a feature quantity vector evaluation part (front) 1705, and a feature quantity vector evaluation part (rightward) 1706. The face that is the target object is detected. This is because even the same face is different in ease of detection depending on its orientation, and therefore it is preferable to detect it using reference parameters corresponding to each face.

顔の向きに応じた特徴量ベクトルは、各評価部で独立に評価される。このとき、あらかじめ検出に用いられる参照パラメータを、顔の正面と左右向きにあわせたパラメータとして保持しておくことで、顔の正面と左右向きに対応した顔検出を行うことができる。これにより複数の方向を向いた顔検出が可能になり、例えばセキュリティなどにおいてよりフレキシブルな検出処理を行うことができるようになる。 The feature amount vector corresponding to the face orientation is independently evaluated by each evaluation unit. At this time, it is possible to perform face detection corresponding to the front and left / right orientation of the face by holding the reference parameters used for detection in advance as parameters that match the front and left / right orientation of the face. As a result, face detection in a plurality of directions is possible, and for example, more flexible detection processing can be performed in security or the like.

以上のように、多重解像度解析によって得られる特徴のうち、まず高いレベルの成分で検出処理を行い、その後、低いレベルの成分で検出処理に移行することで、検出完了を早期に実現できるため、処理量の削減が可能となる。さらに、高いレベルの成分での検出処理から行われるので、検出精度も高いまま維持できる。 As described above, among the features obtained by multi-resolution analysis, first, detection processing is performed with a high-level component, and then the detection completion can be realized early by shifting to detection processing with a low-level component. The amount of processing can be reduced. Furthermore, since the detection process is performed with a high-level component, the detection accuracy can be kept high.

本発明は、例えば、入力画像から所望の対象物体を検出する画像処理分野等において好適に利用できる。 The present invention can be suitably used, for example, in the field of image processing for detecting a desired target object from an input image.

本発明の実施の形態１における画像処理装置のブロック図1 is a block diagram of an image processing apparatus according to Embodiment 1 of the present invention. 本発明の実施の形態１における特徴量ベクトル評価部の内部構成図The internal block diagram of the feature-value vector evaluation part in Embodiment 1 of this invention 本発明の実施の形態１における特徴量ベクトル評価部５の処理フローチャートProcessing flowchart of feature vector evaluation unit 5 in Embodiment 1 of the present invention 本発明の実施の形態１における特徴量ベクトル評価部の処理フローチャートProcessing flowchart of feature vector evaluation unit in Embodiment 1 of the present invention （ａ）本発明の実施の形態１における顔画像のイメージ（ｂ）本発明の実施の形態１における非顔画像のイメージ(A) Image of face image in Embodiment 1 of the present invention (b) Image of non-face image in Embodiment 1 of the present invention 学習処理の流れを示すフローチャートFlow chart showing the flow of learning process アンサンブル学習を行う学習部の構成図Configuration diagram of learning unit for ensemble learning 本発明の実施の形態１における検出処理のフローチャートFlowchart of detection processing in Embodiment 1 of the present invention 本発明の実施の形態１における検出処理の処理図Processing diagram of detection processing in Embodiment 1 of the present invention 本発明の実施の形態１における画像処理装置を含む電子装置の構成図1 is a configuration diagram of an electronic device including an image processing device according to Embodiment 1 of the present invention. 本発明の実施の形態２における画像処理装置のブロック図Block diagram of an image processing apparatus in Embodiment 2 of the present invention 本発明の実施の形態２における特徴量ベクトル評価部の内部ブロック図Internal block diagram of feature vector evaluation unit in embodiment 2 of the present invention 本発明の実施の形態２における特徴量ベクトル評価部の動作フローチャートOperation flow chart of feature vector evaluation unit in embodiment 2 of the present invention （ａ）本発明の実施の形態２における元画像イメージ（ｂ）本発明の実施の形態２における多重解像度解析を行った画像イメージ(A) Original image in Embodiment 2 of the present invention (b) Image image subjected to multi-resolution analysis in Embodiment 2 of the present invention 本発明の実施の形態２における参照パラメータ生成のフローチャートFlow chart of reference parameter generation in Embodiment 2 of the present invention 本発明の実施の形態２における画像処理装置のブロック図Block diagram of an image processing apparatus in Embodiment 2 of the present invention 従来の技術における画像処理装置のブロック図Block diagram of a conventional image processing apparatus 従来の技術における画像処理装置のブロック図Block diagram of a conventional image processing apparatus

Explanation of symbols

１画像処理装置
２入力画像
３部分画像切り出し部
４特徴量ベクトル算出部
５特徴量ベクトル評価部
６画像識別部
７相関値算出部
８類似度算出部
９、２１評価値算出部
１０判定部
１１参照用データベース
１２参照パラメータ
１３対象物体参照ベクトル
１４非対象物体参照ベクトル
１５重み係数
１６検出結果
２０特徴量ベクトル
２２中間判定部
２３最終判定部
２４参照パラメータ保持部
２５制御パラメータ保持部
１０１ＣＰＵ
１０２ＲＯＭ
１０３バス
１０４ＲＡＭ
１０５ＨＤ
１０７Ｉ／Ｆ
１０８カメラ
６０１、１３０１評価値算出部
６０２、１３０２中間判定部
６０３、１３０３最終判定部
６０４、１３０４参照パラメータ保持部
６０５、１３０５制御パラメータ保持部
１１０１参照データベース格納部
１１０２フィルタ部
１１０３学習処理部
１１０４評価データ部
１１０４評価部
１７０３特徴量ベクトル評価部（左向き）
１７０４特徴量ベクトル評価部（正面）
１７０５特徴量ベクトル評価部（右向き） DESCRIPTION OF SYMBOLS 1 Image processing apparatus 2 Input image 3 Partial image clipping part 4 Feature quantity vector calculation part 5 Feature quantity vector evaluation part 6 Image identification part 7 Correlation value calculation part 8 Similarity degree calculation part 9, 21 Evaluation value calculation part 10 Determination part 11 reference Database 12 Reference parameter 13 Target object reference vector 14 Non-target object reference vector 15 Weight coefficient 16 Detection result 20 Feature vector 22 Intermediate determination unit 23 Final determination unit 24 Reference parameter storage unit 25 Control parameter storage unit 101 CPU
102 ROM
103 Bus 104 RAM
105 HD
107 I / F
108 Cameras 601, 1301 Evaluation value calculation unit 602, 1302 Intermediate determination unit 603, 1303 Final determination unit 604, 1304 Reference parameter storage unit 605, 1305 Control parameter storage unit 1101 Reference database storage unit 1102 Filter unit 1103 Learning processing unit 1104 Evaluation data Part 1104 evaluation part 1703 feature vector evaluation part (left direction)
1704 Feature vector evaluation unit (front)
1705 Feature vector evaluation unit (right direction)

Claims

An image processing apparatus for detecting a target object from an input image,
A reference object having a plurality of reference parameters that are combinations of a reference vector including a target object reference vector that is an image vector of the target object and a non-target object reference vector that is an image vector of a non-target object and a weight coefficient of the reference vector. A database,
A partial image cutout unit for cutting out a partial image from the input image;
A feature amount vector calculation unit that calculates a feature amount vector of the partial image cut out by the image cutout unit;
A feature quantity vector evaluation unit that performs a detection process of detecting a target object from the partial image using the feature quantity vector and the reference parameter;
The feature vector evaluation unit includes an image identification unit that determines inclusion / non-inclusion of the target object in the partial image based on a calculation result calculated using the feature vector and the reference parameter;
An image processing apparatus that ends the detection process when the image identification unit determines that the target object is not included.

The image processing apparatus according to claim 1, wherein the detection process is terminated when there are a plurality of image identification units and at least one of the plurality of image identification units determines that the target object is not included.

A correlation value calculation unit for calculating a correlation value between a target object correlation value between the feature quantity vector and the target object reference data, a correlation value between the feature quantity vector and the non-target object reference data, and the correlation value. An evaluation value calculation unit that calculates an evaluation value that is a product of the target object correlation value or the non-target object correlation value calculated by the calculation unit and the weighting factor; and the target object of the partial image according to the evaluation value The image processing apparatus according to claim 1, further comprising a determination unit that performs inclusion / non-inclusion determination.

An intermediate determination unit in which the determination unit repeatedly performs inclusion / non-inclusion determination using the different reference parameters when the calculation processing in the evaluation value calculation unit is a predetermined number of times or less, and calculation in the evaluation calculation unit The image processing apparatus according to claim 3, further comprising a final determination unit that performs inclusion / non-inclusion determination when the processing is performed a predetermined number of times or more.

The feature quantity vector evaluation unit continues detection processing using the different reference parameters when at least one of the plurality of image determination units determines that the target object is included. The image processing apparatus described.

The image processing apparatus according to claim 1, wherein the feature vector evaluation unit uses a plurality of different reference parameters in descending order of weighting factors.

The image processing apparatus according to claim 1, wherein the feature quantity vector evaluation unit uses the target object reference data in ascending order of the number of dimensions of the non-target object reference data regarding the non-target object.

The image processing apparatus according to claim 1, wherein the reference parameter is generated by ensemble learning from the target object reference vector and the non-target object reference vector.

The image processing apparatus according to claim 8, wherein the ensemble learning uses a boosting technique.

10. The image according to claim 1, wherein the feature vector is calculated by a filter process that calculates at least one of an edge feature and a multi-resolution feature (hereinafter referred to as “wavelet feature”) by Haar wavelet transform. Processing equipment.

The image processing apparatus according to claim 10, wherein the wavelet feature is obtained by octave division.

The image processing apparatus according to claim 11, wherein the feature vector evaluation unit sequentially uses high-level components obtained by the octave division.

The image processing apparatus according to claim 1, wherein the input image is at least one of encoded and decoded in a wavelet-compatible format.

The image processing apparatus according to claim 13, wherein the wavelet-compatible format corresponds to a JPEG2000 format.

An image processing method for detecting a target object from an input image,
Parameter setting that sets a plurality of parameters that are combinations of a reference vector including a target object reference vector that is an image vector of the target object and a non-target object reference vector that is an image vector of a non-target object and a weighting factor of the reference vector Steps,
A partial image cutout step of cutting out a partial image from the input image;
A feature amount vector calculating step of calculating a feature amount vector of the clipped partial image;
Using the feature vector and the parameter, a feature vector evaluation step for performing a detection process for detecting a target object from the partial image,
The feature vector evaluation step includes an image determination step for determining inclusion / non-inclusion of the target object in the partial image based on a calculation result calculated using the feature vector and the parameter;
An image processing method for ending the detection process when the plurality of image determination steps determine that the target object is not included.

A correlation value calculating step in which the image determining step calculates a target object correlation value between the feature quantity vector and the target object reference data or a correlation value between the feature quantity vector and the non-target object reference data; and the correlation value The image processing method according to claim 15, further comprising: an evaluation value calculating step for calculating an evaluation value that is a product of the weight coefficient and the weight coefficient.