JP2010108262A

JP2010108262A - Object detection apparatus and control method

Info

Publication number: JP2010108262A
Application number: JP2008279871A
Authority: JP
Inventors: Katsuhiko Kawasaki; 勝彦川崎
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-10-30
Filing date: 2008-10-30
Publication date: 2010-05-13

Abstract

<P>PROBLEM TO BE SOLVED: To detect an object at high speed by narrowing a possible position and size of a partial image for object detection by use of a distance image. <P>SOLUTION: In a scannable range as the partial image, points in which a distance gap is present within the distance image is extracted so that the center of the partial image cannot move in the vicinity of the points, the partial image is scanned by moving it while changing the position and size within an input image corresponding to the distance image, and it is determined whether the scanned partial image is an object. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、顔などのオブジェクトを検出するオブジェクト検出装置および方法に関する。 The present invention relates to an object detection apparatus and method for detecting an object such as a face.

特許文献１に述べられている顔検出方法では、距離画像を作成し、肌色領域画像とエッジ画像を抽出し、距離画像の奥行き情報に応じた輪郭モデルを作成する。そして、輪郭モデルと肌色画像およびエッジ画像との相関によって顔領域を抽出している。 In the face detection method described in Patent Document 1, a distance image is created, a skin color region image and an edge image are extracted, and a contour model corresponding to depth information of the distance image is created. Then, the face area is extracted based on the correlation between the contour model, the skin color image, and the edge image.

また、特許文献２に述べられているオブジェクト検出方法では、入力画像内で位置と大きさを変えて部分画像が移動する。そして、部分画像が顔であるかどうかを、矩形特徴（弱仮説）との比較によって判定する（弱識別）。弱仮説が正しいかどうかは積分画像法によって計算する。各弱仮説の重み付け係数は予め学習によって獲得しておく。部分画像が顔かどうかは、弱仮説の重み付け多数決で行う（強識別）。 Further, in the object detection method described in Patent Document 2, the partial image moves while changing the position and size in the input image. Then, it is determined whether or not the partial image is a face by comparison with a rectangular feature (weak hypothesis) (weak identification). Whether the weak hypothesis is correct is calculated by the integral image method. The weighting coefficient of each weak hypothesis is acquired in advance by learning. Whether a partial image is a face is determined by weighted majority of weak hypotheses (strong identification).

特許文献３に述べられている画像処理方法では、車両の位置と地図データから信号機の位置を参照して、車両から信号機までの距離を算出する。そして、車両前方にある信号機の位置と距離に基づき、大きさを調節したテンプレートで信号機を検出する。
特開２００２−２１６１２９号公報ＵＳ７０９９５１０号公報特開２００７−００４２５６号公報 In the image processing method described in Patent Document 3, the distance from the vehicle to the traffic signal is calculated by referring to the position of the traffic signal from the position of the vehicle and the map data. And a traffic light is detected with the template which adjusted the magnitude | size based on the position and distance of the traffic signal in front of a vehicle.
JP 2002-216129 A US7094991 JP 2007-004256 A

特許文献１では、距離画像と肌色によって顔領域を抽出しているが、実際の肌色は曖昧なので正確な顔領域の抽出は難しい。この方法だと目や口のなどのある程度位置と大きさの決まっている特徴からなるオブジェクトの検出には向いていない。 In Patent Document 1, the face area is extracted based on the distance image and the skin color. However, since the actual skin color is ambiguous, it is difficult to accurately extract the face area. This method is not suitable for detecting an object having a feature whose position and size are determined to some extent, such as the eyes and mouth.

また、特許文献２では、入力画像の全領域で位置と大きさを変えて部分画像を移動しているので、計算量が多くなる。 In Patent Document 2, since the partial image is moved while changing the position and size in the entire area of the input image, the amount of calculation increases.

また、特許文献３による方法では、人間の顔のように様々な位置に存在できて異なるパターンの存在するオブジェクトの検出ができない。 Further, the method according to Patent Document 3 cannot detect objects that can exist in various positions such as a human face and have different patterns.

本発明は上述した問題を解決するためになされたものであり、距離画像を用いてオブジェクト検出用の部分画像の取りうる範囲を絞り込むことで、オブジェクトの高速な検出が可能にすることを目的とする。 The present invention has been made to solve the above-described problems, and an object of the present invention is to enable high-speed object detection by narrowing a range that can be taken by a partial image for object detection using a distance image. To do.

本発明のオブジェクト検出装置は、入力画像を入力する撮像手段と、前記入力画像に映っている各物体までの距離を測定する測距手段と、前記入力画像に対応する距離画像を作成する距離画像作成手段と、前記距離画像を用いて物体までの距離に応じてオブジェクト検出枠を設定するオブジェクト検出枠設定手段と、前記設定されたオブジェクト検出枠のうち、部分画像としてスキャン可能な領域を設定するスキャン領域設定手段と、オブジェクトの特徴を記憶した辞書と、前記設定された領域を用いてスキャンされた部分画像がオブジェクトであるかどうかを、前記辞書を用いて判定するオブジェクト検出手段とを有し、前記スキャン領域設定手段は、前記部分画像としてスキャン可能な範囲を、前記距離画像内で距離の格差の存在する点を抽出し、前記点の近傍を部分画像の中心が移動できないようにし、前記オブジェクト検出手段は、前記距離画像に対応する入力画像内で位置と大きさを変えて部分画像を移動してスキャンを行うことを特徴とする。 An object detection apparatus according to the present invention includes an imaging unit that inputs an input image, a distance measuring unit that measures a distance to each object shown in the input image, and a distance image that creates a distance image corresponding to the input image. A creation unit, an object detection frame setting unit that sets an object detection frame according to a distance to an object using the distance image, and a region that can be scanned as a partial image among the set object detection frames is set. A scan area setting unit; a dictionary that stores object characteristics; and an object detection unit that determines whether the partial image scanned using the set area is an object using the dictionary. The scan area setting means includes a range that can be scanned as the partial image, in which there is a disparity in distance in the distance image. Extraction is made so that the center of the partial image cannot move in the vicinity of the point, and the object detection means scans by moving the partial image while changing the position and size in the input image corresponding to the distance image. It is characterized by that.

本発明では、距離画像を用いてオブジェクト検出用の部分画像の取りうる位置と大きさを絞り込む。その結果、目や口などの位置と大きさの決まっている特徴からなるオブジェクトの高速な検出が可能になる。 In the present invention, the position and size that can be taken by the partial image for object detection are narrowed down using the distance image. As a result, it is possible to quickly detect an object having features whose positions and sizes are determined such as eyes and mouth.

＜第１実施形態＞
以下、図面を参照しながら本発明の好適な実施例について説明していく。 <First Embodiment>
Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

図１は、本発明におけるオブジェクト検出装置の構成図である。本実施例では、顔検出機能のついたカメラ（ビデオカメラでもよい）について述べる。本実施例では顔検出について述べるが、顔以外のオブジェクトの検出も同様の方法で行うことが出来る。図１に示すように、オブジェクト検出装置１０は、撮像部１１、測距部１２、距離画像作成部１３、オブジェクト検出枠設定部１４、スキャン領域設定部１５、オブジェクト検出部１６、顔辞書１７、ＣＰＵ１８、ＲＯＭ１９、ＲＡＭ２０から構成されている。 FIG. 1 is a configuration diagram of an object detection apparatus according to the present invention. In this embodiment, a camera (or a video camera) having a face detection function will be described. In this embodiment, face detection will be described, but detection of an object other than a face can be performed in the same manner. As shown in FIG. 1, the object detection apparatus 10 includes an imaging unit 11, a distance measurement unit 12, a distance image creation unit 13, an object detection frame setting unit 14, a scan area setting unit 15, an object detection unit 16, a face dictionary 17, The CPU 18, the ROM 19, and the RAM 20 are included.

図２は、オブジェクト検出装置１０における顔検出処理の流れ図である。 FIG. 2 is a flowchart of face detection processing in the object detection apparatus 10.

まず、ステップＳ１０１では、撮像部１１から画像を入力する。入力画像を図３（ａ）に示す。 First, in step S101, an image is input from the imaging unit 11. An input image is shown in FIG.

ステップＳ１０２では、測距部１２を用いて入力画像内に映っている各物体までの距離を測定する。 In step S102, the distance to each object reflected in the input image is measured using the distance measuring unit 12.

ステップＳ１０３では、距離画像測定部１３で、各物体までの距離に基づいて、入力画像（図３（ａ））に対応する距離画像（図３（ｂ））を作成する。ここで、距離画像は、レンズ焦点法、ステレオ法、光レーダー法など如何なる方法を用いて作成しても良い。図３（ｂ）の距離画像においては、近くにある物体ほど色が薄く表示している。図３（ｂ）に示すように、撮像部からの距離がほぼ同じとみなされる領域を撮像部から近い順に、Ａ１、Ａ２、Ａ３・・・Ａ１０のように分割する。 In step S103, the distance image measurement unit 13 creates a distance image (FIG. 3B) corresponding to the input image (FIG. 3A) based on the distance to each object. Here, the distance image may be created by using any method such as a lens focus method, a stereo method, or an optical radar method. In the distance image in FIG. 3B, the closer the object is, the lighter the color is displayed. As shown in FIG. 3B, an area that is regarded as having substantially the same distance from the imaging unit is divided into A1, A2, A3,.

ステップＳ１０４において、オブジェクト検出枠設定部１４で、顔を検出するための検出枠を設定する。ここでは、まず図３（ｃ）のように、最も近い領域Ａ１を含む検出枠Ｆｒａｍｅ１を設定する。 In step S104, the object detection frame setting unit 14 sets a detection frame for detecting a face. Here, first, as shown in FIG. 3C, a detection frame Frame1 including the closest region A1 is set.

ステップＳ１０５において、スキャン領域設定部１５で、距離画像内において距離の格差が存在する点を抽出する。図４（ａ）のように設定した検出枠Ｆｒａｍｅ１内で、一定の間隔で距離画像をスキャンして、領域Ａ１に対して一定値以上の（例えば２０ｃｍ以上の奥行きの）距離の格差が存在する点を抽出する（図４（ｂ））。領域Ａ１に対する距離格差点ＧＰは、図４（ｃ）のようになる。また、非スキャン領域ＮＧは、図４（ｄ）のようになる。 In step S <b> 105, the scan area setting unit 15 extracts points where there is a distance difference in the distance image. Within the detection frame Frame1 set as shown in FIG. 4A, a distance image is scanned at a constant interval, and there is a distance difference with respect to the region A1 that is a certain value or more (for example, a depth of 20 cm or more). A point is extracted (FIG. 4B). The distance difference point GP for the region A1 is as shown in FIG. Further, the non-scan area NG is as shown in FIG.

次に、ステップＳ１０６において、スキャン領域設定部１５で、図５（ａ）のように、距離格差点ＧＰの近傍に対して部分画像ＳＷ（図５（ｄ））の非スキャン領域ＮＧを設定する。このときの非スキャン領域の大きさは、対象としている領域ＡＸ（Ｘ＝１，２・・・）内で大きさを変えてスキャンしながら顔検出する部分画像ＳＷ（図５（ｄ））の最小サイズの大きさに設定する。領域Ａ１に対して抽出した全ての距離格差点ＧＰに対して非スキャン領域ＮＧを設定すると、図５（ｂ）のようになる。その結果、領域Ａ１に対して設定した検出枠Ｆｒａｍｅ１内で、顔検出を行う部分画像ＳＷの中心が移動する（図５（ｄ））スキャン領域ＯＫ（スキャン可能な領域）は、図５（ｃ）のようになる。 Next, in step S106, the scan area setting unit 15 sets the non-scan area NG of the partial image SW (FIG. 5D) for the vicinity of the distance difference point GP as shown in FIG. . The size of the non-scanning area at this time is the size of the partial image SW (FIG. 5D) for detecting the face while changing the size in the target area AX (X = 1, 2,...). Set to the minimum size. When the non-scan area NG is set for all the distance disparity points GP extracted for the area A1, the result is as shown in FIG. As a result, the center of the partial image SW for face detection moves within the detection frame Frame1 set for the area A1 (FIG. 5D). The scan area OK (scannable area) is shown in FIG. )become that way.

ステップＳ１０７において、オブジェクト検出部１６で、距離画像に対応する入力画像（図６（ｂ））内で、距離画像から得られたスキャン領域（図６（ａ））を用いて、部分画像ＳＷ（矩形）をスキャンして顔を検出する。図６〜図１０を用いて顔検出について述べる。図６（ｂ）のように、部分画像ＳＷは、その中心がスキャン領域ＯＫ内にあるという条件で、入力画像内に設定した検出枠Ｆｒａｍｅ１内で、位置と大きさを変えて移動する。部分画像ＳＷ（図６（ｂ））が顔であるかどうかは、以下に述べるように、ＵＳ７０９９５１０と同様の方法で判定する。まず、位置と大きさの異なる矩形特徴Ｆｅａｔｕｒｅ（ｉ）（例を図８（ａ）〜（ｄ）に示す）とそれに対応する重み付けの係数Ｃ（ｉ）（図９）とを記憶した顔辞書ＤＩＣ（図９）を用意する。図９の顔辞書ＤＩＣにおいて、矩形特徴Ｆｅａｔｕｒｅ（ｉ）は訓練画像セットを用いた学習によって重要な順に並べられている。まず、部分画像ＳＷ（図６（ｂ））が顔の弱識別の条件を満たしているかどうか積分画像法を用いて判定する。積分画像法では、入力された画像の各点（ｘ，ｙ）に対して、積分画像値、
ＩＩ（ｘ，ｙ）＝∫［０，ｘ］∫［０，ｙ］Ｉ（ｘ，ｙ）ｄｘｄｙ
という値が割り振られる。ここで、Ｉ（ｘ，ｙ）は画像平面上の輝度値である。この積分画像値ＩＩ（ｘ，ｙ）は下記の漸化式によって、各画素の輝度値Ｉ（ｘ，ｙ）のただ一度の参照で計算可能である。
Ｓ（ｘ，ｙ）＝Ｓ（ｘ，ｙ−１）＋Ｉ（ｘ，ｙ）
ＩＩ（ｘ，ｙ）＝ＩＩ（ｘ−１，ｙ）＋Ｓ（ｘ，ｙ）
Ｓ（ｘ，−１）＝０
ＩＩ（−１，ｙ）＝０ In step S107, the object detection unit 16 uses the scan area (FIG. 6A) obtained from the distance image in the input image corresponding to the distance image (FIG. 6B) to display the partial image SW ( (Rectangle) is scanned to detect the face. The face detection will be described with reference to FIGS. As shown in FIG. 6B, the partial image SW moves while changing its position and size in the detection frame Frame1 set in the input image on the condition that the center thereof is in the scan region OK. Whether or not the partial image SW (FIG. 6B) is a face is determined by the same method as in US Pat. No. 7,099,510 as described below. First, a face dictionary storing rectangular features Feature (i) (examples shown in FIGS. 8A to 8D) having different positions and sizes and corresponding weighting coefficients C (i) (FIG. 9). A DIC (FIG. 9) is prepared. In the face dictionary DIC of FIG. 9, the rectangular features Feature (i) are arranged in an important order by learning using the training image set. First, it is determined using the integral image method whether the partial image SW (FIG. 6B) satisfies the condition for weak face identification. In the integral image method, for each point (x, y) of the input image, an integral image value,
II (x, y) = ∫ [0, x] ∫ [0, y] I (x, y) dxdy
Is assigned. Here, I (x, y) is a luminance value on the image plane. The integral image value II (x, y) can be calculated by a single reference of the luminance value I (x, y) of each pixel by the following recurrence formula.
S (x, y) = S (x, y-1) + I (x, y)
II (x, y) = II (x-1, y) + S (x, y)
S (x, -1) = 0
II (-1, y) = 0

ひとたび全画像内のＩＩ（ｘ，ｙ）が計算されると、画像内の任意の矩形領域の平均輝度値は、その各頂点の積分画像値のみから計算可能になる。例えば、図７における矩形領域ＡＣＤＢの平均輝度値Ｍ（ＡＣＤＢ）は、次のようになる。
Ｍ（ＡＣＤＢ）＝（ＩＩ（Ｄ）−ＩＩ（Ｂ）−ＩＩ（Ｃ）＋ＩＩ（Ａ））／｜ＡＣＤＢ｜ Once II (x, y) in the entire image is calculated, the average luminance value of any rectangular area in the image can be calculated only from the integrated image value at each vertex. For example, the average luminance value M (ACDB) of the rectangular area ACDB in FIG. 7 is as follows.
M (ACDB) = (II (D) -II (B) -II (C) + II (A)) / | ACDB |

次に、図１０の入力画像の部分画像ＳＷにおいて、矩形特長Ｆｅａｔｕｒｅ（ｉ）に対応する白色小矩形領域の平均輝度値から黒色小矩形領域の平均輝度値を引いた値ｆ（ｉ）を積分画像法によって求める。ｆ（ｉ）が一定の閾値条件を満たせば、部分画像ＳＷは顔の弱識別の条件を満たしていると判定して弱識別条件ｈ（ｉ）＝１とし、そうでなければｈ（ｉ）＝０とする。次に、部分画像ＳＷが顔の強識別の条件を満たしているかどうかを判定する。図９の顔辞書には、顔を構成する矩形特徴とそれに対応する係数Ｃ（ｔ）（ｔ＝１〜Ｔ｜Ｔは学習の際に決められる値）とが記憶されている。各弱識別条件ｈ（ｔ）（ｔ＝１〜Ｔ）に対して、次の強識別の条件が成り立てば、図１０の部分画像ＳＷは最終的に顔であると判定する。
Σ［ｔ＝１〜Ｔ］Ｃ（ｔ）×ｈ（ｔ）≧（１／２）×θ×Σ［ｔ＝１〜Ｔ］Ｃ（ｔ）（θは閾値） Next, in the partial image SW of the input image of FIG. 10, a value f (i) obtained by subtracting the average luminance value of the black small rectangular area from the average luminance value of the white small rectangular area corresponding to the rectangular feature Feature (i) is integrated. Obtained by image method. If f (i) satisfies a certain threshold condition, it is determined that the partial image SW satisfies the condition for weak facial recognition, and the weak identification condition h (i) = 1, otherwise h (i) = 0. Next, it is determined whether or not the partial image SW satisfies the condition for strong face identification. The face dictionary of FIG. 9 stores rectangular features constituting the face and the corresponding coefficient C (t) (t = 1 to T | T is a value determined during learning). If the next strong discrimination condition is established for each weak discrimination condition h (t) (t = 1 to T), the partial image SW in FIG. 10 is finally determined to be a face.
Σ [t = 1 to T] C (t) × h (t) ≧ (1/2) × θ × Σ [t = 1 to T] C (t) (θ is a threshold value)

ところで、図３（ｃ）のＦｒａｍｅ１は、距離画像を用いて領域Ａ１に対して設定したが、領域Ａ１までのおよその距離と実物の顔の大きさとから領域Ａ１に映っている顔の大きさの範囲は絞られる。領域Ａ１に存在しうる標準的な顔の大きさをＦＳＩＺＥ（Ａ１）とする。すると、図６（ｂ）のＦｒａｍｅ１内の部分画像ＳＷは、ＦＳＩＺＥ（Ａ１）の１／√２倍から√２倍の範囲で１．２の倍率で大きさを変えて移動して顔を検出する。今、入力画像が縦２４０×横３６０ピクセル、部分画像の最小サイズが縦２０×横２０ピクセルであるとする。従来の、ＵＳ７０９９５１０の方法だと、部分画像のとる大きさのレンジはおよそｌｏｇ（２４０／２０）／ｌｏｇ（１．２）である。本実施例では撮像部からの距離によって顔の大きさを推測し、ＦＳＩＺＥ（Ａ１）の１／√２倍から√２倍の範囲で部分画像の大きさを変える。すると、部分画像のとる大きさのレンジはおよそｌｏｇ（２）／ｌｏｇ（１．２）である。すなわち、従来例に比べて、部分画像のとる大きさのレンジはおよそｌｏｇ（２）／ｌｏｇ（２４０／２０）〜０．２９倍になる。また、図５（ｃ）のＦｒａｍｅ１内におけるスキャン領域ＯＫの割合をＰ１（＜１）とすると、Ｆｒａｍｅ１内で顔を検出するための計算量は、０．２９×Ｐ１倍（＜０．２９）になる。 By the way, although Frame1 of FIG.3 (c) was set with respect to area | region A1 using the distance image, the size of the face reflected in area | region A1 from the approximate distance to area | region A1 and the size of a real face. The range of is narrowed. A standard face size that can exist in the area A1 is defined as FSIZE (A1). Then, the partial image SW in Frame 1 in FIG. 6 (b) is moved by changing the size at a magnification of 1.2 in the range of 1 / √2 times to √2 times of FSIZE (A 1) to detect the face. To do. Assume that the input image is 240 × 360 pixels horizontally and the minimum size of the partial image is 20 × 20 pixels. In the conventional method of US70959910, the range of the size of the partial image is approximately log (240/20) / log (1.2). In this embodiment, the size of the face is estimated based on the distance from the imaging unit, and the size of the partial image is changed in a range of 1 / √2 times to √2 times FSIZE (A1). Then, the size range of the partial image is approximately log (2) / log (1.2). That is, the size range of the partial image is approximately log (2) / log (240/20) to 0.29 times that of the conventional example. If the ratio of the scan area OK in Frame 1 in FIG. 5C is P1 (<1), the amount of calculation for detecting a face in Frame 1 is 0.29 × P1 times (<0.29). become.

ステップＳ１０８において、次の検出枠があるかどうかを判定し、次の検出枠がある場合は、次の検出枠に対して、ステップＳ１０４〜ステップＳ１０７の処理を行って顔を検出する。本実施例では、図１１（ａ）のように、領域Ａ２に対して検出枠Ｆｒａｍｅ２を設定し、処理を行う。以下同様に、図１１（ｂ）、（ｃ）、（ｄ）のように、領域Ａ３、Ａ４、Ａ５に対して検出枠を設定して顔を検出する。全ての領域Ａ１〜Ａ１０において、顔検出を行った結果は、図１２のようになる。 In step S108, it is determined whether or not there is a next detection frame. If there is a next detection frame, the process of steps S104 to S107 is performed on the next detection frame to detect a face. In this embodiment, as shown in FIG. 11A, the detection frame Frame2 is set for the region A2, and the process is performed. Similarly, as shown in FIGS. 11B, 11C, and 11D, detection frames are set for the areas A3, A4, and A5 to detect a face. The results of face detection in all the regions A1 to A10 are as shown in FIG.

＜変形例１＞
第１実施形態の図２のＳステップ１０５で、距離の格差の存在する点を抽出する際の走査線の間隔（図４（ａ））はもっと狭くてもよく、最小１ピクセルの間隔で走査して距離の格差の存在する点を抽出してもよい。逆に、距離の格差の存在する点を抽出する際の走査線の間隔（図４（ａ））はもっと広くても良い。 <Modification 1>
In step S105 in FIG. 2 of the first embodiment, the interval between the scanning lines (FIG. 4A) when extracting the points where the distance difference exists may be narrower, and scanning is performed with a minimum interval of 1 pixel. Then, a point where a disparity in distance exists may be extracted. On the contrary, the interval between the scanning lines (FIG. 4A) when extracting the points where the distance difference exists may be wider.

＜変形例２＞
図２のステップＳ１０２の測距と、ステップＳ１０３の距離画像作成は随時行うのではなく、一定の時間間隔（例えば、１０秒、３０秒、６０秒などの間隔）で行っても良い。 <Modification 2>
The distance measurement in step S102 of FIG. 2 and the distance image generation in step S103 are not performed at any time, but may be performed at regular time intervals (for example, intervals of 10 seconds, 30 seconds, 60 seconds, etc.).

＜他の実施形態＞
上述した第１実施形態は、本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。 <Other embodiments>
The above-described first embodiment is merely a specific example for carrying out the present invention, and the technical scope of the present invention should not be construed as being limited thereto. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

本発明は例えば、システム、装置、方法、プログラム若しくは記憶媒体（記録媒体）等としての実施態様をとることが可能である。具体的には、複数の機器（例えば、ホストコンピュータ、インタフェース機器、撮影装置、ｗｅｂアプリケーション等）から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。 The present invention can take the form of, for example, a system, apparatus, method, program, or storage medium (recording medium). Specifically, the present invention may be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, a photographing device, a web application, etc.), or may be applied to a device composed of one device. good.

本発明は、前述した実施形態の機能を実現するソフトウェアのプログラムを、システムあるいは装置に直接あるいは遠隔から供給し、そのシステムあるいは装置のコンピュータが該供給されたプログラムコードを読み出して実行することによっても達成される。なお、この場合のプログラムとは、実施形態において図に示したフローチャートに対応したコンピュータ可読のプログラムである。 The present invention also provides a software program that implements the functions of the above-described embodiments directly or remotely to a system or apparatus, and the system or apparatus computer reads out and executes the supplied program code. Achieved. The program in this case is a computer-readable program corresponding to the flowchart shown in the drawing in the embodiment.

従って、本発明の機能処理をコンピュータで実現するために、該コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であっても良い。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, or the like.

プログラムを供給するための記録媒体としては、以下に示す媒体がある。例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ，ＤＶＤ−Ｒ）などである。 Recording media for supplying the program include the following media. For example, floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD- R).

プログラムの供給方法としては、以下に示す方法も可能である。すなわち、クライアントコンピュータのブラウザからインターネットのホームページに接続し、そこから本発明のコンピュータプログラムそのもの（又は圧縮され自動インストール機能を含むファイル）をハードディスク等の記録媒体にダウンロードする。また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 As a program supply method, the following method is also possible. That is, the browser of the client computer is connected to a homepage on the Internet, and the computer program itself (or a compressed file including an automatic installation function) of the present invention is downloaded to a recording medium such as a hard disk. It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせることも可能である。すなわち該ユーザは、その鍵情報を使用することによって暗号化されたプログラムを実行し、コンピュータにインストールさせることができる。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and key information for decryption is downloaded from a homepage via the Internet to users who have cleared predetermined conditions. It is also possible to make it. That is, the user can execute the encrypted program by using the key information and install it on the computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される。さらに、そのプログラムの指示に基づき、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現され得る。 Further, the functions of the above-described embodiments are realized by the computer executing the read program. Furthermore, based on the instructions of the program, an OS or the like running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments can be realized by the processing.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、実行されることによっても、前述した実施形態の機能が実現される。すなわち、該プログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行うことが可能である。 Further, the program read from the recording medium is written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, and then executed, so that the program of the above-described embodiment can be obtained. Function is realized. That is, based on the instructions of the program, the CPU provided in the function expansion board or function expansion unit can perform part or all of the actual processing.

本発明におけるオブジェクト検出装置の構成図である。It is a block diagram of the object detection apparatus in this invention. 顔検出処理の流れ図である。It is a flowchart of a face detection process. （ａ）は、入力画像の図である。（ｂ）は、入力画像に対応する距離画像の図である。（ｃ）は、領域Ａ１に対して検出枠を設定した図である。(A) is a figure of an input image. (B) is a figure of the distance image corresponding to an input image. (C) is the figure which set the detection frame with respect to area | region A1. （ａ）は、検出枠内で距離の格差が存在する点を抽出するために走査を行う図である。（ｂ）は、一定の間隔で距離画像をスキャンして検出枠内で距離の格差が存在する点を抽出する図である。（ｃ）は、検出枠内で距離の格差が存在する点を抽出した図である。（ｄ）は、非スキャン領域の図である。(A) is a figure which scans in order to extract the point in which the disparity of distance exists within a detection frame. (B) is a figure which scans a distance image with a fixed space | interval, and extracts the point where the disparity of distance exists in a detection frame. (C) is the figure which extracted the point where the disparity of distance exists within a detection frame. (D) is a figure of a non-scan area | region. （ａ）は、距離の格差が存在する点の近傍の非スキャン領域の図である。（ｂ）は、すべての距離の格差が存在する点に対して非スキャン領域を設定した図である。（ｃ）は、最終的に得られたスキャン領域の流れ図である。（ｄ）は、顔を検出する部分画像の中心が移動する図である。(A) is a figure of the non-scan area | region of the vicinity of the point where the disparity of distance exists. (B) is the figure which set the non-scan area | region with respect to the point where the difference of all the distances exists. (C) is a flowchart of the scan region finally obtained. (D) is a figure in which the center of the partial image for detecting the face moves. （ａ）は、距離画像から得られた顔を検出する部分画像の図である。（ｂ）は、入力画像内で顔を検出する部分画像の図である。(A) is a figure of the partial image which detects the face obtained from the distance image. (B) is a figure of the partial image which detects a face in an input image. 積分画像法を示す図である。It is a figure which shows the integral image method. 矩形特徴の図である。It is a figure of a rectangle feature. 顔辞書の図である。It is a figure of a face dictionary. 弱識別の図である。It is a figure of weak identification. （ａ）〜（ｄ）は、それぞれ、領域Ａ２〜Ａ５に対して検出枠を設定した図である。(A)-(d) is the figure which set the detection frame with respect to area | region A2-A5, respectively. 顔検出結果を表す図である。It is a figure showing a face detection result.

Explanation of symbols

１１撮像部
１２測距部
１３距離画像作成部
１４オブジェクト検出枠設定部
１５スキャン領域設定部
１６オブジェクト検出部
１７顔辞書
１８ＣＰＵ
１９ＲＯＭ
２０ＲＡＭ DESCRIPTION OF SYMBOLS 11 Image pick-up part 12 Distance measurement part 13 Distance image creation part 14 Object detection frame setting part 15 Scan area setting part 16 Object detection part 17 Face dictionary 18 CPU
19 ROM
20 RAM

Claims

Imaging means for inputting an input image;
Ranging means for measuring the distance to each object shown in the input image,
Distance image creating means for creating a distance image corresponding to the input image;
An object detection frame setting means for setting an object detection frame according to the distance to the object using the distance image;
Of the set object detection frame, scan area setting means for setting an area that can be scanned as a partial image;
A dictionary that stores the features of the object,
Object detection means for determining whether the partial image scanned using the set region is an object, using the dictionary;
The scan area setting means extracts a point where a disparity in distance exists in the distance image, and a range that can be scanned as the partial image, so that the center of the partial image cannot move in the vicinity of the point,
The object detection device is characterized in that scanning is performed by moving a partial image while changing its position and size in an input image corresponding to the distance image.

The object detection apparatus according to claim 1, wherein the object is a face.

The object detection apparatus according to claim 1, wherein the distance image is created by any one of a lens focus method, a stereo method, and an optical radar method.

The object detection apparatus according to claim 3, wherein the creation of the distance image is performed at regular time intervals.

An imaging step of inputting an input image by an imaging means;
A distance measuring step of measuring a distance to each object reflected in the input image by a distance measuring means;
A distance image creating step of creating a distance image corresponding to the input image by a distance image creating means;
An object detection frame setting step of setting an object detection frame according to the distance to the object using the distance image by the object detection frame setting means;
A scan area setting step for setting an area that can be scanned as a partial image in the set object detection frame by a scan area setting means;
An object detection step of determining whether or not the partial image scanned using the set area is an object by using an object detection means, using a dictionary in which object characteristics are stored;
In the scan region setting step, a range that can be scanned as the partial image is extracted from the distance image where there is a difference in distance so that the center of the partial image cannot move in the vicinity of the point,
The object detection step is a method for controlling an object detection apparatus, wherein scanning is performed by moving a partial image while changing a position and a size in an input image corresponding to the distance image.

The program for functioning a computer as a means which comprises the object detection apparatus of any one of Claims 1-4.