JP2008146106A

JP2008146106A - Equipment for detecting three-dimensional object

Info

Publication number: JP2008146106A
Application number: JP2006328816A
Authority: JP
Inventors: Akira Iiboshi; 明飯星; Satoko Kojo; 聡子古城; Zhencheng Hu; 振程胡; Keiichi Uchimura; 圭一内村; Tetsuya Kawamura; 哲也川村
Original assignee: Honda Motor Co Ltd; Kumamoto University NUC
Current assignee: Honda Motor Co Ltd; Kumamoto University NUC
Priority date: 2006-12-05
Filing date: 2006-12-05
Publication date: 2008-06-26
Anticipated expiration: 2026-12-05
Also published as: JP4873711B2

Abstract

<P>PROBLEM TO BE SOLVED: To further efficiently detect a three-dimensional object. <P>SOLUTION: A device for detecting a three-dimensional object is provided with a foreground acquisition means for removing a background from an image including the three-dimensional object as an object, and for acquiring a foreground image region; an inclination calculating means for calculating an inclination for a predetermined reference axis of a central line of the foreground image region; a region extraction means for extracting a detection region from the foreground image region based on the inclination; an object detection means for detecting an object from the detection region. The detection region where the object should be detected is extracted based on the inclination of the center line so that even if the object is inclined, it is possible to set the detection region without eliminating the object. In one embodiment, when the inclination is a predetermined value or larger, the detection region is extracted in a line in parallel with the center line. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、三次元物体を検出する装置に関する。 The present invention relates to an apparatus for detecting a three-dimensional object.

従来、三次元物体を検出する様々な手法が提案されている。たとえば、下記の特許文献１には、マトリックス型赤外線センサの出力から得られた熱画像の温度データから、第１および第２のしきい値を用いて、乗員の人体部分および顔面位置を抽出する手法が記載されている。下記の特許文献２には、撮像装置により得られた画像を、距離データに応じてグループ化し、それぞれのグループに細線化処理を施し、線状度合いの高い画像部分が円形または楕円形に近似されるならば、人間の頭部であると判定する手法が記載されている。 Conventionally, various methods for detecting a three-dimensional object have been proposed. For example, in Patent Document 1 below, a human body part and a face position of an occupant are extracted from temperature data of a thermal image obtained from an output of a matrix type infrared sensor using first and second threshold values. The method is described. In Patent Document 2 below, images obtained by an imaging device are grouped according to distance data, and each group is subjected to thinning processing, and an image portion having a high degree of linearity is approximated to a circle or an ellipse. If it is, a method for determining that it is a human head is described.

下記の非特許文献１の手法によると、画像の前景部分の中心線を求め、下から３分の１の点を重心Ｃｆｇとし、左右等間隔にαで垂直に切り出す。切り出した領域において、再度中心線を求める。該中心線の傾きを求め、該傾きに沿ってＴ状エッジフィルタを用い、肩の位置を求める。２次曲線で近似した肩形状と中心線との交差点を、首位置とする。首より上の部分を、頭部として検出する。
特開平１０−３１５９０８号公報特開２００３−２８１５５１号公報 Daniel B. Russakoff, Martin Herman, “Head tracking using stereo”, Machine Vision and Applications, 2002 According to the technique of Non-Patent Document 1 below, the center line of the foreground part of the image is obtained, and the center point Cfg is set to a third point from the bottom, and is vertically cut out by α at equal intervals on the left and right. In the cut-out area, the center line is obtained again. An inclination of the center line is obtained, and a shoulder position is obtained using a T-shaped edge filter along the inclination. The intersection between the shoulder shape approximated by the quadratic curve and the center line is defined as the neck position. The part above the neck is detected as the head.
JP-A-10-315908 JP 2003-281551 A Daniel B. Russakoff, Martin Herman, “Head tracking using stereo”, Machine Vision and Applications, 2002

所望の物体をより良好な精度で検出するために、また計算負荷を低減するために、該物体を検出すべき領域を該画像内において特定するのが望ましい。該領域が画像内で限定できれば、該領域内から物体を良好な精度で検出することができると共に、画像全体を処理対象としなくてもよい。上記の非特許文献１のような手法では、中心線に対し、左右等間隔に垂直方向に画像を切り出し、該切り出した画像を処理対象として、人間の頭部を検出する。このような手法によると、垂直方向に画像を一律に切り出すので、検出すべき物体の傾きが大きいとき、切り出された画像から該物体の一部が排除されるおそれがあり、よって物体を正確に検出することができないおそれがある。 In order to detect a desired object with better accuracy and to reduce the calculation load, it is desirable to specify a region in the image where the object is to be detected. If the region can be limited in the image, an object can be detected from the region with good accuracy, and the entire image does not have to be processed. In the method as described in Non-Patent Document 1, an image is cut out in the vertical direction at equal intervals on the left and right with respect to the center line, and the human head is detected using the cut out image as a processing target. According to such a method, since the image is uniformly cut out in the vertical direction, when the inclination of the object to be detected is large, there is a possibility that a part of the object may be excluded from the cut out image. There is a possibility that it cannot be detected.

したがって、物体の傾きが大きい場合でも、計算負荷を低減しつつ、物体をより正確に検出することのできる手法が必要とされている。 Therefore, there is a need for a technique that can detect an object more accurately while reducing the calculation load even when the inclination of the object is large.

この発明の一つの側面によると、三次元物体を検出する装置は、三次元物体が対象物として含まれる画像から背景を除去して、前景画像領域を取得する手段と、該前景画像領域の中心線（ＣＬ）の所定の基準軸に対する傾き（ｋ）を算出する手段と、該傾きに基づいて、該前景画像領域から検出領域を抽出する手段と、該検出領域から、物体を検出する手段と、を備える。 According to one aspect of the present invention, an apparatus for detecting a three-dimensional object includes means for removing a background from an image including a three-dimensional object as an object and acquiring a foreground image region, and a center of the foreground image region. Means for calculating an inclination (k) of a line (CL) with respect to a predetermined reference axis; means for extracting a detection area from the foreground image area based on the inclination; and means for detecting an object from the detection area; .

この発明によれば、検出すべき物体が傾いている場合でも、該傾きに応じた検出領域を抽出するので、検出すべき物体を排除することなく、該検出領域を最適に抽出することができる。ここで、三次元物体が対象物として含まれる画像は、撮像装置によって得られることができる。代替的に、対象物までの距離を計測する距離計測装置により得られる画像でもよい。 According to the present invention, even when the object to be detected is tilted, the detection area corresponding to the tilt is extracted, so that the detection area can be optimally extracted without eliminating the object to be detected. . Here, an image including a three-dimensional object as an object can be obtained by an imaging device. Alternatively, an image obtained by a distance measuring device that measures the distance to the object may be used.

この発明の一実施形態によると、前景画像領域内において、第１および第２の基準線（Ｌａ，Ｌｂ）となる画素行に存在する画素の数を２つに等分する位置を表す第１および第２の中央位置（Ｍａ，Ｍｂ）をそれぞれ算出し、該第１および第２の中央位置を結ぶ線を、上記中心線として求める。こうして、前景画像領域を二分するよう求めた中心線の傾きを用いることにより、物体の傾きに沿って検出領域を抽出することができる。 According to one embodiment of the present invention, the first representing the position in the foreground image region that equally divides the number of pixels existing in the pixel row that becomes the first and second reference lines (La, Lb) into two. And the second center position (Ma, Mb) are calculated, respectively, and a line connecting the first and second center positions is obtained as the center line. In this way, the detection area can be extracted along the inclination of the object by using the inclination of the center line obtained to bisect the foreground image area.

この発明の一実施形態によると、上記傾きが所定値以上のとき、検出領域は、上記中心線に平行な線（ＰＬ１，ＰＬ２）で抽出される。前景領域の中心線に対して平行に検出領域が抽出されるので、検出すべき物体の傾きが大きい場合でも、該物体を排除することなく検出領域を抽出することができる。 According to an embodiment of the present invention, when the inclination is equal to or greater than a predetermined value, the detection area is extracted by lines (PL1, PL2) parallel to the center line. Since the detection area is extracted parallel to the center line of the foreground area, the detection area can be extracted without removing the object even when the inclination of the object to be detected is large.

この発明の一実施形態によると、平行線は、中心線から左右にそれぞれ所定幅を持つよう設定される。一実施形態では、傾きが所定値より大きいとき、左右の幅のうち、左方向の幅（Ａ）と右方向の幅（Ｂ）との相対的な大きさを、中心線に対して左側の画像領域に含まれる対象物についての距離と、該中心線に対して右側の画像領域の対象物についての距離との相対的な大きさに基づいて設定することができる。また、一実施形態では、傾きが所定値より小さいとき、第１および第２の中央位置のうちの一方から左右にそれぞれ同じ幅を持つよう設定された垂直線（ＶＬ１，ＶＬ２）で前景画像領域を切り出すことにより、検出領域が抽出される。さらに、一実施形態では、左方向の幅と右方向の幅の合計が、前景画像領域の最大幅に対する所定の割合になるよう、該左方向の幅および右方向の幅が設定される。 According to one embodiment of the present invention, the parallel lines are set to have a predetermined width from the center line to the left and right. In one embodiment, when the slope is larger than a predetermined value, the left and right widths of the left-side width (A) and the right-side width (B) are set to the left side with respect to the center line. The distance can be set based on the relative size between the distance of the object included in the image area and the distance of the object in the image area on the right side of the center line. In one embodiment, when the inclination is smaller than a predetermined value, the foreground image region is defined by vertical lines (VL1, VL2) set to have the same width from the one of the first and second center positions to the left and right. The detection area is extracted by cutting out. Further, in one embodiment, the left width and the right width are set such that the sum of the left width and the right width is a predetermined ratio with respect to the maximum width of the foreground image area.

こうして、物体の大きさおよび姿勢により適するよう、検出領域を抽出することができる。たとえば、大人の頭部と子供の頭部は大きさが異なるが、それぞれに適した幅を持つよう検出領域を抽出することができる。また、人物の撮像装置に対する向きに適した大きさになるよう、検出領域を抽出することができる。 In this way, the detection area can be extracted so as to be more suitable for the size and posture of the object. For example, an adult head and a child's head are different in size, but the detection region can be extracted so as to have a width suitable for each. In addition, the detection area can be extracted so as to have a size suitable for the orientation of the person with respect to the imaging device.

この発明の一実施形態によると、さらに、画像内の対象物について、該対象物までの距離値を画素ごとに取得する手段と、中心線上にある画素の距離値に基づいて、それぞれの画素行の基準距離値（Ｄ）を表す第２の中心線（ＣＤ）を算出する手段と、検出領域におけるそれぞれの画素行について、第２の中心線から基準距離値を求め、該基準距離値に対して第１の所定範囲外（Ｄ±ｚ１）にある距離値を持つ画素を、該画素行から除去する第１の除去手段と、を備える。該除去は、水平方向の所定範囲について行われる。 According to one embodiment of the present invention, for each object in the image, each pixel row is obtained based on the means for acquiring the distance value to the object for each pixel and the distance value of the pixel on the center line. Means for calculating a second center line (CD) representing the reference distance value (D) of the pixel, and for each pixel row in the detection region, a reference distance value is obtained from the second center line, and the reference distance value And a first removing unit that removes a pixel having a distance value outside the first predetermined range (D ± z1) from the pixel row. The removal is performed for a predetermined range in the horizontal direction.

この発明によれば、たとえば物体を誤検出する可能性のある領域を、検出領域から除去することができる。たとえば、物体が頭部であるとき、このような距離方向の限定により、腕領域を除去することができる。 According to the present invention, for example, an area that may erroneously detect an object can be removed from the detection area. For example, when the object is the head, the arm region can be removed by such limitation of the distance direction.

この発明の一実施形態では、第２の中心線は、第１の基準行に含まれる画素の距離値を平均した距離値と、第２の基準行に含まれる画素の距離値を平均した距離値とに基づいて算出される。代替的に、第２の中心線は、第１の中央位置にある画素の距離値と、第２の中央位置にある画素の距離値とに基づいて算出される。こうして、第２の中心線は、距離方向における物体の中心を表すよう算出されることができる。 In one embodiment of the present invention, the second center line has a distance value obtained by averaging distance values of pixels included in the first reference row and a distance value obtained by averaging distance values of pixels included in the second reference row. And based on the value. Alternatively, the second center line is calculated based on the distance value of the pixel at the first center position and the distance value of the pixel at the second center position. Thus, the second centerline can be calculated to represent the center of the object in the distance direction.

この発明の一実施形態によると、上記第１の所定範囲は、水平方向の所定範囲において中心線から離れるに従って小さくなるよう設定される（図１０の（ｄ））。こうして、物体の形状に沿うよう第１の所定範囲を設定することができる。 According to one embodiment of the present invention, the first predetermined range is set to become smaller as the distance from the center line increases in the horizontal predetermined range ((d) of FIG. 10). In this way, the first predetermined range can be set along the shape of the object.

この発明の一実施形態によると、上記第１の所定範囲は、第２の中心線に対し、距離値が増える方向に第１の幅（ｚ３）を持ち、距離値が減る方向に第２の幅（ｚ４）を持つよう設定され、該第１の幅は、該第２の幅より大きい。 According to an embodiment of the present invention, the first predetermined range has a first width (z3) in the direction in which the distance value increases with respect to the second center line, and the second range in the direction in which the distance value decreases. It is set to have a width (z4), and the first width is greater than the second width.

この発明の一実施形態によると、上記の水平方向の所定範囲は、検出領域の所定の中央領域外に設定される。こうして、物体の中央部分については、第１の除去手段による除去を行うことなく保存することができる。 According to an embodiment of the present invention, the predetermined range in the horizontal direction is set outside a predetermined central region of the detection region. Thus, the central part of the object can be stored without being removed by the first removing means.

この発明の一実施形態によると、上記所定の中央領域においては、検出領域におけるそれぞれの画素行について、該画素行の上記基準距離値に対して第２の所定範囲外（Ｄ±ｚ２）にある距離値を持つ画素を、該画素行から除去する第２の除去手段を備える。一実施例では、第２の所定範囲は、第１の所定範囲より大きいよう設定される。 According to an embodiment of the present invention, in the predetermined central region, each pixel row in the detection region is outside the second predetermined range (D ± z2) with respect to the reference distance value of the pixel row. Second removal means for removing a pixel having a distance value from the pixel row is provided. In one embodiment, the second predetermined range is set to be larger than the first predetermined range.

この発明によれば、第２の所定範囲においては、第１の除去手段とは異なるしきい値で除去を実現し、物体の誤検出を回避することができる。たとえば、検出すべき物体が頭部であるとき、第２の除去手段で、乗員が前に抱えた荷物に対応する領域を除去することができ、該荷物領域において頭部が誤検出されることを回避することができる。 According to the present invention, in the second predetermined range, the removal can be realized with a threshold value different from that of the first removal means, and the erroneous detection of the object can be avoided. For example, when the object to be detected is the head, the second removal means can remove the area corresponding to the luggage previously held by the occupant, and the head is erroneously detected in the luggage area. Can be avoided.

この発明の一実施形態によると、さらに、検出領域の所定の下部領域に、凹部が形成されているかどうかを検出する凹部検出手段と、凹部が検出されたならば、画素から成る線分で閉じる手段と、検出領域内に、穴があるかどうかを検出する手段と、穴が検出されたならば、所定の距離値を持つ画素で該穴を埋める手段と、を備える。 According to one embodiment of the present invention, a recess detection means for detecting whether or not a recess is formed in a predetermined lower area of the detection area, and if a recess is detected, the detection area is closed by a line segment composed of pixels. Means, means for detecting whether or not there is a hole in the detection area, and means for filling the hole with a pixel having a predetermined distance value if the hole is detected.

凹部があると、凹部の周辺領域で物体が誤検出されるおそれがある。この発明によれば、凹部が閉じられて穴が形成され、該穴が埋められるので、このような誤検出を回避することができる。 If there is a recess, an object may be erroneously detected in the peripheral area of the recess. According to the present invention, since the concave portion is closed to form a hole and the hole is filled, such erroneous detection can be avoided.

一実施形態においては、中心線により検出領域を２分した領域のそれぞれにおいて、垂直方向における最下点（Ｔａ，Ｔｂ）を求め、該２つの最下点を結ぶ線分の中点（Ｔｍ）と、該中点から所定の高さに位置する点との間の領域に、画素が存在するかどうかを判断することにより、凹部を検出する。 In one embodiment, the lowest point (Ta, Tb) in the vertical direction is obtained in each of the regions divided by the center line, and the midpoint (Tm) of the line segment connecting the two lowest points. And determining whether or not a pixel is present in a region between the midpoint and a point located at a predetermined height.

この発明の一実施形態によると、さらに、画像内の対象物について、該対象物までの距離値を取得する手段と、上記前景画像領域を細分した各領域について、該距離値を表すグレースケール値を算出し、該各領域が対応するグレースケール値を持つグレースケール画像を生成する手段と、を備える。上記領域抽出手段は、該グレースケール画像から、上記検出領域を抽出する。さらに、三次元物体をモデル化したモデルを記憶する記憶手段と、該モデルと該検出領域中の画像領域との類似度を表す相関値を算出する手段とが設けられる。該モデルと最も高い相関値を持つ画像領域を検出領域において検出することにより、三次元物体を検出する。 According to an embodiment of the present invention, further, a means for obtaining a distance value to the object for the object in the image, and a gray scale value representing the distance value for each area obtained by subdividing the foreground image area. And a means for generating a gray scale image having a gray scale value corresponding to each region. The area extraction means extracts the detection area from the gray scale image. Furthermore, storage means for storing a model obtained by modeling a three-dimensional object and means for calculating a correlation value representing the similarity between the model and the image area in the detection area are provided. A three-dimensional object is detected by detecting an image region having the highest correlation value with the model in the detection region.

この発明によれば、限定された検出領域に対してモデルを相関させるので、計算量を軽減しつつ、物体をより正確に検出することができる。三次元物体を二次元の画像処理で検出するので、計算量を、より低減することができる。 According to the present invention, since the model is correlated with the limited detection region, the object can be detected more accurately while reducing the calculation amount. Since the three-dimensional object is detected by two-dimensional image processing, the amount of calculation can be further reduced.

次に図面を参照してこの発明の実施の形態を説明する。図１は、この発明の一実施形態に従う、物体検出装置１の構成を示す。 Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows a configuration of an object detection apparatus 1 according to an embodiment of the present invention.

この実施形態は、物体検出装置１が車両に搭載されており、乗員の頭部を検出する形態について示す。しかしながら、この物体検出装置が、この形態に限定されない点に注意されたい。これについては、後述される。 In this embodiment, the object detection device 1 is mounted on a vehicle, and an embodiment for detecting the head of an occupant will be described. However, it should be noted that the object detection device is not limited to this form. This will be described later.

光源１０は、シート（座席）に座っている乗員の頭部を照射することができるよう車両内に配置される。一例では、シートの前方上部に配置される。 The light source 10 is disposed in the vehicle so that the head of an occupant sitting on a seat (seat) can be irradiated. In one example, it is arranged in the upper front part of the seat.

照射される光は、好ましくは近赤外光（ＩＲ光）である。一般に、撮影するときの明るさは、日中や夜間などの環境変化によって変動する。また、強い可視光が１方向から顔にあたると、顔面上に陰影のグラデーションが発生する。近赤外光は、この様な照明の変動や陰影のグラデーションに対してロバスト性を有する。 The irradiated light is preferably near infrared light (IR light). Generally, the brightness at the time of shooting varies depending on environmental changes such as daytime and nighttime. Further, when strong visible light strikes the face from one direction, a shaded gradation is generated on the face. Near-infrared light has robustness against such illumination variations and shadow gradations.

この実施例では、光源１０の前面にパターンマスク（フィルタ）１１を設け、乗員に対して格子状のパターン光があたるようにする。パターン光により、後述する視差の算出を、より正確なものにすることができる。これについては、後述される。 In this embodiment, a pattern mask (filter) 11 is provided on the front surface of the light source 10 so that a grid pattern light is applied to the passenger. With the pattern light, the calculation of the parallax described later can be made more accurate. This will be described later.

一対の撮像装置１２および１３は、乗員の頭部の近くであって、該乗員の頭部を含む２次元画像を撮像するよう配置される。撮像装置１２および１３は、左右、上下または斜めに所定距離だけ離れるよう配置される。撮像装置１２および１３は、光源１０からの近赤外光だけを受光するように光学バンドパスフィルターを有する。 The pair of imaging devices 12 and 13 are arranged near the occupant's head so as to capture a two-dimensional image including the occupant's head. The imaging devices 12 and 13 are arranged so as to be separated from each other by a predetermined distance from left to right, up and down, or diagonally. The imaging devices 12 and 13 have an optical bandpass filter so as to receive only near-infrared light from the light source 10.

処理装置１５は、例えば、種々の演算を実行するＣＰＵ、プログラムおよびデータを記憶するメモリ、データの入出力を行うインタフェース等を備えるマイクロコンピュータにより実現される。以上のことを考慮して、図１では、処理装置１５を機能ブロックで表している。これら各機能ブロックの一部または全部を、ソフトウェア、ファームウェア、およびハードウェアの任意の組み合わせで実現することができる。一例では、処理装置１５は、車両に搭載される車両制御装置（ＥＣＵ）により実現される。ＥＣＵは、ＣＰＵを備えるコンピュータであり、車両の様々な制御を実現するためのコンピュータプログラムおよびデータを格納するメモリを備えている。 The processing device 15 is realized by, for example, a microcomputer including a CPU that executes various operations, a memory that stores programs and data, an interface that inputs and outputs data, and the like. Considering the above, in FIG. 1, the processing device 15 is represented by a functional block. Part or all of these functional blocks can be realized by any combination of software, firmware, and hardware. In one example, the processing device 15 is realized by a vehicle control device (ECU) mounted on the vehicle. The ECU is a computer including a CPU and includes a memory for storing a computer program and data for realizing various controls of the vehicle.

視差画像生成部２１は、撮像装置１２および１３により撮像された２つの画像に基づいて、視差画像を生成する。 The parallax image generation unit 21 generates a parallax image based on the two images captured by the imaging devices 12 and 13.

ここで、図２を参照して、視差を算出する手法の一例を簡単に述べる。撮像装置１２および１３は、それぞれ、二次元に配列された撮像素子アレイ３１および３２と、レンズ３３および３４を備える。撮像素子は、たとえば、ＣＣＤ素子またはＣＭＯＳ素子である。 Here, an example of a method for calculating the parallax will be briefly described with reference to FIG. The imaging devices 12 and 13 include imaging element arrays 31 and 32 and lenses 33 and 34 that are two-dimensionally arranged, respectively. The imaging element is, for example, a CCD element or a CMOS element.

撮像装置１２と１３の間の距離（基線長）が、Ｂで表されている。撮像素子アレイ３１および３２は、それぞれ、レンズ３３および３４の焦点距離ｆに配置されている。レンズ３３および３４のある平面から距離Ｌにある対象物の像が、撮像素子アレイ３１ではレンズ３３の光軸からｄ１ずれた位置に形成され、撮像素子アレイ３２ではレンズ３４の光軸からｄ２だけずれた位置に形成される。距離Ｌは、三角測量法の原理により、Ｌ＝Ｂ・ｆ／ｄで求められる。ここで、ｄが視差であり、ｄ＝ｄ２−ｄ１である。 A distance (base line length) between the imaging devices 12 and 13 is represented by B. The image sensor arrays 31 and 32 are disposed at the focal lengths f of the lenses 33 and 34, respectively. An image of an object at a distance L from the plane on which the lenses 33 and 34 are located is formed at a position shifted by d1 from the optical axis of the lens 33 in the image sensor array 31, and only d2 from the optical axis of the lens 34 in the image sensor array 32. It is formed at a shifted position. The distance L is determined by L = B · f / d according to the principle of triangulation. Here, d is the parallax, and d = d2−d1.

視差ｄを求めるために、撮像素子アレイ３１の或るブロックに対し、該ブロックと同じ対象物部分が撮像されている対応ブロックを、撮像素子アレイ３２上で探索する。ブロックの大きさは、任意に設定することができ、たとえば、１画素でもよいし、複数の画素（たとえば、７×７画素）を１つのブロックとしてもよい。 In order to obtain the parallax d, for a certain block of the image sensor array 31, a corresponding block in which the same object portion as that block is imaged is searched on the image sensor array 32. The size of the block can be arbitrarily set. For example, it may be one pixel, or a plurality of pixels (for example, 7 × 7 pixels) may be used as one block.

一手法によると、撮像装置１２および１３の一方で得られる画像のブロックを、撮像装置１２および１３の他方で得られる画像に対して走査させ、対応ブロックを探索する（ブロックマッチング）。一方のブロックの輝度値（たとえば、ブロック中の画素の輝度値を平均したもの）と他方のブロックの輝度値との差の絶対値を求め、これを相関値とする。相関値が最小となるブロックを見つけ、この時の２つのブロック間の距離が、視差ｄを示す。代替的に、他の手法を用いて該探索処理を実現してもよい。 According to one method, a block of an image obtained by one of the imaging devices 12 and 13 is scanned with respect to an image obtained by the other of the imaging devices 12 and 13, and a corresponding block is searched (block matching). The absolute value of the difference between the luminance value of one block (for example, the average of the luminance values of the pixels in the block) and the luminance value of the other block is obtained and used as the correlation value. The block having the smallest correlation value is found, and the distance between the two blocks at this time indicates the parallax d. Alternatively, the search process may be realized using other methods.

輝度値に変化がないブロックがあると、ブロック同士をマッチングすることが困難になるが、上記のパターン光により、このようなブロック中に輝度値の変化を起こさせることができ、よって、ブロックマッチングをより正確に実現することができる。 If there is a block whose luminance value does not change, it becomes difficult to match the blocks, but the above pattern light can cause the luminance value to change in such a block, so block matching Can be realized more accurately.

撮像装置１２および１３の一方から得られた撮像画像において、それぞれのブロックについて算出された視差値は、該ブロックに含まれる画素値に対応づけられ、これを視差画像とする。視差値は、その画素に撮像されている対象物の、撮像装置１２および１３に対する距離を表している。視差値が大きいほど、該対象物の位置は、撮像装置１２および１３に近い。 In the captured image obtained from one of the imaging devices 12 and 13, the parallax value calculated for each block is associated with the pixel value included in the block, and this is used as the parallax image. The parallax value represents the distance of the object imaged by the pixel with respect to the imaging devices 12 and 13. The larger the parallax value is, the closer the position of the object is to the imaging devices 12 and 13.

背景除去部２２は、こうして得られた視差画像から、背景領域を除去し、前景領域を抽出する。背景領域と前景領域のこのような分離には、任意の適切な手法を用いることができる。 The background removal unit 22 removes the background area from the parallax image thus obtained, and extracts the foreground area. Any appropriate technique can be used for such separation of the background region and the foreground region.

この実施例では、メモリ２３に、図３の（ａ）に示されるように、背景を撮像した画像から生成された視差画像が予め記憶されている。この例では、人物が存在していない状態（すなわち、空席）を撮像した画像を、背景画像としている。画像領域４２はシートを示し、画像領域４３は、該シートのさらに背景を示す。該空席時の視差画像は、上記と同様の手法で生成されることができる。このような空席時の視差画像と、図３の（ｂ）に示されるような、乗員がシートに座っている時に生成された視差画像とを比較すると、乗員を撮像した領域４１の視差値が異なる。したがって、所定値以上異なる視差値を持つ画素を抽出することにより、図３の（ｃ）に示されるように、乗員領域４１を、前景領域として抽出することができる。 In this embodiment, as shown in FIG. 3A, a parallax image generated from an image of the background is stored in the memory 23 in advance. In this example, an image obtained by capturing a state where no person exists (that is, a vacant seat) is used as a background image. An image area 42 indicates a sheet, and an image area 43 indicates a background of the sheet. The parallax image at the time of the vacant seat can be generated by the same method as described above. When the parallax image at the time of vacant seat is compared with the parallax image generated when the occupant is sitting on the seat as shown in FIG. 3B, the parallax value of the region 41 in which the occupant is imaged is Different. Therefore, by extracting pixels having parallax values different from each other by a predetermined value or more, the occupant area 41 can be extracted as a foreground area as shown in FIG.

空席時の視差画像は、シートの位置および傾きに従い変化する。メモリ２３に、シートの位置および傾きに応じて複数の空席時の視差画像を記憶しておくことができる。車両には、シートの位置および傾きを検知するセンサ（図示せず）が設けられている。背景除去部２２は、該センサにより検知されたシートの位置および傾きに従って、メモリ２３から、対応する空席時の視差画像を読み出し、これを用いて背景を除去することができる。こうすることにより、シートの位置および（または）傾きが変更された場合にも、背景除去部２２により、より正確に前景領域を抽出することができる。 The parallax image at the vacant seat changes according to the position and inclination of the seat. A plurality of vacant seat parallax images can be stored in the memory 23 according to the position and inclination of the seat. The vehicle is provided with a sensor (not shown) that detects the position and inclination of the seat. The background removing unit 22 can read the corresponding parallax image at the time of vacant seat from the memory 23 according to the position and inclination of the seat detected by the sensor, and can remove the background using this. By doing so, the foreground region can be more accurately extracted by the background removal unit 22 even when the position and / or inclination of the sheet is changed.

正規化部２４は、乗員領域４１について、正規化を実行する。この正規化により、撮像装置１１，１２からの距離に従って、該乗員領域の各画素にグレースケール値が割り当てられる。正規化は、式（１）に従って行われる。

The normalizing unit 24 performs normalization on the passenger area 41. By this normalization, a gray scale value is assigned to each pixel in the occupant area according to the distance from the

imaging devices

11 and 12. Normalization is performed according to equation (1).

ここで、撮像画像の水平（横）方向をｘ座標、垂直（縦）方向をｙ座標に設定すると、ｄ（ｘ，ｙ）は、乗員領域４１のｘｙ位置における視差値を示す。Ｎは、グレースケールのビット数である。一例では、Ｎ＝９であり、０〜５１１のグレースケール値を持つ５１２階調が実現される。０は黒であり、５１１は白である。ｄ’（ｘ，ｙ）は、乗員領域４１のｘｙ位置について算出されるグレースケール値を示す。視差の最大値ｄｍａｘは、撮像装置からの距離の最小値に対応し、視差の最小値ｄｍｉｎは、撮像装置からの距離の最大値に対応する。 Here, when the horizontal (lateral) direction of the captured image is set to the x coordinate and the vertical (vertical) direction is set to the y coordinate, d (x, y) indicates the parallax value at the xy position of the occupant region 41. N is the number of grayscale bits. In one example, N = 9 and 512 gradations with grayscale values from 0 to 511 are realized. 0 is black and 511 is white. d ′ (x, y) indicates a gray scale value calculated for the xy position of the passenger area 41. The parallax maximum value dmax corresponds to the minimum value of the distance from the imaging device, and the parallax minimum value dmin corresponds to the maximum value of the distance from the imaging device.

式（１）に示されるように、正規化は、最大の視差値ｄｍａｘに対応する画素には、最も高いグレースケール値（この例では、白）を割り当て、視差値ｄが小さくなるほど、グレースケール値を徐々に低くする。言い換えれば、距離の最小値に対応する画素には、最も高いグレースケール値を割り当て、撮像装置に対する距離が大きくなるほど、グレースケール値を徐々に低くする。こうして、撮像装置に対する距離を表すグレースケール値を各画素が有するグレースケール画像が、乗員領域４１から生成される。 As shown in Equation (1), normalization assigns the highest gray scale value (white in this example) to the pixel corresponding to the maximum parallax value dmax, and the smaller the parallax value d, the gray scale Decrease the value gradually. In other words, the highest gray scale value is assigned to the pixel corresponding to the minimum distance value, and the gray scale value is gradually lowered as the distance to the imaging device increases. In this way, a grayscale image in which each pixel has a grayscale value representing the distance to the imaging device is generated from the passenger area 41.

検出領域抽出部２５は、上記のグレースケール画像から、頭部を検出すべき領域を抽出する。これにより、頭部を検出するために処理すべき領域が限定されるので、より正確に頭部を検出することができると共に、計算負荷をより低減することができる。検出領域抽出部２５の詳細な処理は、後述される。 The detection area extraction unit 25 extracts an area where the head should be detected from the gray scale image. Thereby, since the area to be processed for detecting the head is limited, the head can be detected more accurately and the calculation load can be further reduced. Detailed processing of the detection area extraction unit 25 will be described later.

メモリ２６（これは、メモリ２３と同じメモリでよい）は、人間の頭部をモデル化した頭部モデルを記憶する。相関部２７は、検出領域抽出部２５により抽出された検出領域に対して、頭部モデルを走査し、走査領域と頭部モデルの相関値を計算する。相関値の最も大きい走査領域を、頭部領域と判断する。こうして、検出された頭部領域の位置および大きさから、頭部の物理的な位置および大きさを求めることができる。 The memory 26 (which may be the same memory as the memory 23) stores a head model obtained by modeling a human head. The correlation unit 27 scans the head model with respect to the detection region extracted by the detection region extraction unit 25, and calculates a correlation value between the scanning region and the head model. The scanning area with the largest correlation value is determined as the head area. Thus, the physical position and size of the head can be obtained from the detected position and size of the head region.

図４は、本願発明の一実施形態に従う、検出領域抽出部２５により実行される処理のフローチャートである。この処理を、図５〜図７を参照しつつ、説明する。 FIG. 4 is a flowchart of processing executed by the detection region extraction unit 25 according to an embodiment of the present invention. This process will be described with reference to FIGS.

図５の（ａ）は、乗員が正常に着座している状態を撮像した画像に基づいて生成されたグレースケール画像を示し、（ｂ）は、乗員が、身体を傾けて着座している状態を撮像した画像に基づいて生成されたグレースケール画像を示す。説明をわかりやすくするため、ｘおよびｙ座標が設定された撮像画像の枠が示されているが、撮像画像中、符号４５により示される背景領域は、背景除去部２２により既に除去されている点に注意されたい。また、符号４６により示される乗員領域は、図では白色で示されているが、実際には、この領域がグレースケール画像である点に注意されたい。グレースケール画像４６の底部のｙ座標をｙ１とし、頂部のｙ座標をｙ２とする。 (A) of FIG. 5 shows the gray scale image produced | generated based on the image which image | photographed the state which the passenger | crew is sitting normally, (b) is the state which the passenger | crew is inclining and sitting The gray scale image produced | generated based on the image which imaged is shown. For ease of explanation, a frame of the captured image in which the x and y coordinates are set is shown, but the background area indicated by reference numeral 45 in the captured image has already been removed by the background removal unit 22. Please be careful. In addition, although the passenger area indicated by the reference numeral 46 is shown in white in the figure, it should be noted that this area is actually a grayscale image. The y coordinate of the bottom of the grayscale image 46 is y1, and the y coordinate of the top is y2.

図４のステップＳ１において、グレースケール画像４６の上部と下部に、それぞれ基準となる画素行（基準行と呼ぶ）を設定する。この例では、頂部から１／５（すなわち、（ｙ２−ｙ１）×４／５）の所の画素行が第１の基準行Ｌａに設定され、頂部から４／５（すなわち、（ｙ２―ｙ１）×１／５）の所の画素行が第２の基準行Ｌｂに設定される。これらの基準行は、ｘｙ平面において乗員（頭部〜胴体）の中心を表す中心線を決定するための行であるので、この実施例に示すように、好ましくは、グレースケール画像の上部および下部のそれぞれを通過するよう設定される。 In step S1 of FIG. 4, reference pixel rows (referred to as reference rows) are set in the upper and lower portions of the grayscale image 46, respectively. In this example, the pixel row 1/5 (ie, (y2−y1) × 4/5) from the top is set as the first reference row La, and 4/5 (ie, (y2−y1) from the top. ) × 1/5) is set as the second reference row Lb. Since these reference rows are rows for determining a center line representing the center of the occupant (head to torso) in the xy plane, preferably, as shown in this embodiment, the upper and lower portions of the grayscale image are displayed. Set to pass through each of the.

さらに、第１の基準行Ｌａに含まれる画素の数を２つに等分する位置を表す中央位置Ｍａを算出する。ここで図６を参照すると、一例として、基準行Ｌａに存在する画素の配置が示されている。網掛けされた部分が、画素が存在する個所である。ここで、画素が存在するとは、好ましくは、適切な視差値が得られた画素（有効画素と呼ばれることができる）が存在することを示している。ｘａ１およびｘａ２は、第１の基準行Ｌａがグレースケール画像４６のエッジを横切る座標である。（ａ）の例では、ｘａ１からｘａ２の範囲が画素で満たされているので、中央位置Ｍａのｘ座標は（ｘａ２―ｘａ１）／２で表される。（ｂ）では、符号４８および４９に示すように、有効画素が存在しない部分が存在する。この場合、中央位置Ｍａのｘ座標は（ａ）とは異なる位置となる。画素行の総有効画素数を２つに等分する位置を求めることにより、中央位置を、該画素行の重心を表す所に位置付けることができる。 Further, a central position Ma representing a position for equally dividing the number of pixels included in the first reference row La into two is calculated. Here, referring to FIG. 6, as an example, an arrangement of pixels existing in the reference row La is shown. The shaded portion is where the pixel exists. Here, the presence of a pixel preferably indicates that there is a pixel (which can be referred to as an effective pixel) from which an appropriate parallax value is obtained. xa1 and xa2 are coordinates at which the first reference row La crosses the edge of the grayscale image 46. In the example of (a), since the range from xa1 to xa2 is filled with pixels, the x coordinate of the central position Ma is represented by (xa2-xa1) / 2. In (b), as indicated by reference numerals 48 and 49, there is a portion where no effective pixel exists. In this case, the x coordinate of the center position Ma is a position different from (a). By obtaining a position that equally divides the total number of effective pixels of a pixel row into two, the center position can be positioned at a location that represents the center of gravity of the pixel row.

同様に、第２の基準行Ｌｂに存在する有効画素の総数を２つに等分する位置を表す中央位置Ｍｂを算出する。図６の（ａ）のような配置の時には、中央位置Ｍｂのｘ座標は（ｘｂ２―ｘｂ１）／２と表される。図５の（ａ）および（ｂ）には、こうして特定された中央位置ＭａおよびＭｂが示されている。 Similarly, a central position Mb representing a position where the total number of effective pixels existing in the second reference row Lb is equally divided into two is calculated. In the arrangement as shown in FIG. 6A, the x coordinate of the center position Mb is expressed as (xb2-xb1) / 2. 5A and 5B show the center positions Ma and Mb thus specified.

ステップＳ２において、中央位置Ｍａおよび中央位置Ｍｂを結ぶ直線を中心線ＣＬに設定し、ｙ軸に対する中心線ＣＬの傾きｋを算出する。傾きｋは、（中央位置Ｍａのｘ座標―中央位置Ｍｂのｘ座標）／（中央位置Ｍａのｙ座標―中央位置Ｍｂのｙ座標）により算出されることができる。傾きｋを、角度により表してもよい。 In step S2, a straight line connecting the center position Ma and the center position Mb is set as the center line CL, and the inclination k of the center line CL with respect to the y axis is calculated. The inclination k can be calculated by (x coordinate of central position Ma−x coordinate of central position Mb) / (y coordinate of central position Ma−y coordinate of central position Mb). The inclination k may be expressed by an angle.

代替的に、複数の画素行（上記の基準行を含んでもよいし含まなくてもよい）の中央位置を上記のような手法で求め、一次近似手法で中心線ＣＬを決定して、その傾きを算出してもよい。図７には、このような複数の画素行で求められた中央位置がプロットされている。これらの中央位置を、任意の適切な手法で一次近似することにより、中心線ＣＬを求めることができる。 Alternatively, the center position of a plurality of pixel rows (which may or may not include the above-described reference row) is obtained by the above-described method, the center line CL is determined by the primary approximation method, and the inclination thereof May be calculated. FIG. 7 plots the center positions obtained in such a plurality of pixel rows. The center line CL can be obtained by linearly approximating these center positions by any appropriate method.

ステップＳ３において、中心線の傾きｋの絶対値が、所定値Ｔより小さいかどうかを判断する。ここで、図５の（ａ）は、傾きｋの絶対値が該所定値Ｔより小さい場合を示す。図５の（ｂ）は、傾きｋの絶対値が該所定値Ｔ以上である場合を示す。図から明らかなように、人物が正常に着座している時には、算出される傾きｋが所定値Ｔより小さくなり、人物が傾いて座っている時には、算出される傾きｋが所定値Ｔ以上となる。 In step S3, it is determined whether or not the absolute value of the inclination k of the center line is smaller than a predetermined value T. Here, FIG. 5A shows a case where the absolute value of the slope k is smaller than the predetermined value T. FIG. 5B shows a case where the absolute value of the slope k is greater than or equal to the predetermined value T. As is apparent from the figure, when the person is sitting normally, the calculated inclination k is smaller than the predetermined value T, and when the person is sitting inclined, the calculated inclination k is equal to or greater than the predetermined value T. Become.

代替的に、中心線ＣＬのｘ軸に対する傾きを用いてもよい。この場合、傾きが所定値より大きいとき、人物が正常に着座していると判断され、傾きが所定値以下であるとき、人物が傾いて座っていると判断される。 Alternatively, the inclination of the center line CL with respect to the x axis may be used. In this case, when the inclination is larger than the predetermined value, it is determined that the person is sitting normally, and when the inclination is equal to or less than the predetermined value, it is determined that the person is inclined and sitting.

傾きｋの絶対値が所定値Ｔより小さい場合、ステップＳ４に進む。図５の（ｃ）に示すように、中央位置Ｍｂに対して左右にそれぞれ同じ幅Ａを有するよう、ｘ軸に対して垂直な第１および第２の垂直線ＶＬ１およびＶＬ２を設定する。該垂直線ＶＬ１およびＶＬ２により、グレースケール画像４６を切り出し、検出領域４７を抽出する。 If the absolute value of the slope k is smaller than the predetermined value T, the process proceeds to step S4. As shown in FIG. 5C, first and second vertical lines VL1 and VL2 perpendicular to the x-axis are set so as to have the same width A on the left and right with respect to the center position Mb. The gray scale image 46 is cut out by the vertical lines VL1 and VL2, and the detection area 47 is extracted.

この例では、“２×Ａ”が、グレースケール画像４６の最大幅の所定割合（たとえば、８０％）となるよう、幅Ａが設定される。ここで、グレースケール画像４６の最大幅は、たとえば符号５１（図５の（ａ））により示されている。最大幅は、各画素行を走査して、グレースケール画像４６の左端エッジと右端エッジの間の長さの最大値を求めることにより得られる。 In this example, the width A is set so that “2 × A” is a predetermined ratio (for example, 80%) of the maximum width of the grayscale image 46. Here, the maximum width of the gray scale image 46 is indicated by, for example, reference numeral 51 ((a) of FIG. 5). The maximum width is obtained by scanning each pixel row and obtaining the maximum value of the length between the left end edge and the right end edge of the grayscale image 46.

こうして、傾きｋが小さい時には、乗員がほとんど傾くことなく正常に着座していると判断することができるので、下方にある（すなわち、胴体側にある）中央位置Ｍｂを中心として同じ幅を持つ領域を垂直に切り出すことにより、頭部を含むよう検出領域４７を抽出することができる。 Thus, when the inclination k is small, it can be determined that the occupant is normally seated with almost no inclination, so that the region having the same width is centered on the central position Mb below (that is, on the body side) The detection area 47 can be extracted so as to include the head.

一方、ステップＳ３において、傾きｋの絶対値が所定値Ｔ以上である時、ステップＳ５に進む。図５の（ｄ）に示すように、グレースケール画像４６の頂部から１／２の所の画素行を、第３の基準行Ｌｃとして特定する。さらに、第１および第２の基準行の場合と同様に、第３の基準行Ｌｃについて、該第３の基準行Ｌｃに存在する有効画素の総数を２つに等分する中央位置Ｃを求める。 On the other hand, when the absolute value of the slope k is greater than or equal to the predetermined value T in step S3, the process proceeds to step S5. As shown in (d) of FIG. 5, the pixel row at a half point from the top of the grayscale image 46 is specified as the third reference row Lc. Further, as in the case of the first and second reference rows, for the third reference row Lc, a central position C is obtained that equally divides the total number of effective pixels present in the third reference row Lc into two. .

ステップＳ６において、中央位置Ｃを基準として、左右それぞれに幅ＡおよびＢを有するよう、中心線ＣＬに対して平行な第１および第２の線ＰＬ１およびＰＬ２を設定する。該第１の平行線ＰＬ１および第２の平行線ＰＬ２により、グレースケール画像４６を切り出して、検出領域４７を抽出する。 In step S6, the first and second lines PL1 and PL2 parallel to the center line CL are set so as to have widths A and B on the left and right, respectively, with the center position C as a reference. The gray scale image 46 is cut out by the first parallel line PL1 and the second parallel line PL2, and the detection area 47 is extracted.

左側の切り出し幅Ａおよび右側の切り出し幅Ｂの相対的な大きさは、好ましくは、乗員の右半身と左半身の撮像装置に対する距離に応じて決定される。すなわち、乗員が撮像装置に対して身体を傾けている時、乗員の左半身および右半身のうち、一方は他方に対して撮像装置に対する距離が近くなる場合がある。距離が短い方は大きく撮像され、距離が遠い方は小さく撮像される。したがって、左半身と右半身のうち、距離が近い方の切り出し幅を、距離が遠い方の切り出し幅より大きくするのが好ましい。 The relative sizes of the left cut-out width A and the right cut-out width B are preferably determined according to the distance between the right and left occupants of the occupant. That is, when the occupant tilts his / her body with respect to the imaging device, one of the occupant's left and right bodies may be closer to the imaging device than the other. A shorter distance is imaged larger, and a longer distance is imaged smaller. Therefore, it is preferable to make the cut-out width of the shorter distance between the left and right body larger than the cut-out width of the longer distance.

図５の（ｂ）の例では、乗員から向かって右方向から撮像されており、乗員の右半身が左半身より撮像装置に近い。よって、Ａ＞Ｂと設定される。ＡとＢの比率を、予め決めておくことができる。一実施例では、Ａ＋Ｂが、グレースケール画像４６の最大幅の８０％であり、かつ、Ａ：Ｂ＝３：２となるよう設定される。乗員から向かって左方向から撮像する場合には、Ａ＜Ｂと設定されることとなり、たとえば、Ａ：Ｂ＝２：３と設定される。最大幅は、前述したように決定されることができる。 In the example of FIG. 5B, the image is taken from the right direction from the occupant, and the right half of the occupant is closer to the imaging device than the left half. Therefore, A> B is set. The ratio of A and B can be determined in advance. In one embodiment, A + B is set to be 80% of the maximum width of the grayscale image 46 and A: B = 3: 2. When imaging from the left direction from the occupant, A <B is set, and for example, A: B = 2: 3 is set. The maximum width can be determined as described above.

なお、ここで挙げた数値は、頭部を除去することなく検出領域を限定することができる値として、発明者が経験上得た値の一例であり、よって、これらの数値に限定されない点に注意されたい。代替的に、左右の幅を同じに設定してもよい。 The numerical values given here are examples of values obtained by the inventor as a value that can limit the detection region without removing the head, and thus are not limited to these numerical values. Please be careful. Alternatively, the left and right widths may be set the same.

右半身と左半身のどちらが撮像装置に近いかは、任意の手法で判断することができる。前述したように、グレースケール画像中の各画素のグレースケール値は、距離を表している。したがって、たとえば、中心線より右側の領域に含まれる画素のグレースケール値の平均値と、中心線より左側の領域に含まれる画素のグレースケール値の平均値とを比較することにより、右半身と左半身のどちらが撮像装置に近いかを、判断することができる。 Whether the right half body or the left half body is closer to the imaging apparatus can be determined by an arbitrary method. As described above, the gray scale value of each pixel in the gray scale image represents the distance. Therefore, for example, by comparing the average value of the gray scale values of the pixels included in the area on the right side of the center line with the average value of the gray scale values of the pixels included in the area on the left side of the center line, It can be determined which of the left half is closer to the imaging device.

こうして、傾きｋが大きい場合でも、中心線ＣＬに平行なラインで切り出すことにより、頭部を排除することなく検出領域４７を抽出することができる。また、撮像装置に対して、左半身および右半身のどちらが傾いているかに従って、切り出し幅を左右で不均等にすることにより、人物の姿勢により適合した検出領域を抽出することができ、計算負荷を軽減することができる。 Thus, even when the inclination k is large, the detection region 47 can be extracted without removing the head by cutting out with a line parallel to the center line CL. Also, according to whether the left half body or the right half body is inclined with respect to the imaging device, it is possible to extract a detection region that is more suitable for the posture of the person by making the cutout width unequal on the left and right, and the calculation load is reduced. Can be reduced.

ここで図８を参照すると、片腕を頭部の側面に挙げた人物の一例がグレースケール画像４６として取得された場合を模式的に示している。前述したように、ｘｙ平面において、平行線ＰＬ１およびＰＬ２によって検出領域４７を切り出す様子が示されている。頭部を確実に検出領域４７に含めるよう、前述したようにＡおよびＢを大きめに設定すると、図に示されるように、腕領域が検出領域に含まれるおそれがある。このように挙げた腕（またはその一部）の領域が検出領域に含まれると、後述する相関処理において、該腕領域において頭部が誤検出されるおそれがある。これを回避するために、ＡおよびＢを、Ａ’およびＢ’に示すように狭くすると、平行線ＰＬ１およびＰＬ２により、頭部の一部が、符号５２に示すように検出領域から排除されるおそれがある。 Here, referring to FIG. 8, a case where an example of a person who has one arm on the side of the head is acquired as the grayscale image 46 is schematically shown. As described above, how the detection region 47 is cut out by the parallel lines PL1 and PL2 in the xy plane is shown. If A and B are set larger as described above so that the head is surely included in the detection region 47, the arm region may be included in the detection region as shown in the figure. If the region of the arm (or part thereof) mentioned above is included in the detection region, the head may be erroneously detected in the arm region in the correlation processing described later. In order to avoid this, if A and B are narrowed as indicated by A ′ and B ′, a part of the head is excluded from the detection region as indicated by reference numeral 52 by parallel lines PL1 and PL2. There is a fear.

本願発明の一実施形態では、このような検出すべき物体（この例では、頭部）の過除去を回避するため、ＡおよびＢは、前述したように広め（たとえば、最大幅の８０％）に設定し、結果として検出領域に含まれてしまうおそれのある不要領域（この例では、腕領域）を除去する手法を、さらに提案する。すなわち、ｘｙ平面において上記のように抽出された検出領域４７に対し、図９に示される処理を実行して、検出領域４７をさらに距離方向（ｚ方向）に限定する。 In an embodiment of the present invention, in order to avoid such over-removal of the object to be detected (in this example, the head), A and B are widened as described above (eg, 80% of the maximum width). A method for removing unnecessary areas (in this example, arm areas) that may be included in the detection area as a result is further proposed. That is, the process shown in FIG. 9 is performed on the detection area 47 extracted as described above in the xy plane, and the detection area 47 is further limited to the distance direction (z direction).

このｚ方向の限定処理を、図１０を参照しながら説明する。この手法の意図を明瞭にするため、図１０の（ａ）には、図５とは異なる画像例が用いられており、両腕を頭部の両側に挙げた人物のグレースケール画像４６が模式的に示されている。この例では、傾きｋが所定値以上であるので、グレースケール画像４６は、平行線ＰＬ１およびＰＬ２により切り出される。幅ＡおよびＢは３：２に設定されており、Ａ＋Ｂは、グレースケール画像４６の最大幅の８０％である。 The z-direction limiting process will be described with reference to FIG. In order to clarify the intention of this method, an image example different from that in FIG. 5 is used in FIG. 10A, and a gray scale image 46 of a person with both arms on both sides of the head is schematically shown. Has been shown. In this example, since the gradient k is greater than or equal to a predetermined value, the grayscale image 46 is cut out by the parallel lines PL1 and PL2. The widths A and B are set to 3: 2, and A + B is 80% of the maximum width of the grayscale image 46.

それぞれの画素行について、平行線ＰＬ１およびＰＬ２と該画素行との交点を、それぞれＸＢおよびＸＡとする。中心線ＣＬと平行線ＰＬ１の間の中央を通るよう線ＱＬ１が設定され、中心線ＣＬと平行線ＰＬ２の間の中央を通るよう線ＱＬ２が設定される。それぞれの画素行と、線ＱＬ１および線ＱＬ２との交点を、それぞれＸｂおよびＸａとする。図には、中心線上ＣＬ上の点Ｍを通る画素行についてのＸＢ、Ｘｂ、ＸａおよびＸＡが示されている。 For each pixel row, the intersections of the parallel lines PL1 and PL2 and the pixel row are XB and XA, respectively. A line QL1 is set to pass through the center between the center line CL and the parallel line PL1, and a line QL2 is set to pass through the center between the center line CL and the parallel line PL2. Intersections between the respective pixel rows and the lines QL1 and QL2 are Xb and Xa, respectively. In the figure, XB, Xb, Xa and XA are shown for a pixel row passing through a point M on the center line CL.

グレースケール画像４６内のそれぞれの画素は、距離を表すグレースケール値（以下、距離値とも呼ばれる）を有している。中央位置Ｍａが存在する第１の基準行Ｌａに含まれる画素の距離値を平均し、該平均距離値を、中央位置Ｍａに対応する距離値とする。同様に、中央位置Ｍｂが存在する第２の基準行Ｌｂに含まれる画素の距離値を平均し、該平均距離値を、中央位置Ｍｂに対応する距離値とする。図１０の（ｂ）に示すように、中央位置ＭａおよびＭｂのｙ座標と、ＭａおよびＭｂに対応する上記距離値（ｚ座標で表される）を、ｙｚ平面においてプロットして、該プロットしたＭａおよびＭｂを結び、中心線ＣＤを求める（図９のステップＳ１１）。中心線ＣＤは、一次式により表されることができる。 Each pixel in the grayscale image 46 has a grayscale value representing a distance (hereinafter also referred to as a distance value). The distance values of the pixels included in the first reference row La where the central position Ma exists are averaged, and the average distance value is set as a distance value corresponding to the central position Ma. Similarly, the distance values of the pixels included in the second reference row Lb where the center position Mb exists are averaged, and the average distance value is set as a distance value corresponding to the center position Mb. As shown in FIG. 10B, the y-coordinates of the central positions Ma and Mb and the distance values (represented by the z-coordinate) corresponding to Ma and Mb are plotted on the yz plane. Ma and Mb are connected to determine the center line CD (step S11 in FIG. 9). The center line CD can be expressed by a linear expression.

ステップＳ１２において、グレースケール画像４６の頂部から底部に向かって、画素行を走査する。ステップＳ１３において、画素行ごとに、該画素行の基準距離値Ｄを、中心線ＣＤから算出する。具体的には、該画素行のｙ座標値を、中心線ＣＤを表す一次式に代入することにより、該画素行の基準距離値Ｄを算出することができる。図には、点Ｍを通る画素行についての基準距離値Ｄが示されている。 In step S12, the pixel rows are scanned from the top to the bottom of the grayscale image 46. In step S13, for each pixel row, a reference distance value D for the pixel row is calculated from the center line CD. Specifically, the reference distance value D of the pixel row can be calculated by substituting the y coordinate value of the pixel row into a linear expression representing the center line CD. In the figure, a reference distance value D for a pixel row passing through the point M is shown.

ステップＳ１４において、それぞれの画素行の（ＸＢ〜Ｘｂ）の間および（Ｘａ〜ＸＡ）の間において、第１の所定範囲（Ｄ±ｚ１）外の距離値を持つ画素を、該画素行から除去する（第１の除去）。第１の所定範囲は、第１の除去で残すべき領域の対象物の厚みを考慮して設定される。 In step S14, pixels having distance values outside the first predetermined range (D ± z1) between (XB to Xb) and (Xa to XA) of each pixel row are removed from the pixel row. (First removal). The first predetermined range is set in consideration of the thickness of the object in the region to be left by the first removal.

図１０の（ｂ）には、点Ｍについて算出された第１の所定範囲が示されている。Ｄ−ｚ１の点を通るよう、中心線ＣＤに平行な線ＤＬ２が設定され、Ｄ＋ｚ１の点を通るよう、中心線ＣＤに平行な線ＤＬ１が設定されている。また、図１０の（ｃ）は、線ＤＬ１およびＤＬ２を、ｘｚ平面上に表しており、端的に言えば、除去する領域を上から見た図である。これらの図からわかるように、ｘ方向のＸＢ〜Ｘｂの範囲において、線ＤＬ１、ＤＬ２およびＰＬ１で囲まれた画素は残され、それ以外の画素は除去される。同様に、ｘ方向のＸａ〜ＸＡの範囲において、線ＤＬ１、ＤＬ２およびＰＬ２で囲まれた画素は残され、それ以外の画素は除去される。 FIG. 10B shows a first predetermined range calculated for the point M. A line DL2 parallel to the center line CD is set so as to pass through the point D-z1, and a line DL1 parallel to the center line CD is set so as to pass through the point D + z1. FIG. 10C shows the lines DL1 and DL2 on the xz plane. In short, the area to be removed is viewed from above. As can be seen from these figures, in the range of XB to Xb in the x direction, the pixels surrounded by the lines DL1, DL2, and PL1 remain, and the other pixels are removed. Similarly, in the range of Xa to XA in the x direction, the pixels surrounded by the lines DL1, DL2, and PL2 are left, and the other pixels are removed.

この第１の除去の技術的意義を説明する。中心線ＣＬは、前述したように、ｘｙ平面における人物（頭部〜胴体）の中心を表す線と考えられる。線ＱＬ１およびＱＬ２の間の領域は、人物の中央部分に相当する。一方、線ＱＬ１およびＱＬ２の外側の領域は、図１０の（ａ）に示すように、腕を含む領域である。第１の除去の目的の１つは、図８を参照して説明したように、このような腕領域を検出領域４７から除去することである。 The technical significance of this first removal will be described. As described above, the center line CL is considered to be a line representing the center of a person (head to torso) on the xy plane. The area between the lines QL1 and QL2 corresponds to the central part of the person. On the other hand, the area outside the lines QL1 and QL2 is an area including the arm as shown in FIG. One of the purposes of the first removal is to remove such an arm region from the detection region 47 as described with reference to FIG.

この目的を達成するため、まず、頭部および胴体の中央部分を表すＱＬ１〜ＱＬ２の領域を保存しつつ、ＱＬ１およびＱＬ２の外側の領域において除去を行う。第１の所定範囲（Ｄ±ｚ１）は、頭部〜胴体の厚みを反映した範囲であるので、第１の除去により、該頭部〜胴体の範囲に収まらない距離値を持つ画素が除去される。通常、腕を挙げると、挙げた腕は頭部より前面に出ることが多いが、このような腕の領域を除去することができる。この例では、第１の所定範囲Ｄ±ｚ１は、胴体の厚みを考慮して５０ｃｍである。こうして、図１０の（ａ）のような、腕に対応する領域を検出領域４７から除去することができる。 In order to achieve this object, first, the regions outside QL1 and QL2 are removed while preserving the regions QL1 to QL2 representing the central portions of the head and the trunk. Since the first predetermined range (D ± z1) is a range reflecting the thickness of the head to the torso, pixels having distance values that do not fit in the range of the head to the torso are removed by the first removal. The Usually, when an arm is raised, the raised arm often comes to the front of the head, but such an area of the arm can be removed. In this example, the first predetermined range D ± z1 is 50 cm in consideration of the thickness of the trunk. Thus, the region corresponding to the arm as shown in FIG. 10A can be removed from the detection region 47.

図９に戻り、ステップＳ１５において、さらに、それぞれの画素行のＸｂ〜Ｘａの間において、第２の所定範囲（Ｄ±ｚ２）外の距離値を持つ画素を除去する（第２の除去）。図１０の（ｃ）を参照すると、第２の所定範囲Ｄ±ｚ２が、線ＤＬ３およびＤＬ４で示されている。ＱＬ１、ＱＬ２、ＤＬ３およびＤＬ４で囲まれた画素は残され、それ以外の画素は除去される。 Returning to FIG. 9, in step S15, pixels having distance values outside the second predetermined range (D ± z2) are removed between Xb to Xa of the respective pixel rows (second removal). Referring to FIG. 10C, the second predetermined range D ± z2 is indicated by lines DL3 and DL4. The pixels surrounded by QL1, QL2, DL3 and DL4 are left, and the other pixels are removed.

この第２の除去の意義を説明する。乗員が、たとえばバッグやボールなどの付随物（荷物）を持っている場合、ｘｙ平面での切り出し処理を行っても、図１１の斜線領域５３に示すように、検出領域４７に荷物領域が残される。このような荷物領域が検出領域４７に残ると、後述する相関処理において、該荷物領域において頭部を誤検出するおそれがある。したがって、このような、胴体以外の領域を、検出領域４７から除去するのが好ましい。 The significance of this second removal will be described. If the occupant has an accessory (baggage) such as a bag or a ball, the baggage area remains in the detection area 47 as shown by the hatched area 53 in FIG. It is. If such a luggage area remains in the detection area 47, the head may be erroneously detected in the luggage area in the correlation processing described later. Therefore, it is preferable to remove such a region other than the body from the detection region 47.

したがって、ＱＬ１およびＱＬ２の間では、しきい値ｚ２を用いた除去を行う。胴体部分が除去されることを回避するため、しきい値ｚ２は、しきい値ｚ１より大きい値に設定されるのが好ましい。また、しきい値ｚ２は、想定される荷物の大きさに応じて決めるのが好ましい。たとえば、約３０ｃｍの厚みを持つ荷物を有している場合を想定すると、Ｄ±ｚ２は、Ｄ±ｚ１より３０ｃｍほど大きくなるよう設定される。 Therefore, removal using the threshold value z2 is performed between QL1 and QL2. In order to avoid the body part from being removed, the threshold value z2 is preferably set to a value larger than the threshold value z1. Moreover, it is preferable to determine the threshold value z2 according to the assumed size of the luggage. For example, assuming that a baggage having a thickness of about 30 cm is present, D ± z2 is set to be approximately 30 cm larger than D ± z1.

図９に戻ると、以上のようなｚ１およびｚ２を用いたｚ方向限定処理を、走査がグレースケール画像４６の最下行に達するまで、それぞれの画素行について実行する（Ｓ１６）。 Returning to FIG. 9, the z-direction limiting process using z1 and z2 as described above is executed for each pixel row until the scan reaches the bottom row of the grayscale image 46 (S16).

代替的に、ｙｚ平面にプロットされる中央位置ＭａおよびＭｂに対応する距離値（ｚ座標値）として、中央位置Ｍａにおける画素の距離値およびＭｂにおける画素の距離値を用いてもよい。 Alternatively, the distance value of the pixel at the center position Ma and the distance value of the pixel at Mb may be used as the distance values (z coordinate values) corresponding to the center positions Ma and Mb plotted on the yz plane.

また、代替形態において、中心線ＣＤは、中心線ＣＬと同様に、複数の画素行についての距離値をプロットし、一次近似手法により求めてもよい。この場合、距離値は、前述したように、対応する画素行に含まれる画素の距離値の平均でもよいし、該画素行の中央位置にある画素の距離値でもよい。 In an alternative embodiment, the center line CD may be obtained by a primary approximation method by plotting distance values for a plurality of pixel rows in the same manner as the center line CL. In this case, as described above, the distance value may be the average of the distance values of the pixels included in the corresponding pixel row, or may be the distance value of the pixel at the center position of the pixel row.

なお、この実施例では、第１の除去において、中心線ＣＤに対する線ＤＬ１の幅およびＤＬ２の幅は同じｚ１である。基準行ＬａおよびＬｂに含まれる画素の距離値を平均した平均距離値でｙｚ平面にプロットする場合には、このような同じ幅でよい。しかしながら、画素行の中央位置ＭａおよびＭｂにある画素の距離値をｙｚ平面にプロットする場合には、好ましくは、ＤＬ１〜ＣＤ：ＣＤ〜ＤＬ２を、たとえば３：２となるように、前者を後者より大きくするのが好ましい。 In this embodiment, in the first removal, the width of the line DL1 and the width of the DL2 with respect to the center line CD are the same z1. When plotting on the yz plane with an average distance value obtained by averaging the distance values of the pixels included in the reference rows La and Lb, the same width may be used. However, when the distance values of the pixels at the center positions Ma and Mb of the pixel row are plotted on the yz plane, it is preferable to set the former to the latter so that DL1 to CD: CD to DL2 is, for example, 3: 2. It is preferable to make it larger.

中央位置ＭａおよびＭｂにおける画素の距離値は、それぞれ、頭部表面までの距離および胴体表面までの距離を表している。一方、ＭａおよびＭｂの画素行（基準行ＬａおよびＬｂ）に含まれる画素の距離値の平均は、頭部および胴体の厚みを考慮した値となる。すなわち、前者の距離値は、後者の平均距離値よりも小さくなる。ＤＬ１とＤＬ２の間の幅は、頭部および胴体の厚みを反映するように設定されるのが好ましいので、図１０（ｂ）のようなＤＬ１およびＤＬ２を中心線ＣＤに対してｚ方向にシフトし、ＣＤ〜ＤＬ１の幅＞ＣＤ〜ＤＬ２の幅とするのが好ましい。この一例を、図１２に示し、ｚ３：ｚ４＝３：２となっている。 The pixel distance values at the central positions Ma and Mb represent the distance to the head surface and the distance to the body surface, respectively. On the other hand, the average of the distance values of the pixels included in the pixel rows Ma and Mb (reference rows La and Lb) is a value in consideration of the thickness of the head and the trunk. That is, the former distance value is smaller than the latter average distance value. Since the width between DL1 and DL2 is preferably set to reflect the thickness of the head and torso, DL1 and DL2 as shown in FIG. 10B are shifted in the z direction with respect to the center line CD. The width of CD to DL1 is preferably greater than the width of CD to DL2. An example of this is shown in FIG. 12, where z3: z4 = 3: 2.

図１０の（ｄ）は、線ＤＬ１およびＤＬ２の代替的な設定を示す。この代替設定においては、上記のｚ１が、ｘ方向に従って変化する。人間の胴体は、その中央部分に向かうほど太くなる傾向にある。したがって、ＸＢ〜Ｘｂの間においては、ｘ値が大きくなるほど、中心線ＣＤと線ＤＬ１、ＤＬ２の間の幅を大きくし、Ｘａ〜ＸＡの間においては、ｘ値が大きくなるほど、中心線ＣＤと線ＤＬ１、ＤＬ２の間の幅を小さくする。 FIG. 10 (d) shows an alternative setting for lines DL1 and DL2. In this alternative setting, z1 changes according to the x direction. The human torso tends to become thicker toward the center. Therefore, the width between the center line CD and the lines DL1 and DL2 increases as the x value increases between XB and Xb, and the center line CD increases as the x value increases between Xa and XA. The width between the lines DL1 and DL2 is reduced.

図１０の（ｃ）および（ｄ）を参照して明らかなように、ｚ方向の限定は、胴体の中央部分にいくほど厚みが大きくなっていることに基づいている。したがって、（ｃ）および（ｄ）のようなｚ１設定に必ずしも限定されず、たとえば、ＤＬ１とＱＬの交点から、ＤＬ２とＱＬ１の交点に向かって凸曲線を描くようにｚ１を設定してもよい。しかしながら、（ｃ）および（ｄ）のような形態は、ＤＬ１およびＤＬ２が直線であるので、高速な計算処理を実現することができる。 As is apparent with reference to FIGS. 10C and 10D, the limitation in the z direction is based on the fact that the thickness increases toward the center of the body. Therefore, it is not necessarily limited to z1 setting like (c) and (d), For example, you may set z1 so that a convex curve may be drawn from the intersection of DL1 and QL toward the intersection of DL2 and QL1. . However, forms such as (c) and (d) can realize high-speed calculation processing because DL1 and DL2 are straight lines.

なお、傾きｋが所定値Ｔ以下の場合に抽出された検出領域４７についても、図９に示す手法で、ｚ方向に制限することができる点に注意されたい。この場合、平行線ＰＬ１およびＰＬ２に代えて、垂直線ＶＬ１およびＶＬ２が用いられる。 It should be noted that the detection region 47 extracted when the inclination k is equal to or less than the predetermined value T can also be limited to the z direction by the method shown in FIG. In this case, vertical lines VL1 and VL2 are used instead of parallel lines PL1 and PL2.

前述したように、図１１のように人物が荷物等を抱えていると、検出領域４７からその領域が除去される。その結果、検出領域４７には、凹部が形成されることとなる。 As described above, when a person holds a luggage or the like as shown in FIG. 11, the area is removed from the detection area 47. As a result, a recess is formed in the detection region 47.

このような凹部が検出領域４７に形成されると、たとえば凹部の周辺で、後述する相関処理において頭部が誤検出されるおそれがある。これを回避するため、本願発明の一実施形態では、さらに、図１３に示す処理を実行し、得られた検出領域４７に凹部が存在する場合には、これを埋める。この処理を、図１４を参照しながら説明する。 If such a recess is formed in the detection region 47, for example, the head may be erroneously detected in the correlation process described later, for example, around the recess. In order to avoid this, in the embodiment of the present invention, the process shown in FIG. 13 is further executed, and if the obtained detection region 47 has a recess, it is filled. This process will be described with reference to FIG.

図１４には、平行線ＰＬ１およびＰＬ２により切り出され、かつ、距離値を用いてｚ方向に限定された検出領域４７に、凹部５４が存在する場合の一例が示されている。凹部５４には、画素が存在していない。 FIG. 14 shows an example in which a recess 54 is present in a detection region 47 cut out by parallel lines PL1 and PL2 and limited to the z direction using the distance value. No pixel is present in the recess 54.

図１３のステップＳ２１において、検出領域４７に含まれる画素の距離値を平均し、平均距離値を算出する。 In step S21 of FIG. 13, the distance values of the pixels included in the detection region 47 are averaged to calculate an average distance value.

ステップＳ２２において、前述した基準線Ｌｃ以下の領域で、最下点、すなわちｙ方向で最も低い点を求める。最下点は、中心線ＣＬと平行線ＰＬ１の間の領域および中心線ＣＬと平行線ＰＬ２の間の領域のそれぞれで求められ、これを、ＴａおよびＴｂとする。 In step S22, the lowest point, that is, the lowest point in the y direction is obtained in the region below the reference line Lc. The lowest point is obtained in each of the region between the center line CL and the parallel line PL1 and the region between the center line CL and the parallel line PL2, and these are Ta and Tb.

ステップＳ２３において、２つの最下点ＴａおよびＴｂを結ぶ線分の中点Ｔｍを求める。さらに、中点Ｔｍのｙ座標Ｔｍｙに、αを加えた値を求める。αは、たとえば、基準線Ｌｃのｙ座標値の１／３に設定される。該中点からＴｍｙ＋αの高さにある点Ｔαを求める。代替的に、中点から該線分に対する法線を描画し、法線の長さがＴｍｙ＋αの所を点Ｔαとしてもよい。中点Ｔｍと点Ｔαとを結ぶ線分上に、画素が存在するかどうかを判断する。 In step S23, the midpoint Tm of the line segment connecting the two lowest points Ta and Tb is obtained. Further, a value obtained by adding α to the y coordinate Tmy of the midpoint Tm is obtained. For example, α is set to 1/3 of the y coordinate value of the reference line Lc. A point Tα at a height of Tmy + α from the midpoint is obtained. Alternatively, a normal line with respect to the line segment may be drawn from the middle point, and the point where the length of the normal line is Tmy + α may be set as the point Tα. It is determined whether or not a pixel exists on a line segment connecting the middle point Tm and the point Tα.

図１４の例では、この判断はＮｏであり、画素が存在していない。これは、検出領域４７の下部に凹部５４が形成されていることを示している。この判断がＹｅｓであれば、検出領域４７の下部が閉じられていることを示す。 In the example of FIG. 14, this determination is No, and no pixel exists. This indicates that a recess 54 is formed in the lower portion of the detection region 47. If this determination is Yes, it indicates that the lower portion of the detection area 47 is closed.

ステップＳ２３の判断がＮｏであれば、ステップＳ２４において、点ＴａおよびＴｂを、幅ｔ（数画素分の幅でよい）の線分で結ぶことにより、検出領域４７の下部を閉じて「穴」を形成する。ここで、線分は、前述のステップＳ２１で算出された平均距離値を有する画素から成るよう構成される。ステップＳ２３の判断がＹｅｓであれば、ステップＳ２５に進む。 If the determination in step S23 is No, in step S24, the points Ta and Tb are connected by a line segment having a width t (a width corresponding to several pixels) to close the lower portion of the detection region 47 and form a “hole”. Form. Here, the line segment is configured to include pixels having the average distance value calculated in step S21 described above. If judgment of Step S23 is Yes, it will progress to Step S25.

代替的に、検出領域４７の底部から数画素行（たとえば、検出領域の底部と頂部の間の画素行の数の１／１０）を、平均距離値を有する画素で強制的に埋めることにより、穴を形成してもよい。 Alternatively, by forcing several pixel rows from the bottom of the detection region 47 (eg, 1/10 of the number of pixel rows between the bottom and top of the detection region) with pixels having an average distance value, A hole may be formed.

ステップＳ２５において、こうして閉じられた検出領域４７の基準線Ｌｃ以下の領域内に、「穴」が存在するかどうかを調べる。「穴」の検出は、任意の適切な手法で実現されることができる。たとえば、８連結による輪郭線を抽出する手法により、基準線Ｌｃ以下の領域の輪郭線を求めることにより、穴を検出することができる。このような穴検出手法は周知であるので、詳細な説明は控える。簡単に説明すると、領域を走査して、輪郭線（境界線）の開始点を決める。開始点において、その近傍の８個の画素位置を反時計回りにサーチして、次の画素を見つける。これを次々に進めることにより、１本の輪郭線を追跡することができる。 In step S25, it is checked whether or not a “hole” exists in the region below the reference line Lc of the detection region 47 thus closed. The detection of “hole” can be realized by any appropriate technique. For example, a hole can be detected by obtaining a contour line in a region below the reference line Lc by a method of extracting a contour line by eight connections. Since such a hole detection method is well known, a detailed description is omitted. Briefly, the region is scanned to determine the start point of the contour line (boundary line). At the start point, the neighboring 8 pixel positions are searched counterclockwise to find the next pixel. By proceeding one after another, one contour line can be traced.

ステップＳ２６において、穴が検出されたかどうか判断する。穴が検出されたならば、ステップ２７において、該穴を、上記の平均距離値を持つ画素で埋める。 In step S26, it is determined whether a hole has been detected. If a hole is detected, in step 27, the hole is filled with pixels having the above average distance value.

穴は、上記のような凹部が閉じられることにより形成される穴だけでなく、たとえば荷物等が胴体中央部分に存在する場合に該荷物領域が除去されることによって形成された穴も含まれる。また、前述したパターン光が強く、対象物の一部が白（たとえば、衣服の白い部分）の場合、視差値を算出することができない場合が生じうる。視差値が得られない部分は、距離値を有する画素（すなわち、上記の有効画素）が存在せず、穴として残されることがある。このような穴が、平均距離値の画素で埋められる。 The hole includes not only a hole formed by closing the concave portion as described above, but also a hole formed by removing the load region when a load or the like is present in the center part of the trunk. In addition, when the pattern light described above is strong and a part of the object is white (for example, a white part of clothes), a case where the parallax value cannot be calculated may occur. In a portion where the parallax value cannot be obtained, there is a case where a pixel having a distance value (that is, the above effective pixel) does not exist and is left as a hole. Such holes are filled with pixels of average distance values.

凹部を検出する手法として、他の手法を用いてもよい。たとえば、凸包を求めるための周知の包装法やGraham法などを用いることができる。好ましくは、基準線Ｌｃ以下の領域のエッジを抽出し、該エッジに対して凸包を求める。凸包を求めることにより、該領域に外接する多角形が求められるので、上記のように線分で検出領域の下部を閉じることは必要とされない。求めた多角形の内部で、穴検出を行えばよい。 Other methods may be used as a method for detecting the recess. For example, a well-known packaging method or Graham method for obtaining a convex hull can be used. Preferably, an edge of a region below the reference line Lc is extracted, and a convex hull is obtained for the edge. By obtaining the convex hull, a polygon circumscribing the region is obtained, and therefore it is not necessary to close the lower part of the detection region with a line segment as described above. Hole detection may be performed inside the obtained polygon.

ステップＳ２８において、任意の適切な粒子解析処理を行う。たとえば、膨張および収縮処理を行うことができる。膨張および収縮処理は、周知の手法であるので、詳細には説明しない。膨張処理を行う場合、たとえば８個の近傍画素の距離値のうちの最大距離値を持つ画素で、画像を１画素分膨張させる。収縮処理においては、たとえば８個の近傍画素の距離値のうちの最小距離値を持つ画素で、画像の境界画素を置き換える。これを繰り返すと、ノイズを除去することができる。粒子解析処理を行った後、面積が最大となる領域を、最終的な検出領域４７として抽出する。たとえば検出領域外にノイズである画素が除去されずに残っていたとしても、面積が最大となる領域が抽出されるので、頭部および胴体を含む領域のみが検出領域４７として抽出される。 In step S28, any appropriate particle analysis process is performed. For example, expansion and contraction processing can be performed. The expansion and contraction processing is a well-known method and will not be described in detail. When the expansion process is performed, for example, the image is expanded by one pixel with a pixel having the maximum distance value among the distance values of eight neighboring pixels. In the contraction process, for example, the boundary pixel of the image is replaced with a pixel having a minimum distance value among the distance values of eight neighboring pixels. By repeating this, noise can be removed. After performing the particle analysis process, a region having the maximum area is extracted as a final detection region 47. For example, even if pixels that are noise remain outside the detection area, the area having the largest area is extracted, so that only the area including the head and the torso is extracted as the detection area 47.

なお、傾きｋが所定値Ｔ以下の場合に抽出された検出領域４７についても、図１３に示す手法で、穴を検出して埋めることができる点に注意されたい。この場合、平行線ＰＬ１およびＰＬ２に代えて、垂直線ＶＬ１およびＶＬ２が用いられる。 It should be noted that the detection region 47 extracted when the inclination k is equal to or less than the predetermined value T can be detected and filled with the technique shown in FIG. In this case, vertical lines VL1 and VL2 are used instead of parallel lines PL1 and PL2.

次に、図１の相関部２７による処理を、説明する。 Next, processing by the correlation unit 27 in FIG. 1 will be described.

まず、図１５を参照して、メモリ２６に記憶されるべき頭部モデルを生成する手法を説明する。図１５の（ａ）および（ｂ）に示されるように、人間の頭部は、楕円球の形状に類似しているという特徴を有する。したがって、人間の頭部を、楕円球で表すことができる。 First, a method for generating a head model to be stored in the memory 26 will be described with reference to FIG. As shown in FIGS. 15A and 15B, the human head has a feature similar to the shape of an elliptical sphere. Therefore, the human head can be represented by an elliptic sphere.

楕円球は、車両内の空間座標に基づいて構築される。車両内の空間座標は、たとえば、撮像装置１１および１２の位置を原点とする。ｚ軸は、撮像装置から垂直に伸長し、ｚ値は、撮像装置からの距離を示す。ｘ軸は、車両の幅方向に、ｚ軸に対して垂直に設定され、ｙ軸は、車両の高さ方向に、ｚ軸に対して垂直に設定される。 The elliptic sphere is constructed based on the space coordinates in the vehicle. The spatial coordinates in the vehicle have, for example, the positions of the imaging devices 11 and 12 as the origin. The z-axis extends vertically from the imaging device, and the z value indicates the distance from the imaging device. The x-axis is set perpendicular to the z-axis in the vehicle width direction, and the y-axis is set perpendicular to the z-axis in the vehicle height direction.

図１５の（ｃ）は、人間の頭部を表した楕円球の一例であり、該楕円球は、中心座標Ｏ（Ｘ_０,Ｙ_０,Ｚ_０）を有する。中心座標は、乗員がシートに座っている時に存在しうる座標点が選択される。ａ、ｂおよびｃは、楕円球の大きさを規定するパラメータであり、人間の頭部を表すよう適切な値に設定される。 FIG. 15C is an example of an elliptic sphere representing a human head, and the elliptic sphere has center coordinates O (X _0, Y _0, Z ₀ ). As the center coordinates, a coordinate point that can exist when the occupant is sitting on the seat is selected. a, b, and c are parameters that define the size of the elliptic sphere, and are set to appropriate values to represent the human head.

頭部の三次元モデルとしての該楕円球を、二次元画像に変換する。楕円球は、矢印５５で示すようにｚ軸方向から見ると、楕円形状を有しているので、該二次元画像は、楕円形状を有するよう生成される。さらに、該二次元画像は、各ｘｙ位置が、該ｘｙ位置に対応するｚ値を表すグレースケール値を持つよう生成される。 The elliptic sphere as a three-dimensional model of the head is converted into a two-dimensional image. Since the elliptical sphere has an elliptical shape when viewed from the z-axis direction as indicated by an arrow 55, the two-dimensional image is generated to have an elliptical shape. Further, the two-dimensional image is generated such that each xy position has a gray scale value representing a z value corresponding to the xy position.

ここで、二次元画像モデルを生成する手法を、より詳細に述べる。楕円球は、式（２）により表されるので、楕円球の各ｘｙ位置におけるｚ値は、式（３）のように表される。

Here, a method for generating a two-dimensional image model will be described in more detail. Since the elliptic sphere is represented by Expression (2), the z value at each xy position of the elliptic sphere is represented by Expression (3).

次に、矢印５５に示されるように、楕円球をｚ軸方向に二次元画像に投影する。この投影は、式（４）により実現される。ここで、Ｒ１１〜Ｒ３３は回転行列式であり、楕円球を回転させない場合には、Ｒ１１およびＲ２２には値１が設定され、他のパラメータにはゼロが設定される。楕円球を回転させる場合には、これらの値に、該回転の角度を表す値が設定される。ｆｘ、ｆｙ、ｕ０およびｖ０は、撮像装置の内部パラメータを示し、たとえば、レンズの歪み等を補正するためのパラメータを含む。

Next, as indicated by the arrow 55, the elliptic sphere is projected onto the two-dimensional image in the z-axis direction. This projection is realized by equation (4). Here, R11 to R33 are rotation determinants. When the ellipsoidal sphere is not rotated, a value 1 is set for R11 and R22, and zero is set for the other parameters. When the elliptic sphere is rotated, values representing the rotation angle are set to these values. fx, fy, u0, and v0 indicate internal parameters of the imaging apparatus, and include parameters for correcting lens distortion and the like, for example.

式（４）に従う投影により、楕円球の表面上の各点（ｘ，ｙ，ｚ）は、二次元画像の（ｘ，ｙ）に投影される。二次元画像の座標（ｘ，ｙ）には、投影元の（ｘ，ｙ，ｚ）のｚ値を表すグレースケール値Ｉが対応づけられる。該グレースケール値Ｉは、式（５）に従って算出される。

By the projection according to the equation (4), each point (x, y, z) on the surface of the ellipsoid sphere is projected onto (x, y) of the two-dimensional image. The gray scale value I representing the z value of the projection source (x, y, z) is associated with the coordinates (x, y) of the two-dimensional image. The gray scale value I is calculated according to equation (5).

ここで、Ｚｍｉｎは、楕円球の（回転させた場合には、回転後の楕円球の）表面の座標のｚ値のうち、最も撮像装置に近い値を示す。Ｎは、正規化部２４による正規化で用いられたＮと同じである。式（５）により、ｚ値がＺｍｉｎの時には、該ｚ値に対応する画素には、最も高いグレースケール値が割り当てられる。ｚ値のＺｍｉｎからの距離が大きくなるにつれ、すなわち撮像装置に対する距離が大きくなるにつれ、該ｚ値に対応する画素のグレースケール値は徐々に低くなる。 Here, Zmin indicates a value closest to the imaging device among the z values of the coordinates of the surface of the ellipsoid (or the ellipsoid after rotation when rotated). N is the same as N used in normalization by the normalization unit 24. According to Equation (5), when the z value is Zmin, the highest gray scale value is assigned to the pixel corresponding to the z value. As the distance of the z value from Zmin increases, that is, as the distance to the imaging device increases, the gray scale value of the pixel corresponding to the z value gradually decreases.

図１５の（ｄ）は、こうして作成された頭部モデルの二次元画像を模式的に示す。楕円の中心から周囲に向かって、グレースケール値が徐々に低くなっている。たとえば、図１５の（ｃ）に示される点５６は、図１５の（ｄ）の点５７に投影される。点５６は、ｚ値がＺｍｉｎであるので、点５７のグレースケール値は、最も高い値（この例では、白）を持つ。また、点５８は点５９に投影されるが、点５８のｚ値が、Ｚｍｉｎより大きい値であるので、点５９のグレースケール値は、点５７のグレースケール値よりも低くなる。 FIG. 15D schematically shows a two-dimensional image of the head model created in this way. The gray scale value gradually decreases from the center of the ellipse toward the periphery. For example, a point 56 shown in (c) of FIG. 15 is projected onto a point 57 of (d) in FIG. Since the point 56 has a z value of Zmin, the gray scale value of the point 57 has the highest value (in this example, white). Further, although the point 58 is projected onto the point 59, since the z value of the point 58 is larger than Zmin, the gray scale value of the point 59 is lower than the gray scale value of the point 57.

こうして、図１５の（ａ）のような、頭部をｚ方向から見た楕円形状は、図１５の（ｄ）の二次元画像モデルの形状に反映され、図１５の（ｂ）のような、頭部の各部分における、撮像装置に対する距離は、図１５の（ｄ）の二次元画像モデルの対応する画素のグレースケール値に反映される。このように、人間の頭部は３次元データで表されるが、ｚ方向の距離をグレースケール値で表すことにより、人間の頭部を二次元画像にモデル化することができる。生成された二次元画像モデルは、メモリ２６に記憶される。 Thus, the elliptical shape when the head is viewed from the z direction as shown in FIG. 15A is reflected in the shape of the two-dimensional image model shown in FIG. 15D, as shown in FIG. The distance from the imaging device in each part of the head is reflected in the gray scale value of the corresponding pixel in the two-dimensional image model shown in FIG. As described above, the human head is represented by three-dimensional data, but the human head can be modeled into a two-dimensional image by expressing the distance in the z direction as a gray scale value. The generated two-dimensional image model is stored in the memory 26.

何種類かの頭部モデルが用意されるのが好ましい。たとえば、大人の頭部と子供の頭部とは大きさが異なるので、別個の頭部モデルが用いられる。上記ａ、ｂおよびｃを調整することにより、所望の大きさの楕円球モデルを生成することができる。どのような大きさの楕円球モデルを用いるかは、複数の人間の頭部を統計的に調べた結果に基づいて決定してもよい。こうして生成された楕円球モデルは、上記のような手法で二次元画像モデルに変換され、メモリ２６に記憶される。 Several types of head models are preferably provided. For example, since an adult's head and a child's head are different in size, separate head models are used. By adjusting the above a, b, and c, an elliptical sphere model having a desired size can be generated. The size of the ellipsoidal sphere model to be used may be determined based on a result of statistical examination of a plurality of human heads. The elliptical sphere model generated in this way is converted into a two-dimensional image model by the above-described method and stored in the memory 26.

また、乗員の姿勢に応じて複数の頭部モデルを用意するのが好ましい。たとえば、首を傾けた乗員の頭部を検出するために、異なる傾きを持つ頭部モデルを用意することができる。たとえば、上記の式（４）に示される回転行列中のパラメータＲ１１〜Ｒ３３の値を調整することにより、所望の角度だけ回転させた楕円球モデルを生成することができる。該楕円球モデルは、同様に、二次元画像モデルに変換され、メモリ２６に記憶される。 It is preferable to prepare a plurality of head models according to the posture of the occupant. For example, in order to detect the head of an occupant with a tilted head, head models having different inclinations can be prepared. For example, an ellipsoidal model rotated by a desired angle can be generated by adjusting the values of the parameters R11 to R33 in the rotation matrix shown in the above equation (4). Similarly, the elliptic sphere model is converted into a two-dimensional image model and stored in the memory 26.

図１６は、一実施例でメモリ２６に記憶される二次元画像の頭部モデルを表す。３種類の大きさがあり、図の右側にいくほど大きくなっている。さらに、それぞれの大きさに対し、傾きなしのモデル、π／４だけ回転させたモデル、３π／４だけ回転させたモデルが用意される。 FIG. 16 shows a head model of a two-dimensional image stored in the memory 26 in one embodiment. There are three sizes, and the larger the size is on the right side of the figure. Furthermore, a model without inclination, a model rotated by π / 4, and a model rotated by 3π / 4 are prepared for each size.

相関部２７は、メモリ２６から二次元画像の頭部モデルを読み出し、該頭部モデルを、検出領域抽出部２５により抽出された検出領域４７に対して走査させる。相関部２７は、該頭部モデルと、検出領域中の走査している画像領域との間でマッチングを行い、相関値を算出する。任意のマッチング手法を用いることができる。 The correlation unit 27 reads the head model of the two-dimensional image from the memory 26 and scans the detection region 47 extracted by the detection region extraction unit 25 with the head model. The correlation unit 27 performs matching between the head model and the scanned image area in the detection area, and calculates a correlation value. Any matching technique can be used.

この実施例では、相関値ｒは、正規化因子のずれ、位置および姿勢の誤差等を考慮した正規化相関式（６）に従って行われる。ここで、Ｓは、マッチングするブロックの大きさを示す。ブロックの大きさは、たとえば１画素でもよく、または、複数の画素の集まり（たとえば、７×７画素）を１つのブロックとしてもよい。頭部モデルと走査対象領域との間の類似度が高いほど、高い値を持つ相関値ｒが算出される。

In this embodiment, the correlation value r is performed according to the normalized correlation equation (6) that takes into account the deviation of the normalization factor, position and orientation errors, and the like. Here, S indicates the size of the matching block. The block size may be, for example, one pixel, or a group of a plurality of pixels (for example, 7 × 7 pixels) may be a single block. The correlation value r having a higher value is calculated as the similarity between the head model and the scanning target region is higher.

前述したように、検出領域４７の頭部部分では、頭部の撮像装置に近い点から周囲に向かって撮像装置に対する距離が徐々に大きくなり、これに従って徐々に値が低くなるようグレースケール値が割り当てられている。一方、頭部モデルも、中心から周囲に向かって撮像装置に対する距離が徐々に大きくなり、これに従って徐々に値が低くなるようグレースケール値が割り当てられている。したがって、頭部モデルが、該乗員領域の頭部部分と相関されれば、他の部分と相関されるよりも、高い値の相関値ｒが算出されることとなる。こうして、相関値ｒに基づき、頭部部分を、検出領域４７から検出することができる。 As described above, in the head portion of the detection region 47, the grayscale value is set so that the distance from the imaging device toward the periphery gradually increases from the point near the imaging device of the head, and the value gradually decreases accordingly. Assigned. On the other hand, the head model is also assigned a grayscale value so that the distance from the imaging device gradually increases from the center toward the periphery, and the value gradually decreases accordingly. Therefore, if the head model is correlated with the head portion of the occupant region, a correlation value r having a higher value is calculated than when correlated with other portions. Thus, the head portion can be detected from the detection region 47 based on the correlation value r.

なお、式（１）で示される前景領域の正規化も、式（５）で示される頭部モデルの正規化も、撮像装置に対する距離が大きくなるにつれ、低い値を持つようグレースケール値が割り当てられる点は同じであるが、厳密に言えば、両者の正規化の間にはずれがある。たとえば、式（２）では、視差の最小値ｄｍｉｎにおいて、グレースケール値が最小になるが、式（５）では、ｚ値が無限大になったときにグレースケール値が最小になる。しかしながら、このようなスケーリングのずれは、上記のような相関により補償されることができる。また、頭部の楕円球モデルを構築する際の中心座標のずれなどの位置的な誤差も、このような相関により補償されることができる。すなわち、相関では、グレースケールが変化する傾向が似ているかどうかが判断されるので、相関される画像領域が、頭部モデルに類似したグレースケールの変化の傾向（すなわち、中心から周囲に向かってグレースケール値が徐々に低くなる）を有していれば、高い相関値が出力されることとなる。 It should be noted that both the normalization of the foreground region represented by equation (1) and the normalization of the head model represented by equation (5) are assigned gray scale values so that the values become lower as the distance to the imaging device increases. The point is the same, but strictly speaking, there is a gap between the normalization of the two. For example, in equation (2), the gray scale value is minimized at the minimum parallax value dmin, but in equation (5), the gray scale value is minimized when the z value becomes infinite. However, such a shift in scaling can be compensated by the correlation as described above. Also, positional errors such as a shift in the center coordinates when constructing the ellipsoidal sphere model of the head can be compensated by such correlation. That is, correlation determines whether the trend of changing grayscale is similar, so that the correlated image area has a trend of grayscale change similar to the head model (ie, from the center to the periphery). If the gray scale value gradually decreases), a high correlation value is output.

最も高い相関値ｒを算出した画像領域が、検出領域４７から検出され、該検出された画像領域の撮像画像における位置づけおよびグレースケール値に基づき、人間の頭部の位置を特定することができる。また、相関された頭部モデルに基づき、頭部の大きさおよび傾き（姿勢）を判断することができる。 The image area where the highest correlation value r is calculated is detected from the detection area 47, and the position of the human head can be specified based on the position of the detected image area in the captured image and the gray scale value. Further, the size and inclination (posture) of the head can be determined based on the correlated head model.

複数の頭部モデルがある場合には、上記のような相関処理を、メモリ２６に記憶された二次元画像の頭部モデルのそれぞれについて行われる。最も高い相関値ｒを算出した頭部モデルが特定されると共に、該モデルと相関された画像領域が、検出領域中に特定される。該検出された画像領域の撮像画像における位置づけおよびグレースケール値から、人間の頭部の位置を特定することができる。また、該最も高い相関値ｒの算出に用いられた頭部モデルに基づき、頭部の大きさおよび傾き（姿勢）を判断することができる。 When there are a plurality of head models, the correlation processing as described above is performed for each head model of the two-dimensional image stored in the memory 26. The head model for which the highest correlation value r is calculated is specified, and the image area correlated with the model is specified in the detection area. From the position of the detected image area in the captured image and the gray scale value, the position of the human head can be specified. Further, the size and inclination (posture) of the head can be determined based on the head model used for calculating the highest correlation value r.

このような、二次元画像のグレースケールのマッチングは、従来の三次元空間における同定処理よりも高速に処理することができる。 Such gray-scale matching of a two-dimensional image can be performed at a higher speed than the conventional identification processing in a three-dimensional space.

図１７の（ａ）および（ｂ）は、図５に示した画像例に基づいて抽出された検出領域４７において、上記の相関処理により特定された頭部領域６０を示す。 (A) and (b) of FIG. 17 show the head region 60 specified by the above correlation processing in the detection region 47 extracted based on the image example shown in FIG.

他の実施例では、物体検出装置１は、光源および一対の撮像装置を用いる代わりに、距離計測装置７０を用いる。図１８は、この実施例に従う、物体検出装置１を示す。一例では、距離計測装置７０は、ＴＯＦ(time of flight)方式の距離画像センサである。該距離画像センサは、２次元に配列された撮像素子を有している。距離画像センサは、ＬＥＤ（発光ダイオード）を用いて、たとえば赤外光をパルス発光して対象物に照射し、レンズを介して該対象物からの反射光を受信する。各撮像素子において、照射光と受信光の位相差（時間遅れ）を求め、該位相差に基づき、距離計測装置７０から対象物までの距離を画素ごとに算出する。こうして、距離画像が生成される。 In another embodiment, the object detection device 1 uses a distance measurement device 70 instead of using a light source and a pair of imaging devices. FIG. 18 shows an object detection apparatus 1 according to this embodiment. In one example, the distance measuring device 70 is a TOF (time of flight) type distance image sensor. The distance image sensor has image sensors arranged in two dimensions. The distance image sensor uses an LED (light emitting diode) to emit infrared light, for example, to irradiate the object, and receives reflected light from the object through the lens. In each image sensor, the phase difference (time delay) between the irradiation light and the reception light is obtained, and the distance from the distance measuring device 70 to the object is calculated for each pixel based on the phase difference. Thus, a distance image is generated.

処理装置１５の背景除去部７２は、距離画像において背景領域を除去し、前景領域を抽出する。これは、前述したように、任意の適切な手法で実現される。たとえば、空席時に距離計測装置７０により計測され生成された距離画像をメモリ７３に予め記憶し、該空席時距離画像と、今回撮像された乗員を含む距離画像とを比較することにより、前景領域を抽出するよう背景を除去することができる。 The background removal unit 72 of the processing device 15 removes the background area from the distance image and extracts the foreground area. As described above, this is realized by any appropriate method. For example, the distance image measured and generated by the distance measuring device 70 when the seat is vacant is stored in the memory 73 in advance, and the foreground region is obtained by comparing the vacant seat distance image with the distance image including the occupant captured this time. The background can be removed to extract.

正規化部７４は、前景領域を正規化する。正規化は、式（１）のｄｍｉｎを、前景領域中の距離の最大値Ｌｍａｘに、ｄｍａｘを、前景領域中の距離の最小値Ｌｍｉｎで置き換えることにより、計算されることができる。 The normalizing unit 74 normalizes the foreground area. Normalization can be calculated by replacing dmin in equation (1) with the maximum distance value Lmax in the foreground area and dmax with the minimum distance value Lmin in the foreground area.

検出領域抽出部７５は、前景領域から、頭部を検出すべき検出領域を抽出する。検出領域の設定は、図１の検出領域抽出部２５と同様の手法を用いて行うことができる。 The detection area extraction unit 75 extracts a detection area where the head should be detected from the foreground area. The detection area can be set using a method similar to that of the detection area extraction unit 25 in FIG.

メモリ７６（メモリ７３と同じでよい）は、メモリ２６と同様に、頭部モデルの二次元画像を記憶する。頭部モデルは、前述したような手法で生成されるが、ここで、図１５の（ｃ）の撮像装置は、距離計測装置に置き換えて考えることができる。相関部７６は、相関部２６と同様の手法で、相関を実施することができる。 The memory 76 (which may be the same as the memory 73) stores a two-dimensional image of the head model, similar to the memory 26. The head model is generated by the method as described above. Here, the imaging device in FIG. 15C can be considered as a distance measuring device. The correlation unit 76 can perform correlation by the same method as the correlation unit 26.

図１９は、物体を検出する処理のフローチャートを示す。この処理は、所定の時間間隔で実施される。一実施例では、図１に示される処理装置１５により、該処理は実現される。 FIG. 19 shows a flowchart of processing for detecting an object. This process is performed at predetermined time intervals. In one embodiment, the processing is realized by the processing device 15 shown in FIG.

ステップＳ３１において、一対の撮像装置によって、人間の頭部を含む画像を取得する。ステップＳ３２において、該取得した画像から、視差画像を生成する。ステップＳ３３において、視差画像から背景を除去し、前景領域を抽出する。背景を除去した後、ノイズを除去するため、膨張収縮などの粒子解析処理および平滑化処理（たとえば、メディアンフィルタを用いて）を行ってもよい。 In step S31, an image including a human head is acquired by a pair of imaging devices. In step S32, a parallax image is generated from the acquired image. In step S33, the background is removed from the parallax image, and the foreground area is extracted. After removing the background, particle analysis processing such as expansion and contraction and smoothing processing (for example, using a median filter) may be performed to remove noise.

ステップＳ３４において、前景領域を、式（１）に従って正規化する。ステップＳ３５において、正規化された前景領域から、検出領域を抽出する。一実施例では、ステップＳ３５において、図４、９および１３の処理が実行される。ステップＳ３６において、メモリから、予め生成されている頭部モデルを読み出す。ステップＳ３７において、頭部モデルを、検出領域に対して走査し、該頭部モデルと、該頭部モデルが重なった画像領域との間の類似度を示す相関値を算出する。ステップＳ３８において、最も高い相関値が算出された画像領域を、該相関値および頭部モデルと関連づけてメモリに記憶する。 In step S34, the foreground area is normalized according to the equation (1). In step S35, a detection area is extracted from the normalized foreground area. In one embodiment, the processes of FIGS. 4, 9 and 13 are performed in step S35. In step S36, a pre-generated head model is read from the memory. In step S37, the head model is scanned with respect to the detection region, and a correlation value indicating the similarity between the head model and the image region where the head model overlaps is calculated. In step S38, the image area where the highest correlation value is calculated is stored in the memory in association with the correlation value and the head model.

ステップＳ３９において、相関すべき頭部モデルがあるかどうかを判断する。この判断がＹｅｓならば、ステップＳ３６からの処理を繰り返す。 In step S39, it is determined whether there is a head model to be correlated. If this determination is Yes, the processing from step S36 is repeated.

ステップＳ４０において、メモリに記憶された相関値のうち、最も高い相関値を選択し、該相関値に対応する画像領域および頭部モデルに基づいて、人間の頭部が存在する位置および大きさを出力する。 In step S40, the highest correlation value is selected from the correlation values stored in the memory, and the position and size of the human head are determined based on the image region and the head model corresponding to the correlation value. Output.

撮像装置に代えて距離測定装置を用いる場合には、ステップＳ３１および３２において、距離計測装置を用いて距離画像が生成される。 When using a distance measuring device instead of the imaging device, a distance image is generated using the distance measuring device in steps S31 and S32.

上記の実施形態では、撮像装置に対する距離が大きくなるにつれグレースケール値が低くなるという傾向について、二次元画像モデルと画像領域との間で相関が行われた。このようなグレースケール値の傾向とは異なる傾向を用いてもよい。たとえば、撮像装置に対する距離が大きくなるにつれ、グレースケール値を大きくするという傾向について、類似性を判断してもよい。 In the embodiment described above, the correlation between the two-dimensional image model and the image region is performed with respect to the tendency that the gray scale value decreases as the distance to the imaging device increases. A tendency different from the tendency of the gray scale value may be used. For example, similarity may be determined for the tendency to increase the gray scale value as the distance to the imaging device increases.

人間の頭部を検出する形態について説明してきたが、本願発明の物体検出装置は、他の物体を検出する形態についても適用可能である。たとえば、人間とは異なる物体についても、所定の方向から見た特徴的形状を有するとともに、該所定の方向の距離をグレースケール値で表した二次元画像モデルを生成することにより、検出することができる。 Although the form for detecting the human head has been described, the object detection apparatus of the present invention can also be applied to a form for detecting other objects. For example, an object different from a human can be detected by generating a two-dimensional image model that has a characteristic shape viewed from a predetermined direction and represents a distance in the predetermined direction as a gray scale value. it can.

また、物体検出装置が車両に搭載された場合の実施形態について説明してきたが、本願発明の物体検出装置は、様々な形態に適用可能である。たとえば、或る物体（たとえば、人間の頭部）が近づいたことを検出し、該検出に応じて、何らかのアクションをとる（例えば、メッセージを発する）というような形態にも適用可能である。 Further, although the embodiment in the case where the object detection device is mounted on a vehicle has been described, the object detection device of the present invention can be applied to various forms. For example, the present invention can also be applied to a form in which a certain object (for example, a human head) is detected and some action is performed (for example, a message is issued) in response to the detection.

この発明の一実施例に従う、物体を検出する装置のブロック図。The block diagram of the apparatus which detects an object according to one Example of this invention. この発明の一実施例に従う、視差を算出する手法を説明する図。The figure explaining the method of calculating the parallax according to one Example of this invention. この発明の一実施例に従う、背景の除去を説明する図。The figure explaining the removal of a background according to one Example of this invention. この発明の一実施例に従う、検出領域を抽出する処理のフローチャート。The flowchart of the process which extracts a detection area | region according to one Example of this invention. この発明の一実施例に従う、検出領域の抽出方法を説明するための図。The figure for demonstrating the extraction method of a detection area according to one Example of this invention. この発明の一実施例に従う、中央位置を求める手法を説明するための図。The figure for demonstrating the method of calculating | requiring a center position according to one Example of this invention. この発明の一実施例に従う、中心線を求める他の手法を説明するための図。The figure for demonstrating the other method of calculating | requiring a centerline according to one Example of this invention. この発明の一実施例に従う、検出領域を距離方向にさらに限定する技術的意義を説明するための図。The figure for demonstrating the technical significance which further limits a detection area to a distance direction according to one Example of this invention. この発明の一実施例に従う、検出領域を距離方向にさらに限定する処理のフローチャート。The flowchart of the process which further limits a detection area to a distance direction according to one Example of this invention. この発明の一実施例に従う、検出領域の距離方向の限定を説明するための図。The figure for demonstrating limitation of the distance direction of a detection area according to one Example of this invention. この発明の一実施例に従う、検出領域を距離方向にさらに限定する他の技術的意義を説明するための図。The figure for demonstrating the other technical significance which further limits a detection area to a distance direction according to one Example of this invention. この発明の一実施例に従う、検出領域を距離方向に限定するしきい値の他の設定を示す図。The figure which shows the other setting of the threshold value which limits a detection area to a distance direction according to one Example of this invention. この発明の一実施例に従う、検出領域における凹部および穴を検出する処理のフローチャート。The flowchart of the process which detects the recessed part and hole in a detection area | region according to one Example of this invention. この発明の一実施例に従う、検出領域における凹部を検出する処理を説明するための図。The figure for demonstrating the process which detects the recessed part in a detection area | region according to one Example of this invention. この発明の一実施例に従う、頭部モデルを説明する図。The figure explaining the head model according to one Example of this invention. この発明の一実施例に従う、複数の種類の頭部モデルの一例を示す図。The figure which shows an example of several types of head models according to one Example of this invention. この発明の一実施例に従う、検出された頭部領域の一例を示す図。The figure which shows an example of the detected head area | region according to one Example of this invention. この発明の他の実施例に従う、物体を検出する装置のブロック図。The block diagram of the apparatus which detects an object according to the other Example of this invention. この発明の一実施例に従う、物体を検出する処理のフローチャート。The flowchart of the process which detects an object according to one Example of this invention.

Explanation of symbols

１物体検出装置
２３，２６メモリ
１２，１３撮像装置
１５処理装置
７０距離計測装置 DESCRIPTION OF SYMBOLS 1 Object detection apparatus 23,26 Memory 12,13 Imaging device 15 Processing apparatus 70 Distance measurement apparatus

Claims

An apparatus for detecting an object,
Foreground acquisition means for acquiring a foreground image area by removing a background from an image including a three-dimensional object as an object;
An inclination calculating means for calculating an inclination of a center line of the foreground image area with respect to a predetermined reference axis;
Area extracting means for extracting a detection area from the foreground image area based on the inclination;
Object detection means for detecting the object from the detection region;
An object detection apparatus comprising:

Further, in the foreground image area, center positions for respectively obtaining first and second center positions representing positions at which the number of pixels existing in the pixel rows serving as the first and second reference lines are equally divided into two. A calculation means;
Center line calculating means for obtaining a line connecting the first and second center positions as the center line;
The object detection apparatus according to claim 1.

When the inclination is equal to or greater than a predetermined value, the area extraction unit extracts the detection area by cutting out the foreground image area by a line parallel to the center line;
The object detection apparatus according to claim 1 or 2.

The parallel lines are set to have a predetermined width on the left and right from the center line,
The object detection apparatus according to claim 3.

In addition, for the object in the image, provided with a distance acquisition means for acquiring the distance to the object,
When the inclination is equal to or greater than the predetermined value, the region extracting unit calculates a relative size of the left-side width and the right-side width among the left and right widths to the left image with respect to the center line. Set based on the relative size of the distance about the object included in the area and the distance about the object in the image area on the right side of the center line;
The object detection apparatus according to claim 4.

If the distance to the object of the left image area is shorter than the distance to the object of the right image area, the area extraction means makes the left width larger than the right width, If the distance for the object in the right image area is shorter than the distance for the object in the left image area, the width in the right direction is made larger than the width in the left direction.
The object detection apparatus according to claim 5.

When the inclination is smaller than a predetermined value, the area extracting means cuts out the foreground image area with a vertical line set so as to have the same width to the left and right from one of the first and second center positions. To extract the detection region,
The object detection apparatus according to claim 2.

The left width and the right width are set such that the sum of the left width and the right width is a predetermined ratio with respect to the maximum width of the foreground image area.
The object detection apparatus according to claim 4.

further,
Distance acquisition means for acquiring, for each pixel, a distance value to the object for the object in the image;
Second center line calculating means for calculating a second center line representing a reference distance value of each pixel row based on the distance value of the pixels on the center line;
For each pixel row in the detection area, the reference distance value is obtained from the second center line, and pixels having a distance value outside the first predetermined range with respect to the reference distance value are extracted from the pixel row. First removing means for removing,
The removal by the first removing means is performed for a predetermined range in the horizontal direction.
The object detection apparatus according to claim 1.

The second center line calculating means is configured to average the distance value of the pixels included in the first reference row and the distance value averaged of the distance values of the pixels included in the second reference row. And calculating the second center line based on
The object detection apparatus according to claim 9.

The second center line calculation means calculates the second center line based on the distance value of the pixel at the first center position and the distance value of the pixel at the second center position. calculate,
The object detection apparatus according to claim 9.

The first predetermined range is set to become smaller as the distance from the center line in the horizontal predetermined range is increased.
The object detection device according to claim 9.

The first predetermined range is set to have a first width in the direction in which the distance value increases and a second width in the direction in which the distance value decreases with respect to the second center line. The width is greater than the second width;
The object detection apparatus according to claim 9.

The predetermined range in the horizontal direction is set outside a predetermined central region of the detection region;
The object detection device according to claim 9.

further,
In the predetermined central region, for each pixel row in the detection region, a pixel having a distance value outside a second predetermined range with respect to the reference distance value of the pixel row is removed from the pixel row. 2 removal means,
The object detection apparatus according to claim 14.

The second predetermined range is set to be larger than the first predetermined range;
The object detection apparatus according to claim 15.

further,
A recess detection means for detecting whether or not a recess is formed in a predetermined lower region of the detection region;
If the recess is detected, means for closing the recess with a line consisting of pixels;
Means for detecting whether there is a hole in the detection region;
Means for filling the hole with pixels having a predetermined distance value if the hole is detected;
The object detection device according to claim 1, comprising:

The recess detecting means further includes:
Means for determining the lowest point in the vertical direction in each of the regions obtained by dividing the detection region by the center line;
Means for determining whether or not a pixel exists in a region between a midpoint of a line connecting the two obtained lowest points and a point located at a predetermined height from the midpoint;
The object detection device according to claim 17, comprising:

The predetermined distance value is an average distance value obtained by averaging distance values of pixels included in the detection area.
The object detection device according to claim 17 or 18.

Further, for the object in the image, distance acquisition means for acquiring a distance value to the object;
For each area obtained by subdividing the foreground image area, a grayscale value representing the distance value is calculated, and a grayscale image having the grayscale value corresponding to each area is generated.
The region extraction means extracts the detection region from the grayscale image,
Model storage means for storing a model obtained by modeling the three-dimensional object;
Means for calculating a correlation value representing the degree of similarity between the model and the image region in the detection region;
The object detection means detects the three-dimensional object by detecting an image area having the highest correlation value with the model in the detection area.
The object detection apparatus according to claim 1.

The three-dimensional object is a human head;
The object detection apparatus according to claim 1.

The detected human head is a human head riding in a vehicle.
The object detection apparatus according to claim 21.