JP5861299B2

JP5861299B2 - Detection apparatus and detection method

Info

Publication number: JP5861299B2
Application number: JP2011162490A
Authority: JP
Inventors: 林　正樹; 林　　正樹; 広和笠原
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2011-07-25
Filing date: 2011-07-25
Publication date: 2016-02-16
Anticipated expiration: 2031-07-25
Also published as: JP2013025713A

Description

本発明は、検出装置及び検出方法に関する。 The present invention relates to a detection apparatus and a detection method.

赤外線センサーなどを用いて人体等を検出する装置の開発が行われている。このような装置は、車両周囲に人物等が存在する場合に、その旨を車両の乗員に認識させることなどに用いられる。 An apparatus for detecting a human body using an infrared sensor or the like has been developed. Such a device is used to make a vehicle occupant recognize when a person or the like is present around the vehicle.

特許文献１には、ミリ波の反射波が微弱である人物を車両等と区別して検出するために遠赤外線カメラの撮像画像を補助的に利用することが示されている。特許文献２には、物体の温度に相当する輝度値が閾値の範囲内か否かに基づいて人物を検出することが示されている。 Patent Document 1 discloses that a captured image of a far-infrared camera is used in an auxiliary manner to detect a person having a weak reflected millimeter wave from a vehicle or the like. Patent Document 2 discloses that a person is detected based on whether or not a luminance value corresponding to the temperature of an object is within a threshold range.

特開２０１０−１２７７１７号公報JP 2010-127717 A 特開２００６−１０１３８４号公報JP 2006-101384 A

上述の手法は、基本的に、人物の温度に近い温度が見つかった場合に人物であると判定している。人物の表面温度はその人物の周囲の温度（環境温度）によって変化する。しかしながら、上述の手法では、環境温度により人物の表面温度が変化することは考慮されていなかった。例えば、夏の人物は薄着となるため表面温度は体温近辺となるが、冬の人物は厚着となるため表面温度は環境温度に近い温度となる。すなわち、季節によっては、検出が困難となる場合がある。 The above-described method basically determines that a person is found when a temperature close to that of the person is found. The surface temperature of a person varies depending on the ambient temperature (environment temperature) of the person. However, in the above-described method, it has not been considered that the surface temperature of the person changes due to the environmental temperature. For example, since the person in the summer is lightly worn, the surface temperature is around the body temperature, while the person in the winter is thickly worn, the surface temperature is close to the environmental temperature. That is, detection may be difficult depending on the season.

また、顔など肌が露出している部分の温度は、季節にかかわらずほぼ一定とみることもできるが、顔の向き、髪型、着衣等により常に撮影されるとは限らないため、有効に検出できるか否かは定かではない。すなわち、検出対象物の温度が環境温度によって変わる場合において、精度良く検出対象物を検知することができないという問題がある。 In addition, the temperature of the part where the skin is exposed, such as the face, can be considered to be almost constant regardless of the season, but it is not always photographed depending on the face direction, hairstyle, clothes, etc., so it can be detected effectively. Whether it can be done is not certain. That is, there is a problem that the detection target cannot be detected with high accuracy when the temperature of the detection target varies depending on the environmental temperature.

本発明は、このような事情に鑑みてなされたものであり、検出対象物の検出精度を向上させることを目的とする。 This invention is made | formed in view of such a situation, and it aims at improving the detection accuracy of a detection target object.

上記目的を達成するための主たる発明は、
センサーの出力に対応する階調値を含む検出対象画像を生成する画像生成部と、前記検出対象画像から検出対象物を検出することに用いる学習済みの識別器を有する検出部と、を含み、
前記識別器は、前記検出対象画像における２つの領域の階調値に基づいて前記検出対象画像における前記検出対象物の検出を行う複数のサブ識別器を有し、
前記検出部は、
前記複数のサブ識別器に対して、前記検出対象画像の複数の領域のうち対応する２つの領域の階調値を入力し、前記複数のサブ識別器の出力に基づいて前記検出対象画像における検出対象物の検出を行い、
前記複数のサブ識別器は、前記２つの領域の階調値を入力したときの特徴量に基づく値を出力し、
前記特徴量は、前記２つの領域のうち対応する一方の領域の階調値と、前記２つの領域のうち対応する他方の領域の階調値に基づいて推定された前記一方の領域の推定階調値と、の差を表す、
検出装置である。
The main invention for achieving the above object is:
An image generation unit for generating a detection target image including a gradation value corresponding to the output of the sensor, and a detection unit having a learned discriminator used for detecting a detection target from the detection target image,
The classifier includes a plurality of sub classifiers that detect the detection target in the detection target image based on gradation values of two regions in the detection target image,
The detector is
To the plurality of sub classifiers, gradation values of two corresponding areas among the plurality of areas of the detection target image are input, and detection in the detection target image is performed based on outputs of the plurality of sub classifiers There line the detection of an object,
The plurality of sub classifiers output values based on feature values when the gradation values of the two regions are input,
The feature amount is an estimated floor of the one region estimated based on a gradation value of one corresponding region of the two regions and a gradation value of the other region corresponding to the two regions. Represents the difference between the key value and
It is a detection device.

本発明の他の特徴については、本明細書及び添付図面の記載により明らかにする。 Other features of the present invention will become apparent from the description of the present specification and the accompanying drawings.

本実施形態における人物検出システム１の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the person detection system 1 in this embodiment. 本実施形態の概要を説明する図である。It is a figure explaining the outline | summary of this embodiment. 参考例の人物検出方法の説明図である。It is explanatory drawing of the person detection method of a reference example. 本実施形態における学習処理のフローチャートである。It is a flowchart of the learning process in this embodiment. 本実施形態における回帰式作成処理のフローチャートである。It is a flowchart of the regression equation creation processing in the present embodiment. クロッピング画像を複数のセルに分割する様子を説明する図である。It is a figure explaining a mode that a cropping image is divided | segmented into a some cell. 本実施形態における特徴量抽出処理のフローチャートである。It is a flowchart of the feature-value extraction process in this embodiment. 本実施形態における人物検出処理（全体画像）のフローチャートである。It is a flowchart of a person detection process (whole image) in this embodiment. 本実施形態における人物検出処理（クロッピング画像）のフローチャートである。It is a flowchart of a person detection process (cropping image) in this embodiment. 探索窓の移動の様子の説明図である。It is explanatory drawing of the mode of a movement of a search window.

本明細書及び添付図面の記載により、少なくとも、以下の事項が明らかとなる。すなわち、
センサーの出力に対応する階調値を含む検出対象画像を生成する画像生成部と、
前記検出対象画像から検出対象物を検出することに用いる学習済みの識別器を有する検出部と、を含み、
前記識別器は、前記検出対象画像における任意の２つの領域の階調値に基づいて前記検出対象画像における前記検出対象物の検出を行う複数のサブ識別器を有し、
前記検出部は、
前記複数のサブ識別器に対して、前記検出対象画像の複数の領域のうち対応する２つの領域の階調値を入力し、前記複数のサブ識別器の出力に基づいて前記検出対象画像における検出対象物の検出を行う、
検出装置である。
このように、任意の２つの領域の階調値に基づいて検出対象物の検出を行うサブ識別器を複数有することで、検出精度が良好な１つ又は複数のサブ識別器を用いて検出対象物の検出を行うことができる。そして、検出対象物の検出精度を向上させることができる。 At least the following matters will become clear from the description of the present specification and the accompanying drawings. That is,
An image generation unit that generates a detection target image including a gradation value corresponding to the output of the sensor;
A detection unit having a learned discriminator used for detecting a detection target from the detection target image,
The discriminator has a plurality of sub discriminators that detect the detection target in the detection target image based on gradation values of arbitrary two regions in the detection target image,
The detector is
To the plurality of sub classifiers, gradation values of two corresponding areas among the plurality of areas of the detection target image are input, and detection in the detection target image is performed based on outputs of the plurality of sub classifiers Detect objects
It is a detection device.
In this way, by having a plurality of sub-classifiers that detect a detection target based on the gradation values of any two regions, the detection target is detected using one or a plurality of sub-classifiers with good detection accuracy. An object can be detected. And the detection accuracy of a detection target can be improved.

かかる検出装置であって、前記複数のサブ識別器は、前記２つの領域の階調値を入力したときの特徴量に基づく値を出力し、前記特徴量は、前記２つの領域のうち対応する一方の領域の階調値と、前記２つの領域のうち対応する他方の領域の階調値に基づいて推定された前記一方の領域の推定階調値と、の差を表すことが望ましい。
このような構成であれば、学習済みの識別器に２つの領域の階調値を入力すると、その特徴量として、一方の領域の階調値と、他方の領域の階調値に基づいて推定された一方の領域の推定階調値と、の差が求められる。そして、学習済みの識別器において、これらの差は検出対象画像と検出対象物との近似の程度を意味する。よって、このような特徴量に基づく値を出力するようにすることで、検出対象物の検出を精度良く行うことができる。 In this detection apparatus, the plurality of sub-classifiers output values based on feature amounts when the gradation values of the two regions are input, and the feature amounts correspond to the two regions. It is desirable to represent the difference between the tone value of one region and the estimated tone value of the one region estimated based on the tone value of the other region corresponding to the two regions.
With such a configuration, when the gradation values of two regions are input to a learned discriminator, the feature value is estimated based on the gradation value of one region and the gradation value of the other region. A difference from the estimated gradation value of one of the areas is obtained. In the learned classifier, these differences mean the degree of approximation between the detection target image and the detection target. Therefore, the detection target can be detected with high accuracy by outputting a value based on such a feature amount.

また、前記推定階調値と前記他方の領域の階調値との関係が一次式で表される場合において、前記検出対象物を含む複数の学習用画像における前記一方の領域の階調値と前記他方の領域における階調値とを用いた最小二乗法により前記一次式の係数を求めることが望ましい。
このようにすることで、一方の領域の推定階調値と他方の領域の階調値との関係式を、複数の学習用画像に基づいて統計学的に求めることができる。したがって、検出精度を向上させることができる。 Further, when the relationship between the estimated gradation value and the gradation value of the other region is expressed by a linear expression, the gradation value of the one region in the plurality of learning images including the detection target object It is desirable to obtain the coefficient of the linear expression by a least square method using the gradation value in the other region.
By doing in this way, the relational expression between the estimated gradation value of one region and the gradation value of the other region can be obtained statistically based on a plurality of learning images. Therefore, detection accuracy can be improved.

また、前記複数のサブ識別器に、前記学習用画像の対応する前記一方の領域の階調値と前記他方の領域の階調値を入力してそれぞれのサブ識別器の特徴量を求め、前記複数のサブ識別器のうち該特徴量が小さいサブ識別器から順に所定個数を検出対象物の検出に用いる評価式に含めることが望ましい。
特徴量が、一方の領域の階調値と、他方の領域の階調値に基づいて推定された一方の領域の推定階調値と、の差が小さいほど、検出対象画像と検出対象物とが近似していると言えるので、この差の値が小さいサブ識別器ほど良好なサブ識別器ということになる。よって、特徴量が小さいサブ識別器から順に所定個数を検出対象物の検出に用いられる評価式に含めることとすることで、精度良く検出対象物の検出を行うことができる。 Further, the gradation value of the one region and the gradation value of the other region corresponding to the learning image are input to the plurality of sub classifiers to obtain the feature amount of each sub classifier, It is desirable to include a predetermined number in order from the sub-classifier having the smallest feature quantity among the plurality of sub-classifiers in the evaluation formula used for detecting the detection target.
The smaller the difference between the tone value of one region and the estimated tone value of one region estimated based on the tone value of the other region, the smaller the feature amount is between the detection target image and the detection target object. Therefore, a sub-classifier with a smaller difference value is a better sub-classifier. Therefore, the detection target can be detected with high accuracy by including a predetermined number in the evaluation formula used for detection of the detection target in order from the sub classifier having the smallest feature amount.

また、前記複数のサブ識別器のそれぞれに、前記検出対象画像における対応する２つの領域の階調値が入力され、前記複数のサブ識別器の出力の合計値に基づいて前記検出対象画像に検出対象物が含まれているか否かについて判定することが望ましい。
このようにすることで、複数のサブ識別器の出力の合計値に基づいて、検出対象画像に検出対象物が含まれているか否かの判定をすることができる。 Further, gradation values of two corresponding regions in the detection target image are input to each of the plurality of sub classifiers, and detection is performed on the detection target image based on a total value of the outputs of the plurality of sub classifiers. It is desirable to determine whether an object is included.
By doing in this way, based on the total value of the outputs of a plurality of sub classifiers, it can be determined whether or not the detection target object is included in the detection target image.

また、前記画像生成部は、前記検出対象画像を含む全体画像から複数の画像をクロッピングすることにより前記検出対象画像を複数生成し、前記複数の検出対象画像のそれぞれについて前記検出対象物の検出を行うことにより、前記全体画像における前記検出対象物の位置を特定することが望ましい。
このようにすることで、全体画像において複数の検出対象物が存在する場合であっても、それぞれの検出対象物を検出することができる。 Further, the image generation unit generates a plurality of detection target images by cropping a plurality of images from the entire image including the detection target images, and detects the detection target object for each of the plurality of detection target images. It is desirable to specify the position of the detection object in the whole image by performing.
By doing in this way, even if it is a case where a several detection target object exists in a whole image, each detection target object can be detected.

また、前記センサーの出力は、該センサーが検出した温度に応じた出力であることが望ましい。
このようにすることで、温度に応じて検出対象物を検出することができる。 The output of the sensor is preferably an output corresponding to the temperature detected by the sensor.
By doing in this way, a detection target object can be detected according to temperature.

また、本明細書及び添付図面の記載により、少なくとも、以下の事項も明らかとなる。すなわち、
センサーの出力に対応する階調値を含む検出対象画像を生成することと、
前記検出対象画像における任意の２つの領域の階調値に基づいて前記検出対象画像における前記検出対象物の検出を行う複数のサブ識別器を用いて前記検出対象画像から検出対象物を検出することと、
を含み、
前記検出することは、前記複数のサブ識別器に対して、前記検出対象画像の複数の領域のうち対応する２つの領域の階調値を入力し、前記複数のサブ識別器の出力に基づいて前記検出対象画像における検出対象物の検出を行うことを含む、検出方法である。
このように、任意の２つの領域の階調値に基づいて検出対象物の検出を行うサブ識別器を複数有することで、検出精度が良好な１つ又は複数のサブ識別器を用いて検出対象物の検出を行うことができる。そして、検出対象物の検出精度を向上させることができる。 In addition, at least the following matters will become clear from the description of the present specification and the accompanying drawings. That is,
Generating a detection target image including a gradation value corresponding to the output of the sensor;
Detecting a detection target from the detection target image using a plurality of sub-identifiers that detect the detection target in the detection target image based on gradation values of arbitrary two regions in the detection target image. When,
Including
The detecting includes inputting to the plurality of sub classifiers gradation values of two corresponding areas of the plurality of areas of the detection target image, based on outputs of the plurality of sub classifiers. It is a detection method including detecting a detection target in the detection target image.
In this way, by having a plurality of sub-classifiers that detect a detection target based on the gradation values of any two regions, the detection target is detected using one or a plurality of sub-classifiers with good detection accuracy. An object can be detected. And the detection accuracy of a detection target can be improved.

＝＝＝実施形態＝＝＝
図１は、本実施形態における人物検出システム１の概略構成を示すブロック図である。以下に示す実施形態では、人物を検出するシステムとして説明を行うが、検出対象物はこれに限られない。図１には、人物検出システム１に含まれる赤外線カメラ１１０と、人物検出装置１２０と、表示装置１３０が示されている。本実施形態では、赤外線カメラ１１０と、人物検出装置１２０と、表示装置１３０とは、それぞれ別体であり、電気的に接続されているが、これらのうち少なくとも２つが一体の装置であってもよい。 === Embodiment ===
FIG. 1 is a block diagram showing a schematic configuration of a person detection system 1 in the present embodiment. In the embodiment described below, a system for detecting a person will be described, but the detection target is not limited to this. FIG. 1 shows an infrared camera 110, a person detection device 120, and a display device 130 included in the person detection system 1. In the present embodiment, the infrared camera 110, the person detection device 120, and the display device 130 are separate from each other and are electrically connected. However, at least two of them may be integrated devices. Good.

赤外線カメラ１１０（センサーに相当）は、中赤外線から遠赤外線の範囲の波長をとらえデジタル値の映像信号を人物検出装置１２０の画像取得部１２２に送信する。赤外線カメラ１１０は、不図示の撮像部とアナログデジタル変換部（Ａ／Ｄ変換部）を含む。撮像部は、赤外線カメラ１１０の受光素子に対応するものであり、受光素子が受光した赤外領域の光に対応した信号を人物検出装置１２０に出力する。Ａ／Ｄ変換部は、撮像部で得られたアナログ信号をデジタル信号に変換する機能を有する。 The infrared camera 110 (corresponding to a sensor) captures a wavelength in the range from mid-infrared to far-infrared and transmits a digital video signal to the image acquisition unit 122 of the person detection device 120. The infrared camera 110 includes an imaging unit (not shown) and an analog / digital conversion unit (A / D conversion unit). The imaging unit corresponds to the light receiving element of the infrared camera 110 and outputs a signal corresponding to the light in the infrared region received by the light receiving element to the person detecting device 120. The A / D conversion unit has a function of converting an analog signal obtained by the imaging unit into a digital signal.

ここで、中赤外線は２．５μｍ〜４．０μｍの波長、遠赤外線は４μｍ〜１０００μｍの波長を有する赤外光である。本実施形態では、赤外線カメラ１１０は８〜１４μｍの波長を検知し、人物の体温を検出対象とするが、この波長に限られず、温度を検出できる波長であればこれに限られない。赤外線カメラ１１０は、車両のフロントグリル部などに搭載される。そして、自車両（赤外線カメラ１１０が搭載された車両）から前方方向の環境を撮影する。 Here, the mid-infrared light is infrared light having a wavelength of 2.5 μm to 4.0 μm, and the far infrared light is infrared light having a wavelength of 4 μm to 1000 μm. In the present embodiment, the infrared camera 110 detects a wavelength of 8 to 14 μm and uses a human body temperature as a detection target. However, the infrared camera 110 is not limited to this wavelength and is not limited to this as long as the temperature can be detected. The infrared camera 110 is mounted on a front grill portion of the vehicle. And the environment of the front direction is image | photographed from the own vehicle (vehicle equipped with the infrared camera 110).

人物検出装置１２０は、画像取得部１２２と画像メモリー１２４と制御部１２６と記憶部１２８を含む。そして、後述するような処理により、表示装置１３０に表示するデータを生成する。これら画像取得部１２２、画像メモリー１２４、制御部１２６、及び、記憶部１２８は、例えば、不図示の中央演算装置（ＣＰＵ）、ランダムアクセスメモリー（ＲＡＭ）及びハードディスク（ＨＤＤ）などにより実現される。 The person detection device 120 includes an image acquisition unit 122, an image memory 124, a control unit 126, and a storage unit 128. And the data displayed on the display apparatus 130 are produced | generated by the process as mentioned later. The image acquisition unit 122, the image memory 124, the control unit 126, and the storage unit 128 are realized by, for example, a central processing unit (CPU), a random access memory (RAM), a hard disk (HDD), and the like (not shown).

画像取得部１２２は、赤外線カメラ１１０が得た映像（例えば、１５ｆｐｓの映像）を取得し、この映像からフレーム画像（全体画像）を取得する。そして、得られた各画像は画像メモリー１２４に送られる。 The image acquisition unit 122 acquires a video (for example, a 15 fps video) obtained by the infrared camera 110, and acquires a frame image (entire image) from this video. Each obtained image is sent to the image memory 124.

画像メモリー１２４は、画像取得部１２２から送られた画像を一時的に記憶する。制御部１２６は、人物検出処理を行うための演算を行う。具体的な人物検出処理については、後述する。記憶部１２８は、学習モデルなどのデータ、演算途中の一時ファイル、及び、演算結果等を保存する。 The image memory 124 temporarily stores the image sent from the image acquisition unit 122. The control unit 126 performs a calculation for performing a person detection process. Specific person detection processing will be described later. The storage unit 128 stores data such as a learning model, a temporary file in the middle of calculation, a calculation result, and the like.

表示装置１３０は、例えば、赤外線画像として得られている自車両前方映像を表示するディスプレイである。表示装置１３０には、さらに人物の検出結果として、検出された人物を強調表示することもできる。 The display device 130 is, for example, a display that displays a front image of the host vehicle obtained as an infrared image. The display device 130 can also highlight the detected person as a result of detecting the person.

図２は、本実施形態の概要を説明する図である。図２には、本実施形態の人物検出システム１の概要を説明するために、各処理がブロックとして表されている。人物検出システム１は、学習処理と検出処理とを行う。なお、外部メモリー等を記憶部１２８として用いたり、学習処理によって得られる検出処理に必要なデータを外部から記憶部１２８に記憶させれば、必ずしも人物検出システム１が学習処理を行わなくてもよい。 FIG. 2 is a diagram for explaining the outline of the present embodiment. In FIG. 2, in order to explain the outline of the person detection system 1 of the present embodiment, each process is represented as a block. The person detection system 1 performs learning processing and detection processing. If the external memory or the like is used as the storage unit 128 or data necessary for the detection process obtained by the learning process is stored in the storage unit 128 from the outside, the person detection system 1 may not necessarily perform the learning process. .

学習処理では、予め用意された学習用クロッピング画像を用いて学習を行い、その学習結果を学習モデルデータベース（記憶部１２８）に記憶する。学習用クロッピング画像は、学習用に用意された画像であって、必ず人物が含まれている画像である。 In the learning process, learning is performed using a learning cropping image prepared in advance, and the learning result is stored in the learning model database (storage unit 128). The learning cropping image is an image prepared for learning and always includes a person.

検出処理では、赤外線カメラ１１０から得られた全体画像からクロッピング画像（検出対象画像に相当する）を切り出し、クロッピング画像に人物が含まれているか否かについて前述の学習結果に応じて判定を行う。 In the detection process, a cropped image (corresponding to a detection target image) is cut out from the entire image obtained from the infrared camera 110, and whether or not a person is included in the cropped image is determined according to the learning result described above.

学習処理では、前述の学習用クロッピング画像についてコントラストの調整などの前処理が行われ、学習用クロッピング画像は学習器に渡される。学習器は、前処理後の学習用クロッピング画像に基づいて特徴量を求め、特徴量に基づく学習結果を学習モデルデータベースに記憶する。 In the learning process, preprocessing such as contrast adjustment is performed on the above-described learning cropped image, and the learning cropped image is passed to the learning device. The learning device obtains a feature amount based on the pre-processed learning cropped image, and stores a learning result based on the feature amount in a learning model database.

検出処理では、赤外線カメラ１１０から得られた全体画像から、部分的に画像をクロッピング（切り出し）し、クロッピング画像を生成する。クロッピング画像についてコントラスト調整などの前処理が行われ、クロッピング画像は識別器に渡される。識別器は、学習結果に応じた複数のサブ識別器を有している。クロッピング画像の階調値が複数のサブ識別器に入力され、これらサブ識別器の出力に基づいてクロッピング画像に人物が含まれているか否かの判定が行われる。 In the detection process, an image is partially cropped from the entire image obtained from the infrared camera 110 to generate a cropped image. Preprocessing such as contrast adjustment is performed on the cropped image, and the cropped image is passed to the discriminator. The classifier has a plurality of sub classifiers according to the learning result. The gradation values of the cropped image are input to a plurality of sub classifiers, and it is determined whether or not a person is included in the cropped image based on the outputs of these sub classifiers.

クロッピング画像は、全体画像から少しずつ位置をずらして画像を切り出すことによって、複数生成される。そして、前述のように、各クロッピング画像について人物が含まれているか否かの判定が行われ、その判定結果は統合される。そして、統合された結果は、全体画像に人物が強調表示されるなどの処理がされた出力画像として表示装置１３０に出力される。 A plurality of cropped images are generated by shifting the position little by little from the entire image and cutting out the images. Then, as described above, it is determined whether or not a person is included in each cropped image, and the determination results are integrated. The integrated result is output to the display device 130 as an output image that has been processed such that a person is highlighted on the entire image.

図３は、参考例の人物検出方法の説明図である。図３を参照しつつ、参考例としての人物検出方法の概念を説明する。図３には、人物が含まれている画像において人物領域階調値Ｔｆと背景領域階調値Ｔｂが示されている。人物領域階調値Ｔｆと背景領域階調値Ｔｂは、ともに、赤外線カメラにより取得された温度に応じて変換された階調値である。参考例の人物検出方法では、人物領域階調値Ｔｆが所定範囲内の階調値である場合には人物が含まれているものとして人物検出を行うが、この「所定範囲内の階調値」は、背景領域階調値Ｔｂに応じて変化するものとなっている。これは、一般に、人物の温度が、その周辺温度に影響を受けて変化するからである。 FIG. 3 is an explanatory diagram of the person detection method of the reference example. The concept of a person detection method as a reference example will be described with reference to FIG. FIG. 3 shows a person area gradation value Tf and a background area gradation value Tb in an image including a person. The person area gradation value Tf and the background area gradation value Tb are both gradation values converted according to the temperature acquired by the infrared camera. In the human detection method of the reference example, when the person area gradation value Tf is a gradation value within a predetermined range, the person is detected as including a person. "" Changes according to the background area gradation value Tb. This is because the temperature of a person generally changes depending on the ambient temperature.

参考例の人物検出方法では、背景領域階調値Ｔｂに基づいて人物領域階調値が推定される（人物領域推定階調値Ｔｆ’）。人物領域推定階調値Ｔｆ’は、背景領域階調値Ｔｂの関数として表される。ここで、人物領域階調値Ｔｆと背景領域階調値Ｔｂとは、検出対象画像において所定位置にある領域（セル）、例えば検出対象画像の右上隅の領域と中央付近の領域である。 In the human detection method of the reference example, the person area gradation value is estimated based on the background area gradation value Tb (person area estimated gradation value Tf ′). The person area estimated gradation value Tf ′ is expressed as a function of the background area gradation value Tb. Here, the person area gradation value Tf and the background area gradation value Tb are an area (cell) at a predetermined position in the detection target image, for example, an upper right corner area and an area near the center of the detection target image.

この関数に背景領域階調値Ｔｂを代入して人物領域推定階調値Ｔｆ’を得る。そして、この人物領域推定階調値Ｔｆ’と実際の人物領域階調値Ｔｆとの差の絶対値が所定範囲に入っているときには、画像に人物が含まれているものと判定する。一方、人物領域推定階調値Ｔｆ’と実際の人物領域階調値Ｔｆとの差の絶対値が所定範囲に入っていないときには、画像に人物が含まれていないものと判定する。 By substituting the background area gradation value Tb into this function, the person area estimated gradation value Tf 'is obtained. Then, when the absolute value of the difference between the person area estimated gradation value Tf ′ and the actual person area gradation value Tf is within a predetermined range, it is determined that a person is included in the image. On the other hand, when the absolute value of the difference between the person area estimated gradation value Tf ′ and the actual person area gradation value Tf is not within the predetermined range, it is determined that no person is included in the image.

このような参考例の手法によっても、従来技術と比較して、画像中に人物が含まれているか否かの判定精度を向上することができる。しかしながら、参考例の手法では、人物領域階調値Ｔｆの位置と背景領域階調値Ｔｂとの位置が固定となっている。そして、人物領域階調値Ｔｆのセルの位置と背景領域階調値Ｔｂのセルの位置について、図３に示した位置関係が最適かどうかは定かではない。つまり、参考例の手法では、より精度良く人物の検出を行う余地があるといえる。よって、本実施形態では以下のような学習を行って、より精度良く人物の検出を行うこととしている。 Also by such a method of the reference example, it is possible to improve the determination accuracy of whether or not a person is included in the image, as compared with the conventional technique. However, in the method of the reference example, the position of the person area gradation value Tf and the position of the background area gradation value Tb are fixed. Then, it is not certain whether the positional relationship shown in FIG. 3 is optimal for the position of the person area gradation value Tf cell and the position of the background area gradation value Tb cell. That is, it can be said that there is a room for detecting a person with higher accuracy in the method of the reference example. Therefore, in the present embodiment, the following learning is performed to detect a person with higher accuracy.

図４は、本実施形態における学習処理のフローチャートである。以下、本フローチャートを参照しつつ、学習処理について説明を行う。学習処理が行われるにあたり、前述のように、学習用クロッピング画像が複数用意されている。そして、これら人物が含まれていることが確実な学習用クロッピング画像を用いて学習モデルを構築する。 FIG. 4 is a flowchart of the learning process in the present embodiment. Hereinafter, the learning process will be described with reference to this flowchart. When the learning process is performed, a plurality of learning cropping images are prepared as described above. Then, a learning model is constructed using the learning cropping image that surely includes these persons.

最初に、複数の学習用クロッピング画像（学習画像）の読み込みが行われる（Ｓ１０２）。ここでは、Ｎ枚の学習用クロッピング画像が用意されている。そして、これら複数の学習用クロッピング画像のそれぞれについて特徴量抽出処理が行われる（Ｓ１０４）。特徴量抽出処理は、回帰式を用いて行われる。ここでは、まず、この回帰式の作成処理について説明を行う。 First, a plurality of learning cropping images (learning images) are read (S102). Here, N learning cropping images are prepared. Then, a feature amount extraction process is performed for each of the plurality of learning cropping images (S104). The feature amount extraction process is performed using a regression equation. Here, first, the process of creating the regression equation will be described.

図５は、本実施形態における回帰式作成処理のフローチャートである。図６は、クロッピング画像を複数のセル（領域）に分割する様子を説明する図である。学習用クロッピング画像は、回帰式作成処理において、複数のセルに分割される。これらの学習用クロッピング画像は、赤外線カメラ１３０によって取得された画像であるので、その階調値は温度に関連したものになっている。ここでは、学習用クロッピング画像の複数のセルのうちの２つのセルの階調値（温度を表す）を用いた線形回帰式を求める。 FIG. 5 is a flowchart of regression equation creation processing in the present embodiment. FIG. 6 is a diagram for explaining how a cropped image is divided into a plurality of cells (regions). The learning cropping image is divided into a plurality of cells in the regression equation creation process. Since these learning cropping images are images acquired by the infrared camera 130, the gradation values are related to temperature. Here, a linear regression equation using the gradation values (representing temperature) of two cells among a plurality of cells of the learning cropping image is obtained.

図６には、学習用クロッピング画像が示され、前述のように学習用クロッピング画像が複数のセルに分割されている。そして、左上のセルから右に向かって順にセル番号が割り当てられ、これらのセルの階調値Ｔ_ｉが使用される（ｉはセル番号）。本実施形態では、各セルは複数の画素を含むため、これら画素の階調値の平均値をそのセルの階調値Ｔ_ｉとしてもよいし、中間値をそのセルの階調値Ｔ_ｉとしてもよい。また、これら個々のセルが１つの画素である場合には、その画素に対応する階調値をそのまま階調値Ｔ_ｉとして用いることもできる。 FIG. 6 shows a learning cropping image, and the learning cropping image is divided into a plurality of cells as described above. Then, cell numbers are assigned in order from the upper left cell to the right, and the gradation value T _i of these cells is used (i is the cell number). In the present embodiment, since each cell including a plurality of pixels, to the average value of the gradation values of the pixels may be the gradation value T _i of the cell, the intermediate value as a gradation value T _i of the cell Also good. When these individual cells are one pixel, the gradation value corresponding to the pixel can be used as the gradation value _Ti as it is.

図５のステップＳ２０２において、変数をｉとしたループが構築される。変数ｉはセル番号であり、増分を１として１からＭまで変化させられる。また、ステップＳ２０４において、変数をｊとしたループが構築される。変数ｊはセル番号であり、増分を１として１からＭまで変化させられる。さらに、ステップＳ２０６において、変数をｋとしたループが構築される。変数ｋは学習用クロッピング画像の番号であり、増分を１として１からＮまで変化させられる。 In step S202 of FIG. 5, a loop with a variable i is constructed. The variable i is a cell number and can be changed from 1 to M with an increment of 1. In step S204, a loop with a variable j is constructed. The variable j is a cell number and can be changed from 1 to M with an increment of 1. In step S206, a loop with a variable k is constructed. The variable k is the number of the cropping image for learning and can be changed from 1 to N with an increment of 1.

ステップＳ２０８において、ｋ番目の学習用クロッピング画像のｉ番目のセルの階調値Ｔ_ｋｉを取得する。次に、ステップＳ２１０において、ｋ番目の学習用クロッピング画像のｊ番目のセルの階調値Ｔ_ｋｊを取得する。そして、ステップＳ２１２において、ｋが１だけ増分される。このような処理が、１枚目の学習用画像からＮ枚目の学習用画像まで繰り返される（Ｓ２１４）。 In step S208, the tone value T _ki of the i th cell of the k th learning cropping image is acquired. Next, in step S210, the gradation value _Tkj of the jth cell of the kth learning cropping image is acquired. In step S212, k is incremented by 1. Such processing is repeated from the first learning image to the Nth learning image (S214).

次に、得られた階調値Ｔ_ｋｉとＴ_ｋｊに基づいて、線形回帰式Ｔ_ｊ’＝ａ_ｉｊＴ_ｉ＋ｂ_ｉｊの係数ａ_ｉｊ及びｂ_ｉｊが最小二乗法により求められる（Ｓ２１６）。そして、ステップＳ２１８において、ｊが１だけ増分される。 Next, based on the obtained gradation values T _ki and T _kj , coefficients a _ij and b _{ij of} the linear regression equation T _j ′ = a _ij T _i + b _ij are obtained by the least square method (S216). In step S218, j is incremented by 1.

このようなｊを変数とした処理が、セル番号１からセル番号Ｍまで繰り返される（Ｓ２２０）。また、ステップＳ２２２において、ｉが１だけ増分される。このようなｉを変数とした処理が、セル番号１からセル番号Ｍまで繰り返される（Ｓ２２４）。このようにすることによって、Ｍ^２本の線形回帰式が得られることになる。 Such processing using j as a variable is repeated from cell number 1 to cell number M (S220). In step S222, i is incremented by one. Such processing using i as a variable is repeated from cell number 1 to cell number M (S224). In this way, M ² linear regression equations are obtained.

線形回帰式を求められると、これに基づいて特徴量を抽出することができるようになる。
図７は、本実施形態における特徴量抽出処理のフローチャートである。ここでは、１枚の画像から特徴量を計算する手法について説明する。なお、この特徴量抽出処理は、学習処理だけでなく、後述する人物検出処理においても使用される。学習処理では、ステップＳ１０２において読み込まれた学習用クロッピング画像の特徴量が抽出されることになる。 When the linear regression equation is obtained, the feature amount can be extracted based on the linear regression equation.
FIG. 7 is a flowchart of the feature amount extraction processing in the present embodiment. Here, a method for calculating a feature amount from one image will be described. This feature amount extraction process is used not only in the learning process but also in the person detection process described later. In the learning process, the feature amount of the learning cropped image read in step S102 is extracted.

ステップＳ３０２において、変数をｉとしたループが構築される。これにより、変数ｉは、増分を１として１からＭまで変化させられる。次に、ｉ番目のセルの階調値Ｔ_ｉを取得する（Ｓ３０４）。 In step S302, a loop with a variable i is constructed. Thereby, the variable i is changed from 1 to M with the increment being 1. Next, to obtain the tone value _{T i} of the i-th cell (S304).

ステップＳ３０６において、変数をｊとしたループが構築される。これにより、変数ｊは、増分を１として１からＭまで変化させられる。次に、特徴量ｖ_ｉｊが求められる（Ｓ３０８）。ステップＳ３０６によって求められる複数の特徴量の一つ一つがサブ識別器に相当する。 In step S306, a loop with a variable j is constructed. As a result, the variable j is changed from 1 to M with an increment of 1. Next, a feature value v _ij is obtained (S308). Each of the plurality of feature amounts obtained in step S306 corresponds to a sub classifier.

特徴量ｖ_ｉｊは、以下の式により求められる。

ｖ_ｉｊ＝Ｔ_ｉ−Ｔ_ｊ’
＝Ｔ_ｉ−（ａ_ｉｊ・Ｔ_ｉ＋ｂ_ｉｊ）

ここで、ｉ及びｊはループ内で変化する変数である。また、ａ_ｉｊ及びｂ_ｉｊの値は、前述の線形回帰式の作成において求められている。また、Ｔ_ｉは、ｉ番目のセルの階調値であり、Ｔ_ｊは、ｊ番目のセルの階調値である。 The feature amount v _ij is obtained by the following equation.

v _ij = T _i −T _j ′
= T _i − (a _ij · T _i + b _ij )

Here, i and j are variables that change in the loop. Further, the values of a _ij and b _ij are obtained in the creation of the linear regression equation described above. T _i is the gradation value of the i-th cell, and T _j is the gradation value of the j-th cell.

ステップＳ３１０において、変数ｊが１だけ増分される。このような処理が、ｊを変数としてセル番号１からセル番号Ｍまで繰り返される（Ｓ３１２）。また、ステップＳ３１４において、変数ｉが１だけ増分される。このような処理が、ｉを変数としてセル番号１からセル番号Ｍまで繰り返される（Ｓ３１６）。 In step S310, the variable j is incremented by one. Such a process is repeated from cell number 1 to cell number M with j as a variable (S312). In step S314, the variable i is incremented by 1. Such a process is repeated from cell number 1 to cell number M with i as a variable (S316).

このようにすることで、各線形回帰式を用いた特徴量が得られることになる。すなわち、特徴量ｖ_１１〜ｖ_ＭＭが得られることになり、合計でＭ^２個の特徴量が得られることになる。 By doing in this way, the feature-value using each linear regression type will be obtained. That is, feature amounts v _{11 to} v _MM are obtained, and a total of M ² feature amounts are obtained.

このようにして得られた特徴量に基づいて学習が行われる（Ｓ１０６）。特徴量について考察すると、特徴量ｖ_ｉｊは、ある一つのセルの実際の階調値Ｔ_ｊ（一方の領域の階調値）と、他のセルの階調値Ｔ_ｉに基づいて得られたある一つのセルの推定階調値Ｔ_ｊ’（一方の領域の推定階調値）との差を表す。学習用クロッピング画像を用いて求めたものであるため、これら推定階調値と実際の階調値はほぼ同値であることが望ましい。よって、学習用クロッピング画像を用いたときにおいて、特徴量が小さいときの、２つのセル番号の組み合わせに基づいて、検出対象物がその画像に含まれているか否かを判定することが望ましいことになる。 Learning is performed based on the feature quantity obtained in this way (S106). Considering the feature amount, the feature amount v _ij was obtained based on the actual gradation value T _j of one cell (the gradation value of one region) and the gradation value T _i of the other cell. This represents a difference from an estimated gradation value T _j ′ (estimated gradation value of one region) of a certain cell. Since it is obtained using the learning cropping image, it is desirable that the estimated gradation value and the actual gradation value are substantially the same value. Therefore, when the learning cropping image is used, it is desirable to determine whether or not the detection target is included in the image based on the combination of two cell numbers when the feature amount is small. Become.

このような原理によると、学習用クロッピング画像の２つのセルの階調値を入力したときの特徴量が小さいセルの組み合わせを用いて、検出対象物がその画像に含まれているか否かの判定を行うことができる。 According to such a principle, it is determined whether or not a detection target is included in the image by using a combination of cells having a small feature amount when the gradation values of two cells of the learning cropped image are input. It can be performed.

そうすると、判定用の評価値をＥとして、

Ｅ＝ｗ_１１・ｖ_１１＋ｗ_１２・ｖ_１２＋・・・＋ｗ_ＭＭ・ｖ_ＭＭ

という評価式を用いることができる。ここで、ｗ_ｉｊは、重み付けされた係数である。重み付けされた係数は次のようにして決めることができる。例えば、学習用クロッピング画像を用いたときにおける特徴量ｖ_ｉｊの値が小さい順にｉとｊの組み合わせが所定個数だけ選択される。そして、選択されたｉとｊの組み合わせの重み付け係数ｗ_ｉｊを「１」とし、選択されなかったｉとｊの組み合わせの重み付け係数ｗ_ｉｊを「０」とする。このようにすることにより、特徴量ｖ_ｉｊが大きいセルの組み合わせについては、評価式から除外し、特徴量ｖ_ｉｊが小さいセルの組み合わせを評価式に組み入れることができる。 Then, let the evaluation value for determination be E,

E = w ₁₁ · v ₁₁ + w ₁₂ · v ₁₂ +... + W _MM · v _MM

Can be used. Here, w _ij is a weighted coefficient. The weighted coefficient can be determined as follows. For example, a predetermined number of combinations of i and j are selected in ascending order of the feature value v _ij when the learning cropping image is used. Then, the weighting coefficient w _ij of the selected combination of i and j is set to “1”, and the weighting coefficient w _ij of the combination of i and j that is not selected is set to “0”. By doing so, the combination of cells having a large feature value v _ij can be excluded from the evaluation formula, and the combination of cells having a small feature value v _ij can be incorporated into the evaluation formula.

このようにして、それぞれの重み付け係数ｗ_ｉｊを求め、評価値Ｅを求めるための評価式、及び、Ｍ^２個の特徴量の算出式を学習モデルとして出力する（Ｓ１０８）。この学習モデルは、記憶部１２８に記憶され、後の人物検出処理において使用されることになる。この学習モデルは、学習済みの識別器に相当する。 In this way, each weighting coefficient w _ij is obtained, and an evaluation formula for _obtaining the evaluation value E and a calculation formula for M ² feature values are output as a learning model (S108). This learning model is stored in the storage unit 128 and used in the subsequent person detection process. This learning model corresponds to a learned classifier.

なお、ここでは上述のような学習器及び識別器を採用することとしたが、学習器及び識別器はこれに限られず、ａｄａｂｏｏｓｔを用いることとしてもよい。ａｄａｂｏｏｓｔを用いる場合、個々の特徴量の算出式を弱識別器に対応させてもよい。また、個々の特徴量から人物検出に適した特徴量を選択し、強識別器を構築してもよい。 Here, the learning device and the discriminator as described above are adopted, but the learning device and the discriminating device are not limited to this, and an adaboost may be used. When using adaboost, each feature quantity calculation formula may be associated with a weak classifier. In addition, a strong classifier may be constructed by selecting a feature quantity suitable for person detection from individual feature quantities.

図８は、本実施形態における人物検出処理（全体画像）のフローチャートである。図８を参照しつつ、赤外線カメラ１１０から得られた全体画像から人物を検出する処理について説明する。 FIG. 8 is a flowchart of person detection processing (entire image) in the present embodiment. Processing for detecting a person from the entire image obtained from the infrared camera 110 will be described with reference to FIG.

最初に、上述のようにして求められた学習モデルの読み込みが行われる（Ｓ４０２）。これにより、評価値Ｅを求めるための評価式、及び、Ｍ^２個の特徴量算出式が取得される。これら個々の特徴量算出式で用いられる回帰式の係数ａ_ｉｊ及びｂ_ｉｊは、前述の学習処理により求められている。 First, the learning model obtained as described above is read (S402). Thereby, an evaluation formula for obtaining the evaluation value E and M ² feature amount calculation formulas are acquired. The coefficients a _ij and b _ij of the regression equation used in these individual feature amount calculation formulas are obtained by the learning process described above.

赤外線カメラ１１０は、撮影対象物から放出される赤外線に応じた映像をデジタルデータとして出力する。このような映像から１枚の画像（全体画像）が取得される（Ｓ４０４）。赤外線カメラ１１０から得られた画像は、温度に応じた階調値が各画素に設定された画像である。つまり、この画像は画素単位で温度情報を有する画像である。 The infrared camera 110 outputs an image corresponding to infrared rays emitted from the object to be photographed as digital data. One image (entire image) is acquired from such a video (S404). The image obtained from the infrared camera 110 is an image in which gradation values corresponding to temperature are set for each pixel. That is, this image is an image having temperature information in units of pixels.

そして、得られた画像について前処理が行われる（Ｓ４０６）。前処理は、例えば、画像のコントラストの調整、及び、リサイズなどの処理である。 Then, preprocessing is performed on the obtained image (S406). Pre-processing is, for example, processing such as image contrast adjustment and resizing.

次に、ステップＳ４０８〜ステップＳ４１２において、探索窓ｗ（図１０参照）を全体画像の範囲内で移動させつつ、人物検出処理（ステップＳ４１０）が行われる。具体的には、ステップＳ４０８〜ステップＳ４１２のループにおいて、探索窓ｗが一画素分ずつ（又は複数画素分ずつ）動かされながら探索窓ｗのサイズで画像がクロッピングされ、クロッピング画像に検出対象物である人物が含まれているか否かの判定が行われる。 Next, in steps S408 to S412, the person detection process (step S410) is performed while moving the search window w (see FIG. 10) within the range of the entire image. Specifically, in the loop from step S408 to step S412, the image is cropped at the size of the search window w while the search window w is moved by one pixel (or by a plurality of pixels), and the cropped image is detected with the detection target. A determination is made whether a person is included.

図９は、本実施形態における人物検出処理（クロッピング画像）のフローチャート、つまりステップＳ４１０を説明するフローチャートである。この人物検出処理は、各クロッピング画像について適用される。よって、人物検出処理（クロッピング画像）において、最初に、クロッピング画像の読み込みが行われる（Ｓ５０２）。 FIG. 9 is a flowchart for explaining a person detection process (cropping image) in this embodiment, that is, step S410. This person detection process is applied to each cropped image. Therefore, in the person detection process (cropped image), first, the cropped image is read (S502).

次に、読み込まれたクロッピング画像の特徴量の抽出が行われる（Ｓ５０４）。特徴量抽出処理は、図７を用いて説明を行った処理とほぼ同様であり、ステップＳ３０８で用いられた線形回帰式の係数ａ_ｉｊ及びｂ_ｉｊの値も、学習用クロッピング画像で得られたものが用いられる。このように特徴量抽出処理を行うことで、クロッピング画像について、評価値Ｅを求めるための評価式に入力するＭ^２個の特徴量が得られることになる。 Next, the feature amount of the read cropped image is extracted (S504). The feature amount extraction process is almost the same as the process described with reference to FIG. 7, and the values of the coefficients a _ij and b _ij of the linear regression equation used in step S308 are also obtained from the learning cropped image. Things are used. By performing the feature amount extraction processing in this way, M ² feature amounts to be input to the evaluation formula for obtaining the evaluation value E can be obtained for the cropped image.

次に、求められたＭ^２個の特徴量に基づいてクロッピング画像に人物が含まれているか否かの判定が行われる（Ｓ５０６）。具体的には、求められたＭ^２個の特徴量が評価式に入力され評価値Ｅが求められる。そして、この評価値Ｅが所定値よりも小さいときには、クロッピング画像に人物が含まれていると判定し、このクロッピング画像の全体画像における位置も記憶する。一方、評価値Ｅが所定値以上のときには、クロッピング画像に人物が含まれていないと判定する。 Next, it is determined whether or not a person is included in the cropped image based on the obtained M ² feature amounts (S506). Specifically, the obtained M ² feature quantities are input to the evaluation formula, and the evaluation value E is obtained. When the evaluation value E is smaller than a predetermined value, it is determined that a person is included in the cropped image, and the position of the cropped image in the entire image is also stored. On the other hand, when the evaluation value E is greater than or equal to a predetermined value, it is determined that no person is included in the cropped image.

このような人物検出処理がステップＳ４０８〜Ｓ４１２のループにおいて行われることにより、すべてのクロッピング画像についての人物検出処理が完了される。 Such person detection processing is performed in the loop of steps S408 to S412 to complete the person detection processing for all the cropped images.

図１０は、探索窓の移動の様子の説明図である。図１０には、全体画像において探索窓が移動させられている様子が示されている。このように、探索窓ｗが全体画像において移動させられつつクロッピングされた画像について前述の人物の検出処理が行われる。このようにして、各クロッピング画像について人物が存在するか否かの判定を行い、人物が検出されたクロッピング画像の位置を記憶することによって、全体画像における人物の位置が特定できる。 FIG. 10 is an explanatory diagram of how the search window moves. FIG. 10 shows how the search window is moved in the entire image. In this way, the above-described person detection process is performed on the image cropped while the search window w is moved in the entire image. In this way, it is determined whether or not a person exists for each cropped image, and the position of the cropped image from which the person is detected is stored, whereby the position of the person in the entire image can be specified.

全体画像における各クロッピング画像についての人物検出処理が完了すると、人物検出装置１２０に接続された表示装置１３０に検出結果が表示される（Ｓ４１４）。結果表示は、赤外線画像として得られている自車両前方映像に、歩行者として判定された箇所を、歩行者を含むように強調表示させたり、注意を喚起するために画面をフラッシュさせたりして行うことができる。さらに、場合によっては運転支援としてブレーキをかけることをアシストしたり、視覚補助として婦論とライトがダウンライトになっているのをアップライトにしてもよい。 When the person detection process for each cropped image in the entire image is completed, the detection result is displayed on the display device 130 connected to the person detection device 120 (S414). In the result display, the location determined as a pedestrian is highlighted to include the pedestrian, or the screen is flashed to call attention to the front image of the host vehicle obtained as an infrared image. It can be carried out. Further, depending on the case, it may be possible to assist in braking as driving assistance, or as an upright that women's theory and light are downlighted as visual assistance.

このようにして、学習用クロッピング画像に基づいて得られた評価式を用いて人物の検出を行うので、学習用クロッピング画像に含まれる人物によく合致するクロッピング画像について、人物が含まれているものとして精度良く検出を行うことができる。また、上述のようなアルゴリズムは、クロッピング画像における２つの領域の階調値を特徴量算出式に入力するというきわめて単純なものであるので、検出速度を向上させることもできる。 In this way, since the person is detected using the evaluation formula obtained based on the learning cropping image, the cropping image that closely matches the person included in the learning cropping image includes a person. Can be detected with high accuracy. Further, the algorithm as described above is extremely simple in which the gradation values of the two regions in the cropped image are input to the feature amount calculation formula, so that the detection speed can be improved.

１人物検出システム、
１１０赤外線カメラ、１２０人物検出装置、１３０表示装置、
１２２画像取得部、１２４画像メモリー、
１２６制御部、１２８記憶部、
Ｔｂ背景領域階調値、Ｔｆ人物領域階調値 1 person detection system,
110 infrared camera, 120 person detection device, 130 display device,
122 image acquisition unit, 124 image memory,
126 control unit, 128 storage unit,
Tb Background area gradation value, Tf Person area gradation value

Claims

An image generation unit for generating a detection target image including a gradation value corresponding to the output of the sensor, and a detection unit having a learned discriminator used for detecting a detection target from the detection target image,
The classifier includes a plurality of sub classifiers that detect the detection target in the detection target image based on gradation values of two regions in the detection target image;
The detector is
To the plurality of sub classifiers, gradation values of two corresponding areas among the plurality of areas of the detection target image are input, and detection in the detection target image is performed based on outputs of the plurality of sub classifiers There line the detection of an object,
The plurality of sub classifiers output values based on feature values when the gradation values of the two regions are input,
The feature amount is an estimated floor of the one region estimated based on a gradation value of one corresponding region of the two regions and a gradation value of the other region corresponding to the two regions. Represents the difference between the key value and
Detection device.

In the case where the relationship between the estimated gradation value and the gradation value of the other region is expressed by a linear expression,
Obtaining coefficients of said linear equation by the least square method using the tone values in the other region and the gradation values of the one region of the plurality of the learning image including the detected object according to claim 1 Detection device.

The plurality of sub-classifiers are input with the gradation value of the one region corresponding to the learning image and the gradation value of the other region to obtain a feature amount of each of the plurality of sub-classifiers, The detection apparatus according to claim 2 , wherein a predetermined number of sub-classifiers in order from a sub-classifier having the smallest feature amount is included in an evaluation formula used for detection of a detection target.

To each of the plurality of sub classifiers, gradation values of corresponding two regions in the detection target image are input, and based on the total value of the outputs of the plurality of sub classifiers, the detection target image is detected. The detection device according to any one of claims 1 to 3, wherein a determination is made as to whether or not.

The image generation unit generates a plurality of the detection target images by cropping a plurality of images from the entire image including the detection target images,
The detection device according to claim 1 , wherein the detection target object is detected for each of the plurality of detection target images to identify a position of the detection target object in the entire image.

The detection device according to claim 1 , wherein the output of the sensor is an output corresponding to a temperature detected by the sensor.

Generating a detection target image including a gradation value corresponding to the output of the sensor;
And said detecting the detection target from the detection target image by using a plurality of sub-classifiers for detecting a detection object in the detection target image based on the tone values of the two regions in the detection target image,
Including
The detecting includes inputting to the plurality of sub classifiers gradation values of two corresponding areas of the plurality of areas of the detection target image, based on outputs of the plurality of sub classifiers. It looks including that the detection of the detection object in the detection target image,
The plurality of sub classifiers output values based on feature values when the gradation values of the two regions are input,
The feature amount is an estimated floor of the one region estimated based on a gradation value of one corresponding region of the two regions and a gradation value of the other region corresponding to the two regions. Represents the difference between the key value and
Detection method.