JP6802923B2

JP6802923B2 - Object detection device and object detection method

Info

Publication number: JP6802923B2
Application number: JP2019530278A
Authority: JP
Inventors: 亮祐三木; 聡笹谷; 誠也伊藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2017-07-19
Filing date: 2017-07-19
Publication date: 2020-12-23
Anticipated expiration: 2037-07-19
Also published as: JPWO2019016879A1; WO2019016879A1

Description

本発明は、カメラの設置状態が変化した場合や、カメラおよび検出対象の移動により検出対象の映り方が変化した場合であっても、それらの変化に対して頑健な物体検出を実現する物体検出装置、及び、物体検出方法に関する。 The present invention is an object detection that realizes robust object detection even when the installation state of the camera changes or the appearance of the detection target changes due to the movement of the camera and the detection target. The present invention relates to an apparatus and an object detection method.

監視カメラなどの撮像デバイスが取得した画像情報から検出対象の物体（例えば、人物、貨物、車両等）を検出する物体検出技術へのニーズが高い。一般的な物体検出技術としては、検出対象の物体が存在しない背景画像を予め用意しておき、入力された撮像画像と背景画像を比較することで、物体を検出する背景差分や、映像のフレーム間における特徴点の差分によって動体を検出するオプティカルフローなどがある。しかし、これらの方法では、画像中の動きのあるものを全て検出してしまうため、例えば画像中から特定の対象のみを検出することができない。 There is a great need for an object detection technology that detects an object to be detected (for example, a person, cargo, vehicle, etc.) from image information acquired by an imaging device such as a surveillance camera. As a general object detection technique, a background image in which the object to be detected does not exist is prepared in advance, and by comparing the input captured image with the background image, background subtraction for detecting the object and a frame of the image are obtained. There is an optical flow that detects a moving object by the difference of feature points between them. However, in these methods, since all moving objects in the image are detected, it is not possible to detect only a specific target in the image, for example.

そこで、物体の輪郭情報や、外見から読み取れる色や形状などのアピアランス情報などを利用して特定の物体を検出する技術が求められる。 Therefore, there is a need for a technique for detecting a specific object by using the contour information of the object and the appearance information such as the color and shape that can be read from the appearance.

例えば、特許文献１では、段落００３４に「ＨＯＧによるアピアランスに基づく特徴量から得られる輪郭情報から人である判定され、かつ、ピクセル状態分析による時空間特徴に基づく特徴量から前景（動状態または静状態）であると判定された画像を人と判別する」識別部が記載されている。この記載をはじめ、特許文献１には、人物を含む画像と含まない画像から成る学習サンプルから、人物の輪郭情報を抽出し、人物と人物以外に区別する識別器を生成する手段と、識別器を用いて画像上の所定の領域に人物が存在するか否かを判定する手段とを用いて、人物検出を実現する技術が開示されている。 For example, in Patent Document 1, paragraph 0034 states, "A person is determined from the contour information obtained from the feature amount based on the appearance by HOG, and the foreground (moving state or static state) is determined from the feature amount based on the spatiotemporal feature by the pixel state analysis. An identification unit that "determines an image determined to be in a state) from a person is described. Including this description, Patent Document 1 describes a means for extracting contour information of a person from a learning sample consisting of an image including a person and an image not including the person, and generating a classifier for distinguishing between the person and a person other than the person. A technique for realizing person detection is disclosed by using a means for determining whether or not a person exists in a predetermined area on an image using the above.

また、特許文献２では、段落００１６に「この監視画像２０００上の変形検知領域２１００は、カメラ装置の各パラメータに関する情報を反映し、図３に示すように、監視画像２０００の歪みを考慮した上で作成される。そして、物体認識装置１（１ａ）は、歪み等で変形した認識対象を含む領域として作成された変形検知領域２１００の画像情報１００について、特徴量を抽出し、認識対象の物体か否かを判定する。」と記載されている。この記載をはじめ、特許文献２には、特許文献１の応用技術として、画像上の検出対象が、カメラ特有のレンズ歪みの影響で変形することを想定し、識別器を用いて人物が存在するか否かを判定する前に、入力する所定の領域を変形させることで検出率を向上させる技術が開示されている。 Further, in Patent Document 2, paragraph 0016 states, "The deformation detection region 2100 on the surveillance image 2000 reflects information on each parameter of the camera device, and as shown in FIG. 3, the distortion of the surveillance image 2000 is taken into consideration. The object recognition device 1 (1a) extracts the feature amount of the image information 100 of the deformation detection area 2100 created as the area including the recognition target deformed by distortion or the like, and extracts the feature amount of the object to be recognized. Whether or not it is determined. " Including this description, in Patent Document 2, as an applied technique of Patent Document 1, a person exists by using a discriminator on the assumption that the detection target on the image is deformed by the influence of the lens distortion peculiar to the camera. A technique for improving the detection rate by deforming a predetermined area to be input before determining whether or not to use the device is disclosed.

特開２００９−１８１２２０号公報Japanese Unexamined Patent Publication No. 2009-181220 特開２０１２−２２１４３７号公報Japanese Unexamined Patent Publication No. 2012-22437

特許文献１では、特定姿勢の人物画像（例えば、直立姿勢を正面から撮影した画像）を学習サンプルとして識別器に学習させ、この識別器を用いて人物検出することで、特定姿勢の人物の検出率を高めている。 In Patent Document 1, a person image of a specific posture (for example, an image of an upright posture taken from the front) is trained by a classifier as a learning sample, and the person is detected using this classifier to detect a person in a specific posture. The rate is increasing.

しかし、実際に撮影された画像中では、カメラ装置と人物の相対位置関係や、カメラ装置のレンズ歪みによって、人物の姿勢（見え方）が大きく変化するため、学習サンプルと撮影画像中の人物の輪郭情報が相違する場合は、特許文献１の識別器では、人物検出の精度が低下してしまうという課題がある。 However, in the actually captured image, the posture (appearance) of the person changes greatly due to the relative positional relationship between the camera device and the person and the lens distortion of the camera device, so the learning sample and the person in the captured image If the contour information is different, the classifier of Patent Document 1 has a problem that the accuracy of person detection is lowered.

また、特許文献２では、同文献の図１３等に示されるように、カメラ装置のパラメータ情報と、検出対象とカメラ装置の位置関係から、識別器に入力する人物の輪郭情報を予め設定した特定の姿勢と同一になるように変形（正規化）することで、人物の姿勢が一定の範囲で変化している場合であっても、識別器による検出率を維持することができる。 Further, in Patent Document 2, as shown in FIG. 13 and the like of the same document, the contour information of the person to be input to the classifier is specified in advance from the parameter information of the camera device and the positional relationship between the detection target and the camera device. By transforming (normalizing) the posture so that it becomes the same as the posture of the person, the detection rate by the discriminator can be maintained even when the posture of the person changes within a certain range.

しかし、検出対象の見え方が想定と大幅に異なる場合や、検出対象の一部が遮蔽物の陰に隠れている場合には、特許文献２の物体認識方法では、検出率が大幅に低下するという課題がある。例えば、人物の頭部、腕部、胴体、脚部の全てを含む画像からは容易に人物を検出できる場合であっても、人物を真上から撮影した画像や、下半身が遮蔽物に隠された人物を撮影した画像を用いた場合は、画像中から脚部を検出できない等の理由により、特許文献２の識別器では、画像中の人物検出率が大幅に低下してしまう。 However, when the appearance of the detection target is significantly different from the assumption, or when a part of the detection target is hidden behind a shield, the detection rate is significantly reduced by the object recognition method of Patent Document 2. There is a problem. For example, even if a person can be easily detected from an image that includes all of the person's head, arms, body, and legs, an image of the person taken from directly above or the lower half of the body is hidden by a shield. When an image of a person is used, the person detection rate in the image is significantly reduced by the classifier of Patent Document 2 because the leg cannot be detected from the image.

このような課題を解決するため、本発明では、識別器に対応しない姿勢の人物を含む撮影画像や、人体の一部が障害物の影に隠れた状態で撮影された画像を用いた場合においても、高精度な人物検出を実現できる物体検出装置を提供することを目的とする。 In order to solve such a problem, in the present invention, when a photographed image including a person in a posture that does not correspond to the classifier or an image captured in a state where a part of the human body is hidden in the shadow of an obstacle is used. Another object of the present invention is to provide an object detection device capable of realizing highly accurate person detection.

本発明に係る物体検出装置は、計測範囲内に検出対象が存在するか否かを判定する物体検出装置であって、撮像装置からの入力を基に前記計測範囲内の三次元情報を取得する三次元情報取得部と、前記検出対象が存在し得る識別候補領域を抽出する識別候補領域抽出部と、前記検出対象の検出に用いる識別器と、該識別器の情報を取得する識別器情報取得部と、前記識別候補領域内の三次元情報を仮想的に視点変換処理するパラメータを決定する画像変換方法決定部と、仮想的に視点変換処理した前記識別候補領域内の三次元情報を基に変換画像を生成する画像変換実施部と、該変換画像を基に前記識別器を用いて前記検出対象を検出する識別部と、前記撮像装置からの入力を基に前記計測範囲内の画像情報を取得する画像取得部と、を備え、前記画像変換方法決定部は、前記画像情報と前記三次元情報と前記識別器の情報を利用して、前記識別器の入力として最適な前記変換画像を生成するパラメータを決定するものとした。 The object detection device according to the present invention is an object detection device that determines whether or not a detection target exists within the measurement range, and acquires three-dimensional information within the measurement range based on an input from the image pickup device. A three-dimensional information acquisition unit, an identification candidate area extraction unit that extracts an identification candidate area in which the detection target may exist, a classifier used for detecting the detection target, and a classifier information acquisition that acquires information on the classifier. Based on the unit, the image conversion method determination unit that determines the parameters for virtually performing viewpoint conversion processing on the three-dimensional information in the identification candidate area, and the three-dimensional information in the identification candidate area that has been virtually subjected to viewpoint conversion processing. An image conversion execution unit that generates a converted image, an identification unit that detects the detection target using the classifier based on the converted image, and image information within the measurement range based on the input from the imaging device. The image conversion method determining unit includes an image acquisition unit to be acquired, and the image conversion method determining unit generates the converted image optimal as an input of the classifier by using the image information, the three-dimensional information, and the information of the classifier. was shall to determine the parameters to be.

本発明の物体検出装置によれば、カメラ装置と物体の相対位置が想定と大幅に異なる画像や、物体の一部が遮蔽された画像を用いた場合においても、検出対象の物体を高精度に検出することができる。 According to the object detection device of the present invention, even when an image in which the relative position of the camera device and the object is significantly different from the assumption or an image in which a part of the object is shielded is used, the object to be detected can be detected with high accuracy. Can be detected.

実施例１の物体検出装置の構成例を示す図である。It is a figure which shows the structural example of the object detection apparatus of Example 1. FIG. 実施例１の識別候補領域抽出部の詳細を示す図である。It is a figure which shows the detail of the identification candidate area extraction part of Example 1. FIG. 実施例１の識別候補領域情報管理部の詳細を示す図である。It is a figure which shows the detail of the identification candidate area information management part of Example 1. FIG. 二次元画像中の識別候補領域を示す図である。It is a figure which shows the identification candidate area in a two-dimensional image. 三次元撮影空間中の識別候補領域を示す図である。It is a figure which shows the identification candidate area in a three-dimensional photographing space. 実施例１の識別器の詳細を示す図である。It is a figure which shows the detail of the classifier of Example 1. FIG. 実施例１の画像変換方法決定部の詳細を示す図である。It is a figure which shows the detail of the image conversion method determination part of Example 1. FIG. 実施例１の視点変換部の処理内容を説明する図である。It is a figure explaining the processing content of the viewpoint conversion part of Example 1. FIG. 画像変換方法決定部の効果を説明する図である。It is a figure explaining the effect of the image conversion method determination part. 画像変換方法決定部の効果を説明する図である。It is a figure explaining the effect of the image conversion method determination part. 実施例１の構成例の識別部の詳細を示す図である。It is a figure which shows the detail of the identification part of the configuration example of Example 1. FIG. 実施例１における処理フロー例を示す図である。It is a figure which shows the processing flow example in Example 1. FIG. 実施例２の物体検出装置の構成例を示す図である。It is a figure which shows the structural example of the object detection apparatus of Example 2. 実施例２の画像変換方法決定部の処理を説明する図である。It is a figure explaining the process of the image conversion method determination part of Example 2. FIG. 実施例２の画像変換方法決定部の処理フローを説明する図である。It is a figure explaining the processing flow of the image conversion method determination part of Example 2. FIG. 図１２の処理フローの詳細を説明する図である。It is a figure explaining the detail of the processing flow of FIG.

以下、本発明の実施例について、適宜図面を参照しながら詳細に説明する。なお、以下では検出対象を人物とした例を説明するが、検出対象は人物に限定されず、貨物や車両等であっても良い。また、カメラ等の撮像装置で撮影した画像情報から検出対象を検出する例を説明するが、検出対象を含む情報は撮像装置で撮影した画像情報に限定されず、サーモセンサで取得したヒートマップであっても良い。 Hereinafter, examples of the present invention will be described in detail with reference to the drawings as appropriate. In the following, an example in which the detection target is a person will be described, but the detection target is not limited to the person, and may be a cargo, a vehicle, or the like. Further, an example of detecting a detection target from image information taken by an image pickup device such as a camera will be described, but the information including the detection target is not limited to the image information taken by the image pickup device, and is a heat map acquired by a thermo sensor. There may be.

実施例１の物体検出装置２ａについて、図１から図９を用いて説明する。 The object detection device 2a of the first embodiment will be described with reference to FIGS. 1 to 9.

図１は、ステレオカメラ等の撮像装置１と接続された、本実施例の物体検出装置２ａの概要を示すブロック図である。物体検出装置２ａは、撮像装置１と検出対象の相対位置の変化により、撮像装置１の撮影画像上での検出対象の見え方が変化した場合であっても、検出対象の頑健な検出を実現する物体検出装置である。 FIG. 1 is a block diagram showing an outline of an object detection device 2a of the present embodiment connected to an image pickup device 1 such as a stereo camera. The object detection device 2a realizes robust detection of the detection target even when the appearance of the detection target on the captured image of the image pickup device 1 changes due to a change in the relative position between the image pickup device 1 and the detection target. It is an object detection device.

図１に示す物体検出装置２ａにおいて、３は撮像装置１からの入力を基に計測範囲内の画像情報を取得する画像取得部、４は撮像装置１からの入力を基に計測範囲内の三次元情報を取得する三次元情報取得部、５は画像情報と三次元情報を利用して検出対象が存在し得る領域である識別候補領域を計測範囲から抽出する識別候補領域抽出部、６は物体検出装置２ａにて使用する識別器６４の情報を取得する識別器情報取得部、７ａは識別器の情報を用いて識別候補領域を、識別器６４の入力として最適な画像へ変換する方法を決定する画像変換方法決定部、８は決定された画像変換方法に基づき識別候補領域から変換画像を取得する画像変換部、９は変換画像中に検出対象が含まれるか否かを判別する識別部である。なお、画像取得部３から識別部９の一部または全部は、必ずしも専用のハードウェアである必要はなく、半導体メモリ等の主記憶装置に記憶されたプログラムやハードディスク等の補助記憶装置に記憶されたデータを、ＣＰＵ等の演算装置で処理することで実現されるものであっても良い。 In the object detection device 2a shown in FIG. 1, 3 is an image acquisition unit that acquires image information within the measurement range based on the input from the image pickup device 1, and 4 is a tertiary within the measurement range based on the input from the image pickup device 1. A three-dimensional information acquisition unit that acquires original information, 5 is an identification candidate area extraction unit that extracts an identification candidate area, which is an area in which a detection target can exist by using image information and three-dimensional information, from a measurement range, and 6 is an object. The classifier information acquisition unit that acquires the information of the classifier 64 used in the detection device 2a, 7a determines a method of converting the discrimination candidate area into the optimum image as the input of the classifier 64 using the information of the classifier. 8 is an image conversion unit that acquires a converted image from an identification candidate area based on the determined image conversion method, and 9 is an identification unit that determines whether or not a detection target is included in the converted image. is there. A part or all of the image acquisition unit 3 to the identification unit 9 does not necessarily have to be dedicated hardware, and is stored in a program stored in a main storage device such as a semiconductor memory or in an auxiliary storage device such as a hard disk. It may be realized by processing the data in a computing device such as a CPU.

以下、図１に示した、撮像装置１、識別候補領域抽出部５、識別器情報取得部６、画像変換方法決定部７ａ、画像変換部８、識別部９について、個々に詳細説明する。
＜撮像装置＞
撮像装置１は、計測範囲の画像情報と三次元情報を取得できる装置である。ここで、画像情報とはデジタル画像データにおける輝度情報、三次元情報とは計測範囲（三次元空間）における三次元点群の座標情報である。Hereinafter, the image pickup device 1, the identification candidate region extraction unit 5, the classifier information acquisition unit 6, the image conversion method determination unit 7a, the image conversion unit 8, and the identification unit 9 shown in FIG. 1 will be described in detail individually.
<Imaging device>
The image pickup apparatus 1 is an apparatus capable of acquiring image information and three-dimensional information of a measurement range. Here, the image information is the brightness information in the digital image data, and the three-dimensional information is the coordinate information of the three-dimensional point cloud in the measurement range (three-dimensional space).

撮像装置１としては、２台以上のカメラからなるステレオカメラや、１台のカメラと三次元情報を取得可能な距離センサの組み合わせでもよい。例えば、ステレオカメラは、２台以上のカメラで同一の対象を撮影することにより、三角測量の原理を利用してカメラから対象までの距離を計測するものであり、画像情報と三次元情報の両方を取得することができる。また、距離センサは投射した光が対象で反射し、距離センサに戻るまでの時間を、投射光と反射光の位相差から算出することで、対象までの距離を計測するものであり、予め位置合わせをしたカメラと組み合わせることで、三次元情報と画像情報を関連付けて取得できる。
＜識別候補領域抽出部＞
図２は識別候補領域抽出部５の詳細を示している。識別候補領域抽出部５は、画像取得部３および三次元情報取得部４の取得する画像情報もしくは三次元情報、またはその両方を利用し、検出対象が存在し得る識別候補領域５５を抽出するものであり、画像情報を用いて識別候補領域５５を抽出する画像処理部５１と、三次元情報を用いて識別候補領域５５を抽出する三次元情報処理部５２と、抽出した１つ以上の識別候補領域５５にＩＤを付与する識別候補領域ＩＤ付与部５３と、識別候補領域５５の位置を表す識別候補領域情報を取得し、管理する識別候補領域情報管理部５４を備えている。以下、画像処理部５１、三次元情報処理部５２、識別候補領域ＩＤ付与部５３、識別候補領域情報管理部５４について詳細に説明する。The image pickup device 1 may be a stereo camera composed of two or more cameras, or a combination of one camera and a distance sensor capable of acquiring three-dimensional information. For example, a stereo camera measures the distance from a camera to an object by using the principle of triangulation by photographing the same object with two or more cameras, and both image information and three-dimensional information. Can be obtained. In addition, the distance sensor measures the distance to the target by calculating the time it takes for the projected light to be reflected by the target and return to the distance sensor from the phase difference between the projected light and the reflected light. By combining with a matched camera, it is possible to obtain the three-dimensional information and the image information in association with each other.
<Identification candidate area extraction unit>
FIG. 2 shows the details of the identification candidate region extraction unit 5. The identification candidate area extraction unit 5 extracts the identification candidate area 55 in which the detection target may exist by using the image information and / or the three-dimensional information acquired by the image acquisition unit 3 and the three-dimensional information acquisition unit 4. The image processing unit 51 that extracts the identification candidate area 55 using the image information, the three-dimensional information processing unit 52 that extracts the identification candidate area 55 using the three-dimensional information, and one or more extracted identification candidates. It includes an identification candidate area ID assigning unit 53 that assigns an ID to the area 55, and an identification candidate area information management unit 54 that acquires and manages identification candidate area information indicating the position of the identification candidate area 55. Hereinafter, the image processing unit 51, the three-dimensional information processing unit 52, the identification candidate area ID assigning unit 53, and the identification candidate area information management unit 54 will be described in detail.

画像処理部５１は、撮像装置１が取得した画像情報に対して画像処理を実施することで識別候補領域５５を抽出する。ここで実行される画像処理としては、例えば、検出対象が存在しない状態の撮影空間を撮影した背景画像を予め取得しておき、その背景画像と撮影した画像との差分を算出する背景差分があるが、肌色検出などのカラー情報を用いた検出など、画像情報によって検出対象の領域を抽出できる手段であれば、特に限定しない。 The image processing unit 51 extracts the identification candidate region 55 by performing image processing on the image information acquired by the image pickup apparatus 1. As the image processing executed here, for example, there is a background subtraction in which a background image obtained by shooting a shooting space in a state where a detection target does not exist is acquired in advance and the difference between the background image and the shot image is calculated. However, the method is not particularly limited as long as it is a means capable of extracting a region to be detected by image information, such as detection using color information such as skin color detection.

三次元情報処理部５２は、撮像装置１が取得した三次元情報に対して三次元処理を実施することで識別候補領域５５を抽出する。ここで実行される三次元処理としては、例えば、検出対象が存在しない状態の撮影空間の背景三次元情報を予め取得しておき、その背景三次元情報と改めて取得した三次元情報との差分を算出する方法があるが、三次元処理を実施することで識別候補領域５５を取得するものであれば、特に限定しない。 The three-dimensional information processing unit 52 extracts the identification candidate region 55 by performing three-dimensional processing on the three-dimensional information acquired by the imaging device 1. As the three-dimensional processing executed here, for example, the background three-dimensional information of the shooting space in the state where the detection target does not exist is acquired in advance, and the difference between the background three-dimensional information and the newly acquired three-dimensional information is obtained. There is a method of calculating, but it is not particularly limited as long as the identification candidate area 55 is acquired by performing three-dimensional processing.

次に、図３Ａ〜図３Ｃを用いて、識別候補領域ＩＤ付与部５３と識別候補領域情報管理部５４について説明する。 Next, the identification candidate area ID assigning unit 53 and the identification candidate area information management unit 54 will be described with reference to FIGS. 3A to 3C.

識別候補領域ＩＤ付与部５３では、画像処理部５１や三次元情報処理部５２で抽出した識別候補領域５５の各々に対しＩＤを付与する。また、識別候補領域情報管理部５４では、ＩＤに当該識別候補領域の位置情報を付加し、識別候補領域情報54_nとして管理する。なお、位置情報は、当該識別候補領域の二次元画像中の始点と終点を示す画像位置、および、当該識別候補領域の三次元撮影空間中の始点と終点を示す三次元位置である。 The identification candidate area ID assigning unit 53 assigns an ID to each of the identification candidate areas 55 extracted by the image processing unit 51 and the three-dimensional information processing unit 52. Further, the identification candidate area information management unit 54 adds the position information of the identification candidate area to the ID and manages it as the identification candidate area information 54_n. The position information is an image position indicating the start point and the end point in the two-dimensional image of the identification candidate area, and a three-dimensional position indicating the start point and the end point of the identification candidate area in the three-dimensional photographing space.

図３Ａは、識別候補領域情報管理部５４が管理するｎ個の識別候補領域情報54_nを例示したものであり、各々の識別候補領域情報54_nには、ＩＤに加え、対応する識別候補領域５５の位置情報である画像位置と三次元位置が記録されていることを示している。図３Ｂは、識別候補領域情報54_1の画像位置を具体的に示すものであり、５６ａ、５６ｂは、撮像装置１の撮影画像における矩形の識別候補領域５５の始点(x1,y1)と終点(x1’,y1’)を示している。同様に、図３Ｃは、識別候補領域情報54_1の三次元位置を具体的に示すものであり、５７ａ、５７ｂは、直方体状の識別候補領域５５の始点(X1,Y1,Z1)と終点(X1’,Y1’,Z1’)を示している。なお、図３Ｂ、図３Ｃでは、矩形、直方体状の識別候補領域５５を例示したが、識別候補領域５５の位置を特定できる表現であれば、他の形状の識別候補領域を用いても良い。この場合、図３Ａ中の画像位置、三次元位置の情報も当該他の形状の識別候補領域に合わせた表現とすることは言うまでもない。
＜識別器情報取得部＞
次に、図４を用いて、識別器情報取得部６を説明する。識別器情報取得部６は複数用意されている識別器６４から適切なものを選択し、それに対応した識別器情報６５を抽出するものである。なお、67_nは識別器64_nを管理するために付与される識別器ＩＤである。FIG. 3A exemplifies n identification candidate area information 54_n managed by the identification candidate area information management unit 54, and each identification candidate area information 54_n includes an ID and a corresponding identification candidate area 55. It shows that the image position and the three-dimensional position, which are the position information, are recorded. FIG. 3B specifically shows the image position of the identification candidate area information 54_1, and 56a and 56b are the start point (x1, y1) and the end point (x1) of the rectangular identification candidate area 55 in the captured image of the image pickup apparatus 1. ', y1') is shown. Similarly, FIG. 3C specifically shows the three-dimensional position of the identification candidate region information 54_1, and 57a and 57b are the start point (X1, Y1, Z1) and the end point (X1) of the rectangular parallelepiped identification candidate region 55. ', Y1', Z1') is shown. Although the rectangular parallelepiped identification candidate region 55 is illustrated in FIGS. 3B and 3C, an identification candidate region having another shape may be used as long as the position of the identification candidate region 55 can be specified. In this case, it goes without saying that the information on the image position and the three-dimensional position in FIG. 3A is also expressed according to the identification candidate region of the other shape.
<Identifier information acquisition unit>
Next, the classifier information acquisition unit 6 will be described with reference to FIG. The classifier information acquisition unit 6 selects an appropriate classifier 64 from a plurality of classifiers 64, and extracts the classifier information 65 corresponding thereto. In addition, 67_n is a classifier ID given to manage the classifier 64_n.

識別器６４は、撮像装置１の撮影画像中に検出対象が含まれるかを判別する識別処理に用いられ、識別器64_nの夫々は、異なる姿勢の検出対象に対して高い識別能力を有するものである。検出対象を含む画像と含まない画像（学習サンプル）を機械学習方法により多数学習することで、各々の識別器64_nに異なる特性を持たせることができる。なお、機械学習方法としては、Support Vector Machineが一般的であるが、他の機械学習方法を用いても良い。 The classifier 64 is used for a discrimination process for discriminating whether or not a detection target is included in the captured image of the image pickup apparatus 1, and each of the classifier 64_n has a high discriminating ability for a detection target having a different posture. is there. By learning a large number of images including the detection target and images not including the detection target (learning samples) by the machine learning method, each classifier 64_n can have different characteristics. As the machine learning method, Support Vector Machine is generally used, but other machine learning methods may be used.

識別器情報65_nは識別器64_nが特に高い識別能力を発揮する入力画像を示すものである。図４では、識別器情報として、正面視した人物画像の識別に強いテンプレート66_1、上面視した人物画像の識別に強いテンプレート66_2、側面視した人物画像の識別に強いテンプレート66_nを例示しているが、色情報や輪郭を表現する特徴量、或いは、輝度情報、勾配情報など、識別器64_nの入力に適当な画像もしくは画像の生成方法を表現する識別器情報であれば、他の情報を記録しておいても良い。
＜画像変換方法決定部＞
次に、図５を用いて、画像変換方法決定部７ａの処理フローを説明する。画像変換方法決定部７ａは、図３Ｃに例示した直方体状の識別候補領域５５内の三次元情報を基に、識別器６４への入力として最適な画像へ変換するための変換方法（パラメータ等）を決定するものである。画像変換方法決定部７ａの処理フローとしては、まず視点変換のパラメータを決定し（S51）、そのパラメータを用いて視点変換画像を生成する（S52）。そして、複数の識別器６４の各々が保持する識別器情報６５を参照して変換画像との類似度を算出し（S53）、類似度が閾値より高ければ処理を終了し、閾値以下ならステップS51に戻り、パラメータを他の値に変更する（S54）。以下、ステップS51、S52、S53、S54について詳しく説明する。The classifier information 65_n indicates an input image in which the classifier 64_n exerts a particularly high discriminating ability. In FIG. 4, as the classifier information, a template 66_1 that is strong in identifying a front-viewed person image, a template 66_2 that is strong in identifying a top-viewed person image, and a template 66_n that is strong in identifying a side-viewed person image are illustrated. , Color information, feature quantity expressing contour, brightness information, gradient information, etc., if it is an image suitable for input of classifier 64_n or classifier information expressing a method of generating an image, record other information. You can keep it.
<Image conversion method determination unit>
Next, the processing flow of the image conversion method determination unit 7a will be described with reference to FIG. The image conversion method determination unit 7a is a conversion method (parameters, etc.) for converting to an optimum image as an input to the classifier 64 based on the three-dimensional information in the rectangular parallelepiped identification candidate region 55 illustrated in FIG. 3C. Is what determines. As the processing flow of the image conversion method determination unit 7a, first, the viewpoint conversion parameters are determined (S51), and the viewpoint conversion image is generated using the parameters (S52). Then, the similarity with the converted image is calculated with reference to the classifier information 65 held by each of the plurality of classifiers 64 (S53), and if the similarity is higher than the threshold value, the process is terminated, and if it is less than the threshold value, step S51. Return to and change the parameter to another value (S54). Hereinafter, steps S51, S52, S53, and S54 will be described in detail.

ステップS51では、視点変換画像の生成に必要なパラメータα、β、γを決定する。なお、各パラメータの詳細については後段にて詳しく説明する。ステップS51における、パラメータα、β、γの決定方法としては網羅的に変動させる方法がある。 In step S51, the parameters α, β, and γ necessary for generating the viewpoint-transformed image are determined. The details of each parameter will be described in detail later. As a method for determining the parameters α, β, and γ in step S51, there is a method of comprehensively varying the parameters.

ステップS52では、図６に示す処理を行う。図６において、８２は識別候補領域５５を観測する視点、８３、８４、８５は計測範囲に設定した三次元空間の座標系におけるｘ軸、ｙ軸、ｚ軸、86_1、86_2は視点変換によって作成される変換画像の一例を示している。ステップS52では、ステップS51で決定したパラメータα、β、γを用いて、直方体状の識別候補領域５５に含まれる三次元情報を、ｘ軸８３を中心にα、ｙ軸８４を中心にβ、ｚ軸８５を中心にγだけ回転させることで、任意の視点から観測した状態に視点変換し、視点変換後の識別候補領域５５を画像に投影することで変換画像８６を取得する。 In step S52, the process shown in FIG. 6 is performed. In FIG. 6, 82 is a viewpoint for observing the identification candidate region 55, 83, 84, and 85 are created by viewpoint conversion in the x-axis, y-axis, z-axis, 86_1, and 86_2 in the coordinate system of the three-dimensional space set in the measurement range. An example of the converted image to be performed is shown. In step S52, using the parameters α, β, and γ determined in step S51, the three-dimensional information included in the rectangular parallelepiped identification candidate region 55 is divided into α centered on the x-axis 83 and β centered on the y-axis 84. The converted image 86 is acquired by rotating the z-axis 85 by γ to convert the viewpoint to the state observed from an arbitrary viewpoint and projecting the identification candidate region 55 after the viewpoint conversion onto the image.

視点変換の方法としては、式１〜式３のような変換式を用いることが一般的であるが、他の視点変換方法を用いても良い。 As a method of viewpoint conversion, a conversion formula such as Equations 1 to 3 is generally used, but other viewpoint conversion methods may be used.

識別候補領域５５を変換画像86_nに投影する方法としては、透視投影が一般的な方法であるが、他の方法を用いても良い。例えば、直立する人物の三次元情報を含む識別候補領域５５を、視点８２の方向に設置した撮像装置１で撮影した場合、視点変換せずに、識別候補領域５５を投影することで、人物を上面視した変換画像86_1を取得できる。これに対し、同じ撮像装置１で撮影した識別候補領域５５を、α＝０°、β＝０°、γ＝９０°だけ回転する視点変換を実施し、視点８２に対して識別候補領域５５を画像に投影すると、人物を側面視した変換画像86_2を取得できる。

Perspective projection is a general method for projecting the identification candidate region 55 onto the converted image 86_n, but other methods may be used. For example, when the identification candidate area 55 including the three-dimensional information of an upright person is photographed by the imaging device 1 installed in the direction of the viewpoint 82, the person is projected by projecting the identification candidate area 55 without changing the viewpoint. The converted image 86_1 viewed from above can be obtained. On the other hand, the identification candidate area 55 captured by the same image pickup apparatus 1 is subjected to viewpoint conversion in which the identification candidate area 55 is rotated by α = 0 °, β = 0 °, and γ = 90 °, and the identification candidate area 55 is set with respect to the viewpoint 82. When projected onto an image, a converted image 86_2 with a side view of a person can be obtained.

さらに、図５のステップS53では、最適化処理を実施することで、識別器６４に対して最も適する画像への画像変換方法を決定する。画像変換方法の決定方法としては、例えば、識別器情報６５を参照してテンプレート６６を取得し、識別候補領域５５に対して視点変換を実施して取得する変換画像86_nとの類似度を算出する方法などがある。類似度の算出方法として、例えば、Normalized Cross-Correlationなどのパターンマッチングを用いることが一般的であるが、他の方法を用いても良い。この際、評価関数を類似度とし、パラメータα、β、γが変数となる評価関数を設計し、この評価関数を最大化する最適化問題を解くことで、識別器64_nに対して類似度が最大となる画像変換方法を取得する。なお、図４に例示したように、識別器６４が２つ以上存在する場合は、識別候補領域５５に対して取得する変換画像86_nとの類似度を各識別器に対して算出し、類似度が最大となった識別器64_nのＩＤを取得しておく。 Further, in step S53 of FIG. 5, the optimization process is performed to determine the image conversion method to the image most suitable for the classifier 64. As a method of determining the image conversion method, for example, the template 66 is acquired with reference to the classifier information 65, and the degree of similarity with the converted image 86_n acquired by performing viewpoint conversion on the identification candidate area 55 is calculated. There are methods and so on. As a method for calculating the degree of similarity, for example, pattern matching such as Normalized Cross-Correlation is generally used, but other methods may be used. At this time, by designing an evaluation function in which the evaluation function is the similarity and the parameters α, β, and γ are variables, and solving the optimization problem that maximizes this evaluation function, the similarity is obtained with respect to the classifier 64_n. Get the maximum image conversion method. As illustrated in FIG. 4, when two or more classifiers 64 are present, the degree of similarity with the converted image 86_n acquired for the identification candidate area 55 is calculated for each classifier, and the degree of similarity is calculated. Obtain the ID of the classifier 64_n that maximizes.

ステップS54ではステップS53で算出した変換画像86_nと識別器情報６５の類似度を閾値と比較し、類似度が閾値以上の場合は処理を終了し、閾値未満の場合はステップS51に戻り、異なるパラメータに変更したうえで、同様の処理を繰り返す。ステップS54で用いられる閾値は、物体検出装置２ａの設置者が任意に設定してもよいが、所定の閾値により物体検出装置２ａによる物体検出を実行した際の物体検出の精度をフィードバックすることで、閾値を適当な値に変更しても良い。例えば、ある閾値を用いた物体検出装置２ａの精度が不十分であると判断された際に、閾値をより高い値に変更する方法などがある。 In step S54, the similarity between the converted image 86_n calculated in step S53 and the classifier information 65 is compared with the threshold value, and if the similarity is greater than or equal to the threshold value, the process is terminated, and if it is less than the threshold value, the process returns to step S51 and different parameters are used. After changing to, repeat the same process. The threshold value used in step S54 may be arbitrarily set by the installer of the object detection device 2a, but by feeding back the accuracy of the object detection when the object detection by the object detection device 2a is executed by the predetermined threshold value. , The threshold value may be changed to an appropriate value. For example, there is a method of changing the threshold value to a higher value when it is determined that the accuracy of the object detection device 2a using a certain threshold value is insufficient.

なお、ステップS51にてパラメータを決定する際に、画像変換方法決定部７ａは予めパラメータα、β、γによって生成される変換画像の縦横の比率を記録した行列マップを作成し、それを参照して決定しても良い。あるいは、撮影空間を複数の領域に分割し、各領域に対しておおよそ有効であるパラメータα、β、γを保持した行列マップを用意し、それを参照して決定しても良い。その際、行列マップが保持するパラメータα、β、γよりも適したものが判明した場合には更新する方法でも良い。あるいは、カメラパラメータを取得して撮像装置１の設置状態の情報を取得することで、おおよそ有効であるパラメータα、β、γを決定する方法でも良い。 When determining the parameters in step S51, the image conversion method determination unit 7a creates a matrix map in which the aspect ratios of the converted images generated by the parameters α, β, and γ are recorded in advance, and refers to the matrix map. You may decide. Alternatively, the imaging space may be divided into a plurality of regions, a matrix map holding parameters α, β, and γ that are approximately valid for each region may be prepared and determined by referring to the matrix map. At that time, if a parameter more suitable than the parameters α, β, and γ held by the matrix map is found, a method of updating may be used. Alternatively, a method of determining substantially effective parameters α, β, and γ by acquiring camera parameters and acquiring information on the installation state of the imaging device 1 may be used.

また、画像変換方法決定部７ａにおいて、パラメータα、β、γを変更して類似度を計算する処理を続行するか否かを判断し、続行する場合はステップS51に戻り、続行しない場合は処理を終了することとしても良い。処理を実行するか否かの判断基準は、例えば、パラメータα、β、γを変更した回数が、あらかじめ設定した回数を上回ったかどうかによって決定しても良い。あるいは、ステップS53にて計算した類似度があらかじめ設定した最低値以下の場合に処理を終了するという方式でも良い。類似度が閾値以上にならない場合でも処理を終了することにより、識別候補領域５５に検出対象の物体が含まれていない場合に、物体検出装置２ａの物体検出処理が繰り返し実施される無駄を防ぐことができる。 Further, the image conversion method determination unit 7a determines whether or not to continue the process of changing the parameters α, β, and γ to calculate the similarity, and if it continues, returns to step S51, and if it does not continue, the process. May be terminated. The criterion for determining whether or not to execute the process may be determined, for example, by whether or not the number of times the parameters α, β, and γ are changed exceeds the number of times set in advance. Alternatively, the process may be terminated when the similarity calculated in step S53 is equal to or less than the preset minimum value. By terminating the process even when the similarity does not exceed the threshold value, it is possible to prevent waste in which the object detection process of the object detection device 2a is repeatedly performed when the object to be detected is not included in the identification candidate area 55. Can be done.

次に、図７Ａ、図７Ｂを用いて、画像変換方法決定部７ａの効果を説明する。図７Ａにおいて８２ａ、８２ｂ、８２ｃは撮像装置１の視点（設置位置・方向）を示し、８７ａ、８７ｂ、８７ｃはそれぞれの視点の撮影画像から抽出した、人物を含む矩形画像を示す。 Next, the effect of the image conversion method determining unit 7a will be described with reference to FIGS. 7A and 7B. In FIG. 7A, 82a, 82b, and 82c indicate the viewpoints (installation position / direction) of the imaging device 1, and 87a, 87b, and 87c indicate rectangular images including a person extracted from the captured images of the respective viewpoints.

直立する人物を撮影した矩形画像８７ａ、８７ｂ、８７ｃを基に、図７Ｂに示す識別器64_1、64_2を用いて人物を検出する場合、視点８２ａから撮影した矩形画像８７ａは識別器64_1のテンプレート66_1と類似度が高く、視点８２ｃから撮影した矩形画像８７ｃは識別器64_2のテンプレート66_2と類似度が高い。そのため、矩形画像８７ａと矩形画像８７ｃについては、識別器64_1または識別器64_2を用いることで容易に人物を検出できる。 When a person is detected using the classifiers 64_1 and 64_2 shown in FIG. 7B based on the rectangular images 87a, 87b and 87c obtained by photographing an upright person, the rectangular image 87a photographed from the viewpoint 82a is a template 66_1 of the classifier 64_1. The rectangular image 87c taken from the viewpoint 82c has a high degree of similarity to the template 66_2 of the classifier 64_2. Therefore, for the rectangular image 87a and the rectangular image 87c, a person can be easily detected by using the classifier 64_1 or the classifier 64_2.

一方、視点８２ｂから撮影した矩形画像８７ｂ中の人物には、撮像装置１の視線の傾きを原因とする変形（テンプレートとのずれ）が発生しており、テンプレート66_1、テンプレート66_2の何れとも類似度が低いため、識別器64_1、識別器64_2では人物を識別できない。このため、視点８２ｂの撮像装置１だけが設置された現場では、従来は人物を検出することが困難であった。 On the other hand, the person in the rectangular image 87b taken from the viewpoint 82b is deformed (deviation from the template) due to the inclination of the line of sight of the image pickup apparatus 1, and has a degree of similarity to both the template 66_1 and the template 66_2. Is low, so the classifier 64_1 and the classifier 64_2 cannot identify a person. For this reason, it has been difficult to detect a person in the field where only the image pickup device 1 of the viewpoint 82b is installed.

このような場合であっても、本実施例の画像変換方法決定部７ａを用いることで、矩形画像８７ｂ中の変形した検出対象も検出が可能となる。以下に視点８２ｂから撮影した変形した人物を検出する手順を説明する。 Even in such a case, by using the image conversion method determining unit 7a of this embodiment, it is possible to detect the deformed detection target in the rectangular image 87b. The procedure for detecting a deformed person photographed from the viewpoint 82b will be described below.

まず、パラメータα、β、γを決定し、それを入力として矩形画像８７ｂに対し仮想的な視点変換を実施することで変換画像８６ｂを作成する。そして、変換画像８６ｂとテンプレート66_1およびテンプレート66_2との類似度を計算し、閾値以上の類似度を示す識別器64_nがあった場合、その識別器のＩＤ６７を取得する。閾値以上の類似度を示す識別器64_nがない場合は、パラメータα、β、γを再度決定し、同様の処理を実施する。矩形画像８７ｂはカメラの傾きによる変形が発生しているものの、人物の正面の画像情報、三次元情報を取得できている。そのため、視点８２ｂから撮影した矩形画像８７ｂを視点８２ａへ仮想的に視点変換した場合、矩形画像８７ａと類似した変換画像８６ｂを取得でき、識別器64_1への入力に適した画像を得ることが可能となる。 First, the parameters α, β, and γ are determined, and the converted image 86b is created by performing virtual viewpoint conversion on the rectangular image 87b using the parameters α, β, and γ as inputs. Then, the similarity between the converted image 86b and the template 66_1 and the template 66_2 is calculated, and if there is a classifier 64_n indicating the similarity equal to or higher than the threshold value, the ID 67 of the classifier is acquired. If there is no classifier 64_n showing similarity above the threshold, the parameters α, β, and γ are determined again, and the same processing is performed. Although the rectangular image 87b is deformed due to the tilt of the camera, the image information and the three-dimensional information of the front of the person can be acquired. Therefore, when the rectangular image 87b taken from the viewpoint 82b is virtually converted to the viewpoint 82a, a converted image 86b similar to the rectangular image 87a can be obtained, and an image suitable for input to the classifier 64_1 can be obtained. It becomes.

同様に、視点８２ｂから撮影した矩形画像８７ｂを視点８２ｃへ仮想的に視点変換した場合、矩形画像８７ｃと類似した変換画像８６ｂを取得でき、識別器64_2への入力に適した画像を得ることが可能となる。 Similarly, when the rectangular image 87b taken from the viewpoint 82b is virtually converted to the viewpoint 82c, a converted image 86b similar to the rectangular image 87c can be obtained, and an image suitable for input to the classifier 64_2 can be obtained. It will be possible.

ここで、視点８２ｂと人物の間に障害物が存在し、人体の一部（例えば脚部）が矩形画像８７ｂに映らない状況下での画像変換方法決定部７ａの利点を説明する。矩形画像８７ｂを視点８２ａへ視点変換した場合、矩形画像８７ｂと同様に変換画像８６ｂも脚部を欠落するため、脚部検出を必要とする識別器64_1では人物を検出できない。これに対し、矩形画像８７ｂを視点８２ｃへ視点変換した場合、矩形画像８７ｂと同様に変換画像８６ｂも脚部を欠落しているが、脚部検出が不要な識別器64_2では人物を検出することができる。すなわち、人体の一部が欠落した矩形画像８７ｂが入力された場合であっても、画像変換方法決定部７ａにて適切な画像変換方法を決定し、それに応じた識別器６４を選択すれば、正確な人物検出を実現することができる。
＜画像変換部＞
画像変換部８は、画像変換方法決定部７ａが決定した画像変換方法に従って識別候補領域５５を変換し、識別器６４への入力に適した変換画像８６を取得するものである。なお、画像変換方法はステップS52と同様に、例えば式１〜式３のような変換式を用いることができるが、他の方法を用いても良い。
＜識別部＞
図８は識別部９の詳細を示している。識別部９は、画像変換部８が取得する変換画像86_n中に検出対象が含まれるか否かを判定するものであり、少なくとも１つ以上の識別器64_nを記録する識別器記録部９１と、識別器64_nを用いて変換画像86_nに対して識別処理を実施する識別処理実施部９２と、識別処理の結果を出力する識別結果出力部を備える。以下、識別処理実施部９２、識別結果出力部９３について、詳細に説明する。Here, the advantages of the image conversion method determining unit 7a in a situation where an obstacle exists between the viewpoint 82b and the person and a part of the human body (for example, the legs) is not reflected in the rectangular image 87b will be described. When the rectangular image 87b is converted to the viewpoint 82a, the converted image 86b also lacks the legs as in the rectangular image 87b, so that the classifier 64_1 that requires leg detection cannot detect a person. On the other hand, when the rectangular image 87b is converted to the viewpoint 82c, the converted image 86b also lacks the legs as in the rectangular image 87b, but the classifier 64_2, which does not require leg detection, detects a person. Can be done. That is, even when a rectangular image 87b in which a part of the human body is missing is input, if an appropriate image conversion method is determined by the image conversion method determination unit 7a and the classifier 64 corresponding to the image conversion method is selected, Accurate person detection can be realized.
<Image conversion unit>
The image conversion unit 8 converts the identification candidate region 55 according to the image conversion method determined by the image conversion method determination unit 7a, and acquires a converted image 86 suitable for input to the classifier 64. As the image conversion method, as in step S52, for example, a conversion formula such as Equations 1 to 3 can be used, but other methods may be used.
<Identification unit>
FIG. 8 shows the details of the identification unit 9. The identification unit 9 determines whether or not a detection target is included in the converted image 86_n acquired by the image conversion unit 8, and includes a classifier recording unit 91 that records at least one classifier 64_n. It includes an identification processing execution unit 92 that performs identification processing on the converted image 86_n using the classifier 64_n, and an identification result output unit that outputs the result of the identification processing. Hereinafter, the identification processing execution unit 92 and the identification result output unit 93 will be described in detail.

識別処理実施部９２は、識別器記録部９１に記録された識別器64_nを用いて、画像変換部８が出力した変換画像86_nに対し識別処理を実施する。識別器記録部９１に識別器64_nが２つ以上記録されている場合、識別処理実施部９２は、画像変換方法決定部７ａにて選択された識別器64_nのＩＤを取得し、そのＩＤに対応する識別器64_nを選択した後に、画像変換実施部７が出力した変換画像86_nに対し識別処理を実施する。 The identification processing execution unit 92 performs identification processing on the converted image 86_n output by the image conversion unit 8 by using the identification device 64_n recorded in the identification device recording unit 91. When two or more classifiers 64_n are recorded in the classifier recording unit 91, the discriminator processing execution unit 92 acquires the ID of the classifier 64_n selected by the image conversion method determination unit 7a and corresponds to the ID. After selecting the classifier 64_n to be used, the image conversion unit 7 outputs the converted image 86_n to perform the identification process.

識別結果出力部９３は、識別処理実施部９２の識別処理結果を外部に出力する。例えば、物体検出装置２ａがモニタなどの表示装置に接続される場合、その表示装置に撮影空間の画像を表示してもよい。そして、識別処理実施部９２が、変換画像86_n中に検出対象を含むと判定した場合、その変換画像86_nの基となった識別候補領域５５の識別候補領域情報54_nを参照し、撮像装置１の撮影画像中における識別候補領域５５の画像位置を取得する。そして、表示装置に表示される撮影画像中の検出対象に対応する位置に矩形の検出窓等を表示したり、検出対象が検出された旨をメッセージとして表示してもよい。
＜処理フロー＞
次に、図９を用いて、本実施例の物体検出装置２ａにおける物体検出の処理フローを説明する。The identification result output unit 93 outputs the identification processing result of the identification processing execution unit 92 to the outside. For example, when the object detection device 2a is connected to a display device such as a monitor, an image of the shooting space may be displayed on the display device. Then, when the identification processing execution unit 92 determines that the conversion image 86_n includes the detection target, the identification candidate area information 54_n of the identification candidate area 55 that is the basis of the converted image 86_n is referred to, and the image pickup apparatus 1 The image position of the identification candidate area 55 in the captured image is acquired. Then, a rectangular detection window or the like may be displayed at a position corresponding to the detection target in the captured image displayed on the display device, or a message indicating that the detection target has been detected may be displayed.
<Processing flow>
Next, the processing flow of object detection in the object detection device 2a of this embodiment will be described with reference to FIG.

ステップS91では、先ず、撮像装置１は計測範囲に対応する画像情報及び三次元情報を取得し、物体検出装置２ａに出力する。画像取得部３は撮像装置１からの入力を基に画像情報を取得し、三次元情報取得部４は撮像装置１からの入力を基に三次元情報を取得する。 In step S91, first, the image pickup apparatus 1 acquires the image information and the three-dimensional information corresponding to the measurement range and outputs them to the object detection apparatus 2a. The image acquisition unit 3 acquires image information based on the input from the image pickup device 1, and the three-dimensional information acquisition unit 4 acquires the three-dimensional information based on the input from the image pickup device 1.

ステップS92では、識別候補領域抽出部５を用いて識別候補領域５５を抽出する。具体的には、画像処理部５１で抽出した矩形領域と、三次元情報処理部５２で抽出した直方体領域を識別候補領域５５とした後、抽出された識別候補領域５５に対し、識別候補領域ＩＤ付与部５３によってＩＤを付与する。 In step S92, the identification candidate region 55 is extracted using the identification candidate region extraction unit 5. Specifically, after the rectangular area extracted by the image processing unit 51 and the rectangular parallelepiped area extracted by the three-dimensional information processing unit 52 are set as the identification candidate area 55, the identification candidate area ID is used for the extracted identification candidate area 55. An ID is assigned by the granting unit 53.

ステップS93では、抽出された識別候補領域５５から、識別処理の対象とする１つの識別候補領域５５を選択する。 In step S93, one identification candidate area 55 to be subjected to the identification process is selected from the extracted identification candidate areas 55.

ステップS94では、選択された識別候補領域５５に対し、視点変換を実施し、画像に投影することで変換画像86_nを取得する。最適化処理により、識別器情報６５に対して類似度が最大となる画像変換方法を取得する。識別器64_nが２つ以上存在する場合は、例えば、識別候補領域５５に対して視点変換を実施して取得する変換画像86_nとテンプレート６６の類似度が最大となった識別器64_nのＩＤを取得し、対応する識別器64_nに対して適切な画像変換方法を決定する。 In step S94, the viewpoint conversion is performed on the selected identification candidate area 55, and the converted image 86_n is acquired by projecting it onto the image. By the optimization process, the image conversion method that maximizes the similarity to the classifier information 65 is acquired. When two or more classifiers 64_n exist, for example, the ID of the classifier 64_n having the maximum similarity between the converted image 86_n and the template 66 acquired by performing viewpoint conversion on the discrimination candidate area 55 is acquired. Then determine the appropriate image conversion method for the corresponding classifier 64_n.

ステップS95では、ステップS94にて決定された変換方法により、選択された識別候補領域５５に対して画像変換を実施し、変換画像86_nを取得する。 In step S95, image conversion is performed on the selected identification candidate area 55 by the conversion method determined in step S94, and the converted image 86_n is acquired.

ステップS96では、ステップS95にて取得した変換画像86_nに対して、識別器64_nを用いて識別処理を実施する。 In step S96, the converted image 86_n acquired in step S95 is subjected to identification processing using the classifier 64_n.

ステップS97では、識別処理の結果、変換画像中に検出対象が含まれるか否かを判定する。含まれる場合はステップS98を実施し、含まれない場合はステップS99を実施する。 In step S97, as a result of the identification process, it is determined whether or not the detection target is included in the converted image. If it is included, step S98 is performed, and if it is not included, step S99 is performed.

ステップS98では、識別処理の結果、変換画像中に検出対象が含まれると判定された際に、識別結果を出力する。物体検出装置２ａが、例えばモニタなどの表示装置に接続される場合、表示装置に撮影空間の画像を表示し、画像中において、識別候補領域５５に対応する位置に矩形の検出窓を表示したり、検出対象が検出された旨を示すメッセージを表示してもよい。識別候補領域５５に対応する位置は、識別候補領域情報管理部５４に記録される位置情報を参照して取得する。 In step S98, when it is determined that the detected object is included in the converted image as a result of the identification process, the identification result is output. When the object detection device 2a is connected to a display device such as a monitor, an image of the shooting space is displayed on the display device, and a rectangular detection window is displayed at a position corresponding to the identification candidate area 55 in the image. , A message indicating that the detection target has been detected may be displayed. The position corresponding to the identification candidate area 55 is acquired by referring to the position information recorded in the identification candidate area information management unit 54.

ステップS99では、選択された識別候補領域５５に対する識別処理を終了した後に、ステップS92にて抽出されたすべての識別候補領域５５に対して識別処理を実施したかを判定する。そして、識別処理が未実施な識別候補領域５５が存在する場合、ステップS93を実施し、識別処理が未実施な識別候補領域５５が存在しない場合、物体検出処理を終了する。 In step S99, after the identification process for the selected identification candidate area 55 is completed, it is determined whether or not the identification process has been performed on all the identification candidate areas 55 extracted in step S92. Then, if there is an identification candidate area 55 for which the identification process has not been performed, step S93 is executed, and if there is no identification candidate area 55 for which the identification process has not been performed, the object detection process is terminated.

以上説明したように、実施例１の物体検出装置２ａでは、抽出した識別候補領域５５を仮想的な視点変換により識別器への入力に適した画像に変換してから、検出対象の検出を実施することで、画像中の検出対象の見え方が識別器のテンプレートと相違する場合や、画面中の検出対象の一部が遮蔽物によって隠されている場合においても、検出対象を高精度に検出することができる。 As described above, in the object detection device 2a of the first embodiment, the extracted identification candidate area 55 is converted into an image suitable for input to the classifier by virtual viewpoint conversion, and then the detection target is detected. By doing so, even if the appearance of the detection target in the image is different from the template of the classifier, or even if a part of the detection target on the screen is hidden by a shield, the detection target is detected with high accuracy. can do.

次に、実施例２の物体検出装置２ｂについて、図１０から図１３を用いて説明する。なお、実施例１と共通する点は、重複説明を省略する。 Next, the object detection device 2b of the second embodiment will be described with reference to FIGS. 10 to 13. The points common to the first embodiment are omitted from the duplicate description.

図２は、ステレオカメラ等の撮像装置１と接続された、本実施例の物体検出装置２ｂの概要を示すブロック図である。実施例１の物体検出装置２ａでは三次元情報を回転させるためのパラメータα、β、γを網羅的に変更する画像変換方法決定部７ａを用いたが、本実施例の物体検出装置２ｂではより効率的にパラメータα、β、γを決定できる画像変換方法決定部７ｂを用いる。以下、画像変換方法決定部７ｂについて詳細に説明する。 FIG. 2 is a block diagram showing an outline of the object detection device 2b of the present embodiment connected to the image pickup device 1 such as a stereo camera. In the object detection device 2a of the first embodiment, the image conversion method determination unit 7a for comprehensively changing the parameters α, β, and γ for rotating the three-dimensional information was used, but in the object detection device 2b of the present embodiment, An image conversion method determination unit 7b capable of efficiently determining the parameters α, β, and γ is used. Hereinafter, the image conversion method determination unit 7b will be described in detail.

先ず、直立した人物を視点８２ｄから撮影している様子を示す図１１を用いて、画像変換方法決定部７ｂでの処理の概要を説明する。図１１において、Ｘｃ、Ｙｃ、Ｚｃはカメラ座標系のｘ軸、ｙ軸、ｚ軸であり、２０４は視点８２ｄに設置した撮像装置１の光軸を示す。ここで、カメラ座標系とは、撮影空間を表す三次元座標系として、撮像装置１のカメラの光学中心を原点とし、ｚ軸（Ｚｃ）をカメラの光軸２０４の方向に一致させ、ｘ軸（Ｘｃ）とｙ軸（Ｙｃ）は画像投影面２０５の横方向と縦方向に平行にとったものである。また、２０５は画像投影面、２０６は視点８２ｄから撮影した画像、２０７は視点８２ｄから取得した三次元情報、２０８は検出対象の姿勢方向を示す直線、２０９、２１０は識別候補領域５５と直線２０８のカメラ座標系における交点座標(Xct,Yct,Zct)、(Xcb,Ycb,Zcb)を示し、２１１、２１２は仮想的な視点変換後の識別候補領域５５と直線２０８のカメラ座標系における交点座標(Xct’,Yct’,Zct’)、(Xcb’,Ycb’,Zcb’)を示す。 First, an outline of processing in the image conversion method determination unit 7b will be described with reference to FIG. 11 showing a state in which an upright person is photographed from the viewpoint 82d. In FIG. 11, Xc, Yc, and Zc are the x-axis, y-axis, and z-axis of the camera coordinate system, and 204 shows the optical axis of the image pickup device 1 installed at the viewpoint 82d. Here, the camera coordinate system is a three-dimensional coordinate system representing a shooting space, with the optical center of the camera of the image pickup apparatus 1 as the origin, the z-axis (Zc) being aligned with the direction of the optical axis 204 of the camera, and the x-axis. (Xc) and the y-axis (Yc) are taken parallel to the horizontal direction and the vertical direction of the image projection surface 205. Further, 205 is an image projection surface, 206 is an image taken from the viewpoint 82d, 207 is three-dimensional information acquired from the viewpoint 82d, 208 is a straight line indicating the posture direction of the detection target, and 209 and 210 are the identification candidate areas 55 and the straight line 208. Indicates the intersection coordinates (Xct, Yct, Zct), (Xcb, Ycb, Zcb) in the camera coordinate system of, and 211 and 212 are the intersection coordinates of the identification candidate area 55 after the virtual viewpoint conversion and the straight line 208 in the camera coordinate system. (Xct', Yct', Zct'), (Xcb', Ycb', Zcb') are shown.

本実施例の画像変換方法決定部７ｂでは、図１１の下図に示すように、ｙ軸（Ｙｃ）に対して傾いた直線２０８を、ｙ軸（Ｙｃ）と平行な直線２０８’に変換するためのパラメータα、β、γを算出する。そして、変換後の識別候補領域５５’の三面図を生成することで、識別器６７への入力として最適な変換画像８６を取得する。 In the image conversion method determining unit 7b of the present embodiment, as shown in the lower figure of FIG. 11, a straight line 208 inclined with respect to the y-axis (Yc) is converted into a straight line 208'parallel to the y-axis (Yc). The parameters α, β, and γ of are calculated. Then, by generating a three-view drawing of the conversion candidate region 55', the optimum converted image 86 is acquired as an input to the classifier 67.

図１２は、上記のパラメータα、β、γの決定処理を含む、画像変換方法決定部７ｂの処理フローを示している。以下、この処理フローを概説する。 FIG. 12 shows a processing flow of the image conversion method determining unit 7b including the determination processing of the above parameters α, β, and γ. The processing flow will be outlined below.

最初に、視点変換のパラメータβを０°に設定（S121）した後、直線２０８を取得する（S122）。そして、この直線２０８を回転させる任意のパラメータα、γを設定（S123）した後に、設定したパラメータα、β、γを利用して検出対象の三面図を生成する（S124）。そして、三面図のうちひとつを変換画像８６として選択した後（S125）、選択した変換画像８６と識別器情報６５の類似度を算出し（S126）、類似度が閾値以上であれば、選択中の変換画像８６を識別器６４への入力画像と決定して処理を終了する。一方、類似度が閾値未満ならステップS128へ遷移する（S127）。ステップS128では、生成した三面図のすべてを変換画像として選択したかを判定し、すべてを選択していない場合はステップS125へ遷移し、β＝０°の場合の三面図すべてについて類似度を算出した場合はステップS129へ遷移する（S128）。ステップS129では、パラメータβを変更して、すなわち、ｙ軸（Ｙｃ）を中心に識別候補領域５５を回転させてから、ステップS124へ戻り（S129）、類似度が閾値以上の変換画像８６が得られるまで、処理を繰り返す。以下、特に重要なステップS122、S123、S124、S129について詳細に説明する。 First, the viewpoint transformation parameter β is set to 0 ° (S121), and then the straight line 208 is acquired (S122). Then, after setting arbitrary parameters α and γ for rotating the straight line 208 (S123), a three-view drawing to be detected is generated using the set parameters α, β and γ (S124). Then, after selecting one of the three views as the converted image 86 (S125), the similarity between the selected converted image 86 and the classifier information 65 is calculated (S126), and if the similarity is equal to or more than the threshold value, the selection is being made. The converted image 86 of the above is determined as the input image to the classifier 64, and the process is terminated. On the other hand, if the similarity is less than the threshold value, the process proceeds to step S128 (S127). In step S128, it is determined whether all of the generated three views are selected as converted images, and if not all are selected, the process proceeds to step S125, and the similarity is calculated for all three views when β = 0 °. If so, the process proceeds to step S129 (S128). In step S129, the parameter β is changed, that is, the identification candidate region 55 is rotated around the y-axis (Yc), and then the process returns to step S124 (S129) to obtain a converted image 86 having a similarity equal to or higher than the threshold value. The process is repeated until the result is obtained. Hereinafter, particularly important steps S122, S123, S124, and S129 will be described in detail.

ステップS122では直線２０８を取得する。直線２０８の求め方の一例しては、識別候補領域５５の三次元情報を参照し、三次元点群の各点同士のユークリッド距離が最大となる２点を結んだ直線をとるものとする。これは、検出対象が直立した人物である場合、その人物を含む識別候補領域５５は鉛直方向に長い直方体であると予測でき、ユークリッド距離が最大となる方向が、人物の姿勢方向を示す直線２０８であると推定できるからである。また、識別候補領域５５の三次元点群に対して主成分分析を実施し、その第一成分の方向にとった直線でも良い。あるいは、一般的な床面推定方法により、撮影する空間に存在する床面を検出可能な場合、その床面と直交する方向と、頭部に対応する１つの点の情報を用いて直線２０８を決定する方法をとっても良い。 In step S122, the straight line 208 is acquired. As an example of how to obtain the straight line 208, it is assumed that the three-dimensional information of the identification candidate region 55 is referred to and a straight line connecting two points having the maximum Euclidean distance between each point of the three-dimensional point cloud is taken. This is because when the detection target is an upright person, the identification candidate area 55 including the person can be predicted to be a rectangular body long in the vertical direction, and the direction in which the Euclidean distance is maximized is the straight line 208 indicating the posture direction of the person. This is because it can be estimated to be. Further, the principal component analysis may be performed on the three-dimensional point cloud of the identification candidate region 55, and a straight line taken in the direction of the first component may be used. Alternatively, if the floor surface existing in the space to be photographed can be detected by a general floor surface estimation method, the straight line 208 is drawn by using the information of the direction orthogonal to the floor surface and one point corresponding to the head. You can take the method of deciding.

ステップS123では、交点座標２１１と交点座標２１２のｘ値、ｚ値が共に等しくなるような、すなわち、Ｘｃｔ’＝Ｘｂｔ’かつＺｃｔ’＝Ｚｂｔ’となるようなパラメータα、γを決定する。Ｘｃｔ’とＸｃｂ’が等しくなるように識別候補領域５５を回転させた際のｚ軸（Ｚｃ）周りの回転角がパラメータγに対応し、Ｚｃｔ’とＺｃｂ’が等しくなるように識別候補領域５５を回転させた際のｘ軸（Ｘｃ）周りの回転角がパラメータαに対応する。ステップS121にてパラメータβは０°に設定されているため、以上の処理によりパラメータα、β、γを決定することができる。 In step S123, the parameters α and γ are determined so that the x value and the z value of the intersection coordinate 211 and the intersection coordinate 212 are both equal, that is, Xct'= Xbt'and Zct'= Zbt'. When the identification candidate area 55 is rotated so that Xct'and Xcb' are equal, the rotation angle around the z-axis (Zc) corresponds to the parameter γ, and the identification candidate area 55 is made equal to Zct'and Zcb'. The angle of rotation around the x-axis (Xc) when the is rotated corresponds to the parameter α. Since the parameter β is set to 0 ° in step S121, the parameters α, β, and γ can be determined by the above processing.

次に、図１３を用いて、ステップS124の処理について説明する。ステップS124では、仮想的な視点変換後に識別候補領域５５の三面図を取得する。図１３において、視点８２ｅ、８２ｆ、８２ｇは三面図を生成するための視点であり、変換画像８６ｅ、８６ｆ、８６ｇは各視点より生成される変換画像８６を示す。直線２０８をｙ軸（Ｙｃ）と平行にするパラメータα、γを決定した後、パラメータβを変化させながら三面図を生成していくと、所定のパラメータβとなったときに、図１３の変換画像８６ｅに示すように、識別候補領域５５中の人物の正面からの視点へ仮想的に視点変換することができ、対応するテンプレートを持つ識別器６４を用いて人物を検出することができる。 Next, the process of step S124 will be described with reference to FIG. In step S124, a three-view view of the identification candidate area 55 is acquired after the virtual viewpoint conversion. In FIG. 13, the viewpoints 82e, 82f, 82g are viewpoints for generating a three-view view, and the converted images 86e, 86f, 86g show the converted images 86 generated from each viewpoint. After determining the parameters α and γ that make the straight line 208 parallel to the y-axis (Yc), when a three-view drawing is generated while changing the parameter β, the conversion shown in FIG. 13 occurs when the predetermined parameter β is obtained. As shown in the image 86e, the viewpoint can be virtually converted to the viewpoint from the front of the person in the identification candidate area 55, and the person can be detected by using the classifier 64 having the corresponding template.

しかしながら、実環境では遮蔽などにより検出対象の特定の方向からの見え方が識別に適さない場合がある。そこで、正面からの視点８２ｅからの変換画像８６ｅに加え、側面と上面からの視点８２ｆ、８２ｇからも変換画像８６ｆ、８６ｇを得ておくことで、候補となる識別器６４の数を増やし、物体検出の精度を向上させることができる。なお、視点８２ｅに対して、各パラメータをさらにα＝０°、β＝９０°、γ＝０°だけ回転させることで側面を見る視点８２ｆを設定でき、視点８２ｅに対して、各パラメータをさらにα＝９０°、β＝０°、γ＝０°だけ回転させることで上面を見る視点８２ｇを設定でき、視点８２ｅ、８２ｆ、８２ｇにおいて識別候補領域５５を透視投影することで、変換画像８６ｅ、８６ｆ、８６ｇを効率的に取得でき、これを三面図とすることで効率的な人物検出を実現できる。 However, in the actual environment, the appearance of the detection target from a specific direction may not be suitable for identification due to shielding or the like. Therefore, by obtaining the converted images 86f and 86g from the viewpoints 82f and 82g from the side surface and the upper surface in addition to the converted image 86e from the viewpoint 82e from the front, the number of candidate classifiers 64 can be increased and the object can be obtained. The accuracy of detection can be improved. It should be noted that the viewpoint 82f for viewing the side surface can be set by further rotating each parameter with respect to the viewpoint 82e by α = 0 °, β = 90 °, and γ = 0 °, and each parameter is further changed with respect to the viewpoint 82e. The viewpoint 82g for viewing the upper surface can be set by rotating α = 90 °, β = 0 °, and γ = 0 °, and the converted image 86e can be set by perspectively projecting the identification candidate region 55 at the viewpoints 82e, 82f, and 82g. 86f and 86g can be obtained efficiently, and efficient person detection can be realized by using these as a three-view drawing.

以上説明した実施例２の物体検出装置では、実施例１に比べ、効率的にパラメータα、β、γを決定することができ、画像中で人物が変形する場合や、遮蔽が発生する場合においても高精度な人物検出を実施することができる。 In the object detection device of the second embodiment described above, the parameters α, β, and γ can be determined more efficiently than in the first embodiment, and when the person is deformed in the image or when shielding occurs. Can also perform highly accurate person detection.

１撮像装置、２ａ、２ｂ物体検出装置、３画像取得部、４三次元情報取得部、５識別候補領域抽出部、５１画像処理部、５２三次元情報処理部、５３識別候補領域ＩＤ付与部、５４識別候補領域情報管理部、５４＿ｎ識別候補領域情報５５識別候補領域、６識別器情報取得部、６４識別器、６５識別器情報、６６テンプレート、７ａ、７ｂ画像変換方法決定部、８画像変換部、８２視点、８６変換画像、８７矩形画像、９識別部、９１識別器記録部、９２識別処理実施部、９３識別結果出力部 1 image pickup device, 2a, 2b object detection device, 3 image acquisition unit, 4 three-dimensional information acquisition unit, 5 identification candidate area extraction unit, 51 image processing unit, 52 three-dimensional information processing unit, 53 identification candidate area ID assignment unit, 54 Identification candidate area information management unit, 54_n Identification candidate area information 55 Identification candidate area, 6 Identification device information acquisition unit, 64 Identification device, 65 Identification device information, 66 Template, 7a, 7b Image conversion method determination unit, 8 Image conversion unit , 82 viewpoints, 86 conversion images, 87 rectangular images, 9 identification units, 91 classifier recording units, 92 identification processing execution units, 93 identification result output units.

Claims

An object detection device that determines whether or not a detection target exists within the measurement range.
A 3D information acquisition unit that acquires 3D information within the measurement range based on the input from the image pickup device, and
An identification candidate area extraction unit that extracts an identification candidate area in which the detection target may exist,
The classifier used to detect the detection target and
The classifier information acquisition unit that acquires the information of the classifier, and
An image conversion method determination unit that determines parameters for virtually performing viewpoint conversion processing on three-dimensional information in the identification candidate region,
An image conversion execution unit that generates a converted image based on the three-dimensional information in the identification candidate region that has been virtually subjected to viewpoint conversion processing.
An identification unit that detects the detection target using the classifier based on the converted image, and
An image acquisition unit that acquires image information within the measurement range based on the input from the image pickup device, and an image acquisition unit.
Equipped with a,
The image conversion method determining unit, by using the information of the identifier the image information and the three-dimensional information, a feature that you determine the parameters to generate optimal the converted image as an input of the discriminator Object detection device.

In the object detection device according to claim 1 ,
The identification candidate region extraction unit
An identification candidate area ID assigning unit that assigns IDs to a plurality of the identification candidate areas,
An identification candidate area information management unit that collectively manages the ID of the identification candidate area, the position in the image information, and the position in the three-dimensional information.
An object detection device comprising.

An object detection device that determines whether or not a detection target exists within the measurement range.
A 3D information acquisition unit that acquires 3D information within the measurement range based on the input from the image pickup device, and
An identification candidate area extraction unit that extracts an identification candidate area in which the detection target may exist,
The classifier used to detect the detection target and
The classifier information acquisition unit that acquires the information of the classifier, and
An image conversion method determination unit that determines parameters for virtually performing viewpoint conversion processing on three-dimensional information in the identification candidate region,
An image conversion execution unit that generates a converted image based on the three-dimensional information in the identification candidate region that has been virtually subjected to viewpoint conversion processing.
An identification unit that detects the detection target using the classifier based on the converted image, and
Equipped with a,
The identifier information acquiring unit, the identifier ID and object detection apparatus characterized that you get the identifier information representing the input signal exhibit particularly high identification capability.

In the object detection device according to any one of claims 1 to 3 ,
The image conversion method determination unit
The optimization process is performed on the result of the virtual viewpoint conversion process, and the image conversion to the most suitable image in the discriminator for determining whether or not the detection target is included in the identification candidate region is realized. An object detection device characterized in determining parameters.

In the object detection device according to claim 3 ,
An object detection device characterized in that the classifier information is any one of a template, color information, luminance information, contour, and gradient information.

In the object detection device according to claim 5 ,
When the classifier information is a template
The image conversion method determining unit calculates the similarity between the image acquired by performing the viewpoint conversion process and the template, and selects the classifier having the maximum similarity. ..

In the object detection device according to any one of claims 1 to 6 .
The image conversion method determining unit is an object detection device having a function of determining the parameters by using camera parameters expressing the installation state of the imaging device.

In the object detection device according to any one of claims 1 to 7 .
The identification unit includes a classifier recording unit that records at least one identification unit having discriminating ability with respect to the detection target.
An identification processing execution unit that performs identification processing for identifying whether or not the detection target is included in the converted image using the identification device.
An identification result output unit that outputs a result when it is determined that the converted image includes the detection target,
An object detection device comprising.

In the object detection device according to any one of claims 1 to 8 .
The image conversion method determining unit is an object detection device characterized in that the parameters are determined based on the three-dimensional shape of the detection target.

In the object detection device according to claim 9 ,
The image conversion method determining unit acquires a straight line indicating a general posture direction of a detection target passing through the identification candidate region, and obtains a straight line.
A function of acquiring the parameter that realizes a virtual viewpoint conversion such that the straight line is parallel to the Y axis of the camera coordinate system of the imaging device.
An object detection device comprising.

In the object detection device according to claim 10 ,
The image conversion method determination unit
After the straight line is converted into a state parallel to the Y axis of the camera coordinate system, a virtual viewpoint conversion is performed from the front surface, the side surface, and the upper surface to the viewpoint for observing the identification candidate region, and the conversion is performed at each viewpoint. An object detection device characterized by acquiring an image.

In the object detection device according to claim 10 or 11 .
The image conversion method determination unit
An object detection device characterized in that the straight line is determined by connecting two points having the maximum Euclidean distance between each point of the three-dimensional point cloud included in the identification candidate region.

In the object detection device according to claim 10 or 11 .
The image conversion method determination unit
Principal component analysis was performed on the three-dimensional point cloud included in the identification candidate region.
An object detection device characterized in that the straight line is determined by taking the direction of the first component.

In the object detection device according to claim 10 ,
The image conversion method determination unit
Estimate the floor surface in the measurement range and
A specific site to be detected is detected in the identification candidate region,
An object detection device, characterized in that a straight line passing through one point corresponding to the portion and extending in a direction orthogonal to the floor surface is determined as the straight line.

It is an object detection method that determines whether or not a detection target exists within the measurement range.
Based on the input from the image pickup device, the three-dimensional information within the measurement range is acquired, and
An identification candidate region in which the detection target may exist is extracted.
Obtain the information of the classifier used to detect the detection target, and
Determine the parameters for virtual viewpoint conversion processing of the three-dimensional information in the identification candidate area.
A converted image is generated based on the three-dimensional information in the identification candidate region that has been virtually subjected to viewpoint conversion processing.
The detection target is detected by using the classifier based on the converted image .
Based on the input from the imaging device, the image information within the measurement range is acquired, and
A method for detecting an object , which comprises using the image information, the three-dimensional information, and the information of the classifier to determine a parameter for generating the converted image that is optimal as an input of the classifier .