JP2021105820A

JP2021105820A - Object detection device and object detection method

Info

Publication number: JP2021105820A
Application number: JP2019236503A
Authority: JP
Inventors: 恒進唐; Koushin Tou; 淳平加藤; Jumpei Kato; 多聞山納; Tamon Yamano; 慎也原瀬; Shinya Harase
Original assignee: NTT Comware Corp
Current assignee: NTT Comware Corp
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2021-07-26
Anticipated expiration: 2039-12-26
Also published as: JP7236377B2

Abstract

To improve accuracy in object detection and position estimation.SOLUTION: An object detection device comprises: detection units 12A and 12B for detecting a target object from respective images obtained by photographing the same area using a plurality of cameras with different elevation/depression angles and calculating a plurality of position coordinate groups for each of the plurality of cameras; and a matching unit 20 for integrating the plurality of position coordinate groups and determining a position coordinate of the same object. The detection units 12A and 12B detect a region where the target object including an object related to the target object is present, and detect the target object and a portion including a shadow of the target object from inside the region where the target object is present.SELECTED DRAWING: Figure 1

Description

本発明は、物体検出装置および物体検出方法に関する。 The present invention relates to an object detection device and an object detection method.

公共インフラは産業や生活の基盤である。公共インフラの一例として電柱などの構造物が挙げられる。電柱が倒壊すると、停電が発生したり、通信が遮断されたりするため、公共インフラの点検・保守は重要である。公共インフラを効率的に点検・保守するためには、点検対象の構造物の正確な位置を把握しておく必要がある。 Public infrastructure is the foundation of industry and livelihood. An example of public infrastructure is a structure such as a utility pole. Inspection and maintenance of public infrastructure is important because if a utility pole collapses, a power outage will occur or communication will be cut off. In order to efficiently inspect and maintain public infrastructure, it is necessary to know the exact location of the structure to be inspected.

近年、機械学習を用いて画像中から物体を自動検出する技術に注目が集まりつつある。深層学習等の技術を用いることで、画像から道路や建物などを高精度で検出することができる。 In recent years, attention has been focused on technology for automatically detecting an object in an image using machine learning. By using techniques such as deep learning, it is possible to detect roads, buildings, etc. from images with high accuracy.

特開２０１８−９７５８８号公報JP-A-2018-97588

航空写真や衛星写真などの上空から撮影した画像を用いると、広範囲の情報を一度に取得できる。しかしながら、対象物体の大きさや撮影時期などの要因により、上空から撮影した画像だけを用いて構造物を検出できないことがある。 A wide range of information can be obtained at once by using images taken from the sky such as aerial photographs and satellite photographs. However, due to factors such as the size of the target object and the shooting time, it may not be possible to detect the structure using only the image taken from the sky.

車載カメラで撮影した画像を用いると、詳細な解析を行いやすいという利点があるものの、撮影のためのコストがかかり、航空・衛星写真よりも対象エリアが狭くなりがちである。建物の高層階に設置してあるカメラで撮影した画像を用いると、俯瞰方向からの広範囲な画像が得られるが、撮影方向が決まっており、検出できない構造物がある。 Using images taken with an in-vehicle camera has the advantage that detailed analysis can be performed easily, but the cost for shooting is high, and the target area tends to be narrower than that of aerial / satellite photographs. Images taken with a camera installed on the upper floors of a building can be used to obtain a wide range of images from the bird's-eye view, but the shooting direction is fixed and some structures cannot be detected.

垂直方向、水平方向、または俯瞰方向のいずれの画像を用いるとしても、より精度よく、画像から物体を検出することが求められる。 Regardless of whether an image in the vertical direction, a horizontal direction, or a bird's-eye view direction is used, it is required to detect an object from the image with higher accuracy.

本発明は、上記に鑑みてなされたものであり、物体検出や位置推定の精度の向上を図ることを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to improve the accuracy of object detection and position estimation.

本発明の一態様の物体検出装置は、同一地域を仰俯角の異なる複数のカメラで撮影した画像のそれぞれから対象物体を検出し、前記複数のカメラごとの複数の位置座標群を求める検出部と、前記複数の位置座標群を統合し、同一物体の位置座標を判定するマッチング部を備え、前記検出部は、前記対象物体に関連する物体を含めて前記対象物体の存在する領域を検出したうえで、前記対象物体の存在する領域内から前記対象物体と当該対象物体の影を含む部分を検出することを特徴とする。 The object detection device according to one aspect of the present invention includes a detection unit that detects a target object from each of images taken by a plurality of cameras having different elevation / depression angles in the same area and obtains a plurality of position coordinate groups for each of the plurality of cameras. A matching unit that integrates the plurality of position coordinate groups and determines the position coordinates of the same object is provided, and the detection unit detects a region in which the target object exists, including an object related to the target object. It is characterized in that the target object and the portion including the shadow of the target object are detected in the region where the target object exists.

本発明によれば、物体検出や位置推定の精度の向上を図ることができる。 According to the present invention, it is possible to improve the accuracy of object detection and position estimation.

本実施形態の物体検出装置の構成例を示す図である。It is a figure which shows the structural example of the object detection apparatus of this embodiment. 物体検出装置に入力する垂直方向から撮影した画像の一例を示す図である。It is a figure which shows an example of the image taken from the vertical direction input to the object detection device. 物体検出装置に入力する水平方向から撮影した画像の一例を示す図である。It is a figure which shows an example of the image taken from the horizontal direction input to the object detection device. 物体検出装置に入力する地域情報の一例を示す図である。It is a figure which shows an example of the area information to input to an object detection apparatus. 物体検出装置が検出する電柱と電柱の付属物の一例を示す図である。It is a figure which shows an example of the utility pole and the accessory of the utility pole detected by the object detection device. 電柱と影をまとめて検出した一例を示す図である。It is a figure which shows an example which detected a utility pole and a shadow together. 仰俯角の異なる複数のカメラで撮影した画像のそれぞれから検出した対象物体の位置を示した図である。It is a figure which showed the position of the target object detected from each of the images taken by a plurality of cameras with different elevation / depression angles. 同一物体の判定処理を説明するための図である。It is a figure for demonstrating the determination process of the same object. 対象物体の位置座標を補正する様子を示す図である。It is a figure which shows the state of correcting the position coordinate of a target object. 物体検出装置の処理の流れを示すフローチャートである。It is a flowchart which shows the processing flow of the object detection apparatus.

以下、本発明の実施の形態について図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１を参照し、本実施形態の物体検出装置の構成例について説明する。本実施形態の物体検出装置は、同一地域を仰俯角の異なる複数のカメラで撮影した画像のそれぞれから求めた位置座標を統合し、検出物体の位置情報を出力する装置である。具体的には、物体検出装置は、上空から撮影した航空写真または衛星写真から対象物体を検出して位置座標を求めるとともに、車載カメラで撮影した画像から対象物体を検出して位置座標を求めて、各画像から求めた位置座標を統合する。このとき、同一物体を判定して、同一物体の位置座標をまとめる。なお、以下では航空写真と車載カメラで撮影した画像とを用いた例で説明するが、これに限るものではない。建造物の高層階から撮影した俯瞰方向の画像を用いてもよい。仰俯角の異なる複数のカメラで撮影した画像のそれぞれから位置座標を求めて統合できればよい。 A configuration example of the object detection device of the present embodiment will be described with reference to FIG. The object detection device of the present embodiment is a device that integrates the position coordinates obtained from each of the images taken by a plurality of cameras having different elevation / depression angles in the same area and outputs the position information of the detected object. Specifically, the object detection device detects the target object from an aerial photograph or a satellite photograph taken from the sky and obtains the position coordinates, and also detects the target object from the image taken by the in-vehicle camera and obtains the position coordinates. , Integrate the position coordinates obtained from each image. At this time, the same object is determined, and the position coordinates of the same object are put together. In the following, an example using an aerial photograph and an image taken by an in-vehicle camera will be described, but the description is not limited to this. An image in the bird's-eye view taken from the upper floors of the building may be used. It suffices if the position coordinates can be obtained from each of the images taken by a plurality of cameras having different elevation / depression angles and integrated.

図１の物体検出装置は、位置検出部１０Ａ，１０Ｂ、マッチング部２０、および位置補正部３０を備える。物体検出装置が備える各部は、演算処理装置、記憶装置等を備えたコンピュータにより構成して、各部の処理がプログラムによって実行されるものとしてもよい。このプログラムは物体検出装置が備える記憶装置に記憶されており、磁気ディスク、光ディスク、半導体メモリ等の記録媒体に記録することも、ネットワークを通して提供することも可能である。 The object detection device of FIG. 1 includes position detection units 10A and 10B, a matching unit 20, and a position correction unit 30. Each part of the object detection device may be configured by a computer provided with an arithmetic processing unit, a storage device, and the like, and the processing of each part may be executed by a program. This program is stored in a storage device included in the object detection device, and can be recorded on a recording medium such as a magnetic disk, an optical disk, or a semiconductor memory, or can be provided through a network.

位置検出部１０Ａ，１０Ｂは、画像および地域情報を入力する入力部１１Ａ，１１Ｂと画像から対象物体を検出して対象物体の位置を求める検出部１２Ａ，１２Ｂを備える。 The position detection units 10A and 10B include input units 11A and 11B for inputting an image and area information, and detection units 12A and 12B for detecting the target object from the image and obtaining the position of the target object.

入力部１１Ａ，１１Ｂのそれぞれは、画像蓄積部５１Ａ，５１Ｂのそれぞれから、仰俯角の異なる複数のカメラで撮影した画像を入力する。例えば、入力部１１Ａは、画像蓄積部５１Ａから、図２に示す航空写真または衛星写真のような、垂直方向から撮影した画像を入力し、入力部１１Ｂは、画像蓄積部５１Ｂから、図３に示す車載カメラで撮影した写真のような、水平方向から撮影した画像を入力する。画像は、動画像からフレームごとに切り出した静止画を用いてもよい。入力部１１Ａ，１１Ｂは、画像の撮影場所、撮影方向、および撮影時間などの撮影情報も画像とともに入力する。入力部１１Ａ，１１Ｂは、フィルタ処理などの前処理を行ってもよい。 Each of the input units 11A and 11B inputs images taken by a plurality of cameras having different elevation / depression angles from the image storage units 51A and 51B, respectively. For example, the input unit 11A inputs an image taken from the vertical direction such as the aerial photograph or the satellite photograph shown in FIG. 2 from the image storage unit 51A, and the input unit 11B inputs the image taken from the image storage unit 51B to FIG. Input an image taken from the horizontal direction, such as a picture taken with the in-vehicle camera shown. As the image, a still image cut out from the moving image for each frame may be used. The input units 11A and 11B also input shooting information such as the shooting location, shooting direction, and shooting time of the image together with the image. The input units 11A and 11B may perform preprocessing such as filter processing.

また、入力部１１Ａ，１１Ｂは、地域情報蓄積部５２から、画像が撮影された地域の地域情報を入力し、対象物体の探索対象領域または探索対象外領域を設定する。地域情報とは、点・線・多角形により図形を表す幾何情報および土地利用や地質などの質的または標高などの量的な属性情報である。 Further, the input units 11A and 11B input the area information of the area where the image was taken from the area information storage unit 52, and set the search target area or the non-search target area of the target object. Regional information is geometric information that represents a figure by points, lines, and polygons, and quantitative attribute information such as qualitative or elevation such as land use and geology.

図４を参照し、地域情報から探索対象領域を設定する概要を説明する。図４の例では、道路地図、地形図、および植生分布図を入力して、探索対象領域を設定した。道路地図は、道路を線分で示した地図である。地形図は、数値標高モデル（ＤＥＭ）であり、地表面を正方形に区切って標高値を持たせたデータである。植生分布図は、自然林であるか否かの植生分布を示した図である。例えば、対象物体を信号とすると、信号は道路沿いのみに存在するので、道路の近傍を探索対象領域とし、起伏が少ない地域を探索対象領域とし、さらに、自然林でない地域を探索対象領域とする。図４の例では、道路地図、地形図、および植生分布図のそれぞれにおいて探索対象領域をグレーで示した。これらの地域情報から特定される探索対象領域を重ね合わせて対象物体を探索する探索対象領域を設定する。地域情報はこれに限るものではなく、例えば、河川や池などの属性情報を入力し、河川や川を探索対象外領域としてもよい。 An outline of setting the search target area from the area information will be described with reference to FIG. In the example of FIG. 4, a road map, a topographic map, and a vegetation distribution map were input to set the search target area. A road map is a map showing roads with line segments. The topographic map is a digital elevation model (DEM), and is data obtained by dividing the ground surface into squares and giving them elevation values. The vegetation distribution map is a diagram showing the vegetation distribution of whether or not it is a natural forest. For example, if the target object is a signal, the signal exists only along the road, so the area near the road is the search target area, the area with few undulations is the search target area, and the area that is not a natural forest is the search target area. .. In the example of FIG. 4, the search target area is shown in gray in each of the road map, topographic map, and vegetation distribution map. The search target area for searching the target object is set by superimposing the search target areas specified from these area information. The area information is not limited to this, and for example, attribute information such as a river or a pond may be input, and the river or the river may be excluded from the search target area.

入力部１１Ａ，１１Ｂは、入力した画像の探索対象外領域をマスク処理してもよいし、探索対象領域または探索対象外領域を検出部１２Ａ，１２Ｂへ通知し、検出部１２Ａ，１２Ｂが画像内で探索対象外領域に相当する部分を除いて処理してもよい。 The input units 11A and 11B may mask the non-search target area of the input image, notify the search target area or the non-search target area to the detection units 12A and 12B, and the detection units 12A and 12B are in the image. You may process by excluding the part corresponding to the area not to be searched by.

検出部１２Ａ，１２Ｂのそれぞれは、画像から対象物体を自動検出して対象物体の位置座標群（１つ以上の位置座標を含む）を求める。検出部１２Ａ，１２Ｂは、機械学習により、対象物体の画像を学習して生成した判定モデルを用いて、入力した画像から対象物体を検出し、画像の撮影情報から検出した対象物体の地図上での位置座標を求める。検出部１２Ａは、上空から広範囲を撮影した画像から対象物体の位置座標群を求め、検出部１２Ｂは、上空から撮影した画像に対応する地域内を車載カメラで撮影した画像群から対象物体の位置座標群を求める。つまり、入力部１１Ａ，１１Ｂが入力した画像群ごと（カメラごと）の複数の位置座標群が求められる。 Each of the detection units 12A and 12B automatically detects the target object from the image and obtains the position coordinate group (including one or more position coordinates) of the target object. The detection units 12A and 12B detect the target object from the input image by using the determination model generated by learning the image of the target object by machine learning, and on the map of the target object detected from the captured information of the image. Find the position coordinates of. The detection unit 12A obtains the position coordinate group of the target object from the image taken over a wide area from the sky, and the detection unit 12B obtains the position of the target object from the image group taken by the in-vehicle camera in the area corresponding to the image taken from the sky. Find the coordinate group. That is, a plurality of position coordinate groups for each image group (for each camera) input by the input units 11A and 11B are obtained.

検出部１２Ａ，１２Ｂは、対象物体そのものだけでなく、関連する物体を含めて検出する。つまり、検出部１２Ａ，１２Ｂは、対象物体に関連する物体を含めて学習し、検出時は対象物体に関連する物体を含めて対象物体を検出する。例えば、対象物体が電柱の場合、図５に示すように、電柱１００の影１１０や保持する架線１２０も含めて検出する。これにより、電柱のみでは検出精度が低くなる場合であっても、関連のある物体をまとめて検出することで精度の向上が期待できる。 The detection units 12A and 12B detect not only the target object itself but also related objects. That is, the detection units 12A and 12B learn including the object related to the target object, and at the time of detection, detect the target object including the object related to the target object. For example, when the target object is a utility pole, as shown in FIG. 5, the shadow 110 of the utility pole 100 and the overhead wire 120 to be held are also detected. As a result, even if the detection accuracy is low only with the utility pole, the accuracy can be expected to be improved by collectively detecting the related objects.

検出部１２Ａ，１２Ｂは、関連する物体を含めて検出した領域内から対象物体を再度検出することで、精度よく対象物体の位置座標を求めることができる。例えば、図５に示す例では、対象物体である電柱１００は含まれているが、精度よく電柱１００の位置座標を求めるためには電柱１００そのものの画像内での位置を特定できるとよい。検出部１２Ａ，１２Ｂは、領域内で検出した電柱１００の根本部分の位置に基づいて電柱１００の位置座標を求める。 The detection units 12A and 12B can accurately obtain the position coordinates of the target object by detecting the target object again from the detected area including the related object. For example, in the example shown in FIG. 5, the utility pole 100, which is the target object, is included, but in order to accurately obtain the position coordinates of the utility pole 100, it is preferable that the position of the utility pole 100 itself in the image can be specified. The detection units 12A and 12B obtain the position coordinates of the utility pole 100 based on the position of the root portion of the utility pole 100 detected in the region.

図６（ａ），（ｂ）に示すように、検出部１２Ａ，１２Ｂは、電柱１００と影１１０をまとめて検出し、電柱１００と影１１０の位置に基づいて電柱１００の根元部分２００Ａを位置座標としてもよい。もしくは、電柱１００と影１１０をまとめて検出した検出枠の、影が伸びていない側の角２００Ｂを電柱１００の位置座標としてもよい。図６（ａ）は、電柱１００の右側に影１１０が伸びているので、検出枠の左下の角を電柱１００の位置座標とする。図６（ｂ）は、電柱１００の左側に影１１０が伸びているので、検出枠の右下の角を電柱１００の位置座標とする。これにより、十分な精度を保ちながら、処理を高速化できる。検出部１２Ａ，１２Ｂは、電柱１００と影１１０の位置関係に応じて、検出枠の辺上の点を電柱１００の位置座標としてもよいし、検出枠内の中心を電柱１００の位置座標としてもよい。 As shown in FIGS. 6A and 6B, the detection units 12A and 12B collectively detect the utility pole 100 and the shadow 110, and position the root portion 200A of the utility pole 100 based on the positions of the utility pole 100 and the shadow 110. It may be a coordinate. Alternatively, the corner 200B on the side where the shadow does not extend may be the position coordinate of the utility pole 100 in the detection frame in which the utility pole 100 and the shadow 110 are detected together. In FIG. 6A, since the shadow 110 extends to the right side of the utility pole 100, the lower left corner of the detection frame is used as the position coordinates of the utility pole 100. In FIG. 6B, since the shadow 110 extends to the left side of the utility pole 100, the lower right corner of the detection frame is used as the position coordinate of the utility pole 100. As a result, the processing can be speeded up while maintaining sufficient accuracy. The detection units 12A and 12B may use the points on the sides of the detection frame as the position coordinates of the utility pole 100 or the center in the detection frame as the position coordinates of the utility pole 100, depending on the positional relationship between the utility pole 100 and the shadow 110. good.

検出部１２Ａ，１２Ｂは、対象物体と関連する物体を含めて、関連する物体の付き方に応じたクラスに分けて学習してもよい。具体的には、電柱から伸びる影の方向に応じたクラス分けを行う。画像内において電柱から影の伸びる方向は、画像の撮影場所、撮影方向、および撮影時刻によって特定できる。例えば、春や秋の午前１０時ごろに北を向いて撮影した画像では、対象物体の影は画像上で左上方向に伸びる。対象物体を検出する画像の撮影場所、撮影方向、および撮影時刻は既知であるので、画像から対象物体を検出する際、画像の条件に合ったクラスの対象物体と関連する物体を含めて検出する。春や秋の午前１０時ごろに北を向いて撮影した画像からは、対象物体が左上方向に伸びる影を有するクラスに分類される対象物体を検出する。 The detection units 12A and 12B may be divided into classes according to how the related objects are attached, including the objects related to the target object. Specifically, the classification is performed according to the direction of the shadow extending from the utility pole. The direction in which the shadow extends from the utility pole in the image can be specified by the shooting location, shooting direction, and shooting time of the image. For example, in an image taken facing north around 10 am in spring or autumn, the shadow of the target object extends in the upper left direction on the image. Detecting the target object Since the shooting location, shooting direction, and shooting time of the image are known, when detecting the target object from the image, the target object and related objects of the class that matches the image conditions are detected. .. From the images taken facing north around 10 am in spring or autumn, the target object classified into the class with a shadow extending in the upper left direction is detected.

検出部１２Ａ，１２Ｂは、学習時、不足している学習データを補うため、対象物体に関連する物体を合成したデータを生成してもよい。例えば、電柱の画像に様々な方向に伸びる影を合成したデータを生成して学習に用いる。架線やクロージャなどの電柱に取り付けられた物体を追加したデータを生成してもよい。 The detection units 12A and 12B may generate data obtained by synthesizing an object related to the target object in order to supplement the learning data that is lacking at the time of learning. For example, data obtained by synthesizing shadows extending in various directions on an image of a telephone pole is generated and used for learning. Data may be generated by adding an object attached to a utility pole such as an overhead wire or a closure.

検出部１２Ａ，１２Ｂは、画像から検出した対象物体のそれぞれについて、地図上にマッピングするための位置座標を求める。垂直方向および水平方向の画像から検出した対象物体の位置座標は、カメラの位置座標、カメラの仰俯角、画像に付随する位置座標（例えば、四隅の位置座標）、および対象物体の画像内での検出座標から推定できる。例えば、水平方向の画像からは、同一の対象物体に対する２枚の画像から三角測量を行って位置座標を推定できる。あるいは、水平方向の動画像から深度マップ（ｄｅｐｔｈｍａｐ）を作成して位置座標を推定してもよい。位置座標の推定に必要な情報は、入力部１１Ａ，１１Ｂが画像を入力するときに画像とともに入力しておく。なお、位置座標を推定する方法はこれに限るものではない。検出部１２Ａ，１２Ｂのそれぞれから対象物体の位置座標群が出力される。これらの位置座標群には、同一物体を示す位置座標が含まれるので、後段のマッチング部２０が同一物体を判定する。 The detection units 12A and 12B obtain the position coordinates for mapping on the map for each of the target objects detected from the image. The position coordinates of the target object detected from the vertical and horizontal images are the position coordinates of the camera, the elevation / depression angle of the camera, the position coordinates associated with the image (for example, the position coordinates of the four corners), and the position coordinates of the target object in the image. It can be estimated from the detected coordinates. For example, from a horizontal image, the position coordinates can be estimated by performing triangulation from two images for the same target object. Alternatively, a depth map may be created from a moving image in the horizontal direction to estimate the position coordinates. The information necessary for estimating the position coordinates is input together with the image when the input units 11A and 11B input the image. The method of estimating the position coordinates is not limited to this. The position coordinate group of the target object is output from each of the detection units 12A and 12B. Since these position coordinate groups include position coordinates indicating the same object, the matching unit 20 in the subsequent stage determines the same object.

なお、図１では、２つの検出部１２Ａ，１２Ｂを図示しているが、これに限るものではない。１つの検出部が複数のカメラで撮影した画像のそれぞれを処理して、それぞれの画像について位置座標群を求めてもよい。さらに検出部を追加して３つ以上の異なるカメラ（例えば垂直方向、水平方向、および俯瞰方向のカメラ）で撮影した画像を処理してもよい。この場合、仰俯角の異なる画像群（例えば垂直方向、水平方向、および俯瞰方向の画像群）ごとに位置座標群が求められる。 Although FIG. 1 shows two detection units 12A and 12B, the present invention is not limited to this. One detection unit may process each of the images taken by a plurality of cameras to obtain a position coordinate group for each image. Further, a detection unit may be added to process images taken by three or more different cameras (for example, vertical, horizontal, and bird's-eye view cameras). In this case, a position coordinate group is obtained for each image group having a different elevation / depression angle (for example, an image group in the vertical direction, a horizontal direction, and a bird's-eye view direction).

マッチング部２０は、検出部１２Ａの求めた位置座標群と検出部１２Ｂの求めた位置座標群をマッチングして同一物体を判定し、位置座標群を統合する。 The matching unit 20 matches the position coordinate group obtained by the detection unit 12A with the position coordinate group obtained by the detection unit 12B to determine the same object, and integrates the position coordinate groups.

図７に、検出部１２Ａ，１２Ｂのそれぞれが検出した対象物体の位置を示す。図７では、検出部１２Ａが検出した対象物体の位置を〇で示し、検出部１２Ｂが検出した対象物体の位置を△で示している。仰俯角の異なる複数のカメラで撮影した画像のそれぞれから対象物体を検出し、検出結果を統合することで、対象物体の検出率を向上できる。 FIG. 7 shows the positions of the target objects detected by the detection units 12A and 12B, respectively. In FIG. 7, the position of the target object detected by the detection unit 12A is indicated by ◯, and the position of the target object detected by the detection unit 12B is indicated by Δ. The detection rate of the target object can be improved by detecting the target object from each of the images taken by a plurality of cameras having different elevation / depression angles and integrating the detection results.

マッチング部２０は、位置座標群を統合する際、検出部１２Ａ，１２Ｂのそれぞれが出力した同一物体の位置座標を判定する。 When integrating the position coordinate groups, the matching unit 20 determines the position coordinates of the same object output by each of the detection units 12A and 12B.

例えば、第１の手法として、マッチング部２０は、以下の式で示すように、検出部１２Ａ，１２Ｂそれぞれの検出した対象物体の位置座標のずれが一定距離以下の場合、同一物体の位置座標であると判定する。 For example, as a first method, as shown by the following equation, the matching unit 20 uses the position coordinates of the same object when the deviation of the position coordinates of the detected target objects of the detection units 12A and 12B is a certain distance or less. Judge that there is.

ここで、ｘ_i1,j1，ｘ_i2,j2はｊ１番目、ｊ２番目のカメラで撮影した画像から求めた位置座標である。ｄ（ｘ_i1,j1，ｘ_i2,j2）は、位置座標ｘ_i1,j1，ｘ_i2,j2間の誤差である。εは同一物体であるか否かを判定するためのパラメータである。 Here, x _{i1, j1} , x _{i2, j2} are the position coordinates obtained from the images taken by the j1st and j2nd cameras. d (x _{i1, j1} , x _{i2, j2} ) is the error between the position coordinates x _{i1, j1} , x _{i2, j2.} ε is a parameter for determining whether or not they are the same object.

第２の手法として、マッチング部２０は、以下の式で示すように、検出部１２Ａ，１２Ｂそれぞれの検出した位置座標とカメラごと（検出部１２Ａ，１２Ｂごとでもよい）に設定されたパラメータとによって定義される領域を算出し、共通する領域が存在する場合、同一物体の位置座標であると判定する。 As a second method, as shown by the following equation, the matching unit 20 is based on the detected position coordinates of the detection units 12A and 12B and the parameters set for each camera (may be each of the detection units 12A and 12B). The defined area is calculated, and if there is a common area, it is determined that the coordinates are the position coordinates of the same object.

ここで、ｘ_ｉは推定対象（条件を満たすものが存在するか否かを探索）であり、ε_j1，ε_j2はカメラごとに設定されたパラメータである。 Here, x _i is an estimation target (searching for whether or not there is a condition that satisfies the condition), and ε _j1 and ε _j2 are parameters set for each camera.

図８に示すように、位置座標ｘ_i1,j1を中心とした半径ε_j1の領域と位置座標ｘ_i2,j2を中心としたε_j2の領域で共通する領域３００が存在する場合、位置座標ｘ_i1,j1，ｘ_i2,j2は同一物体の位置座標であると判定する。 As shown in FIG. 8, when there is a common region 300 in the region of radius ε _j1 _{centered on the position coordinates x i1 and j1} _{and the region of ε j2} centered on the position coordinates x _{i2 and j2, the position coordinates x} _{It is determined that i1, j1} , x _{i2, j2} are the position coordinates of the same object.

共通する領域３００が存在しても、その領域３００が川の中などの対象物体の存在し得ない場所である場合、マッチング部２０は、同一物体として判定しない。具体的には、マッチング部２０は、領域３００が入力部１１Ａ，１１Ｂが地域情報から設定した探索対象領域を含まない場合、もしくは領域３００が探索対象外領域に含まれる場合、同一物体として判定しない。これにより、位置座標間の距離だけで判定するときよりも、同一物体を精度よく判定できる。 Even if the common area 300 exists, if the area 300 is a place where the target object cannot exist, such as in a river, the matching unit 20 does not determine that the object is the same. Specifically, the matching unit 20 does not determine the same object when the area 300 does not include the search target area set by the input units 11A and 11B from the area information, or when the area 300 is included in the non-search target area. .. As a result, the same object can be determined more accurately than when the determination is made based only on the distance between the position coordinates.

第３の手法として、マッチング部２０は、検出部１２Ａ，１２Ｂそれぞれの検出した位置座標とカメラごと（検出部１２Ａ，１２Ｂごとでもよい）に設定された誤差関数を用いてペナルティ値を算出し、算出したペナルティ値に基づいて同一物体であるか否かを判定する。具体的には、次式に示すように、カメラごとに設定した誤差関数ｃ_j1，ｃ_j2を用いて、検出部１２Ａ，１２Ｂの検出した位置座標ｘ_i1,j1，ｘ_i2,j2から推定対象ｘ_iまでの距離に応じたペナルティ値Ｐ_j1，Ｐ_j2を求める。ペナルティ値Ｐ_j1，Ｐ_j2を合計して２（＝カメラの数）で割った値がパラメータε以下となるような推定対象ｘ_iが存在する場合、位置座標ｘ_i1,j1，ｘ_i2,j2は同一物体の位置座標であると判定する。 As a third method, the matching unit 20 calculates a penalty value using the detected position coordinates of each of the detection units 12A and 12B and an error function set for each camera (may be for each of the detection units 12A and 12B). It is determined whether or not the objects are the same based on the calculated penalty value. Specifically, as shown in the following equation, the estimation target is estimated from _{the position coordinates x i1, j1} , x _{i2, j2} detected by the detection units 12A and 12B using _{the error functions c j1} and c _{j2 set for each camera.} _Find the penalty values P j1 and P _j2 according to the distance to x _i . _{If there is an estimation target x i} such that the sum of the penalty values P _j1 and P _j2 is divided by 2 (= the number of cameras) and the value is less than or equal to the parameter ε, the position coordinates x _{i1, j1} , x _{i2, j2} Is determined to be the position coordinates of the same object.

誤差関数ｃ_j1，ｃ_j2は、例えば、検出部１２Ａ，１２Ｂの検出した位置座標ｘ_i1,j1，ｘ_i2,j2から離れるに従って値が大きくなるような関数である。例えば、検出誤差の小さい検出部１２Ａ，１２Ｂについては、検出した位置座標からの距離に応じて急激に値が大きくなるような誤差関数を設定し、検出誤差の大きい検出部１２Ａ，１２Ｂについては、検出した位置座標からの距離に応じてなだらかに値が大きくなるような誤差関数を設定する。 The error functions c _j1 and c _j2 are, for example, functions in which the values increase as the _{distance from the position coordinates x i1, j1} , x _{i2, j2} detected by the detection units 12A and 12B increases. For example, for the detection units 12A and 12B having a small detection error, an error function is set so that the value suddenly increases according to the distance from the detected position coordinates, and for the detection units 12A and 12B having a large detection error, an error function is set. Set an error function so that the value gradually increases according to the distance from the detected position coordinates.

検出部が３つ以上存在する場合も、上記の式を拡張することで対応可能である。 Even if there are three or more detection units, it can be dealt with by extending the above equation.

第３の手法においても同一物体の判定に地域情報を用いてもよい。 In the third method as well, the area information may be used to determine the same object.

位置補正部３０は、検出部１２Ａ，１２Ｂのそれぞれで検出された同一物体と判定された位置座標を１つにまとめるとともに、同一物体と判定された位置座標に基づいて位置座標を補正する。 The position correction unit 30 collects the position coordinates determined to be the same object detected by the detection units 12A and 12B into one, and corrects the position coordinates based on the position coordinates determined to be the same object.

例えば、位置座標がより正確に求められる垂直方向の画像から検出した位置座標を正とし、水平方向の画像から検出した同一物体の位置座標の補正量を求める。そして、水平方向の画像のみから検出できた位置座標を求めた補正量で補正する。図９では、垂直方向の画像から検出した位置座標を〇で示し、水平方向の画像から検出した位置座標を△で示した。位置補正部３０は、位置座標（△）を位置座標（〇）に合わせるための補正量を求める。位置補正部３０は、水平方向の画像のみから検出される位置座標を求めた補正量で補正する。 For example, the position coordinates detected from the vertical image in which the position coordinates can be obtained more accurately are regarded as positive, and the correction amount of the position coordinates of the same object detected from the horizontal image is obtained. Then, the position coordinates that can be detected only from the image in the horizontal direction are corrected by the obtained correction amount. In FIG. 9, the position coordinates detected from the vertical image are indicated by ◯, and the position coordinates detected from the horizontal image are indicated by Δ. The position correction unit 30 obtains a correction amount for adjusting the position coordinates (Δ) to the position coordinates (〇). The position correction unit 30 corrects the position coordinates detected only from the image in the horizontal direction with the obtained correction amount.

測量などにより正確な位置座標が判明している対象物体がある場合、垂直方向および水平方向のそれぞれについて、位置座標を正確な位置座標に合わせる補正量を求め、求めた補正量のそれぞれに基づいて垂直方向から求めた位置座標と水平方向から求めた位置座標を補正してもよい。具体的には、位置補正部３０は、正確な位置座標が判明している対象物体について、垂直方向から求めた位置座標を正確な位置座標に合わせる補正量を求めて、垂直方向から求めた他の対象物体の位置座標を求めた補正量で補正する。水平方向についても同様に、位置補正部３０は、正確な位置座標が判明している対象物体について、水平方向から求めた位置座標を正確な位置座標に合わせる補正量を求めて、水平方向から求めた他の対象物体の位置座標を求めた補正量で補正する。 If there is an object whose accurate position coordinates are known by surveying, etc., the correction amount to match the position coordinates with the accurate position coordinates is obtained in each of the vertical and horizontal directions, and based on each of the obtained correction amounts. The position coordinates obtained from the vertical direction and the position coordinates obtained from the horizontal direction may be corrected. Specifically, the position correction unit 30 obtains the correction amount for adjusting the position coordinates obtained from the vertical direction to the accurate position coordinates for the target object whose exact position coordinates are known, and obtains the correction amount from the vertical direction. The position coordinates of the target object are corrected by the obtained correction amount. Similarly, in the horizontal direction, the position correction unit 30 obtains the correction amount for adjusting the position coordinates obtained from the horizontal direction to the accurate position coordinates for the target object whose exact position coordinates are known, and obtains the correction amount from the horizontal direction. The position coordinates of other target objects are corrected by the obtained correction amount.

別の方法として、位置補正部３０は、カメラの誤差を考慮した次式で表される目的関数Ｌを最小化する位置座標の組み合わせを探索することで、位置座標を補正してもよい。 Alternatively, the position correction unit 30 may correct the position coordinates by searching for a combination of position coordinates that minimizes the objective function L represented by the following equation in consideration of the camera error.

ここで、ｘ_iは、最適化対象であって、ｉ番目の対象物体の位置座標である。ｘ_i,jは、ｊ番目のカメラの画像から求めたｉ番目の対象物体の位置座標である。ｊ番目のカメラの画像からｉ番目の対象物体を検出していない場合はｘ_i＝ｘ_i,jとみなす。ｃ_jはカメラごとに設定した誤差関数である。マッチング部２０の第３の手法で利用する誤差関数と同じでよい。Ｎは対象物体の合計数である。同一物体の位置座標はまとめて１つと数える。Ｋはカメラの合計数である。Ｋ_iはｘ_iを検出したカメラの合計数である。 Here, x _i is the optimization target and is the position coordinate of the i-th target object. x _{i, j} are the position coordinates of the i-th target object obtained from the image of the j-th camera. If the i-th target object is not detected from the image of the j-th camera, it is regarded as _{x i} = x _{i, j.} c _j is an error function set for each camera. It may be the same as the error function used in the third method of the matching unit 20. N is the total number of target objects. The position coordinates of the same object are collectively counted as one. K is the total number of cameras. K _i is the total number of cameras that have detected _{x i.}

図１０のフローチャートを参照し、本実施形態の物体検出装置の動作について説明する。 The operation of the object detection device of the present embodiment will be described with reference to the flowchart of FIG.

ステップＳ１１にて、位置検出部１０Ａは、垂直方向から撮影した画像を入力し、対象物体の位置座標を検出し、ステップＳ１２にて、位置検出部１０Ｂは、水平方向から撮影した画像を入力し、対象物体の位置座標を検出する。ステップＳ１１，Ｓ１２は、並列に行ってもよいし、順番に行ってもよい。 In step S11, the position detection unit 10A inputs an image taken from the vertical direction and detects the position coordinates of the target object, and in step S12, the position detection unit 10B inputs an image taken from the horizontal direction. , Detects the position coordinates of the target object. Steps S11 and S12 may be performed in parallel or in order.

ステップＳ１３にて、マッチング部２０は、垂直方向の画像から検出した位置座標と水平方向の画像から検出した位置座標を統合し、同一物体の位置座標を判定する。 In step S13, the matching unit 20 integrates the position coordinates detected from the vertical image and the position coordinates detected from the horizontal image, and determines the position coordinates of the same object.

ステップＳ１４にて、位置補正部３０は、同一物体の位置座標を１つにまとめるとともに、位置座標を補正する位置補正部を備える。 In step S14, the position correction unit 30 includes a position correction unit that collects the position coordinates of the same object into one and corrects the position coordinates.

以上の処理により、物体検出装置は、対象物体の位置座標群を出力する。例えば、物体検出装置は、地図上に、対象物体を検出した位置座標を示して表示する。 By the above processing, the object detection device outputs the position coordinate group of the target object. For example, the object detection device displays the position coordinates at which the target object is detected on the map.

以上説明したように、本実施形態の物体検出装置は、同一地域を仰俯角の異なる複数のカメラで撮影した画像のそれぞれから対象物体を検出し、複数のカメラごとの複数の位置座標群を求める検出部１２Ａ，１２Ｂと、複数の位置座標群を統合し、同一物体の位置座標を判定するマッチング部２０を備える。これにより、対象物体の検出率を向上できる。 As described above, the object detection device of the present embodiment detects the target object from each of the images taken by a plurality of cameras having different elevation / depression angles in the same area, and obtains a plurality of position coordinate groups for each of the plurality of cameras. The detection units 12A and 12B and a matching unit 20 that integrates a plurality of position coordinate groups and determines the position coordinates of the same object are provided. As a result, the detection rate of the target object can be improved.

本実施形態の検出部１２Ａ，１２Ｂは、対象物体に関連する物体を含めて対象物体の存在する領域を検出したうえで、対象物体の存在する領域内から対象物体とその影を含む部分を検出する。また、検出部１２Ａ，１２Ｂは、画像から検出した対象物体の影の位置に基づいて対象物体の位置座標を求める。これにより、対象物体の位置座標をより正確に求めることができる。 The detection units 12A and 12B of the present embodiment detect the area where the target object exists including the object related to the target object, and then detect the target object and the part including the shadow thereof from the area where the target object exists. do. Further, the detection units 12A and 12B obtain the position coordinates of the target object based on the position of the shadow of the target object detected from the image. As a result, the position coordinates of the target object can be obtained more accurately.

本実施形態の物体検出装置は、対象物体を検出する地域の幾何情報や属性情報に基づいて対象物体を検出する領域を限定することにより、対象物体の検出精度の向上および処理の高速化を実現できる。 The object detection device of the present embodiment realizes improvement in detection accuracy of the target object and speeding up of processing by limiting the area for detecting the target object based on the geometric information and attribute information of the area where the target object is detected. can.

１０Ａ，１０Ｂ…位置検出部
１１Ａ，１１Ｂ…入力部
１２Ａ，１２Ｂ…検出部
２０…マッチング部
３０…位置補正部 10A, 10B ... Position detection unit 11A, 11B ... Input unit 12A, 12B ... Detection unit 20 ... Matching unit 30 ... Position correction unit

Claims

A detection unit that detects a target object from each of images taken by a plurality of cameras with different elevation / depression angles in the same area and obtains a plurality of position coordinate groups for each of the plurality of cameras.
A matching unit that integrates the plurality of position coordinate groups and determines the position coordinates of the same object is provided.
The detection unit detects the region where the target object exists, including the object related to the target object, and then detects the target object and the portion including the shadow of the target object from the region where the target object exists. An object detection device characterized by detecting.

The detection unit learns so that the target object can be classified according to the direction of the shadow, and when detecting the target object from the image, inputs information that can specify the direction of the shadow in the image and detects the target object. The object detection device according to claim 1, wherein the object detection device is characterized by the above.

The object detection device according to claim 1 or 2, wherein the detection unit obtains the position coordinates of the target object based on the position of the shadow of the target object detected from the image.

The object detection device according to any one of claims 1 to 3, wherein the area for detecting the target object is limited based on the geometric information and the attribute information of the area for detecting the target object.

A computer-executed object detection method
A step of detecting a target object from each of images taken by a plurality of cameras having different elevation / depression angles in the same area and obtaining a plurality of position coordinate groups for each of the plurality of cameras.
It has a step of integrating the plurality of position coordinate groups and determining the position coordinates of the same object.
In the step of detecting the target object, after detecting the region where the target object exists including the target object and the accessory, the target object and the shadow of the target object are cast from within the region where the target object exists. An object detection method characterized by detecting a portion including a part.