JP2021128759A

JP2021128759A - Method and device for detecting objects

Info

Publication number: JP2021128759A
Application number: JP2021000627A
Authority: JP
Inventors: ワン・レェフェイ; Lefei Wang; 欣底; Xin Di; ジャン・ジャオユィ; Zhaoyu Zhang; ティアン・ジュン; Jun Tian
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2020-02-10
Filing date: 2021-01-06
Publication date: 2021-09-02
Also published as: CN113255397A

Abstract

To provide a method and device for detecting objects.SOLUTION: An object detection method comprises: performing matching between a radar detection result and a video detection result; generating grid cells in a radar coordinate system based on the radar detection result; generating a border frame in an image coordinate system based on the grid cells; and performing object detection based on the border frame.SELECTED DRAWING: Figure 2

Description

本発明は、オブジェクト検出技術分野に関し、特に、オブジェクト検出方法及び装置に関する。 The present invention relates to the art of object detection, and more particularly to object detection methods and devices.

オブジェクト検出（ｏｂｊｅｃｔｓｄｅｔｅｃｔｉｏｎ）の目的は、画像（又はビデオにおけるフレーム）において、どのようなオブジェクトがあるか、どこにあるかを把握することにある。迅速且つ正確な的オブジェクト検出アルゴリズムは、コンピュータが自動車の運転を行うようにさせることができ、また、補助装置がリアルタイムなシーン情報をユーザ及び他の各種の応用（アプリケーション）に伝送するようにさせることができる。 The purpose of object detection is to understand what objects are and where they are in an image (or frame in a video). A fast and accurate target object detection algorithm can cause a computer to drive a car and an auxiliary device to transmit real-time scene information to the user and various other applications. be able to.

近年、大量の関連技術によれば、良いパフォーマンスのオブジェクト検出が既に開発されている。しかし、それは依然として次の２つの面において挑戦を直面しており、即ち、処理時間の長さ及びオブジェクトの画像上の位置の正確性である。ＤＰＭ（ＤｅｆｏｒｍａｂｌｅＰａｒｔＭｏｄｅｌｓ）のようなシステムはウィンドウスライディング方法を使用し、そのうち、分類器が画像全体上で均一に分布している位置で動作する。しかし、パフォーマンスの面において、スライディングステップの設計により、このような方法は効率及び正確性を低下させてしまう。Ｒ−ＣＮＮ（Ｒｅｇｉｏｎ−ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋｓ）などの最新の方法は領域定義方法を使用し、まず、画像において潜在的な境界枠（ｂｏｒｄｅｒ）を生成し、その後、これらの定義される枠上で分類器を実行する。分類後、後処理を用いて境界枠を簡素化することで、重複検出を除去し、そして、シーンにおける他のオブジェクトに基づいて枠に対して再び評価する（スコアを与える）。しかし、このような複雑なパイプライン（ｐｉｐｅｌｉｎｅｓ）の実行は遅くて最適化することが困難である。 In recent years, according to a large number of related technologies, good performance object detection has already been developed. However, it still faces challenges in two aspects: the length of processing time and the accuracy of the object's position on the image. Systems such as DPM (Deformable Part Models) use a window sliding method, in which the classifier operates at a position that is evenly distributed over the image. However, in terms of performance, due to the design of the sliding steps, such methods reduce efficiency and accuracy. Modern methods such as R-CNN (Region-Convolutional Neural Networks) use region definition methods, first generating potential borders in an image and then classifying on these defined frames. Run the vessel. After classification, post-processing is used to simplify the borders to eliminate duplicate detection and re-evaluate (score) the borders based on other objects in the scene. However, the execution of such complex pipelines is slow and difficult to optimize.

発明者が次のようなことを発見した。即ち、ＹＯＬＯ（ｙｏｕｌｏｏｋｏｎｌｙｏｎｃｅ）のような最新の方法はオブジェクト検出を１つの回帰問題と再定義することで、処理が遅い問題を完全に解決することができる。それはシングル畳み込みネットワークを用いて複数の境界枠及びこれらの枠のクラス確率を同時に予測する。 The inventor discovered the following. That is, the latest methods such as YOULO (you look only once) can completely solve the slow processing problem by redefining object detection as one regression problem. It uses a single convolutional network to simultaneously predict multiple boundary frames and the class probabilities of these frames.

図１に示すよう、ＹＯＬＯはまず、入力される画像をＳ×Ｓグリッドに分割する。各グリッドセルについて、それはＢ境界枠及びこれらの枠の信頼度スコアを予測することができる。各境界枠は５つの予測量からなり、即ち、ｘ、ｙ、ｗ、ｈ及び信頼度である。信頼度スコアはＢ境界枠のＣクラス確率の信頼度を含む。よって、トータルで言えば、これらの予測量はＳ×Ｓ×（Ｂ×５＋Ｃ）テンソルと符号化される。ネットワーク複雑度はＳ^２によって決められる。また、処理時間を考慮して、Ｂも制限され、何故なら、それはＳ^２の倍数であるからである。よって、ＹＯＬＯの問題の１つは、その境界枠の精度が比較的低いことにある。しかし、実際には、大多数のグリッドが内容を含まず、これらの空のグリッドセルを予測することは処理時間の浪費である。 As shown in FIG. 1, YOLO first divides the input image into S × S grids. For each grid cell, it can predict the B border frame and the confidence score of these frames. Each boundary frame consists of five predicted quantities, namely x, y, w, h and reliability. The confidence score includes the confidence of the C-class probability of the B boundary frame. Therefore, in total, these predicted quantities are coded as S × S × (B × 5 + C) tensors. Network complexity is determined by S ^2. In consideration of processing time, B is also restricted, because it is because a multiple of S ^2. Therefore, one of the problems of YOLO is that the accuracy of the boundary frame is relatively low. However, in reality, the vast majority of grids contain no content, and predicting these empty grid cells is a waste of processing time.

上述の問題の少なくとも１つ又は他の類似問題を解決するために、本発明の実施例は、オブジェクト検出方法及び装置を提供する。 To solve at least one of the above problems or other similar problems, the embodiments of the present invention provide object detection methods and devices.

本発明の実施例の第一側面によれば、オブジェクト検出方法が提供され、そのうち、前記方法は、
レーダー検出結果とビデオ検出結果とのマッチングを行い；
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成し；
画像座標系で前記グリッドセルに基づいて境界枠を生成し；及び
前記境界枠に基づいてオブジェクト検出を行うことを含む。 According to the first aspect of the embodiment of the present invention, an object detection method is provided, in which the method is described.
Matching radar detection results with video detection results;
Generate grid cells based on the radar detection result in the radar coordinate system;
It includes generating a border based on the grid cells in the image coordinate system; and performing object detection based on the border.

本発明の実施例の第二側面によれば、オブジェクト検出装置が提供され、そのうち、前記装置は、
レーダー検出結果とビデオ検出結果とのマッチングを行うマッチングユニット；
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成する第一生成ユニット；
画像座標系で前記グリッドセルに基づいて境界枠を生成する第二生成ユニット；及び
前記境界枠に基づいてオブジェクト検出を行う検出ユニットを含む。 According to the second aspect of the embodiment of the present invention, an object detection device is provided, in which the device is
A matching unit that matches radar detection results with video detection results;
The first generation unit that generates grid cells based on the radar detection result in the radar coordinate system;
It includes a second generation unit that generates a boundary frame based on the grid cell in the image coordinate system; and a detection unit that performs object detection based on the boundary frame.

本発明の実施例の第三側面によれば、画像処理装置が提供され、そのうち、前記画像処理装置は処理器及び記憶器を含み、前記記憶器はコンピュータプログラムを記憶し、前記処理器は前記コンピュータプログラムを実行して前述のオブジェクト検出方法を実現するように構成される。 According to a third aspect of an embodiment of the present invention, an image processing device is provided, of which the image processing device includes a processor and a storage device, the storage device stores a computer program, and the processor is said to be said. It is configured to run a computer program to implement the object detection method described above.

本発明の実施例の有益な効果の１つは次の通りであり、即ち、本発明は、レーダー検出結果に基づいて境界枠のビデオフレーム上の可能な位置を指示することで、完全なビデオフレーム上のブラインド検索、細分化及び予測を避けることができる。これにより、オブジェクト検出方法（例えば、ＤＰＭ、Ｒ−ＣＮＮなど）の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小（サイズ）及び位置の正確性を向上させることができる。 One of the beneficial effects of the embodiments of the present invention is as follows, i.e., the present invention indicates the possible position of the border frame on the video frame based on the radar detection result, so that the complete video Blind searches, subdivisions and predictions on the frame can be avoided. As a result, the processing time of the object detection method (for example, DPM, R-CNN, etc.) can be significantly reduced, or the size and position accuracy of the boundary frame can be improved under the same processing time conditions. ..

ＹＯＬＯ方法を採用してオブジェクト検出を行うプロセスを示す図である。It is a figure which shows the process which performs the object detection by adopting the YOLO method. 本発明の実施例におけるオブジェクト検出方法を示す図である。It is a figure which shows the object detection method in the Example of this invention. 本発明の実施例におけるオブジェクト検出方法を示す他の図である。It is another figure which shows the object detection method in the Example of this invention. 本発明の実施例におけるオブジェクト検出装置を示す図である。It is a figure which shows the object detection apparatus in the Example of this invention. 本発明の実施例における画像処理装置を示す図である。It is a figure which shows the image processing apparatus in the Example of this invention.

以下、添付した図面を参照しながら、本発明を実施するための好ましい実施例について詳細に説明する。 Hereinafter, preferred examples for carrying out the present invention will be described in detail with reference to the attached drawings.

＜第一側面の実施例＞
本発明の実施例はオブジェクト検出方法を提供する。図２は本発明の実施例におけるオブジェクト検出方法を示す図である。図２に示すように、該方法は以下の操作（ステップ）を含む。 <Example of the first aspect>
An embodiment of the present invention provides an object detection method. FIG. 2 is a diagram showing an object detection method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following operations (steps).

操作２０１：レーダー検出結果とビデオ検出結果とのマッチングを行い；
操作２０２：レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成し；
操作２０３：画像座標系で前記グリッドセルに基づいて境界枠を生成し；
操作２０４：前記境界枠に基づいてオブジェクト検出を行う。 Operation 201: Matching the radar detection result and the video detection result;
Operation 202: Generate a grid cell based on the radar detection result in the radar coordinate system;
Operation 203: Generate a border frame based on the grid cells in the image coordinate system;
Operation 204: Object detection is performed based on the boundary frame.

本発明は、レーダー検出結果に基づいて境界枠のビデオフレーム上の可能な位置を指示することで、完全なビデオフレーム上のブラインド検索、細分化及び予測を避けることができ、これにより、オブジェクト検出方法（例えば、ＤＰＭ、Ｒ−ＣＮＮなど）の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小及び位置の正確性を向上させることができる。 The present invention avoids blind searches, subdivisions and predictions on a complete video frame by indicating possible positions on the border frame video frame based on radar detection results, thereby object detection. The processing time of the method (eg, DPM, R-CNN, etc.) can be significantly reduced, or the size and position accuracy of the border frame can be improved under the same processing time conditions.

本発明の実施例では、レーダー及びカメラが同時にセンシングデータを非同期で取得し、レーダーはオリジナルデータをキャプチャして検出結果（レーダー検出結果と言う）を出力し、カメラはビデオフレームをキャプチャして検出結果（ビデオ検出結果という）を出力する。 In the embodiment of the present invention, the radar and the camera simultaneously acquire the sensing data asynchronously, the radar captures the original data and outputs the detection result (referred to as the radar detection result), and the camera captures and detects the video frame. Output the result (called video detection result).

本発明の実施例では、レーダー検出結果はポイントクラウド（ｐｏｉｎｔｃｌｏｕｄ）、ポイントクラウドにおける各ポイントの速度、及び他の情報を含む。ポイントクラウドにおけるポイントに基づいて、アルゴリズム、例えば、ｄＢｓｃａｎ（Ｄｅｎｓｉｔｙ−ＢａｓｅｄＳｐａｔｉａｌＣｌｕｓｔｅｒｉｎｇｏｆＡｐｐｌｉｃａｔｉｏｎｓｗｉｔｈＮｏｉｓｅ）又はＥＫＦ（ＥｘｔｅｎｄｅｄＫａｌｍａｎＦｉｌｔｅｒ）を用いて、クラスタリング又は追跡結果を得ることができる。クラスタリング又は追跡結果はオブジェクトの位置（ｌｏｃａｔｉｏｎ）を表し、該位置は（平面）直角座標系におけるオブジェクトのレーダーに対する座標であり、（ｘ、ｙ）と表される。 In an embodiment of the invention, the radar detection result includes a point cloud, the velocity of each point in the point cloud, and other information. Based on the points in the point cloud, clustering or tracking results can be obtained using an algorithm, for example, dBscan (Density-Based Spatial Crusting of Applications with Noise) or EKF (Exted Kalman Filter). The clustering or tracking result represents the location of the object, which is the coordinates of the object with respect to the radar in a (planar) Cartesian coordinate system and is represented as (x, y).

操作２０１では、レーダー検出結果とビデオ検出結果とのマッチングを行い、即ち、レーダーフレームとビデオフレームとのマッチングを行うことで、同じ時刻のレーダーフレーム及びビデオフレームを見つけ、又は、タイムスタンプの差が或る閾値よりも小さいレーダーフレーム及びビデオフレームを見つけ、これにより、レーダー座標系でのこれらの座標（ｘ、ｙ）はビデオフレームにおける位置（ｐｏｓｉｔｉｏｎ）に較正され、この位置は（ｕ、ｖ）と表される。レーダーフレームとビデオフレームとのマッチングを行うことで、レーダーフレームにおけるポイントクラウドをビデオフレームにマッピングすることができ、即ち、レーダー検出結果における各点のビデオフレーム上の位置を確定することができ。なお、本発明では、具体的なマッチング方法について限定せず、関連技術を参照することができる。 In operation 201, the radar detection result and the video detection result are matched, that is, the radar frame and the video frame are matched to find the radar frame and the video frame at the same time, or the difference in time stamps is different. It finds radar and video frames that are less than a certain threshold, which calibrates these coordinates (x, y) in the radar coordinate system to their position in the video frame, which is (u, v). It is expressed as. By matching the radar frame with the video frame, the point cloud in the radar frame can be mapped to the video frame, that is, the position of each point on the video frame in the radar detection result can be determined. In the present invention, the related technology can be referred to without limiting the specific matching method.

操作２０２では、ポイントクラウドがクラスターとしてクラスタリングされ得る場合、即ち、レーダーフレームにおいてポイントクラウドのうちの幾つかのポイントがクラスターとしてクラスタリングされ得る場合、このクラスターは、幅（Ｗ_ｃ ^ｉ）及び高さ（Ｈ_ｃ ^ｉ）により記述することができる、そのうち、“ｉ”はｉ番目のクラスターを示し、この場合、該クラスターの中のポイントは１つの矩形領域内に位置し、この矩形領域はグリッドセルと見なすことができる。 In operation 202, if the point cloud can be clustered as a cluster, i.e., if some points of the point cloud can be clustered as a cluster in a radar frame, then the cluster can be clustered in width (W _c ⁱ ) and height (W c i). _{Can be described by Hc} ⁱ ), of which "i" indicates the i-th cluster, in which case the points in the cluster are located within one rectangular area, which is the grid cell. You can see it.

例えば、操作２０２では、幾つかの実施例において、レーダー検出結果に基づいてグリッドセルを生成することは、先に、レーダー検出結果に対してクラスタリングを行い、少なくとも１つのクラスターを取得し、その後、該クラスターの中のポイントの座標又は信号強度に基づいてグリッドセルを生成することである。以下、例を挙げて説明する。 For example, in operation 202, in some embodiments, generating grid cells based on radar detection results first clusters the radar detection results to obtain at least one cluster, and then. Generating grid cells based on the coordinates or signal strength of points in the cluster. Hereinafter, an example will be described.

幾つかの実施例において、各クラスターについて、該クラスターの中心点の座標をグリッドセルの中心とし、該クラスターにおける最右端の点の横座標（即ち、該クラスターの中のｘ座標の最大値）と最左端の点の横座標（即ち、該クラスターの中のｘ座標の最小値）との差をグリッドセルの幅Ｗ_ｃ ^ｉとし、該クラスターの中の最上端の点の縦座標（即ち、該クラスターの中のｙ座標の最大値）と最下端の点の縦座標（即ち、該クラスターの中のｙ座標の最小値）との差をグリッドセルの高さＨ_ｃ ^ｉとし、これにより、各クラスターに対するグリッドセルを取得することができる。 In some examples, for each cluster, the coordinates of the center point of the cluster are the center of the grid cell, and the abscissa of the rightmost point in the cluster (that is, the maximum value of the x coordinate in the cluster). abscissa of the point of the leftmost (i.e., the minimum value of the x coordinate in a cluster) the difference between the width W _c ⁱ of the grid cell and the ordinate of the point of the uppermost end in the cluster (i.e., the The difference between the maximum value of the y-coordinate in the cluster) and the abscissa of the lowest point (that is, the minimum value of the y-coordinate in the cluster) is defined as the height H _c ⁱ of the grid cell, whereby each You can get the grid cells for the cluster.

幾つかの実施例において、各クラスターについて、該クラスターにおける信号強度が最大の点の座標をグリッドセルの中心として、事前設定の幅及び高さをグリッドセルの幅Ｗ_ｃ ^ｉ及び高さＨ_ｃ ^ｉとし、又は、事前設定の幅及び高さと重みとの乗積をネットワークユニットの幅Ｗ_ｃ ^ｉ及び高さＨ_ｃ ^ｉとし、これにより、各クラスターに対応するグリッドセルを得ることができる。ここで、事前設定の幅及び高さは経験に基づいて設定されても良く、上述の信号強度が最大の点とレーダーとの間の距離に基づいて決定されても良いが、本発明はこれに限定されない。また、固定した（ｆｉｘｅｄ）幅及び高さを事前設定し、他のポリシー又はファクター（例えば、上述の信号強度が最大の点とレーダーとの間の距離）に基づいてグリッドセルの重みを決定し、該重みと上述の事前設定の固定した幅及び高さとの乗積をグリッドセルの幅Ｗ_ｃ ^ｉ及び高さＨ_ｃ ^ｉとしても良い。 In some embodiments, for each cluster, the coordinates of the point with the highest signal strength in the cluster are the center of the grid cell, and the preset width and height are the grid cell width W _c ⁱ and height H _c ^i. Or, the product of the preset width and height and the weight is set to the width W _c ⁱ and the height H _c ⁱ of the network unit, whereby a grid cell corresponding to each cluster can be obtained. Here, the preset width and height may be set based on experience, and the above-mentioned signal strength may be determined based on the distance between the maximum point and the radar. Not limited to. It also presets fixed widths and heights and determines grid cell weights based on other policies or factors (eg, the distance between the point with the highest signal strength and the radar mentioned above). , The product of the weight and the above-mentioned preset fixed width and height may be the width W _c ⁱ and the height H _c ⁱ of the grid cell.

以上の実施例では、異なるクラスターが異なるグリッドサイズ（ｇｒｉｄｓｉｚｅ）を有する。本発明はこれに限定されず、ビデオフレームを処理（ａｄｄｒｅｓｓｉｎｇ）しやすいため、幾つかの実施例において、１つの統一したグリッドサイズを確定しても良い。該統一したグリッドサイズは処理（ａｄｄｒｅｓｓｉｎｇ）効率、検出精度（ｄｅｔｅｃｔｉｎｇａｃｃｕｒａｃｙ）などのファクターにより決められる。異なるオブジェクトについて、該統一したグリッドサイズの計算方法も異なる。 In the above examples, different clusters have different grid sizes. The present invention is not limited to this, and since it is easy to addless to video frames, one unified grid size may be determined in some embodiments. The unified grid size is determined by factors such as processing efficiency and detection accuracy. For different objects, the unified grid size calculation method is also different.

例えば、オブジェクト検出効率を向上させるために、統一したグリッドサイズの幅（Ｗ_ｇ）がｍａｘ（Ｗ_ｃ ^ｉ）に等しく、統一したグリッドサイズの高さ（Ｈ_ｇ）がｍａｘ（Ｈ_ｃ ^ｉ）に等しくても良い。 For example, in order to improve object detection efficiency, the width (W _g ) of the unified grid size is equal to max (W _c ⁱ ), and the height (H _g ) of the unified grid size is max (H _c ⁱ ). It may be equal.

また、例えば、より多くの精度の検出結果を得るために、統一したグリッドサイズの幅（Ｗ_ｇ）がｍｉｎ（Ｗ_ｃ ^ｉ）に等しく、統一したグリッドサイズの高さ（Ｈ_ｇ）がｍｉｎ（Ｈ_ｃ ^ｉ）に等しくても良い。 Further, for example, in order to obtain a detection result with higher accuracy, the width (W _g ) of the unified grid size is equal to min (W _c ⁱ ), and the height (H _g ) of the unified grid size is min (H g). It may be equal to H _c ^i).

また、本発明はクラスタリングの方法について限定せず、前述のｄＢｓｃａｎアルゴリズムであっても良く、他のクラスタリングアルゴリズムであっても良いが、ここではその詳しい説明を省略する。 Further, the present invention is not limited to the clustering method, and may be the above-mentioned dBscan algorithm or another clustering algorithm, but detailed description thereof will be omitted here.

操作２０２では、ポイントクラウドがクラスターとしてクラスタリングされることが困難である場合、即ち、レーダーフレームにおいて、ポイントクラウドにおける点からクラスターをクラスタリングすることができない場合、レーダー検出結果における各点（即ち、レーダーフレーム上のポイントクラウドにおける各点）の座標をグリッドセルの中心とし、その後、任意の方法によりグリッドセルの大小（即ち、幅Ｗ_ｃ ^ｉ及び高さＨ_ｃ ^ｉ）を確定しても良い。 In operation 202, when it is difficult to cluster the point cloud as a cluster, that is, in the radar frame, when the cluster cannot be clustered from the points in the point cloud, each point in the radar detection result (that is, the radar frame). The coordinates of each point in the above point cloud) may be set as the center of the grid cell, and then the size of the grid cell (that is, width W _c ⁱ and height H _c ⁱ ) may be determined by any method.

例えば、各ポイントとレーダーとの間の距離に基づいて、該ポイントの座標をグリッドセルの中心とするグリッドセルの大小を確定する。幾つかの実施例において、レーダーに近い点に対応するグリッドセルのサイズが比較的大きくても良く、レーダーから遠い点に対応するグリッドセルのサイズが比較的小さくても良い。具体的なｓサイズについては経験に基づいて確定されても良いが、本発明はこれについて限定しない。 For example, based on the distance between each point and the radar, the size of the grid cell centered on the coordinates of the point is determined. In some embodiments, the size of the grid cell corresponding to the point near the radar may be relatively large, and the size of the grid cell corresponding to the point far from the radar may be relatively small. The specific s size may be determined based on experience, but the present invention is not limited thereto.

また、例えば、各点の信号強度に基づいて、該点の座標をグリッドセルの中心とするグリッドセルの大小を確定する。信号の強弱が複数のファクターによる影響を受けるから、一般的に言えば、信号が強いことは、被測定媒体の誘電率が大きく、反射面積が大きいことを意味する。よって、幾つかの実施例において、信号強度が高い点ついて、その対応するグリッドセルのサイズを比較的大きくしても良く、逆に、信号強度が弱い点について、その対応するグリッドセルのサイズを比較的小さくしても良い。具体的なサイズについては経験に基づいて確定されても良く、本発明はこれについて限定しない。 Further, for example, based on the signal strength of each point, the size of the grid cell whose center is the coordinate of the point is determined. Generally speaking, a strong signal means a large dielectric constant and a large reflection area of the measurement medium because the strength of the signal is affected by a plurality of factors. Therefore, in some embodiments, the size of the corresponding grid cell may be relatively large for points with high signal strength, and conversely, the size of the corresponding grid cell may be increased for points with low signal strength. It may be relatively small. The specific size may be determined based on experience, and the present invention is not limited thereto.

本発明の実施例では、レーダーフレームのポイントクラウドの中の各点とレーダーとの間の距離がレーダー検出結果から計算することで得られても良いが、ここではその詳しい説明を省略する。また、本発明の実施例では、各点の信号強度の計算方式又は検出方式又は確定方式についても限定せず、従来の任意の方式で得ても良いが、ここではその詳しい説明を省略する。 In the embodiment of the present invention, the distance between each point in the point cloud of the radar frame and the radar may be obtained by calculating from the radar detection result, but the detailed description thereof will be omitted here. Further, in the embodiment of the present invention, the calculation method, the detection method, or the determination method of the signal strength at each point is not limited, and the signal strength may be obtained by any conventional method, but detailed description thereof will be omitted here.

また、前述のポイントクラウドがクラスターとしてクラスタリングされ得るときの方法と同様に、ポイントクラウドからクラスターをクラスタリングすることができない場合にでも、１つの統一したグリッドサイズを確定して良く、方法は前述のポイントクラウドがクラスターとしてクラスタリングされ得るときの方法と類似しているから、ここではその詳しい説明を省略する。 Further, similar to the method when the point cloud can be clustered as a cluster described above, even when the cluster cannot be clustered from the point cloud, one unified grid size may be determined, and the method is the point described above. Since it is similar to the method when the cloud can be clustered as a cluster, the detailed description thereof is omitted here.

本発明の実施例では、レーダー検出結果（ポイントクラウドにおけるポイント）に基づいてグリッドセルを取得し、ポイントを含まない領域を無視し、これにより、レーダーフレームからの較正位置（グリッドセル）は、ビデオフレーム上の境界枠の可能な位置を指示しており、該境界枠は、該グリッドセルに基づいて予測を行うことができる。 In the embodiment of the present invention, the grid cell is acquired based on the radar detection result (point in the point cloud), the area not including the point is ignored, and thus the calibration position (grid cell) from the radar frame is video. It points to a possible position of the border on the frame, which can make predictions based on the grid cells.

操作２０３では、幾つかの実施例において、各グリッドセルについて、ビデオフレーム上で、該グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、該グリッドセルをカバーする少なくとも１つの枠を該グリッドセルに対応する境界枠として生成することができる。 In operation 203, in some embodiments, each grid cell is covered by sliding left or right or up and down along the u-axis and v-axis of the center point of the grid cell on a video frame. At least one frame can be generated as a boundary frame corresponding to the grid cell.

本発明ではスライディングの単位について限定せず、例えば、グリッドセルの大小（幅及び高さ）を単位とし、グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、上述のグリッドセルをカバーする境界枠を得ても良く、また、例えば、事前設定の単位サイズを単位とし、グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、上述のグリッドセルをカバーする境界枠を取得しても良い。 In the present invention, the unit of sliding is not limited. For example, by sliding left and right or up and down along the u-axis and v-axis of the center point of the grid cell with the size (width and height) of the grid cell as the unit. A boundary frame covering the above-mentioned grid cell may be obtained, and for example, by using a preset unit size as a unit and sliding left or right or up and down along the u-axis and v-axis of the center point of the grid cell. , The boundary frame covering the above-mentioned grid cells may be acquired.

幾つかの実施例において、該境界枠の位置は該境界枠の左上角の座標により表されても良く、該境界枠の大小は該境界枠の幅及び高さにより表されても良い。なお、本発明はこれに限定されず、他の量により該境界枠の位置及び大小を示しても良く、具体的には、該境界枠を得る手段又は他のファクターにより決められる。 In some embodiments, the position of the boundary frame may be represented by the coordinates of the upper left corner of the boundary frame, and the size of the boundary frame may be represented by the width and height of the boundary frame. The present invention is not limited to this, and the position and size of the boundary frame may be indicated by other quantities, and specifically, it is determined by means for obtaining the boundary frame or other factors.

幾つかの実施例において、該境界枠の信頼度を確定しても良く、例えば、該グリッドセルにおける点の数に基づいて該境界枠の信頼度を確定し、幾つかの実施例において、境界枠を生成するグリッドセルの中の点が多いほど、オブジェクトが存在する可能性が大きく、該グリッドセルに対応する境界枠の信頼度が大きい。また、例えば、該グリッドセルの中の点の信号強度に基づいて該境界枠の信頼度を確定し、幾つかの実施例において、境界枠を生成するグリッドセルの中の、信号強度が最大の点の信号強度が強いほど、オブジェクトが存在する可能性が大きく、該グリッドセルに対応する境界枠の信頼度が大きい。 In some examples, the reliability of the boundary frame may be determined, for example, the reliability of the boundary frame may be determined based on the number of points in the grid cell, and in some examples, the boundary frame may be determined. The more points in the grid cell that generates the frame, the greater the possibility that the object exists, and the higher the reliability of the boundary frame corresponding to the grid cell. Further, for example, the reliability of the boundary frame is determined based on the signal strength of points in the grid cell, and in some embodiments, the signal strength in the grid cell that generates the boundary frame is the maximum. The stronger the signal strength of a point, the more likely it is that an object will exist, and the greater the reliability of the border frame corresponding to the grid cell.

本発明の実施例では、レーダーフレーム上のポイントクラウドにおける点の速度はさらに、ビデオフレームにおけるオブジェクトの分類に先験情報（事前情報）を提供しても良く、即ち、在発明のオブジェクト検出方法では、さらに、レーダー検出結果における点の速度に基づいてオブジェクトの可能な分類（オブジェクト種類と言う）を確定しても良く、これにより、上述の操作２０３で得た境界枠及びレーダー検出結果において各点に対応する可能な分類に基づいてオブジェクト検出を検出することができる。例えば、点の速度がゼロに近い場合、該点がインテリジェント道路交差点シーンにおける静的オブジェクト、例えば、格柵、消火栓、路標などを表すことを意味し、そうでない場合、該点が移動オブジェクト、例えば、自動車、オートバイ、自転車又は歩行者を表すことを意味する。該先験情報により、オブジェクト検出効率を向上させることができる。 In the embodiments of the present invention, the speed of points in the point cloud on the radar frame may further provide prior information (preliminary information) for the classification of objects in the video frame, i.e., in the object detection method of the invention. Further, the possible classification of objects (referred to as object type) may be determined based on the speed of points in the radar detection result, whereby each point in the boundary frame and radar detection result obtained in the above operation 203. Object detection can be detected based on the possible classifications corresponding to. For example, if the speed of a point is close to zero, it means that the point represents a static object in an intelligent road intersection scene, such as a fence, fire plug, road sign, etc., otherwise the point is a moving object, eg, a road sign. , Means to represent a car, motorcycle, bicycle or pedestrian. The object detection efficiency can be improved by the prior test information.

操作２０４では、上述の境界枠に基づいて、又は、上述の境界枠及び上述の先験情報に基づいて、当業者に既知のアルゴリズムを採用してオブジェクト検出結果を得ることができる。なお、本発明はここでのアルゴリズムについて限定せず、任意の人工知能（ＡＩ）処理方法、例えば、ＹＯＬＯ方法などであっても良い。 In operation 204, an object detection result can be obtained by adopting an algorithm known to those skilled in the art based on the above-mentioned boundary frame or based on the above-mentioned boundary frame and the above-mentioned prior test information. The present invention is not limited to the algorithm here, and may be any artificial intelligence (AI) processing method, for example, the YOLO method.

本発明の実施例の方法によれば、レーダー検出結果に基づいて境界枠のビデオフレーム上の可能な位置を指示することで、完全なビデオフレーム上のブラインド検索、細分化及び予測を避けることができ、これにより、オブジェクト検出方法（例えば、ＤＰＭ、Ｒ−ＣＮＮなど）の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小及び位置の正確性を向上させることができる。 According to the method of the embodiment of the present invention, blind search, subdivision and prediction on the complete video frame can be avoided by indicating the possible position of the boundary frame on the video frame based on the radar detection result. This makes it possible to significantly reduce the processing time of the object detection method (for example, DPM, R-CNN, etc.) or improve the accuracy of the size and position of the boundary frame under the same processing time condition.

以上、図２をもとに本発明の実施例の方法について説明したが、本発明はこれに限定されず、具体的に実施を行うときに、各操作の間の実行順序を適切に調整しても良く、例えば、操作２０１及び操作２０２の実行順序を交換しても良く、幾つかの操作を増減しても良い。即ち、当業者は、上述の図２の記載に基づいて上述の内容を適切に変更しても良い。 Although the method of the embodiment of the present invention has been described above with reference to FIG. 2, the present invention is not limited to this, and the execution order between each operation is appropriately adjusted when carrying out concretely. For example, the execution order of the operations 201 and 202 may be exchanged, and some operations may be increased or decreased. That is, a person skilled in the art may appropriately change the above-mentioned contents based on the above-mentioned description of FIG.

図３は本発明の実施例におけるオブジェクト検出方法を示す他の図であり、レーダーオリジナルデータ及びカメラビデオフレームの取得からオブジェクト検出結果の出力までの全プロセスを示している。図３に示すように、該方法は以下の操作（ステップ）を含む。 FIG. 3 is another diagram showing the object detection method in the embodiment of the present invention, and shows the entire process from the acquisition of the radar original data and the camera video frame to the output of the object detection result. As shown in FIG. 3, the method includes the following operations (steps).

操作３０１：レーダーによりレーダーオリジナルデータを取得し；
操作３０２：レーダーオリジナルデータに対して信号処理を行い、オブジェクトのレーダーに対する位置及びオブジェクトの速度を取得し；
操作３０３：レーダーフレーム上のポイントクラウドに基づいてグリッドセルを生成し；
操作３０４：カメラによりビデオフレームを取得し；
操作３０５：レーダーフレーム及びビデオフレームに対して較正を行い、上述のグリッドセルをビデオフレームにマッピングし；
操作３０６：レーダーフレーム及びビデオフレームに対して時間の同期を行い；
操作３０７：グリッドセルに基づいて境界枠の可能な位置を指示し；
操作３０８：レーダーフレーム上のポイントクラウドにおける点の速度を用いて分類の類型を制限し；
操作３０９：ＡＩ処理を行い、オブジェクト検出結果を得る。 Operation 301: Obtain radar original data by radar;
Operation 302: Signals the radar original data to obtain the position of the object with respect to the radar and the velocity of the object;
Operation 303: Generate a grid cell based on the point cloud on the radar frame;
Operation 304: Acquire a video frame by the camera;
Operation 305: Calibrate radar and video frames and map the above grid cells to video frames;
Operation 306: Synchronize the time with the radar frame and the video frame;
Operation 307: Indicate possible positions of the border frame based on the grid cells;
Operation 308: Limit classification types using point velocities in point clouds on radar frames;
Operation 309: Performs AI processing and obtains an object detection result.

以上、図３をもとに本発明の実施例の方法について説明したが、本発明はこれに限定されず、具体的に実施を行うときに、各操作の間の実行順序を適切に調整しても良く、例えば、操作３０１及び操作３０４を同時に非同期で行っても良く、操作３０３と操作３０５−３０６の実行順序を交換しても良く、例えば、先に操作３０５及び操作３０６を実行し、そして操作３０３を実行しても良い。また、幾つかの操作を増減しても良く、例えば、操作３０８を省略しても良い。即ち、当業者は、上述の図３の記載に基づいて、上述の内容を適切に変更しても良い。 Although the method of the embodiment of the present invention has been described above with reference to FIG. 3, the present invention is not limited to this, and the execution order between each operation is appropriately adjusted when carrying out concretely. For example, the operation 301 and the operation 304 may be performed asynchronously at the same time, or the execution order of the operation 303 and the operation 305-306 may be exchanged. For example, the operation 305 and the operation 306 may be executed first. Then, the operation 303 may be executed. Further, some operations may be increased or decreased, and for example, operation 308 may be omitted. That is, a person skilled in the art may appropriately change the above-mentioned contents based on the above-mentioned description of FIG.

また、図３の例では、操作３０３は図２の操作２０２に対応し、操作３０７は図２の操作２０３に対応する。前の実施例において操作２０２と操作２０３について詳細に説明したので、ここではその詳しい説明を省略する。 Further, in the example of FIG. 3, the operation 303 corresponds to the operation 202 of FIG. 2, and the operation 307 corresponds to the operation 203 of FIG. Since the operation 202 and the operation 203 have been described in detail in the previous embodiment, the detailed description thereof will be omitted here.

なお、以上、本発明に係る各ステップ又はプロセスのみを説明したが、本発明はこれに限定されない。該方法はさらに他のステップ又はプロセスを含んでも良く、これらのステップ又はプロセスの具体的な内容については、従来技術を参照することができる。 Although only each step or process according to the present invention has been described above, the present invention is not limited thereto. The method may further include other steps or processes, with reference to prior art for the specific content of these steps or processes.

本発明の実施例の方法によれば、前述のように、オブジェクト検出方法の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小及び位置の正確性を向上させることができる。 According to the method of the embodiment of the present invention, as described above, the processing time of the object detection method can be significantly reduced, or the size and position accuracy of the boundary frame can be improved under the same processing time condition. ..

＜第二側面の実施例＞
本発明の実施例はオブジェクト検出装置を提供する。該装置が問題を解決する原理が第一側面の実施例と類似しているので、その具体的な実施については第一側面の実施例を参照することができ、ここではその詳しい説明を省略する。 <Example of the second aspect>
An embodiment of the present invention provides an object detection device. Since the principle by which the device solves the problem is similar to that of the first aspect embodiment, the first aspect embodiment can be referred to for its specific implementation, and detailed description thereof will be omitted here. ..

図４は本発明の実施例におけるオブジェクト検出装置を示す図であり、図４に示すように、本発明の実施例におけるオブジェクト検出装置４００は、マッチングユニット４０１、第一生成ユニット４０２、第二生成ユニット４０３及び検出ユニット４０４を含む。マッチングユニット４０１はレーダー検出結果とビデオ検出結果とのマッチングを行い；第一生成ユニット４０２はレーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成し；第二生成ユニット４０３は画像座標系で前記グリッドセルに基づいて境界枠を生成し；検出ユニット４０４は前記境界枠に基づいてオブジェクト検出を行うために用いられる。 FIG. 4 is a diagram showing an object detection device according to an embodiment of the present invention. As shown in FIG. 4, the object detection device 400 according to the embodiment of the present invention includes a matching unit 401, a first generation unit 402, and a second generation. Includes unit 403 and detection unit 404. The matching unit 401 matches the radar detection result with the video detection result; the first generation unit 402 generates a grid cell based on the radar detection result in the radar coordinate system; and the second generation unit 403 is in the image coordinate system. A boundary frame is generated based on the grid cell; the detection unit 404 is used to perform object detection based on the boundary frame.

幾つかの実施例において、図４に示すように、本発明の実施例におけるオブジェクト検出装置４００はさらに確定ユニット４０５を含み、それは前記レーダー検出結果におけるポイントの速度に基づいてオブジェクト種類を確定し、検出ユニット４０４は上述の境界枠及び該レーダー検出結果における各ポイントのオブジェクト種類に基づいてオブジェクト検出を行う。 In some embodiments, as shown in FIG. 4, the object detection device 400 in the embodiments of the present invention further includes a determination unit 405, which determines the object type based on the velocity of the point in the radar detection result. The detection unit 404 performs object detection based on the above-mentioned boundary frame and the object type of each point in the radar detection result.

幾つかの実施例において、第一生成ユニット４０２はレーダー検出結果に対してクラスタリングを行い、少なくとも１つのクラスターを取得し；各クラスターについて、該クラスターの中心点の座標をグリッドセルの中心とし、該クラスターの中の最右端の点の横座標と最左端の点の横座標との差をグリッドセルの幅とし、該クラスターの中の最上端の点の縦座標と最下端の点の縦座標との差をグリッドセルの高さとすることで、各クラスターに対するグリッドセルを得ることができる。 In some embodiments, the first generation unit 402 clusters on the radar detection results to obtain at least one cluster; for each cluster, the coordinates of the center point of the cluster are the center of the grid cell. The difference between the abscissa of the rightmost point in the cluster and the abscissa of the leftmost point is the width of the grid cell, and the abscissa of the top point and the abscissa of the bottom point in the cluster. By using the difference between the two as the height of the grid cells, the grid cells for each cluster can be obtained.

幾つかの実施例において、第一生成ユニット４０２はレーダー検出結果に対してクラスタリングを行い、少なくとも１つのクラスターを取得し；各クラスターについて、該クラスターの中の信号強度が最大の点の座標をグリッドセルの中心とし、事前設定の幅及び高さをグリッドセルの幅及び高さとし、又は、事前設定の幅及び高さと重みとの乗積をネットワークユニットの幅及び高さとすることで、各クラスターに対するグリッドセルを得ることができる。 In some embodiments, the first generation unit 402 clusters on the radar detection results to obtain at least one cluster; for each cluster, grids the coordinates of the points with the highest signal strength in the cluster. For each cluster, center the cell and let the preset width and height be the width and height of the grid cell, or the product of the preset width and height and weight be the width and height of the network unit. You can get a grid cell.

幾つかの実施例において、第一生成ユニット４０２はレーダー検出結果における各点の座標をグリッドセルの中心とし、該点とレーダーとの間の距離に基づいて、該点の座標をグリッドセルの中心とするグリッドセルの大小を確定しても良い。 In some embodiments, the first generation unit 402 has the coordinates of each point in the radar detection result as the center of the grid cell, and based on the distance between the point and the radar, the coordinates of the point are the center of the grid cell. The size of the grid cell to be used may be determined.

幾つかの実施例において、第一生成ユニット４０２はレーダー検出結果の中の各点の座標をグリッドセルの中心とし、該点の信号強度に基づいて、該点の座標をグリッドセルの中心とするグリッドセルの大小を確定することができる。 In some embodiments, the first generation unit 402 has the coordinates of each point in the radar detection result as the center of the grid cell, and the coordinates of the point as the center of the grid cell based on the signal strength of the point. The size of the grid cell can be determined.

幾つかの実施例において、各グリッドセルについて、第二生成ユニット４０３は該グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、該グリッドセルを覆う少なくとも１つの枠を該グリッドセルにする対応の境界枠として生成することができる。 In some embodiments, for each grid cell, the second generation unit 403 covers at least one grid cell by sliding left or right or up and down along the u-axis and v-axis of the center point of the grid cell. It can be generated as a corresponding boundary frame that makes the frame the grid cell.

幾つかの実施例において、第二生成ユニット４０３はさらに、該グリッドセルの中の点の数に基づいて該境界枠の信頼度を確定し、又は、該グリッドセルの中の点の信号強度に基づいて該境界枠の信頼度を確定しても良い。 In some embodiments, the second generation unit 403 further determines the reliability of the boundary frame based on the number of points in the grid cell, or determines the signal strength of the points in the grid cell. Based on this, the reliability of the boundary frame may be determined.

幾つかの実施例において、第二生成ユニット４０３は該グリッドセルの幅及び高さを単位として、該グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、上述の境界枠を生成し、又は、第二生成ユニット４０３は、事前設定の単位サイズを単位として、該グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、上述の境界枠を生成することができる。 In some embodiments, the second generation unit 403 slides left and right or up and down along the u-axis and v-axis of the center point of the grid cell in units of the width and height of the grid cell. The boundary frame of the above is generated, or the second generation unit 403 slides left and right or up and down along the u-axis and v-axis of the center point of the grid cell in the preset unit size as a unit. Borders can be generated.

なお、以上、本発明に係る各部品又はモジュールのみについて説明したが、本発明はこれに限定されない。オブジェクト検出装置４００はさらに他の部品件又はモジュールを含んでも良く、これらの部品又はモジュールの具体的な内容については、関連技術を参照することができる。 Although only each component or module according to the present invention has been described above, the present invention is not limited thereto. The object detection device 400 may further include other parts or modules, and the related art can be referred to for the specific contents of these parts or modules.

上述の実施例から分かるように、本発明の実施例におけるオブジェクト検出装置により、オブジェクト検出方法の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小及び位置の正確性を向上させることができる。 As can be seen from the above-described embodiment, the object detection device in the embodiment of the present invention significantly reduces the processing time of the object detection method, or improves the size and position accuracy of the boundary frame under the same processing time condition. Can be made to.

＜第三側面の実施例＞
本発明の実施例は画像処理装置を提供し、該画像処理装置は例えばコンピュータ、サーバー、ワークステーション、ノートパソコン、スマートフォンなどであって良いが、本発明の実施例はこれに限定されない。 <Example of the third aspect>
An embodiment of the present invention provides an image processing apparatus, and the image processing apparatus may be, for example, a computer, a server, a workstation, a laptop computer, a smartphone, or the like, but the embodiment of the present invention is not limited thereto.

図５は本発明の実施例における画像処理装置を示す図であり、図５に示すように、本発明の実施例における画像処理装置５００は少なくとも１つインターフェース（図５に示されない）、処理器（例えば、中央処理装置（ＣＰＵ））５０１、及び記憶器５０２を含み、記憶器５０２は処理器５０１に接続される。そのうち、記憶器５０２は各種のデータを記憶することができ、また、オブジェクト検出用のプログラム５０３を記憶し、且つ処理器５０１の制御下で該プログラム５０３を実行し、各種の事前設定の値及び所定の条件などを記憶することもできる。 FIG. 5 is a diagram showing an image processing device according to an embodiment of the present invention. As shown in FIG. 5, the image processing device 500 according to an embodiment of the present invention has at least one interface (not shown in FIG. 5) and a processor. (For example, a central processing unit (CPU)) 501 and a storage device 502 are included, and the storage device 502 is connected to the processing device 501. Among them, the storage device 502 can store various data, stores the program 503 for object detection, executes the program 503 under the control of the processor 501, and performs various preset values and various preset values. It is also possible to store predetermined conditions and the like.

１つの実施例において、第二側面の実施例に記載のオブジェクト検出装置４００の機能が処理器５０１に集積され、第一側面の実施例に記載のオブジェクト検出方法を実現しても良い。例えば、該処理器５０１は以下のように構成されても良く、即ち、
レーダー検出結果とビデオ検出結果とのマッチングを行い；
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成し；
画像座標系で前記グリッドセルに基づいて境界枠を生成し；及び
前記境界枠に基づいてオブジェクト検出を行う。 In one embodiment, the functions of the object detection device 400 described in the second aspect embodiment may be integrated in the processor 501 to realize the object detection method described in the first aspect embodiment. For example, the processor 501 may be configured as follows, i.e.
Matching radar detection results with video detection results;
Generate grid cells based on the radar detection result in the radar coordinate system;
A boundary frame is generated based on the grid cells in the image coordinate system; and object detection is performed based on the boundary frame.

もう１つの実施例において、第二側面の実施例に記載のオブジェクト検出装置４００が処理器５０１と別々で構成されても良く、例えば、該オブジェクト検出装置４００を、処理器５０１に接続されるチップとして構成し、処理器５０１の制御によりオブジェクト検出装置４００の機能を実現しても良い。 In another embodiment, the object detection device 400 described in the second aspect embodiment may be configured separately from the processor 501, for example, the object detection device 400 is connected to the processor 501. The function of the object detection device 400 may be realized by controlling the processor 501.

なお、画像処理装置５００はさらに表示器５０５及びＩ／Ｏ装置５０４を含んでも良く、又は、図５に示す全部の部品を含む必要がなく、例えば、さらにカメラヘッド（図示せず）を、入力画像フレームを得るために含んでも良い。また、該画像処理装置５００はさらに図５に無い部品含んでも良く、これについては関連技術を参照することができる。 The image processing device 500 may further include a display 505 and an I / O device 504, or does not need to include all the parts shown in FIG. 5, for example, further inputting a camera head (not shown). It may be included to obtain an image frame. Further, the image processing apparatus 500 may further include parts not shown in FIG. 5, and related techniques can be referred to for this.

本発明の実施例では、処理器５０１は制御器又は操作コントローラと称される場合があり、マイクロプロセッサ又は他の処理装置和／又は論理装置を含んでも良い。該処理器５０１は入力を受信して画像処理装置５００の各部品の操作を制御することができる。 In the embodiments of the present invention, the processor 501 may be referred to as a controller or operation controller and may include a microprocessor or other processor sum / or logic device. The processor 501 can receive the input and control the operation of each component of the image processing device 500.

本発明の実施例では、記憶器５０２は例えばバッファ、フレッシュメモリ、ＨＤＤ、可移動媒体、揮発性記憶器、不揮発性記憶器又は他の適切な装置のうちの１つ又は複数であっても良く、各種の情報や処理用のプログラムを記憶することができる。処理器５０１は該記憶器５０２に記憶の該プログラムを実行して、情報の記憶、処理などを実現しても良い。他の部品の機能が従来と類似しているから、ここではその詳しい説明を省略する。なお、画像処理装置５００の各部品は、専用ハードウェア、ファームウェア、ソフトウェア又はその組み合わせにより実現されても良いが、そのすべては本発明の範囲に属する。 In the embodiments of the present invention, the storage device 502 may be, for example, one or more of a buffer, a fresh memory, an HDD, a mobile medium, a volatile storage device, a non-volatile storage device, or other suitable device. , Various information and processing programs can be stored. The processor 501 may execute the program of storage in the storage 502 to realize storage, processing, and the like of information. Since the functions of other parts are similar to those of the conventional ones, detailed description thereof will be omitted here. Each component of the image processing apparatus 500 may be realized by dedicated hardware, firmware, software, or a combination thereof, but all of them belong to the scope of the present invention.

本発明の実施例における画像処理装置によりオブジェクト検出を行うことで、オブジェクト検出方法の処理時間を著しく減少させ、又は、同じ処理時間の条件で境界枠の大小及び位置の正確性を向上させることができる。 By detecting an object with the image processing apparatus according to the embodiment of the present invention, the processing time of the object detection method can be significantly reduced, or the size and position accuracy of the boundary frame can be improved under the same processing time condition. can.

本発明の実施例はさらにコンピュータ可読プログラムを提供し、そのうち、画像処理装置の中で前記プログラムを実行するときに、前記プログラムは前記画像処理装置に、実施例の第一側面に記載の方法を実行させる。 An embodiment of the present invention further provides a computer-readable program, wherein when the program is executed in the image processing apparatus, the program provides the image processing apparatus with the method described in the first aspect of the embodiment. Let it run.

本発明の実施例はさらに、コンピュータ可読プログラムを記憶した記憶媒体を提供し、そのうち、前記コンピュータ可読プログラムは画像処理装置の中で実施例の第一側面に記載の方法を実行させる。 An embodiment of the present invention further provides a storage medium in which a computer-readable program is stored, wherein the computer-readable program executes the method described in the first aspect of the embodiment in an image processing apparatus.

本発明の実施例を参照しながら説明した装置又は方法は、ハードウェア、処理器により実行されるソフトウェアモジュール、又は両者の組み合わせにより実現することができる。例えば、機能ブロック図における１つ又は複数の機能及び／又は機能ブロック図における１つ又は複数の機能の組み合わせは、コンピュータプログラムにおける各ソフトウェアモジュールに対応しても良く、各ハードウェアモジュールに対応しても良い。また、これらのソフトウェアモジュールは、それぞれ、方法を示す図に示す各ステップに対応することができる。これらのハードウェアモジュールは、例えば、ＦＰＧＡ（ｆｉｅｌｄ−ｐｒｏｇｒａｍｍａｂｌｅｇａｔｅａｒｒａｙ）を用いてこれらのソフトウェアモジュールを固化して実現することができる。 The apparatus or method described with reference to the examples of the present invention can be realized by hardware, a software module executed by a processor, or a combination of both. For example, one or more functions in a functional block diagram and / or a combination of one or more functions in a functional block diagram may correspond to each software module in a computer program and correspond to each hardware module. Is also good. In addition, each of these software modules can correspond to each step shown in the figure showing the method. These hardware modules can be realized by solidifying these software modules using, for example, FPGA (field-programmable gate array).

また、本発明の実施例による装置、方法などは、ソフトウェアにより実現されても良く、ハードェアにより実現されてもよく、ハードェア及びソフトウェアの組み合わせにより実現されても良い。本発明は、このようなコンピュータ可読プログラムにも関し、即ち、前記プログラムは、ロジック部品により実行される時に、前記ロジック部品に、上述の装置又は構成要素を実現させることができ、又は、前記ロジック部品に、上述の方法又はそのステップを実現させることができる。さらに、本発明は、上述のプログラムを記憶した記憶媒体、例えば、ハードディスク、磁気ディスク、光ディスク、ＤＶＤ、フレッシュメモリなどにも関する。 Further, the apparatus, method, etc. according to the embodiment of the present invention may be realized by software, may be realized by hardware, or may be realized by a combination of hardware and software. The present invention also relates to such a computer-readable program, i.e., when the program is executed by a logic component, the logic component can realize the above-mentioned device or component, or the above-mentioned logic. The component can implement the method described above or a step thereof. Furthermore, the present invention also relates to a storage medium that stores the above-mentioned program, such as a hard disk, a magnetic disk, an optical disk, a DVD, or a fresh memory.

また、以上の実施例などに関し、さらに以下の付記を開示する。 In addition, the following additional notes will be further disclosed with respect to the above examples.

（付記１）
オブジェクト検出方法であって、
レーダー検出結果とビデオ検出結果とのマッチングを行い；
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成し；
画像座標系で前記グリッドセルに基づいて境界枠を生成し；及び
前記境界枠に基づいてオブジェクト検出を行うことを含む、方法。 (Appendix 1)
It is an object detection method
Matching radar detection results with video detection results;
Generate grid cells based on the radar detection result in the radar coordinate system;
A method comprising generating a border based on the grid cells in an image coordinate system; and performing object detection based on the border.

（付記２）
付記１に記載の方法であって、
前記方法はさらに、前記レーダー検出結果における点の速度に基づいてオブジェクト種類を確定することを含み
前記境界枠に基づいてオブジェクト検出を行うことは、前記境界枠及び前記レーダー検出結果における各点のオブジェクト種類に基づいてオブジェクト検出を行うことを含む、方法。 (Appendix 2)
The method described in Appendix 1
The method further includes determining the object type based on the velocity of the points in the radar detection result, and performing object detection based on the boundary frame is an object of the boundary frame and each point in the radar detection result. A method that involves performing object detection based on type.

（付記３）
付記１に記載の方法であって、
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成することは、
前記レーダー検出結果に対してクラスタリングを行い、少なくとも１つのクラスターを取得し；及び
各クラスターについて、前記クラスターの中心点の座標をグリッドセルの中心とし、前記クラスターにおける最右端の点の横座標と最左端の点の横座標との差をグリッドセルの幅とし、前記クラスターにおける最上端の点の縦座標と最下端の点の縦座標との差をグリッドセルの高さとすることで、各クラスターに対応するグリッドセルを確定することを含む、方法。 (Appendix 3)
The method described in Appendix 1
Generating a grid cell based on the radar detection result in the radar coordinate system
Clustering is performed on the radar detection result to obtain at least one cluster; and for each cluster, the coordinate of the center point of the cluster is the center of the grid cell, and the abscissa and the most rightmost point in the cluster. The difference between the abscissa and the ordinate of the leftmost point is the width of the grid cell, and the difference between the ordinate of the topmost point and the ordinate of the bottommost point in the cluster is the height of the grid cell. A method that involves determining the corresponding grid cells.

（付記４）
付記１に記載の方法であって、
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成することは、
前記レーダー検出結果に対してクラスタリングを行い、少なくとも１つのクラスターを取得し；及び
各クラスターについて、前記クラスターにおける信号強度が最大の点の座標をグリッドセルの中心とし、事前設定の幅及び高さをグリッドセルの幅及び高さとし、又は、事前設定の幅及び高さと重みとの乗積をネットワークユニットの幅及び高さとすることで、各クラスターに対応するグリッドセルを得ることを含む、方法。 (Appendix 4)
The method described in Appendix 1
Generating a grid cell based on the radar detection result in the radar coordinate system
Clustering is performed on the radar detection result to obtain at least one cluster; and for each cluster, the coordinates of the point with the highest signal strength in the cluster are set as the center of the grid cell, and the preset width and height are set. A method comprising obtaining a grid cell corresponding to each cluster by making the width and height of grid cells or the product of a preset width and height and weight being the width and height of a network unit.

（付記５）
付記１に記載の方法であって、
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成することは、
前記レーダー検出結果における各点の座標をグリッドセルの中心とし、前記点とレーダーとの間の距離に基づいて、前記点の座標をグリッドセルの中心とするグリッドセルの大小を確定することを含む、方法。 (Appendix 5)
The method described in Appendix 1
Generating a grid cell based on the radar detection result in the radar coordinate system
Including determining the size of the grid cell with the coordinates of each point in the radar detection result as the center of the grid cell and the coordinates of the point as the center of the grid cell based on the distance between the point and the radar. ,Method.

（付記６）
付記１に記載の方法であって、
レーダー座標系で前記レーダー検出結果に基づいてグリッドセルを生成することは、
前記レーダー検出結果における各点の座標をグリッドセルの中心とし、前記点の信号強度に基づいて、前記点の座標をグリッドセルの中心とするグリッドセルの大小を確定することを含む、方法。 (Appendix 6)
The method described in Appendix 1
Generating a grid cell based on the radar detection result in the radar coordinate system
A method comprising determining the size of a grid cell with the coordinates of each point in the radar detection result as the center of the grid cell and the coordinates of the point as the center of the grid cell based on the signal strength of the point.

（付記７）
付記１に記載の方法であって、
画像座標系で前記グリッドセルに基づいて境界枠を生成することは、
各グリッドセルについて、前記グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることで、前記グリッドセルを覆う少なくとも１つの枠を前記グリッドセルに対応する境界枠として生成することを含む、方法。 (Appendix 7)
The method described in Appendix 1
Generating a border frame based on the grid cells in the image coordinate system
For each grid cell, by sliding left and right or up and down along the u-axis and v-axis of the center point of the grid cell, at least one frame covering the grid cell is generated as a boundary frame corresponding to the grid cell. The method, including that.

（付記８）
付記７に記載の方法であって、
画像座標系で前記グリッドセルに基づいて境界枠を生成することは、さらに、
前記グリッドセルにおける点の数に基づいて前記境界枠の信頼度を確定し；又は
前記グリッドセルにおける点の信号強度に基づいて前記境界枠の信頼度を確定することを含む、方法。 (Appendix 8)
The method described in Appendix 7
Generating a border frame based on the grid cells in the image coordinate system further
A method comprising determining the reliability of the boundary frame based on the number of points in the grid cell; or determining the reliability of the boundary frame based on the signal strength of the points in the grid cell.

（付記９）
付記７に記載の方法であって、
前記グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることは、
前記グリッドセルの幅及び高さを単位として、前記グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングし；又は
事前設定の単位サイズを単位として、前記グリッドセルの中心点のｕ軸及びｖ軸に沿って左右スライディング又は上下スライディングすることを含む、方法。 (Appendix 9)
The method described in Appendix 7
Sliding left and right or sliding up and down along the u-axis and v-axis of the center point of the grid cell
Sliding left or right or up and down along the u-axis and v-axis of the center point of the grid cell in units of the width and height of the grid cell; or the center point of the grid cell in units of a preset unit size. A method comprising sliding left or right or up and down along the u-axis and v-axis of.

以上、本発明の好ましい実施形態を説明したが、本発明はこの実施形態に限定されず、本発明の趣旨を離脱しない限り、本発明に対するあらゆる変更は本発明の技術的範囲に属する。 Although the preferred embodiment of the present invention has been described above, the present invention is not limited to this embodiment, and any modification to the present invention belongs to the technical scope of the present invention unless the gist of the present invention is deviated.

Claims

An object detector
A matching unit that matches radar detection results with video detection results;
The first generation unit that generates grid cells based on the radar detection result in the radar coordinate system;
An object detection device including a second generation unit that generates a boundary frame based on the grid cells in an image coordinate system; and a detection unit that detects an object based on the boundary frame.

The object detection device according to claim 1.
It further includes a confirmation unit that determines the object type based on the velocity of the point in the radar detection result.
The detection unit is an object detection device that detects an object based on the object type of each point in the boundary frame and the radar detection result.

The object detection device according to claim 1.
The first generation unit clusters on the radar detection result and acquires at least one cluster; and for each cluster, the coordinates of the center point of the cluster are set as the center of the grid cell, and the most in the cluster. The difference between the abscissa of the rightmost point and the abscissa of the leftmost point is the width of the grid cell, and the difference between the abscissa of the uppermost point and the abscissa of the lowest point in the cluster is the height of the grid cell. By doing so, an object detection device that acquires the grid cells corresponding to each cluster.

The object detection device according to claim 1.
The first generation unit clusters the radar detection result and acquires at least one cluster; and for each cluster, the coordinates of the point where the signal strength in the cluster is maximum is set as the center of the grid cell. , The preset width and height are the width and height of the grid cell, or the product of the preset width and height and the weight is the width and height of the network unit, so that the grid cell corresponding to each cluster is used. To get the object detector.

The object detection device according to claim 1.
The first generation unit has the coordinates of each point in the radar detection result as the center of the grid cell, and the coordinates of the point as the center of the grid cell based on the distance between the point and the radar. An object detector that determines the size.

The object detection device according to claim 1.
The first generation unit determines the size of a grid cell with the coordinates of each point in the radar detection result as the center of the grid cell and the point coordinates as the center of the grid cell based on the signal strength of the points. Object detector.

The object detection device according to claim 1.
For each grid cell, the second generation unit slides left and right or up and down along the u-axis and v-axis of the center point of the grid cell to provide at least one frame covering the grid cell into the grid cell. An object detector that is generated as the corresponding border frame.

The object detection device according to claim 7.
The second generation unit further determines the reliability of the boundary frame based on the number of points in the grid cell, or determines the reliability of the boundary frame based on the signal strength of the points in the grid cell. , Object detector.

The object detection device according to claim 7.
The second generation unit slides left and right or up and down along the u-axis and v-axis of the center point of the grid cell in units of the width and height of the grid cell, or the second generation unit may An object detection device that slides left and right or up and down along the u-axis and v-axis of the center point of the grid cell with a preset unit size as a unit.

An image processing device including a processor and a storage device.
A computer program is stored in the storage device.
The processor executes the computer program and
Matching radar detection results with video detection results;
Generate grid cells based on the radar detection result in the radar coordinate system;
An image processing device configured to generate a boundary frame based on the grid cells in an image coordinate system; and to realize object detection based on the boundary frame.