JP7320660B1

JP7320660B1 - Omnidirectional image object detection device and omnidirectional image object detection method

Info

Publication number: JP7320660B1
Application number: JP2022192277A
Authority: JP
Inventors: 敦志菅
Original assignee: Hitachi Industry and Control Solutions Co Ltd
Current assignee: Hitachi Industry and Control Solutions Co Ltd
Priority date: 2022-11-30
Filing date: 2022-11-30
Publication date: 2023-08-03
Anticipated expiration: 2042-11-30

Abstract

【課題】３６０度の全方位画像にて適切に物体を検出可能とする。【解決手段】全方位画像の物体検出装置１は、３６０度全方位カメラ１１で撮影したエクイレクタングラー画像をキューブマップ画像に変換するキューブマップ画像変換部２１と、キューブマップ画像を構成する複数の面のパノラマ画像について物体を検出する物体検出部２３ａと、パノラマ画像において、物体検出部２３ａによって検出された物体を囲う矩形の座標を決定する検出枠決定部２４ａと、エクイレクタングラー画像を３６０度表示画面に表示し、この矩形を３６０度表示画面に対応した位置に合成する三次元情報合成部２５とを有する。【選択図】図１An object can be appropriately detected in a 360-degree omnidirectional image. An omnidirectional image object detection device 1 includes a cubemap image conversion unit 21 for converting an equirectangular image captured by a 360-degree omnidirectional camera 11 into a cubemap image; An object detection unit 23a that detects an object in a panoramic image of a plane, a detection frame determination unit 24a that determines the coordinates of a rectangle surrounding the object detected by the object detection unit 23a in the panoramic image, and an equirectangular image of 360 degrees. A three-dimensional information synthesizing unit 25 for displaying the rectangle on the display screen and synthesizing the rectangle at a position corresponding to the 360-degree display screen. [Selection drawing] Fig. 1

Description

本発明は、全方位画像の物体検出装置、および、全方位画像の物体検出方法に関する。 The present invention relates to an omnidirectional image object detection apparatus and an omnidirectional image object detection method.

近年、仮想空間を提供する装置やサービスが多く普及している。このような仮想空間は、コンピュータグラフィックによって提供されるほか、３６０度の全方位カメラで撮影されたエクイレクタングラー（equirectangular）画像によっても提供可能である。 In recent years, many devices and services that provide virtual space have become widespread. In addition to being provided by computer graphics, such virtual space can also be provided by equirectangular images captured by a 360-degree omnidirectional camera.

また、近年では、二次元映像に対し、ＡＩ（Artificial Intelligence）などによる物体検出を行うことも多く行われている。しかしエクイレクタングラー画像は、二次元平面画像として見たときには歪んでいるため、適切に物体検出を行うことができなかった。 In recent years, object detection is often performed on two-dimensional images by AI (Artificial Intelligence) or the like. However, since the equirectangular image is distorted when viewed as a two-dimensional plane image, it was not possible to perform object detection appropriately.

特許文献１には、全方位カメラ２０を用いて得られる全方位動画像から全方位フレーム画像を順に取り出す手段４１と、順に取り出される全方位フレーム画像をキューブマップに変換する手段４２と、変換されたキューブマップの各方位の要素画像に対して各々ディープラーニングによる物体検出を行う手段４３と、検出された物体を当該要素画像の方位及び当該要素画像内の位置と、当該のキューブマップが基づく全方位フレーム画像とに対応付けて記憶手段５１のレコードに格納する４７と、物体検出処理後のキューブマップを全方位フレーム画像に復元して全方位動画像内の原位置に設定する手段４９と、を有する全方位動画像処理装置の発明が記載されている。 In Patent Document 1, means 41 for sequentially extracting omnidirectional frame images from an omnidirectional moving image obtained using an omnidirectional camera 20, means 42 for converting the sequentially extracted omnidirectional frame images into cube maps, and means 43 for object detection by deep learning for each element image of each direction of the cube map obtained; means 47 for storing in a record of the storage means 51 in association with the azimuth frame image; means 49 for restoring the cube map after object detection processing to the omnidirectional frame image and setting it at the original position in the omnidirectional moving image; An invention of an omnidirectional video processing device having

特開２０２１－０２８８１３号公報JP 2021-028813 A

特許文献１に記載されている発明によれば、全方位画像に撮影された物体を検出し、全方位フレーム画像に物体を検出した位置を設定可能である。しかし、３６０度の全方位画像をキューブマップ画像に変換して物体を検出した場合、キューブマップの辺の部分に位置する物体を適切に検出できなかった。 According to the invention described in Patent Literature 1, it is possible to detect an object photographed in an omnidirectional image and set the position where the object is detected in an omnidirectional frame image. However, when an object is detected by converting a 360-degree omnidirectional image into a cube map image, the object located on the sides of the cube map cannot be detected appropriately.

そこで、本発明は、３６０度の全方位画像にて適切に物体を検出可能とすることを課題とする。 Accordingly, an object of the present invention is to appropriately detect an object in a 360-degree omnidirectional image.

前記した課題を解決するため、本発明の全方位画像の物体検出装置は、全方位カメラで撮影したエクイレクタングラー画像をキューブマップ画像に変換する画像変換部と、前記キューブマップ画像を構成する複数の面のパノラマ画像について物体を検出する第１物体検出部と、前記パノラマ画像において、前記第１物体検出部によって検出された前記物体を囲う矩形の座標を決定する第１検出枠決定部と、前記キューブマップ画像のうちパノラマ画像をずらして回り込ませた回り込み画像について物体を検出する第２物体検出部と、前記回り込み画像の面上に前記第２物体検出部によって検出された物体を囲う矩形の座標を決定する第２検出枠決定部と、前記エクイレクタングラー画像を３６０度表示画面に表示し、前記第１検出枠決定部で決定された矩形と前記第２検出枠決定部で決定された矩形とを前記３６０度表示画面の対応した位置に合成する合成部と、を有することを特徴とする。 In order to solve the above-described problems, an omnidirectional image object detection apparatus according to the present invention includes an image conversion unit that converts an equirectangular image captured by an omnidirectional camera into a cubemap image; a first object detection unit that detects an object in a panoramic image of the plane of the; a first detection frame determination unit that determines coordinates of a rectangle surrounding the object detected by the first object detection unit in the panoramic image; a second object detection unit for detecting an object in a wrap around image obtained by shifting and wrapping a panoramic image out of the cube map image; a second detection frame determining unit for determining coordinates; and displaying the equirectangular image on a 360-degree display screen. and a synthesizing unit for synthesizing the rectangle with the corresponding position on the 360-degree display screen.

本発明の全方位画像の物体検出方法は、全方位カメラで撮影したエクイレクタングラー画像をキューブマップ画像に変換するステップと、前記キューブマップ画像を構成する複数の面のパノラマ画像について物体を検出するステップと、前記パノラマ画像上に検出された前記物体を囲う矩形の座標を決定するステップと、前記キューブマップ画像のうちパノラマ画像をずらして回り込ませた回り込み画像について物体を検出するステップと、前記回り込み画像の面上に検出された物体を囲う矩形の座標を決定するステップと、前記エクイレクタングラー画像を３６０度表示画面に表示し、前記パノラマ画像について物体が検出された矩形と前記回り込み画像について物体が検出された矩形とを前記３６０度表示画面の対応した位置に合成するステップと、を有することを特徴とする。 A method for detecting an object in an omnidirectional image according to the present invention includes the steps of converting an equirectangular image captured by an omnidirectional camera into a cubemap image, and detecting an object in panoramic images of a plurality of planes that constitute the cubemap image. determining the coordinates of a rectangle surrounding the object detected on the panoramic image; detecting the object in a wrapped image obtained by shifting and wrapping the panoramic image out of the cube map image; determining the coordinates of a rectangle surrounding the detected object on the plane of the image; displaying the equirectangular image on a 360-degree display screen and determining the rectangle in which the object is detected for the panoramic image and the object for the wraparound image; and synthesizing the detected rectangle with the corresponding position on the 360-degree display screen.

本発明の物体検出プログラムは、コンピュータに、全方位カメラで撮影したエクイレクタングラー画像をキューブマップ画像に変換する手順、前記キューブマップ画像を構成する複数の面のパノラマ画像について物体を検出する手順、前記パノラマ画像上に検出された前記物体を囲う矩形の座標を決定する手順、前記キューブマップ画像のうちパノラマ画像をずらして回り込ませた回り込み画像について物体を検出する手順、前記回り込み画像の面上に検出された物体を囲う矩形の座標を決定する手順、前記エクイレクタングラー画像を３６０度表示画面に表示し、前記パノラマ画像について物体が検出された矩形と前記回り込み画像について物体が検出された矩形とを前記３６０度表示画面の対応した位置に合成する手順、を実行させるためのものである。
その他の手段については、発明を実施するための形態のなかで説明する。 The object detection program of the present invention provides a computer with a procedure for converting an equirectangular image captured by an omnidirectional camera into a cube map image, a procedure for detecting an object in panoramic images of a plurality of planes that make up the cube map image, a procedure of determining the coordinates of a rectangle surrounding the object detected on the panoramic image; a procedure of detecting the object in a wrapped around image obtained by shifting and wrapping the panoramic image out of the cube map image; A procedure for determining the coordinates of a rectangle surrounding the detected object, displaying the equirectangular image on a 360 -degree display screen, determining the rectangle in which the object is detected in the panorama image and the rectangle in which the object is detected in the wrapping image. to the corresponding positions on the 360-degree display screen.
Other means are described in the detailed description.

本発明によれば、３６０度の全方位画像にて適切に物体を検出可能とすることが可能となる。 According to the present invention, it is possible to appropriately detect an object in a 360-degree omnidirectional image.

本実施形態に係る物体検出装置の構成図である。1 is a configuration diagram of an object detection device according to an embodiment; FIG. エクイレクタングラー画像の一例を示す図である。It is a figure which shows an example of an equirectangular image. エクイレクタングラー画像の座標系を示す図である。FIG. 4 is a diagram showing the coordinate system of an equirectangular image; エクイレクタングラーの動画の表示方法の一例を示す図である。FIG. 10 is a diagram showing an example of a method of displaying an equirectangular moving image; キューブマップ画像の一例を示す図である。It is a figure which shows an example of a cube map image. キューブマップ画像の座標系を示す図である。FIG. 4 is a diagram showing a coordinate system of a cubemap image; キューブマップ画像を示す概念図である。FIG. 4 is a conceptual diagram showing a cube map image; キューブマップ画像の水平外周部のパノラマ画像の概念図である。FIG. 4 is a conceptual diagram of a panorama image of the horizontal periphery of a cube map image; キューブマップ画像の水平外周部を水平方向にずらしたパノラマ画像の概念図である。FIG. 4 is a conceptual diagram of a panorama image obtained by horizontally shifting the horizontal periphery of a cube map image; 物体検出枠の三次元座標変換の前準備を説明する画像である。FIG. 10 is an image for explaining preparation for three-dimensional coordinate transformation of an object detection frame; FIG. 物体検出枠の円周上の配置による三次元座標変換を説明する画像である。It is an image explaining the three-dimensional coordinate conversion by arrangement|positioning on the circumference of an object detection frame. 物体検出枠の向きの変換を説明する画像である。It is an image explaining conversion of the direction of an object detection frame. パノラマ画像を説明する図である。It is a figure explaining a panorama image. パノラマ画像を説明する図である。It is a figure explaining a panorama image.

以降、本発明を実施するための形態を、各図を参照して詳細に説明する。
本発明は、現行の二次元平面画像における物体検出に留まらず、三次元空間としての物体検出を可能とするものである。本発明によれば、３６０度の映像外周における位置（三次元座標）・向き（三次元角度）が取得できる。 EMBODIMENT OF THE INVENTION Henceforth, the form for implementing this invention is demonstrated in detail with reference to each figure.
The present invention enables object detection not only in the current two-dimensional plane image, but also in three-dimensional space. According to the present invention, the position (three-dimensional coordinates) and orientation (three-dimensional angle) on the outer periphery of a 360-degree image can be acquired.

図１は、本実施形態に係る物体検出装置２の構成図である。図１は、３６０度の全方位カメラによる撮影から全方位ビュワーによる表示までの論理構成を示している。 FIG. 1 is a configuration diagram of an object detection device 2 according to this embodiment. FIG. 1 shows a logical configuration from photographing by a 360-degree omnidirectional camera to display by an omnidirectional viewer.

物体検出装置２は、キューブマップ画像変換部２１と、パノラマ取得部２２ａ，２２ｂと、物体検出部２３ａ，２３ｂと、検出枠決定部２４ａ，２４ｂと、三次元情報合成部２５と、ポリゴン生成部２６とを含んで構成される。物体検出装置２は、例えばＣＰＵ（Central Processing Unit）を備えるコンピュータであり、不図示の物体検出プログラムを実行することで各機能部を具現化する。 The object detection device 2 includes a cube map image conversion unit 21, panorama acquisition units 22a and 22b, object detection units 23a and 23b, detection frame determination units 24a and 24b, a three-dimensional information synthesis unit 25, and a polygon generation unit. 26. The object detection device 2 is, for example, a computer including a CPU (Central Processing Unit), and implements each functional unit by executing an object detection program (not shown).

この物体検出装置２には、３６０度全方位カメラ１１で撮影された全方位画像としてエクイレクタングラー画像３１が入力される。物体検出装置２は、このエクイレクタングラー画像３１に撮影されている物体を検出して、検出枠のポリゴンを生成して、３６０度表示画面３２に合成する。この３６０度表示画面３２は、外部コントローラによって視点などが調整され、表示部１３に表示される。 An equirectangular image 31 is input to the object detection device 2 as an omnidirectional image captured by the 360-degree omnidirectional camera 11 . The object detection device 2 detects an object captured in this equirectangular image 31 , generates a detection frame polygon, and synthesizes it on the 360-degree display screen 32 . The 360-degree display screen 32 is displayed on the display unit 13 after the viewpoint and the like are adjusted by an external controller.

３６０度全方位カメラ１１は、３６０度の全方位を一度に撮影可能なカメラである。３６０度全方位カメラ１１は、例えば８Ｋサイズのエクイレクタングラーの動画を撮影可能である。 The 360-degree omnidirectional camera 11 is a camera capable of photographing 360-degree omnidirectional images at once. The 360-degree omnidirectional camera 11 can shoot, for example, an 8K size equirectangular moving image.

図２は、エクイレクタングラー画像３１を示している。
このエクイレクタングラー画像３１は、本来は球面上に投影すべきものを矩形にマッピングしたものである。エクイレクタングラーは、正距円筒図法と呼ばれており、パノラマ写真で球状のパノラマ画像を表すために用いられるほか、地図投影法としても用いられている。エクイレクタングラー画像３１では、球面を地球に擬えたときの緯度が所定間隔の水平直線として表現される。エクイレクタングラー画像３１の画素位置と、これに対応する球面上との画素位置との関係が単純であり、他の投影法に変換しやすいが、特に極において歪みが生じる。 FIG. 2 shows an equirectangular image 31 .
This equirectangular image 31 is obtained by mapping an image that should originally be projected onto a spherical surface onto a rectangle. The equirectangular projection, called the equirectangular projection, is used in panoramic photography to represent spherical panoramic images, as well as as a map projection. In the equirectangular image 31, latitudes when the spherical surface is likened to the earth are expressed as horizontal straight lines at predetermined intervals. The relationship between the pixel positions of the equirectangular image 31 and the corresponding pixel positions on the spherical surface is simple and easy to convert to other projections, but distortion occurs, especially at the poles.

３６０度全方位カメラ１１は、２つの魚眼レンズが組み合わされたものである。３６０度全方位カメラ１１から、フィッシュアイ画像という円形撮像が得られる。３６０度全方位カメラ１１は、フィッシュアイ画像をエクイレクタングラー形式に変換し、空間レンダリングに使用する。この画像形式の変換は、ステッチングと呼ばれている。 The 360-degree omnidirectional camera 11 is a combination of two fisheye lenses. A circular image called a fisheye image is obtained from the 360-degree omnidirectional camera 11 . The 360-degree omnidirectional camera 11 converts the fisheye image into an equirectangular format and uses it for spatial rendering. This image format conversion is called stitching.

図３は、エクイレクタングラー画像３１の座標系を示す図である。
エクイレクタングラー画像３１は、本来は球面にマッピングされるものである。球面の上の点Ｐは、ＸＹＺ座標で表わされてもよく、極座標で表わされてもよい。 FIG. 3 is a diagram showing the coordinate system of the equirectangular image 31. As shown in FIG.
The equirectangular image 31 is originally mapped onto a spherical surface. A point P on the sphere may be represented by XYZ coordinates or by polar coordinates.

図１に戻り説明を続ける。物体検出装置２のキューブマップ画像変換部２１は、３６０度全方位カメラ１１で撮影したエクイレクタングラー画像３１を、キューブマップ画像に変換する画像変換部として機能する。キューブマップ画像とその変換については、後記する図５と図６で説明する。 Returning to FIG. 1, the description continues. The cube map image conversion unit 21 of the object detection device 2 functions as an image conversion unit that converts the equirectangular image 31 captured by the 360-degree omnidirectional camera 11 into a cube map image. Cubemap images and their transformations are described in FIGS. 5 and 6 below.

そして、パノラマ取得部２２ａ，２２ｂは、キューブマップ画像のうち連続した４枚を使って、それぞれ第１パノラマ画像と第２パノラマ画像を取得する。パノラマ取得部２２ａは、キューブマップ画像を構成する複数の面の第１パノラマ画像を取得するも第１パノラマ取得部として機能する。第１パノラマ画像は、後記する図８で説明する。
パノラマ取得部２２ｂは、キューブマップ画像のうちパノラマ画像をずらして回り込ませた回り込み画像を取得する第２パノラマ取得部として機能する。第２パノラマ画像は、後記する図９で説明する。 Then, the panorama acquisition units 22a and 22b acquire a first panorama image and a second panorama image, respectively, using four consecutive cube map images. The panorama acquisition unit 22a functions as a first panorama acquisition unit that acquires first panorama images of a plurality of planes forming a cube map image. The first panorama image will be described later with reference to FIG.
The panorama acquisition unit 22b functions as a second panorama acquisition unit that acquires a wrapped image obtained by shifting and wrapping the panoramic image out of the cube map images. The second panorama image will be described later with reference to FIG.

物体検出部２３ａ，２３ｂは、第１パノラマ画像と第２パノラマ画像から、それぞれ物体を検出する。物体検出部２３ａは、第１パノラマ画像について物体を検出する第１物体検出部として機能する。物体検出部２３ｂは、第２パノラマ画像について物体を検出する第２物体検出部として機能する。 The object detection units 23a and 23b detect objects from the first panorama image and the second panorama image, respectively. The object detection unit 23a functions as a first object detection unit that detects objects in the first panoramic image. The object detection unit 23b functions as a second object detection unit that detects objects in the second panoramic image.

検出枠決定部２４ａは、第１パノラマ画像上の物体の検出枠の座標を決定する第１検出枠決定部として機能する。検出枠決定部２４ｂは、第２パノラマ画像上の物体の検出枠の座標を決定する第２検出枠決定部として機能する。これにより、例えば、右周り後方１４５度に人物を検知等の判定が可能となる。これら検出枠決定部２４ａ，２４ｂの処理については、後記する図１０で説明する。 The detection frame determination unit 24a functions as a first detection frame determination unit that determines the coordinates of the object detection frame on the first panorama image. The detection frame determination unit 24b functions as a second detection frame determination unit that determines the coordinates of the object detection frame on the second panoramic image. As a result, for example, it is possible to determine whether a person is detected at 145 degrees backward in the clockwise direction. Processing of these detection frame determination units 24a and 24b will be described later with reference to FIG.

物体の検出枠の座標は、球面上であり、視点からの距離は検出できない。しかし、人物については、検出した領域の大きさと、人物の平均身長とから、視点から人物までの距離を推定する。また、第１実施形態では、上下方向の物体検出については考慮しない。後記する第２実施形態では、上下方向の物体検出が可能である。 The coordinates of the object detection frame are on a spherical surface, and the distance from the viewpoint cannot be detected. However, for a person, the distance from the viewpoint to the person is estimated from the size of the detected region and the average height of the person. Further, in the first embodiment, vertical object detection is not taken into consideration. In a second embodiment, which will be described later, it is possible to detect an object in the vertical direction.

三次元情報合成部２５は、第１パノラマ画像上の物体の検出枠の座標を三次元情報に変換し、第２パノラマ画像上の物体の検出枠の座標を三次元情報に変換したのち、これら座標を合成する合成部である。三次元情報合成部２５の処理は、後記する図１１で説明する。 The three-dimensional information synthesizing unit 25 converts the coordinates of the object detection frame on the first panoramic image into three-dimensional information, converts the coordinates of the object detection frame on the second panoramic image into three-dimensional information, and converts the coordinates of the object detection frame on the second panoramic image into three-dimensional information. This is a synthesizing unit that synthesizes coordinates. Processing of the three-dimensional information synthesizing unit 25 will be described later with reference to FIG.

ポリゴン生成部２６は、三次元情報合成部２５が合成した座標に物体の検出枠のポリゴンを生成して、３６０度表示画面３２に合成する。ポリゴン生成部２６の処理は、後記する図１２で説明する。 The polygon generation unit 26 generates a polygon of an object detection frame at the coordinates synthesized by the three-dimensional information synthesis unit 25 and synthesizes it on the 360-degree display screen 32 . The processing of the polygon generator 26 will be described later with reference to FIG.

図４は、エクイレクタングラー画像３１の表示方法の一例を示す図である。
表示部１３は、例えばヘッドマウントディスプレイである。図４に示すように、表示部１３は、前述したエクイレクタングラー画像３１を全天球に貼り付けて、ユーザ５に３６０度の仮想現実（ＶＲ）空間を提供する。外部コントローラ１２は、例えばヘッドマウントディスプレイに設けられた各種センサである。このときユーザ５が視る画像は歪んでいない。 FIG. 4 is a diagram showing an example of a method of displaying the equirectangular image 31. As shown in FIG.
The display unit 13 is, for example, a head-mounted display. As shown in FIG. 4, the display unit 13 provides the user 5 with a 360-degree virtual reality (VR) space by pasting the above-described equirectangular image 31 on the omnidirectional sphere. The external controller 12 is, for example, various sensors provided in a head-mounted display. At this time, the image viewed by the user 5 is not distorted.

図５は、キューブマップ画像３３の一例を示す図である。
キューブマップ画像３３は、全方位映像の画像形式の一種であり、上下・正面・左右・背面それぞれの撮像を立方体に貼り付けて空間を再現したものである。キューブマップ画像３３は、エクイレクタングラー画像３１を変換したものである。エクイレクタングラー画像３１は、立方体の正面・背面・左右面・上下面に投影することができる。エクイレクタングラー画像３１が投影された６つの面を抽出することで、キューブマップ画像に変換できる。このキューブマップ画像３３は、各面画素が湾曲せず、物体検出に利用しやすい。 FIG. 5 is a diagram showing an example of the cube map image 33. As shown in FIG.
The cube map image 33 is a kind of omnidirectional video image format, and reproduces a space by attaching images of the top and bottom, the front, the left and right, and the back to a cube. The cubemap image 33 is obtained by transforming the equirectangular image 31 . The equirectangular image 31 can be projected onto the front, back, left and right, and top and bottom surfaces of the cube. By extracting the six planes on which the equirectangular image 31 is projected, it can be converted into a cube map image. This cube map image 33 is easy to use for object detection because each surface pixel is not curved.

数学的には、球の半径をr=1とし、極座標θ,φを0<θ<π、-π/4<φ<7π/4としたとき、以下の式（１）の関係を満たす。

Mathematically, when the radius of the sphere is r=1 and the polar coordinates θ and φ are 0<θ<π and -π/4<φ<7π/4, the following equation (1) is satisfied.

これらを中央でキューブに投影することを考える。
まず、緯度-π/4<φ<π/4、π/4<φ<3π/4、3π/4<φ<5π/4、5π/4<φ<7π/4で4つの領域に分割する。これらは、上部または下部の4つの側面のいずれかに投影される。
-π/4<φ<π/4 で示される側面について検討する。
（sinθcosφ, sinθsinφ, cosθ）の中心投影は（sinθcosφ,sinθsinφ,cosθ）になり、式（２）の場合にx=1平面に該当する。

Consider projecting these onto a cube in the center.
First, divide into 4 regions at latitude -π/4<φ<π/4, π/4<φ<3π/4, 3π/4<φ<5π/4, 5π/4<φ<7π/4 . These are projected on one of the four sides, top or bottom.
Consider the side denoted by -π/4<φ<π/4.
The central projection of (sin θ cos φ, sin θ sin φ, cos θ) becomes (sin θ cos φ, sin θ sin φ, cos θ), which corresponds to the x=1 plane in the case of equation (2).

これは、式（３）に変換できる。

This can be transformed into equation (3).

投影点は(1,tan φ,cotθ/cosφ)となる。
|cotθ/cosφ|<1の場合、前面となる。それ以外は、上部または下部に投影され、そのために別の投影が必要となる。
上部のより良いテストでは、cosφの最小値がcos(π/4)= 1/√2になるという事実を使用する。したがって、cotθ/（1/√2）>1またはtanθ<1/√2となる。これは、θ<35°または0.615ラジアンに相当する。 The projection point is (1,tan φ,cotθ/cosφ).
If |cotθ/cosφ|<1, it is the front surface. Others are projected to the top or bottom, which requires another projection.
A better test at the top uses the fact that the minimum value of cosφ is cos(π/4)=1/√2. Therefore, cot θ/(1/√2)>1 or tan θ<1/√2. This corresponds to θ<35° or 0.615 radians.

図６は、キューブマップ画像３３の座標系を示す図である。
このとき、エクイレクタングラー形式から単位球面への変換は、以下の式（４）で表わされる。

FIG. 6 is a diagram showing the coordinate system of the cube map image 33. As shown in FIG.
At this time, conversion from the equirectangular form to the unit sphere is represented by the following equation (4).

なお、式（４）のｘ_ｓ，ｙ_ｓ，ｚ_ｓは、点Ｐの座標を示している。
そして、単位球面からエクイレクタングラー形式への変換は、以下の式（５）で表わされる。

Note that x _s , y _s , and z _s in Equation (4) indicate the coordinates of point P.
Transformation from the unit sphere to the equirectangular form is represented by the following equation (5).

《第１実施形態》
第１実施形態は、水平外周部に限り、物体を検出して検出枠を付与するものである。殆どの場合、所望の検出物体は上下方向に存在しないので、このような制限下でも問題はない。 <<1st Embodiment>>
In the first embodiment, an object is detected and a detection frame is provided only in the horizontal peripheral portion. In most cases, the desired detection object does not exist in the vertical direction, so there is no problem even under such a limitation.

図７は、キューブマップ画像３３を構成する各面を示す説明図である。
キューブマップ画像３３は、正面Ｂ、左面Ａ、右面Ｃ、背面Ｄ、上面Ｅ、下面Ｆから構成されている。これら各面の歪みはさほど無いため、物体を検出可能であるが、各面の辺に掛かっており、見切れている物体に関しては、物体が検出できないおそれがある。 FIG. 7 is an explanatory diagram showing each plane that constitutes the cube map image 33. As shown in FIG.
The cube map image 33 is composed of a front surface B, a left surface A, a right surface C, a rear surface D, an upper surface E, and a lower surface F. Since there is little distortion on each of these surfaces, the object can be detected.

図８は、キューブマップ画像３３の水平外周部の各面で構成されるパノラマ画像３４の概念図である。
パノラマ取得部２２ａは、キューブマップ画像３３の水平外周部を構成する背面右側Ｄ２、左面Ａ、正面Ｂ、右面Ｃ、背面左側Ｄ１を集めてパノラマ画像３４とする。物体検出部２３ａは、このパノラマ画像３４に対して物体検出を実施する。物体検出には、例えはＡＩモデルのＳＳＤ（Single Shot MultiBox Detector）またはＹＯＬＯ（You Only Look Once）等を用いる。 FIG. 8 is a conceptual diagram of a panorama image 34 formed by each surface of the horizontal peripheral portion of the cube map image 33. As shown in FIG.
The panorama acquisition unit 22a collects the rear right side D2, the left side A, the front side B, the right side C, and the rear left side D1, which constitute the horizontal periphery of the cube map image 33, to obtain a panoramic image 34. FIG. The object detection unit 23 a performs object detection on this panoramic image 34 . For object detection, for example, an AI model SSD (Single Shot MultiBox Detector) or YOLO (You Only Look Once) or the like is used.

なお、元解像度が高いため、物体検出部２３ａ向けには縮小画像を用いるとよい。パノラマ画像３４の水平座標は、全周の角度に該当する。パノラマ画像３４の左右端は画像が見切れるため、物体検出部２３ａは、左右端に掛かっている物体を検出できない。 Since the original resolution is high, it is preferable to use a reduced image for the object detection unit 23a. The horizontal coordinates of the panorama image 34 correspond to angles of the entire circumference. Since the left and right edges of the panorama image 34 are cut off, the object detection unit 23a cannot detect an object hanging over the left and right edges.

図９は、キューブマップ画像３３の水平外周部を水平方向にずらしたパノラマ画像３５の概念図である。
パノラマ画像３４の両端に関しては、対象物が見切れることから、物体検出部２３ａは、両端に掛かっている物体が検出できないおそれがある。この課題に対処するため、本実施形態では、全体を水平方向にずらして回り込ませたパノラマ画像３５を用意し、パノラマ画像３５の物体検出結果とパノラマ画像３４の物体検出結果を合成する。 FIG. 9 is a conceptual diagram of a panorama image 35 in which the horizontal peripheral portion of the cube map image 33 is shifted in the horizontal direction.
As for both ends of the panorama image 34, since the object is cut off, the object detection unit 23a may not be able to detect the object hanging over both ends. In order to solve this problem, in the present embodiment, a panorama image 35 whose entirety is horizontally shifted and wrapped around is prepared, and the object detection result of the panorama image 35 and the object detection result of the panorama image 34 are synthesized.

パノラマ画像３５は、パノラマ画像３４を水平方向にずらして回り込ませた画像である。パノラマ取得部２２ｂは、正面右側Ｂ２、右面Ｃ、背面Ｄ、左面Ａ、正面左側Ｂ１を集めてパノラマ画像３５を構成する。物体検出部２３ｂは、このパノラマ画像３５を入力として二次元画像の物体検出を行う。 A panorama image 35 is an image in which the panorama image 34 is horizontally shifted and wrapped around. The panorama acquisition unit 22b collects the front right side B2, right side C, back side D, left side A, and front left side B1 to form a panorama image 35. FIG. The object detection unit 23b receives the panorama image 35 as an input and performs object detection on the two-dimensional image.

図１０は、物体検出枠の三次元座標変換の前準備を説明する画像である。
物体検出部２３ａは、二次元物体検出処理により、パノラマ画像３４に撮影されている各物体５１～５３をそれぞれ検出して、各検出枠４１～４３の座標を決定する。このとき物体検出部２３ａが検出できるのは、パノラマ画像３４の二次元座標であり、カメラから物体までの距離は検出できない。 FIG. 10 is an image for explaining preparatory preparation for three-dimensional coordinate transformation of the object detection frame.
The object detection unit 23a detects each of the objects 51 to 53 photographed in the panoramic image 34 by two-dimensional object detection processing, and determines the coordinates of each of the detection frames 41 to 43. FIG. At this time, the object detection unit 23a can detect the two-dimensional coordinates of the panoramic image 34, and cannot detect the distance from the camera to the object.

三次元情報合成部２５は、物体検出部２３ａの二次元物体検出で出力された検出枠を三次元座標へと変換していく。三次元情報合成部２５は、３６０度表示画面３２を上部から見下ろし状態でとらえ、外周との関係を把握しておく。 The three-dimensional information synthesizing unit 25 converts the detection frame output by the two-dimensional object detection of the object detecting unit 23a into three-dimensional coordinates. The three-dimensional information synthesizing unit 25 captures the 360-degree display screen 32 as viewed from above, and grasps the relationship with the outer circumference.

図１１は、物体の検出枠の円周上の配置による三次元座標変換を説明する画像である。
３６０度表示画面３２は、円周上に投影される。パノラマ画像３４の左端に対応する３６０度表示画面３２の位置は、視点に対する角度が－１８０度である。パノラマ画像３４の右端に対応する３６０度表示画面３２の位置は、視点に対する角度が＋１８０度である。パノラマ画像３４の中央に対応する３６０度表示画面３２の位置は、視点に対する角度が０度である。 11A and 11B are images for explaining three-dimensional coordinate transformation by arranging the object detection frame on the circumference.
The 360-degree display screen 32 is projected on the circumference. The position of the 360-degree display screen 32 corresponding to the left edge of the panoramic image 34 has an angle of -180 degrees with respect to the viewpoint. The position of the 360-degree display screen 32 corresponding to the right end of the panoramic image 34 has an angle of +180 degrees with respect to the viewpoint. The position of the 360-degree display screen 32 corresponding to the center of the panoramic image 34 has an angle of 0 degrees with respect to the viewpoint.

この３６０度表示画面３２を上部から見た場合、検出枠４１～４３は、３６０度表示画面３２の円周上に配置される。ここで三次元情報合成部２５は、三次元座標変換を行う。パノラマ画像３４のＸ座標は、３６０度表示画面３２の外周角度に相当する。よってパノラマ画像３４の検出枠４１～４３の位置のＸ座標に基づき、３６０度表示画面３２上に変換された検出枠４１～４３の外周角度を算出できる。 When the 360-degree display screen 32 is viewed from above, the detection frames 41 to 43 are arranged on the circumference of the 360-degree display screen 32 . Here, the three-dimensional information synthesizing unit 25 performs three-dimensional coordinate conversion. The X coordinate of the panorama image 34 corresponds to the outer peripheral angle of the 360-degree display screen 32 . Therefore, based on the X coordinates of the positions of the detection frames 41 to 43 of the panorama image 34, the outer peripheral angles of the detection frames 41 to 43 converted onto the 360-degree display screen 32 can be calculated.

図１２は、物体の検出枠４１～４３の向きの変換を説明する画像である。
三次元情報合成部２５は、検出枠４１～４３の向きを、これら検出枠４１～４３の円周投影における位置に基づいて算出する。検出枠４１～４３の向きとは、検出枠４１～４３の三次元座標における角度である。各検出枠４１～４３は、３６０度表示画面３２上から原点（視点）の方向を向いている。よって三次元情報合成部２５は、各検出枠４１～４３の位置（外周角度）から、各検出枠４１～４３の向きを算出する。ポリゴン生成部２６は、各検出枠４１～４３の位置（外周角度）と向きから、これら検出枠４１～４３のポリゴンを生成して、３６０度表示画面３２と合成する。 FIG. 12 is an image for explaining the conversion of the directions of the detection frames 41 to 43 of the object.
The three-dimensional information synthesizing unit 25 calculates the orientations of the detection frames 41 to 43 based on the positions of the detection frames 41 to 43 in the circumferential projection. The directions of the detection frames 41 to 43 are the angles of the detection frames 41 to 43 in three-dimensional coordinates. Each of the detection frames 41 to 43 faces the origin (viewpoint) from the 360-degree display screen 32 . Therefore, the three-dimensional information synthesizing unit 25 calculates the orientation of each detection frame 41-43 from the position (peripheral angle) of each detection frame 41-43. The polygon generator 26 generates polygons of the detection frames 41 to 43 from the positions (peripheral angles) and orientations of the detection frames 41 to 43 and synthesizes them with the 360-degree display screen 32 .

《第２実施形態》
第２実施形態は、全周に亘って物体を検出して検出枠を付与するものである。以下、図１３と図１４を参照して説明する。 <<Second embodiment>>
2nd Embodiment detects an object over a perimeter, and gives a detection frame. Description will be made below with reference to FIGS. 13 and 14. FIG.

図１３は、パノラマ画像３５，３６を示す図面である。
パノラマ画像３５は、キューブマップ画像３３のうち背面上側Ｄ３、上面Ｅ、正面Ｂ、下面Ｆ、背面下側Ｄ４を集めたものである。このパノラマ画像３５に対して物体が検出される。このパノラマ画像３５のＹ座標は、３６０度表示画面３２の所定経度面における緯度に相当する。これにより、３６０度表示画面３２における検出枠４１～４３の位置と向きとを知ることができる。 FIG. 13 is a diagram showing panoramic images 35 and 36. As shown in FIG.
The panorama image 35 is a collection of the back upper side D3, the top side E, the front side B, the bottom side F, and the back side lower side D4 of the cube map image 33 . Objects are detected in this panorama image 35 . The Y coordinate of this panorama image 35 corresponds to the latitude on the predetermined longitude plane of the 360-degree display screen 32 . Thereby, the positions and orientations of the detection frames 41 to 43 on the 360-degree display screen 32 can be known.

しかし、パノラマ画像３５の上下端に関しては、対象物が見切れることから、両端に掛かっている物体が検出できないおそれがある。この課題に対処するため、本実施形態は、全体を上下方向にずらして回り込ませたパノラマ画像３６を用意し、パノラマ画像３６からの物体検出結果とパノラマ画像３５の物体検出結果とを合成する。 However, with respect to the upper and lower ends of the panorama image 35, since the target object is cut off, there is a possibility that the object hanging over both ends cannot be detected. In order to deal with this problem, this embodiment prepares a panorama image 36 whose entirety is vertically shifted and wrapped around, and synthesizes the object detection result from the panorama image 36 and the object detection result from the panorama image 35 .

パノラマ画像３６は、パノラマ画像３５を上下方向にずらして回り込ませた画像である。パノラマ画像３６は、正面下側Ｂ３、下面Ｆ、背面Ｄ、上面Ｅ、正面上側Ｂ４を集めて構成される。このパノラマ画像３６を入力として二次元画像の物体検出が行われる。これにより、本実施形態では、全周に亘って漏れなく物体を検出可能である。 The panorama image 36 is an image in which the panorama image 35 is vertically shifted and wrapped. The panorama image 36 is configured by collecting the front lower side B3, the lower side F, the rear side D, the upper side E, and the front upper side B4. With this panorama image 36 as an input, object detection of a two-dimensional image is performed. Thereby, in this embodiment, the object can be detected without omission over the entire circumference.

図１４は、パノラマ画像３７，３８を示す図面である。
パノラマ画像３７は、キューブマップ画像３３のうち下面Ｆ、右面Ｃ、上面Ｅ、左面Ａを集めたものである。このパノラマ画像３７に対して物体が検出される。このパノラマ画像３７のＹ座標は、３６０度表示画面３２の所定経度面における緯度に相当する。これにより、３６０度表示画面３２における検出枠４１～４３の位置と向きとを知ることができる。 FIG. 14 is a diagram showing panoramic images 37 and 38. As shown in FIG.
The panorama image 37 is a collection of the bottom surface F, right surface C, top surface E, and left surface A of the cube map image 33 . Objects are detected in this panorama image 37 . The Y coordinate of this panorama image 37 corresponds to the latitude on the predetermined longitude plane of the 360-degree display screen 32 . Thereby, the positions and orientations of the detection frames 41 to 43 on the 360-degree display screen 32 can be known.

しかし、パノラマ画像３７の上下端に関しては、対象物が見切れることから、両端に掛かっている物体が検出できないおそれがある。この課題に対処するため、本実施形態は、全体を上下方向にずらして回り込ませたパノラマ画像３８を用意し、パノラマ画像３８からの物体検出結果とパノラマ画像３７の物体検出結果とを合成する。 However, with respect to the upper and lower ends of the panorama image 37, since the target object is cut off, there is a possibility that the object hanging over both ends cannot be detected. In order to deal with this problem, the present embodiment prepares a panorama image 38 whose entirety is vertically shifted and wrapped around, and synthesizes the object detection result from the panorama image 38 and the object detection result from the panorama image 37 .

パノラマ画像３８は、パノラマ画像３７を上下方向にずらして回り込ませた画像である。パノラマ画像３８は、上面Ｅ、左面Ａ、下面Ｆ、右面Ｃを集めて構成される。このパノラマ画像３８を入力として二次元画像の物体検出が行われる。これにより、本実施形態では、全周に亘って漏れなく物体を検出可能である。 The panorama image 38 is an image in which the panorama image 37 is vertically shifted and wrapped. The panorama image 38 is configured by collecting the upper surface E, the left surface A, the lower surface F, and the right surface C. As shown in FIG. With this panorama image 38 as an input, object detection of a two-dimensional image is performed. Thereby, in this embodiment, the object can be detected without omission over the entire circumference.

（変形例）
本発明は上記した実施形態に限定されるものではなく、様々な変形例が含まれる。例えば上記した実施形態は、本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。ある実施形態の構成の一部を他の実施形態の構成に置き換えることが可能であり、ある実施形態の構成に他の実施形態の構成を加えることも可能である。また、各実施形態の構成の一部について、他の構成の追加・削除・置換をすることも可能である。 (Modification)
The present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the described configurations. A part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Moreover, it is also possible to add, delete, or replace a part of the configuration of each embodiment with another configuration.

上記の各構成、機能、処理部、処理手段などは、それらの一部または全部を、例えば集積回路などのハードウェアで実現してもよい。上記の各構成、機能などは、プロセッサがそれぞれの機能を実現するプログラムを解釈して実行することにより、ソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイルなどの情報は、メモリ、ハードディスク、ＳＳＤ（Solid State Drive）などの記録装置、または、フラッシュメモリカード、ＤＶＤ（Digital Versatile Disk）などの記録媒体に置くことができる。 Some or all of the above configurations, functions, processing units, processing means, etc. may be realized by hardware such as integrated circuits. Each of the above configurations, functions, etc. may be realized by software by a processor interpreting and executing a program for realizing each function. Information such as programs, tables, and files that implement each function can be stored in recording devices such as memory, hard disks, SSDs (Solid State Drives), or recording media such as flash memory cards and DVDs (Digital Versatile Disks). can.

各実施形態に於いて、制御線や情報線は、説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には、殆ど全ての構成が相互に接続されていると考えてもよい。
本発明の変形例として、例えば、次の（ａ）～（ｃ）のようなものがある。 In each embodiment, control lines and information lines indicate those considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. In fact, it may be considered that almost all configurations are interconnected.
Modifications of the present invention include, for example, the following (a) to (c).

（ａ）パノラマ画像の長辺軸の位置を３６０度表示画面の角度に対応づけるだけではなく、パノラマ画像の短辺軸の位置を３６０度表示画面の角度に対応づけてもよい。
（ｂ）キューブマップの画像変換に限定されず、任意の多角形への画像変換によってエクイレクタングラー画像を二次元パノラマ画像に変換してもよい。
（ｃ）物体検出処理の方式は、ＳＳＤとＹＯＬＯに限定されず、任意方式の物体検出処理を採用してもよい。 (a) In addition to associating the position of the long axis of the panoramic image with the angle of the 360-degree display screen, the position of the short axis of the panoramic image may be associated with the angle of the 360-degree display screen.
(b) The equirectangular image may be converted into a two-dimensional panorama image by image conversion to an arbitrary polygon, not limited to cube map image conversion.
(c) The method of object detection processing is not limited to SSD and YOLO, and any method of object detection processing may be adopted.

１１３６０度全方位カメラ
１２外部コントローラ
１３表示部
２物体検出装置
２１キューブマップ画像変換部
２２ａパノラマ取得部
２２ｂパノラマ取得部
２３ａ物体検出部（第１物体検出部）
２３ｂ物体検出部（第２物体検出部）
２４ａ検出枠決定部（第１検出枠決定部）
２４ｂ検出枠決定部（第２検出枠決定部）
２５三次元情報合成部（合成部）
２６ポリゴン生成部
３１エクイレクタングラー画像
３２３６０度表示画面
３３キューブマップ画像
３４～３８パノラマ画像
４１～４３検出枠
５ユーザ
５１～５３物体 11 360-degree omnidirectional camera 12 External controller 13 Display unit 2 Object detection device 21 Cube map image conversion unit 22a Panorama acquisition unit 22b Panorama acquisition unit 23a Object detection unit (first object detection unit)
23b object detection unit (second object detection unit)
24a detection frame determination unit (first detection frame determination unit)
24b detection frame determination unit (second detection frame determination unit)
25 Three-dimensional information synthesizing unit (synthesizing unit)
26 Polygon generator 31 Equirectangular image 32 360-degree display screen 33 Cube map images 34-38 Panoramic images 41-43 Detection frame 5 User 51-53 Object

Claims

an image conversion unit that converts an equirectangular image captured by an omnidirectional camera into a cube map image;
a first object detection unit that detects an object in a panoramic image of a plurality of planes that constitute the cube map image;
a first detection frame determination unit that determines coordinates of a rectangle surrounding the object detected by the first object detection unit in the panoramic image;
a second object detection unit that detects an object in a wrapped image obtained by shifting and wrapping the panoramic image of the cube map image;
a second detection frame determination unit that determines coordinates of a rectangle surrounding the object detected by the second object detection unit on the surface of the wrapping image;
The equirectangular image is displayed on a 360-degree display screen, and the rectangle determined by the first detection frame determination unit and the rectangle determined by the second detection frame determination unit are displayed at corresponding positions on the 360 -degree display screen. a synthesizing unit that synthesizes into
An omnidirectional image object detection device comprising:

wherein the synthesis unit corrects the orientation of the rectangle based on the position of the rectangle in the circumferential projection;
2. The omnidirectional image object detection apparatus according to claim 1, wherein:

wherein the panoramic image is a horizontal perimeter of the cubemap image,
2. The omnidirectional image object detection apparatus according to claim 1, wherein:

The panoramic image is
a horizontal periphery of the cube map image;
a combination of the front surface, the top surface, the back surface, and the bottom surface of the cube map image;
a combination of the left surface, the top surface, the right surface, and the bottom surface of the cube map image;
2. The omnidirectional image object detection apparatus according to claim 1, comprising:

converting an equirectangular image captured by an omnidirectional camera into a cubemap image;
a step of detecting an object in a panoramic image of a plurality of planes that make up the cubemap image;
determining the coordinates of a rectangle enclosing the detected object on the panoramic image;
a step of detecting an object in a wraparound image obtained by shifting and wrapping a panoramic image out of the cubemap images;
determining the coordinates of a rectangle enclosing the detected object on the plane of the wrap image;
displaying the equirectangular image on a 360-degree display screen, and synthesizing the rectangle in which the object is detected in the panorama image and the rectangle in which the object is detected in the wraparound image at corresponding positions on the 360-degree display screen; and,
An object detection method for an omnidirectional image, comprising:

to the computer,
The procedure for converting an equirectangular image taken with an omnidirectional camera into a cubemap image,
a procedure for detecting an object in a panoramic image of a plurality of planes that constitute the cubemap image;
determining the coordinates of a rectangle surrounding the object detected on the panoramic image;
a step of detecting an object in a wrapped image obtained by shifting and wrapping the panoramic image of the cube map image;
determining the coordinates of a rectangle enclosing the detected object on the plane of the wrap image;
A procedure of displaying the equirectangular image on a 360-degree display screen, and synthesizing the rectangle in which the object is detected in the panoramic image and the rectangle in which the object is detected in the wraparound image at corresponding positions on the 360-degree display screen. ,
Object detection program for running.