JP7373079B2

JP7373079B2 - Image recognition simulator device

Info

Publication number: JP7373079B2
Application number: JP2022546154A
Authority: JP
Inventors: 玲宇田川; 崇之佐藤; 健永崎
Original assignee: Hitachi Astemo Ltd
Current assignee: Hitachi Astemo Ltd
Priority date: 2020-09-07
Filing date: 2021-07-27
Publication date: 2023-11-01
Anticipated expiration: 2041-07-27
Also published as: WO2022049926A1; DE112021003088T5; JPWO2022049926A1

Description

本発明は、画像認識シミュレータ装置に関する。
本願は、２０２０年９月７日に出願された日本国特願２０２０－１４９９０７号に基づき優先権を主張し、その内容をここに援用する。The present invention relates to an image recognition simulator device.
This application claims priority based on Japanese Patent Application No. 2020-149907 filed on September 7, 2020, the contents of which are incorporated herein.

昨今、各種センサーを自動車に搭載して危険を検知又は回避する予防安全システムのテストが盛んに行われている。予防安全システムが必要な時に意図通り始動しないと事故につながる危険性があるので、多くの場合を想定してテストする必要がある。しかし、実際に車両を走らせて危険なシーンで該システムが始動するか否かのテストは、安全面などにおいて限界がある。このため、CG（Computer Graphics）などで擬似的に生成した走行環境と車両とを用いてテストを行う手法が求められている。 BACKGROUND OF THE INVENTION Recently, testing of active safety systems that detect or avoid danger by installing various sensors in automobiles has been actively conducted. If an active safety system does not start as intended when it is needed, there is a risk of an accident, so it is necessary to test the system in many scenarios. However, there are limits to the ability to actually drive a vehicle and test whether the system will start up in a dangerous scene from a safety standpoint. For this reason, there is a need for a method of conducting tests using a vehicle and a driving environment that is simulated using CG (Computer Graphics) or the like.

一例として、特許文献１では、実写画像データに天候外乱などを示す画像を重畳して擬似的に走行シーンのパターンを複数作成してテストを行う方法が提案されている。 As an example, Patent Document 1 proposes a method in which a plurality of pseudo driving scene patterns are created by superimposing images showing weather disturbances etc. on real image data and a test is performed.

特開２０１０－３３３２１号公報Japanese Patent Application Publication No. 2010-33321

しかしながら、特許文献１に記載の方法では、単に実写画像データに対してもう一つの画像を重畳するだけであり、リアリティに欠ける問題が生じる。例えば、実写画像にCGで作成した歩行者の画像を単に重畳した場合、遠近感が乱れるので、違和感のある画像になってしまう。 However, the method described in Patent Document 1 simply superimposes another image on the real image data, which causes a problem of lack of reality. For example, if an image of a pedestrian created using CG is simply superimposed on a live-action image, the sense of perspective will be disrupted, resulting in an unnatural-looking image.

本発明は、このような技術課題を解決するためになされたものであって、リアリティを有する合成画像を作成することができる画像認識シミュレータ装置を提供することを目的とする。 The present invention was made to solve such technical problems, and an object of the present invention is to provide an image recognition simulator device that can create a realistic composite image.

本発明に係る画像認識シミュレータ装置は、少なくとも２つのカメラにより撮像された輝度画像に基づいてステレオマッチングで距離を計算し、計算した結果を画像で表した距離画像を出力する距離計算部と、前記輝度画像に対し領域分割を行うことにより領域分割画像を得る領域分割計算部と、前記領域分割計算部により領域分割された結果に基づいて前記距離画像からステレオマッチングの誤差を除外する距離画像誤差除外部と、前記距離画像誤差除外部により除外された距離画像に基づいて３次元空間を生成する３次元空間生成部と、車載カメラのアプリケーションの物標として認識される仮想物体を任意の位置と時刻に設置する仮想物体設置部と、前記仮想物体設置部により設置された仮想物体を、前記３次元空間生成部により生成された３次元空間に合成する仮想物体合成部と、前記仮想物体合成部により合成された結果に基づいて、前記２つのカメラの輝度画像を生成する画像生成部と、を備えることを特徴としている。 The image recognition simulator device according to the present invention includes a distance calculation unit that calculates a distance by stereo matching based on brightness images captured by at least two cameras, and outputs a distance image representing the calculated result as an image; a region division calculation unit that performs region division on a luminance image to obtain a region division image; and a distance image error exclusion unit that excludes stereo matching errors from the distance image based on the result of region division by the region division calculation unit. a 3D space generation unit that generates a 3D space based on the range images excluded by the range image error exclusion unit; a virtual object installation unit installed in the virtual object installation unit, a virtual object synthesis unit that synthesizes the virtual object installed by the virtual object installation unit into the three-dimensional space generated by the three-dimensional space generation unit, and a virtual object synthesis unit configured to The present invention is characterized by comprising an image generation unit that generates brightness images of the two cameras based on the combined results.

本発明に係る画像認識シミュレータ装置では、距離画像誤差除外部が領域分割計算部により領域分割された結果に基づいて距離画像からステレオマッチングの誤差を除外するので、ステレオマッチングの誤差に影響されずに仮想物体を輝度画像に合成することができる。従って、リアリティを有する合成画像を作成することができる。 In the image recognition simulator device according to the present invention, the distance image error exclusion section excludes the stereo matching error from the distance image based on the result of region division by the region division calculation section, so that the image recognition simulator device is not affected by the stereo matching error. A virtual object can be synthesized into a brightness image. Therefore, a realistic composite image can be created.

本発明によれば、リアリティを有する合成画像を作成することができる。 According to the present invention, a realistic composite image can be created.

実施形態に係る画像認識シミュレータ装置を示す概略構成図である。1 is a schematic configuration diagram showing an image recognition simulator device according to an embodiment. 画像認識シミュレータ装置のCG－自然画像合成部を示す概略構成図である。FIG. 2 is a schematic configuration diagram showing a CG-natural image synthesis section of the image recognition simulator device. CG－自然画像合成部の動作を説明するための模式図である。FIG. 3 is a schematic diagram for explaining the operation of a CG-natural image composition section. CG－自然画像合成部の動作を説明するための模式図である。FIG. 3 is a schematic diagram for explaining the operation of a CG-natural image composition section. CG－自然画像合成部の動作を説明するための模式図である。FIG. 3 is a schematic diagram for explaining the operation of a CG-natural image composition section. CG－自然画像合成部の動作を説明するための模式図である。FIG. 3 is a schematic diagram for explaining the operation of a CG-natural image composition section.

以下、図面を参照して本発明に係る画像認識シミュレータ装置の実施形態について説明する。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments of an image recognition simulator device according to the present invention will be described with reference to the drawings.

図１は実施形態に係る画像認識シミュレータ装置を示す概略構成図である。本実施形態の画像認識シミュレータ装置１は、車載画像収集装置により収集された複数の時系列画像と同期した制御信号とに基づいて、自然画像、合成したいCG及びその挙動を用いて、認識アプリケーションのシミュレーションを行うための装置である。図示しないが、車載画像収集装置は、少なくとも２つのカメラ（ここでは、ステレオカメラ）を有する画像取得部を備えている。 FIG. 1 is a schematic configuration diagram showing an image recognition simulator device according to an embodiment. The image recognition simulator device 1 of this embodiment uses a natural image, CG to be synthesized, and its behavior to create a recognition application based on a plurality of time-series images collected by an in-vehicle image collection device and synchronized control signals. This is a device for performing simulations. Although not shown, the in-vehicle image collection device includes an image acquisition unit having at least two cameras (stereo cameras in this case).

ステレオカメラは、すなわち車載カメラであり、例えば互いの光軸が平行となるように所定の光軸間隔（基線長）で配置された左右一対のカメラからなり、自車両周囲の様子を撮像する。左右一対のカメラは、それぞれCMOSなどのイメージセンサや光学レンズなどにより構成されている。そして、このステレオカメラによって撮像された画像は、上述の自然画像である。 A stereo camera is a vehicle-mounted camera, and includes, for example, a pair of left and right cameras arranged at a predetermined distance between optical axes (baseline length) so that their optical axes are parallel to each other, and captures images of the surroundings of the vehicle. The pair of left and right cameras each consist of an image sensor such as CMOS and an optical lens. The image captured by this stereo camera is the above-mentioned natural image.

図１に示すように、画像認識シミュレータ装置１は、CG－自然画像合成部１０と画像認識部２０とを備えている。 As shown in FIG. 1, the image recognition simulator device 1 includes a CG-natural image composition section 10 and an image recognition section 20.

図２は画像認識シミュレータ装置のCG－自然画像合成部を示す概略構成図である。図２に示すように、CG-自然画像合成部１０は、距離計算部１１と、領域分割計算部１２と、距離画像誤差除外部１３と、３次元空間生成部１４と、仮想物体設置部１５と、仮想物体合成部１６と、画像生成部１７とを備えている。 FIG. 2 is a schematic configuration diagram showing the CG-natural image synthesis section of the image recognition simulator device. As shown in FIG. 2, the CG-natural image synthesis section 10 includes a distance calculation section 11, a region division calculation section 12, a distance image error exclusion section 13, a three-dimensional space generation section 14, and a virtual object installation section 15. , a virtual object synthesis section 16 , and an image generation section 17 .

距離計算部１１は、ステレオカメラにより撮像された輝度画像に基づいてステレオマッチングで距離を計算し、計算した結果を画像で表した距離画像を出力する。より具体的には、距離計算部１１は、まず、ステレオカメラによって撮像された２枚の輝度画像に基づいて、ステレオマッチングを用いて距離を計算する。このとき、距離計算部１１は、例えば三角測量の原理で画素ごとの視差を取得することで距離を計算する。取得された視差は、使用したステレオカメラの仕様から距離へ変換することができる。例えば、ステレオカメラの基線長をL、CMOSサイズをμ、光学レンズの焦点距離をV、視差をdとした場合、距離をVL/dμで計算することができる。次に、距離計算部１１は、上述のように計算した結果を画像で表した距離画像を距離画像誤差除外部１３に出力する。 The distance calculation unit 11 calculates the distance by stereo matching based on the brightness image captured by the stereo camera, and outputs a distance image representing the calculated result as an image. More specifically, the distance calculation unit 11 first calculates the distance using stereo matching based on two brightness images captured by a stereo camera. At this time, the distance calculation unit 11 calculates the distance by acquiring parallax for each pixel, for example, based on the principle of triangulation. The acquired parallax can be converted into distance based on the specifications of the stereo camera used. For example, if the base length of the stereo camera is L, the CMOS size is μ, the focal length of the optical lens is V, and the parallax is d, the distance can be calculated as VL/dμ. Next, the distance calculation section 11 outputs a distance image representing the result of the calculation as described above to the distance image error exclusion section 13.

ステレオマッチングの例として、局所的な画像情報に基づいて実行する手法が挙げられる。この手法では、着目した画像付近に窓を設定し、その窓内の特徴量が左右の画像で類似度を計算することで得られる。ここで、窓は幅W、高さHとし、着目画素を中心に設定するものと考えられる。類似度の計算として、SAD（Sum of Absolute Difference）が挙げられる。 An example of stereo matching is a method performed based on local image information. In this method, a window is set near the image of interest, and the feature values within the window are obtained by calculating the similarity between the left and right images. Here, the window is assumed to have a width W and a height H, and is set to center around the pixel of interest. An example of calculating similarity is SAD (Sum of Absolute Difference).

そして、右カメラ画像の座標と輝度をp_R=(x,y)^T,I(p_R)、左カメラ画像の座標と輝度をp_L=[x,y]^T,I(p_L)、視差をD=[d,0]^T、窓内の移動量をs=[w, h]^Tとするとき、視差D=[d,0]^Tを得る類似度R(D)として、下記式（１）で求めることができる。Then, the coordinates and brightness of the right camera image are p _R =(x,y) ^T ,I(p _R ), and the coordinates and brightness of the left camera image are p _L =[x,y] ^T ,I(p _L ). When the disparity is D=[d,0] ^T and the amount of movement within the window is s=[w, h] ^T , the following formula is used as the similarity R(D) to obtain the disparity D=[d,0] ^T. It can be obtained using (1).

また、画素ごとに得られる視差は上述で設定した窓と評価関数によって決定されるため、真値ではないことを注意する必要がある。本来の視差と異なる視差が得られる（誤マッチと呼ぶ）場合を距離誤差があると呼ぶ。 Furthermore, it must be noted that the parallax obtained for each pixel is determined by the window and evaluation function set above, and therefore is not a true value. A case where a parallax different from the original parallax is obtained (referred to as a false match) is called a distance error.

領域分割計算部１２は、ステレオカメラにより撮像された輝度画像に対し領域分割を行うことにより領域分割画像を得る。領域分割とは、エッジや輝度など特性が似通った領域ごとに画像を分割し、その分割した領域ごとにラベル付けを行うことを指す。このとき、例えばConvolutional Neural Network(CNN)を応用したアルゴリズムが用いられる。なお、ここでのラベルは、例えば道路、自動車、歩行者、草むらなどであり、それぞれIDが設定されるのが好ましい。 The region division calculation unit 12 obtains region divided images by performing region division on the luminance image captured by the stereo camera. Region segmentation refers to dividing an image into regions with similar characteristics such as edges and brightness, and labeling each divided region. At this time, for example, an algorithm applying Convolutional Neural Network (CNN) is used. Note that the labels here are, for example, roads, cars, pedestrians, grassy areas, etc., and it is preferable that IDs are set for each label.

また、このとき、領域分割計算部１２は、上述の輝度画像に対して奥側よりも手前側を細かく領域分割することが好ましい。これは、画像に対して奥側よりも手前側の方が自車両に近いので、細かく分割することで誤差などを少なくし、安全性を高めることができるからである。 Moreover, at this time, it is preferable that the region division calculation unit 12 divides the above-mentioned luminance image into regions more finely on the front side than on the back side. This is because the front side of the image is closer to the own vehicle than the rear side, so by dividing the image into smaller pieces, errors can be reduced and safety can be improved.

距離画像誤差除外部１３は、領域分割計算部１２により領域分割された結果に基づいて、距離計算部１１により出力された距離画像からステレオマッチングの誤差を除外する。上述したように、距離画像には誤差が含まれている。このため、距離画像誤差除外部１３は、距離画像と領域分割画像からステレオマッチングの誤差を除外し、誤差を除外した距離画像を３次元空間生成部１４に出力する。 The distance image error exclusion section 13 excludes stereo matching errors from the distance image output by the distance calculation section 11 based on the results of the region division performed by the region division calculation section 12 . As described above, the distance image includes errors. For this reason, the distance image error exclusion unit 13 excludes the stereo matching error from the distance image and the region segmented image, and outputs the distance image excluding the error to the three-dimensional space generation unit 14.

具体的には、距離画像誤差除外部１３は、領域分割計算部１２により領域分割された領域分割画像に基づいて画像を分割する画像分割部１３１と、画像分割部１３１によって分割された画像ごとに距離を取得してステレオマッチングの誤差を除外する距離取得部１３２とを有する。画像分割部１３１は、領域分割画像に基づいて画像を分割し、更に領域分割画像で指定されたIDを分割した画像に設定する。 Specifically, the distance image error exclusion unit 13 includes an image division unit 131 that divides the image based on the area divided image divided by the area division calculation unit 12, and an image division unit 131 that divides the image based on the area divided image divided by the area division calculation unit 12; and a distance acquisition unit 132 that acquires the distance and excludes errors in stereo matching. The image dividing unit 131 divides the image based on the area divided image, and further sets the ID specified in the area divided image to the divided image.

距離取得部１３２は、領域分割画像に設定されたIDを基に距離分布を取得し、更に所定の距離を取得してステレオマッチングの誤差の除外を行う。すなわち、距離取得部１３２は、画像のIDごとに特徴付けられた距離を取得した上、該特徴付けられた距離と比べて不自然な距離を誤差として除去する。 The distance acquisition unit 132 acquires a distance distribution based on the ID set in the region divided image, and further acquires a predetermined distance to exclude stereo matching errors. That is, the distance acquisition unit 132 acquires the distance characterized for each ID of the image, and then removes unnatural distances as errors compared to the characterized distance.

具体的には、距離取得部１３２は奥行きの有無によって取得するべき距離を変更する仕組みを有する。画像のIDは、整数で割り当てられ、内部でその整数と種別が対応付けられており、例えば表１のような対応関係になっていることが考えられる。距離取得部１３２は、このIDごとに対応した距離の取得方法を有する。 Specifically, the distance acquisition unit 132 has a mechanism for changing the distance to be acquired depending on the presence or absence of depth. An image ID is assigned as an integer, and the integer and type are internally associated with each other, for example, it is possible to have a correspondence relationship as shown in Table 1. The distance acquisition unit 132 has a distance acquisition method corresponding to each ID.

例えばID=1の場合、種別が道路であるので、距離取得部１３２は道路で取得するのに適した距離の取得方法を実施する。画像収集は、道路を走っている前提であり、道路の距離は自車両の位置を0ｍとして消失点の方向に向かい徐々に遠くなり無限遠点で最大となる。距離取得部１３２は、この前提で画素ごとに設定された距離を補正する。なお、距離はX軸方向に類似していて、さらにY軸方向へ減衰していくため、近似曲線を描くことができる。このように距離を補正することによって、ステレオマッチングの誤差が除外されることになる。 For example, in the case of ID=1, the type is a road, so the distance acquisition unit 132 implements a distance acquisition method suitable for acquiring on a road. Image collection is based on the premise that the vehicle is running on a road, and the distance on the road is set at 0 m from the vehicle's position and gradually moves farther away toward the vanishing point, reaching a maximum at infinity. The distance acquisition unit 132 corrects the distance set for each pixel based on this premise. Note that since the distance is similar in the X-axis direction and further attenuates in the Y-axis direction, an approximate curve can be drawn. Correcting the distance in this way eliminates stereo matching errors.

また、ID=2,3,4の場合、種別がそれぞれ自動車（例えば先行車、対向車）、歩行者、２輪車である。自動車、歩行者及び２輪車がそれぞれ自身で移動するので、ID=1の道路と異なる。また、大きさと向きも様々であるため、領域分割された画像の距離も様々な値が含まれることが予想できる。このとき、距離取得部１３２は合成するのに適した距離を取得することが目的であるので、必ずしも正しい距離を取得する必要はない。そして、距離取得部１３２は、距離画像と領域分割画像の時間変化を取得することにより、時間的及び空間的な距離の分布を取得して距離の変動を均一化する。このように距離の変動を均一化することにより、ステレオマッチングの誤差が除外されることになる。なお、時間的又は空間的な距離の分布は、画像領域全体でも、X軸またはY軸へ投影しても、時間的に重みを付けて計算しても良い。 Furthermore, in the case of ID=2, 3, and 4, the types are automobiles (for example, preceding vehicles, oncoming vehicles), pedestrians, and two-wheeled vehicles, respectively. It is different from the road with ID=1 because cars, pedestrians, and two-wheeled vehicles move on their own. Furthermore, since the sizes and orientations vary, it can be expected that the distances of the segmented images will also include various values. At this time, since the purpose of the distance acquisition unit 132 is to acquire a distance suitable for compositing, it is not necessarily necessary to acquire the correct distance. Then, the distance acquisition unit 132 acquires the temporal and spatial distribution of distances by acquiring temporal changes in the distance image and the region-divided image, thereby making the fluctuations in distance uniform. By equalizing the variation in distance in this way, errors in stereo matching are eliminated. Note that the temporal or spatial distance distribution may be calculated for the entire image region, projected onto the X-axis or Y-axis, or weighted temporally.

また、ID=5の場合、種別が草むらである。草むらや芝生は不規則なパターンを有しており、本実施形態において考えている窓を用いた局所的なステレオマッチングが不得意とするケースである。従って、誤マッチングが比較的に多く、或いは距離が求められないことが多くなるのが想定される。このような場合は、草むらそのものではなく、正しい距離を比較的推定しやすい道路の距離を草むらの距離として推定して使うことができる。このようにすることで、ステレオマッチングの誤差が除外されることになる。なお、その際には、領域分割して得た一つの画像を路面での距離の変化に応じて分割すると、後述するCGの合成が実行しやすくなる。 Also, if ID=5, the type is grass. Grass and lawns have irregular patterns, and this is a case where local stereo matching using windows, as considered in this embodiment, is not suitable. Therefore, it is assumed that there will be a relatively large number of erroneous matchings, or that the distance will not be determined in many cases. In such a case, it is possible to estimate and use the distance of the road, which is relatively easy to estimate the correct distance, as the distance of the grass, rather than the grass itself. By doing so, errors in stereo matching are eliminated. In this case, if one image obtained by region division is divided according to changes in distance on the road surface, CG synthesis, which will be described later, will be easier to perform.

本実施形態では、距離取得部１３２が、領域分割画像に設定されたIDごとに対応した距離をそれぞれ取得し、取得した距離に基づいてステレオマッチングの誤差の除外を行うので、ステレオマッチングの誤差を除外する精度を高めることができる。 In this embodiment, the distance acquisition unit 132 acquires distances corresponding to each ID set in the region segmented image, and eliminates stereo matching errors based on the acquired distances. Exclusion accuracy can be increased.

３次元空間生成部１４は、距離画像誤差除外部１３により除外された距離画像に基づいて３次元空間を生成する。より具体的には、３次元空間生成部１４は、距離画像誤差除外部１３によって除外された距離画像とステレオカメラにより撮像された輝度画像とから３次元空間を生成する。輝度画像は正射影で得られたものを想定しているので、ステレオカメラの使用に基づき距離画像と組み合わせることで、３次元空間の座標を得ることができる。 The three-dimensional space generation section 14 generates a three-dimensional space based on the distance images excluded by the distance image error exclusion section 13. More specifically, the three-dimensional space generation unit 14 generates a three-dimensional space from the distance images excluded by the distance image error exclusion unit 13 and the brightness images captured by the stereo camera . Since the brightness image is assumed to be obtained by orthographic projection, coordinates in a three-dimensional space can be obtained by combining it with a distance image based on the use of a stereo camera.

仮想物体設置部１５は、車載カメラのアプリケーションの物標として認識される仮想物体を任意の位置と時刻に設置する。より具体的には、仮想物体設置部１５は、合成すべき仮想物体のCGの種別を決定し、その仮想物体の設置する時刻と位置を決定する。このとき、仮想物体設置部１５は、その仮想物体をどのように動かすかの情報と自車両の制御信号から得られる自車両の移動距離とに基づいて仮想物体の設置位置と時刻を決定することが好ましい。このようにすれば、より実画像に近い合成画像を得ることができ、シミュレーションの信頼性を向上する効果を奏する。また、仮想物体設置部１５は、決定した結果に基づいて仮想物体を設置する。 The virtual object installation unit 15 installs a virtual object that is recognized as a target for the vehicle-mounted camera application at an arbitrary position and time. More specifically, the virtual object installation unit 15 determines the type of CG of the virtual object to be synthesized, and determines the time and position at which the virtual object is installed. At this time, the virtual object installation unit 15 determines the installation position and time of the virtual object based on information on how to move the virtual object and the travel distance of the own vehicle obtained from the control signal of the own vehicle. is preferred. In this way, it is possible to obtain a composite image that is closer to the real image, which has the effect of improving the reliability of the simulation. Further, the virtual object installation unit 15 installs the virtual object based on the determined result.

なお、仮想物体とは、車載カメラのアプリケーションの物標として認識される物体であり、自動車、歩行者、二輪車等が例に挙げられる。また、仮想物体は擬似的に生成できるので、その速度や大きさは自由に設定することができる。 Note that the virtual object is an object that is recognized as a target for an in-vehicle camera application, and includes, for example, a car, a pedestrian, a two-wheeled vehicle, and the like. Furthermore, since the virtual object can be generated in a pseudo manner, its speed and size can be freely set.

仮想物体合成部１６は、仮想物体設置部１５により設置された仮想物体を、３次元空間生成部１４により生成された３次元空間に合成する。このとき、仮想物体合成部１６は、仮想物体設置部１５が設置した仮想物体を３次元空間の所定位置に合成する。 The virtual object synthesis section 16 synthesizes the virtual object installed by the virtual object installation section 15 into the three-dimensional space generated by the three-dimensional space generation section 14. At this time, the virtual object synthesis section 16 synthesizes the virtual object installed by the virtual object installation section 15 at a predetermined position in the three-dimensional space.

画像生成部１７は、仮想物体合成部１６により合成された結果に基づいて、上記ステレオカメラの輝度画像を生成する。このとき、画像生成部１７は、仮想物体を合成した３次元空間からステレオカメラで得られる左右カメラの輝度画像を生成する。 The image generation unit 17 generates a brightness image of the stereo camera based on the result of synthesis by the virtual object synthesis unit 16. At this time, the image generation unit 17 generates brightness images of the left and right cameras obtained by the stereo cameras from the three-dimensional space in which the virtual objects are synthesized.

一方、画像認識部２０は、画像生成部１７により生成された左右カメラの輝度画像を認識する。 On the other hand, the image recognition unit 20 recognizes the brightness images of the left and right cameras generated by the image generation unit 17.

以下、図３～図６を参照してＣＧ－自然画像合成部１０の動作を説明する。 The operation of the CG-natural image composition section 10 will be described below with reference to FIGS. 3 to 6.

まず、距離計算部１１は、ステレオカメラにより撮像された輝度画像に基づいてステレオマッチングで距離を計算する（図３参照）。図３において、左図はステレオカメラで取得した輝度画像を模擬的に表現したものであり、右図は距離計算部１１の計算によって得られた距離画像を色の濃淡で表現したものである。右図に示す色の濃淡は、物体ごとに距離の遠近に基づいて設定されている。 First, the distance calculation unit 11 calculates a distance by stereo matching based on a brightness image captured by a stereo camera (see FIG. 3). In FIG. 3, the left diagram is a simulated representation of a brightness image obtained by a stereo camera, and the right diagram is a representation of a distance image obtained by calculation by the distance calculation unit 11 using color shading. The color shading shown in the diagram on the right is set based on the distance of each object.

また、右図に示す距離画像には、不規則な点が複数存在している。これらの不規則な点はステレオマッチングの誤差によって生じた距離誤差を表現したものである。このような距離誤差が距離画像に含まれていると、CGを合成する際に物体との前後関係が変わってしまい、違和感のある画像になる。すなわち、これらの距離誤差を含む距離画像に対して仮想物体を合成すると、例えば本来仮想物体より奥にあるはずの芝生の一部が手前にあるように見える問題が生じ、部分的に仮想物体が遮蔽されてしまうので、不自然な画像となる。 Furthermore, there are multiple irregular points in the distance image shown in the right figure. These irregular points represent distance errors caused by stereo matching errors. If such a distance error is included in the distance image, the front and back relationship with the object will change when CG is combined, resulting in an unnatural looking image. In other words, when a virtual object is synthesized with a distance image that includes these distance errors, a problem arises where, for example, a part of the lawn that is supposed to be in the back of the virtual object appears to be in the foreground, and the virtual object is partially hidden. Since it is occluded, the image becomes unnatural.

続いて、領域分割計算部１２は、ステレオカメラにより撮像された輝度画像に対し領域分割を行う（図４参照）。図４に示すように、ステレオカメラで取得した輝度画像に対して領域分割が行われた結果、例えば「自動車」、「道路」、「芝生」、「それ以外」といったラベル付けがされている。 Subsequently, the region division calculation unit 12 performs region division on the brightness image captured by the stereo camera (see FIG. 4). As shown in FIG. 4, as a result of region segmentation performed on the brightness image acquired by the stereo camera, labels such as "car," "road," "lawn," and "other" are assigned, for example.

続いて、距離画像誤差除外部１３は、領域分割計算部１２により領域分割された結果に基づいて距離画像からステレオマッチングの誤差を除外する（図５参照）。具体的には、距離画像誤差除外部１３の距離取得部１３２は、上述の領域分割で得られたラベル付けに基づいて、ラベルごとに距離を再計算し、距離誤差（すなわち、図３の右図中の不規則な点）を除去する。このように領域分割によってラベル付けした結果から距離を再計算したことで、ステレオマッチングで生じる誤差を効果的に除外することができ、不自然な画像の合成を抑制することができる。 Subsequently, the distance image error exclusion unit 13 excludes stereo matching errors from the distance image based on the results of region division by the region division calculation unit 12 (see FIG. 5). Specifically, the distance acquisition unit 132 of the distance image error exclusion unit 13 recalculates the distance for each label based on the labeling obtained by the above-mentioned region division, and eliminates the distance error (i.e., the right side in FIG. 3). (Irregular points in the diagram) are removed. By recalculating the distance from the result of labeling by region division in this way, it is possible to effectively exclude errors caused by stereo matching, and it is possible to suppress the synthesis of unnatural images.

続いて、３次元空間生成部１４は距離誤差を除去した距離画像に基づいて３次元空間を生成し、仮想物体設置部１５は仮想物体（ここでは、歩行者）を任意の位置と時刻に設置する。 Next, the three-dimensional space generation unit 14 generates a three-dimensional space based on the distance image with distance errors removed, and the virtual object installation unit 15 installs a virtual object (in this case, a pedestrian) at an arbitrary position and time. do.

続いて、仮想物体合成部１６は、仮想物体としての歩行者を、３次元空間生成部１４により生成された３次元空間に配置させてCGを合成する（図６の左図参照）。画像生成部１７は、仮想物体合成部１６で合成したCGに基づいて、ステレオカメラの輝度画像を生成する（図６の右図参照）。 Next, the virtual object synthesis unit 16 places the pedestrian as a virtual object in the three-dimensional space generated by the three-dimensional space generation unit 14 and synthesizes CG (see the left diagram in FIG. 6). The image generation unit 17 generates a brightness image of the stereo camera based on the CG synthesized by the virtual object synthesis unit 16 (see the right diagram of FIG. 6).

本実施形態の画像認識シミュレータ装置１では、距離画像誤差除外部１３が領域分割計算部１２により領域分割された結果に基づいて距離画像からステレオマッチングの誤差を除外するので、ステレオマッチングの誤差に影響されずに仮想物体を輝度画像に合成することができる。従って、リアリティを有する合成画像を作成することができる。 In the image recognition simulator device 1 of this embodiment, the distance image error exclusion unit 13 excludes the stereo matching error from the distance image based on the result of region division by the region division calculation unit 12, which affects the stereo matching error. A virtual object can be synthesized into a brightness image without being Therefore, a realistic composite image can be created.

また、ステレオカメラにより撮像された自然画像を用いるので、全てCGで作成する場合と比較してリアリティを更に高めることができるとともに、シミュレーションの信頼性を向上することができる。更に、自然画像に対して自動車や歩行者や二輪車などのCG画像（すなわち、仮想物体）を合成するので、簡単に画像のバリエーションを増加することができる。 Furthermore, since natural images captured by a stereo camera are used, it is possible to further enhance the reality and improve the reliability of the simulation compared to the case where everything is created using CG. Furthermore, since CG images (i.e., virtual objects) of cars, pedestrians, motorcycles, etc. are combined with natural images, it is possible to easily increase the variety of images.

以上、本発明の実施形態について詳述したが、本発明は、上述の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の精神を逸脱しない範囲で、種々の設計変更を行うことができるものである。 Although the embodiments of the present invention have been described in detail above, the present invention is not limited to the above-described embodiments, and various designs can be made without departing from the spirit of the present invention as described in the claims. Changes can be made.

１画像認識シミュレータ装置
１０ CG-自然画像合成部
１１距離計算部
１２領域分割計算部
１３距離画像誤差除外部
１４３次元空間生成部
１５仮想物体設置部
１６仮想物体合成部
１７画像生成部
２０画像認識部
１３１画像分割部
１３２距離取得部1 Image recognition simulator device 10 CG-natural image synthesis section 11 Distance calculation section 12 Area division calculation section 13 Distance image error exclusion section 14 3D space generation section 15 Virtual object installation section 16 Virtual object synthesis section 17 Image generation section 20 Image recognition Section 131 Image division section 132 Distance acquisition section

Claims

a distance calculation unit that calculates a distance by stereo matching based on brightness images captured by at least two cameras, and outputs a distance image representing the calculated result as an image;
a region division calculation unit that performs region division on the luminance image to obtain a region divided image; and a distance image error that excludes stereo matching errors from the distance image based on the result of region division by the region division calculation unit. an exclusion section;
a three-dimensional space generation unit that generates a three-dimensional space based on the distance image excluded by the distance image error exclusion unit;
a virtual object installation unit that installs a virtual object recognized as a target for an in-vehicle camera application at an arbitrary position and time;
a virtual object synthesis unit that synthesizes the virtual object installed by the virtual object installation unit into the three-dimensional space generated by the three-dimensional space generation unit;
an image generation unit that generates brightness images of the two cameras based on the results of the synthesis by the virtual object synthesis unit;
Equipped with
The image recognition simulator device is characterized in that the region division calculation unit divides the brightness image into regions on a front side more finely than on a back side .

The distance image error exclusion unit includes an image division unit that divides the image based on the region divided image divided by the region division calculation unit, and a stereo matching by acquiring distance for each image divided by the image division unit. The image recognition simulator device according to claim 1 , further comprising a distance acquisition unit that excludes an error in the image recognition simulator.

2. The virtual object installation unit determines the installation position and time of the virtual object based on the movement information of the virtual object and the travel distance of the vehicle, and installs the virtual object based on the determined result. 2. The image recognition simulator device according to 2 .