JPWO2004008744A1

JPWO2004008744A1 - Planar development image processing method, reverse development image conversion processing method, plane development image processing device, and reverse development image conversion processing device for plane object images such as road surfaces

Info

Publication number: JPWO2004008744A1
Application number: JP2004521184A
Authority: JP
Inventors: 岩根　和郎; 和郎岩根
Original assignee: IWANE LABORATORIES, LTD.
Current assignee: IWANE LABORATORIES, LTD.
Priority date: 2002-07-12
Filing date: 2003-07-11
Publication date: 2005-11-17
Anticipated expiration: 2023-07-11
Also published as: AU2003248269A1; WO2004008744A9; WO2004008744A1; JP4273074B2

Abstract

各種モニタカメラの遠近法で得られた斜め画像を変換することにより、地図で表現したような平面図に展開し表示するようにする方法及び装置である。通常のカメラから得られた遠近法の映像から平面展開に必要な情報を読みとり、数式により地図のような平面図に展開し、それを組み合わせ、つなぎ合わせ、一枚の大きな路面展開図にする。例えば通常のビデオ映像撮影カメラをバスの周りに取り付け、全ての視野を構成するように、複数のビデオカメラで、バスの外側の道路面をある角度、即ち伏角をもって撮影し、それを計算により、路面の映像を地図の映像のように展開し、その地図のように展開できた映像を処理をし、つなぎ合わせ、バスの周りの路面映像として生成する。これを数式を用いて計算により求める。It is a method and apparatus for developing and displaying a plan view as represented by a map by converting oblique images obtained by perspective of various monitor cameras. Information necessary for plane development is read from the perspective image obtained from a normal camera, developed into a plan view like a map by mathematical formulas, and combined and connected to form a large road development view. For example, a normal video camera is mounted around the bus, and the entire road is composed of multiple video cameras. The road surface outside the bus is photographed at a certain angle, that is, a depression, and this is calculated. The road surface image is developed like a map image, the image developed like the map is processed, connected, and generated as a road surface image around the bus. This is obtained by calculation using mathematical formulas.

Description

本発明は、通常のカメラで撮影した場合、画面が遠近法で撮影されるがそれを地図の画面のように平面に展開する道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法及びその平面展開画像処理装置、逆展開画像処理装置に関するものである。
具体的には、通常のカメラで撮影した映像を展開し、例えばバスの周りの画像を複数のカメラで撮影し、夫々を路面上の平面路面として展開し、バスの周りの道路面を、バスの上方から路面を地図で見たようにそれらの展開画像を結合して一枚の平面路面として表示されるようにし、この平面画像展開により、リニアスケールとなることで画像の結合が可能となり、また、展開した平面画像内で対象物についての画像認識や計測処理が可能となり、さらにこの画像変換原理を逆に使って道路面を平面に展開した後、視点を移動して、再度、遠近法画像に戻すことにより、最初の遠近法画像とは視点を変えた映像を得ることを可能とし、さらにこの画像展開の原理を道路面だけではなく画像内の任意平面について施すようにしたものである。
また、平面展開された画像にオプティカルフロー（又は視差、マッチング等）処理を施すことで任意平面を他の平面から抽出するのであり、さらに平面の凹凸を検出し、その凹凸の平面からのずれを求め、また、オプティカルフローを用いることにより、移動距離速度方向を抽出したり、平面展開された画像にＣＧ（コンピュータグラフィックス）画像からなる物体を置き、それに対応するテクスチャーを貼り付け、平面としてあるいは遠近法画像として表示したり、遠近法画像を平面展開するときに画面内の平行線となるべき線成分が平行になるようにすることで、カメラの伏角θを求め、あるいは遠近法画像の中から画面内の平行線となるべき線成分が交差する点即ち消失点を求めることにより、カメラの伏角θを求めたり、その伏角θを一定にするように画像を移動することでカメラのぶれを補正したりすることを可能にする平面に展開する方法とその装置さらにはその応用に関するものである。
さらに、平面展開された画像情報とカメラの位置情報を送受信することで所望の動画像を再構成することができ、データ伝送量を可能な限り小さくしながら動画像データを高速に送受信できるようにするものである。The present invention relates to a planar developed image processing method for a planar object image such as a road surface in which the screen is photographed in perspective when photographed with a normal camera but is developed in a plane like a map screen, and vice versa. The present invention relates to a developed image conversion processing method, a flat developed image processing apparatus, and a reverse developed image processing apparatus.
Specifically, a video shot with a normal camera is developed, for example, images around the bus are shot with a plurality of cameras, each is developed as a flat road surface on the road surface, and the road surface around the bus is As seen on the map from the top of the road, the developed images are combined and displayed as a single flat road surface, and by this flat image development, it becomes possible to combine images by becoming a linear scale, In addition, image recognition and measurement processing can be performed on the target object in the developed flat image, and the road surface is developed into a flat surface using this image conversion principle in reverse. By returning to the image, it is possible to obtain a video with a different viewpoint from the first perspective image, and this image development principle is applied not only to the road surface but also to any plane in the image .
In addition, an optical flow (or parallax, matching, etc.) processing is performed on the flatly developed image to extract an arbitrary plane from other planes. Further, the unevenness of the plane is detected, and the deviation of the unevenness from the plane is detected. In addition, by using the optical flow, the moving distance speed direction is extracted, an object composed of a CG (computer graphics) image is placed on the flatly developed image, and a texture corresponding to the object is pasted. The camera's dip angle θ can be obtained by displaying as a perspective image, or by making the line components that should be parallel lines in the screen parallel when the perspective image is flattened, or in the perspective image By calculating the point where the line components that should be parallel lines in the screen intersect, that is, the vanishing point, the camera's dip angle θ And how to deploy in a plane that allows or correcting the camera shake by moving the image to a constant apparatus further relates its application.
Furthermore, it is possible to reconstruct a desired moving image by transmitting and receiving the plane-developed image information and the camera position information so that moving image data can be transmitted and received at high speed while minimizing the data transmission amount. To do.

従来、ＣＣＴＶ（ＣｌｏｓｅｄＣｉｒｃｕｉｔＴｅｌｅｖｉｓｉｏｎ）カメラ、あるいは各種モニタカメラではレンズの性質から、遠近法で撮影されてしまうのが当然であった。例えば、バスの後部に着いているバスの後方の状況を撮影するカメラでは、カメラを垂直下向きに取り付けない限り、当然遠近法による映像が撮影され、道路面を真上から見た映像を得ることはできなかった。ましてや、バスの両側の側方、前方、後方を含めてバックミラー等により見て、部分的に見えたり、死角があったり、ゆがんだ像として見えているのが現実であった。つまり、通常のカメラで撮影したのみでは、遠近法で撮影されてしまい、地図のように平面に展開した映像を描くことはできなかった。
また、映像の結合技術としては、斜め映像を結合する技術は存在するが、それは原理的に同一の撮影位置から視点を変えて撮影した複数の画像の結合であり、カメラ位置を任意に変えて撮影した複数の映像を展開し、結合し、現実の対象物と相似形に表示する技術は存在していない。そればかりでなく、ビデオ画像から視点を変えた映像を得る方法や装置は過去にはなく、さらに、複数の遠近法画像から画像処理技術で平面の凹凸を検出する手法もその凹凸の平面からのずれを求める方法も存在しなかった。
そしてまた、ビデオの遠近法画像から平面展開した画像を用いて移動距離速度方向を抽出する方法も装置も存在しないし、さらに、平面展開された画像から目的以外の画像を削除してＣＧ画像にテクスチャーを貼り付け、平面としてあるいは遠近法画像として表示する方法も装置も存在しないし、また、画像内から平面展開に必要な方程式の変数を抽出して演算する的確な方法と方程式も存在しないものであった。
しかも、平面展開の数学的解を求めようとしても、現実の座標と対応付けするために、６点から１２点以上の実測された対応点が必要であり、特に動画においてはその条件を満たすことは実質不可能である。
さらに、ビデオ映像等の動画像は、複数の装置間等においてデータを伝送しようとする場合、伝送データの圧縮が行われるが、従来の圧縮方法は、動画像中の静止背景上にある動きのある部分のみを分離・圧縮する方法が採られており（例えばＭＰＥＧ２方式）、このような圧縮方法では、カメラ自体が移動して画像全体が動き成分をもつ動画像については十分な圧縮効果が得られず、結果としてデータ伝送を行うことができなかった。
そこで本発明は叙上のような従来存した諸事情に鑑み創出されたもので、各画像毎に現実の対象物の複数の実測点を用いることなく、カメラの遠近法で得られた映像を、カメラの撮影条件と撮影された画像内の情報のみで平面展開しようとするのであり、さらに、それを結合して地図で表現したような平面図に展開して表示しようとするものである。
例えば、バス等から外部の状況を見るときに、前面から見た映像、側方から見た映像、後方から見た映像を使用して、それを展開し、つなぎ合わせ、バスの上方から見たその周りの道路面を、地図のように表示し、自車両の位置を平面図の中に表示し、運転をより安全にしようとするものである。
即ち、各種モニタカメラの遠近法で得られた斜め画像を変換することにより、地図で表現したような平面図に展開し表示しようとするもので、例えばバス等から外部の状況を見るときに、前面から見た映像、側方から見た映像、後方から見た映像を使用して、それを展開し、つなぎ合わせ、バスの上方から見たその周りの道路面を地図のように表示しようとするのであり、このような平面画像展開により、リニアスケールとなることで画像の結合、さらには画像内での計測が可能となるようにする道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法及びその平面展開画像処理装置、逆展開画像処理装置を提供することを目的とするものである。
また、上記原理の応用として、道路面を平面に展開した後、視点を移動して、再度、遠近法画像に戻すことにより、最初の遠近法画像とは視点を変えた映像を得ることができるものとしたり、道路以外の任意平面を抽出する際にオプティカルフロー（又は視差、マッチング等）を用いることで障害画像を削除して目的の画像のみを抽出したり、複数の遠近法画像から平面の凹凸を検出する手法、その凹凸の平面からのずれを求められるようにしたり、複数の遠近法画像からオプティカルフローを用いることにより、移動距離速度方向を抽出したり、さらに平面展開された画像にＣＧ画像からなる物体を置き、それに対応するテクスチャーを貼り付け、平面としてあるいは遠近法画像として表示したり、遠近法画像を平面展開するときに実測値をできるだけ用いなくて済むように画面内の平行線となるべき線成分が平行になるようにすることで、カメラの伏角θを求めたり、あるいは遠近法画像の中から画面内の平行線となるべき線成分が交差する点、即ち消失点を求めることにより、カメラの伏角θを求めたり、また、その伏角θを一定にするように画像を移動することでカメラのぶれを補正したりすることができる道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法及びその平面展開画像処理装置、逆展開画像処理装置を提供することを目的とするものである。
そして、遠近法画像を平面展開した複数の画像とカメラの位置情報から所望の三次元画像を再構成できることから、平面展開画像とカメラ位置情報のみを送信することで、受信側で元の動画像を再現することが可能となり、伝送データを可能な限り小さくすることができ、帯域の狭い回線等であっても所望の対象物の動画像をデータ通信できる道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法及びその平面展開画像処理装置、逆展開画像処理装置を提供することを目的とするものである。Conventionally, it has been natural that a CCTV (Closed Circuit Television) camera or various monitor cameras are photographed in perspective due to the nature of the lens. For example, with a camera that captures the situation behind the bus that arrives at the rear of the bus, unless the camera is mounted vertically downward, a perspective image is naturally captured, and an image of the road surface seen from directly above is obtained. I couldn't. Moreover, it was a reality that it was partly visible, had a blind spot, or looked as a distorted image when viewed by a rearview mirror etc., including the sides, the front, and the rear of both sides of the bus. In other words, if the image was taken with a normal camera, the image was taken in perspective, and it was not possible to draw an image developed on a plane like a map.
In addition, as a technology for combining images, there is a technology that combines oblique images, but in principle, it is a combination of multiple images taken from the same shooting position, changing the viewpoint, and the camera position can be changed arbitrarily. There is no technology that unfolds, combines, and displays similar images to real objects. In addition, there has not been a method or apparatus for obtaining a video with a different viewpoint from a video image in the past, and a method for detecting unevenness of a plane from a plurality of perspective images by image processing technology is also available from the uneven surface. There was no way to determine the deviation.
There is neither a method nor an apparatus for extracting the moving distance speed direction using the image developed from the perspective image of the video, but also deleting the image other than the target from the image developed on the plane and converting it into a CG image. There is no method or device for pasting textures and displaying them as a plane or perspective image, and there is no exact method and equation for extracting and calculating the variables of the equations necessary for plane development from within the image Met.
Moreover, even if a mathematical solution of plane development is to be obtained, 6 to 12 or more actually measured corresponding points are required to associate with the actual coordinates, and that condition is satisfied particularly in moving images. Is virtually impossible.
Furthermore, when moving images such as video images are transmitted between a plurality of devices or the like, transmission data is compressed. However, the conventional compression method uses a motion on a still background in a moving image. A method of separating and compressing only a certain part is adopted (for example, MPEG2 method). With such a compression method, a sufficient compression effect is obtained for a moving image in which the camera itself moves and the entire image has a motion component. As a result, data transmission could not be performed.
Therefore, the present invention was created in view of the existing circumstances as described above, and an image obtained by a camera perspective method is used without using a plurality of actual measurement points of an actual object for each image. In addition, it is intended to develop a plane based only on the photographing conditions of the camera and information in the photographed image, and further intends to develop and display it on a plan view that is expressed by a map.
For example, when viewing the external situation from a bus, etc., using the image seen from the front, the image seen from the side, and the image seen from the rear, developed it, joined it, and viewed from above the bus The surrounding road surface is displayed like a map, and the position of the host vehicle is displayed in a plan view to make driving safer.
That is, by converting oblique images obtained by the perspective of various monitor cameras, it is intended to develop and display a plan view as represented by a map. For example, when viewing external conditions from a bus or the like, Using the image seen from the front, the image seen from the side, and the image seen from the back, expand it, connect it, and try to display the road surface around it as seen from above the bus like a map A plane development image processing method for a plane object image such as a road surface, which enables a combination of images and further measurement within the image by forming a linear scale by such plane image development. It is an object of the present invention to provide a reverse development image conversion processing method, a flat development image processing apparatus thereof, and a reverse development image processing apparatus.
In addition, as an application of the above principle, after expanding the road surface to a plane, the viewpoint is moved and then restored to the perspective image, thereby obtaining a video with a different viewpoint from the first perspective image. When extracting an arbitrary plane other than a road, an optical flow (or parallax, matching, etc.) is used to extract the target image by using the optical flow (or parallax, matching, etc.), or to extract a plane from multiple perspective images. A method of detecting unevenness, making it possible to obtain the deviation of the unevenness from the plane, using optical flow from a plurality of perspective images, extracting the moving distance speed direction, and CG to the image developed on the plane Place an object consisting of an image, paste the texture corresponding to it, display it as a flat or perspective image, or actually measure the perspective image when flat By making the line components that should be parallel lines in the screen parallel so that they can be used as little as possible, the camera's dip angle θ should be obtained, or it should be parallel lines in the screen from perspective images By finding the point where the line components intersect, that is, the vanishing point, it is possible to obtain the camera's dip angle θ, or to correct the camera shake by moving the image so that the dip angle θ is constant. It is an object of the present invention to provide a plane development image processing method, a reverse development image conversion processing method, a plane development image processing apparatus, and a reverse development image processing apparatus for a planar object image such as a road surface.
Since a desired three-dimensional image can be reconstructed from a plurality of images obtained by plane development of perspective images and camera position information, the original moving image can be obtained on the receiving side by transmitting only the plane development image and camera position information. The plane of a plane object image such as a road surface that can transmit data of a desired object even on a narrow-band line or the like can be reproduced. It is an object of the present invention to provide a developed image processing method, a reverse developed image conversion processing method, a flat developed image processing apparatus, and a reverse developed image processing apparatus.

上述した目的を達成するため、本発明にあっては、通常のカメラから得られた遠近法の映像から平面展開に必要な情報を読みとり、数式により地図のような平面図に展開し、それを組み合わせ、つなぎ合わせ、一枚の大きな路面展開図にするというものである。例えば通常のビデオ映像撮影カメラをバスの周りに取り付け、全ての視野を構成するように、複数のビデオカメラで、バスの外側の道路面をある角度、即ち伏角をもって撮影し、それを計算により、路面の映像を地図の映像のように展開し、その地図のように展開できた映像と、ビルの壁面の映像に関して同様の処理をし、つなぎ合わせ、バスの周りの路面とビルの映像として生成しようというものである。あるいは、建築物内部の床面、壁面の画像を撮影し、斜めに撮影された画像を平面に展開し、例えば部屋の壁を開いて、部屋をちょうど展開したような図面にするものである。これを数式を用いて計算により求めるのである。
そして、平面展開した複数の画像データと、カメラの位置情報データから所望の三次元の動画像を再構成することができる。これによって、例えば、撮影用のカメラとモニタ用の表示部とが離れている場合でも、カメラ側から平面展開画像とカメラ位置情報を送信することで、モニタ側で元の動画像を再現することが可能となる。平面画像は静止画像であり、斜め画像の動画像と比較してデータ量が格段に小さいため、帯域の狭い回線等で接続された装置間であっても自由にデータ伝送でき、それを受信側で再構成することで、所望の対象物の動画像をデータ通信できることになる。
具体的には、リアルタイム処理としての斜め画像平面展開装置は、例えば映像入力装置としてのＣＣＴＶカメラ又はデジタルスチルカメラ、映像再生部、画像補正部、球面収差補正部、映像展開平面処理部、展開画像結合部、表示部、記録部からなるものである。
また、オフライン処理としての斜め画像平面展開装置は、斜め画像記録済みの映像再生装置、画像補正部、映像展開平面処理部、展開画像結合部、表示部からなるものである。
平面展開として、平面を含む現実場面の対象を斜めから撮影した画像に関して、数学的演算により、元々が平面で構成されている面を、現実場面の平面と比例関係に（相似形に）となる平面画像として、平面に展開して表示するのである。
その平面展開の結合として、上記の方法で得られた、複数の平面展開画像を結合して、一枚の大きな平面展開画像として表現するのである。
また、複数のＣＣＴＶ映像による全方位表示とダイレクト表示としては、複数のＣＣＴＶ映像を上記装置で平面展開し、夫々画像を結合して、一枚の画像とし、目的領域の全域を表示し、必要に応じて、その表示された場所に対応したＣＣＴＶの斜め映像をも同時に表示させるのである。
移動方向の連続結合としては、移動する車両や航空機や船舶等にカメラを積載して撮影することで得られた移動方向の映像を平面展開して、連続結合して一枚の画像とするのである。
また、重複する対象物を一部含む複数の画像を結合する際に、移動物体が写っている場合は、その移動物体の映像を避けて画像結合させることで、静止物体のみの結合画像を生成するのである。
いわばθ方式として、ＣＣＴＶ等で得られた斜め映像を、平面画像に展開するにあたり、斜め映像から光軸位置θを読みとり、カメラ高さｈ、撮影レンズのｆ値、あるいはモニタ上の仮想ｆ値を読みとり、目的の場所の座標を以下の式
ｙ＝ｖ・２^１／２・ｈ・ｃｏｓ（π／４−θ）・ｃｏｓ（β−θ）／（ｆ・ｓｉｎβ） ………（１）
ｘ＝ｕ・ｈ・ｃｏｓ（β−θ）／（ｆ・ｓｉｎβ） ………（２）
のように表現する式、及び同じ意味を持つ式を用いることで、現実の世界の対象物の位置座標や大きさを既知の情報として与えることなく、撮影された映像内から読みとれる情報とθ、ｈ、ｒ等の撮影条件の情報を与えることで、現実の世界の座標系と、画像モニタ上の座標系を関連させて座標変換を行うのである。
ただし、θはカメラの光軸と道路面のなす角度、ｆはカメラの焦点距離、ｈはカメラの高さ、βはカメラの真下からｈ＋ｙの距離にある点と、カメラを結ぶ線分と道路面のなす角度、ｖはカメラにおける映写面であるＣＣＤ（取得映像）面上の原点から縦方向の座標、ｕはＣＣＤ（取得映像）面上の原点から横方向の座標である。また、ｙは道路面におけるカメラの真下からｈ進んだ点を原点として、そこからさらに光軸方向に進んだ距離即ち座標、ｘは道路面における横方向の距離即ち座標である。また、垂直壁面を平面展開する場合は、座標を九十度傾けて処理すればよいのである。なお、数式はこの式のみではなく、他の同様な遠近法と平面とを関係づける数式であっても良い。
ここで、計算に必要な数値、即ち目的の平面と光軸の成す角θと、画像内の任意の点に対応する現実点とカメラ等のなす角β、及び（β−θ）等の値は現実世界（実写映像）の中にある物理量であるので、当然実測することで得られる。ただ実際問題として、実測しながら複数の場所で、しかもカメラが移動する動画一枚一枚についてそれらを計測することは事実上不可能であるので、この変換式の性質から以下のようにすれば画像内から求めることができる。
光軸の画像内位置を決める多くの場合、画像の幾何学的中央が光軸位置であるが、正確に求めるにはカメラ等の撮影機材の光学中心をコリメータ等で求めておくことで実現でき、一度計測すれば、その位置はレンズを含むカメラ系に固有の値としてその光軸位置を画像内の一点として得られる。
次に、前記変換式における最も重要なθの画像内計測の一例を述べると、現実世界での平行線の部分を画像内から経験的に探し出し、その平行線の延長は画像内では交点（消失点、即ち遠近法で図面を書いたときの遠くの点で交わる点であり、パース図等でのＶａｎｉｓｈｉｎｇＰｏｉｎｔである。例えば、直線道路を遠近法で書いたとき、遠くで道路は一点になるのであり、その点が消失点である。）として表示されるから、その光軸点を含む目的平面に平行な面である平面ａと、その交点を含む目的平面に平行な面である平面ｂとの距離をｄとし、このｄと仮想焦点距離ｆとの比（ａｒｃＴａｎｄ／ｆ）としてθを求めることができる。
ここで、仮想焦点距離を求める一例を示すと、前もって現実空間における任意の対象物を見込む角と、表示された画像内の同一対象物を見込む角が同じになるような光軸上の距離を表示画像上で求めておけばよく、このときの単位をピクセルで表せば、レンズを含むカメラ系に固有の値となり、一度求めておけばよいことになる。さらには現実世界の対象物の平行線部分を探し、それが画像内では延長線上で交点を持つ交差線として表現されているから、この交差線が平面展開したときに、平行となるようにθを選択することでθを求め、あるいはθの微調整をすることができる。
なお、θは実測によって求めることもでき、例えば、移動体である自動車等に取り付けたカメラが伏角としてどの程度の角度で下方を向いているかを、単純な方法である分度器によって、また、もし正確性を期すのであれば専用の角度計測装置によって測定することで得ることができる。
また、取得した平面画像中において、消失点を求める過程で、各画像の消失点の位置を固定するように映像を移動して表示することで、手ぶれ等で揺れる映像を安定化することができる。即ち、これは画像内の目的平面内の平行線成分の部分を延長し、交点を求め、それが前記消失点となるとき、その消失点の位置がカメラの移動や揺れにともなって変動することになって、ブレを生じるが、その消失点の位置を一定に固定するように、映像全体を移動させて表示することにより、手ぶれ等で揺れる画像が安定化するのである。
平面展開によって得られた、異なる複数の平面から構成される動画像の、その中の微小領域の単位時間移動量をオプティカルフロー（本明細書・図面において、「Ｏｐｔ．Ｆ」と省略することもある）手法により必要な範囲で求め、その成分分布図から同一成分を抽出することにより、夫々単独の平面画像を分離することができる。このオプティカルフロー手法を使用するに際し、遠近法画像のオプティカルフローは、一般に同一平面内であっても、移動方向でも同一値とはならずに距離により異なる値をとるが、平面変換された平面内のオプティカルフローは同一の値をとるという性質を利用したものである。つまり平面展開により得られた異なる複数の平面が重なる画像であっても、オプティカルフローの成分分布全体図から、同一成分を抽出することにより、夫々単独の平面画像を分離することができ、例えば平面展開した画像においては、平行な平面を構成するビル等の建物の壁面と、道路のガードレール面や街灯面をオプティカルフローの値で分離することができるというものである。
なお、オプティカルフローは、複数の画像の中で対応する夫々の点がどのように動いたかを示す流れのことであり、複数の画像の中で動きがあればオプティカルフローの流れがあり、その動きを線で表示でき、動きがなければオプティカルフローの流れは線で表されなくなるのであり、対応する点が複数の画像内で移動したかどうかを知ることが重要である。
さらに、平面変換された動画像から得られた平面動画像において、オプティカルフローの分布図を生成し、その微小差から平面からのズレとして平面の凹凸を検出し、若しくは前記平面動画像において異なる画角から得られた平面画像を比較演算することにより視差を検出して、その成分分布から平面内の凹凸成分を検出し、この検出した凹凸値で元平面図の各点の平面からのズレを含めた修正平面図を生成するのであり、複数の平面図に展開された複数の平面画像を相関法若しくはマッチング法等の手法によって比較演算することにより、道路面等の複数の平面画像上の夫々の小領域毎に、夫々が対応する小領域の移動量を視差方式若しくはオプティカルフロー方式等により求め、その成分の分布から道路面等の凹凸等の三次元データを検出し、若しくは検出した三次元凹凸値で元平面図の各点の平面からのズレを含めた修正平面図を生成することができる。つまり、オプティカルフローの分布図を生成し、上記の原理を使ってその微小差から平面からのズレとして平面の凹凸等を検出し、異なる画角から得られた平面画像を比較演算することによって視差を検出し、その成分分布から平面内の凹凸成分を検出し、若しくは検出した凹凸値で元平面図の各点の平面からのズレを含めた修正平面図を生成するのである。また、平面に起伏や凹凸がある場合等において、変換された平面画像のオプティカルフローの微小差を検出することができれば、それは平面からのズレを意味しているので凹凸の分布図を生成することができるのである。また、動画あるいは複数のカメラで取得した画像による異なる方向から観察した道路面の平面図に展開された画像が複数あることから、それらを相関法あるいはマッチング等の手法を用いて比較演算し、道路面等の複数の平面画像状の夫々の微小領域毎に、視差を求めること等により、その微小領域の成分の差から移動量を求め、対応する点を組み合わせ計算することにより、道路面の凹凸を検出するというものである。
このことは、視野の重複する動画像から平面展開画像を取得し、それらから視差若しくはオプティカルフローを求めることで、道路面の凹凸に限らず、平面展開した画像内で重複する全ての対象物について、三次元データを検出することができることを意味している。
なお、オプティカルフローによる作業は、すべて視差によっても代行することができ、また、マッチングによって代行することができる。従って、本発明において「オプティカルフロー」という場合には、オプティカルフロー、視差又はマッチング等のいずれの処理であっても良いことを意味する。
そしてまた、平面展開した連統画像の平均的オプティカルフロー値、若しくはマッチング対応位置の移動距離を求め、その値から対象平面の移動距離・移動速度・移動方向、若しくは撮影したカメラの移動距離・移動速度・移動方向を求めることができる。即ち、平面展開された同一平面のオプティカルフローは一定値をとるという性質から、目的の平面のオプティカルフローからカメラの移動速度を求めることができるのであり、これはカメラと対象平面との相対位置、相対速度であることから、静止系と移動系が逆転しても同じである。また、ここで視差は幾何学的にはオプティカルフローと同じ意味を持つのであり、平面展開した連続画像の広い領域のオプティカルフロー又は視差を求め、それを用いて、対象となる平面の移動距離、移動速度、移動方向あるいは撮影したカメラの移動距離、移動速度、移動方向を求めることができる。
分離され平面展開された単独平面内の対象物平面のテクスチャーを、場所の対応するＣＧ（コンピュータグラフィックス）画像若しくは地図画像内の対象物平面に貼り付けることで、ＣＧ画像若しくは地図画像に実写画像を取り込み、平面として、若しくは逆変換して遠近法画像として表示することができる。即ち、平面展開された同一平面は同じオプティカルフローを持つという性質から、混在する複数の平面の中から、目的平面のテクスチャーのみを切り出すことが可能である。分離された平面展開された単独平面内の対象物平面のテクスチャーを、対応するＣＧ画像あるいは地図画像内の対象物平面に貼り付けることにより、ＣＧ画像あるいは地図画像に実写画像を取り込み、平面として、あるいは逆変換して遠近法画像として表示するのである。
また、前記の式（１）及び（２）において、先ずｆとｈを与え、さらに対象物の平行線が画像内で持つ交点を形成する交差線であるとき、この交差線が平面展開したときに平行となるようにθを選択することで、θを求めることができ、選択するθは微調整することができる。これは、道路等の特定の対象物においては、その対象物自体は多くの場合平行線となる部分を持っているという性質を利用するのであり、対象物若しくは対象物群が作る平行線は、遠近法の画像内では、その延長線分が交点を作り、この交点において交差する線分が平面変換された平面画像内で平行線となるようにθを選択することにより、θを求めることができるのである。
実写映像中の平行線の部分を画像内から抽出し、その交点のつくる目的平面に平行な面である平面ａと、光軸点を含む目的平面に平行な面である平面ｂとの距離をｄとし、仮想焦点距離をｆとし、これらのｄとｆとの比から、θ＝ａｒｃＴａｎ（ｄ／ｆ）として、θを求めることができる。即ち、目的の平面と光軸の成す角θは現実世界（実写映像）の中にある物理量であるので実測するべき量であるが、それを画像、及び動画像内の各フレーム画像の中から求めるために、現実世界（実写映像）での平行線の部分を画像内から経験的に探し出し、その交点のつくる目的平面に平行な面を平面ａとし、光軸点を含む目的平面に平行な面を平面ｂとし、また平面ａと平面ｂとの距離をｄとし、仮想焦点距離をｆとし、このｄとｆとの比から、θ＝ａｒｃＴａｎ（ｄ／ｆ）としてθを求めるのである。これは、道路面等の目的の平面とカメラの光軸のなす角度θを求めるために、現実世界での平行線の部分を画像内から経験的に探し出し、その交点と平行線部分の作る道路面等の目的平面と、光軸の画像内の位置とを前もって計測し、あるいは画像の幾何学的中心を近似的な光軸とし、その光軸点を含む道路等の目的平面に平行な面との距離を仮想焦点距離との比を求め、そのアークタンジェントを求めることによりθを求めるのである。
また、異なる設置場所に設置した複数の通常のカメラによって同一地点の同時映像を複数取得し、その複数の同一地点同時映像の平面展開画像を比較演算することで視差を検出し、この視差から対象物の三次元形状を生成することができる。これは上述の例においては、一台のカメラからの映像を平面展開することで、あるいは視点の異なる複数のカメラの結合による平面展開画像を得るものであったが、視点を重複させた複数のカメラ等を用いることで、同一地点の映像を異なる地点から撮影した映像を、平面展開画像として取得した後に重複部分の平面展開画像内で視差を検出するのである。つまり、これまで視差の検出による三次元データの検出は常に原画像、即ち遠近法の画像そのものから得られていたが、ここでは平面展開画像処理をしてから視差を検出するという新しい方法により、位置精度のよい三次元データを得ることができるようになり、これによって従来より更に精度のよい直接三次元形状のデータを簡単に取得可能である。ここで、視野の重複する複数の映像から得られる視差は、動画内の静止座標形においてはオプティカルフローとしばしば同一の意味を持つが、対象物が時間変化する場合や、三次元座標の精度を上げる場合にはこのような複数のカメラによって同一地点の同時映像を複数取得することが有効である。
一方、はじめから地図や平面図を用意して、それらのあらゆる平面図や平面写真やＣＧ（コンピュータグラフィックス）画像等を元にして、先の変換とは反対の逆変換を行い、結果的に先の視点の映像とは異なる任意の視点からの遠近法の映像を生成することができる。また、ビデオ画像の各フレーム画像を連続的に逆変換をすることで、視点移動を繰り返し、実際には撮影していない、仮想の移動するカメラ視点によるビデオ動画像を生成することができる。
具体的には、平面映像を含む遠近法的に表現された映像を平面図に変換して生成した平面展開図、若しくは複数の方向から撮影された複数の平面映像を含む映像を平面図に展開した後に対応点を重ねることで結合して生成した一枚の大画面平面展開図、又は平面図状のＣＧ（コンピュータグラフィックス）画像や地図を元として、前記記載の式（１）及び（２）に対する逆変換式によって任意の視点から見た仮想の遠近法画像を生成し、若しくは連続的に処理をすることで仮想の移動するカメラ視点による動画を生成することができ、視点を変えた後に、逆変換をする方法による具体的な逆変換式は、下記の式（３）及び（４）によるものとする。
ｖ＝ｙ・ｆ・ｓｉｎβ／（２^１／２・ｈ・ｃｏｓ（π／４−θ）・ｃｏｓ（β−θ））
…………………（３）
ｕ＝ｘ・ｆ・ｓｉｎβ／（ｈ・ｃｏｓ（β−θ）） …………………（４）
ただし、ｈはカメラの道路面からの高さ、θはカメラの光軸と道路面のなす角度、ｆはカメラの焦点距離、βはカメラの真下からｈ進んだ点からｙだけ先へ進んだ点とカメラのレンズとを結ぶ線分と、道路面との成す角度、ｘはカメラの光軸を道路面に正射影して得られる線分から垂直方向すなわちカメラから見て横方向の座標、ｙはカメラの真下からｈ進んだ点を原点としたときの光軸方向の座標、ｖはカメラにおける映写面であるＣＣＤ面上の縦方向の座標、ｕはカメラの映写面であるＣＣＤ面上の横方向の座標である。なお、数式はこの式のみではなく、他の同様な遠近法と平面とを関係づける数式であっても良い。
また、平面展開された画像によって、画像上での計測処理、画像認識処理等の各種の認識処理を可能にするのであり、平面展開された画像のスケールはリニアスケールとなり、画像上で計測や、画像処理、画像認識等が非常に容易に行なわれる。そして、オプティカルフローもカメラとの相対速度に比例する形で得られることから、対象物の相対速度も距離に依存せずにリニアスケールで表現されるため、計測のみならず、画像処理認識においても極めて単純化されるのである。
応用平面としての一例は、平面展開面として、道路面・海上面・湖水面・河川面・地上面・垂直壁面・同一平面に配列された対象物が作る垂直仮想平面・建築壁面床面・船の甲板面・滑走路誘導路等空港施設面等を扱うことができる。
応用機器としての乗り物の一例は、バス等の陸上乗り物における周辺道路面、ビル面、電柱の配列面、街路樹の配列面、ガードレールの配列面等、船舶等海上の乗り物の海上面等、船舶の甲板、壁面等、航空機等の滑走路、地上面等の全方位全面表示、あるいは目的領域面表示とすることができる。
さらに、他の応用例としての建築構造物では、建築物の床面、壁面等の平面部分を平面展開表示、及び平面結合表示するものである。
また、応用例としての立体地図作製は、複数のカメラで、移動する車両、航空機、船舶等で路面や地上面や水上面を連続撮影するのみならず、ビル壁面等のような垂直面、あるいは複数の電柱、ガードレール等が規則的に平面的に配列されている仮想垂直平面を持つ対象をも連続撮影することで、前記平面展開した画像を移動方向に結合延長させながら、同時に垂直面を含むより広範囲の平面垂直面展開図をつくることで、立体地図を作製するのである。
一方、上述した道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法に直接に使用される平面展開画像処理装置、逆展開画像処理装置は、遠近法画像を取得する映像入力部と、この映像入力部によって撮影された斜めの映像を再生する映像再生部と、映像入力装置による撮影回転角等を補正する画像補正部と、映像入力装置における球面収差等を補正する球面収差補正部と、遠近法画像を平面展開図に変換する映像展開平面処理部と、映像展開処理を行った映像を結合する展開画像結合部と、結合画像を表示する表示部とからなるものである。
また、展開された映像のオプティカルフローを生成して図示するオプティカルフローマップ生成部と、オプティカルフローマップから目的のオプティカルフローのみを抽出するオプティカルフロー抽出部とを備え、また、異なる位置からの同一地点の映像から視差を検出する視差抽出部を備え、また、複数の同一地点の展開画像を比較する展開画像比較部を備え、また、演算により路面凹凸を抽出する画像比較部と、その凹凸を考慮した修正平面生成部とを備えて構成することができる。
さらに、ＣＣＴＶカメラ又はデジタルスチルカメラ等のカメラにより映像を生成する映像入力部と、入力画像を安定化して表示する入力画像表示部と、入力映像を記録する映像記録部と、記録画像を再生する映像再生部と、球面収差等のレンズによる画像のゆがみを補正すめための座標変換を施し、カメラ回転角を補正するために、目的の平面映像を画像内の平面に方向を合わせる画像補正部と、数学的演算により遠近法映像から平面図を生成する映像展開平面処理部と、展開された映像のオプティカルフローを生成して図示するオプティカルフローマップ生成部と、それらのオプティカルフローマップから目的のオプティカルフローのみを抽出する、オプティカルフロー抽出部と、異なる位置からの同一地点の映像から視差を検出する視差抽出部と、必要な対象物を残し不必要な画像を削除し、さらには新しい映像を挿入する対象物画像処理部と、平面展開された処理された個々の画像を結合して一枚の連続した画像を生成する展開画像結合部と、それらを表示する展開画像表示部と、それらを記録する記録部と、任意視点に逆変換して表示する任意視点画像生成部と、その画像を表示する任意視点画像表示部と、複数の同一地点の展開画像を比較する展開画像比較部と、演算により路面凹凸を抽出する画像比較部、その凹凸を考慮した修正平面生成部とを適宜に組合せて構成したものである。
また、同様に道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法に直接に使用される逆展開画像処理装置は、任意視点に逆変換して表示する任意視点画像生成部と、その画像を表示する任意視点画像表示部とを備えて構成することができる。
さらに、平面展開画像処理装置、逆展開画像処理装置は、遠近法画像を取得する映像入力部と、この映像入力部によって撮影された遠近法画像を三次元空間を構成する一又は二以上の平面画像に分解する平面分解部と、映像入力部の三次元的位置を検出する位置検出部と、平面分解部で分解された平面画像と位置検出部で検出された映像入力部の三次元的位置から三次元画像を再構成して表示する表示部とを備える構成とすることができる。位置検出部で検出された映像入力部の三次元的位置を、平面分解部で分解された平面画像中に表記する位置表記部を備える構成とすることができる。また、映像入力部が移動する場合に、位置表記部は、移動する映像入力部の三次元的位置を、平面分解部で分解された平面画像中に連続的に表記する構成とすることができる。
そして、三次元画像を再構成する表示部が、平面分解部及び位置検出部と離間して配設される場合には、平面分解部及び位置検出部から表示部に一又は二以上の平面画像信号及び映像入力部の三次元的位置信号を送信する送受信手段を備える構成とすることができる。
以上のように構成された本発明に係る道路面等の平面対象物映像の平面展開画像処理方法、同逆展開画像変換処理方法及びその平面展開画像処理装置、逆展開画像処理装置において、映像入力装置によって取得された斜め映像である遠近法画像は、式（１）及び（２）によって平面展開図に変換され、実際的な地図様の画像として表示させられる。
取得生成された平面展開図の結合は一枚の大きな展開画像として、例えば、取得地・場所周囲の状況を含めて地図様に表示させ、目的領域の全域の全面表示、特定域のダイレクト表示等を選択させ、入力映像の同時のダイレクト表示によって周囲状況と容易に対比させる。
また、動画あるいは複数のカメラで取得した画像による異なる方向から観察した道路面の平面図に展開された画像が複数あるとき、それらを、相関法あるいはマッチング等の手法を用いて比較演算し、道路面等の複数の平面画像状の夫々の小領域毎に、視差方式あるいはオプティカルフローの手法等により、その小領域の成分の差から移動量を求め、対応する点を組み合わせ計算することは、道路面の凹凸を検出させ、平面に起伏や凹凸がある場合等において、検出した凹凸値を平面からのズレを含めた修正平面図を生成させる。
このことは視野の重複する動画像から平面展開画像を取得し、それらから視差若しくはオプティカルフローを求めることで、道路面の凹凸に限らず、平面展開した画像内で重複するすべての対象物について、三次元データを検出させる。
視点を重複させた複数のカメラによって同一地点の映像を異なる地点から撮影した複数の映像によって生成された平面展開画像は、重複部分の平面展開画像内で視差を検出させる。
生成される平面展開図は陸上、海上、空港施設面等、さらには建築構造物等における平面表示等の各種のものを展開させ、そして移動による平面展開図の生成取得で、移動方向による平面展開、垂直展開の結合延長で立体地図をも作製させる。
また、映像入力部、入力画像表示部、映像記録部、映像再生部、画像補正部、映像展開平面処理部、オプティカルフローマップ生成部、オプティカルフロー抽出部、視差抽出部、対象物画像処理部、展開画像結合部、展開画像表示部、記録部、任意視点画像生成部、任意視点画像表示部、展開画像比較部、画像比較部、修正平面生成部等を適宜に組合せて構成することは、夫々目的に応じた、例えば遠近法画像道路面を平面画像に展開したり、遠近法で表示されたビル壁面画像を平面画像に展開したり、ビル壁面とガードレール画像とを分離したり、平面画像に変換した後に視点を変えて再度遠近法画像に逆変換したり、テクスチャーを貼り付けた道路面あるいはビル面を表示したり、あるいは逆変換により遠近法画像に変換したり、さらには異なる視点からの画像を組み合わせた視差画像を得たり、それから道路面の凹凸を計算したり等の広範囲な応用を可能にさせる。
そして、平面展開した複数の画像データとカメラの位置情報データから、所望の三次元画像や立体地図等を再構成することができるので、撮影用のカメラとモニタ用の表示部とが離れて設置されているような場合でも、映像取得側（カメラ側）から平面展開画像とカメラ位置情報を送信することで、受信側（モニタ側）で元の動画像を再現することが可能となる。平面展開された画像データは静止画像であり、斜め画像の動画像と比較して格段にデータ量は小さい。従って、平面展開画像とそれを再構成するためのカメラ位置情報データを送受信することによって、データ伝送量を可能な限り小さくした動画像のデータ通信が可能となる。In order to achieve the above-described object, in the present invention, information necessary for plane development is read from a perspective image obtained from a normal camera, and is developed into a plan view like a map by mathematical formulas. Combination, stitching, and making a single large road development. For example, a normal video camera is mounted around the bus, and the entire road is composed of multiple video cameras. The road surface outside the bus is photographed at a certain angle, that is, a depression, and this is calculated. The image of the road surface is developed like a map image, and the image that can be developed like the map and the image of the wall surface of the building are processed in a similar manner, joined together, and generated as the image of the road surface around the bus and the building It is to try. Alternatively, an image of a floor surface and a wall surface inside a building is photographed, and an image photographed obliquely is developed on a flat surface, for example, a room wall is opened, and a drawing in which the room is just unfolded is obtained. This is obtained by calculation using mathematical formulas.
Then, a desired three-dimensional moving image can be reconstructed from a plurality of image data expanded in plane and position information data of the camera. Thus, for example, even when the shooting camera and the monitor display unit are separated from each other, the original moving image can be reproduced on the monitor side by transmitting the plane development image and the camera position information from the camera side. Is possible. A flat image is a still image, and its data volume is much smaller than a moving image of an oblique image. Therefore, data can be transmitted freely even between devices connected by a narrow bandwidth line, etc. By reconstructing with, the moving image of a desired object can be data-communicated.
Specifically, the oblique image plane development device as the real-time processing includes, for example, a CCTV camera or a digital still camera as a video input device, a video playback unit, an image correction unit, a spherical aberration correction unit, a video development plane processing unit, a developed image It consists of a coupling part, a display part, and a recording part.
In addition, the oblique image plane development apparatus as offline processing includes a video reproduction apparatus that has recorded oblique images, an image correction unit, a video development plane processing unit, a development image combining unit, and a display unit.
As a plane development, with respect to an image obtained by obliquely shooting an object of a real scene including a plane, the plane originally composed of a plane is made proportional to the plane of the real scene (similar) by mathematical calculation. As a plane image, it is displayed in a flat plane.
As a combination of the plane development, a plurality of plane development images obtained by the above method are combined and expressed as one large plane development image.
In addition, for omnidirectional display and direct display using a plurality of CCTV images, a plurality of CCTV images are developed in a plane using the above-described apparatus, and the images are combined into one image to display the entire target area. In response to this, an oblique video of CCTV corresponding to the displayed location is also displayed at the same time.
As continuous connection in the moving direction, images of the moving direction obtained by loading a camera on a moving vehicle, aircraft, ship, etc. are imaged in a plane and continuously combined into a single image. is there.
In addition, when a moving object is captured when combining multiple images that partially include overlapping objects, a combined image of only a stationary object is generated by combining the images while avoiding the moving object image. To do.
In other words, when developing an oblique image obtained by CCTV or the like as a θ method as a planar image, the optical axis position θ is read from the oblique image, and the camera height h, the f value of the photographing lens, or the virtual f value on the monitor is read. And the coordinates of the target location
y = v · 2 ^1/2 · H · cos (π / 4-θ) · cos (β-θ) / (f · sinβ) (1)
x = u · h · cos (β−θ) / (f · sin β) (2)
By using an expression expressed as follows and an expression having the same meaning, information that can be read from the captured image and θ without giving the position coordinates and size of the object in the real world as known information By providing information on imaging conditions such as h, r, etc., coordinate conversion is performed by associating the real world coordinate system with the coordinate system on the image monitor.
Where θ is the angle formed by the optical axis of the camera and the road surface, f is the focal length of the camera, h is the height of the camera, β is a distance h + y from directly below the camera, and the line segment connecting the camera and the road An angle formed by the plane, v is a coordinate in the vertical direction from the origin on the CCD (acquired video) plane which is a projection plane in the camera, and u is a coordinate in the horizontal direction from the origin on the CCD (acquired video) plane. In addition, y is a distance or coordinate that travels further in the direction of the optical axis from the point that is advanced from right below the camera on the road surface, and x is a lateral distance or coordinate on the road surface. In addition, when the vertical wall surface is developed on a plane, the coordinates may be inclined by 90 degrees. It should be noted that the mathematical formula is not limited to this formula, and may be a mathematical formula that associates another similar perspective with a plane.
Here, a numerical value necessary for calculation, that is, an angle θ formed by the target plane and the optical axis, a value β such as an angle β formed by a camera or the like corresponding to an arbitrary point in the image, and (β−θ), etc. Is a physical quantity in the real world (live-action video), so it can be obtained by actual measurement. However, as an actual problem, it is practically impossible to measure each moving image that the camera moves at multiple locations while actually measuring. From the nature of this conversion formula, It can be obtained from within the image.
In many cases, the position of the optical axis in the image is the optical axis position, but this can be achieved accurately by finding the optical center of the camera or other photographic equipment with a collimator. Once measured, the optical axis position can be obtained as a point in the image as a value unique to the camera system including the lens.
Next, an example of the most important θ measurement in the image in the conversion formula is described. The parallel line part in the real world is empirically searched from the image, and the extension of the parallel line is the intersection (disappearance) in the image. A point, that is, a point that intersects at a distant point when drawing a drawing in perspective, and is a Vanishing Point in a perspective view etc. For example, when a straight road is written in perspective, the road becomes one point in the distance And the point is a vanishing point.), The plane a is a plane parallel to the target plane including the optical axis, and the plane b is a plane parallel to the target plane including the intersection. And d can be determined as a ratio (arcTand d / f) between d and the virtual focal length f.
Here, as an example of obtaining the virtual focal length, the distance on the optical axis is such that the angle at which an arbitrary object in the real space is viewed in advance and the angle at which the same object in the displayed image is viewed are the same. It may be obtained on the display image, and if the unit at this time is expressed in pixels, it is a value specific to the camera system including the lens, and it is only necessary to obtain it once. Furthermore, since a parallel line portion of an object in the real world is searched and it is represented as an intersection line having an intersection on the extension line in the image, the intersection line becomes parallel when it is expanded in a plane. Can be obtained or fine adjustment of θ can be performed.
Note that θ can also be obtained by actual measurement. For example, by using a protractor, which is a simple method, it is possible to accurately determine how much the camera attached to a moving vehicle such as a car is facing downward. If it is intended, it can be obtained by measuring with a dedicated angle measuring device.
In addition, in the process of obtaining the vanishing point in the acquired planar image, it is possible to stabilize the image that shakes due to camera shake or the like by moving and displaying the image so as to fix the position of the vanishing point of each image. . That is, this extends the part of the parallel line component in the target plane in the image, finds the intersection point, and when it becomes the vanishing point, the position of the vanishing point fluctuates as the camera moves or shakes. Although the image is blurred, the entire image is moved and displayed so that the position of the vanishing point is fixed, thereby stabilizing an image that shakes due to camera shake or the like.
A unit time movement amount of a minute area in a moving image composed of a plurality of different planes obtained by plane development is referred to as an optical flow (in this specification and drawings, “Opt.F” may be omitted). A certain planar image can be separated by obtaining the same component from the component distribution map and obtaining the same component by a required method. When using this optical flow method, the optical flow of perspective images is generally not the same value in the moving direction, even in the same plane, and takes different values depending on the distance. The optical flow uses the property of taking the same value. In other words, even if an image is obtained by overlapping a plurality of different planes obtained by plane development, each plane image can be separated by extracting the same component from the optical flow component distribution overall view. In the developed image, the wall surface of a building such as a building that forms a parallel plane, the guardrail surface of the road, and the streetlight surface can be separated by an optical flow value.
The optical flow is a flow showing how each corresponding point moves in a plurality of images. If there is a movement in a plurality of images, there is a flow of the optical flow. If there is no movement, the flow of the optical flow is not represented by a line, and it is important to know whether or not the corresponding point has moved in a plurality of images.
Further, an optical flow distribution map is generated in the plane moving image obtained from the plane converted moving image, and the unevenness of the plane is detected as a deviation from the plane from the minute difference, or a different image in the plane moving image is detected. The parallax is detected by comparing and calculating the plane image obtained from the corner, and the uneven component in the plane is detected from the component distribution, and the deviation from the plane of each point of the original plan view is detected by the detected uneven value. A modified plan view is generated, and a plurality of plane images developed in a plurality of plan views are compared and calculated by a method such as a correlation method or a matching method, so that each of the plurality of plane images such as a road surface is displayed. For each small area, the amount of movement of the corresponding small area is obtained by the parallax method or the optical flow method, and three-dimensional data such as unevenness on the road surface is detected from the distribution of the components. And, or with the detected three-dimensional irregularities value it can generate a modified plan view, including deviation from the plane of each point in the original plan. In other words, a distribution map of optical flow is generated, the unevenness of the plane is detected as a deviation from the plane from the minute difference using the above principle, and the parallax is calculated by comparing the plane images obtained from different angles of view. And detecting a concavo-convex component in the plane from the component distribution, or generating a corrected plan view including a deviation from the plane of each point of the original plan view with the detected concavo-convex value. In addition, if there are undulations or irregularities on the plane, and if a small difference in the optical flow of the converted plane image can be detected, it means a deviation from the plane, so an uneven distribution map can be generated. Can do it. In addition, since there are multiple images developed in the plan view of the road surface observed from different directions by moving images or images acquired by multiple cameras, they are compared and calculated using a method such as correlation or matching, and roads By calculating the parallax for each minute area of a plurality of planar images such as a surface, etc., by calculating the amount of movement from the difference in the components of that minute area, and by calculating the corresponding points in combination, the unevenness of the road surface Is detected.
This means that by acquiring a flat developed image from moving images with overlapping fields of view and obtaining parallax or optical flow from them, not only the unevenness of the road surface, but also all objects that overlap in the flat developed image This means that three-dimensional data can be detected.
It should be noted that all the operations by optical flow can be performed by parallax, and can be performed by matching. Therefore, the term “optical flow” in the present invention means that any processing such as optical flow, parallax, or matching may be performed.
In addition, the average optical flow value of the continuous image developed on the plane or the movement distance of the matching corresponding position is obtained, and the movement distance / movement speed / movement direction of the target plane or the movement distance / movement of the photographed camera is obtained from the value. Speed and moving direction can be obtained. That is, since the optical flow of the same plane developed in a plane takes a constant value, the moving speed of the camera can be obtained from the optical flow of the target plane, which is the relative position between the camera and the target plane, Since it is a relative speed, it is the same even if the stationary system and the moving system are reversed. Further, here, the parallax has the same meaning as the optical flow geometrically, and the optical flow or the parallax of a wide area of the continuous image obtained by plane development is obtained and used to move the movement distance of the target plane, The moving speed, moving direction or moving distance, moving speed, and moving direction of the photographed camera can be obtained.
By pasting the texture of the object plane in the single plane that has been separated and developed into planes onto the object plane in the CG (computer graphics) image or map image corresponding to the place, a live-action image is displayed on the CG image or map image. And can be displayed as a perspective image by plane conversion or inverse transformation. That is, because the same plane developed in the plane has the same optical flow, it is possible to cut out only the texture of the target plane from a plurality of mixed planes. By pasting the texture of the object plane in the separated single plane that has been separated into planes onto the corresponding object plane in the corresponding CG image or map image, the real image is taken into the CG image or map image, Alternatively, it is inversely transformed and displayed as a perspective image.
Further, in the above formulas (1) and (2), when f and h are first given, and when the parallel line of the object is an intersection line forming an intersection point in the image, the intersection line is expanded in plane. By selecting θ so as to be parallel to the angle θ, θ can be obtained, and θ to be selected can be finely adjusted. This is because a specific object such as a road uses the property that the object itself often has a part that becomes a parallel line. In the perspective image, the extended line segment forms an intersection, and θ is obtained by selecting θ so that the line segment intersecting at the intersection point becomes a parallel line in the plane image obtained by plane conversion. It can be done.
A parallel line portion in a live-action image is extracted from the image, and a distance between a plane a that is a plane parallel to the target plane formed by the intersection and a plane b that is a plane parallel to the target plane including the optical axis point is calculated. d can be obtained, and the virtual focal length can be represented by f. From the ratio of d and f, θ can be obtained as θ = arcTan (d / f). That is, the angle θ between the target plane and the optical axis is a physical quantity in the real world (actual video), and is an amount to be actually measured, but it is calculated from the image and each frame image in the moving image. In order to obtain this, the parallel line part in the real world (live-action image) is empirically searched from the image, and the plane parallel to the target plane formed by the intersection is defined as plane a, and parallel to the target plane including the optical axis point. The plane is the plane b, the distance between the plane a and the plane b is d, the virtual focal length is f, and θ is determined as θ = arcTan (d / f) from the ratio of d and f. This is to find the angle θ between the target plane, such as the road surface, and the optical axis of the camera, and find the part of the parallel line in the real world empirically from the image, and the road created by the intersection and the parallel line part A plane parallel to the target plane such as a road that includes the optical axis point, with the target plane such as a plane and the position of the optical axis in the image measured in advance, or the geometric center of the image as the approximate optical axis Is obtained by obtaining the ratio of the distance to the virtual focal length and obtaining the arc tangent thereof.
In addition, multiple simultaneous images of the same point are acquired by a plurality of ordinary cameras installed at different installation locations, and parallax is detected by comparing and calculating the plane development images of the multiple images of the same point simultaneously. A three-dimensional shape of an object can be generated. In the above example, the image from one camera is developed in a plane, or a plane development image is obtained by combining a plurality of cameras with different viewpoints. By using a camera or the like, parallax is detected in the flat developed image of the overlapping portion after acquiring the video of the same spot taken from different points as a flat developed image. In other words, until now, detection of 3D data by detection of parallax has always been obtained from the original image, that is, the perspective image itself, but here a new method of detecting parallax after performing planar development image processing, It becomes possible to obtain three-dimensional data with high positional accuracy, and thereby, it is possible to easily acquire direct three-dimensional shape data with higher accuracy than before. Here, the parallax obtained from multiple images with overlapping fields of view often has the same meaning as the optical flow in the stationary coordinate form in the video, but the accuracy of the three-dimensional coordinates is improved when the object changes over time. In the case of raising the image, it is effective to obtain a plurality of simultaneous images of the same point by such a plurality of cameras.
On the other hand, a map or a plan view is prepared from the beginning, and based on any plan view, plan photograph, CG (computer graphics) image, etc., the inverse transform is performed opposite to the previous transform. It is possible to generate a perspective video from an arbitrary viewpoint different from the video of the previous viewpoint. In addition, by continuously inversely transforming each frame image of the video image, it is possible to generate a video moving image based on a virtual moving camera viewpoint that is not actually photographed by repeating viewpoint movement.
Specifically, a plane development view that is generated by converting a perspective image including a plane video into a plan view, or a video that includes a plurality of plane images taken from a plurality of directions is developed into a plan view. Then, based on a single large-screen plane development view or a plan view-like CG (computer graphics) image or map generated by overlapping corresponding points, the above formulas (1) and (2 ) Can be used to generate a virtual perspective image viewed from an arbitrary viewpoint, or by continuously processing to generate a moving image with a moving camera viewpoint, and after changing the viewpoint The specific inverse transformation formula based on the inverse transformation method is based on the following formulas (3) and (4).
v = y · f · sin β / (2 ^1/2 · H · cos (π / 4-θ) · cos (β-θ))
………………… (3)
u = x · f · sin β / (h · cos (β−θ)) (4)
Where h is the height of the camera from the road surface, θ is the angle formed by the optical axis of the camera and the road surface, f is the focal length of the camera, and β is advanced y from the point h ahead of the camera. The angle formed by the line connecting the point and the camera lens and the road surface, x is the vertical coordinate from the line obtained by orthogonally projecting the optical axis of the camera onto the road surface, that is, the horizontal coordinate when viewed from the camera, y Is the coordinate in the direction of the optical axis when the point advanced from right below the camera is the origin, v is the vertical coordinate on the CCD plane which is the projection plane of the camera, u is on the CCD plane which is the projection plane of the camera The horizontal coordinate. It should be noted that the mathematical formula is not limited to this formula, and may be a mathematical formula that associates another similar perspective with a plane.
In addition, it is possible to perform various recognition processing such as measurement processing on the image, image recognition processing, etc. by the image developed on the plane, the scale of the image developed on the plane becomes a linear scale, and measurement on the image, Image processing, image recognition, etc. are performed very easily. Since the optical flow is also obtained in a form proportional to the relative speed with the camera, the relative speed of the object is also expressed in a linear scale without depending on the distance. Therefore, not only in measurement but also in image processing recognition. It is greatly simplified.
Examples of applied planes include road surfaces, sea surfaces, lake surfaces, river surfaces, ground surfaces, vertical walls, vertical virtual planes created by objects arranged in the same plane, building wall floors, ships Airport facilities such as the deck surface and runway taxiway.
An example of a vehicle as an applied device is a peripheral road surface, a building surface, an electric pole array surface, a roadside tree array surface, a guardrail array surface, etc. in a land vehicle such as a bus, and the sea surface of a marine vehicle such as a ship. Omnidirectional display such as deck, wall surface, runway of aircraft, ground surface, etc., or target area surface display.
Furthermore, in a building structure as another application example, a plane portion such as a floor surface or a wall surface of a building is displayed in a plane development display and in a plane combination display.
In addition, 3D map creation as an application example is not only continuous shooting of the road surface, ground surface and water surface with a plurality of cameras, moving vehicles, aircraft, ships, etc., but also vertical surfaces such as building wall surfaces, etc. By continuously photographing a target having a virtual vertical plane in which a plurality of utility poles, guard rails, etc. are regularly arranged in a plane, the plane expanded image is coupled and extended in the moving direction, and simultaneously includes a vertical plane. A three-dimensional map is created by creating a broader plane and vertical development.
On the other hand, the flat developed image processing apparatus and the reverse developed image processing apparatus directly used in the above-described flat developed image processing method and reverse developed image conversion processing method of a planar object image such as a road surface acquire perspective images. A video input unit, a video playback unit that reproduces an oblique video captured by the video input unit, an image correction unit that corrects a shooting rotation angle by the video input device, and a spherical aberration correction in the video input device. A spherical aberration correction unit, a video development plane processing unit that converts a perspective image into a plane development view, a developed image combining unit that combines the images that have undergone the video development process, and a display unit that displays the combined image Is.
In addition, an optical flow map generation unit that generates and illustrates an optical flow of the developed video and an optical flow extraction unit that extracts only the target optical flow from the optical flow map are provided, and the same point from different positions is provided. A parallax extraction unit that detects parallax from the image of the image, a development image comparison unit that compares a plurality of development images at the same point, an image comparison unit that extracts road surface unevenness by calculation, and the unevenness And a modified plane generation unit.
Furthermore, a video input unit that generates video by a camera such as a CCTV camera or a digital still camera, an input image display unit that stabilizes and displays an input image, a video recording unit that records the input video, and a playback of the recorded image A video reproduction unit, and an image correction unit that performs coordinate transformation for correcting distortion of the image due to a lens such as spherical aberration and aligns a target plane image with a plane in the image in order to correct a camera rotation angle; A video development plane processing unit that generates a plan view from a perspective image by mathematical operation, an optical flow map generation unit that generates and illustrates an optical flow of the developed video, and a target optical from the optical flow map An optical flow extractor that extracts only the flow, and a parallax that detects parallax from video at the same point from different positions Combining the output part, the target image processing part that leaves unnecessary objects, deletes unnecessary images, and inserts new video, and the processed images that are developed in a plane A developed image combining unit for generating a developed image, a developed image display unit for displaying them, a recording unit for recording them, an arbitrary viewpoint image generating unit for inversely converting to an arbitrary viewpoint, and displaying the image Arbitrary viewpoint image display unit, unfolded image comparison unit that compares unfolded images at the same point, an image comparison unit that extracts road surface unevenness by calculation, and a modified plane generation unit that considers the unevenness are appropriately combined. It is a thing.
Similarly, a reverse development image processing apparatus used directly in a plane development image processing method and a reverse development image conversion processing method of a planar object image such as a road surface is provided with an arbitrary viewpoint that is converted back to an arbitrary viewpoint and displayed. An image generation unit and an arbitrary viewpoint image display unit that displays the image can be provided.
Furthermore, the plane development image processing apparatus and the reverse development image processing apparatus include a video input unit that acquires a perspective image, and one or more planes that form a three-dimensional space from the perspective image captured by the video input unit. A plane decomposition unit that decomposes into an image, a position detection unit that detects a three-dimensional position of the video input unit, a plane image decomposed by the plane decomposition unit, and a three-dimensional position of the video input unit that is detected by the position detection unit And a display unit for reconstructing and displaying a three-dimensional image. A configuration may be provided that includes a position notation unit that indicates the three-dimensional position of the video input unit detected by the position detection unit in the plane image decomposed by the plane decomposition unit. Further, when the video input unit moves, the position notation unit can be configured to continuously describe the three-dimensional position of the moving video input unit in the plane image decomposed by the plane decomposition unit. .
When the display unit for reconstructing the three-dimensional image is disposed apart from the plane decomposition unit and the position detection unit, one or more plane images are provided from the plane decomposition unit and the position detection unit to the display unit. A transmission / reception means for transmitting a signal and a three-dimensional position signal of the video input unit may be provided.
In the plane development image processing method, the reverse development image conversion processing method and the plane development image processing apparatus, and the reverse development image processing apparatus of the planar object image such as road surface according to the present invention configured as described above, video input A perspective image, which is an oblique image acquired by the apparatus, is converted into a plan development view by equations (1) and (2) and displayed as a practical map-like image.
Combining acquired and generated flat development views is displayed as a large expanded image, for example, including the situation around the acquisition location / place, like a map, full display of the entire target area, direct display of a specific area, etc. And can easily compare with the surrounding situation by direct display of the input video simultaneously.
In addition, when there are multiple images developed on a plan view of the road surface observed from different directions by moving images or images acquired by multiple cameras, they are compared and calculated using a method such as correlation or matching, and roads For each small area of a plurality of planar images such as a surface, the amount of movement is calculated from the difference of the components of the small area by a parallax method or an optical flow method, and the corresponding points are combined and calculated. Surface irregularities are detected, and when there are undulations or irregularities on the plane, a corrected plan view including a deviation from the plane of the detected irregularity value is generated.
This is to obtain a flat developed image from moving images with overlapping fields of view, and obtain parallax or optical flow from them, not only for unevenness on the road surface, but for all objects that overlap in the flat developed image, Detect 3D data.
A plane development image generated by a plurality of videos obtained by shooting a video of the same point from different points by a plurality of cameras with overlapping viewpoints causes parallax to be detected in the plane development image of the overlapping portion.
The generated plane development is developed on the land, sea, airport facilities, etc., and various displays such as plane displays on building structures, etc. A three-dimensional map is also created by extending the joints of vertical development.
Also, a video input unit, an input image display unit, a video recording unit, a video playback unit, an image correction unit, a video development plane processing unit, an optical flow map generation unit, an optical flow extraction unit, a parallax extraction unit, a target image processing unit, The development image combining unit, the development image display unit, the recording unit, the arbitrary viewpoint image generation unit, the arbitrary viewpoint image display unit, the development image comparison unit, the image comparison unit, the correction plane generation unit, and the like are appropriately combined. Depending on the purpose, for example, the perspective image road surface is developed into a flat image, the building wall image displayed in perspective is developed into a flat image, the building wall surface and the guardrail image are separated, or the flat image is displayed. After conversion, change the viewpoint and convert it back to the perspective image again, display the road or building surface with the texture attached, or convert it to the perspective image by reverse conversion, Or to obtain a parallax image by combining images from different viewpoints, then allows for a wide range of applications, such as or calculate the unevenness of the road surface.
The desired three-dimensional image, 3D map, etc. can be reconstructed from the plurality of image data and the position information data of the camera developed, so that the camera for photographing and the display unit for monitoring are placed apart from each other. Even in such a case, it is possible to reproduce the original moving image on the reception side (monitor side) by transmitting the plane development image and the camera position information from the video acquisition side (camera side). The image data developed on a plane is a still image, and the amount of data is much smaller than a moving image of an oblique image. Accordingly, by transmitting and receiving the flat developed image and the camera position information data for reconstructing it, it is possible to perform data communication of a moving image with a data transmission amount as small as possible.

第１図は、本発明の一実施の形態を示す道路面等の平面対象物映像の平面展開装置のブロック図である。
第２図は、同じく本発明装置における他の実施の形態を示すブロック図である。
第３図は、同じく本発明装置における他の実施の形態を示すブロック図である。
第４図は、同じく映像の平面展開装置を示すブロック図である。
第５図は、同じく道路における凹凸検出装置を示すブロック図である。
第６図は、同じく道路面展開法によって視点移動と、テクスチャー貼り付けとを行う視点移動・テクスチャー貼り付け装置を示すブロック図である。
第７図は、同じくオプティカルフロー方式によって画像を平面展開する実施の形態においてのブロック図である。
第８図は、同じく取得画像から平面展開画像を生成し、さらに視点を移動した斜め画像を形成する場合の概略説明図である。
第９図は、同じく道路面における凹凸を検出する場合を示す概略説明図である。
第１０図は、同じく消失点からθを求めるときの概略図である。
第１１図は、同じく平面展開したときの異なるθ値であったときの平面展開図例であり、その（Ａ）は遠近法そのままの平面展開する前、（Ｂ）はθ値が実際と異なっている場合、（Ｃ）はθの値が実際の値と同じ場合つまりθが正しく求められた場合である。
第１２図は、同じく立体図生成装置におけるブロック図である。
第１３図は、同じくオプティカルフローによって抽出される平面のイメージ図である。
第１４図は、同じくオプティカルフローによって抽出される平面と、三次元変換されて生成された立体地図のイメージを示す。
第１５図は、同じく立体地図上にカメラ位置の軌跡を記述した状態のイメージ図である。
第１６図は、同じく移動体検出立体生成装置におけるブロック図である。
第１７図は、同じく立体図生成装置におけるテクスチャー貼り付け装置のブロック図である。
第１８図は、同じく対象物認識装置のブロック図である。
第１９図は、第１８図の対象物認識装置の一例として、交通量監視ビデオカメラで得られる映像の具体例であり、（ａ）は監視ビデオカメラ画像（遠近画像）、（ｂ）は本発明により変換される平面展開画像、（ｃ）は本発明により認識された対象物を示す領域分析表示である。
第２０図は、第１８図の対象物認識装置の一例として、交通量監視ビデオカメラにおける対象物認識の処理ステップを示すフローチャートである。
第２１図は、第２０図に示す対象物認識処理により得られた結果を集計した一覧表である。
第２２図は、同じく例えばバスの周りの道路を撮影する複数台のカメラを配置する状況図である。
第２３図は、同じく地図のように路面展開された映像図である。
第２４図は、同じく道路の斜め画像を撮影した例えば（１）…（９）の９枚の道路画像図夫々である。
第２５図は、同じく道路画像図を元に処理して得られた地図のような平面画像図である。
第２６図は、他の実施の形態を示すもので、建築構造物の床面、壁面等の平面部分を斜め画像として撮影した例えば（１）…（１６）の１６枚の室内画像図夫々である。
第２７図は、第２６図に示された画像図夫々を平面結合した合成画像図夫々である。FIG. 1 is a block diagram of a plane development apparatus for a plane object image such as a road surface showing an embodiment of the present invention.
FIG. 2 is a block diagram showing another embodiment of the apparatus of the present invention.
FIG. 3 is a block diagram showing another embodiment of the apparatus of the present invention.
FIG. 4 is a block diagram showing a plane development apparatus for video.
FIG. 5 is a block diagram showing an unevenness detecting device on the road.
FIG. 6 is a block diagram showing a viewpoint movement / texture pasting apparatus that similarly performs viewpoint movement and texture pasting by the road surface development method.
FIG. 7 is a block diagram in an embodiment in which an image is planarly developed by the optical flow method.
FIG. 8 is a schematic explanatory diagram in the case where a flat developed image is generated from the acquired image and an oblique image is formed by moving the viewpoint.
FIG. 9 is a schematic explanatory view showing a case where unevenness on the road surface is similarly detected.
FIG. 10 is a schematic diagram for obtaining θ from the vanishing point.
FIG. 11 is an example of a flat development when the θ values are different when the flat development is performed. FIG. 11A shows a plane development before the perspective development as it is, and FIG. (C) shows the case where the value of θ is the same as the actual value, that is, θ is obtained correctly.
FIG. 12 is a block diagram of the same three-dimensional map generation apparatus.
FIG. 13 is an image view of a plane similarly extracted by the optical flow.
FIG. 14 shows a plane extracted by the optical flow and an image of a three-dimensional map generated by three-dimensional transformation.
FIG. 15 is an image diagram of a state where the locus of the camera position is described on the three-dimensional map.
FIG. 16 is a block diagram of the moving object detection solid generation apparatus.
FIG. 17 is a block diagram of a texture pasting apparatus in the same three-dimensional map generating apparatus.
FIG. 18 is a block diagram of the object recognition apparatus.
FIG. 19 is a specific example of an image obtained by a traffic monitoring video camera as an example of the object recognition apparatus of FIG. 18. (a) is a monitoring video camera image (far and near image), and (b) is a book. The flat developed image converted by the invention, (c) is a region analysis display showing the object recognized by the present invention.
FIG. 20 is a flowchart showing processing steps for object recognition in a traffic monitoring video camera as an example of the object recognition apparatus of FIG.
FIG. 21 is a list summarizing the results obtained by the object recognition process shown in FIG.
FIG. 22 is a situation diagram in which a plurality of cameras for photographing a road around a bus are arranged, for example.
FIG. 23 is an image diagram developed on the road surface like a map.
FIG. 24 shows, for example, nine road image diagrams (1) to (9) obtained by taking an oblique image of the road.
FIG. 25 is a plan image diagram like a map obtained by processing based on the same road image diagram.
FIG. 26 shows another embodiment. For example, (1)... (16), 16 indoor image views, each of which is taken as an oblique image of a plane portion such as a floor surface or a wall surface of a building structure. is there.
FIG. 27 is a composite image diagram obtained by planarly combining the image diagrams shown in FIG.

以下、図面を参照して本発明の実施の形態を説明する。
第１図に示されるように、入力装置としてのＣＣＴＶカメラ又はデジタルスチルカメラ等のカメラ１、これらのカメラ１によって撮影された斜めの映像を再生する映像再生部２、画像補正部３、球面収差補正部４、また後述する式（１）および式（２）により斜め映像を平面図に展開する映像展開平面処理部５、夫々の展開した映像を適切な方法でつなぎ合わせる展開画像結合部６、そのつなぎ合わせた映像を表示する展開画像表示部７、そしてつなぎ合わせた映像を記録媒体に記録する記録部８から成るのであり、リアルタイム処理として斜め画像を平面展開するのである。即ち、リアルタイム処理としての斜め画像平面展開装置として構成されるよう、カメラ１より画像を入力し、リアルタイムで映像を再生しながら、画像補正部３でカメラ１の回転角等を補正し、球面収差補正部４で、カメラ１の球面収差及び目的にあったように補正し、映像展開平面処理部５で、遠近法画像を地図のような平面展開図に変換して展開画像表示部７で表示する。さらに必要があれば、それらの得られた画像を展開画像結合部６で、複数のカメラ１からの映像展開処理を行った映像を結合し、それを表示し、記録部８に記録するものである。
また、第２図に示すように、オフライン処理とする斜め画像平面展開装置では、斜め画像記録済みの斜め映像再生部１１、画像補正部１２、映像展開平面処理部１３、展開画像結合部１４、展開画像表示部１５から成り、通常のカメラ１で撮影した映像を記録してある斜め映像再生部１１からの映像を再生し、画像補正部１２で球面収差及びカメラ回転角等を目的にあったように補正し、映像展開平面処理部１３で地図のような平面画像に展開して展開画像表示部１５で表示し、さらに必要があれば、その後展開画像結合部１４により展開した複数の画像を適切な方法でつなぎ合わせ、展開画像表示部１５でつなぎ合わせた画像を表示するのである。
さらに、第３図に示すように、斜め画像平面展開装置は、映像取得用のカメラ側（送信側）と、斜め画像から分解・展開された平面画像を表示・記録し再構成するモニタ側（受信側）とを分離して設置することができる。この場合、第３図に示すように、カメラ側（送信側）には、入力装置としてのカメラ１、映像再生部２、画像補正部３、球面収差補正部４、映像展開平面処理部５が備えられ、モニタ側（受信側）には、展開画像結合部６、展開画像表示部７、記録部８が備えられる。また、カメラ側には展開、分解された平面画像信号をモニタ側に送信する送信部５ａが備えられ、モニタ側にはカメラ側から送信された平面画像信号を受信する受信部６ａが備えられ、これら送受信部５ａ、６ａが通信回線を介してデータ通信可能に接続されている。これによって、カメラ側で得られた動画像の平面展開画像とカメラ位置情報等の所定情報を、通信回線を介してモニタ側に送信することで、モニタ側では受信した平面展開画像に基づいて動画像を再構成することができ、伝送データ量を可能な限り小さくしつつ所望の動画像を送信、再現することが可能となる。
なお、第３図は、第１図に示したリアルタイム処理の斜め画像平面展開装置を分離設置したものであるが、リアルタイム処理であるとオフライン処理であるとにかかわらず分離設置することができることは勿論である。
平面展開形態としては、平面を含む現実場面の対象を斜めから撮影した画像に関して、数学的演算により、元々が平面で構成されている面を、現実場面の平面と比例関係（相似形）となる平面画像として平面に展開して表示する。これは平面を含む場面を通常のカメラで撮影した画像、即ち斜めから撮影した画像を、数学的演算により、元々が平面で構成されている面を現実平面と相似関係となる平面画像に変換するのであり、例えば、道路面であれば平面に展開して地図の画面のように展開するというものである。
さらに、これを結合するに際し、上記の方法で得られた複数の平面展開画像を結合して、一枚の大きな平面展開画像として表現するのであり、これは通常のカメラで撮影した画像を平面に展開し、その複数の平面展開画像を適切な方法により結合して、一枚の大きな平面展開画像として表現することを意味する。即ち道路面を複数展開し、地図のような平面展開図にしたときそれらをつなぎ合わせ一枚の大きな平面展開画像にするというものである。もちろん平面展開した画像であるため、いくらでも自由に結合できるのであり、遠近法のままの映像ではカメラ位置を同一とした場合を除いては、結合できないのである。
さらに、複数のＣＣＴＶカメラで取得した映像による全方位表示とダイレクト表示とを可能にすることもでき、複数のＣＣＴＶカメラの映像を上記装置で平面展開し、夫々画像を結合して、一枚の画像とし、目的領域の全域を表示し、必要に応じて、その表示された場所に対応したＣＣＴＶカメラの斜め映像をも同時に表示させることができるようにしてある。
即ち、複数のＣＣＴＶカメラの映像を上記装置で平面画像に展開し、夫々の画像を対応点を合わせることで結合して、一枚の画像とし、目的領域の全域を表示する。さらに、必要に応じて、その表示された場所に対応したＣＣＴＶカメラのそのままの映像即ち斜め映像をも表示させることで、監視等の目的の効果を上げることができるのである。
第４図には映像を平面に展開するときの処理ブロック図を示す。まずビデオ映像２１あるいは静止画映像２２が、入力画像として装置に入力されると、その入力時には球面収差補正、回転補正等が補正部２３によって行なわれる。ついで映像平面展開処理部２４によって、下記の式（１）および式（２）で、映像平面展開処理が行なわれる。
即ち、ビデオ映像２１あるいは静止画映像２２によって得られた斜め映像を平面画像に変換するに際し、本発明ではθ方式と称する方法によって得るものとしており、例えばＣＣＴＶカメラ等で得られた斜め映像を、平面画像に展開するにあたり、斜め映像から光軸位置θを読みとり、カメラ高さｈ、撮影レンズのｆ値、あるいはモニタ上の仮想ｆ値を読みとり、目的の場所の座標を以下の式（１）、（２）によって得るものとするのである。
ｙ＝ｖ・２^１／２・ｈ・ｃｏｓ（π／４−θ）・ｃｏｓ（β−θ）／（ｆ・ｓｉｎβ） ………………………（１）
ｘ＝ｕ・ｈ・ｃｏｓ（β−θ）／（ｆ・ｓｉｎβ）……………………（２）
このような式（１）、（２）によって表現される数式、及び同じ意味を持つ数式を用いることで、現実の世界の対象物の位置座標や大きさを既知の情報として与えることなく、撮影された映像内から読みとれる情報とθ、ｈ、ｒ等の撮影条件の情報を与えることで、現実の世界の座標系と画像モニタ上の座標系とを関連させて座標変換を行うこととするのである。
ただし、平面展開座標をｘ、ｙ、画像内座標をｕ、ｖとし、θはカメラの光軸と道路面のなす角度、ｆはカメラの焦点距離、ｈはカメラの高さ、βはカメラの真下からｈ＋ｙの距離にある点と、カメラを結ぶ線分と道路面のなす角度、ｖはカメラの映写面であるＣＣＤ面上の原点から縦方向の座標、ｕはＣＣＤ面上の原点から横方向の座標である。また、ｙは道路面におけるカメラの真下からｈ進んだ点を原点としてそこからさらに光軸方向に進んだ距離即ち座標、ｘは道路面における横方向の距離即ち座標である。
したがって、上記の式（１）、（２）なる数式を用いて、道路面を撮影した遠近法の映像を、地図のような平面画像に変換し展開するのである。
このようにして変換展開された平面画像は、平面展開画像表示・記録部２５で表示、記録される。次に遠近法の画像を平面に展開し、その平面展開画像をいくつかつなぐのが展開画像結合部２６であり、この展開画像結合部２６によって大画面の平面展開画像が得られる。これを表示し記録するのが結合展開画像表示・記録部２７である。
次に、展開した映像から、逆に任意視点を指定し、任意視点を生成するため、これを任意視点生成部２８で行なう。そして逆展開変換処理部２９では、上記の式（１）及び式（２）の逆変換式である下記の式（３）及び式（４）により、もとの視点とは視点を変えた遠近法映像を得ることができるようにしてあり、これを表示、記録するのが逆展開画像表示・記録部３０である。
即ち逆変換する式は以下の通りである。
ｖ＝ｙ・ｆ・ｓｉｎβ／（２^１／２・ｈ・ｃｏｓ（π／４−θ）・ｃｏｓ（β−θ））
…………………（３）
ｕ＝ｘ・ｆ・ｓｉｎβ／（ｈ・ｃｏｓ（β−θ）） …………………（４）
ただし、ｈはカメラの道路面からの高さ、θはカメラの光軸と道路面のなす角度、ｆはカメラの焦点距離、βはカメラの真下からｈ進んだ点からｙだけ先へ進んだ点とカメラのレンズとを結ぶ線分と、道路面との成す角度、ｘはカメラの光軸を道路面に正射影して得られる線分から垂直方向すなわちカメラから見て横方向の座標、ｙはカメラの真下からｈ進んだ点を原点としたときの光軸方向の座標、ｖはカメラにおける映写面であるＣＣＤ面上の縦方向の座標、ｕはＣＣＤ面上の横方向の座標である。
なお、逆変換式はこれらの式（３）及び（４）のみでなく、他の同様な遠近法と平面とを関係づける数式であっても良いものである。
第５図には、第４図の一部のブロック図に新たなブロック図を付加している。
即ち、ビデオ映像２１及び静止画映像２２を入力し、式（１）及び式（２）により、遠近法画像を平面画像に展開するのであり、これを映像平面展開処理部２４で行なう。展開した画像を比較するのが展開画像比較部３１であり、この展開画像比較部３１では、異なった視点からの映像を比較することにより、道路面の凹凸を求めるのであり、凹凸面生成部３２によって処理するようになっている。
第６図においては、道路面展開を行なうのみならず、道路脇の建物や街路樹、ガードレール等も平面に夫々展開し、任意視点画像に変換し、あるいは建物にテクスチャーを貼り付けたり、さらには実写テクスチャーを用いた３ＤＣＧを作ったりするよう、道路面展開法による視点移動と、テクスチャー貼り付けとを行う構成が示されている。
即ち、まず、ビデオ映像入力部４１より入力した画像から道路面平面展開部４２で道路面の平面展開図を得る一方、横立面平面展開部４３ではビルの壁面等の道路脇の部分を平面図に展開する。その後オプティカルフローマップ生成部４４でオプティカルフローマップを生成し、次のオプティカルフローマップ選択抽出部４５で目的の部分を選択抽出する。それにより、縁石面抽出部４６では道路の縁石を抽出し、ビル面抽出部４７ではビル壁面等を抽出する。また歩道面展開部４８により歩道面展開を行ない、前述の道路面平面展開部４２からのデータと突き合わせ、道路水平面統合部４９により歩道部分と車道部分とが組み合わされる。
一方、オプティカルフローマップ選択抽出部４５からは街路樹等平面抽出部５０により街路樹の抽出を行ない、ビル面抽出、歩道面展開部からのデータと道路垂直面統合部５１であわせることにより、道路脇にある垂直成分の面の構成ができる。なお、道路水平面統合部４９、道路垂直面統合部５１においての画像の形成に際し、目的に沿った部分画像の削除、挿入等が画像処理部５８を通過することで行われるのであり、所定の必要とする各種の画像が統合的に作成、表示されるのである。
次いで、３ＤＣＧ位置あわせ部５２により位置あわせを行ない、ビル等垂直面テクスチャー貼り付け部５３で垂直面のテクスチャーを貼り付け、実写テクスチャーを貼り付けた３ＤＣＧが実写テクスチャー３ＤＣＧ生成部５４で構成される。また、道路水平面統合部４９からのデータと道路垂直面統合部５１からのデータとを道路垂直水平面統合部５５で組み合わせ、統合し構成することにより、水平面と垂直面をもった立体図（立体地図）が得られる。それを任意視点遠近法表示部５６で、任意視点からの画像に変換し、視点を変えた任意視点からの遠近法画像の表示がなされる。
なお、オプティカルフローマップ生成部４４、オプティカルフローマップ選択抽出部４５によって、移動速度方向検出部５７で物体（対象平面）の移動方向や移動速度・移動距離、また、撮影カメラの移動方向や移動速度・移動距離を検出するようにしてある。
第７図には本発明装置を構成する一例を示してあり、映像入力部６１はＣＣＴＶカメラ又はデジタルスチルカメラ等のカメラによって取得した実写の映像を入力する部分である。
映像記録部６２は入力した映像を記録する部分、映像再生部６３は記録した画像を再生する部分、画像補正部６４は球面収差等のレンズによる画像のゆがみを補正するための座標変換を行ない、カメラ回転角を補正するために、目的の平面映像を画像内の平面に方向を合わせる部分、映像展開平面処理部６５は前記した式（１）及び（２）を元に数学的演算により遠近法映像から平面図に展開して生成する部分、オプティカルフローマップ生成部６６は展開された映像のオプティカルフローを生成してそれを図示する部分、オプティカルフロー選択抽出部６７はオプティカルフローマップから目的のオプティカルフローのみを抽出する部分、画像処理部６８では画像の中から必要な対象物だけを残し不必要な対象物画像を削除し、さらに新しい映像を挿入する部分である。そして、展開画像結合部６９は展開され処理された個々の画像を結合して一枚の連続した画像を生成する部分、生成された展開画像は、表示部７０によって表示され、記録部７１によって記録される。さらに任意視点画像生成部７２は任意視点に逆変換して遠近法画像として表示する部分、展開画像比較部７３は複数の同一地点の展開画像を比較する部分、画像比較部７４は演算により路面凹凸を抽出する部分である。
これらにより、夫々の目的に応じた例えば遠近法画像道路面を平面画像に展開したり、遠近法で表示されたビル壁面画像を平面画像に展開したり、さらにはビル壁面画像とガードレール画像とを分離したり、平面画像に変換した後に視点を変えて再度遠近法画像に逆変換したり、あるいはテクスチャーを貼り付けた道路面、ビル面等を表示したり、あるいは逆変換により遠近法画像に変換したり、さらには異なる視点からの画像を組み合わせた視差画像を得て、それから道路面の凹凸を計算する等の処理を行なうことができるようにしてある。
第８図においては視点移動、移動体削除等を行うときの手順を示してあり、図において、最上部に示してあるのが、ある視点から見た道路面の遠近法映像である。それを平面展開する三種類の矢印で表してあるように、左から順番に、左面平面展開、道路面平面展開、右面平面展開を行なう。その下に四角がいくつも重ねて書かれているが、これは多数の画面について、上述したように、遠近法映像を三種類の平面に展開するものである。
そして、多数の画像について、三つの平面に展開した後、道路面にある移動体、例えば遠近法ではその先が見えなくなってしまう前方車を取り除く処理を行なって、道路面画像からは車の画像が除去された平面図を得ることができる。結合合成の矢印の下方には、夫々三つの平面に展開された、左面平面展開画像、道路面平面展開画像、右面平面展開画像が得られる。左面平面展開画像、右面平面展開画像においてはオプティカルフローの計算により、街路樹、ガードレール、ビルの壁面等を分離する。
これにより、道路面は地図のように、またその両脇の面は壁を開いたような平面として、この第８図に示すように表示される。
最後にθ方式逆変換によってこれの矢印に示すように、視点を変えた後に逆変換をすることにより、この第８図の最下部に書いた図のように最初とは視点を変えた遠近法の画像が得られることになる。この際の逆変換の式は、前記の式（３）及び（４）に示されているものである。
第９図においては道路面における凹凸面の検出の手順を示してある。
即ち、映像Ａでは右車線前遠方の道路面上に穴があいており、映像Ｂでは同様に右車線前近方に穴があいているが、映像上でカメラが移動しているために、映像Ｂでの穴の方がカメラに近い。それを路面展開すると、視点変更１、視点変更２の矢印で示した先の画像のように穴の位置が異なって現れる。この二つの画像から視差を利用して、この穴の立体的様子を合成するのである。カメラを前後に２台取り付けて、夫々の映像の対応点の視差又はオプティカルフローを取り出しても同じことである。
このようにして道路面の凹凸部分を検出でるのであり、この第９図における最下図に描いてあるように、穴の深さ等を計測することができる。
この第９図では例えば穴の深さの例をあげているが、穴の深さの他に道路工事後のアスファルトの盛り上がり部分、あるいは車の走行により生じたわだち等を測定することができる。
なお、現状で世の中にある道路面の凹凸を測定する装置は、レーザを用いて測定するものが例としてあげられるが、その価格は相当に高価である。本発明方式では以上のようにソフトウェアで映像処理をすることにより、道路面の凹凸を廉価な方法で測定することができる。
第１０図においては、θを求める際の具体的な方法が示されている。現実世界（実写映像）での平行線の部分を画像内から経験的に探し出し、その平行線の延長は画像内では交点として表示されるから、その交点のつくる目的平面に平行な面である平面ａと、光軸点を含む目的平面に平行な面である平面ｂとの距離をｄとし、仮想焦点距離をｆとし、これらのｄとｆとの比から、θ＝ａｒｃＴａｎ（ｄ／ｆ）として、θを求めるものである。
ここで、仮想焦点距離を求める一例を示すと、前もって現実空間における任意の対象物を見込む角と、表示された画像内の同一対象物を見込む角が同じになるような光軸上の距離を表示画像上で求めておけば良く、このときの単位をピクセルで表せば、レンズを含むカメラ系に固有の値となり、一度求めておけば良いことになる。さらには現実世界の対象物の平行線部分を探し、それが画像内では延長線上で交点を持つ交差線として表現されているから、この交差線が平面展開したときに、平行となるようにθを選択することでθを求め、さらにθの微調整をすることもできる。
第１１図においては遠近法の画像を平面展開することにより、θの実際の値を測定する方法を示してある。
この第１１図においての（Ａ）図は遠近法の画像そのままを表示し、（Ｂ）図ではθをある一定値にして平面展開を行なったときの様子である。このとき現実世界の平行線即ち道路面は（Ａ）図のように遠近法では平行線とはならずに交点を持つ線分となっている。さらに平面展開をしたときに、θの値が実際の値と異なるときには、（Ｂ）図のように、平行線となるべき道路の両側の線分が平行線とはならない。それに対して、θが正しく求められた時には（Ｃ）図のように、道路の両側の線分が辺面展開図上で平行線となるから、このときが正しくθが求められたときなので、これによってθを求めることができる。
次に、第１２図乃至第１８図を参照して立体地図を作成する場合について説明する。ここでは、道路を走行する車両に積載したビデオカメラからほぼ進行方向に向けて撮影した映像から立体地図を生成する場合を例として説明する。
第１２図において、動画映像入力部８１は道路を走行する車両に積載したビデオカメラからほぼ進行方向に向けて撮影した映像を入力するのであり、取得された動画像を複数平面の画像に複数平面分解部８２によって分解する。そして、基準平面指定部８３では、映像は複数の平面から構成されていると解釈し、映像の中に複数の平面を設定するのであり、道路を走行している場合では、道路面を基準平面と設定する。
一方、任意目的平面指定部８４は、例えば複数の街路灯等は規則的に設置されていることから一つの平面内にあると考えて街路灯平面を設定し、同様に縁石平面、街路樹平面、ビル全面平面等の複数の平面を設定できるようにしてある。
第１３図に、オプティカルフローによって抽出される平面のイメージを示す。
同図に示すように、カメラの標準位置から各対象物が属する平面の垂直距離をＤとすると、複数の並行平面群としてすべての平面を分離、抽出することができる。このとき、同図に示す街路樹のように、曲面状の対象物については、一つの対象物であっても一つの平面には乗らない点や面を有する曲面状の対象物については、曲面を複数の平面の集まりとして扱い、基準となる平面（同図では街路樹▲１▼）からの距離を与えることで、その平面に属する一つの対象物の情報として捉えることができる。
また、θ、ｈの検出部８５は、道路面と光軸のなす角度θとカメラ系光学中心と道路面の距離ｈとを画像中から読みとるのであり、これを自動的に読みとるには前記の式（１）及び（２）におけるｆとｈとを与えることでの交差線が平面展開されたときに平行となるように設定されたり、前記の平面ａ，ｂに関係するｄ，ｆの比から求められたりするθによるものであり、このθは実測可能な場合は実測で読みとってもよいものである。座標変換部８６は、θとｈとを前記の式（１）及び（２）による平面展開変換式に代入して演算し、画像の平面展開部８７によって平面展開画像を取得するのである。
Ｏｐｔ．Ｆ（オプティカルフロー）値の演算部８８は、映像が動画であることから画像を小領域に分割し、その移動をマッチングや相関法によって画像各部のオプティカルフローを演算によって求めるのであり、Ｏｐｔ．Ｆ（オプティカルフロー）マップ生成部８９は、上記演算結果を画像マップとして表示するのである。
基準平面画像抽出部９０は、道路面を示す固有のオプティカルフロー値のみによる基準平面画像を抽出することでそれを得るようになっている。即ち平面展開された道路面においては、同じ平面内の相対速度は常に一定であるからオプティカルフローは同一であり、簡単に道路面が抽出できるのである。なお、一般の遠近法的に撮影された映像では、距離によって画像内での同一平面であっても相対速度が変化するので、固有のオプティカルフロー値では基準面を抽出できないだけでなく、距離によって大きさも変化するので比較も単純ではないのである。
平行平面抽出部９１は、上記基準平面を抽出すると同様に、基準平面とは異なるオプティカルフロー値として得られるようにしてある。基準平面に平行な平面は、基準平面とその固有値を異にするだけなので、平行平面を基準平面から分離して得ることができるものである。
平面画像構成部９２は、夫々得られた平面をそのまま画像平面として扱い、設定された平面内での画像を取得できるようにしてある。立体地図の生成部９３は、夫々の平行平面を三次元座標で組み立てることで、基準平面とそれに平行な平面の構成による三次元地図を生成するものである。ただし、基準平面とそれに平行な平面だけでは全ての対象物を表現できないので、基準平面とは異なる別の平面をも同じように平面画像構成する必要がある。
また、最初に設定した複数の平面の一つを、基準平面と同じように扱い、同じプロセスで立体地図化して生成することができる。ここでθとｈについては夫々の平面に変換しなければならないので、得られた基準平面の三次元座標からθとｈを夫々の任意平面に変換する必要がある。即ち、指定平面におけるθ、ｈ値の変換及び指定部９５によって、得られた基準平面の三次元画像から任意平面の位置を算出するのであり、その算出は簡易的には手動で変換することも可能である。同様にして目的とする画像を、θ、ｒ指定部（θ、ｈ指定部）９６を経て同様な処理を行って、目的平面画像抽出部９７を介して立体地図をその生成部９３によって生成するのである。
第１４図に、オプティカルフローによって抽出される平面と、三次元変換されて生成された立体地図のイメージを示す。
さらに、上述したθ、ｈの検出部８５、座標変換部８６、画像の平面展開部８７の処理により、映像を取得するカメラの位置や方向を検出し、再構成された立体地図上に記述しプロットすることができる。即ち、θ、ｈの検出部８５、座標変換部８６、画像の平面展開部８７の処理により、カメラの光軸と道路面等の目的平面とのなす角を与え、画像内に座標原点を指定し、変換式により平面展開された画像の中に、又は目的平面の座標の中に、演算で求められたカメラ位置とカメラ方向又はその何れかを記述することができる。前記の式（１）及び（２）は、カメラ光軸と道路面等の目的平面に、カメラの焦点距離と、道路面とカメラの光軸がなす角と、座標原点を与えることで、演算してカメラで取得した道路画像の平面展開画像を得ることができ、その際に、変換式の条件からカメラ位置とカメラ方向が演算で求められる。これにより、変換画像内にカメラ位置やカメラ方向を検出し、立体地図上などにプロットすることができ。
また、移動するカメラにより撮影され平面展開された複数の画像を結合して一枚の画像を生成し、表現されたその結合画像を新たな共通座標系として、その新たな座標系の中に、前記の式（１）及び（２）で求められたカメラ位置とカメラ方向を、次々に記述することもできる。例えば、車載カメラからの映像を平面展開して、各フレーム画像内の目的平面上の対応点を自動又は手動で探索し、対応点を一致させるように結合して目的平面の結合画像を生成し、同一の座標系に統合して表示する。そして、その共通座標系の中にカメラ位置とカメラ方向を次々に検出し、その位置や方向、軌跡をプロットしていくことができる。
第１５図に、立体地図上にカメラ位置の軌跡を記述した状態のイメージを示す。
このようにして、カメラ位置や方向を検出することにより、複数画像のカメラ位置から対象物の位置を特定することができ、平面展開画像から立体地図や三次元画像を再構成できる。従って、車載カメラ等で撮影するだけで走行した範囲の立体地図を自動的に生成することができる。また、このようにカメラ位置や方向を検出できることにより、平面展開画像から再構成された立体地図や三次元画像上でカメラが移動する位置や方向をプロットして記述することができる。
そして、以上のようにして立体地図が生成できることにより、遠近法画像を平面分解した複数の画像とカメラの位置情報を生成し、当該情報から所望の三次元画像を再構成することができることから、平面分解画像とカメラ位置情報を送信することにより、受信側でそれを再構成して元の動画像を再現することが可能となる。
従来、動画像の圧縮方法としては、ＭＰＥＧ２方式に代表されるように、動画中の動きのある部分を分離し、その動きを予測して、信号の冗長をなくすことを主とした圧縮方法が知られている。しかし、この種の従来方法は、静止背景上に移動する物体がある場合のような、部分的に動きに関しては効果的であったが、カメラ自体が移動するような動画像の場合には、画像全体が動き成分を持ち、かつ、移動速度も同一でないため、動画のすべてを更新しなければならなくなり、圧縮効果は著しく低下する。
上述したように、本発明に係る立体地図生成では、動画像を解析し、三次元的平面から構成される画像として取り扱われる。画像の一般的性質として、画像空間は複数の平面によって囲まれていることから、本装置によりそれぞれの平面を抽出し、それらの平面を再構成することで、画像を再構築することができる。そして、平面は三次元的に定義されるので、再構成された平面は三次元空間内に配置されるので、最終画像は三次元画像となる。従って、カメラが一定方向に移動する限りカメラと平面との相対速度は一定となり、各平面毎に移動速読が固有に求められることになる。カメラが等速運動する範囲では各平面は固有の速度成分となり、平面の数だけの速度を定義すればよい。
このようにすることで、画像は圧縮され、受信側では各平面を定義された速度で移動させることで、元の動画像を再現できる。しかも、画像は三次元情報を含むことになるので、三次元的に表現することもできる。
以上により、本発明では、カメラ自体が移動することにより画像全体が動き成分を持ち、かつ画像内の各部分で異なる動き速度成分を持つ画像に関しても十分な圧縮効果を持たせることが可能となり、しかも、画像圧縮時に画像を三次元的に解析するため、結果として動画画像から三次元画像を抽出できるようになる。
第１６図には別の実施の形態が示されており、第１２図に示した実施の形態におけると同一部分は同一の符号を付すことでその詳細な説明は省略してある。
ここで、入力画像に先行車両や対向車両等の移動体が入っている場合は、基準平面のオプティカルフローが基準平面ともそれと平行なＯｐｔ．Ｆ値とも異なる値をとる。従ってＯｐｔ．Ｆ（オプティカルフロー）マップ生成部８９によって形成されたＯｐｔ．Ｆマップにおいて、道路面（基準面）上に存在している異常値であるＯｐｔ．Ｆ値を検出すればその部分は道路面上にある移動体の領域であることになる。もし移動体を削除する目的であれば、この移動体領域を移動体Ｏｐｔ．Ｆ（オプティカルフロー）抽出部１０１、移動体部分抽出部１０２によって抽出削除し、その削除領域を前後の重複する展開画像から補完すればよい。なお、図中符号１０３は、移動体の相対速度を簡易に抽出する簡易移動体相対速度抽出部である。
さらにまた、移動体自身を抽出してその三次元形状を再現し、速度を計測する目的であれば、移動体平面指定部１０５を経て第１６図の右側のプロセス処理を行うのである。ただし、平面展開された基準平面上で求めた移動体領域のＯｐｔ．Ｆ値全体がそのまま車両等の移動体固有のＯｐｔ．Ｆ値を意味するものではないのであり、移動体の三次元形状とＯｐｔ．Ｆ値を求めるには、さらに移動体各面Ｏｐｔ．Ｆ（オプティカルフロー）抽出部１０６、移動体各平面抽出部１０７等で移動体を複数の平面に分解し、移動体を構成する夫々の平面の平面展開を同じプロセスで演算することで、移動体平面画像構成部１０８によって移動体の三次元形状を求めることができ、また、移動体速度ベクトル抽出部１０９によって移動体の速度ベクトルを求めることができる。さらには、それを移動体を含む立体地図の生成部１１０によって、立体地図の中に取り込むことができるようにしてある。
次に、第１７図を参照して、立体地図生成の応用例における実施の形態としての取得された画像のテクスチャーをＣＧ（コンピュータグラフィックス）画像等に貼り付ける方法について説明する。
即ち、平面展開された画像が三次元座標を取得したこと、さらに目的の平面内の画像のみが抽出されたことで、撮影時には例えば一部が街路樹の陰となっているビル壁面のテクスチャーが、街路樹の画像を削除した形で取得できること等から、道路面のテクスチャーはもちろんのこと、目的のビル壁面を構成する平面画像をＣＧ画像と位置あわせして貼り付けることで、他の平面画像と重なることなく、ＣＧ画像にビデオ画像のビル壁面のテクスチャーや街路樹や、ガードレール等を貼り付けることできる。
第１７図において、第１２図に示された実施の形態における同一部分は同一符号によって示されることでその詳細な説明は省略されている。Ｏｐｔ．Ｆ（オプティカルフロー）マップ生成部８９から平行平面画像抽出部１１１によって平行平面画像を抽出する一方、目的平面画像抽出部９７からの抽出された目的とする平面画像をテクスチャー信号発生部１１２を経てテクスチャー信号を取得しておく。また、前記平面画像構成部９２から三次元座標取得部１１３によって三次元座標を取得して、ＣＧ画像との座標を合致させるようにＣＧ座標合わせ部１１４にて合わせ、テクスチャー信号発生部１１２からの信号と共にＣＧへの貼り付け部１１５によってテクスチャーを貼り付けた合成画像を得るようにするのである。
第１８図においては、立体地図を認識された部品によって構成する場合における実施の形態につき示されており、第１２図乃至第１７図によって示された実施の形態における同一構成については、同一符号が付されることでその詳細な説明は省略されている。
即ち、夫々の平面展開された平面及び仮想平面上での対象物に着目すれば、画像内の対象物のオプティカルフローはカメラと対象物との相対速度にのみ依存するので、容易に対象物の追跡が可能となるのであり、また、移動体であっても速度が抽出できるのでそれの追跡が容易となっているのである。
対象物の追跡ができれば、対象物の時間変化に対する位置や形状変化を認識の手がかりとして、画像認識が容易となり、また追跡によらずとも、画像内の対象物を直接比較によって三次元ＣＧモデルに置き換えることができる。そのため、前記平面画像構成部９２によって形成された画像中から、対象物選択追跡部１２１によってその画面中に存する特定のある対象物を選択抽出し、それを追跡するのであり、対象物の属性その他の各種情報を認識する対象物の認識部１２２を経て、対象物の属性を付加する属性付加部１２３に入力されるのである。
一方、前記移動体平面画像構成部１０８、移動体速度ベクトル抽出部１０９によって形成された画像中から、移動体選択追跡部１２４によってその画面中に存する特定のある移動体を選択抽出し、それを追跡するのであり、移動体の属性その他の各種情報を認識する移動体対象物の認識部１２５を経て、対象物の属性を付加する属性付加部１２６に入力されるのである。夫々の属性付加部１２３，１２６から認識対象物で構成する立体地図の生成部１２７によって立体地図を生成するのである。
以下、平面展開画像を利用した対象物の認識、追跡処理を交通量監視ビデオカメラの画像解析に適用した具体例を、第１９図乃至第２１図を参照しつつ説明する。
第１９図は、交通量監視ビデオカメラで得られる映像の具体例であり、同図（ａ）は監視ビデオカメラ画像（遠近画像）、（ｂ）は本発明により変換される平面展開画像、（ｃ）は本発明により認識された対象物を示す領域分析表示である。
第２０図は、交通量監視ビデオカメラにおける対象物認識の処理ステップを示すフローチャートである。また、第２１図は、対象物認識により得られた結果を集計した一覧表である。
これらの図に示す例では、交通量監視ビデオカメラ画像から、画像認識により通過車両の車種、車の色、交通量、速度、加速度、監視ビデオカメラ画像内の車両の通行軌跡を求める場合となっている。
まず、交通量監視ビデオカメラで得られる監視ビデオカメラ画像は、第１９図（ａ）に示すように、遠近画像（斜め画像）であり、対象となる車両の大きさや速度は均一ではない。
この遠近画像をデジタル化し、平面展開して本発明の平面展開画像を得る。得られた平面展開図は第１９図（ｂ）に示すようになる。この平面展開画像では、道路面を平面に展開するようパラメータｆ、θを決定しているので、道路面では、車両の幅、長さ等、車両の高さ方向を除けば画像内のどの位置でもスケールは、均一であり、計測可能である。従って、この平面展開画像によって、交通量監視のための画像認識を行うことができる。
従来は、遠近画像（第１９図（ａ）参照）で車両認識認識を行い、車両認識エリアを画面内の一部の範囲に限定しそこを通過する車両を検出、計測することが一般的であった。本発明では、遠近画像を平面展開できるので、平面展開画像（第１９図（ｂ）参照）を利用することで、画像の道路面全体が車両認識範囲として利用できるようになる。
第１９図（ｂ）に示す例では、平面展開画像の上部（上側）で捉えられた移動体（車両）は、進行方向である画像下部（下側）へ移動する。このとき、平面展開画像内では車両画像の大きさは変わらないので、同一車両に対して複数の画像を取得することができる。例えば、平面展開画像の上部から下部へ移動する移動体画像が約３０コマ分の画像データとして取得できる。
そして、この複数の画像を利用することで、詳細な画像分析（領域分析）が行え、例えば、車種の特定精度を向上させることができ、正確な画像認識が可能となる。第１９図（ｃ）に領域分析表示を示す。
平面展開画像では、例えば、認識精度を上げるために、画像の加算平均や画像処理した輪郭画像の加算平均等の処理が可能になり、画像処理上非常に効果的である。また、平面展開画像上では、道路面のどの部分でもスケールが同一であるため位置情報が容易に取得でき、認識と併せて移動体の移動軌跡を追跡できる。また、このようにスケールが同一であることから、車両の位置と移動速度（一秒間のビデオコマ数から計算）は、道路面のどこでも計測可能であり、加速度、時速は容易に計算可能である。
第２０図を参照して、以上の平面展開技術を利用した交通量認識の処理の流れを、より詳細に説明する。
まず、撮影された遠近画像がデジタル化され（第２０図の２０１）、遠近画像から平面展開画像が作成される（同２０２）。
そして、平面展開画像内で移動体領域が検出され、候補領域を背景画像と現在画像の比較演算が行われて候補領域が作成される（同２０３）。候補領域は、背景画像との画像演算で画像処理され、車両の検出上、独立した面積の小さな領域が除去され、残った領域の膨張結合により車両候補の画像領域が特定される。また、背景画像は、カルマンフィルタ等を用いて更新が行われる（同２０３）。
特定された候補領域は、画像処理の閾値を変える等の処理が行われ、詳細に分析される（同２０４）。また、車両候補領域は、平面展開により移動量の予測が可能であるので、車両存在領域画像を取得する際に、車両の存在領域を予測し、画像の大きさ、位置ずれ等の形状修正、調整が行われ、使用する対応領域が分析、決定される（同２０５）。
決定された対応領域は、その位置情報とともにデータベースに登録される（同２０６）。その後、データベースに登録された画像を利用して、加算平均画像が作成される（同２０７）。
そして、この加算平均画像を利用して、車種判定が行われ（同２０８）、認識結果とともにデータベースに登録される（同２０９）。
データベースに登録される情報としては種々のものがあり、例えば、第２１図に示すように、車両ＩＤ、通過時刻（ＰａｓｓｅｄＴｉｍｅ）、平均時速（Ｓｐｅｅｄ）、加速度（Ａｃｃ）、車種（Ｔｙｐｅ）、色（Ｃｏｌｏｒ）など、車両認識に必要となる各種項目の情報が登録される。勿論、第２１図に示した項目に限定されず、他の情報を登録することもでき、例えば、上記項目に加えて、車両加算平均画像や車両の移動軌跡などを登録することもできる。
なお、画像内の対象物を直接比較により三次元ＣＧモデルに置き換えるには、例えば特願平１１−９７５６２号，特願２０００−１９０７２５，特願２００２−１４６８７４等の出願で明らかにされている方法、装置等によるのであり、例えば、対象物を認識し、特定し、固定し、追跡が可能となり、対象物を三次元ＣＧモデルに置き換えることが可能となる。その方法、装置の概略は、対象物に関して取得した対象物情報をこの対象物に対応して予め登録されている情報コードに変換し、その情報コードを送信あるいは出力する情報コード変換装置と、この情報コード変換装置からの情報コードが受信あるいは入力されることで、この情報コードに対応して予め登録されている再現対象物情報に変換する再現変換装置とを備えたものである。また、所要の対象物に関する情報を入力する情報入力手段と、予め作成した各種対象物ないしその部分およびそれらの属性等に関する情報と、それらの情報を夫々コード化したデータとを蓄積してデータベースを形成した第１の部品庫と、前記情報入力手段に入力された情報と前記第１の部品庫に蓄積された情報とを比較対照して対応する情報に関するデータを選択して出力する情報コード変換装置と、前記第１の部品庫と同様にデータベースを形成した第２の部品庫と、前記情報コード変換装置から出力されたデータを前記第２の部品庫に蓄積されたデータと比較対照して対応する対象物を再現する情報を選択すると共に、この対象物を再現する情報に基づいて所要の出力手段により対象物を再現出力する情報再現変換装置とから構成したものである。さらには、外界を一又は複数のカメラで撮影したビデオ映像の画像表示上において任意の位置の一又は複数の対象物を名称や属性で指定するか、あるいはマウスで四角く囲うか、一点位置をクリックするか、若しくはライトペンやタッチパネルで指定することで、当該対象物周囲の存在を除外しつつ上記カメラと当該対象物との相対角度や方向の変化、及び距離の変化を含めて当該対象物を追跡しながら当該対象物の各画像フレームの特徴、若しくは当該対象物の各構成部品の特徴を時系列的に順次に検出し、かつ各種複数の特徴に関する画像データが豊富に保存されたデータベースから当該対象物の各画像フレームの特徴、若しくは各構成部品の特徴の連続的な変化にも対応関係のある画像フレーム、乃至特徴の画像データを順次に検索し、この検索結果に対応してパターンマッチングのとれた当該対象物に対応する２次元乃至３次元形状を含む再現画像を時系列的な変化毎に順次に構成するとともに、当該再現画像を含む各画像フレーム、若しくは当該特徴の画像データを上記画像表示上、若しくは通信回線等を介し他の画像表示上の所要の領域上に所要の大きさ基準の設定に符合させ動画若しくは連続的な静止画の一群として対比的に順次に表示し、かつ必要に応じ当該再現画像に付随乃至生成した名称や属性データをも所要の領域に表示することである。
こうすることで、夫々の対象物に前もって用意していた属性を三次元ＣＧモデルに付加することで、置き換えた対象物の三次元ＣＧモデルを結合集積することによって立体地図を構成することができ、さらに上記テクスチャー貼り付けによって、対象物の三次元ＣＧモデルに実写テクスチャーを貼り付けることが可能となるのである。
また、第２２図乃至第２７図に示す実施の形態にあっては、例えば各種の乗り物等の移動体における周囲画像を取得できるように、移動方向の連続結合を可能とするものである。即ち移動する車両や航空機や船舶等にカメラを積載して撮影することで得られた移動方向の映像を平面展開して、連続結合して一枚の画像とするのである。これは移動する車両や航空機や船舶等にカメラを搭載して、それから斜め画像を順次撮影し、例えば道路に沿って道路映像を撮影し、それを平面展開し、つなぎ合わせ一枚の道路映像を得るというものである。
具体的に、第２２図に示すように例えば乗り物としてのバス２０１の周りの道路を撮影するカメラの状況を説明すると、この場合には６台のカメラが用いてあるが、必要によってはさらに台数を増やしても良い。バス２０１の前方を撮影する第１カメラ２０１Ａをバス２０１前部に、後方を撮影する第２カメラ２０１Ｂをバス２０１後部に、バス２０１の右側前方を撮影する第３カメラ２０１Ｃをバス右側後部に、バス２０１の左側前方を撮影する第４カメラ２０１Ｄをバス２０１左側後部に、バス２０１の右側後方を撮影する第５カメラ２０１Ｅをバス２０１右側前部に、バス２０１の左側後方を撮影する第６カメラ２０１Ｆをバス２０１左側前部に備え、それらのカメラ２０１Ａ…の撮影する範囲を扇形の図で示してある。このようにして６種類の道路面の映像が遠近法で撮影されるが、それを前記の式（１）及び（２）により、地図のような平面画像に展開するのであり、その際、映像上の座標（ｕ，ｖ）は、地図上に変換した座標（ｘ，ｙ）に変換される。
このようにして、バス２０１の周りを撮影して得られた遠近法での映像は、地図のような道路面の映像に変換され、第２３図に示すようになり、バス２０１の周りの地図のような道路面を表示することができるわけである。さらにこれらの平面展開画像をバス２０１の進行方向にどこまでも結合すれば、地図ができあがることになる。
その具体的な例を第２４図及び第２５図に示すと、第２４図にはその（１）…（９）までの９枚の道路の斜め画像がある。これを前記の式（１）及び（２）により、地図のような平面画像に展開し、そしてそれをつなぎ合わせ一本の道路として地図のように表示したのが第２５図である。これが通常のカメラで撮影した画像即ち斜め画像の平面展開の例、及び平面展開画像の結合の例である。
このとき、画像形成に不要な移動体画像があれば、それを削除することもでき、重複する対象物を一部含む複数の画像を結合する際に、移動体が写っている場合にはその移動体の映像を避けて画像結合させることで、静止物体のみの結合画像を生成するのである。例えば道路上に車両等が写っている場合に、その車両等を避けて平面展開する画像を組み合わせ、つなぎ合わせることにより、道路のみの写っている長い道路の写真が得られるのである。
次に、本発明による応用例の幾つかを説明すると次のようである。
即ち平面展開画像として、道路面・海上面・湖水面・河川面・地上面・垂直壁面・同一平面に配列された対象物が作る垂直仮想平面・建築壁面床面・船の甲板面・滑走路誘導路等空港施設面等を扱うことができる。これは、通常のカメラで撮影した画像を平面に展開するとき、その平面展開面の対象物として、道路面・海上面・湖水面・河川面・垂直壁面・同一平面に配列された対象物が作る垂直仮想平面・地上面・建築壁面床面・船の甲板面・滑走路誘導路等空港施設面等を扱うというものである。
また、応用機器例として乗り物とすることもでき、例えばバス等の陸上乗り物における周辺道路面、ビル面、電柱の配列面、街路樹の配列面、ガードレールの配列面等であり、船舶等海上の乗り物の海上面等、船舶の甲板、壁面等であり、航空機等の滑走路、地上面等であり、これによって全方位全面表示、あるいは目的領域面表示を可能とするのである。
即ち、第２２図乃至第２５図に示すように、バス２０１等の陸上乗り物に取り付けた通常のカメラ２０１Ａ…からの周辺道路面、ビル面、電柱の配列面、街路樹の配列面、ガードレールの配列面等、船舶等海上の乗り物の海上面等、船舶の甲板、壁面等、航空機等の滑走路、地上面等の映像を平面図に展開し、その夫々の周りにおける周辺道路面、ビル面、電柱の配列面、街路樹の配列面、ガードレールの配列面等の全方位全面表示を可能にするのである。あるいは図示を省略したが、船舶等海上の乗り物の海上の全方位全面表示、及び船舶の甲板、壁面等、の全面全方位表示、さらには航空機等の滑走路及び地上面の全方位全面表示を行なうこととができるのである。
さらには、他の応用例として、建築構造物に適用することができ、第２６図、第２７図に示すように例えば建築物の床面、壁面等の平面部分を平面展開表示、及び平面結合表示を可能にするのであり、建築物の内部の撮影を通常のカメラで行ない、床面、壁面等の平面部分を平面展開表示、及び平面結合表示を行なうというものである。
即ち、第２６図に示したのが通常のカメラで撮影した室内の斜め画像であり、例えばこれらの（１）…（１６）の１６枚の画像に対して前記の式（１）、（２）により平面画像に変換し、それらをつなぎ合わせたのが第２７図に示す通りである。このように実際には撮影することのできない画像を、建物の部屋の中の床面、壁面を展開した画像として得ることができ、床面に対しての周囲の壁面を展開した画像として生成できるのである。
さらには、立体地図作製をも可能にするのであり、複数のカメラで、移動する車両、航空機、船舶等で路面や地上面や水上面を連続撮影するのみならず、ビル壁面等のような垂直面、あるいは複数の電柱、ガードレール等が規則的に平面的に配列されている仮想垂直平面を持つ対象をも連続撮影することで、前記平面展開した画像を移動方向に結合延長させながら、同時に垂直面を含むより広範囲の平面垂直面展開図をつくることで、立体地図を作製するのである。
即ち、移動する車両等例えば、車両、航空機、船舶等に積載したビデオカメラで撮影した路面や地上面や水上面の画像を平面図に展開し、それらを適切な方法で結合させ、結合平面展開図を作り、さらに移動方向にも結合延長させることで地図を作製するというものである。あるいはビル壁面等のような垂直面、あるいは複数の電柱、ガードレール等が規則的に平面的に配列されている仮想垂直平面を持つ対象をも連続撮影することで、前記平面展開した画像を移動方向に結合延長させながら、同時に垂直面を含むより広範囲の平面垂直面展開図をつくることで、立体地図を作製するのである。Embodiments of the present invention will be described below with reference to the drawings.
As shown in FIG. 1, a camera 1 such as a CCTV camera or a digital still camera as an input device, a video playback unit 2 that plays back an oblique video shot by these cameras 1, an image correction unit 3, and spherical aberration A correction unit 4, a video development plane processing unit 5 that develops an oblique video into a plan view according to formulas (1) and (2), which will be described later, a developed image combining unit 6 that joins the developed videos in an appropriate manner, The developed image display unit 7 displays the joined video and the recording unit 8 records the joined video on a recording medium. The oblique image is developed in a plane as real time processing. That is, the image correction unit 3 corrects the rotation angle of the camera 1 while the image is input from the camera 1 and reproduces the image in real time so as to be configured as an oblique image plane development device as real-time processing. The correction unit 4 corrects the spherical aberration and the purpose of the camera 1, and the image development plane processing unit 5 converts the perspective image into a plane development view like a map and displays it on the development image display unit 7. To do. If necessary, the developed images are combined with the developed images by the developed image combining unit 6, and the images are displayed and recorded in the recording unit 8. is there.
Further, as shown in FIG. 2, in the oblique image plane developing apparatus for offline processing, an oblique image reproducing unit 11, an image correcting unit 12, an image developing plane processing unit 13, a developed image combining unit 14, The developed image display unit 15 is used to reproduce the image from the oblique image reproduction unit 11 in which the image captured by the normal camera 1 is recorded, and the image correction unit 12 is intended for spherical aberration, camera rotation angle, and the like. The image development plane processing unit 13 develops a flat image such as a map and displays it on the development image display unit 15. If necessary, a plurality of images developed by the development image combination unit 14 are then displayed. The images are joined together by an appropriate method, and the joined images are displayed on the developed image display unit 15.
Further, as shown in FIG. 3, the oblique image plane development apparatus includes a camera side for image acquisition (transmission side) and a monitor side for displaying, recording, and reconstructing a planar image decomposed and developed from the oblique image ( It can be installed separately from the receiving side. In this case, as shown in FIG. 3, on the camera side (transmission side), there are a camera 1 as an input device, a video reproduction unit 2, an image correction unit 3, a spherical aberration correction unit 4, and a video development plane processing unit 5. On the monitor side (reception side), a developed image combining unit 6, a developed image display unit 7, and a recording unit 8 are provided. Further, the camera side is provided with a transmission unit 5a that transmits the developed and decomposed planar image signal to the monitor side, and the monitor side is provided with a reception unit 6a that receives the planar image signal transmitted from the camera side. These transmission / reception units 5a and 6a are connected via a communication line so that data communication is possible. As a result, the plane development image of the moving image obtained on the camera side and the predetermined information such as the camera position information are transmitted to the monitor side via the communication line, and the monitor side transmits the moving image based on the received plane development image. An image can be reconstructed, and a desired moving image can be transmitted and reproduced while reducing the transmission data amount as much as possible.
Note that FIG. 3 shows the real-time processing oblique image plane development apparatus shown in FIG. 1 separated and installed. However, the real-time processing and the offline processing can be separated and installed. Of course.
As a plane development form, an image obtained by obliquely photographing an object of a real scene including a plane is proportionally related to the plane of the real scene (similar shape) by a mathematical operation. A flat image is developed and displayed on a flat surface. This is an image obtained by photographing a scene including a plane with a normal camera, that is, an image photographed from an oblique direction, and transforms a plane originally composed of a plane into a plane image having a similar relationship with the real plane by mathematical operation. For example, if it is a road surface, it is developed on a plane and developed like a map screen.
Furthermore, when combining these, a plurality of plane development images obtained by the above method are combined and expressed as one large plane development image. This is an image taken with a normal camera on a plane. This means that the plurality of flat developed images are combined by an appropriate method and expressed as one large flat developed image. That is, when a plurality of road surfaces are developed into a plane development view such as a map, they are connected to form one large plane development image. Of course, since it is an image developed in a plane, any number of images can be combined freely, and images in the perspective view cannot be combined unless the camera positions are the same.
Furthermore, it is possible to perform omnidirectional display and direct display based on images acquired by a plurality of CCTV cameras. The images of a plurality of CCTV cameras are developed in a plane by the above-described apparatus, and the images are combined to form a single sheet. The entire area of the target area is displayed as an image, and if necessary, an oblique video of the CCTV camera corresponding to the displayed position can be displayed at the same time.
That is, the images of a plurality of CCTV cameras are developed into a planar image by the above-described apparatus, and the respective images are combined by matching corresponding points to form one image, and the entire target area is displayed. Furthermore, if necessary, an image as it is, that is, an oblique image of the CCTV camera corresponding to the displayed location can be displayed, so that the purpose effect such as monitoring can be improved.
FIG. 4 shows a processing block diagram when an image is developed on a plane. First, when a video image 21 or a still image image 22 is input to the apparatus as an input image, spherical aberration correction, rotation correction, and the like are performed by the correction unit 23 at the time of input. Next, the video plane development processing unit 24 performs video plane development processing using the following formulas (1) and (2).
That is, when the oblique video obtained by the video image 21 or the still image video 22 is converted into a planar image, in the present invention, it is obtained by a method called the θ method. For example, an oblique video obtained by a CCTV camera or the like is obtained. In developing the flat image, the optical axis position θ is read from the oblique image, the camera height h, the f value of the photographing lens, or the virtual f value on the monitor is read, and the coordinates of the target location are expressed by the following equation (1). , (2).
y = v · 2 ^1/2 · H · cos (π / 4-θ) · cos (β-θ) / (f · sin β) ... (1)
x = u · h · cos (β−θ) / (f · sin β) (2)
By using the mathematical expressions expressed by the expressions (1) and (2) and the mathematical expressions having the same meaning, it is possible to shoot without giving the position coordinates and the size of the object in the real world as known information. By providing information that can be read from the video and information on photographing conditions such as θ, h, and r, coordinate conversion is performed by associating the real world coordinate system with the coordinate system on the image monitor. It is.
However, the plane development coordinates are x, y, the in-image coordinates are u, v, θ is the angle between the optical axis of the camera and the road surface, f is the focal length of the camera, h is the height of the camera, and β is the camera height. The angle between the point h + y from the bottom and the line connecting the camera and the road surface, v is the vertical coordinate from the origin on the CCD plane that is the projection surface of the camera, and u is the horizontal axis from the origin on the CCD plane It is a coordinate of direction. In addition, y is a distance or coordinate that travels further in the optical axis direction from the point that has advanced h from directly below the camera on the road surface, and x is a lateral distance or coordinate on the road surface.
Therefore, the perspective image obtained by photographing the road surface is converted into a planar image such as a map and developed using the equations (1) and (2).
The plane image thus converted and expanded is displayed and recorded by the plane expanded image display / recording unit 25. Next, a perspective image is developed on a plane, and a number of the plane development images are connected to a developed image combining unit 26. The developed image combining unit 26 obtains a large-scale flat image. The combined developed image display / recording unit 27 displays and records this.
Next, an arbitrary viewpoint is designated from the developed video to generate an arbitrary viewpoint, and this is performed by the arbitrary viewpoint generation unit 28. Then, the inverse expansion conversion processing unit 29 uses the following expressions (3) and (4), which are the inverse conversion expressions of the above expressions (1) and (2), to change the perspective from the original viewpoint. The reversely developed image display / recording unit 30 displays and records a legal image.
That is, the equation for inverse transformation is as follows.
v = y · f · sin β / (2 ^1/2 · H · cos (π / 4-θ) · cos (β-θ))
………………… (3)
u = x · f · sin β / (h · cos (β−θ)) (4)
Where h is the height of the camera from the road surface, θ is the angle formed by the optical axis of the camera and the road surface, f is the focal length of the camera, and β is advanced y from the point h ahead of the camera. The angle formed by the line connecting the point and the camera lens and the road surface, x is the vertical coordinate from the line obtained by orthogonally projecting the optical axis of the camera onto the road surface, that is, the horizontal coordinate when viewed from the camera, y Is the coordinate in the optical axis direction when the origin is a point advanced from directly under the camera, v is the vertical coordinate on the CCD surface, which is the projection surface of the camera, and u is the horizontal coordinate on the CCD surface. .
Note that the inverse transformation formula is not limited to these formulas (3) and (4), but may be a mathematical formula that relates other similar perspectives and planes.
In FIG. 5, a new block diagram is added to a part of the block diagram of FIG.
That is, the video image 21 and the still image image 22 are input, and the perspective image is developed into a plane image by the equations (1) and (2), and this is performed by the image plane development processing unit 24. The developed image comparison unit 31 compares the developed images. The developed image comparison unit 31 obtains the unevenness of the road surface by comparing the images from different viewpoints. It is supposed to be processed by.
In FIG. 6, not only the road surface is developed, but also buildings on the roadside, roadside trees, guardrails, etc. are each developed on a plane, converted into an arbitrary viewpoint image, or a texture is attached to the building. A configuration for performing viewpoint movement by a road surface expansion method and texture pasting so as to create 3DCG using a live-action texture is shown.
That is, first, the road surface plane development unit 42 obtains a road surface plan development from the image input from the video image input unit 41, while the side vertical plane development unit 43 planarizes a roadside portion such as a building wall. Expand in the figure. Thereafter, an optical flow map is generated by the optical flow map generation unit 44, and a target portion is selectively extracted by the next optical flow map selection extraction unit 45. Accordingly, the curb surface extraction unit 46 extracts road curbs, and the building surface extraction unit 47 extracts building wall surfaces and the like. Further, the sidewalk surface development unit 48 performs sidewalk surface development, matches the data from the road surface plane development unit 42 described above, and the road horizontal plane integration unit 49 combines the sidewalk portion and the roadway portion.
On the other hand, the road tree is extracted from the optical flow map selection / extraction unit 45 by the plane tree extraction unit 50, and the road surface is extracted by combining the data from the building surface extraction and sidewalk surface development unit with the road vertical plane integration unit 51. The vertical component surface on the side can be configured. In addition, when forming images in the road horizontal plane integration unit 49 and the road vertical plane integration unit 51, deletion or insertion of partial images according to the purpose is performed by passing through the image processing unit 58, and a predetermined necessity The various images are created and displayed in an integrated manner.
Next, alignment is performed by the 3DCG alignment unit 52, the vertical texture is pasted by the vertical surface texture pasting unit 53 such as a building, and the 3DCG to which the real photograph texture is pasted is configured by the real photograph texture 3DCG generation unit 54. Further, the data from the road horizontal plane integration unit 49 and the data from the road vertical plane integration unit 51 are combined in the road vertical horizontal plane integration unit 55 and integrated to form a three-dimensional map (three-dimensional map) having a horizontal plane and a vertical plane. ) Is obtained. The arbitrary viewpoint perspective display unit 56 converts the image into an image from an arbitrary viewpoint, and displays the perspective image from the arbitrary viewpoint with the viewpoint changed.
Note that the moving direction and moving distance of the object (target plane), the moving direction and moving speed of the photographing camera, and the moving direction and moving speed of the photographing camera are detected by the moving speed direction detecting unit 57 by the optical flow map generating unit 44 and the optical flow map selecting and extracting unit 45. -The movement distance is detected.
FIG. 7 shows an example constituting the apparatus of the present invention, and the video input unit 61 is a part for inputting a real video image obtained by a camera such as a CCTV camera or a digital still camera.
The video recording unit 62 is a part for recording the input video, the video reproduction unit 63 is a part for reproducing the recorded image, and the image correction unit 64 performs coordinate conversion for correcting distortion of the image due to a lens such as spherical aberration. In order to correct the camera rotation angle, a portion where the target plane image is aligned with the plane in the image, the image development plane processing unit 65 is based on the above formulas (1) and (2) by a mathematical operation. A portion that is generated by developing a plan view from an image, an optical flow map generation unit 66 generates an optical flow of the expanded image, and a portion that illustrates the optical flow, and an optical flow selection / extraction unit 67 is a target optical from the optical flow map. The part that extracts only the flow, the image processing unit 68, deletes unnecessary object images from the image, leaving only the necessary objects, Is a portion to insert the image. The developed image combining unit 69 combines the developed and processed individual images to generate a single continuous image. The generated developed image is displayed by the display unit 70 and recorded by the recording unit 71. Is done. Furthermore, the arbitrary viewpoint image generation unit 72 is a part for inversely converting to an arbitrary viewpoint and displaying it as a perspective image, the developed image comparing unit 73 is a part for comparing developed images at a plurality of identical points, and the image comparing unit 74 is a road surface unevenness by calculation Is a part to extract.
Thus, for example, a perspective image road surface corresponding to each purpose is developed into a planar image, a building wall image displayed in perspective is developed into a planar image, and further, a building wall image and a guardrail image are displayed. Separated, converted to a flat image, then changed the viewpoint and converted back to a perspective image again, or displayed a road or building surface with a texture attached, or converted to a perspective image by reverse conversion In addition, a parallax image obtained by combining images from different viewpoints can be obtained, and then processing such as calculating the unevenness of the road surface can be performed.
FIG. 8 shows a procedure for performing viewpoint movement, moving object deletion, and the like. In the figure, a perspective image of a road surface viewed from a certain viewpoint is shown at the top. As represented by three types of arrows for plane development, left plane development, road plane plane development, and right plane development are performed in order from the left. Below that, a number of squares are overlaid, which, as described above, develops perspective images on three types of planes for many screens.
Then, after developing the image on three planes, a moving object on the road surface, for example, a front vehicle that cannot be seen in the perspective is removed, and a vehicle image is obtained from the road surface image. It is possible to obtain a plan view from which is removed. Below the combined / combined arrows, a left plane developed image, a road plane developed image, and a right plane developed image developed in three planes are obtained. In the left plane development image and the right plane development image, street trees, guardrails, building wall surfaces, and the like are separated by optical flow calculation.
As a result, the road surface is displayed as a map, and the surfaces on both sides thereof are displayed as shown in FIG.
Finally, as shown by the arrows in the θ method inverse transformation, the perspective is changed by changing the viewpoint and then performing the inverse transformation after changing the viewpoint, as shown in the figure written at the bottom of FIG. Will be obtained. The inverse transformation formula at this time is the one shown in the above formulas (3) and (4).
FIG. 9 shows a procedure for detecting an uneven surface on a road surface.
That is, in image A, there is a hole on the road surface in front of the right lane, and in image B, there is also a hole in front of the right lane, but because the camera is moving on the image, The hole in image B is closer to the camera. When the road surface is developed, the positions of the holes appear differently as in the previous image indicated by the arrows of viewpoint change 1 and viewpoint change 2. From these two images, the three-dimensional appearance of the hole is synthesized using parallax. The same thing can be said even if two cameras are attached to the front and rear and the parallax or optical flow of the corresponding point of each video is taken out.
In this way, the uneven portion of the road surface can be detected, and the depth of the hole or the like can be measured as depicted in the bottom diagram in FIG.
For example, FIG. 9 shows an example of the depth of the hole. In addition to the depth of the hole, it is possible to measure a raised portion of asphalt after road construction or a rudder caused by traveling of a car.
An example of a device that measures the unevenness of road surfaces in the world at present is one that measures using a laser, but its price is quite expensive. In the method of the present invention, the unevenness of the road surface can be measured by an inexpensive method by processing the video with software as described above.
FIG. 10 shows a specific method for obtaining θ. A part of the parallel line in the real world (live-action video) is empirically searched from the image, and the extension of the parallel line is displayed as an intersection in the image. Therefore, the plane is a plane parallel to the target plane created by the intersection. The distance between a and a plane b that is a plane parallel to the target plane including the optical axis point is d, the virtual focal length is f, and from the ratio of these d and f, θ = arcTan (d / f) Is to obtain θ.
Here, as an example of obtaining the virtual focal length, the distance on the optical axis is such that the angle at which an arbitrary object in the real space is viewed in advance and the angle at which the same object in the displayed image is viewed are the same. It may be obtained on the display image, and if the unit at this time is expressed in pixels, it is a value specific to the camera system including the lens, and it is only necessary to obtain it once. Furthermore, since a parallel line portion of an object in the real world is searched and it is represented as an intersection line having an intersection on the extension line in the image, the intersection line becomes parallel when it is expanded in a plane. Θ can be obtained by selecting and further fine adjustment of θ can be performed.
FIG. 11 shows a method of measuring the actual value of θ by developing a perspective image on a plane.
FIG. 11A shows the perspective image as it is, and FIG. 11B shows the state when plane development is performed with θ set to a certain value. At this time, the parallel line in the real world, that is, the road surface is not a parallel line but a line having an intersection in the perspective as shown in FIG. Further, when the plane is developed, if the value of θ is different from the actual value, the line segments on both sides of the road that should be parallel lines do not become parallel lines as shown in FIG. On the other hand, when θ is correctly obtained, the line segments on both sides of the road become parallel lines on the side development as shown in FIG. Thereby, θ can be obtained.
Next, a case where a three-dimensional map is created will be described with reference to FIGS. Here, a case where a three-dimensional map is generated from an image photographed in a substantially traveling direction from a video camera mounted on a vehicle traveling on a road will be described as an example.
In FIG. 12, a moving image input unit 81 inputs an image captured in a substantially traveling direction from a video camera mounted on a vehicle traveling on a road, and the acquired moving image is converted into a plurality of planes. Decomposed by the decomposition unit 82. The reference plane designating unit 83 interprets that the video is composed of a plurality of planes, and sets a plurality of planes in the video. When the vehicle is traveling on a road, the road plane is set as the reference plane. And set.
On the other hand, the arbitrary-purpose plane designating unit 84 sets a street lamp plane on the assumption that it is within one plane because, for example, a plurality of street lamps are regularly installed. Similarly, the curb plane and the street tree plane A plurality of planes such as a building whole plane can be set.
FIG. 13 shows an image of a plane extracted by the optical flow.
As shown in the figure, if the vertical distance of the plane to which each object belongs from the standard position of the camera is D, all planes can be separated and extracted as a plurality of parallel plane groups. At this time, as with the roadside tree shown in the figure, for a curved object, even for a single object, even if it is a curved object having points or faces that do not ride on one plane, Is treated as a collection of a plurality of planes, and given a distance from the reference plane (street tree (1) in the figure), it can be grasped as information on one object belonging to that plane.
The θ and h detection unit 85 reads the angle θ between the road surface and the optical axis and the distance h between the camera system optical center and the road surface from the image. By setting f and h in the expressions (1) and (2), the intersection line is set to be parallel when the plane is expanded, or the ratio of d and f related to the planes a and b. This θ is obtained from the equation (1), and this θ may be read by actual measurement if it can be actually measured. The coordinate conversion unit 86 calculates θ and h by substituting them into the plane expansion conversion equation according to the above formulas (1) and (2), and acquires a plane expansion image by the plane expansion unit 87 of the image.
Opt. The F (optical flow) value calculation unit 88 divides the image into small regions because the video is a moving image, and calculates the optical flow of each part of the image by calculation using matching and correlation methods. The F (optical flow) map generation unit 89 displays the calculation result as an image map.
The reference plane image extraction unit 90 obtains the reference plane image by extracting only the unique optical flow value indicating the road surface. That is, on the road surface that is developed in a plane, the relative flow in the same plane is always constant, so the optical flow is the same, and the road surface can be extracted easily. Note that in general perspective images, the relative speed changes even with the same plane in the image depending on the distance, so that not only the reference plane cannot be extracted with a unique optical flow value, but also depending on the distance. Because the size changes, the comparison is not simple.
The parallel plane extraction unit 91 is configured to obtain an optical flow value different from that of the reference plane in the same manner as the reference plane is extracted. Since the plane parallel to the reference plane is different from the reference plane in its eigenvalue, the parallel plane can be obtained separately from the reference plane.
The plane image construction unit 92 treats each obtained plane as an image plane as it is, and can acquire an image in the set plane. The three-dimensional map generation unit 93 generates a three-dimensional map having a reference plane and a plane parallel to the reference plane by assembling each parallel plane with three-dimensional coordinates. However, since all the objects cannot be expressed only by the reference plane and a plane parallel to the reference plane, it is necessary to form a plane image in the same manner for another plane different from the reference plane.
Further, one of a plurality of initially set planes can be handled in the same manner as the reference plane, and can be generated by creating a three-dimensional map in the same process. Here, since θ and h must be converted into respective planes, it is necessary to convert θ and h into respective arbitrary planes from the obtained three-dimensional coordinates of the reference plane. That is, the position of the arbitrary plane is calculated from the obtained three-dimensional image of the reference plane by the conversion of the θ and h values in the specified plane and the specifying unit 95, and the calculation can be simply converted manually. Is possible. Similarly, the target image is subjected to the same processing through the θ, r designation unit (θ, h designation unit) 96, and a three-dimensional map is generated by the generation unit 93 via the target plane image extraction unit 97. It is.
FIG. 14 shows a plane extracted by the optical flow and an image of a three-dimensional map generated by three-dimensional transformation.
Further, the position and direction of the camera that acquires the video are detected and described on the reconstructed three-dimensional map by the processes of the θ and h detection units 85, the coordinate conversion unit 86, and the image plane development unit 87 described above. Can be plotted. That is, the angle between the optical axis of the camera and the target plane such as a road surface is given by the processing of the θ and h detection unit 85, the coordinate conversion unit 86, and the image plane development unit 87, and the coordinate origin is designated in the image. Then, the camera position and / or camera direction obtained by the calculation can be described in the image plane-developed by the conversion formula or in the coordinates of the target plane. The above formulas (1) and (2) are calculated by giving the camera focal length, the angle formed by the road surface and the optical axis of the camera, and the coordinate origin to the target plane such as the camera optical axis and the road surface. Thus, it is possible to obtain a plane development image of the road image acquired by the camera, and at this time, the camera position and the camera direction are obtained by calculation from the conditions of the conversion formula. Thereby, the camera position and camera direction can be detected in the converted image and plotted on a three-dimensional map or the like.
Also, a plurality of images taken by a moving camera and developed in a plane are combined to generate a single image, and the combined image that is expressed as a new common coordinate system, in the new coordinate system, The camera position and camera direction obtained by the above equations (1) and (2) can also be described one after another. For example, the image from the in-vehicle camera is developed in a plane, the corresponding points on the target plane in each frame image are searched automatically or manually, and the corresponding points are combined to match to generate a combined image of the target plane. , Integrated display in the same coordinate system. Then, the camera position and the camera direction can be detected one after another in the common coordinate system, and the position, direction, and locus can be plotted.
FIG. 15 shows an image in a state where the locus of the camera position is described on the three-dimensional map.
Thus, by detecting the camera position and direction, the position of the object can be specified from the camera positions of a plurality of images, and a three-dimensional map or a three-dimensional image can be reconstructed from the planar development image. Accordingly, it is possible to automatically generate a three-dimensional map of the traveled range simply by photographing with an in-vehicle camera or the like. Further, since the camera position and direction can be detected in this way, it is possible to plot and describe the position and direction in which the camera moves on a three-dimensional map or a three-dimensional image reconstructed from the planar development image.
And since a three-dimensional map can be generated as described above, a plurality of images obtained by plane-decomposing perspective images and camera position information can be generated, and a desired three-dimensional image can be reconstructed from the information. By transmitting the plane decomposition image and the camera position information, it is possible to reconstruct the original moving image by reconstructing it on the receiving side.
Conventionally, as a compression method of moving images, as represented by the MPEG2 system, there is a compression method mainly for separating a moving part in a moving image, predicting the movement, and eliminating signal redundancy. Are known. However, this type of conventional method was effective in terms of movement partially when there is an object moving on a static background, but in the case of a moving image in which the camera itself moves, Since the entire image has a motion component and the moving speed is not the same, all of the moving images must be updated, and the compression effect is significantly reduced.
As described above, in the three-dimensional map generation according to the present invention, a moving image is analyzed and handled as an image composed of a three-dimensional plane. As a general property of an image, since the image space is surrounded by a plurality of planes, the image can be reconstructed by extracting each plane by the apparatus and reconstructing the planes. Since the plane is defined three-dimensionally, the reconstructed plane is arranged in the three-dimensional space, so that the final image is a three-dimensional image. Therefore, as long as the camera moves in a certain direction, the relative speed between the camera and the plane is constant, and movement speed reading is uniquely required for each plane. In the range where the camera moves at a constant speed, each plane has a unique velocity component, and it is only necessary to define the velocity corresponding to the number of planes.
In this way, the image is compressed, and the original moving image can be reproduced by moving each plane at a defined speed on the receiving side. In addition, since the image includes three-dimensional information, it can be expressed three-dimensionally.
As described above, in the present invention, it becomes possible to have a sufficient compression effect even for an image having a motion component of the entire image by moving the camera itself, and having a different motion speed component in each part in the image, Moreover, since the image is three-dimensionally analyzed during image compression, a three-dimensional image can be extracted from the moving image as a result.
FIG. 16 shows another embodiment. The same parts as those in the embodiment shown in FIG. 12 are denoted by the same reference numerals, and detailed description thereof is omitted.
Here, when a moving body such as a preceding vehicle or an oncoming vehicle is included in the input image, the optical flow of the reference plane is Opt. It takes a value different from the F value. Therefore, Opt. The Opt. In the F map, Opt. Is an abnormal value existing on the road surface (reference surface). If the F value is detected, the portion is an area of a moving body on the road surface. If the object is to delete the moving object, this moving object region is designated as the moving object Opt. Extraction and deletion may be performed by the F (optical flow) extraction unit 101 and the moving body part extraction unit 102, and the deletion region may be complemented from overlapping developed images before and after. In the figure, reference numeral 103 denotes a simple moving body relative speed extraction unit that simply extracts the relative speed of the moving body.
Furthermore, for the purpose of extracting the moving body itself, reproducing its three-dimensional shape, and measuring the speed, the process processing on the right side of FIG. 16 is performed through the moving body plane designating unit 105. However, the Opt. The entire F value remains unchanged as a Opt. This does not mean the F value, and the three-dimensional shape of the moving object and the Opt. In order to obtain the F value, each surface Opt. The moving object is decomposed into a plurality of planes by the F (optical flow) extracting unit 106, each moving object plane extracting unit 107, and the like, and the plane development of each plane constituting the moving object is calculated in the same process. The three-dimensional shape of the moving object can be obtained by the planar image forming unit 108, and the velocity vector of the moving object can be obtained by the moving object velocity vector extracting unit 109. Furthermore, it can be taken into the 3D map by the 3D map generation unit 110 including the moving body.
Next, with reference to FIG. 17, description will be given of a method of pasting the texture of an acquired image as a CG (computer graphics) image or the like as an embodiment in a 3D map generation application example.
In other words, the three-dimensional coordinates of the image developed on the plane have been acquired, and only the image within the target plane has been extracted. Since the image of the roadside tree can be obtained in a deleted form, it is possible to obtain not only the texture of the road surface, but also the other plane image by pasting the plane image constituting the target building wall surface in alignment with the CG image. The texture of the building wall surface of the video image, the roadside tree, the guardrail, etc. can be pasted on the CG image without overlapping.
In FIG. 17, the same parts in the embodiment shown in FIG. 12 are denoted by the same reference numerals, and detailed description thereof is omitted. Opt. The parallel plane image is extracted from the F (optical flow) map generation unit 89 by the parallel plane image extraction unit 111, while the target plane image extracted from the target plane image extraction unit 97 is textured through the texture signal generation unit 112. Get a signal. Further, the three-dimensional coordinates are acquired from the planar image construction unit 92 by the three-dimensional coordinate acquisition unit 113 and are adjusted by the CG coordinate matching unit 114 so as to match the coordinates with the CG image. A composite image in which texture is pasted together with the signal by the pasting unit 115 to the CG is obtained.
FIG. 18 shows an embodiment in which a three-dimensional map is constituted by recognized parts. The same reference numerals are given to the same components in the embodiments shown in FIGS. 12 to 17. The detailed description is omitted by being attached.
In other words, if attention is paid to the object on the plane developed in each plane and the virtual plane, the optical flow of the object in the image depends only on the relative speed between the camera and the object. Tracking is possible, and the speed can be extracted even for a moving object, so that tracking is easy.
If the target can be tracked, the position and shape change with respect to the time change of the target can be used as a clue for recognition, and image recognition can be easily performed, and the target in the image can be directly compared to the 3D CG model without tracking. Can be replaced. For this reason, the object selection tracking unit 121 selects and extracts a specific object existing in the screen from the image formed by the planar image configuration unit 92, and tracks it. The information is input to the attribute adding unit 123 for adding the attributes of the object through the object recognition unit 122 for recognizing the various types of information.
On the other hand, from the images formed by the moving object plane image construction unit 108 and the moving object velocity vector extraction unit 109, a moving object selection tracking unit 124 selects and extracts a specific moving object existing in the screen. The information is tracked, and is input to the attribute adding unit 126 for adding the attribute of the object through the recognition unit 125 of the moving object that recognizes the attribute of the moving object and other various information. A three-dimensional map is generated by the three-dimensional map generation unit 127 constituted by the recognition object from the attribute addition units 123 and 126.
Hereinafter, a specific example in which the object recognition and tracking processing using the flat developed image is applied to the image analysis of the traffic monitoring video camera will be described with reference to FIGS. 19 to 21. FIG.
FIG. 19 is a specific example of an image obtained by a traffic monitoring video camera. FIG. 19 (a) is a monitoring video camera image (far / near image), FIG. 19 (b) is a plane developed image converted by the present invention, ( c) is a region analysis display showing the object recognized by the present invention.
FIG. 20 is a flowchart showing the object recognition processing steps in the traffic monitoring video camera. FIG. 21 is a list summarizing the results obtained by the object recognition.
In the examples shown in these drawings, the vehicle type of the passing vehicle, the color of the vehicle, the traffic volume, the speed, the acceleration, and the vehicle trajectory in the monitoring video camera image are obtained from the traffic monitoring video camera image by image recognition. ing.
First, as shown in FIG. 19A, the monitoring video camera image obtained by the traffic monitoring video camera is a perspective image (an oblique image), and the size and speed of the target vehicle are not uniform.
This perspective image is digitized and developed on a plane to obtain a plane developed image of the present invention. The developed plan view is as shown in FIG. 19 (b). In this plane development image, the parameters f and θ are determined so that the road surface is developed into a plane. Therefore, on the road surface, any position in the image except the vehicle height direction such as the width and length of the vehicle is determined. But the scale is uniform and measurable. Therefore, image recognition for traffic volume monitoring can be performed by using the flat developed image.
Conventionally, it is common to perform vehicle recognition and recognition using a perspective image (see FIG. 19 (a)), limit the vehicle recognition area to a part of the screen, and detect and measure a vehicle passing there. there were. In the present invention, since the perspective image can be developed in a plane, the entire road surface of the image can be used as the vehicle recognition range by using the plane development image (see FIG. 19B).
In the example shown in FIG. 19 (b), the moving body (vehicle) captured at the upper part (upper side) of the planar development image moves to the lower part (lower side) of the image that is the traveling direction. At this time, since the size of the vehicle image does not change in the planar development image, a plurality of images can be acquired for the same vehicle. For example, a moving body image that moves from the upper part to the lower part of the flat developed image can be acquired as image data for about 30 frames.
By using the plurality of images, detailed image analysis (region analysis) can be performed. For example, the accuracy of identifying the vehicle type can be improved, and accurate image recognition can be performed. FIG. 19 (c) shows the area analysis display.
In a flat developed image, for example, in order to increase recognition accuracy, it is possible to perform processing such as addition averaging of images and addition averaging of contour images subjected to image processing, which is very effective in image processing. Moreover, since the scale is the same in any part of the road surface on the plane development image, the position information can be easily acquired, and the movement locus of the moving body can be traced together with the recognition. Since the scale is the same in this way, the position and moving speed of the vehicle (calculated from the number of video frames per second) can be measured anywhere on the road surface, and acceleration and speed can be easily calculated. .
With reference to FIG. 20, the flow of the traffic volume recognition process using the above-described plane development technique will be described in more detail.
First, the captured perspective image is digitized (201 in FIG. 20), and a flat developed image is created from the perspective image (202).
Then, the moving body region is detected in the flat developed image, and the candidate region is created by comparing the candidate region with the background image and the current image (step 203). The candidate area is subjected to image processing by image calculation with the background image, an area having a small independent area is removed for vehicle detection, and an image area of the vehicle candidate is specified by expansion coupling of the remaining areas. The background image is updated using a Kalman filter or the like (203).
The identified candidate area is subjected to processing such as changing the threshold value of image processing, and is analyzed in detail (step 204). In addition, since the vehicle candidate area can predict the movement amount by plane development, when acquiring the vehicle existence area image, the vehicle existence area is predicted and the shape correction such as the size of the image and the positional deviation, Adjustment is performed, and corresponding areas to be used are analyzed and determined (205).
The determined corresponding area is registered in the database together with the position information (206). Thereafter, an addition average image is created using the image registered in the database (207).
Then, using this added average image, the vehicle type is determined (208) and registered in the database together with the recognition result (209).
There are various types of information registered in the database. For example, as shown in FIG. 21, vehicle ID, passage time (Passed Time), average speed (Speed), acceleration (Acc), vehicle type (Type), Information on various items necessary for vehicle recognition such as color is registered. Of course, the information is not limited to the items shown in FIG. 21, and other information can be registered. For example, in addition to the above items, a vehicle addition average image, a vehicle movement locus, and the like can be registered.
In order to replace an object in an image with a three-dimensional CG model by direct comparison, for example, a method disclosed in Japanese Patent Application No. Hei 11-97562, Japanese Patent Application No. 2000-190725, Japanese Patent Application No. 2002-146874, etc. For example, an object can be recognized, specified, fixed, and tracked, and the object can be replaced with a three-dimensional CG model. An outline of the method and apparatus is an information code conversion device that converts object information acquired for an object into an information code registered in advance corresponding to the object, and transmits or outputs the information code. The apparatus includes a reproduction conversion device that receives or inputs an information code from the information code conversion device and converts the information code into reproduction object information registered in advance corresponding to the information code. In addition, an information input means for inputting information on a required object, information on various objects created in advance or their parts and their attributes, etc., and data obtained by coding those information, respectively, are stored in a database. Information code conversion for selecting and outputting data relating to corresponding information by comparing and comparing the formed first parts warehouse, information input to the information input means and information stored in the first parts warehouse Compare the data output from the information code conversion device with the data stored in the second component store, the second component store that forms a database in the same manner as the first component store. An information reproduction conversion device that selects information for reproducing a corresponding object and reproduces and outputs the object by a required output unit based on information for reproducing the object. Than it is. Furthermore, you can specify one or more objects at any position by name or attribute on the image display of the video image taken by one or more cameras from the outside world, or you can enclose it with a mouse or click a single point position. Or by specifying with a light pen or touch panel, the object including the change in the relative angle and direction between the camera and the object and the change in the distance is excluded while excluding the presence of the object. While tracking, the characteristics of each image frame of the target object or the characteristics of each component of the target object are sequentially detected in a time series, and the image data relating to a plurality of various characteristics is stored in the database. Search sequentially for image frames or image data of features that have a corresponding relationship with the characteristics of each image frame of the object or the continuous changes in the characteristics of each component. A reconstructed image including a two-dimensional or three-dimensional shape corresponding to the object for which pattern matching is obtained corresponding to the search result is sequentially configured for each time-series change, and each image frame including the reconstructed image Or, as a group of moving images or continuous still images, the image data of the feature is matched with the setting of a required size reference on a required area on the above image display or other image display via a communication line etc. In contrast, it is displayed sequentially, and if necessary, the name and attribute data accompanying or generated with the reproduced image are also displayed in a required area.
In this way, a 3D map can be constructed by combining and integrating the 3D CG models of the replaced objects by adding the attributes prepared in advance to the respective objects to the 3D CG model. Furthermore, by applying the texture, it is possible to apply a live-action texture to the three-dimensional CG model of the object.
Further, in the embodiment shown in FIGS. 22 to 27, for example, it is possible to continuously combine the moving directions so as to obtain surrounding images of moving objects such as various vehicles. That is, a moving direction image obtained by loading a camera on a moving vehicle, aircraft, ship, or the like and taking a picture is developed in a plane and continuously combined to form a single image. This is because a camera is mounted on a moving vehicle, aircraft, ship, etc., and then oblique images are taken sequentially, for example, a road image is taken along a road, it is developed on a plane, and one road image is joined together. Is to get.
Specifically, as shown in FIG. 22, for example, the situation of a camera that photographs a road around a bus 201 as a vehicle will be described. In this case, six cameras are used. May be increased. The first camera 201A for photographing the front of the bus 201 is at the front of the bus 201, the second camera 201B for photographing the rear is at the rear of the bus 201, and the third camera 201C for photographing the right front of the bus 201 is at the rear of the right side of the bus 201. A fourth camera 201D for photographing the left front of the bus 201 is taken at the rear left side of the bus 201, a fifth camera 201E for photographing the rear right side of the bus 201 is taken at the front right side of the bus 201, and a sixth camera for photographing the rear left side of the bus 201. 201F is provided in the front left side of the bus 201, and the range of images taken by these cameras 201A is shown in a fan shape. In this way, images of six types of road surfaces are taken in perspective, and are developed into a planar image such as a map by the above formulas (1) and (2). The upper coordinates (u, v) are converted into coordinates (x, y) converted on the map.
In this way, the perspective image obtained by photographing the area around the bus 201 is converted to a road image such as a map, as shown in FIG. It is possible to display a road surface such as Furthermore, if these plane development images are combined in the traveling direction of the bus 201 as much as possible, a map is completed.
Specific examples thereof are shown in FIGS. 24 and 25. In FIG. 24, there are nine oblique images of (1) to (9). FIG. 25 shows this as a map, which is developed into a planar image such as a map by the above formulas (1) and (2) and connected as a single road. This is an example of an image taken with a normal camera, that is, an example of plane development of an oblique image and an example of combination of plane development images.
At this time, if there is a moving body image that is not necessary for image formation, it can be deleted, and when a plurality of images that partially include overlapping objects are combined, By combining images while avoiding moving images, a combined image of only stationary objects is generated. For example, when a vehicle or the like is shown on the road, a picture of a long road in which only the road is shown can be obtained by combining and connecting images that are developed on a plane avoiding the vehicle or the like.
Next, some of application examples according to the present invention will be described as follows.
That is, as a flat development image, road surface, sea surface, lake surface, river surface, ground surface, vertical wall surface, vertical virtual plane created by objects arranged on the same plane, building wall floor surface, ship deck surface, runway It can handle airport facilities such as taxiways. This means that when an image taken with a normal camera is developed on a plane, the object on the plane development plane is an object arranged on the road surface, sea surface, lake surface, river surface, vertical wall surface, the same plane. It deals with airport facilities such as vertical virtual planes, ground surfaces, building wall floors, ship decks, runway taxiways, etc.
It can also be a vehicle as an example of applied equipment, such as a peripheral road surface, a building surface, a power pole array surface, a roadside tree array surface, a guardrail array surface, etc. in a land vehicle such as a bus. It is the sea surface of the vehicle, the deck of the ship, the wall surface, etc., the runway of the aircraft, etc., the ground surface, etc., thereby enabling the omnidirectional full surface display or the target area surface display.
That is, as shown in FIGS. 22 to 25, the surrounding road surface, the building surface, the electric pole arrangement surface, the roadside tree arrangement surface, the guardrail of the normal camera 201A attached to the land vehicle such as the bus 201, etc. Develop images such as array planes, sea surface of ships such as ships, decks and walls of ships, runways of airplanes, ground planes, etc. in plan view, surrounding road surface and building surface around each of them In other words, it is possible to display all directions such as a telephone pole arrangement surface, a roadside tree arrangement surface, and a guardrail arrangement surface. Or, although not shown in the figure, a full-scale omnidirectional display of a marine vehicle such as a ship, a full-scale omnidirectional display of a ship's deck, wall, etc. It can be done.
Furthermore, as another application example, it can be applied to a building structure. As shown in FIG. 26 and FIG. The display is made possible, the inside of the building is photographed with a normal camera, and the flat portions such as the floor surface and the wall surface are displayed in plane development and plane combination display.
That is, FIG. 26 shows oblique indoor images taken with a normal camera. For example, the above equations (1) and (2) are applied to 16 images (1) to (16). FIG. 27 shows that the image is converted into a planar image by connecting the images together and connected. In this way, an image that cannot be actually captured can be obtained as an image in which a floor surface and a wall surface in a room of a building are developed, and can be generated as an image in which a surrounding wall surface with respect to the floor surface is developed. It is.
Furthermore, 3D map creation is also possible. With multiple cameras, not only continuous shooting of the road surface, ground surface and water surface with moving vehicles, aircraft, ships, etc., but also vertical such as building wall surfaces, etc. By continuously shooting an object having a virtual vertical plane in which a plane or a plurality of utility poles, guard rails, etc. are regularly arranged in a plane, the image developed on the plane is coupled and extended in the moving direction, and at the same time A three-dimensional map is created by creating a wider plane vertical plane development including the plane.
That is, images of road surface, ground surface and water surface taken with a video camera mounted on a moving vehicle, such as a vehicle, an aircraft, a ship, etc. are developed in a plan view, and they are combined by an appropriate method to develop a combined plane A map is created by making a figure and then extending it in the direction of movement. Alternatively, by continuously shooting an object having a vertical surface such as a building wall surface or a virtual vertical plane in which a plurality of utility poles, guard rails, etc. are regularly arranged in a plane, the image developed on the plane is moved in the moving direction. A three-dimensional map is created by creating a wider range of plane vertical plane developments including vertical planes at the same time, while extending and coupling to.

本発明は以上のように構成されているために、実際には撮影することのできない画像を、本方法、装置を用いることにより、方程式（１）および方程式（２）を用いて変換することにより、斜め画像を平面画像にすることができるのでその応用範囲も広いものである。
また、視点を重複させた複数のカメラによって同一地点の映像を異なる地点から撮影した複数の映像によって平面展開画像を生成することで、重複部分の平面展開画像内で視差を検出することができる。即ち、従来であれば、視差の検出による三次元データの検出は常に原画像、即ち遠近法の画像そのものから得られていたが、本発明では平面展開画像処理をしてから視差を検出するという新しい方法によるから、位置精度のよい三次元データを得ることができるのである。
しかもこれによって、従来よりさらに精度のよい直接三次元形状のデータを簡単に取得することができ、視野の重複する複数の映像から得られる視差は、動画内の静止座標形においてはオプティカルフローとしばしば同一の意味を持つが、対象物が時間変化する場合や、三次元座標の精度を上げる場合には特に有効なものである。
そして、平面展開された画像情報とカメラの位置情報を送受信することで所望の動画像を再構成することができ、データ伝送量を可能な限り小さくしながら動画像データを高速に送受信することができ、特に、帯域の狭い電話回線、インターネット回線等を使用した動画伝送に有効なものとなる。Since the present invention is configured as described above, an image that cannot be actually captured can be converted using Equation (1) and Equation (2) by using the present method and apparatus. Since an oblique image can be made into a flat image, its application range is wide.
In addition, by generating a flat developed image from a plurality of videos taken from different points by using a plurality of cameras with overlapping viewpoints, parallax can be detected in the flat developed image of the overlapping portion. That is, conventionally, detection of three-dimensional data by detection of parallax has always been obtained from the original image, that is, the perspective image itself, but in the present invention, the parallax is detected after performing the flattened image processing. Because of the new method, three-dimensional data with good positional accuracy can be obtained.
In addition, this makes it possible to easily obtain direct three-dimensional shape data with higher accuracy than in the past, and the parallax obtained from multiple images with overlapping fields of view is often referred to as an optical flow in a stationary coordinate form in a video. Although it has the same meaning, it is particularly effective when the object changes with time or when the accuracy of the three-dimensional coordinates is increased.
Then, a desired moving image can be reconstructed by transmitting / receiving the image information and the position information of the camera which are developed in a plane, and moving image data can be transmitted / received at high speed while minimizing the data transmission amount. In particular, it is effective for video transmission using a telephone line, an Internet line, etc. with a narrow band.

Claims

Road surface etc. characterized by reading information necessary for plane development from perspective images obtained from a normal camera, developing it into a plan view, combining it, joining them together, and making a single large development view Planar development image processing method for planar object video.

Read the information necessary for plane development from perspective images obtained from a normal camera, develop it into a plan view using the following formulas (1) and (2), combine it, join it together, and develop a single large image A flat developed image processing method for a planar object image such as a road surface characterized by being made into a diagram.
y = v · 2 ^1/2 · h · cos (π / 4-θ) · cos (β−θ) / (f · sinβ) (1)
x = u · h · cos (β−θ) / (f · sin β) (2)
Where θ is the angle formed by the optical axis of the camera and the road surface, f is the focal length of the camera, h is the height of the camera, β is a distance h + y from directly below the camera, and the line segment connecting the camera and the road The angle formed by the plane, v is the vertical coordinate from the origin on the CCD plane, which is the projection plane of the camera, u is the horizontal coordinate from the origin on the CCD plane, and y is the point that has advanced h from directly below the camera on the road plane. Is the distance or coordinate that travels further in the optical axis direction from the origin, and x is the lateral distance or coordinate on the road surface.

For an image of a real scene including a plane taken from an angle, a plane that is originally composed of a plane is developed and displayed on the plane as a plane image that is proportional to the plane of the real scene by mathematical calculation. A flat developed image processing method for a planar object image such as a road surface according to claim 1 or 2.

The flat developed image processing method for a planar object image such as a road surface according to any one of claims 1 to 3, wherein a plurality of flat developed images are combined and expressed as a single large flat developed image. .

Multiple input images are developed in a plane, and the images are combined to form a single image. The entire target area is displayed, and if necessary, the input image corresponding to the displayed location can also be directly displayed. The flat developed image processing method of a planar object image such as a road surface according to any one of claims 1 to 4, which is displayed simultaneously.

The planar development of a planar object image such as a road surface according to any one of claims 1 to 5, wherein an image of a moving direction by a moving object is planarly developed and continuously combined into a single image. Image processing method.

The road surface according to any one of claims 1 to 6, wherein a moving body image unnecessary for image formation is deleted, and the moving body image is avoided and the images are combined to generate a combined image of only a stationary object. Planar development image processing method for planar object video such as.

In the process of obtaining the vanishing point in the acquired plane image, the image is moved and displayed so as to fix the position of the vanishing point of each image, thereby stabilizing the image that shakes due to camera shake or the like. Item 8. A flat developed image processing method for a planar object image such as a road surface according to any one of items 1 to 7.

Obtaining the unit time movement amount of a minute area in a moving image composed of different planes obtained by plane development within the necessary range by using the optical flow method, and extracting the same component from the component distribution map The flat developed image processing method for a planar object image such as a road surface according to any one of claims 1 to 8, wherein a single planar image is separated by the above.

In a plane moving image obtained from a plane converted moving image, an optical flow distribution map is generated, and unevenness of the plane is detected as a deviation from the plane from the minute difference, or from a different angle of view in the plane moving image. Parallax is detected by comparing and calculating the obtained plane image, the uneven component in the plane is detected from the component distribution, and the deviation from the plane of each point of the original plan view is included with the detected uneven value The flat developed image processing method for a planar object image such as a road surface according to any one of claims 1 to 9, wherein a corrected plan view is generated.

By comparing and calculating a plurality of plane images developed in a plurality of plan views using a method such as a correlation method or a matching method, each small region on a plurality of plane images such as a road surface corresponds to each small area. The amount of movement of the area is obtained by parallax method or optical flow method, etc., and 3D data such as unevenness such as road surface is detected from the distribution of the component, or the plane of each point of the original plan view with the detected 3D unevenness value The flat developed image processing method for a planar object image such as a road surface according to any one of claims 1 to 9, wherein a corrected plan view including a deviation from a plane is generated.

The average optical flow value of a continuous image developed on a plane or the movement distance of a matching-corresponding position is obtained, and the movement distance / movement speed / movement direction of the target plane or the movement distance / movement speed / movement direction of the photographed camera is obtained from the value. A plane development image processing method for a plane object image such as a road surface according to claim 10 or 11, wherein:

By pasting the texture of the object plane in the separated single plane, which has been separated into planes, on the object plane in the CG (computer graphics) image or map image corresponding to the place, the actual image is captured on the CG image or map image. 13. The planar developed image processing method for a planar object image such as a road surface according to claim 1, wherein the image is captured and displayed as a plane or inversely converted and displayed as a perspective image.

In the formulas (1) and (2) according to claim 2, first, f and h are given, and when the parallel line of the object is an intersection line forming an intersection point in the image, this intersection line The plane development image processing of a plane object image such as a road surface according to any one of claims 2 to 13, wherein θ is selected so that the plane becomes parallel when the plane is developed in a plane. Method.

The planar developed image processing method for planar object images such as road surfaces according to claim 14, wherein the selected θ is finely adjusted in claim 14.

A parallel line portion in a live-action image is extracted from the image, and a distance between a plane a that is a plane parallel to the target plane formed by the intersection and a plane b that is a plane parallel to the target plane including the optical axis point is calculated. The road according to any one of claims 2 to 15, wherein d is set to f, the virtual focal length is set to f, and θ is calculated from the ratio of d and f as θ = arcTan (d / f). Planar development image processing method for plane object image such as a plane.

The parallax is detected by acquiring multiple simultaneous images of the same point with multiple ordinary cameras installed at different installation locations, and comparing the plane development images of the multiple images of the same point at the same point. The planar developed image processing method for a planar object image such as a road surface according to any one of claims 1 to 16, wherein a three-dimensional shape is generated.

A plane developed view that is generated by converting a perspectively represented image including a plane image into a plan view, or a corresponding point after developing an image including a plurality of plane images taken from a plurality of directions on a plan view. The formulas (1) and (2) according to claim 2 based on a single large screen plane development view or a plan view-like CG (computer graphics) image or map generated by overlapping and combining. A virtual perspective image viewed from an arbitrary viewpoint is generated by an inverse transformation formula with respect to, or a moving image is generated from a virtual moving camera viewpoint by performing continuous processing. A plane development image processing method for a plane object image such as a road surface according to any one of the above.

The flat developed image processing method for a planar object image such as a road surface according to claim 18, wherein the inverse transformation formula is the following formulas (3) and (4).
v = y · f · sin β / (2 ^1/2 · h · cos (π / 4-θ) · cos (β-θ))
………………… (3)
u = x · f · sin β / (h · cos (β−θ)) (4)
Where h is the height of the camera from the road surface, θ is the angle formed by the optical axis of the camera and the road surface, f is the focal length of the camera, and β is advanced y from the point h ahead of the camera. The angle formed by the line connecting the point and the camera lens and the road surface, x is the vertical coordinate from the line obtained by orthogonally projecting the optical axis of the camera onto the road surface, that is, the horizontal coordinate when viewed from the camera, y Is the coordinate in the optical axis direction when the origin is a point advanced from directly under the camera, v is the vertical coordinate on the CCD surface, which is the projection surface of the camera, and u is the horizontal coordinate on the CCD surface. .

20. A planar object such as a road surface according to any one of claims 1 to 19, which enables various types of recognition processing such as measurement processing and image recognition processing on the image by using a flatly developed image. A method for processing a flat image of an image.

Plane development images include road surface, sea surface, lake surface, river surface, ground surface, vertical wall surface, vertical virtual plane created by objects arranged in the same plane, building wall floor surface, ship deck surface, runway guidance 21. A planar development image processing method for roads or the like according to any one of claims 1 to 20, which is an airport facility surface such as a road.

Plane development images acquired by moving objects include the surrounding road surface, building surface, telephone pole array surface, roadside tree array surface, guardrail array surface, etc. of the land moving object itself. The planar developed image processing method for a road or the like according to any one of claims 1 to 21, which is an upper surface, a ship deck, a wall surface, and the like, and is a runway, a ground surface, and the like of an air moving object itself.

The plane development image processing of a road or the like according to any one of claims 1 to 22, wherein the plane development image is a plane development display and a plane combination display of a plane portion such as a floor surface or a wall surface of a building. Method.

Multiple video input devices continuously shoot moving road surface, ground surface and water surface, vertical surface such as building wall surface, or virtual vertical where multiple power poles, guard rails, etc. are regularly arranged in a plane Claims for creating a three-dimensional map by continuously capturing an object with a flat surface, and simultaneously extending and extending the flat image in the moving direction, and simultaneously creating a wide-area flat surface development view including the vertical surface. A planar development image processing method for a road or the like according to any one of items 1 to 23.

Convert not only a plane in one direction but also a plane image in a plurality of directions in a video including a plane in a plurality of directions into a plurality of plan views, or display a plane image in the plurality of directions as a tertiary A three-dimensional plane development drawing obtained by combining the original and generating a perspective image viewed from an arbitrary viewpoint by inverse transformation, a reverse development image of a planar object image such as a road surface Conversion processing method.

In a video input unit that acquires a perspective image, a video playback unit that plays back an oblique video shot by the video input unit, an image correction unit that corrects a shooting rotation angle by the video input device, and the video input device A spherical aberration correction unit that corrects spherical aberration and the like, a video development plane processing unit that converts a perspective image into a plane development view, a development image combination unit that combines videos that have undergone video development processing, and a combined image are displayed. A flat developed image processing apparatus for a planar object image such as a road surface, characterized by comprising a display unit.

27. An optical flow map generation unit that generates and illustrates an optical flow of a developed video, and an optical flow extraction unit that extracts only a target optical flow from the optical flow map. Planar development image processing apparatus for planar object images such as road surfaces.

28. The flat developed image processing apparatus for a planar object image such as a road surface according to claim 26 or 27, further comprising a parallax extracting unit that detects parallax from video at the same point from different positions.

29. A flat developed image processing apparatus for a planar object image such as a road surface according to any one of claims 26 to 28, further comprising a developed image comparing unit that compares a plurality of developed images at the same point.

30. A planar object image such as a road surface according to any one of claims 26 to 29, further comprising: an image comparison unit that extracts road surface unevenness by calculation; and a correction plane generation unit that takes into account the unevenness. Plane development image processing apparatus.

A video input unit for generating video by a camera, an input image display unit for stabilizing and displaying the input image, a video recording unit for recording the input video, a video playback unit for reproducing the recorded image, and a lens such as a spherical aberration In order to correct the camera rotation angle by performing coordinate transformation to correct image distortion caused by the image, the image correction unit aligns the target plane image with the plane in the image, and mathematically calculates the plane image from the perspective image. An image development plane processing unit that generates a diagram, an optical flow map generation unit that generates and illustrates an optical flow of the expanded image, and an optical flow extraction unit that extracts only the target optical flow from the optical flow map And a parallax extraction unit that detects parallax from video at the same point from different positions, and deletes unnecessary images while leaving necessary objects, In addition, an object image processing unit for inserting a new video, a developed image combining unit for combining individual processed images that have been developed in a plane and generating a single continuous image, and a developed image for displaying them A display unit, a recording unit that records them, an arbitrary viewpoint image generation unit that performs reverse conversion to an arbitrary viewpoint, an arbitrary viewpoint image display unit that displays the image, and a plurality of developed images at the same point are compared. Planar development of a planar object image such as a road surface, which is composed of an unfolded image comparison unit, an image comparison unit that extracts road surface unevenness by calculation, and a modified plane generation unit that takes into account the unevenness as appropriate Image processing device.

A flat developed image processing apparatus for a planar object image such as a road surface according to claims 26 to 31, wherein an arbitrary viewpoint image generating unit that performs reverse conversion to an arbitrary viewpoint and displays the image. A reverse development image conversion processing apparatus for a planar object video such as a road surface, characterized by comprising an arbitrary viewpoint image display unit.

A video input unit that obtains a perspective image, a plane decomposition unit that decomposes the perspective image captured by the video input unit into one or more plane images constituting a three-dimensional space, and a three-dimensional video input unit A position detection unit for detecting a target position, and a display unit for reconstructing and displaying a three-dimensional image from a three-dimensional position of a plane image decomposed by the plane decomposition unit and a video input unit detected by the position detection unit A flat developed image processing apparatus for planar object images such as road surfaces.

34. A plane object such as a road surface according to claim 33, further comprising a position notation unit that indicates a three-dimensional position of the video input unit detected by the position detection unit in a plane image decomposed by the plane decomposition unit. Planar development image processing device for physical images.

35. The scope according to claim 34, wherein, when the video input unit moves, the position notation unit continuously describes the three-dimensional position of the moving video input unit in the planar image decomposed by the plane decomposition unit. Planar development image processing apparatus for planar object images such as road surfaces.

When the display unit for reconstructing the three-dimensional image is disposed apart from the plane decomposition unit and the position detection unit, one or more plane image signals and video images from the plane decomposition unit and the position detection unit to the display unit. 36. The planar developed image processing apparatus for planar object images such as road surfaces according to claims 33 to 35, comprising transmission / reception means for transmitting a three-dimensional position signal of the input unit.