JP4776983B2

JP4776983B2 - Image composition apparatus and image composition method

Info

Publication number: JP4776983B2
Application number: JP2005161540A
Authority: JP
Inventors: 学近藤; 和美中山; 英雄木ノ下; 泰弘道正; 敏雄源間; 秀史保坂; 玲児高幣; 信治武田
Original assignee: Tokyo Electric Power Co Inc; Kozo Keikaku Engineering Inc
Current assignee: Tokyo Electric Power Co Inc; Kozo Keikaku Engineering Inc
Priority date: 2005-06-01
Filing date: 2005-06-01
Publication date: 2011-09-21
Anticipated expiration: 2025-06-01
Also published as: JP2006338284A

Description

本発明は、たとえばデジタルスチルカメラで撮影した複数枚の撮影画像を合成する画像合成装置及び画像合成方法に関する。 The present invention relates to an image synthesizing apparatus and an image synthesizing method for synthesizing a plurality of photographed images taken with a digital still camera, for example.

画像の合成においては、広域の画像上に画角の違う高解像度の映像を落とし込むことで行われる合成方法がある。この場合の入力条件としては、広域が撮影された画像の撮影条件が整っていること、極力高解像度の画像は同じ位置で撮影されていること等が上げられる。 In the image synthesis, there is a synthesis method performed by dropping a high-resolution video with a different angle of view on a wide-area image. As input conditions in this case, it is possible that shooting conditions for an image in which a wide area is shot are prepared, and that a high-resolution image is shot at the same position as much as possible.

このような画像の合成を行うものとして、特許文献１では、被検査物を複数に分割して撮像する撮像手段と、撮像手段により撮像された複数画像データに、該画像データの各々に対応する位置情報や撮影情報を画像データ付加情報として付加された画像データを記憶する画像データ記憶手段と、被検査物の所望の検査位置及びその周辺の複数の画像データとをつなぎ合わせ画像データとして表示する画像データ選択表示手段と、を備え、作業者が表示されたつなぎ合わせ画像データ上で構造物の劣化度合いを検査する構造物用検査装置を提案している。
特開平１１−１３２９６１号公報 In order to synthesize such images, in Japanese Patent Laid-Open No. 2004-228688, an imaging unit that divides an object to be imaged into a plurality of images and a plurality of image data captured by the imaging unit correspond to each of the image data. Image data storage means for storing image data with position information and photographing information added as image data additional information, and a desired inspection position of the inspection object and a plurality of surrounding image data are connected and displayed as image data. There is proposed an inspection apparatus for a structure that includes an image data selection display means and inspects the degree of deterioration of the structure on the stitched image data displayed by the operator.
Japanese Patent Laid-Open No. 11-132961

ところが、上述した特許文献１に示されるものでは、複数の画像データをつなぎ合わせるための具体的な手段を備えていないため、撮影対象物を分割して撮影した拡大画像を合成して得られる撮影対象物の全体画像を高精細の画像とすることができないという問題があった。 However, the one disclosed in Patent Document 1 described above does not include a specific means for joining a plurality of image data, and thus is an image obtained by combining enlarged images obtained by dividing an object to be imaged. There has been a problem that the entire image of the object cannot be a high-definition image.

本発明は、このような状況に鑑みてなされたものであり、上記問題点を解決することができる画像合成装置及び画像合成方法を提供することを目的とする。 The present invention has been made in view of such a situation, and an object thereof is to provide an image composition device and an image composition method capable of solving the above-described problems.

本発明の画像合成装置は、撮影対象物を分割して撮影した拡大画像を合成して前記撮影対象物の全体画像を得る画像合成装置であって、画角を広げて撮影された傾きを持つ前記全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、前記全体画像を正射投影する全体正射投影手段と、前記全体画像と前記拡大画像の色成分分析を行い、前記拡大画像の色を前記全体画像の色にシフトさせる色補正手段と、画角を狭めて撮影された前記拡大画像の、前記全体画像上での位置を検出する位置検出手段と、該位置検出手段にて検出された位置の情報である位置情報を元に前記拡大画像の傾きを検出し、その傾きに基づき前記拡大画像を正射投影する拡大正射投影手段と、前記位置情報を元に、前記拡大画像を合成して正射投影の全体合成画像を得る画像合成手段とを備え、前記色補正手段は、前記拡大画像のＲＧＢ値の各平均値を算出し、前記全体画像のエリアの一部である前記拡大画像のエリアのバックグランド領域のＲＧＢ値の平均値を算出し、前記拡大画像のＲＧＢ値の平均値と前記バックグランド領域のＲＧＢ値の平均値との差により、色差を抽出し、前記拡大画像の色差分だけ前記ＲＧＢ値をシフトさせ、前記位置検出手段は、前記シフトさせたＲＧＢ値に基づき前記位置情報を得るか、又は前記拡大画像を、一定範囲内で移動及び／又は回転させ、併せてサイズ拡大又はサイズ縮小し、前記拡大画像と前記全体画像とのＲＧＢ値の二乗誤差が最小となる前記位置情報を得ることを特徴とする。
また、前記全体正射投影手段は、前記全体画像の正射投影を、前記矩形情報と前記撮影対象物の縦横の寸法とを元に行い、前記位置検出手段は、前記拡大画像の、前記全体画像上での位置をテンプレートマッチングで取得し、前記拡大正射投影手段は、前記拡大画像の正射投影を、前記全体画像に対する前記拡大画像の寸法比を示すスケール情報と前記全体画像の矩形情報と縦横寸法とを元に行い、前記画像合成手段は、前記スケール情報を用いて前記拡大画像を合成するようにすることができる。
また、前記全体正射投影手段は、透視３点問題を利用し、前記矩形情報から撮影手段のレンズ中心を原点とした前記撮影対象物の壁面の３次元情報を取得し、前記位置検出手段は、前記拡大画像を、前記撮影手段のレンズ中心からの焦点距離情報を元に前記全体画像のスケールに合わせて縮小し、縮小した画像の全体画像上での位置を各画像のピクセル情報を検索して算出するようにすることができる。
本発明の画像合成方法は、撮影対象物を分割して撮影した拡大画像を合成して前記撮影対象物の全体画像を得る画像合成方法であって、画角を広げて撮影された傾きを持つ前記全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、前記全体画像を正射投影し、前記全体画像と前記拡大画像の色成分分析を行い、前記拡大画像の色を前記全体画像の色にシフトさせ、画角を狭めて撮影された前記拡大画像の、前記全体画像上での位置を検出し、該位置検出手段にて検出された位置の情報である位置情報を元に前記拡大画像の傾きを検出し、その傾きに基づき前記拡大画像を正射投影し、前記位置情報を元に、前記拡大画像を合成して正射投影の前記全体合成画像を得てなり、前記拡大画像のＲＧＢ値の各平均値を算出し、前記全体画像のエリアの一部である前記拡大画像のエリアのバックグランド領域のＲＧＢ値の平均値を算出し、前記拡大画像のＲＧＢ値の平均値と前記バックグランド領域のＲＧＢ値の平均値との差により、色差を抽出し、前記拡大画像の色差分だけ前記ＲＧＢ値をシフトさせ、さらに、前記シフトさせたＲＧＢ値に基づき前記位置情報を得るか、又は前記拡大画像を、一定範囲内で移動及び／又は回転させ、併せてサイズ拡大又はサイズ縮小し、前記拡大画像と前記全体画像とのＲＧＢ値の二乗誤差が最小となる前記位置情報を得ることを特徴とする。
また、前記全体画像の正射投影を、前記矩形情報と前記撮影対象物の縦横の寸法とを元に行い、前記拡大画像の、前記全体画像上での位置をテンプレートマッチングで取得し、前記拡大画像の正射投影を、前記全体画像に対する前記拡大画像の寸法比を示すスケール情報と前記全体画像の矩形情報と縦横寸法とを元に行い、前記スケール情報を用いて前記拡大画像を合成するようにすることができる。
また、透視３点問題を利用し、前記矩形情報から撮影手段のレンズ中心を原点とした前記撮影対象物の壁面の３次元情報を取得し、前記拡大画像を、前記撮影手段のレンズ中心からの焦点距離情報を元に前記全体画像のスケールに合わせて縮小し、縮小した画像の全体画像上での位置を各画像のピクセル情報を検索して算出するようにすることができる。
本発明では、全体正射投影手段により、画角を広げて撮影された傾きを持つ全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、全体画像が正射投影され、位置検出手段により、画角を狭めて撮影された拡大画像の、全体画像上での位置が検出され、拡大正射投影手段により、その位置情報を元に拡大画像の傾きが検出され、その傾きに基づき拡大画像が正射投影され、合成手段により、位置情報を元に、拡大画像を合成して正射投影の全体画像が得られる。 An image composition apparatus according to the present invention is an image composition apparatus that obtains an entire image of an object to be photographed by combining magnified images obtained by dividing the object to be photographed, and has an inclination photographed with a wide angle of view. Based on rectangular information which is pixel coordinates of the four corners of the rectangle of the whole image and vertical and horizontal dimensions obtained from the whole image, a whole orthographic projection means for orthographically projecting the whole image , the whole image and the enlarged image Color correction means for performing color component analysis and shifting the color of the enlarged image to the color of the entire image, and position detection for detecting the position on the entire image of the enlarged image taken with a narrow angle of view And an orthogonal orthographic projection means for detecting an inclination of the enlarged image based on position information which is position information detected by the position detecting means, and orthographically projecting the enlarged image based on the inclination, Based on the position information, the enlarged image And a synthesized to the image synthesizing means for obtaining the whole composite image orthographic, said color correction means, said calculating each average value of the RGB values of the enlarged image, is part of the area of the entire image Calculate the average value of the RGB values of the background area of the area of the enlarged image, extract the color difference by the difference between the average value of the RGB values of the enlarged image and the average value of the RGB values of the background area, The RGB value is shifted by the color difference of the enlarged image, and the position detecting means obtains the position information based on the shifted RGB value, or moves and / or rotates the enlarged image within a certain range. In addition, the size information is enlarged or reduced in size, and the position information that minimizes the square error of the RGB values of the enlarged image and the whole image is obtained .
Further, the whole orthographic projection unit performs orthographic projection of the whole image based on the rectangular information and vertical and horizontal dimensions of the photographing object, and the position detection unit performs the whole of the enlarged image. A position on the image is acquired by template matching, and the enlarged orthographic projection unit performs orthographic projection of the enlarged image by using scale information indicating a size ratio of the enlarged image to the whole image and rectangular information of the whole image. The image synthesizing means can synthesize the enlarged image using the scale information.
Further, the general orthographic projection means uses a perspective three-point problem, acquires three-dimensional information of the wall surface of the photographing object with the lens center of the photographing means as an origin from the rectangular information, and the position detection means comprises The enlarged image is reduced in accordance with the scale of the whole image based on the focal length information from the lens center of the photographing means, and the pixel information of each image is searched for the position of the reduced image on the whole image. Can be calculated.
An image composition method according to the present invention is an image composition method for obtaining an entire image of an object to be photographed by combining magnified images obtained by dividing the object to be photographed, and having an inclination photographed with a wide angle of view. Based on the rectangular information that is the pixel coordinates of the four corners of the rectangle of the whole image and the vertical and horizontal dimensions obtained from the whole image, the whole image is orthographically projected, and color component analysis of the whole image and the enlarged image is performed. The color of the enlarged image is shifted to the color of the whole image, the position of the enlarged image taken with the angle of view narrowed is detected on the whole image, and the position detected by the position detecting unit is detected. The inclination of the enlarged image is detected based on the positional information that is information, the enlarged image is orthographically projected based on the inclination, and the enlarged image is synthesized based on the positional information to form the whole of the orthographic projection. It will give a composite image, RGB of the enlarged image The average value of the RGB values of the background area of the enlarged image area that is a part of the area of the entire image is calculated, and the average value of the RGB values of the enlarged image and the background are calculated. A color difference is extracted by a difference from an average value of the RGB values of the region, the RGB value is shifted by the color difference of the enlarged image, and the position information is obtained based on the shifted RGB value, or The enlarged image is moved and / or rotated within a certain range, and the size information is enlarged or reduced at the same time to obtain the position information that minimizes the square error of the RGB values of the enlarged image and the whole image. And
Further, the orthographic projection of the entire image is performed based on the rectangular information and the vertical and horizontal dimensions of the object to be photographed, the position of the enlarged image on the entire image is obtained by template matching, and the enlarged image is obtained. An orthographic projection of the image is performed based on scale information indicating a size ratio of the enlarged image to the entire image, rectangular information of the entire image, and vertical and horizontal dimensions, and the enlarged image is synthesized using the scale information. Can be.
Further, using the perspective three-point problem, three-dimensional information of the wall surface of the object to be imaged with the lens center of the imaging unit as an origin is obtained from the rectangular information, and the enlarged image is obtained from the lens center of the imaging unit. Based on the focal length information, the image can be reduced in accordance with the scale of the whole image, and the position of the reduced image on the whole image can be calculated by searching pixel information of each image.
In the present invention, based on the rectangular information and the vertical and horizontal dimensions, which are pixel coordinates of the four corners of the entire image, obtained from the entire image having an inclination photographed with the angle of view widened by the entire orthographic projection means. The position of the enlarged image taken with the angle of view narrowed is detected by the position detecting means, and the position of the enlarged image is detected by the enlarged orthographic projection means based on the position information. Is detected, and an enlarged image is orthographically projected based on the inclination, and the synthesizing unit synthesizes the enlarged images based on the position information to obtain an entire orthographic projected image.

本発明によれば、全体正射投影手段により、画角を広げて撮影された傾きを持つ全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、全体画像が正射投影され、位置検出手段により、画角を狭めて撮影された拡大画像の、全体画像上での位置が検出され、拡大正射投影手段により、その位置情報を元に拡大画像の傾きが検出され、その傾きに基づき拡大画像が正射投影され、合成手段により、位置情報を元に、拡大画像を合成して正射投影の全体画像が得られるようにしたので、撮影対象物を分割して撮影した拡大画像を合成して得られる撮影対象物の全体画像を高精細の画像とすることができる。 According to the present invention, based on the rectangular information and the vertical and horizontal dimensions, which are pixel coordinates of the four corners of the whole image, obtained from the whole image having an inclination photographed by widening the angle of view by the whole orthographic projection unit, The whole image is orthographically projected, and the position detection means detects the position of the enlarged image taken with a narrowed angle of view on the whole image, and the enlarged orthographic projection means enlarges the image based on the position information. The magnified image is detected and the magnified image is orthographically projected based on the tilt, and the compositing means synthesizes the magnified image based on the position information so that an entire orthographic projection image is obtained. The entire image of the object to be photographed obtained by synthesizing the magnified images obtained by dividing the object can be a high-definition image.

本実施形態では、全体正射投影手段により、画角を広げて撮影された傾きを持つ全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、全体画像が正射投影され、位置検出手段により、画角を狭めて撮影された拡大画像の、全体画像上での位置が検出され、拡大正射投影手段により、その位置情報を元に拡大画像の傾きが検出され、拡大正射投影手段により、その傾きに基づき拡大画像が正射投影され、合成手段により、位置情報を元に、拡大画像を合成して正射投影の全体画像が得られるようにし、撮影対象物を分割して撮影した拡大画像を合成して得られる撮影対象物の全体画像を高精細の画像とするようにした。 In the present embodiment, the whole orthographic projection unit obtains the whole image based on the rectangle information and the vertical and horizontal dimensions which are pixel coordinates of the four corners of the rectangle of the whole image, which is obtained from the whole image having a tilted angle of view. The image is orthographically projected, and the position detection means detects the position of the enlarged image taken at a narrow angle of view on the entire image, and the enlarged orthographic projection means detects the position of the enlarged image based on the position information. The tilt is detected, and the magnified orthographic projection means orthographically projects the magnified image based on the tilt, and the synthesizing means synthesizes the magnified image based on the position information so that an entire orthographic projection image is obtained. Thus, the entire image of the object to be obtained obtained by combining the enlarged images obtained by dividing the object to be photographed is a high-definition image.

以下、本発明の実施の形態について説明する。
図１は、本発明の画像合成装置の一実施形態を示すブロック図、図２〜図１０は、図１の画像合成装置による画像合成方法を説明するための図である。 Embodiments of the present invention will be described below.
FIG. 1 is a block diagram showing an embodiment of an image composition apparatus of the present invention, and FIGS. 2 to 10 are diagrams for explaining an image composition method by the image composition apparatus of FIG.

図１に示すように、画像合成装置１０は、デジタルカメラ２０による、撮影対象物を分割して撮影した拡大画像を合成して撮影対象物の全体画像を得るものであり、全体正射投影手段１１、位置検出手段１２、拡大正射投影手段１３、色補正手段１４、合成手段１５を備えている。 As shown in FIG. 1, the image composition device 10 synthesizes an enlarged image obtained by dividing a photographing object with a digital camera 20 to obtain an entire image of the photographing object, and an overall orthographic projection unit. 11, a position detection unit 12, an enlarged orthographic projection unit 13, a color correction unit 14, and a synthesis unit 15.

全体正射投影手段１１は、画角を広げて撮影された傾きを持つ１枚の全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、全体画像を正射投影する。また、全体正射投影手段１１は、全体画像の正射投影を、矩形情報と撮影対象物の縦横の寸法とを元に行う。また、全体正射投影手段１１は、透視３点問題を利用し、矩形情報からデジタルカメラ２０を原点とした撮影対象物の壁面の３次元情報を取得する。なお、全体正射投影手段１１による正射投影の詳細については、後述する。 The total orthographic projection means 11 is based on the rectangular information which is the pixel coordinates of the four corners of the rectangle of the whole image and the vertical and horizontal dimensions obtained from one whole image having an inclination photographed with a wide angle of view. Is an orthographic projection. Further, the overall orthographic projection unit 11 performs orthographic projection of the entire image based on the rectangular information and the vertical and horizontal dimensions of the object to be imaged. Further, the global orthographic projection unit 11 uses the perspective three-point problem, and acquires three-dimensional information of the wall surface of the object to be photographed with the digital camera 20 as the origin from rectangular information. Details of the orthographic projection by the global orthographic projection unit 11 will be described later.

位置検出手段１２は、画角を狭めて撮影されたｎ枚の拡大画像の、全体画像上での位置を検出する。また、位置検出手段１２は、拡大画像の、全体画像上での位置をテンプレートマッチングで取得する。また、位置検出手段１２は、拡大画像を、デジタルカメラ２０からの焦点距離情報を元に全体画像のスケールに合わせて縮小し、縮小した画像の全体画像上での位置を各画像のピクセル情報を検索して算出する。また、位置検出手段１２は、拡大画像を、一定範囲内で移動及び／又は回転させ、併せてサイズ拡大又はサイズと縮小し、拡大画像と全体画像との差が最小となる位置情報を得る。なお、位置検出手段１２による位置検出の詳細については、後述する。 The position detection means 12 detects the position on the whole image of the n enlarged images taken with the angle of view narrowed. Further, the position detection unit 12 acquires the position of the enlarged image on the entire image by template matching. Further, the position detecting means 12 reduces the enlarged image according to the scale of the entire image based on the focal length information from the digital camera 20, and the position of the reduced image on the entire image is obtained as pixel information of each image. Search and calculate. In addition, the position detection unit 12 moves and / or rotates the enlarged image within a certain range, and simultaneously enlarges or reduces the size, thereby obtaining position information that minimizes the difference between the enlarged image and the entire image. Details of position detection by the position detection means 12 will be described later.

拡大正射投影手段１３は、位置検出手段１２によって検出された位置情報を元に拡大画像の傾きを検出し、その傾きに基づき拡大画像を正射投影する。また、拡大正射投影手段１３は、全体画像に対する拡大画像のスケール情報を取得する。なお、拡大正射投影手段１３による正射投影の詳細については、後述する。 The magnified orthographic projection means 13 detects the tilt of the magnified image based on the position information detected by the position detection means 12, and orthographically projects the magnified image based on the tilt. Further, the enlarged orthographic projection unit 13 acquires scale information of the enlarged image with respect to the entire image. Details of the orthographic projection by the magnified orthographic projection means 13 will be described later.

色補正手段１４は、全体画像と拡大画像の色成分分析を行い、拡大画像の色を全体画像の色にシフトさせる。なお、色補正手段１４による色補正の詳細については、後述する。 The color correction unit 14 performs color component analysis of the entire image and the enlarged image, and shifts the color of the enlarged image to the color of the entire image. Details of the color correction by the color correction unit 14 will be described later.

合成手段１５は、位置検出手段１２によって検出された位置情報を元に、拡大画像を合成して正射投影の全体画像を得る。 The synthesizing unit 15 synthesizes the enlarged images based on the position information detected by the position detecting unit 12 to obtain an orthographic projection whole image.

次に、画像合成方法の概要について説明する。
まず、デジタルカメラ２０により、図２（ａ）に示すような対象となる壁面を全体画像（１枚）と、図２（ｂ）（ｃ）に示すような拡大画像（ｎ枚）とを撮影し、入力画像とする。 Next, an outline of the image composition method will be described.
First, the digital camera 20 captures the entire wall surface (1 sheet) as shown in FIG. 2A and the enlarged images (n sheets) as shown in FIGS. 2B and 2C. And an input image.

ここで、デジタルカメラ２０での撮影には、次のような条件がある。
（撮影画像の前提条件）
１．撮影画像は、全体画像（１枚）、分割された拡大画像（ｎ枚）とも、同じ場所から撮影すること。
２．撮影画像は、全体画像（１枚）、分割された拡大画像（ｎ枚）とも、撮影時の焦点距離が全て既知であること。
３．全体画像中に含まれる矩形部位の実際の寸法が既知であること。
なお、最近のデジタルカメラ２０では、撮影画像ファイルに焦点距離情報が記録されているため、その撮影画像ファイルから焦点距離情報を取得するようにすればよい。 Here, there are the following conditions for photographing with the digital camera 20.
(Prerequisites for captured images)
1. The captured images should be taken from the same place for both the entire image (one image) and the divided enlarged images (n images).
2. The captured image has all known focal lengths for the entire image (one image) and the divided enlarged images (n images).
3. The actual dimensions of the rectangular part included in the entire image are known.
In recent digital cameras 20, since focal length information is recorded in a captured image file, the focal length information may be acquired from the captured image file.

次いで、図２（ｄ）での全体画像中の矩形情報（ピクセル座標と実際の縦×横の寸法）を元に、図２（ｅ）のように正射投影する。この場合、透視３点問題を利用し、矩形情報からデジタルカメラ２０を原点とした壁面の３次元情報を取得すればよい。 Next, orthographic projection is performed as shown in FIG. 2E based on rectangular information (pixel coordinates and actual vertical and horizontal dimensions) in the entire image in FIG. In this case, the perspective three-point problem may be used to obtain the three-dimensional information of the wall surface with the digital camera 20 as the origin from the rectangular information.

次いで、図２（ｆ）に示すように、拡大画像の全体画像での位置を取得する。この場合、撮影した拡大画像の全体画像上での位置（ズレ）をテンプレートマッチングで取得することができる。 Next, as shown in FIG. 2F, the position of the enlarged image in the entire image is acquired. In this case, the position (deviation) on the entire image of the captured enlarged image can be acquired by template matching.

次いで、図２（ｇ）に示すように、図２（ｅ）での正射投影と、図２（ｆ）での拡大画像の位置合わせの結果を用いて拡大画像の正射投影を行い、拡大画像（高精細画像）上での全体画像に対する位置とスケール情報とを取得する。この場合、図２（ｆ）で取得した全体画像に対する各拡大画像のズレを基に、図２（ｅ）での３次元情報を用いて正射投影を行うことで、拡大画像上の全体画像での位置、実際の寸法を取得することができる。 Next, as shown in FIG. 2 (g), the orthographic projection in FIG. 2 (e) and the orthographic projection of the enlarged image using the result of the alignment of the enlarged image in FIG. 2 (f) are performed. The position and scale information with respect to the entire image on the enlarged image (high-definition image) are acquired. In this case, by performing orthographic projection using the three-dimensional information in FIG. 2E based on the deviation of each enlarged image with respect to the entire image acquired in FIG. 2F, the entire image on the enlarged image is displayed. The actual position and position can be obtained.

正射投影方法は、次の通りである。
正射投影した画像を作成するために、壁面の直行する４隅の位置から、透視３点問題を使用して壁面の３次元座標を取得する。基準点を用いた立体認識の方法である透視３点問題では、画像中の点の３次元空間中の座標が既知の場合、それらの点を基準として用いることで、３次元データを再編成することができる場合がある。 The orthographic projection method is as follows.
In order to create an orthogonally projected image, the three-dimensional coordinates of the wall surface are acquired from the positions of the four orthogonal corners of the wall surface using the perspective three-point problem. In the perspective three-point problem, which is a method of three-dimensional recognition using reference points, when the coordinates in a three-dimensional space of points in an image are known, the three-dimensional data is reorganized by using those points as a reference. There are cases where it is possible.

そこで、本実施形態では、図３に示すように、壁面を斜めから撮影した画像から、視点Ｏと対象物体の３点Ａ，Ｂ，Ｃまでの距離ｌａ，ｌｂ，ｌｃを未知パラメータとして解き、視点Ｏの３次元座標と視線の方向とを求めるようにしている。 Therefore, in the present embodiment, as shown in FIG. 3, the distances la, lb, lc from the viewpoint O and the three points A, B, C of the target object are solved as unknown parameters from the image obtained by obliquely photographing the wall surface. The three-dimensional coordinates of the viewpoint O and the direction of the line of sight are obtained.

ただし、解を得るために、次の事柄が既知であるとした条件がある。
１．辺ＡＢ，ＢＣ，対角線ＣＡの長さ（直交する撮影壁面の寸法）
２．デジタルカメラ２０の焦点距離（画像面と視点Ｏの間の距離）
３．視点Ｏから△ＡＢＣの各辺を見込む角θａｂ，θｂｃ，θｃａ（これは、画像面の各点ａ，ｂ，ｃの座標を用い、余弦定理により算出可能）
上記の３点は、それぞれ既知であるため、３次元座標を取得することができる。 However, in order to obtain a solution, there is a condition that the following matters are known.
1. Length of sides AB, BC, diagonal CA (dimensions of orthogonal shooting wall surface)
2. Focal length of digital camera 20 (distance between image plane and viewpoint O)
3. Angles θab, θbc, θca for viewing each side of ΔABC from the viewpoint O (this can be calculated by the cosine theorem using the coordinates of each point a, b, c on the image plane)
Since the above three points are known, three-dimensional coordinates can be acquired.

次に、壁面の３次元座標の算出方法について説明する。
図４に示すように、視点Ｏから対象物体の３点Ａ，Ｂ，Ｃまでの距離ｌａ，ｌｂ，ｌｃを未知数とする。ここで、対角線ＡＣ，ＢＤの交点をＰ、線分ＡＰ、ＣＰの長さをｌ、∠ＡＯＰ、∠ＣＯＰ、∠ＡＰＯをそれぞれα、β、γとすると、正弦定理により、ｌａ，ｌｃは数１のように表される。

Next, a method for calculating the three-dimensional coordinates of the wall surface will be described.
As shown in FIG. 4, the distances la, lb, and lc from the viewpoint O to the three points A, B, and C of the target object are unknowns. Here, if the intersection of the diagonal lines AC and BD is P, the length of the line segments AP and CP is l, and ∠AOP, ∠COP, and ∠APO are α, β, and γ, respectively, la and lc are numbers according to the sine theorem. It is expressed as 1.

ここで、数２の余弦定理を用いて、数３にてｌａを求めることができる。

Here, using the cosine theorem of Equation 2, la can be obtained in Equation 3.

このとき、視点Ｏと壁面の対角線間の角度α（∠ＡＯＰ）、β（∠ＣＯＰ）は画像上の矩形４隅のピクセル座標である矩形情報より、辺の長さｌは壁面の寸法から算出できる。以上の算出方法を用いて、各点Ａ，Ｃまでの距離ｌａ，ｌｃを求めることができ、点Ｂまでの距離ｌｂも同様の算出方法で求めることができる。 At this time, the angles α (∠AOP) and β (∠COP) between the viewpoint O and the diagonal of the wall are calculated from rectangular information that is the pixel coordinates of the four corners of the rectangle on the image, and the side length l is calculated from the dimensions of the wall. it can. Using the above calculation method, the distances la and lc to the points A and C can be obtained, and the distance lb to the point B can be obtained by the same calculation method.

物体の各点Ａ，Ｂ，Ｃまでの距離ｌａ，ｌｂ，ｌｃが求められた後、視点Ｏから画像面上の各点ａ，ｂ，ｃまでの距離ｋａ，ｋｂ，ｋｃとの比と、各点ａ，ｂ，ｃの画像面の座標（ｘａ，ｙａ，ｚａ）、（ｘｂ，ｙｂ，ｚｂ）、（ｘｃ，ｙｃ，ｚｃ）から、数（座標）４を用いて視点Ｏを原点とした壁面の座標値を求める。

After the distances la, lb, and lc to the points A, B, and C of the object are obtained, the ratio of the distances ka, kb, and kc from the viewpoint O to the points a, b, and c on the image plane, From the coordinates (xa, ya, za), (xb, yb, zb), (xc, yc, zc) of the image plane of each point a, b, c, using the number (coordinate) 4 as the origin O Obtain the coordinate value of the wall surface.

これらの３次元座標データを用いることで、正射投影することができる。
ただし、計算に使用した３点（Ａ，Ｂ，Ｃ）によっては、上記の数３のように、右辺の分母がゼロの場合やｌａ^２＜０という場合があり、正確な値を算出できない場合がある。 Orthogonal projection can be performed by using these three-dimensional coordinate data.
However, depending on the three points (A, B, C) used for the calculation, the denominator on the right side may be zero or la ² <0, as shown in Equation 3, and an accurate value cannot be calculated. There is.

そこで、このような場合には、４点中別の３点の組み合わせ（たとえばＡ，Ｂ，Ｄ）を用いて再計算することにより、できる限り適切な解が算出されるようにすることができる。 Therefore, in such a case, an appropriate solution can be calculated as much as possible by performing recalculation using another combination of three points among the four points (for example, A, B, D). .

ここで、画像の３次元座標を取得するための手順を、図５のフローにより説明する。
まず、画像上の矩形領域の座標から、画像上の矩形の中心座標を算出する（ステップＳ１）。 Here, the procedure for acquiring the three-dimensional coordinates of the image will be described with reference to the flow of FIG.
First, the center coordinates of the rectangle on the image are calculated from the coordinates of the rectangular area on the image (step S1).

次いで、視点と画像上の矩形の中心座標、２点の矩形座標と実際の距離から、視点と、壁面上の２点の実際の距離を算出する（ステップＳ２）。 Next, the actual distance between the viewpoint and the two points on the wall surface is calculated from the center coordinates of the rectangle on the viewpoint and the image and the two rectangular coordinates and the actual distance (step S2).

次いで、算出時に異常がある場合（ステップＳ３）、別の２点の組み合わせを選択し（ステップＳ４）、（ステップＳ２）に戻って実際の距離を算出する。これに対し、（ステップＳ３）において、異常がない場合、残り２点の組み合わせにて（ステップＳ２）と同様の処理を行い、実際の距離を算出する（ステップＳ５）。 Next, when there is an abnormality in the calculation (step S3), another two-point combination is selected (step S4), and the process returns to (step S2) to calculate the actual distance. On the other hand, if there is no abnormality in (Step S3), the same processing as in (Step S2) is performed with the combination of the remaining two points, and the actual distance is calculated (Step S5).

次いで、算出時に異常がある場合（ステップＳ６）、２点の３次元座標をもとに、壁面の矩形が直角であることを利用して、残りの２点の３次元座標を算出する（ステップＳ７）。これに対し、（ステップＳ６）において、異常がない場合、算出した壁面上の距離と焦点距離をもとに、視点を原点とした壁面の４点の３次元座標を取得する（ステップＳ８）。 Next, when there is an abnormality in the calculation (step S6), based on the three-dimensional coordinates of the two points, the three-dimensional coordinates of the remaining two points are calculated using the fact that the rectangle of the wall surface is a right angle (step). S7). On the other hand, if there is no abnormality in (step S6), based on the calculated distance on the wall surface and the focal length, three-dimensional coordinates of the four points on the wall surface with the viewpoint as the origin are acquired (step S8).

次に、拡大画像の全体画像での位置を取得する位置取得方法（テンプレートマッチング）について説明する。 Next, a position acquisition method (template matching) for acquiring the position of the enlarged image in the entire image will be described.

まず、撮影した拡大画像を、焦点距離情報を元に全体画像のスケールに合わせて縮小させ、縮小させた画像の全体画像上での位置（ズレ）を、各画像のピクセル情報をサーチして算出する（テンプレートマッチング）。 First, the captured enlarged image is reduced to the scale of the entire image based on the focal length information, and the position (deviation) of the reduced image on the entire image is calculated by searching the pixel information of each image. (Template matching).

テンプレートマッチングの手法は、次の通りである。
図６に示すように、全体画像Ｉ_２上で、拡大画像Ｉ_１を一定範囲で移動させ（移動量ｄ）、重なり部（Ｍ_２×Ｎ_２）において、数５により画素当たりの二乗誤差（Ｅ）を求める。Ｅが最も小さいときの移動量（ｄ）を、２つの画像の最適な位置とする。

The template matching method is as follows.
As shown in FIG. 6, on the entire image I ₂ , the enlarged image I ₁ is moved within a certain range (movement amount d), and the square error per pixel (5) in the overlap portion (M ₂ × N ₂ ) E). The movement amount (d) when E is the smallest is the optimum position of the two images.

ただし、次のような理由から、単純なＲＧＢ値どうしの判定では適切な解（位置）が算出できない場合がある。
１．撮影時の露出の違いなどにより、全体画像と拡大画像の同一箇所の色が違う。
２．前提条件として撮影位置は固定としているが、運用上カメラの原点がずれる場合があり、焦点距離情報に基づいた画像の縮小時に誤差が生じる（この状態は、図８（ａ）（ｂ）を参照）。 However, an appropriate solution (position) may not be calculated by simple determination between RGB values for the following reasons.
1. Due to differences in exposure during shooting, the color of the same part of the whole image and the enlarged image is different.
2. Although the shooting position is fixed as a precondition, the camera origin may be shifted in operation, and an error occurs when the image is reduced based on the focal length information (refer to FIGS. 8A and 8B for this state). ).

そこで、上記１．を解消するために、全体画像と拡大画像の色成分分析を行い、自動的に近い色に拡大画像をシフトさせる色補正処理を追加している。この場合の処理を、図７のフローにより説明する。 Therefore, the above 1. In order to solve this problem, color component analysis of the entire image and the enlarged image is performed, and a color correction process for automatically shifting the enlarged image to a close color is added. The processing in this case will be described with reference to the flow of FIG.

まず、拡大画像のＲＧＢ値の各平均値を算出する（ステップＳ１）。次いで、拡大画像エリアのバックグランド領域（全体画像のエリアの一部）のＲＧＢ値の平均値を算出する（ステップＳ２）。次いで、（ステップＳ１）での平均値と、（ステップＳ２）での平均値との差により、色差を抽出する（ステップＳ３）。そして、拡大画像の色差分だけＲＧＢ値をシフトさせる（ステップＳ４）。 First, each average value of the RGB values of the enlarged image is calculated (step S1). Next, an average value of RGB values of the background area of the enlarged image area (a part of the entire image area) is calculated (step S2). Next, a color difference is extracted based on the difference between the average value in (Step S1) and the average value in (Step S2) (Step S3). Then, the RGB value is shifted by the color difference of the enlarged image (step S4).

また、上記２．を解消するために、テンプレートマッチング時にスケールの補正を考慮して、拡大画像を一定範囲内での移動、回転に加えて、サイズの変更（拡大、縮小）を行い、拡大画像と全体画像の差が最小のときを正解とする。 In addition, the above 2. In order to solve this problem, in consideration of the scale correction during template matching, in addition to moving and rotating the enlarged image within a certain range, the size is changed (enlarged and reduced), and the difference between the enlarged image and the whole image The correct answer is when.

この場合の処理を、図９のフローにより説明する。
まず、探索範囲が設定されると、スケールを変えて二乗誤差を算出する（ステップＳ１）。次いで、ある位置での二乗誤差を求め（ステップＳ２）、ＲＧＢ値の二乗誤差を計算（回転も考慮）し（ステップＳ３）、拡大画像の矩形領域ピクセル数分繰り返す（ステップＳ４）。 The processing in this case will be described with reference to the flow of FIG.
First, when the search range is set, the square error is calculated by changing the scale (step S1). Next, the square error at a certain position is obtained (step S2), the square error of the RGB value is calculated (considering rotation) (step S3), and the process is repeated for the number of rectangular area pixels in the enlarged image (step S4).

ここで、探索範囲内のある位置での二乗誤差sDiffが最小二乗誤差minDiffより小さいか否かが判断され（ステップＳ５）、小さい場合には、minDiff＝sDiffとして位置を保存する（ステップＳ６）。そして、スケールの変更回数分二乗誤差の算出を繰り返し（ステップＳ７）、その算出を終えると二乗誤差の最小位置が決定される（ステップＳ８）。 Here, it is determined whether or not the square error sDiff at a certain position in the search range is smaller than the least square error minDiff (step S5). If smaller, the position is saved as minDiff = sDiff (step S6). Then, the calculation of the square error is repeated for the number of scale changes (step S7). When the calculation is completed, the minimum position of the square error is determined (step S8).

次に、拡大画像の正射投影について説明する。
図１０は、図３で説明したデジタルカメラ２０を原点とする３次元座標と、図６で説明した全体画像に対する拡大画像のズレを元に、拡大画像の正射投影を行う場合を示している。 Next, orthographic projection of the enlarged image will be described.
FIG. 10 shows a case where orthographic projection of the enlarged image is performed based on the three-dimensional coordinates with the digital camera 20 described in FIG. 3 as the origin and the deviation of the enlarged image with respect to the entire image described in FIG. .

このような拡大画像の正射投影を行うことで、正射投影画像上で、全体画像に対する拡大画像の位置や、１ピクセル当たり何ｍｍに相当するのかを取得することができる。 By performing orthographic projection of such an enlarged image, it is possible to acquire the position of the enlarged image with respect to the entire image and how many mm per pixel corresponds to the orthographic projection image.

このように、本実施形態では、全体正射投影手段１１により、画角を広げて撮影された傾きを持つ全体画像から得られる、全体画像の矩形４隅のピクセル座標である矩形情報と縦横寸法とに基づき、全体画像が正射投影され、位置検出手段１２により、画角を狭めて撮影された拡大画像の、全体画像上での位置が検出され、拡大正射投影手段１３により、その位置情報を元に拡大画像の傾きが検出され、その傾きに基づき拡大画像が正射投影され、合成手段１５により、位置情報を元に、拡大画像を合成して正射投影の全体画像が得られるようにしたので、撮影対象物を分割して撮影した拡大画像を合成して得られる撮影対象物の全体画像を高精細の画像とすることができる。 As described above, in the present embodiment, the rectangular information and the vertical and horizontal dimensions, which are pixel coordinates of the four corners of the whole image, obtained from the whole image having an inclination photographed by widening the angle of view by the whole orthographic projection unit 11. Based on the above, the whole image is orthographically projected, and the position detection unit 12 detects the position of the enlarged image taken with the angle of view narrowed on the whole image, and the enlarged orthographic projection unit 13 detects the position. The tilt of the magnified image is detected based on the information, and the magnified image is orthographically projected based on the tilt, and the synthesizing unit 15 synthesizes the magnified image based on the position information to obtain the entire orthographic projection image. Since it did in this way, the whole image of the to-be-photographed object obtained by synthesize | combining the enlarged image image | photographed by dividing | segmenting a to-be-photographed object can be made into a high-definition image.

ちなみに、横方向のみ及び縦方向のみのパノラマ合成をサポートするアプリケーションは世の中に数多く存在している。また、正射投影をパンアップ、パンダウン（上下方向のパンニング）のみサポートしたアプリケーションも存在している。画角領域を制限した複数画像の合成を実現しているアプリケーションも存在している。 Incidentally, there are many applications in the world that support panoramic composition only in the horizontal direction and only in the vertical direction. There are also applications that support orthographic projection only for pan-up and pan-down (vertical panning). There are also applications that realize composition of a plurality of images with a limited angle of view area.

これに対し、本実施形態でのアルゴリズムの複合体は、横方向と縦方向の両者のパノラマ合成を同時に実現し（撮影機材の傾き補正＋透視３点問題と分割合成のアルゴリズムにて解決）、同様に上パン、下パンのミックスされた画像の正射投影を実現している点で、従来の合成手法とは異なっている。 On the other hand, the complex of the algorithm in the present embodiment realizes both horizontal and vertical panoramic synthesis at the same time (solvent correction with the shooting equipment tilt correction + perspective three-point problem and division synthesis algorithm), Similarly, it is different from the conventional synthesis method in that the orthographic projection of the mixed image of the upper pan and the lower pan is realized.

また、本実施形態での合成手法は、撮影条件や入力条件も、レンズ長と矩形領域情報の入力のみであり、このような制限された入力条件の少ない状況で、他のアプリケーションで実現し得ない解を得ることができる点でユニークであり、とくにカメラ撮影時の色誤差、撮影位置の誤差を補正する処理がユニークである。 In addition, the composition method in this embodiment can be realized by other applications in such a situation that the shooting conditions and the input conditions are only the input of the lens length and the rectangular area information and there are few such limited input conditions. It is unique in that it can provide a unique solution, and in particular, it has a unique process for correcting color errors and shooting position errors during camera shooting.

なお、本実施形態で採用したアルゴリズムそのものは一般的なものであると通常は解釈されるであろうが、たとえばコンクリート壁面の劣化診断に応用されているケースは少なく、解を求めるツールとして極めて有効である。 It should be noted that the algorithm itself used in this embodiment will normally be interpreted as being general, but for example, it is rarely applied to concrete wall deterioration diagnosis and is extremely effective as a tool for finding a solution. It is.

なお、応用例としては、高解像度エリアマップの作成に適用できる。
すなわち、現状では広域のマップを高解像度で生成することはできないが、本手法の応用で、高い解像度での広域のエリアマップを生成することが可能となる。また、傾き・画角補正のアルゴリズムが入っていることでヘリコプター等での空撮にも対応可能である。 As an application example, it can be applied to creation of a high-resolution area map.
That is, at present, it is not possible to generate a wide-area map with high resolution, but it is possible to generate a wide-area area map with high resolution by applying this method. In addition, it can be used for aerial photography with a helicopter, etc. by including an algorithm for tilt and angle of view correction.

また、画像を用いた多重解像度の３次元モデル空間の作成に適用できる。
すなわち、実際に撮影した画像を用いた３次元モデル空間の作成は非常に面倒で時間のかかる作業となっている。簡単に３次元モデル空間の作成を謳うソフトウェアでも、いくつかのユーザによる処理が必要となる。 Further, the present invention can be applied to creation of a multi-resolution three-dimensional model space using images.
That is, the creation of a three-dimensional model space using an actually photographed image is a very troublesome and time-consuming operation. Even software that easily creates a three-dimensional model space requires processing by several users.

この場合、本手法を応用し、予めサイズを定めた矩形マーカを、たとえば室内の各面に貼り、デジタル画像で撮影することで、自動でマーカを認識し、室内の３次元モデル空間を作成することができる。 In this case, applying this method, a rectangular marker with a predetermined size is pasted on each surface of the room, for example, and captured with a digital image, so that the marker is automatically recognized and an indoor three-dimensional model space is created. be able to.

さらに、室内の各面の拡大画像を撮影し、本手法を用いて位置認識させることで、より高解像度のテクスチャを作成することができる。 Furthermore, a higher-resolution texture can be created by taking an enlarged image of each surface in the room and recognizing the position using this method.

また、ＤＶカメラを用いた壁面画像の作成に適用できる。
すなわち、本手法のアルゴリズムを用いることで、ＤＶカメラで撮影した動画を元に大きな壁面の高精細画像を作成することができる。撮影する動画は、対象壁面全体と、そこからズームした壁面の拡大映像が連続して撮影されているものとする。本手法のアルゴリズムを用いて、まず全体画像を正射投影し、その後、ズームした画像の全体画像からの位置を検出、正射投影することで、連続した動画から、高精細な壁面の画像を生成することができる。これにより、現場での撮影作業は、通常のデジタルスチルカメラを用いた場合より容易に行うことができる。 Further, the present invention can be applied to the creation of a wall image using a DV camera.
That is, by using the algorithm of this method, it is possible to create a high-definition image of a large wall surface based on a moving image shot with a DV camera. It is assumed that the moving image to be photographed is a continuous image of the entire target wall surface and an enlarged image of the wall surface zoomed therefrom. Using the algorithm of this technique, the whole image is orthographically projected, and then the position of the zoomed image is detected from the whole image and orthographically projected, so that a high-definition wall image can be obtained from a continuous video. Can be generated. As a result, on-site shooting can be performed more easily than when a normal digital still camera is used.

本発明の画像合成装置の一実施形態を示すブロック図である。It is a block diagram which shows one Embodiment of the image synthesizing | combining apparatus of this invention. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG. 図１の画像合成装置による画像合成方法を説明するためのフローチャートである。3 is a flowchart for explaining an image composition method by the image composition apparatus of FIG. 1. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG. 図１の画像合成装置による画像合成方法を説明するためのフローチャートである。3 is a flowchart for explaining an image composition method by the image composition apparatus of FIG. 1. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG. 図１の画像合成装置による画像合成方法を説明するためのフローチャートである。3 is a flowchart for explaining an image composition method by the image composition apparatus of FIG. 1. 図１の画像合成装置による画像合成方法を説明するための図である。It is a figure for demonstrating the image composition method by the image composition apparatus of FIG.

Explanation of symbols

１０画像合成装置
１１全体正射投影手段
１２位置検出手段
１３拡大正射投影手段
１４色補正手段
１５合成手段
２０デジタルカメラ（撮影手段） DESCRIPTION OF SYMBOLS 10 Image composition apparatus 11 Whole orthographic projection means 12 Position detection means 13 Enlarged orthographic projection means 14 Color correction means 15 Composition means 20 Digital camera (imaging means)

Claims

An image composition device that synthesizes an enlarged image obtained by dividing a photographing object to obtain an entire image of the photographing object,
Whole orthographic projection for orthographic projection of the whole image based on rectangular information and vertical and horizontal dimensions, which are pixel coordinates of the four corners of the rectangle of the whole image, obtained from the whole image having an inclination photographed with a wide angle of view Means,
Color correction means for performing color component analysis of the entire image and the enlarged image, and shifting the color of the enlarged image to the color of the entire image;
Position detecting means for detecting a position on the whole image of the enlarged image taken with a narrowed angle of view;
An enlarged orthographic projection means for detecting an inclination of the enlarged image based on position information which is position information detected by the position detecting means, and orthographically projecting the enlarged image based on the inclination;
Based on the position information, the image synthesis means for synthesizing the enlarged image to obtain an orthographic projection overall synthesized image ,
The color correction unit calculates an average value of RGB values of the enlarged image, calculates an average value of RGB values of a background area of the area of the enlarged image that is a part of the area of the entire image, and The difference between the average value of the RGB values of the enlarged image and the average value of the RGB values of the background region is used to extract a color difference, and the RGB value is shifted by the color difference of the enlarged image,
The position detection means obtains the position information based on the shifted RGB values, or moves and / or rotates the enlarged image within a certain range, and simultaneously enlarges or reduces the size, and the enlarged image And obtaining the position information that minimizes the square error of the RGB values of the whole image and the whole image .

The global orthographic projection means performs an orthographic projection of the overall image based on the rectangular information and vertical and horizontal dimensions of the object to be photographed,
The position detecting means acquires the position of the enlarged image on the entire image by template matching,
The magnified orthographic projection unit performs orthographic projection of the magnified image based on scale information indicating a size ratio of the magnified image to the whole image, rectangular information of the whole image, and vertical and horizontal dimensions,
The image synthesizing apparatus according to claim 1, wherein the image synthesizing unit synthesizes the enlarged image using the scale information.

The global orthographic projection means uses the perspective three-point problem, and acquires from the rectangular information three-dimensional information of the wall surface of the object to be photographed with the lens center of the photographing means as the origin,
The position detecting means reduces the enlarged image according to the scale of the whole image based on focal length information from the lens center of the photographing means, and sets the position of the reduced image on the whole image to the position of each image. The image composition apparatus according to claim 1, wherein pixel information is searched and calculated.

An image composition method for obtaining an overall image of the object to be photographed by combining enlarged images obtained by dividing the object to be photographed,
Based on the rectangular information that is the pixel coordinates of the four corners of the rectangle of the whole image and the vertical and horizontal dimensions, obtained from the whole image having a tilted image with a wide angle of view, the whole image is orthographically projected,
Perform color component analysis of the entire image and the enlarged image, shift the color of the enlarged image to the color of the entire image,
Detecting the position of the enlarged image taken with a narrowed angle of view on the whole image;
Detecting an inclination of the enlarged image based on position information which is position information detected by the position detecting means, and orthographically projecting the enlarged image based on the inclination;
Based on the position information, the magnified image is synthesized to obtain the orthographic projection overall synthesized image ,
Calculate the average value of the RGB values of the enlarged image, calculate the average value of the RGB values of the background area of the area of the enlarged image that is a part of the area of the entire image, By extracting the color difference according to the difference between the average value and the average value of the RGB values of the background region, the RGB value is shifted by the color difference of the enlarged image,
Further, the position information is obtained based on the shifted RGB values, or the enlarged image is moved and / or rotated within a certain range, and the size is enlarged or reduced at the same time, and the enlarged image and the whole image are obtained. And obtaining the positional information that minimizes the square error of the RGB values .

Orthographic projection of the entire image is performed based on the rectangular information and the vertical and horizontal dimensions of the object to be photographed,
Obtaining the position of the enlarged image on the whole image by template matching;
Orthographic projection of the magnified image is performed based on scale information indicating the size ratio of the magnified image to the entire image, rectangular information of the entire image, and vertical and horizontal dimensions,
The image synthesis method according to claim 4 , wherein the enlarged image is synthesized using the scale information.

Using the perspective three-point problem, obtaining the three-dimensional information of the wall surface of the object to be photographed with the lens center of the photographing means as the origin from the rectangular information,
The enlarged image is reduced according to the scale of the whole image based on the focal length information from the lens center of the photographing means, and the pixel information of each image is searched for the position of the reduced image on the whole image. The image composition method according to claim 4, wherein the image composition method is calculated.