JP2017085297A

JP2017085297A - Image processing apparatus, image processing method, and program

Info

Publication number: JP2017085297A
Application number: JP2015210251A
Authority: JP
Inventors: 恭平菊田; Kyohei Kikuta
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-10-26
Filing date: 2015-10-26
Publication date: 2017-05-18

Abstract

PROBLEM TO BE SOLVED: To properly align each image for generating a panoramic image by connecting a plurality of images including the same subjects.SOLUTION: An image processing apparatus comprises: image acquiring at least two images obtained by imaging subjects so that the subjects face to a different direction each other, and a part of them is overlapped; searching means of searching a corresponding point between the images corresponding to the same subjects in each image acquired by the acquiring means; position determination means of determining a position relationship of each image acquired by the acquiring means on the basis of the corresponding point; and composition means that forms a panoramic image on the basis of the relationship of the images determined by the position determination means by combining and composing at least two images. The position determination means sets reliability of the correcting point detected by the searching means on the basis of a viewing angle dependence of the subject imaged to two images, and determines the position relationships of each image by referring the reliability of the corresponding point.SELECTED DRAWING: Figure 2

Description

本発明は複数の画像を合成して広視野角（パノラマ）画像を生成する処理に関する。 The present invention relates to a process for generating a wide viewing angle (panoramic) image by combining a plurality of images.

複数の画像をつなげるように合成することにより得られる、広視野角を収めたパノラマ画像が知られている。特許文献１には、撮影位置が異なる同一被写体を写した複数の画像を合成するために、位置合わせをする方法が記載されている。位置合わせを行う際、各画像の撮影起点の近隣に位置するランドマークを特定し、ランドマークを基準にして位置合わせを行っている。 A panoramic image with a wide viewing angle obtained by combining a plurality of images is known. Japanese Patent Application Laid-Open No. 2004-151858 describes a method of aligning a plurality of images obtained by photographing the same subject at different photographing positions. When performing alignment, a landmark located in the vicinity of the shooting start point of each image is specified, and alignment is performed based on the landmark.

特開２０１４−０８６９４８号公報JP 2014-086948 A

一般に、複数の画像間において、同一被写体の対応点を探索する。しかしながら各撮影装置が撮影する方向が大きく異なっている場合や、被写体が複雑な形状である場合には、適切に対応点を検出できないことがある。誤った対応点に基づいて画像の位置関係を取得すると、パノラマ画像において画像の繋ぎ目が目立ってしまう。 Generally, a corresponding point of the same subject is searched between a plurality of images. However, when the shooting directions of the respective shooting apparatuses are greatly different, or when the subject has a complicated shape, the corresponding points may not be detected appropriately. When the positional relationship of images is acquired based on an incorrect corresponding point, the joints of the images become conspicuous in the panoramic image.

そこで本発明は、同一被写体を含む複数の画像を接合してパノラマ画像を生成するために、各画像をより適切な位置合わせすることを目的とする。 Accordingly, an object of the present invention is to more appropriately align each image in order to generate a panoramic image by joining a plurality of images including the same subject.

上記課題を解決するため、本発明は、互いに異なる方向を向き、かつ一部重複するように撮像して得られた少なくとも２つの画像を取得する画像取得手段と、前記取得手段が取得した各画像において、同一の被写体に対応する画像間の対応点を探索する探索手段と、前記対応点に基づいて、前記取得手段が取得した各画像の位置関係を決定する位置決定手段と、前記位置決定手段により決定された前記画像それぞれの位置関係に基づいて、前記少なくとも２つの画像を接合することで、パノラマ画像を生成する合成手段と、有し、前記位置決定手段は、前記２つの画像に撮像された被写体の見えの角度依存性に基づいて、前記探索手段が検出した対応点の信頼度を設定し、各対応点の信頼度を参照して前記画像それぞれの位置関係を決定することを特徴とする。 In order to solve the above-described problems, the present invention provides an image acquisition unit that acquires at least two images obtained by imaging in different directions and partially overlapping each other, and each image acquired by the acquisition unit A search means for searching for corresponding points between images corresponding to the same subject, a position determining means for determining a positional relationship between the images acquired by the acquiring means based on the corresponding points, and the position determining means And combining means for generating a panoramic image by joining the at least two images based on the positional relationship between the images determined by step (a), and the position determination means is captured by the two images. The reliability of the corresponding point detected by the search means is set based on the angle dependency of the appearance of the subject, and the positional relationship between the images is determined with reference to the reliability of each corresponding point. And wherein the door.

本発明によれば、同一被写体を含む複数の画像を接合してパノラマ画像を生成するために、各画像をより適切な位置合わせすることができる。 According to the present invention, since a plurality of images including the same subject are joined to generate a panoramic image, each image can be more appropriately aligned.

画像処理システムを示す模式図Schematic diagram showing the image processing system 画像処理装置の機能構成を示すブロック図Block diagram showing functional configuration of image processing apparatus 画像処理装置が行う処理の流れを示すフローチャートA flowchart showing the flow of processing performed by the image processing apparatus 画像処理装置によるパノラマ画像合成の例Example of panoramic image composition by image processing device 位置決定部２０５が行う処理の流れを示すフローチャートThe flowchart which shows the flow of the process which the position determination part 205 performs. カメラと被写体の関係を示す模式図Schematic diagram showing the relationship between the camera and the subject 取得した画像の一例を示す模式図Schematic diagram showing an example of an acquired image パワースペクトルを判定する閾値の例を表す図The figure showing the example of the threshold value which judges a power spectrum カメラと被写体の関係を示す模式図Schematic diagram showing the relationship between the camera and the subject 取得した画像の一例を示す模式図Schematic diagram showing an example of an acquired image 位置決定部２０５が行う処理のフローチャートFlowchart of processing performed by the position determination unit 205 カメラ９０１とＡ面法線との関係を示す図The figure which shows the relationship between the camera 901 and A surface normal line 位置決定部２０５が行う処理のフローチャートFlowchart of processing performed by the position determination unit 205 位置決定部２０５が行う処理のフローチャートFlowchart of processing performed by the position determination unit 205

以下、添付図面を参照して、本発明を好適な実施形態に従って詳細に説明する。なお、以下の実施形態において示す構成は一例にすぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the accompanying drawings. Note that the configurations shown in the following embodiments are merely examples, and the present invention is not limited to the illustrated configurations.

＜第１実施形態＞
第１実施形態では、２つの撮像装置（カメラ）により撮像して得られる画像を繋ぐ合成により、パノラマ画像を生成する。図１は、第１実施形態に適用可能なパノラマ画像を生成する画像処理システムを示す。画像処理装置１００については、ハードウェア構成を示している。画像処理装置１００は、例えばパーソナルコンピュータ（ＰＣ）や、タブレット端末などにより実現される。なお本実施形態における画像処理装置１００は、撮像装置１１０とＩ／Ｆを介して接続された装置を例に説明するが、撮像装置（カメラ）に内蔵された、画像処理チップとして実現することもできる。 <First Embodiment>
In the first embodiment, a panoramic image is generated by combining images obtained by imaging with two imaging devices (cameras). FIG. 1 shows an image processing system that generates a panoramic image applicable to the first embodiment. The image processing apparatus 100 has a hardware configuration. The image processing apparatus 100 is realized by, for example, a personal computer (PC) or a tablet terminal. Note that the image processing apparatus 100 according to the present embodiment is described as an example of an apparatus connected to the imaging apparatus 110 via the I / F, but may be realized as an image processing chip built in the imaging apparatus (camera). it can.

ＣＰＵ１０１は、画像処理装置１００内の各部を統括的に制御するプロセッサである。ＲＡＭ１０２は、ＣＰＵ１０１の主メモリ、ワークエリア等として機能する。ＲＯＭ１０３は、ＣＰＵ１０１によって実行されるプログラム群を格納している。ＨＤＤ１０４は、ＣＰＵ１０１によって実行されるアプリケーションや、画像処理に用いられるデータ等を記憶する。出力Ｉ／Ｆ１０５は、例えばＤＶＩやＨＤＭＩ（登録商標）等の画像出力インタフェースであり、液晶ディスプレイなどの出力装置１０６を接続する。入力Ｉ／Ｆ１０７は、例えばＵＳＢやＩＥＥＥ１３９４等のシリアルバスインタフェースであり、ユーザが各種の指示操作を行うためのキーボードやマウスなどの入力装置１０８を接続する。撮像Ｉ／Ｆ１０９は、例えば、３Ｇ／ＨＤ−ＳＤＩやＨＤＭＩ（登録商標）等の画像入力インタフェースであり、撮像装置（カメラ）１１０を接続する。汎用Ｉ／Ｆ１１１も入力Ｉ／Ｆ同様ＵＳＢやＩＥＥＥ１３９４等のシリアルバスインタフェースであり、撮像装置と同じシーンに対して撮影位置からの距離を測定することのできる距離計測装置１１２を接続する。距離計測装置１１２は公知なものを任意に用いることができる。例えば赤外線パターンのパルスを投光し、物体に投影されたパターンを認識して距離を推定する装置などが知られている。ただし距離計測装置１１２は撮像装置１１０と一体であってもよい。また必要に応じて汎用Ｉ／Ｆ１１１を介し、外部記憶１１３を接続する。 The CPU 101 is a processor that comprehensively controls each unit in the image processing apparatus 100. The RAM 102 functions as a main memory, work area, and the like for the CPU 101. The ROM 103 stores a program group executed by the CPU 101. The HDD 104 stores applications executed by the CPU 101, data used for image processing, and the like. The output I / F 105 is an image output interface such as DVI or HDMI (registered trademark), for example, and is connected to an output device 106 such as a liquid crystal display. The input I / F 107 is a serial bus interface such as USB or IEEE1394, for example, and is connected to an input device 108 such as a keyboard and a mouse for a user to perform various instruction operations. The imaging I / F 109 is an image input interface such as 3G / HD-SDI or HDMI (registered trademark), and connects an imaging device (camera) 110. The general-purpose I / F 111 is a serial bus interface such as USB or IEEE 1394 as well as the input I / F, and is connected to a distance measuring device 112 that can measure the distance from the shooting position to the same scene as the imaging device. Any known distance measuring device 112 can be used. For example, a device that projects a pulse of an infrared pattern, recognizes a pattern projected on an object, and estimates a distance is known. However, the distance measuring device 112 may be integrated with the imaging device 110. Further, the external storage 113 is connected via the general-purpose I / F 111 as necessary.

図２は、第１実施形態における画像処理装置１００の機能構成を示すブロック図である。ＣＰＵ１０１は、ＲＯＭ１０３又はＨＤＤ１０４に格納されたプログラムを読み出してＲＡＭ１０２をワークエリアとして実行することで、図２に示す各機能ブロックとしての役割を果たす。なお、全ての機能ブロックの役割をＣＰＵ１０１が果たす必要はなく、各機能ブロックに対応する専用の処理回路を設けるようにしてもよい。画像処理装置１００は、画像入力部２０２、距離情報取得部２０３、画像補正部２０４、位置決定部２０５、繋ぎ目決定部２０６、画像合成部２０７、画像出力部２０８と、入力端子２０１、出力端子２０９を有する。 FIG. 2 is a block diagram illustrating a functional configuration of the image processing apparatus 100 according to the first embodiment. The CPU 101 plays a role as each functional block shown in FIG. 2 by reading a program stored in the ROM 103 or the HDD 104 and executing the RAM 102 as a work area. Note that the CPU 101 does not have to play the role of all the functional blocks, and a dedicated processing circuit corresponding to each functional block may be provided. The image processing apparatus 100 includes an image input unit 202, a distance information acquisition unit 203, an image correction unit 204, a position determination unit 205, a joint determination unit 206, an image composition unit 207, an image output unit 208, an input terminal 201, and an output terminal. 209.

画像取得部２０２は、パノラマ画像を生成するための複数の画像を取得する。ここで取得する複数の画像は、少なくとも２つの画像において一部重複した撮影領域を有するように撮影して得られた画像である。距離情報取得部２０３は、画像入力部２０２が取得した複数の画像に対応する距離情報（デプスマップ）を取得する。距離情報とは、各画素に撮影位置からの距離を示す情報が格納されたデータである。画像を構成する各画素と、同じ位置にある距離情報における各画素は対応している。従って、画像に撮像された各被写体について、視点（撮像装置の位置）からの距離がわかる。 The image acquisition unit 202 acquires a plurality of images for generating a panoramic image. The plurality of images acquired here are images obtained by shooting so that at least two images have shooting areas partially overlapping. The distance information acquisition unit 203 acquires distance information (depth map) corresponding to the plurality of images acquired by the image input unit 202. The distance information is data in which information indicating the distance from the shooting position is stored in each pixel. Each pixel constituting the image corresponds to each pixel in the distance information at the same position. Therefore, the distance from the viewpoint (position of the imaging device) can be known for each subject imaged in the image.

画像補正部２０４は、各画像データに対して各種の画像処理を施す。ここでは、レンズによる光学歪みを補正する補正処理をする。画像補正部２０４は、撮像時に使用されたレンズの特性に基づいて画素を再配置する。 The image correction unit 204 performs various image processing on each image data. Here, correction processing for correcting optical distortion caused by the lens is performed. The image correction unit 204 rearranges the pixels based on the characteristics of the lens used during imaging.

位置決定部２０５は、パノラマ合成に用いる各画像の位置関係を決定する。本実施形態ではまず、２つの画像の互いに重複する領域における対応点を探索する。探索の結果検出された対応点を用いて、各カメラの位置姿勢を決定した後、さらに各画像の位置関係を決定する。繋ぎ目決定部２０６は、位置決定部２０５によって決定された各画像の位置関係を参照し、２つの画像を接合するための繋ぎ目の位置を決定する。 The position determination unit 205 determines the positional relationship between the images used for panorama synthesis. In the present embodiment, first, corresponding points in areas where two images overlap each other are searched. After determining the position and orientation of each camera using the corresponding points detected as a result of the search, the positional relationship of each image is further determined. The joint determination unit 206 refers to the positional relationship between the images determined by the position determination unit 205 and determines the position of the joint for joining the two images.

画像合成部２０７は、繋ぎ目決定部２０６が決定した繋ぎ目を参照して、２つの画像を接合し、接合した境界が自然になるよう境界付近に境界処理を施し、パノラマ画像を合成する。パノラマ画像は、各画像よりも大きく、より広い視野を撮像して得られたかのような画像である。なお、パノラマ画像を生成するための画像をつなぐ合成処理は、スティッチ処理とも言う。画像出力部２０８は、画像合成部２０７から得られるパノラマ画像を出力する。 The image composition unit 207 refers to the joint determined by the joint determination unit 206, joins the two images, performs boundary processing near the boundary so that the joined boundary becomes natural, and synthesizes a panoramic image. A panoramic image is an image that is larger than each image and is obtained by capturing a wider field of view. Note that the synthesis process for connecting images for generating a panoramic image is also referred to as a stitch process. The image output unit 208 outputs a panoramic image obtained from the image composition unit 207.

図３は、第１実施形態における画像処理装置１００が実行する処理の流れを示すフローチャートである。ＣＰＵ１０１は、以下のフローチャートに沿ったプログラムを読み込み、実行させる。ここでは図４に示す画像データ４０１および４０２から、パノラマ画像４０３を生成する場合を例に説明する。 FIG. 3 is a flowchart showing a flow of processing executed by the image processing apparatus 100 according to the first embodiment. The CPU 101 reads and executes a program according to the following flowchart. Here, a case where a panoramic image 403 is generated from the image data 401 and 402 shown in FIG. 4 will be described as an example.

ステップＳ３０１において、画像入力部２０２は入力端子２０１から入力された複数の入力画像を取得する。ここでは画像４０１および４０２を取得する。ステップＳ３０２において距離情報取得部２０３は、各入力画像それぞれに対応する距離情報を取得する。距離情報取得部２０３は、入力画像における各画素と、対応する距離情報における各画素を結びつけて、入力画像における各画素の距離を参照できるようにする。具体的には、入力画像における各画素の画素位置を示す位置情報を、距離情報における各画素に対応づけておく。 In step S 301, the image input unit 202 acquires a plurality of input images input from the input terminal 201. Here, images 401 and 402 are acquired. In step S302, the distance information acquisition unit 203 acquires distance information corresponding to each input image. The distance information acquisition unit 203 associates each pixel in the input image with each pixel in the corresponding distance information so that the distance of each pixel in the input image can be referred to. Specifically, position information indicating the pixel position of each pixel in the input image is associated with each pixel in the distance information.

ステップＳ３０３において画像補正部２０４は、各入力画像に対してレンズによる光学歪みの影響を補正するための補正処理を実行する。撮像装置１１０から取得した撮像時に使用されたレンズの特性に応じて、各画素を再配置する。再配置をしない画素については、位置情報を変更しない。再配置をする画素については、位置情報を更新する。このとき、ステップＳ３０２において距離情報取得部２０３により各画素の位置情報と対応付けられた距離情報も、同時に再配置されることになり、対応関係は失われない。 In step S303, the image correction unit 204 executes a correction process for correcting the influence of optical distortion caused by the lens on each input image. The pixels are rearranged according to the characteristics of the lens used at the time of imaging acquired from the imaging device 110. The position information is not changed for pixels that are not rearranged. For the pixels to be rearranged, the position information is updated. At this time, the distance information associated with the position information of each pixel by the distance information acquisition unit 203 in step S302 is also rearranged at the same time, and the correspondence is not lost.

ステップＳ３０４において位置決定部２０５は、各画像の位置関係を決定し、位置合わせをする。パノラマ画像において隣接して接合される少なくとも２つの画像の一部は、同じシーンを重複して撮影した重複領域がある。図７は、２つの画像の重複領域を示す図である。画像４０１における領域７０３と、画像４０２における領域７０４は、同じシーンをそれぞれが撮像した領域である。ここで各画像の重複領域において、同じ被写体の同じ位置を表している画素を探索して対応づけることにより、位置合わせをする。ステップＳ３０４における処理の詳細は、後に詳述する。 In step S304, the position determination unit 205 determines the positional relationship between the images and performs alignment. A part of at least two images that are adjacently joined in the panoramic image has an overlapping region in which the same scene is photographed in an overlapping manner. FIG. 7 is a diagram showing an overlapping area of two images. A region 703 in the image 401 and a region 704 in the image 402 are regions in which the same scene is captured. Here, in the overlapping region of each image, the pixels representing the same position of the same subject are searched for and matched to perform alignment. Details of the processing in step S304 will be described later.

ステップＳ３０５において繋ぎ目決定部２０６は、パノラマ画像における各画像の繋ぎ目の位置を決定する。ここで繋ぎ目とは、複数の画像データがして撮影した領域について、いずれの画像を用いるかを定める境界である。本実施形態において繋ぎ目決定部２０６は、画素値の差分をエネルギーコストとしたグラフカットの手法を用いて繋ぎ目位置を決定する。この手法では、画素をノード、隣接画素間をエッジと考え、画素値に応じて各エッジに定められるエネルギーコストに対する最小カットを繋ぎ目として採用する。隣接する画素ｖ，ｕ間のエッジに対するエネルギーコストＥ（ｖ，ｕ）は式（１）（２）のように定める。 In step S305, the joint determination unit 206 determines the position of the joint of each image in the panoramic image. Here, the joint is a boundary that determines which image is to be used for an area captured by a plurality of image data. In the present embodiment, the joint determination unit 206 determines the joint position using a graph cut technique in which the difference between pixel values is an energy cost. In this method, a pixel is considered as a node, and an adjacent pixel is considered as an edge, and a minimum cut for an energy cost determined for each edge according to a pixel value is adopted as a joint. The energy cost E (v, u) for the edge between adjacent pixels v and u is determined as shown in equations (1) and (2).

ここでＩ＿１（ｖ）は重複画像１の画素ｖにおける画素値（Ｒ，Ｇ，Ｂ）を表し、ｎｏｒｍ２Ｌ（ｘ）はベクトルｘに対する２−ノルム（ユークリッドノルム）を表す。また、ｇｒａｄ（ｖ，ｕ）は画素ｖ，ｕ間の画素値の勾配を表す。より具体的には、Ｓｏｂｅｌ＿１＿ｕ−ｖ（）は重複画像１に対するｕ−ｖ方向へのＳｏｂｅｌフィルタ処理を表す。式式（１）は、重複画像それぞれにおいてｕ−ｖ方向の１次のＳｏｂｅｌフィルタを施した後の、画素ｖまたはｕにおける画素値のＬ２ノルムの和を表す。全体としてＥ（ｖ，ｕ）は、各重複画像で画素値が近い値であれば小さく、隣接画素で画素値が近い値であれば大きくなる性質を持つエネルギーコストになる。このようなエネルギーコストのエッジの最小カットを求める。結果、重複画像においてずれが小さい部分、画像のエッジ部分が選ばれやすくなるように繋ぎ目が求められる。

Here, I_1 (v) represents the pixel value (R, G, B) at the pixel v of the overlapping image 1, and norm2L (x) represents the 2-norm (Euclidean norm) for the vector x. Further, grad (v, u) represents the gradient of the pixel value between the pixels v and u. More specifically, Sobel_1_uv () represents Sobel filter processing in the uv direction for the duplicate image 1. Expression (1) represents the sum of the L2 norms of the pixel value in the pixel v or u after the first-order Sobel filter in the uv direction is applied to each overlapping image. As a whole, E (v, u) has an energy cost that has a property that it is small if the pixel value is close in each overlapping image, and is large if the pixel value is close in adjacent pixels. The minimum cut of the energy cost edge is obtained. As a result, a joint is required so that a portion with a small deviation and an edge portion of the image can be easily selected in the overlapped image.

ステップＳ３０６において画像合成部２０７は、ステップＳ３０５において決定された繋ぎ目の位置に基づいて複数の画像を合成し、１枚のパノラマ画像を作成する。まず画像合成部２０７は、パノラマ画像における繋ぎ目の位置を参照して、パノラマ画像における各画素の画素値として、各画像における画素の画素値を取得する。つまりパノラマ画像における各画素の画素値は、入力された複数の画像うち、いずれかの画像を構成する画素の画素値である。さらに、パノラマ画像において繋ぎ目が知覚されるのを抑制するために、境界付近の領域に対して境界処理を行う。境界処理の具体的な方法としては、繋ぎ目近傍でオーバーラップした画像同士を混合するアルファ・ブレンディングなどが挙げられる。ステップＳ３０７において画像出力部２０８は、ステップＳ３０６において得られたパノラマ画像を、ＨＤＤ１０４や出力装置１０６、外部記憶１１３などに出力する。以上により、パノラマ画像の合成処理は完了する。 In step S306, the image composition unit 207 composes a plurality of images based on the position of the joint determined in step S305, and creates one panoramic image. First, the image composition unit 207 refers to the position of the joint in the panoramic image, and acquires the pixel value of the pixel in each image as the pixel value of each pixel in the panoramic image. That is, the pixel value of each pixel in the panoramic image is the pixel value of a pixel constituting any one of the plurality of input images. Further, in order to suppress the perception of joints in the panoramic image, boundary processing is performed on the region near the boundary. Specific examples of the boundary processing include alpha blending in which images overlapping in the vicinity of the joint are mixed. In step S307, the image output unit 208 outputs the panoramic image obtained in step S306 to the HDD 104, the output device 106, the external storage 113, and the like. Thus, the panoramic image synthesis process is completed.

次に、位置決定部２０５が実行するステップＳ３０４における処理の詳細について説明する。本実施形態における位置決定部２０５は、パノラマ画像を生成するための複数の画像における領域毎に、対応点を探索するに適切な領域であるか否かを判定する。図６は、図４に示す画像４０１および４０２を撮影したときのカメラの配置及びシーンの状況を示している。２台のカメラ６０１、６０２によって、それぞれが看板６０３と樹木６０４を含むシーンを撮像している。カメラ６０１により画像４０１が、カメラ６０２により画像６０２が取得される。いずれの画像にも、看板６０３と樹木６０４が存在している。位置決定部２０５は、画像４０１における重複領域７０３の特徴点と画像４０２における重複領域７０４の特徴点から、対応点を探索する。 Next, details of the processing in step S304 executed by the position determination unit 205 will be described. The position determination unit 205 in this embodiment determines whether or not each region in a plurality of images for generating a panoramic image is an appropriate region for searching for a corresponding point. FIG. 6 shows the arrangement of the camera and the situation of the scene when the images 401 and 402 shown in FIG. 4 are taken. The two cameras 601 and 602 capture a scene including a sign 603 and a tree 604, respectively. An image 401 is acquired by the camera 601 and an image 602 is acquired by the camera 602. A sign 603 and a tree 604 are present in any image. The position determination unit 205 searches for a corresponding point from the feature points of the overlap region 703 in the image 401 and the feature points of the overlap region 704 in the image 402.

被写体を見る方向（角度）が変わると、看板６０３も樹木６０４も見え方は変わる。ただし、見る方向に応じた見え方の変化は、看板６０３と樹木６０４とでは異なる。看板６０３については、見る方向が変わった場合においても、看板６０３に描かれた模様などの特徴点の見えは大きく変化しない。従って、画像４０１における看板６０３の特徴点と、画像４０２における看板６０３の特徴点とは、対応づけやすい。他方、樹木６０４は、見る方向が異なる画像４０１と画像４０２とでは、枝葉の重なり方や光の差し方が大きく変わって見える。このように見る方向によって大きく見え方が変化する被写体では、見る方向によって見え方が大きく変化しない被写体に比べ、対応点を誤検出しやすくなると考えられる。なぜならば、見る方向によって大きく見えが変化する被写体では、実空間上の一点が、見る方向の異なる２枚の画像においては、必ずしも同じようには見えていないためである。特に、図６に示すように各カメラの見る方向（撮影方向）が大きく異なる場合、見る方向によって特徴点の見え方が異なる被写体に基づいて対応点を検出すると、誤検出しやすい。例えば図６に示すように、２つのカメラ光軸の成す角θが３０度〜１５０度などの場合、特に対応点を見つけにくい。これらのことから、見る方向によって大きく見えが変化する被写体上の対応点よりも、そうでない被写体上での対応点の方が、信頼度が高いと言える。 When the direction (angle) at which the subject is viewed changes, the way the signboard 603 and the tree 604 are seen changes. However, the change in appearance depending on the viewing direction is different between the sign 603 and the tree 604. Regarding the signboard 603, even when the viewing direction changes, the appearance of feature points such as a pattern drawn on the signboard 603 does not change significantly. Therefore, the feature point of the signboard 603 in the image 401 and the feature point of the signboard 603 in the image 402 can be easily associated with each other. On the other hand, in the images 401 and 402, the tree 604 looks different greatly in how the leaves and leaves are overlapped and how the light is inserted. In this way, it is considered that a subject whose appearance changes greatly depending on the viewing direction is more likely to erroneously detect corresponding points than a subject whose appearance does not change greatly depending on the viewing direction. This is because, in a subject whose appearance changes greatly depending on the viewing direction, one point in the real space does not always look the same in two images with different viewing directions. In particular, as shown in FIG. 6, when the viewing direction (photographing direction) of each camera is greatly different, it is easy to erroneously detect corresponding points based on subjects whose feature points look different depending on the viewing direction. For example, as shown in FIG. 6, when the angle θ formed by the two camera optical axes is 30 degrees to 150 degrees, it is particularly difficult to find a corresponding point. From these facts, it can be said that the corresponding point on the subject that is not so is more reliable than the corresponding point on the subject whose appearance changes greatly depending on the viewing direction.

そこで位置決定部２０５は、各画像における領域毎に、検出した対応点の信頼度を設定する。信頼度は、領域ごとに被写体が、見る方向によって見え方が大きく変化するか否かに応じて設定され、各領域に対応点がある場合、その対応点の信頼度を示すものである。位置決定部２０６は、距離情報に基づいて、被写体が見る方向によって見えが大きく変化する変化するか否かを判断する。見る方向によって見えが大きく変化する被写体とは、距離情報の変化が激しく、複雑に入り組んだ被写体であると考える。樹木６０４はこの例である。一方、見る方向が変わっても見えが大きく変化しない被写体とは、距離情報に大きな変化がなく、比較的なだらかに値が変化する。つまり、平面状であり、カメラに対して奥行き方向に距離が変化するが、その変化の勾配は一定となる被写体である。看板６０３はこの例である。このような被写体は見る角度が変わっても見えが大きく変化しないと考えられる。 Therefore, the position determination unit 205 sets the reliability of the detected corresponding point for each region in each image. The reliability is set in accordance with whether or not the appearance of the subject changes greatly depending on the viewing direction for each region. When there is a corresponding point in each region, the reliability indicates the reliability of the corresponding point. The position determination unit 206 determines whether or not the appearance changes greatly depending on the viewing direction of the subject based on the distance information. A subject whose appearance changes greatly depending on the viewing direction is considered to be a complicated subject in which distance information changes greatly and is complicated. Tree 604 is an example of this. On the other hand, a subject whose appearance does not change greatly even when the viewing direction changes does not have a large change in distance information, and the value changes relatively comparatively. That is, it is a subject that is planar and has a constant gradient with respect to the camera, with the distance changing in the depth direction with respect to the camera. The sign 603 is an example of this. It is considered that the appearance of such a subject does not change greatly even if the viewing angle changes.

本実施形態において位置決定部２０５は、対応点を用いてカメラの位置姿勢を決定し、各カメラに対応する画像の位置関係を決定する。信頼度が大きい領域における対応点の情報を、そうでない対応点よりも重点的あるいは優先的に参照して、カメラの位置姿勢を決定する。 In the present embodiment, the position determination unit 205 determines the position and orientation of the camera using the corresponding points, and determines the positional relationship between the images corresponding to each camera. The position and orientation of the camera are determined by referring to information on corresponding points in the area with high reliability more preferentially or preferentially than the corresponding points that are not.

図５は、ステップＳ３０４における処理の詳細なフローチャートである。ステップＳ５０１において位置決定部２０５は、画像補正部２０４により補正された画像データそれぞれについて、特徴点を抽出する。この特徴点抽出はＳＵＲＦ（Ｓｐｅｅｄ−ＵｐｐｅｄＲｏｂｕｓｔＦｅａｔｕｒｅ）やＯｒｂ（Ｏｒｉｅｎｔｅｄ−ＢＲＩＥＦ）等の公知の技術を用いることができる。ここでは、画像４０１における特徴点と画像４０２における特徴点を複数抽出できたものとする。 FIG. 5 is a detailed flowchart of the process in step S304. In step S 501, the position determination unit 205 extracts feature points for each of the image data corrected by the image correction unit 204. For this feature point extraction, a known technique such as SURF (Speed-Upd Robust Feature) or Orb (Oriented-BRIEF) can be used. Here, it is assumed that a plurality of feature points in the image 401 and a plurality of feature points in the image 402 can be extracted.

ステップＳ５０２において位置決定部２０５は、各画像における特徴点それぞれについて、複数の画像間で対応付けられる特徴点を探索する。対応づけられた特徴点は、対応点として検出される。このような対応点探索は、複数の画像データのいて同一被写体の同一点が写っていないかを調べることに相当する。ステップＳ５０３において位置決定部２０５は、各画像および距離情報を所定の領域ごとに分割する。ここでは、画像を３２画素×３２画素の領域に分割する。ただしここで分割する領域は、３２画素×３２画素に限らない。解像度や画角、画像処理装置の処理能力に応じて設定すればよい。さらに位置決定部２０５は、距離情報における領域ごとに、周波数解析（フーリエ変換）を行う。フーリエ変換して得られた係数のパワーを求めることで、領域内の２次元周波数のパワースペクトルＰ（ｆｘ、ｆｙ）を算出する。 In step S 502, the position determination unit 205 searches for feature points associated with a plurality of images for each feature point in each image. The associated feature points are detected as corresponding points. Such a corresponding point search corresponds to checking whether the same point of the same subject is captured in a plurality of image data. In step S503, the position determination unit 205 divides each image and distance information into predetermined regions. Here, the image is divided into regions of 32 pixels × 32 pixels. However, the area to be divided here is not limited to 32 pixels × 32 pixels. What is necessary is just to set according to the resolution, a view angle, and the processing capability of an image processing apparatus. Further, the position determination unit 205 performs frequency analysis (Fourier transform) for each region in the distance information. A power spectrum P (fx, fy) of a two-dimensional frequency in the region is calculated by obtaining the power of the coefficient obtained by Fourier transform.

次にステップＳ５０４〜Ｓ５１０のループ１に入る。各領域において、ステップＳ５０３における周波数解析の結果に基づいて、画像の各領域に対して信頼度を設定する。距離情報が高周波成分を持つ領域には、対応点を誤検出しやすい領域とみなし、低い信頼度を与える。距離情報が低周波成分のみから成る領域は、適切に対応点を検出できる領域とみなし、高い信頼度を与える。これにより所定の領域ごとに信頼度を設定した信頼度マップを作成する。本実施形態ではまず、パワースペクトルＰ（ｆｘ、ｆｙ）と所定の閾値を比較することにより、距離情報が低周波成分のみから成る領域、強い高周波成分を含む領域、その他の領域に区分する。まずステップＳ５０５において、ｆｘ≧２、ｆｙ≧２の範囲においてパワーＰ（ｆｘ、ｆｙ）と閾値Ｔｈ１とを比較する。ここでは、高周波成分がほぼないことを判定するために閾値Ｔｈ１を用いるため、０に近い値にする。ただし閾値Ｔｈ１は、距離情報のノイズよりは大きい値を設定する。ここでは周波数成分ｆが０の時のパワースペクトルを１としてＴｈ１＝０．０３とする。 Next, the loop 1 of steps S504 to S510 is entered. In each region, the reliability is set for each region of the image based on the result of the frequency analysis in step S503. A region where the distance information has a high-frequency component is regarded as a region where the corresponding point is likely to be erroneously detected, and is given low reliability. A region in which the distance information is composed of only low frequency components is regarded as a region where a corresponding point can be detected appropriately, and gives high reliability. Thereby, a reliability map in which the reliability is set for each predetermined area is created. In this embodiment, first, by comparing the power spectrum P (fx, fy) with a predetermined threshold, the distance information is divided into a region composed of only low frequency components, a region containing strong high frequency components, and other regions. First, in step S505, the power P (fx, fy) is compared with the threshold value Th1 in the range of fx ≧ 2 and fy ≧ 2. Here, the threshold value Th1 is used to determine that there are almost no high-frequency components, so the value is close to zero. However, the threshold value Th1 is set to a value larger than the noise of the distance information. Here, the power spectrum when the frequency component f is 0 is 1, and Th1 = 0.03.

ｆｘ≧２、ｆｙ≧２の範囲において常に周波数成分のパワースペクトルＰ（ｆｘ、ｆｙ）が閾値Ｔｈ１以下であれば、この領域は十分に低周波成分のみから成るとみなす。この場合、処理対象の領域にうつる被写体においては距離の変化がなだらかであり、見る角度によって見え方が変化しにくいと考えられる。そこでステップＳ５０６に進み、この領域には信頼度のフラグ「高」を設定する。一方、ステップＳ５０５でパワースペクトルＰ（ｆｘ、ｆｙ）が閾値Ｔｈ１より大きい周波数成分が存在した場合、ステップＳ５０７に進む。ステップＳ５０７においては更に高周波成分であるｆｘ≧４、ｆｙ≧４の範囲において、パワースペクトルＰ（ｆｘ、ｆｙ）と閾値Ｔｈ２とを比較する。閾値Ｔｈ２は、処理対象の領域の距離情報が高周波成分を含むか否かを判定するために用いるため、大きすぎる値を設定すると、見る方向によって見え方が変わる被写体を判定できなくなる。例えば閾値Ｔｈ２は、閾値Ｔｈ１よりも大きく、周波数成分ｆが０の時のパワースペクトルを１としてＴｈ２＝０．１５とする。 If the power spectrum P (fx, fy) of the frequency component is always equal to or less than the threshold value Th1 in the range of fx ≧ 2 and fy ≧ 2, this region is considered to be sufficiently composed only of the low frequency component. In this case, it is considered that the change in the distance is gentle in the subject moving to the processing target area, and the appearance is hardly changed depending on the viewing angle. Therefore, the process proceeds to step S506, and the reliability flag “high” is set in this area. On the other hand, if there is a frequency component in which the power spectrum P (fx, fy) is larger than the threshold Th1 in step S505, the process proceeds to step S507. In step S507, the power spectrum P (fx, fy) is compared with the threshold Th2 in the range of fx ≧ 4 and fy ≧ 4, which are high-frequency components. The threshold value Th2 is used to determine whether or not the distance information of the region to be processed includes a high frequency component. Therefore, if the value is set too large, it is impossible to determine a subject whose appearance changes depending on the viewing direction. For example, the threshold value Th2 is larger than the threshold value Th1, and the power spectrum when the frequency component f is 0 is 1, and Th2 = 0.15.

処理対象の領域において、パワースペクトルＰ（ｆｘ、ｆｙ）が閾値Ｔｈ２以上をみたす周波数成分があれば、領域には十分に高周波成分が存在するとみなす。処理対象の領域において距離情報が高周波成分を含む場合、その領域に撮像されている被写体は、複雑な構造であり、見る角度によって見え方が変化しやすいと考えられる。そこでステップＳ５０８に進み、この領域には信頼度のフラグ「低」を設定する。上記の条件のいずれも該当しなかった場合は、その他の場合としてステップＳ５０９に進み、処理対象の領域に対して信頼度のフラグ「中」を設定する。 If there is a frequency component in which the power spectrum P (fx, fy) satisfies the threshold Th2 or more in the region to be processed, it is considered that the region has a sufficiently high frequency component. When the distance information includes a high frequency component in the processing target region, the subject imaged in the region has a complicated structure, and the appearance is likely to change depending on the viewing angle. In step S508, the reliability flag “low” is set in this area. If none of the above conditions apply, the process proceeds to step S509 as the other case, and the reliability flag “medium” is set for the processing target area.

図８に、ある領域における１次元方向の距離情報とパワースペクトルＰ（ｆ）の例を示す。図８（ａ）は、距離情報の値の変化がなだらかな場合を示す。図８（ｂ）は、距離情報の値の変化が激しい場合を示している。また、それぞれ周波数解析した結果を示す。ただしパワースペクトルは、周波数成分ｆ＝０でＰ（０）＝１になるように正規化して表示している。図８（ａ）に示す場合、閾値Ｔｈ１との比較により、処理対象の領域には信頼度「高」が与えられる。図８（ｂ）に示す場合、閾値Ｔｈ１および閾値Ｔｈ２を用いた比較の結果、信頼度「低」を設定する。ステップＳ５１０においてループ１を繰り返し、全ての領域に信頼度を示すフラグを決定し、領域ごとの信頼度を示す信頼度マップを作成する。本実施形態では、３段階の信頼度を２ビットで表し、画像と同じ解像度の信頼度マップを作成するとする。つまり領域に含まれるすべての画素について、設定した信頼度を示す情報を設定する。 FIG. 8 shows an example of one-dimensional direction distance information and a power spectrum P (f) in a certain region. FIG. 8A shows a case where the change in the value of the distance information is gentle. FIG. 8B shows a case where the value of the distance information changes drastically. The results of frequency analysis are also shown. However, the power spectrum is normalized and displayed so that frequency component f = 0 and P (0) = 1. In the case illustrated in FIG. 8A, the reliability “high” is given to the region to be processed by comparison with the threshold Th 1. In the case illustrated in FIG. 8B, the reliability “low” is set as a result of the comparison using the threshold Th1 and the threshold Th2. In step S510, loop 1 is repeated, a flag indicating reliability is determined for all regions, and a reliability map indicating reliability for each region is created. In the present embodiment, it is assumed that the reliability of the three stages is represented by 2 bits and a reliability map having the same resolution as the image is created. That is, information indicating the set reliability is set for all pixels included in the region.

ステップＳ５１１において位置決定部２０５は、対応点に基づいてカメラの位置姿勢を決定する。補正された画像における対応点ｉの座標をｘ_ｉ、パノラマ画像における対応点ｉの座標位置をｐ_ｉ、両者の関係付けるホモグラフィ行列をＨとおくと、これらの関係は式（３）のようになる。 In step S511, the position determination unit 205 determines the position and orientation of the camera based on the corresponding points. Assuming that the coordinate of the corresponding point i in the corrected image is x _i , the coordinate position of the corresponding point i in the panoramic image is p _i, and the homography matrix that relates _the two is H, these relationships are expressed by Equation (3). become.

ｐ_ｉ＝Ｈｘ_ｉ・・・（３）
ただし座標は同次座標系で表し、ｘ_ｉ＝（ｘ_ｉ，ｙ_ｉ，ｗ_ｉ）^Ｔ、ｐ_ｉ＝（Ｘ_ｉ，Ｙ_ｉ，Ｗ_ｉ）^Ｔ、Ｈの第ｋ行ｌ列の要素をｈ_ｋｌ（ただし１≦ｋ，ｌ≦３）とする。このとき行列Ｈにカメラ位置姿勢の情報が含まれる。今、ｐ_ｉ、ｘ_ｉは既知であり、Ｈが求める未知数である。ここでｐ_ｉ×Ｈｘ_ｉ＝０（×はベクトル外積）の式と、同次座標系のスケールを１と置くことによって、（３）式は行列Ｈに関して式（４）の通りに書き変えることができる。 p _i = Hx _i (3)
However, the coordinates are expressed in a homogeneous coordinate system, and x _i = (x _i , y _i , w _i ) ^T , p _i = (X _i , Y _i , W _i ) ^T , and the element in the kth row and l column of H h _kl (where 1 ≦ k, l ≦ 3). At this time, the matrix H includes information on the camera position and orientation. Now, p _{i and} x _i are known and H is an unknown number to be obtained. Here, p _i × Hx _i = 0 (× is a vector outer product) and the scale of the homogeneous coordinate system is set to 1, so that equation (3) is rewritten as equation (4) with respect to matrix H Can do.

これを新たにｙ_ｉ＝Ａ_ｉｈとおく。以上はある着目対応点ｉについての式であるが、実際には対応点は複数検出される。この場合、画像が同じであればカメラ位置姿勢を表すｈは共通であるので、式（５）に示すように対応点の分だけ式（４）と同じ形式でｙとＡの行が追加される形でまとめられる。この式（５）をｙ＝Ａｈとおく。

This is newly set as y _i = A _i h. The above is the formula for a certain corresponding point i, but in reality, a plurality of corresponding points are detected. In this case, if the images are the same, h representing the camera position and orientation is common. Therefore, as shown in equation (5), lines y and A are added in the same format as equation (4) by the amount of the corresponding points. It is put together in the form. This equation (5) is set as y = Ah.

対応点ｉについてのパノラマ画像上の座標の２乗誤差ｅ_ｉは式（６）の通りに表すことができる。

Square error e _i of coordinates on the panoramic image for the corresponding point i can be expressed as the equation (6).

もし対応点座標とカメラ位置姿勢が厳密に求められるとすれば誤差ｅ_ｉ＝０となるはずである。しかし実際には対応点座標を厳密に求めるのは困難であり、得られるＡ_ｉやｙ_ｉには誤差を含む。また対応点は複数見つかることがほとんどであり式（５）においてｈの未知数を求めるのに必要な数よりも多くの方程式が得られる。よって各対応点が発生する誤差ｅ_ｉができるだけ小さくなるようなｈを求めることでカメラの位置姿勢を決定する。ただし、上述の対応点が存在する領域（または画素）の信頼度に応じて、信頼度の大きな対応点からの誤差ｅ_ｉが小さくなることを重視する。すなわち対応点ｉに対する重み係数をｗ_ｉとして

If the corresponding point coordinates and the camera position and orientation are strictly determined, the error e _i = 0 should be obtained. However, in practice, it is difficult to obtain the corresponding point coordinates precisely, and the obtained A _i and y _i include errors. In addition, a plurality of corresponding points are almost always found, and more equations than the number necessary for obtaining the unknown number of h in the equation (5) can be obtained. Thus the error e _i of each corresponding point is generated to determine the position and orientation of a camera by obtaining as small as possible so as a h. However, in accordance with the reliability of the area corresponding points described above are present (or pixels), emphasizing that the error e _i from a large corresponding points of the reliability decreases. That is, let w _{i be the} weighting factor for the corresponding point i.

を満たすｈを求める。本実施形態ではｗ_ｉの値を、信頼度「高」は重み係数１．３、信頼度「中」は１．０、信頼度「低」は０．５に変換して適用する。
（７）式は、重み付き疑似逆行列の考え方を用いて、

H is satisfied. In this embodiment, the value of w _i is applied after converting the weight coefficient 1.3 to the reliability “high”, 1.0 to the reliability “medium”, and 0.5 to the reliability “low”.
Equation (7) uses the concept of a weighted pseudo inverse matrix,

と求められる。ここでＷは重みを表す行列であり、２×２の対角行列Ｅ_２に重み係数ｗ_ｉを乗じたｗ_ｉＥ_２を順に並べたブロック対角行列である（Ｗ＝ｄｉａｇ［ｗ_１Ｅ_２，ｗ_２Ｅ_２，ｗ_３Ｅ_{２， …}］）。またＴは転置を表す。このようにして、カメラ位置姿勢ｈを決定することが可能である。各対応点の信頼度に応じた重み係数ｗ_ｉを用いることで、被写体に応じた各対応点の寄与を反映させつつカメラの位置姿勢を求めることができる。

Is required. Here, W is a matrix representing a weight, and is a block diagonal matrix in which w _i E ₂ obtained by multiplying a 2 × 2 diagonal matrix E ₂ by a weighting factor w _i is sequentially arranged (W = diag [w ₁ E _{_{_{_{2, w 2 E 2, w}}}} 3 E 2, ...]). T represents transposition. In this way, the camera position / posture h can be determined. By using the weighting coefficient w _i according to the reliability of each corresponding point, the position and orientation of the camera can be obtained while reflecting the contribution of each corresponding point according to the subject.

図７の例では、看板６０３を含む領域の対応点に大きな重み係数、樹木６０４を含む領域の対応点に小さな重み係数が与えられ、樹木よりも看板を重視して各画像の位置関係を決定することに相当する。 In the example of FIG. 7, a large weighting coefficient is given to the corresponding point of the area including the signboard 603, and a small weighting coefficient is given to the corresponding point of the area including the tree 604, and the positional relationship of each image is determined with emphasis on the signboard rather than the tree. It corresponds to doing.

以上のように対応点の信頼度に応じた対応点を用いてカメラの位置を決定することで、誤検出した対応点を用いるため各画像の位置合わせの精度が低下するのを抑制することができる。被写体の見る方向が変わったときの見え方の変化を考慮し、検出された対応点の信頼度を判定し、信頼できる対応点を優先したカメラの位置姿勢決定が実現できる。これによって、パノラマ画像を合成する際、画像データの位置合わせを精度よく行うことができる。 As described above, by determining the position of the camera using the corresponding point according to the reliability of the corresponding point, it is possible to suppress a decrease in the accuracy of the alignment of each image because the corresponding point detected in error is used. it can. It is possible to determine the reliability of the detected corresponding point in consideration of the change in appearance when the viewing direction of the subject changes, and to determine the position and orientation of the camera giving priority to the reliable corresponding point. As a result, when the panoramic image is synthesized, the image data can be accurately aligned.

なお第１実施形態において、見る方向によって見えが大きく変化する被写体とは、距離の値の変化が激しく複雑に入り組んだ被写体であると考え、距離画像の周波数解析を行ってパワースペクトルを求めた。しかし距離画像における値の変化を調べる方法はこれに限らない。例えば距離情報を、平滑化した後に、２次以上の微分を求めることで距離画像における値の変化の激しさを調べることができる。 In the first embodiment, the subject whose appearance changes greatly depending on the viewing direction is considered to be a subject in which the value of the distance changes greatly and complicatedly, and the power spectrum is obtained by performing frequency analysis of the distance image. However, the method for examining the change in the value in the distance image is not limited to this. For example, after smoothing the distance information, the degree of change in the value in the distance image can be examined by obtaining a second-order or higher derivative.

また第１実施形態ではフーリエ変換を行うため、信頼度のフラグを与える単位領域をＮ×Ｎ画素単位のブロック状としていた。しかし、フーリエ変換を用いない場合、信頼度フラグを設定する領域の分割方法はより柔軟に設定できる。例えば、画像の色の変化やエッジを検出し、色の変化やエッジに沿って、オブジェクトごとに動的に領域を決定してもよい。これにより、単純に分割した領域ごとではなく、画像中の被写体ごとに信頼度フラグを与えることが可能となる。なお、このとき領域決定のために計算が必要にはなるが、後にＳ３０５において画像の繋ぎ目を決定する際には画素の表す色情報やその勾配に応じて繋ぎ目を決定するので、その際にこのときの計算結果を利用することができる。 In the first embodiment, in order to perform Fourier transform, the unit area for giving the reliability flag is in the form of a block of N × N pixels. However, when the Fourier transform is not used, the region dividing method for setting the reliability flag can be set more flexibly. For example, a color change or edge of an image may be detected, and a region may be dynamically determined for each object along the color change or edge. As a result, it is possible to give a reliability flag to each subject in the image, not to each simply divided area. At this time, calculation is necessary to determine the region. However, when the joint of the image is determined later in S305, the joint is determined according to the color information represented by the pixel and its gradient. The calculation result at this time can be used.

＜第２実施形態＞
第１実施形態では、領域に対応する距離情報の周波数解析の結果に基づいて、検出した対応点の信頼度を設定する方法について説明した。第２実施形態では、対応点を検出した後、検出された対応点が各カメラから見えるか否かを判定した結果に基づいて信頼度を設定する方法について説明する。前述の実施形態と同様の構成については、同一の符号を付し、その説明を省略する。 Second Embodiment
In the first embodiment, the method for setting the reliability of the detected corresponding point based on the result of the frequency analysis of the distance information corresponding to the region has been described. In the second embodiment, a method of setting the reliability based on the result of determining whether or not the detected corresponding point is visible from each camera after detecting the corresponding point will be described. The same components as those in the above-described embodiment are denoted by the same reference numerals, and the description thereof is omitted.

図９は、異なる２つの光軸を有するカメラの配置及びシーンの状況を示す。２台のカメラ９０１、９０２それぞれの画角の重複領域には、三角柱状のオブジェクト９０３が存在している。図９に示す十強において各カメラ９０１、９０２によって撮像された画像を、図１０に示す。画像１００１は、カメラ９０１によって撮像された結果得られた画像であり、画像１００２は、カメラ９０２によって撮像された結果得られた画像である。画像１００１における領域１００３と画像１００２における領域１００４は、空間上重複している領域である。 FIG. 9 shows the arrangement of a camera having two different optical axes and the situation of the scene. A triangular prism-shaped object 903 exists in the overlapping area of the angles of view of the two cameras 901 and 902. Images taken by the cameras 901 and 902 in the ten strengths shown in FIG. 9 are shown in FIG. An image 1001 is an image obtained as a result of being captured by the camera 901, and an image 1002 is an image obtained as a result of being captured by the camera 902. A region 1003 in the image 1001 and a region 1004 in the image 1002 are regions that overlap in space.

図１０に示すように、重複している領域１００３と領域１００４には、同じオブジェクト９０３が写っている。しかしそれぞれのカメラから見えているオブジェクト９０３の側面は異なる。同じオブジェクト９０３のＡ面、Ｂ面が似たテクスチャを持っている場合、領域１００３におけるＡ面上の点と領域１００４におけるＢ面上の点が対応点として検出される可能性がある。しかしながら、その対応点は実空間上で同一の点ではない。以上のことから、このオブジェクト９０３上の対応点を手掛かりにカメラの位置姿勢を決定すると、正しい推定ができない。そこで第２実施形態では、このようなオブジェクトの見えの角度依存性に着目する。 As shown in FIG. 10, the same object 903 is shown in the overlapping area 1003 and area 1004. However, the sides of the object 903 seen from each camera are different. When the A and B surfaces of the same object 903 have similar textures, a point on the A surface in the region 1003 and a point on the B surface in the region 1004 may be detected as corresponding points. However, the corresponding points are not the same point in real space. From the above, if the position and orientation of the camera are determined using the corresponding points on the object 903 as a clue, correct estimation cannot be performed. Therefore, in the second embodiment, attention is paid to the angle dependency of the appearance of the object.

図１２は、第２実施形態に適用可能な、位置決定部２０５が実行する位置決定処理のフローチャートである。図１１におけるステップＳ５０１〜Ｓ５０２は第１実施形態と同様のステップである。ステップＳ１１０１において位置決定部２０５は、各画像の距離情報を参照し、撮影対象とするシーン全体の距離情報と各カメラの位置姿勢を、仮定的に決定する。これには公知の手法を用いることができる。 FIG. 12 is a flowchart of a position determination process executed by the position determination unit 205, applicable to the second embodiment. Steps S501 to S502 in FIG. 11 are the same steps as in the first embodiment. In step S1101, the position determination unit 205 refers to the distance information of each image and presumably determines the distance information of the entire scene to be imaged and the position and orientation of each camera. For this, a known method can be used.

ステップＳ１１０２において位置決定部２０５は、各カメラから取得できる距離情報を用いて、画像中の平面部の法線を計算する。ここで平面部とは、オブジェクト自体が平面的である領域であり、画像上で距離変化がないことを条件とするものではない。平面部は、例えば第１実施形態の信頼度が「高」とされた領域を特定することで探索することができる。ただしＮの値は必ずしも同一である必要はなく、例えばさらに大きな面積を持つことを条件にしてもよい。距離情報が取得されているので、平面部の法線方向を計算することができる。 In step S 1102, the position determination unit 205 calculates the normal line of the plane portion in the image using the distance information that can be acquired from each camera. Here, the plane portion is a region where the object itself is planar, and is not a condition that there is no distance change on the image. The plane portion can be searched by specifying an area in which the reliability of the first embodiment is “high”, for example. However, the value of N does not necessarily have to be the same, and for example, it may be provided that it has a larger area. Since the distance information is acquired, the normal direction of the plane portion can be calculated.

次にステップＳ１１０３において位置決定部２０５は、各平面部上に対応点が存在する場合、その平面部の法線とカメラの仮定の位置姿勢との関係を調べる。図１２は、平面部の法線とカメラの位置姿勢との関係を説明する図である。カメラ９０１が撮像した画像１００１において、オブジェクト９０３のＡ面の法線は、カメラ９０１側から見て左方向へ向かう位置関係にある。このような場合、オブジェクト９０３のＡ面は、カメラ９０１から見て右側に配置されたカメラからは見えにくいことになる。したがって、カメラ９０１より右に配置されたカメラに対して、オブジェクト９０３のＡ面上にある対応点の信頼度は低いとみなすことができる。逆に、もし画像１００１中に撮像された平面部の法線が、カメラ９０１側から見て右へ向かっている場合には、その平面部はカメラ９０１より左に配置されたカメラから見えにくいということになる。このときはカメラ９０１より左に配置されたカメラに対して、平面部上の対応点の信頼度は低いとみなす。以上のように、単にオブジェクト９０３のＡ面上にあるために信頼度が低いと設定するのではなく、カメラ９０１より右側に存在するカメラ画像との対応点に対して、オブジェクト９０３のＡ面上にある対応点の信頼度を低く設定する。上記のようにステップＳ１１０３で信頼度を設定しきたら、ステップＳ５１１に移って信頼度に基づき重み係数を決定し、カメラの位置姿勢を決定する。 Next, in step S1103, when there is a corresponding point on each plane part, the position determination unit 205 checks the relationship between the normal of the plane part and the assumed position and orientation of the camera. FIG. 12 is a diagram for explaining the relationship between the normal of the plane portion and the position and orientation of the camera. In the image 1001 captured by the camera 901, the normal of the A plane of the object 903 is in a positional relationship toward the left as viewed from the camera 901 side. In such a case, the surface A of the object 903 is difficult to see from the camera arranged on the right side when viewed from the camera 901. Therefore, it can be considered that the reliability of the corresponding point on the A plane of the object 903 is low with respect to the camera arranged on the right side of the camera 901. On the other hand, if the normal line of the flat surface imaged in the image 1001 is directed to the right when viewed from the camera 901 side, the flat surface region is difficult to see from the camera arranged on the left side of the camera 901. It will be. At this time, it is considered that the reliability of the corresponding point on the plane portion is low with respect to the camera arranged on the left side of the camera 901. As described above, it is not set that the reliability is low because it is simply on the A plane of the object 903, but on the A plane of the object 903 with respect to the corresponding point with the camera image existing on the right side of the camera 901. Set the reliability of corresponding points in When the reliability has been set in step S1103 as described above, the process proceeds to step S511, where the weighting coefficient is determined based on the reliability, and the position and orientation of the camera are determined.

このようにして、対応点を有する被写体が各カメラに対してどの方向を向いているかに基づいて信頼度を設定し、重み係数に反映させることで、対応点の精度を向上することができる。 Thus, the accuracy of the corresponding points can be improved by setting the reliability based on which direction the subject having the corresponding points is directed to each camera and reflecting the reliability.

なお第２実施形態では、距離情報からカメラの位置姿勢を仮定的に求めた上で、距離情報を参照して被写体が各カメラから見えるか否かを判定し信頼度を設定する方法を述べた。メラの仮定的な位置関係を求める方法としては、画像撮影時にＧＰＳなどによって取得された情報を活用し、各カメラの位置を求めて仮定値としてもよい。また、ユーザが各カメラの大雑把な位置を入力することによって、仮定的なカメラ位置を得ることとしてもよい。例えば、図９において「カメラ９０２はカメラ９０１の右側にある」ことや、「カメラ９０１とカメラ９０２は互いに外向きに配置されている」程度のことを、ユーザがマウス等で指示し、それを仮定的な位置関係の参考として活用することができる。 In the second embodiment, a method is described in which the position and orientation of the camera is obtained on the assumption from the distance information, and whether or not the subject can be seen from each camera by referring to the distance information and the reliability is set. . As a method for obtaining the assumed positional relationship of the mela, information obtained by GPS or the like at the time of image capturing may be used to obtain the position of each camera to be an assumed value. Alternatively, a hypothetical camera position may be obtained by the user inputting a rough position of each camera. For example, in FIG. 9, the user indicates with a mouse or the like that “the camera 902 is on the right side of the camera 901” or “the camera 901 and the camera 902 are arranged outward”. It can be used as a reference for hypothetical positional relationships.

＜第３実施形態＞
第３実施形態は、距離情報が取得できない場合に、取得した画像のみから被写体の見えが方向によって大きく変わるか否かを推定し、信頼度のフラグを設定する方法について説明する。被写体を見る方向によって被写体の見え方が大きく変化するか否かは、被写体の特性から推定できる。例えば、文字が書かれている被写体は、看板のように平面構造を持っていることが多い。そこで文字が書かれている被写体は、見る方向が変わっても見えは大きく変化しにくいとみなすことができる。あるいは、ビルのような建物も平面的な構造であり、見る方向が変わっても見え方は大きく変化しにくいとみなすことができる。さらに、空や雲などは通常、カメラからの距離が十分に大きい（無限遠である）ために、カメラの撮影する方向が変化しても、見え方は変わりにくいと考えられる。一方、樹木の枝葉や花のようなものは複雑な構造を持っており、見る方向によって大きく見え方が変わる可能性が高いと考えられる。 <Third Embodiment>
In the third embodiment, when distance information cannot be acquired, a method for estimating whether or not the appearance of a subject largely changes depending on the direction from only the acquired image and setting a reliability flag will be described. Whether or not the appearance of the subject changes greatly depending on the direction in which the subject is viewed can be estimated from the characteristics of the subject. For example, a subject on which characters are written often has a planar structure like a signboard. Therefore, it can be considered that the subject on which characters are written does not change its appearance greatly even if the viewing direction changes. Alternatively, a building such as a building also has a planar structure, and even if the viewing direction changes, it can be considered that the way it looks is hardly changed. Furthermore, since the distance from the camera is usually sufficiently large (infinitely far) from the sky, clouds, etc., it is considered that the appearance is unlikely to change even if the direction of shooting by the camera changes. On the other hand, things like branches and flowers of trees have a complicated structure, and it is highly likely that their appearance changes greatly depending on the viewing direction.

第３実施形態では、以上のような特定の被写体と見えの方向依存性の対応関係を利用する。具体的には、取得した画像において、特定の被写体を検出し、特定の被写体に予め対応づけた信頼度を設定する。 In the third embodiment, the correspondence relationship between the specific subject and the direction dependency of appearance as described above is used. Specifically, a specific subject is detected in the acquired image, and a reliability level that is associated with the specific subject in advance is set.

図１３は、第３実施形態に適用可能な位置決定部２０５が実行する位置決定の処理のフローチャートである。ステップＳ５０１〜Ｓ５０２は第１実施形態と同様のステップである。ステップＳ１３０３において位置決定部２０５は、画像に特定の被写体が存在するか否か判定し、特定の被写体を検出する。ステップＳ１３０４において、ステップＳ１３０３の識別結果に基づいて、所定の被写体が存在すると識別された領域に、被写体に対応する信頼度を、領域の信頼度として設定する。 FIG. 13 is a flowchart of position determination processing executed by the position determination unit 205 applicable to the third embodiment. Steps S501 to S502 are the same as those in the first embodiment. In step S1303, the position determination unit 205 determines whether a specific subject exists in the image and detects the specific subject. In step S1304, the reliability corresponding to the subject is set as the reliability of the region in the region identified as the presence of the predetermined subject based on the identification result in step S1303.

なおステップＳ１３０３における特定の被写体を識別する処理については、例えばテンプレートマッチングを用いることができる。あらかじめ空のテンプレートを持っておき、ステップＳ１３０３で空であると識別された領域には、ステップＳ１３０４で信頼度「高」を設定する。また、葉や樹木のテンプレートを持ち、ステップＳ１３０３で葉や樹木であると識別された領域には、ステップＳ１３０４において判定された領域に「低」の信頼度フラグを与える。さらに、ステップＳ１３０３における識別処理に文字認識を用いることもできる。ステップＳ１３０３において文字が存在すると認識された領域には、ステップＳ１３０４で信頼度「高」のフラグを与える。ステップＳ１３０３の識別処理に、直線認識を用いることで、直線に囲まれた領域を建物様の人工物であるとみなし、ステップＳ１３０４で信頼度「高」を与えるとしてもよい。なお、これらの処理を組み合わせて行ってもよい。このような処理を行うことによって、距離情報が取得できない場合にも被写体の見えの方向依存性に基づいて、対応点の信頼度を設定することができる。以上のように設定された信頼度フラグに基づいて、Ｓ１３０５で画像の位置関係の決定を行う。この処理はＳ５１１と同様である。 For example, template matching can be used for the process of identifying a specific subject in step S1303. An empty template is held in advance, and a reliability “high” is set in step S1304 for the area identified as empty in step S1303. In addition, for a region having a leaf or tree template and identified as a leaf or tree in step S1303, a reliability flag of “low” is given to the region determined in step S1304. Furthermore, character recognition can be used for the identification processing in step S1303. In step S1304, a flag of “high” reliability is given to an area recognized as having a character in step S1303. By using straight line recognition for the identification processing in step S1303, the area surrounded by the straight line may be regarded as a building-like artifact, and the reliability “high” may be given in step S1304. Note that these processes may be performed in combination. By performing such processing, the reliability of corresponding points can be set based on the direction dependency of the appearance of the subject even when distance information cannot be acquired. Based on the reliability flag set as described above, the positional relationship of the images is determined in S1305. This process is the same as S511.

＜第４実施形態＞
第４実施形態では、位置姿勢を決定した後に、その結果に基づいて対応点の信頼度を設定する方法について述べる。カメラの位置姿勢が決定されると、決定された１つの位置姿勢に対して、手掛かりとなった対応点それぞれの位置の誤差が求められる（式（１）の各ｉ）。ここで各カメラの撮影する方向によって被写体の見えが異なり、誤った対応点を含む場合は、それら対応点の誤差はばらつく可能性が高い。つまり、カメラ位置姿勢を求めた後、そのカメラ位置姿勢に対する各対応点の誤差が大きくばらつく領域は、領域内の被写体の見えが異なるために正しく対応点を探せなかった可能性があると決定される。そのような領域の信頼度を低く設定したのち、再度位置姿勢を計算することにより、より信頼できる対応点を優先してカメラの位置姿勢を求めることができる。 <Fourth embodiment>
In the fourth embodiment, after determining the position and orientation, a method for setting the reliability of corresponding points based on the result will be described. When the position and orientation of the camera are determined, the position error of each corresponding point that is a clue is obtained for each determined position and orientation (each i in equation (1)). Here, when the appearance of the subject differs depending on the shooting direction of each camera and an erroneous corresponding point is included, there is a high possibility that the error of the corresponding point varies. In other words, after obtaining the camera position and orientation, it is determined that there is a possibility that an area where the error of each corresponding point with respect to the camera position and orientation varies greatly can not be found correctly because the appearance of the subject in the area is different. The After setting the reliability of such an area low, the position and orientation of the camera can be obtained by giving priority to more reliable corresponding points by calculating the position and orientation again.

図１４は、第４実施形態に適用可能な位置決定部２０５による位置決定処理のフローチャートである。まずステップＳ５０１〜Ｓ５１１は第１実施形態と同様の処理であり、各画像の位置姿勢を決定する。 FIG. 14 is a flowchart of position determination processing by the position determination unit 205 applicable to the fourth embodiment. First, steps S501 to S511 are the same processing as in the first embodiment, and the position and orientation of each image are determined.

ステップＳ１４０１において位置決定部２０５は、所定のブロックにおいて各対応点のカメラ位置姿勢との誤差を調べる。この誤差が所定の許容値を超える場合には、ステップＳ１４０２に進む。ステップＳ１４０２において位置決定部２０５は、処理対象ブロックの信頼度を下げてカメラの位置姿勢を再計算する。これを繰り返して、最終的なカメラの位置姿勢を求める。 In step S1401, the position determination unit 205 checks an error from the camera position and orientation of each corresponding point in a predetermined block. If this error exceeds a predetermined allowable value, the process proceeds to step S1402. In step S1402, the position determination unit 205 lowers the reliability of the processing target block and recalculates the position and orientation of the camera. This is repeated to obtain the final camera position and orientation.

なお、このような算出した結果の誤差と許容量とを参照して当該被写体の対応点の信頼度を下げる方法は、対応点の誤差以外にもある。例えば、カメラの位置姿勢を決定した後に、距離情報や平面部の法線との矛盾があることが分かれば、その被写体上の対応点の信頼度を下げたうえで再計算するようにしてもよい。 It should be noted that there is a method for reducing the reliability of the corresponding point of the subject by referring to the error and the allowable amount of the calculated result in addition to the error of the corresponding point. For example, after determining the position and orientation of the camera, if it is found that there is a discrepancy with the distance information or the normal of the plane part, the reliability of the corresponding point on the subject may be lowered and recalculated. Good.

＜その他の実施形態＞
これまで記述してきた実施形態は、それぞれを単独で用いるだけでなく、組み合わせて実施することもできる。さらにそれぞれの変形例として、以下のような形態も考えられる。また、上述の実施形態では、信頼度のフラグを「高」「中」「低」の３つに分けて説明をしてきた。しかし信頼度の設定の仕方はこれに限定するものではない。例えば２つでもよいし、４つ以上でもよい。また、フラグではなく、重み係数を表す連続値を信頼度として用いてもよい。 <Other embodiments>
The embodiments described so far can be used not only independently but also in combination. Furthermore, the following forms are also conceivable as respective modifications. In the above-described embodiment, the description has been given by dividing the reliability flag into three, “high”, “medium”, and “low”. However, the method of setting the reliability is not limited to this. For example, two may be sufficient and four or more may be sufficient. Moreover, you may use not a flag but the continuous value showing a weighting coefficient as reliability.

また、上述の実施形態では説明を簡単にするため、入力画像を２枚として説明してきたが、入力画像は２枚に限らない。２枚以上の複数の画像によりパノラマ画像を生成する場合でも、同様に上述の実施形態を適用することができる。その際、複数のカメラが周囲の全方位をカバーするように配置し、３６０度の全方位パノラマ画像を合成する場合であっても問題なく本実施形態を適用できる。さらに、立体視用の画像を取得するためのステレオ撮像装置であってもよい。また、複数のカメラが互いに外側を向く配置ではなく、互いに内側を向いて配置されていてもよい。あるいは、１つカメラの向きを変えて複数の入力画像を取得する形式であってもよい。具体的には図６において、カメラ６０１と６０２を同時に用いるのではなく、１つのカメラでまず６０１の位置の画像を取得し、次いでカメラを６０２の位置に移動し、画像を取得するようであってもよい。この場合、パノラマ画像を生成するための画像は、時間差を持って撮影されることになるが、必ずしも同時刻ではなくても同じシーンであれば問題なく実施形態を適用できる。 In the above-described embodiment, the input image has been described as two sheets for the sake of simplicity. However, the number of input images is not limited to two. Even when a panoramic image is generated from a plurality of two or more images, the above-described embodiment can be similarly applied. In this case, the present embodiment can be applied without any problem even when a plurality of cameras are arranged so as to cover all surrounding directions and a 360-degree panoramic image is synthesized. Furthermore, the stereo imaging device for acquiring the image for stereoscopic vision may be sufficient. In addition, a plurality of cameras may be arranged not facing each other but facing each other. Or the format which changes the direction of one camera and acquires several input images may be sufficient. Specifically, in FIG. 6, instead of using the cameras 601 and 602 at the same time, an image at the position 601 is first acquired by one camera, and then the camera is moved to the position 602 to acquire an image. May be. In this case, an image for generating a panoramic image is taken with a time difference, but the embodiment can be applied without any problem as long as it is not the same time but the same scene.

また、前述の実施形態では位置補正部２０５が、画像間の対応点探索、距離情報の解析や被写体判定、信頼度の設定、カメラの位置推定を行う構成とした。しかしながら、それぞれまたは少なくとも一部を、別の手段により実現してもよい。例えば、対応点探索部が画像間の対応点を探索し、検出した対応点を出力する。解析手段や被写体判定手段が、各画像に撮影された被写体の角度依存度に応じた評価を行い、評価結果を出力する。さらに、信頼度設定部が評価結果に基づいて各対応点の信頼度を設定し、位置決定部が信頼度に応じた重みと各対応点に基づいて画像の位置決定を行うとしてもよい。実際に繋ぎ目の無い高品質なパノラマ画像を作成しようとする場合、最後はユーザが合成画像を確認した上で、ユーザの一致点やパラメータの手入力によって微妙な訂正や調整を施した後にパノラマ画像を出力するケースも多い。上記のように各構成を独立にすることにより、上述の実施形態において設定された信頼度を出力装置（モニタ）を介してユーザに提示することができる。この場合ユーザは、各対応点の信頼度を手動調整の参考にすることができる。さらに入力装置を介して、ユーザが領域に設定された信頼度を任意に訂正したり設定したりできるようにしてもよい。これによってユーザは、画像を確認しながら画像位置合わせを制御し、所望のパノラマ画像を得ることができるようになる。 In the above-described embodiment, the position correction unit 205 is configured to search for corresponding points between images, analyze distance information and subject determination, set reliability, and estimate the position of the camera. However, each or at least a part may be realized by another means. For example, the corresponding point search unit searches for corresponding points between images and outputs the detected corresponding points. The analysis unit and the subject determination unit perform evaluation according to the angle dependency of the subject photographed in each image, and output the evaluation result. Furthermore, the reliability setting unit may set the reliability of each corresponding point based on the evaluation result, and the position determination unit may determine the position of the image based on the weight corresponding to the reliability and each corresponding point. When trying to create a high-quality panoramic image that is actually seamless, the panorama is finally checked after the user confirms the composite image, and after making subtle corrections and adjustments by manually entering the matching points and parameters of the user. In many cases, images are output. By making each configuration independent as described above, the reliability set in the above-described embodiment can be presented to the user via the output device (monitor). In this case, the user can refer to the reliability of each corresponding point for manual adjustment. Further, the reliability set in the area may be arbitrarily corrected or set by the user via the input device. As a result, the user can control image alignment while confirming the image, and obtain a desired panoramic image.

本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention is also realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

２０５位置決定部
２０６繋ぎ目決定部
２０７画像合成部 205 Position Determination Unit 206 Joint Determination Unit 207 Image Composition Unit

Claims

Image acquisition means for acquiring at least two images obtained by imaging in different directions and partially overlapping each other;
Search means for searching for corresponding points between images corresponding to the same subject in each image acquired by the acquisition means;
Position determining means for determining the positional relationship between the images acquired by the acquiring means based on the corresponding points;
Combining means for generating a panoramic image by joining the at least two images based on the positional relationship between the images determined by the position determining means;
The position determining means sets the reliability of the corresponding point detected by the search means based on the angle dependency of the appearance of the subject captured in the two images, and refers to the reliability of each corresponding point. An image processing apparatus for determining a positional relationship between the images.

The position determining means sets the reliability of corresponding points in a subject whose feature points look different depending on the viewing direction, and sets the reliability of corresponding points in a subject whose appearance of feature points does not change much depending on the viewing direction. The image processing apparatus according to claim 1, wherein:

Further, distance information acquisition means for acquiring distance information corresponding to each of the images acquired by the acquisition means,
Analyzing means for analyzing the gradient of the change of the distance information for each predetermined area of the distance information;
The image processing apparatus according to claim 1, wherein the position determination unit sets the reliability for each of the predetermined regions based on a result of analysis by the analysis unit.

The analysis means performs frequency analysis for each of the predetermined regions on the distance image,
The position determination means sets the reliability low for a region where the power spectrum of the frequency component obtained from the analysis means has a frequency component larger than a predetermined threshold, and sets the frequency component higher than the predetermined threshold. The image processing apparatus according to claim 3, wherein the reliability is set high for a non-existing region.

Furthermore, it has distance information acquisition means that can acquire distance information corresponding to each image acquired by the acquisition means,
The image processing apparatus according to claim 1, wherein the position determination unit sets the reliability based on a distance between a position of each of the images and a subject.

Furthermore, it has an identification means for identifying the subject in the image,
The image processing apparatus according to claim 1, wherein the position determination unit sets a reliability corresponding to each subject using the identification result obtained by the identification unit.

The position determining means assumes a positional relationship between the images, sets the reliability based on the magnitude of an error between the assumed positional relationship and each corresponding point or distance information, and sets the reliability The image processing apparatus according to claim 1, wherein the positional relationship of each of the images is recalculated based on the image.

Furthermore, it has a display means for displaying the reliability set by the position determining means on a display,
The image processing apparatus according to claim 1, wherein the display unit receives an instruction from the user, and the position determination unit changes the reliability based on the instruction. .

Image acquisition means for acquiring at least two images obtained by imaging the computer in different directions and partially overlapping;
Search means for searching for corresponding points between images corresponding to the same subject in each image acquired by the acquisition means;
Based on the angle dependency of the appearance of the subject captured in the two images, the reliability of the corresponding points detected by the search means is set, and the positional relationship between the images with reference to the reliability of each corresponding point Position determining means for determining the positional relationship of each image acquired by the acquiring means for determining
The program for functioning as a synthetic | combination means which produces | generates a panoramic image by joining the said at least 2 image based on the positional relationship of each said image determined by the said position determination means.

Acquiring at least two images obtained by imaging in different directions and partially overlapping;
In each acquired image, search for corresponding points between images corresponding to the same subject,
Based on the angle dependency of the appearance of the subject imaged in the two images, the reliability of the corresponding point detected by the search means is set, and based on the corresponding point and the reliability of each corresponding point, Determine the positional relationship of each acquired image,
A panoramic image is generated by joining the at least two images based on the determined positional relationship between the images.