JP4321251B2

JP4321251B2 - Apparatus and method for generating and displaying composite image

Info

Publication number: JP4321251B2
Application number: JP2003417435A
Authority: JP
Inventors: 高斉松本; 俊夫守屋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2003-12-16
Filing date: 2003-12-16
Publication date: 2009-08-26
Anticipated expiration: 2023-12-16
Also published as: JP2005182098A

Description

本発明は複数の画像から共通部分を算出して画像同士をつなげて合成された画像を自動生成・表示する装置に関し、詳しくは複数の画像におけるエッジの特徴と色相の特徴群の照合により複数の画像の共通部分を検出し、特徴点群の相対的な位置関係により特徴点間の誤対応を減らして合成された１枚あるいは複数の画像を生成し、表示する装置および方法に関する。 The present invention relates to an apparatus for automatically generating and displaying a composite image by calculating a common portion from a plurality of images and connecting the images, and more specifically, by comparing a plurality of edge features and hue feature groups in a plurality of images. The present invention relates to an apparatus and a method for detecting a common part of images, generating one or a plurality of images synthesized by reducing miscorrespondence between feature points based on the relative positional relationship of feature points, and displaying them.

複数の画像から1枚の画像を生成する際には、各画像における共通な特徴を人間が判断し、手動により選択を行うことで1枚の画像を生成する手法が一般的である。また、自動化を図る場合、各画像における共通な特徴を多数抽出することが困難な場合があり、また相関演算にもとづく特徴点同士の照合においては、誤対応が検出される場合が多く、1枚の画像として自動的かつ精度良く合成することが困難であることが知られる（例えば、特許文献１、非特許文献１乃至３を参照）。 When one image is generated from a plurality of images, a method is generally used in which a human determines a common feature in each image and generates a single image by manual selection. In addition, when automating, it may be difficult to extract many common features in each image, and in the matching of feature points based on correlation calculation, miscorrespondence is often detected. It is known that it is difficult to synthesize the image automatically and accurately (see, for example, Patent Document 1 and Non-Patent Documents 1 to 3).

特開２０００−９０２３２号公報JP 2000-90232 A

千葉直樹,畑中晴雄,飯田崇,“画像特徴に基づく高速・高精度なパノラマ画像合成ソフトウェア”, 三洋電機技報, Vol.35, No.1, pp.75-82, 2003.Naoki Chiba, Haruo Hatanaka, Takashi Iida, “High-speed and high-accuracy panoramic image synthesis software based on image features”, Sanyo Electric Technical Report, Vol.35, No.1, pp.75-82, 2003. John Krumm,“Object Detection with Vector Quantized Binary Features”, Proc. of the 1997 IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp.179-185, 1997.John Krumm, “Object Detection with Vector Quantized Binary Features”, Proc. Of the 1997 IEEE Int. Conf. On Computer Vision and Pattern Recognition, pp.179-185, 1997. 吉田達哉,影沢政隆,塘中哲也,池内克史,“局所特徴認識アルゴリズムによる車両の認識”, 電子情報通信学会技術研究報告, Vol.101, No.302, pp.9-14, 2001.Tatsuya Yoshida, Masataka Kagezawa, Tetsuya Hatanaka, Katsushi Ikeuchi, “Recognition of Vehicles by Local Feature Recognition Algorithm”, IEICE Technical Report, Vol.101, No.302, pp.9-14, 2001.

画像入力機器または既に画像が保存されている記憶装置などから得られた複数の画像より画像間で共通な部分を検出することで、複数の画像をつなげて1枚の画像として自動的に生成し、表示する装置の構築を課題とする。なお、この複数の画像から生成された1枚あるいは複数の画像を合成画像と定義し、以下ではこの呼称を用いる。 By detecting a common part between images from multiple images obtained from an image input device or a storage device that already stores images, the multiple images are connected and automatically generated as a single image. The construction of a display device is an issue. Note that one or a plurality of images generated from the plurality of images is defined as a composite image, and this designation is used below.

前記の課題を解決すべく、画像入力機器または既に画像が保存されているを制御することにより画像入力を行い、入力された画像からエッジ画像と色相画像を生成し、それぞれの画像において画像上のある点において近傍に対するエッジか色相の値の類似度が低い点を特徴点として抽出し、複数の画像間で特徴点を照合することにより、共通な特徴点を持つ画像の組を算出し、同じ特徴点を持つ画像間において、相対位置関係が保持されている特徴点のみを算出し、同じ特徴点を持つ画像間において、相対位置関係が保持されている特徴点群の相対位置の平均値を算出し、この相対位置を画像間の相対位置として、画像を重ね合わせて合成画像を生成、表示する。 In order to solve the above-described problems, image input is performed by controlling an image input device or an image already stored, and an edge image and a hue image are generated from the input image. A point with low edge or hue value similarity to a neighborhood at a certain point is extracted as a feature point, and a set of images having a common feature point is calculated by collating the feature points between multiple images. Calculate only the feature points for which the relative positional relationship is maintained between images having feature points, and calculate the average value of the relative positions of the feature point groups for which the relative positional relationship is maintained between images having the same feature points. The calculated relative position is used as a relative position between the images, and the combined images are generated and displayed by superimposing the images.

本発明により、複数の画像における共通な特徴を抽出・照合し、１枚の合成画像として自動的に生成・表示することが可能となる。特に複数の画像のそれぞれからエッジと色相の特徴点を多数抽出し、各画像間で照合することにより、画像間における共通な部分を検出し、また特徴点同士の相対位置の比較によって特徴点間の誤対応を減少させることで、より精度の高い合成画像の自動生成・表示を可能とする。 According to the present invention, common features in a plurality of images can be extracted and collated, and automatically generated and displayed as a single composite image. In particular, a large number of edge and hue feature points are extracted from each of a plurality of images and collated between the images to detect a common part between the images, and between the feature points by comparing the relative positions of the feature points. This makes it possible to automatically generate and display a synthesized image with higher accuracy.

本発明の実施の形態を図を用いて以下に説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は本発明の処理の流れの概要を示す。処理が開始されると（１０１）、カメラなどの画像入力機器を制御することにより画像入力を行い（１０２）、入力された画像からエッジ画像と色相画像を生成し、それぞれの画像において画像上のある点において近傍に対するエッジの類似度が低い点もしくは色相の値の類似度が低い点を特徴点として抽出し（１０３）、複数の画像間で特徴点を照合することにより、共通な特徴点を持つ画像の組を算出し、算出された画像の組において、相対位置関係が保持されている特徴点のみを算出し（１０４）、共通な特徴点を持つ画像間において、相対位置関係が保持されている特徴点群の相対位置の平均値を算出し、この相対位置を画像間の相対位置として合成画像を生成し（１０５）、生成された合成画像を表示し（１０６）、プログラムを終了する（１０７）。 FIG. 1 shows an outline of the processing flow of the present invention. When the processing is started (101), image input is performed by controlling an image input device such as a camera (102), and an edge image and a hue image are generated from the input images. A point having a low edge similarity to a neighborhood or a point having a low hue value similarity at a certain point is extracted as a feature point (103), and a common feature point is obtained by collating the feature points between a plurality of images. A set of images possessed is calculated, and only the feature points for which the relative positional relationship is maintained in the calculated set of images are calculated (104), and the relative positional relationship is maintained between images having common feature points. The average value of the relative positions of the feature point groups being calculated is calculated, a composite image is generated using the relative position as the relative position between the images (105), the generated composite image is displayed (106), and the program is To completion (107).

以上の処理の流れの概略を踏まえ、図１の処理の流れをより詳細にしたものを図２に示す。図１の１０２が図２の２０２に、図１の１０２が図２の２０３〜２０６に、図１の１０３が図２の２０７〜２０９、２１２に、図１の１０４が図２の２１０に、図１の１０５が図２の２１１に対応する。 Based on the outline of the above processing flow, FIG. 2 shows a more detailed processing flow of FIG. 2 in FIG. 1, 102 in FIG. 1, 203 to 206 in FIG. 2, 103 in FIG. 1 in 207 to 209 and 212 in FIG. 2, 104 in FIG. 1 to 210 in FIG. 105 in FIG. 1 corresponds to 211 in FIG.

また、図２のうち、処理２０８の部分を詳細に示したものを図３に示す。 FIG. 3 shows details of the processing 208 in FIG.

プログラムが開始されると（２０１）、画像入力（１０２）の処理としてカメラなどの画像入力機器の制御を行い、複数枚の画像を取り込む（２０２）。
次に画像特徴算出（１０３）を行う。図４は画像特徴算出（１０３）の処理（２０３〜２０６）のうち２０４と２０６の例を示すものである。画像入力（１０２）により入力された複数の画像のそれぞれに対して、まず明度により二値化したエッジ画像を生成する（２０３）。生成したエッジ画像（４０１）においてウィンドウ（４０２）を設け、ウィンドウを画像内で走査させていく（４０３）。このとき、ある位置でのウィンドウ（４０４）の各ピクセル毎に白か黒のいずれかを示す二値の値を読み込んでピクセル値とし、このウィンドウの周囲に設けたウィンドウ（４０５）の各ピクセル値が一致した場合を１、一致してない場合を０としてウィンドウ内の各ピクセル値に対して評価を行って総和を算出してハミング距離を求める。ハミング距離が閾値以下の場合において、そのウィンドウを特徴点とし、画像における座標とその座標を中心とするウィンドウ内の各ピクセル値を記録する（２０４）。ここで閾値の設定においては、ウィンドウ間の各ピクセル値が同じか否かを比較したときに一致する割合を閾値として設定する。なお、図４において、ある位置でのウィンドウ（４０２）に対して周囲に設けるウィンドウを簡略的に１つのウィンドウ（４０５）として示しているが、実際はウィンドウ（４０２）に対して全周囲に設ける。 When the program is started (201), the image input device such as a camera is controlled as an image input (102) process, and a plurality of images are captured (202).
Next, image feature calculation (103) is performed. FIG. 4 shows examples 204 and 206 in the processing (203 to 206) of the image feature calculation (103). For each of the plurality of images input by the image input (102), an edge image binarized by brightness is first generated (203). A window (402) is provided in the generated edge image (401), and the window is scanned in the image (403). At this time, for each pixel of the window (404) at a certain position, a binary value indicating either white or black is read as a pixel value, and each pixel value of the window (405) provided around this window 1 is calculated when the values match, and 0 is set when the values do not match, and the pixel values in the window are evaluated to calculate the sum and obtain the Hamming distance. When the Hamming distance is equal to or smaller than the threshold, the window is used as a feature point, and the coordinates in the image and each pixel value in the window centered on the coordinates are recorded (204). Here, in setting the threshold, a ratio that matches when each pixel value between the windows is the same is set as the threshold. In FIG. 4, a window provided around the window (402) at a certain position is simply shown as one window (405), but in reality, it is provided around the entire window (402).

また、画像特徴算出（１０３）は取り込んだ複数の各画像より、色相画像を生成する（２０５）。エッジ画像における処理と同様に、生成した色相画像（４０１）においてウィンドウ（４０２）を設け、ウィンドウを画像内で走査させていく（４０３）。このとき、ある位置でのウィンドウ（４０４）の各ピクセル毎に０度から３６０度までの範囲をとることで色みを示す値を読み込んでピクセル値とする。同様にウィンドウ（４０４）の周囲に設けたウィンドウ（４０５）の各ピクセル値を読み込む。次に各ウィンドウの左上を原点としたときにウィンドウ（４０４）とウィンドウ（４０５）の間で同じ座標となる画素のピクセル値の差分をとり、各画素のピクセル値の差分の総和を求める。具体的には下記の式、数１、数２により各画素のピクセル値の差分の総和が算出される。 Further, the image feature calculation (103) generates a hue image from a plurality of captured images (205). Similar to the processing in the edge image, a window (402) is provided in the generated hue image (401), and the window is scanned in the image (403). At this time, by taking a range from 0 degree to 360 degrees for each pixel of the window (404) at a certain position, a value indicating color is read and set as a pixel value. Similarly, each pixel value of the window (405) provided around the window (404) is read. Next, the difference between the pixel values of the pixels having the same coordinates between the window (404) and the window (405) when the upper left corner of each window is the origin is obtained, and the sum of the differences between the pixel values of the pixels is obtained. Specifically, the sum of the differences between the pixel values of each pixel is calculated by the following equations, Equations 1 and 2.

ここでｎはウィンドウ内の縦方向または横方向のピクセル数、H1 とH2は各ウィンドウのある画素のピクセル値とする。また、i=jのときのみ数1及び数２が評価されるものとし、またH1からH2を引いたときの絶対値ｄijが１８０以上のときは、ｄ＝３６０−ｄijを求め、ｄijにｄを代入するものとする。
次いで、この差分の総和が閾値以上の点を特徴点とし、画像における座標とその座標を中心とするウィンドウ内の各ピクセル値を記録する（２０６）。この
閾値はウィンドウ間の各ピクセル値の差が最も大きくなる場合の差分の総和に対して、比較しているウィンドウ間の画素毎のピクセル値の差分の総和が占める割合で定義される。仮にここでの閾値Rを定義すると、実際においてウィンドウの類似度を比較する際には数３により、ウィンドウ間の各ピクセル値の差が最も大きくなる場合の差分の総和に対して、現在比較しているウィンドウ間の画素毎のピクセル値の差分の総和が占める割合を求める。 Here, n is the number of pixels in the vertical or horizontal direction in the window, and H1 and H2 are the pixel values of a certain pixel in each window. Further, it is assumed that the equations 1 and 2 are evaluated only when i = j, and when the absolute value dij when H2 is subtracted from H1 is 180 or more, d = 360−dij is obtained, and dij is changed to dij Shall be substituted.
Next, a point where the sum of the differences is equal to or greater than a threshold value is used as a feature point, and the coordinates in the image and each pixel value in the window centered on the coordinates are recorded (206). This threshold value is defined as the ratio of the sum of the differences of the pixel values for each pixel between the windows being compared to the sum of the differences when the difference of the pixel values between the windows is the largest. If the threshold value R is defined here, when comparing the similarity of windows in actuality, the current difference is compared with the sum of the differences when the difference between the pixel values between the windows is the largest by Equation 3. The ratio of the sum of the differences in pixel values for each pixel between the current windows is calculated.

これにより求められたｒが閾値Rよりも小さい場合は類似でないと判定する。 When r calculated | required by this is smaller than the threshold value R, it determines with not being similar.

次に画像特徴照合（１０４）を行う。画像特徴照合（１０４）は複数の画像間において対応するエッジの特徴点または色相の特徴点を持つ画像の組み合わせを求める処理（２０７）と２０７の処理により求められた画像の組み合わせのうち、画像間で共通する特徴点群のうち相対位置も一致する特徴点群のみを選択する処理（２０８）からなる。まず、図５に画像特徴照合（１０４）の処理である２０７の例を示す。ある画像（５０１）に対して求められたエッジの各特徴点群（５０２〜５０５）を参照画像として、画像入力（１０２）で取り込まれた他の画像（５０６〜５０８）において抽出された特徴点との照合を行う。例えば５０６の画像に対しては既に画像特徴抽出（１０３）で抽出された特徴点群（５０９〜５１２）との類似度の比較を行う。 Next, image feature matching (104) is performed. Image feature matching (104) is a process of obtaining a combination of images having edge feature points or hue feature points corresponding to a plurality of images (207) and among the image combinations obtained by the processing of 207. The processing consists of the processing (208) for selecting only the feature point group having the same relative position among the feature point groups that are common to each other. First, FIG. 5 shows an example 207 which is a process of image feature matching (104). The feature points extracted in the other images (506 to 508) captured by the image input (102) using the feature point groups (502 to 505) of the edges obtained for the certain image (501) as reference images. Is checked. For example, for the image 506, the similarity is compared with the feature point group (509 to 512) already extracted by the image feature extraction (103).

このとき、ウィンドウ内の各ピクセル値が一致した場合を１、一致してない場合を０としてウィンドウ内の各ピクセル値に対して評価を行って総和を算出してハミング距離を求め、ハミング距離が閾値以上の点を求め、その座標を記録する。これにより画像間で共通なエッジの特徴点の位置を算出し、記録する。またエッジ画像における特徴点群の照合と同様に、ある画像（５０１）に対して求められた色相の特徴点群（５０２〜５０５）を参照画像として他の画像（５０６〜５０８）において走査を行いながら照合を行う。このとき、ウィンドウ内の画像の各ピクセル値と参照先に設けたウィンドウ内の画像の各ピクセル値との差分をとり、各ピクセル値の差分の総和を求める。これは画像特徴抽出（１０３）の説明における数１〜数２の処理と同等である。数３が閾値以上の点を求め、その座標を記録する。これにより画像間で共通な色相の特徴点の位置を算出し、記録する。以上によりエッジの特徴点もしくは色相の特徴点を持つ画像の組み合わせをすべて求める。図６に画像特徴照合（１０４）の処理である２０８の例を示す。２０８ではある画像における複数の特徴点と共通な複数の特徴点を持つ画像（６０１、６０７）を選択し、特徴点の相対位置が最も合致するような特徴点の組み合わせを求める。まず共通な特徴点を持つ６０１と６０７の画像を重ねるとした場合に適当な特徴点（６０４）を基準として選択する（３０１）。 At this time, when the pixel values in the window match each other, the evaluation is performed with respect to each pixel value in the window by setting 1 when the pixel values do not match and 0 when the pixel values do not match, and the sum is calculated to obtain the Hamming distance. Find a point above the threshold and record its coordinates. As a result, the position of the feature point of the edge common between the images is calculated and recorded. Similarly to the matching of the feature points in the edge image, scanning is performed in the other images (506 to 508) using the feature points (502 to 505) of the hue obtained for the certain image (501) as a reference image. While collating. At this time, the difference between each pixel value of the image in the window and each pixel value of the image in the window provided at the reference destination is calculated, and the sum of the differences between the pixel values is obtained. This is equivalent to the processing of Formula 1 to Formula 2 in the description of the image feature extraction (103). Find a point where Equation 3 is greater than or equal to the threshold and record its coordinates. As a result, the position of the feature point of the hue common to the images is calculated and recorded. As described above, all combinations of images having edge feature points or hue feature points are obtained. FIG. 6 shows an example 208 of image feature matching (104) processing. In 208, an image (601, 607) having a plurality of feature points in common with a plurality of feature points in a certain image is selected, and a combination of feature points that best matches the relative positions of the feature points is obtained. First, when superimposing images 601 and 607 having common feature points, an appropriate feature point (604) is selected as a reference (301).

次に６０１と６０７の画像が特徴点（６０４）を基準として重なるように配置する（３０２）。このとき画像間で一致する各特徴点間の相対距離を算出する（３０３）。ここで、相対距離とは、基準点から特徴点までの距離である。画像間で一致する特徴点のうち、互いに対応する相対距離が閾値以下となる特徴点の数をカウントする（３０４）。この任意に選択した基準点、相対距離、閾値の関係の一例を図６に示す。ここで、基準点は任意の特徴点としてもよいし、特徴点以外の任意の場所を基準点として選択してもよい。また同様に特徴点（６０２）を基準として６０１と６０７の画像を重ねるとき、画像間で一致する特徴点間の相対位置が閾値以下のものをカウントし、すべての特徴点を基準としたときに画像間で共通な特徴点間の距離が閾値以下となるものの数をカウントする。すべての特徴点を基準としたときに相対距離が閾値以下の特徴点のカウントが終わった場合（３０５）、このカウント値が最も大きくなるときに基準とした特徴点を選択する（３０６）。 Next, the images 601 and 607 are arranged so as to overlap with the feature point (604) as a reference (302). At this time, the relative distance between the feature points that coincide between the images is calculated (303). Here, the relative distance is a distance from the reference point to the feature point. Among feature points that match between images, the number of feature points whose relative distances corresponding to each other are equal to or less than a threshold is counted (304). An example of the relationship between the arbitrarily selected reference point, relative distance, and threshold is shown in FIG. Here, the reference point may be an arbitrary feature point, or an arbitrary place other than the feature point may be selected as the reference point. Similarly, when the images of 601 and 607 are overlapped using the feature point (602) as a reference, when the relative positions between the feature points that match between images are equal to or less than a threshold value, and all feature points are used as a reference, The number of features whose distance between feature points common to the images is equal to or less than a threshold value is counted. When counting of feature points whose relative distance is equal to or less than the threshold when all feature points are used as a reference (305), the feature point used as a reference when the count value is the largest is selected (306).

カウント値が最大となるような特徴点同士の重ね方で画像同士が配置されるとき、画像間で共通の特徴点群のうち、相対位置関係が保持された特徴点の数が最も多いことになる。これを踏まえ、３０６で選択された特徴点を基準として画像を配置したときに、画像間で一致した特徴点間の距離が閾値以上のものを除外し、閾値以下の特徴点のみを選択し、特徴点のウィンドウ内の画像とその座標をファイルとして記録する（３０７）。例えば図６においては特徴点（６０２）を基準として画像（６０１、６０７）を配置した場合、各画像間において一致する特徴の組のうち６０２と６０８の組、６０３と６０９の組、６０５と６１０の組がちょうど重なり、６０４を基準として画像を重ねた場合に比べて一致する特徴点同士の距離が閾値以下となるものの数が大きくなる。よって特徴点６０２を基準として選択して記録する（３０７）。このとき特徴点の組（６１３と６１６）は特徴は一致するが、互いの距離が閾値以上であるため、記録から除外する（３０８）。 When the images are arranged in such a way that the feature points have the maximum count value, the number of feature points having the relative positional relationship is the largest among the feature point groups common to the images. Become. Based on this, when images are arranged based on the feature points selected in 306, the feature points that match between the images are excluded when the distance between the feature points is greater than or equal to a threshold, and only feature points that are less than or equal to the threshold are selected. The image in the feature point window and its coordinates are recorded as a file (307). For example, in FIG. 6, when the images (601, 607) are arranged with the feature point (602) as a reference, among the sets of matching features between the images, a set of 602 and 608, a set of 603 and 609, and a set of 605 and 610 And the number of features whose matching feature points are equal to or less than the threshold is larger than when the images are overlapped with reference to 604 as a reference. Therefore, the feature point 602 is selected and recorded as a reference (307). At this time, the feature point pairs (613 and 616) have the same feature, but are excluded from the recording because the distance between them is equal to or greater than the threshold (308).

次いで画像特徴照合（１０４）の処理（２０８）で記録された特徴点の数と各特徴点間の画像上での距離が閾値以上離れているか否かを評価し（２０９）、閾値以下であれば合成画像の生成に必要な特徴点が得られないと判断し、合成画像が生成できない旨をエラーの表示により操作者に伝える（２１２）。画像特徴照合（１０４）の処理（２０８）で記録された特徴点の数と各特徴点間の画像上での距離が閾値以上であれば、合成画像生成（１０５）として画像間で一致する各特徴点同士の相対位置を算出し、互いに一致する全て特徴点同士の相対位置の平均値を求め、この値を共通な特徴点を持つ画像同士の相対位置とする。共通な特徴点を持つ画像同士の相対位置を順次求めていき、共通な特徴点を持つ画像同士を相対位置分だけ重複するようにして一枚の画像を生成することで合成画像とする（２１０）。最後に画像表示（１０５）として生成された合成画像を表示し（２１１）、プログラムは終了する（２１３）。 Next, it is evaluated whether the number of feature points recorded in the process (208) of the image feature matching (104) and the distance between the feature points on the image are more than a threshold value (209). For example, it is determined that a feature point necessary for generating the composite image cannot be obtained, and an error is displayed to notify the operator that the composite image cannot be generated (212). If the number of feature points recorded in the process (208) of the image feature matching (104) and the distance between the feature points on the image are equal to or greater than the threshold value, each of the images that match between the images as the composite image generation (105) The relative positions of the feature points are calculated, the average value of the relative positions of all the feature points that match each other is obtained, and this value is set as the relative position of the images having the common feature points. The relative positions of the images having the common feature points are sequentially obtained, and a single image is generated so that the images having the common feature points overlap each other by the relative position, thereby obtaining a composite image (210 ). Finally, the generated composite image is displayed as the image display (105) (211), and the program ends (213).

次に図７を用いて本発明の実施形態における装置の構成を説明する。この装置は、プログラムにもとづいてデータの処理を行うメインプロセッサ（７０１）、プログラムとデータを保持するメモリ等の主記憶（７０７）とハードディスク等の補助記憶（７０６）、ディスプレイ（７０２）とグラフィックボード（７０３）、画像入力（１０２）を行うためのカメラ（７０４）とビデオキャプチャボード（７０５）を備える。この構成は、ハードウェア的には、プロセッサ、メモリ、その他のLSIで実現でき、ソフトウェア的にはメモリにロードされたプログラムなどによって実現されるが、ここではそれらの連携によって実現される機能ブロックを記載している。したがって、これらの機能ブロックがハードウェアのみ、ソフトウェアのみ、またはそれらの組み合わせによっていろいろな形で実現できることは、当業者には理解されるところである。 Next, the configuration of the apparatus according to the embodiment of the present invention will be described with reference to FIG. This apparatus includes a main processor (701) for processing data based on a program, a main memory (707) such as a memory for holding the program and data, an auxiliary memory (706) such as a hard disk, a display (702) and a graphic board. (703), a camera (704) for performing image input (102) and a video capture board (705). This configuration can be realized in terms of hardware by a processor, memory, or other LSI, and in terms of software, it is realized by a program loaded in the memory. It is described. Therefore, those skilled in the art will understand that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof.

主記憶（７０７）にて動作するプログラムとデータについて説明をする。オペレーティングシステム（７０９）は、プログラムの実行等を管理する基本ソフトウェアである。複数の画像から合成画像を生成するプログラムとしては、全体の処理の流れを制御するメインプログラム（７１０）、図１におけるカメラ画像入力（１０２）の処理に相当する画像入力プログラム（７１１）、画像特徴算出（１０３）の処理に相当する画像特徴算出プログラム（７１２）、画像特徴照合（１０４）の処理に相当する画像特徴照合プログラム（７１３）、合成画像生成（１０５）の処理に相当する合成画像生成プログラム（７１４）、画像表示（１０６）の処理に相当する画像表示プログラム（７１５）がある。また画像処理にあたって処理結果を保持する画像バッファ（７１６）がある。また、補助記憶（７０６）には主記憶に読み込まれる各種プログラム（７０９〜７１５）がプログラム（７０８）として記憶されている。これらのプログラムは、メモリに読み込まれ、プロセッサによって実行されることによって処理が行われる。 The program and data that operate in the main memory (707) will be described. An operating system (709) is basic software for managing execution of programs and the like. As a program for generating a composite image from a plurality of images, a main program (710) for controlling the overall processing flow, an image input program (711) corresponding to the processing of the camera image input (102) in FIG. Image feature calculation program (712) corresponding to the processing of calculation (103), image feature verification program (713) corresponding to the processing of image feature verification (104), and composite image generation corresponding to the processing of composite image generation (105) There is an image display program (715) corresponding to the processing of the program (714) and image display (106). In addition, there is an image buffer (716) that holds processing results in image processing. The auxiliary memory (706) stores various programs (709 to 715) to be read into the main memory as programs (708). These programs are read into a memory and processed by being executed by a processor.

各プログラムの処理について説明する。メインプログラム（７１０）は図１〜６で示した画像入力から合成画像の生成・表示までのプログラム全体の処理流れを制御する。画像入力プログラム（７１１）はカメラ画像入力（１０２）としてビデオキャプチャボード（７０５）とカメラ（７０４）を制御することにより、カメラにより撮影された画像を主記憶（７０７）の画像バッファ（７１６）に取り込む（図２の２０２）。なお、この画像バッファ（７１６）は以下における各種プログラムの作業領域としても用いられる。 The processing of each program will be described. The main program (710) controls the processing flow of the entire program from image input to composite image generation / display shown in FIGS. The image input program (711) controls the video capture board (705) and the camera (704) as the camera image input (102), whereby the image captured by the camera is stored in the image buffer (716) of the main memory (707). Capture (202 in FIG. 2). The image buffer (716) is also used as a work area for various programs described below.

次に画像特徴算出プログラム（７１２）は画像入力（１０２）により取り込まれた画像よりエッジ画像の生成を行い（図２の２０３）、近傍とのエッジの類似度の低い点を特徴点として抽出する（図２の２０４）。また色相画像を生成し（図２の２０５）、近傍との色相の類似度の低い点を特徴点として抽出する（図２の２０６）。画像特徴照合プログラム（７１３）は画像間においてエッジと色相のそれぞれについて共通な特徴点の組み合わせを算出する（図２の２０７）。 Next, the image feature calculation program (712) generates an edge image from the image captured by the image input (102) (203 in FIG. 2), and extracts a point having a low degree of edge similarity with the vicinity as a feature point. (204 in FIG. 2). Also, a hue image is generated (205 in FIG. 2), and a point having a low degree of hue similarity with the vicinity is extracted as a feature point (206 in FIG. 2). The image feature matching program (713) calculates a combination of feature points common to each edge and hue between images (207 in FIG. 2).

より具体的には画像間に共通なエッジと色相の各特徴点のうち、ある画像における各特徴点の相対位置を基準として参照先の画像での各特徴点の相対位置との差が閾値以下となるもののみを選択し、記録する（図２の２０８）。一致した特徴点の数と特徴点間の画像上での距離が閾値以上か否かを評価し、いずれかが閾値に満たない場合はエラー表示を行う（図２の２１２）。一致した特徴点の数と特徴点間の画像上での距離が閾値以上の場合は合成画像生成プログラム（７１４）により一致した特徴点群から画像間の相対位置の平均値を求め、この相対位置に従って複数の画像を1枚の合成画像として合成する（図２の２１１）。次いで画像表示プログラム（７１５）は合成画像生成プログラム（７１４）により生成された合成画像を表示する（図２の２１１）。 More specifically, the difference between the relative position of each feature point in the reference image is equal to or less than the threshold value based on the relative position of each feature point in a certain image among the edge and hue feature points common to the images. Are selected and recorded (208 in FIG. 2). The number of matched feature points and whether or not the distance between the feature points on the image is equal to or greater than the threshold value are evaluated. When the number of matched feature points and the distance between the feature points on the image are equal to or larger than the threshold value, an average value of relative positions between the images is obtained from the matched feature point group by the composite image generation program (714), and the relative position A plurality of images are synthesized as one synthesized image according to (211 in FIG. 2). Next, the image display program (715) displays the composite image generated by the composite image generation program (714) (211 in FIG. 2).

なお、前述した実施の形態の機能を実現するソフトウェアのプログラムを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のプロセッサが記録媒体に格納されたプログラムを読み出し実行することによっても、本願が目的が達成されることはいうまでもない。 It is also possible to supply a recording medium in which a software program for realizing the functions of the above-described embodiments is recorded to a system or apparatus, and for the processor of the system or apparatus to read and execute the program stored in the recording medium. Needless to say, the object of the present application is achieved.

以上の説明においては、発明の実施の形態として特にカメラを画像入力機器とする装置例を示したが、画像入力機器としては、カメラ付き携帯電話やカメラ付きPDA（Personal Digital Assistant）などの電子手帳、ビデオカメラ、スキャナなどに加え、スチルカメラの画像を保存しているネットワーク上の記憶装置などから画像を入力する場合でも示した装置例と同様の結果を得ることが可能である。 In the above description, an example of an apparatus using a camera as an image input device has been shown as an embodiment of the invention. As an image input device, an electronic notebook such as a camera-equipped mobile phone or a camera-equipped PDA (Personal Digital Assistant) is used. In addition to the video camera, the scanner, etc., the same result as that of the device example shown can be obtained even when the image is input from a storage device on the network storing the still camera image.

また、本発明を用いることにより、産業応用の例としては自動車あるいは航空機や衛星などの移動物体に搭載されたカメラにより撮影された複数の画像から１枚の合成画像を自動的に生成・表示を行ったり、複数のカメラを並べたカメラアレイによる複数の画像から１枚の合成画像を自動的に生成・表示することが可能となり、作業の省力化ならびに作業時間の短縮が期待できる。 In addition, by using the present invention, as an example of industrial application, one composite image is automatically generated and displayed from a plurality of images taken by a camera mounted on a moving object such as an automobile or an aircraft or a satellite. It is possible to automatically generate and display one composite image from a plurality of images by a camera array in which a plurality of cameras are arranged, and it can be expected to save work and shorten the work time.

本発明の全体の機能の構成図である。It is a block diagram of the whole function of this invention. 本発明の全体の処理の流れを示す図である。It is a figure which shows the flow of the whole process of this invention. 全体の処理の流れを示す図２のうち、画像特徴照合に関わる処理２０８の流れを示す図である。It is a figure which shows the flow of the process 208 in connection with an image feature collation among FIG. 2 which shows the flow of the whole process. 全体の処理の流れを示す図２のうち、画像特徴抽出に関わる処理２０４と２０６を示す図である。FIG. 3 is a diagram showing processing 204 and 206 related to image feature extraction in FIG. 2 showing the overall processing flow. 全体の処理の流れを示す図２のうち、画像特徴照合に関わる処理２０７を示す図である。It is a figure which shows the process 207 regarding image feature collation among FIG. 2 which shows the flow of the whole process. 全体の処理の流れを示す図２のうち、画像特徴照合に関わる処理２０８を示す図である。It is a figure which shows the process 208 in connection with an image feature collation among FIG. 2 which shows the flow of the whole process. 本発明の一実施形態における装置全体の構成図である。It is a block diagram of the whole apparatus in one Embodiment of this invention.

Explanation of symbols

１０１‥プログラムの開始処理、１０２‥画像入力処理、１０３‥画像特徴抽出処理、１０４‥画像特徴照合処理、１０５‥画像表示処理、１０６‥プログラムの終了処理
DESCRIPTION OF SYMBOLS 101 ... Program start process, 102 ... Image input process, 103 ... Image feature extraction process, 104 ... Image feature collation process, 105 ... Image display process, 106 ... Program end process

Claims

An image generation device that generates a composite image from a plurality of images,
An image input means for inputting an image;
Image feature calculating means for calculating features in each image of the plurality of images input from the image input means;
Image feature collating means for comparing features of each image obtained by the image feature calculating means and collating images having similar features;
Image generating means for generating a composite image by superimposing at least two of the plurality of images based on a matching result in the image feature matching means;
Have a means for displaying the composite image,
The image feature calculation means generates an edge image and a hue image for each of the plurality of images, extracts feature points in each image based on the edge image and the hue image,
The image feature matching unit is configured to determine a pixel value of a feature point in one image and a pixel value of another image based on the pixel value of the edge image and the hue image of the feature point in each of the plurality of images. When the difference between the pixel values of the feature points is equal to or less than a predetermined threshold, it is determined that the one feature point is similar to the other feature points, and the feature points are among at least two images having the similar feature points. Determining whether one feature point of a plurality of feature points in each image is a reference point, and whether another plurality of feature points in the image are within a predetermined distance from the reference point, An image generating apparatus , wherein when any of a plurality of other feature points is outside the predetermined distance, the feature points outside the predetermined distance are deleted from the extracted feature points .

The image generation apparatus according to claim 1,
The image generating device is characterized in that the image generating means combines two or more images having the similar feature points by superimposing positions of the feature points .