JP2015200918A

JP2015200918A - Image processor, method, and program

Info

Publication number: JP2015200918A
Application number: JP2014077401A
Authority: JP
Inventors: 大木　光晴; Mitsuharu Oki; 光晴大木
Original assignee: Individual
Current assignee: Individual
Priority date: 2014-04-04
Filing date: 2014-04-04
Publication date: 2015-11-12
Anticipated expiration: 2034-04-04
Also published as: JP6348318B2

Abstract

PROBLEM TO BE SOLVED: To achieve photographic expression which enables the feeling of the flow of times by compositing an image (image A) photographed in the past and the current image (image B) photographed from the almost same position without a sense of incongruity.SOLUTION: A relation between positions of photographing the image A and the image B is acquired (photographing position calculation part 13), a virtual photographing position transiting with time, with the photographing position of the image A as a start point and with the photographing position of the image B as an end point is determined (virtual photographing position calculation part 14), an image from the virtual photographing position is created by deforming the image A (deformation image creation part 15), further, the image B and the deformed image A are composited (composite image creation part 16), and display is performed in an image display part 17.

Description

本技術は、複数の画像から１つの画像群を作成する画像処理方法、装置、及び、プログラムに関する。 The present technology relates to an image processing method, an apparatus, and a program for creating one image group from a plurality of images.

画像の表現方法として、リフォトグラフィ（Re-photography）という分野がある。リフォトグラフィとは、過去に撮られた歴史的建造物または自然風景の画像（写真）に対して、同じ場所から撮影し、これら２つの画像（過去の画像と今撮影した画像）を合成して１つの画像にするものである。これにより、時代の流れを感じることのできる写真表現が可能となる。さて、これを実現させるためには、過去の画像を撮影した場所と同じ撮影位置から撮影する必要がある。もし、違う撮影位置から撮影してしまうと、過去の画像と重ね合わせて合成する際に、位置ずれか起きてしまい、鑑賞に堪えられない画像になってしまうからである。しかしながら、過去の画像を見て、その画像を撮影した位置を特定することは難しい。 There is a field called “Re-photography” as an image representation method. Rephotography is an image of a historical building or natural landscape taken in the past (photographs) taken from the same place, and these two images (the past image and the image just taken) are combined. To make one image. As a result, it is possible to make a photographic expression that can feel the flow of the times. In order to realize this, it is necessary to shoot from the same shooting position as the place where the past image was shot. This is because if the image is taken from a different shooting position, a misalignment occurs when the image is superimposed on the past image, resulting in an unbearable image. However, it is difficult to specify the position where the image was taken by looking at the past image.

そこで、過去の画像を自動解析することで、どこから撮影すれば良いかを指示してくれる手法が、特許文献１あるいは非特許文献１に示されている。この手法では、撮影者は撮影機材を持って、最初に、過去の画像を撮影した位置の近くで撮影を行う。過去の画像と今撮影された画像を解析することで、たとえば、「右に５ｍ動きなさい」という指示を行ってくれる。しかし、この手法では、単に、撮影位置を指示してくれるだけであり、撮影者は、その指示に従って、撮影機材をもって移動しなくてはいけない。特に、歴史的建造物の周囲では観光者が多い場合があり、他の観光者の邪魔をしないで、移動することは大変である。さらに、たとえば、上記例で、右に５ｍ動くと、交通量の多い車道上に移動することになり、危険なため撮影できない、という場合もありうる。 Therefore, Patent Document 1 or Non-Patent Document 1 discloses a method for instructing where to start photographing by automatically analyzing past images. In this method, the photographer first takes a picture near the position where the past image was taken, with the photographing equipment. By analyzing the past image and the image taken now, for example, an instruction “move right 5 m” is given. However, in this method, the photographing position is simply instructed, and the photographer must move with the photographing equipment in accordance with the instruction. In particular, there are many tourists around a historical building, and it is difficult to move without disturbing other tourists. In addition, for example, in the above example, if the vehicle moves 5 m to the right, it may move on a road with a lot of traffic, and it may be dangerous to take a picture.

できれば、過去の撮影位置と「およそ」同じ位置から一度撮影するだけで、撮影者の手間は終了するようにしたい。つまり、過去の撮影位置と「およそ」同じ位置から撮影するだけで、過去の画像と今撮影した画像とを融合して、時代の流れを感じることのできる写真表現手法が望まれるが、そのような写真表現手法は存在しなかった。また、その写真表現手法を実現するための装置や方法も存在しなかった。 If possible, we would like to finish the photographer's hassle by shooting once from the “same” position as the previous shooting position. In other words, there is a need for a photographic expression method that allows you to feel the flow of the times by fusing past images and images you have just taken, just by shooting from about the same position as the past shooting position. There was no photographic expression technique. Also, there has been no apparatus or method for realizing the photographic expression technique.

特開２０１１−２３９３６１号公報JP 2011-239361 A

Soonmin Bae, Aseem Agarwala, Fredo Durand. "Computational Re-Photography", ACM Transactions on Graphics (Presented at SIGGRAPH Asia 2010), 29(3), pp. 24:1-24:15.Soonmin Bae, Aseem Agarwala, Fredo Durand. "Computational Re-Photography", ACM Transactions on Graphics (Presented at SIGGRAPH Asia 2010), 29 (3), pp. 24: 1-24: 15.

過去の撮影位置と「およそ」同じ位置から撮影するだけで、過去の画像と今撮影した画像とを融合して、時代の流れを感じることのできる写真表現手法が望まれるが、そのような写真表現手法は存在しなかった。また、その写真表現手法を実現するための装置や方法も存在しなかった。 It is desirable to have a photographic expression technique that allows you to feel the flow of the times by fusing past images and images you have just taken, just by shooting from about the same position as the previous shooting position. There was no expression method. Also, there has been no apparatus or method for realizing the photographic expression technique.

本技術は、このような状況に鑑みてなされたものであり、ほぼ同じ場所から撮影した画像と、過去に撮られた画像を、違和感なく合成することを可能にするので、撮影者は移動しなくてよい。そして、合成された画像は、時代の流れを感じることのできる写真表現手法となっている。 The present technology has been made in view of such a situation, and since an image taken from almost the same place and an image taken in the past can be combined without a sense of incongruity, the photographer moves. It is not necessary. The synthesized image is a photographic expression technique that allows you to feel the flow of the times.

本技術の一側面の画像処理装置は、第１の画像と第２の画像を使って、第３の画像を生成する画像処理装置であって、上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求める撮影位置計算部と、上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定する仮想位置決定部と、上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成する画像生成部とを有し、上記仮想位置決定部では、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、上記画像生成部では、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成することを特徴とする。 An image processing apparatus according to an aspect of the present technology is an image processing apparatus that generates a third image using a first image and a second image, and the first image and the second image A shooting position calculation unit that obtains a positional relationship between the first shooting position where the first image was shot and the second shooting position where the second image was shot by matching, and the first shooting position. And a virtual position determination unit that determines a position between the second shooting position (referred to as a third shooting position) and an image when it is assumed that the shooting is performed from the third shooting position. An image generation unit that generates an image of the image, and the virtual position determination unit includes a path starting from one of the first shooting position or the second shooting position and having the other as an end point. The third shooting position moves with time, and the image generator The deformation in accordance with the positional relationship of the third imaging position with respect to the first imaging position, by performing with respect to the first image, and generates the third image.

これにより、過去に撮られた画像と、ほぼ同じ場所から撮影した現在の画像を、違和感なく合成でき、時代の流れを感じることのできる写真表現を実現できる。 As a result, it is possible to synthesize an image taken in the past and a current image taken from almost the same place without any sense of incongruity, and realize a photographic expression that can feel the flow of the times.

また、上記画像処理装置において、上記第１の画像は、さらに、各画素位置について奥行き情報を有し、上記撮影位置計算部では、上記奥行き情報を使って、上記第１の撮影位置と、上記第２の撮影位置の位置関係を求めること
を特徴とする。 In the image processing apparatus, the first image further includes depth information for each pixel position, and the shooting position calculation unit uses the depth information to determine the first shooting position and the first image. The positional relationship of the second shooting position is obtained.

これにより、精度が向上する。 This improves the accuracy.

また、上記画像処理装置において、上記第１の画像は、さらに、各画素位置についてマスク情報を有し、上記撮影位置計算部では、上記マスク情報がない位置については上記第１の画像の情報を使わず、上記マスク情報がある位置についてのみ上記第１の画像の情報を使ってマッチングを行うことを特徴とする。 In the image processing apparatus, the first image further includes mask information for each pixel position, and the photographing position calculation unit stores information on the first image for a position without the mask information. It is characterized in that matching is performed using the information of the first image only for a position where the mask information is present without being used.

これにより、精度が向上する。 This improves the accuracy.

本技術の一側面の画像処理方法または画像処理プログラムは、第１の画像と第２の画像を使って、第３の画像を生成する画像処理方法または画像処理プログラムであって、上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求める撮影位置計算ステップと、上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定する仮想位置決定ステップと、上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成する画像生成ステップとを有し、上記仮想位置決定ステップでは、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、上記画像生成ステップでは、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成することを特徴とする。 An image processing method or an image processing program according to an aspect of the present technology is an image processing method or an image processing program that generates a third image using a first image and a second image, A shooting position calculation step for obtaining a positional relationship between a first shooting position where the first image is shot and a second shooting position where the second image is shot by matching the image with the second image. And a virtual position determination step for determining a position between the first shooting position and the second shooting position (referred to as a third shooting position), and shooting from the third shooting position. An image generation step for generating a case image as the third image, and in the virtual position determination step, one of the first shooting position and the second shooting position is set as a starting point, and the other On the route that ends at The third shooting position is moved with time, and in the image generation step, the first image is deformed according to the positional relationship of the third shooting position with respect to the first shooting position. In this case, the third image is generated.

本技術の一側面の画像処理は、第１の画像と第２の画像を使って、第３の画像を生成する画像処理であって、上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求め、上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定し、上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成することとし、特に、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成することを特徴とする。 Image processing according to one aspect of the present technology is image processing for generating a third image using the first image and the second image, and is based on matching between the first image and the second image. The positional relationship between the first shooting position at which the first image is shot and the second shooting position at which the second image is shot is obtained, and the first shooting position and the second shooting position are obtained. Is determined as a third imaging position, and an image when it is assumed that the image is captured from the third imaging position is generated as the third image. The third photographing position moves with time on a route having one photographing position or one of the second photographing positions as a starting point and the other as an ending point, with respect to the first photographing position. The deformation corresponding to the positional relationship of the third shooting position is applied to the first image. By it performed, and generates the third image.

本技術の一側面によれば、過去に撮られた画像と、ほぼ同じ場所から撮影した現在の画像を、違和感なく合成でき、時代の流れを感じることのできる写真表現を実現できる。 According to one aspect of the present technology, it is possible to synthesize an image taken in the past and a current image taken from almost the same place without a sense of incongruity, and realize a photographic expression that can feel the flow of the times.

本発明の実施例の構成を示すブロック図である。It is a block diagram which shows the structure of the Example of this invention. 本発明の実施例の処理を説明するフローチャートである。It is a flowchart explaining the process of the Example of this invention. 本発明の実施例が適用できる具体的な撮影状況を説明する図である。It is a figure explaining the concrete imaging | photography condition which can apply the Example of this invention. 本発明の実施例が適用できる過去に撮影された画像を示す図である。It is a figure which shows the image image | photographed in the past which can apply the Example of this invention. 本発明の実施例が適用できる過去に撮影された画像に関して説明する図である。It is a figure explaining the image image | photographed in the past which can apply the Example of this invention. 本発明の実施例が適用できる過去に撮影された画像に関して説明する図である。It is a figure explaining the image image | photographed in the past which can apply the Example of this invention. 本発明の実施例が適用できる具体的な撮影状況を説明する図である。It is a figure explaining the concrete imaging | photography condition which can apply the Example of this invention. 本発明の実施例が適用できる現在において撮影された画像を示す図である。It is a figure which shows the image image | photographed now at which the Example of this invention is applicable. 本発明の実施例を適用したときに生成される画像を示す図である。It is a figure which shows the image produced | generated when the Example of this invention is applied. 本発明の実施例を適用したときに生成される画像を示す図である。It is a figure which shows the image produced | generated when the Example of this invention is applied. 本発明の実施例に関して説明する図である。It is a figure explaining regarding the Example of this invention. 本発明の実施例を適用したときに生成される画像を示す図である。It is a figure which shows the image produced | generated when the Example of this invention is applied. 本発明の実施例の処理を実行するコンピュータ装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the computer apparatus which performs the process of the Example of this invention.

≪本発明の実施例≫
本発明の実施例を説明する。まず、本発明の典型的な使用方法と、本発明の装置での処理の例を簡単に説明する。
＜ユーザ動作１＞ユーザは、最初に、過去に撮られた画像（画像Aとする）を用意する。画像Ａは、たとえば、数十年前に歴史的な建造物を撮影した写真である。
＜ユーザ動作２＞ユーザは、画像Ａを見ながら、画像Ａが撮影された現地へ行き、ほぼ同じ撮影位置から撮影を行う。撮影された画像を画像Ｂとする。
＜ユーザ動作３＞ユーザは、本発明の装置に、画像Ａおよび画像Ｂを入力する。
本発明の装置は、以下の処理を行う。
＜処理１＞画像Ａと画像Ｂのマッチングを行い、画像Ａの撮影された撮影位置（撮影位置ＰＡとする）と、画像Ｂの撮影された撮影位置（撮影位置ＰＢとする）の位置関係を求める。
＜処理２＞次に、仮想的な撮影位置ＰＣ（ｔ）を導入する。ｔは時間を表すパラメータであり、ｔの値は、０から１００までの整数とする。ＰＣ（ｔ）は、ｔ＝０において撮影位置ＰＡと同一であり、ｔ＝１００で撮影位置ＰＢと同一となるように、ｔとともに変化させる。ｔの値を０から１００までとした理由は、画像を１０１枚生成するためであり、１０１枚以外の枚数にしたい場合は、ｔとしてそれに対応した値を取らせれば良い。
＜処理３＞各時刻ｔにおいて、ＰＣ（ｔ）の位置から撮影したと仮定した場合の画像を、画像Ａを変形させることで作成する。この画像をＣ（ｔ）とする。
＜処理４＞また、画像Ｃ（１００）と、画像Ｂを合成する。この画像を画像Ｄとする。
＜処理５＞最後に、画像表示部にて、画像Ｃ（０）から画像Ｃ（１００）を連続表示し、最後に画像Ｄを表示する。
このような一連の処理により、最終的に＜処理５＞で表示される画像（連続的に再生される１０２枚の画像）は、時代の流れを感じることのできる表現をしていることになる。なぜなら、＜処理３＞において作成される画像Ｃ（ｔ）（ｔ＝０〜１００）は、画像Ａから作成されており、過去に撮られた画像である。正確に述べるなら、過去に撮られた画像を変形した画像である。つまり、Ｃ（ｔ）（ｔ＝０〜１００）は、過去の風景を表現している画像である。さらに、Ｃ（１００）は、撮影位置ＰＣ（１００）＝ＰＢ、つまり、画像Ｂと同じ撮影位置から撮影したと仮定した場合の画像である。したがって、画像Ｃ（１００）と画像Ｂを重ね合わせると位置ずれはなく、画像Ｃ（１００）と画像Ｂの合成処理では破たんが起きない。つまり、画像Ｄは、現在の風景と過去の風景とが合成された破たんのない画像である。画像Ｃ（０）から画像Ｃ（１００）までを順番に表示した後、画像Ｄを表示することで、画像Ａ（歴史的な建造物の過去の写真）がゆっくりと変形していき、最後に、「過去の歴史的建造物」が現在の風景と融合した情景を表現できる。このようにして、時代の流れを感じることのできる写真を表現することが可能となる。 << Examples of the Invention >>
Examples of the present invention will be described. First, a typical usage method of the present invention and an example of processing in the apparatus of the present invention will be briefly described.
<User Action 1> The user first prepares an image (image A) taken in the past. The image A is, for example, a photograph taken of a historic building several decades ago.
<User Operation 2> While viewing the image A, the user goes to the site where the image A was shot and takes a picture from almost the same shooting position. Let the captured image be an image B.
<User action 3> The user inputs the image A and the image B to the apparatus of the present invention.
The apparatus of the present invention performs the following processing.
<Process 1> Image A and image B are matched, and the positional relationship between the shooting position where image A was shot (referred to as shooting position PA) and the shooting position where image B was shot (referred to as shooting position PB) is determined. Ask.
<Process 2> Next, a virtual photographing position PC (t) is introduced. t is a parameter representing time, and the value of t is an integer from 0 to 100. PC (t) is changed with t so that it is the same as the shooting position PA at t = 0 and the same as the shooting position PB at t = 100. The reason for setting the value of t from 0 to 100 is to generate 101 images, and when it is desired to use a number other than 101, it is sufficient to take a value corresponding to t.
<Process 3> At each time t, an image on the assumption that the image is taken from the position of PC (t) is created by deforming the image A. Let this image be C (t).
<Process 4> The image C (100) and the image B are combined. This image is designated as an image D.
<Process 5> Finally, the image display unit continuously displays the image C (0) to the image C (100), and finally displays the image D.
Through such a series of processing, the images finally displayed in <Processing 5> (102 images that are continuously played back) are expressed so that they can feel the flow of the times. . This is because the image C (t) (t = 0 to 100) created in <Process 3> is created from the image A and is an image taken in the past. To be precise, it is an image obtained by transforming an image taken in the past. That is, C (t) (t = 0 to 100) is an image representing a past landscape. Furthermore, C (100) is an image when it is assumed that the shooting position PC (100) = PB, that is, it is shot from the same shooting position as the image B. Therefore, when the image C (100) and the image B are overlapped, there is no positional deviation, and no breakage occurs in the composition processing of the image C (100) and the image B. That is, the image D is an unbroken image in which the current landscape and the past landscape are synthesized. Displaying image C (0) to image C (100) in order, and then displaying image D causes image A (a past photo of a historical building) to be slowly deformed. , "Past historic buildings" can be expressed in a fusion of the current landscape. In this way, it is possible to express photographs that can feel the flow of the times.

以上が、本発明の典型的な使用方法と、本発明の装置での処理の簡単な説明である。 The above is a brief description of typical usage of the present invention and processing in the apparatus of the present invention.

なお、一般的に、２枚の画像があれば、ステレオ視のマッチング問題に帰着できるので、既存のステレオ視に関する２枚の画像間のマッチング手法で撮影位置ＰＡと撮影位置ＰＢの位置関係を求める（上述の＜処理１＞）ことが出来る。しかし、求まった位置関係は、解法の過程が不安定であり、不正確となる場合も多い。そこで、本発明の実施例においては、画像Ａについて、奥行き情報を持たせることで、位置関係の精度を向上させている。つまり、画像Ａの任意の位置（ｘ，ｙ）に対して、画素値Ａ（ｘ，ｙ）という値を持っており、らに、奥行き情報Ｚ（ｘ，ｙ）という値も持っているとする。また、画像Ａを撮影したときの焦点距離をｆＡとする。つまり、Ａ（ｘ，ｙ）とＺ（ｘ，ｙ）、および、ｆＡは、既知とする。 In general, if there are two images, it can be reduced to a matching problem in stereo vision. Therefore, the positional relationship between the photographing position PA and the photographing position PB is obtained by a matching method between two images related to existing stereo vision. (<Process 1> described above). However, the obtained positional relationship is often inaccurate because the solution process is unstable. Therefore, in the embodiment of the present invention, the accuracy of the positional relationship is improved by giving depth information to the image A. In other words, for an arbitrary position (x, y) in the image A, the pixel value A (x, y) has a value, and the depth information Z (x, y) also has a value. To do. Further, the focal length when the image A is taken is assumed to be fA. That is, A (x, y), Z (x, y), and fA are known.

画像Ａが与えられていれば、任意の位置（ｘ，ｙ）に対して画素値Ａ（ｘ，ｙ）が与えられていることと同値なので、Ａ（ｘ，ｙ）が既知であるのは明らかである。ｆＡは、過去において画像Ａを撮影したカメラの情報から知ることができる。また、奥行き情報Ｚ（ｘ，ｙ）は、たとえば、論文 D. Hoiem, A.A. Efros, and M. Hebert, "Automatic Photo Pop-up", ACM SIGGRAPH 2005 に書かれている手法を用いることで自動作成することが出来る。あるいは、奥行き情報Ｚ（ｘ，ｙ）は、ユーザが手作業で作成しても良い。 If the image A is given, it is the same value that the pixel value A (x, y) is given to an arbitrary position (x, y), and therefore A (x, y) is known. it is obvious. fA can be known from the information of the camera that captured the image A in the past. Depth information Z (x, y) is automatically created by using the method described in the paper D. Hoiem, AA Efros, and M. Hebert, "Automatic Photo Pop-up", ACM SIGGRAPH 2005, for example. I can do it. Alternatively, the depth information Z (x, y) may be created manually by the user.

また、画像Ａに投影されている被写体の中で現在も存在している被写体に関する情報も既知とする。これは、上述の＜ユーザ動作２＞の動作と同時に、ユーザが現地で、現在の風景と、画像Ａとを見比べて、手作業にて作成すれば良い。この情報を、マッチング用マスク情報Ｍ（ｘ，ｙ）とする。なお、Ｍ（ｘ，ｙ）の値は０または１である。すなわち、画像Ａの中で位置（ｘ，ｙ）の被写体が現在も存在している場合には、Ｍ（ｘ，ｙ）＝１とする。画像Ａの中で位置（ｘ，ｙ）の被写体が現在は存在していない場合には、Ｍ（ｘ，ｙ）＝０とする。 In addition, it is assumed that information on a subject that is currently present among the subjects projected on the image A is also known. This may be created manually by comparing the current scenery with the image A at the same time as the above-described <User action 2>. This information is referred to as matching mask information M (x, y). Note that the value of M (x, y) is 0 or 1. That is, if an object at position (x, y) still exists in image A, M (x, y) = 1. If the subject at position (x, y) does not currently exist in image A, M (x, y) = 0.

また、画像Ａには、歴史的建造物以外に、不要な被写体も含まれている。たとえば、空や地面などである。そこで、不要な部分を排除する情報を、ユーザは作成する。この情報を、被写体用マスク情報Ｎ（ｘ，ｙ）とする。なお、Ｎ（ｘ，ｙ）の値は０または１である。すなわち、画像Ａの中で位置（ｘ，ｙ）の被写体が重要な被写体（歴史的建造物など）である場合には、Ｎ（ｘ，ｙ）＝１とする。画像Ａの中で位置（ｘ，ｙ）の被写体が重要な被写体でない場合には、Ｎ（ｘ，ｙ）＝０とする。 In addition to the historical building, the image A includes unnecessary subjects. For example, the sky or the ground. Therefore, the user creates information that excludes unnecessary portions. This information is set as subject mask information N (x, y). Note that the value of N (x, y) is 0 or 1. That is, if the subject at the position (x, y) in the image A is an important subject (such as a historical building), N (x, y) = 1. If the subject at the position (x, y) in the image A is not an important subject, N (x, y) = 0.

なお、同じ「歴史的な建造物の過去の写真」を使って、複数のユーザが、各ユーザが撮影した現在の写真と合成することが考えられるので、ある人が画像Ａについて奥行き情報Ｚ（ｘ，ｙ）、マッチング用マスク情報Ｍ（ｘ，ｙ）、および、被写体用マスク情報Ｎ（ｘ，ｙ）を作成した場合、その情報を共有することで、他のユーザは、Ｚ（ｘ，ｙ）、Ｍ（ｘ，ｙ）、および、Ｎ（ｘ，ｙ）を作成する手間を省くことが出来る。 In addition, since it is possible that a plurality of users combine with the current photo taken by each user using the same “historical building past photo”, a person can obtain depth information Z ( x, y), matching mask information M (x, y), and subject mask information N (x, y) are created. By sharing the information, other users can use Z (x, y) y), M (x, y), and N (x, y) can be saved.

また、ユーザは、画像Ｂを撮影する際に、自分のカメラの焦点距離を知ることができるので、この焦点距離（ｆＢとする）も既知とする。 Further, since the user can know the focal length of his camera when taking the image B, the focal length (referred to as fB) is also known.

以上をまとめると、本発明の実施例である装置に入力する情報は、
（入力１）画像Ａ、すなわち、各位置（ｘ，ｙ）に対する画素値Ａ（ｘ，ｙ）という値
（入力２）画像Ａに関する奥行き情報、すなわち、各位置（ｘ，ｙ）に対する奥行き情報Ｚ（ｘ，ｙ）という値
（入力３）画像Ａを撮影したときの焦点距離ｆＡ
（入力４）画像Ａに投影されている被写体の中で現在も存在している被写体に関する情報、すなわち、各位置（ｘ，ｙ）に対するマッチング用マスク情報Ｍ（ｘ，ｙ）という値。
（入力５）画像Ａに投影されている被写体の中で重要な被写体であるかどうかに関する情報、すなわち、各位置（ｘ，ｙ）に対する被写体用マスク情報Ｎ（ｘ，ｙ）という値。
（入力６）画像Ｂ、すなわち、各位置（ｘ，ｙ）に対する画素値Ｂ（ｘ，ｙ）という値
（入力７）画像Ｂを撮影したときの焦点距離ｆＢ
である。 In summary, the information input to the apparatus according to the embodiment of the present invention is as follows.
(Input 1) Image A, that is, a value of pixel value A (x, y) for each position (x, y) (Input 2) Depth information about image A, that is, depth information Z for each position (x, y) Value (x, y) (input 3) Focal length fA when image A is photographed
(Input 4) Information relating to a subject that is currently present among subjects projected on the image A, that is, a value of matching mask information M (x, y) for each position (x, y).
(Input 5) Information on whether or not the subject is an important subject among the subjects projected on the image A, that is, a value of subject mask information N (x, y) for each position (x, y).
(Input 6) Image B, that is, a focal length fB when a value (input 7) image B of a pixel value B (x, y) for each position (x, y) is photographed.
It is.

以下で、本発明の実施例の詳細を説明していく。 Details of the embodiments of the present invention will be described below.

本発明の実施例の装置を図１に示す。図１の１０が、本発明の実施例である画像処理装置である。画像処理装置１０内には、入力端子１１、特徴点マッチング処理部１２、撮影位置計算部１３、仮想撮影位置計算部１４、変形画像作成部１５、合成画像作成部１６、および、画像表示部１７がある。また、図中の各部を結ぶ線は、主なデータの流れを示している。各部の詳細は、以下で述べることとし、ここでは、おおよその処理内容を説明しておく。 An apparatus according to an embodiment of the present invention is shown in FIG. Reference numeral 10 in FIG. 1 denotes an image processing apparatus according to an embodiment of the present invention. In the image processing apparatus 10, an input terminal 11, a feature point matching processing unit 12, a shooting position calculation unit 13, a virtual shooting position calculation unit 14, a deformed image creation unit 15, a composite image creation unit 16, and an image display unit 17. There is. Also, the lines connecting the parts in the figure show the main data flow. The details of each part will be described below, and the approximate processing contents will be described here.

入力端子部１１より、上記（入力１）〜（入力７）のデータが入力される。 The data of the above (input 1) to (input 7) is input from the input terminal unit 11.

特徴点マッチング部１２では、画素値Ａ（ｘ，ｙ）、マッチング用マスク情報Ｍ（ｘ，ｙ）、および、画素値Ｂ（ｘ，ｙ）を受け取り、ペアリングされた特徴点どうしの位置関係である（ｘｉ，ｙｉ）および（ｕｉ，ｖｉ）を出力する。 The feature point matching unit 12 receives the pixel value A (x, y), the matching mask information M (x, y), and the pixel value B (x, y), and the positional relationship between the paired feature points (Xi, yi) and (ui, vi) are output.

撮影位置計算部１３では、奥行き情報Ｚ（ｘ，ｙ）、焦点距離ｆＡ、焦点距離ｆＢ、および、ペアリングされた特徴点どうしの位置関係である（ｘｉ，ｙｉ）および（ｕｉ，ｖｉ）を受け取り、画像Ａの撮影位置を基準としたときの画像Ｂの撮影位置（Ｒ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚ）を出力する。 In the photographing position calculation unit 13, the depth information Z (x, y), the focal length fA, the focal length fB, and (xi, yi) and (ui, vi), which are the positional relationships among the paired feature points, are obtained. Received, and outputs the shooting position (R0, R1, R2, Tx, Ty, and Tz) of the image B when the shooting position of the image A is used as a reference.

仮想撮影位置計算部１４では、焦点距離ｆＡ、焦点距離ｆＢ、および、画像Ｂの撮影位置（Ｒ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚ）を受け取り、時刻ｔに対応する撮影位置（Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）、および、ｆＢ０（ｔ））を出力する。 The virtual shooting position calculation unit 14 receives the focal length fA, the focal length fB, and the shooting positions (R0, R1, R2, Tx, Ty, and Tz) of the image B and receives the shooting position (R00) corresponding to the time t. (T), R10 (t), R20 (t), Tx0 (t), Ty0 (t), Tz0 (t), and fB0 (t)) are output.

変形画像作成部１５では、画素値Ａ（ｘ，ｙ）、奥行き情報Ｚ（ｘ，ｙ）、焦点距離ｆＡ、時刻ｔに対応する撮影位置（Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）、および、ｆＢ０（ｔ））を受け取り、画像Ｃ（ｔ）を出力する。ここで、ｔは、０から１００までの整数である。 In the deformed image creation unit 15, the photographing position (R00 (t), R10 (t), R20 (t) corresponding to the pixel value A (x, y), depth information Z (x, y), focal length fA, and time t. ), Tx0 (t), Ty0 (t), Tz0 (t), and fB0 (t)), and outputs an image C (t). Here, t is an integer from 0 to 100.

合成画像作成部１６では、画素値Ａ（ｘ，ｙ）、奥行き情報Ｚ（ｘ，ｙ）、被写体用マスク情報Ｎ（ｘ，ｙ）、焦点距離ｆＡ、画素値Ｂ（ｘ，ｙ）、時刻ｔ＝１００に対応する撮影位置（Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）、および、ｆＢ０（ｔ））を受け取り、画像Ｄを出力する。 In the composite image creation unit 16, pixel value A (x, y), depth information Z (x, y), subject mask information N (x, y), focal length fA, pixel value B (x, y), time An imaging position (R00 (t), R10 (t), R20 (t), Tx0 (t), Ty0 (t), Tz0 (t), and fB0 (t)) corresponding to t = 100 is received and an image D is output.

画像表示部１７では、画像Ｃ（ｔ）（ｔ＝０〜１００）と画像Ｄを受け取り、これらを表示する。 The image display unit 17 receives the image C (t) (t = 0 to 100) and the image D and displays them.

次に、処理の流れを説明するが、その前に、画像Ａと画像Ｂの対応関係式について説明しておく。画像Ａの任意の位置（ｘ，ｙ）は、画像Ａの撮影時のカメラの光学中心を基準とした座標系において、式１で表現される３次元位置の被写体が投影されている。 Next, the flow of processing will be described, but before that, the correspondence relationship between the image A and the image B will be described. At an arbitrary position (x, y) of the image A, a subject at a three-dimensional position expressed by Expression 1 is projected in a coordinate system based on the optical center of the camera when the image A is captured.

なぜなら、（ｘ，ｙ，ｆＡ）の方向から来た光線が画像Ａの位置（ｘ，ｙ）に露光されることと、カメラから被写体までのｚ軸方向の距離（奥行き）がＺ（ｘ，ｙ）であることから明らかである。 This is because the light beam coming from the direction (x, y, fA) is exposed to the position (x, y) of the image A and the distance (depth) in the z-axis direction from the camera to the subject is Z (x, It is clear from y).

上述の式１に位置している被写体が、画像Ｂでは、どのような位置に投影されるかを考察する。一般的に、２枚の画像を撮影したときの、それぞれの撮影位置の関係は、３次元上の回転行列と、３次元上の並進ベクトルとで表現される。回転行列の各要素は三角関数で記述されるが、本発明の実施例の場合、画像Ｂは画像Ａとほぼ同じ位置から撮影されているので、回転行列は１次式で近似することが出来る。すなわち、式２となる。 Consider the position in the image B where the subject located in Equation 1 above is projected. In general, when two images are captured, the relationship between the capturing positions is represented by a three-dimensional rotation matrix and a three-dimensional translation vector. Each element of the rotation matrix is described by a trigonometric function. However, in the case of the embodiment of the present invention, since the image B is taken from almost the same position as the image A, the rotation matrix can be approximated by a linear expression. . That is, Equation 2 is obtained.

式２のＲ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚは、それぞれ適切な数値（実数）である。画像Ａの位置（ｘ，ｙ）に投影された被写体は、画像Ｂの位置（ｕ，ｖ）に投影される。ここで、（ｕ，ｖ）は、式２で計算される位置である。つまり、式１で示される位置を、回転行列により回転させ、並進ベクトルにより平行移動させることで、（ｕ，ｖ）を得ている。 R 0, R 1, R 2, Tx, Ty, and Tz in Formula 2 are appropriate numerical values (real numbers). The subject projected at the position (x, y) of the image A is projected at the position (u, v) of the image B. Here, (u, v) is a position calculated by Equation 2. In other words, (u, v) is obtained by rotating the position shown by Equation 1 by the rotation matrix and moving the position by the translation vector.

式２を変形すると、式３となる。 When Expression 2 is transformed, Expression 3 is obtained.

それでは、図２に本発明の実施例の処理の流れを示す。 FIG. 2 shows the flow of processing according to the embodiment of the present invention.

まず、図２のステップ１で、前述の
（入力１）画像Ａ、すなわち、各位置（ｘ，ｙ）に対する画素値Ａ（ｘ，ｙ）という値
（入力２）画像Ａに関する奥行き情報、すなわち、各位置（ｘ，ｙ）に対する奥行き情報Ｚ（ｘ，ｙ）という値
（入力３）画像Ａを撮影したときの焦点距離ｆＡ
（入力４）画像Ａに投影されている被写体の中で現在も存在している被写体に関する情報、すなわち、各位置（ｘ，ｙ）に対するマッチング用マスク情報Ｍ（ｘ，ｙ）という値。
（入力５）画像Ａに投影されている被写体の中で重要な被写体であるかどうかに関する情報、すなわち、各位置（ｘ，ｙ）に対する被写体用マスク情報Ｎ（ｘ，ｙ）という値。
（入力６）画像Ｂ、すなわち、各位置（ｘ，ｙ）に対する画素値Ｂ（ｘ，ｙ）という値
（入力７）画像Ｂを撮影したときの焦点距離ｆＢ
を、入力端子１１から入力する。そして、ステップ２に進む。 First, in step 1 of FIG. 2, the above-described (input 1) image A, that is, the depth information regarding the value (input 2) image A of the pixel value A (x, y) for each position (x, y), that is, Depth information Z (x, y) value (input 3) for each position (x, y) Focal length fA when image A is photographed
(Input 4) Information relating to a subject that is currently present among subjects projected on the image A, that is, a value of matching mask information M (x, y) for each position (x, y).
(Input 5) Information on whether or not the subject is an important subject among the subjects projected on the image A, that is, a value of subject mask information N (x, y) for each position (x, y).
(Input 6) Image B, that is, a focal length fB when a value (input 7) image B of a pixel value B (x, y) for each position (x, y) is photographed.
Is input from the input terminal 11. Then, the process proceeds to Step 2.

ステップ２で、画像Ａと画像Ｂのそれぞれから特徴点を検出する。ただし、画像Ａからの特徴点検出は、Ｍ（ｘ，ｙ）＝０である位置（ｘ，ｙ）については除外して、Ｍ（ｘ，ｙ）＝１となる位置（ｘ，ｙ）についてのみ求める。そして、画像Ａから得られる特徴点と、画像Ｂから得られる特徴点のマッチングを行う。すなわち、画像Ａの各特徴点と同じ（あるいは類似する）特徴を有する特徴点を、画像Ｂの特徴点から選ぶ。これにより、特徴点どうしのペアリングが行われる。これらペアリングされた特徴点どうしの位置を、（ｘｉ，ｙｉ）および（ｕｉ，ｖｉ）とする。ここで、ｉは、ペアリングされた特徴点を特定するための添え字（パラメータ）である。すなわち、任意のｉについて、画像Ａの位置（ｘｉ，ｙｉ）の特徴と、画像Ｂの位置（ｕｉ，ｖｉ）の特徴は、同じであり（あるいは類似しており）、画像Ａの位置（ｘｉ，ｙｉ）に投影されている被写体と、画像Ｂの位置（ｕｉ，ｖｉ）に投影されている被写体は同じであると考えられる。この処理は、特徴点マッチング部１２で行われる。 In step 2, feature points are detected from each of image A and image B. However, the feature point detection from the image A excludes the position (x, y) where M (x, y) = 0, and the position (x, y) where M (x, y) = 1. Only ask. Then, the feature points obtained from the image A and the feature points obtained from the image B are matched. That is, a feature point having the same (or similar) feature as each feature point of the image A is selected from the feature points of the image B. Thereby, pairing of feature points is performed. The positions of these paired feature points are (xi, yi) and (ui, vi). Here, i is a subscript (parameter) for specifying the paired feature points. That is, for an arbitrary i, the feature of the position (xi, yi) of the image A and the feature of the position (ui, vi) of the image B are the same (or similar), and the position of the image A (xi , Yi) and the subject projected at the position (ui, vi) of the image B are considered to be the same. This process is performed by the feature point matching unit 12.

なお、ステップ２において「Ｍ（ｘ，ｙ）＝０である位置（ｘ，ｙ）については除外する」ことで、画像Ａに投影されている被写体の中で現在は存在していない（つまり、画像Ｂには存在しない）被写体を除外することができ、ペアリングの誤判定を抑制することが出来る。 In step 2, “exclude position (x, y) where M (x, y) = 0” does not currently exist in the subject projected on image A (that is, Subjects that do not exist in the image B can be excluded, and erroneous determination of pairing can be suppressed.

ステップ２の後、ステップ３に進む。 After step 2, proceed to step 3.

ステップ３で、画像Ａの撮影位置を基準としたときに、画像Ｂの撮影位置が、どのような位置にあったかという情報を求める。すなわち、式３（前述）を満たすＲ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚを求める。実際には誤差があるので、式４が最小となるＲ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚを求める。 In step 3, information is obtained as to what position the shooting position of image B was when the shooting position of image A was used as a reference. That is, R0, R1, R2, Tx, Ty, and Tz that satisfy Expression 3 (described above) are obtained. Since there is actually an error, R0, R1, R2, Tx, Ty, and Tz that minimize Equation 4 are obtained.

ここで、ｉについての合算（Σ）は、ステップ２で求めたペアリングされた特徴点すべてについて行う。この処理は、撮影位置計算部１３で行われる。 Here, the summation (Σ) for i is performed for all the paired feature points obtained in step 2. This processing is performed by the photographing position calculation unit 13.

ステップ３の後、ステップ４に進む。 After step 3, proceed to step 4.

ステップ４以降では、画像Ａを時間とともに変形させていった画像を作成する。そのために、ステップ４で、時間を表すパラメータとして、ｔに０をセットする。そして、ステップ５に進む。 In step 4 and subsequent steps, an image obtained by deforming the image A with time is created. Therefore, in step 4, 0 is set to t as a parameter representing time. Then, the process proceeds to Step 5.

ステップ５で、時刻ｔに対応する撮影位置を計算する。すなわち、仮想撮影位置計算部１４にて、式５を計算する。 In step 5, the photographing position corresponding to time t is calculated. That is, the virtual shooting position calculation unit 14 calculates Expression 5.

時刻ｔにおいては、画像Ａの撮影位置を基準として、Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）で表現される回転と、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）で表現される平行移動を行った位置から撮影したとする。また、焦点距離は、ｆＢ０（ｔ）とする。式５から分かるように、ｔ＝０においては、画像Ａの撮影位置と同じになる。ｔ＝１００において、画像Ｂの撮影位置と同じになる。 At time t, the rotation represented by R00 (t), R10 (t), R20 (t), and Tx0 (t), Ty0 (t), Tz0 (t) with reference to the shooting position of the image A It is assumed that the image is taken from the position where the expressed parallel movement is performed. The focal length is fB0 (t). As can be seen from Equation 5, at t = 0, it is the same as the shooting position of the image A. At t = 100, it is the same as the shooting position of the image B.

ステップ５の後、ステップ６に進む。 After step 5, proceed to step 6.

ステップ６で、時刻ｔに対応する画像Ｃ（ｔ）を、画像Ａを変形することで生成する。つまり、画像Ａの各位置（ｘ，ｙ）について、式６を計算し、位置（ｕ（ｘ，ｙ，ｔ），ｖ（ｘ，ｙ，ｔ））を求める。 In step 6, an image C (t) corresponding to time t is generated by transforming the image A. That is, for each position (x, y) in the image A, Equation 6 is calculated to obtain the position (u (x, y, t), v (x, y, t)).

次に、新しいキャンバスを用意する。そして、画像Ａの各位置（ｘ，ｙ）の画素値Ａ（ｘ，ｙ）を、このキャンバス上の位置（ｕ（ｘ，ｙ，ｔ），ｖ（ｘ，ｙ，ｔ））に書き込む。画像Ａのすべての位置について書き込んだ後、キャンバス上にできた画像を、画像Ｃ（ｔ）とする。この処理は、変形画像作成部１５で行われる。 Next, prepare a new canvas. Then, the pixel value A (x, y) at each position (x, y) of the image A is written to the position (u (x, y, t), v (x, y, t)) on the canvas. After writing all the positions of the image A, an image formed on the canvas is defined as an image C (t). This process is performed by the deformed image creation unit 15.

式５および式６から分かるように、ｔ＝０においては画像Ａがそのまま、ｔ＝１００においては画像Ｂと同じ撮影位置から撮影したと仮定した画像が作成されたことになる。また、式５からわかるように、ｔが連続的に増えていくと、時刻ｔに対応する撮影位置（Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）で確定される位置、および、焦点距離ｆＢ０（ｔ））が連続的に変化していく。 As can be seen from Equations 5 and 6, the image A is created as it is at t = 0, and the image is assumed to be taken from the same shooting position as the image B at t = 100. Further, as can be seen from Equation 5, when t increases continuously, the shooting positions corresponding to time t (R00 (t), R10 (t), R20 (t), Tx0 (t), Ty0 (t ), The position determined by Tz0 (t), and the focal length fB0 (t)) continuously change.

ステップ６の後、ステップ７に進む。 After step 6, proceed to step 7.

ステップ７で、ｔの値が１００であるか判定する。１００でなければ、次の時刻ｔにおける画像Ｃ（ｔ）を作成する必要があるので、ステップ８に進む。１００であれば、画像Ｃ（ｔ）はすべて作成したので、ステップ９に進む。 In step 7, it is determined whether the value of t is 100. If not 100, it is necessary to create an image C (t) at the next time t. If it is 100, since all images C (t) have been created, the process proceeds to step 9.

ステップ８で、次の時刻の処理をするために、ｔを１だけインクリメントする。そして、ステップ５に戻る。 In step 8, t is incremented by 1 for processing at the next time. Then, the process returns to step 5.

ステップ９で、最後の画像Ｄを作成する。すなわち、まず、新しいキャンバスを用意する。そして、このキャンバスに画像Ｂをコピーする。つまり、画像Ｂの各位置（ｘ，ｙ）の画素値Ｂ（ｘ，ｙ）を、キャンバス上の位置（ｘ，ｙ）に書き込む。そして、画像Ａの中で、Ｎ（ｘ，ｙ）＝１である各位置（ｘ，ｙ）の画素値Ａ（ｘ，ｙ）を、キャンバス上の位置（ｕ（ｘ，ｙ，１００），ｖ（ｘ，ｙ，１００））に上書きする。なお、（ｕ（ｘ，ｙ，１００），ｖ（ｘ，ｙ，１００））は、前記式６においてｔ＝１００とした場合の値である。キャンバス上にできた画像を、画像Ｄとする。この処理は、合成画像作成部１６で行われる。 In step 9, the last image D is created. That is, first, a new canvas is prepared. Then, image B is copied to this canvas. That is, the pixel value B (x, y) at each position (x, y) of the image B is written to the position (x, y) on the canvas. Then, in the image A, the pixel value A (x, y) at each position (x, y) where N (x, y) = 1 is set to the position (u (x, y, 100), v (x, y, 100)) is overwritten. Note that (u (x, y, 100), v (x, y, 100)) is a value when t = 100 in Equation 6. An image formed on the canvas is referred to as an image D. This process is performed by the composite image creation unit 16.

このようにして作成された画像Ｄについて、以下で説明する。画素値Ａ（ｘ，ｙ）は、キャンバス上の位置（ｕ（ｘ，ｙ，１００），ｖ（ｘ，ｙ，１００））に書き込まれる。書き込まれた画像は、時刻ｔ＝１００における撮影位置から撮影したと仮定した場合の画像Ａに投影されている被写体の画像である。時刻ｔ＝１００における撮影位置は、画像Ｂと同じ撮影位置であるから、既にキャンバスにコピーされている画像Ｂとは位置ずれのない被写体像である。そして、Ｎ（ｘ，ｙ）＝１である位置のみが書き込まれているので、重要な被写体のみが書き込まれている。したがって、画像Ｄは、画像Ｂの中で、重要な被写体部分が、画像Ａに投影されている被写体像により置き換えられた画像となる。 The image D created in this way will be described below. The pixel value A (x, y) is written at a position (u (x, y, 100), v (x, y, 100)) on the canvas. The written image is an image of the subject projected on the image A when it is assumed that the image is taken from the photographing position at time t = 100. Since the shooting position at time t = 100 is the same shooting position as that of the image B, the subject image has no positional deviation from the image B that has already been copied to the canvas. Since only the position where N (x, y) = 1 is written, only the important subject is written. Therefore, the image D is an image in which an important subject portion in the image B is replaced by the subject image projected on the image A.

ステップ９の後、ステップ１０に進む。 After step 9, proceed to step 10.

ステップ１０で、作成された画像を表示していく。すなわち、画像表示部１７にて、画像Ｃ（０）から画像Ｃ（１００）までを順番に表示した後、最後に画像Ｄを表示する。これにより、画像Ａ（歴史的な建造物の過去の写真）がゆっくりと変形していき、最後に、「過去の歴史的建造物」が現在の風景と融合した情景を表現できる。このようにして、時代の流れを感じることのできる写真を表現することが可能となる。 In step 10, the created image is displayed. That is, the image display unit 17 displays the image C (0) to the image C (100) in order, and finally displays the image D. As a result, the image A (a past photograph of a historical building) is slowly deformed, and finally, a scene where the “historic building of the past” is merged with the current landscape can be expressed. In this way, it is possible to express photographs that can feel the flow of the times.

以上で、処理を終える。もちろん、画像Ｃ（ｔ）（ｔ＝０〜１００）と画像Ｄを、図示省略したデータ保存部に保存して、後で、再度、鑑賞できるようにしても良い。 Now, the process is finished. Of course, the image C (t) (t = 0 to 100) and the image D may be stored in a data storage unit (not shown) so that they can be viewed again later.

本発明への理解を深めることができるように、再度、上記処理内容と同じ説明を、図を用いて説明する。 In order to deepen the understanding of the present invention, the same description as the above processing contents will be described again with reference to the drawings.

図３は、画像Ａを撮影したときの状況の一例を示している。当時の歴史的建造物２０Ａと、現在は存在しない当時の歴史的建造物２１Ａと、現在は存在しない樹木２２Ａが図のように配置されている。なお説明の都合上、２０Ａには、図中に示すように、４つの白色の窓と１つの白色のドアがあるとする。また、図を簡略化するため、物体は、四角柱や円錐などで表現している。座標系３０Ａから見てスクリーン３１Ａに投影された画像が、画像Ａであるとする。なお、ここで、座標系３０Ａの原点からスクリーン３１Ａまでの距離は、ｆＡである。 FIG. 3 shows an example of a situation when the image A is taken. A historic building 20A at that time, a historic building 21A at that time that does not currently exist, and a tree 22A that does not currently exist are arranged as shown in the figure. For convenience of explanation, it is assumed that 20A has four white windows and one white door as shown in the figure. In order to simplify the drawing, the object is represented by a quadrangular prism or a cone. It is assumed that the image projected onto the screen 31A as viewed from the coordinate system 30A is the image A. Here, the distance from the origin of the coordinate system 30A to the screen 31A is fA.

図３に示す状況で撮影された画像Ａを、図４に示す。図４に示すように、当時の歴史的建造物２０Ａと、現在は存在しない当時の歴史的建造物２１Ａと、現在は存在しない樹木２２Ａのそれぞれの投影像が画像Ａに写っている。 An image A photographed in the situation shown in FIG. 3 is shown in FIG. As shown in FIG. 4, projected images of a historical building 20A at that time, a historical building 21A at that time that does not currently exist, and a tree 22A that does not currently exist are shown in image A.

本発明の実施例の処理を実行する前に、前述のとおり、たとえば、論文 D. Hoiem, A.A. Efros, and M. Hebert, "Automatic Photo Pop-up", ACM SIGGRAPH 2005 に書かれている手法を用いることで奥行き情報Ｚ（ｘ，ｙ）を作成している。画像Ａの各位置（ｘ，ｙ）は、座標系３０Ａから見て（ｘ，ｙ，ｆＡ）の位置である。したがって、位置（ｘ，ｙ）に投影されている被写体は、実際には、座標系３０Ａから見て式１（前述）で示される位置にあることが分かる。この様子を、図３に示しておく。 Before performing the processing of the embodiment of the present invention, as described above, for example, the technique described in the paper D. Hoiem, AA Efros, and M. Hebert, “Automatic Photo Pop-up”, ACM SIGGRAPH 2005 is used. The depth information Z (x, y) is created by using it. Each position (x, y) of the image A is a position of (x, y, fA) when viewed from the coordinate system 30A. Therefore, it can be seen that the subject projected at the position (x, y) is actually at the position represented by Expression 1 (described above) when viewed from the coordinate system 30A. This state is shown in FIG.

図３において、２１Ａと２２Ａは、現在、存在しないとする。そして、２０Ａは現在も存在しているとする。したがって、画像Ａに投影されている被写体の中で現在も存在している被写体に関する情報、すなわち、各位置（ｘ，ｙ）に対するマッチング用マスク情報Ｍ（ｘ，ｙ）は、図５に示すものとなる。図５において、斜線領域が、Ｍ（ｘ，ｙ）＝１である領域であり、その他の領域は、Ｍ（ｘ，ｙ）＝０である。 In FIG. 3, it is assumed that 21A and 22A do not currently exist. It is assumed that 20A still exists. Therefore, the information regarding the subject that is present in the subject projected on the image A, that is, the matching mask information M (x, y) for each position (x, y) is shown in FIG. It becomes. In FIG. 5, the shaded area is an area where M (x, y) = 1, and the other areas are M (x, y) = 0.

また、２０Ａと２１Ａが、歴史的に意味あるものであり、２２Ａは意味のないものであるとする。したがって、画像Ａに投影されている被写体の中で重要な被写体であるかどうかに関する情報、すなわち、各位置（ｘ，ｙ）に対する被写体用マスク情報Ｎ（ｘ，ｙ）は、図６に示すものとなる。図６において、斜線領域が、Ｎ（ｘ，ｙ）＝１である領域であり、その他の領域は、Ｎ（ｘ，ｙ）＝０である。 Further, 20A and 21A are historically meaningful, and 22A is meaningless. Therefore, information on whether or not the subject is an important subject among the subjects projected on the image A, that is, subject mask information N (x, y) for each position (x, y) is shown in FIG. It becomes. In FIG. 6, the shaded area is an area where N (x, y) = 1, and the other areas are N (x, y) = 0.

図７は、画像Ｂを撮影するときの状況の一例を示している。現在の歴史的建造物２０Ｂと、人２３Ｂと、当時はなかったが現在は建てられている建造物２４Ｂが図のように配置されている。現在の歴史的建造物２０Ｂは、図３の当時の歴史的建造物２０Ａと同一であるが、経年変化により、窓とドアが黒くなってしまった。先に述べたとおり、図７（現在）においては、２１Ａと２２Ａは、存在しない。図７の座標系３０Ｂから見てスクリーン３１Ｂに投影された画像が、画像Ｂであるとする。なお、ここで、座標系３０Ｂの原点からスクリーン３１Ｂまでの距離は、ｆＢである。 FIG. 7 shows an example of a situation when the image B is captured. The current historic building 20B, the person 23B, and the building 24B that was built at the time but was not built at that time are arranged as shown in the figure. The current historic building 20B is the same as the historic building 20A of FIG. 3 at the time, but the windows and doors have become black due to aging. As described above, 21A and 22A do not exist in FIG. 7 (present). Assume that the image projected onto the screen 31B as viewed from the coordinate system 30B in FIG. Here, the distance from the origin of the coordinate system 30B to the screen 31B is fB.

なお、座標系３０Ａと、座標系３１Ａは、ほぼ同じ位置であるが、説明を分かりやすくするため、かなり距離をおいた形で図示している。 The coordinate system 30A and the coordinate system 31A are substantially the same position, but are illustrated with a considerable distance for easy understanding.

図７に示す状況で撮影される画像Ｂを、図８に示す。図８に示すように、現在の歴史的建造物２０Ｂと、人２３Ｂと、当時はなかったが現在は建てられている建造物２４Ｂのそれぞれの投影像が画像Ｂに写っている。 An image B shot in the situation shown in FIG. 7 is shown in FIG. As shown in FIG. 8, a projected image of each of the current historic building 20B, the person 23B, and the building 24B that was not built at the time but is now built is shown in the image B.

図４に示す画像Ａ（ただし、図５に示すＭ（ｘ，ｙ）が１の領域のみ）と、図８に示す画像Ｂとのマッチングが行われる（図２のステップ２およびステップ３）。これにより、座標系３０Ａと、座標系３０Ｂの位置関係Ｒ０、Ｒ１、Ｒ２、Ｔｘ、Ｔｙ、および、Ｔｚが求まる。図３に、この位置関係を図示してある。すなわち、座標系３０Ａに対して、Ｒ０、Ｒ１、Ｒ２で表現される回転と、Ｔｘ、Ｔｙ、Ｔｚで表現される平行移動を行った位置が座標系３０Ｂである。ステップ２およびステップ３で求めた座標系３０Ｂ（図３）は、もちろん、図７に示した座標系３０Ｂと同一となる。 The image A shown in FIG. 4 (however, only the region where M (x, y) shown in FIG. 5 is 1) and the image B shown in FIG. 8 are matched (steps 2 and 3 in FIG. 2). As a result, the positional relationships R0, R1, R2, Tx, Ty, and Tz between the coordinate system 30A and the coordinate system 30B are obtained. FIG. 3 illustrates this positional relationship. That is, the coordinate system 30B is a position where the rotation represented by R0, R1, and R2 and the parallel movement represented by Tx, Ty, and Tz are performed on the coordinate system 30A. Of course, the coordinate system 30B (FIG. 3) obtained in step 2 and step 3 is the same as the coordinate system 30B shown in FIG.

座標系３０Ａと座標系３０Ｂの位置関係が求まると、次に、時刻パラメータｔを用いて、座標系３０Ａから座標系３０Ｂへと滑らかに時間変化する座標系を求める。すなわち、図２のステップ５で求めるＲ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、および、Ｔｚ０（ｔ）である。図３には、座標系３０Ａに対して、Ｒ００（ｔ）、Ｒ１０（ｔ）、Ｒ２０（ｔ）で表現される回転と、Ｔｘ０（ｔ）、Ｔｙ０（ｔ）、Ｔｚ０（ｔ）で表現される平行移動を行った位置を座標系３０Ｃとして記している。図３において、座標系３０Ｃから見てスクリーン３１Ｃに投影された画像が、画像Ｃ（ｔ）となる（図２のステップ６）。なお、ここで、座標系３０Ｃの原点からスクリーン３１Ｃまでの距離は、ｆＢ０（ｔ）である。 When the positional relationship between the coordinate system 30A and the coordinate system 30B is obtained, a coordinate system that smoothly changes in time from the coordinate system 30A to the coordinate system 30B is obtained using the time parameter t. That is, R00 (t), R10 (t), R20 (t), Tx0 (t), Ty0 (t), and Tz0 (t) obtained in step 5 of FIG. In FIG. 3, with respect to the coordinate system 30A, the rotation represented by R00 (t), R10 (t), R20 (t), and Tx0 (t), Ty0 (t), Tz0 (t) are represented. The position where the parallel movement is performed is shown as a coordinate system 30C. In FIG. 3, an image projected onto the screen 31C when viewed from the coordinate system 30C is an image C (t) (step 6 in FIG. 2). Here, the distance from the origin of the coordinate system 30C to the screen 31C is fB0 (t).

図３に示す状況で、仮想的な位置から撮影された画像（座標系３０Ｃから見てスクリーン３１Ｃに投影された画像）Ｃ（ｔ）を、図９に示す。図９に示すように、画像Ｃ（ｔ）には、画像Ａと比べて、各投影像は変形された投影像となって写っている。 FIG. 9 shows an image (image projected onto the screen 31C as viewed from the coordinate system 30C) C (t) taken from a virtual position in the situation shown in FIG. As shown in FIG. 9, in the image C (t), each projected image is shown as a modified projected image as compared with the image A.

また、ｔ＝１００のときの画像Ｃ（ｔ）を、図１０に示す。図９と同様に、画像Ｃ（１００）には、画像Ａと比べて、各投影像は変形された投影像となって写っている。時刻ｔ＝１００のときの撮影位置は、座標系３０Ｂと一致するので、図８に示す画像Ｂの中の「２０Ｂの投影像」と、図１０に示す画像Ｃ（１００）の中の「変形された２０Ａの投影像」とは位置的に一致している点に注意してほしい。 An image C (t) when t = 100 is shown in FIG. As in FIG. 9, the image C (100) shows each projection image as a modified projection image as compared to the image A. Since the photographing position at time t = 100 coincides with the coordinate system 30B, “20B projection image” in the image B shown in FIG. 8 and “deformation” in the image C (100) shown in FIG. Note that the “projected image of 20A” is coincident in position.

次に、このような状況において生成される画像Ｄについて、説明する。上述のステップ９における説明から分かるように、画像Ｄは、画像Ｂに対して、一部分を画像Ａ（正確には、画像Ａを変形した画像Ｃ（１００））で置き換えることで作成されている。置き換える部分は、画像Ａにおいて、Ｎ（ｘ，ｙ）＝１となる部分（図６参照）である。つまり、図１０に示す画像Ｃ（１００）で、「変形された２０Ａの投影像」と「変形された２１Ａの投影像」の部分である。これを図示すると、図１１となる。図１１に、画像Ｄを作成するために、Ｎ（ｘ，ｙ）＝１である位置のみ、画像Ａを変形させたデータを図示しておく。 Next, the image D generated in such a situation will be described. As can be seen from the description in step 9 above, the image D is created by replacing part of the image B with the image A (more precisely, the image C (100) obtained by deforming the image A). The part to be replaced is a part (see FIG. 6) where N (x, y) = 1 in the image A. That is, in the image C (100) shown in FIG. 10, it is a portion of a “deformed projection image of 20A” and a “deformed projection image of 21A”. This is illustrated in FIG. FIG. 11 illustrates data obtained by deforming the image A only at a position where N (x, y) = 1 in order to create the image D.

画像Ｄは、画像Ｂ（図８）に対して、図１１で示す画像が上書きされるので、図１２に示す画像となる。図１２に示す画像Ｄで注意してほしい点は、画像Ｂ（図８）と比べて、２０Ｂの投影像（現在の歴史的建造物２０Ｂの投影像）が、２０Ａの投影像（正確には、２０Ａを変形した投影像）で置き換えられている点である。また、現在は存在しない当時の歴史的建造物２１Ａの投影像も写っている点である。画像Ｄにおいては、歴史的建造物の窓およびドアの色が、黒ではなく白であり、当時の歴史的建造物２０Ａの状態で表現されている。画像Ｄは、２０Ａと２１Ａが、現在の人２３Ｂと、当時はなかったが現在は建てられている建造物２４Ｂとともに、写っている画像となる。 The image D is the image shown in FIG. 12 because the image shown in FIG. 11 is overwritten on the image B (FIG. 8). It should be noted in the image D shown in FIG. 12 that the projected image of 20B (currently projected image of the historical building 20B) is compared with the projected image of 20A (more precisely, compared to the image B (FIG. 8)). , 20A is replaced with a deformed projection image). In addition, a projected image of a historical building 21A at that time that does not currently exist is also shown. In the image D, the color of the windows and doors of the historical building is white, not black, and is expressed in the state of the historical building 20A at that time. The image D is an image in which 20A and 21A are shown together with the current person 23B and the building 24B that was not built at that time but is now built.

かくして、画像Ｃ（０）から画像Ｃ（１００）までを順番に表示した後、最後に画像Ｄを表示することで、画像Ａ（歴史的な建造物の過去の写真）がゆっくりと変形していき、最後に、「過去の歴史的建造物」が現在の風景と融合した情景を表現できる。このようにして、時代の流れを感じることのできる写真を表現することが可能となる。 Thus, after displaying image C (0) to image C (100) in order, and finally displaying image D, image A (a past photo of a historic building) is slowly deformed. Finally, you can express a scene where “historic buildings of the past” merged with the current landscape. In this way, it is possible to express photographs that can feel the flow of the times.

≪本発明のその他の実施例≫
なお、上述実施の形態においては、専用の画像処理装置１０にて、処理する例を示した。しかし、この他にも、画像をコンピュータ装置に送って、このコンピュータ装置内で画像の生成および表示をすることも可能である。この場合、詳細説明は省略するが、コンピュータ装置内でも、上述した画像処理装置１０における処理と同様の処理により、画像Ｃ（ｔ）および画像Ｄを作成し、それを表示することが出来る。 << Other Examples of the Present Invention >>
In the above-described embodiment, an example in which processing is performed by the dedicated image processing apparatus 10 has been described. However, it is also possible to send an image to a computer device and generate and display the image in the computer device. In this case, although detailed description is omitted, the image C (t) and the image D can be created and displayed in the computer apparatus by the same process as the process in the image processing apparatus 10 described above.

図１３は、画像Ｃ（ｔ）および画像Ｄを作成し、それを表示するコンピュータ装置４０の構成例を示している。 FIG. 13 shows a configuration example of the computer apparatus 40 that creates the image C (t) and the image D and displays them.

このコンピュータ装置４０は、ＣＰＵ（Central Processing Unit）４０１１、メモリ４０１２、ディスプレイコントローラ４０１３、入力機器インターフェース４０１４、ネットワークインターフェース４０１５を有している。また、このコンピュータ装置４０は、外部機器インターフェース４０１６およびデジタルカメラインターフェース４０１８を有している。これらは、バス４０１７に接続されている。 The computer apparatus 40 includes a CPU (Central Processing Unit) 4011, a memory 4012, a display controller 4013, an input device interface 4014, and a network interface 4015. Further, the computer device 40 includes an external device interface 4016 and a digital camera interface 4018. These are connected to the bus 4017.

ディスプレイ４０１９は、ディスプレイコントローラ４０１３を介してバス４０１７に接続されている。キーボード（ＫＢＤ）４０２０およびマウス４０２１は、入力機器インターフェース４０１４を介してバス４０１７に接続されている。また、ハードディスクドライブ（ＨＤＤ）４０２２およびメディアドライブ４０２３は、外部機器インターフェース４０１６を介してバス４０１７に接続されている。デジタルカメラは、デジタルカメラインターフェース４０１８を介してバス４０１７に接続されている。また、コンピュータ装置４０は、ネットワークインターフェース４０１５を介して、インターネット等のネットワークに接続される。 The display 4019 is connected to the bus 4017 via the display controller 4013. A keyboard (KBD) 4020 and a mouse 4021 are connected to the bus 4017 via the input device interface 4014. Further, the hard disk drive (HDD) 4022 and the media drive 4023 are connected to the bus 4017 via the external device interface 4016. The digital camera is connected to the bus 4017 via the digital camera interface 4018. The computer device 40 is connected to a network such as the Internet via a network interface 4015.

ＣＰＵ４０１１は、コンピュータ装置４０のメイン・コントローラである。このＣＰＵ４０１１は、オペレーティング・システム（ＯＳ）の制御下で、各種のアプリケーションを実行する。ＣＰＵ４０１１は、例えば、デジタルカメラから一度ハードディスクドライブ４０２２にダウンロードされた画像（たとえば、上述の画像Ｂ）を処理するアプリケーション・プログラムを実行することができる。アプリケーション・プログラムの中には、図２に示した処理を行うプログラムがインプリメントされている。ＣＰＵ４０１１は、バス４０１７によって他の機器類と相互接続されている。 The CPU 4011 is a main controller of the computer device 40. The CPU 4011 executes various applications under the control of an operating system (OS). The CPU 4011 can execute an application program that processes an image (for example, the above-described image B) once downloaded from the digital camera to the hard disk drive 4022, for example. A program for performing the processing shown in FIG. 2 is implemented in the application program. The CPU 4011 is interconnected with other devices by a bus 4017.

メモリ４０１２は、ＣＰＵ４０１１において実行されるプログラム・コードを格納し、また実行中の作業データを一時保管するために使用される記憶装置である。メモリ４０１２は、例えば、ＲＯＭなどの不揮発性メモリおよびＤＲＡＭなどの揮発性メモリの双方を含む構成とされている。 A memory 4012 is a storage device used for storing program codes executed by the CPU 4011 and temporarily storing work data being executed. The memory 4012 includes, for example, a nonvolatile memory such as a ROM and a volatile memory such as a DRAM.

ディスプレイコントローラ４０１３は、ＣＰＵ４０１１が発行する描画命令を実際に処理するための専用コントローラである。ディスプレイコントローラ４０１３において処理された描画データは、例えばフレーム・バッファ（図示しない）に一旦書き込まれた後、ディスプレイ４０１９によって画面出力される。例えば、ハードディスクドライブ４０２２から読み出された画像はディスプレイ４０１９で画面表示されて、ユーザは見て楽しむことができる。 A display controller 4013 is a dedicated controller for actually processing a drawing command issued by the CPU 4011. The drawing data processed in the display controller 4013 is once written in a frame buffer (not shown), for example, and then output on the screen by the display 4019. For example, an image read from the hard disk drive 4022 is displayed on the screen of the display 4019 so that the user can see and enjoy it.

入力機器インターフェース４０１４は、キーボード４０２０やマウス４０２１などのユーザ入力機器をコンピュータ装置４０に接続するための装置である。ユーザは、キーボード４０２０やマウス４０２１を介して、図２に示した処理を行うプログラムを実行するためのコマンドなどを入力することができる。 The input device interface 4014 is a device for connecting user input devices such as a keyboard 4020 and a mouse 4021 to the computer device 40. The user can input a command for executing a program for performing the processing shown in FIG. 2 via the keyboard 4020 and the mouse 4021.

ネットワークインターフェース４０１５は、Ｅｔｈｅｒｎｅｔ（登録商標）などの所定の通信プロトコルに従って、コンピュータ装置４０を、ＬＡＮ（Local Area Network）などの局所的ネットワーク、さらにはインターネットのような広域ネットワークに接続する。 The network interface 4015 connects the computer device 40 to a local network such as a LAN (Local Area Network) and further to a wide area network such as the Internet according to a predetermined communication protocol such as Ethernet (registered trademark).

ネットワーク上では、複数のホスト端末やサーバー（図示しない）がトランスペアレントな状態で接続され、分散コンピューティング環境が構築されている。ネットワーク上では、ソフトウェア・プログラムやデータ・コンテンツなどの配信サービスを行うことができる。例えば、上述の画像Ａが保存されている他のサーバーから画像データを、ネットワーク経由でハードディスクドライブ４０２２へダウンロードすることができる。 On the network, a plurality of host terminals and servers (not shown) are connected in a transparent state, and a distributed computing environment is constructed. On the network, distribution services such as software programs and data contents can be provided. For example, image data can be downloaded to the hard disk drive 4022 from another server in which the image A is stored.

デジタルカメラインターフェース４０１８は、デジタルカメラから供給される画像を取り込むための装置である。外部機器インターフェース４０１６は、ハードディスクドライブ４０２２、メディアドライブ４０２３などの外部装置を、コンピュータ装置４０に接続するための装置である。 The digital camera interface 4018 is a device for capturing an image supplied from the digital camera. The external device interface 4016 is a device for connecting external devices such as the hard disk drive 4022 and the media drive 4023 to the computer device 40.

ハードディスクドライブ４０２２は、従来周知のように、記憶担体としての磁気ディスクを固定的に搭載した外部記憶装置であり、記憶容量やデータ転送速度などの点で他の外部記憶装置よりも優れている。また、このハードディスクドライブ４０２２は、ランダムアクセスも可能である。ソフトウェア・プログラムを実行可能な状態でハードディスクドライブ４０２２上に置くことをプログラムのシステムへの「インストール」と呼ぶ。通常、ハードディスクドライブ４０２２には、ＣＰＵ４０１１が実行すべきオペレーティング・システムのプログラム・コードや、アプリケーション・プログラム、デバイス・ドライバなどが不揮発的に格納されている。例えば、図２に示した処理を行うプログラムを、ハードディスクドライブ４０２２上にインストールすることができる。 As conventionally known, the hard disk drive 4022 is an external storage device in which a magnetic disk as a storage carrier is fixedly mounted, and is superior to other external storage devices in terms of storage capacity and data transfer speed. The hard disk drive 4022 can also be randomly accessed. Placing the software program on the hard disk drive 4022 in an executable state is called “installation” of the program in the system. Usually, the hard disk drive 4022 stores program codes of an operating system to be executed by the CPU 4011, application programs, device drivers, and the like in a nonvolatile manner. For example, a program for performing the processing shown in FIG. 2 can be installed on the hard disk drive 4022.

メディアドライブ４０２３は、ＣＤ（Compact Disc）、ＭＯ（Magneto-Optical disc）、ＤＶＤ（Digital Versatile Disc）などの可搬型メディアを装填して、そのデータ記録面にアクセスするための装置である。可搬型メディアは、主として、ソフトウェア・プログラムやデータ・ファイルなどをコンピュータ可読形式のデータとしてバックアップすることや、これらをシステム間で移動（すなわち販売・流通・配布を含む）する目的で使用される。画像処理を行うためのアプリケーション・プログラムを、これら可搬型メディアを利用して複数の機器間で物理的に流通・配布することができる。 The media drive 4023 is a device for loading portable media such as CD (Compact Disc), MO (Magneto-Optical disc), DVD (Digital Versatile Disc), etc. and accessing the data recording surface. The portable media is mainly used for the purpose of backing up software programs, data files, and the like as data in a computer-readable format, and for moving them between systems (that is, including sales, distribution, and distribution). Application programs for performing image processing can be physically distributed and distributed among a plurality of devices by using these portable media.

以上、本発明の実施形態を詳述してきたが、具体的な構成は本実施形態に限られるものではなく、本発明の要旨を逸脱しない範囲の設計変更等も含まれる。 As mentioned above, although embodiment of this invention was explained in full detail, the concrete structure is not restricted to this embodiment, The design change etc. of the range which does not deviate from the summary of this invention are included.

さらに、本技術は、以下の構成とすることも可能である。 Furthermore, this technique can also be set as the following structures.

［１］
第１の画像と第２の画像を使って、第３の画像を生成する画像処理装置であって、
上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求める撮影位置計算部と、
上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定する仮想位置決定部と、
上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成する画像生成部と
を有し、
上記仮想位置決定部では、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、
上記画像生成部では、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成すること
を特徴とする画像処理装置。
［２］
上記第１の画像は、さらに、各画素位置について奥行き情報を有し、
上記撮影位置計算部では、上記奥行き情報を使って、上記第１の撮影位置と、上記第２の撮影位置の位置関係を求めること
を特徴とする［１］に記載の画像処理装置。
［３］
上記第１の画像は、さらに、各画素位置についてマスク情報を有し、
上記撮影位置計算部では、上記マスク情報がない位置については上記第１の画像の情報を使わず、上記マスク情報がある位置についてのみ上記第１の画像の情報を使ってマッチングを行うこと
を特徴とする［１］あるいは［２］に記載の画像処理装置。
［４］
第１の画像と第２の画像を使って、第３の画像を生成する画像処理装方法であって、
上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求める撮影位置計算ステップと、
上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定する仮想位置決定ステップと、
上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成する画像生成ステップと
を有し、
上記仮想位置決定ステップでは、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、
上記画像生成ステップでは、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成すること
を特徴とする画像処理方法。
［５］
第１の画像と第２の画像を使って、第３の画像を生成する画像処理用のプログラムであって、
上記第１の画像と上記第２の画像のマッチングにより、上記第１の画像が撮影された第１の撮影位置と、上記第２の画像が撮影された第２の撮影位置の位置関係を求める撮影位置計算ステップと、
上記第１の撮影位置と上記第２の撮影位置の間にある位置（第３の撮影位置とする）を決定する仮想位置決定ステップと、
上記第３の撮影位置から撮影したと仮定した場合の画像を、上記第３の画像として生成する画像生成ステップと
を有し、
上記仮想位置決定ステップでは、上記第１の撮影位置、あるいは、上記第２の撮影位置の一方を始点とし、他方を終点とする経路上を、時間とともに、上記第３の撮影位置は移動するようにし、
上記画像生成ステップでは、上記第１の撮影位置に対する上記第３の撮影位置の位置関係に応じた変形を、上記第１の画像に対して行うことで、上記第３の画像を生成すること
を特徴とするコンピュータに実行させるプログラム。 [1]
An image processing device that generates a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculator,
A virtual position determination unit that determines a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation unit that generates an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination unit, the third photographing position moves with time on a route starting from one of the first photographing position or the second photographing position and ending at the other. West,
The image generation unit generates the third image by performing deformation on the first image according to a positional relationship of the third shooting position with respect to the first shooting position. A featured image processing apparatus.
[2]
The first image further has depth information for each pixel position;
The image processing apparatus according to [1], wherein the shooting position calculation unit obtains a positional relationship between the first shooting position and the second shooting position using the depth information.
[3]
The first image further has mask information for each pixel position;
The imaging position calculation unit performs matching using the information of the first image only for the position where the mask information exists without using the information of the first image for the position where the mask information does not exist. The image processing apparatus according to [1] or [2].
[4]
An image processing method for generating a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculation step;
A virtual position determining step for determining a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation step of generating an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination step, the third shooting position moves with time on a path starting from one of the first shooting position or the second shooting position and ending at the other. West,
In the image generation step, the third image is generated by performing deformation on the first image according to the positional relationship of the third shooting position with respect to the first shooting position. A featured image processing method.
[5]
An image processing program for generating a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculation step;
A virtual position determining step for determining a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation step of generating an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination step, the third shooting position moves with time on a path starting from one of the first shooting position or the second shooting position and ending at the other. West,
In the image generation step, the third image is generated by performing deformation on the first image according to the positional relationship of the third shooting position with respect to the first shooting position. A program to be executed by a characteristic computer.

１０…画像処理装置
１１…入力端子
１２…特徴点マッチング処理部
１３…撮影位置計算部
１４…仮想撮影位置計算部
１５…変形画像作成部
１６…合成画像作成部
１７…画像表示部
２０Ａ，２１Ａ，２２Ａ，２０Ｂ，２３Ｂ，２４Ｂ…被写体
３０Ａ，３０Ｂ，３０Ｃ…３次元空間の座標系
３１Ａ，３１Ｂ，３１Ｃ…スクリーン
４０…コンピュータ装置 DESCRIPTION OF SYMBOLS 10 ... Image processing apparatus 11 ... Input terminal 12 ... Feature point matching process part 13 ... Shooting position calculation part 14 ... Virtual shooting position calculation part 15 ... Deformation image creation part 16 ... Composite image creation part 17 ... Image display part 20A, 21A, 22A, 20B, 23B, 24B ... Subjects 30A, 30B, 30C ... Coordinate systems 31A, 31B, 31C in a three-dimensional space ... Screen 40 ... Computer device

Claims

An image processing device that generates a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculator,
A virtual position determination unit that determines a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation unit that generates an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination unit, the third photographing position moves with time on a route starting from one of the first photographing position or the second photographing position and ending at the other. West,
The image generation unit generates the third image by performing deformation on the first image according to a positional relationship of the third shooting position with respect to the first shooting position. A featured image processing apparatus.

The first image further has depth information for each pixel position;
The image processing apparatus according to claim 1, wherein the photographing position calculation unit obtains a positional relationship between the first photographing position and the second photographing position using the depth information.

The first image further has mask information for each pixel position;
The imaging position calculation unit performs matching using the information of the first image only for the position where the mask information exists without using the information of the first image for the position where the mask information does not exist. The image processing apparatus according to claim 1 or 2.

An image processing method for generating a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculation step;
A virtual position determining step for determining a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation step of generating an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination step, the third shooting position moves with time on a path starting from one of the first shooting position or the second shooting position and ending at the other. West,
In the image generation step, the third image is generated by performing deformation on the first image according to the positional relationship of the third shooting position with respect to the first shooting position. A featured image processing method.

An image processing program for generating a third image using a first image and a second image,
By matching the first image and the second image, the positional relationship between the first photographing position where the first image is photographed and the second photographing position where the second image is photographed is obtained. A shooting position calculation step;
A virtual position determining step for determining a position (referred to as a third shooting position) between the first shooting position and the second shooting position;
An image generation step of generating an image when it is assumed that the image is captured from the third imaging position as the third image;
In the virtual position determination step, the third shooting position moves with time on a path starting from one of the first shooting position or the second shooting position and ending at the other. West,
In the image generation step, the third image is generated by performing deformation on the first image according to the positional relationship of the third shooting position with respect to the first shooting position. A program to be executed by a characteristic computer.