JP5925007B2

JP5925007B2 - Image processing apparatus, image processing method, and program

Info

Publication number: JP5925007B2
Application number: JP2012070109A
Authority: JP
Inventors: 邦洋長谷川
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-03-26
Filing date: 2012-03-26
Publication date: 2016-05-25
Anticipated expiration: 2032-03-26
Also published as: JP2013200824A

Description

本発明は画像処理装置、画像処理方法及びプログラムに関し、特に、レイアウトにコンテンツを配置するために用いて好適な技術に関する。 The present invention relates to an image processing apparatus, an image processing method, and a program, and more particularly, to a technique suitable for use in arranging content in a layout.

従来、電子データの文書に対して、コンテンツに付随した位置情報などのメタデータを考慮してレイアウトを自動で行う技術が知られている。具体的な例として、撮影時に取得された撮影位置・方位・角度を基に、静止画像もしくは動画像を３次元の立体データ上の対応する点に表示させる技術が知られている（例えば、特許文献１参照）。また、別の例として、仮想空間の観察者に対し、観察者のもつ指示具の位置姿勢の情報を基に、情報を提供するオブジェクトを観察者の視点から見て好適な位置姿勢に配置する技術が知られている（例えば、特許文献２参照）。また、撮影画像に含まれている複数の撮影対象要素に関する実世界における位置的上下関係を示したデータを参照し、その上下関係を最もよく反映した配置となるように、画像の向きを決定する技術も知られている（例えば、特許文献３参照）。さらには、２枚の画像間でオーバーラップした部分を検出して画像を合成する技術も知られている（例えば、特許文献４参照）。 2. Description of the Related Art Conventionally, a technique for automatically laying out an electronic data document in consideration of metadata such as position information attached to content is known. As a specific example, a technique for displaying a still image or a moving image at a corresponding point on three-dimensional stereoscopic data based on the shooting position, orientation, and angle acquired at the time of shooting is known (for example, patents). Reference 1). As another example, based on the information on the position and orientation of the pointing tool held by the observer, the object providing the information is arranged in a suitable position and orientation when viewed from the observer's viewpoint. A technique is known (for example, refer to Patent Document 2). In addition, referring to data indicating the positional relationship in the real world for a plurality of elements to be captured included in the captured image, the orientation of the image is determined so that the layout best reflects the hierarchical relationship. A technique is also known (see, for example, Patent Document 3). Furthermore, a technique for detecting an overlapping portion between two images and synthesizing the images is also known (see, for example, Patent Document 4).

特開平１１−１２２６３８号公報JP-A-11-122638 特開２００６−２５２４６８号公報JP 2006-252468 A 特許第４０６４１８６号公報Japanese Patent No. 4064186 特許第３８３２８９４号公報Japanese Patent No. 3832894 特開２０１０−１０２５０１号公報JP 2010-102501 A

C. Harris and M.J. Stephens, "A combined corner and edge detector," In Alvey Vision Conference, pages 147-152, 1988.C. Harris and M.J. Stephens, "A combined corner and edge detector," In Alvey Vision Conference, pages 147-152, 1988. David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, 60, 2 (2004), pp.91-110.David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, 60, 2 (2004), pp.91-110. M.A. Fischler and R.C. Bolles, "Random sample consensus: A paradigm formodel fitting with applications to image analysis and automated cartography," Commun. ACM, no.24,vol.6, pp.381-395, June 1981.M.A. Fischler and R.C. Bolles, "Random sample consensus: A paradigm formodel fitting with applications to image analysis and automated cartography," Commun. ACM, no.24, vol.6, pp.381-395, June 1981.

ここで、レイアウトを作成する上で重要な点として、鑑賞者に違和感を与えないことが挙げられる。例えば、各写真に写っている被写体の位置関係が実世界と逆になる形でレイアウトが作成されていると、その実世界を知る者には違和感を与えてしまう。また、実世界を知らない者は、被写体の位置関係を十分に把握することができない。 Here, as an important point in creating the layout, there is no discomfort to the viewer. For example, if the layout is created in such a way that the positional relationship between the subjects in each photo is opposite to that of the real world, it will give a strange feeling to those who know the real world. Moreover, those who do not know the real world cannot fully grasp the positional relationship of the subject.

特許文献１に記載の手法では、紙のアルバムなど２次元平面に配置する場合、３次元の立体データを射影することとなる。その場合、各コンテンツが重なり合ってしまう可能性がある。特許文献２に記載の手法の場合には、指示具を必要としたり、原点が視点に限定されたりするため、他のカメラで撮影した画像をアルバム形式のレイアウトを作成する場合には適用できない。 According to the method described in Patent Document 1, when arranging on a two-dimensional plane such as a paper album, three-dimensional solid data is projected. In that case, there is a possibility that the contents overlap each other. In the case of the method described in Patent Document 2, an indicator is required, and the origin is limited to the viewpoint. Therefore, the method cannot be applied when creating an album format layout of images taken with other cameras.

また、特許文献３に記載の手法では、被写体の位置関係を示したデータを保持していない場合は、適切なレイアウトを作成することができない。また、特許文献４に記載の手法は、レイアウトを作成する技術ではないため、画像のサイズに制限がある場合には適用することができない。 Also, with the method described in Patent Document 3, an appropriate layout cannot be created if data indicating the positional relationship of the subject is not held. Further, since the technique described in Patent Document 4 is not a technique for creating a layout, it cannot be applied when there is a limit on the size of an image.

本発明は前述の問題点に鑑み、特別な装置またはデータを不要にして、鑑賞者にとって見やすいコンテンツの位置を簡単に決定することができるようにすることを目的としている。 The present invention has been made in view of the above-described problems, and it is an object of the present invention to make it possible to easily determine the position of content that can be easily viewed by a viewer without using a special device or data.

本発明の画像処理装置は、一連の被写体が部分的に撮像された複数のコンテンツからそれぞれ特徴を抽出する抽出手段と、前記抽出手段によって抽出された特徴から前記被写体を撮像したときの実世界における位置の関係を推定する推定手段と、前記推定手段による推定の結果に基づいて、実世界における位置の関係を保つ、前記複数のコンテンツの配置位置を決定する決定手段とを備え、前記抽出手段では、前記特徴としてコンテンツの直線成分を抽出しており、前記推定手段は、前記複数のコンテンツの間で共通する部分が無い場合、前記直線成分の連続性により、前記複数のコンテンツの実世界における位置の関係を推定することを特徴とする。 An image processing apparatus according to the present invention includes an extraction unit that extracts features from a plurality of contents in which a series of subjects are partially imaged, and a real world when the subject is imaged from the features extracted by the extraction units. An estimation unit that estimates a positional relationship; and a determination unit that determines a layout position of the plurality of contents that maintains a positional relationship in the real world based on a result of estimation by the estimation unit. The linear component of content is extracted as the feature, and the estimation means determines the position of the plurality of content in the real world by the continuity of the linear component when there is no common part among the plurality of content. The relationship is estimated .

本発明によれば、特別な装置またはデータを不要にして、鑑賞者にとって見やすいコンテンツの位置を簡単に決定することができる。 According to the present invention, it is possible to easily determine the position of the content that is easy for the viewer to see without requiring a special device or data.

第１の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 1st Embodiment. 実施形態に係るレイアウト処理装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the layout processing apparatus which concerns on embodiment. 実施形態に係るレイアウト処理装置の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the layout processing apparatus which concerns on embodiment. 第１の実施形態において、レイアウトを作成する処理手順の一例を示すフローチャートである。6 is a flowchart illustrating an example of a processing procedure for creating a layout in the first embodiment. 第１の実施形態において、被写体の相対位置を推定するか否かを決定する処理手順の一例を示すフローチャートである。5 is a flowchart illustrating an example of a processing procedure for determining whether or not to estimate a relative position of a subject in the first embodiment. 第２の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 2nd Embodiment. 第３の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 3rd Embodiment. 第４の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 4th Embodiment. 第５の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 5th Embodiment. 第６の実施形態において、被写体の相対位置を推定するか否かを決定する処理手順の一例を示すフローチャートである。19 is a flowchart illustrating an example of a processing procedure for determining whether or not to estimate a relative position of a subject in the sixth embodiment. 第７の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 7th Embodiment. 第８の実施形態において作成されるレイアウトの一例を説明する図である。It is a figure explaining an example of the layout produced in 8th Embodiment.

（第１の実施形態）
以下、本発明の第１の実施形態について説明する。ここで、例えば、図１（ａ）に示すように、実世界１０１において、模様の付いた紙の物体１０８の左側に角柱が置かれ、右側に円柱が置かれているシーンを撮影するものとする。図１（ｂ）には、角柱が撮影された写真１０２及び円柱を撮影した写真１０３を示している。このとき、撮影された順序が写真１０３、写真１０２の順である場合にレイアウトを作成すると、一般的には、図１（ｃ）に示すレイアウト１０４が作成される。このように左開きのアルバムに時系列順に並べると、左から右に順番に写真が並んだレイアウト１０４になる。このような形になってしまうと、実際の被写体の配置とは逆向きに写真が並んでしまい、鑑賞者に違和感等を与えてしまう。そこで本実施形態では、各写真間の位置関係を画像の特徴データを用いて推定し、図１（ｃ）に示すレイアウト１０５を作成する。このレイアウト１０５であれば、実世界での被写体の配置と同じ向きに写真が並び、見やすいものとなったといえる。 (First embodiment)
Hereinafter, a first embodiment of the present invention will be described. Here, for example, as shown in FIG. 1A, in the real world 101, a scene in which a prism is placed on the left side of a patterned paper object 108 and a cylinder is placed on the right side is taken. To do. FIG. 1B shows a photograph 102 in which a prism is photographed and a photograph 103 in which a cylinder is photographed. At this time, if the layout is created when the order of taking is the order of the photograph 103 and the photograph 102, generally, the layout 104 shown in FIG. 1C is created. As described above, when arranged in a time-sequential order in the left-open album, a layout 104 in which photos are arranged in order from left to right is obtained. If such a shape is formed, the photos are arranged in the opposite direction to the actual arrangement of the subject, giving the viewer a sense of incongruity. Therefore, in the present embodiment, the positional relationship between the photographs is estimated using the feature data of the image, and the layout 105 shown in FIG. 1C is created. With this layout 105, it can be said that the photos are arranged in the same direction as the arrangement of the subject in the real world and are easy to see.

図２は、本実施形態におけるレイアウト処理装置２００のハードウェア構成例を示すブロック図である。
図２に示すように、画像処理装置であるレイアウト処理装置２００は、ＣＰＵ（Central Processing Unit）２０１、入力装置２０２、出力装置２０３及び記憶装置２０４を備えている。さらに、ＲＡＭ（Random Access Memory）２０５、ＲＯＭ（Read Only Memory）２０６及びＢＵＳ２０７を備えている。 FIG. 2 is a block diagram illustrating a hardware configuration example of the layout processing apparatus 200 according to the present embodiment.
As illustrated in FIG. 2, the layout processing apparatus 200 that is an image processing apparatus includes a CPU (Central Processing Unit) 201, an input device 202, an output device 203, and a storage device 204. Further, a RAM (Random Access Memory) 205, a ROM (Read Only Memory) 206, and a BUS 207 are provided.

ＣＰＵ２０１は、各種データ処理のための論理演算、判断等を行い、ＢＵＳ２０７で接続されている各構成要素の制御を行う。入力装置２０２には、例えば、アルファベットキー、ひらがなキー、カタカナキー、句点等の文字記号入力キー、カーソル移動を指示するカーソル移動キー等のような、各種の機能キーを備えたキーボードが接続されている。また、他に接続されるものとして、レイアウト対象とする画像を入力するカメラ等の撮像素子、もしくは撮像素子によって記録された画像が保持し、入力するためのＨＤＤ等のストレージも挙げられる。さらに、ＧＵＩ（Graphical User Interface）上で画面の制御可能位置を指し示し、機能の選択指示などを行うポインティング機器、例えばマウスやスティックポインタ等も挙げられる。 The CPU 201 performs logical operations and determinations for various data processing, and controls each component connected by the BUS 207. The input device 202 is connected to a keyboard having various function keys such as alphabet keys, hiragana keys, katakana keys, character symbol input keys such as punctuation marks, cursor movement keys for instructing cursor movement, and the like. Yes. In addition, an image pickup device such as a camera that inputs an image to be laid out or a storage such as an HDD that holds and inputs an image recorded by the image pickup device can be cited as another connection. Furthermore, a pointing device such as a mouse or a stick pointer that indicates a controllable position of the screen on a GUI (Graphical User Interface) and performs a function selection instruction or the like is also included.

出力装置２０３は、液晶パネル等の各種表示装置である。記憶装置２０４には、入出力データや処理プログラム等、各種情報が格納される。これらのデータ及びプログラムを格納する記憶媒体としては、ハードディスク、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、フラッシュメモリ等を用いることができる。ＲＡＭ２０５は、各構成要素からの各種データを一時的に記憶する際に用いられる。ＲＯＭ２０６は、本実施形態で実行される処理プログラム等の制御プログラムを記憶する。これら全ての構成要素はＢＵＳ２０７で接続されている。 The output device 203 is various display devices such as a liquid crystal panel. The storage device 204 stores various information such as input / output data and processing programs. As a storage medium for storing these data and programs, a hard disk, CD-ROM, DVD-ROM, flash memory, or the like can be used. The RAM 205 is used when temporarily storing various data from each component. The ROM 206 stores a control program such as a processing program executed in the present embodiment. All these components are connected by BUS207.

図３は、本実施形態のレイアウト処理装置２００の機能構成例を示すブロック図である。
レイアウト処理装置２００は、コンテンツ特徴情報取得部３０１、コンテンツ対応関係取得部３０２、位置姿勢推定部３０３、及びコンテンツ配置位置決定部３０４を備えている。以下、コンテンツとは写真を意味するものとする。以下、これらの機能について、全体の流れを示した図４と合わせて説明する。 FIG. 3 is a block diagram illustrating a functional configuration example of the layout processing apparatus 200 according to the present embodiment.
The layout processing apparatus 200 includes a content feature information acquisition unit 301, a content correspondence acquisition unit 302, a position / orientation estimation unit 303, and a content arrangement position determination unit 304. Hereinafter, content means a photograph. Hereinafter, these functions will be described together with FIG. 4 showing the overall flow.

図４は、本実施形態に係るレイアウト処理装置２００によりレイアウトを作成する処理手順の一例を示すフローチャートである。
まず、ステップＳ４０１において、コンテンツ特徴情報取得部３０１は、記憶装置２０４からコンテンツを読み出し、読み出したコンテンツに対して所定の処理を行って特徴データを取得し、ＲＡＭ２０５に設定する。なお、この際取得されるデータは今後の処理で使用されるデータが含まれていればよく、それ以外のデータについては含まれていてもいなくても構わない。また、特徴データの取得処理の詳細については後述する。 FIG. 4 is a flowchart illustrating an example of a processing procedure for creating a layout by the layout processing apparatus 200 according to the present embodiment.
First, in step S 401, the content feature information acquisition unit 301 reads content from the storage device 204, performs predetermined processing on the read content, acquires feature data, and sets it in the RAM 205. Note that the data acquired at this time only needs to include data used in future processing, and other data may or may not be included. Details of the feature data acquisition process will be described later.

次に、ステップＳ４０２において、コンテンツ特徴情報取得部３０１は、レイアウトの対象とするコンテンツを選択する。ここで選択方法としては、コンテンツに付随している情報などを用いて選択してもよく、ユーザの操作により入力装置２０２から入力された情報に従って選択してもよい。 Next, in step S <b> 402, the content feature information acquisition unit 301 selects content to be laid out. Here, as a selection method, selection may be made using information attached to the content or the like, or selection may be made according to information input from the input device 202 by a user operation.

コンテンツに付随している情報などを用いて選択する場合に使用する情報及び選択方法としては、例えば、撮影場所を用いて位置が所定の距離以内のものを選択したり、撮影時刻を用いて時刻が所定の間隔以内のものを選択したりする方法が挙げられる。また、顔の位置や形状の類似した物体の位置を基に、コンテンツの構図が似通ったものを選択するという方法を用いてもよい。 As information and selection method used when selecting using information attached to content, for example, selecting a location within a predetermined distance using a shooting location, or using a shooting time Can be selected within a predetermined interval. Alternatively, a method may be used in which content with a similar composition is selected based on the position of an object having a similar face position or shape.

また、ユーザの操作により入力装置２０２から入力された情報に従って選択する場合には、使用するコンテンツを単純にクリックなどで選択する以外にも、スイッチボタンなどを用いて選択を行うといった手段も考えられる。なお、手動で使用コンテンツを選択する場合には、コンテンツの特徴データを取得する処理（ステップＳ４０１）はこの後でも構わない。 In addition, when selecting according to information input from the input device 202 by the user's operation, in addition to simply selecting the content to be used by clicking or the like, there may be a means for performing selection using a switch button or the like. . Note that when manually selecting content to be used, the processing for acquiring content feature data (step S401) may be performed after this.

次に、ステップＳ４０３において、コンテンツ対応関係取得部３０２は、取得した特徴データを比較し、複数のコンテンツ間で共通した部位、もしくは共通した物体を推定してその対応関係の情報を取得する。そして、ステップＳ４０４において、位置姿勢推定部３０３は、各コンテンツ中の被写体間の現実世界における撮像装置に対する位置姿勢を推定する。 Next, in step S403, the content correspondence acquisition unit 302 compares the acquired feature data, estimates a common part or a common object among a plurality of contents, and acquires information on the correspondence. In step S404, the position / orientation estimation unit 303 estimates the position / orientation with respect to the imaging device in the real world between the subjects in each content.

次に、ステップＳ４０５において、コンテンツ配置位置決定部３０４は、この推定された位置姿勢を基に各コンテンツのアルバム上でのレイアウトの位置を決定する。そして、ステップＳ４０６において、ステップＳ４０５において決定したレイアウトの位置に各コンテンツを配置する。これにより、レイアウト１０５のような各コンテンツ中の被写体間の現実世界での位置姿勢を再現したレイアウトをアルバム上に作成することができる。 Next, in step S405, the content arrangement position determination unit 304 determines the layout position on the album of each content based on the estimated position and orientation. In step S406, each content is arranged at the layout position determined in step S405. Thereby, a layout that reproduces the position and orientation in the real world between subjects in each content such as the layout 105 can be created on the album.

ステップＳ４０１で取得する特徴データの種類及びその取得方法は様々なものが考えられる。本実施形態では、画像の局所的な特徴量（局所特徴量）を特徴データとして取得し、この局所特徴量を用いて類似画像を検索する方法を用いることとする。この方法では、まず、画像から特徴的な点（局所特徴点）を抽出する。局所特徴点を抽出する方法としては、例えば非特許文献１に開示されている方法を用いる。そして、当該局所特徴点とその周辺の画像情報とに基づいて、当該局所特徴点に対応する特徴量（局所特徴量）を計算する。局所特徴量を計算する方法としては、例えば非特許文献２に開示されている方法を用いる。そして、画像の検索は、局所特徴量同士のマッチングによって行う。 Various types of feature data acquired in step S401 and their acquisition methods can be considered. In this embodiment, a method is used in which a local feature amount (local feature amount) of an image is acquired as feature data, and a similar image is searched using the local feature amount. In this method, first, characteristic points (local feature points) are extracted from an image. As a method for extracting local feature points, for example, the method disclosed in Non-Patent Document 1 is used. Then, based on the local feature point and surrounding image information, a feature amount (local feature amount) corresponding to the local feature point is calculated. As a method for calculating the local feature amount, for example, a method disclosed in Non-Patent Document 2 is used. The image search is performed by matching local feature amounts.

局所特徴量を利用する手法においては、局所特徴量を回転不変、拡大・縮小不変となる複数の要素で構成される情報として定義する。これにより、画像を回転させたり、拡大又は縮小させたりした場合であっても、検索を可能にする。また、局所特徴量は一般的にベクトルとして表現される。ただし、局所特徴量が回転不変、拡大・縮小不変であることは理論上の話であり、実際のデジタル画像においては、画像の回転や拡大・縮小処理前の局所特徴量と処理後の対応する局所特徴量との間に若干の変動が生じる。回転不変の局所特徴量を算出するために、例えば非特許文献２に記載の方法では、局所特徴点周辺の局所領域の画素パターンから主方向を算出し、局所特徴量を算出する時に主方向を基準に局所領域を回転させて方向の正規化を行う。また、拡大・縮小不変の局所特徴量を算出するために、異なるスケールの画像を内部で生成し、各スケールの画像からそれぞれ局所特徴点を抽出して局所特徴量を算出する。ここで、内部で生成した一連の異なるスケールの画像の集合は一般的にスケールスペースと呼ばれる。各特徴点の特徴量は、方向の正規化の際の回転角度及び特徴量を算出したスケールスペースも記憶する。 In the method using the local feature amount, the local feature amount is defined as information including a plurality of elements that are rotation invariant and enlargement / reduction invariant. Thereby, even when the image is rotated, enlarged or reduced, the search can be performed. Further, the local feature amount is generally expressed as a vector. However, it is a theoretical story that the local feature is invariant to rotation and enlargement / reduction. In an actual digital image, the local feature before image rotation and enlargement / reduction processing corresponds to that after processing. Some variation occurs between the local feature amount. In order to calculate the rotation-invariant local feature amount, for example, in the method described in Non-Patent Document 2, the main direction is calculated from the pixel pattern of the local region around the local feature point, and the main direction is calculated when the local feature amount is calculated. Rotate the local region with reference to normalize the direction. In addition, in order to calculate the local feature amount that does not change in size, the local feature amount is calculated by internally generating images of different scales and extracting local feature points from the images of the respective scales. Here, a set of images of different scales generated internally is generally called a scale space. The feature amount of each feature point also stores a rotation angle and a scale space in which the feature amount is calculated when normalizing the direction.

ステップＳ４０３については、図５を用いてその詳細を説明する。図５は、本実施形態における図４のステップＳ４０３の詳細な処理手順の一例を示すフローチャートである。
まず、ステップＳ５０１において、対象としているコンテンツが同一の位置または同一時点で撮影されたものかどうかを判定する。この判定の結果、対象としているコンテンツが同一の位置または同一時点で撮影されたものである場合は、次のステップＳ５０２に進む。この判定は、特定の場所から観察された、つまり撮影者が撮影した際に見た各被写体の相対位置姿勢を再現させるために必要な処理である。この判定には、例えばＧＰＳによって付与された緯度経度情報を利用したり、手動で入力された場所の情報を使用したりすればよい。 Details of step S403 will be described with reference to FIG. FIG. 5 is a flowchart illustrating an example of a detailed processing procedure of step S403 of FIG. 4 in the present embodiment.
First, in step S501, it is determined whether the target content is taken at the same position or at the same time. As a result of the determination, if the target content is taken at the same position or at the same time, the process proceeds to the next step S502. This determination is a process necessary for reproducing the relative position and orientation of each subject observed from a specific place, that is, when the photographer takes a picture. For this determination, for example, latitude / longitude information provided by GPS may be used, or information on a place manually input may be used.

また、複数のコンテンツ間で撮影時刻に大きな差がない場合には、撮影者は撮影位置を変更していないと推定することができるので、コンテンツに付与された撮影時刻を利用することも挙げられる。ただし、センサの計測誤差や撮影間隔を考慮に入れる必要があると考えられるので、所定範囲の差異までは同一の位置とみなす。なお、ステップＳ５０１の判定の結果、同一位置または同一時点で撮影されたものではない場合には、本実施形態における並べ替えの適用は行わず、時系列順に配置する等、他のルールに従って各コンテンツのレイアウトを作成することとする。そのため、ステップＳ５０４へ進み、相対位置に基づくレイアウトを作成するフラグをＯＦＦにして処理を終了する。 In addition, when there is no significant difference in shooting time among a plurality of contents, it can be estimated that the photographer has not changed the shooting position. Therefore, it is also possible to use the shooting time given to the contents. . However, since it is considered necessary to take into account the measurement error of the sensor and the imaging interval, the same position is considered up to a predetermined range. If the result of determination in step S501 is not that the images were taken at the same position or at the same time, the reordering in this embodiment is not applied, and each content is arranged according to other rules, such as being arranged in chronological order. Let's create a layout. For this reason, the process proceeds to step S504, the flag for creating a layout based on the relative position is turned OFF, and the process ends.

次に、ステップＳ５０２において、ステップＳ４０１で取得した特徴データ（局所特徴量）を比較する。局所特徴点または局所特徴量を比較して画像を照合する方法にはいろいろあるが、本実施形態では、非特許文献３に開示されているＲＡＮＳＡＣを利用する。なお、特許文献５には、その検索処理の概要が開示されている。 Next, in step S502, the feature data (local feature amount) acquired in step S401 is compared. There are various methods for comparing images by comparing local feature points or local feature amounts. In this embodiment, RANSAC disclosed in Non-Patent Document 3 is used. Patent Document 5 discloses an outline of the search process.

具体的な比較手順としては、まず、検索元画像の各局所特徴量に対し、比較先画像の局所特徴量で特徴間距離が最小となるものをペアで記述する。次に、検索元画像から３個の局所特徴量をランダムに選択し、それぞれの特徴間距離が最小となる比較先画像の局所特徴量群との間で、その座標の対応からアフィン変換行列を求める。このアフィン変換行列を用い、検索元画像の残りの局所特徴量の座標を比較先画像の座標に変換し、その近傍に上記特徴間距離が最小となるペアの局所特徴量が存在するか否かを確認し、存在すれば１票投票し、存在しなければ投票しないようにする。 As a specific comparison procedure, first, for each local feature amount of the search source image, a local feature amount of the comparison destination image that has the smallest feature distance is described in pairs. Next, three local feature quantities are randomly selected from the search source image, and an affine transformation matrix is calculated from the correspondence of the coordinates with the local feature quantity group of the comparison destination image in which the distance between the features is minimum. Ask. Using this affine transformation matrix, the coordinates of the remaining local features in the search source image are converted to the coordinates of the comparison destination image, and whether there is a pair of local features that minimize the distance between the features in the vicinity. If there is, vote one vote, and if not, do not vote.

最終的に、この投票数が所定の値に達した場合には、検索元画像と比較先画像との間で部分一致する領域が存在すると判断し、その投票数が多いほど一致する領域が大きいと考える。他方、投票数が所定の値に達しない場合には、新たに検索元画像から３個の局所特徴量をランダムに選択してアフィン変換行列を求める処理から再度処理を行うが、この再処理は定められた反復カウント数以内で繰り返す。反復カウント数に達しても投票数が所定の値を超えない場合には、部分一致する領域が存在しないと判断して比較の処理を終了する。 Finally, when the number of votes reaches a predetermined value, it is determined that there is a partially matching area between the search source image and the comparison target image, and the greater the number of votes, the larger the matching area. I think. On the other hand, if the number of votes does not reach a predetermined value, the processing is performed again from the processing of newly selecting three local feature quantities from the search source image and obtaining the affine transformation matrix. Repeat within the specified number of repetition counts. If the number of votes does not exceed a predetermined value even when the number of repetition counts is reached, it is determined that there is no partially matching area, and the comparison process is terminated.

そして、部分一致する領域が存在する場合、上記求めたアフィン変換行列と特徴間距離とが最小となるもののペアを用い、検索元画像中の着目した局所特徴量と対応する特徴量を求めることができる。このとき、その局所特徴量を求める際の方向の正規化の際の回転角度から検索元画像と比較先画像との回転角度の関係を求めることもできる。さらに、その局所特徴量を求める際のスケールスペースを用いて、検索元画像と比較先画像との拡縮関係も求めることができる。 Then, when there is a partially matching region, a feature amount corresponding to the focused local feature amount in the search source image can be obtained using a pair of the affine transformation matrix and the feature distance that are found to be minimum. it can. At this time, the relationship between the rotation angle between the search source image and the comparison target image can be obtained from the rotation angle at the time of normalizing the direction when obtaining the local feature amount. Furthermore, the scaling relationship between the search source image and the comparison destination image can also be obtained using the scale space for obtaining the local feature amount.

次に、ステップＳ５０３において、上記のステップＳ５０２の処理により複数のコンテンツ間で対応が取れるか否かを判定する。つまり、特徴間距離が最小となるペアの局所特徴量が所定数以上存在しているか否かを判定する。この判定の結果、複数のコンテンツ間で対応が取れる場合は、ステップＳ５０５において、相対位置に基づくレイアウトを作成するフラグをＯＮにして処理を終了する。一方、ステップＳ５０３の判定の結果、対応が取れない場合には、ステップＳ５０４へ進み、相対位置に基づくレイアウトを作成するフラグをＯＦＦにして処理を終了する。 Next, in step S503, it is determined whether or not correspondence between a plurality of contents can be achieved by the processing in step S502. That is, it is determined whether or not there are a predetermined number or more of local feature amounts of pairs having the smallest feature distance. As a result of this determination, if correspondence can be achieved between a plurality of contents, in step S505, a flag for creating a layout based on the relative position is set to ON, and the process ends. On the other hand, if the result of determination in step S503 is that correspondence cannot be obtained, processing proceeds to step S504, where the flag for creating a layout based on the relative position is turned OFF, and the processing is terminated.

ここで、具体例として図１（ｂ）に示す写真１０２、１０３のレイアウトを作成する例について説明する。まず、ステップＳ４０１では、特徴データとして画像中の局所特徴量を取得する。次に、ステップＳ４０２では、コンテンツとして写真１０２、１０３を選択する。 Here, an example of creating a layout of the photos 102 and 103 shown in FIG. 1B will be described as a specific example. First, in step S401, a local feature amount in an image is acquired as feature data. Next, in step S402, photographs 102 and 103 are selected as contents.

ステップＳ４０３では、この選択された写真１０２、１０３について、取得された情報を比較する。この処理の詳細として、まず、ステップＳ５０１では撮影時刻情報を比較する。２枚の写真１０２、１０３については、撮影間隔が所定の間隔より短く、双方とも同一の場所から撮影されたものと判定する。次に、ステップＳ５０２では、局所特徴量を比較する。その結果、図１（ｂ）に示すように、写真１０２の局所領域１０６と写真１０３の局所領域１０７とが同一の部分であり、対応が取れるということが分かる。そのため、ステップＳ５０３では、複数のコンテンツ間で対応が取れると判定し、ステップＳ５０５では、相対位置に基づくレイアウトを作成するフラグをＯＮにする。 In step S403, the acquired information is compared for the selected photos 102 and 103. As details of this processing, first, in step S501, the photographing time information is compared. For the two photographs 102 and 103, it is determined that the shooting interval is shorter than the predetermined interval, and both were taken from the same place. Next, in step S502, local feature amounts are compared. As a result, as shown in FIG. 1B, it can be seen that the local area 106 of the photograph 102 and the local area 107 of the photograph 103 are the same part and can be matched. Therefore, in step S503, it is determined that a plurality of contents can be handled, and in step S505, a flag for creating a layout based on the relative position is turned ON.

次に、ステップＳ４０４では、同一部分と判定された局所領域１０６、１０７を基に被写体に対する位置姿勢を推定する。具体的には、局所領域１０６は写真１０２の右端に存在し、局所領域１０７が写真１０３の左端に存在することから、撮影者から見て写真１０２は左側の部分が撮影されたものであり、写真１０３は右側の部分が撮影されたものであると推定することができる。 Next, in step S404, the position and orientation with respect to the subject are estimated based on the local areas 106 and 107 determined to be the same part. Specifically, since the local area 106 is present at the right end of the photograph 102 and the local area 107 is present at the left end of the photograph 103, the photograph 102 is a photograph of the left part as viewed from the photographer. It can be presumed that the photograph 103 is a photograph of the right part.

ステップＳ４０５では、この推定結果を基に、写真１０２をページの左側に配置し、写真１０３を右側に配置することを決定する。そして、ステップＳ４０６では、ステップＳ４０５で決定した配置によりレイアウトを作成する。その結果、図１（ｃ）に示すレイアウト１０５が作成される。なお、本実施形態では、水平方向の位置関係を推定しているが、有効な対応部分さえあれば垂直方向の位置関係を推定することも可能である。この場合には、コンテンツを上下に配置することとなる。 In step S405, based on the estimation result, it is determined that the photograph 102 is arranged on the left side of the page and the photograph 103 is arranged on the right side. In step S406, a layout is created with the arrangement determined in step S405. As a result, the layout 105 shown in FIG. 1C is created. In the present embodiment, the positional relationship in the horizontal direction is estimated. However, the positional relationship in the vertical direction can be estimated as long as there is an effective corresponding portion. In this case, the content is arranged vertically.

以上のように本実施形態によれば、鑑賞者に各コンテンツ中の被写体間の位置関係の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, it is possible to reduce a viewer from being given a contradiction in the positional relationship between subjects in each content, a sense of incongruity caused by a lack of natural line of sight.

（第２の実施形態）
次に、本発明の第２の実施形態について説明する。第１の実施形態では水平及び垂直方向の位置関係を推定してレイアウトを作成したが、本実施形態では奥行方向（前後方向）の位置関係を推定してレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described. In the first embodiment, the layout is created by estimating the positional relationship in the horizontal and vertical directions. In the present embodiment, an example in which the layout is created by estimating the positional relationship in the depth direction (front-rear direction) will be described. Note that the configuration and processing procedure of the layout processing apparatus according to this embodiment are basically the same as those in the first embodiment, and a description thereof will be omitted. In the present embodiment, differences from the first embodiment will be described.

例えば、図６（ａ）に示すように、実世界６０１において、模様の付いた紙の手前に角柱が置かれ、奥に円柱が置かれているシーンを撮影するものとする。図６（ｂ）には、角柱を撮影した写真６０２と、円柱を撮影した写真６０３とを示している。ここで、撮影された順序が写真６０３、写真６０２の順である場合にレイアウトを作成すると、一般的には、図６（ｃ）に示すレイアウト６０４が作成される。このように左開きのアルバムに時系列順に並べると、奥行に応じてサイズを変更し、上から下に順番に写真が並んだレイアウト６０４になる。このような形になってしまうと、実際の被写体の配置とは逆向きに写真が並んでしまい、鑑賞者に違和感等を与えてしまう。そこで本実施形態では、各写真の撮像装置に対する位置姿勢を画像の特徴データを用いて推定し、そこから各写真間の位置関係を割り出し、図６（ｃ）に示すレイアウト６０５を作成する。このレイアウト６０５であれば、実世界での被写体の配置と同じ向きに写真が並び、見やすいものとなったといえる。 For example, as shown in FIG. 6A, in the real world 601, a scene in which a prism is placed in front of a patterned paper and a cylinder is placed in the back is photographed. FIG. 6B shows a photograph 602 in which a prism is photographed and a photograph 603 in which a cylinder is photographed. Here, when the layout is created when the order of photographing is the order of the photograph 603 and the photograph 602, a layout 604 shown in FIG. 6C is generally created. When arranged in time-sequential order in the left-open album in this way, the size is changed according to the depth, and a layout 604 is formed in which photos are arranged in order from top to bottom. If such a shape is formed, the photos are arranged in the opposite direction to the actual arrangement of the subject, giving the viewer a sense of incongruity. Therefore, in the present embodiment, the position and orientation of each photograph with respect to the imaging apparatus is estimated using the feature data of the image, and the positional relationship between each photograph is determined therefrom, and the layout 605 shown in FIG. 6C is created. With this layout 605, it can be said that the photos are arranged in the same direction as the arrangement of the subject in the real world and are easy to see.

本実施形態では、ステップＳ４０４における相対位置を推定する方法が異なっている。具体的な例としては、まず、同一部分と判定された領域のサイズを比較する点が挙げられる。奥にあるものほど拡大して撮像する場合があるため、同一部分のサイズは、奥の被写体を撮像したコンテンツの方が、手前の被写体を撮影したコンテンツよりも大きくなる場合がある。第１の実施形態で説明した局所特徴量は、前述の通り拡大縮小に不変である。したがって、サイズが変わっていても比較することは可能であるため、共通部分が大きく写っているコンテンツの被写体は奥にあると推定できる。 In this embodiment, the method for estimating the relative position in step S404 is different. As a specific example, the size of the areas determined as the same part is first compared. Since there is a case where an image is magnified as the object is in the back, the size of the same part may be larger in the content in which the object in the back is imaged than in the content in which the object in the foreground is captured. The local feature amount described in the first embodiment is invariant to enlargement / reduction as described above. Therefore, since the comparison is possible even if the size is changed, it can be estimated that the subject of the content in which the common part is greatly reflected is in the back.

また、この性質が成り立たない場合でも、共通部分の特徴点から平面推定を行うことにより被写体の位置姿勢を推定することが可能である。平面は最低３点あれば求めることができるので、３点以上の共通特徴点から平面を推定し、被写体がそれぞれ平面状のどこにあるかを求めることにより、被写体の位置姿勢を推定することが可能である。もしくは、一致する特徴点のスケール情報は対応部位の大きさに比例するので、この性質を基に奥行を推定することも可能である。 Even when this property does not hold, it is possible to estimate the position and orientation of the subject by performing plane estimation from the feature points of the common part. Since the plane can be obtained if there are at least three points, it is possible to estimate the position and orientation of the subject by estimating the plane from three or more common feature points and determining where the subject is in the plane. It is. Alternatively, since the scale information of the matching feature points is proportional to the size of the corresponding part, the depth can be estimated based on this property.

図６（ｃ）に示すレイアウト６０５は、このようにして推定された結果を基にステップＳ４０６で配置された結果である。本実施形態では、紙のアルバムなど２次元平面をレイアウトの対象としているが、３次元ディスプレイなど３次元の表示装置にも、相対的な位置関係を再現する形でレイアウトすることが可能である。 A layout 605 shown in FIG. 6C is a result of arrangement in step S406 based on the result estimated in this way. In this embodiment, a two-dimensional plane such as a paper album is targeted for layout, but it is also possible to lay out a three-dimensional display device such as a three-dimensional display in a manner that reproduces the relative positional relationship.

以上のように本実施形態によれば、水平・垂直方向だけでなく、奥行方向であっても鑑賞者に各コンテンツ中の被写体間の位置関係の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, not only in the horizontal and vertical directions but also in the depth direction, the viewer cannot contradict the positional relationship between subjects in each content and cannot naturally carry his gaze. Giving a sense of incongruity can be reduced.

（第３の実施形態）
以下、本発明の第３の実施形態について説明する。本実施形態では、同一の部分が存在しない場合に水平方向及び垂直方向の位置関係を推定してレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Third embodiment)
Hereinafter, a third embodiment of the present invention will be described. In the present embodiment, an example will be described in which the layout is created by estimating the positional relationship between the horizontal direction and the vertical direction when the same portion does not exist. Note that the configuration and processing procedure of the layout processing apparatus according to this embodiment are basically the same as those in the first embodiment, and a description thereof will be omitted. In the present embodiment, differences from the first embodiment will be described.

例えば、図１（ａ）に示す実世界１０１において、図７（ａ）に示す写真７０１、７０２が撮影されたものとする。そして、撮影された順序が写真７０２、写真７０１の順である場合にレイアウトを作成すると、時系列順に配置されるため、一般的には、図７（ｂ）に示すレイアウト７０３が作成される。このような形になってしまうと、実際の被写体の配置とは逆向きに写真が並んでしまい、鑑賞者に違和感等を与えてしまう。また、第１の実施形態で示した方法では、写真７０１、７０２の間に同一の部分が存在しないため、時系列順に配置されてしまう。そこで本実施形態では、このような場合であっても、各写真間の位置関係を画像の特徴データを用いて推定し、図７（ｂ）のレイアウト７０４を作成する例について説明する。 For example, it is assumed that photographs 701 and 702 shown in FIG. 7A are taken in the real world 101 shown in FIG. If the layout is created when the photographed order is the order of the photograph 702 and the photograph 701, the layout is arranged in chronological order, and thus a layout 703 shown in FIG. 7B is generally created. If such a shape is formed, the photos are arranged in the opposite direction to the actual arrangement of the subject, giving the viewer a sense of incongruity. Further, in the method shown in the first embodiment, since the same portion does not exist between the photographs 701 and 702, they are arranged in time series. Therefore, in this embodiment, an example of creating the layout 704 in FIG. 7B by estimating the positional relationship between photographs using image feature data even in such a case will be described.

具体的な処理の流れとしては第１の実施形態と同様であるが、この場合、写真７０１と写真７０２との間には同一の部分が存在しない。そこで、図７（ｃ）に示す、２つの被写体（角柱及び円柱）が共に含まれており、長方形の物体１０８の全部が含まれている写真７０５を用意し、ステップＳ４０３において、この写真７０５と写真７０１、７０２との対応を取ることとする。なお、長方形の物体１０８の全部が含まれている必要はなく、場合によっては部分的に含まれている写真を用いてもよい。 The specific processing flow is the same as in the first embodiment, but in this case, the same portion does not exist between the photograph 701 and the photograph 702. Therefore, a photograph 705 including both of the two subjects (a prism and a cylinder) shown in FIG. 7C and including all of the rectangular object 108 is prepared. In step S403, the photograph 705 and The correspondence with the photos 701 and 702 will be taken. Note that the rectangular object 108 need not be entirely included, and a partially included photograph may be used in some cases.

図７（ｃ）に示す写真７０５を得る手段としては、任意の人物が撮影したものを用いてもよく、超多画素カメラを使用して予め撮影しておいてもよい。また、カメラの電源がＯＮのときは常に撮影しその写真を合成したり、動画を撮影しそこから数フレームを抜き出して合成したりしてもよい。また、対応の取り方や位置姿勢の推定については第１の実施形態と同様に行う。その結果、写真７０１は、写真７０５の中の領域７０６と対応し、写真７０２は写真７０５の中の領域７０７と対応することが分かるので、撮影者から見て写真７０１は左側、写真７０２は右側にあった部分を撮影したと推定される。 As a means for obtaining a photograph 705 shown in FIG. 7C, a photograph taken by an arbitrary person may be used, or a photograph may be taken in advance using a super multi-pixel camera. Also, when the camera is turned on, it is always possible to take a picture and combine the pictures, or to take a moving picture and extract several frames from it and combine them. Further, how to take the correspondence and estimate the position and orientation are performed in the same manner as in the first embodiment. As a result, it can be seen that the photograph 701 corresponds to the area 706 in the photograph 705, and the photograph 702 corresponds to the area 707 in the photograph 705. Therefore, the photograph 701 is on the left side and the photograph 702 is on the right side as viewed from the photographer. It is presumed that the part that was in was taken.

この推定結果を基に、ステップＳ４０５において、写真７０１をページの左側に配置し、写真７０２を右側に配置することを決定する。そして、ステップＳ４０６において、実際にレイアウトを作成する。その結果、図７（ｂ）に示すレイアウト７０４が作成される。なお、図７に示した例では水平方向の位置関係の推定を行っているが、有効な対応部分さえあれば第１の実施形態と同じく垂直方向の位置関係を推定することも可能である。この場合には、コンテンツを上下に配置することとなる。 Based on this estimation result, in step S405, it is determined that the photograph 701 is arranged on the left side of the page and the photograph 702 is arranged on the right side. In step S406, a layout is actually created. As a result, a layout 704 shown in FIG. 7B is created. In the example shown in FIG. 7, the positional relationship in the horizontal direction is estimated. However, as long as there is an effective corresponding portion, the positional relationship in the vertical direction can be estimated as in the first embodiment. In this case, the content is arranged vertically.

以上のように本実施形態によれば、レイアウトの対象外のコンテンツを用いて、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, by using content that is not subject to layout, the viewer is given a contradiction in the position and orientation between subjects in each content, a sense of incongruity caused by the fact that they cannot naturally carry their line of sight, etc. Can reduce that.

（第４の実施形態）
以下、本発明の第４の実施形態について説明する。本実施形態では、同一の部分が存在しない場合に奥行方向の位置関係を推定してレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第２の実施形態と同様であるため、説明は省略する。本実施形態では、第２の実施形態と異なる点について説明する。 (Fourth embodiment)
The fourth embodiment of the present invention will be described below. In the present embodiment, an example of creating a layout by estimating the positional relationship in the depth direction when the same portion does not exist will be described. Note that the configuration and processing procedure of the layout processing apparatus according to the present embodiment are basically the same as those of the second embodiment, and a description thereof will be omitted. In the present embodiment, differences from the second embodiment will be described.

例えば、図６（ａ）に示す実世界６０１において、図８（ａ）に示す写真８０１、８０２が撮影されたものとする。そして、撮影された順序が写真８０２、写真８０１の順である場合にレイアウトを作成すると、時系列順に配置され、一般的には、図８（ｂ）に示すレイアウト８０３が作成される。このような形になってしまうと、実際の被写体の配置とは逆向きに写真が並んでしまい、鑑賞者に違和感等を与えてしまう。また、第２の実施形態で示した方法では、写真８０１、８０２の間に同一の部分が存在しないため、時系列順に配置されてしまう。そこで本実施形態では、このような場合であっても、各写真間の位置関係を画像の特徴データを用いて推定し、図８（ｂ）のレイアウト８０４を作成する例について説明する。 For example, it is assumed that photographs 801 and 802 shown in FIG. 8A are taken in the real world 601 shown in FIG. Then, when the layout is created when the photographed order is the order of the photograph 802 and the photograph 801, they are arranged in chronological order, and generally a layout 803 shown in FIG. 8B is created. If such a shape is formed, the photos are arranged in the opposite direction to the actual arrangement of the subject, giving the viewer a sense of incongruity. Further, in the method shown in the second embodiment, since the same portion does not exist between the photos 801 and 802, they are arranged in chronological order. Therefore, in this embodiment, even in such a case, an example will be described in which the positional relationship between each photograph is estimated using image feature data, and the layout 804 in FIG. 8B is created.

具体的な処理の流れとしては第２の実施形態と同様であるが、写真８０１と写真８０２との間には同一の部分が存在しない。そこで第３の実施形態と同様に、図８（ｃ）に示す、２つの被写体（角柱及び円柱）が共に写っている写真８０５を用意し、ステップＳ４０３において、この写真８０５と写真８０１、８０２との対応を取ることとする。この写真８０５を得る手段はとしては、第３の実施形態と同様である。 The specific processing flow is the same as in the second embodiment, but the same portion does not exist between the photograph 801 and the photograph 802. Therefore, as in the third embodiment, a photograph 805 is prepared which includes two subjects (a prism and a cylinder) as shown in FIG. 8C. In step S403, the photograph 805 and photographs 801 and 802 are displayed. We will take action. The means for obtaining this photograph 805 is the same as in the third embodiment.

また、対応の取り方や位置姿勢の推定については、第２の実施形態と同様に行う。その結果、写真８０１は写真８０５の中の領域８０６と対応し、写真８０２は写真８０５の中の領域８０７と対応することが分かるので、撮影者から見て写真８０１は手前、写真８０２は奥にあったと推定される。 Further, how to take the correspondence and estimate the position and orientation are performed in the same manner as in the second embodiment. As a result, since the photograph 801 corresponds to the area 806 in the photograph 805 and the photograph 802 corresponds to the area 807 in the photograph 805, the photograph 801 is on the front and the photograph 802 is on the back as viewed by the photographer. It is estimated that there was.

このようにして推定された結果を基に、ステップＳ４０６で配置された結果、図８（ｂ）に示すレイアウト８０４が作成される。なお、本実施形態においても、第２の実施形態と同様に、紙のアルバムなどの２次元平面だけでなく３次元ディスプレイなどの３次元の表示装置にも、相対的な位置関係を再現する形でレイアウトすることが可能である。 Based on the result estimated in this way, the layout 804 shown in FIG. 8B is created as a result of the arrangement in step S406. In the present embodiment, as in the second embodiment, the relative positional relationship is reproduced not only on a two-dimensional plane such as a paper album but also on a three-dimensional display device such as a three-dimensional display. Can be laid out.

以上のように本実施形態によれば、奥行方向でもレイアウト対象外のコンテンツを用いて、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, content that is not subject to layout even in the depth direction is used, and the viewer is inconsistent in the position and orientation between subjects in the content, or feels uncomfortable due to the inability to carry the line of sight naturally. Can be reduced.

（第５の実施形態）
以下、本発明の第５の実施形態について説明する。本実施形態では、第３及び４の実施形態と同様に２枚の写真間で同一の部分が存在しない場合に各コンテンツ間の位置関係を推定してレイアウトを作成する例について説明する。なおし、本実施形態では、２枚の写真に同一の物体が分割されて写っている点を利用してレイアウトを作成する。また、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Fifth embodiment)
The fifth embodiment of the present invention will be described below. In the present embodiment, as in the third and fourth embodiments, an example will be described in which a layout is created by estimating the positional relationship between contents when there is no identical portion between two photos. In the present embodiment, the layout is created by using the point that the same object is divided and shown in two photos. Further, the configuration and processing procedure of the layout processing apparatus according to the present embodiment are basically the same as those in the first embodiment, and thus the description thereof is omitted. In the present embodiment, differences from the first embodiment will be described.

例えば、図１（ａ）に示す実世界１０１のシーンにおいて、図９（ａ）に示す写真９０１、９０２を撮影するものとする。ここで、写真９０１の中の物体９０７及び写真９０２の中の物体９０８は、ともに実世界１０１に存在する長方形の物体１０８である。したがって、この物体９０７、９０８が同一の物体であると判断できれば、そのことを手がかりに撮像装置に対する被写体の位置姿勢を推定することが可能となる。 For example, it is assumed that photographs 901 and 902 shown in FIG. 9A are taken in the scene of the real world 101 shown in FIG. Here, the object 907 in the photograph 901 and the object 908 in the photograph 902 are both rectangular objects 108 existing in the real world 101. Therefore, if it can be determined that the objects 907 and 908 are the same object, it is possible to estimate the position and orientation of the subject with respect to the imaging apparatus using this as a clue.

具体的な処理の流れは第１の実施形態と同様であるが、本実施形態では、ステップＳ４０１で取得する特徴データと、ステップＳ４０３における複数のコンテンツ間でのマッチング方法が異なっている。本実施形態では、物体１０８の各辺を直線として抽出し、その成分を特徴として比較することにより対応を取る。 The specific processing flow is the same as that of the first embodiment, but in this embodiment, the feature data acquired in step S401 and the matching method between a plurality of contents in step S403 are different. In the present embodiment, each side of the object 108 is extracted as a straight line, and a response is taken by comparing the components as features.

具体的には、まず、ステップＳ４０１において、画像中の直線成分を抽出する。この処理では既知の手法を用いればよく、例えば画像に対してHough変換をかけることにより、画像に含まれる直線を抽出することができる。例えば、写真９０１では直線９０３、９０４が抽出され、写真９０２では直線９０５、９０６がそれぞれ抽出される。 Specifically, first, in step S401, a linear component in the image is extracted. In this process, a known method may be used. For example, a straight line included in the image can be extracted by performing Hough transform on the image. For example, straight lines 903 and 904 are extracted from the photograph 901, and straight lines 905 and 906 are extracted from the photograph 902, respectively.

また、ステップＳ４０３においては、各画像から抽出した直線成分を比較する。前述したように、使用する画像は撮影時間もしくは場所の差異が所定の範囲内であるという制約があるため、２つの画像間でパラメータがほぼ同一である直線の組が存在すれば、それらは連続した直線であると判断することができる。図９（ａ）に示す例では、直線９０３と直線９０５とが連続性を示す直線であり、直線９０４と直線９０６とが連続性を示す直線であると判断される。このことを利用すれば、連続した直線を辺として持つ物体は、同一の物体であるとみなすことができる。つまり、物体９０７と物体９０８とは同一の物体であると判断される。 In step S403, the linear components extracted from the images are compared. As described above, since there is a restriction that the difference in photographing time or place is within a predetermined range for the image to be used, if there is a set of straight lines having almost the same parameters between the two images, they are continuous. It can be determined that this is a straight line. In the example shown in FIG. 9A, it is determined that the straight line 903 and the straight line 905 are straight lines indicating continuity, and the straight line 904 and the straight line 906 are straight lines indicating continuity. If this is utilized, the object which has a continuous straight line as an edge | side can be considered that it is the same object. That is, it is determined that the object 907 and the object 908 are the same object.

この対応関係を利用することにより、第１〜第４の実施形態と同様に被写体の位置姿勢を推定することができ、上下左右、および奥行方向での写真の配置位置を決定することができる。写真９０１では２枚の写真間で同一とみなされる物体９０７が写真の右側に存在し、写真９０２では同じく物体９０８が左側に存在するので、撮影者から見て写真９０１は左側、写真９０２は右側にあった部分であると推定される。 By using this correspondence, the position and orientation of the subject can be estimated as in the first to fourth embodiments, and the arrangement positions of the photos in the vertical and horizontal directions and in the depth direction can be determined. In the photo 901, an object 907 that is regarded as identical between the two photos exists on the right side of the photo, and in the photo 902, the object 908 also exists on the left side. It is estimated that this is the part that

この推定結果を基に、ステップＳ４０５において、写真９０１をページの左側に配置し、写真９０２を右側に配置することを決定する。そして、ステップＳ４０６で実際にレイアウトを作成する。その結果、図９（ｂ）に示すレイアウト９０９が生成される。なお、対応関係を取得するために使用する特徴は直線成分でなくてもよい。例えば、物体の形状や色といったものを使用しても構わない。例えば円のように直線成分のない物体が双方の写真に共通して写っていた場合、色を特徴として利用すると有効である。 Based on the estimation result, in step S405, it is determined that the photograph 901 is arranged on the left side of the page and the photograph 902 is arranged on the right side. In step S406, a layout is actually created. As a result, a layout 909 shown in FIG. 9B is generated. Note that the feature used to acquire the correspondence relationship may not be a linear component. For example, an object shape or color may be used. For example, when an object having no linear component such as a circle is shown in both photographs, it is effective to use color as a feature.

以上のように本実施形態によれば、各コンテンツ間で共通部分がなくても、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, even if there is no common part between the contents, the viewer can feel the inconsistency in the position and orientation between the subjects in the contents and the sense of incongruity caused by the fact that the line of sight cannot be carried naturally. Can reduce giving.

（第６の実施形態）
以下、本発明の第６の実施形態について説明する。本実施形態では、画像の特徴データを利用して画像間の対応関係を推定できない場合に、各コンテンツ間の位置関係を推定してレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Sixth embodiment)
The sixth embodiment of the present invention will be described below. In the present embodiment, an example of creating a layout by estimating the positional relationship between contents when the correspondence between images cannot be estimated using image feature data will be described. Note that the configuration and processing procedure of the layout processing apparatus according to this embodiment are basically the same as those in the first embodiment, and a description thereof will be omitted. In the present embodiment, differences from the first embodiment will be described.

第１〜５の実施形態では、画像間の対応関係を推定するために、画像の特徴データとして、局所特徴量、直線、色などを挙げてきた。しかし、これらを含めた画像から得られるどんな特徴量を用いても対応関係が取れない可能性がある。その場合には、代替手法として画像から抽出できる以外の特徴を使用することとする。使用する特徴としては、例えば電子コンパスなどのセンサ情報が挙げられる。また、手動で対応関係を指定することも考えられる。この状況を考慮した場合でも、基本的な処理の流れは第１の実施形態と同様であるが、ステップＳ４０３の処理が本実施形態では異なる。 In the first to fifth embodiments, local feature amounts, straight lines, colors, and the like have been cited as image feature data in order to estimate the correspondence between images. However, there is a possibility that no correspondence can be obtained by using any feature amount obtained from an image including these. In that case, features other than those that can be extracted from the image are used as an alternative method. For example, sensor information such as an electronic compass may be used as the feature to be used. It is also possible to manually specify the correspondence relationship. Even in consideration of this situation, the basic processing flow is the same as in the first embodiment, but the processing in step S403 is different in this embodiment.

図１０は、本実施形態における図４のステップＳ４０３の詳細な処理手順の一例を示すフローチャートである。以下、図５と異なる点について説明する。 FIG. 10 is a flowchart illustrating an example of a detailed processing procedure of step S403 of FIG. 4 in the present embodiment. Hereinafter, differences from FIG. 5 will be described.

ステップＳ５０３の判定の結果、画像の特徴データを利用して対応がとれない場合には、ステップＳ１００１において、前述した代替手法を用いることができるか否かを判定する。この判定の結果、代替手法を用いることができない場合はステップＳ５０４に進む。一方、代替手法を用いることができる場合は、ステップＳ１００２において、代替手法を用いるフラグをＯＮにする。これにより、図４のステップＳ４０４では、代替手法により各コンテンツ間の位置関係を推定することとなる。 If the result of determination in step S503 is that no correspondence can be obtained using image feature data, it is determined in step S1001 whether or not the above-described alternative method can be used. As a result of the determination, if the alternative method cannot be used, the process proceeds to step S504. On the other hand, if the alternative method can be used, the flag using the alternative method is turned ON in step S1002. Thereby, in step S404 of FIG. 4, the positional relationship between each content is estimated by an alternative method.

以上のように本実施形態によれば、特徴量を用いても対応関係が取れない場合にも、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, even when the correspondence is not obtained even if the feature amount is used, the viewer is not able to carry the line of sight between the subjects in each content or the line of sight naturally. Giving a feeling of strangeness can be reduced.

（第７の実施形態）
以下、本発明の第７の実施形態について説明する。本実施形態では、３つ以上の被写体（写真）のレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Seventh embodiment)
The seventh embodiment of the present invention will be described below. In the present embodiment, an example of creating a layout of three or more subjects (photos) will be described. Note that the configuration and processing procedure of the layout processing apparatus according to this embodiment are basically the same as those in the first embodiment, and a description thereof will be omitted. In the present embodiment, differences from the first embodiment will be described.

例えば、図１１（ａ）に示すように、実空間１１０１において、模様の付いた紙の近傍に、左側に角柱、中央に円柱、右側に立方体が置かれているシーンを撮影するものとする。図１１（ｂ）には、角柱を撮影した写真１１０２、円柱を撮影した写真１１０３、及び立方体を撮影した写真１１０４を示している。 For example, as shown in FIG. 11A, in a real space 1101, a scene in which a prism is placed on the left side, a cylinder on the center, and a cube on the right side is photographed in the vicinity of the patterned paper. FIG. 11B shows a photograph 1102 in which a prism is photographed, a photograph 1103 in which a cylinder is photographed, and a photograph 1104 in which a cube is photographed.

具体的な処理の流れとしては、まず、図４に示す手順により２枚ずつの位置姿勢を推定していき、位置姿勢を位置関係が推定できる組み合わせで繰り返していく。そして、位置関係を推定した写真を組み合わせた中で、重複があるものを手がかりにして全体の位置関係を推定していく。 As a specific processing flow, first, the position and orientation of each sheet are estimated by the procedure shown in FIG. 4, and the position and orientation are repeated in a combination that can estimate the positional relationship. Then, the entire positional relationship is estimated using a combination of photographs with estimated positional relationships as a clue.

例えば、写真１１０２、１１０３、１１０４の場合、まず、第５の実施形態と同じ方法により写真１１０２と写真１１０３との位置姿勢が推定され、撮影者から見て写真１１０２が左側、写真１１０３が右側にあった部分を撮影したと推定される。同様に写真１１０３と写真１１０４とに関しては、第１の実施形態と同じ方法で推定が行われ、撮影者から見て写真１１０３が左側、写真１１０４が右側にあった部分を撮影したと推定される。この２つの推定結果から、写真１１０２、１１０３、１１０４は撮影者から見て左から１１０２、１１０３、１１０４の順にあったシーンを撮影したと推定することができる。 For example, in the case of photographs 1102, 1103, and 1104, first, the positions and orientations of the photographs 1102 and 1103 are estimated by the same method as in the fifth embodiment, and the photograph 1102 is on the left and the photograph 1103 is on the right as viewed by the photographer. It is presumed that the part was shot. Similarly, for the photos 1103 and 1104, estimation is performed in the same manner as in the first embodiment, and it is estimated that the photograph 1103 is taken on the left side and the photo 1104 is on the right side as viewed from the photographer. . From these two estimation results, it can be estimated that the photos 1102, 1103, and 1104 were taken from the scenes in the order of 1102, 1103, and 1104 from the left as viewed from the photographer.

この推定結果を基に、ステップＳ４０５において、写真１１０２をページの左側に配置し、写真１１０３をページの中央に配置し、写真１１０４をページの右側に配置することを決定する。そして、ステップＳ４０６で実際にレイアウトを作成する。その結果、図１１（ｃ）に示すレイアウト１１０５が生成される。上記の例では２枚の写真の位置関係の推定に第１及び第５の実施形態で示した方法を用いたが、勿論他の実施形態で示した方法を用いても構わない。また、ある組は水平方向の位置関係を、別の組は垂直方向の位置関係を推定し、それらを組み合わせるといったことも可能である。 Based on this estimation result, in step S405, it is determined that the photograph 1102 is placed on the left side of the page, the photograph 1103 is placed in the center of the page, and the photograph 1104 is placed on the right side of the page. In step S406, a layout is actually created. As a result, a layout 1105 shown in FIG. 11C is generated. In the above example, the method shown in the first and fifth embodiments is used to estimate the positional relationship between two photographs, but of course, the method shown in other embodiments may be used. It is also possible to estimate the positional relationship in the horizontal direction for one set and to estimate the positional relationship in the vertical direction for another set and combine them.

以上のように本実施形態によれば、対象コンテンツが３つ以上であっても、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, even if there are three or more target contents, the viewer can feel the inconsistency in the position and orientation between subjects in each content, the uncomfortable feeling caused by the fact that the line of sight cannot be carried naturally. Can reduce giving.

（第８の実施形態）
以下、本発明の第８の実施形態について説明する。本実施形態では、被写体が傾いて撮影されている場合に各コンテンツ間の位置関係を推定してレイアウトを作成する例について説明する。なお、本実施形態に係るレイアウト処理装置の構成及び処理手順については、基本的には第１の実施形態と同様であるため、説明は省略する。本実施形態では、第１の実施形態と異なる点について説明する。 (Eighth embodiment)
The eighth embodiment of the present invention will be described below. In the present embodiment, an example will be described in which the layout is created by estimating the positional relationship between the contents when the subject is photographed with an inclination. Note that the configuration and processing procedure of the layout processing apparatus according to this embodiment are basically the same as those in the first embodiment, and a description thereof will be omitted. In the present embodiment, differences from the first embodiment will be described.

例えば、図１（ａ）に示す実世界１０１のシーンを撮影し、図１２（ａ）に示す写真１２０１、１２０２が得られたものとする。ここで、写真１２０１は、実世界１０１と比較して傾いている。このため、第１〜第７の実施形態に従ってこのままレイアウトを作成すると、鑑賞者に見にくさを感じさせてしまう。そこで本実施形態では、局所特徴量の回転への不変性を利用し、見にくさを感じさせないように傾きを補正してレイアウトを作成する例について説明する。 For example, assume that a scene of the real world 101 shown in FIG. 1A is photographed, and photographs 1201 and 1202 shown in FIG. 12A are obtained. Here, the photograph 1201 is inclined as compared with the real world 101. For this reason, if the layout is created as it is according to the first to seventh embodiments, it is difficult for the viewer to see. Therefore, in the present embodiment, an example will be described in which the layout is created by using the invariance of the local feature amount to rotation and correcting the inclination so as not to make it difficult to see.

基本的な処理の流れとしては第１の実施形態と同様であるが、本実施形態ではステップＳ４０３でコンテンツをマッチングする際に、画像の回転を考慮することとなる。写真１２０１と写真１２０２とでは、局所領域１２０３、１２０４が対応している。この２つの局所領域における局所特徴量は、第１の実施形態で説明したとおり主方向を基準として回転させ、方向の正規化が行われている。また、前述したように、局所領域を求める際に、回転角度の関係も求めることができる。したがって、この際に回転させた角度を記憶しておけば傾きを補正することが可能となる。 The basic processing flow is the same as in the first embodiment, but in this embodiment, image rotation is taken into consideration when matching content in step S403. The photos 1201 and 1202 correspond to local regions 1203 and 1204. The local feature amounts in the two local regions are rotated with reference to the main direction as described in the first embodiment, and the direction is normalized. Further, as described above, when the local region is obtained, the relationship of the rotation angle can also be obtained. Therefore, if the angle rotated at this time is stored, the inclination can be corrected.

ここで、傾きを補正する際の基準の選択方法は、手動で基準とする写真を選択する、写真内の直線成分を比較しより水平な直線が多いものを選択する、第３及び第４の実施形態のように被写体が全て写っている写真を利用する、といった方法が挙げられる。そして、これらの処理により傾きを推定した後、写真の傾きの補正を行い、更に被写体に内接する矩形で一部を切り出し、図１２（ｂ）に示す写真１２０５が生成される。 Here, the reference selection method for correcting the inclination is to manually select the reference photo, compare the linear components in the photo, and select the one with more horizontal straight lines. There is a method of using a photograph in which all subjects are shown as in the embodiment. Then, after estimating the inclination by these processes, the inclination of the photograph is corrected, and a part of the rectangle inscribed in the subject is cut out to generate a photograph 1205 shown in FIG.

この補正と第１の実施形態と同様の位置姿勢の推定とを基に、ステップＳ４０５において、写真１２０５をページの左側に配置し、写真１２０２をページの右側に配置することを決定する。そして、ステップＳ４０６で実際にレイアウトを作成する。その結果、図１２（ｃ）に示すレイアウト１２０６が生成される。上記の例では第１の実施形態と同様の実世界における推定を示したが、勿論他の実施形態で示した例にも適用可能である。 Based on this correction and the estimation of the position and orientation similar to those of the first embodiment, in step S405, it is determined that the photograph 1205 is arranged on the left side of the page and the photograph 1202 is arranged on the right side of the page. In step S406, a layout is actually created. As a result, a layout 1206 shown in FIG. 12C is generated. In the above example, estimation in the real world similar to that of the first embodiment is shown, but it is of course applicable to the examples shown in other embodiments.

以上のように本実施形態によれば、対象コンテンツが傾いている場合にも、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことからくる違和感等を与えることを減らすことができる。 As described above, according to the present embodiment, even when the target content is tilted, the viewer is given a discrepancy in the position and orientation between subjects in the content, a sense of incongruity caused by the fact that the line of sight cannot be carried naturally. Can reduce that.

（その他の実施形態）
前述した第１〜第８の実施形態では、基本的に写真を対象にして、１つの特徴データを基に、画像間の１方向の相対位置推定を行う例について説明した。一方、動画であっても写っている画像中で大きく特徴が変化しなければ、レイアウトの対照とすることは可能である。また、２つ以上の特徴データを組み合わせて相対位置を推定したり、２方向以上の相対位置を組み合わせて推定したりしても構わない。 (Other embodiments)
In the first to eighth embodiments described above, an example in which a relative position in one direction between images is estimated based on one feature data basically for a photograph has been described. On the other hand, even if it is a moving image, it can be used as a layout contrast if the feature does not change significantly in the captured image. Further, the relative position may be estimated by combining two or more feature data, or may be estimated by combining the relative positions in two or more directions.

例えば、図６に示す実世界６０１における角柱と円柱との相対位置の関係を考える。第２の実施形態においては、この２つの被写体の奥行の相対位置の関係のみを対象にして局所特徴量のみを用いた。しかし、２つの被写体の位置関係は斜めになっている。したがって、この斜めの関係を再現するために奥行方向とともに水平方向の位置関係も推定する。さらに、その際に局所特徴量だけでなく、物体の形状も特徴として用いる。これにより、斜め方向の位置関係をレイアウトで再現できる。また、複数の情報を用いることにより、位置関係の推定精度を高めることができる。さらに、静止画の代わりに上記特徴が変化しない程度の動画を用いることもできる。以上により、鑑賞者に各コンテンツ中の被写体間の位置姿勢の矛盾や、自然に視線を運べないことによる違和感等を与えることをより精度よく減らすことができる。 For example, consider the relationship between the relative positions of a prism and a cylinder in the real world 601 shown in FIG. In the second embodiment, only the local feature amount is used only for the relationship between the relative positions of the depths of the two subjects. However, the positional relationship between the two subjects is oblique. Therefore, in order to reproduce this oblique relationship, the positional relationship in the horizontal direction as well as the depth direction is estimated. Further, in this case, not only the local feature amount but also the shape of the object is used as the feature. Thereby, the positional relationship in the oblique direction can be reproduced with the layout. Moreover, the estimation precision of a positional relationship can be improved by using several information. Furthermore, a moving image that does not change the above-described characteristics can be used instead of a still image. As described above, it is possible to more accurately reduce the viewer's inconsistency in the position and orientation between subjects in each content, and the uncomfortable feeling caused by not being able to carry the line of sight naturally.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

３０１コンテンツ特徴情報取得部
３０２コンテンツ対応関係取得部
３０３位置姿勢推定部
３０４コンテンツ配置位置決定部 301 content feature information acquisition unit 302 content correspondence acquisition unit 303 position and orientation estimation unit 304 content arrangement position determination unit

Claims

Extraction means for extracting features from a plurality of contents in which a series of subjects are partially imaged;
Estimating means for estimating a positional relationship in the real world when the subject is imaged from the features extracted by the extracting means;
Determining means for determining the arrangement position of the plurality of contents based on the result of estimation by the estimation means, and maintaining the positional relationship in the real world ;
The extraction unit extracts a linear component of the content as the feature, and the estimation unit determines, based on the continuity of the linear component, if there is no common part between the plurality of contents. An image processing apparatus characterized by estimating a positional relationship in the real world .

The image processing apparatus according to claim 1, wherein the feature is information on a portion common to the plurality of contents.

The image processing apparatus according to claim 2, wherein the extraction unit calculates a local feature amount from the plurality of contents and extracts information of a portion common to the plurality of contents.

The image processing apparatus according to claim 1, wherein the characteristic is information on a portion corresponding to the plurality of contents.

5. The image processing apparatus according to claim 4, wherein information on a portion corresponding to the plurality of contents is at least one of a linear component, an object color, and an object shape.

6. The image according to claim 4 or 5 , wherein the estimating means estimates a positional relationship by estimating continuity of the subject using information of corresponding portions between the plurality of contents. Processing equipment.

The estimation means further estimates a positional relationship based on content in which all or a part of the series of subjects is imaged and includes a portion common to the plurality of contents. Item 8. The image processing apparatus according to Item 1.

The said estimation means estimates the positional relationship based on the combination of two contents among the said three or more contents, when the said some content is three or more contents. The image processing apparatus according to 1.

The relationship between the position estimated by the estimating means, front and rear, right and left, an image processing apparatus according to any one of claims 1-8, characterized in that the vertical relationship.

The content, the image processing apparatus according to any one of claim 1 to 9, characterized in that a still image or a moving image.

An extraction step of extracting features from a plurality of contents in which a series of subjects are partially imaged,
An estimation step for estimating a positional relationship in the real world when the subject is imaged from the feature extracted in the extraction step;
A determination step for determining an arrangement position of the plurality of contents based on a result of the estimation in the estimation step, and maintaining a positional relationship in the real world ;
In the extraction step, a linear component of content is extracted as the feature, and in the estimation step, when there is no common part between the plurality of contents, the continuity of the linear components causes the plurality of content to be extracted. An image processing method characterized by estimating a positional relationship in the real world .

An extraction step of extracting features from a plurality of contents in which a series of subjects are partially imaged,
An estimation step for estimating a positional relationship in the real world when the subject is imaged from the feature extracted in the extraction step;
Based on the estimation result in the estimation step, the computer executes a determination step for determining the arrangement position of the plurality of contents, maintaining a positional relationship in the real world ,
In the extraction step, a linear component of content is extracted as the feature, and in the estimation step, when there is no common part between the plurality of contents, the continuity of the linear components causes the plurality of content to be extracted. A program characterized by estimating positional relationships in the real world .