JP2019045993A

JP2019045993A - Image processing device, image processing method and program

Info

Publication number: JP2019045993A
Application number: JP2017166102A
Authority: JP
Inventors: 貴之猿田; Takayuki Saruta; 優和真継; Masakazu Matsugi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-08-30
Filing date: 2017-08-30
Publication date: 2019-03-22

Abstract

To extract a subject area from a photographic image with high accuracy.SOLUTION: An image processing device according to the present invention acquires location information and azimuth information as a photographic condition when photographic means photographs, and generates a model image on the basis of the photographic condition, and then extracts a subject area from a photographic image on the basis of the generated model image.SELECTED DRAWING: Figure 4

Description

本発明は、画像から被写体領域を抽出する技術に関する。 The present invention relates to a technique for extracting a subject area from an image.

画像認識手法のひとつとして、撮影により得られた画像を複数の領域に分割して、分割した領域毎に被写体の分類に関するクラスを識別する手法がある。この手法は、各領域から抽出される特徴量に基づいて、各領域のクラスを識別する。適切に画像を領域分割することは、被写体の認識やシーンの認識、被写体に応じた画質の補正等の多くの画像処理を容易にする。 As one of the image recognition methods, there is a method of dividing an image obtained by photographing into a plurality of areas and identifying a class related to the classification of an object for each of the divided areas. This method identifies the class of each area based on the feature value extracted from each area. Properly dividing the image into areas facilitates many image processing such as object recognition and scene recognition, and correction of image quality according to the object.

しかしながら、各領域の特徴量のみに基づいて識別を行うと信頼度が高い（識別スコア、識別尤度が高い）にも関わらず誤検出する場合がある。たとえば、青い空と青い壁は人間には容易に区別することができるが、空の一部を切り出した領域と青い壁の一部を切り出した領域を識別器で区別することは難しい。 However, if identification is performed based on only the feature amount of each region, erroneous detection may occur despite the high reliability (the identification score and the identification likelihood are high). For example, although a blue sky and a blue wall can be easily distinguished by humans, it is difficult for a classifier to distinguish an area cut out of a part of the sky from an area cut out of a part of the blue wall.

そこで、特許文献１には、カメラで撮影する際に得られるＧＰＳ情報や撮影情報に基づいて被写体のカテゴリや部分を推定する技術が開示されている。 Therefore, Patent Document 1 discloses a technique for estimating a category or a part of a subject based on GPS information or shooting information obtained when shooting with a camera.

特許第５６５５８２１号公報Patent No. 5565521 gazette

Ｒ．Ｓｏｃｈｅｒ，“ＰａｒｓｉｎｇＮａｔｕｒａｌＳｃｅｎｅｓａｎｄＮａｔｕｒａｌＬａｎｇｕａｇｅｗｉｔｈＲｅｃｕｒｓｉｖｅＮｅｕｒａｌＮｅｔｗｏｒｋｓ”，ＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＭａｃｈｉｎｅＬｅａｒｎｉｎｇ２０１１．R. Socher, “Parsing Natural Scenes and Natural Language with Recursive Neural Networks”, International Conference on Machine Learning 2011. Ｐ．Ｋｒａｈｅｎｂｕｈｌ，“ＥｆｆｉｃｉｅｎｔＩｎｆｅｒｅｎｃｅｉｎＦｕｌｌｙＣｏｎｎｅｃｔｅｄＣＲＦｓｗｉｔｈＧａｕｓｓｉａｎＥｄｇｅＰｏｔｅｎｔｉａｌｓ”，ＮｅｕｒａｌＩｎｆｏｒｍａｔｉｏｎＰｒｏｃｅｓｓｉｎｇＳｙｓｔｅｍｓ２０１１．P. Krahenbuhl, "Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials", Neural Information Processing Systems 2011. Ｊ．Ｔｉｇｈｅ，“ＳｕｐｅｒＰａｒｓｉｎ：ＳｃａｌａｂｌｅＮｏｎｐａｒａｍｅｔｒｉｃＩｍａｇｅＰａｒｓｉｎｇｗｉｔｈＳｕｐｅｒｐｉｘｅｌｓ”，ＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ２０１０．J. Tighe, "SuperParsin: Scalable Nonparametric Image Parsing with Superpixels", European Conference on Computer Vision 2010. Ｙ．Ｔａｉｇｍａｎ，“ＤｅｅｐＦａｃｅ：ＣｌｏｓｉｎｇｔｈｅＧａｐｔｏＨｕｍａｎ−ＬｅｖｅｌＰｅｒｆｏｒｍａｎｃｅｉｎＦａｃｅＶｅｒｉｆｉｃａｔｉｏｎ”，ＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ２０１４．Y. Taigman, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification", Conference on Computer Vision and Pattern Recognition 2014.

しかしながら、特許文献１の技術のように、ＧＰＳ情報や撮影情報から得られる情報だけでは、撮影画像内の被写体領域を精度よく抽出することができない場合がある。そこで、本発明は、撮影画像から被写体領域を精度よく抽出できるようにすることを目的とする。 However, as in the technique of Patent Document 1, there are cases where it is not possible to accurately extract the subject region in the photographed image only by the information obtained from the GPS information and the photographing information. Therefore, an object of the present invention is to enable accurate extraction of a subject region from a captured image.

本発明は、撮影手段により被写体が撮影された撮影画像を取得する第１の取得手段と、前記取得された撮影画像を撮影したときの前記撮影手段の位置情報および方位情報を撮影条件として取得する第２の取得手段と、前記取得した撮影条件に基づいて、前記被写体のモデル画像を生成する生成手段と、前記生成したモデル画像に基づいて、前記撮影画像から前記被写体の領域を抽出する抽出手段と、を有することを特徴とする。 The present invention acquires, as a photographing condition, a first acquisition unit that acquires a photographed image in which a subject is photographed by a photographing unit, and position information and orientation information of the photographing unit when the acquired photographed image is photographed. Extraction means for extracting the area of the subject from the photographed image based on the second acquisition means, generation means for generating a model image of the subject based on the acquired photographing conditions, and the generated model image And.

本発明によれば、撮影画像から被写体領域を精度よく抽出できるようになる。 According to the present invention, it is possible to accurately extract a subject area from a captured image.

第１の実施形態に係る画像処理システムの構成例を示す図。FIG. 1 is a view showing an example of the arrangement of an image processing system according to the first embodiment; 第１の実施形態において被写体領域を抽出した画像の一例を示す図。FIG. 3 is a view showing an example of an image in which a subject region is extracted in the first embodiment. 第１の実施形態に係る画像処理装置のハードウェア構成を示す概略ブロック図。FIG. 1 is a schematic block diagram showing a hardware configuration of an image processing apparatus according to a first embodiment. 各実施形態に係る画像処理装置の機能構成を示す概略ブロック図。FIG. 1 is a schematic block diagram showing a functional configuration of an image processing apparatus according to each embodiment. 第３の実施形態に係る画像処理装置の機能構成を示す概略ブロック図。FIG. 10 is a schematic block diagram showing the functional configuration of an image processing apparatus according to a third embodiment. 各実施形態に係る画像処理装置の機能構成を示す概略ブロック図。FIG. 1 is a schematic block diagram showing a functional configuration of an image processing apparatus according to each embodiment. 各実施形態に係る画像処理装置により実行される処理を示すフローチャート。5 is a flowchart showing processing executed by the image processing apparatus according to each embodiment. 第１の実施形態における被写体モデルの一例を示す図。FIG. 2 is a view showing an example of a subject model in the first embodiment. 第１の実施形態における被写体モデル画像の一例を示す図。FIG. 3 is a view showing an example of a subject model image in the first embodiment. 第１の実施形態における被写体モデル画像の他の一例を示す図。FIG. 6 is a view showing another example of the subject model image in the first embodiment. 本実施形態における被写体領域抽出工程の処理の詳細を説明する図。FIG. 8 is a view for explaining the details of processing of the subject region extraction step in the present embodiment; 第１の実施形態における被写体モデル座標系の一例を示す図。FIG. 2 is a view showing an example of a subject model coordinate system in the first embodiment. 第１の実施形態におけるエッジベースの３次元モデルフィッティングの一例を示す図。FIG. 6 is a view showing an example of edge-based three-dimensional model fitting in the first embodiment. 第１の実施形態における線分情報を利用して３次元モデルの位置及び姿勢を算出する方法を説明する図。FIG. 7 is a view for explaining a method of calculating the position and orientation of a three-dimensional model using line segment information in the first embodiment. 第３の実施形態に係る登録物体画像設定部の機能構成を示す概略ブロック図。The schematic block diagram which shows the function structure of the registration object image setting part which concerns on 3rd Embodiment. 第３の実施形態に係る登録物体画像設定部による処理を示すフローチャート。The flowchart which shows the processing by the registered object picture setting part concerning a 3rd embodiment. 第３の実施形態において登録物体領域の設定の例を示す図。The figure which shows the example of the setting of a registration object area | region in 3rd Embodiment.

［第１の実施形態］
以下、本発明の第１の実施形態の詳細について図面を参照しつつ説明する。図１は、本実施形態に係る画像処理装置を備えたシステムの概略的な構成例を示す図である。図１のシステムは、カメラ１０と、画像処理装置２０とが、ネットワーク１５を介して接続されている。なお、画像処理装置２０とカメラ１０は、一体に構成されていてもよい。 First Embodiment
Hereinafter, the details of the first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a view showing a schematic configuration example of a system provided with an image processing apparatus according to the present embodiment. In the system of FIG. 1, a camera 10 and an image processing apparatus 20 are connected via a network 15. The image processing apparatus 20 and the camera 10 may be integrally configured.

カメラ１０は、画像処理装置２０による画像処理の対象となる画像を撮影する。図１は、木（ｔｒｅｅ）３０ａ、自動車（ｃａｒ）３０ｂ、建物（ｂｕｉｌｄｉｎｇ）３０ｃ、空（ｓｋｙ）３０ｄ、道（ｒｏａｄ）３０ｅ、人体（ｂｏｄｙ）３０ｆ等が画角（撮影範囲）内に存在するシーン３０を、カメラ１０が撮影する例を示している。画像処理装置２０は、カメラ１０で撮影されたシーン３０における各被写体領域を抽出する。 The camera 10 captures an image to be subjected to image processing by the image processing apparatus 20. In FIG. 1, a tree 30a, a car 30b, a building 30c, a sky 30d, a road 30e, a human body 30f, etc. exist within the angle of view (shooting range) An example in which the camera 10 captures a scene 30 to be shot is shown. The image processing apparatus 20 extracts each subject area in the scene 30 captured by the camera 10.

図２は、被写体領域を抽出した画像の一例を示す図である。図２（ａ）は撮影された識別対象画像１００を示しており、この画像には被写体１０１〜１０５が映っている。この識別対象画像１００から抽出された被写体領域を、図２（ｂ）の２００で示している。図２（ｂ）では、建物１０１の被写体領域２００のみを抽出した例を示しているが、複数の被写体領域を抽出してもよい。図２（ｃ）に複数の被写体領域２０１、２０２を抽出している例を示す。被写体領域２００〜２０２を抽出する方法、抽出された被写体領域２００〜２０２およびその他領域１０２〜１０６に対する処理方法に関しては後述する。 FIG. 2 is a view showing an example of an image in which a subject region is extracted. FIG. 2A shows a photographed identification target image 100, in which images of subjects 101 to 105 appear. A subject area extracted from the identification target image 100 is indicated by 200 in FIG. Although FIG. 2B shows an example in which only the subject area 200 of the building 101 is extracted, a plurality of subject areas may be extracted. FIG. 2C shows an example in which a plurality of subject areas 201 and 202 are extracted. The method of extracting the subject areas 200 to 202, and the processing method for the extracted subject areas 200 to 202 and the other areas 102 to 106 will be described later.

図３は、本実施形態に係る画像処理装置２０のハードウェア構成の一例を示す概略ブロック図である。ＣＰＵ４０１は、画像処理装置２０全体を制御する。ＣＰＵ４０１がＲＯＭ４０３やＨＤ４０４等に格納されたプログラムを実行することにより、後述する画像処理装置２０の機能構成及び画像処理装置２０に係るフローチャートの処理が実現される。ＲＡＭ４０２は、ＣＰＵ４０１がプログラムを展開して実行するワークエリアとして機能する記憶領域である。ＲＯＭ４０３は、ＣＰＵ４０１が実行するプログラム等を格納する記憶領域である。ＨＤ４０４は、ＣＰＵ４０１が処理を実行する際に要する各種のプログラム、閾値に関するデータ等を含む各種のデータを格納する記憶領域である。操作部４０５は、ユーザによる入力操作を受け付ける。表示部４０６は、画像処理装置２０の情報を表示する。ネットワークＩ／Ｆ４０７は、画像処理装置２０と、外部の機器とを接続する。 FIG. 3 is a schematic block diagram showing an example of the hardware configuration of the image processing apparatus 20 according to the present embodiment. The CPU 401 controls the entire image processing apparatus 20. When the CPU 401 executes a program stored in the ROM 403, the HD 404, or the like, the functional configuration of the image processing apparatus 20 described later and the processing of the flowchart according to the image processing apparatus 20 are realized. The RAM 402 is a storage area that functions as a work area in which the CPU 401 develops and executes a program. The ROM 403 is a storage area for storing programs executed by the CPU 401 and the like. The HD 404 is a storage area for storing various data including various programs required when the CPU 401 executes a process, data on a threshold, and the like. The operation unit 405 receives an input operation by the user. The display unit 406 displays information of the image processing apparatus 20. A network I / F 407 connects the image processing apparatus 20 to an external device.

図４は各実施形態における画像処理装置２０の機能構成を示す概略ブロック図であり、図４（ａ）が本実施形態におけるブロック図である。なお、図４（ａ）には、画像処理装置２０内の各機能ブロックの他に、カメラ１０に相当する撮影部５００も示している。撮影部５００は、カメラ１０に相当し、識別対象画像を取得する。本実施形態の画像処理装置２０は、撮影画像取得部５０１、撮影条件取得部５０２、被写体情報取得部５０３、被写体モデル画像生成部５０４、被写体領域抽出部５０５、画像処理部５０６、被写体情報保持部５０７を有している。なお、被写体情報保持部５０７は、不揮発性の記憶装置として計算機２０と接続された構成としてもよい。画像処理装置２０が有するこれらの各機能の詳細については、後述する。 FIG. 4 is a schematic block diagram showing a functional configuration of the image processing apparatus 20 in each embodiment, and FIG. 4A is a block diagram in the present embodiment. In addition to the functional blocks in the image processing apparatus 20, FIG. 4A also illustrates an imaging unit 500 corresponding to the camera 10. The imaging unit 500 corresponds to the camera 10 and acquires an identification target image. The image processing apparatus 20 according to the present embodiment includes a photographed image acquisition unit 501, a photographing condition acquisition unit 502, an object information acquisition unit 503, an object model image generation unit 504, an object area extraction unit 505, an image processing unit 506, an object information holding unit It has 507. The subject information holding unit 507 may be connected to the computer 20 as a non-volatile storage device. Details of each of these functions of the image processing apparatus 20 will be described later.

図７は各実施形態における画像処理装置２０により実行される処理の概要を示すフローチャートであり、図７（ａ）が本実施形態のフローチャートである。図７（ａ）のフローチャートにおいて、撮影画像取得工程Ｓ１１０では、撮影画像取得部５０１が撮影部５００によって撮影された識別対象画像を入力データとして受信する。取得された識別対象画像は被写体モデル画像生成部５０４および被写体領域抽出部５０５に送信される。 FIG. 7 is a flowchart showing an outline of processing executed by the image processing apparatus 20 in each embodiment, and FIG. 7A is a flowchart of this embodiment. In the flowchart of FIG. 7A, in the captured image acquisition step S110, the captured image acquisition unit 501 receives an identification target image captured by the imaging unit 500 as input data. The acquired identification target image is transmitted to the subject model image generation unit 504 and the subject region extraction unit 505.

次に、撮影条件取得工程Ｓ１２０では、撮影条件取得部５０２が撮影部５００によって撮影された識別対象画像に対応する撮影条件を取得する。撮影情報の具体的な内容については後述する。取得された撮影条件は被写体情報取得部５０３および被写体モデル画像生成部５０４に送信される。次に、被写体情報取得工程Ｓ１３０では撮影条件取得工程Ｓ１２０で取得された撮影部の位置情報と方位情報（方角・向き）から推定される被写体の三次元モデルおよび各面の表面反射特性情報（テクスチャ情報）を被写体情報保持部５０７から取得する。詳細については後述する。取得された被写体情報である被写体の３次元モデルおよび各面の表面反射特性情報（テクスチャ情報）は被写体モデル画像生成部５０４に送信される。 Next, in the photographing condition acquisition step S120, the photographing condition acquisition unit 502 acquires the photographing condition corresponding to the identification target image photographed by the photographing unit 500. The specific content of the imaging information will be described later. The acquired shooting conditions are transmitted to the subject information acquisition unit 503 and the subject model image generation unit 504. Next, in the subject information acquisition step S130, a three-dimensional model of the subject estimated from the position information and orientation information (direction and direction) of the imaging unit acquired in the imaging condition acquisition step S120 and surface reflection characteristic information of each surface (texture Information) is acquired from the subject information holding unit 507. Details will be described later. The three-dimensional model of the subject and the surface reflection characteristic information (texture information) of each surface, which are the subject information acquired, are transmitted to the subject model image generation unit 504.

次に、被写体モデル画像生成工程Ｓ１４０では、撮影条件取得部５０２で取得された撮影条件と被写体情報取得部５０３で取得された被写体情報に基づいて被写体モデル画像を生成する。生成方法に関しては後述する。生成された被写体モデル画像は撮影画像取得部５０１で取得された撮影画像と同じサイズであり、同じ被写体に関する画像である。生成された被写体モデル画像は被写体領域抽出部５０５に送信される。次に、被写体領域抽出工程Ｓ１５０では、被写体モデル画像生成部５０４によって生成された被写体モデル画像と撮影画像取得部５０１によって取得された撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出する。具体的な方法については後述する。抽出された被写体領域の情報は画像処理部５０６に送信される。 Next, in the subject model image generation step S140, a subject model image is generated based on the photographing conditions acquired by the photographing condition acquisition unit 502 and the subject information acquired by the object information acquisition unit 503. The generation method will be described later. The generated subject model image has the same size as the captured image acquired by the captured image acquisition unit 501, and is an image related to the same object. The generated subject model image is transmitted to the subject region extraction unit 505. Next, in the subject region extraction step S150, the subject region in the photographed image is extracted by fitting the subject model image generated by the subject model image generation unit 504 and the photographed image acquired by the photographed image acquisition unit 501. The specific method will be described later. Information of the extracted subject area is transmitted to the image processing unit 506.

最後に、画像処理工程Ｓ１６０では被写体領域抽出部５０５によって抽出された被写体領域に基づいて画像処理を行う。具体的な処理内容については後述する。以上の処理によって、撮影画像中の被写体領域を抽出し、その被写体領域やその領域の輪郭に基づいて画像を画像処理する。 Finally, in the image processing step S160, image processing is performed based on the subject area extracted by the subject area extraction unit 505. Specific processing content will be described later. By the above processing, the subject region in the photographed image is extracted, and the image is image-processed based on the subject region and the contour of the region.

次に、図７（ａ）に示したフローチャートを参照しつつ、各処理の具体的な流れを説明する。撮影画像取得工程Ｓ１１０では、図１で示したようなシーン３０を撮影部５００が撮影した画像を、識別対象画像１００として取得する。なお、識別対象画像は、図示しない外部装置に格納されている画像であってもよい。その場合、入力部５０１は外部装置から読み出された画像を識別対象画像として取得する。外部装置に格納されている画像は、例えば撮影部５００等で予め撮影された画像であってもよいし、ネットワーク等を経由するなどの他の方法で取得されてから格納された画像であってもよい。その場合には、撮影画像の他に撮影時の撮影パラメータなどの情報（撮影情報）を予め保存しておく必要がある。撮影情報の具体的な内容については次の撮影条件取得工程Ｓ１２０で説明する。 Next, the specific flow of each process will be described with reference to the flowchart shown in FIG. In the photographed image acquiring step S110, an image obtained by the photographing unit 500 photographing the scene 30 as shown in FIG. 1 is acquired as the identification target image 100. The identification target image may be an image stored in an external device (not shown). In that case, the input unit 501 acquires an image read from an external device as an identification target image. The image stored in the external device may be, for example, an image captured in advance by the imaging unit 500 or the like, or an image stored after being acquired by another method such as passing through a network or the like. It is also good. In that case, it is necessary to save in advance information (shooting information) such as shooting parameters at the time of shooting in addition to the shot image. The specific content of the imaging information will be described in the next imaging condition acquisition step S120.

撮影条件取得工程Ｓ１２０では、図１で示したようなシーン３０を撮影部５００が撮影した際の撮影条件を取得する。撮影条件とはＧＰＳ情報などに基づいて得られる撮影部５００の位置情報、識別対象画像１００を撮影した際の撮影部５００の方位（方角・向き）、画角やシャッタースピードなどの撮影パラメータなどを示している。また、撮影時の光源情報や環境情報を取得してもよい。例えば、カメラなどで撮影をした場合にカメラのセンサで取得している測光値、色温度やＡＦなどの値でよい。また、ＧＰＳ情報などを用いてネットワークを介して天気や気温などの環境情報を取得してもよい。それらの情報は被写体モデル画像を生成する際に利用され、より高精細な被写体モデル画像を生成することができる。 In the photographing condition acquisition step S120, the photographing condition when the photographing unit 500 photographs the scene 30 as shown in FIG. 1 is acquired. The shooting conditions include the position information of the shooting unit 500 obtained based on GPS information, etc., the azimuth (direction / direction) of the shooting unit 500 when shooting the identification target image 100, and shooting parameters such as the angle of view and shutter speed. It shows. Also, light source information and environmental information at the time of shooting may be acquired. For example, it may be a photometric value acquired by a sensor of a camera when photographing with a camera or the like, a value such as color temperature or AF. In addition, environmental information such as weather and temperature may be acquired via a network using GPS information or the like. Those pieces of information are used when generating a subject model image, and a higher definition subject model image can be generated.

被写体情報取得工程Ｓ１３０では、撮影条件取得工程Ｓ１２０において取得された撮影条件に基づいて被写体情報を取得する。撮影条件の中でも特にＧＰＳ情報などで取得される位置情報に基づいて被写体情報を取得する。さらに、被写体自体が情報を保持し、それをＲＦＩＤタグなどに保持していてその情報を画像処理装置が受信することで被写体情報を取得してもよい。それにより例えば、移動している被写体に対する被写体情報を取得することができる。いずれにしても、本実施形態において、被写体情報とは次の被写体モデル画像生成工程Ｓ１４０において利用される情報である。図８は本実施形態における被写体モデルの一例を示しており、被写体情報とは例えば、図８（ａ）に示す３００〜３０２のように被写体の３次元モデルや、各面の色・テクスチャ情報、各面の反射率や透明度・屈折率などを表す。この情報に基づいて現在の撮影条件における被写体モデル画像を生成する。 In the subject information acquisition step S130, subject information is acquired based on the imaging conditions acquired in the imaging condition acquisition step S120. Among the shooting conditions, subject information is acquired based on position information acquired particularly by GPS information or the like. Furthermore, the subject itself may hold information and may be held in an RFID tag or the like, and the image processing apparatus may receive the information to obtain subject information. Thereby, for example, subject information on a moving subject can be acquired. In any case, in the present embodiment, the subject information is information used in the next subject model image generating step S140. FIG. 8 shows an example of a subject model in the present embodiment, and subject information includes, for example, a three-dimensional model of a subject, color / texture information of each surface, and the like as shown in 300 to 302 shown in FIG. It represents the reflectance, transparency, and refractive index of each surface. Based on this information, a subject model image under the current shooting conditions is generated.

本実施形態においては被写体の３次元モデルを取得するとするが、被写体モデル画像が生成できればよいので、各視点から得られる被写体のモデル画像を保持しておいてもよい。図８（ｂ）に示すような人物やキャラクターなどの多関節物体の場合には、各パーツのモデルを別々に保持しておいてもよい。 In the present embodiment, although a three-dimensional model of a subject is acquired, it is sufficient if a subject model image can be generated, so a model image of a subject obtained from each viewpoint may be held. In the case of an articulated object such as a person or a character as shown in FIG. 8B, the model of each part may be held separately.

被写体モデル画像生成工程Ｓ１４０では、先の撮影条件取得工程Ｓ１２０、被写体情報取得工程Ｓ１３０で得られた撮影条件および被写体情報を用いて被写体モデル画像を生成する。生成するにはレイトレーシングなどのレンダリング技術などを用いればよい。撮影装置の位置情報や方位、画角や撮影パラメータ、撮像装置や被写体が存在する環境の情報を利用することで、撮影画像に近い被写体モデル画像を生成することができる。図９、図１０は本実施形態における被写体モデル画像の一例を示す図であり、図８（ａ）に示すような被写体モデルの場合には、図９（ａ）、（ｂ）に示すような被写体モデル画像が生成される。また、図８（ｂ）に示すような多関節物体の被写体モデルの場合には、図１０（ａ）〜（ｄ）に示すように撮影装置の位置情報や方位、撮影パラメータなどの他に各パーツ間の関節角度のパラメータもあるため複数の被写体モデル画像を生成する。生成方法に関してはランダムで生成してもよいし、被写体モデル情報として被写体がよくとる姿勢（各関節角度の頻度）を予め記憶しておいて、それに基づいて生成してもよい。また図９、図１０では、単一の被写体モデルから被写体モデル画像を生成しているが、複数のモデルを用いて被写体モデル画像を生成してもよい。 In the subject model image generation step S140, a subject model image is generated using the photographing conditions and the subject information obtained in the previous photographing condition acquisition step S120 and the subject information acquisition step S130. In order to generate, rendering technology such as ray tracing may be used. A subject model image close to a photographed image can be generated by using position information of the photographing apparatus, an azimuth, an angle of view, photographing parameters, and information of an imaging apparatus and an environment in which the subject is present. FIGS. 9 and 10 are views showing an example of a subject model image according to this embodiment, and in the case of a subject model as shown in FIG. 8A, as shown in FIGS. 9A and 9B. An object model image is generated. Further, in the case of an object model of a multi-joint object as shown in FIG. 8B, as shown in FIGS. Since there are also parameters of joint angles between parts, multiple subject model images are generated. The generation method may be randomly generated, or the posture (frequency of each joint angle) taken by the subject often as subject model information may be stored in advance and may be generated based thereon. In FIGS. 9 and 10, the subject model image is generated from a single subject model, but a plurality of models may be used to generate a subject model image.

被写体領域抽出工程Ｓ１５０では、撮影画像取得工程Ｓ１１０において取得された撮影画像と被写体モデル画像生成工程Ｓ１４０で取得された被写体モデル画像をフィッティングすることで被写体領域を抽出する。図１１は、本実施形態における被写体領域抽出工程の処理の詳細を説明する図である。図１１に示すように撮影画像１００に対して、被写体モデル画像で取得された被写体モデルの輪郭６００をフィッティングさせることで被写体領域を抽出する。 In the subject region extraction step S150, the subject region is extracted by fitting the photographed image acquired in the photographed image acquisition step S110 with the object model image acquired in the object model image generation step S140. FIG. 11 is a diagram for explaining the details of the processing of the subject region extraction step in the present embodiment. As shown in FIG. 11, the subject region is extracted by fitting the contour 600 of the subject model acquired from the subject model image to the captured image 100.

本実施形態では、エッジベースの３次元モデルフィッティングを行うことで被写体領域を抽出する。この方法により被写体領域と同時に予め定義しておいた座標系における３次元位置姿勢を算出することができる。図１２は本実施形態における被写体モデル座標系の一例を示している。ここでは、図１２に示すように被写体モデル３００のある位置を原点とする被写体モデル座標系３０４を定義し、その被写体モデル座標系３０４におけるカメラ１０の３次元位置姿勢を算出する。本実施形態では、被写体を１０１のみとして説明するが、複数の被写体に対して行ってもよいし、１つの被写体に対してフィッティングを行って、その被写体との相対的な位置姿勢の関係を用いて他の被写体領域を抽出してもよい。 In the present embodiment, the subject region is extracted by performing edge-based three-dimensional model fitting. According to this method, it is possible to calculate the three-dimensional position and orientation in the coordinate system which has been previously defined at the same time as the subject area. FIG. 12 shows an example of a subject model coordinate system in the present embodiment. Here, as shown in FIG. 12, an object model coordinate system 304 having a position at an object model 300 as an origin is defined, and the three-dimensional position and orientation of the camera 10 in the object model coordinate system 304 is calculated. In the present embodiment, only the subject 101 is described, but it may be performed on a plurality of subjects, or fitting is performed on one subject, and the relative position and orientation relationship with the subject is used. Other subject areas may be extracted.

図１３は、本実施形態におけるエッジベースの３次元モデルフィッティングの一例を示す図である。図１３に示すように、エッジベースの３次元モデルフィッティングはモデルの各エッジを等間隔に分割した各分割点において処理を行う。処理は３次元モデルを画像平面上に投影して、画像平面上で行う。各分割点５２において、３次元モデルの各投影線５１に垂直かつ分割点を通過する線分（以下、探索ライン５３）上において被測定対象物体のエッジを探索する。すべての分割点について処理が終ったら、３次元モデルの位置及び姿勢を算出する。ここで、分割点の中で対応をとることができた対応点の総数をＮ_Ｃとする。３次元モデルの位置及び姿勢は、繰り返し計算により補正することで算出する。 FIG. 13 is a diagram showing an example of edge-based three-dimensional model fitting in the present embodiment. As shown in FIG. 13, in the edge-based three-dimensional model fitting, processing is performed at each division point obtained by dividing each edge of the model at equal intervals. The processing is performed on the image plane by projecting a three-dimensional model on the image plane. At each dividing point 52, the edge of the object to be measured is searched on a line segment (hereinafter, search line 53) which passes through the dividing point perpendicular to each projection line 51 of the three-dimensional model. After processing for all division points, the position and orientation of the three-dimensional model are calculated. Here, let N _C be the total number of corresponding points that could be dealt with among the division points. The position and orientation of the three-dimensional model are calculated by repeatedly correcting.

次に、線分（エッジ）の情報を利用して３次元モデルの位置及び姿勢を算出する方法を説明する。図１４は、線分の情報を利用して３次元モデルの位置及び姿勢を算出する方法を説明する図である。図１４では、画像の水平方向をｕ軸、垂直方向をｖ軸にとっている。ある分割点５２の座標を（ｕ_ｋ、ｖ_ｋ）、該分割点が所属する線分５１の画像上での傾きをｕ軸に対する傾きθで表す。また、線分５１の法線ベクトルを（ｓｉｎθ、―ｃｏｓθ）とする。さらに、該分割点５２に対する被測定対象物体のエッジ上にある対応点６０の座標を（ｕ´_ｋ、ｖ´_ｋ）とする。ここで、該分割点５２から該対応点６０までの距離を計算する。ここで線分５１上の点（ｕ、ｖ）は数式１を満たす。 Next, a method of calculating the position and orientation of a three-dimensional model using information on line segments (edges) will be described. FIG. 14 is a diagram for explaining a method of calculating the position and orientation of a three-dimensional model using line segment information. In FIG. 14, the horizontal direction of the image is taken as the u axis, and the vertical direction as the v axis. The coordinates of a certain division point 52 are represented by (u _k , v _k ), and the inclination of the line segment 51 to which the division point belongs on the image is represented by the inclination θ with respect to the u-axis. Also, let the normal vector of the line segment 51 be (sin θ, −cos θ). Further, the coordinates of the corresponding point 60 on the edge of the object to be measured with respect to the division point 52 are set to (u ′ _k , v ′ _k ). Here, the distance from the dividing point 52 to the corresponding point 60 is calculated. Here, the point (u, v) on the line segment 51 satisfies the formula 1.

ただし、 However,

である。該対応点６０を通り、線分５１に平行な直線上の点（ｕ、ｖ）は数式３を満たす。 It is. A point (u, v) on a straight line parallel to the line segment 51 passing through the corresponding point 60 satisfies Equation 3.

ただし、 However,

である。よって、該分割点５２と該対応点６０までの距離はｄ−ｒで計算される。該対応点６０の画像座標系における座標値は３次元モデルの位置・姿勢の関数である。３次元モデルの位置・姿勢の自由度は６自由度である。ここで３次元モデルの位置及び姿勢を表すパラメータをｐで表す。ｐは６次元ベクトルであり、３次元モデルの位置を表す３つの要素と、姿勢を表す３つの要素からなる。姿勢を表す３つの要素は、例えばオイラー角による表現や、方向が回転軸を表して大きさが回転角を表す３次元ベクトルなどによって表現される。 It is. Therefore, the distance between the dividing point 52 and the corresponding point 60 is calculated by dr. The coordinate values of the corresponding point 60 in the image coordinate system are a function of the position and orientation of the three-dimensional model. There are six degrees of freedom in the position and orientation of the three-dimensional model. Here, a parameter representing the position and orientation of the three-dimensional model is represented by p. p is a six-dimensional vector, and includes three elements representing the position of the three-dimensional model and three elements representing the posture. The three elements representing the posture are represented, for example, by Euler angles or a three-dimensional vector in which the direction represents the rotation axis and the size represents the rotation angle.

（ｕ、ｖ）を分割点５２の画像座標系における座標として（ｕ_ｋ、ｖ_ｋ）の近傍で１次のテイラー展開によって近似すると数式５のように表せる。 If (u, v) is approximated by a first-order Taylor expansion in the vicinity of (u _k , v _k ) as coordinates of the division point 52 in the image coordinate system, it can be expressed as Expression 5.

ここで偏微分係数∂ｕ／∂ｐ_ｉ、∂ｖ／∂ｐ_ｉは画像ヤコビアンである。 Here, partial differential coefficients ∂u / ∂p _i and ∂v / ∂p _i are image Jacobians.

数式５によって表される（ｕ、ｖ）が数式３で表される直線上に存在するように、３次元モデルの位置・姿勢パラメータｐの補正値Δｐを算出する。数式５を数式３に代入すると、数式６のようになる。 The correction value Δp of the position / posture parameter p of the three-dimensional model is calculated such that (u, v) represented by Formula 5 exists on the straight line represented by Formula 3. Substituting Equation 5 into Equation 3, Equation 6 is obtained.

数式６を整理すると数式７のように表せる。 If Formula 6 is arranged, it can be expressed like Formula 7.

数式７は、Ｎ_Ｃ個の分割点について成り立つため、数式８のようなΔｐに対する線形連立方程式が成り立つ。 Since Equation 7 holds for N _C division points, linear simultaneous equations for Δp like Equation 8 hold.

ここで数式８を数式９のように簡潔に表す。 Here, Equation 8 is briefly expressed as Equation 9.

数式９より、行列Ｊの一般化逆行列（Ｊ^Ｔ・Ｊ）^−１を用いてΔｐが求められる。最終的にはΔｐが閾値以下になるまで繰り返し計算を行う。繰り返し計算ののち、被写体領域が抽出される。被写体モデル画像生成工程Ｓ１４０において複数の被写体モデル画像を生成した場合には、各被写体モデル画像と撮影画像をフィッティングさせた結果の中でΔｐが最も小さくなったときの被写体領域を最適な被写体領域とすればよい。なお、多関節物体の場合には、パーツごとに３次元モデルフィッティングを行えばよい。または、図１０（ａ）〜（ｄ）のように複数の被写体モデル画像に対して、各々３次元モデルフィッティングを行い、Δｐが最も小さくなったときの被写体領域を最適な被写体領域とすればよい。 From Equation 9, Δp is determined using the generalized inverse matrix (J ^T · J) ⁻¹ of the matrix J. Finally, calculation is repeated until Δp becomes equal to or less than the threshold. After repeated calculation, the subject area is extracted. When a plurality of subject model images are generated in the subject model image generating step S140, the subject area at the time when Δp becomes the smallest among the results of fitting each subject model image and the photographed image is the optimum subject area. do it. In the case of an articulated object, three-dimensional model fitting may be performed for each part. Alternatively, three-dimensional model fitting may be performed on a plurality of subject model images as shown in FIGS. 10A to 10D, and the subject area when Δp becomes the smallest may be set as the optimal subject area. .

なお、本実施形態では被写体モデルを保持している被写体に関しての領域だけを抽出したが、それ以外の領域に関しても抽出してもよい。その場合には非特許文献１や２に開示されているような領域分割の手法を用いて領域分割およびカテゴリ識別を行えばよい。もしくは非特許文献３に開示されているような識別対象画像のグローバル特徴量に基づいた類似画像検索によって選択された類似画像に基づいて識別対象画像の各小領域のクラスを決定してもよい。その類似画像を検索する際に、撮影情報取得工程Ｓ１２０において取得されたカメラの位置情報や、被写体情報取得工程Ｓ１３０において取得された被写体情報に基づいて検索してもよい。 In the present embodiment, only the area related to the subject holding the subject model is extracted, but other areas may also be extracted. In that case, region division and category identification may be performed using a region division method as disclosed in Non-Patent Documents 1 and 2. Alternatively, the class of each small region of the identification target image may be determined based on the similar image selected by the similar image search based on the global feature of the identification target image as disclosed in Non-Patent Document 3. When searching for the similar image, the search may be performed based on the position information of the camera acquired in the photographing information acquisition step S120 or the subject information acquired in the subject information acquisition step S130.

本実施形態では、被写体領域を抽出するためにエッジベースの３次元モデルフィッティングを用いて行ったが、被写体領域を前景、それ以外の領域を背景としてグラフカットを行い被写体領域を抽出する方法を用いてもよい。また、エッジ以外の画像特徴量を用いたフィッティングを行ってもよい。 In the present embodiment, edge-based three-dimensional model fitting is performed to extract the subject area, but a method of extracting the subject area by performing a graph cut with the subject area as the foreground and the other areas as the background is used. May be Also, fitting may be performed using image feature quantities other than edges.

画像処理工程Ｓ１６０では、被写体領域抽出工程Ｓ１５０において抽出された被写体領域を用いて撮影画像に対して高画質化処理を行う。具体的な処理について、図２（ｂ）のように被写体１０１に対して被写体領域２００が抽出された場合で説明する。被写体領域２００が抽出された被写体１０１に対しては被写体モデルの各面の色情報を用いて、色調補正やホワイトバランスの調整、ノイズの調整などを行う。また、被写体１０１以外の被写体１０６などの不要物に対しては不要物除去やぼかし処理などを行う。不要物除去を行う場合には被写体モデル画像を用いて背景を補間する。また、図２（ｃ）のように複数の被写体領域２０１、２０２を抽出した場合には、各被写体領域に対して別々の処理を行うこともできる。 In the image processing step S160, the image quality enhancement processing is performed on the captured image using the subject region extracted in the subject region extracting step S150. A specific process will be described in the case where the subject region 200 is extracted for the subject 101 as shown in FIG. 2B. Color tone correction, white balance adjustment, noise adjustment, and the like are performed on the subject 101 for which the subject area 200 has been extracted, using color information of each surface of the subject model. Further, unnecessary objects are removed, blurring processing, and the like are performed on unnecessary objects such as the object 106 other than the object 101. When removing unnecessary objects, the background is interpolated using a subject model image. When a plurality of subject areas 201 and 202 are extracted as shown in FIG. 2C, separate processes can be performed on each subject area.

また、先に述べたように被写体モデルを保持している被写体以外の領域も抽出した場合には、そのカテゴリや領域に従って高画質化処理を行ってもよい。例えば、図２（ｃ）の場合には人物１０３、木１０６が抽出出来た場合には、人物や木に合わせた高画質化処理を行えばよい。 Further, as described above, when an area other than the subject holding the subject model is also extracted, the high image quality processing may be performed according to the category or the area. For example, in the case of FIG. 2C, when the person 103 and the tree 106 can be extracted, the high image quality processing may be performed according to the person and the tree.

以上のように、本実施形態によれば、画像処理装置２０は撮影時の撮影条件から被写体の３次元モデルを取得し、被写体モデル画像を生成する。生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出し、その被写体領域に基づいて、領域ごとに高画質化処理を行うことができるようになる。被写体の３次元モデルと撮影条件から被写体モデル画像を生成することによって、精度よく識別対象画像内の被写体領域を抽出することができる。 As described above, according to the present embodiment, the image processing apparatus 20 acquires a three-dimensional model of an object from the imaging conditions at the time of imaging, and generates an object model image. By fitting the generated object model image and the photographed image, the object region in the photographed image can be extracted, and the image quality improvement processing can be performed for each region based on the object region. By generating the subject model image from the three-dimensional model of the subject and the shooting conditions, it is possible to accurately extract the subject region in the identification target image.

［第２の実施形態］
次に、本発明の第２の実施形態について説明する。本実施形態では、一度生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出したあと、被写体領域の抽出結果より撮影部の位置・方位もしくは光源情報、環境情報の再推定を行う。そして、再推定されたそれらの情報から再度被写体モデル画像を生成し、撮影画像とフィッティングさせることで撮影画像中の被写体領域を精度よく推定する。以下、図面を参照しつつ、本発明の第２の実施形態について説明する。なお、第１の実施形態で既に説明をした構成については、その説明を省略し、同一の符号を付す。 Second Embodiment
Next, a second embodiment of the present invention will be described. In the present embodiment, after the subject region in the captured image is extracted by fitting the captured image and the subject model image generated once, the position / orientation of the imaging unit or the light source information and environment information are extracted from the extraction result of the subject region. Re-estimate. Then, the subject model image is generated again from the re-estimated information, and the subject region in the photographed image is accurately estimated by fitting it to the photographed image. Hereinafter, a second embodiment of the present invention will be described with reference to the drawings. In addition, about the structure already demonstrated in 1st Embodiment, the description is abbreviate | omitted and the same code | symbol is attached | subjected.

図４（ｂ）に、本実施形態における画像処理装置２０の機能構成を示す。第１の実施形態と異なる点は、撮影条件再推定部５０８、被写体領域再抽出部５０９が追加されていることである。 FIG. 4B shows a functional configuration of the image processing apparatus 20 in the present embodiment. The difference from the first embodiment is that a photographing condition re-estimation unit 508 and a subject region re-extraction unit 509 are added.

第１の実施形態では撮影条件および被写体モデルを用いて、被写体モデル画像を１つないし複数作成して被写体モデル画像と撮影画像をフィッティングさせることで被写体領域を抽出していた。一方、本実施形態ではフィッティング結果をフィードバックして、撮影条件を再推定し、再度被写体領域を抽出する。この方法によりＧＰＳなどが得られる位置情報やカメラのセンサ情報などから得られる環境情報が正確ではない場合や、粗い場合にも位置情報などの撮影情報を正確に推定し、被写体領域も高精度に抽出することができる。 In the first embodiment, one or more subject model images are created using the shooting conditions and the subject model, and the subject region is extracted by fitting the subject model image and the photographed image. On the other hand, in the present embodiment, the fitting result is fed back to re-estimate the shooting conditions, and the subject region is extracted again. This method accurately estimates imaging information such as position information even when the environment information obtained from the position information obtained by GPS etc. or sensor information from the camera is not accurate or rough, and the subject area is also highly accurate. It can be extracted.

図７（ｂ）は、本実施形態における画像処理装置による処理を示すフローチャートである。本実施形態の撮影画像取得工程Ｓ２１０〜被写体領域抽出工程Ｓ２５０、画像処理工程Ｓ２８０における処理内容は、第１の実施形態における撮影画像取得工程Ｓ１１０〜被写体領域抽出工程Ｓ１５０、画像処理工程Ｓ１６０と同様である。本実施形態においてはそれ以外に撮影条件再推定工程Ｓ２６０、被写体領域再抽出工程Ｓ２７０が追加されている。 FIG. 7B is a flowchart showing processing by the image processing apparatus in the present embodiment. The processing contents in the photographed image acquisition step S210 to the subject area extraction step S250 and the image processing step S280 of the present embodiment are the same as the photographed image acquisition step S110 to the subject area extraction step S150 and the image processing step S160 in the first embodiment. is there. In the present embodiment, a photographing condition re-estimation step S260 and a subject region re-extraction step S270 are additionally added.

撮影条件再推定工程Ｓ２６０では、撮影条件再推定部５０８が被写体領域抽出工程Ｓ２５０において抽出された被写体領域推定結果から撮影条件を再推定する。再推定された撮影条件は被写体領域再抽出部５０９に送信される。次に被写体領域再抽出工程Ｓ２６０では、被写体領域抽出部５０９が撮影条件再推定工程Ｓ２６０において再推定された撮影条件から被写体モデル画像を再生成して、被写体モデル画像と撮影画像を用いて被写体領域を抽出する。抽出された被写体領域は画像処理部５０６に送信される。 In the photographing condition re-estimating step S260, the photographing condition re-estimating unit 508 re-estimates the photographing conditions from the subject region estimation result extracted in the subject region extracting step S250. The re-estimated shooting conditions are sent to the subject area re-extraction unit 509. Next, in the subject region re-extracting step S260, the subject region extraction unit 509 regenerates the subject model image from the photographing conditions re-estimated in the photographing condition re-estimating step S260, and uses the subject model image and the photographed image. Extract The extracted subject area is transmitted to the image processing unit 506.

次に、撮影条件再推定工程Ｓ２６０、被写体領域再抽出工程Ｓ２７０の具体的な処理内容を説明する。撮影条件再推定工程Ｓ２６０では、被写体領域抽出工程Ｓ２５０において抽出された被写体領域を抽出した結果から撮影条件を再推定する。具体的には、図１２で示したように被写体モデルの３次元的な位置を原点とする被写体モデル座標におけるカメラの位置・姿勢を用いる。推定された被写体モデル座標におけるカメラの位置・姿勢から、ＧＰＳ情報などから得られる位置情報に変換するために予め被写体の世界座標系における位置・姿勢情報は得ておく。得られる世界座標系におけるカメラの位置姿勢は、撮影条件取得工程Ｓ２２０において取得されたカメラの位置・姿勢より高精度な位置・姿勢となる。また、被写体の各面のテクスチャ情報も保持しておけば、撮影画像の各面のテクスチャおよび先の世界座標系におけるカメラの位置姿勢より、撮影条件取得工程Ｓ２２０において取得された環境情報より高精度な環境情報、光源情報を取得することができる。 Next, specific processing contents of the photographing condition re-estimating step S260 and the subject region re-extracting step S270 will be described. In the photographing condition re-estimating step S260, photographing conditions are re-estimated from the result of extracting the subject region extracted in the subject region extracting step S250. Specifically, as shown in FIG. 12, the position / posture of the camera at the object model coordinates with the three-dimensional position of the object model as the origin is used. Position / posture information of the subject in the world coordinate system is obtained in advance to convert the position / posture of the camera in the estimated subject model coordinates into position information obtained from GPS information or the like. The position and orientation of the camera in the obtained world coordinate system are higher in precision than the position and orientation of the camera acquired in the photographing condition acquisition step S220. In addition, if texture information of each surface of the subject is also stored, it is more accurate than the environment information acquired in the photographing condition acquisition step S220 from the texture of each surface of the photographed image and the position and orientation of the camera in the previous world coordinate system. Environmental information and light source information can be acquired.

被写体領域再抽出工程Ｓ２７０では、撮影条件再推定工程Ｓ２６０において再推定された撮影条件を用いて、再度被写体モデル画像を生成し、被写体モデル画像と撮影画像を用いて再度３次元モデルフィッティングを行う。それにより、被写体領域抽出工程Ｓ２５０において抽出された被写体領域よりも高精度な被写体領域を抽出することができる。 In the subject region re-extracting step S270, the subject model image is generated again using the photographing conditions re-estimated in the photographing condition re-estimating step S260, and three-dimensional model fitting is performed again using the subject model image and the photographed image. Thus, it is possible to extract a subject area with higher accuracy than the subject area extracted in the subject area extraction step S250.

以上のように、本実施形態によれば、画像処理装置２０は撮影時の撮影条件から被写体の３次元モデルを取得し、被写体モデル画像を生成する。生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出し、その被写体領域に基づいて、再度撮影条件を推定する。再度推定された撮影条件から被写体モデル画像を再度生成して、撮影画像とフィッティングすることで精度よく識別対象画像内の被写体領域を抽出することができる。 As described above, according to the present embodiment, the image processing apparatus 20 acquires a three-dimensional model of an object from the imaging conditions at the time of imaging, and generates an object model image. A subject area in the photographed image is extracted by fitting the generated subject model image and the photographed image, and the photographing condition is estimated again based on the subject area. The subject region in the identification target image can be extracted with high accuracy by regenerating the subject model image from the shooting condition re-estimated and fitting it with the shooting image.

［第３の実施形態］
次に、本発明の第３の実施形態について説明する。本実施形態では、画像処理装置２０は被写体モデルを用いた被写体領域の他に、予めユーザが登録した登録物体を検出する。被写体領域の他に登録物体を検出することで、被写体領域と登録物体の両方を利用して画像処理を行うことや撮影アシストが可能になる。以下、図面を参照しつつ、本発明の第３の実施形態について説明する。なお、第１、第２の実施形態で既に説明をした構成については、その説明を省略し、同一の符号を付す。 Third Embodiment
Next, a third embodiment of the present invention will be described. In the present embodiment, the image processing apparatus 20 detects a registered object registered in advance by the user, in addition to the subject area using the subject model. By detecting a registered object in addition to the subject region, it is possible to perform image processing and imaging assistance using both the subject region and the registered object. Hereinafter, a third embodiment of the present invention will be described with reference to the drawings. In addition, the description is abbreviate | omitted about the structure already demonstrated by 1st, 2nd embodiment, the same code | symbol is attached | subjected.

図５は、本実施形態における画像処理装置２０の機能構成を示す概略ブロック図である。図５（ａ）は画像処理を行う際のブロック図であり、図５（ｂ）は撮影アシストを行う際のブロック図である。図５（ａ）では、第１の実施形態の構成に加えて、登録物体検出部５１０が追加され、図５（ｂ）では、登録物体検出部５１０が追加され、画像処理部５０６の代わりに撮影アシスト部５１１が追加されている。第１、第２の実施形態においては予め被写体情報保持部に記憶されている被写体の情報を用いて被写体領域を抽出する。これに対し、本実施形態においてはユーザがそれ以外の被写体に関する情報を予め登録しておくことで、撮影時に登録された物体に関しても検出を行うことができる。 FIG. 5 is a schematic block diagram showing a functional configuration of the image processing apparatus 20 in the present embodiment. FIG. 5A is a block diagram when performing image processing, and FIG. 5B is a block diagram when performing imaging assistance. In FIG. 5A, in addition to the configuration of the first embodiment, a registered object detection unit 510 is added, and in FIG. 5B, a registered object detection unit 510 is added, and instead of the image processing unit 506. A photographing assist unit 511 is added. In the first and second embodiments, the subject area is extracted using the subject information stored in advance in the subject information storage unit. On the other hand, in the present embodiment, the user can detect information on an object registered at the time of shooting by registering information on other objects in advance.

図７（ｃ）、（ｄ）は、本実施形態における画像処理装置２０により実行される処理を示すフローチャートである。図７（ｃ）は被写体領域と登録物体の両方を利用して画像処理を行う際のフローチャートであり、図７（ｄ）は撮影アシストを行う際のフローチャートである。図７（ｃ）、（ｄ）において、撮影画像取得工程Ｓ３１０〜被写体領域抽出工程Ｓ３５０における処理内容は、第１の実施形態の撮影画像取得工程Ｓ１１０〜被写体領域抽出工程Ｓ１５０と同様である。図７（ｃ）では、第１の実施形態の処理に加えて登録物体検出工程Ｓ３６０が追加され、画像処理工程Ｓ３７０の処理内容が異なっている。一方、図７（ｄ）では、第１の実施形態の処理に加えて登録物体検出工程Ｓ３６０が追加され、画画像処理工程Ｓ３７０の代わりに撮影アシスト工程Ｓ３８０が追加されている。 FIGS. 7C and 7D are flowcharts showing processing executed by the image processing apparatus 20 in the present embodiment. FIG. 7C is a flowchart when performing image processing using both a subject area and a registered object, and FIG. 7D is a flowchart when performing imaging assist. 7C and 7D, the processing contents in the photographed image acquisition step S310 to the subject area extraction step S350 are the same as the photographed image acquisition step S110 to the subject area extraction step S150 of the first embodiment. In FIG. 7C, in addition to the processing of the first embodiment, a registered object detection step S360 is added, and the processing content of the image processing step S370 is different. On the other hand, in FIG. 7D, in addition to the processing of the first embodiment, a registered object detection step S360 is added, and a photographing assist step S380 is added instead of the image processing step S370.

図７（ｃ）のフローチャートにおいて、登録物体検出工程Ｓ３６０では、登録物体検出部５１０が登録物体情報保持部に保持されている登録物体の情報を用いて撮影画像中の登録物体を検出する。検出された登録物体の検出位置および領域情報は画像処理部５０６に送信される。画像処理工程Ｓ３７０では、写体領域抽出工程Ｓ３５０で抽出された被写体領域および登録物体検出工程Ｓ３６０で検出された登録物体の検出情報に基づいて高画質化処理を行う。具体的な処理内容については後述する。 In the flowchart of FIG. 7C, in the registered object detection step S360, the registered object detection unit 510 detects the registered object in the photographed image using the information of the registered object held in the registered object information holding unit. The detected position and area information of the registered object detected are transmitted to the image processing unit 506. In the image processing step S370, high image quality processing is performed based on the subject region extracted in the photo object region extraction step S350 and the detection information of the registered object detected in the registered object detection step S360. Specific processing content will be described later.

図７（ｄ）の場合には、登録物体検出部５１０における処理内容は図７（ｃ）と同様であるが、画像処理工程Ｓ３７０の代わりに撮影アシスト工程Ｓ３８０が追加されている。撮影アシスト工程Ｓ３８０では、撮影アシスト部５１１は被写体領域抽出工程Ｓ３５０で抽出された被写体領域および登録物体検出工程Ｓ３６０で検出された登録物体の検出情報に基づいてユーザに対して撮影アシストを行う。 In the case of FIG. 7D, the processing content in the registered object detection unit 510 is the same as that of FIG. 7C, but a photographing assist step S380 is added instead of the image processing step S370. In the shooting assistance step S380, the shooting assistance unit 511 performs shooting assistance for the user based on the subject region extracted in the subject region extraction step S350 and the detection information of the registered object detected in the registered object detection step S360.

次に、図７（ｃ）のフローチャートに従い、登録物体検出工程Ｓ３６０および画像処理工程Ｓ３７０の処理内容をより具体的に説明する。登録物体検出工程Ｓ３６０では、ユーザが予め登録していた物体や人物を検出する。例えば、ユーザが子供の写真を登録すれば撮影した撮影画像から被写体モデルを用いて抽出する被写体以外に、子供領域を検出する。 Next, processing contents of the registered object detection step S360 and the image processing step S370 will be more specifically described according to the flowchart of FIG. 7 (c). In the registered object detection step S360, an object or a person registered in advance by the user is detected. For example, when the user registers a picture of a child, a child area is detected other than the subject extracted using the subject model from the photographed image taken.

ここで、先に登録物体の登録方法について説明する。図１５に、本実施形態における登録物体画像設定部２１の機能構成の概略ブロック図を示す。また、登録物体画像設定部２１による処理を示すフローチャートを図１６に示す。登録物体画像取得工程Ｔ１１０では、撮影部５００で撮影された登録物体画像を取得する。ここでは、撮影部によって撮影することによって登録物体画像を取得することにしているが、予め撮影部によって撮影しておいた画像を登録してもよい。また登録物体検出精度を向上するため複数枚登録することが望ましい。次に、登録物体情報設定工程Ｔ１２０では、登録物体画像取得工程Ｔ１１０で取得された登録物体の情報を設定する。例えば、ユーザが子供の写真を登録する場合には、ユーザインターフェースなどを用いて“子供”というタグを付与する。その他にも“犬”や“車”などのタグを付与する。図１７は、本実施形態における登録物体領域の設定の例を示す図である。図１７（ａ）、（ｂ）に示すように、ユーザがその登録物体７０１に対して登録物体領域７０２を設定してもよい。また、登録物体情報保持部５１５に登録物体の画像を保持してもよいが、登録された登録物体の画像もしくは登録物体領域から特徴量を取得して保持しておけばよい。特徴量の例としては画像（領域）内の色特徴やテクスチャ特徴の統計量を用いればよい。例えば、
・ＲＧＢ、ＨＳＶ、Ｌａｂ、ＹＣｂＣｒ色空間の各成分
・Ｇａｂｏｒｆｉｌｔｅｒ、ＬｏＧのフィルタ応答
を用いるとする。色特徴は４（色空間）×３（成分）の１２次元となる。また、フィルタ応答に関してはＧａｂｏｒｆｉｌｔｅｒ、ＬｏＧフィルタの数に対応した次元数となる。画像（領域）の特徴づけを行うため、画像（領域）の内の画素ごとに得られる特徴量から統計量を求める。用いる統計量は、平均、標準偏差、歪度、尖度の４つを用いるとする。歪度は分布の非対称性の度合いを示し、尖度は分布が平均の近くに密集している度合いを示す統計量である。よって、色特徴は４（色空間）×３（成分）×４（統計量）の４８次元となり、テクスチャ特徴の次元数は（フィルタ応答数）×４（統計量）となる。この特徴量と登録物体情報を登録物体情報保持部に保持しておけばよい。 Here, the registration method of the registered object will be described first. FIG. 15 shows a schematic block diagram of a functional configuration of the registered object image setting unit 21 in the present embodiment. A flowchart showing processing by the registered object image setting unit 21 is shown in FIG. In the registered object image acquisition step T110, the registered object image photographed by the photographing unit 500 is acquired. Here, although the registration object image is acquired by photographing with the photographing unit, an image photographed in advance by the photographing unit may be registered. Further, in order to improve the detection accuracy of the registered object, it is desirable to register a plurality of images. Next, in the registered object information setting step T120, the information of the registered object acquired in the registered object image acquiring step T110 is set. For example, when the user registers a picture of a child, a tag of “child” is attached using a user interface or the like. In addition, tags such as "dog" and "car" are attached. FIG. 17 is a diagram showing an example of setting of a registered object area in the present embodiment. As shown in FIGS. 17A and 17B, the user may set a registered object area 702 for the registered object 701. In addition, although the image of the registered object may be held in the registered object information holding unit 515, the feature amount may be acquired and held from the image of the registered object or the registered object region. As an example of the feature amount, statistics of color features and texture features in an image (region) may be used. For example,
Each component of RGB, HSV, Lab, YCbCr color space Gabor filter It is assumed that a filter response of LoG is used. The color feature is 12 dimensions of 4 (color space) × 3 (components). Also, with respect to the filter response, the number of dimensions corresponds to the number of Gabor filters and LoG filters. In order to characterize an image (region), a statistic is obtained from feature amounts obtained for each pixel in the image (region). The statistic to be used is assumed to use four of mean, standard deviation, skewness and kurtosis. Skewness indicates the degree of asymmetry of the distribution, and kurtosis is a statistic that indicates how densely the distribution is close to the mean. Therefore, the color feature has 48 dimensions of 4 (color space) × 3 (components) × 4 (statistics), and the number of dimensions of the texture features is (number of filter responses) × 4 (statistics). The feature amount and the registered object information may be held in the registered object information holding unit.

次に、設定された登録物体の情報に従って、撮影画像から登録物体を検出する方法について説明する。撮影画像から登録物体情報設定工程Ｔ１２０で設定した特徴量を取得する。登録物体の領域を設定した場合には、画像から同サイズの領域を切り出してその領域から特徴量を取得すればよい。その取得された特徴量と登録された特徴量とを比較して類似度が閾値以上であれば登録物体であると判断する。登録物体であると判断された場合にはその登録物体の位置および領域の情報は、次の画像処理工程Ｓ３７０で利用される。本実施形態では特徴量を比較して登録物体であるかを判断したが、顔などを登録した場合には非特許文献４に示されているような顔認証技術を用いてもよい。 Next, a method of detecting a registered object from a photographed image according to the set information of the registered object will be described. The feature amount set in the registered object information setting step T120 is acquired from the photographed image. When the area of the registered object is set, an area of the same size may be cut out from the image and the feature amount may be acquired from the area. The acquired feature amount is compared with the registered feature amount, and if the similarity is equal to or more than a threshold value, it is determined that the object is a registered object. If it is determined that the object is a registered object, the information on the position and area of the registered object is used in the next image processing step S370. In the present embodiment, the feature amounts are compared to determine whether the object is a registered object, but when a face or the like is registered, a face authentication technique as shown in Non-Patent Document 4 may be used.

画像処理工程Ｓ３７０では、被写体領域抽出工程Ｓ３５０、登録物体検出工程Ｓ３６０で抽出、検出された被写体領域、登録物体検出情報を用いて撮影画像に対して高画質化処理を行う。具体的には、被写体領域および登録物体領域に合わせてホワイトバランスを調整したり、その領域のコントラストが最大になるように現像パラメータを調整する。被写体領域および登録物体領域以外の領域に対しては、第１の実施形態と同様に不要物除去するか、領域分割の手法を適用してカテゴリに応じた高画質化処理を行う。 In the image processing step S370, the image quality enhancement processing is performed on the photographed image using the subject region extracted and detected in the subject region extraction step S350 and the registered object detection step S360. Specifically, the white balance is adjusted in accordance with the subject area and the registered object area, or the development parameter is adjusted so as to maximize the contrast of the area. As in the first embodiment, unnecessary areas are removed or areas are divided for areas other than the subject area and the registered object area, and image quality improvement processing according to the category is performed.

次に、図７（ｄ）のフローチャートに従い、撮影画像に高画質化処理を行うのではなく、ユーザに対して撮影パラメータや構図を提示する撮影アシストを行う場合の処理の詳細について説明する。 Next, according to the flowchart of FIG. 7D, details of processing in the case of performing imaging assistance for presenting imaging parameters and composition to the user instead of performing the image quality enhancement processing on the captured image will be described.

撮影アシスト工程Ｓ３８０では、被写体領域抽出工程Ｓ３５０、登録物体検出工程Ｓ３６０で抽出、検出された被写体領域、登録物体検出情報を用いてユーザに対して撮影アシストを行う。具体的には、被写体領域および登録物体領域、またその位置に従いユーザに対してその被写体および登録物体を好適な構図で撮影できるようにカメラ位置、アングルもしくは登録物体の位置をユーザに提示する。被写体領域抽出工程Ｓ３５０の処理の際に３次元モデルフィッティングを用いた場合には、被写体とカメラの３次元位置姿勢が詳細に得られているため、正確にカメラ位置をアシストすることができる。また、被写体の３次元位置および登録物体のおよその３次元位置から、その２つの領域を被写界深度に収める撮影パラメータをユーザに提示することができる。その他、第１の実施形態と同様に、被写体および登録物体の色情報からそれを現在の環境において撮影するのに好適なパラメータを提示してもよい。 In the shooting assist step S380, shooting assist is performed on the user using the subject region extracted and detected in the subject region extraction step S350 and the registered object detection step S360, and the registered object detection information. Specifically, the camera position, the angle, or the position of the registered object is presented to the user so that the user can shoot the subject and the registered object with a suitable composition according to the subject region and the registered object region and the position thereof. When three-dimensional model fitting is used in the process of the subject region extraction step S350, the three-dimensional position and orientation of the subject and the camera are obtained in detail, so that the camera position can be accurately assisted. Also, from the three-dimensional position of the subject and the approximate three-dimensional position of the registered object, it is possible to present the user with imaging parameters that bring the two regions into the depth of field. Besides, as in the first embodiment, from the color information of the subject and the registered object, parameters suitable for photographing it in the current environment may be presented.

以上のように、本実施形態によれば、画像処理装置２０は撮影時の撮影条件から被写体の３次元モデルを取得し、被写体モデル画像を生成する。生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出する。さらに、予めユーザが登録した物体を検出し、その物体検出結果と被写体領域に基づいて高画質化処理を行うことができるようになる。また、その物体検出結果と被写体領域に基づいてユーザに対して撮影アシストを行うことができるようになる。 As described above, according to the present embodiment, the image processing apparatus 20 acquires a three-dimensional model of an object from the imaging conditions at the time of imaging, and generates an object model image. The subject region in the photographed image is extracted by fitting the generated subject model image and the photographed image. Furthermore, it is possible to detect an object registered in advance by the user, and to perform the high image quality processing based on the object detection result and the subject region. Also, imaging assistance can be performed for the user based on the object detection result and the subject area.

［第４の実施形態］
次に、本発明の第４の実施形態について説明する。本実施形態では、被写体領域を抽出したあと、抽出された被写体領域と保持されている被写体モデルとを比較し、被写体の時間的な変化を検出する。被写体とモデルの差異が大きい場合には、被写体モデルの該当部分を更新するか、もしくは撮影画像中の被写体の該当部分を被写体モデルの該当部分に変更する。これにより、モデルの更新を可能にしたり、被写体が劣化した場合に被写体モデルを利用して撮影画像を生成することができる。以下、図面を参照しつつ、本発明の第４の実施形態について説明する。なお、第１〜第３の実施形態で既に説明をした構成については、その説明を省略し、同一の符号を付す。 Fourth Embodiment
Next, a fourth embodiment of the present invention will be described. In the present embodiment, after extracting the subject area, the extracted subject area is compared with the held subject model to detect a temporal change of the subject. If the difference between the subject and the model is large, the corresponding part of the subject model is updated, or the corresponding part of the subject in the photographed image is changed to the corresponding part of the subject model. As a result, it is possible to update the model or to generate a captured image using the subject model when the subject is deteriorated. Hereinafter, a fourth embodiment of the present invention will be described with reference to the drawings. In addition, the description is abbreviate | omitted about the structure already demonstrated by the 1st-3rd embodiment, and the same code | symbol is attached | subjected.

図６は、各実施形態における画像処理装置２０の機能構成を示す概略ブロック図であり、図６（ａ）が本実施形態の概略ブロック図である。本実施形態の画像処理装置２０では、第１の実施形態の構成に加えて、被写体情報評価部５１２が追加されている。被写体情報評価部の具体的な処理内容については後述する。 FIG. 6 is a schematic block diagram showing a functional configuration of the image processing apparatus 20 in each embodiment, and FIG. 6 (a) is a schematic block diagram of the present embodiment. In addition to the configuration of the first embodiment, a subject information evaluation unit 512 is added to the image processing apparatus 20 of the present embodiment. Specific processing contents of the subject information evaluation unit will be described later.

本実施形態の画像処理装置２０による画像処理を示すフローチャートを図７（ｅ）に示す。本実施形態の撮影画像取得工程Ｓ４１０〜被写体領域抽出工程Ｓ４５０における処理内容は、第１の実施形態における撮影画像取得工程Ｓ１１０〜被写体領域抽出工程Ｓ１５０と同様である。本実施形態においては、それ以外に被写体情報評価工程Ｓ４６０が追加されている。また、画像処理工程Ｓ４７０の処理内容が変更されている。 A flowchart showing image processing by the image processing apparatus 20 of the present embodiment is shown in FIG. The processing contents in the photographed image acquisition step S410 to the subject area extraction step S450 of the present embodiment are the same as the photographed image acquisition step S110 to the subject area extraction step S150 in the first embodiment. In the present embodiment, a subject information evaluation step S460 is additionally provided. In addition, the processing content of the image processing step S470 is changed.

被写体情報評価工程Ｓ４６０では、被写体情報評価部５１２が被写体領域抽出工程Ｓ４５０において抽出された被写体領域の情報と被写体情報保持部５０７に保持された被写体の情報を比較して、現状の被写体の状態を評価する。評価結果は画像処理部５０６に送信する。画像処理工程Ｓ４７０では、被写体領域抽出工程Ｓ４５０において抽出された被写体領域と被写体情報評価工程Ｓ４６０において評価された現状の被写体情報を用いて撮影画像に対して高画質化処理を行う。 In the subject information evaluation step S460, the subject information evaluation unit 512 compares the information of the subject area extracted in the subject area extraction step S450 with the information of the subject held in the subject information holding unit 507, and determines the current state of the subject. evaluate. The evaluation result is sent to the image processing unit 506. In the image processing step S470, the image quality enhancement process is performed on the captured image using the subject region extracted in the subject region extracting step S450 and the current subject information evaluated in the subject information evaluation step S460.

次に、図７（ｅ）のフローチャートに従い、被写体情報評価工程Ｓ４６０および画像処理工程Ｓ４７０の処理内容を具体的に説明する。 Next, processing contents of the subject information evaluation step S460 and the image processing step S470 will be specifically described according to the flowchart of FIG.

被写体情報評価工程Ｓ４６０では、被写体領域抽出工程Ｓ４５０において抽出された被写体領域の情報と被写体情報保持部５０７に保持された被写体の情報を比較して、現状の被写体の状態を評価する。具体的には、被写体の３次元位置・姿勢情報から撮影画像上の各被写体面のテクスチャと被写体情報保持部に保持されている各被写体面のテクスチャを比較する。そして、大きく違う面および面の一部があれば次の画像処理工程Ｓ４６０において被写体モデル保持部に保持されている被写体モデルの該当箇所の色情報・テクスチャ情報で置換する。もしくは被写体モデル情報を現状の被写体の状態に合わせて更新する。他の画像処理工程の処理内容については、第１〜第３の実施形態と同様であるため、説明を省略する。 In the subject information evaluation step S460, the information on the subject area extracted in the subject area extraction step S450 is compared with the information on the subject held in the subject information holding unit 507 to evaluate the current state of the subject. Specifically, the texture of each object plane on the photographed image is compared with the texture of each object plane held in the object information holding unit based on the three-dimensional position / posture information of the object. Then, if there is a greatly different face or part of the face, it is replaced with color information / texture information of the corresponding part of the subject model held in the subject model holding unit in the next image processing step S460. Alternatively, the subject model information is updated according to the current state of the subject. About the processing content of other image processing processes, since it is the same as that of the first to third embodiments, the description will be omitted.

以上のように、本実施形態によれば、画像処理装置２０は撮影時の撮影条件から被写体の３次元モデルを取得し、被写体モデル画像を生成する。生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出する。その被写体領域抽出結果に基づいて、被写体モデルの表面反射特性情報と撮影画像中の被写体領域を比較し、モデルとの差異を検出する。これによりモデルの更新を可能にしたり、被写体が劣化した場合に被写体モデルを利用して撮影画像を生成することができる。 As described above, according to the present embodiment, the image processing apparatus 20 acquires a three-dimensional model of an object from the imaging conditions at the time of imaging, and generates an object model image. The subject region in the photographed image is extracted by fitting the generated subject model image and the photographed image. Based on the subject region extraction result, the surface reflection characteristic information of the subject model is compared with the subject region in the photographed image to detect a difference from the model. This makes it possible to update the model or to generate a photographed image using the subject model when the subject is deteriorated.

［第５の実施形態］
次に、本発明の第５の実施形態について説明する。本実施形態では、被写体モデルを追加することを可能にした構成について説明する。以下、図面を参照しつつ、本発明の第５の実施形態について説明する。なお、第１〜第４の実施形態で既に説明をした構成については、その説明を省略し、同一の符号を付す。 Fifth Embodiment
Next, a fifth embodiment of the present invention will be described. In the present embodiment, a configuration in which a subject model can be added will be described. Hereinafter, a fifth embodiment of the present invention will be described with reference to the drawings. In addition, the description is abbreviate | omitted about the structure already demonstrated by the 1st-4th embodiment, and the same code | symbol is attached | subjected.

本実施形態における画像処理装置２０の機能構成を示す概略ブロック図を図６（ｂ）に示す。本実施形態の画像処理装置２０では、第１の実施形態の構成に加えて被写体情報追加部５１３が追加されている。被写体情報追加部の具体的な処理内容については後述する。 A schematic block diagram showing a functional configuration of the image processing apparatus 20 in the present embodiment is shown in FIG. In the image processing apparatus 20 of the present embodiment, a subject information adding unit 513 is added to the configuration of the first embodiment. Specific processing contents of the subject information adding unit will be described later.

図７（ｆ）に、本実施形態の画像処理時により実行される処理を示すフローチャートを示す。本実施形態の撮影画像所得工程Ｓ５１０〜被写体領域抽出工程Ｓ５５０における処理内容は、第１の実施形態の撮影画像取得工程Ｓ１１０〜被写体領域抽出工程Ｓ１５０と同様である。また、本実施形態においては、それ以外に被写体情報追加工程Ｓ５６０が追加されている。被写体情報追加工程Ｓ５６０では、被写体情報追加部５１３が被写体領域抽出工程Ｓ５５０において抽出された被写体領域の情報に基づいて被写体情報を被写体情報保持部に追加する。追加された被写体情報は他の撮影画像の高画質化処理の際に利用される。 FIG. 7F shows a flowchart showing the processing executed at the time of the image processing of this embodiment. The processing contents in the photographed image acquisition process S510 to the subject area extraction process S550 of the present embodiment are the same as the photographed image acquisition process S110 to the subject area extraction process S150 of the first embodiment. Further, in the present embodiment, a subject information addition step S560 is additionally provided. In the subject information adding step S560, the subject information adding unit 513 adds subject information to the subject information holding unit based on the information of the subject region extracted in the subject region extracting step S550. The added subject information is used in the process of enhancing the image quality of other captured images.

次に、図７（ｆ）のフローチャートに従い、被写体情報追加工程Ｓ５６０の処理内容をより具体的に説明する。被写体情報追加工程Ｓ５６０では、被写体領域抽出工程Ｓ５５０において抽出された情報に基づいて被写体情報を追加する。追加する情報はカメラの３次元位置・姿勢、被写体を撮影した環境および光源情報、撮影パラメータ、被写体の各面のテクスチャ情報などである。それらを被写体情報保持部に保持しておくことで、次回以降に撮影した際にその情報を利用することができる。 Next, the processing content of the subject information adding step S560 will be described more specifically according to the flowchart of FIG. 7 (f). In the subject information addition step S560, subject information is added based on the information extracted in the subject region extraction step S550. The information to be added is the three-dimensional position / posture of the camera, the environment and light source information of shooting the subject, shooting parameters, texture information of each surface of the subject, and the like. By holding them in the subject information holding unit, the information can be used when photographing is performed next time or later.

以上のように、本実施形態によれば、画像処理装置２０は撮影時の撮影条件から被写体の３次元モデルを取得し、被写体モデル画像を生成する。生成された被写体モデル画像と撮影画像をフィッティングさせることで撮影画像中の被写体領域を抽出する。その抽出した被写体領域の情報を画像処理装置の被写体情報保持部に追加登録することで、より多くの被写体情報を高画質化処理に利用することができる。さらに、このような手法を適用することでユーザがその被写体をよく撮るカメラの位置・姿勢や撮影パラメータでのデータを効果的に増やすことができ、高画質化処理の精度を向上することができる。 As described above, according to the present embodiment, the image processing apparatus 20 acquires a three-dimensional model of an object from the imaging conditions at the time of imaging, and generates an object model image. The subject region in the photographed image is extracted by fitting the generated subject model image and the photographed image. By additionally registering the extracted information of the subject area in the subject information holding unit of the image processing apparatus, more subject information can be used for the image quality improvement process. Furthermore, by applying such a method, it is possible to effectively increase the data on the position and attitude of the camera where the user often takes the subject and shooting parameters, and it is possible to improve the accuracy of the image quality improvement process. .

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 Other Embodiments
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Processing is also feasible. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

５０１撮影画像取得部
５０２撮影条件取得部
５０３被写体情報取得部
５０４被写体モデル画像生成部
５０５被写体領域抽出部
５０６画像処理部
５０７被写体情報保持部 501 Photographed image acquisition unit 502 Photographing condition acquisition unit 503 Subject information acquisition unit 504 Subject model image generation unit 505 Subject region extraction unit 506 Image processing unit 507 Subject information holding unit

Claims

A first acquisition unit that acquires a photographed image of a subject photographed by the photographing unit;
A second acquisition unit configured to acquire position information and orientation information of the photographing unit when the acquired photographed image is photographed as a photographing condition;
Generation means for generating a model image of the subject based on the acquired imaging conditions;
Extracting means for extracting the area of the subject from the photographed image based on the generated model image;
An image processing apparatus comprising:

The image processing apparatus according to claim 1, wherein the extraction unit extracts the area of the subject from the photographed image by fitting the area of the subject included in the model image and the photographed image. .

The second acquisition unit may further acquire at least one of environmental information, light source information, and imaging parameters of the imaging unit when the acquired captured image is captured as the imaging condition. The image processing apparatus according to claim 1, wherein the image processing apparatus is characterized by:

The information processing apparatus further includes third acquisition means for acquiring information on the subject based on the acquired imaging conditions.
The image processing apparatus according to any one of claims 1 to 3, wherein the generation unit generates the model image based on the acquired imaging condition and information on the subject.

The third acquisition unit further acquires, as information on the subject, information on surface reflection characteristics of the subject,
The image processing apparatus according to claim 4, wherein the generation unit generates the model image further based on information on the acquired surface reflection characteristic.

The image processing apparatus according to claim 4, further comprising an addition unit that adds information on the subject.

The image processing apparatus according to any one of claims 1 to 6, further comprising image processing means for performing image processing on the area of the subject extracted from the photographed image.

The apparatus further comprises detection means for detecting an object registered by the user from the photographed image,
8. The image processing apparatus according to claim 7, wherein the image processing unit further performs image processing on the area of the detected object.

9. The display device according to claim 1, further comprising: a presentation unit configured to present at least one of a photographing parameter of the photographing unit and a composition at the time of photographing based on the area of the subject extracted from the photographed image. The image processing apparatus according to any one of the above.

The image processing apparatus according to any one of claims 1 to 9, wherein the generation unit generates model images of a plurality of the subjects.

The image processing apparatus further comprises estimation means for estimating at least one of the imaging conditions based on the result of extraction of the region of the subject by the extraction means.
11. The apparatus according to any one of claims 1 to 10, wherein the extraction unit extracts the region of the subject from the photographed image based on the model image generated by the generation unit based on the estimated photographing condition. An image processing apparatus according to any one of the preceding claims.

The image processing according to any one of claims 1 to 11, further comprising: evaluation means for evaluating temporal change of the subject based on a result of the extraction means extracting a region of the subject. apparatus.

Acquiring a photographed image in which a subject is photographed by the photographing means;
Acquiring position information and orientation information of the photographing unit when the acquired photographed image is photographed as a photographing condition;
Generating a model image of the subject based on the acquired shooting conditions;
Extracting an area of the subject from the captured image based on the generated model image;
An image processing method comprising:

The program for functioning a computer as each means of the image processing apparatus of any one of Claims 1-12.