JP2019032742A

JP2019032742A - Image processing device, image processing method, and program

Info

Publication number: JP2019032742A
Application number: JP2017154145A
Authority: JP
Inventors: 知頼岩尾; Tomoyori IWAO
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-08-09
Filing date: 2017-08-09
Publication date: 2019-02-28

Abstract

To provide an Image processing device, an image processing method, and a program that improve image quality of a virtual viewpoint image including an area where a background shape can vary.SOLUTION: An image processing device for generating a virtual viewpoint image from images photographed by a plurality of imaging devices, includes: specifying means for specifying an area where a person appears in a region to be a background of the virtual viewpoint image; and generation means for generating the virtual viewpoint image using a three-dimensional model having a higher resolution in a region where a person does not appear in the region to be background of the virtual viewpoint image than the area specified by the specifying means.SELECTED DRAWING: Figure 3

Description

本発明は、自由始点映像を生成可能な画像処理装置、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program capable of generating a free start point video.

複数台の実カメラで撮影した映像を用いて、３次元空間内に仮想的に配置した実際には存在しないカメラ（仮想カメラ）から撮影した画像を再現する技術として、仮想視点画像技術がある。スタジアム等での競技中の映像を例に挙げると、仮想視点画像を生成する際には、競技を行っている選手やボール等の主要な被写体の他に、観客席や屋根等の背景画像を生成する必要がある。この背景画像を作成する技術として、例えば特許文献１には、半径が一定の円筒面上の背景形状に対して、多視点画像をテクスチャとして射影するレンダリング手法が開示されている。 There is a virtual viewpoint image technique as a technique for reproducing an image photographed from a camera (virtual camera) that does not actually exist and is virtually arranged in a three-dimensional space using videos photographed by a plurality of real cameras. For example, when creating a virtual viewpoint image, in addition to the main subjects such as players and balls that are competing, background images such as spectator seats and roofs are used. Need to be generated. As a technique for creating this background image, for example, Patent Document 1 discloses a rendering technique in which a multi-viewpoint image is projected as a texture on a background shape on a cylindrical surface having a constant radius.

特開２００８−２１７５４７号公報JP 2008-217547 A

特許文献１に記載の手法では、実際の背景の形状とは異なる円筒面上に多視点画像を射影することにより、実際の背景の詳細な形状に多視点画像を射影するより処理負荷を低減できる。しかしながら、実際の背景の形状と円筒形状とのずれが大きいほど、生成した仮想視点画像と、実写した画像との差が大きくなり、仮想視点画像の画質が悪くなるという問題がある。また、予め生成した実際の背景の詳細な形状に多視点画像を射影することで、実際の背景の形状との差を低下させることが考えられる。しかしながら、予め生成した実際の背景の詳細な形状に多視点画像を射影したとしても、観客席等の、時間変化によって背景形状が変動する領域であれば、生成した仮想視点画像と、実写した画像との差が大きくなり、仮想視点画像の画質が悪くなり得る。
そこで、本発明では、背景形状が変動し得る領域を含む仮想視点画像の画質の向上を目的とする。 In the method described in Patent Document 1, the processing load can be reduced by projecting a multi-viewpoint image onto a detailed shape of the actual background by projecting the multi-viewpoint image onto a cylindrical surface different from the actual background shape. . However, there is a problem that the larger the difference between the actual background shape and the cylindrical shape, the larger the difference between the generated virtual viewpoint image and the actually captured image, and the worse the image quality of the virtual viewpoint image. It is also conceivable to reduce the difference from the actual background shape by projecting a multi-viewpoint image onto the detailed shape of the actual background generated in advance. However, even if a multi-viewpoint image is projected onto a detailed shape of an actual background that has been generated in advance, the generated virtual viewpoint image and a photographed image can be used as long as the background shape changes with time, such as a spectator seat. And the image quality of the virtual viewpoint image may be deteriorated.
Therefore, an object of the present invention is to improve the image quality of a virtual viewpoint image including an area where the background shape can vary.

本発明は、複数台の撮像装置で撮影した画像から仮想視点画像を生成する画像処理装置であって、前記仮想視点画像の背景とする領域において人物が写る領域を特定する特定手段と、前記特定手段により特定された領域より、前記仮想視点画像の背景とする領域において人物が写らない領域の方が解像度の高い３次元モデルを用いて前記仮想視点画像を生成する生成手段と、を有することを特徴とする。
本発明の他の態様は、複数台の撮像装置で撮影した画像から仮想視点画像を生成する画像処理装置であって、背景となる形状の変動度を取得する第１取得手段と、前記背景となる形状の変動度に基づいて、前記背景となる形状の解像度を決定する決定手段と、前記決定手段により決定された解像度の前記背景となる形状を用いて前記仮想視点画像を生成する生成手段と、を有することを特徴とする。 The present invention is an image processing device that generates a virtual viewpoint image from images captured by a plurality of imaging devices, the specifying means for specifying a region in which a person is captured in a region as a background of the virtual viewpoint image, and the specifying Means for generating the virtual viewpoint image using a three-dimensional model having a higher resolution in a region where a person is not captured in a region as a background of the virtual viewpoint image than an area specified by the means. Features.
Another aspect of the present invention is an image processing device that generates a virtual viewpoint image from images captured by a plurality of imaging devices, the first acquisition means for acquiring the degree of variation of the shape as a background, and the background. Determining means for determining the resolution of the shape serving as the background based on the degree of variation of the shape to be generated; and generating means for generating the virtual viewpoint image using the shape serving as the background of the resolution determined by the determining means; It is characterized by having.

本発明によれば、背景形状が変動し得る領域を含む仮想視点画像の画質を向上させることができる。 According to the present invention, it is possible to improve the image quality of a virtual viewpoint image including a region where the background shape can vary.

本実施形態の画像処理装置の一構成例を示す図である。It is a figure which shows the example of 1 structure of the image processing apparatus of this embodiment. カメラ群を構成する各カメラの一例を示した図である。It is the figure which showed an example of each camera which comprises a camera group. 画像処理装置の機能ブロック図である。It is a functional block diagram of an image processing apparatus. 画像処理装置で行われる処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed with an image processing apparatus. 形状を低解像度化する領域を表す模式図である。It is a schematic diagram showing the area | region which makes a shape low resolution. 詳細形状を生成する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which produces | generates a detailed shape. 形状の前面図の一例を示す図である。It is a figure which shows an example of the front view of a shape.

以下、本発明の実施形態について、図面を参照して説明する。なお、以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。なお、同一の構成については、同じ符号を付して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The following embodiments do not limit the present invention, and all the combinations of features described in the present embodiment are not necessarily essential to the solution means of the present invention. In addition, about the same structure, the same code | symbol is attached | subjected and demonstrated.

図１は、本実施形態の画像処理装置１００を含む仮想視点画像システムの概略構成例を示した図である。図１に示す仮想視点画像システムは、画像処理装置１００と複数台の撮像装置（カメラ群）１０９とを有して構成されている。
画像処理装置１００は、仮想視点画像を生成する装置である。なお、本実施形態において、画像という文言は、特に断りがない限り、動画も静止画も含む概念として説明を行う。画像処理装置１００は、ＣＰＵ１０１、メインメモリ１０２、記憶部１０３、入力部１０４、表示部１０５、外部Ｉ／Ｆ部１０６を備え、各部がバス１０７を介して接続されている。 FIG. 1 is a diagram illustrating a schematic configuration example of a virtual viewpoint image system including an image processing apparatus 100 according to the present embodiment. The virtual viewpoint image system shown in FIG. 1 includes an image processing device 100 and a plurality of imaging devices (camera groups) 109.
The image processing apparatus 100 is an apparatus that generates a virtual viewpoint image. In the present embodiment, the term “image” will be described as a concept including both moving images and still images unless otherwise specified. The image processing apparatus 100 includes a CPU 101, a main memory 102, a storage unit 103, an input unit 104, a display unit 105, and an external I / F unit 106, and each unit is connected via a bus 107.

ＣＰＵ１０１は、画像処理装置１００を統括的に制御する演算処理装置であり、記憶部１０３等に格納された各種プログラムを実行して様々な処理を行う。メインメモリ１０２は、各種処理で用いるデータやパラメータなどを一時的に格納するほか、ＣＰＵ１０１に作業領域を提供する。記憶部１０３は、各種プログラムやＧＵＩ（グラフィカル・ユーザ・インターフェイス）表示に必要な各種データを記憶する大容量記憶装置で、例えばハードディスクやシリコンディスク等の不揮発性メモリが用いられる。入力部１０４は、キーボードやマウス、電子ペン、タッチパネル等の装置であり、ユーザからの操作入力を受け付ける。表示部１０５は、液晶パネルなどで構成され、仮想視点画像生成時の仮想カメラの経路設定のためのＧＵＩ表示などを行う。外部Ｉ／Ｆ部１０６は、カメラ群１０９を構成する各カメラとＬＡＮ１０８を介して接続され、映像データや制御信号データの送受信を行う。バス１０７は上述の各部を接続し、データ転送を行う。 The CPU 101 is an arithmetic processing device that controls the image processing apparatus 100 in an integrated manner, and executes various programs by executing various programs stored in the storage unit 103 and the like. The main memory 102 temporarily stores data and parameters used in various processes and provides a work area for the CPU 101. The storage unit 103 is a large-capacity storage device that stores various programs and various data necessary for GUI (graphical user interface) display. For example, a nonvolatile memory such as a hard disk or a silicon disk is used. The input unit 104 is a device such as a keyboard, a mouse, an electronic pen, or a touch panel, and receives an operation input from a user. The display unit 105 is configured by a liquid crystal panel or the like, and performs GUI display for setting a virtual camera path when generating a virtual viewpoint image. The external I / F unit 106 is connected to each camera constituting the camera group 109 via the LAN 108 and transmits and receives video data and control signal data. A bus 107 connects the above-described units and performs data transfer.

カメラ群１０９は、ＬＡＮ１０８経由で画像処理装置１００と接続されており、画像処理装置１００からの制御信号を基に、撮影の開始や停止、カメラ設定（シャッタースピード、絞りなど）の変更、撮影した映像データの転送を行う。
なお、システム構成については、上記以外にも様々な構成要素が存在するが、それらの説明は省略する。 The camera group 109 is connected to the image processing apparatus 100 via the LAN 108. Based on a control signal from the image processing apparatus 100, the start and stop of shooting, change of camera settings (shutter speed, aperture, etc.), and shooting are performed. Transfer video data.
As for the system configuration, there are various components other than the above, but the description thereof is omitted.

図２は、図１のカメラ群１０９を構成する各カメラ２０３の配置例を示した図である。仮想視点画像を生成するシーンとしては、スポーツ、コンサート等、数多く考えられるが、ここではラグビーを行うスタジアムに１０台のカメラ２０３を設置したケースを例に挙げて説明する。競技を行うフィールド２０１上には被写体２０２としての選手とボールが存在し、１０台のカメラ２０３がフィールド２０１を取り囲むように配置されている。カメラ群１０９を構成する個々のカメラ２０３は、フィールド２０１全体、或いはフィールド２０１の注目領域が画角内に収まるように、適切なカメラの向き、焦点距離、露出制御パラメータ等が設定されているとする。 FIG. 2 is a view showing an arrangement example of the cameras 203 constituting the camera group 109 of FIG. There are many possible scenes for generating a virtual viewpoint image, such as sports and concerts. Here, a case where ten cameras 203 are installed in a stadium where rugby is played will be described as an example. A player and a ball as the subject 202 exist on the field 201 where the game is performed, and ten cameras 203 are arranged so as to surround the field 201. When the individual cameras 203 constituting the camera group 109 are set with appropriate camera orientation, focal length, exposure control parameters, etc. so that the entire field 201 or the attention area of the field 201 is within the angle of view. To do.

図３は、本実施形態における画像処理装置１００の機能構成を示す機能ブロック図である。本実施形態では、画像処理装置１００により、仮想視点画像の背景となるスタジアムの形状が生成される。画像処理装置１００は、詳細形状生成部３０１、形状変動領域特定部３０２、形状低解像度化部３０３、撮影領域マッピング部３０４、オクルージョン領域特定部３０５、形状修正部３０６を有する。なお、これらの機能構成は、ＣＰＵ１０１が、例えば、記憶部１０３に記憶される制御プログラムを実行し、情報の演算や加工及び各ハードウェアを制御することで実現される。また、図３に示す各機能構成の一部またはすべてをＡＳＩＣやＦＰＧＡどの専用のハードウェアにより構成してもよい。 FIG. 3 is a functional block diagram showing a functional configuration of the image processing apparatus 100 according to the present embodiment. In the present embodiment, the image processing apparatus 100 generates a stadium shape as a background of the virtual viewpoint image. The image processing apparatus 100 includes a detailed shape generating unit 301, a shape variation region specifying unit 302, a shape low resolution unit 303, an imaging region mapping unit 304, an occlusion region specifying unit 305, and a shape correcting unit 306. Note that these functional configurations are realized by the CPU 101 executing, for example, a control program stored in the storage unit 103 to control calculation and processing of information and each hardware. Also, some or all of the functional configurations shown in FIG. 3 may be configured by dedicated hardware such as ASIC or FPGA.

詳細形状生成部３０１は、スタジアムの設計図や多視点撮影画像からスタジアムの詳細形状を生成する。詳細形状生成部３０１は、背景となる物体の形状を示す３次元モデル情報を生成する。形状変動領域特定部３０２は、時間経過に伴うスタジアムの形状の変動度が大きな領域を特定する。形状低解像度化部３０３は、形状変動度が大きな領域の形状における解像度を変更、具体的には低解像度化する。形状低解像度化部３０３は、３次元モデルの細やかさ、粗さ、滑らかさ、または量子化ビット数を変更することにより形状の解像度を変更してもよい。撮影領域マッピング部３０４は、各カメラで撮影している領域をスタジアムの形状上にマッピングする。オクルージョン領域特定部３０５は、スタジアムの形状の中のオクルージョン領域を特定する。オクルージョン領域とは、カメラで撮影されていない、または、手前にある物体が背後にある物体を隠して、カメラで撮影されていない領域をいう。形状修正部３０６はオクルージョン領域に相当する形状を修正する。ここで、本実施形態における形状とは、複数の多面体ポリゴン群で形成されるものとする。なお、形状を、ワイヤーフレームモデル、サーフェイスモデルまたは点群で表現してもよい。 The detailed shape generation unit 301 generates a detailed shape of the stadium from the stadium design drawing and the multi-viewpoint captured image. The detailed shape generation unit 301 generates three-dimensional model information indicating the shape of the object that is the background. The shape change area specifying unit 302 specifies an area where the degree of change in the shape of the stadium with time elapses is large. The shape lowering unit 303 changes the resolution in the shape of the region having a large shape variation degree, specifically, lowers the resolution. The shape reduction unit 303 may change the resolution of the shape by changing the fineness, roughness, smoothness, or quantization bit number of the three-dimensional model. The shooting area mapping unit 304 maps the area shot by each camera onto the shape of the stadium. The occlusion area specifying unit 305 specifies an occlusion area in the shape of the stadium. The occlusion area refers to an area that is not photographed by the camera or that is not photographed by the camera because the object in the foreground hides the object behind it. The shape correcting unit 306 corrects the shape corresponding to the occlusion area. Here, the shape in the present embodiment is formed by a plurality of polyhedral polygon groups. The shape may be expressed by a wire frame model, a surface model, or a point group.

次に、画像処理装置１００で行われる処理について、図４に示すフローチャートを参照して説明する。この一連の処理は、ＣＰＵ１０１が、所定のプログラムを記憶部１０３から読み込んでメインメモリ１０２に展開し、これをＣＰＵ１０１が実行することで実現される。尚、以下に示す処理の全てがＣＰＵ１０１によって実行される必要はなく、処理の一部または全てがＣＰＵ１０１以外の一つ又は複数の処理回路によって行われるように画像処理装置１００が構成されていてもよい。以下の説明では、簡略化のために処理ステップＳ４０１〜処理ステップＳ４０６をそれぞれＳ４０１〜Ｓ４０６と略記し、このことは後述の他のフローチャートにおいても同様とする。 Next, processing performed in the image processing apparatus 100 will be described with reference to a flowchart shown in FIG. This series of processing is realized by the CPU 101 reading a predetermined program from the storage unit 103 and developing it in the main memory 102, which is executed by the CPU 101. Note that it is not necessary for the CPU 101 to execute all of the processes shown below, and the image processing apparatus 100 may be configured such that part or all of the processes are performed by one or a plurality of processing circuits other than the CPU 101. Good. In the following description, the processing steps S401 to S406 are abbreviated as S401 to S406, respectively, for the sake of simplicity, and this also applies to other flowcharts described later.

Ｓ４０１において、詳細形状生成部３０１は、スタジアムの詳細形状を生成する。具体的には、詳細形状生成部３０１は、メインメモリ１０２に展開されたスタジアムの設計図や多視点画像のエッジ特徴等の情報に基づいて、スタジアムの詳細形状を生成する。この詳細形状生成方法の詳細に関しては、後述する。 In S401, the detailed shape generation unit 301 generates the detailed shape of the stadium. Specifically, the detailed shape generation unit 301 generates the detailed shape of the stadium based on information such as the design drawing of the stadium developed in the main memory 102 and the edge characteristics of the multi-viewpoint image. Details of this detailed shape generation method will be described later.

次のＳ４０２において、形状変動領域特定部３０２は、メインメモリ１０２に展開された多視点画像を入力として、スタジアムの詳細形状の中で形状変動度が大きな領域を特定する。
ここで、スタジアムにおける形状変動度とは、例えば試合が行われておらず、人が立ち入っていないスタジアムの形状を初期形状とした際に、仮想視点画像を生成するシーケンスの中で、スタジアムの形状が初期形状からどの程度変化したかを表す指標である。例えば試合が始まって観客席に観客が入れば、その観客席の領域は形状変動度が大きな領域として特定される。形状変動度が大きな領域を特定する方法の一つとして、事前知識を用いる方法がある。例えばサッカーの試合では、観客席は多くの観客が入るため形状変動度が大きくなり、屋根には人が立ち入らないから形状変動度が小さくなる、等の情報を与えることができる。 In next step S 402, the shape change area specifying unit 302 receives a multi-viewpoint image developed in the main memory 102 and specifies an area having a large shape change degree in the detailed shape of the stadium.
Here, the shape variation degree in the stadium is, for example, the shape of the stadium in the sequence for generating the virtual viewpoint image when the initial shape is the shape of the stadium where no game is played and no people are entering. Is an index that represents how much has changed from the initial shape. For example, when a match starts and a spectator enters the spectator seat, the spectator seat area is specified as an area having a large shape variation. One method for identifying a region having a large shape variation is a method using prior knowledge. For example, in a soccer game, a large number of spectators can enter the spectator seat, so that the degree of change in shape is large, and since a person does not enter the roof, the amount of change in shape can be reduced.

また、この形状変動度が大きい領域は、必ずしも仮想視点画像のシーケンスの中で一定ではない。例えば、仮想視点画像のシーケンスの開始時刻には天候が良く、一階席に観客が多かったが、該当シーケンス中に雨が降り始め、屋根のある二階席に観客が移動する場合は、形状変動度が大きい領域が該当シーケンス中に変動している。このように形状変動度が試合中に変化する際には、予め解像度が異なる形状をスタジアムの領域毎に用意しておき、仮想視点画像をレンダリングする際に、形状変動度の大小に従ってスタジアムの形状の解像度を領域毎に変更してもよい。 In addition, the region where the degree of shape variation is large is not necessarily constant in the sequence of virtual viewpoint images. For example, if the weather was good at the start time of the virtual viewpoint image sequence and there were many spectators on the first floor, but it began to rain during the sequence, and the spectator moved to the second floor with a roof, the shape changed. A region with a high degree fluctuates during the corresponding sequence. In this way, when the degree of change in shape changes during a game, shapes with different resolutions are prepared in advance for each stadium area, and when rendering a virtual viewpoint image, the shape of the stadium is determined according to the degree of change in shape. The resolution may be changed for each region.

また、スタジアムの形状変動度は必ずしもユーザが手動により指示する必要はなく、自動的に形状変動度が大きな領域が判定されてもよい。自動的に形状変動度が大きな領域を判定する方法としては、例えば多視点画像中の色変化が大きな領域を特定する方法を用いることができる。多くの観客が入った領域では、時間経過によって観客が動くため、他の領域よりも時間毎の色変化が大きくなる。逆に屋根の領域では、色変化は少なくなる。したがって、色変化を用いることで、形状変動度が大きな領域の自動的な判定が可能となる。例えば多視点撮影しているカメラ２０３の一台について、ｎ番目のフレームの、幅ｗ、高さｈの画像の画素（ｘ，ｙ）毎の輝度値が保存された行列をＩ_nとし、ｎ番目のフレームの前後ｋフレーム分の輝度値の差分の合計値Ｓ_nは、式（１）で表すことができる。

The stadium shape variation degree does not necessarily have to be manually designated by the user, and an area having a large shape variation degree may be automatically determined. As a method for automatically determining a region having a large degree of shape variation, for example, a method for identifying a region having a large color change in a multi-viewpoint image can be used. In an area where many spectators enter, the spectator moves over time, so the color change with time becomes larger than in other areas. Conversely, the color change is less in the roof area. Therefore, by using the color change, it is possible to automatically determine an area having a large degree of shape variation. For example, for one camera 203 that performs multi-viewpoint shooting, a matrix storing luminance values for each pixel (x, y) of an image of width n and height h in the nth frame is _denoted by In, and n. the sum S _n of the difference in brightness values before and after k frames of th frame can be represented by the formula (1).

形状変動領域特定部３０２は、この輝度値の差分に対して、閾値を設定することで形状変動度が大きい領域かどうかを判定する。閾値に関しては、ユーザがパラメータチューニングを行ってもよいし、画素毎の輝度値の差分の分布から自動的に決定してもよい。形状変動領域特定部３０２は、この処理を多視点撮影している全カメラの画像について行う。なお、式（１）において、前後のフレームの合計値を計算しているのは、スタジアムに設置されている大型映像表示装置等のように、形状変動度は小さいが、表示される映像により誤判定される可能性が高い領域における誤判定を防ぐためである。すなわち大型映像表示装置には、時計や各チームの得点などが表示され、前後フレームのみで形状変動度を判定すると、誤判定が発生するためである。 The shape variation area specifying unit 302 determines whether or not the area has a large shape variation degree by setting a threshold value for the difference between the luminance values. The threshold value may be parameter-tuned by the user, or may be automatically determined from the distribution of luminance value differences for each pixel. The shape variation area specifying unit 302 performs this process on images of all cameras that are taking multiple viewpoints. Note that in Equation (1), the total value of the previous and subsequent frames is calculated with a small degree of shape variation, such as a large video display device installed in a stadium, but it is erroneous depending on the displayed video. This is to prevent erroneous determination in a region that is highly likely to be determined. That is, the large-sized video display device displays a clock, scores of each team, and the like, and erroneous determination occurs when the degree of shape variation is determined only by the front and rear frames.

画像処理装置１００は、前述のように多視点画像中で形状変動度が大きな領域を特定した後には、それぞれの画像を撮影したカメラの位置姿勢に基づいて、画像中の形状変動度が大きな領域を形状上に投影し、マッピングを行う。画像処理装置１００は、カメラ群１０９が撮影した多視点画像中の形状変動度が大きな領域と、Ｓ４０１において生成した背景形状との対応付けを行う。また、画像処理装置１００は、形状変動度が大きい領域をマッピングした後には、図５（ａ）に示すように、マッピングした領域を含むように、四角形ポリゴンの頂点５０１同士を結んだエッジ５０２で囲む処理を行う。そして、このエッジ５０２で囲まれた領域が後述のＳ４０３において処理を行う領域となされる。ここで、エッジとは、ポリゴンの頂点５０１同士を結んだ、あるポリゴンの一辺となる線のことを表す。なお、画像処理装置１００は、背景形状から、形状変動度が大きい領域を直接特定してもよい。また、画像処理装置１００は、世界座標から形状変動度が大きい領域を特定してもよい。また、画像処理装置１００は、人物が写る（または、人物が写る可能性がある）領域を形状変動度が大きい領域として特定してもよい。また、画像処理装置１００は、仮想視点画像において背景とする領域において画像処理により人物を多視点画像から検出し、検出した領域を形状変動度が大きな領域として特定してもよい。画像処理装置１００は、撮影対象となるスタジアム等の施設の、仮想視点画像において背景となる領域のうち、観客席や人物が存在し得る領域を形状変動度が大きい領域を特定してもよい。また、画像処理装置１００は、撮影対象となるスタジアム等の施設の屋根など人物が存在しない領域を形状変動度が大きい領域を特定してもよい。 After the image processing apparatus 100 identifies a region having a large degree of shape variation in the multi-viewpoint image as described above, the region having a large degree of shape variation in the image is determined based on the position and orientation of the camera that captured each image. Is projected onto the shape and mapping is performed. The image processing apparatus 100 associates a region having a large shape variation degree in the multi-viewpoint image captured by the camera group 109 with the background shape generated in S401. Further, after mapping an area having a large degree of shape variation, the image processing apparatus 100 uses an edge 502 that connects the vertices 501 of the quadrilateral polygons so as to include the mapped area, as shown in FIG. Perform the enclosing process. An area surrounded by the edge 502 is an area to be processed in S403 described later. Here, the edge represents a line that is one side of a certain polygon and connects the vertexes 501 of the polygon. Note that the image processing apparatus 100 may directly specify a region having a large shape variation degree from the background shape. Further, the image processing apparatus 100 may specify a region having a large degree of shape variation from the world coordinates. In addition, the image processing apparatus 100 may specify an area where a person is captured (or a person may be captured) as an area where the degree of shape variation is large. In addition, the image processing apparatus 100 may detect a person from a multi-viewpoint image by image processing in a background region in the virtual viewpoint image, and specify the detected region as a region having a large shape variation degree. The image processing apparatus 100 may specify an area where the degree of variation in shape is large among areas serving as backgrounds in a virtual viewpoint image of a facility such as a stadium to be photographed. Further, the image processing apparatus 100 may identify an area where the degree of shape variation is large in an area where no person exists, such as a roof of a facility such as a stadium to be imaged.

Ｓ４０３において、形状低解像度化部３０３は、形状変動度に基づいて、仮想視点画像を生成する際に用いるその形状の解像度を決定する。Ｓ４０３において、形状低解像度化部３０３は、詳細形状生成部３０１が生成したスタジアムの詳細形状のなかで、形状変動度が大きい領域の形状における解像度を変更、具体的には低解像度化する。ここで、形状の解像度とは、その形状を形成するポリゴン数に依存した、形状の詳細さを表す指標である。すなわち、形状を低解像度化するとは、該当する形状のポリゴン数を減少させることに相当する。なお、形状低解像度化部３０３は、形状変動度が小さな領域の形状における解像度を、例えば、ポリゴン数を増加させることで高解像度化してもよい。また、仮想視点画像において背景となる領域の形状を点群で表現する場合、形状低解像度化部３０３は、その点群を増減することで、形状の解像度を変更してもよい。 In step S 403, the shape reduction unit 303 determines the resolution of the shape used when generating the virtual viewpoint image based on the shape variation degree. In step S 403, the shape reduction unit 303 changes the resolution in the shape of the region having a large shape variation degree among the detailed shapes of the stadium generated by the detailed shape generation unit 301, specifically, reduces the resolution. Here, the resolution of the shape is an index representing the details of the shape depending on the number of polygons forming the shape. That is, reducing the resolution of the shape corresponds to reducing the number of polygons of the corresponding shape. Note that the shape reduction unit 303 may increase the resolution in the shape of the region having a small shape variation degree, for example, by increasing the number of polygons. In addition, when the shape of the background region in the virtual viewpoint image is expressed by a point group, the shape reduction resolution unit 303 may change the resolution of the shape by increasing or decreasing the point group.

形状を低解像度化する際の最も簡潔な方法は、図５（ｂ）に示すように投影領域５１４を囲むエッジ５１２の頂点５１１を通るような近似的な多角形ポリゴン（ｍ角形ポリゴンとする。）を、形状変動度が大きい領域の代替として用いることである。一般的には、その後、元形状から解像度が大きく離れすぎないように、近似的なｍ角形ポリゴンで形状を代替した後に、ｍ角形ポリゴンを四角形以下のポリゴンに分割する。尚、形状の低解像度の方法は上記の方法に限定されない。上記の方法の他には、投影領域中の形状の解像度を細分化し、解像度が低い箇所から優先的に低解像度化する方法や、投影領域を囲むエッジの頂点を通るような凸な曲面をフィッティングする方法も存在し、それらの方法が用いられてもよい。 The simplest method for reducing the resolution of the shape is an approximate polygon polygon (m-square polygon) passing through the vertex 511 of the edge 512 surrounding the projection region 514 as shown in FIG. ) As an alternative to a region with a large degree of shape variation. In general, after replacing the shape with an approximate m-polygon so that the resolution is not too far from the original shape, the m-polygon is divided into polygons equal to or smaller than a quadrangle. Note that the low resolution method of the shape is not limited to the above method. In addition to the above methods, the resolution of the shape in the projection area is subdivided, and a method of preferentially reducing the resolution from a low-resolution part or fitting a convex curved surface passing through the vertex of the edge surrounding the projection area There are also methods that may be used.

次に、Ｓ４０４において、撮影領域マッピング部３０４は、メインメモリ１０２に展開されている多視点カメラの位置姿勢と画角を基に、形状低解像度化部３０３が出力したスタジアムの形状に対して、各カメラの撮影領域をマッピングする。スタジアムの形状上に撮影領域をマッピングする方法は、Ｓ４０２において、画像中で形状変動度が大きい領域を形状上にマッピングする場合と同様の方法を用いることができる。 Next, in S 404, the imaging region mapping unit 304 determines the stadium shape output from the shape reduction unit 303 based on the position and orientation of the multi-viewpoint camera developed in the main memory 102 and the angle of view. Map the shooting area of each camera. The method for mapping the imaging region on the shape of the stadium can be the same as the method for mapping the region having a large degree of shape variation on the shape in S402.

次に、Ｓ４０５において、オクルージョン領域特定部３０５は、撮影領域マッピング部３０４が撮影領域をマッピングしたスタジアムの形状について、オクルージョン領域の有無を判定する。ここで、オクルージョン領域とは、スタジアム形状中で、撮影領域がマッピングされていない領域のことである。そして、Ｓ４０５でオクルージョン領域が存在しないと判定された場合、画像処理装置１００はこの図４のフローチャートの形状生成処理を終了する。 Next, in S405, the occlusion area specifying unit 305 determines whether or not there is an occlusion area for the shape of the stadium to which the shooting area mapping unit 304 maps the shooting area. Here, the occlusion area is an area in the stadium shape where the shooting area is not mapped. If it is determined in S405 that no occlusion area exists, the image processing apparatus 100 ends the shape generation process of the flowchart of FIG.

一方、Ｓ４０５でオクルージョン領域が存在すると判定された場合、Ｓ４０６において、形状修正部３０６が、オクルージョン領域の形状を修正する。すなわち、オクルージョン領域があると判定された場合、形状修正部３０６は、ステップ４０２で示したように撮影領域がマッピングされていない領域を近傍のエッジで囲み、形状修正を行う領域とする。そして、形状修正部３０６は、オクルージョン領域特定部３０５が特定したオクルージョン領域に該当する形状を修正する。ここで、Ｓ４０５で述べたように、オクルージョン領域の形状はポリゴンの頂点同士を結んだエッジで囲まれている。そのため、形状を修正する際には、Ｓ４０３で記述した、ポリゴンを低解像度化する場合と同様の方法を用いることができる。この場合、Ｓ４０３で挙げた手法の一つを用いてポリゴンを低解像度化し、Ｓ４０５に戻って処理を行う。一般的に、形状を低解像度化することで細かな凹凸が少なくなるため、カメラから見えない、オクルージョン領域は少なくなる。 On the other hand, if it is determined in S405 that an occlusion area exists, the shape correction unit 306 corrects the shape of the occlusion area in S406. In other words, when it is determined that there is an occlusion area, the shape correction unit 306 surrounds an area where the imaging area is not mapped as shown in step 402 with a nearby edge, and sets it as an area for shape correction. Then, the shape correcting unit 306 corrects the shape corresponding to the occlusion area specified by the occlusion area specifying unit 305. Here, as described in S405, the shape of the occlusion area is surrounded by an edge connecting the vertices of the polygon. Therefore, when the shape is corrected, the same method as described in S403 for reducing the resolution of the polygon can be used. In this case, the resolution of the polygon is reduced using one of the methods mentioned in S403, and the process returns to S405 for processing. Generally, since the fine irregularities are reduced by reducing the resolution of the shape, the occlusion area that cannot be seen from the camera is reduced.

Ｓ４０６における形状修正がなされた後、オクルージョン領域特定部３０５は、再びＳ４０５でオクルージョン領域の有無を判定する。そして、再びオクルージョン領域が検出された場合は、Ｓ４０６において形状修正部３０６による形状修正が行われる。ここで形状修正を行う際には、複数あるポリゴン低解像度化手法の内、Ｓ４０５、Ｓ４０６の繰り返し処理で未だ使用されていない方法を用いる。すなわち、Ｓ４０５とＳ４０６を繰り返すことで、オクルージョン領域に対して選択し得る全てのポリゴン低解像度化法を適用できる。多視点カメラによってスタジアム全体が撮影されている場合、いずれかのポリゴン低解像度化の方法を用いることで、形状のオクルージョン領域をほぼゼロにすることができる。 After the shape correction in S406, the occlusion area specifying unit 305 determines again whether or not there is an occlusion area in S405. If an occlusion area is detected again, shape correction by the shape correction unit 306 is performed in S406. Here, when shape correction is performed, a method that is not yet used in the iterative processing of S405 and S406 is used among a plurality of polygon resolution reduction methods. That is, by repeating S405 and S406, all polygon resolution reduction methods that can be selected for the occlusion region can be applied. When the entire stadium is photographed by a multi-viewpoint camera, the shape occlusion area can be made substantially zero by using any one of the polygon reduction methods.

続いて、図４のＳ４０１におけるスタジアムの詳細形状生成について詳しく説明する。図６は、本実施形態に係る、スタジアムの詳細形状生成を設計図面から生成する場合の処理の流れを詳しく説明したフローチャートである。ここで、スタジアムの設計図とは、例えば投影図のことを表す。 Next, the detailed shape generation of the stadium in S401 of FIG. 4 will be described in detail. FIG. 6 is a flowchart illustrating in detail the flow of processing when generating the detailed shape of the stadium from the design drawing according to the present embodiment. Here, the design drawing of the stadium represents, for example, a projection drawing.

Ｓ６０１において、詳細形状生成部３０１は、スタジアムの設計図面のデータを読み込み、設計図面中の輪郭線を３ＤＣＧ（３次元・コンピュータ・グラフィックス）空間上に描画する。先ず、詳細形状生成部３０１は、前面図、側面図、上面図等から成る投影図面をそれぞれ読み込み、それぞれの正投影図の視点に対応するカメラのイメージプレーンとして、各図面を３ＤＣＧ空間上に配置する。 In step S 601, the detailed shape generation unit 301 reads stadium design drawing data and draws a contour line in the design drawing on a 3DCG (three-dimensional computer graphics) space. First, the detailed shape generation unit 301 reads projection drawings including a front view, a side view, a top view, and the like, and arranges each drawing on a 3DCG space as an image plane of a camera corresponding to the viewpoint of each orthographic projection. To do.

図７には前面図の一例を示す。スタジアムの前面図とは、スタジアムを正投影カメラ７００で前面から撮影した時の輪郭線が描かれている図である。詳細形状生成部３０１は、正投影カメラ７００の視点による投影対象７０１の輪郭線７１０が配置されたイメージプレーン７０２として、前図面を３ＤＣＧ空間上に配置する。また、同様に側面図、上面図はスタジアムを側面と上面から撮影した時のスタジアムの輪郭線が描かれている図である。詳細形状を生成する際、詳細形状生成部３０１は、これらの輪郭線を形状のエッジとして用いるため、設計図面中の輪郭線を３ＤＣＧ空間上のラインとして描画する。 FIG. 7 shows an example of a front view. The front view of the stadium is a diagram in which an outline is drawn when the stadium is photographed from the front by the orthographic camera 700. The detailed shape generation unit 301 arranges the previous drawing in the 3DCG space as the image plane 702 on which the outline 710 of the projection target 701 from the viewpoint of the orthographic projection camera 700 is arranged. Similarly, the side view and the top view are diagrams showing the outline of the stadium when the stadium is photographed from the side and the top. When generating the detailed shape, the detailed shape generation unit 301 draws the contour line in the design drawing as a line in the 3DCG space in order to use these contour lines as shape edges.

次に、Ｓ６０２において、詳細形状生成部３０１は、Ｓ６０１で生成した輪郭線を３次元空間上に投影し、輪郭線の位置合わせを行う。先ず、詳細形状生成部３０１は、それぞれの輪郭線を、対応する正投影カメラの方向に向かって並進移動させる。さらに、詳細形状生成部３０１は、並進移動させた際に他の投影図から生成されたラインの頂点と衝突する地点をラインの並進移動の距離とする。そして、詳細形状生成部３０１は、並進移動の距離を決定した後、これらのラインに隣接したラインの頂点に一致するような回転を加える。これらの処理は、必ずしも全自動で行われる必要はなく、ユーザ手動の位置調整が加わってもよい。 Next, in step S602, the detailed shape generation unit 301 projects the contour line generated in step S601 on the three-dimensional space, and aligns the contour line. First, the detailed shape generation unit 301 translates each contour line toward the corresponding orthographic camera. Furthermore, the detailed shape generation unit 301 sets a point that collides with the vertex of a line generated from another projection when the translation is performed as the distance of the translation of the line. Then, after determining the translational distance, the detailed shape generation unit 301 performs rotation so as to match the vertices of the lines adjacent to these lines. These processes do not necessarily need to be performed fully automatically, but may be subject to user manual position adjustment.

次に、Ｓ６０３において、詳細形状生成部３０１は、Ｓ６０２で生成した、それぞれの設計図面の輪郭線を結合した形状のエッジに対してポリゴンを生成する。ポリゴンを生成する際は、ラインを形成している頂点同士を結ぶ線をポリゴンのエッジとする。また、Ｓ４０３で述べたように、形状解像度を保つために、ポリゴンは四角形以下にして用いられる。 In step S 603, the detailed shape generation unit 301 generates a polygon for an edge having a shape formed by combining the outlines of the respective design drawings generated in step S 602. When generating a polygon, a line connecting vertices forming a line is set as an edge of the polygon. Also, as described in S403, in order to maintain the shape resolution, the polygon is used with a square or less.

図６のフローチャートでは設計図面の輪郭線を用いて詳細な形状を生成しているが、輪郭線を用いて詳細形状を生成する方法は必ずしもこれに限定されない。例えば、スタジアムを多視点撮影して、多視点撮影した実写画像のエッジから形状を生成することもできる。設計図面を用いる場合と同様に、撮影したカメラの各視点からのイメージプレーンとして各撮影画像を３ＤＣＧ空間上に配置する。その後、実写画像のエッジを基に、３ＤＣＧ空間上にラインを描画する。実写画像のエッジは、画像処理において一般的なエッジ検出フィルタであるソーベルフィルタ、ラプラシアンフィルタ、プレヴィットフィルタ等を、グレースケール画像に畳み込むことで得ることができる。強度の弱いエッジまで検出するように、フィルタの係数を調整することで、ある程度詳細に形状の輪郭を得ることができるようになるが、画像に存在するノイズの影響を強く受けるようになるため、極めて詳細なエッジはユーザが目視で検出する必要がある。実際にＣＧモデリングにおいては、エッジ検出フィルタを使わず、ユーザが目視で画像のエッジを見つけ、形状の輪郭を形成する例も多く見られる。これらのエッジを輪郭として用いることでＳ６０２、Ｓ６０３の手順でスタジアムの形状を得ることができる。 In the flowchart of FIG. 6, the detailed shape is generated using the outline of the design drawing, but the method of generating the detailed shape using the outline is not necessarily limited to this. For example, a stadium can be photographed from multiple viewpoints, and the shape can be generated from the edges of a live-action image photographed from multiple viewpoints. As in the case of using the design drawing, each captured image is arranged on the 3DCG space as an image plane from each viewpoint of the captured camera. Thereafter, a line is drawn on the 3DCG space based on the edge of the photographed image. The edge of a real image can be obtained by convolving a Sobel filter, a Laplacian filter, a Previt filter, or the like, which is a common edge detection filter in image processing, with a grayscale image. By adjusting the filter coefficient to detect even weak edges, the contour of the shape can be obtained in some detail, but because it will be strongly affected by noise present in the image, Very detailed edges need to be detected visually by the user. In fact, in CG modeling, there are many examples in which a user visually finds an edge of an image and forms a contour of a shape without using an edge detection filter. By using these edges as contours, the stadium shape can be obtained by the procedures of S602 and S603.

また、詳細な形状を生成する手法は、必ずしも輪郭線を用いる方法だけではなく他の手法が用いられてもよい。その一つの手法として、事前にスタジアムの点群データを、レーザースキャナを用いて取得して、その点群をポリゴンに変換する方法がある。点群データからポリゴンに変換する手法はポアソン・サーフェス・リコンストラクションやボールピボットといった方法が知られている。これらの手法により点群をポリゴン形状に変形した後、形状のスムージングを行うことでノイズを低減し、滑らかなスタジアムの形状を得ることができる。スムージングの方法としては、ポリゴン形状の頂点座標に対して、移動平均フィルタやガウシアンフィルタを畳み込む方法が一般的に知られている。 Further, the method for generating the detailed shape is not necessarily limited to the method using the contour line, and other methods may be used. One method is to acquire stadium point cloud data in advance using a laser scanner and convert the point cloud to polygons. Known methods for converting point cloud data into polygons include Poisson surface reconstruction and ball pivot. By transforming the point cloud into a polygon shape by these methods, the shape can be smoothed to reduce noise and obtain a smooth stadium shape. As a smoothing method, a method of convolving a moving average filter or a Gaussian filter with respect to polygonal vertex coordinates is generally known.

以上のように、本実施形態によれば、背景形状の変動度が大きい領域を低解像度化し、また背景形状のオクルージョン領域の形状を修正することで、多視点画像を射影しても背景等の画像が破綻しない画質の良い自由始点映像を生成可能である。また、本実施形態によれば、観客席などの形状変動度が大きな領域の解像度より、屋根などの形状変動度が小さな領域の解像度を高くする。したがって、客席などの形状変動度が大きな領域の解像度を低くすることで、予め生成した背景の形状と、観客が存在する等変動後の背景の形状との差を丸めることができ、仮想視点画像を生成した際の画質の低下を低減することができる。一方、屋根などの形状変動度が小さな領域では、予め生成した背景の形状と、実際の背景の形状との差が小さいため、高解像度の背景の形状を用いることで高画質な仮想視点画像を生成することができる。 As described above, according to the present embodiment, by reducing the resolution of an area where the background shape has a large degree of variation, and by correcting the shape of the occlusion area of the background shape, even if a multi-viewpoint image is projected, the background etc. It is possible to generate a free start point image with good image quality that does not break down the image. Further, according to the present embodiment, the resolution of the region with a small shape variation such as the roof is made higher than the resolution of the region with a large shape variation such as the auditorium. Therefore, by lowering the resolution of areas with large degree of shape variation such as audience seats, it is possible to round the difference between the shape of the background generated in advance and the shape of the background after equal variation where there is a spectator. It is possible to reduce the deterioration of image quality when generating. On the other hand, in areas where the degree of shape variation is small, such as the roof, the difference between the pre-generated background shape and the actual background shape is small, so a high-resolution virtual viewpoint image can be obtained by using a high-resolution background shape. Can be generated.

前述した実施形態では、詳細形状から形状を低解像度化することで、仮想視点画像生成に適したスタジアムの形状を生成する例について述べた。その他の実施形態として、スタジアムの形状を生成する際に、低解像度な形状を徐々に高解像度な形状に変形する方法を利用してもよい。 In the above-described embodiment, the example in which the shape of the stadium suitable for virtual viewpoint image generation is generated by reducing the shape from the detailed shape is described. As another embodiment, when generating a stadium shape, a method of gradually transforming a low-resolution shape into a high-resolution shape may be used.

本発明は、上述の各実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read the program. It can also be realized by processing to be executed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

上述の実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。即ち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The above-described embodiments are merely examples of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed as being limited thereto. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

１００：画像処理装置、１０１：ＣＰＵ、１０２：メインメモリ、３０１：詳細形状生成部、３０２：形状変動領域特定部、３０３：形状低解像度化部、３０４：撮像領域マッピング部、３０５：オクルージョン領域特定部、３０６：形状修正部 DESCRIPTION OF SYMBOLS 100: Image processing apparatus, 101: CPU, 102: Main memory, 301: Detailed shape production | generation part, 302: Shape fluctuation area specific | specification part, 303: Shape low resolution part, 304: Imaging area mapping part, 305: Occlusion area specification Part, 306: shape correction part

Claims

An image processing device that generates a virtual viewpoint image from images taken by a plurality of imaging devices,
A specifying means for specifying an area in which a person appears in an area as a background of the virtual viewpoint image;
Generating means for generating the virtual viewpoint image using a three-dimensional model having a higher resolution in a region in which a person is not captured in a region as a background of the virtual viewpoint image than in an area specified by the specifying means;
An image processing apparatus comprising:

2. The image processing according to claim 1, wherein a region in which a person is not captured in a region as a background of the virtual viewpoint image is a region corresponding to a roof of a stadium that is a photographing target of the plurality of imaging devices. apparatus.

The image processing apparatus according to claim 1, wherein the specifying unit specifies an area corresponding to a spectator seat of a stadium, which is a shooting target of the plurality of imaging apparatuses.

Changing means for reducing the resolution of the three-dimensional model corresponding to the area specified by the specifying means;
The image processing apparatus according to claim 1, wherein the generation unit generates the virtual viewpoint image using the three-dimensional model whose resolution is reduced by the changing unit.

The specifying means captures an area in which a person appears in the background area of the virtual viewpoint image by user input, detection of a person from an image captured by the plurality of imaging devices, or imaging by the plurality of imaging devices. The image processing apparatus according to claim 1, wherein the image processing apparatus is specified based on a color change of the image.

An image processing device that generates a virtual viewpoint image from images taken by a plurality of imaging devices,
First acquisition means for acquiring a degree of variation of a shape as a background;
Determining means for determining the resolution of the background shape based on the degree of variation of the background shape;
Generating means for generating the virtual viewpoint image using the background shape having the resolution determined by the determining means;
An image processing apparatus comprising:

The image processing apparatus according to claim 6, wherein the variation degree of the shape is an index indicating a magnitude of a change in the shape as the background in a sequence for generating a virtual viewpoint image.

Having a second acquisition means for acquiring a background shape;
The second acquisition means uses at least one of an outline of a design drawing and an edge detected in an image captured by the plurality of imaging devices as an edge having a shape as the background. 8. The image processing apparatus according to claim 6, wherein the image processing apparatus is characterized in that:

The image processing apparatus according to claim 8, wherein the second acquisition unit acquires the background shape generated based on a point cloud acquired in advance.

The first acquisition unit specifies an area having a large degree of variation of the shape based on a user input or a color change of images captured by the plurality of imaging devices. The image processing apparatus according to any one of 9.

11. The image processing apparatus according to claim 6, wherein the determination unit determines to reduce the resolution of an area having a large degree of variation of the background shape.

The generation unit generates a virtual viewpoint image using the background shape obtained by approximating a region having a large variation degree of the background shape with a polygonal polygon and reducing the resolution. The image processing apparatus according to 11.

The generation unit subdivides the resolution in an area where the degree of variation in the shape of the background is large, and generates the virtual viewpoint image using the shape of the background that is preferentially reduced in resolution from a low-resolution part. The image processing apparatus according to claim 11, wherein the image processing apparatus is an image processing apparatus.

The said generating means produces | generates the said virtual viewpoint image using the shape which fitted the convex curved surface to the area | region where the fluctuation | variation degree of the shape used as the background is large. The image processing apparatus described.

A correction unit that corrects the shape of the area that is not photographed by the imaging device in the background shape;
15. The method according to claim 6, wherein when the processing for correcting the shape is repeated, the correcting means uses a method that is not yet used among a plurality of methods for correcting the shape. The image processing apparatus according to item.

Mapping means for mapping the images taken by the plurality of imaging devices onto the background shape;
An area specifying means for specifying an occlusion area in which the photographed image is not mapped out of the shape serving as the background;
The image processing apparatus according to claim 15, wherein the correction unit corrects a shape of the occlusion area.

An image processing method executed by an image processing device that generates a virtual viewpoint image from images taken by a plurality of imaging devices,
A specifying step of specifying an area in which a person appears in an area as a background of the virtual viewpoint image;
A generating step of generating the virtual viewpoint image using a three-dimensional model having a higher resolution in a region where a person is not captured in a region as a background of the virtual viewpoint image than in the region specified in the specifying step;
An image processing method comprising:

An image processing method executed by an image processing device that generates a virtual viewpoint image from images taken by a plurality of imaging devices,
An acquisition step of acquiring the degree of variation of the shape as a background;
A determination step for determining a resolution of the background shape based on the degree of variation of the background shape;
A generation step of generating the virtual viewpoint image using the shape as the background of the resolution determined in the determination step;
An image processing method comprising:

The program for functioning a computer as each means of the image processing apparatus of any one of Claim 1 to 16.