JP2018194979A

JP2018194979A - Three-dimensional information restoration method, restoration program and restoration apparatus

Info

Publication number: JP2018194979A
Application number: JP2017096826A
Authority: JP
Inventors: 昌平中潟; Shohei Nakagata; 智史島田; Tomohito Shimada
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-05-15
Filing date: 2017-05-15
Publication date: 2018-12-06

Abstract

To reduce a processing load or memory usage associated with three-dimensional information restoration.SOLUTION: A three-dimensional information restoration method specifies whether an object is present on a straight line in a three-dimensional space connecting respective pixels of an optical center and a first image that is picked up by a first camera and that is selected from among a plurality of images picked up by a plurality of cameras at different photographing positions, defines a region including pixels of the specified first image as a first range where an object search is performed and a second range other than the first range, sets a first interval as an interval where the search is performed for the pixels included in the first range and a second interval longer than the first interval for the pixels included in the second range, matches blocks between a second image among the plurality of images that is not selected as the first image and the first image in accordance with the set ranges and intervals, and estimates a position on the three-dimensional space for each pixel of the first image on the basis of correlation between the matched blocks.SELECTED DRAWING: Figure 2

Description

本発明は、３次元情報の復元方法、復元プログラム及び復元装置に関する。 The present invention relates to a three-dimensional information restoration method, a restoration program, and a restoration device.

自由視点映像という技術が知られている。例えば、視点が異なる複数のカメラが撮像する多視点画像から３次元情報を復元する。このような３次元情報を用いることにより、実際にはカメラが存在しない仮想的な視点から３次元のオブジェクトが観測される仮想視点画像を生成できる。 A technique called free viewpoint video is known. For example, three-dimensional information is restored from multi-viewpoint images captured by a plurality of cameras with different viewpoints. By using such three-dimensional information, it is possible to generate a virtual viewpoint image in which a three-dimensional object is observed from a virtual viewpoint where no camera actually exists.

特開２０１２−６９０９０号公報JP 2012-69090 A 特開２００６−２１５９３９号公報JP 2006-215939 A 特開２０００−２１５３１１号公報JP 2000-215311 A 特開２００２−１９７４４３号公報JP 2002-197443 A

しかしながら、上記の３次元情報を復元するには、処理負荷やメモリの使用量が増大する場合がある。例えば、３次元情報の復元時には、多視点画像のうち１つを基準画像とする一方でその他を参照画像とし、基準画像の画素ごとに基準画像の撮影に割り当てられたカメラの光学中心および基準画像の画素を結ぶ３次元空間の直線上に設定される基準画像上のブロックが参照画像に投影されることにより、基準画像および参照画像の間でブロックがマッチングされる。ところが、基準画像および参照画像の間でブロックをマッチングする間隔を細かく設定するほど処理負荷やメモリの使用量が増大するので、３次元情報を高精度に復元する妨げとなる。 However, in order to restore the above three-dimensional information, the processing load and memory usage may increase. For example, when restoring three-dimensional information, one of the multi-viewpoint images is used as a reference image while the other is used as a reference image, and the optical center of the camera and the reference image assigned to the shooting of the reference image for each pixel of the reference image The blocks on the standard image set on the straight line in the three-dimensional space connecting the pixels are projected onto the reference image, whereby the blocks are matched between the standard image and the reference image. However, as the interval for matching blocks between the standard image and the reference image is set more finely, the processing load and the memory usage increase, which hinders the restoration of three-dimensional information with high accuracy.

１つの側面では、本発明は、３次元情報の復元に伴う処理負荷又はメモリの使用量を低減できる３次元情報の復元方法、復元プログラム及び復元装置を提供することを目的とする。 In one aspect, an object of the present invention is to provide a three-dimensional information restoration method, a restoration program, and a restoration device that can reduce a processing load or a memory usage accompanying restoration of three-dimensional information.

一態様の３次元情報の復元方法では、撮影位置が異なる複数のカメラにより撮像された複数の画像を取得し、前記複数の画像の中から選択された第１のカメラにより撮像された第１の画像について、前記第１のカメラの光学中心および前記第１の画像のそれぞれの画素を結ぶ３次元空間の直線上にオブジェクトが存在するか特定し、特定した第１の画像の画素を含む領域を、前記３次元空間の直線上で前記オブジェクトの探索が実行される第１の範囲として設定し、前記第１の画像中の前記第１の範囲以外を第２の範囲に設定し、前記第１の範囲に含まれる画素に対しては、前記３次元空間の直線上で前記オブジェクトの探索を実行する間隔として第１の間隔を設定し、前記第２の範囲に含まれる画素に対しては、前記第１の間隔よりも長い第２の間隔を設定し、設定された範囲および設定された間隔にしたがって、前記複数の画像のうち前記第１の画像として選択されない第２の画像及び前記第１の画像の間でブロックをマッチングし、マッチングされたブロック間の相関に基づいて、前記第１の画像の画素ごとに前記３次元空間上の位置を推定する、処理をコンピュータが実行する。 In one aspect of the three-dimensional information restoration method, a plurality of images captured by a plurality of cameras having different shooting positions are acquired, and a first image captured by a first camera selected from the plurality of images is acquired. For an image, it is specified whether an object exists on a straight line in a three-dimensional space connecting the optical center of the first camera and each pixel of the first image, and an area including the pixel of the specified first image is determined. , Set as a first range in which the search for the object is executed on a straight line in the three-dimensional space, set a range other than the first range in the first image as a second range, For the pixels included in the range, a first interval is set as an interval for executing the search for the object on the straight line in the three-dimensional space, and for the pixels included in the second range, Longer than the first interval 2 is set, and according to the set range and the set interval, a block is matched between the second image that is not selected as the first image and the first image among the plurality of images. The computer executes a process of estimating the position in the three-dimensional space for each pixel of the first image based on the correlation between the matched blocks.

３次元情報の復元に伴う処理負荷又はメモリの使用量を低減できる。 It is possible to reduce the processing load or memory usage accompanying the restoration of the three-dimensional information.

図１は、実施例１に係る３次元情報の復元システムの構成例を示す図である。FIG. 1 is a diagram illustrating a configuration example of a three-dimensional information restoration system according to the first embodiment. 図２は、実施例１に係るサーバ装置の機能的構成を示すブロック図である。FIG. 2 is a block diagram illustrating a functional configuration of the server apparatus according to the first embodiment. 図３は、前景および背景の分離の一例を示す図である。FIG. 3 is a diagram illustrating an example of separation of the foreground and the background. 図４は、ＶｉｓｕａｌＨｕｌｌの一例を示す図である。FIG. 4 is a diagram illustrating an example of Visual Hull. 図５は、ＶｉｓｕａｌＨｕｌｌの算出プロセスの一例を示す図である。FIG. 5 is a diagram illustrating an example of a Visual Hull calculation process. 図６は、ＶｉｓｕａｌＨｕｌｌの算出プロセスの一例を示す図である。FIG. 6 is a diagram illustrating an example of a Visual Hull calculation process. 図７は、オブジェクトの探索範囲の設定例を示す図である。FIG. 7 is a diagram illustrating an example of setting an object search range. 図８は、探索間隔の一例を示す図である。FIG. 8 is a diagram illustrating an example of a search interval. 図９は、探索間隔とデプスの関係の一例を示す図である。FIG. 9 is a diagram illustrating an example of the relationship between the search interval and the depth. 図１０は、カメラ間の配置条件の一例を示す図である。FIG. 10 is a diagram illustrating an example of an arrangement condition between cameras. 図１１は、ブロックマッチングの一例を示す図である。FIG. 11 is a diagram illustrating an example of block matching. 図１２は、デプス画像の一例を示す図である。FIG. 12 is a diagram illustrating an example of a depth image. 図１３は、実施例１に係る３次元情報の復元処理の手順を示すフローチャートである。FIG. 13 is a flowchart illustrating the procedure of the three-dimensional information restoration process according to the first embodiment. 図１４は、基準画像の各画素が属する領域の割合の一例を示す図である。FIG. 14 is a diagram illustrating an example of a ratio of a region to which each pixel of the reference image belongs. 図１５は、基準画像の各画素の探索範囲の平均値の比率の一例を示す図である。FIG. 15 is a diagram illustrating an example of the ratio of the average value of the search range of each pixel of the reference image. 図１６は、実施例１及び実施例２に係る復元プログラムを実行するコンピュータのハードウェア構成例を示す図である。FIG. 16 is a diagram illustrating a hardware configuration example of a computer that executes the restoration program according to the first embodiment and the second embodiment.

以下に添付図面を参照して本願に係る３次元情報の復元方法、復元プログラム及び復元装置について説明する。なお、この実施例は開示の技術を限定するものではない。そして、各実施例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Hereinafter, a three-dimensional information restoration method, restoration program, and restoration apparatus according to the present application will be described with reference to the accompanying drawings. Note that this embodiment does not limit the disclosed technology. Each embodiment can be appropriately combined within a range in which processing contents are not contradictory.

［システム構成］
図１は、実施例１に係る３次元情報の復元システムの構成例を示す図である。図１に示す３次元情報の復元システム１は、一側面として、視点が異なる複数のカメラ３０Ａ〜３０Ｍが撮像する多視点画像から３次元情報を復元する復元サービスを提供するものである。 [System configuration]
FIG. 1 is a diagram illustrating a configuration example of a three-dimensional information restoration system according to the first embodiment. The 3D information restoration system 1 shown in FIG. 1 provides, as one aspect, a restoration service that restores 3D information from multi-viewpoint images captured by a plurality of cameras 30A to 30M having different viewpoints.

図１に示すように、３次元情報の復元システム１には、サーバ装置１０と、複数のカメラ３０Ａ〜３０Ｍとが含まれる。以下では、カメラ３０Ａ〜３０Ｍのことを「カメラ３０」と記載する場合がある。 As illustrated in FIG. 1, the three-dimensional information restoration system 1 includes a server device 10 and a plurality of cameras 30 A to 30 M. Hereinafter, the cameras 30A to 30M may be referred to as “camera 30”.

これらサーバ装置１０及びカメラ３０の間は、所定のネットワークＮＷを介して接続される。このネットワークＮＷは、有線または無線を問わず、インターネット、ＬＡＮ（Local Area Network）やＶＰＮ（Virtual Private Network）などの任意の種類の通信網により構築することができる。 The server device 10 and the camera 30 are connected via a predetermined network NW. This network NW can be constructed by any type of communication network, such as the Internet, a LAN (Local Area Network), or a VPN (Virtual Private Network), whether wired or wireless.

このように、図１に示す３次元情報の復元システム１では、多視点画像がネットワークＮＷを介してカメラ３０からサーバ装置１０へ伝送される場合を例示するが、これはあくまで伝送形態の一例であり、サーバ装置１０及びカメラ３０の間で必ずしも双方向に通信が行われずともかまわない。例えば、ネットワークＮＷを経由せず、多視点画像が放送波を介してカメラ３０からサーバ装置１０へ伝送されることとしてもかまわない。 As described above, the three-dimensional information restoration system 1 illustrated in FIG. 1 illustrates a case where a multi-viewpoint image is transmitted from the camera 30 to the server apparatus 10 via the network NW, but this is merely an example of a transmission form. There is no need to perform bidirectional communication between the server device 10 and the camera 30. For example, a multi-viewpoint image may be transmitted from the camera 30 to the server device 10 via a broadcast wave without going through the network NW.

サーバ装置１０は、上記の復元サービスを提供するコンピュータである。サーバ装置１０は、復元装置の一例である。 The server device 10 is a computer that provides the restoration service. The server device 10 is an example of a restoration device.

一実施形態として、サーバ装置１０は、パッケージソフトウェア又はオンラインソフトウェアとして、上記の復元サービスの機能を実現する復元プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、サーバ装置１０は、上記の復元サービスを提供するＷｅｂサーバとして実装することとしてもよいし、アウトソーシングによって上記の復元サービスを提供するクラウドとして実装することとしてもかまわない。 As an embodiment, the server apparatus 10 can be implemented by installing a restoration program that realizes the function of the restoration service as package software or online software on a desired computer. For example, the server device 10 may be implemented as a Web server that provides the above restoration service, or may be implemented as a cloud that provides the above restoration service by outsourcing.

カメラ３０は、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）などの撮像素子を搭載する撮像装置である。 The camera 30 is an image pickup apparatus on which an image pickup device such as a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) is mounted.

図１には、あくまで一例として、サッカー観戦におけるカメラ３０の配置が示されている。例えば、サッカーのフィールドＦ上の３次元空間に含まれる３次元のオブジェクト、例えばボールや選手、審判などの試合関係者等が復元される。この場合、複数のカメラ３０は、フィールドＦの周囲からフィールドＦの内部へ向けて配置される。このとき、複数のカメラ３０の撮影範囲が組み合わさることによりフィールドＦの全域が複数のカメラ３０の撮影範囲に収まる配置で各カメラ３０が設置されると共に、各カメラ３０は、他のカメラ３０との間で撮影範囲の一部が重複する状態で配置される。このような配置の下、複数のカメラ３０がフレームごとに同期して撮影することにより、異なる視点ごとに同一のタイミングで撮影された複数の画像がフレーム単位で得られる。以下では、撮影位置が異なるカメラ３０Ａ〜３０Ｍにより同一のフレームで撮像された複数の画像のことを「多視点画像」と記載し、また、１つのカメラ３０により時系列に撮像される一連の画像のことを「動画像」と記載する場合がある。 FIG. 1 shows the arrangement of the cameras 30 in a soccer game as an example. For example, a three-dimensional object included in the three-dimensional space on the soccer field F, for example, a game-related person such as a ball, a player, or a referee, is restored. In this case, the plurality of cameras 30 are arranged from the periphery of the field F toward the inside of the field F. At this time, by combining the shooting ranges of the plurality of cameras 30, each camera 30 is installed in an arrangement in which the entire field F is within the shooting ranges of the plurality of cameras 30. Are arranged in a state where a part of the shooting range overlaps between the two. Under such an arrangement, the plurality of cameras 30 shoots in synchronization with each frame, so that a plurality of images shot at the same timing for each different viewpoint can be obtained in units of frames. Hereinafter, a plurality of images captured in the same frame by the cameras 30 A to 30 M having different shooting positions are referred to as “multi-viewpoint images”, and a series of images captured in time series by the single camera 30. May be described as “moving image”.

［サーバ装置１０の構成］
次に、本実施例に係るサーバ装置１０の機能的構成について説明する。図２は、実施例１に係るサーバ装置１０の機能的構成を示すブロック図である。図２に示すように、サーバ装置１０は、通信Ｉ／Ｆ（InterFace）部１１と、記憶部１３と、制御部１５とを有する。なお、図２には、上記の復元サービスに関連するサーバ装置１０の機能部が抜粋して示されているに過ぎず、図示以外の機能部、例えば既存のコンピュータがデフォルトまたはオプションで装備する機能部がサーバ装置１０に備わることを妨げない。例えば、上記の多視点画像がカメラ３０からサーバ装置１０へ放送波や衛星波を介して伝搬される場合、放送波や衛星波の受信部をさらに有することとしてもかまわない。 [Configuration of Server Device 10]
Next, a functional configuration of the server device 10 according to the present embodiment will be described. FIG. 2 is a block diagram illustrating a functional configuration of the server apparatus 10 according to the first embodiment. As illustrated in FIG. 2, the server device 10 includes a communication I / F (InterFace) unit 11, a storage unit 13, and a control unit 15. Note that FIG. 2 merely shows the functional units of the server device 10 related to the restoration service, and the functional units other than those illustrated, for example, functions that an existing computer has as a default or an option are provided. Is not prevented from being provided in the server device 10. For example, when the multi-viewpoint image is propagated from the camera 30 to the server device 10 via a broadcast wave or a satellite wave, it may further include a broadcast wave or satellite wave receiver.

通信Ｉ／Ｆ部１１は、他の装置との間で通信制御を行うインタフェースである。 The communication I / F unit 11 is an interface that controls communication with other devices.

一実施形態として、通信Ｉ／Ｆ部１１には、ＬＡＮカードなどのネットワークインタフェースカードが対応する。例えば、通信Ｉ／Ｆ部１１は、カメラ３０から多視点画像を受信したり、また、撮像制御に関する指示、例えば電源ＯＮ／電源ＯＦＦの他、パンやチルトなどの指示をカメラ３０へ送信したりする。 As an embodiment, the communication I / F unit 11 corresponds to a network interface card such as a LAN card. For example, the communication I / F unit 11 receives a multi-viewpoint image from the camera 30, and transmits an instruction related to imaging control, for example, an instruction such as panning and tilting in addition to power ON / OFF, to the camera 30. To do.

記憶部１３は、制御部１５で実行されるＯＳ（Operating System）を始め、上記の復元プログラムなどの各種プログラムに用いられるデータを記憶する記憶デバイスである。 The storage unit 13 is a storage device that stores data used for various programs such as the restoration program described above, including an OS (Operating System) executed by the control unit 15.

一実施形態として、記憶部１３は、サーバ装置１０における補助記憶装置として実装される。例えば、補助記憶装置には、ＨＤＤ（Hard Disk Drive）、光ディスクやＳＳＤ（Solid State Drive）などが対応する。この他、ＥＰＲＯＭ（Erasable Programmable Read Only Memory)などのフラッシュメモリも補助記憶装置に対応する。 As an embodiment, the storage unit 13 is implemented as an auxiliary storage device in the server device 10. For example, the auxiliary storage device corresponds to an HDD (Hard Disk Drive), an optical disk, an SSD (Solid State Drive), or the like. In addition, a flash memory such as an EPROM (Erasable Programmable Read Only Memory) also corresponds to the auxiliary storage device.

記憶部１３は、制御部１５で実行されるプログラムに用いられるデータの一例として、カメラ３０の位置や向きなどの外部パラメータ及びカメラ３０の画角やレンズの歪みなどの内部パラメータを含むパラメータ１３ａを記憶する。このパラメータ１３ａ以外にも、他の電子データを記憶することもできる。例えば、記憶部１３には、カメラ３０から伝送された多視点画像の時系列データなどを記憶することができる。 The storage unit 13 includes, as an example of data used in a program executed by the control unit 15, parameters 13 a including external parameters such as the position and orientation of the camera 30 and internal parameters such as the angle of view of the camera 30 and lens distortion. Remember. In addition to the parameter 13a, other electronic data can be stored. For example, the storage unit 13 can store time-series data of multi-viewpoint images transmitted from the camera 30.

制御部１５は、サーバ装置１０の全体制御を行う処理部である。 The control unit 15 is a processing unit that performs overall control of the server device 10.

一実施形態として、制御部１５は、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）などのハードウェアプロセッサにより実装することができる。ここでは、プロセッサの一例として、ＣＰＵやＭＰＵを例示したが、汎用型および特化型を問わず、任意のプロセッサにより実装することができる。この他、制御部１５は、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などのハードワイヤードロジックによって実現されることとしてもかまわない。 As one embodiment, the control unit 15 can be implemented by a hardware processor such as a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). Here, the CPU and the MPU are illustrated as an example of the processor. However, the processor and the MPU can be mounted by any processor regardless of a general-purpose type or a specialized type. In addition, the control unit 15 may be realized by a hard wired logic such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA).

制御部１５は、図示しない主記憶装置として実装されるＤＲＡＭ（Dynamic Random Access Memory）やＳＲＡＭ（Static Random Access Memory）などのＲＡＭのワークエリア上に、上記の復元プログラムを展開することにより、下記の処理部を仮想的に実現する。 The control unit 15 expands the restoration program described above on a RAM work area such as a DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory) mounted as a main storage device (not shown), thereby The processing unit is virtually realized.

制御部１５は、図２に示すように、取得部１５ａと、分離部１５ｂと、第１の設定部１５ｃと、第２の設定部１５ｄと、マッチング部１５ｅと、推定部１５ｆとを有する。 As shown in FIG. 2, the control unit 15 includes an acquisition unit 15a, a separation unit 15b, a first setting unit 15c, a second setting unit 15d, a matching unit 15e, and an estimation unit 15f.

取得部１５ａは、多視点画像を取得する処理部である。 The acquisition unit 15a is a processing unit that acquires a multi-viewpoint image.

一実施形態として、取得部１５ａは、カメラ３０から伝送される多視点画像をフレーム単位で取得することができる。ここで、取得部１５ａが多視点画像を取得するソースは任意であってよく、カメラ３０に限定されない。例えば、取得部１５ａは、多視点画像を蓄積するハードディスクや光ディスクなどの補助記憶装置またはメモリカードやＵＳＢ（Universal Serial Bus）メモリなどのリムーバブルメディアから読み出すことにより多視点画像を取得することもできる。この他、取得部１５ａは、外部装置からネットワークＮＷを介して受信することによって多視点画像を取得することもできる。 As one embodiment, the acquisition unit 15a can acquire a multi-viewpoint image transmitted from the camera 30 in units of frames. Here, the source from which the acquisition unit 15a acquires the multi-viewpoint image may be arbitrary, and is not limited to the camera 30. For example, the acquisition unit 15a can also acquire a multi-viewpoint image by reading from an auxiliary storage device such as a hard disk or an optical disk that accumulates the multi-viewpoint image or a removable medium such as a memory card or a USB (Universal Serial Bus) memory. In addition, the acquisition unit 15a can also acquire a multi-viewpoint image by receiving from an external device via the network NW.

分離部１５ｂは、前景と背景を分離する処理部である。ここで言う「前景」とは、カメラ３０の撮影範囲内の３次元空間に存在するオブジェクトの中でも動体などのオブジェクトを指す一方で、「背景」とは、動体でないオブジェクトを指す。 The separation unit 15b is a processing unit that separates the foreground and the background. The “foreground” here refers to an object such as a moving object among objects existing in a three-dimensional space within the shooting range of the camera 30, while the “background” refers to an object that is not a moving object.

一実施形態として、分離部１５ｂは、取得部１５ａにより動画像のフレームがカメラ３０ごとに取得される度に、当該フレームの画像から前景に対応する領域をカメラ３０ごとに抽出する。例えば、分離部１５ｂは、当該フレームの画像と、それよりも前に取得されたフレームの画像との間で画素値の差分を検出することにより、前景に対応する領域を抽出する。図３は、前景および背景の分離の一例を示す図である。図３には、カメラ３０Ａ〜３０Ｍのうちあるカメラ３０により撮像された動画像に含まれる一連の画像が時系列に並んだ状態で示されている。例えば、分離部１５ｂは、画像内に動体が観測されない可能性が高いフレームの画像Ｆ０と、最新のフレームの画像ＦＬとの間で画素値の差が所定の閾値以上である画素を検出する。ここで、上記の画像Ｆ０には、一例として、所定のフレーム数にわたってフレーム間の差分が検出されなかったフレームの画像などを利用することができる。このように抽出された画素は、ラベリングが実行されることによりブロブとして抽出することができる。その上で、分離部１５ｂは、ラベリングにより抽出されたブロブに対応する画素の画素値を「１」に設定すると共に、それ以外の画素の画素値を「０」に設定する。これにより、前景に対応する領域が「白」で表現されると共に、背景に対応する領域が「黒」で表現されたマスク画像ＭＬを生成することができる。なお、図３には、マスク画像の一例として、２値化画像を生成する場合を例示したが、階調の数は「２」に限定されないのは言うまでもない。 As an embodiment, every time a frame of a moving image is acquired for each camera 30 by the acquisition unit 15a, the separation unit 15b extracts a region corresponding to the foreground from the image of the frame for each camera 30. For example, the separation unit 15b extracts a region corresponding to the foreground by detecting a difference in pixel value between the image of the frame and the image of the frame acquired before that. FIG. 3 is a diagram illustrating an example of separation of the foreground and the background. In FIG. 3, a series of images included in a moving image captured by a camera 30 among the cameras 30 A to 30 M are shown in a state of being arranged in time series. For example, the separation unit 15b detects a pixel having a pixel value difference equal to or greater than a predetermined threshold value between the image F0 of the frame in which there is a high possibility that no moving object is observed in the image and the image FL of the latest frame. Here, as the image F0, for example, an image of a frame in which a difference between frames is not detected over a predetermined number of frames can be used. The pixels thus extracted can be extracted as blobs by performing labeling. In addition, the separation unit 15b sets the pixel value of the pixel corresponding to the blob extracted by labeling to “1”, and sets the pixel values of the other pixels to “0”. Accordingly, it is possible to generate the mask image ML in which the area corresponding to the foreground is expressed by “white” and the area corresponding to the background is expressed by “black”. FIG. 3 illustrates a case where a binarized image is generated as an example of a mask image, but it goes without saying that the number of gradations is not limited to “2”.

第１の設定部１５ｃは、複数の画像の中から選択された第１のカメラにより撮像された第１の画像について、第１のカメラの光学中心および第１の画像のそれぞれの画素を結ぶ３次元空間の直線上にオブジェクトが存在するか特定する処理部である。さらに、第１の設定部１５ｃは、特定した第１の画像の画素を含む領域を、３次元空間の直線上でオブジェクトの探索が実行される第１の範囲として設定し、第１の画像中の第１の範囲以外の領域を第２の範囲に設定する処理部である。 The first setting unit 15c connects the optical center of the first camera and each pixel of the first image for the first image captured by the first camera selected from the plurality of images. It is a processing unit that identifies whether an object exists on a straight line in the dimensional space. Furthermore, the first setting unit 15c sets a region including the pixel of the identified first image as a first range in which an object search is performed on a straight line in the three-dimensional space, and includes the first image in the first image. This is a processing unit for setting a region other than the first range to the second range.

第１の設定部１５ｃは、多視点画像のうち基準画像が持つ画素ごとに当該基準画像の撮影に割り当てられたカメラの光学中心および当該基準画像の画素を結ぶ３次元空間の直線上でオブジェクトの探索が実行される範囲を設定する。以下、３次元空間の直線上でオブジェクトの探索が実行される範囲のことを「探索範囲」と記載する場合がある。なお、第１の設定部１５ｃは、特定部の一例でもある。 The first setting unit 15c includes, for each pixel of the reference image in the multi-viewpoint image, the object center on the straight line in the three-dimensional space that connects the optical center of the camera assigned to the reference image and the pixel of the reference image. Sets the range in which the search is performed. Hereinafter, a range in which an object search is performed on a straight line in a three-dimensional space may be referred to as a “search range”. The first setting unit 15c is also an example of a specifying unit.

ここで、基準画像とは、多視点画像の中から選択される画像のことを指し、基準画像以外のその他の画像は参照画像と識別される。この基準画像の分担は、持ち回りであり、最終的には、多視点画像の全ての画像が基準画像として選択されてブロックマッチングやデプスマップの生成が実施されることになる。以下、基準画像を撮影するカメラ３０のことを「基準カメラ３０α」と記載すると共に、参照画像を撮影するカメラ３０のことを「参照カメラ３０β」と記載する場合がある。なお、基準カメラは、第１のカメラの一例であり、基準画像は、第１の画像の一例である。 Here, the standard image refers to an image selected from the multi-viewpoint images, and other images other than the standard image are identified as reference images. The sharing of the reference image is carried around. Eventually, all images of the multi-viewpoint image are selected as the reference image, and block matching and depth map generation are performed. Hereinafter, the camera 30 that captures the reference image may be referred to as “reference camera 30α”, and the camera 30 that captures the reference image may be referred to as “reference camera 30β”. The reference camera is an example of a first camera, and the reference image is an example of a first image.

一実施形態として、第１の設定部１５ｃは、記憶部１３に記憶されたカメラパラメータ１３ａを参照して、各カメラ３０のマスク画像に含まれる前景のシルエットと各カメラ３０の光学中心とにより形成される錐体の積集合空間、すなわち「ＶｉｓｕａｌＨｕｌｌ」を求める。その上で、第１の設定部１５ｃは、上記のオブジェクトの探索範囲を３次元空間上に存在するオブジェクトの存在範囲にまでＶｉｓｕａｌＨｕｌｌに絞り込む。図４は、ＶｉｓｕａｌＨｕｌｌの一例を示す図である。図４には、カメラ３０Ａ〜３０Ｃの３つのカメラ３０のマスク画像３１Ａ〜３１ＣがＶｉｓｕａｌＨｕｌｌの算出に用いられる場合が示されている。図４に示すように、カメラ３０Ａの光学中心およびマスク画像３１Ａ上の前景のシルエットＳＡを結ぶ直線群と、カメラ３０Ｂの光学中心およびマスク画像３１Ｂ上の前景のシルエットＳＢを結ぶ直線群と、カメラ３０Ｃの光学中心およびマスク画像３１Ｃ上の前景のシルエットＳＣを結ぶ直線群とにより包含される３次元空間上の領域ＥがＶｉｓｕａｌＨｕｌｌとして求まる。このようなＶｉｓｕａｌＨｕｌｌの算出により、各カメラ３０が被写体とする３次元空間におけるオブジェクトの存在範囲を領域Ｅに絞り込むことができる。 As an embodiment, the first setting unit 15 c is formed by referring to the camera parameter 13 a stored in the storage unit 13 and using the foreground silhouette included in the mask image of each camera 30 and the optical center of each camera 30. The product set space of the cones to be formed, that is, “Visual Hull” is obtained. After that, the first setting unit 15c narrows the search range of the object to Visual Hull to the existence range of the object existing in the three-dimensional space. FIG. 4 is a diagram illustrating an example of Visual Hull. FIG. 4 shows a case where the mask images 31A to 31C of the three cameras 30 of the cameras 30A to 30C are used for the calculation of Visual Hull. As shown in FIG. 4, a straight line group connecting the optical center of the camera 30A and the foreground silhouette SA on the mask image 31A, a straight line group connecting the optical center of the camera 30B and the foreground silhouette SB on the mask image 31B, and the camera A region E in the three-dimensional space that is included by the group of straight lines connecting the optical center of 30C and the foreground silhouette SC on the mask image 31C is obtained as Visual Hull. By calculating such Visual Hull, the existence range of the object in the three-dimensional space that is the subject of each camera 30 can be narrowed down to the region E.

例えば、ＶｉｓｕａｌＨｕｌｌを算出するアルゴリズムの一例について説明する。図５は、ＶｉｓｕａｌＨｕｌｌの算出プロセスの一例を示す図である。図５に示すように、第１の設定部１５ｃは、カメラ３０Ａ〜カメラ３０Ｍのうち１つを基準カメラ３０αとして選択する。一方、基準カメラ３０αとして選択されなかったカメラ３０は以降で参照カメラ３０βと識別される。続いて、第１の設定部１５ｃは、基準カメラ３０αのマスク画像３１αが持つ画素のうち前景のシルエットを形成する画素（ｕ１，ｖ１）を選択する。その後、第１の設定部１５ｃは、参照カメラ３０βのうち１つを選択する。このように選択された参照カメラ３０βのことを、以下、「参照カメラ３０β１」と識別する。 For example, an example of an algorithm for calculating Visual Hull will be described. FIG. 5 is a diagram illustrating an example of a Visual Hull calculation process. As illustrated in FIG. 5, the first setting unit 15c selects one of the cameras 30A to 30M as the reference camera 30α. On the other hand, the camera 30 that has not been selected as the reference camera 30α is hereinafter identified as the reference camera 30β. Subsequently, the first setting unit 15c selects a pixel (u1, v1) that forms a foreground silhouette among the pixels of the mask image 31α of the reference camera 30α. Thereafter, the first setting unit 15c selects one of the reference cameras 30β. The reference camera 30β selected in this way is hereinafter identified as “reference camera 30β1”.

その上で、第１の設定部１５ｃは、３次元空間上で基準カメラ３０αの光学中心Ｏおよび画素（ｕ１，ｖ１）を通る直線を参照カメラ３０β１のマスク画像３１β１に投影することによりエピポーラ線ＥＬを描画する。そして、第１の設定部１５ｃは、エピポーラ線ＥＬが参照カメラ３０β１のマスク画像３１β１上に存在する範囲を決定する。例えば、範囲の開始位置は、基準カメラ３０αの光学中心Ｏが参照カメラ３０β１のマスク画像３１β１に投影された点に対応する。また、範囲の終了位置は、３次元空間上で基準カメラ３０αの光学中心Ｏから画素（ｕ１，ｖ１）へ通る直線を無限遠まで延伸する場合に参照カメラ３０β１のマスク画像３１β１上で収束する点に対応する。 The first setting unit 15c then projects a straight line passing through the optical center O of the reference camera 30α and the pixel (u1, v1) onto the mask image 31β1 of the reference camera 30β1 in a three-dimensional space, thereby causing the epipolar line EL Draw. Then, the first setting unit 15c determines a range where the epipolar line EL exists on the mask image 31β1 of the reference camera 30β1. For example, the start position of the range corresponds to the point at which the optical center O of the base camera 30α is projected onto the mask image 31β1 of the reference camera 30β1. The end position of the range converges on the mask image 31β1 of the reference camera 30β1 when a straight line extending from the optical center O of the reference camera 30α to the pixel (u1, v1) is extended to infinity in the three-dimensional space. Corresponding to

その後、第１の設定部１５ｃは、参照カメラ３０β１のマスク画像３１β１上でエピポーラ線ＥＬが前景のシルエットＳβと重なる範囲、すなわち交差開始位置および交差終了位置を算出する。そして、第１の設定部１５ｃは、記憶部１３に記憶されたカメラパラメータ１３ａを参照して、エピポーラ線ＥＬと前景のシルエットＳβとが重なる範囲を基準カメラ３０αの光学中心Ｏからの奥行き情報（ｚ）へ変換する。 Thereafter, the first setting unit 15c calculates a range where the epipolar line EL overlaps the foreground silhouette Sβ on the mask image 31β1 of the reference camera 30β1, that is, the intersection start position and the intersection end position. Then, the first setting unit 15c refers to the camera parameter 13a stored in the storage unit 13, and sets the depth information (from the optical center O of the reference camera 30α to the range where the epipolar line EL and the foreground silhouette Sβ overlap ( z).

このようにエピポーラ線ＥＬと前景のシルエットＳβとの重複範囲の導出が全ての参照カメラ３０βが選択されるまで繰り返して実行される。この結果、図６に示す通り、基準カメラ３０α以外の参照カメラ３０βごとにエピポーラ線ＥＬと前景のシルエットＳβとの重複範囲が得られる。図６は、ＶｉｓｕａｌＨｕｌｌの算出プロセスの一例を示す図である。図６には、４つの参照カメラ３０βごとにエピポーラ線ＥＬと前景のシルエットＳβとの重複範囲が基準カメラ３０αの光学中心Ｏからの奥行き情報（ｚ）で示されている。すなわち、図６には、４つの参照カメラ３０βの重複範囲が点線、一点鎖線、破線および二点鎖線で示されている。例えば、第１の設定部１５ｃは、各参照カメラ３０βの間で重複範囲が共通する部分、すなわち図６に太線で示された箇所を基準カメラ３０αのマスク画像上の画素（ｕ１，ｖ１）におけるオブジェクトの存在範囲として絞り込む。ここで、必ずしも全ての参照カメラ３０βの間で重複範囲が共通することを条件とせずともかまわない。例えば、所定数の参照カメラ３０βの間で重複範囲が共通する部分をオブジェクトの存在範囲として絞り込むことができる。このようにオブジェクトの存在範囲の絞り込みに成功した画素（ｕ１，ｖ１）は、ＶｉｓｕａｌＨｕｌｌ領域に分類される。 Thus, the derivation of the overlapping range of the epipolar line EL and the foreground silhouette Sβ is repeatedly performed until all the reference cameras 30β are selected. As a result, as shown in FIG. 6, an overlapping range between the epipolar line EL and the foreground silhouette Sβ is obtained for each reference camera 30β other than the base camera 30α. FIG. 6 is a diagram illustrating an example of a Visual Hull calculation process. In FIG. 6, the overlapping range of the epipolar line EL and the foreground silhouette Sβ for each of the four reference cameras 30β is indicated by depth information (z) from the optical center O of the reference camera 30α. That is, in FIG. 6, the overlapping range of the four reference cameras 30β is indicated by a dotted line, a one-dot chain line, a broken line, and a two-dot chain line. For example, the first setting unit 15c uses a portion where the overlapping range is common among the reference cameras 30β, that is, a portion indicated by a thick line in FIG. 6 in the pixel (u1, v1) on the mask image of the reference camera 30α. Filter as the existence range of objects. Here, it is not necessarily required that the overlapping range is common among all the reference cameras 30β. For example, it is possible to narrow down a portion having a common overlapping range among a predetermined number of reference cameras 30β as an object existing range. Thus, the pixels (u1, v1) that have successfully narrowed down the existence range of the object are classified into the Visual Hull region.

なお、図６には、各参照カメラ３０βの間で重複範囲が共通する部分が存在する場合を例示したが、各参照カメラ３０βの間で重複範囲が共通する部分がない場合、オブジェクトの存在範囲の絞り込みに失敗する。このように前景のシルエットに含まれる画素であってもオブジェクトの存在範囲の絞り込みに失敗した画素は、非ＶｉｓｕａｌＨｕｌｌ領域に分類される。また、背景に対応する画素、すなわちシルエットに含まれない場合には、ＶｉｓｕａｌＨｕｌｌの算出が実行されず、非ＶｉｓｕａｌＨｕｌｌ領域に自動的に分類される。 FIG. 6 illustrates the case where there is a portion having a common overlapping range between the reference cameras 30β. However, when there is no portion having a common overlapping range between the reference cameras 30β, the existence range of the object is illustrated. Fails to narrow down. Thus, even if the pixels are included in the foreground silhouette, the pixels that have failed to narrow down the existence range of the object are classified as non-Visual Hull regions. Further, when the pixel corresponding to the background, that is, not included in the silhouette, the Visual Hull is not calculated, and is automatically classified into the non-Visual Hull region.

ＶｉｓｕａｌＨｕｌｌ領域または非ＶｉｓｕａｌＨｕｌｌ領域への分類が終了した後、第１の設定部１５ｃは、当該画素にオブジェクトの探索範囲を設定する。図７は、オブジェクトの探索範囲の設定例を示す図である。図７には、画像４０の水平ラインＨ１上に存在する各画素の探索範囲が示されている。ＶｉｓｕａｌＨｕｌｌ領域に設定される探索範囲が黒の塗り潰しにより示されると共に、非ＶｉｓｕａｌＨｕｌｌ領域に設定される探索範囲が点の塗り潰しにより示されている。 After the classification into the Visual Hull area or the non-Visual Hull area is completed, the first setting unit 15c sets an object search range for the pixel. FIG. 7 is a diagram illustrating an example of setting an object search range. FIG. 7 shows a search range of each pixel existing on the horizontal line H 1 of the image 40. The search range set in the Visual Hull area is indicated by black filling, and the search range set in the non-Visual Hull area is indicated by dot filling.

図７に示すように、ＶｉｓｕａｌＨｕｌｌ領域に分類された画素の探索範囲には、第１の設定部１５ｃは、前景となるオブジェクトの存在範囲を設定する。また、非ＶｉｓｕａｌＨｕｌｌ領域に分類された画素の探索範囲には、第１の設定部１５ｃは、前景および背景の両方を探索するための広域の探索範囲を設定する。この広域の探索範囲の一例として、ｚＮｅａｒ及びｚＦａｒが設定される。例えば、カメラ３０によりサッカースタジアムの撮影が行われる場合、ｚＦａｒには、カメラ３０から一番遠い位置を基準に、一例として、「６０ｍ」が設定される。また、ｚＮｅａｒには、カメラ３０から一定の距離以上離れているとの想定の下、一例として、「１ｍ」が設定される。これらｚＮｅａｒ及びｚＦａｒには、両者の間に前景が存在しうる程度に十分の間隔が隔てられる。 As illustrated in FIG. 7, the first setting unit 15 c sets the existence range of the foreground object in the search range of the pixels classified into the Visual Hull area. Further, the first setting unit 15c sets a wide search range for searching both the foreground and the background in the search range of the pixels classified into the non-Visual Hull region. As an example of this wide search range, zNear and zFar are set. For example, when shooting a soccer stadium with the camera 30, “60 m” is set in zFar as an example with reference to the position farthest from the camera 30. In addition, as an example, “1 m” is set in zNear under the assumption that the camera is away from the camera 30 by a certain distance or more. These zNear and zFar are sufficiently spaced so that a foreground can exist between them.

第２の設定部１５ｄは、第１の範囲に含まれる画素に対しては、３次元空間の直線上でオブジェクトの探索を実行する間隔として第１の間隔を設定し、第２の範囲に含まれる画素に対しては、第１の間隔よりも長い第２の間隔を設定する処理部である。 The second setting unit 15d sets the first interval as an interval for searching for an object on a straight line in the three-dimensional space for the pixels included in the first range, and is included in the second range. For a pixel to be processed, the processing unit sets a second interval longer than the first interval.

すなわち、第２の設定部１５ｄは、基準カメラ３０αの光学中心および基準画像の画素を結ぶ３次元空間の直線上でオブジェクトの探索が実行される間隔を設定する。以下、３次元空間の直線上でオブジェクトの探索が実行される間隔のことを「探索間隔」と記載する場合がある。 That is, the second setting unit 15d sets an interval at which an object search is performed on a straight line in a three-dimensional space connecting the optical center of the reference camera 30α and the pixels of the reference image. Hereinafter, an interval at which an object search is performed on a straight line in a three-dimensional space may be referred to as a “search interval”.

一実施形態として、第２の設定部１５ｄは、基準画像の画素ごとに当該画素がＶｉｓｕａｌＨｕｌｌ領域または非ＶｉｓｕａｌＨｕｌｌ領域のいずれであるかにより異なる探索間隔Ｎを設定する。例えば、図８は、探索間隔の一例を示す図である。図８には、基準カメラ３０αの光学中心から基準画像の画素を経由して無限遠へ向かう３次元空間の直線が示されている。図８に示すように、ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素には、非ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素よりも細かい探索間隔が設定される。一方、非ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素には、ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素よりも粗い（長い）探索間隔が設定される。言い換えれば、第２の設定部１５ｄは、ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素には、３次元空間の直線上でオブジェクトの探索が実行される密度を高める探索間隔の設定を行う一方で、非ＶｉｓｕａｌＨｕｌｌ領域に分類された基準画像の画素には、３次元空間の直線上でオブジェクトの探索が実行される密度を下げる探索間隔の設定を行う。図８に示す例では、ＶｉｓｕａｌＨｕｌｌ領域の探索間隔は非ＶｉｓｕａｌＨｕｌｌ領域の探索間隔の４倍程度の密度に設定される。さらには、参照画像における基準画像のエピポーラ線の目盛りは、基準カメラ３０αの光学中心から遠くなるほど粗くなる。このため、探索間隔がエピポーラ線で略等間隔となるように、基準カメラ３０αの光学中心から近いほど探索間隔が密に設定される一方で、基準カメラ３０αの光学中心から遠いほど探索間隔が疎に設定される。 As an embodiment, the second setting unit 15d sets a different search interval N for each pixel of the reference image depending on whether the pixel is a Visual Hull region or a non-Visual Hull region. For example, FIG. 8 is a diagram illustrating an example of a search interval. FIG. 8 shows a straight line in a three-dimensional space from the optical center of the reference camera 30α to the infinity via pixels of the reference image. As shown in FIG. 8, a search interval finer than the pixels of the reference image classified into the non-Visual Hull region is set for the pixels of the reference image classified into the Visual Hull region. On the other hand, a coarser (longer) search interval is set for the pixels of the reference image classified into the non-Visual Hull region than the pixels of the reference image classified into the Visual Hull region. In other words, the second setting unit 15d sets the search interval for increasing the density at which the search for the object is performed on the straight line in the three-dimensional space for the pixels of the reference image classified into the Visual Hull region. For the pixels of the reference image classified into the non-Visual Hull region, a search interval is set to reduce the density at which the search for the object is executed on a straight line in the three-dimensional space. In the example shown in FIG. 8, the search interval of the Visual Hull region is set to a density that is about four times the search interval of the non-Visual Hull region. Furthermore, the scale of the epipolar line of the reference image in the reference image becomes coarser as the distance from the optical center of the reference camera 30α increases. Therefore, the search interval is set closer to the optical center of the reference camera 30α so that the search interval is substantially equidistant with the epipolar line, while the search interval is decreased as the distance from the optical center of the reference camera 30α is increased. Set to

ここで、探索間隔は、３次元空間の直線の奥行きｚをデプスｄへ離散化する階調数Ｎ＋１と関連する。ここでは、あくまで一例として、探索間隔の目盛り数およびデプスの階調数が同数である場合を例示して以下の説明を行う。図９は、探索間隔とデプスの関係の一例を示す図である。図９に示すように、探索間隔は、基準カメラ３０αの光学中心から近いほど探索間隔が密に設定される一方で、基準カメラ３０αの光学中心から遠いほど探索間隔が疎に設定される。一方、デプスの各階調は、３次元空間の直線上で等間隔に設定される。例えば、３次元空間の直線の奥行きｚは、下記の式（１）にしたがってデプスｄへ変換することができる。 Here, the search interval is related to the number of gradations N + 1 for discretizing the depth z of the straight line in the three-dimensional space to the depth d. Here, as an example, the following description will be given by exemplifying a case where the number of scales of the search interval and the number of gradations of the depth are the same. FIG. 9 is a diagram illustrating an example of the relationship between the search interval and the depth. As shown in FIG. 9, the search interval is set to be denser as it is closer to the optical center of the reference camera 30α, while the search interval is set to be sparser as it is farther from the optical center of the reference camera 30α. On the other hand, each gradation of depth is set at equal intervals on a straight line in a three-dimensional space. For example, the depth z of the straight line in the three-dimensional space can be converted into the depth d according to the following equation (1).

このような探索間隔およびデプスの関係下では、一例として、次のような基準にしたがってＮを決定できる。例えば、デプスｄが１ずれた時に、参照画像のエピポーラ線上の画素位置のずれを何ピクセル以内に収めるか否かにより、Ｎを決定できる。 Under such a relationship between search interval and depth, for example, N can be determined according to the following criteria. For example, when the depth d is shifted by 1, N can be determined depending on how many pixels the shift of the pixel position on the epipolar line of the reference image falls within.

そこで、以下では、一例として、画素位置のずれを、水平垂直方向のずれの大きい方で、非ＶｉｓｕａｌＨｕｌｌ領域では「１」ピクセル以内、ＶｉｓｕａｌＨｕｌｌ領域では「０．２５」ピクセル以内となるように、各Ｎを設定する設計の下で説明を行う。このように非ＶｉｓｕａｌＨｕｌｌ領域で１ピクセル以内と定めるのは、参照画像上で１ピクセル程度の探索間隔であれば、デプスの精度を最低限保つためである。一方、前景となるＶｉｓｕａｌＨｕｌｌ領域では、よりデプスの精度を上げたいので、探索間隔を１／４にして、さらに細かくしている。 Therefore, in the following, as an example, the displacement of the pixel position is set so that the displacement in the horizontal and vertical directions is larger, within a “1” pixel in the non-Visual Hull region, and within “0.25” pixels in the Visual Hull region. A description will be given under the design of setting each N. The reason why the non-Visual Hull region is determined to be within 1 pixel in this way is to keep the depth accuracy to a minimum if the search interval is about 1 pixel on the reference image. On the other hand, in the Visual Hull area as the foreground, in order to increase the depth accuracy, the search interval is set to ¼ to make it finer.

このとき、カメラの条件は、次の通りであることとする。すなわち、カメラ３０の解像度が１９２０×１０８０ピクセルであり、水平画角が６６度であるものとする。さらに、カメラ間の条件は、次の通りであることとする。図１０は、カメラ間の配置条件の一例を示す図である。図１０に示すように、基準カメラ３０αおよび参照カメラ３０βの距離が３．７１ｍであることとする。さらに、基準カメラ３０αの光軸および参照カメラ３０βの光軸の交点Ｃは、基準カメラ３０αの光学中心から１１．３３ｍであり、参照カメラ３０βの光学中心から１１．９３ｍであることとする。 At this time, the camera conditions are as follows. That is, it is assumed that the resolution of the camera 30 is 1920 × 1080 pixels and the horizontal angle of view is 66 degrees. Furthermore, the conditions between the cameras are as follows. FIG. 10 is a diagram illustrating an example of an arrangement condition between cameras. As shown in FIG. 10, it is assumed that the distance between the base camera 30α and the reference camera 30β is 3.71 m. Furthermore, the intersection C between the optical axis of the standard camera 30α and the optical axis of the reference camera 30β is 11.33 m from the optical center of the standard camera 30α, and 11.93 m from the optical center of the reference camera 30β.

さらに、ｚＮｅａｒが３．０ｍであると共にｚＦａｒが２４．０ｍであるとしたとき、ＶｉｓｕａｌＨｕｌｌ領域および非ＶｉｓｕａｌＨｕｌｌ領域の探索間隔Ｎは次の通りとなる。すなわち、ＶｉｓｕａｌＨｕｌｌ領域の探索間隔Ｎは、４３２０である。非ＶｉｓｕａｌＨｕｌｌ領域の探索間隔は、１０８０である。これらの探索間隔の設定により、画素位置のずれを、非ＶｉｓｕａｌＨｕｌｌ領域では「１」ピクセル以内、ＶｉｓｕａｌＨｕｌｌ領域では「０．２５」ピクセル以内に抑えることが可能になる。 Further, when zNear is 3.0 m and zFar is 24.0 m, the search interval N between the Visual Hull region and the non-Visual Hull region is as follows. That is, the search interval N of the Visual Hull region is 4320. The search interval of the non-Visual Hull region is 1080. By setting these search intervals, it is possible to suppress the displacement of the pixel position within “1” pixels in the non-Visual Hull region and within “0.25” pixels in the Visual Hull region.

マッチング部１５ｅは、基準画像および参照画像の間でブロックをマッチングする処理部である。 The matching unit 15e is a processing unit that matches blocks between the standard image and the reference image.

一実施形態として、マッチング部１５ｅは、基準画像の画素ごとに、第１の設定部１５ｃにより設定された探索範囲の開始位置から第２の設定部１５ｄにより設定された探索間隔ごとに基準カメラ３０αの光学中心および基準画像の画素を結ぶ３次元空間の直線上に基準画像のブロックを配置する。そして、マッチング部１５ｅは、基準画像のブロックが配置される度に、参照画像ごとに当該参照画像へ基準画像のブロックを投影する。これにより、基準画像のブロックが参照画像上で観測される位置にブロックを配置できる。例えば、図７に示す探索範囲および図８に示す探索間隔にしたがってブロックマッチングを実施する場合、図１１に示すブロックマッチングを実現できる。図１１は、ブロックマッチングの一例を示す図である。図１１に示すように、基準画像４０αの画素が非ＶｉｓｕａｌＨｕｌｌ領域である場合、基準画像４０αのブロックは、参照画像４０βの一点鎖線で示すエピポーラ線上でマッチングされる。一方、基準画像４０αの画素がＶｉｓｕａｌＨｕｌｌ領域である場合、基準画像４０αのブロックは、参照画像４０βの二点鎖線で示すエピポーラ線上でマッチングされる。これらの対比から、ＶｉｓｕａｌＨｕｌｌ領域では、ブロックマッチングをオブジェクトの存在範囲に絞り込むことにより、処理負荷やメモリ使用量の低減を実現できる。さらに、オブジェクトの存在範囲では、非ＶｉｓｕａｌＨｕｌｌ領域よりもブロックマッチングが密に実行されるので、３次元情報も高精度に復元できることが期待できる。 As one embodiment, the matching unit 15e generates, for each pixel of the reference image, the reference camera 30α for each search interval set by the second setting unit 15d from the start position of the search range set by the first setting unit 15c. A block of the reference image is arranged on a straight line in a three-dimensional space connecting the optical center of the pixel and the pixel of the reference image. Then, the matching unit 15e projects the block of the standard image onto the reference image for each reference image every time the block of the standard image is arranged. Thereby, a block can be arrange | positioned in the position where the block of a base image is observed on a reference image. For example, when block matching is performed according to the search range shown in FIG. 7 and the search interval shown in FIG. 8, the block matching shown in FIG. 11 can be realized. FIG. 11 is a diagram illustrating an example of block matching. As shown in FIG. 11, when the pixel of the standard image 40α is a non-Visual Hull region, the block of the standard image 40α is matched on an epipolar line indicated by a one-dot chain line of the reference image 40β. On the other hand, when the pixel of the standard image 40α is the Visual Hull region, the block of the standard image 40α is matched on the epipolar line indicated by the two-dot chain line of the reference image 40β. From these comparisons, in the Visual Hull area, it is possible to reduce processing load and memory usage by narrowing block matching to the object existence range. Furthermore, since the block matching is executed more densely in the presence range of the object than in the non-Visual Hull region, it can be expected that the three-dimensional information can be restored with high accuracy.

このようにマッチングされた基準画像のブロックおよび参照画像のブロックの間では、相関、いわゆる類似度が算出される。例えば、基準画像のブロックおよび参照画像のブロックの間でＳＡＤ（Sum of Absolute Difference）、いわゆる差分絶対値和を算出することができる。このＳＡＤの場合、ＳＡＤの値が低いほど基準画像のブロックおよび参照画像のブロックの相関が高いことを意味するので、以下ではＳＡＤの値のことを「コスト」と記載する場合がある。そして、基準画像の各画素では、参照画像ごとに算出されるコストのうち最小値を当該画素のコストの代表値とすることもできるし、参照画像ごとに算出されるコストの平均値の代表値とすることもできる。このように基準画像の画素ごとにデプスｄ別のコストｃｏｓｔ（ｘ，ｙ，ｄ）を算出することができる。なお、ここでは、相関の一例として、ＳＡＤを例示したが、他の相関を算出することもできる。例えば、基準画像のブロックおよび参照画像のブロックの間で相関係数を算出することもできる。この場合、相関係数が大きいほど基準画像のブロックおよび参照画像のブロックの相関が高いことを意味する。 A correlation, that is, a so-called similarity is calculated between the block of the reference image and the block of the reference image thus matched. For example, SAD (Sum of Absolute Difference), so-called sum of absolute differences, can be calculated between the block of the standard image and the block of the reference image. In the case of this SAD, the lower the SAD value, the higher the correlation between the block of the base image and the block of the reference image. Therefore, hereinafter, the value of the SAD may be described as “cost”. In each pixel of the standard image, the minimum value among the costs calculated for each reference image can be set as the representative value of the cost of the pixel, or the average value of the average value of the cost calculated for each reference image. It can also be. In this way, the cost cost (x, y, d) for each depth d can be calculated for each pixel of the reference image. In addition, although SAD was illustrated here as an example of a correlation, another correlation can also be calculated. For example, the correlation coefficient can be calculated between the block of the standard image and the block of the reference image. In this case, the larger the correlation coefficient, the higher the correlation between the block of the base image and the block of the reference image.

推定部１５ｆは、基準画像の画素ごとに当該画素の３次元空間上の位置を推定する処理部である。 The estimation unit 15f is a processing unit that estimates the position of the pixel in the three-dimensional space for each pixel of the reference image.

一実施形態として、推定部１５ｆは、マッチング部１５ｅにより基準画像の画素ごとに算出されたデプスｄ別のコストｃｏｓｔ（ｘ，ｙ，ｄ）に基づいてデプス画像ｄ＊（ｘ，ｙ）を生成する。図１２は、デプス画像の一例を示す図である。図１２に示すように、推定部１５ｆは、基準画像４０αの各画素のデプス別のコストｃｏｓｔ（ｘ，ｙ，ｄ）から各画素に最適なデプスを選択する問題をエネルギー最小化問題として設定する。そして、推定部１５ｆは、エネルギー最小化問題を解くアルゴリズム、例えば局所解および隣接画素同士の整合性の両面からエネルギーを最小化するアルゴリズムにしたがってデプス画像５０を生成する。 As an embodiment, the estimation unit 15f generates the depth image d * (x, y) based on the cost cost (x, y, d) for each depth d calculated by the matching unit 15e for each pixel of the reference image. To do. FIG. 12 is a diagram illustrating an example of a depth image. As illustrated in FIG. 12, the estimation unit 15f sets, as an energy minimization problem, a problem of selecting an optimal depth for each pixel from the cost cost (x, y, d) for each pixel of the reference image 40α. . Then, the estimation unit 15f generates the depth image 50 according to an algorithm that solves the energy minimization problem, for example, an algorithm that minimizes energy from both the local solution and the consistency between adjacent pixels.

［処理の流れ］
図１３は、実施例１に係る３次元情報の復元処理の手順を示すフローチャートである。この処理は、多視点画像のフレームが取得される度にリアルタイムで実行することもできるし、多視点画像のフレームが動画像として取得済みである場合にはバッチ処理で実行することもできる。 [Process flow]
FIG. 13 is a flowchart illustrating the procedure of the three-dimensional information restoration process according to the first embodiment. This process can be executed in real time every time a frame of a multi-viewpoint image is acquired, or can be executed in a batch process when the frame of the multi-viewpoint image has already been acquired as a moving image.

図１４に示すように、取得部１５ａにより多視点画像のフレームが取得されると（ステップＳ１０１）、分離部１５ｂは、カメラ３０ごとに、当該フレームの画像と、それよりも前に取得されたフレームの画像との間で画素値の差分を検出することにより前景に対応する領域を抽出して前景と背景を分離する（ステップＳ１０２）。このように前景と背景が分離されることにより、カメラ３０ごとにマスク画像が生成される。 As illustrated in FIG. 14, when the frame of the multi-viewpoint image is acquired by the acquisition unit 15a (Step S101), the separation unit 15b acquires the image of the frame and the previous image for each camera 30. An area corresponding to the foreground is extracted by detecting a difference in pixel value between the frame image and the foreground and the background are separated (step S102). Thus, a mask image is generated for each camera 30 by separating the foreground and the background.

そして、第１の設定部１５ｃは、多視点画像のうち１つを基準画像として選択する（ステップＳ１０３）。これにより、その他の画像は、参照画像と識別される。続いて、第１の設定部１５ｃは、ステップＳ１０３で選択された基準画像が持つ画素のうち１つを選択する（ステップＳ１０４）。 Then, the first setting unit 15c selects one of the multi-viewpoint images as a reference image (Step S103). As a result, the other images are identified as reference images. Subsequently, the first setting unit 15c selects one of the pixels included in the reference image selected in step S103 (step S104).

その上で、第１の設定部１５ｃは、ステップＳ１０２でカメラ３０ごとに生成されたマスク画像を用いて、参照画像ごとにステップＳ１０４で選択された基準画像の画素のエピポーラ線ＥＬと前景のシルエットＳβとの重複範囲を求め、この重複範囲が共通する部分をオブジェクトの存在範囲として絞り込む（ステップＳ１０５）。 In addition, the first setting unit 15c uses the mask image generated for each camera 30 in step S102, and the epipolar line EL and the foreground silhouette of the pixel of the reference image selected in step S104 for each reference image. An overlapping range with Sβ is obtained, and a portion where the overlapping range is common is narrowed down as an object existence range (step S105).

このとき、オブジェクトの存在範囲の絞り込みに成功した場合（ステップＳ１０６Ｙｅｓ）、ステップＳ１０４で選択された基準画像の画素は、ＶｉｓｕａｌＨｕｌｌ領域に分類される。この場合、第１の設定部１５ｃは、ステップＳ１０４で選択された基準画像の画素の探索範囲に前景となるオブジェクトの存在範囲を設定する（ステップＳ１０７）。そして、第２の設定部１５ｄは、ステップＳ１０４で選択された基準画像の画素に後述の第２の間隔よりも短い第１の間隔を探索間隔として設定する（ステップＳ１０８）。 At this time, if the object range is successfully narrowed down (Yes in step S106), the pixels of the reference image selected in step S104 are classified into the Visual Hull area. In this case, the first setting unit 15c sets the presence range of the foreground object in the search range of the reference image pixel selected in step S104 (step S107). Then, the second setting unit 15d sets a first interval shorter than a later-described second interval as a search interval for the pixel of the reference image selected in step S104 (step S108).

一方、オブジェクトの存在範囲の絞り込みに成功しない場合（ステップＳ１０６Ｎｏ）、ステップＳ１０４で選択された基準画像の画素は、非ＶｉｓｕａｌＨｕｌｌ領域に分類される。この場合、第１の設定部１５ｃは、ステップＳ１０４で選択された基準画像の画素の探索範囲に広域の探索範囲、すなわち開始位置ｚＮｅａｒおよび終了位置ｚＦａｒを設定する（ステップＳ１０９）。そして、第２の設定部１５ｄは、ステップＳ１０４で選択された基準画像の画素に第１の間隔よりも長い第２の間隔を探索間隔として設定する（ステップＳ１１０）。 On the other hand, if the object existence range has not been narrowed down (No in step S106), the pixel of the reference image selected in step S104 is classified into a non-Visual Hull region. In this case, the first setting unit 15c sets a wide search range, that is, a start position zNear and an end position zFar as the search range of the pixel of the reference image selected in step S104 (step S109). Then, the second setting unit 15d sets a second interval longer than the first interval as a search interval for the pixel of the reference image selected in step S104 (step S110).

その後、マッチング部１５ｅは、探索範囲および探索間隔にしたがって参照画像ごとに当該参照画像および基準画像の間でブロックをマッチングし、マッチングされたブロック間で相関を算出する（ステップＳ１１１）。 Thereafter, the matching unit 15e matches a block between the reference image and the reference image for each reference image according to the search range and the search interval, and calculates a correlation between the matched blocks (step S111).

そして、基準画像の全ての画素が選択されるまで（ステップＳ１１２Ｎｏ）、上記のステップＳ１０４〜ステップＳ１１１までの処理を繰り返し実行する。その後、基準画像の全ての画素が選択されると（ステップＳ１１２Ｙｅｓ）、推定部１５ｆは、マッチング部１５ｅにより基準画像の画素ごとに算出されたデプスｄ別のコストｃｏｓｔ（ｘ，ｙ，ｄ）に基づいてデプス画像ｄ＊（ｘ，ｙ）を生成する（ステップＳ１１３）。 Then, the processes from step S104 to step S111 are repeated until all the pixels of the reference image are selected (step S112 No). Thereafter, when all the pixels of the reference image are selected (Yes in step S112), the estimation unit 15f calculates the cost cost (x, y, d) for each depth d calculated for each pixel of the reference image by the matching unit 15e. Based on this, a depth image d * (x, y) is generated (step S113).

そして、多視点画像の全ての画像が基準画像に選択されるまで（ステップＳ１１４Ｎｏ）、上記のステップＳ１０３〜ステップＳ１１３までの処理を繰り返し実行する。その後、多視点画像の全ての画像が基準画像に選択されると（ステップＳ１１４Ｙｅｓ）、処理を終了する。 Then, the processes from step S103 to step S113 are repeatedly executed until all the images of the multi-viewpoint images are selected as reference images (No in step S114). Thereafter, when all the images of the multi-viewpoint images are selected as the reference images (Yes at Step S114), the process is terminated.

［効果の一側面］
上述してきたように、本実施例に係るサーバ装置１０は、多視点画像における基準画像の画素ごとに基準カメラの光学中心及び基準画像の画素を結ぶ３次元空間の直線上でオブジェクトが探索される範囲をＶｉｓｕａｌＨｕｌｌで絞り込み、ブロックマッチングの間隔を細かく設定する。したがって、本実施例に係るサーバ装置１０によれば、３次元情報の復元に伴う処理負荷又はメモリの使用量を低減できる。 [One aspect of effect]
As described above, the server device 10 according to the present embodiment searches for an object on a straight line in a three-dimensional space connecting the optical center of the reference camera and the pixels of the reference image for each pixel of the reference image in the multi-viewpoint image. The range is narrowed down with Visual Hull, and the block matching interval is set finely. Therefore, according to the server device 10 according to the present embodiment, it is possible to reduce the processing load or the memory usage accompanying the restoration of the three-dimensional information.

［メモリ使用量の増加度合いの比較］
図１４は、基準画像の各画素が属する領域の割合の一例を示す図である。図１５は、基準画像の各画素の探索範囲の平均値の比率の一例を示す図である。図１４に示すように、基準画像におけるＶｉｓｕａｌＨｕｌｌ領域の割合が７％である一方で、非ＶｉｓｕａｌＨｕｌｌ領域の割合が９３％であるという条件を仮定する。さらに、図１５に示すように、「（ＶｉｓｕａｌＨｕｌｌ領域のある画素の探索デプスの範囲の平均)／（非ＶｉｓｕａｌＨｕｌｌ領域のある画素の探索デプス範囲）」、すなわち図７に示す点の塗り潰しに対する黒の塗り潰しの平均の割合が１．３％であると仮定する。これは、ＶｉｓｕａｌＨｕｌｌ領域は非ＶｉｓｕａｌＨｕｌｌ領域と比べて、１画素あたり１．３％のメモリを持てばよいことを意味する。 [Comparison of increase in memory usage]
FIG. 14 is a diagram illustrating an example of a ratio of a region to which each pixel of the reference image belongs. FIG. 15 is a diagram illustrating an example of the ratio of the average value of the search range of each pixel of the reference image. As shown in FIG. 14, it is assumed that the ratio of the Visual Hull area in the reference image is 7% while the ratio of the non-Visual Hull area is 93%. Further, as shown in FIG. 15, “(average of search depth range of pixels with Visual Hull region) / (search depth range of pixels with non-Visual Hull region)”, that is, the dot filling shown in FIG. Assume that the average percentage of black fill is 1.3%. This means that the Visual Hull area only needs to have 1.3% of memory per pixel as compared to the non-Visual Hull area.

このとき、各領域でコスト計算するのに用いるメモリの比率は、次の通りとなる。すなわち、非ＶｉｓｕａｌＨｕｌｌ領域は、０．９３×１００の計算により９３となる。ＶｉｓｕａｌＨｕｌｌ領域は、０．０７×１．３の計算により０．０９１となる。全体では、９３＋０．０９１の計算により９３．０９１となる。 At this time, the ratio of the memory used for calculating the cost in each area is as follows. That is, the non-Visual Hull region becomes 93 by calculation of 0.93 × 100. The Visual Hull area is 0.091 by the calculation of 0.07 × 1.3. Overall, it is 93.091 by the calculation of 93 + 0.091.

さらに、ＶｉｓｕａｌＨｕｌｌ領域のみの探索間隔を１／４にした場合、ＶｉｓｕａｌＨｕｌｌ領域のメモリだけが４倍になって、０．０９１×４の計算により０．３６４となる。全体では、９３＋０．３６４の計算により９３．３６４となる。したがって、メモリ増加度合いは、９３．３６４／９３．０９１の計算により、およそ０．２９％となる。 Further, when the search interval of only the Visual Hull area is set to ¼, only the memory of the Visual Hull area is quadrupled and becomes 0.364 by the calculation of 0.091 × 4. In total, 93.364 is obtained by the calculation of 93 + 0.364. Therefore, the memory increase degree is about 0.29% by the calculation of 93.364 / 93.091.

さて、これまで開示の装置に関する実施例について説明したが、本発明は上述した実施例以外にも、種々の異なる形態にて実施されてよいものである。そこで、以下では、本発明に含まれる他の実施例を説明する。 Although the embodiments related to the disclosed apparatus have been described above, the present invention may be implemented in various different forms other than the above-described embodiments. Therefore, another embodiment included in the present invention will be described below.

［スタンドアローン］
上記の実施例１では、上記の復元サービスを提供するサーバ装置１０を例示することによりクライアントサーバシステムにより実装例を示したが、スタンドアローンで実装されることとしてもかまわない。この場合、スタンドアローンで動作するコンピュータに上記の復元プログラムをインストールすることとすればよい。 [Stand-alone]
In the first embodiment, the implementation example is shown by the client server system by exemplifying the server device 10 that provides the restoration service. However, the server apparatus 10 may be implemented in a stand-alone manner. In this case, the restoration program may be installed on a computer that operates stand-alone.

［探索間隔の変更］
上記の実施例１では、基準画像の画素がＶｉｓｕａｌＨｕｌｌ領域であるか否かにより探索間隔を設定する例を説明したが、更なる基準で異なる探索間隔を設定することもできる。例えば、基準画像の画素がＶｉｓｕａｌＨｕｌｌ領域である場合、当該画素の重要度に応じて異なる探索間隔を設定することができる。この重要度は、一例として、基準画像の画素が基準画像から検出される重要領域、例えば表示に重要である顔やトラッキングに重要である背番号などに対応するか否かにより設定できる。例えば、基準画像の画素が顔や背番号に対応する場合、顔や背番号に対応しない場合よりも高い重要度を付与する。その上で、第２の設定部１５ｄは、基準画像の画素がＶｉｓｕａｌＨｕｌｌ領域である場合、基準画像の画素の重要度が高いほど細かい探索間隔を設定することができる。 [Change search interval]
In the first embodiment, the example in which the search interval is set depending on whether or not the pixel of the reference image is the Visual Hull region has been described. However, a different search interval may be set based on a further reference. For example, when the pixel of the reference image is a Visual Hull region, different search intervals can be set according to the importance of the pixel. As an example, the importance can be set depending on whether or not the pixels of the reference image correspond to an important area detected from the reference image, for example, a face important for display or a back number important for tracking. For example, when the pixel of the reference image corresponds to the face or the spine number, higher importance is given than when the pixel does not correspond to the face or the spine number. In addition, when the pixel of the reference image is the Visual Hull region, the second setting unit 15d can set a finer search interval as the importance of the pixel of the reference image is higher.

［分散および統合］
また、図示した各装置の各構成要素は、必ずしも物理的に図示の如く構成されておらずともよい。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。例えば、取得部１５ａ、分離部１５ｂ、第１の設定部１５ｃ、第２の設定部１５ｄ、マッチング部１５ｅまたは推定部１５ｆをサーバ装置１０の外部装置としてネットワーク経由で接続するようにしてもよい。また、取得部１５ａ、分離部１５ｂ、第１の設定部１５ｃ、第２の設定部１５ｄ、マッチング部１５ｅまたは推定部１５ｆを別の装置がそれぞれ有し、ネットワーク接続されて協働することで、上記のサーバ装置１０の機能を実現するようにしてもよい。 [Distribution and integration]
In addition, each component of each illustrated apparatus does not necessarily have to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured. For example, the acquisition unit 15a, the separation unit 15b, the first setting unit 15c, the second setting unit 15d, the matching unit 15e, or the estimation unit 15f may be connected as an external device of the server device 10 via a network. In addition, another device has an acquisition unit 15a, a separation unit 15b, a first setting unit 15c, a second setting unit 15d, a matching unit 15e, or an estimation unit 15f, and is connected to a network to cooperate. You may make it implement | achieve the function of said server apparatus 10. FIG.

［復元プログラム］
また、上記の実施例で説明した各種の処理は、予め用意されたプログラムをパーソナルコンピュータやワークステーションなどのコンピュータで実行することによって実現することができる。そこで、以下では、図１６を用いて、上記の実施例と同様の機能を有する復元プログラムを実行するコンピュータの一例について説明する。 [Restore program]
The various processes described in the above embodiments can be realized by executing a prepared program on a computer such as a personal computer or a workstation. In the following, an example of a computer that executes a restoration program having the same function as that of the above embodiment will be described with reference to FIG.

図１６は、実施例１及び実施例２に係る復元プログラムを実行するコンピュータのハードウェア構成例を示す図である。図１６に示すように、コンピュータ１００は、操作部１１０ａと、スピーカ１１０ｂと、カメラ１１０ｃと、ディスプレイ１２０と、通信部１３０とを有する。さらに、このコンピュータ１００は、ＣＰＵ１５０と、ＲＯＭ１６０と、ＨＤＤ１７０と、ＲＡＭ１８０とを有する。これら１１０〜１８０の各部はバス１４０を介して接続される。 FIG. 16 is a diagram illustrating a hardware configuration example of a computer that executes the restoration program according to the first embodiment and the second embodiment. As illustrated in FIG. 16, the computer 100 includes an operation unit 110a, a speaker 110b, a camera 110c, a display 120, and a communication unit 130. Further, the computer 100 includes a CPU 150, a ROM 160, an HDD 170, and a RAM 180. These units 110 to 180 are connected via a bus 140.

ＨＤＤ１７０には、図１６に示すように、上記の実施例１で示した取得部１５ａ、分離部１５ｂ、第１の設定部１５ｃ、第２の設定部１５ｄ、マッチング部１５ｅ及び推定部１５ｆと同様の機能を発揮する復元プログラム１７０ａが記憶される。この復元プログラム１７０ａは、図２に示した取得部１５ａ、分離部１５ｂ、第１の設定部１５ｃ、第２の設定部１５ｄ、マッチング部１５ｅ及び推定部１５ｆの各構成要素と同様、統合又は分離してもかまわない。すなわち、ＨＤＤ１７０には、必ずしも上記の実施例１で示した全てのデータが格納されずともよく、処理に用いるデータがＨＤＤ１７０に格納されればよい。 As shown in FIG. 16, the HDD 170 is similar to the acquisition unit 15a, the separation unit 15b, the first setting unit 15c, the second setting unit 15d, the matching unit 15e, and the estimation unit 15f described in the first embodiment. A restoration program 170a that exhibits the above function is stored. This restoration program 170a is integrated or separated in the same manner as the components of the acquisition unit 15a, separation unit 15b, first setting unit 15c, second setting unit 15d, matching unit 15e, and estimation unit 15f shown in FIG. It doesn't matter. That is, the HDD 170 does not necessarily have to store all the data shown in the first embodiment, and data used for processing may be stored in the HDD 170.

このような環境の下、ＣＰＵ１５０は、ＨＤＤ１７０から復元プログラム１７０ａを読み出した上でＲＡＭ１８０へ展開する。この結果、復元プログラム１７０ａは、図１６に示すように、復元プロセス１８０ａとして機能する。この復元プロセス１８０ａは、ＲＡＭ１８０が有する記憶領域のうち復元プロセス１８０ａに割り当てられた領域にＨＤＤ１７０から読み出した各種データを展開し、この展開した各種データを用いて各種の処理を実行する。例えば、復元プロセス１８０ａが実行する処理の一例として、図１３に示す処理などが含まれる。なお、ＣＰＵ１５０では、必ずしも上記の実施例１で示した全ての処理部が動作せずともよく、実行対象とする処理に対応する処理部が仮想的に実現されればよい。 Under such an environment, the CPU 150 reads the restoration program 170 a from the HDD 170 and expands it in the RAM 180. As a result, the restoration program 170a functions as a restoration process 180a as shown in FIG. The restoration process 180a develops various data read from the HDD 170 in an area allocated to the restoration process 180a in the storage area of the RAM 180, and executes various processes using the decompressed various data. For example, the process shown in FIG. 13 is included as an example of the process executed by the restoration process 180a. Note that the CPU 150 does not necessarily operate all the processing units described in the first embodiment, and the processing unit corresponding to the process to be executed may be virtually realized.

なお、上記の復元プログラム１７０ａは、必ずしも最初からＨＤＤ１７０やＲＯＭ１６０に記憶されておらずともかまわない。例えば、コンピュータ１００に挿入されるフレキシブルディスク、いわゆるＦＤ、ＣＤ−ＲＯＭ、ＤＶＤディスク、光磁気ディスク、ＩＣカードなどの「可搬用の物理媒体」に復元プログラム１７０ａを記憶させる。そして、コンピュータ１００がこれらの可搬用の物理媒体から復元プログラム１７０ａを取得して実行するようにしてもよい。また、公衆回線、インターネット、ＬＡＮ、ＷＡＮなどを介してコンピュータ１００に接続される他のコンピュータまたはサーバ装置などに復元プログラム１７０ａを記憶させておき、コンピュータ１００がこれらから復元プログラム１７０ａを取得して実行するようにしてもよい。 Note that the restoration program 170a is not necessarily stored in the HDD 170 or the ROM 160 from the beginning. For example, the restoration program 170 a is stored in a “portable physical medium” such as a flexible disk inserted into the computer 100, so-called FD, CD-ROM, DVD disk, magneto-optical disk, IC card or the like. Then, the computer 100 may acquire and execute the restoration program 170a from these portable physical media. In addition, the restoration program 170a is stored in another computer or server device connected to the computer 100 via a public line, the Internet, a LAN, a WAN, etc., and the computer 100 acquires and executes the restoration program 170a therefrom. You may make it do.

１復元システム
１０サーバ装置
１１通信Ｉ／Ｆ部
１３記憶部
１３ａカメラパラメータ
１５制御部
１５ａ取得部
１５ｂ分離部
１５ｃ第１の設定部
１５ｄ第２の設定部
１５ｅマッチング部
１５ｆ推定部 DESCRIPTION OF SYMBOLS 1 Restoration system 10 Server apparatus 11 Communication I / F part 13 Storage part 13a Camera parameter 15 Control part 15a Acquisition part 15b Separation part 15c 1st setting part 15d 2nd setting part 15e Matching part 15f Estimation part

Claims

Acquire multiple images captured by multiple cameras with different shooting positions,
On a first image captured by a first camera selected from the plurality of images, on a straight line in a three-dimensional space connecting the optical center of the first camera and each pixel of the first image To determine if the object exists
A region including the pixel of the identified first image is set as a first range in which the search for the object is executed on a straight line in the three-dimensional space, and other than the first range in the first image Set the area of to the second range,
For the pixels included in the first range, a first interval is set as an interval for performing the search for the object on a straight line in the three-dimensional space, and for the pixels included in the second range Then, a second interval longer than the first interval is set,
According to a set range and a set interval, a block is matched between the second image that is not selected as the first image among the plurality of images and the first image,
Estimating a position in the three-dimensional space for each pixel of the first image based on a correlation between matched blocks;
A method for restoring three-dimensional information, wherein a computer executes processing.

3. The process according to claim 1, wherein the process of setting the first interval sets the first interval that is finer as the importance of the pixel of the first image that has succeeded in specifying the range is higher. How to restore dimension information.

The three-dimensional information restoration method according to claim 1, wherein the identifying process uses Visual Hull to identify a range in which the object exists.

Acquire multiple images captured by multiple cameras with different shooting positions,
On a first image captured by a first camera selected from the plurality of images, on a straight line in a three-dimensional space connecting the optical center of the first camera and each pixel of the first image To determine if the object exists
A region including the pixel of the identified first image is set as a first range in which the search for the object is executed on a straight line in the three-dimensional space, and other than the first range in the first image Set the area of to the second range,
For the pixels included in the first range, a first interval is set as an interval for performing the search for the object on a straight line in the three-dimensional space, and for the pixels included in the second range Then, a second interval longer than the first interval is set,
According to a set range and a set interval, a block is matched between the second image that is not selected as the first image among the plurality of images and the first image,
Estimating a position in the three-dimensional space for each pixel of the first image based on a correlation between matched blocks;
A three-dimensional information restoration program which causes a computer to execute processing.

An acquisition unit for acquiring a plurality of images captured by a plurality of cameras having different shooting positions;
On a first image captured by a first camera selected from the plurality of images, on a straight line in a three-dimensional space connecting the optical center of the first camera and each pixel of the first image A specific part that identifies whether an object exists in
A region including the pixel of the identified first image is set as a first range in which the search for the object is executed on a straight line in the three-dimensional space, and other than the first range in the first image A first setting unit for setting the area of the second range to a second range;
For the pixels included in the first range, a first interval is set as an interval for performing the search for the object on a straight line in the three-dimensional space, and for the pixels included in the second range Then, a second setting unit that sets a second interval longer than the first interval;
A matching unit that matches a block between the first image and the second image that is not selected as the first image among the plurality of images according to a set range and a set interval;
An estimation unit that estimates a position in the three-dimensional space for each pixel of the first image based on a correlation between matched blocks;
An apparatus for restoring three-dimensional information, comprising: