JP2021077120A

JP2021077120A - Image processing system, controller and control method therefor, image transmission method and program

Info

Publication number: JP2021077120A
Application number: JP2019203506A
Authority: JP
Inventors: 伊藤　博康; Hiroyasu Ito; 博康伊藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-08
Filing date: 2019-11-08
Publication date: 2021-05-20

Abstract

To provide an image processing system, having a plurality of imaging apparatuses and a plurality of image processing apparatuses, which executes load distribution and transmits a picked-up image picked up at the same timing to the same image processor.SOLUTION: An image processing system (100) includes a plurality of camera adapters (120) corresponding to a plurality of cameras (112) and acquiring a plurality of images acquired by picking-up with the plurality of cameras (112) and a plurality of front end servers (131) acquiring a plurality of images transmitted from the plurality of camera adapters and executing image processing. Each of the plurality of camera adapters (120) transmits an image to a first front end server (131a) included in the plurality of front end servers on the basis of the fact that picking-up timing information given to the image is the first picking-up timing information and transmits the image to a second front end server (131b) included in the plurality of front end servers on the basis of the fact that the picking-up timing information given to the image is the second picking-up timing information.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理システム、制御装置およびその制御方法、画像伝送方法、プログラムに関する。 The present invention relates to an image processing system, a control device and its control method, an image transmission method, and a program.

昨今、複数のカメラを異なる位置に設置して多視点で同期撮像し、当該撮像により得られた複数の画像を用いて仮想視点コンテンツを生成する技術が注目されている。複数の画像から仮想視点コンテンツを生成する技術によれば、例えば、サッカーやバスケットボールのハイライトシーンを様々な角度から視聴することが出来る。そのため、通常の画像と比較してユーザに高臨場感を与えることが出来る。 Recently, a technique of installing a plurality of cameras at different positions, performing synchronous imaging from multiple viewpoints, and generating virtual viewpoint contents using a plurality of images obtained by the imaging has attracted attention. According to the technology for generating virtual viewpoint contents from a plurality of images, for example, highlight scenes of soccer and basketball can be viewed from various angles. Therefore, it is possible to give the user a high sense of presence as compared with a normal image.

複数の画像に基づく仮想視点コンテンツの生成及び閲覧は、複数のカメラが撮像した画像をサーバなどの画像処理部に集約し、当該画像処理部にて、三次元モデル生成、レンダリングなどの処理を施し、ユーザ端末に伝送を行うことで実現される。しかしながら、複数のカメラが撮像した画像を用いた三次元モデル生成、レンダリングなどの処理を施すためには、サーバなどの画像処理部に高い能力が求められる。そこで、複数のサーバを用いて負荷分散を行うことが考えられる。 To generate and view virtual viewpoint content based on multiple images, the images captured by multiple cameras are aggregated in an image processing unit such as a server, and the image processing unit performs processing such as 3D model generation and rendering. , It is realized by transmitting to the user terminal. However, in order to perform processing such as three-dimensional model generation and rendering using images captured by a plurality of cameras, a high ability is required for an image processing unit such as a server. Therefore, it is conceivable to perform load balancing using a plurality of servers.

負荷分散を行うためには、複数の画像を複数のサーバへ適切に振り分ける技術が要求される。特許文献１では、Ｈａｓｈ値を算出した結果により負荷分散装置へのデータのアクセスの振り分けを行うことについて記載されている。特許文献２では、データ送信先のサーバを決定するためのパラメータに現在時刻を加味して負荷分散を行うことについて記載されている。 In order to perform load balancing, a technique for appropriately distributing a plurality of images to a plurality of servers is required. Patent Document 1 describes that data access to the load balancer is distributed based on the result of calculating the Hash value. Patent Document 2 describes that load balancing is performed by adding the current time to a parameter for determining a server to which data is transmitted.

特開２００３−１３１９６１号公報Japanese Unexamined Patent Publication No. 2003-131961 特開２０１４−０９８９７５号公報Japanese Unexamined Patent Publication No. 2014-098975

仮想画像の生成では、同期撮像により同じタイミングで撮像された複数の撮像画像を一つの画像処理部に集約することが望ましい。しかしながら、特許文献１、２記載の負荷分散方法では、同じタイミングで撮像された複数の撮像画像を一つの画像処理部に集めるという振り分け方ができないという課題がある。 In the generation of virtual images, it is desirable to aggregate a plurality of captured images captured at the same timing by synchronous imaging into one image processing unit. However, the load distribution method described in Patent Documents 1 and 2 has a problem that it is not possible to sort a plurality of captured images captured at the same timing into one image processing unit.

本発明の一態様によれば、複数の撮像装置と複数の画像処理装置を備えた画像処理システムにおいて、負荷分散を行い、同じタイミングで撮像された撮像画像を同じ画像処理装置に伝送することを可能にする技術が提供される。 According to one aspect of the present invention, in an image processing system including a plurality of image pickup devices and a plurality of image processing devices, load distribution is performed and captured images captured at the same timing are transmitted to the same image processing device. The technology that makes it possible is provided.

本発明の一態様による画像処理システムは、
複数の撮像装置と対応し、前記複数の撮像装置により撮像されることにより取得された複数の画像を取得する複数の制御装置と、
前記複数の制御装置から伝送される前記複数の画像を取得する複数の画像処理装置と、
を有し、
前記複数の制御装置は、画像に付与されている撮像タイミング情報が第１の撮像タイミング情報であることに基づいて、前記複数の画像処理装置に含まれる第１の画像処理装置に画像を伝送し、画像に付与されている撮像タイミング情報が第２の撮像タイミング情報であることに基づいて、前記複数の画像処理装置に含まれる第２の画像処理装置に画像を伝送する。 The image processing system according to one aspect of the present invention is
A plurality of control devices corresponding to a plurality of image pickup devices and acquiring a plurality of images acquired by being imaged by the plurality of image pickup devices, and a plurality of control devices.
A plurality of image processing devices for acquiring the plurality of images transmitted from the plurality of control devices, and a plurality of image processing devices.
Have,
The plurality of control devices transmit an image to a first image processing device included in the plurality of image processing devices based on the imaging timing information given to the image being the first image processing timing information. Based on the imaging timing information given to the image being the second imaging timing information, the image is transmitted to the second image processing apparatus included in the plurality of image processing apparatus.

本発明によれば、複数の撮像装置と複数の画像処理装置を備えた画像処理システムにおいて、負荷分散を行い、同じタイミングで撮像された撮像画像を同じ画像処理装置に伝送することが可能となる。 According to the present invention, in an image processing system provided with a plurality of image pickup devices and a plurality of image processing devices, it is possible to perform load distribution and transmit the captured images captured at the same timing to the same image processing device. ..

画像処理システムの構成例を示すブロック図。A block diagram showing a configuration example of an image processing system. カメラアダプタの機能構成例を示すブロック図。A block diagram showing an example of a functional configuration of a camera adapter. フロントエンドサーバの機能構成例を示すブロック図。A block diagram showing a functional configuration example of the front-end server. フロントエンドサーバのデータ入力制御部の機能構成例を示すブロック図。A block diagram showing a functional configuration example of the data input control unit of the front-end server. ワークフロー全体を説明するフローチャート。A flowchart that describes the entire workflow. 機材設置前のワークフローを説明するフローチャート。A flowchart explaining the workflow before installing the equipment. 機材設置時のワークフローを説明するフローチャート。A flowchart explaining the workflow when installing the equipment. 設置時キャリブレーションの処理を説明するシーケンス図。The sequence diagram explaining the process of calibration at the time of installation. 撮像開始処理を説明するシーケンス図。The sequence diagram explaining the imaging start process. 撮像開始処理を説明するシーケンス図。The sequence diagram explaining the imaging start process. 三次元モデル情報の生成処理を説明するシーケンス図。A sequence diagram illustrating a process of generating three-dimensional model information. カメラアダプタが出力するデータ形式を示す図。The figure which shows the data format which a camera adapter outputs. 第１実施形態のカメラアダプタの動作を説明するフローチャート。The flowchart explaining the operation of the camera adapter of 1st Embodiment. 第１実施形態のカメラアダプタによるアドレスの設定を説明する図。The figure explaining the setting of the address by the camera adapter of 1st Embodiment. 第１実施形態のカメラアダプタによるアドレスの設定を説明する図。The figure explaining the setting of the address by the camera adapter of 1st Embodiment. ファイル生成処理について説明するフローチャート。A flowchart illustrating a file generation process. カメラアダプタのハードウェア構成例を示すブロック図。A block diagram showing a hardware configuration example of a camera adapter. 第２実施形態のカメラアダプタの動作を説明するフローチャート。The flowchart explaining the operation of the camera adapter of 2nd Embodiment. 第２実施形態のカメラアダプタによるアドレスの設定を説明する図。The figure explaining the setting of the address by the camera adapter of the 2nd Embodiment.

＜第１実施形態＞
競技場（スタジアム）やコンサートホールなどの施設に設置された複数のカメラ及びマイクを用いて撮像及び収音を行うシステムについて、図１のシステム構成図を用いて説明する。第１実施形態の画像処理システム１００は、センサシステム１１０ａ〜１１０ｚ、画像コンピューティングサーバ１３０、コントローラ１４０、スイッチングハブ１８０、及びエンドユーザ端末１５０を有する。画像処理システム１００は、複数のカメラ（センサシステム１１０ａ〜１１０ｚ）から得られた画像に基づいて、仮想視点画像を生成する。 <First Embodiment>
A system for imaging and collecting sound using a plurality of cameras and microphones installed in facilities such as a stadium and a concert hall will be described with reference to the system configuration diagram of FIG. The image processing system 100 of the first embodiment includes sensor systems 110a to 110z, an image computing server 130, a controller 140, a switching hub 180, and an end user terminal 150. The image processing system 100 generates a virtual viewpoint image based on images obtained from a plurality of cameras (sensor systems 110a to 110z).

コントローラ１４０は制御ステーション１４１と仮想カメラ操作ＵＩ１４２を有する。制御ステーション１４１は、画像処理システム１００を構成するそれぞれのブロックに対してネットワーク１９０ａ〜１９０ｄ、１８０ａ、１８０ｂ、及び１７０ａ〜１７０ｙを通じて動作状態の管理及びパラメータ設定制御などを行う。ここで、ネットワークはＥｔｈｅｒｎｅｔ（登録商標）であるＩＥＥＥ標準準拠のＧｂＥ（ギガビットイーサーネット）や１０ＧｂＥでもよいし、インターコネクトＩｎｆｉｎｉｂａｎｄ、産業用イーサーネット等を組合せて構成されてもよい。また、これらに限定されず、他の種別のネットワークであってもよい。 The controller 140 has a control station 141 and a virtual camera operation UI 142. The control station 141 manages the operating state and controls the parameter setting for each block constituting the image processing system 100 through the networks 190a to 190d, 180a, 180b, and 170a to 170y. Here, the network may be GbE (Gigabit Ethernet) or 10 GbE conforming to the IEEE standard (registered trademark), or may be configured by combining an interconnect Infiniband, an industrial Ethernet, or the like. Further, the network is not limited to these, and may be another type of network.

次に、センサシステム１１０ａ〜１１０ｚにより取得される２６セットの画像及び音声を、センサシステム１１０ｚから画像コンピューティングサーバ１３０へ送信する動作を説明する。なお、第１実施形態の画像処理システム１００では、センサシステム１１０ａ〜１１０ｚがデイジーチェーンにより接続されている。 Next, an operation of transmitting 26 sets of images and sounds acquired by the sensor systems 110a to 110z from the sensor system 110z to the image computing server 130 will be described. In the image processing system 100 of the first embodiment, the sensor systems 110a to 110z are connected by a daisy chain.

第１実施形態において、特別な説明がない場合は、センサシステム１１０ａからセンサシステム１１０ｚまでの２６セットのシステムを区別せずセンサシステム１１０と記載する。それぞれのセンサシステム１１０内の装置についても同様に、特別な説明がない場合は区別せず、マイク１１１、カメラ１１２、雲台１１３、外部センサ１１４、及びカメラアダプタ１２０と記載する。なお、センサシステムが２６セットの場合を記載しているが、これはあくまでも一例であり、センサシステムの台数をこれに限定されるものではない。また、第１実施形態では、特に断りがない限り、画像という文言が、動画と静止画の概念を含むものとして説明する。すなわち、第１実施形態の画像処理システム１００は、静止画及び動画の何れについても処理可能である。 In the first embodiment, unless otherwise specified, the 26 sets of systems from the sensor system 110a to the sensor system 110z are referred to as the sensor system 110 without distinction. Similarly, the devices in each sensor system 110 are not distinguished unless otherwise specified, and are described as a microphone 111, a camera 112, a pan head 113, an external sensor 114, and a camera adapter 120. The case where the sensor system has 26 sets is described, but this is just an example, and the number of sensor systems is not limited to this. Further, in the first embodiment, unless otherwise specified, the word "image" will be described as including the concepts of moving images and still images. That is, the image processing system 100 of the first embodiment can process both still images and moving images.

また、第１実施形態では、画像処理システム１００により提供される仮想視点コンテンツに、仮想視点画像と仮想視点音声が含まれる例を中心に説明するが、これに限らない。例えば、仮想視点コンテンツに音声が含まれていなくても良い。また例えば、仮想視点コンテンツに含まれる音声が、仮想視点に最も近いマイクにより収音された音声であっても良い。また、第１実施形態では、説明の簡略化のため、部分的に音声についての記載を省略しているが、基本的に画像と音声は共に処理されるものとする。 Further, in the first embodiment, an example in which the virtual viewpoint image and the virtual viewpoint sound are included in the virtual viewpoint content provided by the image processing system 100 will be mainly described, but the present invention is not limited to this. For example, the virtual viewpoint content does not have to include audio. Further, for example, the sound included in the virtual viewpoint content may be the sound picked up by the microphone closest to the virtual viewpoint. Further, in the first embodiment, for the sake of simplification of the explanation, the description about the sound is partially omitted, but basically both the image and the sound are processed.

画像処理システム１００は、被写体を複数の方向から撮像するための複数のカメラを有する。本実施形態では、センサシステム１１０ａ〜１１０ｚは、それぞれ１台ずつの撮像装置としてのカメラ１１２ａ〜１１２ｚを有している。なお、１台のセンサシステム１１０が複数のカメラ１１２と接続されてもよい。複数のセンサシステム１１０同士はデイジーチェーンにより接続される。この接続形態により、撮像画像の４Ｋや８Ｋなどへの高解像度化及び高フレームレート化に伴う画像データの大容量化において、接続ケーブル数の削減や配線作業の省力化ができる効果があることをここに明記しておく。なお、複数のセンサシステムの接続形態はこれに限られるものではなく、例えば、センサシステム１１０ａ〜１１０ｚの各々がスイッチングハブ１８０に接続されて、スイッチングハブ１８０を経由してセンサシステム１１０間のデータ送受信を行う、スター型のネットワーク構成としてもよい。 The image processing system 100 has a plurality of cameras for capturing an image of a subject from a plurality of directions. In the present embodiment, the sensor systems 110a to 110z each have one camera 112a to 112z as an imaging device. One sensor system 110 may be connected to a plurality of cameras 112. The plurality of sensor systems 110 are connected to each other by a daisy chain. It is said that this connection form has the effect of reducing the number of connection cables and labor saving in wiring work in increasing the resolution of captured images to 4K or 8K and increasing the capacity of image data due to the increase in frame rate. I will specify it here. The connection form of the plurality of sensor systems is not limited to this. For example, each of the sensor systems 110a to 110z is connected to the switching hub 180, and data is transmitted / received between the sensor systems 110 via the switching hub 180. This may be a star-type network configuration.

また、図１では、デイジーチェーンとなるようセンサシステム１１０ａ〜１１０ｚの全てがカスケード接続されている構成を示したがこれに限定するものではない。例えば、複数のセンサシステム１１０をいくつかのグループに分割して、分割したグループ単位でセンサシステム１１０間をデイジーチェーン接続するようにしてもよい。また、この場合、各グループの終端となるカメラアダプタ１２０がスイッチングハブに接続されて画像コンピューティングサーバ１３０へ画像の入力を行うようにしてもよい。このような構成は、スタジアムにおいてとくに有効である。例えば、スタジアムが複数階で構成され、フロア毎にセンサシステム１１０を配備する場合が考えられる。この場合に、フロア毎、あるいはスタジアムの半周毎にセンサシステムをグループ化し、画像コンピューティングサーバ１３０への入力を行うことができる。その結果、全てのセンサシステム１１０を１つのデイジーチェーンで接続する配線が困難な場所でも設置の簡便化及びシステムの柔軟化を図ることができる。 Further, FIG. 1 shows a configuration in which all of the sensor systems 110a to 110z are cascade-connected so as to form a daisy chain, but the present invention is not limited to this. For example, a plurality of sensor systems 110 may be divided into several groups, and the sensor systems 110 may be daisy-chained in the divided group units. Further, in this case, the camera adapter 120 which is the terminal of each group may be connected to the switching hub to input the image to the image computing server 130. Such a configuration is particularly effective in a stadium. For example, a stadium may be composed of a plurality of floors, and a sensor system 110 may be installed on each floor. In this case, the sensor system can be grouped for each floor or every half lap of the stadium, and input to the image computing server 130 can be performed. As a result, it is possible to simplify the installation and make the system flexible even in a place where wiring for connecting all the sensor systems 110 with one daisy chain is difficult.

また、デイジーチェーン接続されて画像コンピューティングサーバ１３０へ画像入力を行うカメラアダプタ１２０が１つであるか２つ以上であるかに応じて、画像コンピューティングサーバ１３０での画像処理の制御が切り替えられる。すなわち、センサシステム１１０が複数のグループに分割されているかどうかに応じて制御が切り替えられる。画像コンピューティングサーバ１３０へ画像入力を行うカメラアダプタ１２０が１つの場合は、１グループのデイジーチェーン接続で画像伝送を行いながら競技場全周画像が生成される。そのため、画像コンピューティングサーバ１３０において全周の画像データが揃うタイミングは同期がとられている。すなわち、センサシステム１１０がグループに分割されていなければ、同期はとれる。 Further, the control of image processing in the image computing server 130 is switched depending on whether the number of camera adapters 120 connected in a daisy chain and inputting images to the image computing server 130 is one or two or more. .. That is, the control is switched depending on whether or not the sensor system 110 is divided into a plurality of groups. When there is one camera adapter 120 for inputting an image to the image computing server 130, an image of the entire circumference of the stadium is generated while performing image transmission with one group of daisy chain connections. Therefore, the timing at which the image data of the entire circumference is gathered in the image computing server 130 is synchronized. That is, if the sensor system 110 is not divided into groups, synchronization can be achieved.

しかし、画像コンピューティングサーバ１３０へ画像入力を行うカメラアダプタ１２０が複数になる（センサシステム１１０がグループに分割される）場合は、それぞれのグループにおけるデイジーチェーンのレーン（経路）によって遅延時間が異なる場合が考えられる。そのため、画像コンピューティングサーバ１３０において全周の画像データが揃うまで待って同期をとる同期制御によって、画像データの集結をチェックしながら後段の画像処理を行う必要があることを明記しておく。 However, when there are a plurality of camera adapters 120 for inputting images to the image computing server 130 (sensor systems 110 are divided into groups), the delay time differs depending on the daisy chain lane (route) in each group. Can be considered. Therefore, it is clearly stated that it is necessary to perform the subsequent image processing while checking the collection of the image data by the synchronization control in which the image computing server 130 waits until the image data of the entire circumference is prepared and synchronizes.

次にセンサシステム１１０について説明する。第１実施形態では、センサシステム１１０はマイク１１１、カメラ１１２、雲台１１３、外部センサ１１４、及びカメラアダプタ１２０を有する。なお、この構成に限定するものではなく、少なくとも１台のカメラアダプタ１２０と、１台のカメラ１１２または１台のマイク１１１を有していれば良い。また例えば、センサシステム１１０は１台のカメラアダプタ１２０と、複数のカメラ１１２で構成されてもよいし、１台のカメラ１１２と複数のカメラアダプタ１２０で構成されてもよい。即ち、画像処理システム１００内の複数のカメラ１１２と複数のカメラアダプタ１２０はＮ対Ｍ（ＮとＭは共に１以上の整数）で対応する。 Next, the sensor system 110 will be described. In the first embodiment, the sensor system 110 includes a microphone 111, a camera 112, a pan head 113, an external sensor 114, and a camera adapter 120. The configuration is not limited to this, and it is sufficient to have at least one camera adapter 120 and one camera 112 or one microphone 111. Further, for example, the sensor system 110 may be composed of one camera adapter 120 and a plurality of cameras 112, or may be composed of one camera 112 and a plurality of camera adapters 120. That is, the plurality of cameras 112 and the plurality of camera adapters 120 in the image processing system 100 correspond to each other by N to M (N and M are both integers of 1 or more).

また、センサシステム１１０は、マイク１１１、カメラ１１２、雲台１１３、及びカメラアダプタ１２０以外の装置を含んでいてもよい。また、カメラ１１２とカメラアダプタ１２０が一体となって構成されていてもよい。さらに、カメラアダプタ１２０の機能の少なくとも一部をフロントエンドサーバ１３１ａ〜１３１ｂが有していてもよい。第１実施形態では、センサシステム１１０ａ〜１１０ｚは同様の構成を有しているものとするが、センサシステム１１０ａ〜１１０ｚのすべてが同じ構成に限定されるものではなく、其々のセンサシステム１１０が異なる構成でもよい。 Further, the sensor system 110 may include devices other than the microphone 111, the camera 112, the pan head 113, and the camera adapter 120. Further, the camera 112 and the camera adapter 120 may be integrally configured. Further, the front-end servers 131a to 131b may have at least a part of the functions of the camera adapter 120. In the first embodiment, it is assumed that the sensor systems 110a to 110z have the same configuration, but not all of the sensor systems 110a to 110z are limited to the same configuration, and each sensor system 110 has a similar configuration. It may have a different configuration.

マイク１１１ａにて収音された音声と、カメラ１１２ａにて撮像された画像は、カメラアダプタ１２０ａにおいて後述の画像処理が施された後、デイジーチェーン１７０ａを通してセンサシステム１１０ｂのカメラアダプタ１２０ｂに伝送される。同様にセンサシステム１１０ｂは、収音された音声と撮像された画像を、センサシステム１１０ａから取得した画像及び音声と合わせてセンサシステム１１０ｃに伝送する。このような動作を続けることにより、センサシステム１１０ａ〜１１０ｚが取得した画像及び音声は、センサシステム１１０ｚからネットワーク１８０ｂを介してスイッチングハブ１８０に伝わり、その後、画像コンピューティングサーバ１３０へ伝送される。 The sound picked up by the microphone 111a and the image captured by the camera 112a are transmitted to the camera adapter 120b of the sensor system 110b through the daisy chain 170a after the image processing described later is performed by the camera adapter 120a. .. Similarly, the sensor system 110b transmits the picked-up sound and the captured image to the sensor system 110c together with the image and the sound acquired from the sensor system 110a. By continuing such an operation, the images and sounds acquired by the sensor systems 110a to 110z are transmitted from the sensor system 110z to the switching hub 180 via the network 180b, and then transmitted to the image computing server 130.

なお、本実施形態では、カメラ１１２とカメラアダプタ１２０が分離された構成を採用しているが、カメラ１１２とカメラアダプタ１２０が同一筺体で一体化された構成でもよい。その場合、マイク１１１は一体化されたカメラ１１２に内蔵されてもよいし、カメラ１１２の外部に接続されていてもよい。 In the present embodiment, the camera 112 and the camera adapter 120 are separated from each other, but the camera 112 and the camera adapter 120 may be integrated in the same housing. In that case, the microphone 111 may be built in the integrated camera 112 or may be connected to the outside of the camera 112.

次に、画像コンピューティングサーバ１３０の構成及び動作について説明する。本実施形態の画像コンピューティングサーバ１３０は、センサシステム１１０ｚから取得したデータを処理する。画像コンピューティングサーバ１３０は、フロントエンドサーバ１３１ａ〜１３１ｂ、データベース１３２（以下、ＤＢとも記載する。）、バックエンドサーバ１３３、タイムサーバ１３４を有する。第１実施形態において、特別な説明がない場合は、フロントエンドサーバ１３１ａ〜１３１ｂを区別せずフロントエンドサーバ１３１と記載する。また、フロントエンドサーバ１３１の台数として２台と記載しているが、あくまでも一例であり、台数をこれに限定するものではない。すなわち、フロントエンドサーバ１３１は、３台以上であってもよい。 Next, the configuration and operation of the image computing server 130 will be described. The image computing server 130 of the present embodiment processes the data acquired from the sensor system 110z. The image computing server 130 includes front-end servers 131a to 131b, a database 132 (hereinafter, also referred to as DB), a back-end server 133, and a time server 134. In the first embodiment, unless otherwise specified, the front-end servers 131a to 131b are referred to as the front-end server 131 without distinction. Further, although the number of front-end servers 131 is described as two, this is just an example, and the number is not limited to this. That is, the number of front-end servers 131 may be three or more.

タイムサーバ１３４は時刻及び同期信号を配信する機能を有し、スイッチングハブ１８０を介してセンサシステム１１０ａ〜１１０ｚに時刻及び同期信号を配信する。時刻と同期信号を受信したカメラアダプタ１２０ａ〜１２０ｚは、カメラ１１２ａ〜１１２ｚを時刻と同期信号をもとにＧｅｎｌｏｃｋさせ画像フレーム同期を行う。即ち、タイムサーバ１３４は、複数のカメラ１１２の撮像タイミングを同期させる。これにより、画像処理システム１００は同じタイミングで撮像された（同期撮像された）複数の撮像画像に基づいて仮想視点画像を生成できるため、撮像タイミングのずれによる仮想視点画像の品質低下を抑制できる。なお、本実施形態ではタイムサーバ１３４が複数のカメラ１１２の時刻同期を管理するものとするが、これに限らず、時刻同期のための処理を各々のカメラ１１２又は各々のカメラアダプタ１２０が独立して行ってもよい。 The time server 134 has a function of distributing the time and synchronization signals, and distributes the time and synchronization signals to the sensor systems 110a to 110z via the switching hub 180. The camera adapters 120a to 120z that have received the time and synchronization signals genlock the cameras 112a to 112z based on the time and synchronization signals to perform image frame synchronization. That is, the time server 134 synchronizes the imaging timings of the plurality of cameras 112. As a result, the image processing system 100 can generate a virtual viewpoint image based on a plurality of captured images captured (synchronously captured) at the same timing, so that deterioration of the quality of the virtual viewpoint image due to a deviation in the imaging timing can be suppressed. In the present embodiment, the time server 134 manages the time synchronization of the plurality of cameras 112, but the present invention is not limited to this, and each camera 112 or each camera adapter 120 independently performs the processing for time synchronization. You may go there.

フロントエンドサーバ１３１は、複数のカメラ１１２により撮像された複数の画像を処理する画像処理装置の例である。画像処理システム１００は、複数のフロントエンドサーバを有する。本実施形態では、複数のフロントエンドサーバ１３１のそれぞれは、センサシステム１１０ｚから取得した画像及び音声のセグメント化された伝送パケットを再構成してデータ形式を変換する。フロントエンドサーバ１３１は、こうして得られた画像及び音声を、カメラの識別子やデータ種別、フレーム番号に応じてデータベース１３２に書き込む。フロントエンドサーバ１３１の詳細については後述する。また、本実施形態では、複数のカメラ１１２の同期撮像により得られた画像のうち同じタイミングで撮像された画像が、複数のフロントエンドサーバ１３１のうちの同じフロントエンドサーバ１３１に配信されるようにしている。この構成により、フロントエンドサーバ１３１における画像処理の効率を向上させている。バックエンドサーバ１３３では、仮想カメラ操作ＵＩ１４２から視点の指定を受け付け、受け付けられた視点に基づいて、データベース１３２から対応する画像及び音声データを読み出し、レンダリング処理を行って仮想視点画像を生成する。 The front-end server 131 is an example of an image processing device that processes a plurality of images captured by a plurality of cameras 112. The image processing system 100 has a plurality of front-end servers. In the present embodiment, each of the plurality of front-end servers 131 reconstructs the segmented transmission packets of the image and the sound acquired from the sensor system 110z to convert the data format. The front-end server 131 writes the image and sound thus obtained in the database 132 according to the camera identifier, the data type, and the frame number. Details of the front-end server 131 will be described later. Further, in the present embodiment, the images captured at the same timing among the images obtained by the synchronous imaging of the plurality of cameras 112 are delivered to the same front-end server 131 among the plurality of front-end servers 131. ing. This configuration improves the efficiency of image processing in the front-end server 131. The back-end server 133 receives the designation of the viewpoint from the virtual camera operation UI 142, reads the corresponding image and audio data from the database 132 based on the received viewpoint, performs rendering processing, and generates a virtual viewpoint image.

なお、画像コンピューティングサーバ１３０の構成はこれに限らない。例えば、フロントエンドサーバ１３１、データベース１３２、及びバックエンドサーバ１３３のうち少なくとも２つが一体となって構成されていてもよい。また、フロントエンドサーバ１３１、データベース１３２、及びバックエンドサーバ１３３の少なくとも何れかが複数含まれていてもよい。また、画像コンピューティングサーバ１３０内の任意の位置に上記の装置以外の装置が含まれていてもよい。さらに、画像コンピューティングサーバ１３０の機能の少なくとも一部をエンドユーザ端末１５０や仮想カメラ操作ＵＩ１４２が有していてもよい。 The configuration of the image computing server 130 is not limited to this. For example, at least two of the front-end server 131, the database 132, and the back-end server 133 may be integrally configured. Further, at least one of the front-end server 131, the database 132, and the back-end server 133 may be included. Further, a device other than the above device may be included at an arbitrary position in the image computing server 130. Further, the end user terminal 150 or the virtual camera operation UI 142 may have at least a part of the functions of the image computing server 130.

レンダリング処理された画像は、バックエンドサーバ１３３からエンドユーザ端末１５０に送信される。こうして、エンドユーザ端末１５０を操作するユーザは視点の指定に応じた画像閲覧及び音声視聴が出来る。すなわち、バックエンドサーバ１３３は、複数のカメラ１１２により撮像された撮像画像（複数視点画像）と視点情報とに基づく仮想視点コンテンツを生成する。より具体的には、バックエンドサーバ１３３は、例えば複数のカメラアダプタ１２０により複数のカメラ１１２による撮像画像から抽出された所定領域の画像データと、ユーザ操作により指定された視点に基づいて、仮想視点コンテンツを生成する。そしてバックエンドサーバ１３３は、生成した仮想視点コンテンツをエンドユーザ端末１５０に提供する。カメラアダプタ１２０による所定領域の抽出の詳細については後述する。本実施形態における仮想視点コンテンツは、仮想的な視点から被写体を撮像した場合に得られる画像としての仮想視点画像を含むコンテンツである。言い換えると、仮想視点画像は、指定された視点における見えを表す画像であるとも言える。仮想的な視点（仮想視点）は、ユーザにより指定されても良いし、画像解析の結果等に基づいて自動的に指定されても良い。すなわち仮想視点画像には、ユーザが任意に指定した視点に対応する任意視点画像（自由視点画像）が含まれる。また、複数の候補からユーザが指定した視点に対応する画像や、装置が自動で指定した視点に対応する画像も、仮想視点画像に含まれる。 The rendered image is transmitted from the back-end server 133 to the end-user terminal 150. In this way, the user who operates the end user terminal 150 can view images and listen to audio according to the designation of the viewpoint. That is, the back-end server 133 generates virtual viewpoint contents based on the captured images (multi-viewpoint images) captured by the plurality of cameras 112 and the viewpoint information. More specifically, the back-end server 133 uses, for example, a virtual viewpoint based on image data of a predetermined area extracted from images captured by a plurality of cameras 112 by a plurality of camera adapters 120 and a viewpoint designated by a user operation. Generate content. Then, the back-end server 133 provides the generated virtual viewpoint content to the end user terminal 150. Details of extraction of a predetermined area by the camera adapter 120 will be described later. The virtual viewpoint content in the present embodiment is content including a virtual viewpoint image as an image obtained when a subject is imaged from a virtual viewpoint. In other words, the virtual viewpoint image can be said to be an image representing the appearance at the specified viewpoint. The virtual viewpoint (virtual viewpoint) may be specified by the user, or may be automatically specified based on the result of image analysis or the like. That is, the virtual viewpoint image includes an arbitrary viewpoint image (free viewpoint image) corresponding to a viewpoint arbitrarily specified by the user. In addition, an image corresponding to a viewpoint designated by the user from a plurality of candidates and an image corresponding to a viewpoint automatically designated by the device are also included in the virtual viewpoint image.

なお、本実施形態では、仮想視点コンテンツに音声データ（オーディオデータ）が含まれる場合の例を中心に説明するが、必ずしも音声データが含まれていなくても良い。また、バックエンドサーバ１３３は、仮想視点画像をＨ．２６４やＨＥＶＣに代表される標準技術により圧縮符号化したうえで、ＭＰＥＧ−ＤＡＳＨプロトコルを使ってエンドユーザ端末１５０へ送信してもよい。また、仮想視点画像は、非圧縮でエンドユーザ端末１５０へ送信されてもよい。とくに圧縮符号化を行う前者はエンドユーザ端末１５０としてスマートフォンやタブレットを想定している。後者は非圧縮画像を表示可能なディスプレイを想定している。すなわち、エンドユーザ端末１５０の種別に応じて画像フォーマットが切り替え可能であることを明記しておく。また、画像の送信プロトコルはＭＰＥＧ−ＤＡＳＨに限らず、例えば、ＨＬＳ（ＨＴＴＰＬｉｖｅＳｔｒｅａｍｉｎｇ）やその他の送信方法を用いても良い。 In the present embodiment, an example in which audio data (audio data) is included in the virtual viewpoint content will be mainly described, but the audio data may not necessarily be included. In addition, the back-end server 133 displays the virtual viewpoint image in H. It may be compressed and encoded by a standard technique represented by 264 or HEVC, and then transmitted to the end user terminal 150 using the MPEG-DASH protocol. Further, the virtual viewpoint image may be transmitted to the end user terminal 150 without compression. In particular, the former that performs compression coding assumes a smartphone or tablet as the end user terminal 150. The latter envisions a display capable of displaying uncompressed images. That is, it is clearly stated that the image format can be switched according to the type of the end user terminal 150. Further, the image transmission protocol is not limited to MPEG-DASH, and for example, HLS (HTTP Live Streaming) or other transmission method may be used.

この様に、画像処理システム１００は、映像収集ドメイン、データ保存ドメイン、及び映像生成ドメインという３つの機能ドメインを有する。映像収集ドメインは、センサシステム１１０ａ〜１１０ｚを含む。データ保存ドメインは、データベース１３２、フロントエンドサーバ１３１ａ〜１３１ｂ及びバックエンドサーバ１３３を含む。映像生成ドメインは、仮想カメラ操作ＵＩ１４２及びエンドユーザ端末１５０を含む。なお本構成に限らず、例えば、仮想カメラ操作ＵＩ１４２がセンサシステム１１０ａ〜１１０ｚから画像を直接に取得する事も可能である。しかしながら、本実施形態では、センサシステム１１０ａ〜１１０ｚから直接画像を取得する方法ではなくデータ保存機能を中間に配置する方法をとる。具体的には、フロントエンドサーバ１３１ａ〜１３１ｂが、センサシステム１１０ａ〜１１０ｚが生成した画像データや音声データ及びそれらのデータのメタ情報をデータベース１３２の共通スキーマ及びデータ型に変換している。これにより、センサシステム１１０のカメラ１１２が他機種のカメラに変化しても、変化した差分をフロントエンドサーバ１３１ａ〜１３１ｂが吸収し、データベース１３２に登録することができる。このことによって、カメラ１１２が他機種カメラに変わった場合に、仮想カメラ操作ＵＩ１４２が適切に動作しない虞を低減できる。 As described above, the image processing system 100 has three functional domains, that is, a video acquisition domain, a data storage domain, and a video generation domain. The video acquisition domain includes sensor systems 110a-110z. The data storage domain includes the database 132, the front-end servers 131a-131b and the back-end server 133. The video generation domain includes a virtual camera operation UI 142 and an end user terminal 150. Not limited to this configuration, for example, the virtual camera operation UI 142 can directly acquire an image from the sensor systems 110a to 110z. However, in the present embodiment, the method of arranging the data storage function in the middle is adopted instead of the method of directly acquiring the image from the sensor systems 110a to 110z. Specifically, the front-end servers 131a to 131b convert the image data and audio data generated by the sensor systems 110a to 110z and the meta information of those data into the common schema and data type of the database 132. As a result, even if the camera 112 of the sensor system 110 is changed to a camera of another model, the front-end servers 131a to 131b can absorb the changed difference and register it in the database 132. As a result, when the camera 112 is changed to another model camera, the possibility that the virtual camera operation UI 142 does not operate properly can be reduced.

また、仮想カメラ操作ＵＩ１４２は、データベース１３２に直接アクセスするのではなく、バックエンドサーバ１３３を介してアクセスする構成である。バックエンドサーバ１３３で画像生成処理に係わる共通処理を行い、操作ＵＩに係わるアプリケーションの差分部分を仮想カメラ操作ＵＩ１４２で行っている。このことにより、仮想カメラ操作ＵＩ１４２の開発において、ＵＩ操作デバイスや、生成したい仮想視点画像を操作するＵＩの機能要求に対する開発に注力する事ができる。また、バックエンドサーバ１３３は、仮想カメラ操作ＵＩ１４２の要求に応じて画像生成処理に係わる共通処理を追加又は削除する事も可能である。このことによって仮想カメラ操作ＵＩ１４２の要求に柔軟に対応する事ができる。 Further, the virtual camera operation UI 142 is configured to access the database 132 via the back-end server 133 instead of directly accessing the database 132. The back-end server 133 performs common processing related to the image generation processing, and the virtual camera operation UI 142 performs the difference portion of the application related to the operation UI. As a result, in the development of the virtual camera operation UI 142, it is possible to focus on the development of the UI function requirements for the UI operation device and the UI for operating the virtual viewpoint image to be generated. Further, the back-end server 133 can add or delete common processing related to the image generation processing in response to the request of the virtual camera operation UI 142. This makes it possible to flexibly respond to the demands of the virtual camera operation UI 142.

このように、画像処理システム１００においては、被写体を複数の方向から撮像するための複数のカメラ１１２による撮像に基づく画像データに基づいて、バックエンドサーバ１３３により仮想視点画像が生成される。なお、本実施形態における画像処理システム１００は、上記で説明した物理的な構成に限定される訳ではなく、論理的に構成されていてもよい。 As described above, in the image processing system 100, the back-end server 133 generates a virtual viewpoint image based on the image data based on the images taken by the plurality of cameras 112 for capturing the subject from a plurality of directions. The image processing system 100 in the present embodiment is not limited to the physical configuration described above, and may be logically configured.

次に、図１に記載のシステムにおける各ノード（カメラアダプタ１２０、フロントエンドサーバ１３１、データベース１３２、バックエンドサーバ１３３、仮想カメラ操作ＵＩ１４２、エンドユーザ端末１５０）の機能構成を説明する。 Next, the functional configuration of each node (camera adapter 120, front-end server 131, database 132, back-end server 133, virtual camera operation UI 142, end-user terminal 150) in the system shown in FIG. 1 will be described.

図２は、カメラアダプタ１２０の機能構成例を示すブロック図である。第１実施形態におけるカメラアダプタ１２０の機能構成について図２を参照して説明する。カメラアダプタ１２０は、ネットワークアダプタ２１０、伝送部２２０、画像処理部２３０及び、外部機器制御部２４０を有する。 FIG. 2 is a block diagram showing a functional configuration example of the camera adapter 120. The functional configuration of the camera adapter 120 according to the first embodiment will be described with reference to FIG. The camera adapter 120 includes a network adapter 210, a transmission unit 220, an image processing unit 230, and an external device control unit 240.

ネットワークアダプタ２１０は、データ送受信部２１１及び時刻制御部２１２を備える。 The network adapter 210 includes a data transmission / reception unit 211 and a time control unit 212.

データ送受信部２１１は、デイジーチェーン１７０、ネットワーク１３５、及びネットワーク１９０ａを介して他のカメラアダプタ１２０、フロントエンドサーバ１３１ａ、１３１ｂ、タイムサーバ１３４、及び制御ステーション１４１とデータ通信を行う。例えばデータ送受信部２１１は、カメラ１１２による撮像画像から前景背景分離部２３１により分離された前景画像と背景画像とを、別のカメラアダプタ１２０に対して出力する。出力先のカメラアダプタ１２０は、例えば、画像処理システム１００内のカメラアダプタ１２０のうち、データルーティング処理部２２２の処理に応じて予め定められた順序において次のカメラアダプタ１２０である。各々のカメラアダプタ１２０が前景画像と背景画像とを出力することで、複数の視点から撮像された前景画像と背景画像に基づいて仮想視点画像が生成される。なお、撮像画像から分離した前景画像を出力して背景画像は出力しないカメラアダプタ１２０が存在してもよい。 The data transmission / reception unit 211 performs data communication with other camera adapters 120, front-end servers 131a and 131b, time server 134, and control station 141 via the daisy chain 170, network 135, and network 190a. For example, the data transmission / reception unit 211 outputs the foreground image and the background image separated by the foreground background separation unit 231 from the image captured by the camera 112 to another camera adapter 120. The output destination camera adapter 120 is, for example, the next camera adapter 120 in the camera adapter 120 in the image processing system 100 in a predetermined order according to the processing of the data routing processing unit 222. By outputting the foreground image and the background image from each camera adapter 120, a virtual viewpoint image is generated based on the foreground image and the background image captured from a plurality of viewpoints. There may be a camera adapter 120 that outputs a foreground image separated from the captured image and does not output a background image.

時刻制御部２１２は、例えばＩＥＥＥ１５８８規格のＯｒｄｉｎａｙＣｌｏｃｋに準拠し、タイムサーバ１３４との間で送受信したデータのタイムスタンプを保存する機能と、タイムサーバ１３４と時刻同期を行う。なお、ＩＥＥＥ１５８８に限定する訳ではなく、他のＥｔｈｅｒＡＶＢ規格や、独自プロトコルによってタイムサーバとの時刻同期を実現してもよい。本実施形態では、ネットワークアダプタ２１０としてＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）を利用するが、ＮＩＣに限定するものではなく、同様の他のＩｎｔｅｒｆａｃｅを利用してもよい。また、ＩＥＥＥ１５８８はＩＥＥＥ１５８８−２００２、ＩＥＥＥ１５８８−２００８のように標準規格として更新されており、後者については、ＰＴＰｖ２（ＰｒｅｃｉｓｉｏｎＴｉｍｅＰｒｏｔｏｃｏｌＶｅｒｓｉｏｎ２）とも呼ばれる。 The time control unit 212 conforms to, for example, the Ordinary Clock of the IEEE1588 standard, and performs a function of storing a time stamp of data sent and received to and from the time server 134 and time synchronization with the time server 134. It is not limited to IEEE1588, and time synchronization with a time server may be realized by another EtherAVB standard or an original protocol. In the present embodiment, the NIC (Network Interface Card) is used as the network adapter 210, but the present invention is not limited to the NIC, and other similar interfaces may be used. In addition, IEEE1588 has been updated as a standard such as IEEE1588-2002 and IEEE1588-2008, and the latter is also called PTPv2 (Precision Time Protocol Version 2).

伝送部２２０は、ネットワークアダプタ２１０を介してスイッチングハブ１８０等に対するデータの伝送を制御する機能を有している。伝送部２２０が備える機能部について、以下に説明する。 The transmission unit 220 has a function of controlling the transmission of data to the switching hub 180 and the like via the network adapter 210. The functional unit included in the transmission unit 220 will be described below.

データ圧縮・伸張部２２１は、データ送受信部２１１を介して送受信されるデータに対して所定の圧縮方式、圧縮率、及びフレームレートを適用した圧縮を行う機能と、圧縮されたデータを伸張する機能を有している。 The data compression / decompression unit 221 has a function of compressing data transmitted / received via the data transmission / reception unit 211 by applying a predetermined compression method, compression rate, and frame rate, and a function of decompressing the compressed data. have.

データルーティング処理部２２２は、後述するデータルーティング情報保持部２２５が保持するデータを利用し、データ送受信部２１１が受信したデータ及び画像処理部２３０で処理されたデータのルーティング先を決定する。さらに、データルーティング処理部２２２は、決定したルーティング先へデータを送信する機能を有している。ルーティング先としては、同一の注視点にフォーカスされたカメラ１１２に対応するカメラアダプタ１２０とするのが、画像処理を行う上で好適である。同一の注視点にフォーカスされたカメラ１１２同士の画像フレーム相関が高いためである。複数のカメラアダプタ１２０それぞれのデータルーティング処理部２２２による決定に応じて、画像処理システム１００内において前景画像や背景画像をリレー形式で出力するカメラアダプタ１２０の順序が定まる。 The data routing processing unit 222 uses the data held by the data routing information holding unit 225, which will be described later, to determine the routing destination of the data received by the data transmitting / receiving unit 211 and the data processed by the image processing unit 230. Further, the data routing processing unit 222 has a function of transmitting data to the determined routing destination. As the routing destination, it is preferable to use a camera adapter 120 corresponding to the camera 112 focused on the same gazing point in order to perform image processing. This is because the image frame correlation between the cameras 112 focused on the same gazing point is high. The order of the camera adapters 120 that output the foreground image and the background image in the relay format in the image processing system 100 is determined according to the determination by the data routing processing unit 222 of each of the plurality of camera adapters 120.

時刻同期制御部２２３は、ＩＥＥＥ１５８８規格のＰＴＰ（ＰｒｅｃｉｓｉｏｎＴｉｍｅＰｒｏｔｏｃｏｌ）に準拠し、タイムサーバ１３４と時刻同期に係わる処理を行う機能を有している。なお、ＰＴＰに限定するのではなく他の同様のプロトコルを利用して時刻同期してもよい。 The time synchronization control unit 223 conforms to PTP (Precision Time Protocol) of the IEEE1588 standard, and has a function of performing processing related to time synchronization with the time server 134. The time may be synchronized by using another similar protocol instead of limiting to PTP.

画像・音声伝送処理部２２４は、画像データ又は音声データを、データ送受信部２１１を介して他のカメラアダプタ１２０またはフロントエンドサーバ１３１へ転送するためのメッセージを作成する機能を有している。メッセージには画像データ又は音声データ、及び各データのメタ情報が含まる。本実施形態のメタ情報には画像の撮像または音声のサンプリングをした時のタイムコードまたはシーケンス番号、データ種別、及びカメラ１１２やマイク１１１の個体を示す識別子などが含まれる。なお、送信される画像データまたは音声データは、データ圧縮・伸張部２２１でデータ圧縮されていてもよい。また、画像・音声伝送処理部２２４は、他のカメラアダプタ１２０からデータ送受信部２１１を介してメッセージを受取る。そして、画像・音声伝送処理部２２４は、メッセージに含まれるデータ種別に応じて、伝送プロトコルで規定されたパケットサイズにフラグメントされたデータ情報を画像データまたは音声データに復元する。なお、データを復元した際にデータが圧縮されている場合は、データ圧縮・伸張部２２１が伸張処理を行う。 The image / audio transmission processing unit 224 has a function of creating a message for transferring image data or audio data to another camera adapter 120 or a front-end server 131 via the data transmission / reception unit 211. The message includes image data or audio data, and meta information of each data. The meta information of the present embodiment includes a time code or sequence number at the time of image capturing or sound sampling, a data type, and an identifier indicating an individual of the camera 112 or the microphone 111. The transmitted image data or audio data may be data-compressed by the data compression / decompression unit 221. Further, the image / audio transmission processing unit 224 receives a message from another camera adapter 120 via the data transmission / reception unit 211. Then, the image / audio transmission processing unit 224 restores the data information fragmented to the packet size defined by the transmission protocol into the image data or the audio data according to the data type included in the message. If the data is compressed when the data is restored, the data compression / decompression unit 221 performs the decompression process.

データルーティング情報保持部２２５は、データ送受信部２１１で送受信されるデータの送信先を決定するためのアドレス情報を保持する機能を有する。ルーティング方法については後述する。 The data routing information holding unit 225 has a function of holding address information for determining a transmission destination of data transmitted / received by the data transmitting / receiving unit 211. The routing method will be described later.

画像処理部２３０は、カメラ制御部２４１の制御によりカメラ１１２が撮像した画像データ及び他のカメラアダプタ１２０から受取った画像データに対して処理を行う機能を有している。画像処理部２３０が備える機能部について以下に説明する。 The image processing unit 230 has a function of processing the image data captured by the camera 112 and the image data received from the other camera adapter 120 under the control of the camera control unit 241. The functional unit included in the image processing unit 230 will be described below.

前景背景分離部２３１は、カメラ１１２が撮像した画像データを前景画像と背景画像に分離する機能を有している。すなわち、複数のカメラアダプタ１２０それぞれの前景背景分離部２３１は、複数のカメラ１１２のうち対応するカメラ１１２による撮像画像から所定領域を抽出する。所定領域は例えば撮像画像に対するオブジェクト検出の結果得られる前景画像である。所定領域の抽出により、前景背景分離部２３１は、撮像画像を前景画像と背景画像に分離する。なお、オブジェクトとは、例えば人物である。ただし、オブジェクトが特定人物（選手、監督、及び／又は審判など）であっても良いし、ボールやゴールなど、画像パターンが予め定められている物体であっても良い。また、オブジェクトとして、動体が検出されるようにしても良い。人物等の重要なオブジェクトを含む前景画像とそのようなオブジェクトを含まない背景領域を分離して処理することで、画像処理システム１００において生成される仮想視点画像の上記のオブジェクトに該当する部分の画像の品質を向上できる。また、前景と背景の分離を複数のカメラアダプタ１２０それぞれが行うことで、複数のカメラ１１２を備えた画像処理システム１００における負荷を分散させることができる。なお、所定領域は前景画像に限らず、例えば背景画像であってもよい。 The foreground background separation unit 231 has a function of separating the image data captured by the camera 112 into a foreground image and a background image. That is, the foreground background separation unit 231 of each of the plurality of camera adapters 120 extracts a predetermined region from the image captured by the corresponding camera 112 among the plurality of cameras 112. The predetermined area is, for example, a foreground image obtained as a result of object detection on a captured image. By extracting the predetermined region, the foreground background separation unit 231 separates the captured image into the foreground image and the background image. The object is, for example, a person. However, the object may be a specific person (player, manager, and / or referee, etc.), or may be an object having a predetermined image pattern, such as a ball or a goal. Further, a moving object may be detected as an object. By separating and processing the foreground image including an important object such as a person and the background area not including such an object, the image of the part corresponding to the above object of the virtual viewpoint image generated in the image processing system 100. The quality of the object can be improved. Further, by separating the foreground and the background from each of the plurality of camera adapters 120, the load on the image processing system 100 provided with the plurality of cameras 112 can be distributed. The predetermined area is not limited to the foreground image, and may be, for example, a background image.

三次元モデル情報生成部２３２は、前景背景分離部２３１で分離された前景画像及び他のカメラアダプタ１２０から受取った前景画像を利用し、例えばステレオカメラの原理を用いて三次元モデルに係わる画像情報を生成する機能を有している。 The three-dimensional model information generation unit 232 uses the foreground image separated by the foreground background separation unit 231 and the foreground image received from another camera adapter 120, and uses, for example, the principle of a stereo camera to provide image information related to the three-dimensional model. Has a function to generate.

キャリブレーション制御部２３３は、キャリブレーションに必要な画像データを、カメラ制御部２４１を介してカメラ１１２から取得し、キャリブレーションに係わる演算処理を行うフロントエンドサーバ１３１に送信する機能を有している。なお本実施形態ではキャリブレーションに係わる演算処理をフロントエンドサーバ１３１で行っているが、演算処理を行うノードはフロントエンドサーバ１３１に限定されない。例えば、制御ステーション１４１やカメラアダプタ１２０（他のカメラアダプタ１２０を含む）など他のノードで演算処理が行われてもよい。またキャリブレーション制御部２３３は、カメラ制御部２４１を介してカメラ１１２から取得した画像データに対して、予め設定されたパラメータに応じて撮像中のキャリブレーション（動的キャリブレーション）を行う機能を有している。 The calibration control unit 233 has a function of acquiring image data required for calibration from the camera 112 via the camera control unit 241 and transmitting the image data to the front-end server 131 that performs arithmetic processing related to the calibration. .. In the present embodiment, the arithmetic processing related to the calibration is performed by the front-end server 131, but the node that performs the arithmetic processing is not limited to the front-end server 131. For example, arithmetic processing may be performed at another node such as the control station 141 or the camera adapter 120 (including another camera adapter 120). Further, the calibration control unit 233 has a function of performing calibration (dynamic calibration) during imaging according to preset parameters for the image data acquired from the camera 112 via the camera control unit 241. doing.

外部機器制御部２４０は、カメラアダプタ１２０に接続する機器を制御する機能を有している。外部機器制御部２４０が備えている機能部について、以下に説明する。 The external device control unit 240 has a function of controlling a device connected to the camera adapter 120. The functional unit included in the external device control unit 240 will be described below.

カメラ制御部２４１は、カメラ１１２と接続し、カメラ１１２の制御、撮像画像取得、同期信号提供、及び時刻設定などを行う機能を有している。カメラ１１２の制御には、例えば撮像パラメータ（画素数、色深度、フレームレート、及びホワイトバランスの設定など）の設定及び参照、カメラ１１２の状態（撮像中、停止中、同期中、及びエラーなど）の取得、撮像の開始及び停止や、ピント調整などがある。なお、本実施形態ではカメラ１１２を介してピント調整を行っているが、取り外し可能なレンズがカメラ１１２に装着されている場合は、カメラアダプタ１２０がレンズに接続し、直接レンズの調整を行ってもよい。また、カメラアダプタ１２０がカメラ１１２を介してズーム等のレンズ調整を行ってもよい。 The camera control unit 241 is connected to the camera 112 and has a function of controlling the camera 112, acquiring a captured image, providing a synchronization signal, setting a time, and the like. For control of the camera 112, for example, setting and reference of imaging parameters (number of pixels, color depth, frame rate, white balance setting, etc.), state of the camera 112 (imaging, stopped, synchronizing, error, etc.) Acquisition, start and stop of imaging, focus adjustment, etc. In the present embodiment, the focus is adjusted via the camera 112, but when a removable lens is attached to the camera 112, the camera adapter 120 is connected to the lens to directly adjust the lens. May be good. Further, the camera adapter 120 may perform lens adjustment such as zooming via the camera 112.

同期信号提供は、時刻同期制御部２２３がタイムサーバ１３４と同期した時刻を利用し、撮像タイミング（制御クロック）をカメラ１１２に提供することで行われる。時刻設定は、時刻同期制御部２２３がタイムサーバ１３４と同期した時刻を例えばＳＭＰＴＥ１２Ｍのフォーマットに準拠したタイムコードで提供することで行われる。これにより、カメラ１１２から受取る画像データに提供したタイムコードが付与されることになる（詳細は図９Ａの参照により後述する）。なおタイムコードのフォーマットはＳＭＰＴＥ１２Ｍに限定されるわけではなく、他のフォーマットであってもよい。また、カメラ制御部２４１は、カメラ１１２に対するタイムコードの提供はせず、カメラ１１２から受取った画像データに自身がタイムコードを付与してもよい（詳細は図９Ｂの参照により後述する）。 The synchronization signal is provided by the time synchronization control unit 223 using the time synchronized with the time server 134 and providing the imaging timing (control clock) to the camera 112. The time setting is performed by the time synchronization control unit 223 providing the time synchronized with the time server 134 with a time code conforming to, for example, the SMPTE 12M format. As a result, the provided time code is added to the image data received from the camera 112 (details will be described later with reference to FIG. 9A). The time code format is not limited to SMPTE12M, and may be another format. Further, the camera control unit 241 may not provide the time code to the camera 112, but may assign the time code to the image data received from the camera 112 (details will be described later with reference to FIG. 9B).

マイク制御部２４２は、マイク１１１と接続し、マイク１１１の制御、収音の開始及び停止や収音された音声データの取得などを行う機能を有している。マイク１１１の制御は例えば、ゲイン調整や、状態取得などである。またカメラ制御部２４１と同様に、マイク制御部２４２はマイク１１１に対して音声サンプリングするタイミングとタイムコードを提供する。音声サンプリングのタイミングとなるクロック情報としては、タイムサーバ１３４からの時刻情報が例えば４８ＫＨｚのワードクロックに変換されてマイク１１１に供給される。 The microphone control unit 242 is connected to the microphone 111 and has a function of controlling the microphone 111, starting and stopping sound collection, acquiring sound collected voice data, and the like. The control of the microphone 111 is, for example, gain adjustment, state acquisition, and the like. Further, similarly to the camera control unit 241, the microphone control unit 242 provides the microphone 111 with the timing and time code for voice sampling. As the clock information that is the timing of voice sampling, the time information from the time server 134 is converted into, for example, a 48 KHz word clock and supplied to the microphone 111.

雲台制御部２４３は雲台１１３と接続し、雲台１１３の制御を行う機能を有している。雲台１１３の制御は例えば、パン・チルト制御や、状態取得などがある。 The pan head control unit 243 has a function of connecting to the pan head 113 and controlling the pan head 113. Control of the pan head 113 includes, for example, pan / tilt control and state acquisition.

センサ制御部２４４は、外部センサ１１４と接続し、外部センサ１１４がセンシングしたセンサ情報を取得する機能を有する。例えば、外部センサ１１４としてジャイロセンサが利用される場合は、振動を表す情報を取得することができる。そして、センサ制御部２４４が取得した振動情報を用いて、画像処理部２３０は、前景背景分離部２３１での処理に先立って、振動を抑えた画像を生成することができる。振動情報は例えば、８Ｋカメラの画像データを、振動情報を考慮して、元の８Ｋサイズよりも小さいサイズで切り出して、隣接設置されたカメラ１１２の画像との位置合わせを行う場合に利用される。これにより、建造物の躯体振動が各カメラに異なる周波数で伝搬しても、カメラアダプタ１２０に配備された本機能で位置合わせを行う。その結果、電子的に防振された画像データを生成でき、画像コンピューティングサーバ１３０におけるカメラ１１２の台数分の位置合わせの処理負荷を軽減する効果が得られる。なお、センサシステム１１０のセンサは外部センサ１１４に限定するわけではなく、カメラアダプタ１２０に内蔵されたセンサであっても同様の効果が得られる。 The sensor control unit 244 has a function of connecting to the external sensor 114 and acquiring sensor information sensed by the external sensor 114. For example, when a gyro sensor is used as the external sensor 114, information representing vibration can be acquired. Then, using the vibration information acquired by the sensor control unit 244, the image processing unit 230 can generate an image in which vibration is suppressed prior to the processing by the foreground background separation unit 231. The vibration information is used, for example, when the image data of an 8K camera is cut out in a size smaller than the original 8K size in consideration of the vibration information and aligned with the image of the adjacent camera 112. .. As a result, even if the skeleton vibration of the building propagates to each camera at different frequencies, the alignment is performed by this function provided in the camera adapter 120. As a result, electronically vibration-proof image data can be generated, and the effect of reducing the processing load of alignment for the number of cameras 112 in the image computing server 130 can be obtained. The sensor of the sensor system 110 is not limited to the external sensor 114, and the same effect can be obtained even if the sensor is built in the camera adapter 120.

図３はフロントエンドサーバ１３１の機能構成例を示したブロック図である。制御部３１０はＣＰＵやＤＲＡＭ、プログラムデータや各種データを記憶したＨＤＤやＮＡＮＤメモリなどの記憶媒体、Ｅｔｈｅｒｎｅｔ（登録商標）等のハードウェアで構成される。制御部３１０は、フロントエンドサーバ１３１の各機能部及びフロントエンドサーバ１３１のシステム全体の制御を行う。また、制御部３１０は、モード制御を行って、キャリブレーション動作や撮像前の準備動作、及び撮像中動作などの動作モードを切り替える。また、制御部３１０は、Ｅｔｈｅｒｎｅｔ（登録商標）を通じて制御ステーション１４１からの制御指示を受信し、各モードの切り替えやデータの入出力などを行う。また、制御部３１０は、同じくネットワークを通じて制御ステーション１４１からスタジアムＣＡＤデータ（スタジアム形状データ）を取得し、スタジアムＣＡＤデータをＣＡＤデータ記憶部３３５と非撮像データファイル生成部３８５に送信する。なお、本実施形態におけるスタジアムＣＡＤデータ（スタジアム形状データ）はスタジアムの形状を示す三次元データであり、メッシュモデルやその他の三次元形状を表すデータであればよく、ＣＡＤ形式に限定されない。 FIG. 3 is a block diagram showing a functional configuration example of the front-end server 131. The control unit 310 is composed of a CPU, a DRAM, a storage medium such as an HDD or a NAND memory that stores program data and various data, and hardware such as Ethernet (registered trademark). The control unit 310 controls each functional unit of the front-end server 131 and the entire system of the front-end server 131. Further, the control unit 310 performs mode control to switch operation modes such as a calibration operation, a preparatory operation before imaging, and an operation during imaging. Further, the control unit 310 receives a control instruction from the control station 141 through Ethernet (registered trademark), switches each mode, inputs / outputs data, and the like. Further, the control unit 310 also acquires stadium CAD data (stadium shape data) from the control station 141 through the network, and transmits the stadium CAD data to the CAD data storage unit 335 and the non-imaging data file generation unit 385. The stadium CAD data (stadium shape data) in the present embodiment is three-dimensional data indicating the shape of the stadium, and may be any data representing a mesh model or other three-dimensional shape, and is not limited to the CAD format.

データ入力制御部３２０は、Ｅｔｈｅｒｎｅｔ（登録商標）等の通信路とスイッチングハブ１８０を介して、カメラアダプタ１２０とネットワーク接続されている。データ入力制御部３２０は、ネットワークを通してカメラアダプタ１２０から前景画像、背景画像、被写体の三次元モデル、音声データ、及びカメラキャリブレーション撮像画像データを取得する。ここで、前景画像は仮想視点画像の生成のための撮像画像の前景領域に基づく画像データであり、背景画像は当該撮像画像の背景領域に基づく画像データである。上述のようにカメラアダプタ１２０は、カメラ１１２による撮像画像に対する所定のオブジェクトの検出処理の結果に応じて、前景領域及び背景領域を特定し、前景画像及び背景画像を生成する。所定のオブジェクトとは、例えば人物である。なお、所定のオブジェクトは特定の人物（選手、監督、及び／又は審判など）であっても良い。また、所定のオブジェクトには、ボールやゴールなど、画像パターンが予め定められている物体が含まれていてもよい。また、所定のオブジェクトとして、動体が検出されるようにしても良い。 The data input control unit 320 is network-connected to the camera adapter 120 via a communication path such as Ethernet (registered trademark) and a switching hub 180. The data input control unit 320 acquires a foreground image, a background image, a three-dimensional model of a subject, audio data, and camera calibration captured image data from the camera adapter 120 through a network. Here, the foreground image is image data based on the foreground region of the captured image for generating the virtual viewpoint image, and the background image is image data based on the background region of the captured image. As described above, the camera adapter 120 specifies the foreground area and the background area according to the result of the detection processing of a predetermined object for the image captured by the camera 112, and generates the foreground image and the background image. The predetermined object is, for example, a person. The predetermined object may be a specific person (player, manager, and / or referee, etc.). Further, the predetermined object may include an object having a predetermined image pattern, such as a ball or a goal. Further, a moving object may be detected as a predetermined object.

また、データ入力制御部３２０は、カメラアダプタ１２０から取得した前景画像及び背景画像をデータ同期部３３０に送信し、カメラアダプタ１２０から取得したカメラキャリブレーション撮像画像データをキャリブレーション部３４０に送信する。また、データ入力制御部３２０は受信したデータの圧縮伸張やデータルーティング処理等を行う機能を有する。また、制御部３１０とデータ入力制御部３２０は共にＥｔｈｅｒｎｅｔ（登録商標）等のネットワークによる通信機能を有しているが、通信機能はこれらで共有されていてもよい。その場合は、制御ステーション１４１からの制御コマンドによる指示やスタジアムＣＡＤデータをデータ入力制御部３２０で受けて、制御部３１０に対して送る方法を用いてもよい。 Further, the data input control unit 320 transmits the foreground image and the background image acquired from the camera adapter 120 to the data synchronization unit 330, and transmits the camera calibration captured image data acquired from the camera adapter 120 to the calibration unit 340. Further, the data input control unit 320 has a function of performing compression / decompression of received data, data routing processing, and the like. Further, both the control unit 310 and the data input control unit 320 have a communication function by a network such as Ethernet (registered trademark), but the communication function may be shared by these. In that case, a method may be used in which the data input control unit 320 receives the instruction by the control command from the control station 141 or the stadium CAD data and sends it to the control unit 310.

データ同期部３３０は、カメラアダプタ１２０から取得したデータをＤＲＡＭ上に一次的に記憶し、前景画像、背景画像、音声データ及び三次元モデルデータが揃うまでバッファする。なお、前景画像、背景画像、音声データ及び三次元モデルデータをまとめて、以降では撮像データと称する。撮像データにはルーティング情報やタイムコード情報（時間情報）、カメラ識別子等のメタ情報が付与されており、データ同期部３３０はこのメタ情報を元にデータの属性を確認する。これによりデータ同期部３３０は、同一時刻のデータであることなどを判断してデータがそろったことを確認する。これは、ネットワークによって各々のカメラアダプタ１２０から転送されたデータについてはネットワークパケットの受信順序が保証されず、ファイル生成に必要なデータが揃うまでバッファする必要があるためである。本実施形態では、複数のフロントエンドサーバ（画像処理装置）により分散処理を実現するとともに、各フロントエンドサーバには同一時刻のデータが振り分けられるので、フロントエンドサーバにおける画像処理の処理効率が向上する。 The data synchronization unit 330 temporarily stores the data acquired from the camera adapter 120 on the DRAM, and buffers the foreground image, the background image, the audio data, and the three-dimensional model data until they are prepared. The foreground image, background image, audio data, and three-dimensional model data are collectively referred to as imaging data hereafter. Meta information such as routing information, time code information (time information), and camera identifier is added to the captured data, and the data synchronization unit 330 confirms the data attributes based on this meta information. As a result, the data synchronization unit 330 determines that the data is at the same time and confirms that the data is complete. This is because the reception order of network packets is not guaranteed for the data transferred from each camera adapter 120 by the network, and it is necessary to buffer until the data necessary for file generation is prepared. In the present embodiment, distributed processing is realized by a plurality of front-end servers (image processing devices), and data at the same time is distributed to each front-end server, so that the processing efficiency of image processing in the front-end server is improved. ..

データがそろったら、データ同期部３３０は、前景画像及び背景画像を画像処理部３５０に、三次元モデルデータを三次元モデル結合部３６０に、音声データを撮像データファイル生成部３８０にそれぞれ送信する。なお、ここで揃えるデータは、後述される撮像データファイル生成部３８０においてファイル生成を行うために必要なデータである。また、背景画像は前景画像とは異なるフレームレートで撮像されてもよい。例えば、背景画像のフレームレートが１ｆｐｓである場合、１秒毎に１つの背景画像が取得されるため、背景画像が取得されない時間については、背景画像が無い状態で全てのデータがそろったとしてよい。また、データ同期部３３０は、所定時間が経過してもデータが揃っていない場合には、データ集結ができないことを示す情報をデータベース１３２に通知する。そして、後段のデータベース１３２が、データを格納する際に、カメラ番号やフレーム番号とともにデータ欠落を示す情報を格納する。これにより、仮想カメラ操作ＵＩ１４２からバックエンドサーバ１３３への視点指示に応じて、データ集結したカメラ１１２の撮像画像から所望の画像が形成できるか否かをレンダリング前に自動通知することが可能となる。その結果、仮想カメラ操作ＵＩ１４２のオペレータの目視負荷を軽減できる。 When the data is prepared, the data synchronization unit 330 transmits the foreground image and the background image to the image processing unit 350, the three-dimensional model data to the three-dimensional model coupling unit 360, and the audio data to the imaging data file generation unit 380. The data prepared here is data necessary for file generation in the imaging data file generation unit 380, which will be described later. Further, the background image may be captured at a frame rate different from that of the foreground image. For example, when the frame rate of the background image is 1 fps, one background image is acquired every second. Therefore, for the time when the background image is not acquired, all the data may be collected without the background image. .. Further, the data synchronization unit 330 notifies the database 132 of information indicating that the data cannot be collected when the data is not available even after the lapse of a predetermined time. Then, when the database 132 in the subsequent stage stores the data, it stores the information indicating the lack of data together with the camera number and the frame number. As a result, it is possible to automatically notify before rendering whether or not a desired image can be formed from the captured image of the camera 112 in which data is collected in response to a viewpoint instruction from the virtual camera operation UI 142 to the back-end server 133. .. As a result, the visual load on the operator of the virtual camera operation UI 142 can be reduced.

ＣＡＤデータ記憶部３３５は制御部３１０から受け取ったスタジアム形状を示す三次元データをＤＲＡＭ、ＨＤＤ、またはＮＡＮＤメモリ等の記憶媒体に保存する。そして、画像結合部３７０に対して、スタジアム形状データの要求を受け取った際に保存されたスタジアム形状データを送信する。 The CAD data storage unit 335 stores the three-dimensional data indicating the stadium shape received from the control unit 310 in a storage medium such as a DRAM, an HDD, or a NAND memory. Then, the stadium shape data saved when the request for the stadium shape data is received is transmitted to the image combining unit 370.

キャリブレーション部３４０はカメラのキャリブレーション動作を行い、キャリブレーションによって得られたカメラパラメータを後述する非撮像データファイル生成部３８５に送る。また同時に、自身の記憶領域にもカメラパラメータを保持し、後述する三次元モデル結合部３６０にカメラパラメータ情報を提供する。 The calibration unit 340 performs a camera calibration operation and sends the camera parameters obtained by the calibration to the non-imaging data file generation unit 385, which will be described later. At the same time, the camera parameters are also held in its own storage area, and the camera parameter information is provided to the three-dimensional model coupling unit 360 described later.

画像処理部３５０は前景画像や背景画像に対して、カメラ間の色や輝度値の合わせこみ、ＲＡＷ画像データが入力される場合には現像処理、及びカメラのレンズ歪みの補正等の処理を行う。そして、画像処理を行った前景画像は撮像データファイル生成部３８０に、背景画像は画像結合部３７０にそれぞれ送信する。 The image processing unit 350 performs processing such as matching of color and brightness values between cameras, development processing when RAW image data is input, and correction of camera lens distortion for the foreground image and background image. .. Then, the image-processed foreground image is transmitted to the imaging data file generation unit 380, and the background image is transmitted to the image combining unit 370.

三次元モデル結合部３６０は、カメラアダプタ１２０から取得した同一時刻の三次元モデルデータをキャリブレーション部３４０が生成したカメラパラメータを用いて結合する。そして、ＶｉｓｕａｌＨｕｌｌと呼ばれる方法を用いて、スタジアム全体における前景画像の三次元モデルデータを生成する。生成した三次元モデルは撮像データファイル生成部３８０に送信される。 The three-dimensional model coupling unit 360 combines the three-dimensional model data at the same time acquired from the camera adapter 120 using the camera parameters generated by the calibration unit 340. Then, using a method called VisualHull, three-dimensional model data of the foreground image in the entire stadium is generated. The generated three-dimensional model is transmitted to the imaging data file generation unit 380.

画像結合部３７０は画像処理部３５０から背景画像を取得し、ＣＡＤデータ記憶部３３５からスタジアムの三次元形状データ（スタジアム形状データ）を取得し、取得したスタジアムの三次元形状データの座標に対する背景画像の位置を特定する。背景画像の各々についてスタジアムの三次元形状データの座標に対する位置が特定できると、背景画像を結合して１つの背景画像とする。なお、本背景画像の三次元形状データの作成については、バックエンドサーバ１３３が実施してもよい。 The image combining unit 370 acquires a background image from the image processing unit 350, acquires three-dimensional shape data (stadium shape data) of the stadium from the CAD data storage unit 335, and a background image with respect to the coordinates of the acquired three-dimensional shape data of the stadium. Identify the location of. When the position of each of the background images with respect to the coordinates of the three-dimensional shape data of the stadium can be specified, the background images are combined into one background image. The back-end server 133 may create the three-dimensional shape data of the background image.

撮像データファイル生成部３８０はデータ同期部３３０から音声データを、画像処理部３５０から前景画像を、三次元モデル結合部３６０から三次元モデルデータを、画像結合部３７０から三次元形状に結合された背景画像を取得する。そして、取得したこれらのデータをＤＢアクセス制御部３９０に対して出力する。ここで、撮像データファイル生成部３８０は、これらのデータをそれぞれの時間情報に基づいて対応付けて出力する。ただし、これらのデータの一部を対応付けて出力してもよい。例えば、撮像データファイル生成部３８０は、前景画像と背景画像とを、前景画像の時間情報及び背景画像の時間情報に基づいて対応付けて出力する。また例えば、撮像データファイル生成部３８０は、前景画像、背景画像、及び三次元モデルデータを、前景画像の時間情報、背景画像の時間情報、及び三次元モデルデータの時間情報に基づいて対応付けて出力する。なお、撮像データファイル生成部３８０は、対応付けられたデータをデータの種類別にファイル化して出力してもよいし、複数種類のデータを時間情報が示す時刻ごとにまとめてファイル化して出力してもよい。このように対応付けられた撮像データがＤＢアクセス制御部３９０によってデータベース１３２に出力されることで、バックエンドサーバ１３３は時間情報が対応する前景画像と背景画像とから仮想視点画像を生成できる。 The imaging data file generation unit 380 is combined with audio data from the data synchronization unit 330, foreground images from the image processing unit 350, three-dimensional model data from the three-dimensional model coupling unit 360, and three-dimensional shape from the image coupling unit 370. Get the background image. Then, these acquired data are output to the DB access control unit 390. Here, the imaging data file generation unit 380 outputs these data in association with each other based on the respective time information. However, some of these data may be associated and output. For example, the imaging data file generation unit 380 outputs the foreground image and the background image in association with each other based on the time information of the foreground image and the time information of the background image. Further, for example, the imaging data file generation unit 380 associates the foreground image, the background image, and the three-dimensional model data with each other based on the time information of the foreground image, the time information of the background image, and the time information of the three-dimensional model data. Output. The imaging data file generation unit 380 may file and output the associated data for each type of data, or collectively output a plurality of types of data as a file for each time indicated by the time information. May be good. By outputting the imaged data associated in this way to the database 132 by the DB access control unit 390, the back-end server 133 can generate a virtual viewpoint image from the foreground image and the background image corresponding to the time information.

なお、データ入力制御部３２０により取得される前景画像と背景画像のフレームレートが異なる場合、撮像データファイル生成部３８０は、常に同時刻の前景画像と背景画像を対応付けて出力することは難しい。そこで、撮像データファイル生成部３８０は、前景画像の時間情報と所定の規則に基づく関係にある時間情報を有する背景画像とを対応付けて出力する。ここで、前景画像の時間情報と所定の規則に基づく関係にある時間情報を有する背景画像とは、例えば、撮像データファイル生成部３８０が取得した背景画像のうち前景画像の時間情報に最も近い時間情報を有する背景画像である。このように、所定の規則に基づいて前景画像と背景画像を対応付けることにより、前景画像と背景画像のフレームレートが異なる場合でも、近い時刻に撮像された前景画像と背景画像とから仮想視点画像を生成することができる。 When the frame rates of the foreground image and the background image acquired by the data input control unit 320 are different, it is difficult for the imaging data file generation unit 380 to always output the foreground image and the background image at the same time in association with each other. Therefore, the imaging data file generation unit 380 outputs the time information of the foreground image in association with the background image having the time information having a relationship based on a predetermined rule. Here, the background image having the time information of the foreground image and the time information having a relationship based on a predetermined rule is, for example, the time closest to the time information of the foreground image among the background images acquired by the imaging data file generation unit 380. It is a background image having information. By associating the foreground image with the background image based on a predetermined rule in this way, even if the frame rates of the foreground image and the background image are different, the virtual viewpoint image can be obtained from the foreground image and the background image captured at close times. Can be generated.

なお、前景画像と背景画像の対応付けの方法は上記のものに限らない。例えば、前景画像の時間情報と所定の規則に基づく関係にある時間情報を有する背景画像は、取得された背景画像であって前景画像より前の時刻に対応する時間情報を有する背景画像のうち、前景画像の時間情報に最も近い時間情報を有する背景画像であってよい。なお、複数のフロントエンドサーバ１３１が相互に通信することにより、他のフロントエンドサーバで処理された背景画像を含めて選択することができる。例えば、フロントエンドサーバ１３１ａ（１３１ｂ）は、自身が処理した背景画像とフロントエンドサーバ１３１ｂ（１３１ａ）により処理された背景画像の中から、前景画像の時間情報に最も近い時間情報を有する背景画像を選択するようにしてもよい。この方法によれば、前景画像よりフレームレートの低い背景画像の取得を待つことなく、対応付けられた前景画像と背景画像とを低遅延で出力することができる。また、前景画像の時間情報と所定の規則に基づく関係にある時間情報を有する背景画像は、取得された背景画像であって前景画像より後の時刻に対応する時間情報を有する背景画像のうち、前景画像の時間情報に最も近い時間情報を有する背景画像でもよい。 The method of associating the foreground image with the background image is not limited to the above. For example, the background image having the time information of the foreground image and the time information having a relationship based on a predetermined rule is the acquired background image and the background image having the time information corresponding to the time before the foreground image. It may be a background image having the time information closest to the time information of the foreground image. By communicating with each other, the plurality of front-end servers 131 can be selected including the background image processed by the other front-end servers. For example, the front-end server 131a (131b) selects a background image having the time information closest to the time information of the foreground image from the background image processed by itself and the background image processed by the front-end server 131b (131a). You may choose. According to this method, the associated foreground image and the background image can be output with low delay without waiting for the acquisition of the background image having a frame rate lower than that of the foreground image. Further, the background image having the time information of the foreground image and the time information having a relationship based on a predetermined rule is the acquired background image and among the background images having the time information corresponding to the time after the foreground image. It may be a background image having the time information closest to the time information of the foreground image.

非撮像データファイル生成部３８５は、キャリブレーション部３４０からカメラパラメータ、制御部３１０からスタジアムの三次元形状データを取得し、ファイル形式に応じて成形した後にＤＢアクセス制御部３９０に送信する。なお、非撮像データファイル生成部３８５に入力されるデータであるカメラパラメータまたはスタジアム形状データは、個別にファイル形式に応じて成形される。すなわち、非撮像データファイル生成部３８５は、どちらか一方のデータを受信した場合、それらを個別にＤＢアクセス制御部３９０に送信する。 The non-imaging data file generation unit 385 acquires camera parameters from the calibration unit 340 and three-dimensional shape data of the stadium from the control unit 310, shapes the stadium according to the file format, and then transmits the data to the DB access control unit 390. The camera parameters or stadium shape data, which are the data input to the non-imaging data file generation unit 385, are individually formed according to the file format. That is, when the non-imaging data file generation unit 385 receives either of the data, the non-imaging data file generation unit 385 individually transmits the data to the DB access control unit 390.

ＤＢアクセス制御部３９０は、ＩｎｆｉｎｉＢａｎｄなどにより高速な通信が可能となるようにデータベース１３２と接続される。そして、撮像データファイル生成部３８０及び非撮像データファイル生成部３８５から受信したファイルをデータベース１３２に対して送信する。本実施形態では、撮像データファイル生成部３８０が時間情報に基づいて対応付けた撮像データは、フロントエンドサーバ１３１とネットワークを介して接続される記憶装置であるデータベース１３２へＤＢアクセス制御部３９０を介して出力される。ただし、対応付けられた撮像データの出力先はこれに限らない。例えば、フロントエンドサーバ１３１は、時間情報に基づいて対応付けられた撮像データを、フロントエンドサーバ１３１とネットワークを介して接続され仮想視点画像を生成する画像生成装置であるバックエンドサーバ１３３に出力してもよい。また、データベース１３２とバックエンドサーバ１３３の両方に出力してもよい。 The DB access control unit 390 is connected to the database 132 so that high-speed communication is possible by InfiniBand or the like. Then, the files received from the imaging data file generation unit 380 and the non-imaging data file generation unit 385 are transmitted to the database 132. In the present embodiment, the imaging data associated with the imaging data file generation unit 380 based on the time information is sent to the database 132, which is a storage device connected to the front-end server 131 via the network, via the DB access control unit 390. Is output. However, the output destination of the associated imaging data is not limited to this. For example, the front-end server 131 outputs the imaging data associated based on the time information to the back-end server 133, which is an image generation device that is connected to the front-end server 131 via a network and generates a virtual viewpoint image. You may. Further, it may be output to both the database 132 and the back-end server 133.

また、本実施形態ではフロントエンドサーバ１３１が前景画像と背景画像の対応付けを行うものとするが、これに限らず、データベース１３２が対応付けを行ってもよい。例えば、データベース１３２はフロントエンドサーバ１３１から時間情報を有する前景画像及び背景画像を取得する。そしてデータベース１３２は、前景画像と背景画像とを前景画像の時間情報及び背景画像の時間情報に基づいて対応付けて、データベース１３２が備える記憶部に出力してもよい。 Further, in the present embodiment, the front-end server 131 associates the foreground image with the background image, but the present invention is not limited to this, and the database 132 may associate the foreground image and the background image. For example, the database 132 acquires a foreground image and a background image having time information from the front-end server 131. Then, the database 132 may associate the foreground image and the background image with each other based on the time information of the foreground image and the time information of the background image, and output the image to the storage unit included in the database 132.

フロントエンドサーバ１３１のデータ入力制御部３２０の機能構成について、図４のブロック図を参照して説明する。データ入力制御部３２０は、サーバネットワークアダプタ４１０、サーバ伝送部４２０、及びサーバ画像処理部４３０を有する。 The functional configuration of the data input control unit 320 of the front-end server 131 will be described with reference to the block diagram of FIG. The data input control unit 320 includes a server network adapter 410, a server transmission unit 420, and a server image processing unit 430.

サーバネットワークアダプタ４１０は、サーバデータ受信部４１１を有し、カメラアダプタ１２０から送信されるデータを受信する機能を有する。 The server network adapter 410 has a server data receiving unit 411 and has a function of receiving data transmitted from the camera adapter 120.

サーバ伝送部４２０は、サーバデータ受信部４１１から受取ったデータに対する処理を行う機能を有しており、以下の機能部から構成されている。サーバデータ伸張部４２１は、圧縮されたデータを伸張する機能を有している。サーバデータルーティング処理部４２２は、後述するサーバデータルーティング情報保持部４２４が保持するアドレス等のルーティング情報に基づきデータの転送先を決定し、サーバデータ受信部４１１から受取ったデータを転送する。サーバ画像・音声伝送処理部４２３は、カメラアダプタ１２０からサーバデータ受信部４１１を介してメッセージを受取り、メッセージに含まれるデータ種別に応じて、フラグメント化されたデータを画像データまたは音声データに復元する。なお、復元後の画像データや音声データが圧縮されている場合は、サーバデータ伸張部４２１で伸張処理が行われる。サーバデータルーティング情報保持部４２４は、サーバデータ受信部４１１が受信したデータの送信先を決定するためのアドレス情報を保持する機能を有する。 The server transmission unit 420 has a function of processing the data received from the server data reception unit 411, and is composed of the following functional units. The server data decompression unit 421 has a function of decompressing the compressed data. The server data routing processing unit 422 determines a data transfer destination based on routing information such as an address held by the server data routing information holding unit 424, which will be described later, and transfers the data received from the server data receiving unit 411. The server image / audio transmission processing unit 423 receives a message from the camera adapter 120 via the server data receiving unit 411, and restores fragmented data into image data or audio data according to the data type included in the message. .. If the restored image data or audio data is compressed, the server data decompression unit 421 performs decompression processing. The server data routing information holding unit 424 has a function of holding address information for determining the transmission destination of the data received by the server data receiving unit 411.

サーバ画像処理部４３０は、カメラアダプタ１２０から受信した画像データまたは音声データに係わる処理を行う機能を有している。処理内容は、例えば、画像データのデータ実体（前景画像、背景画像、及び三次元モデル情報）に応じた、カメラ番号や画像フレームの撮像時刻、画像サイズ、画像フォーマット、及び画像の座標の属性情報などが付与されたフォーマットへの整形処理などである。 The server image processing unit 430 has a function of performing processing related to image data or audio data received from the camera adapter 120. The processing content is, for example, attribute information of the camera number, the imaging time of the image frame, the image size, the image format, and the coordinates of the image according to the data entity (foreground image, background image, and three-dimensional model information) of the image data. It is a shaping process to the format to which etc. is given.

次に本実施形態におけるワークフローについて図５のフローチャートを参照して説明する。本実施形態では、競技場やコンサートホールなどの施設に複数のカメラ１１２やマイク１１１を設置し撮像を行う場合のワークフローについて説明する。なお、図５に示される処理の開始前において、画像処理システム１００の設置や操作を行う操作者（ユーザ）は設置前に必要な情報（事前情報）を収集し計画の立案を行う。また、操作者は、図５に示される処理の開始前において、対象となる施設に機材を設置しているものとする。 Next, the workflow in this embodiment will be described with reference to the flowchart of FIG. In this embodiment, a workflow in which a plurality of cameras 112 and microphones 111 are installed in a facility such as a stadium or a concert hall to perform imaging will be described. Before the start of the process shown in FIG. 5, the operator (user) who installs or operates the image processing system 100 collects necessary information (preliminary information) before the installation and formulates a plan. Further, it is assumed that the operator has installed the equipment in the target facility before the start of the process shown in FIG.

Ｓ５００（設置前処理）において、コントローラ１４０の制御ステーション１４１は、ユーザから事前情報に基づく設定を受け付ける。Ｓ５００の詳細は図６を用いて後述する。次に、Ｓ５０１（設置時処理）において画像処理システム１００の各装置は、ユーザからの操作に基づいてコントローラ１４０から発行されたコマンドに従って、システムの動作確認のための処理を実行する。Ｓ５０１の詳細は図７を用いて後述する。次に、Ｓ５０２（撮像前処理）において、仮想カメラ操作ＵＩ１４２は、競技等のための撮像開始前に画像や音声を出力する。これにより、ユーザは、競技等の前に、マイク１１１により収音された音声やカメラ１１２により撮像された画像を確認できる。 In S500 (pre-installation processing), the control station 141 of the controller 140 receives a setting based on prior information from the user. Details of S500 will be described later with reference to FIG. Next, in S501 (processing at the time of installation), each device of the image processing system 100 executes a process for confirming the operation of the system according to a command issued from the controller 140 based on an operation from the user. Details of S501 will be described later with reference to FIG. 7. Next, in S502 (pre-imaging processing), the virtual camera operation UI 142 outputs an image or sound before the start of imaging for a competition or the like. As a result, the user can confirm the sound picked up by the microphone 111 and the image captured by the camera 112 before the competition or the like.

Ｓ５０３（撮像時処理）において、コントローラ１４０の制御ステーション１４１は、各々のマイク１１１に収音を実施させ、各々のカメラ１１２に撮像を実施させる。カメラ１１２に接続された各々のカメラアダプタ１２０は、制御ステーション１４１からの撮像指示を起点とし、撮像タイミング情報であるタイムコード（撮像時刻やフレーム番号）のカウントを開始する。Ｓ５０３における撮像はマイク１１１を用いた収音を含むものとするがこれに限らず、画像の撮像だけであってもよい。Ｓ５０１で行った設定を変更する場合（Ｓ５０４でＹＥＳ）、処理はＳ５０６へ進む。また、撮像を終了する場合は（Ｓ５０４でＮＯ、Ｓ５０５でＹＥＳ）、処理はＳ５０７に進む。Ｓ５０４、Ｓ５０５における判定は、典型的には、ユーザからコントローラ１４０への入力に基づいて行われる。ただしこの例に限らない。 In S503 (processing at the time of imaging), the control station 141 of the controller 140 causes each microphone 111 to collect sound and each camera 112 to perform imaging. Each camera adapter 120 connected to the camera 112 starts counting the time code (imaging time and frame number) which is the imaging timing information, starting from the imaging instruction from the control station 141. The image pickup in S503 includes sound collection using the microphone 111, but is not limited to this, and may be only image capture. When changing the setting made in S501 (YES in S504), the process proceeds to S506. When the imaging is finished (NO in S504, YES in S505), the process proceeds to S507. The determination in S504 and S505 is typically made based on the input from the user to the controller 140. However, it is not limited to this example.

Ｓ５０６（設定変更処理）において、コントローラ１４０は、Ｓ５０１で行われた設定を変更する。変更内容は、典型的には、Ｓ５０４にて取得されたユーザ入力に基づいて決定される。Ｓ５０６における設定の変更において撮像を停止する必要がある場合は、一度撮像が停止され、設定を変更した後に撮像が再開される。また、撮像を停止する必要がない場合は、撮像と並行して設定の変更が実施される。 In S506 (setting change processing), the controller 140 changes the setting made in S501. The content of the change is typically determined based on the user input acquired in S504. When it is necessary to stop the imaging when the setting is changed in S506, the imaging is stopped once, and the imaging is restarted after the setting is changed. If it is not necessary to stop the imaging, the setting is changed in parallel with the imaging.

Ｓ５０７（編集時処理）において、コントローラ１４０は、複数のカメラ１１２により撮像された画像及び複数のマイク１１１により収音された音声の編集を実施する。当該編集は、典型的には、仮想カメラ操作ＵＩ１４２を介して入力されたユーザ操作に基づいて行われる。なお、Ｓ５０７とＳ５０３の処理は並行して行われるようにしても良い。例えば、スポーツ競技やコンサートなどがリアルタイムに配信される（例えば競技中に競技の画像が配信される）場合は、Ｓ５０３の撮像とＳ５０７の編集が同時に実施される。また、スポーツ競技におけるハイライト画像が競技後に配信される場合は、Ｓ５０７において撮像を終了した後に編集が実施される。 In S507 (processing at the time of editing), the controller 140 edits the image captured by the plurality of cameras 112 and the sound picked up by the plurality of microphones 111. The editing is typically performed based on user operations entered via the virtual camera operation UI 142. The processes of S507 and S503 may be performed in parallel. For example, when a sports competition, a concert, or the like is distributed in real time (for example, an image of the competition is distributed during the competition), the imaging of S503 and the editing of S507 are performed at the same time. If the highlight image in the sports competition is delivered after the competition, editing is performed after the imaging is completed in S507.

次に、前述したＳ５００（設置前処理）の詳細を、図６のフローチャートを参照して説明する。まず、Ｓ６００において制御ステーション１４１は撮像の対象となる施設に関する情報（スタジアム情報）に関するユーザからの入力を受け付ける。スタジアム情報とは、スタジアムの形状、音響、照明、電源、伝送環境、及びスタジアムの三次元モデルデータなどを指す。つまりスタジアム情報には、上述のスタジアム形状データが含まれる。なお本実施形態では撮像対象となる施設がスタジアムである場合に関して記述している。これは、競技場で開催されるスポーツ競技の画像生成を想定したものである。ただし、室内で開催されるスポーツ競技もあるため、撮像対象の施設はスタジアムに限定されるものではない。また、コンサートホールにおけるコンサートの仮想視点画像を生成する場合もあるし、スタジアムでの野外コンサートの画像を生成する場合もあるため、撮像対象のイベントは競技に限定されるものではないことを明記しておく。 Next, the details of the above-mentioned S500 (pre-installation process) will be described with reference to the flowchart of FIG. First, in S600, the control station 141 accepts input from the user regarding information (stadium information) regarding the facility to be imaged. The stadium information refers to the shape of the stadium, sound, lighting, power supply, transmission environment, three-dimensional model data of the stadium, and the like. That is, the stadium information includes the above-mentioned stadium shape data. In this embodiment, the case where the facility to be imaged is a stadium is described. This is based on the assumption that images of sports competitions held at the stadium will be generated. However, since some sports competitions are held indoors, the facilities to be imaged are not limited to stadiums. In addition, since the virtual viewpoint image of the concert in the concert hall may be generated and the image of the outdoor concert in the stadium may be generated, it is clarified that the event to be imaged is not limited to the competition. Keep it.

次に、Ｓ６０１において制御ステーション１４１は、機器情報に関するユーザからの入力を受け付ける。機器情報とは、カメラ、雲台、レンズ、及びマイク等の撮像機材、ＬＡＮ、ＰＣ、サーバ、及びケーブル等の情報機器、及び中継車に関する情報を指す。特に、フロントエンドサーバ１３１の台数、及び全てのフロントエンドサーバ１３１のＭＡＣアドレスの入力を受け付ける。なお、本実施形態では、フロントエンドサーバを宛先として指定するためにＭＡＣアドレスが用いられるが、これに限られるものではなく、ネットワークにおいてブロントエンドサーバを一意に指定できるアドレスであればよい。また、制御ステーション１４１は、入力された全てのフロントエンドサーバ１３１のＭＡＣアドレスを、全てのカメラアダプタ１２０に通知する。図１の構成例では、フロントエンドサーバ１３１ａ〜１３１ｂのＭＡＣアドレスが、カメラアダプタ１２０ａ〜１２０ｚに通知される。さらに、制御ステーション１４１は、カメラアダプタ１２０からフロントエンドサーバ１３１へのスケジューリング方法を入力する。スケジューリング方法とは、例えば、撮像画像のフレーム番号と配信先のフロントエンドサーバ１３１の対応である。より具体的には、例えば、フレーム番号１から順に、フロントエンドサーバ１３１ａ、１３１ｂに、交互に撮像画像をフレーム単位で送信する、ことを指す。 Next, in S601, the control station 141 receives an input from the user regarding the device information. The device information refers to information on imaging devices such as cameras, pan heads, lenses, and microphones, information devices such as LANs, PCs, servers, and cables, and relay vehicles. In particular, it accepts input of the number of front-end servers 131 and the MAC addresses of all front-end servers 131. In the present embodiment, the MAC address is used to specify the front-end server as the destination, but the present invention is not limited to this, and any address can be used as long as the bronte-end server can be uniquely specified in the network. Further, the control station 141 notifies all the camera adapters 120 of the MAC addresses of all the input front-end servers 131. In the configuration example of FIG. 1, the MAC addresses of the front-end servers 131a to 131b are notified to the camera adapters 120a to 120z. Further, the control station 141 inputs a scheduling method from the camera adapter 120 to the front-end server 131. The scheduling method is, for example, the correspondence between the frame number of the captured image and the front-end server 131 of the distribution destination. More specifically, for example, it means that the captured images are alternately transmitted to the front-end servers 131a and 131b in order from the frame number 1 in frame units.

次に、Ｓ６０２において制御ステーション１４１は、Ｓ６０１で機器情報が入力された撮像機材のうち、カメラ、雲台、及びマイクの配置情報に関する入力を受けつける。配置情報は、先述のスタジアムの三次元モデルデータを利用して入力することができる。 Next, in S602, the control station 141 receives the input regarding the arrangement information of the camera, the pan head, and the microphone among the imaging devices to which the device information is input in S601. The placement information can be input using the above-mentioned three-dimensional model data of the stadium.

次に、Ｓ６０３において制御ステーション１４１は、画像処理システム１００の運用情報に関するユーザ入力を受け付ける。運用情報とは、撮像対象、フレームレート、撮像時間、カメラワーク、及び注視点などを指す。例えば、撮像対象が、撮像画像において選手等の前景画像が試合と比較して圧倒的に多い開会式などである場合には、画像生成の手法をその状況に適した手法に変更しうる。また、陸上競技であるかフィールドを使うサッカー競技等であるかなどの競技種別に応じて、注視点の変更と、カメラワークの制約条件変更が行われうる。これらの運用情報の組み合わせで構成される設定情報のテーブルが制御ステーション１４１で管理、変更、及び指示される。前述したＳ６００からＳ６０３により、システム設置前のワークフローを完了する。 Next, in S603, the control station 141 accepts user input regarding the operation information of the image processing system 100. Operational information refers to the imaging target, frame rate, imaging time, camera work, gazing point, and the like. For example, when the image to be captured is an opening ceremony in which the foreground image of a player or the like is overwhelmingly larger than that of a match in the captured image, the image generation method can be changed to a method suitable for the situation. In addition, the gazing point may be changed and the restrictions on camera work may be changed according to the type of competition such as athletics or soccer using the field. A table of setting information composed of a combination of these operational information is managed, changed, and instructed by the control station 141. The workflow before system installation is completed by the above-mentioned S600 to S603.

次に、前述したＳ５０１（設置時処理）の詳細を、図７のフローチャートを参照して説明する。まず、Ｓ７００において、制御ステーション１４１は、設置機材の過不足の有無に関するユーザ入力（機材確認）を受け付ける。ユーザは、Ｓ６０１で入力された機器情報と設置する機材を比較して過不足の有無を確認することで、設置機材の過不足の有無を判定できる。次に、Ｓ７０１において制御ステーション１４１は、Ｓ７００で不足すると判定された機材の設置確認処理を実行する。つまり、ユーザは、Ｓ７００とＳ７０１との間で不足機材を設置することができ、制御ステーション１４１は、ユーザにより不足機材が設置されたことを確認する。 Next, the details of S501 (processing at the time of installation) described above will be described with reference to the flowchart of FIG. First, in S700, the control station 141 accepts user input (equipment confirmation) regarding the presence or absence of excess or deficiency of installed equipment. The user can determine whether or not there is excess or deficiency of the installed equipment by comparing the equipment information input in S601 with the equipment to be installed and confirming whether or not there is excess or deficiency. Next, in S701, the control station 141 executes the installation confirmation process of the equipment determined to be insufficient in S700. That is, the user can install the missing equipment between S700 and S701, and the control station 141 confirms that the missing equipment has been installed by the user.

次に、Ｓ７０２において、制御ステーション１４１は、Ｓ７０１で設置された機材を起動し正常に動作するかの調整前システム動作確認を行う。なお、Ｓ７０２の処理は、ユーザがシステム動作確認を実施し、その確認結果を制御ステーション１４１に対してユーザが入力するようにしても良い。ここで、機材の過不足や動作にエラーが発生した場合には（Ｓ７０２でＮＯ）、Ｓ７０３において、制御ステーション１４１はユーザに対してエラー通知を行う。制御ステーション１４１は、エラーが解除されるまで次のステップには進まないロック状態となる。エラー状態が検出されない場合、あるいは、エラー状態が解除された場合には（Ｓ７０２でＹＥＳ）、Ｓ７０４において、制御ステーション１４１はユーザに対して正常通知を行い、以降の処理に進む。これにより、初期段階でエラーを検知することができる。確認の後、Ｓ７０５〜Ｓ７０７においてカメラ１１２に関する処理が実行され、Ｓ７０８〜Ｓ７１０においてマイク１１１に関する処理が実行される。 Next, in S702, the control station 141 activates the equipment installed in S701 and confirms the operation of the pre-adjustment system to see if it operates normally. In the process of S702, the user may confirm the system operation, and the user may input the confirmation result to the control station 141. Here, if an excess or deficiency of equipment or an error occurs in operation (NO in S702), the control station 141 notifies the user of the error in S703. The control station 141 is in a locked state in which it does not proceed to the next step until the error is cleared. If the error state is not detected, or if the error state is cleared (YES in S702), in S704, the control station 141 notifies the user normally and proceeds to the subsequent processing. This makes it possible to detect an error at an initial stage. After confirmation, the processing related to the camera 112 is executed in S705 to S707, and the processing related to the microphone 111 is executed in S708 to S710.

まず、カメラ１１２に関する処理について述べる。Ｓ７０５において、制御ステーション１４１は、設置されたカメラ１１２の調整を実施する。Ｓ７０５のカメラ１１２の調整とは、例えば、画角合わせと色合わせを含み、設置された全てのカメラ１１２について実施される。Ｓ７０５の調整は、ユーザ操作に基づいて行われるようにしても良いし、自動調整機能により実現されても良い。また、画角合わせでは、ズーム、パン、チルト、及びフォーカスの調整が並行して実施され、それらの調整結果が制御ステーション１４１に保存される。そして、色合わせでは、ＩＲＩＳ、ＩＳＯ／ゲイン、ホワイトバランス、シャープネス、及びシャッタースピードの調整が並行して実施され、それらの調整結果が制御ステーション１４１に保存される。 First, the processing related to the camera 112 will be described. In S705, the control station 141 adjusts the installed camera 112. The adjustment of the camera 112 of S705 includes, for example, angle-of-view matching and color matching, and is performed for all installed cameras 112. The adjustment of S705 may be performed based on a user operation, or may be realized by an automatic adjustment function. In the angle of view adjustment, zoom, pan, tilt, and focus adjustments are performed in parallel, and the adjustment results are stored in the control station 141. In color matching, IRIS, ISO / gain, white balance, sharpness, and shutter speed adjustments are performed in parallel, and the adjustment results are stored in the control station 141.

次に、Ｓ７０６において、制御ステーション１４１は、設置されたカメラ全てが同期する様に調整する。なお、Ｓ７０６における同期の調整は、自動調整機能により実現されても良いが、ユーザ操作に基づいて行われるようにしても良い。さらに、Ｓ７０７において、制御ステーション１４１は、カメラ設置時キャリブレーションを行う。より具体的には、制御ステーション１４１は、設置されたカメラ全ての座標が世界座標に一致する様に調整を行う。詳細なキャリブレーションについては図８を参照して後述する。なお、カメラ１１２の制御コマンドやタイムサーバとの同期に関するネットワーク経路の疎通確認もあわせて実施される。そして、マイク調整が進むまで調整後システム動作正常確認処理で待つ（Ｓ７１１）。 Next, in S706, the control station 141 adjusts so that all the installed cameras are synchronized. The synchronization adjustment in S706 may be realized by the automatic adjustment function, but may be performed based on the user operation. Further, in S707, the control station 141 calibrates when the camera is installed. More specifically, the control station 141 adjusts so that the coordinates of all the installed cameras match the world coordinates. Detailed calibration will be described later with reference to FIG. It should be noted that the control command of the camera 112 and the communication confirmation of the network route related to the synchronization with the time server are also carried out. Then, after the adjustment, the system operation normality confirmation process waits until the microphone adjustment proceeds (S711).

次に、マイク１１１に関する処理について述べる。まず、Ｓ７０８において、制御ステーション１４１は、設置されたマイク１１１の調整を実施する。Ｓ７０８のマイク１１１の調整とは、例えば、ゲイン調整を含み、設置したマイク全てについて実施される。なお、Ｓ７０８におけるマイク１１１の調整は、自動調整機能により実現されても良いが、ユーザ操作に基づいて行われても良い。 Next, the processing related to the microphone 111 will be described. First, in S708, the control station 141 adjusts the installed microphone 111. The adjustment of the microphone 111 of S708 includes, for example, a gain adjustment, and is performed for all the installed microphones. The adjustment of the microphone 111 in S708 may be realized by the automatic adjustment function, but may be performed based on the user operation.

次に、Ｓ７０９において、制御ステーション１４１は、設置されたマイク全てが同期する様に調整する。具体的には、同期クロックの確認を実施する。Ｓ７０９における同期の調整は、ユーザ操作に基づいて行われるようにしても良いし、自動調整機能により実現されても良い。 Next, in S709, the control station 141 adjusts so that all the installed microphones are synchronized. Specifically, the synchronous clock is confirmed. The synchronization adjustment in S709 may be performed based on a user operation, or may be realized by an automatic adjustment function.

次に、Ｓ７１０において、制御ステーション１４１は、設置されたマイク１１１のうち、フィールドに設置されたマイク１１１について位置の調整を実施する。Ｓ７１０におけるマイク１１１の位置の調整は、ユーザ操作に基づいて行われても良いし、自動調整機能により実現されても良い。なお、マイク１１１の制御コマンドやタイムサーバとの同期に関するネットワーク経路の疎通確認もあわせて実施される。 Next, in S710, the control station 141 adjusts the position of the microphone 111 installed in the field among the installed microphones 111. The adjustment of the position of the microphone 111 in S710 may be performed based on a user operation, or may be realized by an automatic adjustment function. It should be noted that the control command of the microphone 111 and the communication confirmation of the network route related to the synchronization with the time server are also carried out.

次に、Ｓ７１１において、制御ステーション１４１は、カメラ１１２ａ〜１１２ｚ、及びマイク１１１ａ〜１１１ｚが正しく調整できたかを確認することを目的として調整後システム動作確認を実施する。Ｓ７１１の処理は、ユーザ指示に基づいて実行されうる。カメラ１１２、マイク１１１ともに調整後システム動作正常確認がとれた場合には（Ｓ７１１でＹＥＳ）、Ｓ７１３において、制御ステーション１４１は、ユーザに正常通知を行う。一方、エラーが発生した場合には（Ｓ７１１でＮＯ）、Ｓ７１２において、制御ステーション１４１は、ユーザにエラー通知を行う。エラー通知では、エラーが発生したカメラ１１２あるいはマイク１１１の種別及び個体番号が通知される。制御ステーション１４１は、エラーが発生した機器の種別と個体番号をもとに再調整の指示を出す。 Next, in S711, the control station 141 performs an adjusted system operation check for the purpose of confirming whether the cameras 112a to 112z and the microphones 111a to 111z have been adjusted correctly. The process of S711 can be executed based on a user instruction. If it is confirmed that the system operation is normal after adjusting both the camera 112 and the microphone 111 (YES in S711), the control station 141 notifies the user of the normality in S713. On the other hand, when an error occurs (NO in S711), in S712, the control station 141 notifies the user of the error. In the error notification, the type and individual number of the camera 112 or microphone 111 in which the error occurred are notified. The control station 141 issues a readjustment instruction based on the type and individual number of the device in which the error has occurred.

次に設置時処理（Ｓ５０１）及び撮像前処理（Ｓ５０２）について説明する。画像処理システム１００は、設置時キャリブレーションを行う状態と通常の撮像を行う状態を動作モード変更により切り替え制御できる。なお、撮像中にある特定カメラのキャリブレーションが必要になるケースもあり、この場合には撮像とキャリブレーションという二種類の動作が両立する。 Next, the installation processing (S501) and the pre-imaging processing (S502) will be described. The image processing system 100 can switch and control the state of performing calibration at the time of installation and the state of performing normal imaging by changing the operation mode. In some cases, it may be necessary to calibrate a specific camera during imaging. In this case, two types of operations, imaging and calibration, are compatible.

設置時キャリブレーション処理について、図８に示すシーケンス図を参照して説明する。なお、図８においては、装置間で行われる指示に対するデータの受信完了や処理完了の通知についての記載は省略するが、指示に対して何らかのレスポンスが返却されるものとする。 The calibration process at the time of installation will be described with reference to the sequence diagram shown in FIG. In FIG. 8, although the description of the notification of the completion of data reception and the completion of processing for the instruction given between the devices is omitted, it is assumed that some response is returned in response to the instruction.

まず、カメラ１１２の設置が完了すると、ユーザは制御ステーション１４１に対して、設置時キャリブレーションの実行を指示する。すると、制御ステーション１４１は、フロントエンドサーバ１３１及びカメラアダプタ１２０に対して、キャリブレーション開始を指示する（Ｓ８００）。 First, when the installation of the camera 112 is completed, the user instructs the control station 141 to perform the installation calibration. Then, the control station 141 instructs the front-end server 131 and the camera adapter 120 to start calibration (S800).

フロントエンドサーバ１３１は、キャリブレーション開始指示を受けると、それ以降に受信した画像データをキャリブレーション用データと判定し、キャリブレーション部３４０が処理できるように制御モードを変更する（Ｓ８０２ａ）。図８では、フロントエンドサーバ１３１は１つしか示されていないが、本実施形態では複数のフロントエンドサーバが用いられている。したがって、制御ステーション１４１は、複数台のフロントエンドサーバのそれぞれにキャリブレーション開始の指示、後述のカメラパラメータ推定指示およびキャリブレーション終了指示を行う。そして、複数のフロントエンドサーバのそれぞれが、図８に示される処理を実行する。また、カメラアダプタ１２０は、キャリブレーション開始指示を受けると、前景背景分離等の画像処理を行わず非圧縮のフレーム画像を扱う制御モードに移行する（Ｓ８０２ｂ）。さらに、カメラアダプタ１２０は、カメラ１１２に対してカメラモード変更を指示する（Ｓ８０１）。これを受けたカメラ１１２は、制御モードを、キャリブレーションを行うためのモードに変更する（Ｓ８０２ｃ）。このモードでは、例えば、カメラ１１２のフレームレートが１ｆｐｓに設定される。あるいは、カメラ１１２が動画でなく静止画を伝送するように設定されてもよい。また、カメラアダプタ１２０によってフレームレートが制御されてキャリブレーション画像が伝送されるモードに設定されてもよい。 Upon receiving the calibration start instruction, the front-end server 131 determines that the image data received thereafter is the calibration data, and changes the control mode so that the calibration unit 340 can process it (S802a). Although only one front-end server 131 is shown in FIG. 8, a plurality of front-end servers are used in the present embodiment. Therefore, the control station 141 gives an instruction to start calibration, an instruction to estimate camera parameters and an instruction to end calibration, which will be described later, to each of the plurality of front-end servers. Then, each of the plurality of front-end servers executes the process shown in FIG. Further, when the camera adapter 120 receives a calibration start instruction, it shifts to a control mode for handling an uncompressed frame image without performing image processing such as foreground background separation (S802b). Further, the camera adapter 120 instructs the camera 112 to change the camera mode (S801). Upon receiving this, the camera 112 changes the control mode to a mode for performing calibration (S802c). In this mode, for example, the frame rate of the camera 112 is set to 1 fps. Alternatively, the camera 112 may be set to transmit a still image instead of a moving image. Further, the frame rate may be controlled by the camera adapter 120 to set the mode in which the calibration image is transmitted.

制御ステーション１４１は、カメラアダプタ１２０に対して、カメラのズーム値とフォーカス値の取得を指示し（Ｓ８０３）、カメラアダプタ１２０は、制御ステーション１４１に、カメラ１１２のズーム値とフォーカス値を送信する（Ｓ８０４）。図８では、カメラアダプタ１２０とカメラ１１２はそれぞれ１つしか記載されていないが、カメラアダプタ１２０とカメラ１１２に関する制御は、画像処理システム１００内の全てのカメラアダプタ１２０と全てのカメラ１１２に対してそれぞれ実行される。そのため、Ｓ８０３及びＳ８０４はカメラ台数分実行され、全てのカメラ１１２に対するＳ８０３及びＳ８０４の処理が完了した時点で、制御ステーション１４１は、全カメラ分のズーム値とフォーカス値を受信できている状態となる。次に、制御ステーション１４１は、フロントエンドサーバ１３１に、Ｓ８０４で受信した全カメラ分のズーム値とフォーカス値を送信する（Ｓ８０５）。 The control station 141 instructs the camera adapter 120 to acquire the zoom value and the focus value of the camera (S803), and the camera adapter 120 transmits the zoom value and the focus value of the camera 112 to the control station 141 (S803). S804). In FIG. 8, only one camera adapter 120 and one camera 112 are described, but the control regarding the camera adapter 120 and the camera 112 is applied to all the camera adapters 120 and all the cameras 112 in the image processing system 100. Each is executed. Therefore, S803 and S804 are executed for the number of cameras, and when the processing of S803 and S804 for all the cameras 112 is completed, the control station 141 is in a state where the zoom value and the focus value for all the cameras can be received. .. Next, the control station 141 transmits the zoom value and the focus value for all the cameras received in S804 to the front-end server 131 (S805).

次いで、制御ステーション１４１は、フロントエンドサーバ１３１に、設置時キャリブレーション用撮像の撮像パターンを通知する（Ｓ８０６）。ここで撮像パターンには、画像特徴点となるマーカ等をグラウンド内で動かして複数回撮像する場合の、別タイミングで撮像された画像を区別するためのパターン名（例えばパターン１〜１０）の属性が付加される。つまり、フロントエンドサーバ１３１は、Ｓ８０６以降に受信したキャリブレーション用の画像データを、Ｓ８０６で受信した撮像パターンにおける撮像画像であると判定する。 Next, the control station 141 notifies the front-end server 131 of the imaging pattern of the calibration imaging at the time of installation (S806). Here, the imaging pattern has an attribute of a pattern name (for example, patterns 1 to 10) for distinguishing images captured at different timings when a marker or the like, which is an image feature point, is moved in the ground to capture images multiple times. Is added. That is, the front-end server 131 determines that the image data for calibration received after S806 is the captured image in the imaging pattern received in S806.

制御ステーション１４１は、カメラアダプタ１２０に対して同期静止画撮像を指示し（Ｓ８０７）、カメラアダプタ１２０は、全てのカメラで同期した静止画撮像を行うようにカメラ１１２に指示する（Ｓ８０８）。そして、カメラ１１２は撮像画像をカメラアダプタ１２０に送信する（Ｓ８０９）。制御ステーション１４１は、カメラアダプタ１２０に対して、Ｓ８０７で撮像指示した画像を、全てのフロントエンドサーバ１３１に伝送するように指示する（Ｓ８１０）。さらに、カメラアダプタ１２０は、伝送先として指定された全てのフロントエンドサーバ１３１にＳ８０９で受信した画像を伝送する（Ｓ８１１）。なお、注視点のグループが複数ある場合には、注視点グループ毎にＳ８０６からＳ８１１のキャリブレーション用画像撮像を行っても良い。 The control station 141 instructs the camera adapter 120 to take a synchronized still image (S807), and the camera adapter 120 instructs the camera 112 to take a synchronized still image with all the cameras (S808). Then, the camera 112 transmits the captured image to the camera adapter 120 (S809). The control station 141 instructs the camera adapter 120 to transmit the image captured in S807 to all the front-end servers 131 (S810). Further, the camera adapter 120 transmits the image received in S809 to all the front-end servers 131 designated as transmission destinations (S811). When there are a plurality of gazing point groups, the calibration images of S806 to S811 may be captured for each gazing point group.

Ｓ８１１で伝送されるキャリブレーション用画像は、前景背景分離等の画像処理が行われず、撮像された画像を圧縮せずにそのまま伝送されるものとする。そのため、全カメラが高解像度で撮像を行う場合や、カメラ台数が多くなった場合、伝送帯域の制約上、全ての非圧縮画像を同時に送信することができなくなる虞がある。その結果、ワークフローの中でキャリブレーションに要する時間が長くなる虞がある。その場合、Ｓ８１０の画像伝送指示において、カメラアダプタ１２０の１台ずつに対して、キャリブレーションのパターン属性に応じた非圧縮画像の伝送指示が順番に行われる。さらにこのような場合、マーカのパターン属性に応じたより多くの特徴点を撮像する必要があるため、複数マーカを用いたキャリブレーション用の画像撮像が行われる。この場合、負荷分散の観点から、画像撮像と非圧縮画像伝送を非同期に行ってもよい。また、キャリブレーション用の画像撮像で取得した非圧縮画像を、カメラアダプタ１２０にパターン属性ごとに逐次蓄積し、並行して非圧縮画像の伝送をＳ８１０の画像伝送指示に応じて行ってもよい。これによりワークフローの処理時間やヒューマンエラーの削減を図ることができる効果がある。 The calibration image transmitted in S811 is not subjected to image processing such as foreground background separation, and the captured image is transmitted as it is without being compressed. Therefore, when all cameras perform high-resolution imaging or when the number of cameras increases, there is a risk that all uncompressed images cannot be transmitted at the same time due to restrictions on the transmission band. As a result, the time required for calibration in the workflow may increase. In that case, in the image transmission instruction of S810, the transmission instruction of the uncompressed image according to the calibration pattern attribute is sequentially given to each of the camera adapters 120. Further, in such a case, since it is necessary to image more feature points according to the pattern attribute of the marker, image imaging for calibration using a plurality of markers is performed. In this case, from the viewpoint of load distribution, image imaging and uncompressed image transmission may be performed asynchronously. Further, the uncompressed image acquired by image imaging for calibration may be sequentially accumulated in the camera adapter 120 for each pattern attribute, and the uncompressed image may be transmitted in parallel in response to the image transmission instruction of S810. This has the effect of reducing workflow processing time and human error.

全てのカメラ１１２においてＳ８１１の処理が完了した時点で、全てのフロントエンドサーバ１３１は、全カメラ分の撮像画像を受信できている状態となる。前述したように、撮像パターンが複数ある場合には、Ｓ８０６からＳ８１１の処理がパターン数分繰り返される。 When the processing of S811 is completed in all the cameras 112, all the front-end servers 131 are in a state of being able to receive the captured images for all the cameras. As described above, when there are a plurality of imaging patterns, the processes of S806 to S811 are repeated for the number of patterns.

次いで、全てのキャリブレーション用撮像が完了すると、制御ステーション１４１は、フロントエンドサーバ１３１に対して、カメラパラメータ推定処理を指示する（Ｓ８１２）。フロントエンドサーバ１３１は、カメラパラメータ推定処理指示を受けると、Ｓ８０５で受信した全カメラ分のズーム値とフォーカス値、及びＳ８１１で受信した全カメラ分の撮像画像を用いて、カメラパラメータ推定処理を行う（Ｓ８１３）。Ｓ８１３におけるカメラパラメータ推定処理の詳細については後述する。なお、注視点が複数ある場合には、注視点グループ毎にＳ８１３のカメラパラメータ推定処理を行うものとする。そして、フロントエンドサーバ１３１は、Ｓ８１３のカメラパラメータ推定処理の結果として導出された全カメラ分のカメラパラメータをデータベース１３２に送信して保存する（Ｓ８１４）。 Next, when all the calibration imaging is completed, the control station 141 instructs the front-end server 131 to perform camera parameter estimation processing (S812). Upon receiving the camera parameter estimation processing instruction, the front-end server 131 performs camera parameter estimation processing using the zoom values and focus values for all cameras received in S805 and the captured images for all cameras received in S811. (S813). The details of the camera parameter estimation process in S813 will be described later. When there are a plurality of gazing points, the camera parameter estimation process of S813 is performed for each gazing point group. Then, the front-end server 131 transmits and stores the camera parameters for all the cameras derived as a result of the camera parameter estimation process in S813 to the database 132 (S814).

また、フロントエンドサーバ１３１は、制御ステーション１４１に対しても同様に全カメラ分のカメラパラメータを送信（Ｓ８１５）する。制御ステーション１４１は、カメラアダプタ１２０に対して、カメラ１１２に対応するカメラパラメータを送信する（Ｓ８１６）。カメラアダプタ１２０は、自身に接続されているカメラ１１２のカメラパラメータを受信し、保存する（Ｓ８１７）。 Further, the front-end server 131 also transmits camera parameters for all cameras to the control station 141 (S815). The control station 141 transmits the camera parameters corresponding to the camera 112 to the camera adapter 120 (S816). The camera adapter 120 receives and stores the camera parameters of the camera 112 connected to itself (S817).

そして、制御ステーション１４１は、キャリブレーション結果を確認する（Ｓ８１８）。確認方法としては、導出されたカメラパラメータの数値を確認しても良いし、Ｓ８１４のカメラパラメータ推定処理の演算過程を確認しても良いし、カメラパラメータを用いて画像生成を行い、生成された画像を確認するようにしても良い。そして、制御ステーション１４１は、カメラアダプタ１２０とフロントエンドサーバ１３１に対して、キャリブレーション終了を指示する（Ｓ８１９）。 Then, the control station 141 confirms the calibration result (S818). As a confirmation method, the derived numerical values of the camera parameters may be confirmed, the calculation process of the camera parameter estimation process of S814 may be confirmed, or an image is generated using the camera parameters. You may check the image. Then, the control station 141 instructs the camera adapter 120 and the front-end server 131 to end the calibration (S819).

フロントエンドサーバ１３１はキャリブレーション終了指示を受けると、Ｓ８０２ａのキャリブレーション開始処理とは逆に、それ以降に受信した画像データをキャリブレーション用データでないと判定するよう制御モードを変更する（Ｓ８２１ａ）。またカメラアダプタ１２０はキャリブレーション終了指示を受けると、Ｓ８０２ｂで実行したキャリブレーション開始処理とは逆に、前景背景分離等の画像処理を行う制御モードに移行する（Ｓ８２１ｂ）。さらに、カメラアダプタ１２０は、カメラ１１２に対してカメラモード変更を指示する（Ｓ８２０）。これを受けたカメラ１１２は、Ｓ８０２ｃで実行したキャリブレーション開始処理とは逆に、通常の撮像を行うモードに移行する（Ｓ８２１ｃ）。 Upon receiving the calibration end instruction, the front-end server 131 changes the control mode so as to determine that the image data received thereafter is not the calibration data, contrary to the calibration start process of S802a (S821a). When the camera adapter 120 receives the calibration end instruction, the camera adapter 120 shifts to the control mode for performing image processing such as foreground and background separation, contrary to the calibration start process executed in S802b (S821b). Further, the camera adapter 120 instructs the camera 112 to change the camera mode (S820). Upon receiving this, the camera 112 shifts to the normal imaging mode (S821c), contrary to the calibration start process executed in S802c.

以上の処理により、設置時キャリブレーション処理として、全カメラ分のカメラパラメータを導出し、導出されたカメラパラメータをカメラアダプタ１２０及びデータベース１３２に保存することができる。 By the above processing, as the calibration process at the time of installation, the camera parameters for all the cameras can be derived, and the derived camera parameters can be saved in the camera adapter 120 and the database 132.

また、上述した設置時キャリブレーション処理は、カメラ設置後及び撮像前に実施され、カメラが動かされなければ再度処理する必要はないが、カメラを動かす場合（例えば、試合の前半と後半とで注視点を変更するなど）には、再度同様の処理が行われる。 In addition, the above-mentioned installation calibration process is performed after the camera is installed and before imaging, and if the camera is not moved, it is not necessary to process it again. However, when the camera is moved (for example, in the first half and the second half of the game) To change the viewpoint, etc.), the same process is performed again.

また、撮像中にボールがぶつかる等のアクシデントにより所定の閾値以上にカメラ１１２が動いてしまった場合に、そのカメラ１１２を撮像状態からキャリブレーション開始状態に遷移させ上述の設置時キャリブレーションを行っても良い。その場合、システムとしては通常の撮像状態を維持し、そのカメラ１１２のみがキャリブレーション用画像を伝送している旨をフロントエンドサーバ１３１に通知することで、システム全体をキャリブレーションモードにする必要はなく撮像の継続性を図れる。さらには、本システムのデイジーチェーンでの伝送においては、通常の撮像における画像データの伝送帯域にキャリブレーション用の非圧縮画像を送ると、伝送帯域制限を超過する場合が考えられる。この場合、非圧縮画像の伝送優先度を下げたり、非圧縮画像を分割して送信したりすることで対応する。さらには、カメラアダプタ１２０間の接続が１０ＧｂＥなどの場合は、全二重の特徴を使うことで、通常の撮像の画像データ伝送とは逆向きに非圧縮画像を伝送することで帯域確保が図れる。 Further, when the camera 112 moves beyond a predetermined threshold value due to an accident such as a ball hitting during imaging, the camera 112 is changed from the imaging state to the calibration start state and the above-mentioned installation calibration is performed. Is also good. In that case, it is necessary to put the entire system into the calibration mode by maintaining the normal imaging state as the system and notifying the front-end server 131 that only the camera 112 is transmitting the calibration image. The continuity of imaging can be achieved without any problem. Furthermore, in the transmission of this system in the daisy chain, if an uncompressed image for calibration is sent to the transmission band of image data in normal imaging, the transmission band limit may be exceeded. In this case, the transmission priority of the uncompressed image is lowered, or the uncompressed image is divided and transmitted. Furthermore, when the connection between the camera adapters 120 is 10 GbE or the like, the bandwidth can be secured by transmitting an uncompressed image in the opposite direction to the image data transmission of normal imaging by using the full-duplex feature. ..

また、複数の注視点のうちの１つの注視点を変更したい場合など、１つの注視点グループのカメラ１１２のみ、上述した設置時キャリブレーション処理を再度行うようにしても良い。その場合、キャリブレーション処理中は、対象の注視点グループのカメラ１１２については、通常の画像撮像及び仮想視点画像生成を行うことができない。そのため、キャリブレーション処理中であることが制御ステーション１４１に通知され、制御ステーション１４１が仮想カメラ操作ＵＩ１４２に対して視点操作の制限をかけるなどの処理を要求する。フロントエンドサーバ１３１では、仮想視点画像生成の処理に影響が出ないよう制御してカメラパラメータ推定処理を行うものとする。 Further, when it is desired to change the gazing point of one of the plurality of gazing points, the above-described installation calibration process may be performed again only for the camera 112 of one gazing point group. In that case, during the calibration process, it is not possible to perform normal image imaging and virtual viewpoint image generation for the camera 112 of the target gazing point group. Therefore, the control station 141 is notified that the calibration process is in progress, and the control station 141 requests a process such as restricting the viewpoint operation on the virtual camera operation UI 142. The front-end server 131 controls the camera parameter estimation process so as not to affect the virtual viewpoint image generation process.

次に、カメラ１１２による撮像、マイク１１１による収音、及び、撮像又は収音されたデータをカメラアダプタ１２０及びフロントエンドサーバ１３１を介してデータベース１３２へ蓄積する処理について説明する。 Next, imaging by the camera 112, sound collection by the microphone 111, and processing of accumulating the imaged or collected data in the database 132 via the camera adapter 120 and the front-end server 131 will be described.

図９Ａ及び図９Ｂを参照して、カメラ１１２の撮像開始処理シーケンスについて説明する。図９Ａ及び図９Ｂはそれぞれ内容が異なる処理シーケンスを示しているが、何れのシーケンスに従っても同様の結果を得ることができる。カメラアダプタ１２０は、図９Ａに示した処理を行うか図９Ｂに示した処理を行うかを、カメラ１１２の仕様に応じて選択する。例えば、カメラアダプタ１２０は、接続されているカメラ１１２がタイムコード（ＴｉｍｅＣｏｄｅ）を含むメタ情報を撮像画像に付与することが可能である場合には図９Ａの処理を、そうでない場合には図９Ｂの処理を実行する。 The imaging start processing sequence of the camera 112 will be described with reference to FIGS. 9A and 9B. Although FIGS. 9A and 9B show processing sequences having different contents, similar results can be obtained according to any of the sequences. The camera adapter 120 selects whether to perform the process shown in FIG. 9A or the process shown in FIG. 9B according to the specifications of the camera 112. For example, the camera adapter 120 performs the process of FIG. 9A when the connected camera 112 can add meta information including a time code (Time Code) to the captured image, and the process of FIG. 9A otherwise. The process of 9B is executed.

まず図９Ａについて説明する。タイムサーバ１３４は例えばＧＰＳ９００などと時刻同期を行い、タイムサーバ１３４内で管理される時刻の設定を行う（Ｓ９０１）。なおＧＰＳ９００を用いた方法に限定されるものではなく、ＮＴＰ（ＮｅｔｗｏｒｋＴｉｍｅＰｒｏｔｏｃｏｌ）など他の方法で時刻を設定してもよい。次にカメラアダプタ１２０はタイムサーバ１３４との間でＰＴＰ（ＰｒｅｃｉｓｉｏｎＴｉｍｅＰｒｏｔｏｃｏｌ）を使用した通信を行い、カメラアダプタ１２０内で管理される時刻を補正しタイムサーバ１３４と時刻同期を行う（Ｓ９０２）。 First, FIG. 9A will be described. The time server 134 synchronizes the time with, for example, GPS900, and sets the time managed in the time server 134 (S901). The time is not limited to the method using GPS900, and the time may be set by another method such as NTP (Network Time Protocol). Next, the camera adapter 120 communicates with the time server 134 using PTP (Precision Time Protocol), corrects the time managed in the camera adapter 120, and synchronizes the time with the time server 134 (S902).

カメラアダプタ１２０はカメラ１１２に対して、Ｇｅｎｌｏｃｋ信号や３値同期信号等の同期撮像信号及びタイムコード信号を、撮像フレームに同期して提供し始める（Ｓ９０３）。なお提供される情報はタイムコードに限定されるものではなく、撮像フレームを識別できる識別子であれば他の情報でもよい。次に、カメラアダプタ１２０はカメラ１１２に対して撮像開始指示を行う（Ｓ９０４）。カメラ１１２は撮像開始指示を受けると、Ｇｅｎｌｏｃｋ信号に同期して撮像を行う（Ｓ９０５）。 The camera adapter 120 starts to provide the camera 112 with a synchronized imaging signal such as a Genlock signal and a ternary synchronization signal and a time code signal in synchronization with the imaging frame (S903). The information provided is not limited to the time code, and other information may be used as long as it is an identifier that can identify the imaging frame. Next, the camera adapter 120 gives an imaging start instruction to the camera 112 (S904). When the camera 112 receives an imaging start instruction, it performs imaging in synchronization with the Genlock signal (S905).

次に、カメラ１１２は撮像した画像にタイムコード信号を含めて、画像データ（撮像画像）としてカメラアダプタ１２０へ送信する（Ｓ９０６）。カメラ１１２が撮像を停止するまでＧｅｎｌｏｃｋ信号に同期した撮像が行われる。カメラアダプタ１２０は撮像途中にタイムサーバ１３４との間でのＰＴＰ時刻補正処理を行い、Ｇｅｎｌｏｃｋ信号の発生タイミングを補正する（Ｓ９０７）。必要な補正量が大きくなる場合は、予め設定された変更量に応じた補正を適用してもよい。 Next, the camera 112 includes a time code signal in the captured image and transmits it as image data (captured image) to the camera adapter 120 (S906). Imaging synchronized with the Genlock signal is performed until the camera 112 stops imaging. The camera adapter 120 performs PTP time correction processing with the time server 134 during imaging to correct the generation timing of the Genlock signal (S907). When the required correction amount becomes large, the correction according to the preset change amount may be applied.

以上により、システム内の複数のカメラアダプタ１２０に接続する複数のカメラ１１２の同期撮像を実現する事ができる。 As described above, it is possible to realize synchronous imaging of a plurality of cameras 112 connected to a plurality of camera adapters 120 in the system.

次に図９Ｂについて説明する。まず図９Ａの場合と同様に、カメラアダプタ１２０、タイムサーバ１３４及びＧＰＳ９００の間で時刻同期処理が行われる（Ｓ９０１、Ｓ９０２）。次に、カメラアダプタ１２０は撮像開始指示を行う（Ｓ９５３）。撮像開始指示の中には撮像期間やフレーム数を指定する情報が含まれる。カメラ１１２は撮像開始指示に従い撮像を行い（Ｓ９５４）、撮像した画像データ（撮像画像）をカメラアダプタ１２０へ送信する（Ｓ９５５）。画像データを受取ったカメラアダプタ１２０は、画像データのメタ情報にタイムコードを付与する（Ｓ９５６）。 Next, FIG. 9B will be described. First, as in the case of FIG. 9A, the time synchronization process is performed between the camera adapter 120, the time server 134, and the GPS 900 (S901, S902). Next, the camera adapter 120 gives an imaging start instruction (S953). The imaging start instruction includes information for specifying the imaging period and the number of frames. The camera 112 performs imaging according to the imaging start instruction (S954), and transmits the captured image data (captured image) to the camera adapter 120 (S955). The camera adapter 120 that has received the image data adds a time code to the meta information of the image data (S956).

カメラアダプタ１２０は撮像途中にタイムサーバ１３４との間でのＰＴＰ時刻補正処理を行い、カメラ１１２に対して撮像タイミングの補正を行う。必要な補正量が大きくなる場合は、予め設定された変更量に応じた補正を適用してもよい。例えば、１フレーム毎など短いタイミングで撮像開始指示が繰返し行われ、上述したＳ９５３〜Ｓ９５６の処理が繰り返される（Ｓ９５７〜Ｓ９６０）。 The camera adapter 120 performs PTP time correction processing with the time server 134 during imaging, and corrects the imaging timing for the camera 112. When the required correction amount becomes large, the correction according to the preset change amount may be applied. For example, the imaging start instruction is repeatedly performed at a short timing such as every frame, and the above-mentioned processes S953 to S956 are repeated (S957 to S960).

なお、図９Ａ及び図９Ｂではカメラ１１２の撮像開始処理シーケンスについて説明したが、マイク１１１もカメラ１１２の同期撮像と同様の処理を行い、同期収音を行う。 Although the imaging start processing sequence of the camera 112 has been described in FIGS. 9A and 9B, the microphone 111 also performs the same processing as the synchronous imaging of the camera 112 to perform synchronous sound collection.

次に、本実施形態における複数のカメラアダプタ１２０（１２０ａ、１２０ｂ、１２０ｃ、及び１２０ｄ）が連動して三次元モデル情報を生成する処理シーケンスについて図１０を用いて説明する。なお、処理の順番は図に示したものに限定される訳ではない。 Next, a processing sequence in which a plurality of camera adapters 120 (120a, 120b, 120c, and 120d) in the present embodiment work together to generate three-dimensional model information will be described with reference to FIG. The order of processing is not limited to that shown in the figure.

なお、本実施形態の画像処理システム１００には２６台のカメラ１１２とカメラアダプタ１２０が含まれるが、ここでは２台のカメラ１１２ｂと１１２ｃ、及び、４台のカメラアダプタ１２０ａ、１２０ｂ、１２０ｃ、及び１２０ｄに注目して説明する。４台のカメラアダプタは、デイジーチェーンで接続されており、上流から下流に向かってカメラアダプタ１２０ａ、１２０ｂ、１２０ｃ、１２０ｄの順番で接続されている。また、カメラ１１２ｂはカメラアダプタ１２０ｂに、カメラ１１２ｃはカメラアダプタ１２０ｃに、其々接続されている。なおカメラアダプタ１２０ａ及びカメラアダプタ１２０ｄに接続されるカメラ１１２ａ、１１２ｄ、各々のカメラアダプタ１２０に接続するマイク１１１、雲台１１３、及び外部センサ１１４については省略する。また、カメラアダプタ１２０ａ〜１２０ｄはタイムサーバ１３４と時刻同期が完了し、撮像状態となっているものとする。 The image processing system 100 of the present embodiment includes 26 cameras 112 and a camera adapter 120, but here, two cameras 112b and 112c, and four camera adapters 120a, 120b, 120c, and The explanation will be given with a focus on 120d. The four camera adapters are connected by a daisy chain, and the camera adapters 120a, 120b, 120c, and 120d are connected in this order from upstream to downstream. Further, the camera 112b is connected to the camera adapter 120b, and the camera 112c is connected to the camera adapter 120c, respectively. The cameras 112a and 112d connected to the camera adapter 120a and the camera adapter 120d, the microphone 111 connected to each camera adapter 120, the pan head 113, and the external sensor 114 will be omitted. Further, it is assumed that the camera adapters 120a to 120d have completed time synchronization with the time server 134 and are in the imaging state.

カメラ１１２ｂ及びカメラ１１２ｃは其々の下流に接続されているカメラアダプタ１２０ｂ及び１２０ｃに対して撮像画像（１）及び撮像画像（２）を送信する（Ｓ１００１、Ｓ１００２）。カメラアダプタ１２０ｂ及び１２０ｃにおいて、其々のキャリブレーション制御部２３３が、受信した撮像画像（１）及び撮像画像（２）に対してキャリブレーション処理を行う（Ｓ１００３、Ｓ１００４）。キャリブレーション処理は例えば色補正やブレ補正等である。なお、本実施形態ではキャリブレーション処理を実施するが、必ずしも実施しなくてもよい。次に、キャリブレーション処理済の撮像画像（１）及び撮像画像（２）に対して、カメラアダプタ１２０ｂ及び１２０ｃの前景背景分離部２３１が前景背景分離処理を行う（Ｓ１００５、Ｓ１００６）。 The camera 112b and the camera 112c transmit the captured image (1) and the captured image (2) to the camera adapters 120b and 120c connected downstream thereof (S1001, S1002). In the camera adapters 120b and 120c, the calibration control units 233 perform calibration processing on the received captured images (1) and captured images (2) (S1003, S1004). The calibration process is, for example, color correction or blur correction. Although the calibration process is performed in this embodiment, it is not always necessary to perform the calibration process. Next, the foreground background separation unit 231 of the camera adapters 120b and 120c performs the foreground background separation processing on the calibrated captured image (1) and the captured image (2) (S1005, S1006).

次に、カメラアダプタ１２０ｂ及び１２０ｃにおいて、データ圧縮・伸張部２２１が前景背景分離部２３１により分離された前景画像及び背景画像に対して圧縮を行う（Ｓ１００７、Ｓ１００８）。なお分離した前景画像及び背景画像の其々の重要度に応じて圧縮率が変更されてもよい。また、場合によっては圧縮が行われなくてもよい。例えば、カメラアダプタ１２０は、背景画像よりも前景画像の圧縮率が低くなるように、前景画像と背景画像とのうち少なくとも背景画像を圧縮して下流のカメラアダプタ１２０に対して出力する。前景画像も背景画像も圧縮する場合、重要な撮像対象を含む前景画像に対してはロスレス圧縮を行い、撮像対象を含まない背景画像に対してはロスあり圧縮を行う。これにより、この後に次のカメラアダプタ１２０ｃまたはカメラアダプタ１２０ｄに伝送されるデータ量を効率的に削減する事ができる。例えばサッカー、ラグビー及び野球等が開催されるスタジアムのフィールドを撮像した場合には、画像の大半が背景画像で構成され、プレーヤ等の前景画像の領域が小さいという特徴があるため、伝送データ量を大きく削減できることをここに明記しておく。 Next, in the camera adapters 120b and 120c, the data compression / decompression unit 221 compresses the foreground image and the background image separated by the foreground / background separation unit 231 (S1007, S1008). The compression ratio may be changed according to the importance of the separated foreground image and background image. Further, in some cases, compression may not be performed. For example, the camera adapter 120 compresses at least the background image of the foreground image and the background image so that the compression ratio of the foreground image is lower than that of the background image, and outputs the compressed image to the downstream camera adapter 120. When both the foreground image and the background image are compressed, lossless compression is performed on the foreground image including an important imaging target, and lossy compression is performed on the background image not including the imaging target. As a result, the amount of data transmitted to the next camera adapter 120c or camera adapter 120d after that can be efficiently reduced. For example, when the field of a stadium where soccer, rugby, baseball, etc. are held is imaged, most of the image is composed of a background image, and the area of the foreground image of a player or the like is small. It is clearly stated here that a large reduction can be made.

さらには、カメラアダプタ１２０ｂ又はカメラアダプタ１２０ｃは、重要度に応じて、それぞれの下流に接続されているカメラアダプタ１２０ｃまたはカメラアダプタ１２０ｄに対して出力する画像のフレームレートを変更してもよい。例えば、重要な撮像対象を含む前景画像は高フレームレートで出力し、撮像対象を含まない背景画像は低フレームレートで出力するようにしてもよい。この事によって次のカメラアダプタ１２０ｃまたはカメラアダプタ１２０ｄに伝送されるデータ量をさらに削減する事ができる。またカメラ１１２の設置場所、撮像場所、及び／又はカメラ１１２の性能などに応じて、カメラアダプタ１２０毎に圧縮率や伝送フレームレートを変更してもよい。また、スタジアムの観客席等の三次元構造は図面を用いて事前に確認することができるため、カメラアダプタ１２０は背景画像から観客席の部分を除いた画像を伝送してもよい。これにより、後述のレンダリングの時点で、事前に生成したスタジアム三次元構造を利用することで試合中のプレーヤに重点化した画像レンダリングを実施し、システム全体で伝送及び記憶されるデータ量の削減ができるという効果が生まれる。 Further, the camera adapter 120b or the camera adapter 120c may change the frame rate of the image output to the camera adapter 120c or the camera adapter 120d connected downstream thereof, depending on the importance. For example, the foreground image including the important imaging target may be output at a high frame rate, and the background image not including the imaging target may be output at a low frame rate. As a result, the amount of data transmitted to the next camera adapter 120c or camera adapter 120d can be further reduced. Further, the compression rate and the transmission frame rate may be changed for each camera adapter 120 according to the installation location of the camera 112, the imaging location, and / or the performance of the camera 112. Further, since the three-dimensional structure of the spectator seats and the like of the stadium can be confirmed in advance by using drawings, the camera adapter 120 may transmit an image excluding the spectator seat portion from the background image. As a result, at the time of rendering described later, image rendering focused on the player during the match is performed by using the stadium three-dimensional structure generated in advance, and the amount of data transmitted and stored in the entire system can be reduced. The effect of being able to do it is born.

次にカメラアダプタ１２０は、圧縮した前景画像及び背景画像を下流に隣接するカメラアダプタ１２０に転送する（Ｓ１０１０、Ｓ１０１１、Ｓ１０１２）。なお、本実施形態では前景画像及び背景画像は同時に転送されているが、其々が個別に転送されてもよい。 Next, the camera adapter 120 transfers the compressed foreground image and background image to the adjacent camera adapter 120 downstream (S1010, S1011, S1012). In the present embodiment, the foreground image and the background image are transferred at the same time, but each may be transferred individually.

次に、カメラアダプタ１２０ｂは、上流のカメラアダプタ１２０ａから受信した前景画像と前景背景分離処理Ｓ１００５で分離した前景画像とを使用して三次元モデル情報を作成する（Ｓ１０１３）。同様にカメラアダプタ１２０ｃも三次元モデル情報を作成する（Ｓ１０１４）。 Next, the camera adapter 120b creates three-dimensional model information using the foreground image received from the upstream camera adapter 120a and the foreground image separated by the foreground background separation process S1005 (S1013). Similarly, the camera adapter 120c also creates three-dimensional model information (S1014).

次に、カメラアダプタ１２０ｂは上流のカメラアダプタ１２０ａから受信した前景画像及び背景画像を、下流のカメラアダプタ１２０ｃへ転送する（Ｓ１０１５）。カメラアダプタ１２０ｃも同様にカメラアダプタ１２０ｄへ前景画像及び背景画像を転送する（Ｓ１０１６）。なお、本実施形態では前景画像及び背景画像は同時に転送されているが、其々が個別に転送されてもよい。さらに、カメラアダプタ１２０ｃは、カメラアダプタ１２０ａが作成し、カメラアダプタ１２０ｂから受信した前景画像及び背景画像をカメラアダプタ１２０ｄへ転送する（Ｓ１０１７）。 Next, the camera adapter 120b transfers the foreground image and the background image received from the upstream camera adapter 120a to the downstream camera adapter 120c (S1015). The camera adapter 120c also transfers the foreground image and the background image to the camera adapter 120d (S1016). In the present embodiment, the foreground image and the background image are transferred at the same time, but each may be transferred individually. Further, the camera adapter 120c is created by the camera adapter 120a and transfers the foreground image and the background image received from the camera adapter 120b to the camera adapter 120d (S1017).

次に、カメラアダプタ１２０ａ〜１２０ｃの各々は、作成した三次元モデル情報を其々下流に接続されているカメラアダプタ１２０ｂ〜１２０ｄへ転送する（Ｓ１０１８、Ｓ１０１９、Ｓ１０２０）。さらに、カメラアダプタ１２０ｂ及び１２０ｃは、逐次受信した三次元モデル情報を次のカメラアダプタ１２０ｃ及び１２０ｄへ転送する（Ｓ１０２１、Ｓ１０２２）。さらに、カメラアダプタ１２０ｃは、カメラアダプタ１２０ａが作成し、カメラアダプタ１２０ｂから受信した三次元モデル情報をカメラアダプタ１２０ｄへ転送する（Ｓ１０２３）。 Next, each of the camera adapters 120a to 120c transfers the created three-dimensional model information to the camera adapters 120b to 120d connected downstream (S1018, S1019, S1020). Further, the camera adapters 120b and 120c transfer the sequentially received three-dimensional model information to the next camera adapters 120c and 120d (S1021, S1022). Further, the camera adapter 120c transfers the three-dimensional model information created by the camera adapter 120a and received from the camera adapter 120b to the camera adapter 120d (S1023).

こうして、最終的に、カメラアダプタ１２０ａ〜１２０ｄが作成した前景画像、背景画像、及び三次元モデル情報は、ネットワーク接続されたカメラアダプタ１２０間を逐次伝送され、フロントエンドサーバ１３１に伝送される。 In this way, finally, the foreground image, the background image, and the three-dimensional model information created by the camera adapters 120a to 120d are sequentially transmitted between the network-connected camera adapters 120 and transmitted to the front-end server 131.

なお、本シーケンス図ではカメラアダプタ１２０ａ及びカメラアダプタ１２０ｄのキャリブレーション処理、前景背景分離処理、圧縮処理、及び三次元モデル情報作成処理については記載を省略している。しかし実際には、カメラアダプタ１２０ａ及びカメラアダプタ１２０ｄも、カメラアダプタ１２０ｂやカメラアダプタ１２０ｃと同様の処理を行い、前景画像、背景画像及び三次元モデル情報を作成している。また、ここでは４台のカメラアダプタ１２０間のデータ転送シーケンスについて説明したが、カメラアダプタ１２０の数が増えても同様の処理が行われる。 In this sequence diagram, the calibration process, the foreground background separation process, the compression process, and the three-dimensional model information creation process of the camera adapter 120a and the camera adapter 120d are omitted. However, in reality, the camera adapter 120a and the camera adapter 120d also perform the same processing as the camera adapter 120b and the camera adapter 120c to create a foreground image, a background image, and three-dimensional model information. Further, although the data transfer sequence between the four camera adapters 120 has been described here, the same processing is performed even if the number of camera adapters 120 increases.

ここまで説明したように、複数のカメラアダプタ１２０のうち、予め定められた順序における最終段のカメラアダプタ１２０以外のカメラアダプタ１２０は、対応するカメラ１１２による撮像画像から所定領域を抽出する。そしてその抽出結果に基づく画像データは、上記の予め定められた順序において次のカメラアダプタ１２０へ出力される。一方、上記の予め定められた順序における最終段のカメラアダプタ１２０は、抽出結果に基づく画像データをフロントエンドサーバ１３１へ出力する。すなわち、複数のカメラアダプタ１２０はデイジーチェーンで接続され、各々のカメラアダプタ１２０が撮像画像から所定領域を抽出した結果に基づく画像データは、予め定められた最終段のカメラアダプタ１２０によってフロントエンドサーバ１３１へ入力される。このようなデータの伝送方式を用いることで、画像処理システム１００内におけるセンサシステム１１０の数が変動した場合の、フロントエンドサーバ１３１における処理負荷やネットワークの伝送負荷の変動を抑制することができる。 As described above, among the plurality of camera adapters 120, the camera adapters 120 other than the final stage camera adapters 120 in a predetermined order extract a predetermined region from the image captured by the corresponding camera 112. Then, the image data based on the extraction result is output to the next camera adapter 120 in the predetermined order described above. On the other hand, the camera adapter 120 in the final stage in the predetermined order described above outputs image data based on the extraction result to the front-end server 131. That is, the plurality of camera adapters 120 are connected by a daisy chain, and the image data based on the result of each camera adapter 120 extracting a predetermined area from the captured image is obtained by the front-end server 131 by the predetermined final stage camera adapter 120. Is entered in. By using such a data transmission method, it is possible to suppress fluctuations in the processing load in the front-end server 131 and the transmission load in the network when the number of sensor systems 110 in the image processing system 100 fluctuates.

また、カメラアダプタ１２０が出力する画像データは、上記の抽出結果に基づく画像データと、予め定められた順序において前のカメラアダプタ１２０による所定領域の抽出結果に基づく画像データとを用いて生成されるデータであってもよい。例えば、各々のカメラアダプタ１２０が自身による抽出結果と前のカメラアダプタ１２０による抽出結果の差分に基づく画像データを出力することで、システム内の伝送データ量を低減することができる。上記の順序において最終段のカメラアダプタ１２０は、他のカメラ１１２による撮像画像から他のカメラアダプタ１２０により抽出された所定領域の画像データに基づく抽出画像データを上記の他のカメラアダプタ１２０から取得する。そして、自身が抽出した所定領域の抽出結果と、他のカメラアダプタ１２０から取得した抽出画像データとに応じた画像データを、仮想視点画像を生成するための画像コンピューティングサーバ１３０に対して出力する。 Further, the image data output by the camera adapter 120 is generated by using the image data based on the above extraction result and the image data based on the extraction result of the predetermined area by the previous camera adapter 120 in a predetermined order. It may be data. For example, each camera adapter 120 can reduce the amount of data transmitted in the system by outputting image data based on the difference between the extraction result by itself and the extraction result by the previous camera adapter 120. In the above order, the camera adapter 120 in the final stage acquires the extracted image data based on the image data of the predetermined region extracted by the other camera adapter 120 from the images captured by the other camera 112 from the other camera adapter 120. .. Then, the image data corresponding to the extraction result of the predetermined area extracted by itself and the extracted image data acquired from the other camera adapter 120 is output to the image computing server 130 for generating the virtual viewpoint image. ..

また、カメラアダプタ１２０は、カメラ１１２が撮像した画像を前景部分と背景部分に分け、例えばそれぞれの重要度に応じて圧縮率や伝送するフレームレートを変える。このことにより、カメラ１１２が撮像したデータの全てをフロントエンドサーバ１３１に伝送する場合よりも伝送量を低減する事ができる。また、三次元モデル生成に必要な三次元モデル情報を各々のカメラアダプタ１２０が逐次作成する。この事により、全てのデータをフロントエンドサーバ１３１に集結させ、サーバで全ての三次元モデル生成処理を行う場合と比較し、サーバの処理負荷を低減させる事ができ、よりリアルタイムに三次元モデル生成を可能とする事ができる。 Further, the camera adapter 120 divides the image captured by the camera 112 into a foreground portion and a background portion, and changes the compression rate and the frame rate to be transmitted according to the importance of each. As a result, the amount of transmission can be reduced as compared with the case where all the data captured by the camera 112 is transmitted to the front-end server 131. Further, each camera adapter 120 sequentially creates the three-dimensional model information necessary for generating the three-dimensional model. As a result, it is possible to reduce the processing load of the server compared to the case where all the data is collected in the front-end server 131 and all the 3D model generation processing is performed by the server, and the 3D model generation can be performed in more real time. Can be made possible.

次に図１０〜１２を用いて、カメラアダプタ１２０から出力される画像データを、複数のフロントエンドサーバ１３１に振り分ける処理について説明する。本実施形態では、カメラアダプタが３台、フロントエンドサーバが２台の場合について説明するが、カメラアダプタとフロントエンドサーバの台数はこれに限定されるものではない。 Next, the process of distributing the image data output from the camera adapter 120 to the plurality of front-end servers 131 will be described with reference to FIGS. 10 to 12. In the present embodiment, the case where the number of camera adapters is three and the number of front-end servers is two will be described, but the number of camera adapters and front-end servers is not limited to this.

図１１に、カメラアダプタ１２０が出力するデータ形式の一例を示す。１１０１は宛先ＭＡＣアドレスであり、画像データの送信先となるフロントエンドサーバ１３１ａまたは１３１ｂのＭＡＣアドレスを格納する。これらのＭＡＣアドレスは、図６のＳ６０１において制御ステーション１４１からカメラアダプタ１２０に対して通知されたＭＡＣアドレスである。本実施形態では、全てのフロントエンドサーバ１３１ａ〜１３１ｂのＭＡＣアドレスが、全てのカメラアダプタ１２０ａ〜１２０ｚに通知される。１１０２は送信元ＭＡＣアドレスであり、画像データの送信元であるカメラアダプタ１２０のＭＡＣアドレスを格納する。１１０３はＩＰヘッダであり、送信元と宛先のＩＰアドレスを格納する。 FIG. 11 shows an example of the data format output by the camera adapter 120. Reference numeral 1101 is a destination MAC address, which stores the MAC address of the front-end server 131a or 131b to which the image data is transmitted. These MAC addresses are the MAC addresses notified from the control station 141 to the camera adapter 120 in S601 of FIG. In this embodiment, the MAC addresses of all the front-end servers 131a to 131b are notified to all the camera adapters 120a to 120z. 1102 is the source MAC address, and stores the MAC address of the camera adapter 120, which is the source of the image data. 1103 is an IP header, and stores the IP addresses of the source and the destination.

１１０４は画像データのフレーム番号であり、図７のＳ７１４において指示された動画撮像開始の始点に基づいて、フレーム毎に加算したフレーム番号を格納する。１１０５はカメラ番号であり、カメラ１１２ａ〜１１２ｚの各々に付与された番号である。カメラ番号は、それぞれのカメラ１１２の製造シリアル番号等あらかじめ付与された番号でも、図６のＳ６０１において制御ステーション１４１から入力した番号でも構わない。さらには、カメラ１１２とカメラアダプタ１２０が一対一で接続されている場合は、カメラ１１２の番号でなくカメラアダプタ１２０の番号でも構わない。１１０６は画像データのデータ実体（前景画像、背景画像、及び三次元モデル情報）である。なお図１１では、特徴的なデータのみ記載し説明しているが、上記以外の情報が含まれる場合もある。 Reference numeral 1104 is a frame number of the image data, and the frame number added for each frame is stored based on the start point of the start of moving image imaging instructed in S714 of FIG. Reference numeral 1105 is a camera number, which is a number assigned to each of the cameras 112a to 112z. The camera number may be a number assigned in advance such as the manufacturing serial number of each camera 112, or a number input from the control station 141 in S601 of FIG. Further, when the camera 112 and the camera adapter 120 are connected one-to-one, the number of the camera adapter 120 may be used instead of the number of the camera 112. Reference numeral 1106 is a data entity of image data (foreground image, background image, and three-dimensional model information). Although only characteristic data is described and described in FIG. 11, information other than the above may be included.

次に、カメラアダプタ１２０の動作を、図１２のフローチャートと図１０のシーケンス図を用いて説明する。特に図１０のシーケンス図においては、カメラアダプタ１２０ｂに着目して説明する。なお、図１２の説明において、カメラアダプタ１２０が送受信するデータとは、前景画像、背景画像、三次元モデル情報を総称しているものとする。 Next, the operation of the camera adapter 120 will be described with reference to the flowchart of FIG. 12 and the sequence diagram of FIG. In particular, in the sequence diagram of FIG. 10, the camera adapter 120b will be focused on. In the description of FIG. 12, the data transmitted and received by the camera adapter 120 is a general term for the foreground image, the background image, and the three-dimensional model information.

Ｓ１２０１において、カメラアダプタ１２０はプロミスキャスモードに移行する。この移行は、典型的にはユーザ操作を起点として行われるが、電源投入時にカメラアダプタ１２０がプロミスキャスモードに自動設定される構成であっても良い。Ｓ１２０２において、カメラアダプタ１２０（外部機器制御部２４０）は、接続されているカメラ１１２から撮像画像を入力する。これは、図１０のＳ１００１に相当する処理である。Ｓ１２０３において、カメラアダプタ１２０（ネットワークアダプタ２１０）は、隣接するカメラアダプタ１２０から画像データを受信する。これは、図１０のＳ１０１０、Ｓ１０１８に相当する処理である。Ｓ１２０４において、カメラアダプタ１２０（画像処理部２３０）は受信した画像データに対して画像処理を行う。これは、図１０のＳ１００３、Ｓ１００５、Ｓ１００７、Ｓ１０１３に相当する処理である。なお、Ｓ１２０２〜Ｓ１２０４の処理は、図１２に示される処理が繰り返される間に適宜実行される処理である。すなわち、Ｓ１２０２とＳ１２０４の処理はカメラ１１２による同期撮像が実行されてその撮像画像を受信した場合に、Ｓ１２０３の処理は上流のカメラアダプタからデータが転送された場合に、適宜実行される処理である。 In S1201, the camera adapter 120 shifts to the promiscuous mode. This transition is typically performed starting from a user operation, but the camera adapter 120 may be automatically set to the promiscuous mode when the power is turned on. In S1202, the camera adapter 120 (external device control unit 240) inputs an captured image from the connected camera 112. This is a process corresponding to S1001 in FIG. In S1203, the camera adapter 120 (network adapter 210) receives image data from the adjacent camera adapter 120. This is a process corresponding to S1010 and S1018 in FIG. In S1204, the camera adapter 120 (image processing unit 230) performs image processing on the received image data. This is a process corresponding to S1003, S1005, S1007, and S1013 in FIG. The processes of S1202 to S1204 are processes that are appropriately executed while the processes shown in FIG. 12 are repeated. That is, the processing of S1202 and S1204 is a processing that is appropriately executed when synchronous imaging by the camera 112 is executed and the captured image is received, and the processing of S1203 is appropriately executed when data is transferred from the upstream camera adapter. ..

Ｓ１２０５において、カメラアダプタ１２０（伝送部２２０）は、送信する画像データが、Ｓ１２０４で処理した画像データであるか否かを判別する。Ｓ１２０４で処理した画像データを送信する場合は（Ｓ１２０４でＹＥＳ）、処理はＳ１２０６へ進み、それ以外の場合は（Ｓ１２０４でＮＯ）、処理はＳ１２０９へ進む。Ｓ１２０９では、カメラアダプタ１２０（伝送部２２０）は、Ｓ１２０３で受信した画像データを送信する。これは、図１０のＳ１０１５、Ｓ１０２１に相当する処理である。 In S1205, the camera adapter 120 (transmission unit 220) determines whether or not the image data to be transmitted is the image data processed in S1204. When transmitting the image data processed in S1204 (YES in S1204), the process proceeds to S1206, and in other cases (NO in S1204), the process proceeds to S1209. In S1209, the camera adapter 120 (transmission unit 220) transmits the image data received in S1203. This is a process corresponding to S1015 and S1021 in FIG.

Ｓ１２０６において、カメラアダプタ１２０（伝送部２２０）は、図６のＳ６０１において制御ステーション１４１より入力されたスケジューリング方法に基づいて、データを振り分けるフロントエンドサーバ１３１を決定する。フロントエンドサーバ１３１は、複数のカメラ１１２で撮像された画像を合成して１枚の画像を生成するため、同じフレームの画像を、同じサーバに送信し合成処理する必要がある。従って、カメラアダプタ１２０は、例えば、撮像タイミング情報が第１の撮像タイミング情報であることに基づいて複数のフロントエンドサーバ１３１のうちの第１のフロントエンドサーバへ、撮像タイミング情報が第２の撮像タイミング情報であることに基づいて複数のフロントエンドサーバ１３１のうちの第２のフロントエンドサーバへ、画像データを送信するように制御する。スケジューリング方法の一例としては、画像データのフレーム番号をフロントエンドサーバの数で割った余りに応じて、画像データを振り分ける先のフロントエンドサーバ１３１を決定する方法がある。例えば、フロントエンドサーバが１３１ａと１３１ｂの２台の場合は、奇数フレームはフロントエンドサーバ１３１ａ、偶数フレームはフロントエンドサーバ１３１ｂと、振り分け先のフロントエンドサーバ１３１が決定される。スケジューリング方法の他の例として、撮像タイミング情報（例えば、フレーム番号）と送信先のフロントエンドサーバ１３１との対応を定めたスケジュール表を制御ステーション１４１で作成し、カメラアダプタ１２０は、そのスケジュール表に従って画像データを振り分けても良い。スケジュール表には、例えば、第１の撮像タイミング情報（奇数のフレーム番号）と第１のフロントエンドサーバ（１３１ａ）の対応、第２の撮像タイミング情報(偶数のフレーム番号)と第２のフロントエンドサーバ（１３１ｂ）の対応などが記述されている。 In S1206, the camera adapter 120 (transmission unit 220) determines the front-end server 131 for distributing data based on the scheduling method input from the control station 141 in S601 of FIG. Since the front-end server 131 synthesizes the images captured by the plurality of cameras 112 to generate one image, it is necessary to transmit the images of the same frame to the same server and perform the compositing process. Therefore, in the camera adapter 120, for example, based on the imaging timing information being the first imaging timing information, the imaging timing information is the second imaging to the first front-end server among the plurality of front-end servers 131. Based on the timing information, the image data is controlled to be transmitted to the second front-end server among the plurality of front-end servers 131. As an example of the scheduling method, there is a method of determining the front-end server 131 to which the image data is distributed according to the remainder obtained by dividing the frame number of the image data by the number of front-end servers. For example, when there are two front-end servers 131a and 131b, the odd-numbered frames are the front-end server 131a, the even-numbered frames are the front-end server 131b, and the distribution destination front-end server 131 is determined. As another example of the scheduling method, the control station 141 creates a schedule table that defines the correspondence between the imaging timing information (for example, the frame number) and the front-end server 131 of the transmission destination, and the camera adapter 120 follows the schedule table. Image data may be sorted. In the schedule table, for example, the correspondence between the first imaging timing information (odd frame number) and the first front-end server (131a), the second imaging timing information (even frame number) and the second front end The correspondence of the server (131b) and the like are described.

Ｓ１２０７において、カメラアダプタ１２０（伝送部２２０）は、宛先ＭＡＣアドレスとしてＳ１２０６で決定した振り分け先のフロントエンドサーバ１３１ａまたは１３１ｂのＭＡＣアドレスを設定したデータを生成する。Ｓ１２０８において、カメラアダプタ１２０（伝送部２２０）は、Ｓ１２０７で生成したデータを送信する。これは、図１０のＳ１０１１、Ｓ１０１９に相当する処理である。なお、Ｓ１２０５〜Ｓ１２０９の処理は、図１２に示された処理が繰り返される間に、送信対象のデータが生成されたことに応じて適宜実行される処理である。 In S1207, the camera adapter 120 (transmission unit 220) generates data in which the MAC address of the distribution destination front-end server 131a or 131b determined in S1206 is set as the destination MAC address. In S1208, the camera adapter 120 (transmission unit 220) transmits the data generated in S1207. This is a process corresponding to S1011 and S1019 in FIG. The processes of S120 to S1209 are processes that are appropriately executed according to the generation of data to be transmitted while the processes shown in FIG. 12 are repeated.

次に、図１２のフローチャートの動作に沿った、カメラアダプタ１２０とフロントエンドサーバ１３１のデータの流れについて図１３Ａ、１３Ｂを参照して説明する。 Next, the data flow of the camera adapter 120 and the front-end server 131 according to the operation of the flowchart of FIG. 12 will be described with reference to FIGS. 13A and 13B.

図１３Ａでは、それぞれのカメラ１１２がフレーム番号「ｍ」の画像を撮像し、それぞれのカメラアダプタ１２０から次のカメラアダプタに画像データを送信する様子を表している。 FIG. 13A shows how each camera 112 captures an image having a frame number “m” and transmits image data from each camera adapter 120 to the next camera adapter.

１３０１、１３０２、１３０３、１３０４、１３０５は、カメラアダプタ１２０ａ〜１２０ｃ、フロントエンドサーバ１３１ａ〜１３１ｂそれぞれに付与されたＭＡＣアドレスを表している。 1301, 1302, 1303, 1304, and 1305 represent MAC addresses assigned to the camera adapters 120a to 120c and the front-end servers 131a to 131b, respectively.

１３１０、１３２０、１３３０は、それぞれカメラアダプタ１２０ａ、１２０ｂ、１２０ｃが画像処理を行い送信するデータを表しており、データ形式は、図１１を用いて説明した形式に則っている。また、データ１３１０、１３２０、１３３０は、図１２のフローチャートのＳ１２０８で送信する画像データに相当する。 Reference numerals 1310, 1320, and 1330 represent data to be transmitted by the camera adapters 120a, 120b, and 120c, respectively, and the data format conforms to the format described with reference to FIG. Further, the data 1310, 1320, 1330 correspond to the image data transmitted in S1208 of the flowchart of FIG.

また、１３１１、１３２１、１３３１は、画像データの宛先となるフロントエンドサーバ１３１のＭＡＣアドレスである。図１３Ａでは、Ｓ１２０６〜Ｓ１２０７の処理に基づいて、フレーム番号「ｍ」の画像データをフロントエンドサーバ１３１ａに送信すべく、フロントエンドサーバ１３１ａのＭＡＣアドレス「００：００：００：２０：０１」が設定されている。これにより、フレーム番号「ｍ」の画像データは全てフロントエンドサーバ１３１ａに送信される。１３１２、１３２２、１３３２は、画像データの送信元となるカメラアダプタ１２０のＭＡＣアドレスであり、画像を撮像し処理を行ったカメラアダプタ１２０のＭＡＣアドレスが設定される。 Further, 1311, 1321, 1331 are MAC addresses of the front-end server 131 that is the destination of the image data. In FIG. 13A, the MAC address “00:00:00:20:01” of the front-end server 131a is set to transmit the image data of the frame number “m” to the front-end server 131a based on the processes of S1206 to S1207. It is set. As a result, all the image data of the frame number "m" is transmitted to the front-end server 131a. Reference numerals 1312, 1322, and 1332 are MAC addresses of the camera adapter 120 that is the source of image data, and the MAC address of the camera adapter 120 that has captured and processed the image is set.

図１３Ｂは、それぞれのカメラ１１２が撮像したフレーム番号「ｍ＋１」の画像データと、隣接するカメラアダプタ１２０から受信したフレーム番号「ｍ」の画像データとを、次のカメラアダプタに送信する様子を表している。図１３Ａに示したデータ１３１０、１３２０は、それぞれ下流のカメラアダプタに転送されている。１３４０、１３５０、１３６０は、それぞれカメラアダプタ１２０ａ、１２０ｂ、１２０ｃが画像処理を行い送信するフレーム番号「ｍ＋１」のデータであり、図１２のフローチャートのＳ１２０８で送信する画像データに相当する。１３４１、１３５１、１３６１は、画像データの宛先となるフロントエンドサーバ１３１のＭＡＣアドレスである。Ｓ１２０６〜Ｓ１２０７の処理に基づいて、フレーム番号「ｍ＋１」の画像データをフロントエンドサーバ１３１ｂに送信すべく、フロントエンドサーバ１３１ｂのＭＡＣアドレス「００：００：００：２０：０２」が設定されている。 FIG. 13B shows how the image data of the frame number “m + 1” captured by each camera 112 and the image data of the frame number “m” received from the adjacent camera adapter 120 are transmitted to the next camera adapter. ing. The data 1310 and 1320 shown in FIG. 13A are transferred to the downstream camera adapters, respectively. 1340, 1350, and 1360 are the data of the frame number “m + 1” transmitted by the camera adapters 120a, 120b, and 120c, respectively, after performing image processing, and correspond to the image data transmitted in S1208 of the flowchart of FIG. 1341, 1351, and 1361 are MAC addresses of the front-end server 131 that is the destination of the image data. Based on the processing of S1206 to S1207, the MAC address "00:00:00:20:02" of the front-end server 131b is set in order to transmit the image data of the frame number "m + 1" to the front-end server 131b. ..

１３６０、１３７０は、隣接するカメラアダプタ１２０から受信した画像データであり、内容は書き換えずにそのまま転送している。図１２のフローチャートのＳ１２０９で送信するデータに相当する。図１３Ｂのデータ１３５０，１３６０は、それぞれ図１３Ａのデータ１３１０、１３２０と対応しており、それぞれのデータ内容は同じである。 Reference numerals 1360 and 1370 are image data received from the adjacent camera adapter 120, and the contents are transferred as they are without being rewritten. This corresponds to the data transmitted in S1209 of the flowchart of FIG. The data 1350 and 1360 in FIG. 13B correspond to the data 1310 and 1320 in FIG. 13A, respectively, and the data contents are the same.

以上により、フレーム番号「ｍ」の画像データは全てフロントエンドサーバ１３１ａに送信され、フレーム番号「ｍ＋１」の画像データは全てフロントエンドサーバ１３１ｂに送信される。以降は、同様の処理を繰り返すことにより、１フレーム毎にフロントエンドサーバ１３１ａと１３１ｂに交互に画像データを送信することが可能となり、複数のフロントエンドサーバ１３１による処理の振り分けが可能となる。 As described above, all the image data of the frame number "m" is transmitted to the front-end server 131a, and all the image data of the frame number "m + 1" is transmitted to the front-end server 131b. After that, by repeating the same processing, the image data can be alternately transmitted to the front-end servers 131a and 131b for each frame, and the processing can be distributed by the plurality of front-end servers 131.

次に図１４のフローチャートに従って、撮像時処理（Ｓ５０３）におけるフロントエンドサーバ１３１の動作について説明する。 Next, the operation of the front-end server 131 in the imaging process (S503) will be described with reference to the flowchart of FIG.

制御部３１０は、制御ステーション１４１から撮像モードに切り替える指示を受信し、撮像モードに切り替える（Ｓ１４００）。撮像が開始されると、データ入力制御部３２０はカメラアダプタ１２０からの撮像データの受信を開始する（Ｓ１４０１）。データ同期部３３０は、ファイル作成に必要な撮像データが全て揃うまで撮像データをバッファする（Ｓ１４０２）。なお、フローチャート上は明記していないが、Ｓ１４０２では撮像データに付与されている時間情報が一致するかどうかや、所定台数のカメラが充足しているかどうかなどが判定される。 The control unit 310 receives an instruction to switch to the imaging mode from the control station 141 and switches to the imaging mode (S1400). When the imaging is started, the data input control unit 320 starts receiving the imaging data from the camera adapter 120 (S1401). The data synchronization unit 330 buffers the imaging data until all the imaging data required for file creation are prepared (S1402). Although not specified in the flowchart, in S1402, it is determined whether or not the time information given to the imaging data matches, and whether or not a predetermined number of cameras are satisfied.

またカメラ１１２の状態によっては、キャリブレーション中やエラー処理中であることによって画像データが送られない場合がある。この場合は、所定のカメラ番号の画像が抜けていることが後段（Ｓ１４０８）のデータベース１３２へのデータ転送において通知される。ここで、所定カメラ台数の充足を判定するために、撮像データの到着を所定時間待つ方法がある。しかし本実施形態では、システムの一連の処理の遅延を抑制するために、各々のカメラアダプタ１２０がデイジーチェーンによってデータを伝送する際に、各カメラ番号に対応する画像データの有無を示す情報を付与する。これにより、フロントエンドサーバ１３１の制御部３１０において、画像データの有無を直ちに判断することが可能となる。これによって、撮像データの到着待ち時間を設定する必要がなくなる効果が得られることをここに明記しておく。 Further, depending on the state of the camera 112, image data may not be sent due to calibration or error processing. In this case, it is notified in the data transfer to the database 132 in the subsequent stage (S1408) that the image of the predetermined camera number is missing. Here, there is a method of waiting for the arrival of the imaging data for a predetermined time in order to determine the satisfaction of the predetermined number of cameras. However, in the present embodiment, in order to suppress the delay of a series of processing of the system, when each camera adapter 120 transmits data by the daisy chain, information indicating the presence or absence of image data corresponding to each camera number is added. To do. As a result, the control unit 310 of the front-end server 131 can immediately determine the presence or absence of image data. It is clearly stated here that this has the effect of eliminating the need to set the arrival waiting time of the imaging data.

データ同期部３３０によってファイル作成に必要なデータがバッファリングされた後、画像処理部３５０は、ＲＡＷ画像データの現像処理やレンズ歪み補正、前景画像及び背景画像の各カメラで撮像された画像間の色や輝度値を合わせるなどの各種変換処理を行う（Ｓ１４０３）。データ同期部３３０によってバッファリングされたデータが背景画像を含む場合（Ｓ１４０４でＹＥＳ）、画像結合部３７０は、背景画像を結合する処理を行う（Ｓ１４０５）。画像結合部３７０は、Ｓ１４０３において画像処理部３５０が処理した背景画像を取得し、ＣＡＤデータ記憶部３３５が保存したスタジアム形状データの座標に合わせて背景画像をつなぎ合わせることにより結合した背景画像を得る。結合された背景画像は、撮像データファイル生成部３８０に送られる。 After the data required for file creation is buffered by the data synchronization unit 330, the image processing unit 350 develops the RAW image data, corrects the lens distortion, and captures the foreground image and the background image between the images captured by the cameras. Various conversion processes such as matching color and brightness values are performed (S1403). When the data buffered by the data synchronization unit 330 includes the background image (YES in S1404), the image combining unit 370 performs a process of combining the background images (S1405). The image combining unit 370 acquires the background image processed by the image processing unit 350 in S1403, and joins the background images according to the coordinates of the stadium shape data saved by the CAD data storage unit 335 to obtain the combined background image. .. The combined background image is sent to the imaging data file generation unit 380.

他方、バッファリングされたデータが背景画像を含まない場合は（Ｓ１４０４でＮＯ）、背景画像の結合（Ｓ１４０５）がスキップされる。 On the other hand, if the buffered data does not include the background image (NO in S1404), the combination of the background images (S1405) is skipped.

次に、三次元モデル結合部３６０が三次元モデルを結合する処理を行う。データ同期部３３０から三次元モデルを取得した三次元モデル結合部３６０は、三次元モデルデータとカメラパラメータを使って前景画像の三次元モデルを生成、結合する（Ｓ１４０６）。Ｓ１４０６までの処理によって作成された撮像データを受け取った撮像データファイル生成部３８０は、撮像データをファイル形式に応じて成形してからパッキングし、撮像データファイルとしてＤＢアクセス制御部３９０に送る（Ｓ１４０７）。ＤＢアクセス制御部３９０は、Ｓ１４０７で撮像データファイル生成部３８０から受け取った撮像データファイルを、データベース１３２に送信する（Ｓ１４０８）。 Next, the three-dimensional model joining unit 360 performs a process of joining the three-dimensional models. The three-dimensional model combining unit 360 that has acquired the three-dimensional model from the data synchronization unit 330 generates and combines the three-dimensional model of the foreground image using the three-dimensional model data and the camera parameters (S1406). The imaging data file generation unit 380 that has received the imaging data created by the processes up to S1406 shapes the imaging data according to the file format, packs it, and sends it to the DB access control unit 390 as an imaging data file (S1407). .. The DB access control unit 390 transmits the image pickup data file received from the image pickup data file generation unit 380 in S1407 to the database 132 (S1408).

続いて、本実施形態を構成する各装置のハードウェア構成について、より詳細に説明する。上述の通り、本実施形態では、カメラアダプタ１２０がＦＰＧＡ及び／又はＡＳＩＣなどのハードウェアを実装し、これらのハードウェアによって、上述した各処理を実行する場合の例を中心に説明した。それはセンサシステム１１０内の各種装置や、フロントエンドサーバ１３１、データベース１３２、バックエンドサーバ１３３、及びコントローラ１４０についても同様である。しかしながら、上記装置のうち、少なくとも何れかが、例えばＣＰＵ、ＧＰＵ、ＤＳＰなどを用い、ソフトウェア処理によって本実施形態の処理を実行するようにしても良い。 Subsequently, the hardware configuration of each device constituting the present embodiment will be described in more detail. As described above, in the present embodiment, an example in which the camera adapter 120 implements hardware such as FPGA and / or ASIC and executes each of the above-described processes by these hardware has been mainly described. The same applies to the various devices in the sensor system 110, the front-end server 131, the database 132, the back-end server 133, and the controller 140. However, at least one of the above devices may execute the processing of the present embodiment by software processing using, for example, a CPU, GPU, DSP, or the like.

図１５は、図２に示した機能構成をソフトウェア処理によって実現するための、カメラアダプタ１２０のハードウェア構成例を示すブロック図である。なお、フロントエンドサーバ１３１、データベース１３２、バックエンドサーバ１３３、制御ステーション１４１、仮想カメラ操作ＵＩ１４２、及びエンドユーザ端末１５０などの装置も、図１５のハードウェア構成となりうる。カメラアダプタ１２０は、ＣＰＵ１５０１、ＲＯＭ１５０２、ＲＡＭ１５０３、補助記憶装置１５０４、表示部１５０５、操作部１５０６、通信部１５０７、及びバス１５０８を有する。 FIG. 15 is a block diagram showing a hardware configuration example of the camera adapter 120 for realizing the functional configuration shown in FIG. 2 by software processing. Devices such as the front-end server 131, the database 132, the back-end server 133, the control station 141, the virtual camera operation UI 142, and the end-user terminal 150 may also have the hardware configuration shown in FIG. The camera adapter 120 includes a CPU 1501, a ROM 1502, a RAM 1503, an auxiliary storage device 1504, a display unit 1505, an operation unit 1506, a communication unit 1507, and a bus 1508.

ＣＰＵ１５０１は、ＲＯＭ１５０２やＲＡＭ１５０３に格納されているコンピュータプログラムやデータを用いてカメラアダプタ１２０の全体を制御する。ＲＯＭ１５０２は、変更を必要としないプログラムやパラメータを格納する。ＲＡＭ１５０３は、補助記憶装置１５０４から供給されるプログラムやデータ、及び通信部１５０７を介して外部から供給されるデータなどを一時記憶する。補助記憶装置１５０４は、例えばハードディスクドライブ等で構成され、静止画や動画などのコンテンツデータを記憶する。 The CPU 1501 controls the entire camera adapter 120 by using computer programs and data stored in the ROM 1502 and the RAM 1503. The ROM 1502 stores programs and parameters that do not need to be changed. The RAM 1503 temporarily stores programs and data supplied from the auxiliary storage device 1504, data supplied from the outside via the communication unit 1507, and the like. The auxiliary storage device 1504 is composed of, for example, a hard disk drive or the like, and stores content data such as still images and moving images.

表示部１５０５は、例えば液晶ディスプレイ等で構成され、ユーザがカメラアダプタ１２０を操作するためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）などを表示する。操作部１５０６は、例えばキーボードやマウス等で構成され、ユーザによる操作を受けて各種の指示をＣＰＵ１５０１に入力する。通信部１５０７は、カメラ１１２やフロントエンドサーバ１３１などの外部の装置と通信を行う。例えば、カメラアダプタ１２０が外部の装置と有線で接続される場合には、ＬＡＮケーブル等が通信部１５０７に接続される。なお、カメラアダプタ１２０が外部の装置と無線通信する機能を有する場合、通信部１５０７はアンテナを備える。バス１５０８は、カメラアダプタ１２０の各部を繋いで情報を伝達する。 The display unit 1505 is composed of, for example, a liquid crystal display or the like, and displays a GUI (Graphical User Interface) or the like for the user to operate the camera adapter 120. The operation unit 1506 is composed of, for example, a keyboard, a mouse, or the like, and inputs various instructions to the CPU 1501 in response to an operation by the user. The communication unit 1507 communicates with an external device such as the camera 112 or the front-end server 131. For example, when the camera adapter 120 is connected to an external device by wire, a LAN cable or the like is connected to the communication unit 1507. When the camera adapter 120 has a function of wirelessly communicating with an external device, the communication unit 1507 includes an antenna. The bus 1508 connects each part of the camera adapter 120 to transmit information.

なお、例えばカメラアダプタ１２０の処理のうち一部をＦＰＧＡで行い、別の一部の処理を、ＣＰＵを用いたソフトウェア処理によって実現するようにしても良い。また、本実施形態では表示部１５０５と操作部１５０６はカメラアダプタ１２０の内部に存在するが、カメラアダプタ１２０は表示部１５０５及び操作部１５０６の少なくとも一方を備えていなくてもよい。また、表示部１５０５及び操作部１５０６の少なくとも一方がカメラアダプタ１２０の外部に別の装置として存在していて、ＣＰＵ１５０１が、表示部１５０５を制御する表示制御部、及び操作部１５０６を制御する操作制御部として動作してもよい。 For example, a part of the processing of the camera adapter 120 may be performed by the FPGA, and another part of the processing may be realized by software processing using the CPU. Further, in the present embodiment, the display unit 1505 and the operation unit 1506 are present inside the camera adapter 120, but the camera adapter 120 does not have to include at least one of the display unit 1505 and the operation unit 1506. Further, at least one of the display unit 1505 and the operation unit 1506 exists as another device outside the camera adapter 120, and the CPU 1501 controls the display control unit 1505 and the operation control unit 1506. It may operate as a unit.

また、上述の実施形態は、画像処理システム１００が競技場やコンサートホールなどの施設に設置される場合の例を中心に説明した。施設の他の例としては、例えば、遊園地、公園、競馬場、競輪場、カジノ、プール、スケートリンク、スキー場、ライブハウスなどがある。また、各種施設で行われるイベントは、屋内で行われるものであっても屋外で行われるものであっても良い。また、本実施形態における施設は、一時的に（期間限定で）建設される施設も含む。 Further, the above-described embodiment has been described focusing on an example in which the image processing system 100 is installed in a facility such as a stadium or a concert hall. Other examples of facilities include, for example, amusement parks, parks, racetracks, keirin racetracks, casinos, pools, skating rinks, ski resorts, live houses, and the like. In addition, the events held at various facilities may be held indoors or outdoors. The facilities in this embodiment also include facilities that are temporarily constructed (for a limited time).

＜第２実施形態＞
第１実施形態では、それぞれのカメラアダプタ１２０がプロミスキャスモードで動作し、宛先となるフロントエンドサーバ１３１のＭＡＣアドレスを直接指定して画像データを送信する方法について説明した。したがって、第１実施形態では、スケジューリング方法により規定される複数のカメラアダプタの接続の順番は、デイジーチェーンの接続順により実現される。第２実施形態では、プロミスキャスモードを用いない構成を説明する。この場合、第２実施形態では、スケジューリング方法により規定されるカメラアダプタ間の上流から下流への順番は、それぞれのアダプタが「宛先ＭＡＣ」に設定するアドレスにより実現される。したがって、デイジーチェーンの接続形態は必須ではない。また、第２実施形態では、最も下流、すなわち最終段のカメラアダプタ１２０のみがフロントエンドサーバ１３１のＭＡＣアドレスを指定する。第２実施形態では、複数のカメラアダプタ１２０がデイジーチェーンの接続形態で接続さていなくてもよい。 <Second Embodiment>
In the first embodiment, a method in which each camera adapter 120 operates in the promiscuous mode and the MAC address of the destination front-end server 131 is directly specified to transmit image data has been described. Therefore, in the first embodiment, the connection order of the plurality of camera adapters defined by the scheduling method is realized by the connection order of the daisy chain. In the second embodiment, a configuration that does not use the promiscuous mode will be described. In this case, in the second embodiment, the order from upstream to downstream between the camera adapters defined by the scheduling method is realized by the address set in the "destination MAC" by each adapter. Therefore, the daisy chain connection form is not essential. Further, in the second embodiment, only the most downstream, that is, the final stage camera adapter 120 specifies the MAC address of the front-end server 131. In the second embodiment, the plurality of camera adapters 120 may not be connected in a daisy chain connection form.

第２実施形態では、図６のＳ６０１において、フロントエンドサーバ１３１ａ、１３１ｂのＭＡＣアドレスが、フロントエンドサーバ１３１にデータを送信する、最終段のカメラアダプタ１２０ｚのみに通知される。最終段のカメラアダプタ以外のカメラアダプタには、スケジューリング方法により規定される順番に従って、データの送信先となるアドレスが通知される。 In the second embodiment, in S601 of FIG. 6, the MAC addresses of the front-end servers 131a and 131b are notified only to the final stage camera adapter 120z that transmits data to the front-end server 131. Camera adapters other than the camera adapter in the final stage are notified of addresses to which data is transmitted in the order specified by the scheduling method.

図１６は、第２実施形態におけるカメラアダプタ１２０の動作を示すフローチャートである。なお、第１実施形態（図１２）と同様の処理には同一の符号を付してある。また、図１２と同様に、図１６の説明においてカメラアダプタ１２０が送受信するデータを画像データと記載するが、前景画像、背景画像、三次元モデル情報を総称しているものとする。 FIG. 16 is a flowchart showing the operation of the camera adapter 120 in the second embodiment. The same reference numerals are given to the same processes as in the first embodiment (FIG. 12). Further, similarly to FIG. 12, in the description of FIG. 16, the data transmitted and received by the camera adapter 120 is described as image data, but the foreground image, the background image, and the three-dimensional model information are generically referred to.

Ｓ１２０２〜Ｓ１２０４を実行した後、Ｓ１６０１において、カメラアダプタ１２０は、自身が最終段のカメラアダプタであるか否かを判別する。最終段のカメラアダプタであるか否かの情報は、図６のＳ６０２における配置情報の入力時に、制御ステーション１４１からカメラアダプタ１２０に対して設定される。例えば、図１に示すシステム構成において、カメラアダプタのデータが、カメラアダプタ１２０ａ→カメラアダプタ１２０ｂ→...→カメラアダプタ１２０ｚと伝送される場合、カメラアダプタ１２０ｚが最終段のカメラアダプタであるという設定がなされる。 After executing S1202 to S1204, in S1601, the camera adapter 120 determines whether or not it is the final stage camera adapter. The information on whether or not the camera adapter is the final stage is set from the control station 141 to the camera adapter 120 at the time of inputting the arrangement information in S602 of FIG. For example, in the system configuration shown in FIG. 1, when the data of the camera adapter is transmitted in the order of camera adapter 120a → camera adapter 120b → ... → camera adapter 120z, the setting is that the camera adapter 120z is the final stage camera adapter. Is done.

自身が最終段のカメラアダプタではないと判定された場合（Ｓ１６０１でＮＯ）、Ｓ１６０２において、カメラアダプタ１２０は、画像データの宛先ＭＡＣアドレスに、スケジューリング方法に従った下流のカメラアダプタを設定する。例えば、カメラアダプタ１２０ａは、カメラアダプタ１２０ｂのＭＡＣアドレスを画像データの宛先に設定する。そして、Ｓ１６０３において、カメラアダプタ１２０は、画像データを送信する（Ｓ１６０３）。ここで送信する画像データの形式は、第１実施形態（図１１）と同様である。 When it is determined that the camera adapter is not the final stage camera adapter (NO in S1601), in S1602, the camera adapter 120 sets the downstream camera adapter according to the scheduling method to the destination MAC address of the image data. For example, the camera adapter 120a sets the MAC address of the camera adapter 120b as the destination of the image data. Then, in S1603, the camera adapter 120 transmits image data (S1603). The format of the image data transmitted here is the same as that of the first embodiment (FIG. 11).

一方、自身が最終段のカメラアダプタであると判定された場合は（Ｓ１６０１でＹＥＳ）、設定されたスケジューリング方法に従って送信先に決定されたフロントエンドサーバに画像データを送信する（Ｓ１６０４〜ＳＳ１６０６）。Ｓ１６０４〜Ｓ１６０５の処理は、図１２のＳ１２０６〜Ｓ１２０７と同様である。 On the other hand, if it is determined that the camera adapter is the final stage camera adapter (YES in S1601), the image data is transmitted to the front-end server determined as the destination according to the set scheduling method (S1604 to SS1606). The processing of S1604 to S1605 is the same as that of S1206 to S1207 of FIG.

次に、図１６のフローチャートの動作に沿った、カメラアダプタ１２０とフロントエンドサーバ１３１のデータの流れについて図１７を用いて説明する。図１７は、それぞれのカメラ１１２が撮像したフレーム番号「ｎ＋１」の画像データと、隣接するカメラアダプタ１２０から受信したフレーム番号「ｎ」の画像データとを、次のカメラアダプタに送信する様子を表している。なお第１実施形態（図１３Ａ、図１３Ｂ）と同じ構成、同じデータには同一の符号を付与している。 Next, the data flow of the camera adapter 120 and the front-end server 131 according to the operation of the flowchart of FIG. 16 will be described with reference to FIG. FIG. 17 shows how the image data of the frame number “n + 1” captured by each camera 112 and the image data of the frame number “n” received from the adjacent camera adapter 120 are transmitted to the next camera adapter. ing. The same configuration and the same data as those in the first embodiment (FIGS. 13A and 13B) are given the same reference numerals.

１７０１は、カメラアダプタ１２０ｚのＭＡＣアドレスである。１７１０、１７２０、１７３０は、カメラアダプタ１２０ｚ以外のカメラアダプタ１２０が下流のカメラアダプタ１２０に送信するデータである。これらは、図１６のフローチャートのＳ１６０３で送信するデータに相当する。データ１７１０の宛先ＭＡＣアドレス１７１１には、次のカメラアダプタ１２０ｂのＭＡＣアドレスが格納される。同様に、データ１７２０，１７３０の宛先ＭＡＣアドレス１７２１、１７３１には、それぞれ次のカメラアダプタ１２０ｚのＭＡＣアドレスが格納される。 1701 is the MAC address of the camera adapter 120z. 1710, 1720, and 1730 are data transmitted by the camera adapter 120 other than the camera adapter 120z to the downstream camera adapter 120. These correspond to the data transmitted in S1603 of the flowchart of FIG. The MAC address of the next camera adapter 120b is stored in the destination MAC address 1711 of the data 1710. Similarly, the MAC addresses of the next camera adapter 120z are stored in the destination MAC addresses 1721 and 1731 of the data 1720 and 1730, respectively.

データ１７４０、１７５０は、最終段のカメラアダプタ１２０ｚが、フロントエンドサーバ１３１に送信するデータであり、図１６のフローチャートのＳ１６０６で送信するデータに相当する。カメラアダプタ１２０ｚでは、フロントエンドサーバ１３１への振り分け処理を行い、フレーム番号「ｎ」のデータ１７４０をフロントエンドサーバ１３１ａに、フレーム番号「ｎ＋１」のデータ１７５０をフロントエンドサーバ１３１ａに振り分ける。そのため、データ１７４０の宛先ＭＡＣアドレス１７４１にはフロントエンドサーバ１３１ａのＭＡＣアドレスが、データ１７５０の宛先ＭＡＣアドレス１７５１にはフロントエンドサーバ１３１ｂのＭＡＣアドレスが格納される。 The data 1740 and 1750 are the data transmitted by the camera adapter 120z in the final stage to the front-end server 131, and correspond to the data transmitted in S1606 of the flowchart of FIG. The camera adapter 120z performs distribution processing to the front-end server 131, and distributes the data 1740 of the frame number “n” to the front-end server 131a and the data 1750 of the frame number “n + 1” to the front-end server 131a. Therefore, the MAC address of the front-end server 131a is stored in the destination MAC address 1741 of the data 1740, and the MAC address of the front-end server 131b is stored in the destination MAC address 1751 of the data 1750.

以上により、プロミスキャスモードを用いず、スケジューリング方法により規定されるカメラアダプタ１２０の順番において最終段のカメラアダプタ１２０ｚがフロントエンドサーバ１３１のＭＡＣアドレスを指定し、データの振り分けを行うことができる。 As described above, the camera adapter 120z in the final stage can specify the MAC address of the front-end server 131 in the order of the camera adapters 120 defined by the scheduling method without using the promiscuous mode, and data can be distributed.

以上、上述した実施形態によれば、カメラ１１２の台数などのシステムを構成する装置の規模、及び撮像画像の出力解像度や出力フレームレートなどに依らず、仮想視点画像を簡便に生成することが出来る。 As described above, according to the above-described embodiment, the virtual viewpoint image can be easily generated regardless of the scale of the devices constituting the system such as the number of cameras 112, and the output resolution and output frame rate of the captured image. ..

以上のように、上記各実施形態によれば、複数の撮像装置（カメラ１１２）と複数の画像処理装置（フロントエンドサーバ１３１）を備えた画像処理システムにおいて、同じタイミングで撮像された撮像画像を１つの画像処理装置に振り分けることが可能となる。その結果、画像処理システム１００におけるフレーム単位処理の効率が向上するといった効果がある。 As described above, according to each of the above embodiments, in an image processing system including a plurality of image pickup devices (camera 112) and a plurality of image processing devices (front end server 131), captured images captured at the same timing are captured. It is possible to distribute to one image processing device. As a result, there is an effect that the efficiency of frame-by-frame processing in the image processing system 100 is improved.

＜他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other Embodiments>
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the above embodiments, and various modifications and modifications can be made without departing from the spirit and scope of the invention. Therefore, a claim is attached to make the scope of the invention public.

１００...画像処理システム、１１０...センサシステム、１１１...マイク、１１２...カメラ、１１３...雲台、１２０...カメラアダプタ、１８０...スイッチングハブ、１５０...エンドユーザ端末、１３１...フロントエンドサーバ、１３２...データベース、１３３...バックエンドサーバ、１３４...タイムサーバ、１４１...制御ステーション、１４２...仮想カメラ操作ＵＩ 100 ... image processing system, 110 ... sensor system, 111 ... microphone, 112 ... camera, 113 ... pan head, 120 ... camera adapter, 180 ... switching hub, 150. .. End user terminal, 131 ... Front end server, 132 ... Database, 133 ... Back end server, 134 ... Time server, 141 ... Control station, 142 ... Virtual camera operation UI

Claims

A plurality of control devices corresponding to a plurality of image pickup devices and acquiring a plurality of images acquired by being imaged by the plurality of image pickup devices, and a plurality of control devices.
A plurality of image processing devices for acquiring the plurality of images transmitted from the plurality of control devices, and a plurality of image processing devices.
Have,
The plurality of control devices transmit the image to the first image processing device included in the plurality of image processing devices based on the imaging timing information given to the image being the first image processing timing information. Then, based on the imaging timing information given to the image being the second imaging timing information, the image is transmitted to the second image processing device included in the plurality of image processing devices. Image processing system.

The claim is characterized in that an image processing device to which an image to which the image pickup timing information is attached is transmitted is determined based on the image pickup timing information given to the image and the number of the plurality of image processing devices. The image processing system according to 1.

The plurality of control devices correspond to the image pickup timing information of the plurality of image processing devices with the image to which the image pickup timing information is given based on the image pickup timing information and the schedule table given to the image. Transmit to the image processing device
The claim is characterized in that the schedule table includes a correspondence between the first imaging timing information and the first image processing apparatus, and a correspondence between the second imaging timing information and the second image processing apparatus. The image processing system according to 1.

The image processing system according to any one of claims 1 to 3, wherein the image pickup timing information is a time code or a frame number assigned to the image.

The image processing system according to any one of claims 1 to 4, wherein the control device sets the address of the image processing device as the transmission destination to the destination of the image acquired from the corresponding image pickup device.

The image processing system according to claim 5, wherein the plurality of control devices are connected by a daisy chain.

The image processing according to claim 6, wherein the control device transmits the image transmitted from the upstream control device to the downstream control device without changing the destination set in the daisy chain. system.

The control device transmits images from the upstream control device to the downstream control device in a predetermined order.
The image pickup timing information added to the images of the image received from the upstream control device and the image acquired from the corresponding image pickup device by the final stage control device in the predetermined order is the first image pickup timing information. 1 to 4, wherein the information is transmitted to the first image processing apparatus based on the above, and is transmitted to the second image processing apparatus based on the second imaging timing information. The image processing system according to any one of the above items.

An acquisition means for acquiring an image captured based on the image taken by the image pickup device, and
Based on the imaging timing information of the captured image being the first imaging timing information, the image is transmitted to the first image processing apparatus included in the plurality of image processing devices, and the imaging timing information is the second imaging. A determination means for determining the destination of the captured image so as to transmit the image to the second image processing device included in the plurality of image processing devices based on the timing information.
A control device including an output means for setting and outputting a destination determined by the determination means to data including the captured image.

The control device according to claim 9, wherein the acquisition means acquires the captured image to which the imaging timing information is added from the imaging device by notifying the imaging device of a time code.

The control device according to claim 9, further comprising an imparting means for imparting the imaging timing information to the captured image acquired by the acquiring means.

Connected to other controllers in a daisy chain,
The control device according to any one of claims 9 to 11, wherein the data including the captured image transmitted from the control device upstream of the daisy chain is directly transmitted to the control device downstream of the daisy chain.

A first connecting means for connecting to another control device and a second connecting means for connecting to the plurality of image processing devices are provided.
Further, in the determination means, the destination of the data received via the first connection means is the image pickup timing information given to the image captured image included in the received data as the first image capture timing information. It is characterized in that an image is transmitted to the first image processing device based on the above, and it is determined to be transmitted to the second image processing device based on the second imaging timing information. The control device according to any one of claims 9 to 11.

An acquisition step of acquiring a plurality of images acquired by being imaged by the plurality of imaging devices corresponding to a plurality of imaging devices, and
Based on the imaging timing information given to the image acquired in the acquisition step being the first imaging timing information, the image is transmitted to the first image processing device included in the plurality of image processing devices. The first transmission process and
Based on the image pickup timing information given to the image acquired in the acquisition step being the second image pickup timing information, the image is transmitted to the second image processing apparatus included in the plurality of image processing apparatus. An image transmission method comprising a second transmission step.

It is a control method of the control device.
The acquisition process to acquire the captured image based on the imaging of the imaging device, and
Based on the imaging timing information of the captured image being the first imaging timing information, the image is transmitted to the first image processing apparatus included in the plurality of image processing devices, and the imaging timing information is the second imaging. A determination step of determining the destination of the captured image so as to transmit the image to the second image processing device included in the plurality of image processing devices based on the timing information.
A control method for a control device, comprising: an output step of setting a destination determined by the determination step on data including the captured image and outputting the data.

A program for causing a computer to function as each means of the control device according to any one of claims 9 to 13.