JP2019050463A

JP2019050463A - Image processing system and method for controlling image processing system

Info

Publication number: JP2019050463A
Application number: JP2017172735A
Authority: JP
Inventors: 智昭新井; Tomoaki Arai
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-09-08
Filing date: 2017-09-08
Publication date: 2019-03-28

Abstract

To read out, in a short time, a frame photographed at a specific time from a plurality of images photographed by a plurality of imaging means.SOLUTION: An image processing system comprises: a plurality of imaging means (112) that respectively photograph a plurality of images of a subject from a plurality of directions; control information generation means (120) that generates control information added to frames of the plurality of images photographed by the plurality of imaging means on the basis of time information and recording instructions superimposed to the frames of the plurality of images photographed by the plurality of imaging means; writing control means (230) that writes, in storage means, the frames of the plurality of images photographed by the plurality of imaging means on the basis of the control information; read-out control means (270) that reads out the frames of the plurality of images from the storage means on the basis of the control information; and virtual viewpoint image creation means (270) that creates a virtual viewpoint image by using the frames of the plurality of images read out by the read-out control means.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理システム及び画像処理システムの制御方法に関する。 The present invention relates to an image processing system and a control method of the image processing system.

昨今、複数のカメラを異なる位置に設置して多視点で同期撮影し、撮影により得られた複数視点画像を用いて仮想視点コンテンツを生成する技術が注目されている。上記の技術によれば、例えば、サッカーやバスケットボールなどの特定シーン（例えばゴールシーンなど）を様々な角度から視聴することができるため、通常の画像と比較してユーザに高臨場感を与えることができる。複数のカメラは、画像を撮影する。サーバなどの画像処理部は、複数のカメラが撮影した画像を集約し、三次元モデル生成及びレンダリングなどの処理を施し、ユーザ端末に伝送を行う。これにより、複数視点画像に基づく仮想視点コンテンツの生成及び閲覧を実現できる。 Recently, attention has been focused on a technique of setting a plurality of cameras at different positions and synchronously photographing with multiple viewpoints, and generating virtual viewpoint content using a plurality of viewpoint images obtained by photographing. According to the above-mentioned technology, for example, it is possible to view a specific scene (for example, a goal scene) such as soccer or basketball from various angles, thereby providing the user with a high sense of reality as compared with a normal image. it can. Multiple cameras capture images. An image processing unit such as a server aggregates images captured by a plurality of cameras, performs processing such as three-dimensional model generation and rendering, and transmits the processed image to a user terminal. Thereby, generation and browsing of virtual viewpoint content based on a plurality of viewpoint images can be realized.

このとき、サーバに集約される画像枚数は、膨大である。サーバは、同時刻に撮影された画像を複数のカメラで撮影された画像群の中から抽出するのに長時間を要してしまう。その抽出時間は、仮想視点コンテンツが閲覧可能になるまでの時間に影響を与える。例えば、カメラ台数が１００台、映像フレームレートが６０ｆｐｓである場合、サッカーの試合の９０分間の撮影では、約３２４０万枚の画像がサーバに集約される。また、画像解析部が、選手などの前景と背景を分離し、それぞれの画像を別々に管理する場合、更に画像枚数が膨れ上がる。このような状況下で、仮想視点コンテンツを短時間で生成し閲覧できるようにすることが重要となる。 At this time, the number of images collected on the server is enormous. The server takes a long time to extract images taken at the same time from a group of images taken by a plurality of cameras. The extraction time affects the time until the virtual viewpoint content becomes viewable. For example, when the number of cameras is 100 and the video frame rate is 60 fps, about 90 minutes of soccer match shooting, approximately 32.4 million images are collected on the server. In addition, when the image analysis unit separates the foreground and the background of a player or the like and separately manages the respective images, the number of images further increases. Under such circumstances, it is important to be able to generate and view virtual viewpoint content in a short time.

特許文献１には、複数のカメラで撮影した画像に対してタイムコードを付加し、同一タイムコードの画像を抽出し繋ぎ合せてパノラマ画像を生成する技術が開示されている。 Patent Document 1 discloses a technique of adding a time code to images taken by a plurality of cameras, extracting and joining images of the same time code to generate a panoramic image.

特開２００２−２０９２０８号公報JP 2002-209208 A

しかし、特許文献１の技術は、タイムコードを用いた画像検索として、画像一枚ずつに対してタイムコード比較を行うため、処理時間が非常に長くなる。 However, in the technique of Patent Document 1, as image search using a time code, the time code comparison is performed on each image one by one, so the processing time becomes very long.

本発明の目的は、複数の撮像手段で撮影された複数の画像の中から特定の時刻に撮影されたフレームを短時間に読み出すことができる方法を提供することである。 An object of the present invention is to provide a method capable of reading out a frame captured at a specific time in a short time from among a plurality of images captured by a plurality of imaging means.

本発明の画像処理システムは、被写体に対して複数の方向から複数の画像をそれぞれ撮影する複数の撮像手段と、前記複数の撮像手段により撮影された複数の画像の各フレームに重畳される時間情報及び録画指示を基に、前記複数の撮像手段により撮影された複数の画像の各フレームに付加する制御情報を生成する制御情報生成手段と、前記制御情報を基に、前記複数の撮像手段により撮影された複数の画像の各フレームを記憶手段に書き込む書き込み制御手段と、前記制御情報を基に、前記記憶手段から前記複数の画像のフレームを読み出す読み出し制御手段と、前記読み出し制御手段により読み出された前記複数の画像のフレームを用いて、仮想視点画像を生成する仮想視点画像生成手段とを有する。 In the image processing system according to the present invention, a plurality of image pickup means for respectively photographing a plurality of images from a plurality of directions with respect to a subject, and time information superimposed on each frame of the plurality of images photographed by the plurality of image pickup means And control information generating means for generating control information to be added to each frame of the plurality of images taken by the plurality of imaging means based on the recording instruction and the plurality of imaging means based on the control information Write control means for writing each frame of the plurality of images into the storage means, read control means for reading the frames of the plurality of images from the storage means based on the control information, and read by the read control means Virtual viewpoint image generation means for generating a virtual viewpoint image using the frames of the plurality of images.

本発明によれば、複数の撮像手段で撮影された複数の画像の中から特定の時刻に撮影されたフレームを短時間に読み出すことができる。 According to the present invention, a frame captured at a specific time can be read out in a short time from among a plurality of images captured by a plurality of imaging means.

画像処理システムの構成例を示す図である。It is a figure showing an example of composition of an image processing system. カメラアダプタの構成例を示す図である。It is a figure which shows the structural example of a camera adapter. 画像コンピューティングサーバの構成例を示す図である。It is a figure which shows the structural example of an image computing server. タイムコード抽出部の動作例を示すフローチャートである。It is a flowchart which shows the operation example of a time code extraction part. 画像付加情報の構造例を示す図である。It is a figure which shows the structural example of image additional information. 録画指示に基づいた画像付加情報の生成方法を説明するための図である。It is a figure for demonstrating the production | generation method of the image additional information based on recording instruction | indication. メモリ構造パラメータを説明するための図である。It is a figure for demonstrating a memory structure parameter. 書き込み制御部の動作例を示すフローチャートである。It is a flowchart which shows the operation example of a write-in control part. ＳＱテーブル情報の構造例を示す図である。It is a figure which shows the structural example of SQ table information. 読み出し制御部の動作例を示すフローチャートである。It is a flowchart which shows the operation example of a read-out control part.

図１は、本発明の実施形態による画像処理システム１００の構成例を示す図である。画像処理システム１００は、競技場（スタジアム）又はコンサートホールなどの施設に複数のカメラ及びマイクを設置し、撮影及び集音を行う。画像処理システム１００は、複数のセンサシステム１１０ａ〜１１０ｚ、画像コンピューティングサーバ２００、コントローラ３００、スイッチングハブ１８０、及びエンドユーザ端末１９０を有する。 FIG. 1 is a diagram showing an exemplary configuration of an image processing system 100 according to an embodiment of the present invention. The image processing system 100 installs a plurality of cameras and microphones in a facility such as a stadium or a concert hall, and performs shooting and sound collection. The image processing system 100 includes a plurality of sensor systems 110 a to 110 z, an image computing server 200, a controller 300, a switching hub 180, and an end user terminal 190.

コントローラ３００は、制御ステーション３１０及び仮想カメラ操作ＵＩ３３０を有する。制御ステーション３１０は、画像処理システム１００を構成するそれぞれのブロックに対して、ネットワーク３１０ａ〜３１０ｃ、１８０ａ、１８０ｂ、及び１７０ａ〜１７０ｙを介して、動作状態の管理及びパラメータ設定制御などを行う。ここで、ネットワークは、ＩＥＥＥ標準準拠のＧｂＥ（ギガビットイーサーネット）又は１０ＧｂＥでもよいし、インターコネクトＩｎｆｉｎｉｂａｎｄ、産業用イーサーネット等を組み合せてもよい。また、ネットワークは、これらに限定されず、他の種別のネットワークであってもよい。 The controller 300 has a control station 310 and a virtual camera operation UI 330. The control station 310 performs operation state management, parameter setting control, and the like on the blocks constituting the image processing system 100 through the networks 310 a to 310 c, 180 a, 180 b, and 170 a to 170 y. Here, the network may be IEEE standard compliant GbE (Gigabit Ethernet) or 10 GbE, or may be a combination of an interconnect Infiniband, industrial Ethernet, or the like. Also, the network is not limited to these, and may be another type of network.

センサシステム１１０ａ〜１１０ｚは、ネットワーク１７０ａ〜１７０ｙのデイジーチェーンにより接続される。センサシステム１１０ｚは、２６セットのセンサシステム１１０ａ〜１１０ｚの２６セットの画像及び音声を画像コンピューティングサーバ２００へ送信する。本実施形態では、特別な説明がない場合は、２６セットのセンサシステム１１０ａ〜１１０ｚを区別せず、センサシステム１１０と記載する。センサシステム１１０ａは、マイク１１１ａ、カメラ１１２ａ、雲台１１３ａ、外部センサ１１４ａ及びカメラアダプタ１２０ａを有する。センサシステム１１０ｂ〜１１０ｚは、センサシステム１１０ａと同様の構成を有する。各センサシステム１１０内の装置についても、特別な説明がない場合は区別しない。なお、センサシステム１１０の台数は２６に限定されない。また、例えばカメラ１１２とカメラアダプタ１２０が一体の装置として構成されていても良いし、外部センサ１１４が存在しなくても良いなど、種々の変形例が存在する。 Sensor systems 110a-110z are connected by a daisy chain of networks 170a-170y. The sensor system 110 z transmits 26 sets of images and sounds of the 26 sets of sensor systems 110 a-110 z to the image computing server 200. In the present embodiment, 26 sets of sensor systems 110 a to 110 z are not distinguished and described as a sensor system 110 unless otherwise specified. The sensor system 110a includes a microphone 111a, a camera 112a, a camera platform 113a, an external sensor 114a, and a camera adapter 120a. The sensor systems 110b to 110z have the same configuration as the sensor system 110a. The devices in each sensor system 110 are also not distinguished unless there is a special description. The number of sensor systems 110 is not limited to twenty-six. In addition, for example, various modifications may exist such as the camera 112 and the camera adapter 120 may be configured as an integrated device, or the external sensor 114 may not be present.

本実施形態では、特に断りがない限り、画像は、動画と静止画を含むものとして説明する。すなわち、画像処理システム１００は、静止画及び動画の何れについても処理可能である。また、画像処理システム１００により提供される仮想視点コンテンツには、仮想視点画像と仮想視点音声が含まれる例を説明するが、これに限らない。例えば、仮想視点コンテンツに音声が含まれていなくてもよい。また、例えば、仮想視点コンテンツに含まれる音声が、仮想視点に最も近いマイクにより集音された音声であってもよい。また、本実施形態では、説明の簡略化のため、部分的に音声についての記載を省略しているが、基本的に画像と音声は共に処理される。 In the present embodiment, the image is described as including a moving image and a still image unless otherwise noted. That is, the image processing system 100 can process both still images and moving images. Further, although an example in which virtual viewpoint content and virtual viewpoint sound are included in virtual viewpoint content provided by the image processing system 100 will be described, the present invention is not limited thereto. For example, the audio may not be included in the virtual viewpoint content. Also, for example, the sound included in the virtual viewpoint content may be the sound collected by the microphone closest to the virtual viewpoint. Further, in the present embodiment, although the description of the voice is partially omitted for simplification of the description, basically both the image and the voice are processed.

センサシステム１１０ａ〜１１０ｚは、それぞれ、１台のカメラ１１２ａ〜１１２ｚを有する。すなわち、画像処理システム１００は、被写体に対して複数の方向から複数の画像をそれぞれ撮影するための複数のカメラ（撮像手段）１１２ａ〜１１２ｚを有する。複数のセンサシステム１１０同士は、ネットワーク１７０ａ〜１７０ｙのデイジーチェーンにより接続される。なお、接続形態は、各センサシステム１１０ａ〜１１０ｚが、スイッチングハブ１８０に接続され、スイッチングハブ１８０を経由して相互にデータ送受信を行うスター型のネットワーク構成でもよい。 Each of the sensor systems 110a to 110z has one camera 112a to 112z. That is, the image processing system 100 includes a plurality of cameras (imaging units) 112a to 112z for respectively photographing a plurality of images in a plurality of directions with respect to a subject. The plurality of sensor systems 110 are connected by a daisy chain of networks 170a to 170y. The connection configuration may be a star network configuration in which the sensor systems 110 a to 110 z are connected to the switching hub 180 and mutually transmit and receive data via the switching hub 180.

マイク１１１ａは、音声を集音する。カメラ１１２ａは、画像を撮影する。カメラアダプタ１２０ａは、マイク１１１ａにより集音された音声と、カメラ１１２ａにより撮影された画像に対して、処理を行う。そして、カメラアダプタ１２０ａは、処理後の音声及び画像を、デイジーチェーンのネットワーク１７０ａを介して、センサシステム１１０ｂのカメラアダプタ１２０ｂに伝送する。同様に、センサシステム１１０ｂは、集音された音声と撮影された画像を、センサシステム１１０ａから取得した画像及び音声と合わせてセンサシステム１１０ｃに伝送する。前述した動作を続けることにより、センサシステム１１０ｚは、センサシステム１１０ａ〜１１０ｚが取得した画像及び音声を、ネットワーク１８０ｂ及びスイッチングハブ１８０を介して、画像コンピューティングサーバ２００へ伝送する。 The microphone 111a collects voice. The camera 112a captures an image. The camera adapter 120a performs processing on the sound collected by the microphone 111a and the image captured by the camera 112a. Then, the camera adapter 120a transmits the processed audio and image to the camera adapter 120b of the sensor system 110b via the daisy chain network 170a. Similarly, the sensor system 110b transmits the collected sound and the captured image to the sensor system 110c together with the image and the sound acquired from the sensor system 110a. By continuing the operation described above, the sensor system 110z transmits the image and the sound acquired by the sensor systems 110a to 110z to the image computing server 200 via the network 180b and the switching hub 180.

次に、画像コンピューティングサーバ２００の構成及び動作について説明する。画像コンピューティングサーバ２００は、センサシステム１１０ｚから取得したデータの処理を行う。画像コンピューティングサーバ２００は、フロントエンドサーバ２３０、データベース（以下、ＤＢとも記載する。）２５０、バックエンドサーバ２７０、及びタイムサーバ２９０を有する。 Next, the configuration and operation of the image computing server 200 will be described. The image computing server 200 processes data acquired from the sensor system 110z. The image computing server 200 includes a front end server 230, a database (hereinafter also referred to as DB) 250, a back end server 270, and a time server 290.

タイムサーバ２９０は、複数のカメラ１１２の撮影タイミングを同期させる。これにより、画像処理システム１００は、同じタイミングで撮影された複数の撮影画像に基づいて仮想視点画像を生成できるため、撮影タイミングのずれによる仮想視点画像の品質低下を抑制できる。なお、タイムサーバ２９０が複数のカメラ１１２の時刻同期を管理する場合に限定されず、各カメラ１１２又は各カメラアダプタ１２０が時刻同期のための処理を独立して行ってもよい。 The time server 290 synchronizes the imaging timings of the plurality of cameras 112. As a result, the image processing system 100 can generate a virtual viewpoint image based on a plurality of captured images captured at the same timing, and therefore, it is possible to suppress the deterioration in quality of the virtual viewpoint image due to a shift in the imaging timing. The present invention is not limited to the case where the time server 290 manages time synchronization of a plurality of cameras 112, and each camera 112 or each camera adapter 120 may independently perform processing for time synchronization.

フロントエンドサーバ２３０は、センサシステム１１０ｚから取得した画像及び音声を基に、セグメント化された伝送パケットを再構成してデータ形式を変換した後に、カメラの識別子、データ種別、及びフレーム番号に応じて、データベース２５０に書き込む。 The front end server 230 reconstructs a segmented transmission packet based on the image and sound acquired from the sensor system 110 z and converts the data format, and then, according to the camera identifier, data type, and frame number. , Write to the database 250.

次に、バックエンドサーバ２７０は、仮想カメラ操作ＵＩ３３０から視点の指定を受け付け、受け付けられた視点に基づいて、データベース２５０から対応する画像及び音声データを読み出し、レンダリング処理を行って仮想視点画像を生成する。より具体的には、バックエンドサーバ２７０は、例えば複数のカメラ１１２による撮影画像から抽出された前景領域の画像データと、ユーザ操作により指定された視点に基づいて、仮想視点コンテンツを生成する。カメラアダプタ１２０が前景領域を抽出する方法については後述する。 Next, the back end server 270 receives specification of a viewpoint from the virtual camera operation UI 330, reads out the corresponding image and sound data from the database 250 based on the received viewpoint, performs rendering processing, and generates a virtual viewpoint image Do. More specifically, the back-end server 270 generates virtual viewpoint content based on, for example, image data of a foreground area extracted from images captured by the plurality of cameras 112 and a viewpoint specified by a user operation. The method by which the camera adapter 120 extracts the foreground area will be described later.

バックエンドサーバ２７０は、レンダリング処理された画像及び音声を、エンドユーザ端末１９０に出力する。エンドユーザ端末１９０は、ユーザの操作に応じて、視点の指定に応じた画像閲覧及び音声視聴を行うことができる。 The back end server 270 outputs the rendered image and sound to the end user terminal 190. The end user terminal 190 can perform image browsing and audio viewing in accordance with the designation of the viewpoint according to the user's operation.

なお、画像コンピューティングサーバ２００の構成は、これに限らない。例えば、フロントエンドサーバ２３０、データベース２５０、及びバックエンドサーバ２７０のうち少なくとも２つが一体となって構成されていてもよい。また、画像コンピューティングサーバ２００は、フロントエンドサーバ２３０、データベース２５０、及びバックエンドサーバ２７０の少なくとも何れかを複数有していてもよい。また、画像コンピューティングサーバ２００内の任意の位置に上記の装置以外の装置が含まれていてもよい。さらに、画像コンピューティングサーバ２００の機能の少なくとも一部をエンドユーザ端末１９０又は仮想カメラ操作ＵＩ３３０が有していてもよい。 The configuration of the image computing server 200 is not limited to this. For example, at least two of the front end server 230, the database 250, and the back end server 270 may be integrally configured. Further, the image computing server 200 may have a plurality of at least one of the front end server 230, the database 250, and the back end server 270. In addition, devices other than the above-described devices may be included at any position in the image computing server 200. Furthermore, the end user terminal 190 or the virtual camera operation UI 330 may have at least a part of the functions of the image computing server 200.

また、画像処理システム１００の構成は上記の構成に限らない。例えば、仮想カメラ操作ＵＩ３３０が直接センサシステム１１０ａ〜１１０ｚから画像を取得するようにしても良い。また、仮想カメラ操作ＵＩ３３０が、直接データベース２５０にアクセスするような構成であっても良い。また、画像処理システム１００は、上記で説明した物理的な構成に限定されず、論理的に構成されていてもよい。 Further, the configuration of the image processing system 100 is not limited to the above configuration. For example, the virtual camera operation UI 330 may directly acquire an image from the sensor systems 110a to 110z. Also, the virtual camera operation UI 330 may be configured to directly access the database 250. Further, the image processing system 100 is not limited to the physical configuration described above, and may be logically configured.

また、仮想視点画像は、仮想的な視点から被写体を撮影した場合に得られる画像である。言い換えると、仮想視点画像は、指定された視点における見えを表す画像である。仮想的な視点（仮想視点）は、ユーザにより指定されてもよいし、画像解析の結果等に基づいて自動的に指定されてもよい。 The virtual viewpoint image is an image obtained when the subject is photographed from a virtual viewpoint. In other words, the virtual viewpoint image is an image representing the appearance at the designated viewpoint. The virtual viewpoint (virtual viewpoint) may be designated by the user, or may be automatically designated based on the result of image analysis or the like.

図２は、図１のカメラアダプタ１２０の構成例を示す図である。カメラアダプタ１２０は、カメラ制御部１２１、マイク制御部１２２、データ送受信部１２３、時刻同期制御部１２４、前景背景分離部１２５、三次元モデル情報生成部１２６、タイムコード抽出部１２７、及び画像付加情報生成部１２８を有する。 FIG. 2 is a view showing a configuration example of the camera adapter 120 of FIG. The camera adapter 120 includes a camera control unit 121, a microphone control unit 122, a data transmission / reception unit 123, a time synchronization control unit 124, a foreground / background separation unit 125, a three-dimensional model information generation unit 126, a time code extraction unit 127, and image additional information The generation unit 128 is included.

カメラ制御部１２１は、カメラ１１２に接続され、カメラ１１２の制御、撮影画像取得、同期信号提供、及び時刻設定などを行う機能を有する。カメラ１１２の制御は、例えば、撮影パラメータ（画素数、色深度、フレームレート、及びホワイトバランスの設定など）の設定及び参照、カメラ１１２の状態（撮影中、停止中、同期中、及びエラーなど）の取得、撮影の開始及び停止、又はピント調整などである。なお、カメラ制御部１２１が、カメラ１１２に対してピント調整を行う場合に限定されない。取り外し可能なレンズがカメラ１１２に装着されている場合、カメラアダプタ１２０が、レンズに接続され、直接レンズの調整を行ってもよい。また、カメラアダプタ１２０がカメラ１１２を介してズーム等のレンズ調整を行ってもよい。同期信号提供は、時刻同期制御部１２４がタイムサーバ２９０と外部同期した時刻を利用し、撮影タイミング（制御クロック）をカメラ１１２に提供することで行われる。時刻設定は、時刻同期制御部１２４がタイムサーバ２９０と同期した時刻を例えばＳＭＰＴＥ１２Ｍのフォーマットに準拠したタイムコード（時間情報）で提供することで行われる。時刻同期制御部１２４は、カメラ１１２から受け取る画像データの各フレームに重畳されタイムコードとして、外部同期によりタイムコードを生成する。なお、タイムコードのフォーマットは、ＳＭＰＴＥ１２Ｍに限定されるわけではなく、他のフォーマットであってもよい。また、カメラ制御部１２１は、カメラ１１２に対するタイムコードの提供をせず、カメラ１１２から受け取った画像データに自身がタイムコードを付与してもよい。 The camera control unit 121 is connected to the camera 112 and has a function of controlling the camera 112, acquiring a captured image, providing a synchronization signal, setting a time, and the like. For control of the camera 112, for example, setting and reference of shooting parameters (number of pixels, color depth, frame rate, setting of white balance, etc.), status of the camera 112 (during shooting, stopping, synchronizing, error etc.) Acquisition, start and stop of shooting, or focus adjustment. The present invention is not limited to the case where the camera control unit 121 adjusts the focus on the camera 112. If a removable lens is attached to the camera 112, the camera adapter 120 may be connected to the lens to make direct lens adjustments. In addition, the camera adapter 120 may perform lens adjustment such as zooming via the camera 112. The synchronization signal is provided by using the time synchronized externally with the time server 290 by the time synchronization control unit 124 and providing a photographing timing (control clock) to the camera 112. The time setting is performed by providing the time synchronized with the time server 290 by the time synchronization control unit 124 with a time code (time information) conforming to, for example, the format of SMPTE 12M. The time synchronization control unit 124 generates a time code by external synchronization as a time code superimposed on each frame of the image data received from the camera 112. The format of the time code is not limited to SMPTE 12M, and may be another format. In addition, the camera control unit 121 may assign a time code to the image data received from the camera 112 without providing the time code to the camera 112.

マイク制御部１２２は、マイク１１１に接続され、マイク１１１の制御、収音の開始及び停止、収音された音声データの取得などを行う機能を有する。マイク１１１の制御は、例えば、ゲイン調整、及び状態取得などである。また、マイク制御部１２２は、カメラ制御部１２１と同様に、マイク１１１に対して音声サンプリングするタイミングとタイムコードを提供する。音声サンプリングのタイミングとなるクロック情報としては、タイムサーバ２９０からの時刻情報が例えば４８ＫＨｚのワードクロックに変換されてマイク１１１に供給される。 The microphone control unit 122 is connected to the microphone 111, and has a function of controlling the microphone 111, starting and stopping sound collection, and acquiring collected sound data. The control of the microphone 111 is, for example, gain adjustment and state acquisition. Further, the microphone control unit 122, like the camera control unit 121, provides the microphone 111 with a timing and a time code for voice sampling. As clock information for audio sampling timing, time information from the time server 290 is converted to, for example, a 48 KHz word clock and supplied to the microphone 111.

データ送受信部１２３は、ネットワーク１７０ａ〜１７０ｙ、１８０ａ、１８０ｂ、２９１及び３１０ａを介して、他のカメラアダプタ１２０、フロントエンドサーバ２３０、タイムサーバ２９０、及び制御ステーション３１０とデータ通信を行う。 The data transmission / reception unit 123 performs data communication with the other camera adapters 120, the front end server 230, the time server 290, and the control station 310 via the networks 170a to 170y, 180a, 180b, 291 and 310a.

時刻同期制御部１２４は、ＩＥＥＥ１５８８規格のＰＴＰ（ＰｒｅｃｉｓｉｏｎＴｉｍｅＰｒｏｔｏｃｏｌ）に準拠し、タイムサーバ２９０と時刻同期に係わる処理を行う機能を有する。なお、時刻同期制御部１２４は、ＰＴＰに限定されず、他の同様のプロトコルを利用して時刻同期してもよい。 The time synchronization control unit 124 has a function of performing processing relating to time synchronization with the time server 290 in compliance with PTP (Precision Time Protocol) of the IEEE 1588 standard. The time synchronization control unit 124 is not limited to PTP, and may perform time synchronization using other similar protocols.

前景背景分離部１２５は、カメラ１１２が撮影した画像データを前景画像と背景画像に分離する。すなわち、各カメラアダプタ１２０に含まれる前景背景分離部１２５は、複数のカメラ１１２のうちの対応するカメラ１１２による撮影画像から所定領域を抽出する。所定領域は、例えば撮影画像に対応するオブジェクト検出の結果により得られる前景画像である。この抽出により、前景背景分離部１２５は、撮影画像を前景画像と背景画像に分離する。なお、オブジェクトは、例えば人物である。ただし、オブジェクトが特定人物（選手、監督、及び／又は審判など）であってもよいし、ボールやゴールなどの画像パターンが予め定められている物体であってもよい。また、オブジェクトとして動体が検出されるようにしてもよい。前景背景分離部１２５は、人物等の重要なオブジェクトを含む前景画像とそのようなオブジェクトを含まない背景領域を分離して処理する。 The foreground / background separator 125 separates the image data captured by the camera 112 into a foreground image and a background image. That is, the foreground / background separation unit 125 included in each camera adapter 120 extracts a predetermined area from the image captured by the corresponding one of the plurality of cameras 112. The predetermined area is, for example, a foreground image obtained as a result of object detection corresponding to a photographed image. By this extraction, the foreground / background separator 125 separates the photographed image into a foreground image and a background image. The object is, for example, a person. However, the object may be a specific person (such as a player, a director, and / or a referee), or may be an object having a predetermined image pattern such as a ball or a goal. Also, a moving body may be detected as an object. The foreground / background separation unit 125 separates and processes a foreground image including important objects such as a person and a background area not including such objects.

三次元モデル情報生成部１２６は、前景背景分離部１２５で分離された前景画像及び他のカメラアダプタ１２０から入力した前景画像を利用し、例えばステレオカメラの原理を用いて三次元モデルに関わる画像情報を生成する。 The three-dimensional model information generation unit 126 uses the foreground image separated by the foreground / background separation unit 125 and the foreground image input from the other camera adapter 120, and uses, for example, the principle of a stereo camera to generate image information related to a three-dimensional model. Generate

タイムコード抽出部１２７は、カメラ１１２が撮影した画像データに重畳される補助データを抽出する。補助データは、ズーム率、露出、色温度などのカメラパラメータ、及びタイムコード（以後、外部タイムコードと呼ぶ）などを含む。タイムコード抽出部１２７は、外部タイムコードが破綻しているか否かを判定した上で、正常なタイムコードを画像付加情報生成部１２８へ送信する。なお、外部タイムコードが破綻していた場合、タイムコード抽出部１２７は、内部で常時生成している予測タイムコードを画像付加情報生成部１２８へ送信する。予測タイムコードは、ひとつ前の外部タイムコード又は予測タイムコードを用いて算出される。その際、タイムコード抽出部１２７は、ドロップフレームを考慮して算出するのが望ましい。 The time code extraction unit 127 extracts auxiliary data to be superimposed on the image data captured by the camera 112. The auxiliary data includes a zoom factor, an exposure, camera parameters such as color temperature, and a time code (hereinafter referred to as an external time code). After determining whether the external time code is broken, the time code extraction unit 127 transmits a normal time code to the image additional information generation unit 128. When the external time code is broken, the time code extraction unit 127 transmits the predicted time code constantly generated internally to the image additional information generation unit 128. The predicted time code is calculated using the previous external time code or predicted time code. At this time, it is desirable that the time code extraction unit 127 calculate the drop frame in consideration.

画像付加情報生成部１２８は、画像データ（前景画像、背景画像、三次元モデルなどのデータ）又は音声データに付加させる情報（以後、画像付加情報と呼ぶ）を生成する。画像付加情報は、画像データに重畳されるタイムコード、録画シーケンス番号、録画フラグ、フレーム番号、カメラアダプタ番号、及びデータ識別子などを含む。また、画像付加情報に含まれるパラメータの一部は、制御ステーション３１０から入力した録画指示に基づいて生成される。画像データ又は音声データに付加された画像付加情報は、データ送受信部１２３を介して、他のカメラアダプタ１２０又はフロントエンドサーバ２３０へ転送され、制御情報として使用される。 The image additional information generation unit 128 generates image data (data such as a foreground image, a background image, and a three-dimensional model) or information to be added to audio data (hereinafter referred to as image additional information). The image additional information includes a time code to be superimposed on image data, a recording sequence number, a recording flag, a frame number, a camera adapter number, a data identifier, and the like. In addition, a part of the parameters included in the image additional information is generated based on the recording instruction input from the control station 310. Image additional information added to image data or audio data is transferred to another camera adapter 120 or front end server 230 via the data transmission / reception unit 123 and used as control information.

図３は、図１の画像コンピューティングサーバ２００の構成例を示す図である。画像コンピューティングサーバ２００は、フロントエンドサーバ２３０、データベース２５０、及びバックエンドサーバ２７０を有する。フロントエンドサーバ２３０は、データ入力制御部２０２及び書き込み制御部２０３を有する。データベース２５０は、通信部２０１、ＳＱ情報管理部２０４、及び記憶部２０５を有する。バックエンドサーバ２７０は、読み出し制御部２０６及び仮想視点画像生成部２０７を有する。 FIG. 3 is a view showing a configuration example of the image computing server 200 of FIG. The image computing server 200 has a front end server 230, a database 250, and a back end server 270. The front end server 230 includes a data input control unit 202 and a write control unit 203. The database 250 includes a communication unit 201, an SQ information management unit 204, and a storage unit 205. The back-end server 270 includes a read control unit 206 and a virtual viewpoint image generation unit 207.

通信部２０１は、通信路を介して、制御ステーション３１０に接続され、制御ステーション３１０から受信した制御指示などを画像コンピューティングサーバ２００の少なくとも１つ以上の機能ブロックへ転送する。 The communication unit 201 is connected to the control station 310 via a communication path, and transfers control instructions and the like received from the control station 310 to at least one or more functional blocks of the image computing server 200.

データ入力制御部２０２は、ネットワークとスイッチングハブ１８０を介して、カメラアダプタ１２０に接続されている。データ入力制御部２０２は、カメラアダプタ１２０から前景画像、背景画像、三次元モデル、及び音声データを取得する。また、データ入力制御部２０２は、順次送られてくるデータを後段の書き込み制御部２０３へ転送する。その際、データ入力制御部２０２は、画像データ又は音声データに付加されている画像付加情報についても、書き込み制御部２０３へ転送する。 The data input control unit 202 is connected to the camera adapter 120 via the network and the switching hub 180. The data input control unit 202 acquires a foreground image, a background image, a three-dimensional model, and audio data from the camera adapter 120. Further, the data input control unit 202 transfers the data sequentially sent to the write control unit 203 in the subsequent stage. At this time, the data input control unit 202 also transfers to the writing control unit 203 the image additional information added to the image data or the audio data.

書き込み制御部２０３は、前段のデータ入力制御部２０２から受信したデータを記憶部２０５へ書き込む。書き込み制御部２０３は、制御ステーション３１０によって予め設定されたパラメータと画像付加情報に基づいて書き込みアドレスを生成し、書き込みアドレスが示す記憶部２０５の領域にデータを書き込む。その際、書き込み制御部２０３は、画像付加情報に含まれる録画フラグの状態に応じて、書き込みをするか否かを判断する。すなわち書き込み制御部２０３は、録画フラグが１（録画中）の場合は画像データを記憶部２０５に書き込み、録画フラグが０（録画停止中）の場合は画像データを記憶部２０５に書き込まない。また、書き込み制御部２０３は、画像付加情報をモニタリングし、録画開始位置を検出した時のタイムコード、録画シーケンス番号、カメラアダプタ番号、データ識別子、及び書き込みアドレスをひと塊にした録画開始情報をＳＱ情報管理部２０４へ転送する。 The write control unit 203 writes the data received from the data input control unit 202 at the previous stage to the storage unit 205. The write control unit 203 generates a write address based on the parameters and image additional information preset by the control station 310, and writes data in the area of the storage unit 205 indicated by the write address. At this time, the write control unit 203 determines whether to write according to the state of the recording flag included in the image additional information. That is, the write control unit 203 writes the image data to the storage unit 205 when the recording flag is 1 (during recording), and does not write the image data to the storage unit 205 when the recording flag is 0 (during recording stop). Also, the write control unit 203 monitors the image additional information, and detects the time code when the recording start position is detected, the recording sequence number, the camera adapter number, the data identifier, and the recording start information in which the write address is grouped. Transfer to the information management unit 204.

ＳＱ情報管理部２０４は、書き込み制御部２０３から受信した録画開始情報を、録画シーケンス番号ごとに用意されたＳＱテーブルに振り分けて管理する。ＳＱ情報管理部２０４は、読み出し制御部２０６からＳＱテーブル読み出し要求を受け付けた場合、要求と共に受信した録画シーケンス番号に該当するＳＱテーブルからすべての録画開始情報（ＳＱテーブル情報）を読み出し、読み出し制御部２０６へ送信する。 The SQ information management unit 204 distributes and manages the recording start information received from the write control unit 203 to the SQ table prepared for each recording sequence number. When the SQ information management unit 204 receives the SQ table read request from the read control unit 206, the SQ information management unit 204 reads all the recording start information (SQ table information) from the SQ table corresponding to the recording sequence number received along with the request, Send to 206.

記憶部２０５は、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等のストレージメディアである。記憶部２０５は、ＳＳＤを並列に接続した構成であり、大量データの読み出し及び書き込みを高速化できる。 The storage unit 205 is a storage medium such as a solid state drive (SSD). The storage unit 205 has a configuration in which SSDs are connected in parallel, and can speed up reading and writing of a large amount of data.

読み出し制御部２０６は、前景画像、背景画像、三次元モデル、及び音声データを記憶部２０５から読み出して仮想視点画像生成部２０７へ転送する。読み出し制御部２０６は、通信部２０１を介して制御ステーション３１０から受信した指示及びパラメータとＳＱ情報管理部２０４から受信したＳＱテーブル情報に基づいて読み出しアドレスを生成する。そして、読み出し制御部２０６は、読み出しアドレスが示す記憶部２０５の領域から画像データ又は音声データを読み出す。 The read control unit 206 reads the foreground image, the background image, the three-dimensional model, and the audio data from the storage unit 205 and transfers the read data to the virtual viewpoint image generation unit 207. The read control unit 206 generates a read address based on the instructions and parameters received from the control station 310 via the communication unit 201 and the SQ table information received from the SQ information management unit 204. Then, the read control unit 206 reads image data or audio data from the area of the storage unit 205 indicated by the read address.

仮想視点画像生成部２０７は、読み出し制御部２０６から受信した前景画像、背景画像、三次元モデル、及び音声データを用いて、仮想視点画像を生成する。その際、仮想視点画像生成部２０７は、通信部２０１を介して制御ステーション３１０から受信した仮想視点の座標や視点方向などのパラメータに基づいて、仮想視点画像を生成する。仮想視点画像生成部２０７は、生成した仮想視点画像をエンドユーザ端末１９０へ出力する。以下、画像処理システム１００の制御方法を説明する。 The virtual viewpoint image generation unit 207 generates a virtual viewpoint image using the foreground image, the background image, the three-dimensional model, and the audio data received from the read control unit 206. At this time, the virtual viewpoint image generation unit 207 generates a virtual viewpoint image based on parameters such as coordinates of the virtual viewpoint and the viewpoint direction received from the control station 310 through the communication unit 201. The virtual viewpoint image generation unit 207 outputs the generated virtual viewpoint image to the end user terminal 190. Hereinafter, a control method of the image processing system 100 will be described.

図４は、図２のタイムコード抽出部１２７の動作例を示すフローチャートである。タイムコード抽出部１２７は、画像データに重畳される補助データの中からタイムコードが含まれるデータパケットを抽出した際に、図４の処理を行う。データパケットは、ＳＤＩ（ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）のＨＡＮＣ（ＨｏｒｉｚｏｎｔａｌＡｎｃｉｌｌａｒｙＤａｔａ）に重畳されるＶＩＴＣ（ＶｅｒｔｉｃａｌＩｎｔｅｒｖａｌＴｉｍｅＣｏｄｅ）データパケットである。 FIG. 4 is a flowchart showing an operation example of the time code extraction unit 127 of FIG. The time code extraction unit 127 performs the process of FIG. 4 when the data packet including the time code is extracted from the auxiliary data to be superimposed on the image data. The data packet is a vertical interval time code (VITC) data packet to be superimposed on horizontal serial data (HANC) of serial digital interface (SDI).

ステップＳ４０１では、タイムコード抽出部１２７は、データパケットに含まれる誤り検出符号を検出する。誤り検出符号は、例えばパリティビット（ＰａｒｉｔｙＢｉｔ）又はチェックサム（ＣｈｅｃｋＳｕｍ）などである。また、ステップＳ４０１では、タイムコード抽出部１２７は、受信したデータパケットに含まれるデータ値に基づいて誤り検出符号を算出する。 In step S401, the time code extraction unit 127 detects an error detection code included in the data packet. The error detection code is, for example, a parity bit or a check sum. In step S401, the time code extraction unit 127 calculates an error detection code based on the data value included in the received data packet.

次に、ステップＳ４０２では、タイムコード抽出部１２７は、ステップＳ４０１で検出した誤り検出符号と算出した誤り検出符号を比較し、補助データが破損しているか否かを判定する。タイムコード抽出部１２７は、両者の誤り検出符号が一致した場合（Ｓ４０２のＹｅｓ）には、データパケットのエラーが検出されないので、外部タイムコードを用いた処理ステップＳ４０３に処理を進める。タイムコード抽出部１２７は、両者の誤り検出符号が一致しない場合（Ｓ４０２のＮｏ）には、データパケットのエラーが検出されたので、予測タイムコードを用いた処理ステップＳ４０４に処理を進める。 Next, in step S402, the time code extraction unit 127 compares the error detection code detected in step S401 with the calculated error detection code to determine whether the auxiliary data is corrupted. If the error detection codes of the two do not match (Yes in S402), the time code extraction unit 127 proceeds to the processing step S403 using an external time code because an error in the data packet is not detected. If the two error detection codes do not match (No in S402), the time code extraction unit 127 detects an error in the data packet, and advances the process to processing step S404 using a predicted time code.

ステップＳ４０３では、タイムコード抽出部１２７は、データパケットから外部タイムコードを抽出し、その外部タイムコードを画像付加情報生成部１２８へ出力するタイムコード（以後、出力タイムコードと呼ぶ）として決定する。その後、タイムコード抽出部１２７は、ステップＳ４０５に処理を進める。 In step S403, the time code extraction unit 127 extracts an external time code from the data packet, and determines the external time code as a time code (hereinafter referred to as an output time code) to be output to the image additional information generation unit 128. Thereafter, the time code extraction unit 127 proceeds with the process to step S405.

ステップＳ４０４では、タイムコード抽出部１２７は、内部で常時生成している予測タイムコードを出力タイムコードとして決定し、ステップＳ４０５に処理を進める。予測タイムコードは、ひとつ前の出力タイムコードのフレーム数に対して１加算した値である。ただし、タイムコード抽出部１２７は、ドロップフレームを考慮して、予測タイムコードを算出する必要がある。例えば、フレームレートが５９．９４ｆｐｓの場合、タイムコード抽出部１２７は、予測タイムコードを００：００：００：５９から００：００：０１：０４へスキップさせるように、分の単位が繰り上がる度に４フレームだけスキップさせる必要がある。ただし、タイムコード抽出部１２７は、予測タイムコードが００：００：０９：５９から００：００：１０：００へ繰り上がるように、分の単位が０、１０、２０、３０、４０、５０の何れかへ繰り上がる場合はスキップさせる必要がない。 In step S404, the time code extraction unit 127 determines the predicted time code constantly generated internally as the output time code, and the process proceeds to step S405. The predicted time code is a value obtained by adding 1 to the number of frames of the immediately preceding output time code. However, the time code extraction unit 127 needs to calculate the predicted time code in consideration of the drop frame. For example, when the frame rate is 59.94 fps, the time code extraction unit 127 advances the unit of minutes so that the predicted time code is skipped from 00: 00: 00: 59 to 00: 00: 01: 04. Need to skip only 4 frames. However, in the time code extraction unit 127, the unit of minutes is 0, 10, 20, 30, 40, 50 so that the predicted time code advances from 00: 00: 09: 59 to 00: 00: 10: 00. There is no need to skip when moving up to any one.

ステップＳ４０５では、タイムコード抽出部１２７は、決定された出力タイムコードを画像付加情報生成部１２８へ出力する。 In step S405, the time code extraction unit 127 outputs the determined output time code to the image additional information generation unit 128.

図５は、図２の画像付加情報生成部１２８が生成する画像付加情報５００の構造例を示す図である。画像付加情報５００は、タイムコード５０１、録画シーケンス番号５０２、録画フラグ５０３、フレーム番号５０４、カメラアダプタ番号５０５、及びデータ識別子５０６などを有する。 FIG. 5 is a diagram showing an example of the structure of the image additional information 500 generated by the image additional information generation unit 128 of FIG. The image additional information 500 includes a time code 501, a recording sequence number 502, a recording flag 503, a frame number 504, a camera adapter number 505, a data identifier 506, and the like.

タイムコード５０１は、ＨＨ（時間）：ＭＭ（分）：ＳＳ（秒）：ＦＦ（フレーム）で構成された時間情報である。録画シーケンス番号５０２は、録画回数を示す情報であり、０以上の整数である。なお、録画シーケンス番号５０２が０の場合、カメラアダプタ１２０は、制御ステーション３１０から録画開始指示を一度も受け付けていないことを意味する。 The time code 501 is time information composed of HH (hour): MM (minute): SS (second): FF (frame). The recording sequence number 502 is information indicating the number of times of recording, and is an integer of 0 or more. When the recording sequence number 502 is 0, it means that the camera adapter 120 has not received a recording start instruction from the control station 310 even once.

録画フラグ５０３は、録画状態を示す情報である。録画フラグ５０３が１の場合、録画中を示し、録画フラグ５０３が０の場合、録画停止中を示す。フレーム番号５０４は、録画中のフレーム数を示す情報であり、０以上の整数である。 The recording flag 503 is information indicating a recording state. When the recording flag 503 is 1, it indicates that recording is in progress, and when the recording flag 503 is 0, it indicates that recording is stopped. The frame number 504 is information indicating the number of frames being recorded, and is an integer of 0 or more.

カメラアダプタ番号５０５は、カメラアダプタ１２０又はカメラ１１２を識別するための識別子であり、０以上の整数である。例えば、カメラアダプタ番号５０５は、すべてのカメラアダプタ１２０に対して０から始まる連番が割り振られている。 The camera adapter number 505 is an identifier for identifying the camera adapter 120 or the camera 112, and is an integer of 0 or more. For example, camera adapter numbers 505 are assigned serial numbers starting from 0 for all camera adapters 120.

データ識別子５０６は、前景画像、背景画像、三次元データ、及び音声データを識別するための情報である。データ識別子５０６は、アルファベット一文字（前景画像＝Ｆ、背景画像＝Ｂ、３次元モデル＝Ｍ、音声データ＝Ａ）である。なお、データ識別子５０６の構成は、これに限定されるものではなく、別の構成でもよい。 The data identifier 506 is information for identifying a foreground image, a background image, three-dimensional data, and audio data. The data identifier 506 is one alphabetic character (foreground image = F, background image = B, three-dimensional model = M, audio data = A). The configuration of the data identifier 506 is not limited to this, and may be another configuration.

図６は、図２の画像付加情報生成部１２８の録画指示に基づいた画像付加情報５００の生成方法を説明するための図である。画像付加情報生成部１２８は、録画指示及びパラメータに基づいて、画像付加情報５００に含まれる録画シーケンス番号５０２、録画フラグ５０３、フレーム番号５０４を生成する。なお、タイムコード５０１は、録画指示に関係なく、タイムコード抽出部１２７から受信したタイムコードをそのまま用いる。画像付加情報生成部１２８は、制御情報生成手段であり、複数のカメラ１１２により撮影された複数の画像の各フレームに付加する画像付加情報（制御情報）５００を生成する。 FIG. 6 is a diagram for explaining a method of generating the image additional information 500 based on the recording instruction of the image additional information generation unit 128 of FIG. The image additional information generation unit 128 generates a recording sequence number 502, a recording flag 503, and a frame number 504 included in the image additional information 500 based on the recording instruction and the parameters. The time code 501 uses the time code received from the time code extraction unit 127 as it is regardless of the recording instruction. The image additional information generation unit 128 is control information generation means, and generates image additional information (control information) 500 to be added to each frame of a plurality of images taken by the plurality of cameras 112.

まず、時刻Ｔ１では、画像付加情報生成部１２８は、制御ステーション３１０から録画開始指示と録画開始時刻（０９：２０：４５：００）を受け付ける。画像付加情報生成部１２８は、これを起点として、タイムコード抽出部１２７から受信するタイムコードのモニタリングを開始し、録画開始時刻と一致するタイムコードを検索する。なお、録画開始時指示によって、録画シーケンス番号５０２、録画フラグ５０３、及びフレーム番号５０４の状態が変化することはない。 First, at time T1, the image additional information generation unit 128 receives a recording start instruction and a recording start time (09: 20: 45: 00) from the control station 310. Starting from this, the image additional information generation unit 128 starts monitoring of the time code received from the time code extraction unit 127, and searches for a time code that matches the recording start time. Note that the state of the recording sequence number 502, the recording flag 503, and the frame number 504 is not changed by the recording start instruction.

次に、時刻Ｔ２では、画像付加情報生成部１２８は、タイムコード抽出部１２７から受信したタイムコードの中から録画開始時刻と一致するタイムコードを検出する。画像付加情報生成部１２８は、一致するタイムコードを検出した場合、録画シーケンス番号５０２をインクリメントし、録画フラグ５０３を０（Ｌｏｗ）から１（Ｈｉｇｈ）へ変更し、フレーム番号５０４を０に初期化する。その後、画像付加情報生成部１２８は、制御ステーション３１０から録画停止指示を受け付けるまで、録画シーケンス番号５０２及び録画フラグ５０３の値を保持する。また、画像付加情報生成部１２８は、タイムコード抽出部１２７からタイムコードを受け付ける度にフレーム番号５０４をインクリメントする。 Next, at time T2, the image additional information generation unit 128 detects, from among the time codes received from the time code extraction unit 127, a time code that matches the recording start time. When the additional image information generation unit 128 detects a matching time code, it increments the recording sequence number 502, changes the recording flag 503 from 0 (Low) to 1 (High), and initializes the frame number 504 to 0. Do. Thereafter, the image additional information generation unit 128 holds the values of the recording sequence number 502 and the recording flag 503 until the recording stop instruction is received from the control station 310. Further, each time the image additional information generation unit 128 receives a time code from the time code extraction unit 127, the image additional information generation unit 128 increments the frame number 504.

次に、時刻Ｔ３では、画像付加情報生成部１２８は、制御ステーション３１０から録画停止指示を受け付ける。画像付加情報生成部１２８は、録画停止指示を受け付けた場合、録画フラグ５０３を１（Ｈｉｇｈ）から０（Ｌｏｗ）へ変更し、録画シーケンス番号５０２とフレーム番号５０４の値を保持する。その後、画像付加情報生成部１２８は、次の録画開始指示を受け付けるまで、録画シーケンス番号５０２、録画フラグ５０３、及びフレーム番号５０４の値を保持する。 Next, at time T3, the image additional information generation unit 128 receives a recording stop instruction from the control station 310. When receiving the recording stop instruction, the image additional information generation unit 128 changes the recording flag 503 from 1 (High) to 0 (Low), and holds the values of the recording sequence number 502 and the frame number 504. Thereafter, the image additional information generation unit 128 holds the values of the recording sequence number 502, the recording flag 503, and the frame number 504 until the next recording start instruction is received.

画像付加情報生成部１２８は、録画開始時刻に一致するタイムコード５０１が付加されたフレーム以降のフレームの録画フラグ５０３を１にする。そして、画像付加情報生成部１２８は、録画停止指示を入力した場合には、複数のカメラ１１２のフレーム付加される録画フラグ５０３を０にする。 The image additional information generation unit 128 sets the recording flag 503 of the frames after the frame to which the time code 501 matching the recording start time is added to 1. Then, when the video recording stop instruction is input, the image additional information generation unit 128 sets the video recording flag 503 of the plurality of cameras 112 to which the frame is added to 0.

図７は、メモリ構造パラメータ７００を説明するための図である。制御ステーション３１０は、通信部２０１を介して、書き込み制御部２０３又は読み出し制御部２０６へメモリ構造パラメータ７００を送信する。メモリ構造パラメータ７００は、ＢＡＳＥ＿ＡＤＤＲ７０１、ＣＡＭ＿ＯＦＳＴ＿ＡＤＤＲ７０２、ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０３、ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０４、及びＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０５を有する。さらに、メモリ構造パラメータ７００は、ＰＲＩＭ＿ＦＧ＿ＯＦＳＴ＿ＡＤＤＲ７０６、ＰＲＩＭ＿ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０７、ＰＲＩＭ＿ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０８、及びＰＲＩＭ＿ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０９等を有する。メモリ構造パラメータ７００は、画像データ又は音声データを記憶部２０５の予め決められた領域へ書き込むために使用される。 FIG. 7 is a diagram for explaining the memory structure parameter 700. The control station 310 transmits the memory structure parameter 700 to the write control unit 203 or the read control unit 206 via the communication unit 201. The memory structure parameter 700 includes BASE_ADDR 701, CAM_OFST_ADDR 702, BG_OFST_ADDR 703, ML_OFST_ADDR 704, and AD_OFST_ADDR 705. Furthermore, the memory structure parameter 700 includes PRIM_FG_OFST_ADDR 706, PRIM_BG_OFST_ADDR 707, PRIM_ML_OFST_ADDR 708, PRIM_AD_OFST_ADDR 709, and the like. The memory structure parameter 700 is used to write image data or audio data to a predetermined area of the storage unit 205.

ＢＡＳＥ＿ＡＤＤＲ７０１は、すべてのカメラ１１２が撮影した画像データ及びすべてのマイク１１１が収音した音声データを記録するために確保された領域の先頭アドレスである。 BASE_ADDR 701 is the head address of the area secured for recording the image data captured by all the cameras 112 and the audio data collected by all the microphones 111.

ＣＡＭ＿ＯＦＳＴ＿ＡＤＤＲ７０２は、格納領域Ａ（７２０ａ〜７２０ｚ）の各々の先頭位置を特定するための相対アドレスである。ＣＡＭ＿ＯＦＳＴ＿ＡＤＤＲ７０２は、記憶部２０５の全領域をカメラ１１２の台数で均等に分割し、それぞれの格納領域Ａ（７２０ａ〜７２０ｚ）の先頭位置を特定するために用いられる。また、ＣＡＭ＿ＯＦＳＴ＿ＡＤＤＲ７０２は、前景画像を格納する格納領域Ｂ（７３０）の先頭位置を特定するためにも用いられる。 The CAM_OFST_ADDR 702 is a relative address for specifying the start position of each of the storage areas A (720 a to 720 z). The CAM_OFST_ADDR 702 is used to equally divide the entire area of the storage unit 205 by the number of cameras 112 and to specify the head position of each storage area A (720a to 720z). The CAM_OFST_ADDR 702 is also used to specify the start position of the storage area B (730) storing the foreground image.

ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０３は、背景画像を格納する格納領域Ｃ（７４０）の先頭位置を特定するための相対アドレスである。ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０３は、前景画像、背景画像、三次元モデル、及び音声データのデータ種別ごとに格納領域Ａ（７２０ａ〜７２０ｚ）の各々を分割し、その中で背景画像を格納する格納領域Ｃ（７４０）の先頭位置を特定するために用いられる。 BG_OFST_ADDR 703 is a relative address for specifying the start position of storage area C (740) storing the background image. The BG_OFST_ADDR 703 divides each of the storage areas A (720a to 720z) according to the data type of the foreground image, background image, 3D model, and audio data, and stores the background image in the storage area C (740). It is used to identify the start position.

ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０４は、三次元モデルを格納する格納領域Ｄ（７５０）の先頭位置を特定するための相対アドレスである。ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０４は、前景画像、背景画像、三次元モデル、及び音声データのデータ種別ごとに格納領域Ａ（７２０ａ〜７２０ｚ）の各々を分割し、その中で三次元モデルを格納する格納領域Ｄ（７５０）の先頭位置を特定するために用いられる。 The ML_OFST_ADDR 704 is a relative address for specifying the start position of the storage area D (750) storing the three-dimensional model. The ML_OFST_ADDR 704 divides each of the storage areas A (720a to 720z) according to the data type of the foreground image, background image, 3D model, and audio data, and stores the 3D model in the storage area D (750). It is used to specify the beginning position of.

ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０５は、音声データを格納する格納領域Ｅ（７６０）の先頭位置を特定するための相対アドレスである。ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０５は、前景画像、背景画像、三次元モデル、及び音声データのデータ種別ごとに格納領域Ａ（７２０ａ〜７２０ｚ）の各々を分割し、その中で音声データを格納する格納領域Ｅ（７６０）の先頭位置を特定するために用いられる。 AD_OFST_ADDR 705 is a relative address for specifying the start position of the storage area E (760) storing audio data. The AD_OFST_ADDR 705 divides each of the storage areas A (720a to 720z) according to the data type of the foreground image, background image, three-dimensional model, and audio data, and stores the audio data in the storage area E (760). It is used to identify the start position.

ＰＲＩＭ＿ＦＧ＿ＯＦＳＴ＿ＡＤＤＲ７０６は、１フレームの前景画像を格納する格納領域Ｆ（７３１ａ〜７３１ｚ）の各々の先頭位置を特定するための相対アドレスである。ＰＲＩＭ＿ＦＧ＿ＯＦＳＴ＿ＡＤＤＲ７０６は、格納領域Ｂ（７３０）の先頭位置から順次書き込まれた各前景画像フレームの先頭位置を特定するために用いられる。 The PRIM_FG_OFST_ADDR 706 is a relative address for specifying the head position of each of the storage areas F (731a to 731z) storing the foreground image of one frame. PRIM_FG_OFST_ADDR 706 is used to specify the start position of each foreground image frame sequentially written from the start position of storage area B (730).

ＰＲＩＭ＿ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０７は、１フレームの背景画像を格納する格納領域Ｇ（７４１ａ〜７４１ｚ）の各々の先頭位置を特定するための相対アドレスである。ＰＲＩＭ＿ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０７は、格納領域Ｃ（７４０）の先頭位置から順次書き込まれた各背景画像フレームの先頭位置を特定するために用いられる。 PRIM_BG_OFST_ADDR 707 is a relative address for specifying the head position of each of the storage areas G (741a to 741z) storing the background image of one frame. The PRIM_BG_OFST_ADDR 707 is used to specify the start position of each background image frame sequentially written from the start position of the storage area C (740).

ＰＲＩＭ＿ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０８は、１フレームの三次元モデルを格納する格納領域Ｈ（７５１ａ〜７５１ｚ）の各々の先頭位置を特定するための相対アドレスである。ＰＲＩＭ＿ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０８は、格納領域Ｄ（７５０）の先頭位置から順次書き込まれた各三次元モデルフレームの先頭位置を特定するために用いられる。 The PRIM_ML_OFST_ADDR 708 is a relative address for specifying the head position of each of the storage areas H (751 a to 751 z) storing the three-dimensional model of one frame. PRIM_ML_OFST_ADDR 708 is used to specify the start position of each three-dimensional model frame sequentially written from the start position of storage area D (750).

ＰＲＩＭ＿ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０９は、１フレームの音声データを格納する格納領域Ｉ（７６１ａ〜７６１ｚ）の各々の先頭位置を特定するための相対アドレスである。ＰＲＩＭ＿ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０９は、格納領域Ｅ（７６０）の先頭位置から順次書き込まれた各音声データフレームの先頭位置を特定するために用いられる。 PRIM_AD_OFST_ADDR 709 is a relative address for specifying the head position of each of the storage areas I (761a to 761z) storing audio data of one frame. PRIM_AD_OFST_ADDR 709 is used to specify the start position of each audio data frame sequentially written from the start position of storage area E (760).

フレームは、複数のデータ種別のデータを含み、例えば、前景画像、背景画像、三次元モデル及び音声データのうちの２以上のデータである。画像付加情報生成部１２８は、各フレームの複数のデータ種別のデータ毎に、画像付加情報５００を付加する。 A frame includes data of a plurality of data types, and is, for example, two or more data of a foreground image, a background image, a three-dimensional model, and audio data. The image additional information generation unit 128 adds the image additional information 500 to each of the plurality of data types of each frame.

図８は、図３の書き込み制御部２０３の動作例を示すフローチャートである。書き込み制御部２０３は、前段のデータ入力制御部２０２から１フレーム分の画像データ又は音声データを受信するたびに、図８の処理を行う。ただしこの例に限らず、例えば、ＧＯＰ（ＧｒｏｕｐＯｆＰｉｃｔｕｒｅ）に対応する画像データが入力されるたびに図８の処理が行われるようにしても良いし、１秒分の画像データや音声データが入力されるたびに図８の処理が行われるようにしても良い。 FIG. 8 is a flowchart showing an operation example of the write control unit 203 of FIG. The write control unit 203 performs the process of FIG. 8 each time image data or audio data of one frame is received from the data input control unit 202 of the previous stage. However, the present invention is not limited to this example. For example, the process of FIG. 8 may be performed each time image data corresponding to a GOP (Group Of Picture) is input, and one second of image data or audio data is The process of FIG. 8 may be performed each time an input is made.

ステップＳ８０１では、書き込み制御部２０３は、データ入力制御部２０２から受信した画像データ又は音声データに付加されている画像付加情報５００を取得する。次に、ステップＳ８０２では、書き込み制御部２０３は、画像付加情報５００に含まれる録画フラグ５０３の状態に応じて、画像データを記憶部２０５へ書き込むか否かを判断する。書き込み制御部２０３は、録画フラグ５０３が１（録画中）の場合（Ｓ８０２のＹｅｓ）、画像データ及び音声データを記憶部２０５へ書き込む必要があるため、ステップＳ８０３に処理を進める。書き込み制御部２０３は、録画フラグ５０３が０（録画停止中）の場合（Ｓ８０２のＮｏ）、画像データ及び音声データを記憶部２０５へ書き込む必要がないため、処理を終了する。 In step S801, the writing control unit 203 acquires the image additional information 500 added to the image data or audio data received from the data input control unit 202. Next, in step S802, the write control unit 203 determines whether to write the image data to the storage unit 205 according to the state of the recording flag 503 included in the image additional information 500. When the recording flag 503 is 1 (during recording) (Yes in S802), the writing control unit 203 needs to write the image data and the audio data to the storage unit 205, and thus the processing proceeds to step S803. When the recording flag 503 is 0 (during recording stop) (No in S802), the writing control unit 203 ends the processing because it is not necessary to write the image data and the audio data to the storage unit 205.

ステップＳ８０３では、書き込み制御部２０３は、メモリ構造パラメータ７００と画像付加情報５００に基づいて、書き込みアドレスＷＲＩＴＥ＿ＡＤＤＲを生成する。すなわち、書き込み制御部２０３は、入力された画像データに付加された録画フラグ５０３が１である場合、当該画像データを格納するための記憶部２０５の記憶領域を示すアドレスＷＲＩＴＥ＿ＡＤＤＲを生成する。書き込み制御部２０３は、式（１）、（２）、（３）を用いて、書き込みアドレスＷＲＩＴＥ＿ＡＤＤＲを生成する。 In step S 803, the write control unit 203 generates a write address WRITE_ADDR based on the memory structure parameter 700 and the image additional information 500. That is, when the recording flag 503 added to the input image data is 1, the write control unit 203 generates an address WRITE_ADDR indicating a storage area of the storage unit 205 for storing the image data. The write control unit 203 generates a write address WRITE_ADDR using Expressions (1), (2), and (3).

まず、式（１）のＯＦＳＴ＿ＡＤＤＲを説明する。書き込み制御部２０３は、データ識別子５０６がＦである場合、ＯＦＳＴ＿ＡＤＤＲとして０を設定する。書き込み制御部２０３は、データ識別子５０６がＢである場合、ＯＦＳＴ＿ＡＤＤＲとしてＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０３を設定する。書き込み制御部２０３は、データ識別子５０６がＭである場合、ＯＦＳＴ＿ＡＤＤＲとしてＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０４を設定する。書き込み制御部２０３は、データ識別子５０６がＡである場合、ＯＦＳＴ＿ＡＤＤＲとしてＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０５を設定する。 First, OFST_ADDR in equation (1) will be described. When the data identifier 506 is F, the write control unit 203 sets 0 as OFST_ADDR. When the data identifier 506 is B, the write control unit 203 sets BG_OFST_ADDR 703 as OFST_ADDR. When the data identifier 506 is M, the write control unit 203 sets ML_OFST_ADDR 704 as OFST_ADDR. When the data identifier 506 is A, the write control unit 203 sets AD_OFST_ADDR 705 as OFST_ADDR.

次に、式（２）のＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲを説明する。書き込み制御部２０３は、データ識別子５０６がＦである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＦＲ＿ＯＦＳＴ＿ＡＤＤＲ７０６を設定する。書き込み制御部２０３は、データ識別子５０６がＢである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０７を設定する。書き込み制御部２０３は、データ識別子５０６がＭである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０８を設定する。書き込み制御部２０３は、データ識別子５０６がＡである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０９を設定する。 Next, PRIM_OFST_ADDR of Formula (2) is demonstrated. When the data identifier 506 is F, the write control unit 203 sets PRIM_FR_OFST_ADDR 706 as PRIM_OFST_ADDR. When the data identifier 506 is B, the write control unit 203 sets PRIM_BG_OFST_ADDR 707 as PRIM_OFST_ADDR. When the data identifier 506 is M, the write control unit 203 sets PRIM_ML_OFST_ADDR 708 as PRIM_OFST_ADDR. When the data identifier 506 is A, the write control unit 203 sets PRIM_AD_OFST_ADDR 709 as PRIM_OFST_ADDR.

その後、書き込み制御部２０３は、式（３）により、書き込みアドレスＷＲＩＴＥ＿ＡＤＤＲを生成する。 Thereafter, the write control unit 203 generates a write address WRITE_ADDR according to Expression (3).

次に、ステップＳ８０４では、書き込み制御部２０３は、画像付加情報５００をモニタリングし、対象フレームの画像データ又は音声データが録画開始位置であるか否かを判定する。書き込み制御部２０３は、対象フレームが録画開始位置である場合（Ｓ８０４のＹｅｓ）には、録画開始情報をＳＱ情報管理部２０４へ送る必要があるため、ステップＳ８０５に処理を進める。書き込み制御部２０３は、対象フレームが録画開始位置でない場合（Ｓ８０４のＮｏ）には、録画開始情報をＳＱ情報管理部２０４へ送る必要がないため、ステップＳ８０６に処理を進める。書き込み制御部２０３は、録画フラグ５０３が１（録画中）であり、かつフレーム番号５０４が０のときに、対象フレームが録画開始位置であると判定する。ただし、書き込み制御部２０３は、録画フラグ５０３を用いた判定については、ステップＳ８０２で既に行っているため、ステップＳ８０４では、フレーム番号５０４のみを用いて判定すればよい。 Next, in step S804, the write control unit 203 monitors the image additional information 500, and determines whether the image data or audio data of the target frame is the recording start position. When the target frame is the recording start position (Yes in S804), the writing control unit 203 needs to send the recording start information to the SQ information management unit 204, and therefore the process proceeds to step S805. If the target frame is not the recording start position (No in S804), the writing control unit 203 does not need to send the recording start information to the SQ information management unit 204, and advances the process to step S806. When the recording flag 503 is 1 (during recording) and the frame number 504 is 0, the writing control unit 203 determines that the target frame is the recording start position. However, since the determination using the recording flag 503 has already been performed in step S802, the writing control unit 203 may perform the determination using only the frame number 504 in step S804.

ステップＳ８０５では、書き込み制御部２０３は、録画開始位置を検出した際のタイムコード５０１、録画シーケンス番号５０２、カメラアダプタ番号５０５、データ識別子５０６及び書き込みアドレスを一塊にした録画開始情報をＳＱ情報管理部２０４へ送信する。なお、書き込み制御部２０３がＳＱ情報管理部２０４へ送信する録画開始情報の内容は、図７を用いて説明したメモリ構造等に応じて異なる。また、例えば、録画開始情報にデータ識別子５０６が含まれていなかったとしても、従来技術より高速にデータ読み出しが可能になるという効果は得られる。 In step S805, the write control unit 203 uses the time code 501 at the time of detection of the recording start position, the recording sequence number 502, the camera adapter number 505, the data identifier 506, and the recording start information including the write address in a lump. Send to 204. The contents of the recording start information that the write control unit 203 transmits to the SQ information management unit 204 differ depending on the memory structure and the like described with reference to FIG. Further, for example, even if the data identifier 506 is not included in the recording start information, an effect that data can be read faster than in the prior art can be obtained.

次に、ステップＳ８０６では、書き込み制御部２０３は、ステップＳ８０３において生成された書き込みアドレスＷＲＩＴＥ＿ＡＤＤＲが示す記憶部２０５の領域へ、画像データ又は音声データのフレームを書き込む。書き込み制御部２０３は、複数のカメラ１１２の各フレームの複数のデータ種別のデータ（前景画像、背景画像、三次元モデル及び音声データ）を書き込む。 Next, in step S806, the write control unit 203 writes a frame of image data or audio data in the area of the storage unit 205 indicated by the write address WRITE_ADDR generated in step S803. The writing control unit 203 writes data (foreground image, background image, three-dimensional model, and audio data) of a plurality of data types of each frame of the plurality of cameras 112.

図９は、図３のＳＱ情報管理部２０４のＳＱテーブルで管理するＳＱテーブル情報９００の構造を示す図である。ＳＱテーブル情報９００は、カメラ番号９０１、録画開始時刻９０２、及びベースアドレス９０３を有する。 FIG. 9 is a diagram showing the structure of SQ table information 900 managed by the SQ table of the SQ information management unit 204 of FIG. The SQ table information 900 includes a camera number 901, a recording start time 902, and a base address 903.

カメラ番号９０１は、カメラアダプタ１２０に割り振られた番号である。カメラ番号９０１には、書き込み制御部２０３から受信した録画開始情報に含まれるカメラアダプタ番号５０５が設定される。 The camera number 901 is a number assigned to the camera adapter 120. In the camera number 901, a camera adapter number 505 included in the recording start information received from the writing control unit 203 is set.

録画開始時刻９０２は、ＨＨ（時間）：ＭＭ（分）：ＳＳ（秒）：ＦＦ（フレーム）で構成された時間情報である。録画開始時刻９０２には、書き込み制御部２０３から受信した録画開始情報に含まれる録画開始フレームのタイムコード５０１が設定される。 The recording start time 902 is time information composed of HH (hour): MM (minute): SS (second): FF (frame). In the recording start time 902, a time code 501 of a recording start frame included in the recording start information received from the writing control unit 203 is set.

ベースアドレス９０３は、記憶部２０５に書き込まれた録画シーケンスの先頭画像のアドレスである。ベースアドレス９０３には、書き込み制御部２０３から受信した録画開始情報に含まれる録画開始フレームの書き込みアドレスが設定される。 The base address 903 is the address of the leading image of the recording sequence written in the storage unit 205. In the base address 903, the write address of the recording start frame included in the recording start information received from the write control unit 203 is set.

ＳＱテーブル情報９００は、ひとつの録画シーケンス番号５０２につき、前景画像、背景画像、三次元モデル、及び音声データのそれぞれのＳＱテーブル情報９００が生成される。したがって、録画シーケンス数とデータ種別の組み合わせに応じた数のＳＱテーブル情報９００が存在する。ＳＱ情報管理部２０４は、書き込み制御部２０３から受信した録画開始情報をＳＱテーブル情報９００へエントリする場合、録画開始情報に含まれる録画シーケンス番号５０２とデータ識別子５０６に基づいて、エントリ先となるＳＱテーブル情報９００を選択する。ＳＱ情報管理部２０４は、録画シーケンス番号５０２とデータ識別子５０６を基に、カメラ番号９０１毎（カメラ識別子毎）に、録画開始時刻９０２及びベースアドレス９０３を登録する。 The SQ table information 900 generates SQ table information 900 of each of a foreground image, a background image, a three-dimensional model, and audio data for one recording sequence number 502. Therefore, there is a number of SQ table information 900 corresponding to the combination of the number of recording sequences and the data type. When the recording start information received from the writing control unit 203 is entered into the SQ table information 900, the SQ information management unit 204 becomes an entry destination SQ based on the recording sequence number 502 and the data identifier 506 included in the recording start information. Table information 900 is selected. The SQ information management unit 204 registers the recording start time 902 and the base address 903 for each camera number 901 (for each camera identifier) based on the recording sequence number 502 and the data identifier 506.

図１０は、図３の読み出し制御部２０６の動作例を示すフローチャートである。読み出し制御部２０６は、通信部２０１を介して制御ステーション３１０から読み出し指示を受信した際に、図１０の処理を行う。 FIG. 10 is a flowchart showing an operation example of the read control unit 206 in FIG. When the read control unit 206 receives a read instruction from the control station 310 via the communication unit 201, the read control unit 206 performs the process of FIG.

ステップＳ１００１では、読み出し制御部２０６は、読み出し指示と共に受信した読み出しパラメータとメモリ構造パラメータ７００を取得する。読み出しパラメータは、特定シーン（例えば、サッカーのゴールシーン）に該当する画像データ又は音声データを記憶部２０５から読み出すためのパラメータである。読み出しパラメータには、特定シーンに該当する画像データ又は音声データが含まれる録画シーケンス番号、特定シーンの開始時刻及び終了時刻などのうち、少なくとも何れか１つが含まれる。つまり読み出しパラメータは、録画シーケンス番号を特定するための情報であれば良い。 In step S1001, the read control unit 206 acquires the read parameter and the memory structure parameter 700 received together with the read instruction. The read parameter is a parameter for reading out from the storage unit 205 image data or audio data corresponding to a specific scene (for example, a goal scene of soccer). The readout parameter includes at least one of a recording sequence number including image data or audio data corresponding to a specific scene, a start time and an end time of the specific scene, and the like. That is, the read parameter may be information for specifying a recording sequence number.

次に、ステップＳ１００２では、読み出し制御部２０６は、読み出しパラメータから特定される録画シーケンス番号を基に、ＳＱ情報管理部２０４からＳＱテーブル情報９００を取得する。読み出し制御部２０６は、ＳＱ情報管理部２０４へ録画シーケンス番号を送信することにより、該当する録画シーケンス番号に対応する前景画像、背景画像、三次元モデル、及び音声データのそれぞれのＳＱテーブル情報９００を順次取得する。 Next, in step S1002, the read control unit 206 acquires the SQ table information 900 from the SQ information management unit 204 based on the recording sequence number specified from the read parameter. The read control unit 206 transmits the recording sequence number to the SQ information management unit 204 to obtain the SQ table information 900 of the foreground image, the background image, the three-dimensional model, and the audio data corresponding to the corresponding recording sequence number. Acquire sequentially.

次に、ステップＳ１００３では、読み出し制御部２０６は、読み出しパラメータに含まれる特定シーンの開始時刻、メモリ構造パラメータ７００、及びＳＱテーブル情報９００に基づいて、読み出しベースアドレスを生成する。読み出しベースアドレスは、記憶部２０５上のカメラごとに設けられた前景画像、背景画像、三次元モデル、及び音声データのそれぞれの格納領域に対して生成される。したがって、データ種別（前景画像、背景画像、三次元モデル、音声データ）の数とカメラアダプタ１２０の台数を掛け合わせた数の読み出しベースアドレスが生成される。例えば、カメラアダプタ１２０が１００台設置されている場合、４００個の読み出しベースアドレスが生成される。読み出し制御部２０６は、式（４）、（５）、（６）により、読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲを算出する。式（５）及び（６）中の録画開始時刻９０２とベースアドレス９０３は、処理対象となるカメラ番号９０１に応じて、ＳＱテーブル情報９００から該当するデータが読み出される。 Next, in step S1003, the read control unit 206 generates a read base address based on the start time of the specific scene included in the read parameter, the memory structure parameter 700, and the SQ table information 900. The read base address is generated for each storage area of the foreground image, the background image, the three-dimensional model, and the audio data provided for each camera on the storage unit 205. Therefore, the number of read base addresses is generated by multiplying the number of data types (foreground image, background image, three-dimensional model, audio data) by the number of camera adapters 120. For example, when 100 camera adapters 120 are installed, 400 read base addresses are generated. The read control unit 206 calculates the read base address READ_BASE_ADDR according to Expressions (4), (5), and (6). As for the recording start time 902 and the base address 903 in the equations (5) and (6), corresponding data is read out from the SQ table information 900 according to the camera number 901 to be processed.

まず、式（４）のＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲを説明する。読み出し制御部２０６は、データ種別が前景画像である場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＦＲ＿ＯＦＳＴ＿ＡＤＤＲ７０６を設定する。読み出し制御部２０６は、データ種別が背景画像である場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＢＧ＿ＯＦＳＴ＿ＡＤＤＲ７０７を設定する。読み出し制御部２０６は、データ種別が三次元モデルである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＭＬ＿ＯＦＳＴ＿ＡＤＤＲ７０８を設定する。読み出し制御部２０６は、データ種別が音声データである場合、ＰＲＩＭ＿ＯＦＳＴ＿ＡＤＤＲとしてＰＲＩＭ＿ＡＤ＿ＯＦＳＴ＿ＡＤＤＲ７０９を設定する。 First, PRIM_OFST_ADDR of equation (4) will be described. When the data type is a foreground image, the read control unit 206 sets PRIM_FR_OFST_ADDR 706 as PRIM_OFST_ADDR. When the data type is a background image, the read control unit 206 sets PRIM_BG_OFST_ADDR 707 as PRIM_OFST_ADDR. When the data type is a three-dimensional model, the read control unit 206 sets PRIM_ML_OFST_ADDR 708 as PRIM_OFST_ADDR. When the data type is voice data, the read control unit 206 sets PRIM_AD_OFST_ADDR 709 as PRIM_OFST_ADDR.

また、読み出し制御部２０６は、ドロップフレームを考慮し、式（５）により、特定シーンの開始時刻と録画開始時刻９０２との差分を、差分フレーム数ＤＩＦＦ＿ＦＲＡＭＥ＿ＮＵＭとして算出する。その後、読み出し制御部２０６は、式（６）により、読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲを算出する。 Further, the read control unit 206 calculates the difference between the start time of the specific scene and the recording start time 902 as the difference frame number DIFF_FRAME_NUM by the equation (5) in consideration of the drop frame. Thereafter, the read control unit 206 calculates the read base address READ_BASE_ADDR according to Expression (6).

次に、ステップＳ１００４では、読み出し制御部２０６は、特定シーンのフレーム総数を算出する。具体的には、読み出し制御部２０６は、特定シーンの終了時刻から開始時刻を減算し、特定シーンのフレーム総数を算出する。ただし、読み出し制御部２０６は、ドロップフレームを考慮し、フレーム総数を算出する。 Next, in step S1004, the read control unit 206 calculates the total number of frames of a specific scene. Specifically, the read control unit 206 subtracts the start time from the end time of the specific scene to calculate the total number of frames of the specific scene. However, the read control unit 206 calculates the total number of frames in consideration of a drop frame.

次に、ステップＳ１００５では、読み出し制御部２０６は、変数Ｎを０で初期化する。変数Ｎは、処理中のフレーム番号を示す。 Next, in step S1005, the read control unit 206 initializes a variable N to zero. The variable N indicates the frame number being processed.

次に、ステップＳ１００６では、読み出し制御部２０６は、メモリ構造パラメータ７００と変数Ｎに基づいて、読み出しオフセットアドレスＲＥＡＤ＿ＯＦＳＴ＿ＡＤＤＲを生成する。読み出しオフセットアドレスは、記憶部２０５上に設けられた前景画像、背景画像、三次元モデル、及び音声データのそれぞれの格納領域に対して生成される。したがって、データ種別（前景画像、背景画像、三次元モデル、音声データ）の数だけ読み出しオフセットアドレスが生成される。読み出し制御部２０６は、式（７）により、読み出しオフセットアドレスＲＥＡＤ＿ＯＦＳＴ＿ＡＤＤＲを算出する。 Next, in step S1006, the read control unit 206 generates a read offset address READ_OFST_ADDR based on the memory structure parameter 700 and the variable N. The readout offset address is generated for each of the storage areas of the foreground image, the background image, the three-dimensional model, and the audio data provided on the storage unit 205. Therefore, readout offset addresses are generated as many as the number of data types (foreground image, background image, three-dimensional model, audio data). The read control unit 206 calculates the read offset address READ_OFST_ADDR according to equation (7).

次に、ステップＳ１００７では、読み出し制御部２０６は、式（６）の読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲと、式（７）の読み出しオフセットアドレスＲＥＡＤ＿ＯＦＳＴ＿ＡＤＤＲを用いて、読み出しアドレスを生成する。読み出しアドレスは、ステップＳ１００３で生成された各読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲに対し、読み出しオフセットアドレスＲＥＡＤ＿ＯＦＳＴ＿ＡＤＤＲを加算して生成される。読み出しアドレスは、読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲの数だけ生成される。読み出しオフセットアドレスＲＥＡＤ＿ＯＦＳＴ＿ＡＤＤＲは、読み出しベースアドレスＲＥＡＤ＿ＢＡＳＥ＿ＡＤＤＲと同一データ種別のオフセットアドレスが選択される。読み出し制御部２０６は、式（８）により、読み出しアドレスＲＥＡＤ＿ＡＤＤＲを算出する。 Next, in step S1007, the read control unit 206 generates a read address using the read base address READ_BASE_ADDR of Expression (6) and the read offset address READ_OFST_ADDR of Expression (7). The read address is generated by adding the read offset address READ_OFST_ADDR to each read base address READ_BASE_ADDR generated in step S1003. The read addresses are generated by the number of read base addresses READ_BASE_ADDR. As the read offset address READ_OFST_ADDR, an offset address of the same data type as the read base address READ_BASE_ADDR is selected. The read control unit 206 calculates the read address READ_ADDR according to equation (8).

次に、ステップＳ１００８では、読み出し制御部２０６は、読み出しアドレスＲＥＡＤ＿ＡＤＤＲが示す記憶部２０５の領域から画像データ又は音声データを読み出す。読み出し制御部２０６は、読み出しアドレスＲＥＡＤ＿ＡＤＤＲを用いて、記憶部２０５上のカメラ１１２ごとに設けられた前景画像、背景画像、三次元モデル、及び音声データのそれぞれの格納領域から１フレーム単位のデータを読み出す。 Next, in step S1008, the read control unit 206 reads image data or audio data from the area of the storage unit 205 indicated by the read address READ_ADDR. The read control unit 206 uses the read address READ_ADDR to store one frame unit of data from the storage area of the foreground image, the background image, the three-dimensional model, and the audio data provided for each camera 112 on the storage unit 205. read out.

次に、ステップＳ１００９では、読み出し制御部２０６は、変数Ｎをインクリメントする。次に、ステップＳ１０１０では、読み出し制御部２０６は、変数Ｎがフレーム総数より小さいか否かを判定する。読み出し制御部２０６は、変数Ｎがフレーム総数より小さい場合には、すべてのフレームを記憶部２０５から読み出していないので、ステップＳ１００６に処理を戻し、次のフレームの処理を行う。読み出し制御部２０６は、変数Ｎがフレーム総数より小さくない場合には、すべてのフレームを記憶部２０５から読み出しているので、図１０の処理を終了する。 Next, in step S1009, the read control unit 206 increments the variable N. Next, in step S1010, the read control unit 206 determines whether the variable N is smaller than the total number of frames. If the variable N is smaller than the total number of frames, the read control unit 206 does not read all the frames from the storage unit 205, so the process returns to step S1006 to process the next frame. When the variable N is not smaller than the total number of frames, the read control unit 206 ends the process of FIG. 10 because all the frames are read from the storage unit 205.

読み出し制御部２０６は、記憶部２０５から複数のカメラ１１２の特定シーンの開始時刻以降のフレームを読み出す。読み出し制御部２０６は、複数のカメラ１１２の各フレームの複数のデータ種別のデータ（前景画像、背景画像、三次元モデル及び音声データ）を読み出す。仮想視点画像生成部２０７は、読み出し制御部２０６により読み出された複数のカメラ１１２のフレームを用いて、仮想視点画像を生成する。 The read control unit 206 reads a frame after the start time of a specific scene of the plurality of cameras 112 from the storage unit 205. The read control unit 206 reads data (foreground image, background image, three-dimensional model, and audio data) of a plurality of data types of each frame of the plurality of cameras 112. The virtual viewpoint image generation unit 207 generates a virtual viewpoint image using the frames of the plurality of cameras 112 read by the read control unit 206.

以上述べたように、書き込み制御部２０３及び読み出し制御部２０６は、カメラアダプタ１２０が生成した画像付加情報５００に基づいて、記憶部２０５の書き込み制御及び読み出し制御を行う。これにより、画像コンピューティングサーバ２００は、複数のカメラ１１２で撮影された画像群の中から同時刻に撮影された画像を短時間に抽出し、仮想視点コンテンツを閲覧するまでの時間を短縮することができる。 As described above, the writing control unit 203 and the reading control unit 206 perform writing control and reading control of the storage unit 205 based on the image additional information 500 generated by the camera adapter 120. Thus, the image computing server 200 extracts images taken at the same time from images taken by a plurality of cameras 112 in a short time, and shortens the time until browsing virtual viewpoint content. Can.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

なお、上記実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 In addition, the said embodiment only shows the example of embodiment in the case of implementing this invention, and the technical scope of this invention should not be limitedly interpreted by these. That is, the present invention can be implemented in various forms without departing from the technical concept or the main features thereof.

１２１カメラ制御部、１２２マイク制御部、１２３データ送受信部、１２４時刻同期制御部、１２５前景背景分離部、１２６三次元モデル情報生成部、１２７タイムコード抽出部、１２８画像付加情報生成部、２０２データ入力部、２０３書き込み制御部、２０４ＳＱ情報管理部、２０５記憶部、２０６読み出し制御部、２０７仮想視点画像生成部 121 camera control unit 122 microphone control unit 123 data transmission / reception unit 124 time synchronization control unit 125 foreground / background separation unit 126 three-dimensional model information generation unit 127 time code extraction unit 128 image additional information generation unit 202 data Input unit, 203 writing control unit, 204 SQ information management unit, 205 storage unit, 206 reading control unit, 207 virtual viewpoint image generating unit

Claims

A plurality of image pickup means for respectively photographing a plurality of images from a plurality of directions with respect to the subject;
Based on time information and a recording instruction superimposed on each frame of the plurality of images photographed by the plurality of imaging units, control information to be added to each frame of the plurality of images photographed by the plurality of imaging units is generated Control information generation means for
Writing control means for writing each frame of a plurality of images taken by the plurality of imaging means into the storage means based on the control information;
Reading control means for reading the frames of the plurality of images from the storage means based on the control information;
And a virtual viewpoint image generation unit configured to generate a virtual viewpoint image using the frames of the plurality of images read by the read control unit.

The control information includes a recording state, a frame number, and an identifier of the imaging unit.
The writing control means writes a frame indicating that the recording state is in recording to the address of the storage means according to the frame number and the identifier of the imaging means, and writes the start frame of the recording for each identifier of the imaging means Register the address,
The image processing system according to claim 1, wherein the read control unit reads the frames of the plurality of images from the storage unit based on a write address registered for each identifier of the imaging unit.

The control information includes the time information,
3. The image processing system according to claim 2, wherein the control information generation unit sets the recording state of frames subsequent to a frame to which time information coincident with the recording start time is added.

4. The image processing system according to claim 3, wherein the control information generation unit, when inputting a recording stop instruction, sets the recording state to be added to the frames of the plurality of images to a recording stop state. .

The writing control means registers the time information added to the start frame of the recording for each identifier of the imaging means,
The read control means is configured to read from the storage means based on an address corresponding to a difference between time information added to a recording start frame registered for each identifier of the imaging means and a start time of a read instruction. The image processing system according to claim 3 or 4, wherein a frame after the start time of the read instruction of a plurality of images is read out.

Synchronization control means for generating time information by external synchronization as time information to be superimposed on each frame;
The information processing apparatus further comprises: time information generating means for outputting internally generated time information to the control information generating means when an error of the time information generated by the synchronization control means is detected. The image processing system according to any one of 3 to 5.

The frame includes data of a plurality of data types,
The control information includes the data type,
The control information generation unit generates the control information to be added for each of data of a plurality of data types of each frame,
The write control means writes data of a plurality of data types of the frame to an address of the storage means according to the data type,
The read control means may read data of a plurality of data types of the frame from the storage means based on a write address registered for each identifier of the imaging means and an address corresponding to the data type. The image processing system according to any one of claims 2 to 6.

8. The image processing system according to claim 7, wherein the data of the plurality of data types is two or more data of a foreground image, a background image, a three-dimensional model, and audio data.

The control information includes a recording sequence number,
The write control means registers a write address of the recording start frame for each identifier of the imaging means based on the recording sequence number,
The read control means reads frames of the plurality of images from the storage means on the basis of a write address registered for each identifier of the imaging means corresponding to the recording sequence number of the read instruction. The image processing system of any one of 2-8.

A control method of an image processing system having an image having a plurality of image pickup means for respectively photographing a plurality of images from a plurality of directions with respect to a subject.
Based on time information and a recording instruction superimposed on each frame of the plurality of images photographed by the plurality of imaging units by the control information generation unit, to each frame of the plurality of images photographed by the plurality of imaging units A control information generation step of generating control information to be added;
A writing step of writing each frame of a plurality of images captured by the plurality of imaging units into the storage unit based on the control information by the writing control unit;
Reading out the frames of the plurality of images from the storage means based on the control information by the readout control means;
And a virtual viewpoint image generation step of generating a virtual viewpoint image by using the frames of the plurality of read out images by the virtual viewpoint image generation means.