JP4161773B2

JP4161773B2 - Video editing apparatus and processing method of video editing apparatus

Info

Publication number: JP4161773B2
Application number: JP2003101884A
Authority: JP
Inventors: 弘美星野; 史夫中島
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2003-04-04
Filing date: 2003-04-04
Publication date: 2008-10-08
Anticipated expiration: 2023-04-04
Also published as: JP2004312281A

Description

【０００１】
【発明の属する技術分野】
本発明は，映像信号の映像編集装置，映像編集装置の処理方法に関する。
【０００２】
【従来の技術】
近年，映画，ＴＶ番組などの映像作品の制作分野では，撮影した映像素材に関するメタデータの有効活用が進められている。この映像素材に関するメタデータは，例えば，映像作品のタイトル名，撮影日時，シーン番号などの映像素材の属性を表す情報や，撮影時における撮像装置やレンズ等の設定情報などである（例えば，特許文献１参照）。これらのメタデータは，撮影された映像素材を識別・管理する上で有用な情報であるとともに，映像素材の後処理段階におけるＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）合成処理やコンポジット処理などにも有効利用されている。
【０００３】
従来では，このようなメタデータは，磁気テープ等に記録された映像素材とは別に，パーソナルコンピュータ等の端末上で記録・管理されており，かかる映像素材とメタデータとのリンクは，映像撮影時のタイムコード等を双方に付与することによってなされていた。このため，かかるメタデータを利用して映像素材を後処理する場合には，当該端末等からタイムコード等を介して，対応するメタデータ（カメラやレンズの設定情報等）を読み出し，かかるメタデータを当該映像素材の画質，被写体の動き等を表す情報として後処理に利用していた。
【０００４】
【特許文献１】
特開平９−４６６２７号公報
【０００５】
【発明が解決しようとする課題】
しかしながら，上記従来のようなメタデータを利用した後処理方法では，映像素材とは別に記録・管理され，タイムコード等を介して映像素材に間接的にリンクされているメタデータを利用していた。
【０００６】
このため，メタデータを抽出する際には，タイムコード等により時間合わせをする必要があり，必要なメタデータの抽出・表示等が非効率的であるという問題があった。また，映像素材とは別に，メタデータを取り扱う必要があり，不便であるという問題もあった。
【０００７】
さらに，映像素材またはメタデータのいずれか一方が編集されている場合には，メタデータを映像素材と同期させて連続的に抽出・表示することができないという問題があった。また，映像素材が可変速撮影（フレームレートを変化させて撮影）されている場合には，映像素材のフレーム数とメタデータの記録数との間にずれが生じているので，映像素材に対応するメタデータを好適に抽出できないという問題があった。
【０００８】
本発明は，上記問題点に鑑みてなされたものであり，本発明の目的は，後処理に必要なメタデータを，容易かつ迅速に抽出して，映像素材と同期させて表示できるとともに，映像素材とメタデータを一体的に取り扱うことができ，編集処理や可変速撮影された映像素材にも柔軟に対応することが可能な，新規かつ改良された映像編集装置およびその処理方法を提供することにある。
【０００９】
【課題を解決するための手段】
上記課題を解決するため，本発明の第１の観点によれば，記憶媒体に記録されている映像信号を後処理する映像編集装置が提供される。この映像編集装置は，前記記憶媒体から，前記映像信号の画質に関する撮像装置の設定情報である第１のメタデータ，撮影時におけるレンズ装置の設定情報である第２のメタデータ，又は，撮影時における前記撮像装置の位置又は動きに関する設定情報である第３のメタデータの少なくともいずれかがフレーム単位で付加されている前記映像信号を再生する映像信号再生装置と；前記再生された映像信号から，前記第１のメタデータ，前記第２のメタデータ又は前記第３のメタデータのうち少なくとも一つをフレーム単位で抽出するメタデータ抽出部と；前記再生された映像信号と，前記抽出されたメタデータとを，フレーム単位で同期させて表示部に表示する表示制御部と；を備える。
【００１０】
かかる構成により，映像編集装置の映像信号再生装置が再生する記憶媒体には，映像素材のコンテンツである映像信号と，この映像信号にフレーム単位で付加されたメタデータとがともに記録されている。従って，映像編集装置は，映像信号とメタデータとを一体的に取り扱うことができる。また，映像信号再生装置は，この記録媒体に記録されている映像信号を再生して，メタデータ抽出部に出力することができる。メタデータ抽出部は，再生された映像信号のフレーム毎に，当該映像信号に付加されているメタデータを順次，抽出することができる。このため，抽出されたメタデータと映像信号との整合性をとる必要がないので，メタデータ抽出部は，当該映像信号の各フレームに対応するメタデータを容易かつ迅速に抽出できる。また，メタデータ抽出部は，当該映像信号に付加されている全てのメタデータを抽出するのではなく，後処理の内容に応じて必要なメタデータのみを抽出できる。また，表示部は，当該映像信号のフレーム毎に，上記抽出されたメタデータを表示できる。これにより，映像編集装置のオペレータは，表示部に当該映像信号とともに表示されたメタデータを閲覧しながら，当該映像信号の後処理を好適に行うことができる。
【００１１】
また，上記後処理は，再生された映像信号と，合成用映像信号とを合成する映像合成処理である，ように構成してもよい。また，上記合成用映像信号は，コンピュータグラフィックス映像信号である，ように構成してもよい。
【００１２】
また，上記表示部は，再生された映像信号および抽出されたメタデータとともに，合成用映像信号および合成用映像信号に関連するメタデータを表示する，ように構成してもよい。かかる構成により，オペレータは，記憶媒体から再生された映像信号と，合成用映像信号の映像自体をフレーム毎に見比べることができるだけでなく，双方の映像信号に対応するメタデータをフレーム毎に見比べることができる。
【００１３】
また，上記後処理は，再生された映像信号を補正する映像補正処理である，ように構成してもよい。
【００１４】
また，上記映像信号に付加されているメタデータは，メタデータの利用目的に応じて，１または２以上のメタデータグループにグループ化されている，ように構成してもよい。かかる構成により，メタデータ抽出部は，メタデータグループ単位でメタデータを抽出することができる。
【００１５】
さらに，上記メタデータ抽出部は，後処理の内容に応じた１または２以上のメタデータグループを抽出する，ように構成してもよい。かかる構成により，メタデータ抽出部は，後処理の内容に応じて必要なメタデータグループのみを抽出して，表示できる。
【００１６】
また，上記メタデータグループは，映像信号を生成した撮像装置の設定情報を含むカメラ設定グループ，撮像装置が備えるレンズ装置の設定情報を含むレンズ設定グループ，または撮像装置が備えるドーリ装置の設定情報を含むドーリ設定グループの少なくともいずれかを含，ように構成してもよい。かかる構成により，また，カメラ設定グループのメタデータは，例えば，撮影された映像の画質等を表す情報として機能することができる。また，レンズ設定グループおよびドーリ設定グループのメタデータは，例えば，撮影された映像内に現れる被写体等の動き，距離等を表す情報として機能することができる。
【００１７】
また，上記映像信号に付加されるメタデータグループには，固有のグループ識別情報が付与されており，メタデータ抽出部は，グループ識別情報に基づいて，後処理の内容に応じた１または２以上のメタデータグループを抽出する，ように構成してもよい。かかる構成により，メタデータ抽出部は，グループ識別情報に基づいて，いずれのメタデータグループであるかを識別することができるので，メタデータグループ単位での抽出処理を迅速に行うことができる。
【００１８】
また，上記映像信号に付加されるメタデータグループには，メタデータグループのデータ量情報が付与されており，メタデータ抽出部は，データ量情報に基づいて，後処理の内容に応じた１または２以上のメタデータグループを抽出する，ように構成してもよい。かかる構成により，メタデータ抽出部は，あるメタデータグループのメタデータの抽出処理を実行するに際し，データ量情報に基づいて，予め，当該メタデータグループ内のメタデータ量を把握することができる。このため，メタデータ抽出部は，メタデータグループ単位での抽出処理を迅速に行うことができる。
【００１９】
また，上記記憶媒体に記録されている映像信号は，フレームレートが変化している，ように構成してもよい。かかる構成により，記憶媒体に記録されている映像信号のフレームレートが変化している場合であっても，当該映像信号のフレーム毎にメタデータが付加されているので，映像信号とメタデータの対応関係が崩れることがない。このため，メタデータ抽出部は，映像信号とメタデータの整合性をとらずとも，当該映像信号からメタデータを好適にフレーム単位で抽出できる。
【００２０】
また，上記課題を解決するため，本発明の別の観点によれば，記憶媒体に記録されている映像信号を後処理する映像編集装置の処理方法が提供される。この映像編集装置の処理方法は，前記記憶媒体から，前記映像信号の画質に関する撮像装置の設定情報である第１のメタデータ，撮影時におけるレンズ装置に関する設定情報である第２のメタデータ，又は，撮影時における前記撮像装置の位置又は動きに関する設定情報である第３のメタデータの少なくともいずれかがフレーム単位で付加されている前記映像信号を再生する映像信号再生段階と；前記再生された映像信号から，前記第１のメタデータ，前記第２のメタデータ又は前記第３のメタデータのうち少なくとも一つをフレーム単位で抽出するメタデータ抽出段階と；前記再生された映像信号と，前記抽出されたメタデータとを，フレーム単位で同期させて表示部に表示する表示制御段階と；を含む。
【００２１】
また，上記後処理は，再生された映像信号と，合成用映像信号とを合成する映像合成処理である，ようにしてもよい。また，上記合成用映像信号は，コンピュータグラフィック映像信号である，ようにしてもよい。
【００２２】
また，上記表示段階では，再生された映像信号および抽出されたメタデータとともに，合成用映像信号および合成用映像信号に関するメタデータを同期させて表示する，ようにしてもよい。
【００２３】
また，上記後処理は，再生された映像信号を補正する映像補正処理である，ようにしてもよい。
【００２４】
また，上記映像信号に付加されているメタデータは，メタデータの利用目的に応じて，１または２以上のメタデータグループにグループ化されている，ようにしてもよい。また，上記メタデータ抽出段階では，後処理の内容に応じた１または２以上のメタデータグループを抽出する，ようにしてもよい。
【００２５】
また，上記メタデータグループは，映像信号を生成した撮像装置の設定情報を含むカメラ設定グループ，撮像装置が備えるレンズ装置の設定情報を含むレンズ設定グループ，または撮像装置が備えるドーリ装置の設定情報を含むドーリ設定グループの少なくともいずれかを含，ようにしてもよい。
【００２６】
【発明の実施の形態】
以下に添付図面を参照しながら，本発明の好適な実施の形態について詳細に説明する。なお，本明細書及び図面において，実質的に同一の機能構成を有する構成要素については，同一の符号を付することにより重複説明を省略する。
【００２７】
（第１の実施の形態）
以下に，本発明の第１の実施の形態にかかる映像編集装置およびその処理方法について説明する。以下では，まず，本実施形態にかかる映像編集装置が取り扱う映像信号を記憶媒体に記録する映像記録システム等について説明し，次いで，本実施形態にかかる映像編集装置について詳細に説明することとする。
【００２８】
＜１映像記録システム＞
まず，本実施形態にかかる映像記録システムおよび映像記録方法について説明する。
【００２９】
＜１．１システム構成＞
まず，本実施形態にかかる映像記録システムの概要について説明する。本実施形態にかかる映像記録システムは，例えば，テレビ放送局や，ビデオコンテンツ，映画等の制作会社などが，ＴＶ番組，ビデオコンテンツ，映画などの映像作品を制作するためのシステムである。この映像記録システムは，例えば，撮影現場（撮影スタジオ，ロケ現場等）に設けられ，映像作品を構成する映像素材の映像コンテンツデータを撮影・収録することができる。この映像コンテンツデータは，例えば，映像データ及び／又は音声データから構成されるコンテンツデータである。このうち映像データは，一般的には，例えば，動画像データであるが，図画，写真または絵画などの静止画像データを含むようにしてもよい。
【００３０】
さらに，この映像記録システムは，例えば，撮影した映像素材に関連する各種のメタデータを生成することができる。さらに，映像記録システムは，かかるメタデータをグループ化した上で，映像素材を構成する映像信号に対してフレームごとに付加して，映像信号とともに記憶媒体に記録することができる。なお，このメタデータは，例えば，上記映像素材の概要，属性または撮影機器の設定等を表す上位データであり，映像素材のインデックス情報や，撮影条件等を特定する情報などとして機能するが，詳細については後述する。
【００３１】
次に，図１に基づいて，本実施形態にかかる映像記録システムの全体構成について説明する。なお，図１は，本実施形態にかかる映像記録システム１の概略的な構成を示すブロック図である。
【００３２】
図１に示すように，本実施形態にかかる映像記録システム１は，例えば，撮像装置１０と，集音装置１８と，カメラコントロールユニット（以下では，ＣＣＵという。）２０と，メタデータ入力用端末装置３０と，メタデータ付加装置４０と，ビデオテープレコーダ（以下では，ＶＴＲという。）５０と，メタデータ合成装置６０と，表示装置７０と，から主に構成されている。
【００３３】
撮像装置１０は，例えば，レンズ装置１２に入射した光学像を電気信号に変換するビデオカメラなどであり，被写体を撮像して映像信号を生成・出力することができる。この撮像装置１０は，映像作品を構成する各場面（シーン）を撮影し，生成した映像信号を，例えばＣＣＵ２０に出力することができる。この映像信号は，例えば，プログレッシブ方式またはインターレース方式のいずれの方式で生成されてもよい。
【００３４】
なお，本実施形態では，撮像装置１０からＣＣＵ２０への映像信号の伝送は，例えば光ファイバケーブル等を介して光信号としてなされる。このように光信号として映像信号を伝送することにより，ＨＤＳＤＩ（ＨｉｇｈＤｅｆｉｎｉｔｉｏｎＳｉｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）形式で伝送する場合（例えば５０ｍ程度）と比べて，長距離伝送（例えば１ｋｍ程度）が可能になる。このため，撮像装置１０と，ＣＣＵ２０およびＶＴＲ５０等とを十分に離隔して配設することができるので，撮影の自由度が高まる。しかし，かかる例に限定されず，撮像装置１０は，例えば，ＨＤＳＤＩケーブル等で映像信号を伝送してもよい。この場合には，例えば，ＣＣＵ２０を設けずに，撮像装置１０からメタデータ付加装置４０に直接，映像信号を伝送してもよい。
【００３５】
また，撮像装置１０は，例えば，上記撮影時における，撮像装置１０内の各種の設定情報（シャッタスピード，ゲイン等の撮影条件情報）を収集し，これらの設定情報を基にカメラ設定メタデータを生成することができる。さらに，撮像装置１０は，例えば，このカメラ設定メタデータをカメラ設定グループとしてグループ化してパッキングした上で，上記映像信号の１フレーム毎に付加することができるが，詳細については後述する。
【００３６】
また，かかる撮像装置１０は，例えば，レンズ装置１２と，ドーリ装置１４とを具備している。
【００３７】
レンズ装置１２は，例えば，複数枚のレンズと，これらレンズの距離，絞り等を調整する駆動装置とから構成されており，ズーム，アイリス，フォーカス等を調整して，撮像装置１０本体に好適な光学像を入射させることができる。このレンズ装置１２は，例えば，撮影時におけるレンズ装置１２内の各種の設定情報（ズーム，アイリス，フォーカス等の撮影条件情報）を，レンズ設定メタデータとして１フレーム毎に生成することができる。
【００３８】
ドーリ（ｄｏｌｌｙ）装置１４は，撮像装置１０本体を載置して移動させるための台車であり，例えば，撮像装置１０を被写体に接近させたり遠ざけたりして撮影する場合や，移動する被写体とともに撮像装置１０を移動させて撮影する場合などに用いられる。このドーリ装置１４は，例えば，その下部に設けられた滑車をレール上に載置することにより，被写体等に沿って高速移動することができる。かかるドーリ装置１４は，例えば，撮影時におけるドーリ装置１４内の各種の設定情報（ドーリの位置，カメラの向き等の撮影条件情報）を，ドーリ設定メタデータとして１フレーム毎に生成することができる。なお，このドーリ装置１４は，必ずしも設けられなくてもよく，例えば，上方から撮影するためクレーン等に撮像装置１０を設置する場合や，カメラマンが撮像装置１０を担いで撮影する場合などには，不要である。
【００３９】
上記のようにして生成されたレンズ設定メタデータおよびドーリ設定メタデータは，例えば，ＲＳ−２３２Ｃケーブルなどを介してメタデータ付加装置４０に出力される。
【００４０】
集音装置１８は，例えば，マイクロフォンなどで構成されており，音声信号を生成・出力することができる。より詳細には，この集音装置１８は，上記撮像装置１０による撮影時における，背景音や俳優の発声音などの音声情報を集音して，音声信号を生成する。この音声信号は，例えばＶＴＲ５０に出力される。なお，この集音装置１８は，撮像装置１０が具備してもよい，
【００４１】
ＣＣＵ２０は，例えば，撮影装置１０から光信号として入力された映像信号を，ＨＤＳＤＩ用の信号に変換して，ＨＤＳＤＩケーブルを介してメタデータ付加装置４０に出力することができる。また，ＣＣＵ２０は，例えば，当該映像信号から光ファイバケーブル等を介してカメラ設定メタデータを取得することもできる。なお，このＣＣＵ２０は，必ずしも，撮像装置１０とは別体に構成された装置として設けられなくともよく，例えば，撮像装置１０に内蔵されてもよい。特に，例えば，撮像装置１０が映像信号を例えばＨＤＳＤＩ形式で出力するように構成した場合には，このＣＣＵ２０は必須の装置ではない。
【００４２】
メタデータ入力用端末装置３０は，例えば，パーソナルコンピュータなどの情報処理装置及びその周辺装置などで構成されており，ユーザ入力に基づいて，シーン情報メタデータを生成することができる。このシーン情報メタデータは，例えば，撮像装置１０が撮影するシーンに関するメタデータであり，従来の撮影において電子カチンコ等に記載されていた情報（シーン番号，テイク番号等）などである。かかるメタデータ入力用端末装置３０は，例えば，ディレクタ等によってこれから撮影しようとするシーンのシーン番号等が入力されると，これに対応するシーン情報メタデータを生成し，ＲＳ−２３２Ｃケーブルなどを介してメタデータ付加装置４０に出力する。なお，カメラマンまたはディレクタ等は，このメタデータ入力用端末装置３０を利用して，例えば，映像素材の収録時におけるコメント（撮影状況のメモ書き等）を，シーン状況メタデータとして追加入力することもできる。
【００４３】
メタデータ付加装置４０は，本実施形態にかかる特徴的な装置であり，例えば，映像信号に対してフレーム単位で上記メタデータを付加することができる。より詳細には，メタデータ付加装置４０には，例えば，上記レンズ装置１２，ドーリ装置１４およびメタデータ入力用端末装置３０などから，それぞれ，レンズ設定メタデータ，ドーリ設定メタデータ，シーン情報メタデータなどが入力される。メタデータ付加装置４０は，例えば，これらのメタデータを，その利用目的ごとに，レンズ設定グループ，ドーリ設定グループ，シーン情報グループなどといった複数のメタデータグループにグループ化して，パッキングする。さらに，メタデータ付加装置４０は，例えば，このようにグループ化したレンズ設定グループ，ドーリ設定グループおよびシーン情報グループのメタデータを，ＣＣＵ２０から入力された映像信号のブランキング領域に１フレーム毎に順次，挿入して付加することができる。このようにして，全てのメタデータが付加された映像信号は，例えば，ＨＤＳＤＩケーブル等を介してＶＴＲ５０に出力される。
【００４４】
なお，このメタデータ付加装置４０には，リファレンス信号生成装置７２からリファレンス信号（基準同期信号）が入力され，タイムコード信号生成装置７４からタイムコード信号（ＬＴＣ：ｌｉｎｅａｒＴｉｍｅＣｏｄｅ）が入力されている。また，かかるＬＴＣをＶＴＲ５０に出力することもできる。
【００４５】
ＶＴＲ５０は，本実施形態にかかる映像信号記録装置として構成されており，例えば，上記メタデータ付加装置４０から入力された映像信号や，集音装置１８から入力された音声信号を，ビデオテープ５２等の記憶媒体に記録することができる。また，このＶＴＲ５０は，ビデオテープ５２に記録されている映像信号等を再生することもできる。また，このＶＴＲ５０は，例えば，メタデータ付加装置４０から入力された映像信号をそのままメタデータ合成装置６０に出力する，あるいは，ビデオテープ５２から再生した映像信号をメタデータ合成装置６０に出力することができる。
【００４６】
なお，本実施形態では，記憶媒体としてビデオテープ５２を用いているが，かかる例に限定されず，例えば，各種の磁気テープ，磁気ディスク，光ディスク，メモリーカード等の任意の記憶媒体であってもよい。また，映像信号記録装置は，上記ＶＴＲ５０の例に限定されず，このような各種の記憶媒体に対応した装置（ディスク装置，各種リーダライタ等）に変更することもできる。
【００４７】
メタデータ合成装置６０は，例えば，上記のように映像信号に付加されているメタデータを抽出，デコードして，当該映像信号に合成するデコーダ装置である。より詳細には，このメタデータ合成装置６０は，例えば，ＶＴＲ５０から入力された映像信号に付加されているメタデータの全部または一部を，フレーム単位で抽出することができる。さらに，このメタデータ合成装置６０は，抽出したメタデータをデコードして映像データに書き換えた上で，当該映像信号にフレーム単位で合成することができる。この合成とは，例えば，上記映像信号と，メタデータの映像データとを，例えばフレーム単位で多重（スーパーインポーズ等）することをいう。
【００４８】
表示装置７０は，例えば，ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ），ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などのディスプレイ装置である。この表示装置７０は，上記メタデータ合成装置６０から上記メタデータが合成された映像信号が入力されると，当該メタデータがスーパーインポーズ等された映像を表示することができる。
【００４９】
＜１．２メタデータの内容＞
次に，本実施形態にかかるグループ化されたメタデータについて詳細に説明する。本実施形態では，例えば，上記のような映像素材に関連する多様なメタデータを，その利用目的等に応じて，例えば４つのメタデータグループにグループ分けして，伝送，記録，管理している。以下では，これら４つのメタデータグループごとに，そのメタデータグループに含まれるメタデータの内容について詳細に説明する。
【００５０】
＜１．２．１シーン情報グループ＞
まず，図２に基づいて，シーン情報グループに含まれるシーン情報メタデータについて，具体例を挙げながら詳細に説明する。なお，図２は，本実施形態にかかるシーン情報グループに含まれるシーン情報メタデータの具体例を示す説明図である。
【００５１】
図２に示すように，シーン情報グループに含まれるシーン情報メタデータは，例えば，従来，電子カチンコ（スレート）等に表示されていた「タイムコード」，「シーン番号」，「テイク番号」などの情報をはじめとする，撮像装置１０が撮影するシーンに関連する各種のメタデータである。
【００５２】
・「タイムコード」は，ＬＴＣなどに代表される時間，分，秒，フレーム番号等からなる時間情報である。従来では，この「タイムコード」は，例えば，ビデオテープ５２の音声トラックなどの長手方向に記録されていた。本実施形態では，この「タイムコード」は，タイムコード信号生成装置７４によって生成され，例えば，上記メタデータ付加装置によって映像信号のブランキング領域に１フレーム毎に付される。このタイムコードによって，映像信号の位置を特定することができる。この「タイムコード」のデータ量は例えば１６バイトである。
・「日付」は，撮影が行われた日付を表すテキスト情報であり，そのデータ量は例えば４バイトである。
・「映像作品題名」は，映像作品のタイトルを表すテキスト情報であり，そのデータ量は例えば３０バイトである。
・「撮影チーム番号」は，当該撮影を担当している撮影チーム（クルー）を特定するためのＩＤ番号などであり，そのデータ量は例えば２バイトである。
・「シーン番号」は，映像作品を構成する複数のシーン（Ｓｃｅｎｅ；撮影場面）のうち，撮影が行われているシーンを特定するための番号などであり，そのデータ量は例えば２バイトである。この「シーン番号」を参照することにより，撮影された映像素材が，映像作品中のいかなるシーンに相当するものであるかを識別できる。なお，例えば，シーンをさらに細分化したカットの番号を，シーン情報メタデータとして追加することもできる。
・「テイク番号」は，撮像装置１０による１回の記録開始から記録終了に至るまでの連続した映像単位であるテイク（Ｔａｋｅ）を特定するための番号であり，そのデータ量は例えば２バイトである。この「テイク番号」を参照することにより，記録されている映像信号が，いかなるシーンに属するいかなるテイクに相当するものであるかを識別できる。
・「ロール番号」は，上記テイクをさらに細分化した映像単位であるロール（Ｒｏｌｌ）を特定するための番号であり，そのデータ量は例えば２バイトである。・「カメラマン」，「ディレクタ」，「プロデューサ」は，それぞれ，撮影を担当したカメラマン名，ディレクタ名，プロデューサ名を表すテキスト情報であり，これらのデータ量はそれぞれ例えば１６バイトである。
【００５３】
このように，シーン情報グループには，例えば，収録された映像の属性情報やインデックス情報となりうるメタデータが集められている。このシーン情報メタデータは，例えば，映像収録段階，後処理段階および編集段階などで，その映像素材のコンテンツを把握し，映像素材を識別，管理する上で有用な情報となる。
【００５４】
＜１．２．２カメラ設定グループ＞
次に，図３に基づいて，カメラ設定グループに含まれるカメラ設定メタデータについて，具体例を挙げながら詳細に説明する。なお，図３は，本実施形態にかかるカメラ設定グループに含まれるカメラ設定メタデータの具体例を示す説明図である。
【００５５】
図３に示すように，カメラ設定グループに含まれるカメラ設定メタデータは，例えば，映像を撮影したときの撮像装置１０の設定情報をメインとする各種の撮影条件等を表すメタデータである。
【００５６】
・「カメラＩＤ」は，撮影処理を行った撮像装置１０を特定するためのシリアル番号（機器番号）であり，そのデータ量は例えば４バイトである。
・「ＣＨＵスイッチＯＮ／ＯＦＦ」は，以下に説明するような，撮像装置１０の設定を標準設定から変化させているか否かを表すビット情報であり，そのデータ量は例えば１バイトである。
・「ＣＣＵＩＤ」は，撮影処理を行ったＣＣＵ２０を特定するためのシリアル番号（機器番号）であり，そのデータ量は例えば４バイトである。
・「フィルタ設定」は，撮影時における撮像装置１０のフィルタの設定を表す情報であり，そのデータ量は例えば２バイトである。本実施形態では，例えば，撮像装置１０が５種類のフィルタを２重に備えており，このうち，どのフィルタを２つ組み合わせて撮影したかを表している。
・「シャッタスピード」は，撮影時における撮像装置１０のシャッタスピードの設置値を表す情報であり，そのデータ量は例えば１バイトである。本実施形態では，この「シャッタスピード」は，例えば，「１／１００」〜「１／２０００」秒の間で，６段階に設定可能である。
・「ゲイン」は，撮影時における撮像装置１０のゲインの設置値を表す情報であり，そのデータ量は例えば１バイトである。
・「ＥＣＳ」は，撮影時における撮像装置１０のＥＣＳ（ＥｘｔｅｎｄｅｄＣｌｅａｒＳｃａｎ）機能のＯＮ／ＯＦＦを表す情報であり，そのデータ量は例えば２バイトである。
・「ガンマ（マスター）」は，撮影時における撮像装置１０のガンマ特性（ガンマカーブ等）の設定を表す情報であり，そのデータ量は例えば２バイトである。・「ガンマ（ユーザ設定）」は，ユーザ設定によりガンマカーブ等を変化させた場合のガンマ特性の設定を表す情報であり，そのデータ量は例えば１バイトである。
・「バリアブルフレームレート」は，可変速撮影可能な撮像装置１０によって撮影された映像信号のフレームレート設定値を表す情報であり，そのデータ量は例えば１バイトである。本実施形態にかかる撮像装置１０は，例えば，２３．９８〜３０Ｐでフレームレートを変化させて撮影可能であるが，かかる例に限定されず，例えば１〜６０Ｐで可変速撮影できるように構成してもよい。
・「映像信号白レベル」は，撮影時における撮像装置１０のホワイトバランス調整処理による映像信号の白レベル設定値を表す情報であり，そのデータ量は例えば６バイトである。
・「映像信号黒レベル」は，撮影時における撮像装置１０のブラックバランス調整処理による映像信号の黒レベルの設定値を表す情報であり，そのデータ量は例えば８バイトである。
・「ディテールレベル」は，撮影時における撮像装置１０のディテール調整処理によるディテールレベルの設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「ニーポイント」は，撮影時における撮像装置１０のニー回路で圧縮される映像信号のニーポイントの設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「ニースロープ」は，撮影時において撮像装置１０のニー回路で圧縮される映像信号のニースロープの設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「レコーダステータス」は，ＶＴＲ５０等の映像信号記録再生装置が映像信号を記録する際のフレームレートの設置値を表す情報であり，そのデータ量は例えば１バイトである。この「レコーダステータス」は，上記「バリアブルフレームレート」に対応して決定される。
【００５７】
このように，カメラ設定グループには，例えば，撮影時における撮像装置１０の設定情報などの撮影条件に関するメタデータが集められている。このカメラ設定メタデータは，例えば，映像素材の後処理段階などで，その映像素材の画質（明度，色合い，質感等）などを把握する上で有用な情報となる。
【００５８】
＜１．２．３レンズ設定グループ＞
次に，図４に基づいて，レンズ設定グループに含まれるレンズ設定メタデータについて，具体例を挙げながら詳細に説明する。なお，図４は，本実施形態にかかるレンズ設定グループに含まれるレンズ設定メタデータの具体例を示す説明図である。
【００５９】
図４に示すように，レンズ設定グループに含まれるレンズ設定メタデータは，例えば，映像撮影時におけるレンズ装置１２の設定情報をメインとする各種の撮影条件等を表すメタデータである。
【００６０】
・「ズーム」は，撮影時におけるレンズ装置１２の撮影倍率調整処理によるズーム設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「フォーカス」は，撮影時におけるレンズ装置１２の焦点距離調整処理によるフォーカス設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「アイリス」は，撮影時におけるレンズ装置１２の露光調整処理によるアイリス（絞り）設定値を表す情報であり，そのデータ量は例えば２バイトである。
・「レンズＩＤ」は，撮影に使われたレンズ装置１２を特定するためのシリアル番号（機器番号）であり，そのデータ量は例えば４バイトである。
【００６１】
このように，レンズ設定グループには，例えば，撮影時におけるレンズ装置１２の設定情報などの撮影条件に関するメタデータが集められている。このレンズ設定メタデータは，例えば，映像素材の後処理段階などで，その映像素材で撮影されている被写体の動き，撮像装置１０からの距離等を把握する上で有用な情報となる。
【００６２】
＜１．２．４ドーリ設定グループ＞
次に，図５に基づいて，ドーリ設定グループに含まれるドーリ設定メタデータについて，具体例を挙げながら詳細に説明する。なお，図５は，本実施形態にかかるドーリ設定グループに含まれるカメラ設定メタデータの具体例を示す説明図である。
【００６３】
図５に示すように，ドーリ設定グループに含まれるドーリ設定メタデータは，例えば，映像撮影時におけるドーリ装置１４の設定情報をメインとする各種の撮影条件等を表すメタデータである。
・「ＧＰＳ」は，撮影時におけるドーリ装置１４の位置（即ち，撮像装置１０の位置）を特定するための緯度および経度情報（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ情報）であり，そのデータ量は例えば１２バイトである。・「移動方向」は，撮影時におけるドーリ装置１４の移動方向（即ち，撮像装置１０の移動方向）をアングルで表す情報であり，そのデータ量は例えば４バイトである。
・「移動スピード」は，撮影時におけるドーリ装置１４の移動スピード（即ち，撮像装置１０の移動スピード）を表す情報であり，そのデータ量は例えば４バイトである。
・「カメラ方向」は，撮像装置１０の撮影方向を表す情報であり，固定されたドーリ装置１４を基準として，撮像装置１０の回転角度（首を振った角度）で表現される。具体的には，例えば，撮像装置１０の撮像方向を「パン（ｐａｎ）」（Ｚ軸方向），「チルト（ｔｉｌｔ）」（Ｙ軸方向），「ロール（ｒｏｌｌ）」（Ｘ軸方向）の３方向の回転角度で表す。これら３つのデータ量はそれぞれ例えば２バイトである。
・「ドーリ高さ」は，ドーリ装置１４の高さを表す情報であり，そのデータ量は例えば２バイトである。この情報により，撮像装置１０の垂直方向の位置が特定できる。
・「ドーリＩＤ」は，撮影に使われたドーリ装置１４を特定するためのシリアル番号（機器番号）であり，そのデータ量は例えば４バイトである。
【００６４】
このように，ドーリ設定グループには，例えば，撮影時におけるドーリ装置１４の位置，動き等の設定情報からなる撮影条件に関するメタデータが集められている。このドーリ設定メタデータも，例えば，上記レンズ設定メタデータと同様に，映像素材の後処理段階などで，その映像素材に現れている被写体の動き，距離等を把握する上で有用な情報となる。
【００６５】
以上，本実施形態にかかる例えば４つのメタデータグループの内容について説明した。このようにメタデータをグループ化することにより，メタデータの利用目的に応じて，必要なメタデータのみをグループ単位で好適に抽出して，利用，書き換えなどすることができる。
【００６６】
例えば，映像の収録段階では，収録中あるいは収録完了した映像を識別，把握するなどの目的で，シーン番号，タイムコード等を含む上記シーン情報グループのメタデータが抽出されて活用される。また，映像素材の後処理段階では，実写の映像に対してＣＧ映像を合成処理する場合などに，上記カメラ，レンズおよびドーリ設定グループのメタデータが有用である。具体的には，当該映像素材の画質を把握するなどの目的で，上記カメラ設定グループのメタデータが抽出されて活用される。また，当該映像素材内の被写体の動きを把握するなどの目的で，上記レンズ設定グループおよびドーリ設定グループのメタデータが抽出されて活用される。
【００６７】
なお，このようにレンズ設定グループおよびドーリ設定グループのメタデータの利用目的には共通性がある。このため，本実施形態のように，レンズ設定グループおよびドーリ設定グループを別グループとして構成するのではなく，例えば，１つのレンズ・ドーリ設定グループとして構成し，レンズ設定メタデータおよびドーリ設定メタデータを１つにまとめてグループ化するなどしてもよい。
【００６８】
＜１．３メタデータフォーマット＞
次に，図６に基づいて，本実施形態にかかるメタデータフォーマットについて説明する。なお，図６は，本実施形態にかかるメタデータフォーマットを説明するための説明図である。
【００６９】
上記のように，本実施形態にかかるメタデータは，例えば４つのメタデータグループにグループ化されている。このようにグループ化されたメタデータは，例えば，上記撮像装置１０およびメタデータ付加装置４０等によって，所定のフォーマットで映像信号にフレーム単位で付加される。
【００７０】
より詳細には，図６（ａ）に示すように，上記メタデータは，例えば，映像信号の垂直ブランキング領域内のアンシラリデータ領域等に，アンシラリデータとしてパッケージ化されて１フレーム毎に挿入される。このパッケージ化されたメタデータの例えば伝送時におけるフォーマットを，図６（ｂ）に示す。
【００７１】
図６（ｂ）に示すように，メタデータは，例えば，シーン情報グループ，カメラ設定グループ，レンズ設定グループおよびドーリ設定グループという４つのメタデータグループにグループ化され，この４つのメタデータグループが連続して直列的に配列されたフォーマットを有する。各メタデータグループは，例えば，ＳＭＰＴＥ（ＳｏｃｉｅｔｙｏｆＭｏｔｉｏｎＰｉｃｔｕｒｅａｎｄＴｅｌｅｖｉｓｉｏｎＥｎｇｉｎｅｅｒｓ）規格（ＳＭＰＴＥ２９１Ｍ等）に基づいて，ＫＬＶ（ＫｅｙＬｅｎｇｔｈＶａｌｕｅ）符号化されている。
【００７２】
「Ｋ（Ｋｅｙ）」は，例えば，各メタデータグループの先頭に付与される例えば１バイトのキーＩＤ（予約語）である。この「Ｋ」符号は，本実施形態にかかるグループ識別情報として構成されており，各メタデータグループを識別するための符号として機能する。例えば，映像信号のいかなるフレームにおいても，この「Ｋ」符号として，シーン情報グループには常に例えば「０１」を付与し，カメラ設定グループには常に例えば「０２」を付与し，レンズ設定グループには常に例えば「０３」を付与し，レンズ設定グループには常に例えば「０４」を付与することにより，メタデータグループ毎に固有の識別符号を統一して付することができる。このように，メタデータグループ毎に固有のグループ識別情報である「Ｋ」符号を付与することにより，かかるグループ識別情報に基づいて，複数のメタデータグループの中から特定のメタデータグループのみを，フレーム毎に容易に抽出することができる。
【００７３】
「Ｌ（Ｌｅｎｇｔｈ）」は，例えば，上記「Ｋ」符号の次に付与される例えば１バイトのレングス符号である。この「Ｌ」符号は，本実施形態にかかるデータ量情報として構成されており，後続のメタデータグループのデータ量を表す符号として機能する。例えば，あるフレームのシーン情報グループに付された「Ｌ」が「１２４」であれば，当該フレームにおけるシーン情報グループのデータ量が例えば１２４バイトであることを表す。このように，各メタデータグループのコンテンツの前に，データ量情報である「Ｌ」符号を付与することにより，メタデータの抽出或いは書き換え処理の処理効率が向上する。つまり，メタデータ付加装置４０およびＶＴＲ５０等のメタデータを処理する装置は，上記データ量情報である「Ｌ」符号を参照することにより，これから抽出或いは書き換えしようとするメタデータのデータ量を予め把握できる。このため，当該抽出或いは書き換え処理の処理効率が向上する。
【００７４】
「Ｅｌｅｍｅｎｔ」は，例えば，実際の各メタデータグループのメタデータが格納されるユーザデータ領域（Ｖａｌｕｅ領域）であり，可変長である。
【００７５】
また，このようにＫＬＶ符号化されたメタデータグループの前には，伝送されるメタデータの種類を定義，識別するためのフラグである「ＡｎｃｉｌｌａｌｙＤａｔａＦｌａｇ」アンシラリデータフラグ，「ＤＩＤ：Ｄａｔａｉｄｅｎｔｉｆｉｃａｔｉｏｎ」データＩＤ，「ＳＤＩＤ：ＳｅｃｏｎｄａｒｙＤａｔａＩｄｅｎｔｉｆｉｃａｔｉｏｎ」セカンダリーデータＩＤ，「ＤＣ：ＤａｔａＣｏｕｎｔｅｒ」データカウンタ，などが付されている。一方，メタデータグループの後には，伝送時における誤り検出用の「ＣＲＣ：ＣｙｃｌｉｃＲｅｄｕｎｄａｎｃｙＣｈｅｃｋ」，「ＣＨＥＣＫＳＵＭ」などの符号が付されている。
【００７６】
ところで，上記ＳＭＰＴＥ規格では，ＫＬＶ符号化したメタデータを映像信号のアンシラリデータ領域にパッキングして挿入する場合には，アンシラリデータの１パケットサイズが２５５バイトとなるように規格化されている。そこで，本実施形態にかかるメタデータフォーマットでは，この規格に適合するように，グループ化されたメタデータのデータ総量が，例えば２５５バイト以下となるように調整されている。具体的には，例えば，シーン情報グループのメタデータ量が例えば１２４バイト以下，カメラ設定グループのメタデータ量が例えば４０バイト以下，レンズ設定グループのメタデータ量が例えば１０バイト以下，ドーリ設定グループのメタデータ量が例えば５２バイト以下，となるように調整されている。このため，アンシラリデータの１つのパケットサイズが，メタデータ総量で例えば約２２６バイト以下となるように設定されている。
【００７７】
このように，本実施形態にかかるメタデータフォーマットでは，全てのメタデータがアンシラリデータの１パケットサイズ（２５５バイト）内に収まるように設定されている。しかし，かかる例に限定されず，例えば，複数のアンシラリデータパケットを連結させて，これら複数のパケットにメタデータを分割してパッキングするようにしてもよい。
【００７８】
以上説明したように，本実施形態にかかるメタデータフォーマットは，例えば，メタデータに割り当てられた領域を，メタデータグループ数に応じて分割し，各分割領域に，各メタデータグループのメタデータを挿入するような構成である。さらに，各メタデータグループの先頭には，上記グループ識別情報およびデータ量情報がそれぞれ付与されている。かかる構成により，メタデータの利用目的に応じて必要なメタデータを，メタデータグループ毎に，迅速かつ容易に検出，抽出または書き換えることができる。例えば，上記グループ識別情報を，映像作品の収録部署と編集部署との間で共通の識別ＩＤとして共用することで，映像作品の制作過程においてメタデータを好適に利用することができる。
【００７９】
＜１．４各装置の構成＞
次に，上記のような映像記録システム１を構成する主要な装置について詳細に説明する。
【００８０】
＜１．４．１撮像装置＞
まず，図７に基づいて，本実施形態にかかる撮像装置１０について詳細に説明する。なお，図７は，本実施形態にかかる撮像装置１０の構成を示すブロック図である。
【００８１】
図７に示すように，撮像装置１０は，例えば，ＣＰＵ１００と，メモリ部１０２と，撮像部１０４と，信号処理部１０６と，表示部１０８と，カメラ設定メタデータ生成部１１０と，メタデータ付加部１１２と，送受信部１１４と，レンズ装置１２と，ドーリ装置１４とを備える。
【００８２】
ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１００は，演算処理装置および制御装置として機能し，撮像装置１０の各部の処理を制御することができる。また，メモリ部１０２は，例えば，各種のＲＡＭ，ＲＯＭ，フラッシュメモリ，ハードディスクなどの記憶装置などで構成されており，ＣＰＵ１００の処理に関する各種データ，およびＣＰＵ１００の動作プログラム等を記憶または一時記憶する機能を有する。
【００８３】
撮像部１０４は，例えば，ＯＨＢ（Ｏｐｔｉｃａｌｈｅａｄｂａｓｅ）などで構成されており，被写体を撮像して映像信号を生成する機能を有する。詳細には，この撮像部１０４は，例えば，レンズ装置１２から入射された光学像を，プリズム（図示せず。）によりＲ・Ｇ・Ｂに分光し，各種のフィルタ（図示せず。）を透過させた上で，ＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）等の撮像デバイス（図示せず。）により所定のシャッタスピードで光電変換して，アナログ電気信号である映像信号を生成する。
【００８４】
信号処理部１０６は，撮像部１０４から入力された微弱なアナログ電気信号である映像信号に対して，ゲイン調整（ＡＧＣ）処理，相関２重サンプリング処理，Ａ／Ｄ変換処理，エラー補正処理，ホワイトバランス調整処理，ダイナミックレンジ圧縮処理，ガンマ補正処理，シェーディング補正処理，ディテール調整処理，ニー処理などを施して，デジタル映像信号を出力することができる。なお，本実施形態では，例えば，ＨＤ（ＨｉｇｈＤｅｆｉｎｉｔｉｏｎ）デジタル映像信号を生成・出力するよう構成されている。また，この信号処理部１０６は，例えば，上記デジタル映像信号をアナログ映像信号に変換して，表示部１０８に出力することもできる。また，この信号処理部１０６は，例えば，予め設定された条件に基づいて，或いはカメラマンの入力操作に基づいて，出力する映像信号のフレームレートを変化（例えば２３．９８〜３０Ｐ）させることができる。
【００８５】
また，表示部１０８は，例えば，カメラマンが被写体を見るためのビューファインダーなどであり，ＣＲＴモニタなどで構成されている。この表示部１０８は，上記信号処理部１０６から入力された例えばアナログ映像信号を表示出力することができる。なお，この表示部１０８は，例えば，ＬＣＤモニタなどの各種のディスプレイ装置などで構成されてもよい。
【００８６】
カメラ設定メタデータ生成部１１０は，例えば，撮像部１０４の設定情報や，上記信号処理部１０８でのガンマ，ニー，ディテール等の信号処理の設定情報などのパラメータを取得して管理している。さらに，カメラ設定メタデータ生成部１１０は，かかるパラメータに基づいて，上記カメラ設定メタデータを，例えば映像信号の１フレーム毎にそれぞれ生成して，メタデータ付加部１１２に出力する。
【００８７】
メタデータ付加部１１２は，本実施形態にかかるメタデータ付加装置の１つとして構成されている。このメタデータ付加部１１２は，例えば，撮像装置１０の外部へ映像信号を出力するタイミングにあわせて，カメラ設定メタデータを当該映像信号に１フレーム毎に付加することができる。具体的には，このメタデータ付加部１１２は，例えば，カメラ設定メタデータ生成部１１０から入力されたカメラ設定メタデータを，ＫＬＶ符号化してパッキングする。さらに，メタデータ付加部１１２は，このパッキングしたカメラ設定メタデータを，図８（ａ）に示すように，映像信号のブランキング領域のうちカメラ設定グループに割り当てられている領域に，１フレーム毎に順次，挿入する。
【００８８】
このとき，メタデータ付加部１１２は，図８（ａ）に示すように，例えば，カメラ設定グループ以外の，シーン情報グループ，レンズ設定グループおよびドーリ設定グループに対応する領域には，ダミーデータを挿入しておくことができる。
【００８９】
なお，上記のようなカメラ設定メタデータ生成部１０８およびメタデータ付加部１１０は，例えば，ハードウェアとして構成してもよいし，或いは，上記処理機能を実現するソフトウエアとして構成して，このプログラムをメモリ部１０２に格納してＣＰＵ１００が実際の処理を行うようにしてもよい。
【００９０】
送受信部１１４は，例えば，上記のようにしてカメラ設定メタデータが付加された映像信号を，光ファイバケーブルを介してＣＣＵ２０に送信する。
【００９１】
レンズ装置１２は，例えば，光学ブロック１２２と，駆動系ブロック１２４と，レンズ設定メタデータ生成部１２４とを備える。
【００９２】
光学系ブロック１２２は，例えば，複数枚のレンズ，絞りなどからなり，被写体からの光学像を撮像部１０４に入射させることができる。駆動系ブロック１２４は，例えば，光学系ブロック１２２のレンズ間距離や絞りを調整するなどして，ズーム，アイリス，フォーカスなどを調整することができる。
【００９３】
レンズ設定メタデータ生成部１２６は，例えば，上記駆動系ブロック１２４のレンズ設定情報等のパラメータを取得して管理している。さらに，レンズ設定メタデータ生成部１２６は，かかるパラメータに基づいて，上記レンズ設定メタデータを例えば１フレーム毎に生成する。このようにして生成されたレンズ設定メタデータは，例えば，ＲＳ−２３２Ｃケーブルを介して，メタデータ付加装置４０に出力される。
【００９４】
ドーリ装置１４は，例えば，ドーリ計測部１４２と，ドーリ設定メタデータ生成部１４４とを備える。
【００９５】
ドーリ計測部１４２は，例えば，ＧＰＳ情報，ドーリ装置１４の移動速度や向き，撮像装置１０のアングルなどといった，ドーリ装置１４に関する各種の設定情報を計測して，ドーリ設定メタデータ生成部１４４に出力する。
【００９６】
ドーリ設定メタデータ生成部１４４は，例えば，上記ドーリ計測部１４２からの計測情報に基づいて，上記ドーリ設定メタデータを，例えば１フレーム毎に生成する。このようにして生成されたドーリ設定メタデータは，例えば，ＲＳ−２３２Ｃケーブルを介して，メタデータ付加装置４０に出力される。
【００９７】
＜１．４．２カメラコントロールユニット＞
次に，図９に基づいて，本実施形態にかかるＣＣＵ２０について詳細に説明する。なお，図９は，本実施形態にかかるＣＣＵ２０の構成を示すブロック図である。
【００９８】
図９に示すように，ＣＣＵ２０は，例えば，ＣＰＵ２００と，メモリ部２０２と，送受信部２０４と，信号処理部２０６と，シリアライザ２０８と，を備える。
【００９９】
ＣＰＵ２００は，演算処理装置および制御装置として機能し，ＣＣＵ２０の各部の処理を制御することができる。このＣＰＵ２００には，リファレンス信号が入力されており，映像記録システム１内の他の装置との間で，映像信号の同期をとることができる。また，メモリ部２０２は，例えば，各種のＲＡＭ，ＲＯＭ，フラッシュメモリ，ハードディスクなどの記憶装置などで構成されており，ＣＰＵ２００の処理に関する各種データ，およびＣＰＵ２００の動作プログラム等を記憶または一時記憶する機能を有する。
【０１００】
送受信部２０４は，例えば，撮像装置１０からカメラ設定メタデータが付加された映像信号を受信し，信号処理部２０６に送信する。
【０１０１】
信号処理部２０６は，例えば，光信号として入力された映像信号を，ＨＤＳＤＩ信号に変換処理して，シリアライザ２０８に出力する。なお，この信号処理部２０６は，上記撮像装置１０の信号処理部１０６の処理機能を具備するように構成することもできる。
【０１０２】
シリアライザ２０８は，例えば，信号処理部２０６から受け取った映像信号をパラレル−シリアル変換して，ＨＤＳＤＩケーブルを介してメタデータ付加装置４０に送信する。なお，このＣＣＵ２０が出力する映像信号のブランキング領域には，図８（ａ）に示したように，例えば，カメラ設定グループに対応する領域にのみ，実際のメタデータが挿入されており，その他のメタデータグループの領域にはダミーデータが挿入されている。
【０１０３】
＜１．４．３メタデータ付加装置＞
次に，図１０に基づいて，本実施形態にかかるメタデータ付加装置４０について詳細に説明する。なお，図１０は，本実施形態にかかるメタデータ付加装置４０の構成を示すブロック図である。
【０１０４】
図１０に示すように，メタデータ付加装置４０は，例えば，ＣＰＵ４００と，メモリ部４０２と，メタデータパッキング部４０４と，メタデータエンコーダ４０６と，デシリアライザ４０８と，メタデータ挿入部４１０と，シリアライザ４１２と，を備える。
【０１０５】
ＣＰＵ４００は，演算処理装置および制御装置として機能し，メタデータ付加装置４０の各部の処理を制御することができる。このＣＰＵ４００には，リファレンス信号が入力されており，映像記録システム１内の他の装置との間で，映像信号の同期をとることができる。また，このＣＰＵ４００には，タイムコード信号（ＬＴＣ）が入力されており，このＬＴＣに基づいてシーン情報メタデータの１つであるタイムコード情報を生成して，メモリ部４０２に記憶させることができる。また，かかるＬＴＣをＶＴＲ５０に出力することもできる。
【０１０６】
また，メモリ部４０２は，例えば，各種のＲＡＭ，ＲＯＭ，フラッシュメモリ，ハードディスクなどの記憶装置などで構成されており，ＣＰＵ４００の処理に関する各種データ，およびＣＰＵ４００の動作プログラム等を記憶または一時記憶する機能を有する。また，このメモリ部４０２は，例えば，各装置から送信されてきたメタデータを一時的に記憶するためのメタデータバッファメモリ４０３を具備している。
【０１０７】
このメタデータバッファメモリ４０３は，例えば，上記レンズ装置１２から撮影開始後に順次送信されてくるレンズ設定メタデータ，ドーリ装置１４から撮影開始後に順次送信されてくるドーリ設定メタデータ，メタデータ入力用端末装置３０から撮影開始前に予め所得したシーン情報メタデータ，ＣＰＵ４００から入力されたタイムコード情報，などを記憶する。
【０１０８】
メタデータパッキング部４０４は，例えば，上記メタデータバッファメモリ４０３に蓄えられている各種のメタデータの中から必要なメタデータを抽出し，その利用目的ごとに，レンズ設定グループ，ドーリ設定グループ，シーン情報グループなどといった複数のメタデータグループにグループ化して，上記ＫＬＶの構造にパッキングし直す。メタデータパッキング部４０４は，このようにパッキングしたメタデータをメタデータエンコーダ４０６に出力する。
【０１０９】
メタデータエンコーダ４０６は，上記メタデータパッキング部４０４からのメタデータをエンコードする。上記のようにしてメタデータ付加装置４０に入力されてくるメタデータは，例えばＲＳ−２３２Ｃのプロトコル形式のデータである。このため，メタデータエンコーダ４０６は，例えば，このメタデータを，ＨＤＳＤＩ形式の映像信号へ挿入できるように，アンシラリデータパケット形式にフォーマット変換して符号化する（図６参照）。この符号化により，例えば，メタデータの前後には，上記説明したような各種のフラグやＣＲＣなどが付される。
【０１１０】
デシリアライザ４０８は，ＣＣＵ２０から入力された映像信号をシリアル−パラレル変換して，メタデータ挿入部４１０に出力する。
【０１１１】
メタデータ挿入部４１０は，上記メタデータエンコーダ４０６から入力されたメタデータを，上記デシリアライザ４０８から入力されてくる映像信号のブランキング領域に，１フレーム毎に順次挿入していく。
【０１１２】
このとき，メタデータ挿入部４１０に入力されてくる映像信号において，例えば，ブランキング領域のうちカメラ設定グループに対応する領域には，図８（ａ）に示すように，上記撮像装置１０によって予めカメラ設定グループのカメラ設定メタデータが挿入された状態となっている。
【０１１３】
一方，このカメラ設定グループ以外の，シーン情報グループ，レンズ設定グループ，ドーリ設定グループに対応する領域には，ダミーデータが挿入された状態となっている。このため，メタデータ挿入部４１０は，図８（ｂ）に示すように，例えば，かかるダミーデータを，実際のシーン情報メタデータ，レンズ設定メタデータ，ドーリ設定メタデータなどにそれぞれ書き換えることで，当該メタデータの当該映像信号へのメタデータの挿入が実現される。かかるメタデータの書き換え処理時には，メタデータ挿入部４１０は，例えば，各メタデータグループの対応領域に付与されているグループ識別情報「Ｋ」およびデータ量情報「Ｌ」に基づいて，当該対応領域の検出，書き換え処理を行うので，書き換え処理を効率的に行うことができる。また，メタデータ挿入部４１０は，このようにメタデータを挿入するときに，例えば，挿入するメタデータと映像信号との遅延時間の位相合わせを行うこともできる。
【０１１４】
シリアライザ４１２は，上記のようにしてメタデータ挿入部４１０によってメタデータが１フレーム毎に付加された映像信号を，パラレル−シリアル変換して，ＶＴＲ５０に送信する。
【０１１５】
このように，本実施形態にかかるメタデータ付加装置４０は，例えば，予めカメラ設定メタデータが付加されている映像信号に対し，さらに，シーン情報メタデータ，レンズ設定メタデータおよびドーリ設定メタデータを追加して付加することができる。
【０１１６】
＜１．４．４ビデオテープレコーダ＞
次に，図１１に基づいて，本実施形態にかかるＶＴＲ５０について詳細に説明する。なお，図１１は，本実施形態にかかるＶＴＲ５０の構成を示すブロック図である。
【０１１７】
図１１に示すように，ＶＴＲ５０は，例えば，ＣＰＵ５００と，メモリ部５０２と，デシリアライザ５０４と，信号処理部５０６と，メタデータデコーダ５０８と，記録再生ブロック５１０と，ＥＣＣブロック５１２と，メタデータエンコーダ５１４と，シリアライザ５１６と，を備える。
【０１１８】
ＣＰＵ５００は，演算処理装置および制御装置として機能し，ＶＴＲ５０の各部の処理を制御することができる。このＣＰＵ５００には，タイムコード信号（ＬＴＣ）信号が入力されている。また，メモリ部５０２は，例えば，各種のＲＡＭ，ＲＯＭ，フラッシュメモリ，ハードディスクなどの記憶装置などで構成されており，ＣＰＵ５００の処理に関する各種データ，およびＣＰＵ５００の動作プログラム等を記憶または一時記憶する機能を有する。
【０１１９】
デシリアライザ５０４は，メタデータ付加装置４０から入力された映像信号をシリアル−パラレル変換して，信号処理部５０６に出力する。
【０１２０】
信号処理部５０６は，例えば，映像信号等をビデオテープ５２に対して好適に記録／再生するために，映像信号に各種の処理を施すことができる。例えば，信号処理部５０６は，必要に応じて，ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐｐｈａｓｅ）１，ＭＰＥＧ２，ＭＰＥＧ４，またはＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）方式などに基づいて，映像信号を圧縮／伸張処理できる。また，信号処理部５０６は，例えば，上記各信号の記録／再生のタイミング合わせをしたり，映像信号と音声信号を分離して，ＥＣＣ（ＥｒｒｏｒＣｏｒｒｅｃｔｉｎｇＣｏｄｅ：誤り訂正符号）を付与したりもできる。また，信号処理部５０６は，例えば，映像信号に付加されているメタデータをフレーム単位で抽出したり，逆に，デコードされたメタデータを映像信号にフレーム単位で挿入したりできる。
【０１２１】
この信号処理部５０６は，例えば，メタデータ付加装置４０から入力された映像信号をそのままシリアライザ５１４に出力する，あるいは，ビデオテープ５２から再生された映像信号をシリアライザ５１４に出力することができる。
【０１２２】
メタデータデコーダ５０８は，例えば，映像信号から取り出されたメタデータをデコードする。具体的には，メタデータデコーダ５０８は，例えば，記録する上で不要な，当該メタデータに付与されているフラグ（Ｆｌａｇ，ＤＩＤ，ＳＤＩＤ等）およびＣＲＣ等を取り除いて，ＣＰＵ５００に出力する。ＣＰＵ５００は，例えば，このメタデータに上記映像信号と同様にＥＣＣを付与して，記録再生ブロック５１０に出力する。
【０１２３】
記録再生ブロック５１０は，例えば，ビデオヘッドおよび駆動メカニズム（いずれも図示せず。）等から構成されている。この記録再生ブロック５１０は，メタデータが付加された映像信号をビデオテープ５２に対して実際に記録／再生することができる。より詳細には，記録再生ブロック５１０は，例えば，映像信号，音声信号およびメタデータを１フレーム単位でセットにして，ビデオテープ５２の記録エリアに順次，記録していくことができる。また，この記録再生ブロック５１０は，例えば，ビデオテープ５２の記録エリアに記録されている映像信号，音声信号およびメタデータを１フレーム単位でセットにして，順次，再生することができる。
【０１２４】
ＥＣＣブロック５１２は，例えば，上記ＥＣＣに基づいて，記録再生ブロック５１０によってビデオテープ５２から再生された映像信号等の誤り検出を行う。このＥＣＣブロック５１２は，誤り検出完了後に，例えば，再生されたメタデータをＣＰＵ５００に，映像信号及び音声信号を信号処理部５０６に出力する。
【０１２５】
メタデータエンコーダ５１４は，再生されたメタデータを伝送用のフォーマットにエンコード（上記フラグ，ＣＲＣ等を付与）して，信号処理部５０６に出力する。信号処理部５０６は，例えば，ＥＣＣブロック５１２から入力された映像信号及び音声信号と，上記メタデータエンコーダ５１４によってエンコードされたメタデータとを合わせて，シリアライザ５１６に出力する。
【０１２６】
シリアライザ５１６は，信号処理部５０６から入力された映像信号等を，パラレル−シリアル変換して，メタデータ合成装置６０に送信する。
【０１２７】
なお，上記のように，信号処理部５０６，メタデータデコーダ５０８，ＣＰＵ５００および記録再生ブロック５１０などは，本実施形態にかかる記録部として構成されており，メタデータが付加された映像信号を記憶媒体に記録することができる。
【０１２８】
＜１．４．５メタデータ合成装置＞
次に，図１２に基づいて，本実施形態にかかるメタデータ合成装置６０について詳細に説明する。なお，図１２は，本実施形態にかかるメタデータ合成装置６０の構成を示すブロック図である。
【０１２９】
図１２に示すように，メタデータ合成装置６０は，例えば，ＣＰＵ６００と，メモリ部６０２と，デシリアライザ６０４と，メタデータ抽出部６０６と，メタデータデコーダ６０８と，メタデータ映像生成部６１０と，メタデータ映像合成部６１２と，シリアライザ６１４と，を備える。
【０１３０】
ＣＰＵ６００は，演算処理装置および制御装置として機能し，メタデータ合成装置６０の各部の処理を制御することができる。また，メモリ部６０２は，例えば，各種のＲＡＭ，ＲＯＭ，フラッシュメモリ，ハードディスクなどの記憶装置などで構成されており，ＣＰＵ６００の処理に関する各種データ，およびＣＰＵ６００の動作プログラム等を記憶または一時記憶する機能を有する。
【０１３１】
デシリアライザ６０４は，ＶＴＲ５０から入力された映像信号をシリアル−パラレル変換して，メタデータ抽出部６０６に出力する。
【０１３２】
メタデータ抽出部６０６は，例えば，映像信号のブランキング領域に挿入されているメタデータを１フレーム毎に抽出する。このとき，メタデータ抽出部６０６は，例えば，ブランキング領域に挿入されている全てのメタデータを抽出するのではなく，例えば，特定のメタデータグループ（例えばシーン情報グループ）のメタデータのみを抽出したり，さらに，当該メタデータグループ内の特定のメタデータ（例えば，タイムコード，シーン番号，テイク番号）のみを抽出したりするようにしてもよい。なお，かかるメタデータの抽出処理時には，メタデータ抽出部６０６は，各メタデータグループに付与されているグループ識別情報「Ｋ」およびデータ量情報「Ｌ」に基づいて，抽出しようとするメタデータグループの位置およびデータ量を把握できるので，必要なメタデータの抽出処理を効率的に行うことができる。
【０１３３】
メタデータ抽出部６０６は，例えば，このようにして抽出したメタデータをメタデータデコーダ６０８に出力する一方，映像信号はそのままの状態でメタデータ映像合成部６１２に出力する。
【０１３４】
メタデータデコーダ６０８は，例えば，メタデータ抽出部６０６から入力されたメタデータをデコードし，メタデータ映像生成部６１０に出力する。
【０１３５】
メタデータ映像生成部６１０は，例えば，メタデータデコーダ６０８から入力されたメタデータを，スーパーインポーズするために映像データに書き換えることができる。即ち，メタデータデコーダ６０８でデコードされたメタデータは，例えばテキストデータ形式のメタデータであるので，メタデータ映像生成部６１０は，このメタデータを映像データ形式に変換する。
【０１３６】
メタデータ映像合成部６１２は，例えば，メタデータ抽出部６０６から入力された映像信号に対して，メタデータ映像生成部６１０で映像データに変換されたメタデータをフレーム単位で順次，合成することができる。換言すると，このメタデータ付加部６１２は，例えば，当該映像信号に対して，映像データ化されたメタデータをフレーム単位で多重して，スーパーインポーズすることができる。
【０１３７】
シリアライザ６１２は，メタデータ映像合成部６１２から入力された映像信号等を，パラレル−シリアル変換して，表示装置７０に送信する。
【０１３８】
このようにして，メタデータ合成装置６０は，撮像装置によって撮影中の映像信号，或いはＶＴＲ５０で再生された映像信号から，そのブランキング領域に挿入されているメタデータを取り出して，当該映像信号にスーパーインポーズすることができる。この結果，かかる映像信号が入力された表示装置７０は，当該メタデータがスーポーインポーズ等された映像を表示することができる。
【０１３９】
これにより，ディレクタ等は，例えば，撮像装置１０によって収録中の映像，あるいは収録後にＶＴＲ５０で再生された映像を，当該映像に関するメタデータとともに閲覧することができる。このため，例えば，タイムコード，シーン番号，テイク番号等がスーパーインポーズ表示されている場合には，ディレクタ等は，当該映像がいかなるシーンの，いかなるテイクの，いかなる時間のものであるかなどを，映像を見ながら容易に識別，確認することができる。
【０１４０】
＜１．５映像記録方法＞
次に，図１３に基づいて，上記のような映像記録システム１を用いた本実施形態にかかる映像記録方法について説明する。なお，図１３は，本実施形態にかかる映像記録方法を説明するためのタイミングチャートである。
【０１４１】
図１３（ａ）に示すように，撮影が開始されると，まず，撮像装置１０には，生の映像が，順次，入射される。すると，撮像装置１０は，０フレーム，１フレーム，２フレーム，…とフレーム単位で映像信号を順次生成していく。このとき，撮像装置１０のＣＣＤ等は，例えば，当該映像を例えばプログレッシブ方式でスキャンする。このため，撮像装置１０の出力する映像信号は，撮像装置１０に入射された生の映像に対して，例えば１フレーム程度の遅延が生ずる。この結果，図１３（ｂ）に示すように，ＣＣＵ２０の出力も例えば１フレーム程度遅延する。
【０１４２】
また，上記映像信号の生成と略同時に，撮像装置１０は，カメラ設定メタデータを１フレーム毎に生成し，図１３（ｂ）に示すように，対応するフレームの映像信号のブランキング領域に１フレーム毎に順次挿入していく。これにより，撮像装置１０は，撮像処理を実行して映像信号を生成しながら，当該映像信号に対して，カメラ設定グループのメタデータをフレーム単位で付加することができる。
【０１４３】
また，このような映像装置１０の撮影処理と同時並行して，レンズ装置１２およびドーリ装置１４は，上記撮影処理時における設定情報を収集して，レンズ設定メタデータおよびドーリ設定メタデータを例えば１フレーム毎に生成し，メタデータ付加装置４０に順次出力している。
【０１４４】
さらに，ＣＣＵ２０には，上記撮像装置１０によって生成され，カメラ設定メタデータが１フレーム毎に付加された映像信号が，順次，入力されてくる。ＣＣＵ２０は，図１３（ｂ）に示すように，この映像信号をメタデータ付加装置４０に順次出力していく。
【０１４５】
また，メタデータ付加装置４０は，図１３（ｃ）に示すように，ＣＣＵ２０から入力されてくる映像信号のブランキング領域に，シーン情報メタデータ，レンズ設定メタデータおよびドーリ設定メタデータを，１フレーム毎に順次挿入していく。また，メタデータ付加装置４０は，例えば，当該映像信号に対して１フレーム毎に，タイムコード情報をシーン情報メタデータの１つとして付加していく。このようにして，メタデータ付加装置４０は，上記撮像装置１０による撮影処理と同時並行して，当該映像信号に対して，利用目体に応じてグループ化されたメタデータをフレーム単位で付加することができる。
【０１４６】
さらに，ＶＴＲ５０には，図１３（ｄ）に示すように，例えば，メタデータ付加装置４０から，メタデータが付加された映像信号が順次入力されるとともに，集音装置１８から音声信号が順次入力されてくる。この音声信号は，例えば，一旦メモリ部５０２に貯蔵され，当該映像信号の遅延に合わせて映像信号に同期して記録される。ＶＴＲ５０は，当該映像信号のメタデータをデコードした上で，当該映像信号および同期された音声信号とともに，ビデオテープ５２等にフレーム単位で記録していく。
【０１４７】
以上のように，本実施形態にかかる映像記録方法では，例えば，撮像装置１０による撮影処理を実行しながら，各種のメタデータを生成してグループ化し，当該撮影処理によって生成された映像信号に対して上記グループ化されたメタデータをフレーム単位で付加して，記憶媒体に記録することができる。
【０１４８】
以上説明したように，上記映像記録システム１及びこれを用いた映像記録方法によれば，撮像装置１０によって生成された映像信号に対して，撮影処理中にリアルタイムで，映像信号に関連するメタデータをフレーム単位で付加して，同一記憶媒体に記録することができる。このため，従来のように，ＰＣ等の端末装置内に記録されたメタデータと，記憶媒体中に記録された映像素材とを，タイムコード等で間接的にリンクする必要がなくなり，映像素材とその映像素材に関するメタデータとを，直接的にリンクさせて記録することができる。従って，映像素材とメタデータを一体的に管理することができるので便利である。また，メタデータの抽出時に映像素材とメタデータの整合性をとる必要が無いので，必要なメタデータを効率的に抽出して利用したり，書き換えたりできるようになる。
【０１４９】
＜２映像編集装置＞
次に，本実施形態にかかる映像編集装置およびその処理方法について説明する。
【０１５０】
＜２．１映像編集装置＞
まず，本実施形態にかかる映像編集装置の概要について説明する。映像編集装置は，上記のようにして撮影・収録された映像素材を編集処理するための装置であり，編集部署のあるスタジオ等に設けられる。この編集部署とは，例えば，上記記録システム１が撮影・収録した映像素材を編集処理して，完全パッケージプログラム（以下では，完パケという。）を完成させる部署である。編集処理には，例えば，映像素材の粗編集処理，後処理，本編集処理などが含まれる。
【０１５１】
粗編集処理は，多様な映像素材の中から，映像作品を構成する映像を切り出す処理である。具体的には，まず，上記映像記録システム１によって収録された複数の映像素材の中から必要な映像素材を収集し，次いで，収集された映像素材の中から必要な映像部分を選択して，開始位置（Ｉｎ点）および終了位置（Ｏｕｔ点）のタイムコード等を決定して，必要な映像部分を特定する処理である。
【０１５２】
後処理は，撮影・収録された実写の映像素材に対して，映像合成処理または映像補正処理等を施して，実写の映像素材の内容を変更する処理である。
【０１５３】
また，本編集処理は，粗編集や後処理を経た映像素材のオリジナルデータを繋ぎ合わせ，最終的な画質調整等を施し，番組などで放映するための完全パッケージデータを作成する処理である。
【０１５４】
本実施形態にかかる映像編集装置３は，上記編集処理のうち，例えば，後処理を行うための装置として構成されている。この映像編集装置３が行う後処理は，例えば，上記映像記録システム１によって撮影・収録された実写の映像素材に対して，映像合成処理若しくは映像補正処理等を施す処理である。
【０１５５】
より詳細には，映像合成処理は，例えば，フォアグラウンドの映像とバックグラウンドの映像とを合成する処理である。このように合成する映像のうち，少なくともいずれかの映像としてコンピュータグラフィックス映像（以下では，ＣＧ映像という。）を用いるのが，ＣＧ映像合成処理である。ＣＧ映像合成処理の具体例としては，例えば，フォアグラウンドである人物の実写映像と，バックグラウンドである架空背景のＣＧ映像とを合成する処理や，逆に，フォアグラウンドである架空人物のＣＧ映像と，バックグラウンドである背景の実写映像とを合成する処理などが挙げられる。
【０１５６】
また，映像補正処理は，実写の映像素材に対して各種の補正を施す処理である。具体的には，例えば，映像素材の明るさを変えるなどして，昼間の映像を夕方の映像にしたり，春の映像を秋の季節の映像にしたりする処理や，実写の映像には無いものを付加したり，実写の映像にあるものを変更／削除したりする処理（例えば，人物の髪型を変更する）などが挙げられる。
【０１５７】
なお，映像編集装置３は，このような後処理以外にも，例えば，粗編集処理及び／又は本編集処理などを実行できるように構成されてもよい。
【０１５８】
次に，図１４に基づいて，上記のような後処理等を実行するための映像編集装置の全体構成について説明する。なお，図１４は，本実施形態にかかる映像編集装置３の概略的な構成を示すブロック図である。
【０１５９】
図１４に示すように，本実施形態にかかる映像編集装置３は，例えば，再生用ＶＴＲ９０と，合成用映像サーバ９２と，編集用端末装置８０と，から主に構成されている。
【０１６０】
再生用ＶＴＲ９０は，例えば，上記ビデオテープ５２等の記憶媒体に対して映像信号を記録／再生可能なビデオテープレコーダであり，本実施形態にかかる映像信号再生装置として構成されている。この再生用ＶＴＲ９０の内部構成は，例えば，上記図１２で説明したようなＶＴＲ５０の内部構成と略同一であるので，その説明は省略する。
【０１６１】
この再生用ＶＴＲ９０には，例えば，上記映像記録システム１によって撮影・収録された実写の映像素材を記録しているビデオテープ５２が，ローディングされる。このビデオテープ５２に記録されている映像信号には，上記のように，グループ化されたメタデータがフレーム単位で付加されている。再生用ＶＴＲ９０は，かかるビデオテープ５２に記録されている実写の映像素材を再生し，再生した映像素材を編集用端末装置８０に出力することができる。より詳細には，この再生用ＶＴＲ９０は，例えば，当該ビデオテープ５２から，上記のようにしてフレーム単位で各種メタデータが付加された映像信号（実写の映像素材）及び／又は音声信号を再生することができる。また，再生用ＶＴＲ９０は，このようにして再生した実写の映像信号等を，ＨＤＳＤＩケーブル等を介して，編集用端末装置８０に出力することができる。
【０１６２】
なお，この再生用ＶＴＲ９０は，例えば，上記のように，ビデオテープ５２から映像信号を再生して，編集用端末装置８０に提供することを主な機能とするが，かかる例に限定されず，例えば，編集用端末装置８０等から入力された映像信号等をビデオテープ５２に記録できるように構成しても勿論よい。
【０１６３】
合成用映像サーバ９２は，例えば，合成用映像信号を貯蔵するためのサーバであり，例えば，大容量の映像信号を格納可能なハードディスクドライブ等の記録装置を備える。また，この合成用映像サーバ９２は，例えば，格納している合成用映像信号を，必要に応じて，編集用端末装置８０に出力することができる。なお，この合成用映像サーバ９２は，例えば，編集用端末装置８０とは別体に構成されたパーソナルコンピュータ及び記録装置などからなるサーバ装置として構成されてもよいし，或いは，編集用端末装置８０に内蔵されたディスク装置などとして構成されてもよい。
【０１６４】
この合成用映像サーバ９２が貯蔵する合成用映像信号は，例えば，上記再生用ＶＴＲ９０で再生された映像信号に対して合成するための映像信号である。本実施形態では，この合成用映像信号は，例えば，コンピュータを用いて作成された架空の図表，図形，線画，絵等からなる動画像または静止画像のＣＧ映像信号である。かかるＣＧ映像信号は，例えば，ＣＧ映像作成用のソフトウエア等がインストールされたコンピュータなどにより生成され，合成用映像サーバ９２に予め格納されている。また，例えば，このＣＧ映像信号に対しても，ＣＧ作成時点でＣＧ映像信号に関連する各種のメタデータ（例えば，ＣＧ画像のガンマ，ディテール，ニーレベル等）がフレーム単位で付加されている。
【０１６５】
なお，合成用映像信号は，上記ＣＧ映像信号の例に限定されず，例えば実写の映像信号などであっても勿論よい。これにより，実写映像同士を合成することが可能になる。
【０１６６】
編集用端末装置８０は，例えば，パーソナルコンピュータなどの情報処理装置及びその周辺装置などで構成される。この編集用端末装置８０は，例えば，上記再生用ＶＴＲ９０から，再生された実写の映像信号を取得するとともに，上記合成用映像サーバ９２から，ＣＧ映像信号を取得することができる。
【０１６７】
また，この編集用端末装置８０は，例えば，実写の映像信号に付加されているメタデータを抽出することができる。このとき，編集用端末装置８０は，例えば，上記メタデータグループごとメタデータを抽出できるだけでなく，予め設定された抽出条件等に基づいて，メタデータグループ内の所定のメタデータのみを抽出することもできる。
【０１６８】
さらに，編集用端末装置８０は，かかる抽出したメタデータを，当該映像信号にフレーム単位で同期させて表示することができる。これにより，編集用端末装置８０のオペレータは，映像とともに表示されたメタデータを閲覧しながら，編集用端末装置８０を操作して，実写映像に対して多様な後処理を施すことができる。具体的には，編集用端末装置８０は，例えば，上記実写の映像信号とＣＧ映像信号とを合成するなどといった映像合成処理や，上記実写の映像信号を補正する映像補正処理などを行うことができる。
【０１６９】
このようにして編集用端末装置８０によって後処理された映像信号は，例えば，編集用端末装置８０内の記録装置などに記録される。しかし，かかる例に限定されず，後処理された映像信号は，例えば，編集用端末装置８０に接続された記録用ＶＴＲ（図示せず。）によって新たなビデオテープ等の記憶媒体に記録される，或いは，編集用端末装置８０に接続された後処理用映像サーバ（図示せず。）に記録されるなどしてもよい。
【０１７０】
＜２．２編集用端末装置の構成＞
次に，図１５に基づいて，本実施形態にかかる編集用端末装置８０の構成について詳細に説明する。なお，図１５は，本実施形態にかかる編集用端末装置８０の概略的な構成を示すブロック図である。
【０１７１】
図１５に示すように，編集用端末装置８０は，ＣＰＵ８００と，メモリ部８０２と，入力部８０４と，表示部８０６と，音声出力部８０８と，外部インターフェース８１０と，記録装置８１１と，メタデータ抽出部８１２と，表示制御部８１４と，映像合成処理部８１６と，映像補正処理部８１８と，を備える。
【０１７２】
ＣＰＵ８００は，演算処理装置および制御装置として機能し，編集用端末装置１０の各部の処理を制御することができる。また，メモリ部８０２は，例えば，ＲＡＭ，ＲＯＭ，フラッシュメモリなどで構成されており，ＣＰＵ８００の処理に関する各種データ，ＣＰＵ８００の動作プログラム等を記憶する機能を有する。
【０１７３】
入力部８０４は，例えば，マウス，キーボード，タッチパネルなどの一般的なＰＣ用入力装置（図示せず。）と，映像編集用入力装置（図示せず。）などから構成されている。この映像編集用入力装置は，例えば，映像再生ボタン，停止ボタン，巻き戻しボタン，早送りボタンなどの各種の編集用ボタンや，映像再生速度を調整したり，再生する映像素材を選択したりするためのジョグダイヤル，レバーなどを備える。編集用端末装置８０のオペレータは，例えば，上記映像編集用入力装置を操作して，映像信号を多様に再生表示したり，後処理したりできる。
【０１７４】
表示部８０４は，ディスプレイ装置であり，例えばＣＲＴモニタやＬＣＤモニタなどで構成される。この表示部８０４は，例えば，映像信号や，当該映像信号に対応するメタデータなどを表示することができる。この表示部８０４は，上記再生用ＶＴＲ９０によって再生された映像信号と，この映像信号から抽出されたメタデータを，フレーム単位で同期させて表示することができるが，詳細は後述する。
【０１７５】
音声出力部１１０は，例えば，スピーカなどの発音装置と音声信号処理装置等から構成されており，上記再生用ＶＴＲによって再生された音声信号に基づいて音声出力することができる。
【０１７６】
外部インターフェース８１０は，編集用端末装置１０に例えばＨＤＳＤＩ，ＲＳ−２３２Ｃ，ＵＳＢ，ＳＣＳＩなどのインターフェースで接続された周辺機器との間で，データの送受信を行う部位である。この周辺装置は，例えば，上記再生用ＶＴＲ９０および合成用映像サーバ９２などである。
【０１７７】
記録装置８１１は，例えば，ハードディスクドライブ等で構成されたストレージ装置であり，後処理された映像信号，メタデータ，各種プログラムなどを格納することができる。
【０１７８】
メタデータ抽出部８１２は，上記再生用ＶＴＲ９０によって再生された実写の映像信号から，この映像信号のブランキング領域に１フレーム毎に挿入されているメタデータを抽出することができる。このとき，メタデータ抽出部８１２は，例えば，ブランキング領域に挿入されている全てのメタデータを抽出するのではなく，例えば，特定のメタデータグループのメタデータのみを抽出したり，さらに，当該メタデータグループ内の特定のメタデータのみを抽出したりできる。かかるメタデータの抽出処理時には，メタデータ抽出部８１２は，各メタデータグループに付与されているグループ識別情報「Ｋ」およびデータ量情報「Ｌ」に基づいて，抽出しようとするメタデータグループの位置およびデータ量を把握できるので，必要なメタデータの抽出処理を効率的に行うことができる。
【０１７９】
このような抽出処理では，メタデータ抽出部８１２は，例えば，後処理の内容に応じて，必要なメタデータのみを抽出するように設定されている。
【０１８０】
具体的には，映像合成処理を行う場合において，実写の映像素材とＣＧ映像との間で，映像の色，明るさおよび質感等を調整するときには，例えば，カメラ設定グループ内の必要なカメラ設定メタデータ（撮影時における撮像装置１０のシャッタスピード，ゲイン，ガンマ，ディテール等の設定情報）などが抽出される。また，映像合成処理を行う場合において，実写の映像素材とＣＧ映像との間で，映像の動き等を調整するときには，例えば，レンズ設定グループ内の必要なレンズ設定メタデータ（撮影時におけるレンズ装置１２のズーム，フォーカス，アイリス等の設定情報）や，ドーリ設定グループ内の必要なドーリ設定メタデータ（撮影時におけるドーリ装置１４の移動スピード，カメラ方向（Ｐａｎ，Ｔｉｌｔ，Ｒｏｌｌ）等の情報）などが抽出される。
【０１８１】
また，実写の映像素材の映像補正処理を行う場合には，例えば，カメラ設定グループ内の必要なカメラ設定メタデータ（撮影時における撮像装置１０のシャッタスピード，ゲイン，ガンマ，ディテール等の設定情報）などが抽出される。
【０１８２】
また，メタデータ抽出部８１２は，例えば，上記合成用映像サーバ９２から入力されたＣＧ映像信号にフレーム単位で付加されているＣＧ映像に関するメタデータをも，フレーム単位で抽出することができる。このとき，メタデータ抽出部８１２は，例えば，上記実写の映像信号から抽出される各メタデータ（例えば実写映像のゲイン，ガンマ等）に対応させて，ＣＧ映像信号から各メタデータ（例えばＣＧ映像のゲイン，ガンマ等）を抽出するように設定されている。
【０１８３】
メタデータ抽出部８１２は，上記のようにして抽出したメタデータを，例えば，表示制御部８１４に出力する。
【０１８４】
表示制御部８１４は，上記表示部８０６の表示内容（映像信号およびメタデータ等）を制御することができる。具体的には，表示制御部８１４は，例えば，上記メタデータ抽出部８１２によって抽出されたメタデータを，表示用データに変換する処理を施した上で，表示部８０６が，実写の映像信号と当該変換したメタデータとを，フレーム単位で同期して表示するように制御することができる。
【０１８５】
この表示制御部８１４によるメタデータの変換処理について，より詳細に説明すると，まず，表示制御部８１４は，例えば，上記メタデータ抽出部８１２によって１フレームごとに抽出されたメタデータをデコードして，解釈する。次いで，表示制御部８１４は，例えば，解釈したメタデータを，閲覧しやすいような表形式等に変換する。
【０１８６】
表示制御部８１４は，例えば，このように変換処理したメタデータを，実写の映像信号に応じて１フレーム毎に更新しながら，実写の映像信号とともに表示するように，表示部８０６を制御することができる。これにより，表示部８０６は，実写の映像信号およびメタデータをフレーム単位で同期させて表示することができる。
【０１８７】
このため，編集端末装置８０のオペレータは，例えば，実写の映像とともに，表形式に整理されたメタデータを閲覧することができるので，後処理しようとする映像素材の映像の画質や動きを，客観的かつ正確に把握することができる。
【０１８８】
さらに，例えばＣＧ合成処理が行われる場合には，表示制御部８１４は，例えば，上記のような実写の映像信号およびこれに対応するメタデータに加えて，ＣＧ映像信号およびこれに対応するメタデータ（ＣＧデータ）をも，フレーム単位で同期させて表示するように制御することができる。
【０１８９】
また，例えば映像補正処理が行われる場合には，表示制御部８１４は，例えば，上記のような実写の映像信号およびこれに対応するメタデータに加えて，映像補正後の映像信号およびこれに対応するメタデータをも表示するように制御することができる。
【０１９０】
映像合成処理部８１６は，例えば，オペレータの入力に基づいて，上記再生された実写の映像信号と，合成用映像信号とを合成する映像合成処理を行うことができる。より詳細には，映像合成処理部８１６は，例えば，オペレータの操作に基づいて，上記再生用ＶＴＲ９０から入力された実写の映像信号と，上記合成用映像サーバ９２から入力されたＣＧ映像信号とを，例えばキー信号を利用して合成することができる。このとき，合成される実写の映像信号は，フォアグラウンドまたはバックグラウンドのいずれであってもよい。
【０１９１】
この映像合成処理部８１６によって映像合成処理が実行されるときには，例えば，上記のように，表示部８０６の表示画面上には，実写映像およびＣＧ映像が，それぞれに対応するメタデータとともに見やすいレイアウトで表示される。このため，オペレータは，例えば，実写の映像とＣＧ映像とを対比しながら，さらに，実写の映像のメタデータとＣＧデータとを対比することができる。従って，例えば，双方のメタデータの値を合わせるようにＣＧ映像の画質や動きを調整した上で実写映像に合成できるので，ＣＧ合成処理を客観的かつ好適に実行することができる。
【０１９２】
映像補正処理部８１８は，例えば，オペレータの入力に基づいて，上記再生された実写の映像信号を補正する映像補正処理を行うことができる。より詳細には，映像補正処理部８１８は，例えば，オペレータの操作に基づいて，上記再生用ＶＴＲ９０から入力された実写の映像信号に対して，映像の一部を追加，削除，変更する処理をしたり，背景を変える処理をしたり，画質等の調整処理を施したりして，映像補正することができる。
【０１９３】
この映像補正処理部８１８によって映像補正処理が実行されるときには，例えば，上記のように，表示部８０６の表示画面上には，実写映像および補正後の映像が，それぞれに対応するメタデータとともに見やすいレイアウトで表示される。このため，オペレータは，例えば，実写の映像と補正後の映像とを対比しながら，さらに，実写の映像のメタデータと補正後の映像のメタデータとを対比することができる。従って，双方のメタデータのレベル差等を参照しながら，実写映像を補正できるので，映像補正処理を客観的かつ好適に実行することができる。
【０１９４】
以上，編集用端末装置８０の各部に着ついて説明した。なお，上記のようなメタデータ合抽出部８１２，表示制御部８１４，映像合成処理部８１６，映像補正処理部８１８などは，上記のような処理機能を実現できるものであれば，例えば，専用装置（ハードウェア）として構成されてもよいし，或いは，コンピュータ等の編集用端末装置８０に上記処理を実行させるアプリケーションプログラムをインストールしてソフトウエアとして構成されてもよい。
【０１９５】
＜２．３映像編集装置の処理方法＞
次に，図１６に基づいて，上記のような映像編集装置３の処理方法について説明する。なお，図１６は，本実施形態にかかる映像編集装置３の処理方法を示すフローチャートである。以下では，例えば，ＣＧ合成処理の処理フローの具体例を挙げて説明するが，かかる例に限定されるものではない。
【０１９６】
図１６に示すように，まず，ステップＳ１００では，再生用ＶＴＲ９０によって実写映像が再生されるとともに，合成用映像サーバ９２によってＣＧ映像が読み出される（ステップＳ１００）。再生用ＶＴＲ９０は，例えば，編集用端末用装置３０からの指示に基づいて，映像記録システム１によって映像素材が収録されているビデオテープ５２から，所定の実写の映像信号を再生して，編集用端末装置８０に出力する。一方，合成用映像サーバ９２は，例えば，編集用端末用装置３０からの指示に基づいて，貯蔵しているＣＧ映像信号のうち，上記再生された実写の映像信号に合成するための所定のＣＧ映像信号を読み出して，編集用端末装置８０に出力する
次いで，ステップＳ１０２では，編集用端末装置８０によって映像信号に付加されているメタデータの少なくとも一部が抽出される（ステップＳ１０２）。編集用端末装置８０のメタデータ抽出部８１２は，例えば，再生された実写の映像信号のブランキング領域に挿入されているメタデータを，１フレーム毎に抽出して，デコード等する。この抽出時には，メタデータ抽出部８１２は，例えば，オペレータによって指定された抽出条件に基づいて，実写の映像信号に付加されている多様なメタデータから，ＣＧ合成処理内容に応じた必要なメタデータのみを，メタデータグループ単位あるいは個々のメタデータ単位で選択して抽出できる。
【０１９７】
また，メタデータ抽出部８１２は，例えば，上記抽出処理と同時に，ＣＧ映像信号にフレーム単位で付加されているＣＧ用のメタデータをも，１フレーム毎に抽出する。このとき，メタデータ抽出部８１２は，例えば，上記実写の映像信号から抽出されるメタデータに対応するＣＧ映像信号のメタデータを抽出する。
【０１９８】
さらに，ステップＳ１０４では，映像信号とメタデータがフレーム単位で同期して表示される（ステップＳ１０４）。表示制御部８１４は，例えば，実写の映像信号と，上記のように抽出されたメタデータとを，１フレーム毎に同期させて，表示部８０６に表示させる。これと同時に，表示制御部８１４は，例えば，ＣＧ映像信号と，上記のように抽出されたＣＧ用のメタデータとを，１フレーム毎に同期させて，表示部８０６に表示させる。
【０１９９】
その後，ステップＳ１０６では，ＣＧ合成処理が実行される（ステップＳ１０６）。編集用端末装置８０のオペレータは，例えば，上記のように表示部８０６に表示された実写映像およびメタデータ，並びにＣＧ映像およびＣＧ用メタデータを閲覧しながら，合成するＣＧ映像の画質や動きを補正して，双方の映像を合成する操作を行う。映像合成処理部８１６は，例えば，かかるオペレータの操作に基づいて，ＣＧ映像信号を補正処理した上で，実写の映像信号と補正後のＣＧ映像信号とを合成した映像信号を生成する。
【０２００】
ここで，図１７及び図１８に基づいて，上記のようなＣＧ合成処理時において編集用端末装置８０の表示部８０６に表示される映像およびメタデータの具体例について説明する。なお，図１７は，ＣＧ合成処理でのＣＧ映像の画質調整時において編集用端末装置８０の表示部８０６に表示される映像およびメタデータの具体例を示す説明図である。また，図１８は，ＣＧ合成処理でのＣＧ映像の動き調整時において編集用端末装置８０の表示部８０６に表示される映像およびメタデータの具体例を示す説明図である。
【０２０１】
図１７に示すように，表示部８０６の表示画面８０７には，例えば，フォアグラウンドとして「サバンナを歩くライオン」の実写映像が表示されている。また，この実写映像の下部には，当該実写映像のフレームに対応するカメラ設定メタデータが表示されている。より詳細には，例えば，カメラ設定グループの「レコーダステータス」，「シャッタスピード」，「ゲイン」，「ガンマ」，「ディテールレベル」，「ニーポイント」などの項目の値が，表形式で表示されている。
【０２０２】
さらに，表示画面８０７には，例えば，バックグラウンドとして「宇宙空間の風景」のＣＧ映像が表示されている。また，このＣＧ映像の下部には，当該ＣＧ映像のフレームに対応するＣＧ用メタデータが表示されている。このＣＧ用メタデータの項目は，上記実写映像のカメラ設定メタデータの項目に対応するものとなっている。
【０２０３】
このような実写映像とＣＧ映像を合成して「宇宙空間を歩くライオン」の映像を作成するために，オペレータは，まず，ＣＧ映像の画質（色，明るさ，質感等）を補正する処理を行う。具体的には，オペレータは，例えば，ＣＧ映像の画質が実写映像の画質に合うように，ＣＧ映像のガンマ，ディテール，ニーポイント等の値を補正する処理を行う。このとき，オペレータは，実写映像のカメラ設定メタデータとＣＧ用メタデータとを見比べながら，補正処理を行うことができるので，各パラメータの補正の度合いを客観的に判断して，高精度で画質調整することができる。
【０２０４】
次いで，オペレータは，ＣＧ映像の動きを補正する処理を行うために，表示部８０６に表示するメタデータグループが切り替えられる。かかる切り替えにより，表示部８０６の表示画面８０７には，図１８に示すように，上記と同様の実写映像の下部に，当該実写映像のフレームに対応するレンズ設定メタデータ及びドーリ設定メタデータが表示される。より詳細には，例えば，レンズ設定グループの「ズーム」，「フォーカス」，「アイリス」，ドーリ設定グループの「カメラ方向（Ｐａｎ，Ｔｉｌｔ，Ｒｏｌｌ）」が表示される。また，ＣＧ映像の下部には，この実写映像のレンズ設定メタデータ及びドーリ設定メタデータに対応した項目のＣＧ用メタデータが表示される。
【０２０５】
このようなメタデータが表示されると，オペレータは，ＣＧ映像の動きを補正する処理を行う。具体的には，オペレータは，例えば，ＣＧ映像の動きが実写映像の動きに合うように，ＣＧ映像のズーム，フォーカス，アイリスや，カメラ方向などを補正する処理を行う。このとき，オペレータは，実写映像のレンズおよびドーリ設定メタデータとＣＧ用メタデータとを見比べながら，補正処理を行うことができるので，双方の映像の動きに関するパラメータを客観的に判断して，高精度でＣＧ映像の動きを調整することができる。
【０２０６】
次に，図１９に基づいて，映像補正処理時において編集用端末装置８０の表示部８０６に表示される映像およびメタデータの具体例について説明する。図１９は，映像補正処理時において編集用端末装置８０の表示部８０６に表示される映像およびメタデータの具体例を示す説明図である。なお，以下では，映像補正処理の具体例として，例えば，昼間に撮影された実写映像の画質を変化させることにより，夜の映像に補正する処理について説明する。
【０２０７】
図１９に示すように，表示部８０６の表示画面８０７には，例えば，補正前の映像として「昼のサバンナを歩くライオン」の実写映像が表示されている。また，この実写映像の下部には，当該実写映像のフレームに対応するカメラ設定メタデータが表示されている。より詳細には，例えば，カメラ設定グループの「シャッタスピード」，「ゲイン」，「ガンマ」，「ディテールレベル」，「ニーポイント」。「ニースロープ」などの項目の値が，表形式で表示されている。
【０２０８】
さらに，表示画面８０７には，例えば，補正後の映像として「夜のサバンナを歩くライオン」の映像が表示されている。また，この補正後の映像の下部には，当該補正後の映像のフレームに対応するカメラ設定メタデータが表示されている。
【０２０９】
このように，補正前の実写映像と補正後の映像とを並べて表示するとともに，補正前の実写映像のカメラ設定メタデータと，補正後の映像のカメラ設定メタデータとを並べて表示することで，オペレータは，双方の映像間におけるカメラ設定メタデータの各項目のレベル差を判断材料として，好適に映像補正処理を実行することができる。具体的には，オペレータは，例えば，補正後の映像が好適な夜の映像となるように，実写映像のガンマ，ディテール，ニーポイント等の値を調整して，実写映像を補正する処理を行う。このとき，オペレータは，補正前と補正後のカメラ設定メタデータの各項目を見比べながら，補正処理を行うことができるので，各項目を増減させる度合いを客観的に判断して，高精度で画質調整することができる。
【０２１０】
また，図１９に示したような映像補正処理では，例えば，実写映像の「太陽」の部分を削除して，「月」の映像を挿入する補正処理も成されている。このように映像補正処理では，実写映像にあったものを削除したり，実写映像に無かったものを追加したりすることもできる。
【０２１１】
以上説明したように，上記映像編集装置３及びこの処理方法によれば，メタデータがフレーム単位で付加された映像信号を利用して，当該映像信号の後処理に必要なメタデータを好適に抽出・表示することができる。このため，かかる映像編集装置３を用いることで，映像素材を好適に後処理することができる。
【０２１２】
即ち，上記映像記録システム１で映像素材が収録された記憶媒体５２には，映像素材とその映像素材に関するメタデータとがフレーム単位で直接的にリンクされて記録されている。映像編集装置３は，かかる映像素材を再生して，再生された映像素材に直接付随しているメタデータを抽出することができる。このため，従来のように，映像素材とは別に記録されているメタデータにアクセスして，読み出し，さらにタイムコードを用いて映像素材とメタデータとの時間合わせを行うといった必要がない。従って，映像とメタデータとを一体的に取り扱うことができるので，映像素材の後処理に必要なメタデータを，迅速かつ容易に抽出することができる。
【０２１３】
さらに，抽出したメタデータを当該映像素材とフレーム単位で同期させて好適に表示することができる。このため，オペレータは，映像とともに，映像に対応するメタデータをフレーム単位で閲覧できるので，映像素材の画質，動き等を客観的かつ正確に把握できる。
【０２１４】
また，映像素材とメタデータがフレーム単位でリンクされて記録されているので，例えば，映像素材の粗編集処理などで，映像素材がＩｎ点およびｏｕｔ点で切り出されたとしても，対応する部分のメタデータも当該映像素材に付随して切り出される。このため，映像素材とメタデータとの整合性をとらなくても，映像素材と同期させてメタデータを連続的に抽出・表示することができる。このため，当該編集後の映像素材であっても，メタデータを迅速かつ容易に抽出して，後処理に活用することができる。
【０２１５】
また，撮像装置１０が可変速撮影処理することで，記録された映像信号のフレームレートが変化している場合であっても，当該映像信号に対してフレーム単位でメタデータが付加されているので，単位時間当たりの映像信号のフレーム数とメタデータの記録数との間にずれが生じることがない。よって，このような可変速撮影された映像信号を後処理する場合であっても，映像素材からメタデータをフレーム単位で抽出して表示することができる。上記のように，映像編集装置３は，編集処理や可変速撮影された映像素材にも柔軟に対応することができる。
【０２１６】
また，上記のように各種のメタデータがその利用目的に応じてグループ化されて，映像信号に付加されている。このため，必要なメタデータを，メタデータグループ単位で，迅速かつ容易に抽出して表示することができる。
【０２１７】
また，ＣＧ合成処理等の映像合成処理を行う場合には，例えば，カメラ設定メタデータを抽出して表示することで，オペレータは，映像素材の画質（明るさ，色合い，質感等）を的確に把握でき，一方，例えば，レンズおよびドーリ設定メタデータを抽出して表示することで，撮影時における撮像装置１０および被写体の動き等を的確に把握できる。さらに，合成されるＣＧ映像のメタデータを実写映像のメタデータと同時に表示することで，ＣＧ映像の画質や動きを客観的に把握して，当該ＣＧ映像を的確に補正することができる。
【０２１８】
また，映像補正処理を行う場合には，例えば，カメラ設定メタデータを抽出して表示することで，オペレータは，映像素材の画質（明るさ，色合い，質感等）を的確に把握して，好適に映像補正することができる。
【０２１９】
以上，添付図面を参照しながら本発明の好適な実施形態について説明したが，本発明はかかる例に限定されない。当業者であれば，特許請求の範囲に記載された技術的思想の範疇内において各種の変更例または修正例に想到し得ることは明らかであり，それらについても当然に本発明の技術的範囲に属するものと了解される。
【０２２０】
例えば，上記実施形態にかかる映像記録システム１では，撮像装置１０がカメラ設定メタデータを映像信号に付加したが，かかる例に限定されない。例えば，図２０に示すように，映像記録システム１では，例えばＣＣＵ２０が撮像装置１０から取得したカメラ設定メタデータを，ＲＳ−２３２Ｃケーブル等を介して映像信号とは別の経路でメタデータ付加装置４０に出力するように構成し，システム内で生成される例えばシーン情報メタデータ，レンズ設定メタデータ，ドーリ設定メタデータおよびカメラ設定メタデータの全てを，メタデータ付加装置４０に集合させ，メタデータ付加装置４０が，これらのメタデータをまとめて映像信号に付加するように構成してもよい。かかる構成により，例えば，メタデータを付加する機能を有していない撮像装置１０を用いた場合などにも，映像信号に対してフレーム単位ですべてのメタデータを付加することができる。
【０２２１】
例えば，上記実施形態にかかる映像記録システム１では，メタデータ付加装置４０は，撮像装置１０，ＣＣＵ２０およびＶＴＲ５０等とは，別体のハードウェアとして構成されていたが，本発明は，かかる例に限定されない。例えば，メタデータ付加装置４０は，撮像装置１０，ＣＣＵ２０またはＶＴＲ５０等のいずれか１つまたは複数に内蔵されるように構成してもよい。また，メタデータ合成装置４０は，例えば，ＶＴＲ５０等に内蔵されるように構成してもよい。このように，メタデータ付加装置４０やメタデータ合成装置６０をＶＴＲ５０または撮像装置１０などに内蔵することにより，システム内の機器数を低減し，機器間の接続の手間等を省力化することができる。
【０２２２】
また，撮像装置１０を，例えば，映像信号を記憶媒体に記録する機能を備えた撮像装置（カムコーダ等）として構成してもよい。これにより，撮像装置１０が，例えば，上記ＣＣＵ２０，メタデータ付加装置４０およびＶＴＲ５０等の全ての機能を備えるように構成することもできる。
【０２２３】
また，上記実施形態では，レンズ装置１２が生成したレンズ設定メタデータは，ＲＳ−２３２Ｃケーブル等を介して出力され，メタデータ付加装置４０で映像信号に付加されたが，本発明はかかる例に限定されない。例えば，撮像装置１０本体との間でレンズの設定情報等を通信可能なレンズ装置１２を採用して，レンズ装置１２から撮像装置１０本体に対して，レンズ設定メタデータ等を直接入力するように構成してもよい。これにより，撮像装置１０のメタデータ付加部１１２が，例えば，カメラ設定メタデータのみならず，レンズ装置１２から取得したレンズ設定メタデータをも映像信号に付加するように構成することもできる。
【０２２４】
また，上記実施形態では，機器間で各種メタデータ等の通信を行うインターフェースとして，ＲＳ−２３２ＣやＨＤＳＤＩ等を採用していたが，かかる例に限定されず，例えば，ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ），ＳＣＳＩ（ＳｍａｌｌＣｏｍｐｕｔｅｒＳｙｓｔｅｍＩｎｔｅｒｆａｃｅ），シリアルＳＣＳＩ，ＧＰ−ＩＢ（ＧｅｎｅｒａｌＰｕｒｐｏｓｅＩｎｔｅｒｆａｃｅＢｕｓ）などの各種のインターフェースを利用してもよい。また，上記機器間では，有線通信に限られず，例えば，無線通信によりメタデータ及び／又は映像信号等を伝送してもよい。
【０２２５】
また，上記実施形態では，映像記録システム内で生成される各種のメタデータは，シーン情報グループ，カメラ設定グループ，レンズ設定グループ，ドーリ設定グループからなる４つメタデータグループにグループ分けされていたが，本発明はかかる例に限定されない。例えば，レンズ設定グループとドーリ設定グループとを１つにまとめて，レンズ・ドーリ設定グループするなど，上記４つのメタデータグループをその利用目的に応じて任意に組み合わせても良い。また，上４つのメタデータグループの全てを設ける必要もなく，例えば，上記のうちの１つ以上のメタデータグループを設けるようにしても良い。
【０２２６】
また，上記以外の新たなメタデータグループを設けてもよい。具体的には，例えば，音声情報グループを設け，この音声情報グループ内に，録音方式情報（ステレオ，モノラル，サラウンド等），録音内容情報（マイク１は背景音を録音し，マイク２は俳優の音声を録音している等）などの音声関連情報メタデータをグループ化してもよい。
【０２２７】
また，上記実施形態にかかる映像記録システム１は，メタデータ合成装置６０および表示装置７０を備えていたが，かかる例に限定されず，これらの装置は必ずしも具備されなくてもよい。
【０２２８】
また，上記実施形態にかかる映像編集装置３は，再生用ＶＴＲ９０を１台具備していたが，かかる例に限定されず，例えば，複数台の再生用ＶＴＲ９０を具備するようにしてもよい。これにより，複数台の再生用ＶＴＲ５０で複数種類の実写の映像信号を再生して，編集用端末装置８０に出力することで，編集用端末装置８０は実写の映像信号同士を合成することもできるようになる。この際，それぞれの実写の映像信号に関連するメタデータを，各映像信号に同期させて表示するようにしてもよい。
【０２２９】
また，上記実施形態では，映像編集装置３は，編集用端末装置８０，再生用ＶＴＲ９０および合成用映像サーバ９２から構成されていたが，本発明はかかる例に限定されない。例えば，映像編集装置３は，これら３つの機器が一体化されたハードウェアとして構成されもよい。また，映像編集装置３は，上記以外にも記録用ＶＴＲまたは記録用サーバなどを別途備えても良い。
【０２３０】
また，映像編集装置３は，上記のような後処理のみならず，例えば，粗編集処理や本編集処理などといった各種の編集処理を実行可能な総合的な編集装置として構成されてもよい。
【０２３１】
また，上記実施形態では，映像素材の後処理の例として，ＣＧ合成処理等の映像合成処理や映像補正処理等について説明したが，かかる例に限定されず，編集用端末装置８０は，例えば，上記以外にも各種の後処理を実行できるように構成してもよい。具体的には，編集用端末装置８０は，例えば，映像素材の異常映像部分（例えば，過度のノイズを含む部分，輝度または色合いが異常な部分など）を削除する機能や，字幕データまたはＣＧ静止画像データ等を映像素材にスーパーインポーズする機能，映像素材をフェードイン・アウトさせる機能などを備えてもよい。
【０２３２】
【発明の効果】
以上説明したように，本発明にかかる映像編集装置は，メタデータがフレーム単位で直接的に付加された映像素材から，映像素材の後処理に必要なメタデータを容易かつ迅速に抽出して，当該メタデータおよび映像素材をフレーム単位で同期させて表示することができる。また，メタデータと映像素材を一体的に取り扱うことができる。さらに，映像素材が編集によりカットされた場合や，映像素材が可変速撮影されている場合などにも，柔軟に対応してメタデータを抽出・表示することができる。このため，オペレータは，かかる映像編集装置を用いて，映像素材の後処理を好適に行うことができる。
【図面の簡単な説明】
【図１】図１は，第１の実施形態にかかる映像記録システムの概略的な構成を示すブロック図である。
【図２】図２は，第１の実施形態にかかるシーン情報グループに含まれるシーン情報メタデータの具体例を示す説明図である。
【図３】図３は，第１の実施形態にかかるカメラ設定グループに含まれるカメラ設定メタデータの具体例を示す説明図である。
【図４】図４は，第１の実施形態にかかるレンズ設定グループに含まれるレンズ設定メタデータの具体例を示す説明図である。
【図５】図５は，第１の実施形態にかかるドーリ設定グループに含まれるカメラ設定メタデータの具体例を示す説明図である。
【図６】図６は，第１の実施形態にかかるメタデータフォーマットを説明するための説明図である。
【図７】図７は，第１の実施形態にかかる撮像装置の構成を示すブロック図である。
【図８】図８は，第１の実施形態にかかる映像信号にメタデータを付加する態様を説明するための説明図である。
【図９】図９は，第１の実施形態にかかるカメラコントロールユニットの構成を示すブロック図である。
【図１０】図１０は，第１の実施形態にかかるメタデータ付加装置の構成を示すブロック図である。
【図１１】図１１は，第１の実施形態にかかるビデオテープレコーダの構成を示すブロック図である。
【図１２】図１２は，第１の実施形態にかかるメタデータ合成装置の構成を示すブロック図である。
【図１３】図１３は，第１の実施形態にかかる映像記録方法を説明するためのタイミングチャートである。
【図１４】図１４は，第１の実施形態にかかる映像編集装置の概略的な構成を示すブロック図である。
【図１５】図１５は，第１の実施形態にかかる編集用端末装置の概略的な構成を示すブロック図である。
【図１６】図１６は，第１の実施形態にかかる映像編集装置の処理方法を示すフローチャートである。
【図１７】図１７は，第１の実施形態にかかるＣＧ合成処理でのＣＧ映像の画質調整時において，編集用端末装置の表示部に表示される映像およびメタデータの具体例を示す説明図である。
【図１８】図１８は，第１の実施形態にかかるＣＧ合成処理でのＣＧ映像の動き調整時において，編集用端末装置の表示部に表示される映像およびメタデータの具体例を示す説明図である。
【図１９】図１９は，第１の実施形態にかかる映像補正処理時において，編集用端末装置の表示部に表示される映像およびメタデータの具体例を示す説明図である。
【図２０】図２０は，変更例にかかる映像記録システムの概略的な構成を示すブロック図である。
【符号の説明】
１：映像記録システム
３：映像編集装置
１０：撮像装置
１２：レンズ装置
１４：ドーリ装置
１８：集音装置
２０：カメラコントロールユニット
３０：メタデータ入力用端末装置
４０：メタデータ付加装置
５０：ビデオテープレコーダ
５２：ビデオテープ
６０：メタデータ合成装置
７０：表示装置
８０：編集用端末装置
９０：再生用ビデオテープレコーダ
９２：合成用映像サーバ
１０４：撮像部
１０８：表示部
１１０：カメラ設定メタデータ生成部
１１２：メタデータ付加部
１２６：レンズ設定メタデータ生成部
１４４：ドーリ設定メタデータ生成部
４０３：メタデータバッファメモリ
４０６：メタデータパッキング部
４０８：メタデータエンコーダ
４１０：メタデータ挿入部
５０６：信号処理部
５０８：メタデータデコーダ
５１４：メタデータエンコーダ
６０６：メタデータ抽出部
６０８：メタデータデコーダ
６１０：メタデータ映像生成部
６１２：メタデータ映像合成部
８０６：表示部
８０７：表示画面
８１２：メタデータ抽出部
８１４：表示制御部
８１６：映像合成処理部
８１８：映像補正処理部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a video signal editing apparatus for video signals and a processing method for the video editing apparatus.
[0002]
[Prior art]
In recent years, in the field of production of video works such as movies and TV programs, effective use of metadata relating to captured video material has been promoted. The metadata related to the video material includes, for example, information indicating the attributes of the video material such as the title name of the video work, the shooting date and time, the scene number, and the setting information of the imaging device and the lens at the time of shooting (for example, patent Reference 1). These metadata are useful information for identifying and managing captured video materials, and are also effectively used for CG (Computer Graphics) composition processing and composite processing in the post-processing stage of video materials. .
[0003]
Conventionally, such metadata is recorded and managed on a terminal such as a personal computer separately from the video material recorded on the magnetic tape or the like, and the link between the video material and the metadata is video shooting. It was done by giving the time code of the time to both sides. Therefore, when post-processing video material using such metadata, the corresponding metadata (camera and lens setting information, etc.) is read from the terminal or the like via a time code or the like, and the metadata is read. Is used for post-processing as information representing the image quality of the video material, movement of the subject, and the like.
[0004]
[Patent Document 1]
JP-A-9-46627
[0005]
[Problems to be solved by the invention]
However, the conventional post-processing method using metadata as described above uses metadata that is recorded and managed separately from the video material and indirectly linked to the video material via a time code or the like. .
[0006]
For this reason, when extracting metadata, it is necessary to adjust the time by a time code or the like, and there is a problem that extraction / display of necessary metadata is inefficient. Another problem is that it is inconvenient to handle metadata separately from video material.
[0007]
Furthermore, when either video material or metadata is edited, there is a problem that the metadata cannot be extracted and displayed continuously in synchronization with the video material. Also, when the video material is shot at variable speed (taken by changing the frame rate), there is a discrepancy between the number of frames of the video material and the number of recorded metadata. There is a problem that the metadata to be extracted cannot be suitably extracted.
[0008]
The present invention has been made in view of the above problems, and an object of the present invention is to easily and quickly extract metadata necessary for post-processing and display it in synchronization with video material. To provide a new and improved video editing apparatus and processing method capable of handling material and metadata in an integrated manner, and flexibly supporting editing processing and variable-speed shot video material. It is in.
[0009]
[Means for Solving the Problems]
In order to solve the above problems, according to a first aspect of the present invention, there is provided a video editing apparatus for post-processing a video signal recorded on a storage medium. This video editing apparatus is configured to store the storage medium from First metadata that is setting information of the imaging device regarding the image quality of the video signal, second metadata that is setting information of the lens device at the time of shooting, or setting information about the position or movement of the imaging device at the time of shooting At least one of the third metadata is A video signal reproducing device for reproducing the video signal added in units of frames; and from the reproduced video signal, At least one of the first metadata, the second metadata, and the third metadata A metadata extraction unit that extracts the frame by frame; the reproduced video signal and the extracted metadata are synchronized by frame Display control unit to display on the display unit And comprising;
[0010]
With this configuration, the video signal that is the content of the video material and the metadata added to the video signal in units of frames are recorded on the storage medium that is played back by the video signal playback device of the video editing device. Therefore, the video editing apparatus can handle the video signal and the metadata integrally. Further, the video signal reproduction device can reproduce the video signal recorded on the recording medium and output it to the metadata extraction unit. The metadata extraction unit can sequentially extract metadata added to the video signal for each frame of the reproduced video signal. For this reason, it is not necessary to maintain consistency between the extracted metadata and the video signal, so that the metadata extraction unit can easily and quickly extract metadata corresponding to each frame of the video signal. Further, the metadata extraction unit can extract only necessary metadata according to the contents of post-processing, instead of extracting all the metadata added to the video signal. Further, the display unit can display the extracted metadata for each frame of the video signal. Thereby, the operator of the video editing apparatus can suitably perform the post-processing of the video signal while browsing the metadata displayed together with the video signal on the display unit.
[0011]
Further, the post-processing may be configured to be a video synthesis process for synthesizing the reproduced video signal and the synthesis video signal. The composition video signal may be configured as a computer graphics video signal.
[0012]
The display unit may be configured to display the composite video signal and metadata related to the composite video signal together with the reproduced video signal and the extracted metadata. With this configuration, the operator can not only compare the video signal reproduced from the storage medium and the video of the composite video signal for each frame, but also compare the metadata corresponding to both video signals for each frame. Can do.
[0013]
The post-processing may be configured to be video correction processing for correcting the reproduced video signal.
[0014]
Further, the metadata added to the video signal may be configured to be grouped into one or two or more metadata groups according to the purpose of use of the metadata. With this configuration, the metadata extraction unit can extract metadata in units of metadata groups.
[0015]
Further, the metadata extraction unit may be configured to extract one or more metadata groups according to the contents of post-processing. With this configuration, the metadata extraction unit can extract and display only necessary metadata groups according to the contents of post-processing.
[0016]
Further, the metadata group includes a camera setting group including setting information of an imaging apparatus that has generated a video signal, a lens setting group including setting information of a lens apparatus included in the imaging apparatus, or setting information of a dolly apparatus included in the imaging apparatus. It may be configured to include at least one of the included setting groups. With this configuration, the metadata of the camera setting group can function as information representing the image quality of the captured video, for example. Also, the metadata of the lens setting group and the dolly setting group can function as information representing the movement, distance, etc. of the subject appearing in the captured video, for example.
[0017]
Further, unique group identification information is given to the metadata group added to the video signal, and the metadata extraction unit 1 or 2 or more according to the contents of the post-processing based on the group identification information. The metadata group may be extracted. With this configuration, the metadata extraction unit can identify which metadata group is based on the group identification information, and thus can perform extraction processing in units of metadata groups quickly.
[0018]
Further, the metadata group added to the video signal is provided with the data amount information of the metadata group, and the metadata extraction unit 1 or 2 according to the content of the post-processing based on the data amount information. You may comprise so that two or more metadata groups may be extracted. With this configuration, the metadata extraction unit can grasp the metadata amount in the metadata group in advance based on the data amount information when executing the metadata extraction process of a certain metadata group. Therefore, the metadata extraction unit can quickly perform extraction processing in units of metadata groups.
[0019]
Further, the video signal recorded in the storage medium may be configured such that the frame rate is changed. With this configuration, even when the frame rate of the video signal recorded on the storage medium changes, metadata is added for each frame of the video signal. The relationship never breaks. For this reason, the metadata extraction unit can suitably extract metadata from the video signal in units of frames without taking consistency between the video signal and the metadata.
[0020]
In order to solve the above problem, according to another aspect of the present invention, there is provided a processing method of a video editing apparatus for post-processing a video signal recorded on a storage medium. The processing method of this video editing apparatus is based on the storage medium, First metadata that is setting information of the imaging device relating to the image quality of the video signal, second metadata that is setting information relating to the lens device at the time of photographing, or setting information relating to the position or movement of the imaging device at the time of photographing. At least one of the third metadata is A video signal playback stage for playing back the video signal added in units of frames; from the played video signal; At least one of the first metadata, the second metadata, and the third metadata A metadata extraction step for extracting the image data in frame units; synchronizing the reproduced video signal and the extracted metadata in frame units; Display control stage to be displayed on the display And including.
[0021]
Further, the post-processing may be video synthesis processing for synthesizing the reproduced video signal and the synthesis video signal. Further, the synthesis video signal may be a computer graphic video signal.
[0022]
In the display step, the video signal for synthesis and the metadata related to the video signal for synthesis may be displayed in synchronization with the reproduced video signal and the extracted metadata.
[0023]
The post-processing may be video correction processing for correcting the reproduced video signal.
[0024]
Further, the metadata added to the video signal may be grouped into one or two or more metadata groups according to the purpose of use of the metadata. In the metadata extraction stage, one or more metadata groups corresponding to the contents of post-processing may be extracted.
[0025]
Further, the metadata group includes a camera setting group including setting information of an imaging apparatus that has generated a video signal, a lens setting group including setting information of a lens apparatus included in the imaging apparatus, or setting information of a dolly apparatus included in the imaging apparatus. It may be possible to include at least one of the included dolly setting groups.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the present specification and drawings, components having substantially the same functional configuration are denoted by the same reference numerals, and redundant description is omitted.
[0027]
(First embodiment)
The video editing apparatus and the processing method thereof according to the first embodiment of the present invention will be described below. In the following, a video recording system that records video signals handled by the video editing apparatus according to the present embodiment on a storage medium will be described first, and then the video editing apparatus according to the present embodiment will be described in detail.
[0028]
<1 Video recording system>
First, a video recording system and a video recording method according to the present embodiment will be described.
[0029]
<1.1 System configuration>
First, an outline of the video recording system according to the present embodiment will be described. The video recording system according to the present embodiment is a system for producing video works such as TV programs, video contents, movies, etc., for example, by a television broadcasting station, a production company of video contents, movies, and the like. This video recording system is provided, for example, at a shooting site (shooting studio, location location, etc.), and can record and record video content data of video materials constituting a video work. This video content data is content data composed of video data and / or audio data, for example. Of these, the video data is generally, for example, moving image data, but may include still image data such as drawings, photographs or paintings.
[0030]
Furthermore, this video recording system can generate, for example, various types of metadata related to the shot video material. Further, the video recording system can group such metadata, add them to the video signals constituting the video material for each frame, and record them together with the video signals on a storage medium. Note that this metadata is, for example, high-order data that represents the outline of the video material, attributes, or settings of the imaging device, and functions as index information of the video material, information for specifying shooting conditions, and the like. Will be described later.
[0031]
Next, the overall configuration of the video recording system according to the present embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing a schematic configuration of the video recording system 1 according to the present embodiment.
[0032]
As shown in FIG. 1, the video recording system 1 according to the present embodiment includes, for example, an imaging device 10, a sound collecting device 18, a camera control unit (hereinafter referred to as CCU) 20, and a metadata input terminal. The apparatus mainly includes a device 30, a metadata adding device 40, a video tape recorder (hereinafter referred to as VTR) 50, a metadata composition device 60, and a display device 70.
[0033]
The imaging device 10 is, for example, a video camera that converts an optical image incident on the lens device 12 into an electrical signal, and can capture and image a subject to generate and output a video signal. The imaging apparatus 10 can photograph each scene (scene) constituting a video work and output the generated video signal to, for example, the CCU 20. This video signal may be generated by, for example, a progressive method or an interlace method.
[0034]
In the present embodiment, the transmission of the video signal from the imaging device 10 to the CCU 20 is performed as an optical signal via, for example, an optical fiber cable. By transmitting a video signal as an optical signal in this way, a long distance transmission (for example, about 1 km) is possible as compared with a case of transmitting in HD SDI (High Definition Serial Digital Interface) format (for example, about 50 m). For this reason, since the imaging device 10 and the CCU 20 and the VTR 50 can be disposed sufficiently apart from each other, the degree of freedom in photographing is increased. However, the present invention is not limited to this example, and the imaging device 10 may transmit a video signal using, for example, an HD SDI cable. In this case, for example, the video signal may be transmitted directly from the imaging device 10 to the metadata adding device 40 without providing the CCU 20.
[0035]
The imaging device 10 collects various setting information (shooting condition information such as shutter speed and gain) in the imaging device 10 at the time of shooting, for example, and sets camera setting metadata based on these setting information. Can be generated. Further, for example, the imaging apparatus 10 can group and pack the camera setting metadata as a camera setting group and add it to each frame of the video signal. Details will be described later.
[0036]
In addition, the imaging device 10 includes, for example, a lens device 12 and a dolly device 14.
[0037]
The lens device 12 includes, for example, a plurality of lenses and a driving device that adjusts the distance, aperture, and the like of these lenses, and is suitable for the main body of the imaging device 10 by adjusting zoom, iris, focus, and the like. An optical image can be incident. For example, the lens device 12 can generate various setting information (shooting condition information such as zoom, iris, and focus) in the lens device 12 at the time of shooting for each frame as lens setting metadata.
[0038]
The dolly device 14 is a carriage for placing and moving the main body of the imaging device 10, for example, when shooting with the imaging device 10 approaching or moving away from the subject, or imaging with the moving subject. This is used when the apparatus 10 is moved to take a picture. The dolly device 14 can move at high speed along the subject or the like, for example, by placing a pulley provided in the lower part thereof on a rail. The dolly device 14 can generate, for example, various setting information in the dolly device 14 at the time of shooting (shooting condition information such as the position of the dolly and the camera orientation) as the dolly setting metadata for each frame. . The dolly device 14 does not necessarily have to be provided. For example, when the imaging device 10 is installed on a crane or the like for photographing from above, or when a cameraman takes a picture while holding the imaging device 10, It is unnecessary.
[0039]
The lens setting metadata and the dolly setting metadata generated as described above are output to the metadata adding device 40 via, for example, an RS-232C cable.
[0040]
The sound collector 18 is composed of, for example, a microphone and can generate and output an audio signal. More specifically, the sound collecting device 18 collects sound information such as background sounds and voices of actors at the time of photographing by the imaging device 10 to generate a sound signal. This audio signal is output to the VTR 50, for example. The sound collecting device 18 may be included in the imaging device 10.
[0041]
For example, the CCU 20 can convert a video signal input as an optical signal from the photographing apparatus 10 into an HD SDI signal and output the signal to the metadata adding apparatus 40 via the HD SDI cable. The CCU 20 can also acquire camera setting metadata from the video signal via an optical fiber cable or the like, for example. Note that the CCU 20 does not necessarily have to be provided as a device configured separately from the imaging device 10, and may be incorporated in the imaging device 10, for example. In particular, for example, when the imaging device 10 is configured to output a video signal in, for example, the HD SDI format, the CCU 20 is not an essential device.
[0042]
The metadata input terminal device 30 includes, for example, an information processing device such as a personal computer and its peripheral devices, and can generate scene information metadata based on user input. This scene information metadata is, for example, metadata relating to a scene photographed by the imaging apparatus 10, and is information (scene number, take number, etc.) described in an electronic clapperboard or the like in conventional photographing. For example, when a scene number or the like of a scene to be photographed is input by a director or the like, the metadata input terminal device 30 generates scene information metadata corresponding to the scene number and the like via an RS-232C cable or the like. To the metadata adding device 40. Note that a photographer, director, or the like may additionally input, for example, a comment at the time of recording a video material (such as a note of a shooting situation) as scene situation metadata by using the metadata input terminal device 30. it can.
[0043]
The metadata adding device 40 is a characteristic device according to the present embodiment. For example, the metadata adding device 40 can add the metadata to the video signal in units of frames. More specifically, the metadata adding device 40 includes, for example, lens setting metadata, dolly setting metadata, and scene information metadata from the lens device 12, the dolly device 14, and the metadata input terminal device 30, respectively. Etc. are entered. For example, the metadata adding apparatus 40 packs these metadata by grouping them into a plurality of metadata groups such as a lens setting group, a dolly setting group, and a scene information group for each purpose of use. Further, for example, the metadata adding device 40 sequentially adds the metadata of the lens setting group, the dolly setting group, and the scene information group that are grouped in this manner to the blanking area of the video signal input from the CCU 20 for each frame. , Can be inserted and added. In this way, the video signal to which all metadata is added is output to the VTR 50 via, for example, an HD SDI cable.
[0044]
The metadata adding device 40 is supplied with a reference signal (reference synchronization signal) from the reference signal generating device 72 and a time code signal (LTC: linear time code) from the time code signal generating device 74. . Also, such LTC can be output to the VTR 50.
[0045]
The VTR 50 is configured as a video signal recording device according to the present embodiment. For example, a video signal input from the metadata adding device 40 or an audio signal input from the sound collecting device 18 is converted into a video tape 52 or the like. Can be recorded on any storage medium. The VTR 50 can also reproduce video signals and the like recorded on the video tape 52. The VTR 50 outputs, for example, the video signal input from the metadata adding device 40 to the metadata combining device 60 as it is, or outputs the video signal reproduced from the video tape 52 to the metadata combining device 60. Can do.
[0046]
In this embodiment, the video tape 52 is used as a storage medium. However, the present invention is not limited to this example. For example, any storage medium such as various magnetic tapes, magnetic disks, optical disks, and memory cards may be used. Good. Further, the video signal recording device is not limited to the example of the VTR 50, and can be changed to a device (disk device, various reader / writer, etc.) corresponding to such various storage media.
[0047]
The metadata synthesizing device 60 is, for example, a decoder device that extracts and decodes the metadata added to the video signal as described above and synthesizes it with the video signal. More specifically, the metadata composition device 60 can extract, for example, all or part of the metadata added to the video signal input from the VTR 50 in units of frames. Further, the metadata synthesizing device 60 can synthesize the extracted metadata in units of frames after decoding and rewriting the extracted metadata into video data. This combining means, for example, multiplexing (eg, superimposing) the video signal and the video data of the metadata in units of frames.
[0048]
The display device 70 is a display device such as an LCD (Liquid Crystal Display) or a CRT (Cathode Ray Tube). When the video signal obtained by synthesizing the metadata is input from the metadata synthesizing device 60, the display device 70 can display a video in which the metadata is superimposed.
[0049]
<1.2 Contents of metadata>
Next, grouped metadata according to the present embodiment will be described in detail. In the present embodiment, for example, various metadata related to the video material as described above are grouped into, for example, four metadata groups according to the purpose of use, and transmitted, recorded, and managed. . Hereinafter, for each of these four metadata groups, details of metadata included in the metadata group will be described in detail.
[0050]
<1.2.1 Scene information group>
First, scene information metadata included in a scene information group will be described in detail with reference to FIG. FIG. 2 is an explanatory diagram showing a specific example of scene information metadata included in the scene information group according to the present embodiment.
[0051]
As shown in FIG. 2, the scene information metadata included in the scene information group includes, for example, “time code”, “scene number”, “take number”, etc., which are conventionally displayed on an electronic clapperboard (slate) or the like. Various kinds of metadata related to a scene taken by the imaging apparatus 10 including information.
[0052]
“Time code” is time information including hours, minutes, seconds, frame numbers, and the like represented by LTC. Conventionally, this “time code” is recorded in the longitudinal direction of an audio track of the video tape 52, for example. In the present embodiment, this “time code” is generated by the time code signal generation device 74 and is attached to the blanking region of the video signal for each frame by the metadata adding device, for example. The position of the video signal can be specified by this time code. The data amount of this “time code” is, for example, 16 bytes.
“Date” is text information indicating the date on which the image was taken, and the amount of data is, for example, 4 bytes.
“Video work title” is text information representing the title of the video work, and the amount of data is, for example, 30 bytes.
“Photographing team number” is an ID number or the like for identifying the photographing team (crew) in charge of the photographing, and the data amount is, for example, 2 bytes.
The “scene number” is a number for specifying a scene in which shooting is performed among a plurality of scenes (Scene: shooting scene) constituting the video work, and the data amount is, for example, 2 bytes. . By referring to the “scene number”, it is possible to identify what scene in the video work corresponds to the captured video material. Note that, for example, a cut number obtained by further subdividing a scene can be added as scene information metadata.
The “take number” is a number for specifying a take that is a continuous video unit from the start of recording once to the end of recording by the imaging apparatus 10, and the data amount is, for example, 2 bytes. is there. By referring to the “take number”, it is possible to identify which take belongs to which scene the recorded video signal corresponds to.
“Roll number” is a number for specifying a roll (Roll), which is a video unit obtained by further subdividing the take, and its data amount is, for example, 2 bytes. “Cameraman”, “Director”, and “Producer” are text information representing the name of the cameraman, director, and producer who are in charge of shooting, respectively, and the amount of these data is, for example, 16 bytes.
[0053]
In this way, in the scene information group, for example, metadata that can be attribute information and index information of the recorded video is collected. This scene information metadata is useful information for grasping the content of the video material, identifying and managing the video material, for example, in the video recording stage, the post-processing stage, and the editing stage.
[0054]
<1.2.2 Camera setting group>
Next, the camera setting metadata included in the camera setting group will be described in detail with reference to FIG. FIG. 3 is an explanatory diagram showing a specific example of camera setting metadata included in the camera setting group according to the present embodiment.
[0055]
As shown in FIG. 3, the camera setting metadata included in the camera setting group is, for example, metadata representing various shooting conditions mainly including setting information of the imaging device 10 when a video is shot.
[0056]
“Camera ID” is a serial number (device number) for specifying the imaging device 10 that has performed the imaging process, and the data amount is, for example, 4 bytes.
“CHU switch ON / OFF” is bit information indicating whether or not the setting of the imaging apparatus 10 is changed from the standard setting as described below, and the data amount is, for example, 1 byte.
“CCU ID” is a serial number (device number) for specifying the CCU 20 that has performed the imaging process, and its data amount is, for example, 4 bytes.
“Filter setting” is information indicating the filter setting of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 2 bytes. In the present embodiment, for example, the imaging apparatus 10 is provided with five types of filters in duplicate, and represents which of these two filters are combined and photographed.
“Shutter speed” is information indicating the setting value of the shutter speed of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 1 byte. In the present embodiment, this “shutter speed” can be set in six stages, for example, between “1/100” and “1/2000” seconds.
“Gain” is information representing the installation value of the gain of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 1 byte.
“ECS” is information indicating ON / OFF of the ECS (Extended Clear Scan) function of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 2 bytes.
“Gamma (master)” is information indicating the setting of the gamma characteristic (gamma curve or the like) of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 2 bytes. “Gamma (user setting)” is information indicating the setting of the gamma characteristic when the gamma curve or the like is changed by the user setting, and the data amount is, for example, 1 byte.
“Variable frame rate” is information indicating the frame rate setting value of the video signal shot by the imaging device 10 capable of variable speed shooting, and the data amount is, for example, 1 byte. The imaging apparatus 10 according to the present embodiment can be photographed by changing the frame rate at, for example, 23.98 to 30P, but is not limited to such an example, and is configured to be capable of variable speed photographing at 1 to 60P, for example. May be.
“Video signal white level” is information representing the white level setting value of the video signal by the white balance adjustment processing of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 6 bytes.
“Video signal black level” is information indicating the set value of the black level of the video signal by the black balance adjustment processing of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 8 bytes.
“Detail level” is information representing a setting value of the detail level by the detail adjustment processing of the imaging apparatus 10 at the time of shooting, and the data amount is, for example, 2 bytes.
“Knee point” is information indicating the set value of the knee point of the video signal compressed by the knee circuit of the imaging apparatus 10 at the time of shooting, and the amount of data is, for example, 2 bytes.
“Need slope” is information indicating the set value of the knee slope of the video signal compressed by the knee circuit of the imaging device 10 at the time of shooting, and the amount of data is, for example, 2 bytes.
“Recorder status” is information indicating the setting value of the frame rate when the video signal recording / reproducing apparatus such as the VTR 50 records the video signal, and the data amount is, for example, 1 byte. The “recorder status” is determined in correspondence with the “variable frame rate”.
[0057]
In this way, in the camera setting group, for example, metadata regarding shooting conditions such as setting information of the imaging device 10 at the time of shooting is collected. The camera setting metadata is useful information for grasping the image quality (brightness, hue, texture, etc.) of the video material at the post-processing stage of the video material, for example.
[0058]
<1.2.3 Lens setting group>
Next, based on FIG. 4, the lens setting metadata included in the lens setting group will be described in detail with a specific example. FIG. 4 is an explanatory diagram showing a specific example of lens setting metadata included in the lens setting group according to the present embodiment.
[0059]
As shown in FIG. 4, the lens setting metadata included in the lens setting group is, for example, metadata representing various shooting conditions mainly including setting information of the lens device 12 at the time of video shooting.
[0060]
“Zoom” is information representing a zoom setting value obtained by the photographing magnification adjustment processing of the lens apparatus 12 at the time of photographing, and the data amount is, for example, 2 bytes.
“Focus” is information representing a focus setting value by the focal length adjustment processing of the lens device 12 at the time of photographing, and the data amount is, for example, 2 bytes.
“Iris” is information representing an iris (aperture) setting value by exposure adjustment processing of the lens device 12 at the time of photographing, and the data amount is, for example, 2 bytes.
“Lens ID” is a serial number (device number) for specifying the lens device 12 used for photographing, and the data amount is, for example, 4 bytes.
[0061]
In this way, in the lens setting group, for example, metadata relating to shooting conditions such as setting information of the lens device 12 at the time of shooting is collected. The lens setting metadata is useful information for grasping, for example, the movement of the subject photographed with the video material, the distance from the imaging device 10 and the like in the post-processing stage of the video material.
[0062]
<1.2.4 Dori setting group>
Next, the dolly setting metadata included in the dolly setting group will be described in detail with reference to FIG. FIG. 5 is an explanatory diagram showing a specific example of the camera setting metadata included in the dolly setting group according to the present embodiment.
[0063]
As shown in FIG. 5, the dolly setting metadata included in the dolly setting group is, for example, metadata representing various shooting conditions and the like mainly including setting information of the dolly device 14 at the time of video shooting.
“GPS” is latitude and longitude information (Global Positioning System information) for specifying the position of the dolly device 14 (that is, the position of the imaging device 10) at the time of shooting, and the data amount is, for example, 12 bytes. . “Moving direction” is information indicating the moving direction of the dolly device 14 at the time of shooting (that is, the moving direction of the imaging device 10) by an angle, and the data amount is, for example, 4 bytes.
“Movement speed” is information indicating the movement speed of the dolly device 14 at the time of photographing (that is, the movement speed of the imaging device 10), and the data amount is, for example, 4 bytes.
“Camera direction” is information indicating the shooting direction of the imaging apparatus 10 and is expressed by the rotation angle (angle of swinging) of the imaging apparatus 10 with the fixed dolly device 14 as a reference. Specifically, for example, the imaging direction of the imaging apparatus 10 is “pan” (Z-axis direction), “tilt” (Y-axis direction), “roll” (X-axis direction). This is expressed as a rotation angle in three directions. Each of these three data amounts is, for example, 2 bytes.
“Dolly height” is information indicating the height of the dolly device 14, and its data amount is, for example, 2 bytes. With this information, the position of the imaging device 10 in the vertical direction can be specified.
“Dori ID” is a serial number (device number) for specifying the dolly device 14 used for photographing, and its data amount is, for example, 4 bytes.
[0064]
In this way, in the dolly setting group, for example, metadata regarding shooting conditions including setting information such as the position and movement of the dolly device 14 at the time of shooting is collected. This dolly setting metadata is also useful information for grasping the movement, distance, etc. of the subject appearing in the video material in the post-processing stage of the video material, for example, as in the lens setting metadata. .
[0065]
The contents of, for example, four metadata groups according to this embodiment have been described above. By grouping metadata in this way, it is possible to suitably extract only necessary metadata in units of groups according to the purpose of use of metadata, and use, rewrite, or the like.
[0066]
For example, in the video recording stage, the metadata of the scene information group including the scene number and time code is extracted and used for the purpose of identifying and grasping the video that is being recorded or has been recorded. Further, in the post-processing stage of the video material, the metadata of the camera, lens, and dolly setting group is useful when a CG video is synthesized with a live-action video. Specifically, the metadata of the camera setting group is extracted and used for the purpose of grasping the image quality of the video material. Further, the metadata of the lens setting group and the dolly setting group is extracted and used for the purpose of grasping the movement of the subject in the video material.
[0067]
As described above, the purpose of using the metadata of the lens setting group and the dolly setting group is common. For this reason, the lens setting group and the dolly setting group are not configured as separate groups as in the present embodiment. For example, the lens setting metadata and the dolly setting metadata are configured as one lens / dolly setting group. They may be grouped together into one.
[0068]
<1.3 Metadata format>
Next, a metadata format according to this embodiment will be described with reference to FIG. FIG. 6 is an explanatory diagram for explaining a metadata format according to the present embodiment.
[0069]
As described above, the metadata according to the present embodiment is grouped into, for example, four metadata groups. The metadata grouped in this way is added to the video signal in a predetermined format by the imaging device 10 and the metadata adding device 40, for example.
[0070]
More specifically, as shown in FIG. 6A, the metadata is packaged as ancillary data in, for example, an ancillary data area in the vertical blanking area of the video signal and is frame-by-frame. Inserted. The format of the packaged metadata at the time of transmission, for example, is shown in FIG.
[0071]
As shown in FIG. 6B, the metadata is grouped into four metadata groups, for example, a scene information group, a camera setting group, a lens setting group, and a dolly setting group, and these four metadata groups are continuous. And have a format arranged in series. Each metadata group is KLV (Key Length Value) encoded based on, for example, SMPTE (Society of Motion Picture and Television Engineers) standard (SMPTE291M or the like).
[0072]
“K (Key)” is, for example, a 1-byte key ID (reserved word) assigned to the top of each metadata group. This “K” code is configured as group identification information according to the present embodiment, and functions as a code for identifying each metadata group. For example, in any frame of the video signal, for example, “01” is always given to the scene information group, “02” is always given to the camera setting group, and “02” is always given to the lens setting group. For example, “03” is always given, and “04” is always given to the lens setting group, for example, so that a unique identification code can be unified for each metadata group. In this way, by assigning a “K” code that is unique group identification information to each metadata group, based on the group identification information, only a specific metadata group is selected from a plurality of metadata groups. Each frame can be easily extracted.
[0073]
“L (Length)” is, for example, a 1-byte length code provided after the “K” code. This “L” code is configured as data amount information according to the present embodiment, and functions as a code representing the data amount of the subsequent metadata group. For example, if “L” attached to a scene information group of a certain frame is “124”, this indicates that the data amount of the scene information group in that frame is, for example, 124 bytes. In this way, by adding an “L” code that is data amount information before the contents of each metadata group, the processing efficiency of the metadata extraction or rewriting process is improved. In other words, metadata processing devices such as the metadata adding device 40 and the VTR 50 grasp the data amount of metadata to be extracted or rewritten in advance by referring to the “L” code that is the data amount information. it can. For this reason, the processing efficiency of the extraction or rewriting process is improved.
[0074]
“Element” is, for example, a user data area (Value area) in which metadata of each actual metadata group is stored, and has a variable length.
[0075]
In addition, before the metadata group encoded in this way, an “Ancillary Data Flag” ancillary data flag, which is a flag for defining and identifying the type of metadata to be transmitted, and “DID: Data identification” A data ID, “SDID: Secondary Data Identification” secondary data ID, “DC: Data Counter” data counter, and the like are attached. On the other hand, codes such as “CRC: Cyclic Redundancy Check” and “CHECK SUM” for error detection at the time of transmission are attached after the metadata group.
[0076]
By the way, in the SMPTE standard, when packing KLV-coded metadata into an ancillary data area of a video signal and inserting it, it is standardized so that one packet size of ancillary data is 255 bytes. . Therefore, in the metadata format according to the present embodiment, the total data amount of the grouped metadata is adjusted to be, for example, 255 bytes or less so as to conform to this standard. Specifically, for example, the metadata amount of the scene information group is 124 bytes or less, the metadata amount of the camera setting group is 40 bytes or less, the metadata amount of the lens setting group is 10 bytes or less, For example, the metadata amount is adjusted to be 52 bytes or less. For this reason, one packet size of the ancillary data is set to be, for example, about 226 bytes or less in the total amount of metadata.
[0077]
As described above, in the metadata format according to the present embodiment, all the metadata is set to fit within one packet size (255 bytes) of the ancillary data. However, the present invention is not limited to this example. For example, a plurality of ancillary data packets may be concatenated, and metadata may be divided and packed into the plurality of packets.
[0078]
As described above, in the metadata format according to the present embodiment, for example, the area allocated to the metadata is divided according to the number of metadata groups, and the metadata of each metadata group is divided into each divided area. It is the structure which inserts. Further, the group identification information and the data amount information are given to the top of each metadata group. With this configuration, it is possible to quickly, easily detect, extract, or rewrite metadata necessary for the purpose of using metadata for each metadata group. For example, by sharing the group identification information as a common identification ID between the recording department and the editing department of the video work, the metadata can be suitably used in the production process of the video work.
[0079]
<1.4 Configuration of each device>
Next, main devices constituting the video recording system 1 as described above will be described in detail.
[0080]
<1.4.1 Imaging device>
First, the imaging apparatus 10 according to the present embodiment will be described in detail based on FIG. FIG. 7 is a block diagram illustrating a configuration of the imaging apparatus 10 according to the present embodiment.
[0081]
As illustrated in FIG. 7, the imaging device 10 includes, for example, a CPU 100, a memory unit 102, an imaging unit 104, a signal processing unit 106, a display unit 108, a camera setting metadata generation unit 110, and metadata addition. Unit 112, transmission / reception unit 114, lens device 12, and dolly device 14.
[0082]
A CPU (Central Processing Unit) 100 functions as an arithmetic processing device and a control device, and can control processing of each unit of the imaging device 10. The memory unit 102 includes, for example, storage devices such as various RAMs, ROMs, flash memories, and hard disks, and has a function of storing or temporarily storing various data related to the processing of the CPU 100, operation programs of the CPU 100, and the like. Have
[0083]
The imaging unit 104 is configured by, for example, OHB (Optical Head Base), and has a function of imaging a subject and generating a video signal. More specifically, the imaging unit 104, for example, splits an optical image incident from the lens device 12 into R, G, and B by a prism (not shown), and uses various filters (not shown). Then, the image is photoelectrically converted at a predetermined shutter speed by an imaging device (not shown) such as a CCD (Charge Coupled Device) to generate a video signal that is an analog electrical signal.
[0084]
The signal processing unit 106 performs gain adjustment (AGC) processing, correlated double sampling processing, A / D conversion processing, error correction processing, white correction on a video signal that is a weak analog electric signal input from the imaging unit 104. Digital video signals can be output by performing balance adjustment processing, dynamic range compression processing, gamma correction processing, shading correction processing, detail adjustment processing, knee processing, and the like. In the present embodiment, for example, an HD (High Definition) digital video signal is generated and output. The signal processing unit 106 can also convert the digital video signal into an analog video signal and output the analog video signal to the display unit 108, for example. In addition, the signal processing unit 106 can change the frame rate of the video signal to be output (for example, 23.98 to 30P) based on, for example, preset conditions or based on a cameraman input operation. .
[0085]
The display unit 108 is, for example, a viewfinder for a cameraman to view a subject, and includes a CRT monitor. The display unit 108 can display and output, for example, an analog video signal input from the signal processing unit 106. The display unit 108 may be composed of various display devices such as an LCD monitor, for example.
[0086]
The camera setting metadata generation unit 110 acquires and manages parameters such as setting information of the imaging unit 104 and signal processing setting information such as gamma, knee, and detail in the signal processing unit 108, for example. Further, the camera setting metadata generation unit 110 generates the camera setting metadata, for example, for each frame of the video signal based on the parameters, and outputs the generated camera setting metadata to the metadata addition unit 112.
[0087]
The metadata adding unit 112 is configured as one of metadata adding devices according to the present embodiment. For example, the metadata adding unit 112 can add the camera setting metadata to the video signal for each frame in accordance with the timing of outputting the video signal to the outside of the imaging device 10. Specifically, the metadata adding unit 112 performs KLV encoding and packing, for example, camera setting metadata input from the camera setting metadata generation unit 110. Further, the metadata adding unit 112 adds the packed camera setting metadata to the area assigned to the camera setting group in the blanking area of the video signal as shown in FIG. Are inserted sequentially.
[0088]
At this time, the metadata adding unit 112 inserts dummy data into areas corresponding to the scene information group, the lens setting group, and the dolly setting group other than the camera setting group, for example, as shown in FIG. Can be kept.
[0089]
The camera setting metadata generation unit 108 and the metadata addition unit 110 as described above may be configured as hardware, for example, or may be configured as software that realizes the above processing functions. May be stored in the memory unit 102 and the CPU 100 may perform actual processing.
[0090]
For example, the transmission / reception unit 114 transmits the video signal to which the camera setting metadata is added as described above to the CCU 20 via the optical fiber cable.
[0091]
The lens device 12 includes, for example, an optical block 122, a drive system block 124, and a lens setting metadata generation unit 124.
[0092]
The optical system block 122 includes, for example, a plurality of lenses, a diaphragm, and the like, and can make an optical image from a subject incident on the imaging unit 104. The drive system block 124 can adjust zoom, iris, focus, and the like by adjusting the distance between the lenses and the aperture of the optical system block 122, for example.
[0093]
The lens setting metadata generation unit 126 acquires and manages parameters such as lens setting information of the drive system block 124, for example. Further, the lens setting metadata generation unit 126 generates the lens setting metadata, for example, for each frame based on such parameters. The lens setting metadata generated in this way is output to the metadata adding device 40 via, for example, an RS-232C cable.
[0094]
The dolly device 14 includes, for example, a dolly measurement unit 142 and a dolly setting metadata generation unit 144.
[0095]
The dolly measuring unit 142 measures various setting information related to the dolly device 14 such as GPS information, the moving speed and direction of the dolly device 14, and the angle of the imaging device 10, and outputs them to the dolly setting metadata generation unit 144. To do.
[0096]
The dolly setting metadata generation unit 144 generates the dolly setting metadata, for example, for each frame based on the measurement information from the dolly measurement unit 142, for example. The dolly setting metadata generated in this way is output to the metadata adding device 40 via, for example, an RS-232C cable.
[0097]
<1.4.2 Camera control unit>
Next, the CCU 20 according to the present embodiment will be described in detail based on FIG. FIG. 9 is a block diagram showing the configuration of the CCU 20 according to the present embodiment.
[0098]
As illustrated in FIG. 9, the CCU 20 includes, for example, a CPU 200, a memory unit 202, a transmission / reception unit 204, a signal processing unit 206, and a serializer 208.
[0099]
The CPU 200 functions as an arithmetic processing unit and a control unit, and can control processing of each unit of the CCU 20. A reference signal is input to the CPU 200, and the video signal can be synchronized with other devices in the video recording system 1. The memory unit 202 includes, for example, storage devices such as various RAMs, ROMs, flash memories, and hard disks, and has a function of storing or temporarily storing various data related to the processing of the CPU 200, operation programs of the CPU 200, and the like. Have
[0100]
For example, the transmission / reception unit 204 receives a video signal to which camera setting metadata is added from the imaging device 10 and transmits the video signal to the signal processing unit 206.
[0101]
The signal processing unit 206 converts, for example, a video signal input as an optical signal into an HD SDI signal and outputs it to the serializer 208. The signal processing unit 206 can also be configured to have the processing function of the signal processing unit 106 of the imaging apparatus 10.
[0102]
For example, the serializer 208 performs parallel-serial conversion on the video signal received from the signal processing unit 206 and transmits the video signal to the metadata adding device 40 via the HD SDI cable. In the blanking area of the video signal output from the CCU 20, actual metadata is inserted only in the area corresponding to the camera setting group, for example, as shown in FIG. Dummy data is inserted in the area of the metadata group.
[0103]
<1.4.3 Metadata addition device>
Next, the metadata adding device 40 according to the present embodiment will be described in detail with reference to FIG. FIG. 10 is a block diagram showing the configuration of the metadata adding device 40 according to this embodiment.
[0104]
As shown in FIG. 10, the metadata adding device 40 includes, for example, a CPU 400, a memory unit 402, a metadata packing unit 404, a metadata encoder 406, a deserializer 408, a metadata insertion unit 410, and a serializer 412. And.
[0105]
The CPU 400 functions as an arithmetic processing device and a control device, and can control processing of each unit of the metadata adding device 40. A reference signal is input to the CPU 400, and the video signal can be synchronized with other devices in the video recording system 1. Further, a time code signal (LTC) is input to the CPU 400, and time code information that is one of scene information metadata can be generated based on the LTC and stored in the memory unit 402. . Also, such LTC can be output to the VTR 50.
[0106]
The memory unit 402 includes, for example, storage devices such as various RAMs, ROMs, flash memories, and hard disks, and functions to store or temporarily store various data related to the processing of the CPU 400, operation programs of the CPU 400, and the like. Have The memory unit 402 includes a metadata buffer memory 403 for temporarily storing metadata transmitted from each device, for example.
[0107]
The metadata buffer memory 403 includes, for example, lens setting metadata sequentially transmitted from the lens device 12 after the start of photographing, dolly setting metadata sequentially transmitted from the dolly device 14 after starting photographing, and a metadata input terminal. The scene information metadata previously acquired from the apparatus 30 before the start of photographing, time code information input from the CPU 400, and the like are stored.
[0108]
The metadata packing unit 404 extracts, for example, necessary metadata from various kinds of metadata stored in the metadata buffer memory 403, and sets a lens setting group, a dolly setting group, a scene for each purpose of use. The data is grouped into a plurality of metadata groups such as an information group and repacked into the KLV structure. The metadata packing unit 404 outputs the metadata packed in this way to the metadata encoder 406.
[0109]
The metadata encoder 406 encodes the metadata from the metadata packing unit 404. The metadata input to the metadata adding device 40 as described above is, for example, data in the RS-232C protocol format. Therefore, for example, the metadata encoder 406 converts the format into an ancillary data packet format and encodes the metadata so that it can be inserted into an HDSDI video signal (see FIG. 6). By this encoding, for example, various flags and CRC as described above are added before and after the metadata.
[0110]
The deserializer 408 performs serial-parallel conversion on the video signal input from the CCU 20 and outputs the video signal to the metadata insertion unit 410.
[0111]
The metadata insertion unit 410 sequentially inserts the metadata input from the metadata encoder 406 into the blanking area of the video signal input from the deserializer 408 for each frame.
[0112]
At this time, in the video signal input to the metadata insertion unit 410, for example, an area corresponding to the camera setting group in the blanking area is preliminarily set by the imaging device 10 as shown in FIG. The camera setting metadata of the camera setting group is inserted.
[0113]
On the other hand, dummy data is inserted in areas corresponding to the scene information group, the lens setting group, and the dolly setting group other than the camera setting group. For this reason, as shown in FIG. 8B, the metadata insertion unit 410, for example, rewrites the dummy data into actual scene information metadata, lens setting metadata, dolly setting metadata, and the like. The metadata can be inserted into the video signal. At the time of the metadata rewriting process, the metadata insertion unit 410, for example, based on the group identification information “K” and the data amount information “L” assigned to the corresponding area of each metadata group, Since the detection and rewriting process is performed, the rewriting process can be performed efficiently. Further, when inserting metadata in this way, the metadata insertion unit 410 can also perform, for example, phase alignment of the delay time between the metadata to be inserted and the video signal.
[0114]
The serializer 412 performs parallel-serial conversion on the video signal to which the metadata is added for each frame by the metadata insertion unit 410 as described above, and transmits the video signal to the VTR 50.
[0115]
As described above, the metadata adding apparatus 40 according to the present embodiment further adds scene information metadata, lens setting metadata, and dolly setting metadata to a video signal to which camera setting metadata is added in advance. It can be added in addition.
[0116]
<1.4.4 Video tape recorder>
Next, the VTR 50 according to the present embodiment will be described in detail based on FIG. FIG. 11 is a block diagram showing the configuration of the VTR 50 according to the present embodiment.
[0117]
As shown in FIG. 11, the VTR 50 includes, for example, a CPU 500, a memory unit 502, a deserializer 504, a signal processing unit 506, a metadata decoder 508, a recording / reproducing block 510, an ECC block 512, and a metadata encoder. 514 and a serializer 516.
[0118]
The CPU 500 functions as an arithmetic processing unit and a control unit, and can control processing of each unit of the VTR 50. A time code signal (LTC) signal is input to the CPU 500. The memory unit 502 includes, for example, storage devices such as various RAMs, ROMs, flash memories, and hard disks, and has a function of storing or temporarily storing various data related to processing of the CPU 500, operation programs of the CPU 500, and the like. Have
[0119]
The deserializer 504 performs serial-parallel conversion on the video signal input from the metadata adding device 40 and outputs the video signal to the signal processing unit 506.
[0120]
For example, the signal processing unit 506 can perform various processes on the video signal in order to suitably record / reproduce the video signal and the like on the video tape 52. For example, the signal processing unit 506 can compress / decompress the video signal based on MPEG (Moving Picture Experts Group phase) 1, MPEG2, MPEG4, or DCT (Discrete Cosine Transform) system, as necessary. In addition, the signal processing unit 506 can adjust the timing of recording / reproduction of each of the above signals, or can separate the video signal and the audio signal and add an ECC (Error Correcting Code). . Further, the signal processing unit 506 can extract, for example, metadata added to the video signal in units of frames, or conversely, can insert decoded metadata in units of frames in the video signals.
[0121]
For example, the signal processing unit 506 can directly output the video signal input from the metadata adding device 40 to the serializer 514 or output the video signal reproduced from the video tape 52 to the serializer 514.
[0122]
The metadata decoder 508 decodes, for example, metadata extracted from the video signal. Specifically, the metadata decoder 508 removes, for example, flags (Flag, DID, SDID, etc.) and CRC attached to the metadata that are unnecessary for recording, and outputs them to the CPU 500. For example, the CPU 500 adds ECC to the metadata in the same manner as the video signal and outputs the metadata to the recording / reproducing block 510.
[0123]
The recording / reproducing block 510 is composed of, for example, a video head and a driving mechanism (both not shown). The recording / reproducing block 510 can actually record / reproduce the video signal to which the metadata is added to the video tape 52. More specifically, the recording / reproducing block 510 can record, for example, a video signal, an audio signal, and metadata in units of one frame and sequentially record them in the recording area of the video tape 52. In addition, the recording / reproducing block 510 can sequentially reproduce, for example, a set of video signals, audio signals, and metadata recorded in the recording area of the video tape 52 in units of one frame.
[0124]
For example, the ECC block 512 performs error detection on a video signal or the like reproduced from the video tape 52 by the recording / reproducing block 510 based on the ECC. After the error detection is completed, the ECC block 512 outputs the reproduced metadata to the CPU 500 and the video signal and the audio signal to the signal processing unit 506, for example.
[0125]
The metadata encoder 514 encodes the reproduced metadata in a transmission format (with the above flag, CRC, etc.) and outputs the encoded metadata to the signal processing unit 506. For example, the signal processing unit 506 combines the video signal and audio signal input from the ECC block 512 with the metadata encoded by the metadata encoder 514 and outputs the combined signal to the serializer 516.
[0126]
The serializer 516 performs parallel-serial conversion on the video signal or the like input from the signal processing unit 506 and transmits the converted video signal to the metadata synthesis device 60.
[0127]
As described above, the signal processing unit 506, the metadata decoder 508, the CPU 500, the recording / reproducing block 510, and the like are configured as a recording unit according to the present embodiment, and the video signal to which the metadata is added is stored in the storage medium. Can be recorded.
[0128]
<1.4.5 Metadata Synthesizer>
Next, the metadata synthesis apparatus 60 according to the present embodiment will be described in detail based on FIG. FIG. 12 is a block diagram showing the configuration of the metadata synthesis apparatus 60 according to this embodiment.
[0129]
As shown in FIG. 12, the metadata composition apparatus 60 includes, for example, a CPU 600, a memory unit 602, a deserializer 604, a metadata extraction unit 606, a metadata decoder 608, a metadata video generation unit 610, A data video composition unit 612 and a serializer 614 are provided.
[0130]
The CPU 600 functions as an arithmetic processing device and a control device, and can control processing of each unit of the metadata composition device 60. The memory unit 602 includes, for example, storage devices such as various RAMs, ROMs, flash memories, and hard disks, and stores or temporarily stores various data related to the processing of the CPU 600, operation programs of the CPU 600, and the like. Have
[0131]
The deserializer 604 performs serial-parallel conversion on the video signal input from the VTR 50 and outputs it to the metadata extraction unit 606.
[0132]
For example, the metadata extraction unit 606 extracts the metadata inserted in the blanking area of the video signal for each frame. At this time, the metadata extraction unit 606 does not extract all metadata inserted in the blanking area, for example, but extracts only metadata of a specific metadata group (for example, scene information group), for example. In addition, only specific metadata (for example, time code, scene number, take number) in the metadata group may be extracted. At the time of such metadata extraction processing, the metadata extraction unit 606 performs the extraction of the metadata group to be extracted based on the group identification information “K” and the data amount information “L” given to each metadata group. As a result, the required metadata can be extracted efficiently.
[0133]
For example, the metadata extraction unit 606 outputs the metadata extracted in this way to the metadata decoder 608, while outputting the video signal to the metadata video composition unit 612 as it is.
[0134]
For example, the metadata decoder 608 decodes the metadata input from the metadata extraction unit 606 and outputs the decoded metadata to the metadata video generation unit 610.
[0135]
For example, the metadata video generation unit 610 can rewrite the metadata input from the metadata decoder 608 into video data for superimposition. That is, since the metadata decoded by the metadata decoder 608 is, for example, metadata in a text data format, the metadata video generation unit 610 converts this metadata into a video data format.
[0136]
For example, the metadata video synthesis unit 612 can sequentially synthesize the metadata converted into video data by the metadata video generation unit 610 in units of frames with respect to the video signal input from the metadata extraction unit 606. it can. In other words, for example, the metadata adding unit 612 can superimpose the video signal metadata by multiplexing the video data into a frame unit.
[0137]
The serializer 612 performs parallel-serial conversion on the video signal or the like input from the metadata video synthesis unit 612 and transmits the converted video signal to the display device 70.
[0138]
In this way, the metadata synthesizing device 60 takes out the metadata inserted in the blanking area from the video signal being captured by the imaging device or the video signal reproduced by the VTR 50, and converts it into the video signal. You can superimpose. As a result, the display device 70 to which the video signal is input can display a video in which the metadata is superimposed.
[0139]
As a result, the director or the like can browse, for example, the video being recorded by the imaging device 10 or the video reproduced by the VTR 50 after recording, along with metadata relating to the video. For this reason, for example, when the time code, scene number, take number, etc. are displayed in a superimposed manner, the director, etc., indicates what scene, what take, what time, etc. , Can be easily identified and confirmed while viewing the video.
[0140]
<1.5 Video recording method>
Next, a video recording method according to the present embodiment using the video recording system 1 as described above will be described with reference to FIG. FIG. 13 is a timing chart for explaining the video recording method according to the present embodiment.
[0141]
As shown in FIG. 13A, when shooting is started, first, raw images are sequentially incident on the imaging device 10. Then, the imaging apparatus 10 sequentially generates video signals in units of frames of 0 frame, 1 frame, 2 frames,. At this time, the CCD or the like of the imaging device 10 scans the video, for example, by a progressive method, for example. For this reason, the video signal output from the imaging device 10 has a delay of, for example, about one frame with respect to the raw video incident on the imaging device 10. As a result, as shown in FIG. 13B, the output of the CCU 20 is also delayed by about one frame, for example.
[0142]
At substantially the same time as the generation of the video signal, the imaging device 10 generates camera setting metadata for each frame, and, as shown in FIG. 13B, 1 in the blanking area of the video signal of the corresponding frame. Insert sequentially for each frame. Thereby, the imaging device 10 can add the metadata of the camera setting group to the video signal in units of frames while executing the imaging process to generate the video signal.
[0143]
At the same time as the imaging process of the video apparatus 10, the lens apparatus 12 and the dolly apparatus 14 collect setting information at the time of the imaging process, and set the lens setting metadata and the dolly setting metadata to, for example, 1 Each frame is generated and sequentially output to the metadata adding device 40.
[0144]
Further, video signals generated by the imaging device 10 and having camera setting metadata added for each frame are sequentially input to the CCU 20. The CCU 20 sequentially outputs the video signal to the metadata adding device 40 as shown in FIG.
[0145]
Further, as shown in FIG. 13C, the metadata adding device 40 adds scene information metadata, lens setting metadata, and dolly setting metadata to the blanking area of the video signal input from the CCU 20. Insert sequentially for each frame. In addition, the metadata adding device 40 adds time code information as one piece of scene information metadata for each frame to the video signal, for example. In this way, the metadata adding device 40 adds the metadata grouped according to the utilization object to the video signal in units of frames in parallel with the photographing process by the imaging device 10. be able to.
[0146]
Further, as shown in FIG. 13 (d), for example, video signals to which metadata is added are sequentially input from the metadata adding device 40 and audio signals are sequentially input from the sound collecting device 18 to the VTR 50. It will be. For example, the audio signal is temporarily stored in the memory unit 502 and recorded in synchronization with the video signal in accordance with the delay of the video signal. The VTR 50 decodes the metadata of the video signal and then records the video signal and the synchronized audio signal in units of frames on the video tape 52 and the like.
[0147]
As described above, in the video recording method according to the present embodiment, for example, various types of metadata are generated and grouped while performing the shooting process by the imaging device 10, and the video signal generated by the shooting process is processed. Thus, the grouped metadata can be added to each frame and recorded on a storage medium.
[0148]
As described above, according to the video recording system 1 and the video recording method using the video recording system 1, the metadata related to the video signal is generated in real time during the shooting process for the video signal generated by the imaging device 10. Can be added to each frame and recorded on the same storage medium. For this reason, it is not necessary to indirectly link metadata recorded in a terminal device such as a PC and video material recorded in a storage medium with a time code or the like as in the past. The metadata related to the video material can be directly linked and recorded. Therefore, it is convenient because the video material and the metadata can be managed integrally. In addition, since it is not necessary to match the video material and the metadata when extracting the metadata, the necessary metadata can be efficiently extracted and used or rewritten.
[0149]
<2 Video editing device>
Next, the video editing apparatus and its processing method according to this embodiment will be described.
[0150]
<2.1 Video editing device>
First, an outline of the video editing apparatus according to the present embodiment will be described. The video editing device is a device for editing the video material shot and recorded as described above, and is provided in a studio or the like where the editing department is located. The editing department is, for example, a department that completes a complete package program (hereinafter referred to as a complete package) by editing the video material shot and recorded by the recording system 1. The editing processing includes, for example, rough editing processing of video material, post-processing, main editing processing, and the like.
[0151]
The coarse editing process is a process of cutting out a video constituting a video work from various video materials. Specifically, first, necessary video materials are collected from a plurality of video materials recorded by the video recording system 1, and then a necessary video portion is selected from the collected video materials. In this process, the time code of the start position (In point) and the end position (Out point) is determined and the necessary video portion is specified.
[0152]
The post-processing is a process of changing the content of the live-action video material by performing a video composition process or a video correction process on the live-action video material shot and recorded.
[0153]
In addition, this editing process is a process of creating complete package data for broadcasting on a program or the like by connecting original data of video materials that have undergone rough editing and post-processing, and performing final image quality adjustment.
[0154]
The video editing apparatus 3 according to the present embodiment is configured as an apparatus for performing post-processing, for example, among the editing processes. The post-processing performed by the video editing device 3 is, for example, processing for performing video composition processing, video correction processing, or the like on a live-action video material photographed and recorded by the video recording system 1.
[0155]
More specifically, the video composition process is, for example, a process of synthesizing a foreground video and a background video. The computer graphics video (hereinafter referred to as CG video) is used as at least one of the videos to be synthesized in this way. As a specific example of the CG video composition process, for example, a process of synthesizing a live-action video of a person who is a foreground and a CG video of a fictitious background which is a background, and conversely, For example, a process of synthesizing a background live-action image as a background.
[0156]
The video correction process is a process for performing various corrections on a live-action video material. Specifically, for example, changing the brightness of the video material to change the daytime video into the evening video, the spring video into the autumn video, or something that is not in the live-action video. For example, a process of adding or changing / deleting an image in a live-action image (for example, changing a person's hairstyle) is included.
[0157]
Note that the video editing apparatus 3 may be configured to execute, for example, rough editing processing and / or main editing processing in addition to such post-processing.
[0158]
Next, the overall configuration of the video editing apparatus for executing the post-processing and the like as described above will be described with reference to FIG. FIG. 14 is a block diagram showing a schematic configuration of the video editing apparatus 3 according to the present embodiment.
[0159]
As shown in FIG. 14, the video editing apparatus 3 according to the present embodiment mainly includes, for example, a playback VTR 90, a composition video server 92, and an editing terminal device 80.
[0160]
The reproduction VTR 90 is a video tape recorder capable of recording / reproducing a video signal on a storage medium such as the video tape 52, for example, and is configured as a video signal reproduction device according to the present embodiment. The internal structure of the reproduction VTR 90 is substantially the same as the internal structure of the VTR 50 described with reference to FIG.
[0161]
The playback VTR 90 is loaded with, for example, a video tape 52 on which a live-action video material photographed and recorded by the video recording system 1 is recorded. As described above, the grouped metadata is added to the video signal recorded on the video tape 52 in units of frames. The playback VTR 90 can play back the actual video material recorded on the video tape 52 and output the played back video material to the editing terminal device 80. More specifically, the reproduction VTR 90 reproduces, for example, a video signal (actual video material) and / or an audio signal to which various metadata is added in units of frames as described above from the video tape 52. be able to. In addition, the reproduction VTR 90 can output the video signal and the like of the real image reproduced in this way to the editing terminal device 80 via the HD SDI cable or the like.
[0162]
The playback VTR 90 has a main function of playing a video signal from the video tape 52 and providing it to the editing terminal device 80, for example, as described above, but is not limited to this example. For example, the video signal input from the editing terminal device 80 or the like may be recorded on the video tape 52 as a matter of course.
[0163]
The composition video server 92 is a server for storing a composition video signal, for example, and includes a recording device such as a hard disk drive capable of storing a large capacity video signal. Further, the composition video server 92 can output the stored composition video signal to the editing terminal device 80 as necessary, for example. The composition video server 92 may be configured as a server device including a personal computer and a recording device configured separately from the editing terminal device 80, or may be configured as an editing terminal device 80. It may be configured as a disk device built in the disk.
[0164]
The synthesizing video signal stored in the synthesizing video server 92 is, for example, a video signal for synthesizing with the video signal reproduced by the reproduction VTR 90. In the present embodiment, the composite video signal is, for example, a CG video signal of a moving image or a still image made up of a fictitious chart, figure, line drawing, picture, etc. created using a computer. Such a CG video signal is generated by, for example, a computer in which software for creating a CG video or the like is installed, and stored in advance in the video server for synthesis 92. Also, for example, various metadata (for example, CG image gamma, detail, knee level, etc.) related to the CG video signal at the time of creation of the CG are added to the CG video signal in units of frames.
[0165]
Note that the synthesis video signal is not limited to the above-described example of the CG video signal, and may be, for example, a live-action video signal. This makes it possible to synthesize live-action video images.
[0166]
The editing terminal device 80 includes, for example, an information processing device such as a personal computer and its peripheral devices. For example, the editing terminal device 80 can acquire the reproduced video signal from the reproduction VTR 90 and also obtain the CG video signal from the composition video server 92.
[0167]
Further, the editing terminal device 80 can extract, for example, metadata added to a live-action video signal. At this time, for example, the editing terminal device 80 can extract not only metadata for each metadata group but also only predetermined metadata in the metadata group based on preset extraction conditions and the like. You can also.
[0168]
Further, the editing terminal device 80 can display the extracted metadata in synchronization with the video signal in units of frames. Thereby, the operator of the editing terminal device 80 can operate the editing terminal device 80 while browsing the metadata displayed together with the video, and can perform various post-processings on the live-action video. Specifically, the editing terminal device 80 can perform, for example, video synthesis processing such as synthesis of the live-action video signal and CG video signal, video correction processing for correcting the live-action video signal, and the like. it can.
[0169]
The video signal post-processed by the editing terminal device 80 in this manner is recorded, for example, in a recording device in the editing terminal device 80. However, the present invention is not limited to this example, and the post-processed video signal is recorded on a new storage medium such as a video tape by a recording VTR (not shown) connected to the editing terminal device 80, for example. Alternatively, it may be recorded in a post-processing video server (not shown) connected to the editing terminal device 80.
[0170]
<2.2 Configuration of editing terminal device>
Next, the configuration of the editing terminal device 80 according to the present embodiment will be described in detail with reference to FIG. FIG. 15 is a block diagram showing a schematic configuration of the editing terminal device 80 according to the present embodiment.
[0171]
As shown in FIG. 15, the editing terminal device 80 includes a CPU 800, a memory unit 802, an input unit 804, a display unit 806, an audio output unit 808, an external interface 810, a recording device 811, metadata. An extraction unit 812, a display control unit 814, a video composition processing unit 816, and a video correction processing unit 818 are provided.
[0172]
The CPU 800 functions as an arithmetic processing device and a control device, and can control processing of each part of the editing terminal device 10. The memory unit 802 includes, for example, a RAM, a ROM, a flash memory, and the like, and has a function of storing various data related to processing of the CPU 800, an operation program of the CPU 800, and the like.
[0173]
The input unit 804 includes, for example, a general PC input device (not shown) such as a mouse, a keyboard, and a touch panel, a video editing input device (not shown), and the like. This video editing input device is used, for example, for various editing buttons such as a video playback button, a stop button, a rewind button, and a fast forward button, to adjust a video playback speed, and to select a video material to be played back. Equipped with a jog dial and lever. The operator of the editing terminal device 80 can, for example, operate the video editing input device to reproduce and display video signals in various ways and post-process them.
[0174]
The display unit 804 is a display device, and is composed of, for example, a CRT monitor or an LCD monitor. The display unit 804 can display, for example, a video signal and metadata corresponding to the video signal. The display unit 804 can display the video signal reproduced by the reproduction VTR 90 and the metadata extracted from the video signal in synchronization on a frame basis, which will be described in detail later.
[0175]
The audio output unit 110 includes, for example, a sound generation device such as a speaker, an audio signal processing device, and the like, and can output audio based on the audio signal reproduced by the reproduction VTR.
[0176]
The external interface 810 is a part that transmits / receives data to / from a peripheral device connected to the editing terminal device 10 through an interface such as HD SDI, RS-232C, USB, or SCSI. The peripheral devices are, for example, the reproduction VTR 90 and the composition video server 92.
[0177]
The recording device 811 is a storage device composed of, for example, a hard disk drive or the like, and can store post-processed video signals, metadata, various programs, and the like.
[0178]
The metadata extraction unit 812 can extract metadata inserted for each frame in the blanking region of the video signal from the actual video signal reproduced by the reproduction VTR 90. At this time, for example, the metadata extraction unit 812 does not extract all the metadata inserted in the blanking area, but extracts only the metadata of a specific metadata group, for example, Extract only specific metadata within a metadata group. During such metadata extraction processing, the metadata extraction unit 812 positions the metadata group to be extracted based on the group identification information “K” and the data amount information “L” given to each metadata group. In addition, since the amount of data can be grasped, the necessary metadata can be extracted efficiently.
[0179]
In such an extraction process, the metadata extraction unit 812 is set to extract only necessary metadata according to the contents of post-processing, for example.
[0180]
Specifically, when performing video composition processing, when adjusting the color, brightness, texture, etc. of a video between a live-action video material and a CG video, for example, necessary camera settings in a camera setting group Metadata (setting information such as shutter speed, gain, gamma, detail, etc. of the image pickup apparatus 10 at the time of shooting) is extracted. Further, when performing video composition processing, when adjusting the motion of the video between the live-action video material and the CG video, for example, necessary lens setting metadata in the lens setting group (the lens device at the time of shooting) 12 setting information such as zoom, focus, and iris), and necessary dolly setting metadata in the dolly setting group (moving speed of the dolly device 14 during shooting, information on the camera direction (Pan, Tilt, Roll), etc.), etc. Is extracted.
[0181]
In addition, when performing video correction processing of a live-action video material, for example, necessary camera setting metadata in the camera setting group (setting information such as shutter speed, gain, gamma, and detail of the imaging device 10 at the time of shooting) Etc. are extracted.
[0182]
Also, the metadata extraction unit 812 can extract, for example, metadata about CG video added to the CG video signal input from the synthesis video server 92 in units of frames. At this time, the metadata extraction unit 812, for example, associates each metadata (for example, CG image) from the CG image signal in correspondence with each metadata (for example, gain, gamma, etc. of the captured image) extracted from the above-described captured image signal. Is set to extract (such as gain, gamma, etc.).
[0183]
The metadata extraction unit 812 outputs the metadata extracted as described above to the display control unit 814, for example.
[0184]
The display control unit 814 can control the display contents (video signal, metadata, etc.) of the display unit 806. Specifically, for example, the display control unit 814 performs a process of converting the metadata extracted by the metadata extraction unit 812 into display data, and then the display unit 806 displays the live-action video signal. The converted metadata can be controlled to be displayed synchronously in units of frames.
[0185]
The metadata conversion process by the display control unit 814 will be described in more detail. First, the display control unit 814 decodes the metadata extracted for each frame by the metadata extraction unit 812, for example, Interpret. Next, the display control unit 814 converts the interpreted metadata into, for example, a table format that is easy to view.
[0186]
For example, the display control unit 814 controls the display unit 806 so that the metadata that has been converted in this way is displayed together with the live-action video signal while being updated frame by frame according to the real-shot video signal. Can do. As a result, the display unit 806 can display the video signal and the metadata of the live action in synchronization on a frame basis.
[0187]
For this reason, the operator of the editing terminal device 80 can browse the metadata arranged in a table format together with, for example, a live-action video, so that the video quality and movement of the video material to be post-processed can be objectively determined. Can be grasped accurately and accurately.
[0188]
Further, for example, when the CG synthesis process is performed, the display control unit 814, for example, in addition to the above-described live-action video signal and the corresponding metadata, the CG video signal and the corresponding metadata. (CG data) can also be controlled to be displayed in synchronization in units of frames.
[0189]
For example, when video correction processing is performed, the display control unit 814, for example, in addition to the above-described actual video signal and metadata corresponding thereto, the video signal after video correction and the video signal corresponding thereto. The metadata to be displayed can also be controlled to be displayed.
[0190]
For example, the video composition processing unit 816 can perform a video composition process for synthesizing the reproduced real video signal and the composite video signal based on an operator input. More specifically, the video composition processing unit 816, for example, based on an operator's operation, takes a live-action video signal input from the reproduction VTR 90 and a CG video signal input from the composition video server 92. , For example, can be synthesized using a key signal. At this time, the actual video signal to be synthesized may be either foreground or background.
[0191]
When the video composition processing is executed by the video composition processing unit 816, for example, as described above, on the display screen of the display unit 806, the live-action video and the CG video are displayed in a layout that is easy to see together with the corresponding metadata. Is displayed. For this reason, for example, the operator can compare the metadata of the live-action video with the CG data while comparing the live-action video and the CG video. Therefore, for example, the image quality and motion of the CG video can be adjusted so as to match both metadata values, and can be synthesized with the real video, so that the CG synthesis process can be executed objectively and suitably.
[0192]
For example, the video correction processing unit 818 can perform video correction processing for correcting the reproduced real video signal based on an input from the operator. More specifically, the video correction processing unit 818 performs processing for adding, deleting, and changing a part of the video with respect to the live-action video signal input from the reproduction VTR 90 based on the operation of the operator, for example. Image correction can be performed by performing processing such as changing the background, adjusting the image quality, or the like.
[0193]
When the video correction processing is executed by the video correction processing unit 818, for example, as described above, on the display screen of the display unit 806, the live-action video and the corrected video are easy to see together with the corresponding metadata. Displayed in layout. For this reason, for example, the operator can compare the metadata of the live-action video and the metadata of the corrected video while comparing the video of the live-action video and the corrected video. Accordingly, since the photographed image can be corrected while referring to the level difference between both metadata, the image correction process can be executed objectively and suitably.
[0194]
The description has been given so far regarding each part of the editing terminal device 80. Note that the metadata combination extraction unit 812, the display control unit 814, the video composition processing unit 816, the video correction processing unit 818, and the like as described above are, for example, dedicated devices as long as they can realize the processing functions described above. It may be configured as (hardware), or may be configured as software by installing an application program that causes the editing terminal device 80 such as a computer to execute the above processing.
[0195]
<2.3 Processing Method of Video Editing Device>
Next, a processing method of the video editing apparatus 3 as described above will be described with reference to FIG. FIG. 16 is a flowchart showing a processing method of the video editing apparatus 3 according to the present embodiment. In the following, for example, a specific example of the processing flow of the CG synthesis process will be described. However, the present invention is not limited to this example.
[0196]
As shown in FIG. 16, first, in step S100, a live-action video is played by the playback VTR 90, and a CG video is read by the synthesis video server 92 (step S100). For example, the reproduction VTR 90 reproduces a predetermined live-action video signal from the video tape 52 on which the video material is recorded by the video recording system 1 based on an instruction from the editing terminal device 30, for editing. Output to the terminal device 80. On the other hand, the synthesizing video server 92, for example, based on an instruction from the editing terminal device 30, a predetermined CG for synthesizing the reproduced real video signal among the stored CG video signals. Read video signal and output to video editing terminal device 80
Next, in step S102, at least part of the metadata added to the video signal by the editing terminal device 80 is extracted (step S102). The metadata extraction unit 812 of the editing terminal device 80 extracts, for example, the metadata inserted into the blanking area of the reproduced live-action video signal for each frame and decodes it. At the time of this extraction, the metadata extraction unit 812, for example, from the various metadata added to the live-action video signal based on the extraction conditions specified by the operator, necessary metadata corresponding to the CG composition processing content Can be selected and extracted in units of metadata groups or individual metadata units.
[0197]
Also, the metadata extraction unit 812 extracts, for example, CG metadata added to the CG video signal in units of frames at the same time as the above extraction processing for each frame. At this time, the metadata extraction unit 812 extracts, for example, metadata of the CG video signal corresponding to the metadata extracted from the live-action video signal.
[0198]
In step S104, the video signal and the metadata are displayed synchronously in units of frames (step S104). For example, the display control unit 814 causes the display unit 806 to display the video signal of the live-action and the metadata extracted as described above for each frame. At the same time, the display control unit 814 causes the display unit 806 to display, for example, the CG video signal and the CG metadata extracted as described above for each frame.
[0199]
Thereafter, in step S106, a CG synthesis process is executed (step S106). The operator of the editing terminal device 80, for example, changes the image quality and movement of the CG video to be synthesized while browsing the live-action video and metadata displayed on the display unit 806 as described above, and the CG video and CG metadata. Correct and perform the operation to synthesize both images. For example, the video composition processing unit 816 corrects the CG video signal based on the operation of the operator, and then generates a video signal by combining the actual video signal and the corrected CG video signal.
[0200]
A specific example of video and metadata displayed on the display unit 806 of the editing terminal device 80 during the CG composition processing as described above will be described with reference to FIGS. FIG. 17 is an explanatory diagram showing a specific example of video and metadata displayed on the display unit 806 of the editing terminal device 80 at the time of adjusting the image quality of the CG video in the CG synthesis process. FIG. 18 is an explanatory diagram showing a specific example of video and metadata displayed on the display unit 806 of the editing terminal device 80 when adjusting the motion of the CG video in the CG synthesis process.
[0201]
As shown in FIG. 17, on the display screen 807 of the display unit 806, for example, a live-action image of “a lion walking on the savannah” is displayed as the foreground. In addition, camera setting metadata corresponding to the frame of the live-action video is displayed at the bottom of the real-action video. More specifically, for example, the values of items such as “recorder status”, “shutter speed”, “gain”, “gamma”, “detail level”, “knee point” in the camera setting group are displayed in a table format. ing.
[0202]
Further, on the display screen 807, for example, a CG image of “space scenery” is displayed as a background. In the lower part of the CG video, CG metadata corresponding to the frame of the CG video is displayed. The item of the CG metadata corresponds to the item of the camera setting metadata of the photographed video.
[0203]
In order to synthesize such a live-action image and a CG image to create an image of a lion walking in outer space, the operator first performs a process of correcting the image quality (color, brightness, texture, etc.) of the CG image. Do. Specifically, for example, the operator performs processing for correcting values such as gamma, detail, knee point, and the like of the CG video so that the quality of the CG video matches the quality of the captured video. At this time, the operator can perform the correction process while comparing the camera setting metadata of the live-action video with the CG metadata, so that the degree of correction of each parameter can be objectively determined, and the image quality can be determined with high accuracy. Can be adjusted.
[0204]
Next, the operator switches the metadata group displayed on the display unit 806 in order to perform processing for correcting the movement of the CG video. As a result of such switching, the lens setting metadata and the dolly setting metadata corresponding to the frame of the live-action video are displayed on the display screen 807 of the display unit 806, as shown in FIG. Is done. More specifically, for example, “zoom”, “focus”, “iris” of the lens setting group, and “camera direction (Pan, Tilt, Roll)” of the dolly setting group are displayed. Further, CG metadata of items corresponding to the lens setting metadata and the dolly setting metadata of the photographed video is displayed at the bottom of the CG video.
[0205]
When such metadata is displayed, the operator performs a process of correcting the movement of the CG video. Specifically, for example, the operator performs processing for correcting the zoom, focus, iris, camera direction, and the like of the CG video so that the motion of the CG video matches the motion of the live-action video. At this time, the operator can perform correction processing while comparing the lens and dolly setting metadata of the live-action image with the CG metadata. The movement of the CG image can be adjusted with accuracy.
[0206]
Next, a specific example of video and metadata displayed on the display unit 806 of the editing terminal device 80 during video correction processing will be described with reference to FIG. FIG. 19 is an explanatory diagram showing a specific example of video and metadata displayed on the display unit 806 of the editing terminal device 80 during video correction processing. In the following, as a specific example of the video correction process, for example, a process for correcting to a night video by changing the image quality of a live-action video shot in the daytime will be described.
[0207]
As shown in FIG. 19, on the display screen 807 of the display unit 806, for example, a live-action image of “a lion walking on a savanna in the daytime” is displayed as an image before correction. In addition, camera setting metadata corresponding to the frame of the live-action video is displayed at the bottom of the real-action video. More specifically, for example, “shutter speed”, “gain”, “gamma”, “detail level”, “knee point” in the camera setting group. The values of items such as “Need Slope” are displayed in a table format.
[0208]
Further, on the display screen 807, for example, an image of “a lion walking on a savanna at night” is displayed as a corrected image. In addition, camera setting metadata corresponding to the frame of the corrected video is displayed at the bottom of the corrected video.
[0209]
In this way, the live-action video before correction and the video after correction are displayed side by side, and the camera setting metadata of the real-time video before correction and the camera setting metadata of the video after correction are displayed side by side. The operator can preferably execute the video correction process using the level difference of each item of the camera setting metadata between the two videos as a judgment material. Specifically, for example, the operator performs a process of correcting the live-action image by adjusting values of the gamma, detail, knee point, etc. of the live-action image so that the corrected image becomes a suitable night image. . At this time, the operator can perform the correction process by comparing each item of the camera setting metadata before and after the correction. Therefore, the operator can objectively determine the degree of increase / decrease of each item, and can perform image processing with high accuracy. Can be adjusted.
[0210]
Further, in the video correction processing as shown in FIG. 19, for example, correction processing for deleting the “sun” portion of the live-action video and inserting the “moon” video is also performed. In this way, in the video correction process, it is possible to delete what was in the live-action video or add what was not in the real-life video.
[0211]
As described above, according to the video editing apparatus 3 and the processing method, metadata necessary for post-processing of the video signal is preferably extracted using the video signal to which metadata is added in units of frames.・ It can be displayed. For this reason, by using the video editing device 3, the video material can be suitably post-processed.
[0212]
That is, on the storage medium 52 on which the video material is recorded by the video recording system 1, the video material and metadata relating to the video material are directly linked and recorded in units of frames. The video editing apparatus 3 can reproduce such video material and extract metadata directly attached to the reproduced video material. Therefore, unlike the prior art, it is not necessary to access and read metadata recorded separately from the video material, and to perform time alignment between the video material and the metadata using a time code. Therefore, since the video and metadata can be handled in an integrated manner, the metadata necessary for the post-processing of the video material can be extracted quickly and easily.
[0213]
Furthermore, the extracted metadata can be suitably displayed in synchronization with the video material in units of frames. For this reason, the operator can browse the video and the metadata corresponding to the video in units of frames, so that the image quality and movement of the video material can be objectively and accurately grasped.
[0214]
In addition, since the video material and metadata are linked and recorded in units of frames, even if the video material is cut out at the In point and the out point, for example, by rough editing processing of the video material, Metadata is also extracted along with the video material. For this reason, the metadata can be extracted and displayed continuously in synchronization with the video material even if the consistency between the video material and the metadata is not taken. For this reason, even for the edited video material, metadata can be extracted quickly and easily and used for post-processing.
[0215]
Further, even when the frame rate of the recorded video signal is changed by the imaging device 10 performing variable speed shooting processing, metadata is added to the video signal in units of frames. Therefore, there is no deviation between the number of video signal frames per unit time and the number of recorded metadata. Therefore, even when post-processing such a variable speed video signal, the metadata can be extracted from the video material in units of frames and displayed. As described above, the video editing apparatus 3 can flexibly cope with video materials that have been subjected to editing processing or variable speed shooting.
[0216]
Also, as described above, various types of metadata are grouped according to the purpose of use and added to the video signal. Therefore, necessary metadata can be extracted and displayed quickly and easily in units of metadata groups.
[0217]
Also, when performing video composition processing such as CG composition processing, for example, by extracting and displaying camera setting metadata, the operator can accurately determine the image quality (brightness, hue, texture, etc.) of the video material. On the other hand, for example, by extracting and displaying the lens and dolly setting metadata, it is possible to accurately grasp the movement of the imaging device 10 and the subject during photographing. Furthermore, by displaying the metadata of the synthesized CG video at the same time as the metadata of the live-action video, it is possible to objectively grasp the image quality and movement of the CG video and correct the CG video accurately.
[0218]
Also, when performing video correction processing, for example, by extracting and displaying camera setting metadata, the operator can accurately grasp the image quality (brightness, hue, texture, etc.) of the video material. The video can be corrected.
[0219]
As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, this invention is not limited to this example. It will be obvious to those skilled in the art that various changes or modifications can be conceived within the scope of the technical idea described in the claims, and these are naturally within the technical scope of the present invention. It is understood that it belongs.
[0220]
For example, in the video recording system 1 according to the above-described embodiment, the imaging apparatus 10 adds camera setting metadata to the video signal, but the present invention is not limited to this example. For example, as shown in FIG. 20, in the video recording system 1, for example, the camera setting metadata acquired by the CCU 20 from the imaging device 10 is transferred to the metadata adding device via a path different from the video signal via an RS-232C cable or the like. For example, scene information metadata, lens setting metadata, dolly setting metadata, and camera setting metadata generated in the system are gathered in the metadata adding device 40, and metadata is generated. The adding device 40 may be configured to collectively add these metadata to the video signal. With this configuration, for example, even when the imaging apparatus 10 that does not have a function of adding metadata is used, all metadata can be added to the video signal in units of frames.
[0221]
For example, in the video recording system 1 according to the above embodiment, the metadata adding device 40 is configured as hardware separate from the imaging device 10, the CCU 20, the VTR 50, and the like. It is not limited. For example, the metadata adding device 40 may be configured to be incorporated in any one or more of the imaging device 10, the CCU 20, the VTR 50, and the like. Further, the metadata synthesizing device 40 may be configured to be incorporated in, for example, the VTR 50 or the like. As described above, by incorporating the metadata adding device 40 and the metadata synthesizing device 60 in the VTR 50 or the imaging device 10 or the like, the number of devices in the system can be reduced, and labor and time for connection between devices can be saved. it can.
[0222]
Further, the imaging device 10 may be configured as an imaging device (camcorder or the like) having a function of recording a video signal in a storage medium, for example. Thereby, the imaging device 10 can be configured to have all the functions of the CCU 20, the metadata adding device 40, the VTR 50, and the like, for example.
[0223]
In the above embodiment, the lens setting metadata generated by the lens device 12 is output via an RS-232C cable or the like and added to the video signal by the metadata adding device 40. However, the present invention is such an example. It is not limited. For example, a lens device 12 capable of communicating lens setting information and the like with the imaging device 10 main body is employed, and lens setting metadata and the like are directly input from the lens device 12 to the imaging device 10 main body. It may be configured. Accordingly, the metadata adding unit 112 of the imaging device 10 can be configured to add not only the camera setting metadata but also the lens setting metadata acquired from the lens device 12 to the video signal, for example.
[0224]
In the above-described embodiment, RS-232C, HDSDI, or the like is employed as an interface for communicating various metadata between devices. However, the present invention is not limited to this example. For example, USB (Universal Serial Bus), Various interfaces such as SCSI (Small Computer System Interface), serial SCSI, and GP-IB (General Purpose Interface Bus) may be used. In addition, the above devices are not limited to wired communication, and for example, metadata and / or video signals may be transmitted by wireless communication.
[0225]
In the above embodiment, the various metadata generated in the video recording system are grouped into four metadata groups including a scene information group, a camera setting group, a lens setting group, and a dolly setting group. The present invention is not limited to such an example. For example, the above four metadata groups may be arbitrarily combined according to the purpose of use, such as combining the lens setting group and the dolly setting group into one lens and dolly setting group. Further, it is not necessary to provide all the upper four metadata groups. For example, one or more metadata groups may be provided.
[0226]
A new metadata group other than the above may be provided. Specifically, for example, a voice information group is provided, and within this voice information group, recording method information (stereo, monaural, surround, etc.), recording content information (microphone 1 records the background sound, microphone 2 is the actor's Voice related information metadata such as voice recording may be grouped.
[0227]
Further, the video recording system 1 according to the above embodiment includes the metadata composition device 60 and the display device 70. However, the present invention is not limited to this example, and these devices may not necessarily be provided.
[0228]
Further, the video editing apparatus 3 according to the above embodiment includes one playback VTR 90. However, the video editing apparatus 3 is not limited to this example. For example, the video editing apparatus 3 may include a plurality of playback VTRs 90. As a result, a plurality of types of live-action video signals are reproduced by a plurality of playback VTRs 50 and output to the editing terminal device 80, so that the editing terminal device 80 can synthesize the live-action video signals. It becomes like this. At this time, the metadata related to each video signal of the actual photograph may be displayed in synchronization with each video signal.
[0229]
In the above embodiment, the video editing device 3 is composed of the editing terminal device 80, the playback VTR 90, and the composition video server 92. However, the present invention is not limited to such an example. For example, the video editing device 3 may be configured as hardware in which these three devices are integrated. In addition to the above, the video editing device 3 may include a recording VTR or a recording server.
[0230]
The video editing apparatus 3 may be configured as a comprehensive editing apparatus capable of executing various editing processes such as a rough editing process and a main editing process as well as the post-processing as described above.
[0231]
In the above-described embodiment, video composition processing such as CG composition processing and image correction processing have been described as examples of post-processing of the video material. However, the present invention is not limited to this example. You may comprise so that various post-processes other than the above can be performed. Specifically, the editing terminal device 80, for example, has a function of deleting an abnormal video portion (for example, a portion including excessive noise, a portion having an abnormal luminance or hue), subtitle data, or CG stillness. A function of superimposing image data or the like on the video material, a function of fading in or out the video material, and the like may be provided.
[0232]
【The invention's effect】
As described above, the video editing apparatus according to the present invention easily and quickly extracts metadata necessary for post-processing of video material from video material to which metadata is directly added in units of frames. The metadata and the video material can be displayed in synchronization on a frame basis. Also, metadata and video material can be handled in an integrated manner. Furthermore, metadata can be extracted and displayed flexibly even when the video material is cut by editing or when the video material is shot at a variable speed. For this reason, the operator can suitably perform the post-processing of the video material using the video editing apparatus.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of a video recording system according to a first embodiment.
FIG. 2 is an explanatory diagram illustrating a specific example of scene information metadata included in a scene information group according to the first embodiment;
FIG. 3 is an explanatory diagram illustrating a specific example of camera setting metadata included in a camera setting group according to the first embodiment;
FIG. 4 is an explanatory diagram illustrating a specific example of lens setting metadata included in a lens setting group according to the first embodiment;
FIG. 5 is an explanatory diagram illustrating a specific example of camera setting metadata included in the dolly setting group according to the first embodiment;
FIG. 6 is an explanatory diagram for explaining a metadata format according to the first embodiment;
FIG. 7 is a block diagram illustrating a configuration of the imaging apparatus according to the first embodiment.
FIG. 8 is an explanatory diagram for explaining a mode of adding metadata to a video signal according to the first embodiment;
FIG. 9 is a block diagram illustrating a configuration of a camera control unit according to the first embodiment.
FIG. 10 is a block diagram illustrating a configuration of a metadata adding apparatus according to the first embodiment;
FIG. 11 is a block diagram showing a configuration of a video tape recorder according to the first embodiment.
FIG. 12 is a block diagram illustrating a configuration of a metadata composition device according to the first embodiment;
FIG. 13 is a timing chart for explaining a video recording method according to the first embodiment;
FIG. 14 is a block diagram illustrating a schematic configuration of the video editing apparatus according to the first embodiment;
FIG. 15 is a block diagram illustrating a schematic configuration of an editing terminal device according to the first embodiment;
FIG. 16 is a flowchart illustrating a processing method of the video editing apparatus according to the first embodiment;
FIG. 17 is an explanatory diagram illustrating a specific example of video and metadata displayed on the display unit of the editing terminal device when adjusting the image quality of the CG video in the CG composition processing according to the first embodiment; It is.
FIG. 18 is an explanatory diagram showing a specific example of video and metadata displayed on the display unit of the editing terminal device when adjusting the motion of the CG video in the CG composition processing according to the first embodiment. It is.
FIG. 19 is an explanatory diagram illustrating a specific example of video and metadata displayed on the display unit of the editing terminal device during the video correction processing according to the first embodiment.
FIG. 20 is a block diagram illustrating a schematic configuration of a video recording system according to a modified example.
[Explanation of symbols]
1: Video recording system
3: Video editing device
10: Imaging device
12: Lens device
14: Dolly device
18: Sound collector
20: Camera control unit
30: Terminal device for metadata input
40: Metadata adding device
50: Video tape recorder
52: Video tape
60: Metadata synthesizer
70: Display device
80: Terminal device for editing
90: Video tape recorder for playback
92: Video server for composition
104: Imaging unit
108: Display unit
110: Camera setting metadata generation unit
112: Metadata addition unit
126: Lens setting metadata generation unit
144: Dolly setting metadata generation unit
403: Metadata buffer memory
406: Metadata packing unit
408: Metadata encoder
410: Metadata insertion part
506: Signal processor
508: Metadata decoder
514: Metadata encoder
606: Metadata extraction unit
608: Metadata decoder
610: Metadata video generation unit
612: Metadata video composition unit
806: Display unit
807: Display screen
812: Metadata extraction unit
814: Display control unit
816: Video composition processing unit
818: Image correction processing unit

Claims

A video editing device for post-processing a video signal recorded on a storage medium:
From the storage medium, the first metadata that is setting information of the imaging device relating to the image quality of the video signal, the second metadata that is setting information of the lens device at the time of shooting, or the position of the imaging device at the time of shooting Or a video signal playback device that plays back the video signal to which at least one of the third metadata that is setting information related to motion is added in units of frames;
A metadata extraction unit that extracts at least one of the first metadata, the second metadata, and the third metadata from the reproduced video signal in units of frames;
A display control unit that displays the reproduced video signal and the extracted metadata on a display unit in a synchronized manner in units of frames;
Equipped with a video editing device.

The post, the a reproduced video signal, a video synthesis processing for synthesizing the synthesized image signal, a video editing apparatus according to claim 1.

The composite video signal is a computer graphics video signal, the video editing apparatus according to claim 2.

The composite video signal has metadata about the composite video signal added in frame units,
The metadata extraction unit extracts metadata related to the synthesis video signal from the synthesis video signal in units of frames,
The display control unit displays the reproduced video signal and the extracted metadata on the display unit in synchronization with each other on a frame basis, and relates to the synthesis video signal and the extracted synthesis video signal. The video editing apparatus according to claim 2, wherein metadata is displayed on the display unit in synchronization with a frame unit.

The post-processing is an image correction process for correcting the reproduced video signal, the video editing apparatus according to claim 1.

The display control unit synchronizes and displays the uncorrected video signal and the metadata related to the uncorrected video signal on a frame basis, and displays the corrected video signal and the corrected video signal. The video editing apparatus according to claim 5, wherein the video editing apparatus displays the video signal metadata on the display unit in synchronization with each other on a frame basis.

The metadata added to the video signal, depending on the intended use of the meta data are grouped into one or more metadata group, the video editing apparatus according to claim 1.

The metadata group is
A camera setting group including setting information of an imaging device that has generated the video signal, a lens setting group including setting information of a lens device included in the imaging device, or a dolly setting group including setting information of a dolly device included in the imaging device. containing at least one, image editing apparatus according to claim 7.

A processing method of a video editing apparatus for post-processing a video signal recorded on a storage medium, comprising:
From the storage medium, first metadata that is setting information of the imaging device relating to the image quality of the video signal, second metadata that is setting information relating to the lens device at the time of shooting, or the position of the imaging device at the time of shooting. Or a video signal playback stage for playing back the video signal to which at least one of the third metadata that is setting information related to motion is added in units of frames;
A metadata extraction step of extracting at least one of the first metadata, the second metadata, and the third metadata from the reproduced video signal in units of frames;
A display control step of displaying the reproduced video signal and the extracted metadata on a display unit in a synchronized manner in units of frames;
A processing method for a video editing apparatus, including :

The post, the a reproduced video signal, a video synthesis processing for synthesizing the synthesized image signal processing method of the video editing apparatus according to claim 9.

The composite video signal is a computer graphic image signal processing method of the video editing apparatus according to claim 10.

The composite video signal has metadata about the composite video signal added in frame units,
In the metadata extraction step, metadata about the synthesis video signal is extracted from the synthesis video signal in units of frames,
In the display control step, the reproduced video signal and the extracted metadata are displayed on the display unit in synchronization with each other on a frame basis, and the synthesized video signal and the extracted synthesized video signal are related to each other. The processing method of the video editing apparatus according to claim 10, wherein metadata is displayed on the display unit in synchronization with each frame.

The post-processing is an image correction process for correcting the reproduced video signal processing method of the video editing apparatus according to claim 9.

In the display control step, the video signal before correction and the metadata related to the video signal before correction are displayed on the display unit in synchronization with each other, and the corrected video signal and the corrected video signal are displayed. The processing method of the video editing apparatus according to claim 13, wherein metadata relating to the video signal is displayed on the display unit in synchronization with each other on a frame basis.

The metadata added to the video signal, depending on the intended use of the meta data are grouped into one or more metadata group, processing method of the video editing apparatus according to claim 9 .

The metadata group is
A camera setting group including setting information of an imaging apparatus that has generated the video signal, a lens setting group including setting information of a lens apparatus included in the imaging apparatus, or a dolly setting group including setting information of a dolly apparatus included in the imaging apparatus. at least one processing method of the video editing apparatus according to claim 15.