JP4185333B2

JP4185333B2 - Video distribution device and video reception device

Info

Publication number: JP4185333B2
Application number: JP2002251831A
Authority: JP
Inventors: 亮上崎; 忠司小林; 利紀樋尻; 義幸望月
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2001-09-07
Filing date: 2002-08-29
Publication date: 2008-11-26
Anticipated expiration: 2022-08-29
Also published as: JP2003179908A

Description

【０００１】
【発明の属する技術分野】
本発明は、スポーツ番組などの映像を配信したり、受信したりする映像配信装置および映像受信装置に関する。
【０００２】
【従来の技術】
通信ネットワークのインフラ整備の進展に伴い、スポーツ番組などの映像の配信、受信に関する技術が開発されつつある。このような映像の配信、受信に関する従来技術として、特開平７−９５３２２号公報（第１の公報）に開示されたビデオ情報配信システムと、特開平２−５４６４６号公報（第２の公報）に開示された番組配信装置がある。
【０００３】
第１の公報に開示されたビデオ情報配信システムは、ビデオセンタ、ビデオ・ダイヤルトーントランク、利用者端末から構成される。利用者がビデオセンタを呼び出すと、利用者が所望する番組が、ビデオセンタより伝送路を介して伝送される。ビデオ・ダイヤルトーントランクはビデオセンタより高速転送されるビデオ情報を受信し、それを通常の速度のビデオ情報に再生して、低速伝送路を介して利用者端末へ伝送する。
【０００４】
第２の公報に開示された番組配信装置は、複数の動画番組を保持する記憶装置と、ネットワークを介して端末装置から番組配信要求と広告挿入要求とを受信し、動画番組および指定された広告要求を情報ブロックに分割してネットワークを介して配信する配信装置と、上述した広告の挿入要求で指定された広告を挿入するタイミングに応じて課金を異ならしめるよう制御する制御装置より構成される。
【０００５】
【発明が解決しようとする課題】
しかしながら、上記の従来技術では、視聴者に配信される映像は、ある特定の視点から制作者の意図のみで撮影された映像であり、視聴者が自らの好みに応じた映像を視聴することや、視点を変更するといった操作は不可能である。例えば、あるサッカー等のスポーツ観戦の番組等において、視聴者は、自分の好きな特定の選手をじっくり視聴したいという要求をもっていても、その選手がわずかのシーンでしか登場せす、他の選手ばかり登場するような映像作品であっても、これを視聴せざるをえない。
【０００６】
また、上記の従来技術では、あらかじめビデオセンタや記憶装置に番組が記録されている必要があり、リアルタイムの映像を配信する仕組みにはなっていないという問題がある。
そこで、本発明は、このような状況に鑑みてなされたもので、視聴者の嗜好が反映された映像の配信が可能な映像配信装置および映像受信装置を提供することを目的とする。
【０００７】
さらに、本発明は、蓄積された映像の配信だけでなく、リアルタイム（ライブ）の映像についても、視聴者の嗜好を反映した配信が可能な映像配信装置および映像受信装置を提供することをも目的とする。
【０００８】
【課題を解決するための手段】
上記目的を達成するために、本発明に係る映像配信装置は、通信ネットワークを介して映像受信装置と通信する映像配信装置であって、異なる視点からの複数の映像を取得する映像取得手段と、前記映像ごとに、その映像に含まれる内容を解析し、解析結果を内容情報として生成する映像解析手段と、前記各内容情報と、視聴者より通知された嗜好情報との適合度を判定し、配信する映像を決定し、決定した映像を配信する配信映像マッチング手段とを備えることを特徴とする。つまり、異なる視点からの複数の映像の中から各映像ごとに生成された内容情報と視聴者の嗜好情報との適合度で決定し、視聴者の嗜好に合致した１つの映像を視聴者の映像受信装置に対して配信する。
【０００９】
ここで、内容情報には、被写体を同定する情報や、被写体の表示位置または表示領域を表す情報を含めてもよい。また、嗜好情報を得るための入れ物を映像受信装置側に配信し、この入れ物に被写体に対する嗜好の度合いを入力させることにより嗜好情報を取得してもよい。また、配信した映像について視聴者から画面上の位置が指定されると、その位置の被写体を特定し、この被写体に関する付加情報を送信するようにしてもよい。
【００１０】
さらに、本発明は、通信ネットワークを介して映像受信装置と通信する映像配信装置であって、異なる視点からの複数の映像を取得する映像取得手段と、前記映像ごとに、その映像に含まれる内容を解析し、解析結果を内容情報として生成する映像解析手段と、前記各映像および前記各内容情報を多重化して配信する映像多重化手段とを備えることを特徴とする映像配信装置とすることもできる。この場合には、映像受信装置の側において映像配信装置から配信されてきた各内容情報と、視聴者より通知された嗜好情報との適合度を判定し、映像配信装置から配信されてきた複数の映像の中から再生する１つの映像を決定し、決定した映像を再生するようにすればよい。
【００１１】
また、本発明は、このような特徴的な手段をコンピュータに機能させるプログラムとして実現したり、そのプログラムを記録した記録媒体として実現したりすることもできる。そして、本発明に係るプログラムをインターネット等の通信網や記録媒体等を介して流通させることもできる。
【００１２】
【発明の実施の形態】
（実施の形態１）
以下、本発明の実施の形態１に係る映像配信システムを、図面に基づいて説明する。なお、この実施の形態では、限定された空間の撮影対象として、サッカーなどのスポーツ中継の場合の選手を中心とした映像を例に挙げて説明するが、本発明は任意の撮影空間および撮影対象に対して適用可能である。
【００１３】
図１は、本発明の実施の形態１における映像配信システム１の機能構成を示すブロック図である。
本発明の実施の形態１に係る映像配信システム１は、利用者の嗜好に応じた映像等のコンテンツをストリーム配信する通信システムであり、映像配信装置１０と、映像受信装置２０と、これらを接続する通信ネットワーク３０とから構成される。
【００１４】
映像配信装置１０は、複数の映像（多視点映像）の中からユーザの嗜好や嗜好履歴に合致した１つの映像を数フレームごとに切換・選択するような編集を行った映像コンテンツをリアルタイムに構築し、映像受信装置２０に向けてストリーム配信するコンピュータ等からなる配信サーバであり、映像取得部１１０と、映像解析部１２０と、配信映像マッチング部１３０と、映像記録部１４０と、付加情報提供部１５０と、映像情報配信部１６０等とからなる。
【００１５】
映像取得部１１０は、所定の撮影空間（例えば、サッカー場）に分散配置され、限定された撮影空間内の複数の被写体を様々な視点および角度からそれぞれ撮影した複数の映像（多視点映像）を取得する複数台の撮影機器（ビデオカメラ等）である。この映像取得部１１０により取得された多視点映像は、ケーブルや無線通信により、映像解析部１２０に伝送される。
【００１６】
映像解析部１２０は、各映像の内容（具体的には、画面のどの位置に何の被写体（例えば、選手）が写っているか）をフレームごとにそれぞれ取得し、取得結果をＭＰＥＧ７などのマルチメディアコンテンツの記述子（Ｄｅｓｃｒｉｐｔｏｒ）で記述した内容情報として各映像のフレームごとに生成する。
【００１７】
配信映像マッチング部１３０は、映像取得部１１０により取得されたライブコンテンツや、映像記録部１４０に保持されているストレージコンテンツについて、映像受信装置２０から送られてきたユーザの嗜好や嗜好の履歴と各映像の内容情報とを比較し、複数の映像（多視点映像）の中からユーザの嗜好や嗜好履歴に合致した１つの映像を数フレームごとに切換・選択するような編集を行った映像コンテンツをリアルタイムに構築したり、内容情報が付加された多視点映像を映像記録部１４０のコンテンツデータベース１４１に格納したり、嗜好値入力ダイアログ１４６を生成して嗜好データベース１４５に格納したりする。
【００１８】
映像記録部１４０は、配信するストレージコンテンツなどを保持するコンテンツデータベース１４１と、ユーザごとの嗜好を取得するための嗜好データベース１４５とを保持するハードディスク等である。コンテンツデータベース１４１は、ライブ（生放送）やストレージ（録画による放送）のモードを選択するモード選択ダイアログ１４２、ライブ中継中のコンテンツや、保持しているストレージコンテンツのコンテンツ一覧１４３およびコンテンツ１４４自体を記憶する。また、嗜好データベース１４５は、被写体に対する嗜好値（嗜好度）を入力するためのコンテンツごとの嗜好値入力ダイアログ１４６およびユーザが入力した嗜好履歴を格納するユーザごとの嗜好履歴テーブル１４７を記憶する。
【００１９】
付加情報提供部１５０は、ライブおよびストレージのコンテンツごとに視聴者に提供される配信映像に関連した情報（被写体（対象物）のプロフィール等の付加情報、例えば、サッカー中継のコンテンツであれば、サッカー選手の生年月日等のプロフィール）をあらかじめ格納した付属情報テーブル１５１を保持するハードディスク等である。この付属情報テーブル１５１には、例えば、個々の選手について「生年月日」、「主な経歴」、「特徴」、「選手のコメント」の情報があらかじめ記録されており、配信映像マッチング部１３０から選手名等を特定した通知があると、特定された選手に関する付加情報を映像受信装置２０に送信する。
【００２０】
映像情報配信部１６０は、通信ネットワーク３０を介して映像受信装置２０と通信するための双方向の通信インタフェースやドライバソフト等である。
【００２１】
映像受信装置２０は、ライブやストレージのモード選択や、嗜好値の入力等についてユーザと対話したり、映像配信装置１０から配信されてくる映像コンテンツをユーザに提示するパーソナルコンピュータ、携帯電話機、携帯情報端末、デジタル放送用ＴＶ等であり、操作部２１０と、映像出力部２２０と、送受信部２３０等とからなる。
【００２２】
操作部２１０は、リモートコントローラや、キーボード、マウスなどのポインティングデバイスなどのデバイスであって、ユーザとの対話によってユーザが希望するコンテンツを指定したり、嗜好値を入力して嗜好値情報として送受信部２３０に送信したり、映像出力部２２０に表示されている被写体の位置情報を送受信部２３０に送信したりする。
【００２３】
送受信部２３０は、通信ネットワーク３０を介して映像配信装置１０とシリアル通信するための送受信回路やドライバソフト等である。
【００２４】
通信ネットワーク３０は、映像配信装置１０と映像受信装置２０とを接続する双方向の伝送路であり、ＣＡＴＶ等の放送・通信網、電話網、データ通信網等によるインターネット等の通信ネットワークである。
【００２５】
以上のように構成された映像配信システム１の動作について、図２に示されたシーケンス（本システムの主な処理の流れ）に沿って順に説明する。なお、本図のシーケンスにおいては、ある一時点における多視点映像についての流れを示している。
【００２６】
映像配信装置１０の映像取得部１１０は、映像を取得することが可能なビデオカメラなどの撮影機器が複数台で構成されており、限定された撮影空間内の複数の被写体を様々な視点および角度からそれぞれ撮影した複数の映像（多視点映像）を取得する（Ｓ１１）。本実施の形態の映像配信装置１０では、限定された空間を様々な視点および角度から撮影した映像が必要となるため、できる限り多くの撮影機器を分散させて撮影空間に配置することが望ましいが、本発明は機器の台数や配置位置などには限定されない。映像取得部１１０により取得された多視点映像は、ケーブルや無線通信を利用することにより、映像解析部１２０に伝送される。本実施の形態では、各々の映像取得部１１０により取得された映像はすべて１台の映像解析部１２０に伝送され、集中的に管理されるものとするが、映像解析部１２０は、撮影機器ごとに備えられていてもよい。
【００２７】
映像解析部１２０は、映像取得部１１０により取得された各々の映像を解析したりして、各映像の内容（画面のどの位置に何の被写体（例えば、選手）が写っているか）をフレームごとにそれぞれ取得し、取得結果をＭＰＥＧ７などのマルチメディアコンテンツの記述子（Ｄｅｓｃｒｉｐｔｏｒ）で記述した内容情報として各映像のフレームごとに生成する（Ｓ１２）。内容情報の生成には、（１）内容情報の抽出と、（２）内容情報の記述との２段階のステップが必要となる。内容情報は、撮影されている映像の内容に大きく依存するが、例えばサッカーなどのスポーツ中継であれば、映像の大部分は競技中の選手の映像であると考えられる。そこで、本実施の形態では、映像を解析することによって、映像に含まれている選手を同定し、選手名とその選手が映像中で表示されている位置を内容情報として生成することを考える。以下ではまず、内容情報の抽出の例として、映像中の選手の同定（誰が写っているか）および、その表示位置の取得を実現するための２通りの方法（計測器を用いた方法、画像処理を用いた方法）について述べる。
【００２８】
１．計測器を用いた方法
計測器を用いた方法では、空間中の任意の点を基準点とする座標系（以降、グローバル座標系と称す）における３次元位置が計測可能で、固有のＩＤ番号が割り当てられている位置センサ（例えば、ＧＰＳ。以降、位置センサと称す）を、同定したい個々の対象物に装着する。これにより、各々の対象物が同定でき、しかも３次元位置を取得することが可能となる。次に、映像を取得するためのカメラを様々な位置・角度に設置する。
【００２９】
本実施の形態１では、設置されたカメラは固定し、パンやチルトは行わないものとする。したがって、固定した状態で撮影空間をすべてカバーできるだけのカメラを用意しなければならない。固定位置が決定されたすべてのカメラに関して、グローバル座標系における位置および、視線（視準）方向ベクトルを求め、映像解析部１２０にあらかじめ通知しておく。なお、本実施の形態で用いるカメラは、図３（ａ）に示されるように、投影方向がカメラに固定された座標系（以降、カメラ座標系と称す）で表現した場合のカメラの視線方向（Ｚ軸）に一致し、Ｚ軸上のＺ＝０の位置に投影中心があり、投影面がＺ＝ｄであるとする。対象物に装着された位置センサからは、個々の位置センサに割り当てられたＩＤ番号および３次元位置座標が時系列に映像解析部１２０に入力される。ＩＤ番号は、対象物を同定するために必要である。
【００３０】
次に、位置センサからの情報およびカメラの位置情報を用いて対象物が映像（画面上）のどの位置に表示されているのかを同定する方法について説明する。
まず、グローバル座標系における位置センサの３次元位置座標を、カメラ座標系における表現に変換する。グローバル座標系をｉ番目のカメラのカメラ座標系に変換する行列をＭｖｉ、グローバル座標系における位置センサの出力をｖｗとすると、カメラ座標系における位置センサの出力（座標）ｖｃは、ｖｃ＝Ｍｖｉ・ｖｗで求められる。ここで、「・」は行列とベクトルの積を表す。また、この式を行列およびベクトルの成分を用いて表すと次のようになる。
【数１】

【００３１】
次に、カメラの投影面における位置センサの２次元座標を、投影変換を用いることにより求める。図３（ａ）を投影面に沿って上方から見た図３（ｂ）と、図３（ａ）を投影面に沿って側方から見た図３（ｃ）とより、投影面における座標ｖｐ＝（ｘｐ，ｙｐ）は、ｘｐ＝ｘｃ／（ｚｃ／ｄ）、ｙｐ＝ｙｃ／（ｚｃ／ｄ）となる。そして、算出されたｘｐ、ｙｐが、そのカメラの投影面（画面）内に収まっているか否かを判定し、収まっている場合にはその座標を表示位置として取得する。以上の処理を、すべてのカメラおよびすべての対象物に施すことにより、各々のカメラについて現在どの対象物がどの位置に表示されているのかを決定する。
【００３２】
２．画像処理を用いた方法
画像処理を用いた方法では、位置センサなどは利用せずカメラから取得される映像のみから内容情報の抽出を行うため、計測器を用いた場合のようにカメラは固定されている必要はない。映像から対象物を同定するためには、映像から対象物のみを切り出し、さらにその対象物を同定する必要がある。対象物を映像から切り出す方法に関しては特に限定しないが、上述したスポーツ中継の例では、基本的に背景が単一色であること（例えばサッカーやアメリカンフットボール中継であれば背景は芝生の色であることが殆どである。）が多いため、色情報を用いて背景と対象物を分離することが可能である。以下では、映像から抽出された複数の対象物を同定するための技術について述べる。
【００３３】
（１）テンプレートマッチング
個々の選手について、数多くのテンプレート画像を用意しておき、背景から分離された対象物とテンプレート画像とのマッチングを図り、最も適合していると考えられる画像から選手を同定する。具体的にはまず、映像に含まれるある選手に着目し、その選手を囲む最小の矩形（以降、「対象矩形」と称す）を求める。次に、あるテンプレート（矩形であるとする）について、それが対象矩形よりも大きな場合はダウンサンプリング、小さな場合はアップサンプリングすることにより、矩形の大きさを合わせる。そして、対象矩形のある位置における画素値と、テンプレート画像のそれと同じ位置の画素値との差分を取る。以上の処理をすべての画素で行い、その総和Ｓを算出する。すべてのテンプレート画像に関して上述の処理を行い、Ｓが最小となるテンプレート画像の選手が、同定の対象となっている選手であるとする。
【００３４】
（２）動き予測
スポーツ中継映像では、選手の動きは連続であるため、フレーム間で劇的に変化することはない。また、移動する方向や速度に関しても制限されているため、現在のフレームにおける選手の位置が既知であれば、次のフレームにおける位置をある程度予測することができる。したがって、現在のフレームにおける選手の位置から次のフレームにおける選手の位置の取り得る値の範囲を予測し、その範囲に対してのみテンプレートマッチングを用いることができる。また、着目している選手の周りの選手との位置関係も、劇的に変化することはないため、動き予測のための情報として利用できる。例えば、１フレーム前の画像で隣に表示されていた選手の現在のフレームにおける位置が既知ならば、同定の対象となっている選手もその周辺に存在する可能性が高く、現在のフレームにおける位置を予測することができる。
【００３５】
（３）事前取得情報の利用
スポーツ中継であれば、対戦するチーム同士は異なった色のユニホームを着用していることが多い。ユニホームの色は事前に取得できるため、その色情報を用いてチームを判別することが可能である。また、ユニホームには背番号が付与されており、背番号は重複して用いられることはないため、個々の選手を同定する上で、非常に有効である。
【００３６】
対象物の同定および、対象物が表示されている位置の取得は、上述した方法を組み合わせることで達成される。例えば、まず対象物の色情報とユニホームの色情報のマッチングを取ることによりチームの判別を行う。次に、ユニホームの背番号の部分のみを切り出したテンプレート画像を数多く用意しておき、テンプレートマッチングを用いて背番号を識別する。背番号まで識別できた選手は同定が完了する。同定できなかった選手に関しては、前フレームの映像や、既に同定が完了した周辺の選手との位置関係を利用して動き予測を行い、予測範囲に対して選手の全身画像をテンプレート画像としたテンプレートマッチングを行う。位置は、主走査方向、および副走査方向における対象矩形の左上の位置および右下の位置で特定される。
【００３７】
次に取得された内容情報の記述（Ｄｅｓｃｒｉｐｔｉｏｎ）について述べる。内容情報の記述には、ＭＰＥＧ−７などのマルチメディアコンテンツの記述様式を用いる。本実施の形態では、上記の手順で抽出された選手名および、画像内における表示位置を、内容情報として記述する。例えば、図４に示すように映像中にＡ（例えば、Ａｎｎｄｏ），Ｂ（例えば、Ｎｉｙａｍｏｔｏ）の２人の選手が含まれている場合には、内容情報の記述形式の一例は図５に示されるようになる。
【００３８】
本図において、＜Ｉｎｆｏｒｍａｔｉｏｎ＞は内容情報の開始および終了を示す記述子（タグ）であり、＜ＩＤ＞は個々の選手を識別する記述子であり、この記述子の中には選手の氏名を同定する＜ＩＤＮａｍｅ＞記述子および所属を同定する＜ＩＤＯｒｇａｎｉｚａｔｉｏｎ＞記述子が含まれている。＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞記述子は、画像中における選手の表示されている位置を示し、上述の方法によって取得されたものである。＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞記述子内にある＜Ｐｏｓｉｔｉｏｎ＞記述子に囲まれた値は順に、選手を包含する矩形の左上のＸ座標、Ｙ座標、右下のＸ座標、Ｙ座標を表す。なお、画像処理を用いた方法であれば選手を包含する矩形を取得することができるが、計測器（位置センサ・ＧＰＳ）のみを用いる方法では、不可能である。したがって、計測器のみを用いた場合には、左上座標と右下座標には同一の値、すなわち一点の座標位置が記述される。映像解析部１２０は、複数台のカメラから入力されたすべての映像に関してそれぞれ上記の内容情報を生成する。また、内容情報はフレームごとに生成されるため、映像と内容情報は１対１に対応する。
【００３９】
次に、配信映像マッチング部１３０、映像情報配信部１６０および映像受信装置２０の映像出力部２２０に関して説明する。視聴者は、映像情報配信部１６０を介して映像出力部２２０に伝送されてくる映像を視聴することができるが、逆に自身の嗜好情報を配信映像マッチング部１３０に通知することが可能である。スポーツ中継の場合、映像の中心は、競技に出場する選手であり、どの選手が出場するのかは事前に確定している。そこで、本実施の形態では、嗜好度の設定が可能な対象を、競技に出場する選手であるとする。
【００４０】
映像解析部１２０によって各内容情報が生成されると、配信映像マッチング部１３０は、ライブコンテンツに係る多視点映像とその内容情報とをコンテンツデータベース１４１に格納する（Ｓ１３）。
そして、配信映像マッチング部１３０は、上記テンプレートマッチング法で用いられたテンプレート画像や名前、背番号により嗜好値入力ダイアログ１４６を生成して、嗜好データベース１４５に格納した後、コンテンツデータベース１４１からライブやストレージのいずれかのモード選択するためのモード選択ダイアログ１４２を読み出して送信する（Ｓ１４）。映像受信装置２０のユーザがモード選択ダイアログ１４２のスイッチボタンを操作部２１０のマウスなどによりクリック操作していずれかのモードを指定すると（Ｓ１５）、いずれのモードが指定されたかを表すモード指定情報が映像受信装置２０から映像配信装置１０に送信される（Ｓ１６）。
【００４１】
モード指定情報が送信されてくると、配信映像マッチング部１３０は、ユーザが指定したモードのコンテンツ一覧１４３をコンテンツデータベース１４１から読み出して映像受信装置２０に送信すると共に（Ｓ１７）、ライブコンテンツと映像記録部１４０に格納されたストレージコンテンツとを切換配信するための図示しないスイッチを指定側に切り換える。
【００４２】
映像受信装置２０のユーザが操作部２１０のマウスなどに所望のコンテンツをクリック操作してコンテンツを指定すると、映像受信装置２０から映像配信装置１０にユーザが指定したコンテンツ名が送信される（Ｓ１８）。
【００４３】
コンテンツが指定されると、配信映像マッチング部１３０は、内容情報に基づき指定されたコンテンツに関する嗜好情報を設定するためのテーブル、嗜好値入力ダイアログ１４６を嗜好データベース１４５から読み出し、エディトプログラムなどと共に映像受信装置２０に送信する（Ｓ１９）。この嗜好値入力ダイアログ１４６は、例えば、エディット画像、スクリプト（氏名、背番号等）からなり、テンプレートマッチング法に用いるテンプレート画像や、氏名、背番号等に基づいて配信映像マッチング部１３０により生成され、映像記録部１４０の嗜好データベース１４５に格納される。なお、この嗜好値入力ダイアログ１４６の送信は、ライブコンテンツの中継途中であってもよいが、中継が開始される以前の方が好ましい。この理由は、最新の嗜好情報が取得されるまでの間は例えば嗜好履歴テーブル１４７に格納されている前回行われた同一カードの際に取得した嗜好履歴で映像を選択する方策しかないため、できるだけ早く最新の嗜好で映像を選択した方が、嗜好により合致するからである。
【００４４】
図６に、嗜好値入力ダイアログ１４６のＧＵＩインタフェースの一例を示す。図６のインタフェースは、出場する選手の「顔画像」、「氏名」、「背番号」および、嗜好度を入力する「エディットボックス」（スピンボックス）より構成される。視聴者は、操作部２１０のリモートコントローラや、キーボードなどのデバイスを用いて、嗜好度を決定したい選手のエディットボックス位置にカーソルを合わせ、嗜好度を入力する。または、エディットボックスの横にある上下の矢印アイコンにカーソルを合わせて、クリックして嗜好度の値を上下させて決定する方法でもよい。本実施の形態では、嗜好度「０」の場合に最も低く、嗜好度「１００」の場合に最も高いとする。なお、上述の方法は絶対評価を用いた方法であるが、出場する選手に順序付けを行うなどの相対評価の方法でもよい。以上の方法により取得された嗜好情報は、映像配信装置１０に送信される（Ｓ２０）。図７に嗜好情報の一例を示す。本図に示されるように嗜好情報は、内容情報と同様にＭＰＥＧ−７などのマルチメディアコンテンツの記述様式を用いて記述されており、個々の選手を識別する記述子＜ＩＤ＞と、この記述子の中には選手の氏名を同定する記述子＜ＩＤＮａｍｅ＞と、嗜好度を同定する記述子＜Ｐｒｅｆｅｒｅｎｃｅ＞とが含まれている。この嗜好情報は、映像情報配信部１６０を介して、配信映像マッチング部１３０に通知され、嗜好履歴テーブル１４７に更新記憶される（Ｓ２１）。
【００４５】
嗜好情報を取得すると、配信映像マッチング部１３０は、映像解析部１２０より生成された内容情報の付与された複数の映像と、視聴者より通知された嗜好情報やその履歴とに基づき、その視聴者にどの映像を配信するべきかを決定するマッチング処理を実行する（Ｓ２２）。以下、そのマッチング処理について、２通りの方法（最も嗜好度の高い対象物を利用して決定する方法、個々の嗜好度から総合的に決定する方法）を具体的に説明する。
【００４６】
１．最も嗜好度の高い対象物を利用して決定する方法
嗜好度の最も高い選手が表示されている映像を配信する場合には、例えば図８に示されるフローチャートの手順にしたがう。
【００４７】
（１）視聴者より通知された嗜好情報を分析し、最も嗜好度の高い選手（以降、配信対象選手とも称す）を決定する（Ｓ２２０１）。
【００４８】
（２）映像解析手段より伝送されてきた内容情報を分析し、配信対象選手が写っている映像の数を判断する（Ｓ２２０２）。複数の視点からの映像のうち、（１）で決定された配信対象選手が表示されている映像を配信映像の候補とする。配信対象選手の表示されている映像が１つに限定されている場合には、そのカメラからの映像に決定し（Ｓ２２０３）、この映像を視聴者に配信する。
【００４９】
（３）複数の映像に配信対象選手が表示されている場合には、それらの中から、最も適当だと考えられる映像を配信するが、その決定方法は特に限定しない。例えば、内容情報の＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞の記述子（Ｄｅｓｃｒｉｐｔｏｒ）で、矩形情報が取得されている場合には（Ｓ２２０４でＹｅｓ）、配信対象選手を包含している矩形の面積を算出し、最も面積が大きな映像に決定し（Ｓ２２０５）、この映像を配信映像とする。
【００５０】
また、矩形情報が取得されていない場合には（Ｓ２２０４でＮｏ）、配信対象選手の表示されている位置を取得し、画面の中心に最も近いものを配信映像とする（Ｓ２２０６）方法が考えられる。なお、配信対象選手が写っている映像の数が「０」の場合には、次番手の選手を決定し、次番手の選手についてステップＳ２２０２〜Ｓ２２０６の処理を実行することにより配信映像を決定すればよい（Ｓ２２０７）。
【００５１】
２．個々の嗜好度から総合的に決定する方法
個々の選手の嗜好度に基づき、総合的に判断して配信映像を決定する場合には、例えば図９に示されるフローチャートの手順にしたがう。
【００５２】
（１）すべてのカメラからの映像に関して、内容情報の＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞の記述子（Ｄｅｓｃｒｉｐｔｏｒ）で矩形情報が取得されているか否か判断する（Ｓ２２１１）。矩形情報が取得されている場合には（Ｓ２２１１でＹｅｓ）、個々の選手を包含する矩形の面積を算出する（Ｓ２２１２）。矩形情報が取得されていない場合には（Ｓ２２１１でＮｏ）、画面中心で最大値を取り、画面の淵で最小値を取る関数（例えば、ｆ（ｘ，ｙ）＝ｓｉｎ（π＊ｘ／（２＊ｘ＿ｍｉｄ））＊ｓｉｎ（π＊ｙ／（２＊ｙ＿ｍｉｄ））は上記の条件を満たす。但し、ｘ、ｙは画素位置、ｘ＿ｍｉｄ、ｙ＿ｍｉｄは画面中心の座標であり、＊は積を示す。）を規定し、個々の選手の位置を入力して関数の値を求める（Ｓ２２１５）。
【００５３】
（２）（１）で求めた値と、対応する選手の嗜好度との積を算出し、さらに画面に表示されている選手の値の総和をとって、当該画像における目的関数の値とする（Ｓ２２１３，Ｓ２２１６）。
【００５４】
（３）（２）の値が最大となる視点からの映像を配信映像に決定する（Ｓ２２１４，Ｓ２２１７）。
【００５５】
ここで、１フレームごとに上記の処理を行うと、映像が次々に切り替わってしまう可能性があるため、配信映像マッチング部１３０では、数フレームおきに上記の方法を適用し、視聴者に配信する映像を決定する。
【００５６】
以上のようにして配信映像の決定が終わると、配信映像マッチング部１３０は、決定した映像をストリーム配信する（図２のＳ２３）。そして、映像受信装置２０の映像出力部２２０は、送受信部２３０を介して配信されてきた映像をその画面上に再生する（図２のＳ２４）。
【００５７】
このようにして、実施の形態１に係る映像配信システム１によれば、映像配信装置１０において多視点映像の中から各ユーザの嗜好に合致した映像が数フレームごとに選択されて映像受信装置２０に配信され、これが映像受信装置２０の映像出力部２２０において再生される。
【００５８】
続いて、視聴者は、配信されてくる映像に対して働きかけを行うことにより、付加情報を取得することが可能である（図２のステップＳ２５〜Ｓ２９）。以下では、例えば、操作部２１０のマウスのようなポインティングデバイスを用いて付加情報を取得する方法について述べる。
【００５９】
例えば、図４に示されるように映像中にＡ，Ｂの２人の選手が含まれている場合において、例えば右側の選手Ｂ（Ｎｉｙａｍｏｔｏ）の付加情報を取得したいとき、ユーザは、ポインティングデバイスのカーソルを対象Ｂ上に合わせてクリックする（図２のＳ２５）。クリックされると、その画面上での位置情報が映像配信装置１０の映像情報配信部１６０を介して配信映像マッチング部１３０に通知される（図２のＳ２６）。そして、配信映像マッチング部１３０は、配信映像に付与されている内容情報からどの対象が選択されたのかを特定し、その結果を付加情報提供部１５０に通知する（図２のＳ２７）。例えば、図４に示される画像が表示されている場合に、右側の画像上の位置がクリックされた場合、配信映像マッチング部１３０は、図５に示される内容情報に基づいてＮｉｙａｍｏｔｏだけを通知する。付加情報提供部１５０は、選択された対象であるＮｉｙａｍｏｔｏに関する付加情報を付属情報テーブル１５１から読み出し、付加情報を配信映像マッチング部１３０および、映像情報配信部１６０を介して、映像受信装置２０の映像出力部２２０に送信する（図２のＳ２８）。この付加情報は、図１０に示されるように、上記ＭＰＥＧ７にしたがう記述子で記述されており、個々の選手を識別する記述子＜ＩＤ＞と、この記述子の中には選手の氏名を同定する記述子＜ＩＤＮａｍｅ＞と、生年月日を表す記述子＜ＤａｔｅＯｆＢｉｒｔｈ＞と、主な経歴を表す記述子＜Ｃａｒｅｅｒ＞と、特徴を表す記述子＜ＳｐｅｃｉａｌＡｂｉｌｉｔｙ＞と、選手のコメントを表す記述子＜Ｃｏｍｍｅｎｔ＞とが含まれている。
【００６０】
なお、選択された対象に関連する情報が記録されていない場合には、情報が存在しないことを通知するメッセージを送信する。
最後に、映像出力部２２０は、送受信部２３０を介して配信されてきた付加情報をその画面上に再生する（図２のＳ２９）。
【００６１】
このように、実施の形態１に係る映像配信システム１によれば、視聴者は複数の視点から撮影された映像の中から、好みに合致した映像を視聴することができるだけでなく、さらに、配信される映像に働きかけを行うことによって、興味をもっている対象に関連する情報（付加情報）を取得することが可能となる。
【００６２】
（実施の形態２）
次いで、本発明の実施の形態２に係る映像配信システムを、図面に基づいて説明する。なお、この実施の形態２においても、限定された空間の撮影対象として、サッカーなどのスポーツ中継の場合の選手を中心とした映像を例に挙げて説明するが、本発明は任意の撮影空間および撮影対象に対して適用可能である。
【００６３】
図１１は、本発明の実施の形態２における映像配信システム２の機能構成を示すブロック図である。実施の形態１の映像配信システム１と対応する機能構成については同じ番号を付し、その詳細な説明を省略する。
この映像配信システム２は、映像配信装置４０と、映像受信装置５０と、これらを接続する通信ネットワーク３０とから構成され、多視点映像の中からユーザの嗜好に合致した映像を再生するシステムである点で実施の形態１の映像配信システム１と同様であるが、実施の形態１では、映像配信装置１０が利用者の嗜好に応じた映像等のコンテンツを決定しストリーム配信したのに対して、この映像配信システム２では、映像配信装置４０は多視点映像のコンテンツ等のすべて（選択される可能性のあるすべてのコンテンツ）をストリーム配信しておき、映像受信装置５０が利用者の嗜好に応じた映像等を選択決定し再生するようにした点で異なっている。
【００６４】
この映像配信システム２の映像配信装置４０は、内容情報および付加情報を付加した複数の映像（多視点映像）の映像コンテンツ等を映像受信装置５０に向けてストリーム配信するコンピュータ等からなる配信サーバであり、映像取得部１１０と、映像解析部１２０と、付加情報提供部４１０と、映像記録部４２０と、映像多重化部４３０と、多重化映像情報配信部４４０とを備えている。
【００６５】
付加情報提供部４１０は、映像解析部１２０によって生成された内容情報をサーチし、内容情報に含まれる被写体（対象物）の付加情報を付属情報テーブル１５１に基づいて生成したり、内容情報および付加情報が付加された映像を映像記録部４２０のコンテンツデータベース４２１に格納したり、嗜好値入力ダイアログ１４６を生成して嗜好データベース１４５に格納したりする。
【００６６】
映像記録部４２０は、入力側が付加情報提供部４１０に接続されると共に出力側が映像多重化部４３０に接続されており、内部にコンテンツデータベース４２１と、嗜好データベース１４５とを備えている。コンテンツデータベース４２１には、内容情報および付加情報が付加された映像コンテンツ４２４自体が格納される。なお、嗜好データベース１４５から嗜好履歴テーブル１４７が削除されている。これは、映像受信装置５０において、利用者の嗜好に応じた映像を選択するので、映像配信装置４０で嗜好履歴テーブル１４７を保持しておく必要がないからである。
【００６７】
映像多重化部４３０は、付加情報提供部４１０から出力される内容情報および付加情報が付加されたライブの多視点映像と、コンテンツデータベース４２１に格納されたストレージの映像コンテンツ４２４とをユーザのモード指定に応じて選択し、映像と内容情報と付加情報とをカメラごとに多重化し、さらにそれらの情報を多重化することにより、１つのビットストリームを生成したりする（図１３参照）。また、映像多重化部４３０は、嗜好値入力ダイアログ１４６を映像受信装置５０にストリーム配信したりする。
【００６８】
多重化映像情報配信部４４０は、通信ネットワーク３０を介して映像受信装置５０と通信するための双方向の通信インタフェースやドライバソフト等である。
【００６９】
映像受信装置５０は、ライブやストレージのモード選択や、嗜好値の入力等についてユーザと対話したり、映像配信装置４０からストリーム配信されてくる映像と内容情報と付加情報とを分離したり、複数の映像（多視点映像）の中からユーザの嗜好や嗜好履歴に合致した１つの映像を数フレームごとに切換・選択するような編集を行った映像コンテンツをリアルタイムに構築し、ユーザに提示したりするパーソナルコンピュータ、携帯電話機、携帯情報端末、デジタル放送用ＴＶ等であり、操作部２１０と、映像出力部２２０と、送受信部２３０と、表示映像マッチング部５１０と、映像記録部５２０とを備える。
【００７０】
表示映像マッチング部５１０は、映像配信装置４０からストリーム配信されてくる映像、内容情報および付加情報をカメラごとに分離し（図１３参照）、これらを映像記録部５２０に格納したり、映像配信装置４０から配信されてくる嗜好値入力ダイアログ１４６を映像記録部５２０に格納したり、操作部２１０から送られてきたユーザの嗜好等と映像配信装置４０から送られてくる各映像の内容情報とを比較し、複数の映像（多視点映像）の中からユーザの嗜好や嗜好履歴に合致した１つの映像を数フレームごとに切換・選択するような編集を行った映像コンテンツをリアルタイムに構築したりする。
【００７１】
映像記録部５２０は、映像配信装置４０から配信されてくるライブあるいはストレージのコンテンツなどを保持するコンテンツデータベース５２１と、ユーザごとの嗜好を取得するための嗜好データベース５２５とを保持するハードディスク等である。コンテンツデータベース５２１は、保持しているストレージコンテンツのコンテンツ一覧５２３およびコンテンツ５２４自体を記憶する。また、嗜好データベース５２５は、映像配信装置４０から送られてきたコンテンツごとの嗜好値入力ダイアログ１４６およびユーザが入力した嗜好履歴を格納する嗜好履歴テーブル１４７を記憶する。
【００７２】
以上のように構成された本実施の形態の映像配信システム２の動作について、図１２に示されたシーケンス（本システムの主な処理の流れ）に沿って順に説明する。なお、本図のシーケンスにおいても、ある一時点における多視点映像についての流れを示しており、実施の形態１のシーケンスと対応する処理については、詳細な説明を省略する。
【００７３】
映像取得部１１０による複数の映像（多視点映像）の取得が終わると（Ｓ１１）、映像解析部１２０は多視点映像を解析して映像ごとに内容情報を生成し、付加情報提供部４１０は内容情報をサーチし、内容情報に含まれる被写体（対象物）の付加情報を生成する（Ｓ３２）。例えば、映像中にＡ，Ｂの２人写っている場合には、このＡ，Ｂ２人の付加情報を生成する。付加情報の生成が終わると、付加情報提供部４１０は、内容情報および付加情報が付加された映像を映像記録部４２０のコンテンツデータベース４２１に格納する（Ｓ３３）。
そして、実施の形態１の場合と同様に、モード選択ダイアログの送信（Ｓ１４）や、映像受信装置５０におけるモード指定（Ｓ１５）、モード指定情報の送信（Ｓ１６）、コンテンツ一覧情報の送信（Ｓ１７）、コンテンツ指定の送信（Ｓ１８）が順次行われる。
【００７４】
コンテンツの指定が行われると、映像多重化部４３０は、指定されたライブあるいはストレージのコンテンツの多視点映像（複数の映像）と各映像ごとの内容情報と各映像ごとの付加情報とを多重化して送信した後（Ｓ３９）、このコンテンツの嗜好値入力ダイアログ１４６を送信する。
【００７５】
表示映像マッチング部５１０は、映像配信装置４０から送られてきた多視点映像と各映像ごとの内容情報と各映像ごとの付加情報とを各カメラごとに分離してコンテンツデータベース５２１に格納し（Ｓ４０）、さらに嗜好値入力ダイアログ１４６を嗜好データベース５２５に格納する。
次いで、表示映像マッチング部５１０は、嗜好値入力ダイアログ１４６を嗜好データベース５２５から読み出して映像出力部２２０に送り表示させ（Ｓ４１）、ユーザが入力した嗜好情報を嗜好履歴テーブル１４７に格納した後（Ｓ４２）、嗜好情報と内容情報とを比較し、多視点映像の中からユーザの嗜好に合致した１つの視点の映像を決定する（Ｓ４３）。なお、この映像の決定方法は、本実施の形態１と同様である。そして、表示映像マッチング部５１０は、決定した映像を映像出力部２２０に送りその画面上に再生させる（Ｓ４４）。
【００７６】
このように、実施の形態２に係る映像配信システム２によれば、映像配信装置４０は複数の映像（多視点映像）を映像受信装置５０に送信しておき、映像受信装置５０において多視点映像の中からユーザの嗜好に合致した１つの映像が数フレームごとに選択決定され、再生される。
【００７７】
続いて、ユーザは、配信されてきた映像に対して働きかけを行うことにより、付加情報を取得することが可能である（図１２のステップＳ４５〜Ｓ４７）。
例えば、ユーザの嗜好に合致した映像が再生され、配信されている映像に付加情報を取得したい対象が表示されている状態において、ユーザが操作部２１０のポインティングデバイスのカーソルを画面に映し出された対象の上に合わせてクリックすると、その画面上での位置情報が表示映像マッチング部５１０に通知される（Ｓ４５）。そして、表示映像マッチング部５１０は、映像に付与されている内容情報からどの対象が選択されたのかを特定し（Ｓ４６）、対応する付加情報の中からその特定した付加情報だけを映像出力部２２０に送る。例えば、図４に示される対象Ａ，Ｂが表示されている場合に、右側の対象Ｂ上の位置がクリックされた場合、表示映像マッチング部５１０は、まず、図５に示される内容情報に基づいてＮｉｙａｍｏｔｏを特定する。すると、表示映像マッチング部５１０は、２人についての付加情報の中からＮｉｙａｍｏｔｏに関する付加情報だけを読み出し、映像出力部２２０に送る。これによって、映像出力部２２０には、取得したい対象の付加情報だけが表示される（Ｓ４７）。
【００７８】
このように、実施の形態２に係る多視点映像配信システム２によれば、視聴者は複数の視点から撮影された映像の中から、好みに合致した映像を視聴することができるだけでなく、さらに、配信される映像に働きかけを行うことによって、興味をもっている対象に関連する情報（付加情報）を取得することが可能となる。
【００７９】
ところで、映像記録部５２０のコンテンツデータベース５２１には、映像配信装置４０から送られてきた多視点映像と、各映像ごとの内容情報と、各映像ごとの付加情報とがすべて揃ったコンテンツ５２４が格納されている。したがって、このコンテンツについては、映像配信装置４０から再配信を受けるまでもなく、映像受信装置５０において、繰り返し再生することができる。
【００８０】
また、繰り返し再生の際に、表示映像マッチング部５１０が、映像記録部５２０の嗜好データベース５２５から嗜好値入力ダイアログ１４６を読み出して、ユーザが入力した前回と異なる嗜好情報に基づいて複数の視点から撮影された映像の中から、この嗜好に合致した映像を再生することもでき、この場合には、ユーザは前回とは異なる対象（選手）を中心とした別の編集の映像を視聴することができる。
【００８１】
以上、本発明に係る映像配信システムを実施形態に基づいて説明したが、本発明は実施の形態に限定されるものでなく、以下に述べる変形例についても適用される。
【００８２】
上記実施の形態では、映像コンテンツの配信ごとに、嗜好値入力ダイアログ１４６を表示し、視聴者の嗜好情報を取得するようにしたが、このようなタイミングではなく、嗜好の履歴を用いて多視点映像の中から１つの映像を選択するようにしてもよい。例えば、過去に取得された視聴者の嗜好情報等を映像配信装置４０に蓄積しておき、その情報を参照することで、映像コンテンツの配信の度に視聴者から嗜好情報を取得するという手間を省くことができる。
【００８３】
また、上記実施の形態１では、付加情報提供部１５０は、映像受信装置２０において位置指定がされた場合にだけ、付加情報が映像配信装置１０から映像受信装置２０に送信されたが、視聴者の指定を待たずに、配信が決定された映像についての付加情報を映像コンテンツとともにあらかじめ配信しておいてもよい。これによって、視聴者が指示を発してから付加情報を取得するまでの時間が短縮されるので、早い応答性を有する映像配信システムが実現できる。
【００８４】
さらに、これとは逆に、上記実施の形態２では、付加情報提供部４１０が多視点映像のそれぞれについて付加情報が添付されたが、映像受信装置５０において位置指定がされた場合にだけ付加情報を配信するようにしてもよい。これによって、最終的に選択されるか否か不明な映像コンテンツの付加情報についても配信しておくことに起因する通信ネットワーク３０における通信負荷が軽減される。
【００８５】
また、上記実施の形態１，２では、サッカーのライブ中継を例に説明したが、野球等、屋外で行われるスポーツ等のライブ中継や、屋内で行われる音楽会、芝居等のライブ中継にも勿論適用できる。
【００８６】
さらに、上記実施の形態１，２では、嗜好のほか、映像中のオブジェクトごとの大きさや、位置だけを映像選択の際における評価の対象としたが、この評価の対象にオブジェクトの動きを加えるようにしてもよい。
【００８７】
すなわち、屋内でのライブ中継の場合、この施設にモーションキャプチャシステムを設置することにより、対象（歌手等）がステージ上を走り回るような激しい動きを検出することもできる。一方、例えば、ライブステージでは、複数の被写体が混在する中で主役（注目される人）がリアルタイムに入れ替わるような演出が行われたりする。このような場合、じっとしている人を見るよりは、その時点でステージ上を走り回っているような激しい動きをしている人（活躍している人）を見たいというのが、視聴者の心理であり、嗜好に合致する。したがって、モーションキャプチャシステムの機器を用いて得られる映像中で表示されている対象の動き量を映像解析部１２０で解析し、動き量を内容情報に含め、動きの激しい被写体ほど、注目度や、関心度が高いとして、この映像を選択するようにしてもよい。
【００８８】
（実施の形態３）
図１４は、あるグループ「スペード」のライブコンサートのステージの様子を示す図である。
同図に示されるように、ステージの周囲には、複数台（図示、４台）のカメラＣ１〜Ｃ４が固定配設され、スペードのメンバー（図１４の左から古垣、下原、前井、陸袋）の肢体には、複数のマーカＭがそれぞれ取着されている。
【００８９】
各カメラＣ１〜Ｃ４は、Ｒ，Ｇ，Ｂの各色画像を取得するほか、赤外光を射出する発光部と、マーカＭで反射された赤外光を受光する受光部とを備えており、フレームごとにマーカで反射された映像を受光部で取り込むように構成されている。このフレームごとのマーカ映像は、例えば図１に示される映像解析部１２０に送られて、対象の動き量が解析される。
【００９０】
図１５は、２つのマーカ画像（Ｐ１，Ｐ２）から動き量を解析する様子を示す図である。なお、ここでは、図１４に示されるメンバー下原だけが映っている２つのマーカ画像から動き量を解析する場合が示されている。
【００９１】
映像解析部１２０は、２つのマーカ画像Ｐ１，Ｐ２の対応する各マーカＭを比較し、肩、肘、手首、…、足先といった各部分の動き量Δｖ１，Δｖ２，Δｖ３，Δｖ４，…Δｖ（ｎ−１），Δｖｎをそれぞれ計測する。そして、映像解析部１２０は、各部分の計測が終わると、これらの計測値の総和を計算し、この計算結果をその時点における映像中で表示されている対象、歌手の動き量として取得し、取得した動き量を内容情報に含める。なお、まず腰、肩等を基準にして、腕、手首、といった順番で動き量を計算してもよい。また、複数の視点から得られたマーカ画像Ｍを組み合わせ、３次元の動きベクトルを計測してもよい。この場合には、１つのマーカ画像Ｍでマーカが重なるような場合でも、各マーカを峻別することができ、動き量の誤計算といった事態を避けることができ、精度の高い動き量を求めることが出きる。
【００９２】
図１６は、映像解析部１２０により生成される内容情報の一例を示す図である。
この例では、＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞記述子に、画像中における歌手の表示されている大きさを有する位置＜Ｐｏｓｉｔｉｏｎ＞と、計測器（位置センサ・ＧＰＳ）等によって取得された大きさを有さないポイントの位置＜Ｌｏｃａｔｉｏｎ＞とが合わせて記述されており、対象の画面上の大きさと、中央等の位置と両者でオブジェクト単位での評価を行うことができるようになっている。さらに、この内容情報では、＜ｍｏｔｉｏｎ＞記述子により、動き量についてオブジェクト単位での評価を行うことができるようになっている。
【００９３】
このように内容情報がオブジェクトの大きさ、位置のほか、動き量が含められて構成されている場合、個々の歌手の嗜好度や、画面上における対象の大きさ、位置、動き等に基づき、個々の対象をオブジェクトごとに評価し、総合的に判断して配信映像を決定する場合には、例えば図１７に示されるフローチャートの手順にしたがう。
【００９４】
配信映像マッチング部１３０は、まず、すべてのカメラからの映像に関して、内容情報の＜ＲｅｇｉｏｎＬｏｃａｔｏｒ＞の記述子（Ｄｅｓｃｒｉｐｔｏｒ）で矩形情報を参照し、個々のオブジェクト、歌手を包含する矩形の面積を算出する（Ｓ２２２１）。矩形面積の算出が終わると、配信映像マッチング部１３０は、画面中心で最大値を取り、画面の淵で最小値を取る関数（例えば、ｆ（ｘ，ｙ）＝ｓｉｎ（π＊ｘ／（２＊ｘ＿ｍｉｄ））＊ｓｉｎ（π＊ｙ／（２＊ｙ＿ｍｉｄ））を用いて、個々の歌手の位置に関する関数の値を算出する（Ｓ２２２２）。関数値の値の算出が終わると、配信映像マッチング部１３０は、すべてのカメラからの映像に関して、内容情報の＜ｍｏｔｉｏｎ＞記述子を参照し、動き量を読み出す（Ｓ２２２３）。
【００９５】
面積の算出、関数値の算出、動き量の読み出しが終わると、配信映像マッチング部１３０は、すべてのカメラからの映像に関して、面積と対応する歌手の嗜好度との積を算出し、さらに画面に表示されている歌手の値の総和を算出し、位置とこの位置に対応する歌手の嗜好度との積を算出し、さらに画面に表示されている歌手の値の総和を算出し、さらに画面に表示されている歌手の動き量の値の総和を算出することにより、目的関数の値を求める（Ｓ２２２４）。
そして、すべてのカメラからの映像に関して、目的関数の値を求めると、目的関数の値が最大となる視点からの映像を配信映像に決定する（Ｓ２２２５）。
【００９６】
このようにして、動き量を評価値の中に含めると、じっとしているよりも活躍しているであろう動きの多い歌手の映像が高く評価され、高く評価された映像が数フレームごとに選択されることにないる。この結果、映像配信装置１０において多視点映像の中から各ユーザの嗜好に合致した映像が配信されることになる。
【００９７】
【発明の効果】
以上の説明から明らかなように、本発明に係る映像配信装置は、通信ネットワークを介して映像受信装置と通信する映像配信装置であって、異なる視点からの複数の映像を取得する映像取得手段と、前記映像ごとに、その映像に含まれる内容を解析し、解析結果を内容情報として生成する映像解析手段と、前記各内容情報と、視聴者より通知された嗜好情報との適合度を判定し、配信する映像を決定し、決定した映像を配信する配信映像マッチング手段とを備えることを特徴とする。
つまり、異なる視点からの複数の映像の中から各映像ごとに生成された内容情報と視聴者の嗜好情報との適合度で決定し、視聴者の嗜好に合致した１つの映像を視聴者の映像受信装置に対して配信する。
【００９８】
これにより、視聴者は、自己の嗜好に合致した映像を選択的に視聴することができる。したがって、映像の選択に関して視聴者の要求を満足させることができる。しかも、映像取得手段、映像解析手段および配信映像マッチング手段による処理を高速に繰り返して行うことで、リアルタイム映像に関しても配信の対象とすることができる。
【００９９】
ここで、内容情報には、被写体を同定する情報や、被写体の表示位置または表示領域を表す情報を含めてもよい。また、嗜好情報を得るための入れ物を映像受信装置側に配信し、この入れ物に被写体に対する嗜好の度合いを入力させることにより嗜好情報を取得してもよい。また、配信した映像について視聴者から画面上の位置が指定されると、その位置の被写体を特定し、この被写体に関する付加情報を送信するようにしてもよい。
【０１００】
さらに、本発明は、通信ネットワークを介して映像受信装置と通信する映像配信装置であって、異なる視点からの複数の映像を取得する映像取得手段と、前記映像ごとに、その映像に含まれる内容を解析し、解析結果を内容情報として生成する映像解析手段と、前記各映像および前記各内容情報を多重化して配信する映像多重化手段とを備えることを特徴とする映像配信装置とすることもできる。この場合には、映像受信装置の側において映像配信装置から配信されてきた各内容情報と、視聴者より通知された嗜好情報との適合度を判定し、映像配信装置から配信されてきた複数の映像の中から再生する１つの映像を決定し、決定した映像を再生するようにすればよい。
【０１０１】
これによって、このような映像配信装置から配信した各映像および各内容情報を受信する映像受信装置において、各内容情報と視聴者より通知された嗜好情報との適合度を判定し、再生する映像を決定し、決定した映像を再生することにすれば、視聴者は、自己の嗜好に合致した映像を選択的に視聴することができる。
【０１０２】
また、本発明は、このような特徴的な手段をコンピュータに機能させるプログラムとして実現したり、そのプログラムを記録した記録媒体として実現したりすることもできる。そして、本発明に係るプログラムをインターネット等の通信網や記録媒体等を介して流通させることもできる。
【０１０３】
このように、本発明により、視聴者は、例えば、スポーツ観戦の番組において、自分がひいきにしている選手が頻繁に登場する映像を選択的に視聴することができ、楽しい時間を過ごすことができる。よって、本発明は、映像配信システムが提供するサービスの価値を飛躍的に向上させるものであり、その実用的価値は極めて高い。
【図面の簡単な説明】
【図１】本発明の実施の形態１における映像配信システム１の機能構成を示すブロック図である。
【図２】映像配信システム１の動作を示すシーケンス図である。
【図３】図３（ａ）は本発明の実施の形態１で用いるカメラ座標系における位置と、投影面上における位置の関係を示す斜視図であり、図３（ｂ）は図３（ａ）を投影面に沿って上方から見た図であり、図３（ｃ）は図３（ａ）を投影面に沿って側方から見た図である。
【図４】図１に示される映像取得部１１０により取得された映像の一例を示す図である。
【図５】図１に示される映像解析部１２０により生成される内容情報の一例を示す図である。
【図６】図１に示される配信映像マッチング部１３０により生成される嗜好値入力ダイアログの一例を示す図である。
【図７】図１に示される映像受信装置２０から送られてくる嗜好情報の一例を示す図である。
【図８】配信映像マッチング部１３０が最も嗜好度が高い対象物を利用して配信する映像を決定する際に実行するフローチャートである。
【図９】配信映像マッチング部１３０が個々の嗜好度から総合的に判断して配信する映像を決定する際に実行するフローチャートである。
【図１０】図１に示される付加情報提供部１５０から送られる付加情報の一例を示す図である。
【図１１】本発明の実施の形態２における映像配信システム２の機能構成を示すブロック図である。
【図１２】映像配信システム２の動作を示すシーケンス図である。
【図１３】映像、内容情報および付加情報の多重化・分離方法の一例を示す図である。
【図１４】、あるグループ「スペード」のライブコンサートのステージの様子を示す図である。
【図１５】２つのマーカ画像（Ｐ１，Ｐ２）から動き量を解析する様子を示す図である。
【図１６】映像解析部１２０により生成される内容情報の一例を示す図である。
【図１７】配信映像マッチング部１３０が個々の嗜好度等から総合的に判断して配信する映像を決定する際に実行するフローチャートである。
【符号の説明】
１，２映像配信システム
１０，４０映像配信装置
２０，５０映像受信装置
３０通信ネットワーク
１１０映像取得部
１２０映像解析部
１３０配信映像マッチング部
１４０，４２０，５２０映像記録部
１４１，４２１，５２１コンテンツデータベース
１４４，４２４，５２４コンテンツ
１４５，５２５嗜好データベース
１４６嗜好値入力ダイアログ
１４７嗜好履歴テーブル
１５０，４１０付加情報提供部
２１０操作部
２２０映像出力部
２３０送受信部
４３０映像多重化部
４４０多重化映像情報配信部
５１０表示映像マッチング部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a video distribution device and a video reception device that distribute or receive a video such as a sports program.
[0002]
[Prior art]
With the advancement of communication network infrastructure, technologies related to the distribution and reception of sports programs and other images are being developed. As conventional techniques relating to such video distribution and reception, a video information distribution system disclosed in JP-A-7-95322 (first publication) and JP-A-2-54646 (second publication) are disclosed. There is a disclosed program distribution device.
[0003]
The video information distribution system disclosed in the first publication includes a video center, a video dial tone trunk, and a user terminal. When the user calls the video center, a program desired by the user is transmitted from the video center via a transmission path. The video dial tone trunk receives video information transferred at high speed from the video center, reproduces it into normal speed video information, and transmits it to the user terminal via a low-speed transmission path.
[0004]
The program distribution device disclosed in the second publication receives a program distribution request and an advertisement insertion request from a terminal device via a network, a storage device that holds a plurality of moving image programs, and a moving image program and a designated advertisement. It comprises a distribution device that divides a request into information blocks and distributes it via a network, and a control device that controls charging to differ according to the timing at which the advertisement specified in the advertisement insertion request described above is inserted.
[0005]
[Problems to be solved by the invention]
However, in the above-described conventional technology, the video delivered to the viewer is a video shot only from the creator's intention from a specific viewpoint, and the viewer can watch the video according to his / her preference. An operation such as changing the viewpoint is impossible. For example, in a sports watching program such as a soccer game, even if the viewer has a request to watch a specific player he / she likes, only the other players appear in only a few scenes. Even video works that appear, I have to watch this.
[0006]
In addition, the above-described conventional technique has a problem in that a program must be recorded in advance in a video center or a storage device, and a mechanism for distributing real-time video is not provided.
Therefore, the present invention has been made in view of such a situation, and an object thereof is to provide a video distribution device and a video reception device capable of distributing a video reflecting viewer's preference.
[0007]
It is another object of the present invention to provide a video distribution apparatus and a video reception apparatus capable of not only distributing stored video but also distributing real-time (live) video reflecting viewers' preferences. And
[0008]
[Means for Solving the Problems]
In order to achieve the above object, a video distribution apparatus according to the present invention is a video distribution apparatus that communicates with a video reception apparatus via a communication network, and a video acquisition unit that acquires a plurality of videos from different viewpoints; For each video, analyze the content included in the video, determine the degree of fit between the video analysis means for generating the analysis result as content information, each content information, and the preference information notified from the viewer, It comprises distribution video matching means for determining a video to be distributed and distributing the determined video. In other words, it is determined by the degree of matching between the content information generated for each video from a plurality of videos from different viewpoints and the viewer's preference information, and one video that matches the viewer's preference is selected as the viewer's video. Deliver to the receiving device.
[0009]
Here, the content information may include information for identifying the subject and information indicating the display position or display area of the subject. Alternatively, the container for obtaining the preference information may be distributed to the video receiving device side, and the preference information may be acquired by causing the container to input the degree of preference for the subject. Further, when a position on the screen is designated by the viewer for the distributed video, the subject at that position may be specified, and additional information regarding this subject may be transmitted.
[0010]
Furthermore, the present invention is a video distribution device that communicates with a video receiving device via a communication network, and includes video acquisition means for acquiring a plurality of videos from different viewpoints, and contents included in the videos for each of the videos A video distribution device comprising: a video analysis unit that analyzes the video and generates an analysis result as content information; and a video multiplexing unit that multiplexes and distributes each video and each content information. it can. In this case, the degree of matching between each content information distributed from the video distribution device on the video reception device side and the preference information notified from the viewer is determined, and a plurality of distribution information distributed from the video distribution device is determined. One video to be played back may be determined from the video and the determined video may be played back.
[0011]
In addition, the present invention can be realized as a program for causing a computer to function such characteristic means, or as a recording medium on which the program is recorded. The program according to the present invention can be distributed via a communication network such as the Internet, a recording medium, or the like.
[0012]
DETAILED DESCRIPTION OF THE INVENTION
(Embodiment 1)
Hereinafter, a video distribution system according to Embodiment 1 of the present invention will be described with reference to the drawings. In this embodiment, as an object to be captured in a limited space, a video centered on players in the case of sports broadcast such as soccer will be described as an example. Is applicable.
[0013]
FIG. 1 is a block diagram showing a functional configuration of a video distribution system 1 according to Embodiment 1 of the present invention.
The video distribution system 1 according to Embodiment 1 of the present invention is a communication system that streams content such as video according to user's preference, and connects the video distribution device 10 and the video reception device 20 to each other. Communication network 30.
[0014]
The video distribution apparatus 10 constructs video content edited in real time by switching and selecting one video that matches the user's preference and preference history from a plurality of videos (multi-view video) every several frames. And a distribution server composed of a computer or the like that distributes a stream to the video receiving device 20, and includes a video acquisition unit 110, a video analysis unit 120, a distribution video matching unit 130, a video recording unit 140, and an additional information providing unit. 150 and a video information distribution unit 160 and the like.
[0015]
The video acquisition unit 110 is distributed and arranged in a predetermined shooting space (for example, a soccer field), and a plurality of videos (multi-view videos) obtained by shooting a plurality of subjects in the limited shooting space from various viewpoints and angles, respectively. A plurality of photographing devices (video cameras, etc.) to be acquired. The multi-view video acquired by the video acquisition unit 110 is transmitted to the video analysis unit 120 by cable or wireless communication.
[0016]
The video analysis unit 120 acquires the contents of each video (specifically, what position (for example, a player) appears in which position on the screen) for each frame, and the acquisition result is a multimedia such as MPEG7. It is generated for each frame of each video as content information described by a content descriptor (Descriptor).
[0017]
The distribution video matching unit 130 has the user's preference and preference history sent from the video receiving device 20 and each of the live content acquired by the video acquisition unit 110 and the storage content held in the video recording unit 140. Video content that has been edited to compare and select one video that matches the user's preference and preference history from several videos (multi-view video) in several frames. It is constructed in real time, a multi-view video to which content information is added is stored in the content database 141 of the video recording unit 140, or a preference value input dialog 146 is generated and stored in the preference database 145.
[0018]
The video recording unit 140 is a hard disk or the like that holds a content database 141 that holds storage content to be distributed and a preference database 145 for acquiring preferences for each user. The content database 141 stores a mode selection dialog 142 for selecting a live (live broadcast) or storage (broadcast by recording) mode, a content being live relayed, a content list 143 of stored storage content, and the content 144 itself. . In addition, the preference database 145 stores a preference value input dialog 146 for each content for inputting a preference value (preference degree) for a subject and a preference history table 147 for each user that stores a preference history input by the user.
[0019]
The additional information providing unit 150 adds additional information such as information (subject (object) profile) related to the distribution video provided to the viewer for each live and storage content, for example, soccer relay content. It is a hard disk or the like that holds an attached information table 151 preliminarily storing a player's birth date and other profile. In this attached information table 151, for example, information of “birth date”, “main career”, “feature”, and “comment of a player” is recorded in advance for each player. When there is a notification specifying the player name or the like, the additional information related to the specified player is transmitted to the video receiving device 20.
[0020]
The video information distribution unit 160 is a bidirectional communication interface or driver software for communicating with the video receiving device 20 via the communication network 30.
[0021]
The video receiving device 20 interacts with the user regarding live or storage mode selection, preference value input, and the like, and presents video content distributed from the video distribution device 10 to the user. A terminal, a digital broadcasting TV, and the like, and includes an operation unit 210, a video output unit 220, and a transmission / reception unit 230.
[0022]
The operation unit 210 is a device such as a remote controller, a pointing device such as a keyboard or a mouse, and specifies a content desired by the user through an interaction with the user, or inputs a preference value and transmits / receives it as preference value information. 230, or the position information of the subject displayed on the video output unit 220 is transmitted to the transmission / reception unit 230.
[0023]
The transmission / reception unit 230 is a transmission / reception circuit or driver software for serial communication with the video distribution apparatus 10 via the communication network 30.
[0024]
The communication network 30 is a bidirectional transmission path that connects the video distribution device 10 and the video reception device 20, and is a communication network such as the Internet such as a broadcasting / communication network such as CATV, a telephone network, a data communication network, or the like.
[0025]
The operation of the video distribution system 1 configured as described above will be described in order along the sequence shown in FIG. 2 (main processing flow of this system). Note that the sequence of this figure shows the flow of a multi-view video at a certain temporary point.
[0026]
The video acquisition unit 110 of the video distribution apparatus 10 includes a plurality of shooting devices such as a video camera that can acquire a video, and a plurality of subjects in a limited shooting space can be viewed at various viewpoints and angles. A plurality of images (multi-view images) captured from each of the images are acquired (S11). In the video distribution apparatus 10 according to the present embodiment, since a limited space needs to be captured from various viewpoints and angles, it is desirable to disperse as many photographing devices as possible and arrange them in the photographing space. The present invention is not limited to the number of devices and the arrangement position. The multi-view video acquired by the video acquisition unit 110 is transmitted to the video analysis unit 120 by using a cable or wireless communication. In the present embodiment, all the videos acquired by each video acquisition unit 110 are transmitted to one video analysis unit 120 and managed in a centralized manner. May be provided.
[0027]
The video analysis unit 120 analyzes each video acquired by the video acquisition unit 110 and displays the contents of each video (what subject (for example, a player) appears in which position on the screen) for each frame. The acquisition result is generated for each frame of each video as content information described by a multimedia content descriptor (Descriptor) such as MPEG7 (S12). Generation of content information requires two steps: (1) content information extraction and (2) content information description. The content information largely depends on the content of the video being shot. For example, in the case of a sports broadcast such as soccer, most of the video is considered to be a video of a player in competition. Therefore, in the present embodiment, it is considered that the player included in the video is identified by analyzing the video, and the player name and the position where the player is displayed in the video are generated as the content information. In the following, first, as an example of content information extraction, two methods (method using a measuring instrument, image processing) for identifying a player in a video (who is shown) and obtaining the display position thereof are described. Is described.
[0028]
1. Method using measuring instrument
In the method using a measuring instrument, a three-dimensional position in a coordinate system (hereinafter referred to as a global coordinate system) with an arbitrary point in space as a reference point can be measured, and a position sensor assigned with a unique ID number (For example, GPS, hereinafter referred to as a position sensor) is attached to each object to be identified. As a result, each object can be identified and a three-dimensional position can be acquired. Next, cameras for acquiring images are installed at various positions and angles.
[0029]
In the first embodiment, the installed camera is fixed, and panning and tilting are not performed. Therefore, it is necessary to prepare a camera that can cover the entire shooting space in a fixed state. For all cameras for which fixed positions have been determined, the position in the global coordinate system and the line-of-sight (collimation) direction vector are obtained and notified to the video analysis unit 120 in advance. As shown in FIG. 3A, the camera used in the present embodiment is a camera line-of-sight direction when expressed in a coordinate system in which the projection direction is fixed to the camera (hereinafter referred to as a camera coordinate system). It is assumed that there is a projection center at the position of Z = 0 on the Z axis, and the projection plane is Z = d. From the position sensor attached to the object, the ID number and the three-dimensional position coordinate assigned to each position sensor are input to the video analysis unit 120 in time series. The ID number is necessary for identifying the object.
[0030]
Next, a method for identifying the position on the video (on the screen) where the object is displayed using the information from the position sensor and the position information of the camera will be described.
First, the three-dimensional position coordinates of the position sensor in the global coordinate system are converted into an expression in the camera coordinate system. Assuming that the matrix for converting the global coordinate system to the camera coordinate system of the i-th camera is Mvi and the output of the position sensor in the global coordinate system is vw, the output (coordinate) vc of the position sensor in the camera coordinate system is vc = Mvi · It is calculated by vw. Here, “·” represents a product of a matrix and a vector. Moreover, this expression is expressed as follows using matrix and vector components.
[Expression 1]

[0031]
Next, the two-dimensional coordinates of the position sensor on the projection plane of the camera are obtained by using projection transformation. The coordinates on the projection plane are shown in FIG. 3 (b) when FIG. 3 (a) is viewed from above along the projection plane and FIG. 3 (c) when FIG. 3 (a) is viewed from the side along the projection plane. vp = (xp, yp) becomes xp = xc / (zc / d) and yp = yc / (zc / d). Then, it is determined whether or not the calculated xp and yp are within the projection plane (screen) of the camera, and if so, the coordinates are acquired as the display position. By applying the above processing to all cameras and all objects, it is determined which object is currently displayed at which position for each camera.
[0032]
2. Method using image processing
In the method using image processing, content information is extracted from only the video acquired from the camera without using a position sensor or the like, so that the camera does not need to be fixed as in the case of using a measuring instrument. In order to identify an object from an image, it is necessary to cut out only the object from the image and further identify the object. Although there is no particular limitation on the method of extracting the target object from the video, the background is basically a single color in the above-described example of the sports broadcast (for example, the background is a lawn color for soccer or American football relay) Therefore, it is possible to separate the background and the object using color information. Below, the technique for identifying the some target object extracted from the image | video is described.
[0033]
(1) Template matching
For each player, a number of template images are prepared, the object separated from the background is matched with the template image, and the player is identified from the image considered to be the best match. Specifically, first, paying attention to a certain player included in the video, a minimum rectangle (hereinafter referred to as “target rectangle”) surrounding the player is obtained. Next, for a certain template (assumed to be a rectangle), the size of the rectangle is adjusted by down-sampling when the template is larger than the target rectangle and up-sampling when the template is smaller. Then, the difference between the pixel value at a certain position of the target rectangle and the pixel value at the same position as that of the template image is taken. The above processing is performed for all the pixels, and the sum S thereof is calculated. It is assumed that the above-described processing is performed for all template images, and the player of the template image having the smallest S is the player to be identified.
[0034]
(2) Motion prediction
In sports broadcast video, the movement of the player is continuous, so there is no dramatic change between frames. In addition, since the moving direction and speed are also limited, if the position of the player in the current frame is known, the position in the next frame can be predicted to some extent. Therefore, it is possible to predict a range of possible values of the position of the player in the next frame from the position of the player in the current frame, and use template matching only for that range. Moreover, since the positional relationship with the players around the player of interest does not change dramatically, it can be used as information for motion prediction. For example, if the position in the current frame of the player displayed next to the image one frame before is known, the player who is the object of identification is likely to exist in the vicinity, and the position in the current frame Can be predicted.
[0035]
(3) Use of pre-acquired information
In the case of sports broadcasts, the opposing teams often wear different colored uniforms. Since the color of the uniform can be acquired in advance, the team can be identified using the color information. In addition, the uniform is given a back number, and the back number is not used redundantly, which is very effective in identifying individual players.
[0036]
The identification of the object and the acquisition of the position where the object is displayed are achieved by combining the above-described methods. For example, the team is discriminated by first matching the color information of the object with the color information of the uniform. Next, a large number of template images obtained by cutting out only the uniform number portion of the uniform are prepared, and the identification number is identified using template matching. The identification is completed for the player who can identify the player's number. For a player who could not be identified, a motion prediction is performed using the image of the previous frame and the positional relationship with surrounding players that have already been identified, and a template with the whole body image of the player as a template image for the predicted range Perform matching. The position is specified by an upper left position and a lower right position of the target rectangle in the main scanning direction and the sub scanning direction.
[0037]
Next, description (Description) of the acquired content information will be described. For the description of the content information, a description format of multimedia contents such as MPEG-7 is used. In the present embodiment, the player name extracted by the above procedure and the display position in the image are described as content information. For example, as shown in FIG. 4, when two players A (for example, Anno) and B (for example, Niyamato) are included in the video, an example of the description format of the content information is shown in FIG. It comes to be.
[0038]
In this figure, <Information> is a descriptor (tag) indicating the start and end of the content information, and <ID> is a descriptor that identifies each player. In this descriptor, the player's name is displayed. An <IDName> descriptor for identifying and an <IDOrganization> descriptor for identifying affiliation are included. The <RegionLocator> descriptor indicates the position where the player is displayed in the image, and is acquired by the above-described method. The values enclosed in the <Position> descriptor in the <RegionLocator> descriptor sequentially represent the upper left X coordinate, the Y coordinate, the lower right X coordinate, and the Y coordinate of the rectangle including the player. Note that a rectangle including a player can be acquired by a method using image processing, but it is not possible by a method using only a measuring instrument (position sensor / GPS). Therefore, when only the measuring instrument is used, the same value, that is, the coordinate position of one point is described in the upper left coordinate and the lower right coordinate. The video analysis unit 120 generates the above content information for all videos input from a plurality of cameras. Further, since the content information is generated for each frame, the video and the content information have a one-to-one correspondence.
[0039]
Next, the distribution video matching unit 130, the video information distribution unit 160, and the video output unit 220 of the video reception device 20 will be described. The viewer can view the video transmitted to the video output unit 220 via the video information distribution unit 160, but conversely can notify the distribution video matching unit 130 of his / her preference information. . In the case of sports broadcasts, the center of the video is the players who will participate in the competition, and which player will participate is determined in advance. Therefore, in the present embodiment, it is assumed that a target whose preference level can be set is a player who participates in a competition.
[0040]
When each content information is generated by the video analysis unit 120, the distribution video matching unit 130 stores the multi-view video related to the live content and its content information in the content database 141 (S13).
The distribution video matching unit 130 generates a preference value input dialog 146 using the template image, name, and back number used in the template matching method, stores the preference value input dialog 146 in the preference database 145, and then performs live or storage from the content database 141. The mode selection dialog 142 for selecting one of the modes is read and transmitted (S14). When the user of the video receiving apparatus 20 clicks the switch button of the mode selection dialog 142 with the mouse of the operation unit 210 to designate any mode (S15), mode designation information indicating which mode is designated is displayed. It is transmitted from the video receiving device 20 to the video distribution device 10 (S16).
[0041]
When the mode designation information is transmitted, the distribution video matching unit 130 reads out the content list 143 of the mode designated by the user from the content database 141 and transmits it to the video reception device 20 (S17), and also the live content and video recording. A switch (not shown) for switching and distributing the storage content stored in the unit 140 is switched to the designated side.
[0042]
When the user of the video reception device 20 clicks the desired content on the mouse of the operation unit 210 and designates the content, the content name designated by the user is transmitted from the video reception device 20 to the video distribution device 10 (S18). .
[0043]
When the content is specified, the distribution video matching unit 130 reads the table for setting the preference information regarding the specified content based on the content information, the preference value input dialog 146 from the preference database 145, and receives the video together with the edit program. It transmits to the apparatus 20 (S19). The preference value input dialog 146 includes, for example, an edit image and a script (name, spine number, etc.), and is generated by the distribution video matching unit 130 based on the template image used for the template matching method, the name, spine number, and the like. It is stored in the preference database 145 of the video recording unit 140. The transmission of the preference value input dialog 146 may be in the middle of relaying the live content, but is preferably before the relay is started. The reason is that until the latest preference information is acquired, for example, there is only a method of selecting a video based on the preference history acquired in the same card last time stored in the preference history table 147. This is because the user who selects the video with the latest preference sooner matches the preference.
[0044]
FIG. 6 shows an example of the GUI interface of the preference value input dialog 146. The interface of FIG. 6 includes “face image”, “name”, “number”, and “edit box” (spin box) for inputting the preference level of the player who participates. The viewer uses a remote controller of the operation unit 210 or a device such as a keyboard to move the cursor to the edit box position of the player whose preference level is to be determined, and input the preference level. Alternatively, a method may be used in which the cursor is placed on the up and down arrow icons next to the edit box and clicked to increase or decrease the preference value. In the present embodiment, it is assumed that it is lowest when the preference level is “0” and highest when the preference level is “100”. The above-described method is a method using absolute evaluation, but may be a relative evaluation method such as ordering players who participate. The preference information acquired by the above method is transmitted to the video distribution apparatus 10 (S20). FIG. 7 shows an example of preference information. As shown in the figure, the preference information is described using a description format of multimedia contents such as MPEG-7 as well as the content information. The descriptor <ID> for identifying each player and the description The child includes a descriptor <IDName> for identifying the player's name and a descriptor <Preference> for identifying the preference level. This preference information is notified to the distribution video matching unit 130 via the video information distribution unit 160, and is updated and stored in the preference history table 147 (S21).
[0045]
When the preference information is acquired, the distribution video matching unit 130, based on the plurality of videos to which the content information is generated generated by the video analysis unit 120, the preference information notified from the viewer, and the history thereof, the viewer. A matching process for determining which video is to be distributed to is executed (S22). Hereinafter, the matching process will be specifically described in two ways (a method of determining using an object having the highest degree of preference and a method of comprehensively determining from individual preference levels).
[0046]
1. How to make the decision using the object with the highest preference
When distributing a video in which a player with the highest preference is displayed, for example, the procedure of the flowchart shown in FIG. 8 is followed.
[0047]
(1) The preference information notified from the viewer is analyzed, and the player with the highest preference level (hereinafter also referred to as a distribution target player) is determined (S2201).
[0048]
(2) The content information transmitted from the video analysis means is analyzed to determine the number of videos in which the distribution target player is shown (S2202). Among videos from a plurality of viewpoints, a video displaying the distribution target player determined in (1) is set as a distribution video candidate. If the distribution target player's displayed video is limited to one, the video from the camera is determined (S2203), and this video is distributed to the viewer.
[0049]
(3) When a distribution target player is displayed on a plurality of videos, a video considered to be most appropriate among them is distributed, but the determination method is not particularly limited. For example, when the rectangle information is acquired in the <RegionLocator> descriptor (Descriptor) of the content information (Yes in S2204), the area of the rectangle including the distribution target player is calculated, and the area is the largest. A large video is determined (S2205), and this video is used as a distribution video.
[0050]
If rectangular information has not been acquired (No in S2204), the position where the distribution target player is displayed is acquired, and the one closest to the center of the screen is set as the distribution video (S2206). . If the number of videos in which the distribution target player is shown is “0”, the next player is determined, and the distribution video is determined by executing the processing of steps S2202 to S2206 for the next player. (S2207).
[0051]
2. A method for comprehensive determination from individual preference levels
When determining the distribution video based on the overall preference based on the preference level of each player, for example, the procedure of the flowchart shown in FIG. 9 is followed.
[0052]
(1) It is determined whether or not rectangular information has been acquired for all video images from the <RegionLocator> descriptor (Descriptor) of the content information (S2211). If the rectangle information has been acquired (Yes in S2211), the area of the rectangle including each player is calculated (S2212). If rectangular information has not been acquired (No in S2211), a function that takes the maximum value at the center of the screen and takes the minimum value at the edge of the screen (for example, f (x, y) = sin (π * x / ( 2 * x_mid)) * sin (π * y / (2 * y_mid)) satisfies the above conditions, where x and y are pixel positions, x_mid and y_mid are coordinates of the screen center, and * indicates a product. .) And the position of each player is input to obtain the value of the function (S2215).
[0053]
(2) The product of the value obtained in (1) and the preference level of the corresponding player is calculated, and the sum of the player values displayed on the screen is taken as the value of the objective function in the image. (S2213, S2216).
[0054]
(3) The video from the viewpoint that maximizes the value of (2) is determined as the distribution video (S2214, S2217).
[0055]
Here, if the above processing is performed for each frame, there is a possibility that videos are switched one after another. Therefore, the distribution video matching unit 130 applies the above method every few frames and distributes it to the viewer. Determine the video.
[0056]
When the determination of the distribution video is completed as described above, the distribution video matching unit 130 distributes the determined video as a stream (S23 in FIG. 2). Then, the video output unit 220 of the video reception device 20 reproduces the video distributed via the transmission / reception unit 230 on the screen (S24 in FIG. 2).
[0057]
In this way, according to the video distribution system 1 according to the first embodiment, the video distribution apparatus 10 selects a video that matches each user's preference from the multi-viewpoint videos every several frames, and the video reception apparatus 20. And is reproduced by the video output unit 220 of the video receiver 20.
[0058]
Subsequently, the viewer can acquire additional information by acting on the distributed video (steps S25 to S29 in FIG. 2). Hereinafter, for example, a method of acquiring additional information using a pointing device such as a mouse of the operation unit 210 will be described.
[0059]
For example, when two players A and B are included in the video as shown in FIG. 4, for example, when the user wants to acquire additional information of the right player B (Niyamamoto), the user Place the cursor on the target B and click (S25 in FIG. 2). When clicked, position information on the screen is notified to the distribution video matching unit 130 via the video information distribution unit 160 of the video distribution device 10 (S26 in FIG. 2). Then, the distribution video matching unit 130 identifies which target is selected from the content information given to the distribution video, and notifies the result to the additional information providing unit 150 (S27 in FIG. 2). For example, when the image shown in FIG. 4 is displayed and the position on the right image is clicked, the distribution video matching unit 130 notifies only Niyamato based on the content information shown in FIG. . The additional information providing unit 150 reads additional information related to the selected target Niyamato from the attached information table 151, and the additional information is transmitted to the video of the video reception device 20 via the distribution video matching unit 130 and the video information distribution unit 160. The data is transmitted to the output unit 220 (S28 in FIG. 2). As shown in FIG. 10, this additional information is described by the descriptor according to the above-mentioned MPEG7. A descriptor <ID> for identifying each player and the name of the player are identified in this descriptor. Descriptor <IDName>, Descriptor <DateOfBirth> representing the date of birth, Descriptor <Carrier> representing the main history, Descriptor <SpecialAbility> representing the feature, and Descriptor <Comment> representing the player's comment Comment>.
[0060]
In addition, when the information relevant to the selected object is not recorded, a message notifying that there is no information is transmitted.
Finally, the video output unit 220 reproduces the additional information distributed via the transmission / reception unit 230 on the screen (S29 in FIG. 2).
[0061]
As described above, according to the video distribution system 1 according to the first embodiment, the viewer can not only view a video that matches a preference from videos taken from a plurality of viewpoints, but also distribute the video. It is possible to acquire information (additional information) related to an object of interest by acting on the video to be displayed.
[0062]
(Embodiment 2)
Next, a video distribution system according to Embodiment 2 of the present invention will be described with reference to the drawings. In the second embodiment, as an example of a limited space shooting target, an image centered on players in the case of sports broadcast such as soccer will be described as an example. It can be applied to a subject to be photographed.
[0063]
FIG. 11 is a block diagram showing a functional configuration of the video distribution system 2 according to Embodiment 2 of the present invention. Functional configurations corresponding to those of the video distribution system 1 of the first embodiment are given the same numbers, and detailed descriptions thereof are omitted.
This video distribution system 2 is composed of a video distribution device 40, a video reception device 50, and a communication network 30 connecting them, and reproduces a video that matches the user's preference from among multi-view video. In this respect, the video distribution system 1 of the first embodiment is the same as the first embodiment, but in the first embodiment, the video distribution apparatus 10 determines and distributes content such as video according to the user's preference, In this video distribution system 2, the video distribution device 40 streams all multi-viewpoint video content and the like (all contents that may be selected), and the video reception device 50 responds to the user's preference. The difference is that the selected video is selected and played back.
[0064]
The video distribution device 40 of the video distribution system 2 is a distribution server composed of a computer or the like that stream-distributes video content of a plurality of videos (multi-view video) with content information and additional information added to the video receiver 50. The video acquisition unit 110, the video analysis unit 120, the additional information providing unit 410, the video recording unit 420, the video multiplexing unit 430, and the multiplexed video information distribution unit 440 are provided.
[0065]
The additional information providing unit 410 searches the content information generated by the video analysis unit 120 and generates additional information on the subject (target object) included in the content information based on the attached information table 151, or the content information and additional information The video to which information is added is stored in the content database 421 of the video recording unit 420, or the preference value input dialog 146 is generated and stored in the preference database 145.
[0066]
The video recording unit 420 has an input side connected to the additional information providing unit 410 and an output side connected to the video multiplexing unit 430, and includes a content database 421 and a preference database 145 inside. The content database 421 stores the video content 424 itself to which content information and additional information are added. Note that the preference history table 147 has been deleted from the preference database 145. This is because the video receiving device 50 selects the video according to the user's preference, and therefore it is not necessary to hold the preference history table 147 in the video distribution device 40.
[0067]
The video multiplexing unit 430 designates the user's mode designation between the content information output from the additional information providing unit 410 and the live multi-view video to which the additional information is added, and the video content 424 of the storage stored in the content database 421. The video, content information, and additional information are multiplexed for each camera, and the information is further multiplexed to generate one bit stream (see FIG. 13). In addition, the video multiplexing unit 430 streams the preference value input dialog 146 to the video receiving device 50.
[0068]
The multiplexed video information distribution unit 440 is a bidirectional communication interface or driver software for communicating with the video reception device 50 via the communication network 30.
[0069]
The video receiving device 50 interacts with the user regarding live and storage mode selection, preference value input, etc., separates video, content information, and additional information streamed from the video distribution device 40, Video content that has been edited in such a way that a single video that matches the user's preferences and preference history is switched and selected every several frames from the video (multi-view video) in real time and presented to the user A personal computer, a mobile phone, a portable information terminal, a digital broadcasting TV, and the like, and includes an operation unit 210, a video output unit 220, a transmission / reception unit 230, a display video matching unit 510, and a video recording unit 520.
[0070]
The display video matching unit 510 separates the video, content information, and additional information stream-distributed from the video distribution device 40 for each camera (see FIG. 13), and stores them in the video recording unit 520 or the video distribution device. 40, the preference value input dialog 146 distributed from 40 is stored in the video recording unit 520, the user's preference sent from the operation unit 210, and the content information of each video sent from the video distribution device 40. Comparing and building video content edited in real time, such as switching and selecting one video that matches the user's preference and preference history from several videos (multi-view video) every few frames .
[0071]
The video recording unit 520 is a hard disk or the like that holds a content database 521 that holds live or storage content delivered from the video delivery device 40 and a preference database 525 for obtaining preferences for each user. The content database 521 stores the content list 523 and the content 524 itself of the stored storage content. Also, the preference database 525 stores a preference value input dialog 146 for each content sent from the video distribution device 40 and a preference history table 147 that stores a preference history input by the user.
[0072]
The operation of the video distribution system 2 of the present embodiment configured as described above will be described in order along the sequence shown in FIG. 12 (main processing flow of the system). Note that the sequence of this figure also shows the flow of a multi-view video at a certain temporary point, and detailed description of the processing corresponding to the sequence of Embodiment 1 is omitted.
[0073]
When the video acquisition unit 110 finishes acquiring a plurality of videos (multi-view video) (S11), the video analysis unit 120 analyzes the multi-view video and generates content information for each video, and the additional information providing unit 410 displays the content. Information is searched, and additional information of the subject (object) included in the content information is generated (S32). For example, when two persons A and B are shown in the video, additional information of the two persons A and B is generated. When the generation of the additional information is completed, the additional information providing unit 410 stores the content information and the video with the additional information added in the content database 421 of the video recording unit 420 (S33).
As in the case of the first embodiment, the mode selection dialog is transmitted (S14), the mode designation (S15) in the video receiving device 50, the mode designation information is transmitted (S16), and the content list information is transmitted (S17). The content designation transmission (S18) is sequentially performed.
[0074]
When the content is specified, the video multiplexing unit 430 multiplexes the specified live or storage content multi-view video (a plurality of videos), the content information for each video, and the additional information for each video. (S39), the content preference value input dialog 146 is transmitted.
[0075]
The display video matching unit 510 separates the multi-view video transmitted from the video distribution device 40, the content information for each video, and the additional information for each video for each camera and stores them in the content database 521 (S40). Further, the preference value input dialog 146 is stored in the preference database 525.
Next, the display video matching unit 510 reads the preference value input dialog 146 from the preference database 525, sends it to the video output unit 220 for display (S41), and stores the preference information input by the user in the preference history table 147 (S42). ), The preference information and the content information are compared, and one viewpoint video that matches the user's preference is determined from among the multi-view videos (S43). This video determination method is the same as in the first embodiment. Then, the display video matching unit 510 sends the determined video to the video output unit 220 and reproduces it on the screen (S44).
[0076]
As described above, according to the video distribution system 2 according to the second embodiment, the video distribution device 40 transmits a plurality of videos (multi-view video) to the video reception device 50, and the video reception device 50 performs the multi-view video. One video image that matches the user's preference is selected and determined every several frames and reproduced.
[0077]
Subsequently, the user can acquire additional information by acting on the distributed video (steps S45 to S47 in FIG. 12).
For example, in a state where a video that matches the user's preference is played and the target for which additional information is to be acquired is displayed in the distributed video, the user has displayed the pointing device cursor of the operation unit 210 on the screen. When the user clicks on the screen, the position information on the screen is notified to the display video matching unit 510 (S45). Then, the display video matching unit 510 specifies which target is selected from the content information given to the video (S46), and the video output unit 220 extracts only the specified additional information from the corresponding additional information. Send to. For example, when the targets A and B shown in FIG. 4 are displayed and the position on the right target B is clicked, the display video matching unit 510 firstly based on the content information shown in FIG. To identify Niyamato. Then, the display video matching unit 510 reads only the additional information related to Niyamato from the additional information about the two people, and sends it to the video output unit 220. As a result, only the additional information to be acquired is displayed on the video output unit 220 (S47).
[0078]
As described above, according to the multi-view video distribution system 2 according to the second embodiment, the viewer can not only view a video that matches a preference from videos shot from a plurality of viewpoints. By acting on the distributed video, it becomes possible to acquire information (additional information) related to the object of interest.
[0079]
By the way, the content database 521 of the video recording unit 520 stores content 524 in which all of the multi-view video transmitted from the video distribution device 40, the content information for each video, and the additional information for each video are provided. Has been. Therefore, this content can be repeatedly reproduced in the video receiving device 50 without receiving redistribution from the video distributing device 40.
[0080]
Also, during repeated playback, the display video matching unit 510 reads the preference value input dialog 146 from the preference database 525 of the video recording unit 520, and shoots from a plurality of viewpoints based on preference information different from the previous one input by the user. It is also possible to play a video that matches this preference from among the recorded videos, and in this case, the user can view another edited video centered on a different target (player) from the previous time. .
[0081]
As described above, the video distribution system according to the present invention has been described based on the embodiment. However, the present invention is not limited to the embodiment, and may be applied to modifications described below.
[0082]
In the above-described embodiment, the preference value input dialog 146 is displayed for each distribution of video content, and the viewer's preference information is acquired. One image may be selected from the images. For example, the viewer's preference information acquired in the past is stored in the video distribution device 40, and by referring to the information, the trouble of acquiring the preference information from the viewer every time video content is distributed is saved. It can be omitted.
[0083]
In the first embodiment, the additional information providing unit 150 transmits the additional information from the video distribution device 10 to the video receiving device 20 only when the position is specified in the video receiving device 20. Without waiting for the designation, the additional information about the video determined to be distributed may be distributed in advance together with the video content. As a result, the time from when the viewer issues an instruction until the additional information is acquired is shortened, so that a video delivery system having quick response can be realized.
[0084]
Further, on the contrary, in the second embodiment, the additional information providing unit 410 attaches the additional information for each of the multi-view images. However, the additional information is provided only when the position is specified in the video receiving device 50. May be distributed. As a result, the communication load on the communication network 30 resulting from the distribution of additional information of video content that is unclear whether or not it is finally selected is reduced.
[0085]
In the first and second embodiments, the live broadcast of soccer has been described as an example. However, the live broadcast of sports such as baseball and the like performed outdoors, and the live broadcast of concerts and plays performed indoors, etc. Of course, it can be applied.
[0086]
Furthermore, in the first and second embodiments, in addition to the preference, only the size and position of each object in the video are set as the evaluation targets when selecting the video, but the movement of the object is added to the evaluation target. It may be.
[0087]
That is, in the case of live broadcast indoors, by installing a motion capture system in this facility, it is also possible to detect a violent movement in which an object (such as a singer) runs on the stage. On the other hand, for example, in a live stage, there is an effect in which the leading role (person to be noticed) is switched in real time while a plurality of subjects are mixed. In this case, the viewer wants to see the person (the person who is active) who is moving on the stage at that time rather than watching the person who is still standing. It is psychological and matches taste. Therefore, the amount of movement of the target displayed in the video obtained using the motion capture system device is analyzed by the video analysis unit 120, and the amount of movement is included in the content information. This video may be selected because the degree of interest is high.
[0088]
(Embodiment 3)
FIG. 14 is a diagram illustrating a state of a live concert stage of a certain group “SPADE”.
As shown in the figure, a plurality of (four shown) cameras C1 to C4 are fixedly arranged around the stage, and members of spades (from the left in FIG. 14, Furugaki, Shimohara, Maei). A plurality of markers M are respectively attached to the limbs.
[0089]
Each of the cameras C1 to C4 includes R, G, and B color images, a light emitting unit that emits infrared light, and a light receiving unit that receives the infrared light reflected by the marker M. An image reflected by the marker for each frame is captured by the light receiving unit. The marker video for each frame is sent to, for example, the video analysis unit 120 shown in FIG. 1, and the amount of motion of the target is analyzed.
[0090]
FIG. 15 is a diagram illustrating a state in which a motion amount is analyzed from two marker images (P1, P2). Here, a case is shown in which the amount of motion is analyzed from two marker images in which only the member Shimohara shown in FIG. 14 is shown.
[0091]
The video analysis unit 120 compares the corresponding markers M of the two marker images P1 and P2, and the motion amounts Δv1, Δv2, Δv3, Δv4,... Δv () of each part such as the shoulder, elbow, wrist,. n-1) and Δvn are measured respectively. Then, when the measurement of each part is completed, the video analysis unit 120 calculates the sum of these measurement values, and obtains the calculation result as the object displayed in the video at that time, the amount of movement of the singer, The acquired amount of movement is included in the content information. First, the amount of movement may be calculated in the order of arms, wrists, etc., based on the waist and shoulders. Further, the marker images M obtained from a plurality of viewpoints may be combined to measure a three-dimensional motion vector. In this case, even when the markers overlap in one marker image M, each marker can be distinguished, a situation such as erroneous calculation of the motion amount can be avoided, and a highly accurate motion amount can be obtained. Come out.
[0092]
FIG. 16 is a diagram illustrating an example of content information generated by the video analysis unit 120.
In this example, the <RegionLocator> descriptor has a position <Position> having the size displayed by the singer in the image, and a point that does not have the size acquired by a measuring instrument (position sensor / GPS) or the like. The position <Location> is described together, and the evaluation on the object unit can be performed on both the size on the target screen and the position such as the center. Furthermore, in this content information, the motion amount can be evaluated in object units by the <motion> descriptor.
[0093]
In this way, when the content information is configured to include the amount of movement in addition to the size and position of the object, based on the preference level of each singer, the size, position, movement, etc. of the target on the screen, When each object is evaluated for each object and the distribution video is determined by comprehensive judgment, for example, the procedure of the flowchart shown in FIG. 17 is followed.
[0094]
First, the distribution video matching unit 130 refers to the rectangular information with the <RegionLocator> descriptor (Descriptor) of the content information for the videos from all cameras, and calculates the area of the rectangle including each object and singer. (S2221). When the calculation of the rectangular area is finished, the distribution video matching unit 130 takes a maximum value at the center of the screen and a function that takes the minimum value at the edge of the screen (for example, f (x, y) = sin (π * x / (2 * X_mid)) * sin (π * y / (2 * y_mid)) is used to calculate the value of the function related to the position of each singer (S2222). The unit 130 refers to the <motion> descriptor of the content information regarding the images from all the cameras, and reads the motion amount (S2223).
[0095]
When the calculation of the area, the calculation of the function value, and the reading of the motion amount are finished, the distribution video matching unit 130 calculates the product of the area and the corresponding singer's preference for the video from all cameras, and further on the screen. Calculate the sum of the displayed singer values, calculate the product of the position and the singer's preference corresponding to this position, calculate the sum of the singer values displayed on the screen, and The value of the objective function is obtained by calculating the sum of the values of the displayed singer movement amounts (S2224).
Then, when the value of the objective function is obtained for the videos from all cameras, the video from the viewpoint that maximizes the value of the objective function is determined as the distribution video (S2225).
[0096]
In this way, when the amount of movement is included in the evaluation value, the video of a singer with a lot of movement that will be active rather than still is highly evaluated, and the highly evaluated video is evaluated every few frames. Will be selected. As a result, the video distribution apparatus 10 distributes video that matches each user's preference from among the multi-view video.
[0097]
【The invention's effect】
As is apparent from the above description, the video distribution apparatus according to the present invention is a video distribution apparatus that communicates with a video reception apparatus via a communication network, and a video acquisition unit that acquires a plurality of videos from different viewpoints. For each video, the content included in the video is analyzed, and the degree of conformity between the video analysis means for generating the analysis result as content information, the content information, and the preference information notified from the viewer is determined. And a distribution video matching means for determining a video to be distributed and distributing the determined video.
In other words, it is determined by the degree of matching between the content information generated for each video from a plurality of videos from different viewpoints and the viewer's preference information, and one video that matches the viewer's preference is selected as the viewer's video. Deliver to the receiving device.
[0098]
Thereby, the viewer can selectively view the video that matches his / her preference. Therefore, it is possible to satisfy the viewer's request regarding the selection of video. In addition, real-time video can be targeted for distribution by repeatedly performing the processing by the video acquisition unit, the video analysis unit, and the distribution video matching unit at high speed.
[0099]
Here, the content information may include information for identifying the subject and information indicating the display position or display area of the subject. Alternatively, the container for obtaining the preference information may be distributed to the video receiving device side, and the preference information may be acquired by causing the container to input the degree of preference for the subject. Further, when a position on the screen is designated by the viewer for the distributed video, the subject at that position may be specified, and additional information regarding this subject may be transmitted.
[0100]
Furthermore, the present invention is a video distribution device that communicates with a video receiving device via a communication network, and includes video acquisition means for acquiring a plurality of videos from different viewpoints, and contents included in the videos for each of the videos A video distribution device comprising: a video analysis unit that analyzes the video and generates an analysis result as content information; and a video multiplexing unit that multiplexes and distributes each video and each content information. it can. In this case, the degree of matching between each content information distributed from the video distribution device on the video reception device side and the preference information notified from the viewer is determined, and a plurality of distribution information distributed from the video distribution device is determined. One video to be played back may be determined from the video and the determined video may be played back.
[0101]
Thus, in a video receiving device that receives each video and each content information distributed from such a video distribution device, the degree of matching between each content information and the preference information notified from the viewer is determined, and the video to be played back is determined. If it is determined and the determined video is reproduced, the viewer can selectively view the video that matches his / her preference.
[0102]
In addition, the present invention can be realized as a program for causing a computer to function such characteristic means, or as a recording medium on which the program is recorded. The program according to the present invention can be distributed via a communication network such as the Internet, a recording medium, or the like.
[0103]
Thus, according to the present invention, for example, in a sports watching program, viewers can selectively watch videos in which the players he / she likes frequently appear and have a good time. . Therefore, the present invention dramatically improves the value of the service provided by the video distribution system, and its practical value is extremely high.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration of a video distribution system 1 according to Embodiment 1 of the present invention.
FIG. 2 is a sequence diagram showing an operation of the video distribution system 1;
3A is a perspective view showing the relationship between the position in the camera coordinate system used in the first embodiment of the present invention and the position on the projection plane, and FIG. 3B is a perspective view of FIG. ) Is viewed from above along the projection plane, and FIG. 3C is a view of FIG. 3A viewed from the side along the projection plane.
4 is a diagram illustrating an example of an image acquired by the image acquisition unit 110 illustrated in FIG. 1. FIG.
5 is a diagram showing an example of content information generated by the video analysis unit 120 shown in FIG.
6 is a diagram illustrating an example of a preference value input dialog generated by a distribution video matching unit 130 illustrated in FIG. 1. FIG.
7 is a diagram showing an example of preference information sent from the video reception device 20 shown in FIG. 1. FIG.
FIG. 8 is a flowchart executed when the distribution video matching unit 130 determines an image to be distributed using an object having the highest degree of preference.
FIG. 9 is a flowchart executed when the distribution video matching unit 130 determines a video to be distributed by comprehensively judging from individual preference levels.
10 is a diagram showing an example of additional information sent from the additional information providing unit 150 shown in FIG.
FIG. 11 is a block diagram showing a functional configuration of a video distribution system 2 according to Embodiment 2 of the present invention.
12 is a sequence diagram showing an operation of the video distribution system 2. FIG.
FIG. 13 is a diagram illustrating an example of a method for multiplexing / separating video, content information, and additional information.
FIG. 14 is a view showing a state of a live concert stage of a certain group “SPADE”.
FIG. 15 is a diagram illustrating a state in which a motion amount is analyzed from two marker images (P1, P2).
16 is a diagram illustrating an example of content information generated by the video analysis unit 120. FIG.
FIG. 17 is a flowchart executed when the distribution video matching unit 130 determines a video to be distributed by comprehensively judging from individual preference levels and the like.
[Explanation of symbols]
1, 2 Video distribution system
10, 40 Video distribution device
20, 50 video receiver
30 Communication network
110 Video acquisition unit
120 Video analysis unit
130 Distribution video matching section
140, 420, 520 Video recording unit
141,421,521 Content Database
144,424,524 content
145,525 preference database
146 Preference value input dialog
147 Preference history table
150,410 Additional information provider
210 Operation unit
220 Video output unit
230 Transceiver
430 Video multiplexing unit
440 Multiplexed video information distribution unit
510 Display image matching unit

Claims

A video distribution device for distributing video via a communication network,
Video acquisition means for acquiring a plurality of videos from different viewpoints;
Analyzing the subject included in the image for each of the video, the content information including rectangle information is information about a rectangle encompassing the object, for each of the video, a video analysis unit configured to generate,
Storage means for storing preference information including a preference level that is a value indicating a degree of preference of the viewer with respect to the subject;
The rectangular area is obtained from the rectangular information corresponding to the subject, and a value obtained by multiplying the rectangular area by the preference level is set as the matching level between the subject and the viewer's preference, and the matching level of the subject is determined. Using the video including the subject and the viewer's preference, and further using the video compatibility to select a video having a high fitness with the viewer from the plurality of videos. A video distribution device comprising: distribution video matching means for determining and distributing.

The video distribution device further includes additional information storage means for storing in advance additional information corresponding to each of the plurality of videos,
The video according to claim 1, wherein the distribution video matching means reads additional information corresponding to a video selected and determined from the plurality of videos from the additional information storage means and distributes the additional information together with the video. Distribution device.

A video distribution method for distributing video over a communication network,
A video acquisition step of acquiring a plurality of videos from different viewpoints;
Analyzing the subject included in the image for each of the video, the content information including rectangle information is information about a rectangle encompassing the object, for each of the video, a video analysis step for generating,
A storage step of storing preference information including a preference level that is a value indicating a degree of preference of the viewer with respect to the subject;
The rectangular area is obtained from the rectangular information corresponding to the subject, and a value obtained by multiplying the rectangular area by the preference level is set as the matching level between the subject and the viewer's preference, and the matching level of the subject is determined. Using the video including the subject and the viewer's preference, and further using the video compatibility to select a video having a high fitness with the viewer from the plurality of videos. A video distribution method comprising: a distribution video matching step for determining and distributing.

A program used in a video distribution device that distributes video via a communication network,
A program for causing a computer to execute the steps according to claim 3 .

A video distribution system for distributing video via a communication network,
It consists of a video distribution device and a video reception device,
The video distribution device includes:
Video acquisition means for acquiring a plurality of videos from different viewpoints;
Analyzing the subject included in the image for each of the video, the content information including rectangle information is information about a rectangle encompassing the object, for each of the video, a video analysis unit configured to generate,
Storage means for storing preference information including a preference level that is a value indicating a degree of preference of the viewer with respect to the subject;
The rectangular area is obtained from the rectangular information corresponding to the subject, and a value obtained by multiplying the rectangular area by the preference level is set as the matching level between the subject and the viewer's preference, and the matching level of the subject is determined. Using the video including the subject and the viewer's preference, and further using the video compatibility to select a video having a high fitness with the viewer from the plurality of videos. Distribution video matching means for deciding and distributing,
The video receiver is
And transmitting means for transmitting the preference information to the video distribution device,
Receiving means for receiving a video having a high degree of fitness distributed from the video distribution device;
A video distribution system comprising: display means for displaying the received video.