JP2004320092A

JP2004320092A - Digital contents summary reproducing method and system

Info

Publication number: JP2004320092A
Application number: JP2003107195A
Authority: JP
Inventors: Koichi Terada; 光一寺田; Yukio Fujii; 藤井　　由紀夫
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2003-04-11
Filing date: 2003-04-11
Publication date: 2004-11-11
Anticipated expiration: 2023-04-11
Also published as: JP4356343B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a digital contents summary reproducing method with excellent convenience and high efficiency. <P>SOLUTION: All frames in a stream are rearranged in orders of high importance, and the resultant stream is used for a digest index. Only the optional number of frames is extracted from the head of the index and reproduced in a temporal order to display the summary with an optional length. In the case that a contents storage location is apart from a reproduction location, data are sequentially transmitted from the frame at the head of the index to realize quick summary viewing / listening of an important frame group. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、多量のコンテンツを視聴する際に、コンテンツ本体とは別に、コンテンツ内容に関するダイジェスト情報を供給する技術に関するものであり、特にデジタルビデオコンテンツを配信するためのビデオサーバ装置、該配信を受けるビデオ再生端末装置、及びこれらを用いたビデオ配信システムに関する。
【０００２】
【従来の技術】
近年のデジタル映像処理技術の発達に伴い、多量のデジタル映像コンテンツが流通し始めている。これらのコンテンツを効率的に流通させるため、また、視聴者の限られた時間を有効に利用させるため、コンテンツのダイジェスト情報を提供するための技術の発達が著しい。
【０００３】
コンテンツのダイジェスト情報は、例えば対象コンテンツが映画であれば、その映画に関する各種の属性、例えば製作年や製作者に関わる情報などのほか、映画配給直前に使用された予告ＣＭ映像などが利用されることがある。対象コンテンツがスポーツ中継映像のようなものであれば、コンテンツ配信を開始する前に人間がコンテンツを視聴し、重要と思われるシーンにタグを打つなどして部分的に抜き出し、抜き出した映像をダイジェスト情報として利用することもある。
【０００４】
このような技術としては、例えば、特開平１１−１９６３８５号公報に示されているようなものがある。
【０００５】
しかし、以上述べたような従来技術には、以下に述べるような課題があった。
【０００６】
映像を含むコンテンツを販売する場合、映像情報を含むダイジェスト情報の有無は、宣伝効果に大きな差を生むであろうことが予想される。このような映像ダイジェスト情報は、近年配給された映画のようなコンテンツであれば予告ＣＭ映像などを流用することが可能であるものの、一般的にはコンテンツ本体を元にして生成するほかないのが現状である。
【０００７】
映像コンテンツ本体を元にして映像ダイジェスト情報を生成する場合、コンテンツのうちのどの部分がダイジェストに適当であるかを判断し、映像を部分的に抜き出す必要が生じる。この抜き出すための作業は、従来から人間がコンテンツを視聴してタグ打ちするなどの方法により行ってきた。しかし近年の研究により、対象コンテンツが限られてはいるものの、計算機による自動作業が行えるようになってきている。このような研究としては、例えば、非特許文献１に示されるようなものがある。
【０００８】
しかしいずれの方法においても、予めダイジェスト映像情報を生成しておく方式であることから、生成されるダイジェスト映像情報の長さは、生成時に一意に決定されてしまう。つまり、例えば２時間分のコンテンツのダイジェストを生成し、これが１分の映像となった場合、ダイジェスト映像を見たい視聴者にとっては、必ず１分間の視聴時間が必要となる。仮にこの視聴者がもっと短いダイジェストを欲していたとしても、１分間のダイジェスト映像をもっと短縮して視聴することができないことになる。同様に、もう少し詳しいダイジェスト映像がほしいと思ったとしても、そのような視聴方法をとることはできない。
【０００９】
このような視聴者向けに、複数のダイジェスト映像、例えば１分間版のほかに、１５秒版や３分間版を用意したりすることは可能である。しかし、このような点に関しては視聴者の要求は細分化する傾向にあり、早晩３０秒版や２分間版を用意せざるを得なくなる可能性が高い。また、このような要求にこたえるためには、前記のような人間によるタグ打ちによる方法では、タグ打ちの際に複数の重み付けを持ったタグを打っていく必要が生じるため、タグ打ちのコストが大幅に上昇することが予想される。
【００１０】
【特許文献１】
特開平１１−１９６３８５号公報
【非特許文献１】
益満健他「映像重要度を用いたパーソナライズ要約映像作成手法」、電子情報通信学会論文誌Ｄ−ＩＩＶｏｌ．Ｊ８４−Ｄ−ＩＩＮｏ．８、ｐｐ．１８４８−１８５５、２００１
【非特許文献２】
ＤａｎｉｅｌＤｅＭｅｎｔｈｏｎ，ＶｉｋｒａｎｔＫｏｂｌａ，ＤａｖｉｄＤｏｅｒｍａｎｎ，“ＶｉｄｅｏＳｕｍｍａｒｉｚａｔｉｏｎｂｙＣｕｒｖｅＳｉｍｐｌｉｆｉｃａｔｉｏｎ”，ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅｓｉｘｔｈＡＣＭｉｎｔｅｒｎａｔｉｏｎａｌｃｏｎｆｅｒｅｎｃｅｏｎＭｕｌｔｉｍｅｄｉａ１９９８，Ｂｒｉｓｔｏｌ，ＵｎｉｔｅｄＫｉｎｇｄｏｍ，ｐｐ．２１１−２１８，１９９８
【００１１】
【発明が解決しようとする課題】
本発明の目的は、利便性が高く効率の良いデジタルコンテンツ要約再生システムを得ることにある。
【００１２】
【課題を解決するための手段】
上記課題は、
デジタルコンテンツを要約して再生時間を短縮する方法において、
該要約する方法は、デジタルコンテンツを時間軸で分割し、分割した要素それぞれについてその重要度を評価し、評価値の高い要素のみを再生する方法であって、
該再生時間の短縮方法は、指定された再生時間となるように評価値の高い要素から順に選択する方法であって、
再生時間の短縮率はデジタルコンテンツ再生時に指定される、
とすることによって解決される。
【００１３】
【発明の実施の形態】
以下、本発明の実施例について、図面を用いて説明する。
〔第一実施例の説明〕
図１に本発明の第一の実施例を示す。図１において、１はビデオサーバシステム、２はビデオ再生端末システムである。また、１０１はビデオエンコード手段、１０２はビデオストリーム保持手段、１０３はダイジェスト生成手段、１０４はダイジェストインデックス情報送出手段、１０５はビデオストリーム送出手段である。また、２０１はダイジェストインデックス情報要求手段、２０２は部分ビデオストリーム要求手段、２０３は部分ビデオストリーム一時保持手段、２０４はビデオストリーム一時保持手段、２０５はダイジェストインデックス選択整形手段、２０６はビデオデコード手段、２０７はビデオ表示手段である。
【００１４】
次に図１を用いて、本実施例の動作について説明する。
【００１５】
ビデオサーバシステム１に対してコンテンツ画像が入力されると、ビデオエンコード手段１０１がこれを例えばＭＰＥＧ方式などによりエンコードし、エンコード結果をビデオストリーム保持手段１０２に保存する。もちろん、既にエンコードされたビデオストリームを直接にビデオストリーム保持手段１０２に入力しても良いが、本明細書では説明の簡単のため省略する。
【００１６】
ユーザはビデオ再生端末システム２を操作し、コンテンツの選択を行い、選択の結果をビデオサーバシステムに伝達する。ここで選択や伝達に用いる手段は、例えばコンテンツタイトル名をリストから選択しても良いし、画像の一部を示した一覧から選択しても良い。このような選択手段等については、従来から利用されている技術を用いればよいため、本明細書では説明を省略する。
【００１７】
選択されたコンテンツの情報がビデオサーバシステムに伝えられると、ビデオサーバシステムは必要なビデオストリームを取り出し、ダイジェスト生成手段１０３によって該ビデオストリームに対応するダイジェストインデックスを生成する。
【００１８】
ここでダイジェストインデックスは、対応するビデオストリームの全フレームに整数の番号を振り、各フレームを重要と思われる順に並べ替え、並べ替えた結果のフレーム番号を列挙したものである。図２にダイジェストインデックスの例について示す。この例では、１０３４番目のフレームが全コンテンツ中で最も重要と思われるフレームであり、以下順に、２６番目、５４４番目、３９番目、１６６７番目のフレームが重要であると判断されたことを示す。仮に、対応するビデオストリームが１０万フレームの長さを持つとすると、これを例えば１０００倍速で視聴するためには、ダイジェストインデックスの先頭１００個のフレーム番号を取得し、これを昇順にソートし、対応するフレームを順に表示することで、所望の速度でのダイジェスト視聴が可能となる。
【００１９】
ダイジェストインデックス生成のための演算方式自体は、本発明では特に触れないが、例えば、前記非特許文献２に示されているような方法を利用することを想定している。もちろん、ダイジェストインデックス生成のための演算方式については、上記文献によるものに限ることなく、任意の方式を利用しても良い。
【００２０】
さて、本実施例では、ビデオサーバシステムによって生成されたダイジェストインデックス情報が、ダイジェストインデックス送出手段１０４によってビデオ再生端末システムに伝達される。ビデオ再生端末システムがダイジェストインデックス情報を取得すると、ダイジェストインデックス選択整形手段２０５は、別途ユーザによって指定された再生速度に基づいて、ダイジェストインデックスの先頭部分を必要個数だけ取り出す。これを昇順にソートし、ソート結果をビデオサーバシステムへ伝達する。なお、このソート処理自体は、ビデオ再生端末システム側で行う必要はなく、構成によってはビデオサーバシステム側で行っても良い。また、ダイジェストインデックスの先頭部分を必要個数だけ取り出す処理自体についても、これをビデオ再生端末システム側で行う必要はなく、必要な再生速度情報をビデオ再生端末システムからビデオサーバシステムに伝達し、ビデオサーバシステム側でダイジェストインデックスの切り出しを行っても構わない。
【００２１】
次に、ビデオサーバシステムが再生すべきダイジェストインデックスを取得すると、ストリーム送出手段１０５は、該インデックス情報に基づいて対応するビデオストリームを送出し、ビデオ再生端末システムはこれを受け取ってダイジェスト再生を行う。
【００２２】
ダイジェスト再生前、再生中もしくは再生終了後のしかるべきタイミングで、ビデオサーバシステムは主ストリームを送出し、ビデオ再生端末システムはこれを受け取って主ストリーム保持手段２０４に一時保存し、ユーザにコンテンツを視聴させるための処理を行う。
【００２３】
なお、主コンテンツの視聴が有料であるならば、ビデオ再生端末システムがビデオストリーム本体を取得した段階で課金することが想定される。また、もしダイジェスト再生のみを無料とするならば、再生速度に下限を設定したり、ダイジェストインデックス生成対象をビデオストリームの先頭など一部分に限るといった制限が必要となる。
〔第二実施例の説明〕
次に、本発明の別の実施例について説明する。
【００２４】
図３に本発明の第二の実施例を示す。図３において、１はビデオサーバシステム、２はビデオ再生端末システムである。また、１０１はビデオエンコード手段、１０２はビデオストリーム保持手段、１０３はダイジェスト生成手段、１０４はダイジェストインデックス情報送出手段、１０５はビデオストリーム送出手段、１０６は部分ストリーム読み出し手段、１０７はビデオデコード手段、１０８はビデオエンコード手段、１０９は部分ストリーム送出手段である。また、２０１はダイジェストインデックス情報要求手段、２０２は部分ビデオストリーム要求手段、２０３は部分ビデオストリーム一時保持手段、２０４はビデオストリーム一時保持手段、２０５はダイジェストインデックス選択整形手段、２０６はビデオデコード手段、２０７はビデオ表示手段である。
【００２５】
なお、ビデオエンコード手段１０８及びビデオデコード手段２０６は、それぞれ複数の方式によるエンコード手段、及びデコード手段を保持していても良い。
【００２６】
次に図３を用いて、本実施例の動作について説明する。
【００２７】
ビデオサーバシステム１に対してコンテンツ画像が入力されると、ビデオエンコード手段１０１がこれを例えばＭＰＥＧ方式などによりエンコードし、エンコード結果をビデオストリーム保持手段１０２に保存する。
【００２８】
ユーザはビデオ再生端末システム２を操作し、コンテンツの選択を行い、選択の結果をビデオサーバシステムに伝達する。
【００２９】
選択されたコンテンツの情報がビデオサーバシステムに伝えられると、ビデオサーバシステムは必要なビデオストリームを取り出し、ダイジェスト生成手段１０３によって該ビデオストリームに対応するダイジェストインデックスを生成する。
【００３０】
ビデオサーバシステムによって生成されたダイジェストインデックス情報は、ダイジェストインデックス送出手段１０４によってビデオ再生端末システムに伝達される。ビデオ再生端末システムが受け取ったダイジェストインデックス情報は、ダイジェストインデックス選択整形手段２０７に送られ、ここで別途ユーザによって指定された再生速度に基づいて、ダイジェストインデックスの先頭部分が必要個数だけ取り出される。これを昇順にソートし、部分ストリーム要求としてビデオサーバシステムへ伝達する。
【００３１】
ビデオサーバシステムがダイジェストインデックス情報を含んだ部分ストリーム要求を受け取ると、部分ストリーム読み出し手段１０６は、該インデックス情報に基づいてビデオストリームを部分的に読み出す。これを元に、ビデオデコード手段１０７及びビデオエンコード手段１０８を介し、部分ストリームを生成する。生成結果を部分ストリーム送出手段１０９が受け取り、ビデオ再生端末システムへと送出する。
【００３２】
ここでビデオサーバシステムが送出する部分ストリームは、元々ビデオストリーム保持手段が持っているＭＰＥＧ形式のデータの一部分でも良いし、エンコーダによってＪＰＥＧ形式やＭＰＥＧ形式でエンコードしなおしたりしたものでもよく、その形式には依存しない。例えば、ユーザが要求したビデオ再生速度が十分高速である場合は、再生すべきフレーム相互間の相関性が低いことが予想されるため、ＪＰＥＧ形式による再エンコードが最も高い処理効率を得る、といったことが期待できる。このような条件においては、ＪＰＥＧエンコーダを選択し、部分ストリームとしてＭｏｔｉｏｎＪＰＥＧ形式のデータを送出してもよい。
【００３３】
ビデオ再生端末システムはビデオサーバシステムから送出される部分ストリームを受け取り、部分ストリーム一時保持手段２０３にこれを一時保存し、ダイジェスト再生を行う。部分ストリームとして複数の形式によるストリームが送られてくるため、ビデオ再生端末システム内のデコード手段２０６はこれら複数の形式に対応したデコード手段を持ち、必要に応じて切り替えて動作するよう制御される。
【００３４】
ダイジェスト再生前、再生中もしくは再生終了後のしかるべきタイミングで、ビデオサーバシステムは主ストリームを送出し、ビデオ再生端末システムはこれを受け取って主ストリーム保持手段２０４に一時保存し、ユーザにコンテンツを視聴させるための処理を行う。
【００３５】
ここで、複数の形式に対応したビデオエンコード手段及びビデオデコード手段については、複数のビデオエンコーダ及びビデオデコーダを個別に搭載することで実現しても良いし、単一の演算装置に複数のソフトウエアを組み込みことで複数の形式に対応するのでも良い。
【００３６】
また、ビデオ再生端末システムが、自らがサポートするビデオデコード手段の一覧をビデオサーバシステムへ伝達しておき、ビデオサーバシステムは、一覧の中から最も効率よく処理可能なエンコード形式を選択するような構成としても良い。なお、サポート可能なビデオデコード手段の一覧を伝達するのは、ビデオサーバシステムへ部分ストリームを要求する時点で行うのが効率が良いと考えられる。
〔第三実施例の説明〕
次に、本発明のさらに別の実施例について説明する。
【００３７】
図４に本発明の第三の実施例を示す。図４において、１はビデオサーバシステム、２はビデオ再生端末システムである。また、１０１はビデオエンコード手段、１０２はビデオストリーム保持手段、１０３はダイジェスト生成手段、１０４はダイジェストインデックス情報送出手段、１０５はビデオストリーム送出手段である。また、２０１はダイジェストインデックス情報要求手段、２０２は部分ビデオストリーム要求手段、２０３は部分ビデオストリーム一時保持手段、２０４はビデオストリーム一時保持手段、２０５はダイジェストインデックス選択整形手段、２０６はビデオデコード手段、２０７はビデオ表示手段である。また、４０１は伝送路性能検出手段である。
【００３８】
次に図４を用いて、本実施例の動作について説明する。本実施例の動作は、前述の第一実施例の動作と一点を除き同一である。第一実施例と異なる動作となるのは、次の点である。第一実施例において、ビデオ再生端末システムに対する再生速度入力はユーザによって外部から行われていた。本実施例においては、ビデオサーバシステム及びビデオ再生端末システムの双方が具備する伝送路性能検出手段４０１によって、ビデオ再生端末システムにおける再生速度が自動的に決定される。
【００３９】
これは例えば、ビデオサーバシステムとビデオ再生端末システムの間が高速の伝送路で接続されている場合は、比較的多くのデータを短時間に伝送することができることから、低い速度での再生が可能であるが、両者の間の伝送路が低速である場合には、伝送すべきデータ量を削減し、見かけ上の再生速度を速くすることが必要になるためである。
【００４０】
もちろんこのような構成であっても、再生速度を全て機械が決定するのではなく、制限を設けてその範囲内で人間が所望の再生速度を設定する、といった方式でも良い。例えば、伝送路速度が十分でない場合には、通常選択可能な１００倍速はメニューに現れず、最小でも５００倍速になる、といった制限が想定できる。
〔第四実施例の説明〕
次に、本発明のさらに別の実施例について説明する。
【００４１】
図５に本発明の第四の実施例を示す。図５において、１１は画像監視サーバシステム、１２は監視センタ制御端末システム、１３は監視画像再生端末システムである。また、１０１はビデオエンコード手段、１０２はビデオストリーム保持手段、１０３はダイジェスト生成手段、１０４はダイジェストインデックス情報送出手段、１０５はビデオストリーム送出手段、１１０は警報制御手段である。また、２０１はダイジェストインデックス情報取得手段、２０２は部分ビデオストリーム要求手段、２０３は部分ビデオストリーム一時保持手段、２０４はビデオストリーム一時保持手段、２０５はダイジェストインデックス選択整形手段、２０６はビデオデコード手段、２０７はビデオ表示手段である。また、３０１は部分ビデオストリーム一時保持手段、３０２はビデオデコード手段、３０３はビデオ表示手段である。
【００４２】
次に図５を用いて、本実施例の動作について説明する。
【００４３】
ここで、画像監視サーバシステム１１は、映像による監視を行おうとしている監視対象サイト近傍に設置されることを想定し、また、監視センタ制御端末システム１２は、該監視システムを統括する中央監視センタのような場所に設置されることを想定し、また、監視画像再生端末１３は、移動体や、各種公的機関（警察、消防等）に設置されることを想定している。
【００４４】
画像監視サーバシステム１１は、監視対象を映し出すカメラに接続されていることを想定する。監視カメラから入力される画像は、エンコード手段１０１によってエンコードされ、ビデオストリーム保持手段１０２に保存される。カメラからの入力画像はダイジェスト生成手段１０３にも供給され、リアルタイムもしくは一定時間間隔でダイジェストインデックスを生成する。
【００４５】
次に、監視対象に関する何らかの異常を操作スイッチやセンサなどにより検出した場合、警報入力が警報制御手段１１０に入力される。この入力をトリガとして、ダイジェストインデックス送出手段１０４は、その時点で求められていたダイジェストインデックス情報を、監視センタ制御端末システムに向けて送出する。なお、このとき同時に、どこに設置されている画像監視サーバシステムであるかを示す情報や、どのようなトリガによって発生した警報かを示す情報を、同時に送出するという方法も考えられる。
【００４６】
監視センタ制御端末システムがこの警報を受け取ると、予め設定してある再生速度情報に基づき、ダイジェストインデックスの選択及びソート処理などを行い、部分ストリーム要求を画像監視サーバシステムに向けて送出する。この際、必要であればオペレータ等が介在して、再生速度を変更しても良い。
【００４７】
画像監視サーバシステム１１が部分ストリーム要求を受け取ると、これに含まれるダイジェストインデックス情報に基づいてビデオストリーム保持手段１０２に格納されているビデオストリームの一部分を抜き出し、部分ストリームとして送出を行う。
【００４８】
監視センタ制御端末システム１２は、この部分ストリームを受け取ると、部分ストリーム一時保持手段２０３にこれを一時保存し、ストリームの表示処理を行う。オペレータはこれを視聴し、もし必要であれば再生速度の調整を行い、再度、部分ストリーム要求を送出しても良い。
【００４９】
画像監視サーバシステム１１から送出される部分ストリームは、監視センタ制御端末システム１２だけに送られるのではなく、監視画像再生端末１３にも送られる。この端末は比較的単純な構成であり、部分ストリームを再生する機能だけを持つ。画像監視サーバシステムから受け取った部分ストリームは、部分ビデオストリーム保持手段３０１にいったん保存され、次に新しい部分ストリームを受け取るまで繰り返し再生表示を行う。
【００５０】
このような動作により、監視画像中の重要フレームだけを先に送出することで、たとえ移動体への伝送路が低速であったとしても、問題解決に役立つであろう高解像度の画像を各端末に対して送ることができる。例えば、火災による高温をセンサが検知して警報を発した場合、火元の場所はどこで何が燃え始めたのか、人間は残っていないか、周辺に危険物はないか、といった消火に役立つと思われる重要な情報を、迅速に関係各所に送付することができる。
【００５１】
なお、本実施例では、監視センタ制御端末にいったんダイジェストインデックスを送り、これを送付しなおすことで部分ストリームの送出が始まるようになっているが、これに制限されるものではなく、監視センタ制御端末システムとの通信を行う前に、画像監視サーバシステムが自律的に予め指定されていた再生速度情報によって部分ストリームを生成し、これを自動的に送出する、といった構成でも構わない。
【００５２】
また、必ずしも全ての監視画像再生端末に向けて部分ストリームを送出する必要はなく、監視センタ制御端末システムにおいて何らかの制約条件を送付していても構わない。例えば、監視対象サイトの近隣に位置している監視画像再生端末に向けてだけ、部分ストリームを送付する、といった制約をかけてもよい。
〔実施例の別方式の説明〕
次に、以上述べた実施例においては、以下に述べる特徴を併せ持つことができる。この特徴について、図６を用いて説明する。
【００５３】
図６は、上記実施例において、ビデオサーバシステム１からストリームが送出され、これをビデオ再生端末システム２が受け取り、いったんストリーム一時保持手段に格納したものを再生表示する動作の部分について、説明のため部分的に抜き出したものである。
【００５４】
図６において、１０２はビデオストリーム保持手段、１０５はストリーム送出手段、２０４はビデオストリーム一時保持手段、２０６はビデオデコード手段、２０７はビデオ表示手段である。
【００５５】
始めに、ｓｔｅｐ１００１において、部分ストリームの送出が行われる。ストリーム送出手段１０５は、別途指定されたビデオストリームの一部分を送出する。これを受け取ったビデオ再生端末システム２は、単一のビデオストリーム一時保持手段中の、当該部分ストリームが本来占めるべき位置にこれを格納していく。この結果、図６に模式的に示すように、部分ストリームは、ビデオストリーム一時保持手段２０４をストライプ状に占有するように格納される。
【００５６】
次に、ｓｔｅｐ１００２において、部分ストリームの再生表示と、これに平行して主ストリーム全体の送出が行われる。ビデオデコード手段２０６によってストライプ状に格納された部分ストリームが読み出され、これがデコードされて表示処理される。一方、ストリーム送出手段１０５は、ビデオストリーム１０２に格納されているストリームのうち、先ほど部分ストリームとして送出しなかった部分だけを、順に送出する。これを受け取ったビデオ再生端末システム２は、先ほど部分ストリームを格納しなかった残りの部分に対して、受け取ったストリームを格納していく。この処理によって最終的には、ビデオストリーム一時保持手段２０４は、先頭から順々にビデオストリーム全体が格納されたのと同じ状態となる。
【００５７】
次に、ｓｔｅｐ１００３において、ストリーム全体の再生表示が行われる。前のステップが完了した時点で、ビデオストリーム一時保持手段２０４にはストリーム全体が正しい順序で格納されていることになるため、これを先頭から順にデコードすることで、ストリーム全体の再生を行うことができる。
【００５８】
以上のような処理を行うことによって、部分ストリーム用の専用ストリーム保持手段を用意することなく、単一のストリーム保持手段によってビデオ再生端末システムを構築することができる。これにより構成の単純化を図ることができる。
〔第五実施例の説明〕
次に、以上述べた実施例は、何れも部分ストリームと主ストリームの２段階に分けた伝送を行うものであるが、ダイジェストインデックスによって重要なフレームと判断されたものから順に伝送を行う方法がある。
【００５９】
この方法の例として、さらに別に実施例について説明する。
【００６０】
図７に本発明の第五の実施例を示す。図７において、１はビデオサーバシステム、２はビデオ再生端末システムである。また、１０１はビデオエンコード手段、１０２はビデオストリーム保持手段、１０３はダイジェスト生成手段、１０５はビデオストリーム送出手段、１１１はビデオストリーム構造情報保持手段、１１２は部分ストリーム選択手段、１１３はシステム制御手段である。また、２０４はビデオストリーム一時保持手段、２０５はダイジェストインデックス選択整形手段、２０６はビデオデコード手段、２０７はビデオ表示手段、２０８はストリーム要求手段、２０９はダイジェストインデックス一時保持手段である。
【００６１】
次に図７を用いて、本実施例の動作について説明する。
【００６２】
ビデオサーバシステム１に画像が入力されると、ビデオエンコード手段１０１によってＭＰＥＧ等のエンコードが行われ、ビデオストリーム保持手段１０２に保存される。またこのとき、ビデオエンコード手段１０１は、エンコードの際に使用したストリーム構造情報を出力し、ストリーム構造情報保持手段１１１がこれを格納する。ここでストリーム構造情報は、例えばＭＰＥＧによるエンコードを仮定すると、フレームデータがフレーム間相関を利用してエンコードされているかどうか、どのフレームとの間の相関を用いたエンコードか、といった情報である。
【００６３】
次に、ユーザがコンテンツの視聴を要求すると、ビデオ再生端末システム２のストリーム要求手段２０８は、ビデオサーバシステムに対してストリームの要求を行う。制御手段１１３がこれを受け、対応するビデオストリームをビデオストリーム保持手段１１２から取り出し、これをもとにしてダイジェスト生成手段１０３がダイジェストインデックスを生成する。ダイジェストインデックス情報は、ビデオ再生端末システム２へ送られ、ダイジェストインデックス一時保持手段２０９に格納される。制御手段１１３はまた、ストリーム構造情報保持手段１１１から、対応するストリームの構造情報を取り出し、先ほど生成されたダイジェストインデックスと共に部分ストリーム選択手段１１２へ送る。部分ストリーム選択手段１１２は、受け取った情報から、ダイジェストインデックスの先頭に指定されているフレームをデコードするために必要な部分ストリームを求め、ストリーム送出手段１０５は、この情報をもとにして、必要とされている部分ストリームだけをビデオ再生端末システム２へ送出する。例えば、指定フレームがフレーム間相関を利用したエンコードによるものであれば、そのデコードに際しては参照先のフレームデータが場合によっては複数必要となるため、これらをまとめて部分ストリームとして送出する、といった動作になる。
【００６４】
部分ストリーム選択手段１１２は、ダイジェストインデックスの先頭に指定されているフレームの処理を終えると、２番目に指定されているフレーム、３番目のフレーム、といったように順次処理を進め、最終的に全てのフレームに対応するストリームを送出する。なお、ＭＰＥＧのようにフレーム間相関を用いるエンコード方式の場合、既に参照先フレームとして送出したフレームが、ｎ番目に指定されているフレームとして再度送出対象となることになるが、この場合は既に送出済みであるとして実際には送出しないことができる。
【００６５】
次に、ビデオ再生端末システム２は、ダイジェストインデックス情報とビデオストリームとを、ビデオサーバシステム１から受け取って内部に一時的に保持する。このうちビデオストリームは、ダイジェストインデックスの先頭に現れるフレームから順に受け取ることになる。ユーザが指定した再生速度に基づき、ダイジェストインデックスを先頭から部分的に選択及びソートし、これを元にビデオストリームを順次デコードし、ユーザに視聴させる。
【００６６】
ここで、ビデオストリームは重要なフレームから順に送られてくるため、ビデオストリーム受信開始直後には最も重要なフレームが少しだけビデオ再生端末システム内に保持されている状態となる。この状態では、例えば２０％分のダイジェスト再生を行おうとしても、対応するビデオストリームが手元にないため再生できない。よって、ビデオストリームが送信されている途中の状態においては、ユーザが指定できる再生速度に制限を設けることになる。例えば、ビデオストリーム一時保持手段２０４に３％分のダイジェストに対応するビデオストリームが入っているものとすれば、再生速度入力において３％より多い値が設定できないように制限する。もちろん、ビデオストリーム一時保持手段２０４に格納されているビデオストリームの量が順次増えることで、３％より大きい値を選択できるようにする。
【００６７】
なお、本実施例において用いるビデオストリーム一時保持手段２０４では、図６において模式的に示したストライプ状のデータ格納方法を採ることが適当であることはいうまでもない。
【００６８】
以上述べた何れの実施例においても、ストリームに含まれるデータとしてビデオデータについてしか触れていないが、ビデオに付随する音声データもしくはそれ以外のデータについても、同様の取扱いによって処理することができる。
【００６９】
また、ビデオ再生端末システムは、部分ストリームと主ストリームを両方とも受け取るように記述しているところがあるが、これは必須ではなく、例えばユーザの指示によって、主ストリームは受け取らずに部分ストリームだけを受け取るような動作をしても良い。
【００７０】
また、ダイジェストインデックス情報の生成は、ビデオ再生端末側からの要求をトリガとして処理されているところがあるが、ビデオサーバシステム内にビデオストリームが取り込まれた時点でこれを生成し保持しておいても良い。また、必ずしもビデオサーバシステム内においてダイジェストインデックス情報を生成する必要はなく、これを外部から別途供給しても構わない。
【００７１】
【発明の効果】
以上述べたように本発明によれば、
コンテンツのダイジェスト視聴時に、どれだけの長さのダイジェストを視聴するかを端末側で予め決定することが可能となり、
ダイジェストを視聴する際には、コンテンツの一部だけをサーバからダウンロードすることにより、視聴を開始するまでに要する時間を短縮することが可能となり、
また、ダイジェストだけを再生する場合には、サーバから受信するデータの総量を低減することが可能となり、
以上により、利便性が高く効率が良いシステムを提供することができるという効果がある。
【図面の簡単な説明】
【図１】本発明の第一実施例の構成図。
【図２】本発明のダイジェストインデックスの例。
【図３】本発明の第二実施例の構成図。
【図４】本発明の第三実施例の構成図。
【図５】本発明の第四実施例の構成図。
【図６】本発明におけるストリーム一時保持方法の別方式。
【図７】本発明の第五実施例の構成図。
【符号の説明】
１…ビデオサーバシステム、２…ビデオ再生端末システム、１１…画像監視サーバシステム、１２…監視センタ制御端末システム、１３…監視画像再生端末システム、１０１…ビデオエンコード手段、１０２…ビデオストリーム保持手段、１０３…ダイジェスト生成手段、１０４…ダイジェストインデックス情報送出手段、１０５…ビデオストリーム送出手段、１０６…部分ストリーム読み出し手段、１０７…ビデオデコード手段、１０８…ビデオエンコード手段、１０９…部分ストリーム送出手段、１１０…警報制御手段、１１１…ビデオストリーム構造情報保持手段、１１２…部分ストリーム選択手段、１１３…システム制御手段、２０１…ダイジェストインデックス情報要求手段、２０２…部分ビデオストリーム要求手段、２０３…部分ビデオストリーム一時保持手段、２０４…ビデオストリーム一時保持手段、２０５…ダイジェストインデックス選択整形手段、２０６…ビデオデコード手段、２０７…ビデオ表示手段、２０８…ストリーム要求手段、２０９…ダイジェストインデックス一時保持手段、３０１…部分ビデオストリーム一時保持手段、３０２…ビデオデコード手段、３０３…表示手段、４０１…伝送路性能検出手段。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for supplying digest information relating to content content separately from the content itself when viewing a large amount of content, and in particular, a video server device for distributing digital video content, and receiving the distribution. The present invention relates to a video playback terminal device and a video distribution system using the same.
[0002]
[Prior art]
With the development of digital video processing technology in recent years, a large amount of digital video content has begun to be distributed. In order to efficiently distribute these contents and to effectively use the limited time of the viewer, the technology for providing digest information of the contents has been remarkably developed.
[0003]
As the digest information of the content, for example, if the target content is a movie, various attributes related to the movie, for example, information related to the production year and the creator, etc., as well as a preview CM video used immediately before distribution of the movie, etc. are used. Sometimes. If the target content is something like a sports broadcast video, humans will watch the content before starting content distribution, tag it out with scenes that are considered important, etc., partially extract it, and digest the extracted video Sometimes used as information.
[0004]
As such a technique, for example, there is a technique disclosed in Japanese Patent Application Laid-Open No. H11-196385.
[0005]
However, the conventional techniques described above have the following problems.
[0006]
When selling content including video, it is expected that the presence or absence of digest information including video information will make a significant difference in advertising effectiveness. Such video digest information can be diverted to a preview CM video or the like if it is a content like a movie distributed in recent years, but generally, it must be generated based on the content itself. It is the current situation.
[0007]
When video digest information is generated based on a video content body, it is necessary to determine which part of the content is appropriate for the digest and extract a part of the video. This extraction work has conventionally been performed by a method in which a person views and tags the content. However, recent research has made it possible to perform automatic work by a computer, although the target content is limited. As such a study, for example, there is a study shown in Non-Patent Document 1.
[0008]
However, in any of the methods, since the digest video information is generated in advance, the length of the generated digest video information is uniquely determined at the time of generation. That is, for example, if a digest of the content for two hours is generated, and this is a one-minute video, a viewer who wants to view the digest video necessarily needs one minute of viewing time. Even if the viewer wants a shorter digest, he / she cannot view the digest video of one minute more shortened. Similarly, if you want a more detailed digest, you can't do that.
[0009]
For such a viewer, it is possible to prepare a plurality of digest videos, for example, a one-minute version, a 15-second version, or a three-minute version. However, regarding this point, the demands of the viewer tend to be fragmented, and there is a high possibility that a 30-second version or a 2-minute version must be prepared soon or later. In order to respond to such a demand, in the method of tagging by a human as described above, it is necessary to strike a tag having a plurality of weights at the time of tagging. It is expected to rise significantly.
[0010]
[Patent Document 1]
JP-A-11-196385
[Non-patent document 1]
Takemasa, et al., "Personalized summary video creation method using video importance", IEICE Transactions D-II Vol. J84-D-II No. 8, pp. 1848-1855, 2001
[Non-patent document 2]
Daniel DeMenthon, Vikrant Kobla, David Doermann, "Video Summarization by Curve Simplification", Proceedings of the next edition of the International Convention of the International Union of Communications, ACM international communication. 211-218, 1998
[0011]
[Problems to be solved by the invention]
SUMMARY OF THE INVENTION An object of the present invention is to provide a highly convenient and efficient digital content summarizing and reproducing system.
[0012]
[Means for Solving the Problems]
The above issues are
In a way to reduce playback time by summarizing digital content,
The summarizing method is a method of dividing the digital content on a time axis, evaluating the importance of each of the divided elements, and reproducing only the element having a high evaluation value,
The method of shortening the reproduction time is a method of sequentially selecting elements having higher evaluation values so that the specified reproduction time is obtained,
The playback time reduction rate is specified when playing digital content.
And is solved.
[0013]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[Description of First Embodiment]
FIG. 1 shows a first embodiment of the present invention. In FIG. 1, reference numeral 1 denotes a video server system, and 2 denotes a video playback terminal system. Also, 101 is a video encoding unit, 102 is a video stream holding unit, 103 is a digest generation unit, 104 is a digest index information transmission unit, and 105 is a video stream transmission unit. 201 is a digest index information requesting unit, 202 is a partial video stream requesting unit, 203 is a partial video stream temporary holding unit, 204 is a video stream temporary holding unit, 205 is a digest index selection and shaping unit, 206 is a video decoding unit, 207 Is a video display means.
[0014]
Next, the operation of the present embodiment will be described with reference to FIG.
[0015]
When a content image is input to the video server system 1, the video encoding unit 101 encodes the content image by, for example, the MPEG method, and stores the encoding result in the video stream holding unit 102. Of course, an already encoded video stream may be directly input to the video stream holding unit 102, but is omitted in this specification for simplicity of description.
[0016]
The user operates the video playback terminal system 2 to select content, and transmits the selection result to the video server system. Here, the means used for selection and transmission may select, for example, a content title name from a list or a list showing a part of an image. For such a selecting means, etc., a conventionally used technique may be used, and therefore description thereof is omitted in this specification.
[0017]
When the information of the selected content is transmitted to the video server system, the video server system extracts a required video stream, and generates a digest index corresponding to the video stream by the digest generation unit 103.
[0018]
Here, the digest index is obtained by assigning an integer number to all frames of the corresponding video stream, rearranging the frames in the order considered to be important, and listing the frame numbers of the rearranged results. FIG. 2 shows an example of the digest index. In this example, the 1034th frame is considered to be the most important frame in the entire contents, and the 26th, 544th, 39th, and 1667th frames are determined to be important in the following order. Assuming that the corresponding video stream has a length of 100,000 frames, in order to view it at, for example, 1000 times speed, obtain the first 100 frame numbers of the digest index and sort them in ascending order. By sequentially displaying the corresponding frames, digest viewing at a desired speed becomes possible.
[0019]
Although the arithmetic method itself for generating the digest index is not particularly described in the present invention, it is assumed that, for example, a method as described in Non-Patent Document 2 is used. Of course, the arithmetic method for generating the digest index is not limited to the method described in the above document, and an arbitrary method may be used.
[0020]
Now, in this embodiment, the digest index information generated by the video server system is transmitted to the video reproduction terminal system by the digest index sending means 104. When the video playback terminal system obtains the digest index information, the digest index selection and shaping means 205 extracts the required number of heads of the digest index based on the playback speed separately specified by the user. This is sorted in ascending order, and the sorting result is transmitted to the video server system. Note that this sort processing itself does not need to be performed on the video reproduction terminal system side, and may be performed on the video server system side depending on the configuration. Also, the process of extracting the required number of the leading portions of the digest index itself does not need to be performed on the video playback terminal system side, and the necessary playback speed information is transmitted from the video playback terminal system to the video server system, The digest index may be cut out on the system side.
[0021]
Next, when the video server system obtains a digest index to be reproduced, the stream transmitting means 105 transmits a corresponding video stream based on the index information, and the video reproduction terminal system receives this and performs digest reproduction.
[0022]
At an appropriate timing before, during or after the digest playback, the video server system sends out the main stream, and the video playback terminal system receives this and temporarily saves it in the main stream holding means 204 to allow the user to view the content. Perform a process for causing
[0023]
If the viewing of the main content is charged, it is assumed that the video playback terminal system charges for the main content of the video stream. If only digest playback is free, it is necessary to set a lower limit on the playback speed and to limit the digest index generation target to a part such as the head of the video stream.
[Explanation of the second embodiment]
Next, another embodiment of the present invention will be described.
[0024]
FIG. 3 shows a second embodiment of the present invention. In FIG. 3, 1 is a video server system, and 2 is a video playback terminal system. Also, 101 is a video encoding unit, 102 is a video stream holding unit, 103 is a digest generation unit, 104 is a digest index information sending unit, 105 is a video stream sending unit, 106 is a partial stream reading unit, 107 is a video decoding unit, 108 Is a video encoding means, and 109 is a partial stream sending means. 201 is a digest index information requesting unit, 202 is a partial video stream requesting unit, 203 is a partial video stream temporary holding unit, 204 is a video stream temporary holding unit, 205 is a digest index selection and shaping unit, 206 is a video decoding unit, 207 Is a video display means.
[0025]
Note that the video encoding means 108 and the video decoding means 206 may respectively hold encoding means and decoding means using a plurality of methods.
[0026]
Next, the operation of this embodiment will be described with reference to FIG.
[0027]
When a content image is input to the video server system 1, the video encoding unit 101 encodes the content image by, for example, the MPEG method, and stores the encoding result in the video stream holding unit 102.
[0028]
The user operates the video playback terminal system 2 to select content, and transmits the selection result to the video server system.
[0029]
When the information of the selected content is transmitted to the video server system, the video server system extracts a required video stream, and generates a digest index corresponding to the video stream by the digest generation unit 103.
[0030]
The digest index information generated by the video server system is transmitted to the video reproduction terminal system by the digest index sending means 104. The digest index information received by the video playback terminal system is sent to the digest index selection and shaping means 207, where the required number of digest index heads are extracted based on the playback speed separately specified by the user. This is sorted in ascending order and transmitted to the video server system as a partial stream request.
[0031]
When the video server system receives the partial stream request including the digest index information, the partial stream reading unit 106 partially reads the video stream based on the index information. Based on this, a partial stream is generated via a video decoding unit 107 and a video encoding unit 108. The generation result is received by the partial stream sending means 109 and sent to the video playback terminal system.
[0032]
Here, the partial stream transmitted by the video server system may be a part of the MPEG format data originally possessed by the video stream holding means, or may be re-encoded in the JPEG format or the MPEG format by an encoder. Does not depend on For example, if the video playback speed requested by the user is sufficiently high, the correlation between frames to be played back is expected to be low, so re-encoding in the JPEG format achieves the highest processing efficiency. Can be expected. Under such conditions, a JPEG encoder may be selected and Motion JPEG format data may be transmitted as a partial stream.
[0033]
The video playback terminal system receives the partial stream sent from the video server system, temporarily stores the partial stream in the partial stream temporary holding unit 203, and performs digest playback. Since streams in a plurality of formats are sent as partial streams, the decoding means 206 in the video playback terminal system has decoding means corresponding to the plurality of formats, and is controlled to switch and operate as necessary.
[0034]
At an appropriate timing before, during or after the digest playback, the video server system sends out the main stream, and the video playback terminal system receives this and temporarily saves it in the main stream holding means 204 to allow the user to view the content. Perform a process for causing
[0035]
Here, the video encoding unit and the video decoding unit corresponding to a plurality of formats may be realized by separately mounting a plurality of video encoders and video decoders, or a plurality of software units may be installed in a single arithmetic unit. May be used to support multiple formats.
[0036]
Also, the video playback terminal system transmits a list of video decoding means supported by itself to the video server system, and the video server system selects an encoding format that can be processed most efficiently from the list. It is good. It is considered efficient to transmit the list of supportable video decoding means at the time of requesting a partial stream from the video server system.
[Description of Third Embodiment]
Next, still another embodiment of the present invention will be described.
[0037]
FIG. 4 shows a third embodiment of the present invention. In FIG. 4, 1 is a video server system, and 2 is a video playback terminal system. Also, 101 is a video encoding unit, 102 is a video stream holding unit, 103 is a digest generation unit, 104 is a digest index information sending unit, and 105 is a video stream sending unit. 201 is a digest index information requesting unit, 202 is a partial video stream requesting unit, 203 is a partial video stream temporary holding unit, 204 is a video stream temporary holding unit, 205 is a digest index selection and shaping unit, 206 is a video decoding unit, 207 Is a video display means. Reference numeral 401 denotes a transmission path performance detection unit.
[0038]
Next, the operation of this embodiment will be described with reference to FIG. The operation of this embodiment is the same as the operation of the above-described first embodiment except for one point. The operation different from that of the first embodiment is as follows. In the first embodiment, the input of the playback speed to the video playback terminal system is performed externally by the user. In this embodiment, the playback speed in the video playback terminal system is automatically determined by the transmission path performance detection means 401 provided in both the video server system and the video playback terminal system.
[0039]
For example, if the video server system and the video playback terminal system are connected by a high-speed transmission line, relatively large amounts of data can be transmitted in a short time, so playback at a low speed is possible. However, if the transmission path between the two is low, it is necessary to reduce the amount of data to be transmitted and increase the apparent reproduction speed.
[0040]
Of course, even with such a configuration, a system may be used in which the machine does not determine all the playback speeds, but rather limits are set and a desired playback speed is set by a person within the range. For example, when the transmission path speed is not sufficient, a limit that the normally selectable 100 × speed does not appear in the menu but becomes at least 500 × speed can be assumed.
[Explanation of the fourth embodiment]
Next, still another embodiment of the present invention will be described.
[0041]
FIG. 5 shows a fourth embodiment of the present invention. In FIG. 5, reference numeral 11 denotes an image monitoring server system, 12 denotes a monitoring center control terminal system, and 13 denotes a monitoring image reproduction terminal system. Also, 101 is a video encoding unit, 102 is a video stream holding unit, 103 is a digest generation unit, 104 is a digest index information transmission unit, 105 is a video stream transmission unit, and 110 is an alarm control unit. 201 is a digest index information acquisition unit, 202 is a partial video stream requesting unit, 203 is a partial video stream temporary holding unit, 204 is a video stream temporary holding unit, 205 is a digest index selection and shaping unit, 206 is a video decoding unit, 207 Is a video display means. Reference numeral 301 denotes a partial video stream temporary holding unit, 302 denotes a video decoding unit, and 303 denotes a video display unit.
[0042]
Next, the operation of this embodiment will be described with reference to FIG.
[0043]
Here, it is assumed that the image monitoring server system 11 is installed in the vicinity of a monitoring target site that is going to monitor by video, and the monitoring center control terminal system 12 is a central monitoring center that controls the monitoring system. The surveillance image reproducing terminal 13 is assumed to be installed in a mobile body or various public institutions (police, fire department, etc.).
[0044]
It is assumed that the image monitoring server system 11 is connected to a camera that projects a monitoring target. The image input from the surveillance camera is encoded by the encoding unit 101 and stored in the video stream holding unit 102. The input image from the camera is also supplied to the digest generation unit 103, and generates a digest index in real time or at fixed time intervals.
[0045]
Next, when any abnormality related to the monitoring target is detected by an operation switch, a sensor, or the like, an alarm input is input to the alarm control unit 110. With this input as a trigger, the digest index transmitting means 104 transmits the digest index information obtained at that time to the monitoring center control terminal system. At this time, it is also conceivable to simultaneously transmit information indicating where the image monitoring server system is installed and information indicating what kind of trigger has caused an alarm.
[0046]
When the monitoring center control terminal system receives this warning, it selects a digest index and sorts it based on preset playback speed information, and sends a partial stream request to the image monitoring server system. At this time, if necessary, the playback speed may be changed by an operator or the like.
[0047]
When the image monitoring server system 11 receives the partial stream request, it extracts a part of the video stream stored in the video stream holding means 102 based on the digest index information included in the request and sends it as a partial stream.
[0048]
Upon receiving the partial stream, the monitoring center control terminal system 12 temporarily stores the partial stream in the partial stream temporary holding unit 203, and performs a stream display process. The operator may watch this, adjust the playback speed if necessary, and send a partial stream request again.
[0049]
The partial stream sent from the image monitoring server system 11 is sent not only to the monitoring center control terminal system 12 but also to the monitoring image reproduction terminal 13. This terminal has a relatively simple configuration, and has only a function of reproducing a partial stream. The partial stream received from the image monitoring server system is temporarily stored in the partial video stream holding unit 301, and is repeatedly reproduced and displayed until the next new partial stream is received.
[0050]
By transmitting only the important frames in the monitoring image first by such an operation, even if the transmission path to the moving object is slow, a high-resolution image that will help solve the problem is transmitted to each terminal. Can be sent to For example, if the sensor detects a high temperature due to a fire and issues an alarm, it may help to extinguish the fire, such as where the fire started and what began to burn, no human remains, or if there is no dangerous material around. Possible important information can be quickly sent to relevant parties.
[0051]
In this embodiment, the digest index is once sent to the monitoring center control terminal, and the transmission of the partial stream is started by sending the digest index again. However, the present invention is not limited to this. Before performing communication with the terminal system, the image monitoring server system may autonomously generate a partial stream based on reproduction speed information specified in advance, and automatically transmit the partial stream.
[0052]
Further, it is not always necessary to transmit the partial stream to all the monitoring image reproduction terminals, and some restriction conditions may be transmitted in the monitoring center control terminal system. For example, a restriction may be imposed that a partial stream is sent only to a monitoring image reproducing terminal located near a monitoring target site.
[Explanation of another method of the embodiment]
Next, the embodiments described above can have the following features. This feature will be described with reference to FIG.
[0053]
FIG. 6 is a diagram for explaining the operation of the above embodiment in which a stream is transmitted from the video server system 1 and received by the video playback terminal system 2 to reproduce and display the stream once stored in the stream temporary holding means. It is partially extracted.
[0054]
6, reference numeral 102 denotes a video stream holding unit, 105 denotes a stream sending unit, 204 denotes a video stream temporary holding unit, 206 denotes a video decoding unit, and 207 denotes a video display unit.
[0055]
First, in step 1001, a partial stream is transmitted. The stream sending means 105 sends out a part of a separately specified video stream. The video playback terminal system 2 receiving this stores the video stream temporarily in the single video stream temporary holding means at the position that the partial stream should occupy. As a result, as schematically shown in FIG. 6, the partial stream is stored so as to occupy the video stream temporary holding unit 204 in a stripe shape.
[0056]
Next, in step 1002, the reproduction and display of the partial stream and the transmission of the entire main stream are performed in parallel. The video decoding means 206 reads out the partial stream stored in a stripe shape, decodes the partial stream, and performs display processing. On the other hand, the stream transmission means 105 sequentially transmits only the portion of the stream stored in the video stream 102 that has not been transmitted as a partial stream earlier. Upon receiving this, the video playback terminal system 2 stores the received stream in the remaining part where the partial stream was not stored earlier. By this processing, finally, the video stream temporary holding unit 204 is in the same state that the entire video stream is sequentially stored from the beginning.
[0057]
Next, in step 1003, reproduction and display of the entire stream is performed. When the previous step is completed, the entire stream is stored in the video stream temporary holding means 204 in the correct order, so that the entire stream can be reproduced by decoding it in order from the beginning. it can.
[0058]
By performing the above processing, a video playback terminal system can be constructed by a single stream holding unit without preparing a dedicated stream holding unit for a partial stream. Thereby, the configuration can be simplified.
[Description of Fifth Embodiment]
Next, in each of the above-described embodiments, the transmission is performed in two stages of the partial stream and the main stream. However, there is a method in which the transmission is performed in order from the frame determined to be important by the digest index. .
[0059]
Another example will be described as an example of this method.
[0060]
FIG. 7 shows a fifth embodiment of the present invention. In FIG. 7, 1 is a video server system, and 2 is a video playback terminal system. Reference numeral 101 denotes a video encoding unit, 102 denotes a video stream holding unit, 103 denotes a digest generation unit, 105 denotes a video stream sending unit, 111 denotes a video stream structure information holding unit, 112 denotes a partial stream selection unit, and 113 denotes a system control unit. is there. Reference numeral 204 denotes a video stream temporary holding unit, 205 denotes a digest index selection and shaping unit, 206 denotes a video decoding unit, 207 denotes a video display unit, 208 denotes a stream requesting unit, and 209 denotes a digest index temporary holding unit.
[0061]
Next, the operation of this embodiment will be described with reference to FIG.
[0062]
When an image is input to the video server system 1, the image is encoded by the video encoding unit 101 such as MPEG and stored in the video stream holding unit 102. At this time, the video encoding unit 101 outputs the stream structure information used at the time of encoding, and the stream structure information holding unit 111 stores this. Here, the stream structure information is information as to whether or not frame data is encoded using inter-frame correlation, and to which frame the encoding is to be performed using correlation, assuming encoding by MPEG, for example.
[0063]
Next, when the user requests viewing of the content, the stream request means 208 of the video playback terminal system 2 requests a stream to the video server system. The control means 113 receives this, extracts the corresponding video stream from the video stream holding means 112, and based on this, the digest generation means 103 generates a digest index. The digest index information is sent to the video playback terminal system 2 and stored in the digest index temporary holding unit 209. The control unit 113 also extracts the structure information of the corresponding stream from the stream structure information holding unit 111 and sends it to the partial stream selection unit 112 together with the digest index generated earlier. The partial stream selecting unit 112 obtains a partial stream necessary for decoding the frame specified at the head of the digest index from the received information, and the stream transmitting unit 105 determines whether the partial stream is necessary based on this information. Only the partial stream that has been sent is sent to the video playback terminal system 2. For example, if the specified frame is encoded by using inter-frame correlation, decoding may require a plurality of frame data of a reference destination in some cases. Become.
[0064]
After finishing the processing of the frame specified at the head of the digest index, the partial stream selection unit 112 sequentially proceeds with the processing of the second specified frame, the third frame, and so on. Send the stream corresponding to the frame. In the case of an encoding method using inter-frame correlation such as MPEG, a frame that has already been transmitted as a reference destination frame is to be transmitted again as the n-th designated frame. It may not be actually sent as already completed.
[0065]
Next, the video playback terminal system 2 receives the digest index information and the video stream from the video server system 1 and temporarily stores the digest index information and the video stream therein. Among them, the video stream is received in order from the frame appearing at the head of the digest index. Based on the playback speed designated by the user, the digest index is partially selected and sorted from the beginning, and based on this, the video stream is sequentially decoded and the user is made to view.
[0066]
Here, since the video stream is sent in order from the important frames, immediately after the start of the video stream reception, the most important frames are slightly held in the video playback terminal system. In this state, for example, even if an attempt is made to perform the digest reproduction for 20%, the reproduction cannot be performed because the corresponding video stream is not at hand. Therefore, while the video stream is being transmitted, a limit is imposed on the reproduction speed that can be specified by the user. For example, assuming that the video stream temporary holding unit 204 contains a video stream corresponding to a digest of 3%, the reproduction speed input is limited so that a value larger than 3% cannot be set. Of course, a value larger than 3% can be selected by sequentially increasing the amount of the video stream stored in the video stream temporary holding means 204.
[0067]
It is needless to say that the video stream temporary holding means 204 used in the present embodiment suitably employs a stripe-shaped data storage method schematically shown in FIG.
[0068]
In each of the embodiments described above, only video data is described as data included in a stream. However, audio data accompanying video or other data can be processed in the same manner.
[0069]
In addition, the video playback terminal system has been described to receive both the partial stream and the main stream, but this is not essential. For example, according to a user instruction, the main stream is not received and only the partial stream is received. Such an operation may be performed.
[0070]
In some cases, the generation of the digest index information is processed using a request from the video playback terminal as a trigger, but when the video stream is captured in the video server system, it may be generated and stored. good. In addition, it is not always necessary to generate digest index information in the video server system, and this may be separately supplied from outside.
[0071]
【The invention's effect】
According to the present invention as described above,
When viewing the digest of the content, the terminal can determine in advance how long the digest should be viewed,
By downloading only a part of the content from the server when viewing the digest, it is possible to reduce the time required to start viewing.
Also, when only the digest is reproduced, the total amount of data received from the server can be reduced,
As described above, there is an effect that a highly convenient and efficient system can be provided.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a first embodiment of the present invention.
FIG. 2 is an example of a digest index according to the present invention.
FIG. 3 is a configuration diagram of a second embodiment of the present invention.
FIG. 4 is a configuration diagram of a third embodiment of the present invention.
FIG. 5 is a configuration diagram of a fourth embodiment of the present invention.
FIG. 6 shows another method of the stream temporary holding method according to the present invention.
FIG. 7 is a configuration diagram of a fifth embodiment of the present invention.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Video server system, 2 ... Video reproduction terminal system, 11 ... Image monitoring server system, 12 ... Monitoring center control terminal system, 13 ... Surveillance image reproduction terminal system, 101 ... Video encoding means, 102 ... Video stream holding means, 103 ... digest generating means, 104 ... digest index information sending means, 105 ... video stream sending means, 106 ... partial stream reading means, 107 ... video decoding means, 108 ... video encoding means, 109 ... partial stream sending means, 110 ... alarm control Means: 111: video stream structure information holding means, 112: partial stream selecting means, 113: system control means, 201: digest index information requesting means, 202: partial video stream requesting means, 203: unit Video stream temporary holding means, 204: Video stream temporary holding means, 205: Digest index selection and shaping means, 206: Video decoding means, 207: Video display means, 208: Stream request means, 209: Digest index temporary holding means, 301 ... Partial video stream temporary holding means, 302: video decoding means, 303: display means, 401: transmission path performance detection means.

Claims

In a way to reduce playback time by summarizing digital content,
The summarizing method is a method of dividing the digital content on a time axis, evaluating the importance of each of the divided elements, and reproducing only the element having a high evaluation value,
The method of shortening the reproduction time is a method of sequentially selecting elements having higher evaluation values so that the specified reproduction time is obtained,
The playback time reduction rate is specified at the time of digital content playback,
Digital content summary playback method.

The digital content is digital video content,
The method of dividing the digital content on a time axis is a method of dividing the digital video content on a frame basis,
The method of claim 1 for summarizing and reproducing digital contents.

With the features of claims 1 and 2,
Input means for acquiring digital content, summarizing means for summarizing digital content, playback speed designating means for designating a reduction rate of playback time, and output means for outputting digital content, comprising: Characterized in that the digital content is summarized and output at a specified shortening rate,
Digital content summarization system.

With the features of claims 1 and 2,
Storage means for holding the digital content, summarizing means for summarizing the digital content, playback speed designating means for designating a reduction rate of the playback time, and output means for outputting the digital content; Digital content being summarized and output at a specified shortening rate,
Digital content summarization system.

With the features of claims 1 and 2,
Storage means for holding the digital content, summarizing means for summarizing the digital content, evaluation value output means for outputting a list of evaluation values used for the summarization, and elements having the highest evaluation in the evaluation value list in order. A content summarization device comprising: a content element output unit that outputs the content element;
Playback speed designating means for designating a reduction rate of playback time, playback output means for playing back and outputting digital content, evaluation value input means for obtaining the evaluation value list, and content element input means for obtaining the content element And a content reproduction device comprising: a content element temporary storage unit for temporarily holding the content element;
And a communication path connecting the two devices,
Digital content summary playback system.

With the features of claims 1 and 2,
Storage means for holding digital content, summarizing means for summarizing digital content, evaluation value output means for outputting a list of evaluation values used for the summarization, and selectively outputting content elements requested to be output A content summarization device comprising:
Playback speed designating means for designating a shortening rate of playback time; evaluation value input means for acquiring the evaluation value list; content element output requesting means for sequentially requesting an output with the highest evaluation in the evaluation value list; A content element input unit for acquiring an element, a content element temporary storage unit for temporarily holding the content element, and a content reproduction apparatus including a reproduction output unit for reproducing and outputting the digital content;
And a communication path connecting the two devices,
Digital content summary playback system.

The digital content summary playback system according to claim 6, wherein
The content element temporary storage means includes: a summary partial content element temporary storage means for temporarily holding a content element selected for a playback time designated by a playback speed designation means; and a content element other than the selected content element. And non-summary partial content element temporary storage means for temporarily holding
Digital content summary playback system.

The digital content summary playback system according to claim 7, wherein
The content summarizing apparatus includes transcodec means for changing a data format of a content element,
The content reproducing apparatus includes a multi-decoding unit for reproducing a content element whose data format has been changed by the transcodec unit,
The content element stored in the summary partial content element temporary storage unit and the content element stored in the non-summary partial content element temporary storage unit have different data formats.
Digital content summary playback system.

The digital content summarizing and reproducing system according to claim 5, 6, 7, or 8,
The playback speed designation means is a playback speed input means for designating a shortening rate from outside the system,
Digital content summary playback system.

The digital content summarizing and reproducing system according to claim 5, 6, 7, or 8,
The playback speed designation unit is a playback speed determination unit that obtains a shortening rate based on attribute information on a communication path that connects the content summarization device and the content playback device.
Digital content summary playback system.

The digital content summary playback system according to claim 10, wherein
The attribute information on the communication path is information on a transmission speed of the communication path,
Digital content summary playback system.

The digital content summarizing and reproducing system according to claim 5, 6, 7, 8, 9, 10, 11, or 12,
The communication between the content summarizing device and the content reproducing device is started by a communication start unit provided in the content summarizing device,
Digital content summary playback system.

The digital content summary playback system according to claim 12, wherein
The communication start unit is an input device connected to the content summarization device,
Digital content summary playback system.

The digital content summary playback system according to claim 5, 6, 7, 8, 9, 10, 11, 12, or 13,
The system comprises one or a plurality of the content summarizing devices, a plurality of the content reproducing devices, and a communication path interconnecting them,
The plurality of content playback devices include one or more partial content playback devices that receive and play only the summarized partial content, and one or more total content playback devices that receive and play the entire content. Characterized by being composed,
Digital content summary playback system.