JP4045768B2

JP4045768B2 - Video processing device

Info

Publication number: JP4045768B2
Application number: JP2001308282A
Authority: JP
Inventors: 宏樹吉村; 和貴平田
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2001-10-04
Filing date: 2001-10-04
Publication date: 2008-02-13
Anticipated expiration: 2021-10-04
Also published as: JP2003116095A

Description

【０００１】
【発明の属する技術分野】
本発明は、映像処理装置における映像データに対するリンクの提示に関する装置または方法に関し、特に、作業領域を確保して、映像データから部分映像データを抽出し、当該部分映像データに対して、テキスト・データ、音声データ、画像データ、関連資料ファイルデータ、映像データなどをリンク・データの内容として、リンク・データを利用者または利用者間で容易かつ適切に関連付けさせることが可能な映像処理装置および映像処理方法に関する。
【０００２】
【従来の技術】
近年、インターネットを介してマルチメディア・データを利用した意思伝達のためのコミュニケーション・ツールや会議システム情報共有が個人や企業において進展している。その中で、従来の印刷物にマーカーやメモの書き込みを行うのと同様にデジタル・ドキュメントやビデオ映像にテキスト注釈を付加するシステムが提案されている。特開平８−２７２９８９号公報「映像仕様による資料作成支援システム」では、公報でテキスト情報と映像情報とを関連付けて資料として取り扱うことが可能である。以下、この技術を第１の従来技術と呼ぶことにする。
【０００３】
次に、特開２０００−２５０８６４号公報「協調作業支援システム」では、さまざまな形式の注釈が可能な技術として、プレゼンテーション資料などストリーミング・データに対して、メモや質問などのテキスト・データが付加可能でかつ複数のクライアント間で共有可能である。以下、この技術を第２の従来技術と呼ぶことにする。
【０００４】
また、特開平６−２７４５５２号公報「マルチメディアデータリンク方式」では、画面に表示されている動画像中の任意エリアまたは一連の動画像データ中の任意画面を指定することにより、当該画面にデータを表示可能である。以下、この技術を第３の従来技術と呼ぶことにする。
【０００５】
さらに、Y.Yamamoto,CHI2001「Time-ART」では、ビデオや音声データを視聴中に自由にクリッピングできるユーザ・インターフェイスを備え、テキストによる注釈機能を持ったツールが提案されている。以下、この技術を第４の従来技術と呼ぶことにする。
一方、特開平１０−２１０２９号公報「テロップ表示装置」では、テロップを簡単に利用者が作成でき、音声情報や画像情報を付加情報として簡単に追加できる表示装置が存在する。以下、この技術を第５の従来技術と呼ぶことにする。
【０００６】
また、従来、ワールド・ワイド・ウェブを閲覧するためのウェブ・ブラウザでホーム・ぺ一ジを閲覧する場合、そのホーム・ぺ一ジにいわゆるイメージ・マップとしてリンク情報が埋め込まれる場合がある。利用者はウェブ・ブラウザで提示されるホーム・ぺ一ジのイメージ・マップを構成している領域上にマウスを移動させ、マウスをクリックすることにより、リンクされている情報にアクセスすることが可能である。以下、この技術を第６の従来技術と呼ぶことにする。
【０００７】
特開平８−３２９０９６号公報「画像データ検索装置」では、画像データに、付加情報として、その画像の特徴を簡潔に表すアイコンを設定する手段を有し、そのアイコンを１次元以上の軸を有するマップ上の所定の位置に配置し、そのアイコンを用いて係るアイコンに関連する画像データを検索する画像データ検索装置の技術が公開されている。以下、この技術を第７の従来技術と呼ぶことにする。
【０００８】
さらに、特開平８−３２９０９７号公報「画像データ検索装置」では、画像データに、付加情報として、その画像に対するキーワードを設定する手段を有し、そのキーワードを用いて、画像データを検索する画像データ検索装置の技術が公開されている。以下、この技術を第８の従来技術と呼ぶことにする。
【０００９】
また、特開平８−３２９０９８号公報「画像データ検索装置」では、１次元以上の軸を有する第１のマップ上の画像データと１次元以上の軸を有する第２のマップ上の付加情報とを関連付けて、画像データを検索することができる画像データ検索装置の技術が公開されている。以下、この技術を第９の従来技術と呼ぶことにする。
【００１０】
特開平１１−３９１２０号公報「コンテンツ表示・選択装置およびコンテンツ表示・選択方法、並びにコンテンツ表示・選択方法のプログラムが記録された記録媒体」では、ＨＴＭＬ文書コンテンツを二次元配列に配置することにより、マウス・ポインタなしでブラウジング（内容の一覧）を可能とする技術が公開されている。以下、この技術を第１０の従来技術と呼ぶことにする。
【００１１】
【発明が解決しようとする課題】
しかしながら、従来の技術では、以下に示すような種々な問題があった。
まず、上述した第１から第５までの従来システムの共通な問題点として、利用者は映像データの再生中に別の画面に部分映像データを抽出して、当該映像データの内容に、音声データなどから参照しながらリンク・データを付加することができないという問題があった。
【００１２】
また、部分映像データに付加したリンク・データについて、部分映像データ上の任意の場所にリンク・データを付加することができず、どこに付加したのか分からないという問題があった。例えば、映像データに、人物と資料などのオブジェクトが複数映っているときに、当該部分映像データに対して、リンク・データを付加する場合、従来技術では、リンク・データのコメントが、どのオブジェクトを指しているか判別できないといった問題があった。
さらに、関連するリンク・データの付加情報を部分画像データの任意指定部分に、複数リンク・データの重ね合わせができないという問題点があった。
【００１３】
次に、前記の第６の従来技術では、イメージ・マップを含むＨＴＭＬ文書コンテンツを利用者に提示する場合、利用者がブラウザ内のイメージ・マップを含むＨＴＭＬ文書コンテンツの領域上にマウスを移動しなければ、イメージ・マップの存在を利用者が知ることはできなかった。
【００１４】
次に、第７、第８および第９の従来技術は、画像データに対して、アイコンやテキスト・データあるいは付加情報などを関連付けられるものの、利用者に対してリンクの視覚的なフィードバックを与える技術ではなく、同一の画像データに対して、複数のリンクを付加した場合に、利用者に視覚的なフィードバックを与えて、各リンクを区別し、リンクされた情報を利用できるものではない。
【００１５】
同じく、第１０の従来技術を用いても、利用者はＨＴＭＬ文書コンテンツ中、特に画像データや映像データ中に表現されている人や物など特定領域に関連付けられたイメージ・マップの存在を利用者に提示できないという問題があった。
また、第６から第１０までの従来技術のいずれを用いても、映像データ中に表現されている人や物などの特定領域といわゆる電子掲示板システムあるいは電話など通話・通信システムと連携して利用することはできないという問題があった。
【００１６】
本発明は、このような従来の課題を解決するためになされたもので、映像データから特定される部分映像データに対して当該部分映像データに関連付けられたデータの存在を提示することに関して有効な映像処理装置などを提供することを目的とする。
【００１７】
【課題を解決するための手段】
上記目的を達成するため、本発明に係る映像処理装置では、部分映像データ特定手段が映像データから当該映像データの部分である部分映像データを特定し、データ関連付け手段が特定した部分映像データに対してデータを当該データの存在を提示可能なように関連付ける。
従って、映像データから部分映像データを特定して当該部分映像データに対してデータをその存在が提示可能な態様で関連付けることができ、これにより、当該部分映像データに関連付けられたデータの存在を提示可能とすることができる。
【００１８】
ここで、映像処理装置としては、種々な装置として構成されてもよく、例えばコンピュータを用いて構成することができる。
また、映像データとしては、例えば時間的に連続した映像データが用いられ、具体的には、フレーム内の平面的な画像データが時間的に連続して変化していくようなデータが用いられ、この場合、フレーム内の位置を表す座標（横軸及び縦軸）の値と時間軸の値とで映像データ中の一点を示すことができる。
【００１９】
また、部分映像データとしては、種々なデータが用いられてもよく、例えば、１つのフレームの画像データや、１つのフレームの画像データ中の特定の対象のデータや、時間幅を有したフレームの画像データつまり時間的に連続した複数のフレームの画像データや、時間幅を有した特定の対象のデータなどを用いることができる。
【００２０】
また、部分映像データを特定する仕方としては、種々な仕方が用いられてもよく、例えばユーザからの指定に基づいて特定する仕方や、映像処理装置が予め定められた手順で自動的に特定する仕方や、これら両方を併用する仕方などを用いることができる。
【００２１】
また、部分映像データに対して関連付けるデータとしては、種々なデータが用いられてもよく、テキストのデータや、音声のデータや、画像のデータなどを用いることができる。
また、部分映像データに対して関連付けるデータの数としては、単数であってもよく、複数であってもよい。
【００２２】
また、本発明に係る映像処理装置では、部分映像データ特定手段は、映像データに含まれる同一の対象のデータについての時間幅を有する部分映像データを特定する。
従って、映像データに含まれる時間幅を有する同一の対象のデータに対してデータを関連付けることができる。
【００２３】
ここで、同一の対象のデータとしては、種々な対象のデータが用いられてもよく、例えば人物を対象とするデータや、物を対象とするデータや、フレーム内の所定の領域を対象とするデータなどを用いることができる。なお、同一の対象を特定する仕方としては、種々な仕方が用いられてもよく、例えば静止しているものについては同一の場所に存するものを同一の対象とみなす仕方を用いることができ、動作を行うものについては同一の形状などの特徴を有するものを同一の対象とみなす仕方を用いることができる。
また、時間幅としては、種々な時間幅を用いることができる。
【００２４】
また、本発明に係る映像処理装置では、映像データは音声データと対応している。そして、部分映像データ特定手段は、映像データに含まれる単数又は複数の人物のデータについて、当該人物のデータに対応した音声データが有効である時間幅を有する部分映像データを特定する。
従って、単数又は複数の人物を対象として、当該対象に対応した音声が有効である時間幅を有するデータを部分映像データとして特定することができる。
【００２５】
ここで、音声データとしては、例えば対応する映像データ中の人物などにより発せられる音声のデータが用いられ、例えば当該映像データと時間軸で対応する。
また、単数の人物のデータについては、例えば当該人物が発するとみなされる音声が連続的に続く間の時間幅或いは所定の閾値未満の無音声期間を除いて当該人物が発するとみなされる音声が連続的に続く間の時間幅などを、当該人物のデータに対応した音声データが有効である時間幅として決定することができる。
【００２６】
同様に、複数の人物のデータについては、例えばこれら複数の人物の中の少なくとも一人の人物が音声を発しているとみなされる状態が連続的に続く間の時間幅或いは所定の閾値未満の無音声期間を除いてこのような状態が連続的に続く間の時間幅などを、当該複数の人物のデータに対応した音声データが有効である時間幅として決定することができる。
【００２７】
また、本発明に係る映像処理装置では、部分映像データ特定手段は、映像データのフレーム内で部分映像データが位置する領域を特定するデータを用いて当該部分映像データを特定する。
従って、例えばフレーム内での座標位置のデータなどを用いることにより、部分映像データを構成する各フレーム内の画像領域を特定して当該部分映像データを特定することができる。
【００２８】
また、本発明に係る映像処理装置では、部分映像データ特定手段は、部分映像データ候補特定手段により複数の部分映像データの候補を特定し、部分映像データ指定受付手段により特定した部分映像データ候補に含まれる部分映像データの指定をユーザから受け付け、そして、指定を受け付けた部分映像データを特定した部分映像データとする。
従って、映像処理装置により自動的に部分映像データの候補を複数特定した後に、これら複数の候補の中からユーザにより部分映像データを指定する仕方により、当該指定された部分映像データを最終的に特定した部分映像データとすることができる。
【００２９】
ここで、部分映像データの候補の数としては、種々な数が用いられてもよく、例えば単数である場合があってもよい。
また、部分映像データの候補を特定する仕方としては、種々な仕方が用いられてもよく、例えば映像データのフレーム内に存する各対象毎のデータをそれぞれ部分映像データの候補として特定することができる。
また、部分映像データ指定受付手段としては、例えばユーザにより操作されるキーボードやマウスなどを用いることができる。
【００３０】
また、本発明に係る映像処理装置では、関連部分映像データ特定手段が、部分映像データに関連付けられたデータから当該部分映像データを特定する。
従って、例えば部分映像データに関連付けられたデータがユーザにより指定された場合などに、当該データが関連付けられた当該部分映像データを特定することができる。
【００３１】
また、本発明に係る映像処理装置では、関連データ提示手段が、部分映像データに関連付けられたデータの存在を示すデータを、映像データ中の当該部分映像データと視覚的に関連付けて提示する。
従って、部分映像データに関連付けられたデータの存在を当該部分映像データと視覚的に関連付けて提示することができ、これにより、当該関連付けられたデータの存在や当該関連付けをユーザに対して視覚的に把握可能とすることができる。
【００３２】
ここで、部分映像データに関連付けられたデータの存在を示すデータとしては、例えばアイコンのデータを用いることができ、また、後述するように種々なデータを用いることができる。
また、部分映像データに関連付けられたデータの存在を示すデータと当該部分映像データとを視覚的に関連付ける仕方としては、種々な仕方が用いられてもよく、例えばこれらのデータを近隣に配置する仕方や、これらのデータの一部を重ねて配置する仕方などを用いることができる。
また、提示の仕方としては、例えば画面に表示出力する仕方や、紙面に印刷出力する仕方などを用いることができる。
【００３３】
また、本発明に係る映像処理装置では、関連データ提示手段は、部分映像データに関連付けられたデータの存在を示すデータとして、当該部分映像データの形状に基づく形状を有するデータを提示する。
従って、部分映像データの形状に基づく形状を有するデータを提示することにより、当該データと当該部分映像データとの関連付けをユーザにより視覚的に把握し易くすることができる。
【００３４】
ここで、部分映像データの形状に基づく形状を有するデータとしては、種々なデータが用いられてもよく、例えば部分映像データの形状に基づく形状を有する影のデータなどを用いることができる。
【００３５】
また、本発明に係る映像処理装置では、関連データ提示手段は、部分映像データに関連付けられたデータの存在を示すデータとして、映像データのフレームの外側であって当該フレームの外側に設けられた枠の内側に、当該部分映像データのフレーム内での水平位置を示すデータ及び垂直位置を示すデータを提示する。
従って、映像データのフレーム内ではなくフレーム外に設けられた枠に、部分映像データに関連付けられたデータの存在を示すデータが提示されるため、フレーム内の画像をそのまま見易いものとすることができる。また、提示されるデータにより、部分映像データのフレーム内での水平位置及び垂直位置を示すことができる。
【００３６】
映像データのフレームの外側に設けられた枠としては、種々な枠が用いられてもよく、例えば映像データのフレームと比べて一回り大きいような枠が用いられ、当該フレームの外側であって当該枠の内側には映像データは提示されない。
また、部分映像データは、その水平位置での垂直線とその垂直位置での水平線とが直交する位置に存することとなる。
【００３７】
また、本発明に係る映像処理装置では、部分映像データに関連付けられたデータの存在を示すデータと所定の処理とが対応付けられている。そして、提示データ指定受付手段が提示されたデータ（部分映像データに関連付けられたデータの存在を示すデータ）の指定をユーザから受け付け、提示データ対応処理実行手段が指定を受け付けたデータに対応付けられた処理を実行する。
従って、ユーザは、提示されたデータを指定することにより、当該データに対応付けられた処理を実行させることができる。
【００３８】
ここで、所定の処理としては、種々な処理が用いられてもよく、例えば提示されたデータに関連する文書処理やメールやインターネットなどに関するプログラムを起動する処理や、これにより提示されたデータに関連するデータを表示や送信などする処理などを用いることができ、更に具体的には、例えば提示されたデータに関連するデータを画面上に表示する処理や、当該データを電子メールにより設定されたアドレスに対して送信する処理や、当該データを電話により設定された電話番号に対して音声送信する処理などを用いることができる。
また、提示データ指定受付手段としては、例えばユーザにより操作されるキーボードやマウスなどを用いることができる。
【００３９】
また、本発明に係る映像処理装置では、複数の端末装置により同一の映像データに関する操作を実行することが可能である。
従って、例えば一つの端末装置（例えば一人のユーザ）により同一の映像データに関する操作を実行することばかりでなく、複数の端末装置（例えば複数のユーザ）により同一の映像データに関する操作を実行することができ、これにより、同一の映像データに関する部分映像データや当該部分映像データに関連付けられるデータなどを共有することや共同で編集することなどができる。
【００４０】
ここで、端末装置としては、種々な装置が用いられてもよく、例えばコンピュータを用いることができる。
また、複数の端末装置の数としては、種々な数が用いられてもよい。
また、同一の映像データに関する操作としては、種々な操作が用いられてもよく、例えば映像データから部分映像データを特定する操作や、特定された部分映像データにデータを関連付ける操作などを用いることができる。
【００４１】
また、一構成例として、複数の端末装置は有線や無線のネットワークなどを介して通信可能に接続され、これら複数の端末装置によりアクセス可能な共通の記憶装置が設けられて、当該記憶装置に操作対象となるデータが保存される。
【００４２】
また、本発明に係る映像処理装置では、複数関連データ提示手段が、映像データから特定された当該映像データの部分である部分映像データに関連付けられた複数のデータの存在を示すデータを、当該映像データ中の当該部分映像データと視覚的に関連付けて提示する。
従って、部分映像データに関連付けられた複数のデータの存在を当該部分映像データと視覚的に関連付けて提示することができ、これにより、当該関連付けられた複数のデータの存在や当該関連付けをユーザに対して視覚的に把握可能とすることができる。
【００４３】
ここで、部分映像データに関連付けられた複数のデータの数としては、種々な数が用いられてもよい。
また、部分映像データに関連付けられた複数のデータの存在を示すデータとしては、例えば部分映像データに単数のデータが関連付けられた場合とは異なるデータが用いられ、更に好ましい態様例として、部分映像データに関連付けられた複数のデータの数を表すデータが用いられる。
【００４４】
また、本発明に係る映像処理装置では、複数関連データ提示手段は、部分映像データに関連付けられた複数のデータの存在を示すデータとして、当該関連付けられたデータの数と同数のデータを提示する。
従って、部分映像データに関連付けられたデータの数をユーザにより視覚的に把握可能に提示することができる。
【００４５】
ここで、部分映像データに関連付けられたデータの数と同数のデータとしては、好ましい態様例としてそれぞれが同一又は類似の形状を有するデータを用いることができ、或いは、例えばそれぞれが異なる形状を有するデータが用いられてもよい。
【００４６】
また、本発明に係る映像処理装置では、複数関連データ提示手段は、部分映像データに関連付けられた各データの存在を示すデータを当該関連付けられた各データ毎に識別可能な態様で提示する。
従って、部分映像データに関連付けられた各データ毎に、その存在を示すデータをユーザにより視覚的に識別可能とすることができる。
【００４７】
ここで、部分映像データに関連付けられた各データ毎にその存在を示すデータが識別可能な態様としては、例えば当該各データ毎にその存在を示すデータの形状や色や輝度や配置位置などを異ならせるような態様を用いることができる。
【００４８】
なお、以上に示した本発明と同様に同一の画像のデータに関連付けられた複数のデータの存在を示すデータを当該画像データと視覚的に関連付けて提示する技術や当該複数と同数のデータを提示する技術や当該各データ毎に識別可能とする技術などは、必ずしも映像データから特定された部分映像データに限られずに、種々な画像データに適用することが可能であり、例えば当該画像データとして静止画像のデータに適用することも可能である。
【００４９】
また、本発明では、以上に示したような各種の処理を実現する映像処理方法を提供する。
例えば、本発明に係る映像処理方法では、映像データから当該映像データの部分である部分映像データを特定し、特定した部分映像データに対してデータを当該データの存在を提示可能なように関連付ける。
また、本発明に係る映像処理方法では、映像データから特定された当該映像データの部分である部分映像データに関連付けられた複数のデータの存在を示すデータを、当該映像データ中の当該部分映像データと視覚的に関連付けて提示する。
【００５０】
また、本発明では、以上に示したような各種の処理を実現するプログラムを提供する。なお、本発明では、このようなプログラムを格納した記憶媒体を提供することも可能である。
例えば、本発明に係るプログラムでは、映像データから当該映像データの部分である部分映像データを特定する処理と、特定した部分映像データに対してデータを当該データの存在を提示可能なように関連付ける処理とをコンピュータに実行させる。
また、本発明に係るプログラムでは、映像データから特定された当該映像データの部分である部分映像データに関連付けられた複数のデータの存在を示すデータを当該映像データ中の当該部分映像データと視覚的に関連付けて提示する処理をコンピュータに実行させる。
【００５１】
【発明の実施の形態】
本発明に係る実施例を図面を参照して説明する。
まず、本発明の第１実施例に係る映像処理装置や映像処理方法を説明する。
図１は、本発明に係る映像処理装置の一例を示すブロック図である。映像処理装置１は、記憶部１１と、リンク対象領域指定部１２と、リンク生成部１３と、映像提示部１４と、リンク管理部１５とから構成される。
【００５２】
記憶部１１は、一般的な記憶装置から構成され、リンク（関連付け）される一方の対象となる映像データ（以下、単に映像データと記述することもある）およびリンク・データ（関連付けに関するデータ）並びにリンクされるもう一方の対象となる被リンク・データを保持する。
リンク対象領域指定部１２は、マウスやデジタイザなどの座標入力装置から構成され、利用者（ユーザ）から映像データ中のリンク対象となる領域の座標データ(以下、リンク対象領域座標データと記述することもある)を入力し、係るリンク対象領域座標データをリンク生成部１３に出カする。
【００５３】
リンク生成部１３は、利用者からダイアログ形式のユーザ・インタフェースによって入カされる被リンク・データの識別子あるいは名前を入力する。また、リンク生成部１３は、リンク対象領域指定部１２から入カされたリンク対象領域座標データと利用者から入力された被リンク・データとをリンクしてリンク・データとして記憶部１１に出力する。
映像提示部１４は、ディスプレイから構成され、視覚化されたリンク・データおよび映像データを利用者に提示する。
リンク管理部１５は、記憶部１１、リンク対象領域指定部１２、リンク生成部１３、映像提供部１４を管理し、制御する。
【００５４】
本例では、映像データは、動画データおよび音声データが組み合わされたデータ、または、動画データまたは音声データのいずれか一方のデータを意味することとして、説明を行う。また、本例では、部分映像データは、映像データ中の時間的または空間的（領域的）な一部分のデータを意味する。
なお、本発明に言う映像データは、例えば画像のみのデータから構成され、また、例えば当該画像データに対して音声などのデータが対応付けられる場合も含む。
【００５５】
図２は、図１の映像処理装置を詳細化したブロック図である。
図２に示すように、記憶部１１は、映像記憶装置２１およびリンク・データ記憶装置２６から構成される。リンク対象領域指定部１２は、（任意）部分映像データ指定装置２３および部分映像データ提示装置２４から構成される。リンク生成部１３は、リンク・データ付加装置２５から構成される。映像提示部１４は、映像データ提示装置２２、部分映像データ提示装置２４、およびリンク・データ提示装置２７から構成される。
【００５６】
映像記憶装置２１は、一般的なメモリで構成され、入力された映像データを保持する。
映像データ提示装置２２は、ディスプレイで構成され、映像記憶装置２１に保持されている映像データを利用者に提示する。
【００５７】
部分映像データ指定装置２３は、マウスなどの座標入カ装置によって構成され、映像データ提示装置２２によって提示されている映像データの任意の一部を指定し、その指定された部分映像データを、部分映像データ提示装置２４に転送する。
部分映像データ提示装置２４は、部分映像データ指定装置２３から転送された部分映像データを提示する。
【００５８】
リンク・データ付加装置２５は、部分映像データ提示装置２４によって提示されている部分映像データに対して、リンク・データを付加し、リンク・データ記憶装置２６に転送する。
リンク・データ記憶装置２６は、リンク・データ付加装置２５によって付加されたリンク・データ、及び部分映像データを保持する。
リンク・データ提示装置２７は、リンク・データ付加装置２５によって付加されたリンク・データならびにリンク・データ群を提示する。
【００５９】
ここで、映像データ中の任意の部分映像データの抽出について説明する。
映像データから部分映像データを抽出する形態としては、利用者が映像データ処理装置１の提供するユーザ・インタフェースを通して、人手で部分映像データの画像上の外形(輪郭)もしくは外接矩捗を指定することによって部分映像データを抽出する方法や、映像処理装置１により自動的に抽出された部分映像データの侯補を利用者が選択する方法などの形態がある。
【００６０】
ここでは、映像処理装置１が自動的に部分映像データの候補を抽出する場合における部分映像データの抽出方法について説明する。
部分映像データを抽出するべき映像データが図３に示されるような場合を想定する。つまり、映像データ（Video.mpg）３１の或るフレーム（本例では、フレーム番号１２０から１５０までの３１フレーム）のフレーム上の矩形領域（ｘ-ｙ直交座標の表現で{(1O,30),(1O,10),(20,10),(20,30)})の内部に部分映像データの候補として抽出されるべき人が記録されているとする。なお、同図には、水平方向を表すｘ座標の軸と、垂直方向を表すｙ座標の軸と、時間の流れを表す時間ｔの軸を示してある。
【００６１】
この部分映像データ抽出手続きは、図４に示すように、各フレームにおける輪郭抽出処理（ステップＳ１）、各フレームにおける外接矩形計算処理（ステップＳ２）、フレーム間差分算出処理（ステップＳ３）、部分映像データ検出処理（ステップＳ４）、部分映像データ候補提示処理（ステップＳ５）からなる。
【００６２】
具体的には、まず、各フレームにおける輪郭抽出処理では、部分映像データの矩形領域を特定するために、映像処理装置１は、映像データ３１中の各フレームにおいて輪郭抽出処理を行う（ステップＳ１）。輪郭抽出は、通常の画像処理で用いられるいわゆる微分フィルタを用いることにより人の画像のエッジを抽出し、そのエッジを連結することで輪郭を抽出することが可能である。また、輪郭抽出処理によって、人が複数の小領域に分割されている場合でも、従来の領域分割・統合処理によって人単位の領域（輪郭）を抽出することが可能である。
【００６３】
次に、この人単位の輪郭を抽出した後、各フレームにおける外接矩形計算処理において当該輪郭を包含する外接矩形３３を算出する（ステップＳ２）。ここで、この外接矩形算出処理によって、フレーム番号１２０から１５０までの３１フレームにおいて、フレーム上の座標表現で{(10,30),(10,10),(20,10),(20,30)}の外接矩形３３を計算することができる。
【００６４】
続いて、フレーム間差分算出処理および部分映像データ検出処理において、各フレームを比較して、同一の部分映像データ中の人を単一のオブジェクト（部分映像データ）として取り扱えるか否かの検査を行う（ステップＳ３、ステップＳ４）。つまり、ＭＰＥＧ２などで用いられているような各フレーム間のフレーム差分を計算することによって、或るフレームと次のフレームで記録されているものが同一か否かを判断する。
【００６５】
具体的には、フレーム間差分算出処理において（ステップＳ３）、フレーム番号１１９のフレームとフレーム番号１２０とのフレーム差分では、フレーム番号１１９のフレームには人が記録されず、フレーム番号１２０のフレームには人が記録されているため、フレーム差分の結果（例えば、各画素の差分の総和）は大きな値を持つことになる。同じように、フレーム番号１５０のフレームとフレーム番号１５１のフレームとのフレーム差分も大きな値を持つことになる。それに対して、フレーム番号１２０から１５０までのフレームでは、先の同一の矩形領域３３に人が記録されているため、そのフレームにおけるフレーム差分は小さな値を持つ。
【００６６】
部分映像データ検出処理において（ステップＳ４）、以上のフレーム差分の値と矩形領域３３が存在するか否かの情報から、フレーム番号１２０から１５０までのフレームには、部分映像データの候補となる人が記録されていることが分かる。
そこで、部分映像データ候補提示処理において、それらのフレームの当該矩形領域３３の部分を単一の部分映像データ３２として映像処理装置１の利用者に提示する。
【００６７】
次に、図５に示すように、本例に係る映像処理装置１の処理手順について説明する。
この処理手順は、映像の提示（ステップＳ１１）、部分映像の指定（ステップＳ１２）、部分映像の提示（ステップＳ１３）、リンク・データ付加（ステップＳ１４）、およびリンク・データ保存（ステップＳ１５）からなる。
【００６８】
まず、映像提示においては、映像データ提示装置２２は映像処理装置１の映像記憶装置２１に保持されている映像データを提示する（ステップＳ１１）。
次に、部分映像の指定においては、部分映像データ指定装置２３を使って利用者によって指定される映像データのいわゆるタイム・コードまたはフレーム番号、および座標データを取得する（ステップＳ１２）。
【００６９】
続いて、部分映像の提示においては、部分映像データ提示装置２３は、利用者によって指定された部分映像データを提示する（ステップＳ１３）。
リンク・データ付加においては、利用者は部分映像データ提示装置２３により提示された部分映像データに対して、リンク・データ付加装置２５を使って、関連するデータ（リンク・データ）を付加する（ステップＳ１４）。
最後に、リンク・データ保存においては、リンク・データ記憶装置２６は、利用者によって付加されたリンク・データ、および映像データのいわゆるタイム・コードまたはフレーム番号、および座標データを保持する（ステップＳ１５）。
【００７０】
図６は、リンク・データ記憶装置２６に記憶されるデータのデータ構造を示す。また、図７は、リンク・データ記憶装置２６に記憶されるデータの拡張されたデータ構造を示す。
リンク・データ記憶装置２６は、部分画像データのタイム・コード４１を保持し、リンク・データ付加装置２５によって入力された被リンク対象データ４３、部分映像データ提示装置２４で指定された任意の座標データ４２、記憶装置名４４、部分画像アイコン・データ４５を格納し、拡張されたデータ構造では、更に、協調作業などを行うために利用者データ４６を格納する。
【００７１】
例えば、タイム・コード４１には、或るフレームの静止画像部分からリンク・データが付加される場合は、部分映像データ提示装置２４の映像データのその地点のタイム・コードが記録される。また、「ここからここまで」という指定の場合は、リンク・データを付加する開始点と終了点の情報がタイム・コード４１に記録される。
【００７２】
座標データ４２は、部分映像データ提示装置２４に、リンク対象領域座標として(x1,y1)、(x1,y2)、(x2,y2)、( x2,y1)の２次元座標を与え、マウスなどの入力装置によってプロットされたテキスト・データや部分画像アイコンを、(x1,y1)、(x1,y2)、(x2,y2)、(x2,y1)として、リンク対象領域座標の２次元座標値を保持する。
被リンク対象データ４３は、コメントや電子データファイルの情報や、テキスト・データやファイル格納先の情報を保持する。
【００７３】
次に、図８の本例に係るユーザ・インタフェース例を用いて操作手順について説明する。
利用者は、映像データ提示画面５１に提示されている映像から、リンク・データを付加したい部分映像を指定することにより、部分映像データ提示画面５２に、指定した部分映像が映し出される。
【００７４】
部分映像データ提示画面５２上の利用者が指定した任意の場所において、リンク・データ付加画面５３から、当該提示画像に複数のテキスト・データによるコメントや電子データファイルを部分画像アイコン６１ａ〜６１ｃとして付加することが可能である。この場合、図６や図７が示すとおり、リンク・データが付加される指定時間をタイム・コード４１で保存し、部分映像データ提示画面５２上のどの部分に利用者がリンク・データを付加したかを示す場所情報を座標データ４２で保持し、付加したテキスト・データによるコメントや電子データファイルを被リンク対象データ４３として保持して、これら３つを１つのデータとして保持するデータ構造を持つ。
【００７５】
図９は、部分映像データにリンク・データが付加された後のデータ構造を示す。
図９は、タイム・コード(00:01:00.00)において、３つのリンク・データ（「質問について」ならびに「このコメントがポイント」のコメント(テキスト・データ)と“abc.mpg"という名称の関連映像データ）を保持していることを示す。このように、指定した任意の部分映像データ毎にデータが保持されているので、一時的に付加されたリンク・データを部分映像データ上から消去することも可能な構造を持つ。
【００７６】
また、保管先として、記憶装置先を指定できる。これは、リンク・データを付加した映像データを記憶装置名４４のように公開（Public）サーバまたは非公開（Private）サーバに保管する場合、および複数の利用者間で同一の映像データに対して協調してリンク・データの付加作業を行う場合に利用される。
【００７７】
さらに、図８に示されるように、部分映像データ提示画面５２上の指定した任意の領域に同一の関連したデータとしてリンク・データを付加する場合、テキスト・データのコメントや関連電子データなどのリンク・データの重ね合わせが可能である。ここでは、座標位置に対して、アイコンやコメントが重なるだけでなく、関連したリンク・データをグループとして登録することも可能である。図９のリンク・データの“＊"を付したものが、グループ化された情報で保持される。
【００７８】
図１０に示すように、部分映像データ提示画面５２上の人物のオブジェクト７１や場所のオブジェクト７２に対してリンク・データを付加した場合、当該オブジェクト７１、７２にリンク・データを付加するとメッセージを送ることが可能である。
例えば、人物７１に対してリンク・データを付加する場合は、人物用メッセージ送付用リンク・データ７３を用いる。これによると、会議を行った映像データに対して或る参加者に質問するには、利用者は、部分映像データ提示画面上に表示される参加者に対してコメントと電子メールアドレスを付加することにより、当該参加者にメッセージを送ることが可能である。また、当該メッセージを送付する場合、コメントのみならず利用者がリンク・データを付加した当該リンク・データも送付することが可能である。これにより、どのような状況での質問であるかや、指定時間やその場の状況を端的かつ適切に把握することが可能となる。
【００７９】
また、場所のオブジェクトに対してリンク・データを付加する場合は、場所空間用メッセージ送付用リンク・データ７４を用いる。利用方法としては次のような想定をしている。すなわち、指定した任意の部分映像データが重要な人物のコメントを保持していて、その情報を未来の会議で利用したい場合に、会議が行われる場所に対して電子メールのようなメッセージサービスを利用してデータを送付する。実際に利用するときは、当該場所にある端末または利用者の端末を利用して開示する。
【００８０】
被リンク対象データ提示画面５４には、当該映像データの中で利用者によって付加されたリンク・データ付きの部分映像データ提示画面６２ａ〜６２ｅが複数提示されている。また、被リンク対象データ提示画面５４に提示されたリンク・データ群として、当該映像データから抽出したリンク・データのみならず、当該映像データ以外のリンク・データも指定可能である。
【００８１】
次に、複数の利用者間でネットワークを介して映像データやリンク・データを付加する手順について説明する。
図１１は、複数の利用者間で利用する場合に主に利用される装置およびユーザ・インタフェースの一例を示す。
利用者Ａおよび利用者Ｂは、映像記憶装置２１から映像データを取り出して、リンク・データを付加する任意の部分映像データを指定する。
【００８２】
図１１では、同じ部分映像データに対して、利用者Ａは、リンク・データ付加用入カダイアログ（リンク・データ付加画面）５３を用いて、１個のリンク・データ「この人はＸ氏」を付加し、利用者Ｂは、リンク・データ付加用入カダイアログ５３を用いて、２個のリンク・データ「この会話の関連映像」(テキスト・データ)と“xyz.mpg"(映像データ)を付加することを示している。これらデータは、リンク・データ記憶装置２６に保持される。そのデータ構造は図１２で表され、利用者Ａおよび利用者Ｂのタイム・コードや座標データなどがそれぞれ保持されている。図１３は、利用者Ａおよび利用者Ｂが付加したリンク・データを表現する部分画像アイコン８１ａ、８３ａ〜８３ｃを同時に提示したイメージ図を示す。なお、各利用者Ａ、Ｂの被リンク対象データ提示画面５３には、それぞれのリンクデータ付きの部分画像データ８２ａ〜８２ｃ、８４ａ〜８４ｄが示される。
【００８３】
また、事前に利用者Ａがリンク・データを付加する部分映像データを指定して、後で利用者Ｂに対して、当該部分映像データの存在場所を電子メールなどで伝えて、リンク・データを付加するような、非同期的な協調作業が可能である。さらに、利用者単体もしくは複数の利用者間で、リンクデータ記憶装置２６にアクセスして、事前に作成した部分映像データやリンク・データの再編集が可能である。
【００８４】
また、利用者単体または複数の利用者間で、リンク・データを付加、保持ならびに提示可能とするために、図１４のような構成を用いることが可能である。この構成では、利用者Ａの端末装置と、利用者Ｂの端末装置と、利用者間で共有されるリンク・データ記憶装置９８と、利用者間で共有される映像記憶装置９７とがネットワークを介して接続されている。また、各利用者Ａ、Ｂの装置には、それぞれ、映像データ提示装置９１ａ、９１ｂと、（任意）部分映像データ指定装置９２ａ、９２ｂと、部分映像データ提示装置９３ａ、９３ｂと、リンク・データ付加装置９４ａ、９４ｂと、リンク・データ提示装置９５ａ、９５ｂと、リンク・データ記憶装置９６ａ、９６ｂとが備えられている。
【００８５】
図１５および図１６を参照して、被リンク対象データと映像データで構成される事前に作成した映像データ１と映像データ２を再利用して合成された映像データを作成する場合について説明する。
図１５は、映像データ１０１に被リンク対象データ１０２がリンクされている様子を示す。
【００８６】
図１６は、本例の映像処理装置１を用いて、映像データを再利用して編集する例を示す。
或る会議が開催される前に会議の主催者などは、これまでの経緯を短時間のうちに理解ならびに参加者間で共有するために、この会議に関連する事前に作成した映像データ１と映像データ２にアクセスして、個々の被リンク対象データ１１４ａ、１１５ａ、１１５ｂ、１１６ａ〜１１６ｃ、１２４ａ、１２５ａ〜１２５ｃ、１２６ａである会議議事録や資料を閲覧しながら、複数ある映像データ１１１〜１１３、１２１〜１２３の中からもっとも関連する映像データを取り出して並べ替えなどの編集を行って、合成された映像データを制作することが可能である。
【００８７】
続いて、映像データからリンク・データの対象となる映像フレームを自動抽出する処理について説明する。
リンク・データ付加装置２５は、前述のとおり利用者が部分映像データを指定することを可能とする以外に、指定した任意の部分映像データの後の映像データ中の動画データや当該動画データ中の動画オブジェクトならびに音声データを解析して、リンク・データを指定した部分映像データの当該フレーム上で、同一人物の発言と推定される部分音声データならびに複数の人物間の対話で同一の内容であると推定される部分音声データを抜き出して、抜き出した部分音声データに対応した部分の映像データ（部分映像データ）とリンク・データを付加する。
【００８８】
例えば、図１７は、同一人物の発言推測の開始点と終了点を抽出する一例である。
この場合、時間ｔの軸に対して連続した複数のフレームＦ１〜Ｆ７の中で、被リンク対象データを付加したいフレーム（例えば、フレームＦ１或いはフレームＦ４）の音声データが次のフレーム（例えば、フレームＦ２或いはフレームＦ７）をまたがっていてそして音声データが途切れるところまでを発言推測箇所Ｔ１、Ｔ２として提示して、その音声データの開始点と終了点にあたる映像フレームを被リンク対象データを付加するフレームとする。
【００８９】
図１８は、複数の人物間の対話推測の開始点と終了点を抽出する一例である。この場合も、時間ｔの軸に対して連続した複数のフレームＦ１１〜Ｆ１７の中で、図１７の場合と同様に、対話推測箇所Ｔ１１〜Ｔ１４を抽出する。ただし、本例では、この場合には、対話中に生じる会話Ｔ２１、Ｔ２２の間の部分の時間をΔｔとして、図１９に示すとおり、当該Δｔが或る一定の間隔より短ければ、これを同一の対話部分と推測する。
【００９０】
次に、本発明の第２実施例に係る映像処理装置や映像処理方法を説明する。
なお、本例の映像処理装置１の概略的な構成や動作は、例えば上記した第１実施例で示したものと同様であり、本例では、異なる部分について詳しく説明する。
【００９１】
図２０は、本例に係るリンク・データのデータ構造の一例を示すブロック図である。
本例のリンク・データは、識別子１３１、映像データのファイル名１３２、フレーム開始番号１３３、フレーム終了番号１３４、リンク対象領域座標１３５、被リンク対象データ名（例えばＵＲＬ）１３６、および視覚的フィードバック・データ１３７から構成される。
【００９２】
識別子１３１は、リンク・データ自身を区別するためのデータであり、リンク管理部１５によって、リンク・データ毎に割り当てられる。
映像データのファイル名１３２は、リンクの対象となる映像データを特定する。
フレーム開始番号１３３は、係る映像データのリンク対象となるフレームの開始番号である。
フレーム終了番号１３４は、係る映像データのリンク対象となるフレームの終了番号である。
【００９３】
リンク対象領域座標１３５は、利用者によって指定される映像データ中のリンク対象となる座標データである。
被リンク対象データ名１３６は、係る映像データにリンクされるデータの名前である。
視覚的フィードバック・データ１３７は、映像データに対してリンクが存在することを利用者に視覚的にフィードバックするために利用されるデータである。
【００９４】
ここで、識別子１３１は、リンク管理部１５によって設定される。
映像データ名１３２、フレーム開始番号１３３、フレーム終了番号１３４、およびリンク対象領域座標１３５は、リンク対象領域指定部１２によって利用者から入カされる。
また、被リンク対象データ名１３６は、リンク生成部１３によって利用者からダイアログ形式のユーザ・インタフェースを使って入力される。
視覚的フィードバック・データ１３７は、リンク生成部１３によって生成される。
【００９５】
図２１は、本例に係る主要なユーザ・インタフェースを示す図である。
主要なユーザ・インタフェース１４１は、映像提示画面１４２、映像再生ボタン１４３、映像停止ボタン１４４、リンク開始ボタン１４５、リンク終了ボタン１４６、被リンク対象データ名入カダイアログ１４７から構成される。
【００９６】
映像提示画面１４２は、記憶部１１に保持されている映像データを利用者に対して提示する。
映像再生ボタン１４３は、利用者がマウスなどでクリックすることにより、係る映像データの再生を開始することを可能とする。
映像終了ボタン17は、利用者がマウスなどでクリックすることにより、係る映像データの再生を停止することを可能とする。
【００９７】
リンク開始ボタン１４５は、利用者がマウスなどでクリックすることにより、リンクされるべき係る再生中の映像データの開始フレームを指定することを可能とする。
リンク終了ボタン１４６は、利用者がマウスなどでクリックすることにより、リンクされるべき係る再生中の映像データの終了フレームを指定することを可能とする。
被リンク対象データ名入カダイアログ１４７は、利用者が係る映像データに対して、リンクされるべき被リンク対象データ名をダイアログを通して入力することを可能とする。
【００９８】
図２２は、本例の映像処理装置１のリンク付け処理の一例を示すフローチャート図である。
図２２に示すように、リンク付け処理は、初期化処理（ステップＳ２１）、映像再生検知処理（ステップＳ２２）、リンク開始検知処理（ステップＳ２３）、リンク終了検知処理（ステップＳ２４）、リンク対象領域定義処理（ステップＳ２５）、被リンク対象入力処理（ステップＳ２６）、リンク生成処理（ステップＳ２７）、リンク提示処理（ステップＳ２８）、および映像停止検知処理（ステップＳ２９）から構成される。
【００９９】
続いて、図２２のフローチャート図を用いて、本例の映像処理装置１の処理手順を説明する。
まず、初期化処理においては、映像処理装置１の記憶部１１、リンク対象領域指定部１２、リンク生成部１３、映像提示部１４、およびリンク管理部１５の各部が初期化される（ステップＳ２１）。
【０１００】
つまり、まず、リンク管理部１５によってリンク・データが生成され、初期化される。具体的には、ダイアログ入力を用いるなどして、記憶部１１に保持されている利用する映像データのファイル名を映像データ名１３２の値としてリンク・データに設定する。また、リンク・データの識別子１３１は、リンク管理部１５により、当該映像処理装置１に固有な識別子が設定される。リンク・データのフレーム開始番号１３３およびフレーム終了番号１３４は既定値として０などの値がリンク管理部１５によって設定される。同じく、リンク・データのリンク対象領域座標１３５、被リンク対象データ名１３６、および視覚的フィードバック・データ１３７の既定の値がリンク管理部１５によって設定される。リンク管理部１５によって生成されたリンク・データは記憶部１１によって保持される。
【０１０１】
次に、映像再生検知処理においては、利用者によって指定された映像データに対して、利用者からのマウスまたはタブレットを使用した映像再生ボタン１４３のクリックを検知することによって再生を開始する（ステップＳ２２）。
続いて、リンク開始検知処理においては、リンク管理部１５は、利用者からのリンク開始ボタン１４５のクリックを検知することにより、映像データに対するリンク領域を定義するためのフレーム開始番号を決定し、その値をリンク・データのフレーム開始番号１３３として設定する（ステップＳ２３）。
【０１０２】
続いて、リンク終了検知処理においては、リンク管理部１５は、利用者からのリンク終了ボタン１４６のクリックを検知することにより、映像データに対するリンク領域を定義するためのフレーム終了番号を決定し、その値をリンク・データのフレーム終了番号１３４として設定する（ステップＳ２４）。ここで、リンク管理部１５は、映像データの再生を一時停止する。
【０１０３】
リンク対象領域定義処理においては、リンク管理部１５は、まず、利用者に対して、映像提示画面１４２にリンク対象領域の定義が可能であることを、そのメッセージと映像データを重ね合わせるなどして通知する。また、リンク対象領域指定部１２は、利用者からのマウスによる指定によって映像提示画面１４２に提示されている映像データのリンク対象となる領域の座標データを取得する。ここで、リンク対象領域指定部１２は、利用者の指定した領域を白線で囲むなどの視覚的フィードバックを利用者に与える。リンク対象領域指定部１２は、利用者から取得したリンク対象領域を定義する座標データ（以下、リンク対象領域定義座標データと記述することもある。）を、記憶部１１に保持されているリンク・データのリンク対象領域座標１３５の値として設定する（ステップＳ２５）。
【０１０４】
被リンク対象入力処理においては、リンク管理部１５は、利用者から被リンク対象データ名入カダイアログによって指定される被リンク対象データ名を取得し、記憶部１１に保持されているリンク・データの被リンク対象データ名１３６の値として設定する（ステップＳ２６）。
リンク生成処理においては、リンク生成部１３は、記憶部１１に保持されているリンク・データのリンク対象領域座標１３５の値およびフレーム開始番号１３３からフレーム終了番号１３４までに対応する映像データから、利用者に対する視覚的フィードバック用の画像データおよび関連座標データを生成する。当該画像データおよび関連座標データをリンク・データの視覚的フィードバック・データとして設定する（ステップＳ２７）。
【０１０５】
リンク提示処理においては、リンク・データの視覚的フィードバック・データ１３７の関連座標データを用いて、映像データに係る画像データを重ね合わせて映像提示画面１４２に提示する（ステップＳ２８）。
映像停止検知処理においては、利用者が映像停止ボタンをマウスでクリックしたか否かを検知し、クリックした場合においては、映像データの提示を停止し、リンク付け処理を終了する（ステップＳ２９）。一方、クリックをしていない場合においては、再度、リンク開始検知処理以降の処理を行う（ステップＳ２３〜ステップＳ２９）。
【０１０６】
ここで、図２３から図３１を用いて、リンク対象領域定義処理（ステップＳ２５）およびリンク生成処理（ステップＳ２７）を詳述する。
図２３は、部分映像データとしてリンクされる映像オブジェクト（文字「Ｙ」のロゴ）１５１の例を示す。
この図２３は、記憶部１１に保持されているフレーム開始番号１３３からフレーム終了番号１３４までに対応する映像データ（各フレームにおいては、静止画像データ）を示す。
【０１０７】
図２４は、利用者によるマウス操作によって、当該映像オブジェクト１５１が選択されている様子を枠１５２によって映像提示画面１４２を使って利用者に提示している図を示す。
図２５は、当該映像オブジェクト１５１を画像処理により斜めに倒した映像オブジェクト１５３の図を示す。
図２６は、図２５の斜めにされた映像オブジェクト１５３をエッジ抽出（境界抽出）および色変換の画像処理により影データ１５４を生成した図を示す。
【０１０８】
図２７は、図２３のオリジナルの映像オブジェクト１５１と図２６の影データ１５４とを合成した図を示す。
図２８は、映像提示画面１４２を使って利用者に提示すべき領域を図２７のデータから抽出した図を示す。
【０１０９】
リンク対象領域定義処理（ステップＳ２５）では、前述したように、まず、映像オブジェクトに対してリンク付け可能であることを、映像提示画面１４２の枠の色を変更するあるいはリンク開始ボタン１４５の色を変更するなどして、利用者に通知する。
次に、利用者は映像提示画面１４２に提示されている図２３の映像オブジェクト１５１を参照しつつ、マウスを操作して、リンク付けを行うべき映像オブジェクト（ここでは、「Ｙ」のロゴ）１５１を選択する。選択結果は、図２４に示す枠１５２で示される。
【０１１０】
この枠１５２を表現する座標（例えば、左上角および右下角の座標）はリンク対象領域定義座標データとして、記憶部１１に保持されているリンク・データのリンク対象領域座標１３５に設定される。
続いて、リンク生成処理においては（ステップＳ２７）、図２４の画像を射影変換するなどの画像処理を行い図２３のオリジナルの映像オブジェクト１５１と区別可能にする。さらに、斜めにされた映像オブジェクト１５３に対する微分フィルタを用いた輪郭抽出などをして、映像オブジェクト１５３と背景の境界を決定し、映像オブジェクト１５３の領域の色変換を行うことによって、影データ１５４を生成する。
【０１１１】
さらに、図２３のオリジナルの映像オブジェクト１５１と生成した影データ１５４を合成することによって図２７の画像を得る。
最後に、利用者に対して映像提示画面１４２に提示すべき領域のクリッピング処理を行うことによって視覚的フィードバック・データ１３７を生成する。当該クリッピングされた影データ１５５の背景ないしオリジナルの映像オブジェクト１５１との境界の座標値を関連座標データとして、影データ１５４とともに記憶部１１において保持されている視覚的フィードバック・データ１３７として設定する。続いて、リンク生成処理では（ステップＳ２７）、図２８の映像を映像提示画面１４２に提示する。
【０１１２】
ここで、１つの映像オブジェクトに複数のリンク付けをする場合のリンク生成処理（ステップＳ２７）について説明する。
１つの映像オブジェクトに対してリンク付けを複数行う場合には、図２９に示すように、異なる角度で斜めにした映像オブジェクトを複数生成し、影の色を変更するなどしてそれぞれの影データ１５６ａ、１５６ｂを生成することにより、利用者にそれぞれのリンクを区別可能とする。その影データ１５６ａ、１５６ｂを図３０に示すように図２３のオリジナルの映像データ１５１と重ね合わせ、さらに図３１のようにクリッピング処理を行うことによって、当該クリッピング処理後の映像１５７ａ、１５７ｂを用いて視覚的フィードバック・データ１３７を生成する。
【０１１３】
続いて、利用者が視覚的フィードバックによって提示されている影データをマウスによって指定することでリンク対象提示を指示した場合のユーザ・インタフェースの様子について説明する。
まず、映像提示画面１４２に提示されている映像オブジェクトにリンクが関連付けられている場合、前述したように影データが重ね合わせて表示されている。利用者がマウスを用いてこの影データをクリックした場合、リンク管理部１５は、リンク・データ中の識別子１３１、映像データ名１３２、フレーム開始番号１３３、フレーム終了番号１３４、および視覚的フィードバック・データ１３７に一致ないし包含されるか否かを判断し、一致ないし包含されている場合、被リンク対象データ名１３６の値を被リンク対象データ名入カダイアログ１４７に表示することによって、利用者が被リンク対象データにアクセス可能とする。あるいは、被リンク対象データ名１３６の内容を別のウィンドウないしディスプレイに（例えば、映像提示画面１４２を画面分割して、その１つの画面に）提示する。
【０１１４】
以上の説明においては、映像データおよび被リンク対象データが同一の映像処理装置１の記憶部１１に保持されていることを前提に説明したが、映像データまたは被リンク対象データが例えばネットワークを介して係る映像処理装置１に接続されて、映像処理装置１から当該映像データまたは被リンク対象データにアクセスするように構成することも可能である。この場合、図２０の映像データ名１３２または被リンク対象データ名１３６は、それぞれ、映像データのアクセス先を表現するいわゆるＵＲＬまたは被リンク対象データのアクセス先を表現するＵＲＬとして構成することができる。
【０１１５】
また、映像データおよび被リンク対象データが同一の映像処理装置１の記憶部１１に保持されていることを前提に説明したが、図３２に示すように、クライアント１６１およびサーバ１６２がネットワーク１６３を介して接続され、クライアント１６１またはサーバ１６２に前述した映像処理装置１の各部の機能を分離配置し、リンク付けを行うように構成することができる。例えば、図３２に示すように、リンク生成部１７３をサーバ１６２に配置し、他の処理部である記憶部１７１、リンク対象指定部１７２、映像指示部１７４、リンク管理部１７５をクライアント１６１に配置する構成とすることも可能である。
【０１１６】
図３３は、ネットワークに伝送されるリンク・データの形式の例を示す。
図３３に示すように図２０のリンク・データの構造を例えばいわゆるＸＭＬ形式に変換してネットワークに伝送することによって、図３２のようなクライアント１６１とサーバ１６２をネットワーク１６３を介して接続した場合にリンク・データを転送して利用することが可能となる。
【０１１７】
同じく、図３４は、ネットワークに伝送されるリンク・データの形式の他の例を示す。図３４では、被リンク対象データとして、リンク・データが指定されている様子を示す。
具体的には、〈resource-name〉要素として、LlNK001のリンク・データの識別子が設定されている。このように被リンク対象データとしてリンク・データを指定した場合、リンク管理部１７５は、図３４のＸＭＬ形式のリンク・データ（識別子がLINK003のリンク・データ）を解釈し、L1NK001のリンク・データを取得する。さらに、リンク管理部１７５は、図３３のＸＭＬ形式のリンク・データ（識別子がLlNKOO1のリンク・データ）を解釈し、<audiovisual-data>要素にVideo.mpgのデータが設定され、かつ<resource-name>要素にAnnotation.txtのデータが設定されていることを検知する。
【０１１８】
続いて、リンク管理部１７５は、利用者に対してVideo.mpgのデータかAnnotation.txtのデータかのいずれを利用するかを選択させ、選択されたデータを映像提示画面１４２に提示する。仮に〈resource-name〉要素として、さらにリンク・データの識別子が設定されている場合には、同様の動作を繰り返して、リンクをたどっていく。このように、被リンク対象データ名１３６または〈resource-name〉要素にリンク・データの識別子を設定することによって、リンク・データを再利用することができる。また、ＸＭＬ形式化されたリンク・データを電子メールなどで転送し、転送された利用者の映像処理装置１で当該ＸＭＬ形式化されたリンク・データを利用することにより、リンク・データを再利用することも可能である。
【０１１９】
次に、リンクされたテキスト・データ、音声データ、または映像データなどの電子データからリンクされる任意の部分映像データを特定する手段およびステップについて説明する。
ここで、図３５に示すようなリンク・データが記憶部１１に保持されているとする。
つまり、リンク識別子１３１に“LlNK001”の値が設定され、映像データ名１３２に“Video.mpg”の値が設定され、フレーム開始番号１３３に“１２０”の値が設定され、フレーム終了番号１３４に“150”の値が設定され、リンク対象領域座標１３５に“{(1O,30),(10,10),(20,10),(20,30)}”の値が設定され、被リンク対象データ名１３６に“Annotation.txt”の値が設定され、視覚的フィードバック・データ１３７に“Visual.dat”が設定されていたとする。
【０１２０】
利用者がリンクされたテキスト・データ、音声データまたは映像データなどの電子データから任意の部分映像データを特定する場合、まず、利用者は図２１の被リンク対象データ名入カダイアログ１４７から所望の被リンク対象データ名（当該電子データの名前）を入力する。つまり、利用者は被リンク対象データ名入カダイアログ１４７を使って“Annotation.txt”を入力する。被リンク対象データ名入カダイアログ１４７から被リンク対象データ名が入力されると、映像処理装置１のリンク管理部１５は、記憶装置１１に保持されているリンク・データの中から、被リンクデータ名の値が一致するリンク・データを検索し、取得する。
【０１２１】
次に、リンク管理部１５は、当該リンク・データの映像データ名１３２を参照し、映像データ名に一致する映像データを記憶部１１から取得する。つまり、リンク管理部１５は、当該リンク・データ中の映像データ名１３２の値“Video.mpg”を参照し、当該“Video.mpg”に一致する映像データを記憶部１１から取得する。
【０１２２】
続いて、リンク管理部１５は、リンク・データ中のフレーム開始番号１３３およびフレーム終了番号１３４の値“１２０”および“１５０”を参照して、映像データ中から抽出すべきフレーム、すなわち、フレーム番号１２０から１５０までのフレームを抽出する。さらに、リンク管理部１５は、リンク・データを参照してリンク対象領域座標１３５を参照し、リンク対象領域座標として“{(10,30),(1O,1O),(20,1O),(20,30)}”の値を得る。
【０１２３】
そこで、リンク管理部１５は、前述の抽出した各フレームのリンク対象領域座標に一致する領域、ここでは(10,30),(10,10),(20,10),(20,30)の各座標に囲まれる領域に対して、リンク・データの視覚的フィードバック・データ１３７に対応するデータを配置し、映像提示画面１４２に提示する。
そこで、利用者は、被リンク対象データ名“Annotation.txt”に対応した、映像データ“Video.mpg”中の視覚的フィードバックで示される領域を特定することができる。
【０１２４】
次に、リンクされたテキスト・データ、音声データまたは映像データなどの電子データを電子掲示板システムまたは電話などの通話ないし通信システムに転送し、リンクされた任意の部分映像データに関連する対象に当該電子データを引き渡す手段およびステップについて説明する。
【０１２５】
図３６は、図１の映像処理装置１と同様に記憶部１９１とリンク対象領域指定部１９２とリンク生成部１９３と映像提示部１９４とリンク管理部１９５とを備えた構成に、リンク・データ転送部１９６および電話通話部１９７を付加した拡張した映像処理装置１８１の一例を示すブロック図である。
【０１２６】
リンク・データ転送部１９６は、ＣＰＵおよびバッファ記憶装置から構成され、記憶部１９１から転送すべき被リンク対象データを入力し、電話通話部１９７に転送する。
電話通話部１９７は、通常の電話の通話機能を有するサブシステムであり、リンク・データ転送部１９６から入力された被リンク対象データを外部の電話に送信する。
【０１２７】
ここで、図３７に示すようなリンク・データが記憶部１９１に保持されているとする。つまり、リンク識別子１３１に“LlNK002”の値が設定され、映像データ名１３２に“Video.mpg”の値が設定され、フレーム開始番号１３３に“１２０”の値が設定され、フレーム終了番号１３４に“１５０”の値が設定され、リンク対象領域座標１３５に“{(10,30),(10,10),(20,1O),(20,30)}”の値が設定され、被リンク対象データ名１３６に“Voice.dat”の値が設定され、視覚的フィードバック・データ１３７に“Visual2.dat”が設定されていたとする。また、音声データであるVoice.datは記憶部１９１に保持され、“Voice.dat”と組み合わせて、当該音声データ“Voice.dat”に対応する通話用の電話番号“O120-123-4567”が同じく記憶部１９１に保持されているとする。
【０１２８】
利用者が映像提示画面１４２に提示されている視覚的フィードバックにより、識別子“Link002”で識別されるリンク・データをマウスを使って選択したとする。
リンク管理部１９５は、記憶部１９１に保持されているリンク・データを参照し、被リンク対象データが音声データである“Voice.dat”であることを特定すると、当該“Voice.dat”に対応する通話用の電話番号“O120-123-4567”を取得する。
【０１２９】
次に、リンク管理部１９５は、“Voice.dat”をリンク・データ転送部１９６に転送する。続いて、電話通話部１９７は、取得した電話番号“0120-123-4567”を使って通話先を呼び出し、呼び出しがあった場合、“Voice.dat”を音声データとして再生し、通話を完了する。
【０１３０】
ここでは、通常の公衆電話網の電話に接続することを想定したが、電話通話部１９７に代えて、データ送出機能を準備し、被リンク対象データがテキスト・データである場合には、電子掲示板に転送するように構成することもできる。同じく、通常の電話機能に代えて、いわゆるインターネット電話機能を持つデータ送出機能を準備し、インターネット電話に対して被リンク対象データを送出するように構成することも可能である。
【０１３１】
次に、本発明の第３実施例に係る映像処理装置や映像処理方法を説明する。
図３８および図３９は、本例の映像処理装置１の映像提示画面２０１に映像オブジェクト２０３が１個である映像データおよび枠２０２並びに視覚的フィードバック２０４ａ、２０４ｂ、２０５ａ、２０５ｂ、２０６ａ、２０６ｂが提示されているユーザ・インタフェースの例を示す図である。
図４０は、本例の映像処理装置１の映像提示画面２０１に映像オブジェクト２０３、２０７が２個である映像データおよび枠２０２並びに視覚的フィードバック２０５ａ、２０５ｂ、２０６ａ、２０６ｂ、２０８ａ、２０８ｂが提示されているユーザ・インタフェースの例を示す図である。
【０１３２】
図３８は、映像オブジェクト（“Ｙ”のロゴ）２０３に対して、１個のリンクが設定されている場合を示す。
一方、図３９は、同一の映像オブジェクト（“Ｙ”のロゴ）２０３に対して、２個のリンクが設定されている場合を示す。このように複数のリンクが設定されている場合には、相異なる色の図形を映像提示画面２０１の枠２０２に提示することにより、リンクを区別することが可能となる。
【０１３３】
図３８、図３９、および図４０に示すように、リンクが設定されている映像オブジェクト（“Ｙ”のロゴ）２０３、２０７から距離が短い枠２０２の二辺であって当該映像オブジェクト２０３、２０７の水平位置（例えば、横軸）及び垂直位置（例えば、縦軸）に対応する位置にそれぞれリンクを示す図形２０４ａ、２０４ｂ、２０５ａ、２０５ｂ、２０６ａ、２０６ｂ、２０８ａ、２０８ｂを配置することにより、利用者はリンクの存在を示す視覚的フィードバックを得ることが可能となる。
【０１３４】
以上のように、本発明の実施例に係る映像処理装置や映像処理方法では、映像データ中の任意の部分映像データを指定する手段と、映像データと指定した任意の部分映像データを同時に提示する手段と、指定した任意の部分映像データに対してリンク・データを付加する手段と、指定した任意の部分映像データ上にリンクされる対象データを提示する手段を備えた構成において、指定した任意の部分映像データに対して１個以上の関連するリンク・データを付加し、当該リンク・データを示す１個以上の部分画像アイコンを重ねて提示する、または、隣接してもしくは重ね合わせて視覚的にリンクの存在を提示する手段を備えた。
【０１３５】
そして、映像データに対してテキスト・データによるコメントや利用された関連した資料などを付加して映像データを提示し、注釈を付加する任意の部分映像データを指定し、当該映像データと指定した部分映像データを同時に提示し、指定した部分映像データに対してリンク・データを付加し、付加したリンク・データをリンクされる任意の部分映像データに隣接してまたは重ね合わせて視覚的にリンクの存在を提示する。
【０１３６】
従って、リンク・データをリンクされる任意の部分映像データに隣接してまたは重ね合わせて視覚的にリンクの存在を提示することによって、任意の部分映像データにリンク・データが存在していることを利用者に対して視覚的にフィードバックすることができる。
また、利用者は指定したフレームを抽出する部分映像データについて、リンク・データを付加したい当該部分映像データに対して、テキスト・データ、音声データ、画像データ、関連資料ファイルデータ、動画像データなどを容易かつ適切に関連付けることが可能となる。
また、関連付けを行うユーザ・インタフェースは、映像データと任意部分映像データを両方表示して、当該映像データを再生することにより、利用者は抽出した部分映像データを参照しながらリンク・データを付加することが可能となる。
【０１３７】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データは部分映像データの時間範囲を含み、リンク・データを付加する手段は当該部分映像データの時間範囲を指定する手段を備えた。
従って、リンク・データを付加する部分映像データの指定範囲として、映像データの時間範囲も指定することができる。
【０１３８】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データ或いは部分映像データは部分映像データの映像データにおける領域情報を含み、リンク・データを付加する手段は、部分映像データ上の領域情報を取得し、リンク・データを構成する。
従って、リンク・データを付加する手段により、部分映像データ上の領域情報を含んだリンク・データを構成することが可能となる。
【０１３９】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データの提示手段は、１個以上の部分画像アイコンを時間軸上、映像データ上（空間軸上）に提示する。
従って、１個以上の部分画像アイコンを時間軸上、映像データ上（空間軸上）に提示することが可能となる。
【０１４０】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンクされる任意の部分映像データに隣接してまたは重ね合わせて視覚的にリンクの存在を提示する手段は、係る任意の部分映像データまたは当該領域中の映像対象の影、または類似する形状の輝度変化によって、視覚的にリンクの存在を提示する。
従って、係る任意の部分映像データまたは当該領域中の映像対象の影または類似形状の輝度変化による視覚的フィードバックを利用者に提供することによって、係る任意の部分映像データまたは当該領域中の映像対象の形状に対応した影または類似形状の輝度変化による視覚的フィードバックを利用者に提供することができる。
【０１４１】
また、本発明の実施例に係る映像処理装置や映像処理方法では、映像の任意部分データ中にある任意の場所に対して、複数の関連するデータを示す場合に、部分画像を重ねて付加する手段を備えた。
従って、映像データ中の任意の部分データ中にある任意の映像対象に対して、複数のリンク・データが関連する場合に、部分画像アイコンなどによってリンク・データを重ね合わせて付加することが可能となる。
【０１４２】
また、本発明の実施例に係る映像処理装置や映像処理方法では、任意の部分映像データまたは当該領域中の映像対象の影または類似形状の輝度変化としては、当該任意の部分映像データまたは当該領域中の映像対象の形状から提示すべき影の形状または類似した形状などを生成する。
従って、任意の部分映像データまたは当該領域中の映像対象の影または類似形状の輝度変化として、当該任意の部分映像データまたは当該領域中の映像対象の形状から提示すべき影の形状または類似した形状などを生成することによって、オリジナルの映像データに対して違和感のない視覚的フィードバックを利用者に提供することができる。
【０１４３】
また、本発明の実施例に係る映像処理装置や映像処理方法では、同一の画像データに対して複数のリンクを付加した場合に、利用者に視覚的なフィードバックを与えて、各リンクを区別し、リンクされた情報を有効に利用させることができる。
【０１４４】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンクされる任意の部分映像データに隣接してまたは重ね合わせて視覚的にリンクの存在を提示する手段は、任意の部分映像データ内の輝度変化から映像対象を抽出する手段を備えた。
従って、任意の部分映像データの映像対象を、当該任意の部分映像データ内の輝度から抽出することができる。
【０１４５】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データを付加する手段は、映像データの輝度変化などから抽出された映像データ中の映像対象を利用者により選択させる。
従って、映像データの輝度変化などから抽出された映像データ中の映像対象を利用者が選択することが可能となる。
【０１４６】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンクされる任意の部分映像データに隣接してまたは重ね合わせて視覚的にリンクの存在を提示する手段は、２つ以上のリンクを提示する場合に、係る任意の部分映像データまたは当該領域中の映像対象の相異なる影または相異なる類似形状の輝度変化もしくは色の変化などによって、視覚的にリンクの存在を提示する。
従って、同一の部分映像データに複数のリンク・データが付加された場合に、これら複数のリンクを識別可能に提示することができる。
【０１４７】
また、本発明の実施例に係る映像処理装置や映像処理方法では、任意部分映像データに指定したリンク・データを付加すると同時に、部分映像データにリンク・データを単独ならびに重ね合わせて提示することにより、任意の部分映像データの指定部分画像に対して、関連するリンク・データを付加することに加えて、リンク・データを指定部分画像に重ね合わせることが可能となる。
【０１４８】
また、本発明の実施例に係る映像処理装置や映像処理方法では、付加されたリンク・データは、他の任意の部分映像データや他の任意の部分映像データに付加されたリンク・データに対して、リンク・データのリンク・データを生成する手段を備えた。
従って、付加されたリンク・データにより、他の任意の部分映像データや他の任意の部分映像データに付加されたリンク・データに対して、関連付けが可能となる。
【０１４９】
このように、本発明の実施例に係る映像処理装置や映像処理方法では、映像データ中の同一の領域または当該領域中の同一の映像対象に対して複数のリンクを関連付けることができ、また、リンク・データが付加された指定した任意の部分映像データを、映像データ中の他の任意の部分映像データや他のリンク・データに対して関連付けることが可能となる。
【０１５０】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データの内容として、テキスト・データ、音声データもしくは映像データなどの電子データや、電子ファイルまたはリンク・データなどをリンク対象とすることを記述する。
従って、映像データにリンクされるデータの内容として、テキスト・データ、音声データまたは映像データなどの電子データなどをリンクすることができる。
【０１５１】
このように、本発明の実施例に係る映像処理装置や映像処理方法では、指定した任意の部分映像データに、関連する電子メールなどの既存の電子文書や、会議で用いられた画像データ、関連した部分音声データ、映像データなどの電子ファイルを関連付けさせることができる。
【０１５２】
また、本発明の実施例に係る映像処理装置や映像処理方法では、単独または複数の利用者がリンク・データを付加、共有、提示または配布する手段を備えた。
従って、例えば、利用者は、保存されたリンク・データについて、情報携帯端末およびリンク・データを付加、共有、提示または配布する手段を利用して、リンク・データを取得し、リンクされた映像データに対してリンクを付加するなどの種々な再編集を行うことが可能となる。また、利用者は、複数の利用者間でリンク・データを付加、共有、提示または配布する手段を利用してリンク・データならびにリンク・データが付加された映像データまたは被リンク対象データを取得して、これにより、リンク・データ、リンク・データが付加された映像データまたは被リンク対象データを合成するなどの種々な再編集を行うことが可能となる。
【０１５３】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データを付加する手段は、指定した任意の部分映像データ中の単一または複数の人物に対して、音声データが有効であるときに、当該動画データおよび音声データから部分映像データの時間範囲を抽出する手段を備えた。
従って、リンク・データを付加する手段により、指定された任意の部分映像データ中の音声データなどを解析して、同一人物の発言の部分やあるいは質疑応答などの複数人物間の対話で同一の内容である部分を推測して切り出し、当該部分のデータに相当する部分映像データを抽出して、当該部分映像データに対してリンク・データを付加することが可能となる。
【０１５４】
また、本発明の実施例に係る映像処理装置や映像処理方法では、リンクされたテキスト・データ、音声データまたは映像データなどの電子データなどからリンクされる任意の部分映像データを特定する手段を備えた。
従って、リンクされたテキスト・データ、音声データまたは映像データなどの電子データなどからリンクされる任意の部分映像データを特定することにより、リンクされた電子データなどからリンクされる任意の部分映像データを参照することが可能となる。
【０１５５】
また、本発明の実施例に係る映像処理装置や映像処理方法では、利用者が部分映像データまたは部分映像データ内の映像対象を指定することによって、リンクされたテキスト・データ、音声データまたは映像データなどの電子データなどを電子掲示板システムや、電話または電子メールなどの通話ないし通信システムに転送し、リンクされた任意の部分映像データに関連する対象に当該電子データなどを引き渡す手段を備えた。
従って、任意の部分映像データから、係る電子掲示板システムまたは電話ないし通信システムを利用して、任意の部分映像データに関連する相手側のデータを参照することや、当該相手に対して当該電子データなどを通知ないし転送することが可能となる。
【０１５６】
また、本発明の実施例に係る映像処理装置や映像処理方法では、映像データを提示し、当該映像データに対してリンク・データを保持し、処理する構成において、リンク・データがリンクされる任意の部分映像データに対応して、映像データの外枠に視覚的にリンクの存在を提示する手段を備えた。
従って、提示される映像データを邪魔することなく、その外枠を用いて部分映像データに対するリンクの存在を提示することができる。
【０１５７】
以上のように、本発明の実施例に係る映像処理装置や映像処理方法では、リンク・データがリンクされた映像データを利用者に提示する場合に、例えば利用者が映像処理装置内の映像提示画面に提示されている映像データの領域上にマウスを移動しなくても、利用者に対してリンクの視覚的なフィードバックを与えることができ、単数または複数のリンクの存在を利用者に知らせることができる。また、リンクされたテキスト・データ、音声データまたは映像データなどの電子データなどから任意の部分映像データを参照することが可能となる。さらに、視覚的なフィードバックを通して、任意の部分映像データから、電子掲示板システムまたは電話ないし通信システムを介して、当該任意の部分映像データに関連する対象を参照または利用すること、などが可能となる。
【０１５８】
なお、本発明の実施例に係る映像処理装置などでは、映像データから部分映像データを特定するリンク対象領域指定部１２などの機能により部分映像データ特定手段が構成されており、部分映像データと他のデータとを関連付ける（リンクする）リンク生成部１３などの機能によりデータ関連付け手段が構成されている。
また、本発明の実施例に係る映像処理装置などの部分映像データ特定手段では、部分映像データの候補を特定する機能により部分映像データ候補特定手段が構成されており、当該候補の中から部分映像データの指定をユーザから受け付ける機能により部分映像データ指定受付手段が構成されている。
【０１５９】
また、本発明の実施例に係る映像処理装置などでは、部分映像データに関連付けられたデータから当該部分映像データを特定するリンク管理部１５などの機能により関連部分映像データ特定手段が構成されており、部分映像データに関連付けられたデータの存在を示すデータ（視覚的フィードバック・データ）を当該部分映像データと視覚的に関連付けて提示する映像提示部１４などの機能により関連データ提示手段が構成されている。
【０１６０】
また、本発明の実施例に係る映像処理装置などでは、視覚的フィードバック・データと所定の処理とが記憶部１１において対応付けられており、提示された視覚的フィードバック・データの指定をユーザから受け付ける映像提示部１４などの機能により提示データ指定受付手段が構成されており、指定を受け付けた視覚的フィードバック・データに対応付けられた処理を実行するリンク管理部１５などの機能により提示データ対応処理実行手段が構成されている。
【０１６１】
また、本発明の実施例に係る映像処理装置などでは、部分映像データに関連付けられた複数のデータの存在を示す例えば複数の視覚的フィードバック・データを当該部分映像データと視覚的に関連付けて提示する映像提示部１４などの機能により複数関連データ提示手段が構成されている。
【０１６２】
ここで、本発明に係る映像処理装置や映像処理方法などの構成や態様としては、必ずしも以上に示したものに限られず、種々な構成や態様が用いられてもよい。
また、本発明の適用分野としては、必ずしも以上に示したものに限られず、本発明は、種々な分野に適用することが可能なものである。
【０１６３】
また、本発明に係る映像処理装置や映像処理方法などにおいて行われる各種の処理としては、例えばプロセッサやメモリ等を備えたハードウエア資源においてプロセッサがＲＯＭ（Read Only Memory）に格納された制御プログラムを実行することにより制御される構成が用いられてもよく、また、例えば当該処理を実行するための各機能手段が独立したハードウエア回路として構成されてもよい。
また、本発明は上記の制御プログラムを格納したフロッピー（登録商標）ディスクやＣＤ（Compact Disc）−ＲＯＭ等のコンピュータにより読み取り可能な記録媒体や当該プログラム（自体）として把握することもでき、当該制御プログラムを記録媒体からコンピュータに入力してプロセッサに実行させることにより、本発明に係る処理を遂行させることができる。
【０１６４】
【発明の効果】
以上説明したように、本発明に係る映像処理装置や映像処理方法などでは、例えば、映像データから当該映像データの部分である部分映像データを特定し、特定した部分映像データに対してデータを当該データの存在を提示可能なように関連付けるようにしたため、当該部分映像データに関連付けられたデータの存在を提示可能とすることができる。
つまり、本発明に係る映像処理装置や映像処理方法などでは、例えば、部分映像データに関連付けられたデータの存在を示すデータを、映像データ中の当該部分映像データと視覚的に関連付けて提示するようにしたため、当該関連付けられたデータの存在や当該関連付けをユーザに対して視覚的に把握可能とすることができる。
【０１６５】
また、本発明に係る映像処理装置や映像処理方法などでは、例えば、映像データから特定された当該映像データの部分である部分映像データに関連付けられた複数のデータの存在を示すデータを、当該映像データ中の当該部分映像データと視覚的に関連付けて提示するようにしたため、当該関連付けられた複数のデータの存在や当該関連付けをユーザに対して視覚的に把握可能とすることができる。
【図面の簡単な説明】
【図１】本発明に係る映像処理装置の構成例を示す図である。
【図２】本発明に係る映像処理装置の詳細な構成例を示す図である。
【図３】映像データから部分映像データを抽出する様子を示す図である。
【図４】部分映像データを抽出する処理の手順の一例を示す図である。
【図５】部分映像データにリンク・データを付加する処理の手順の一例を示す図である。
【図６】リンク・データ付加記憶装置のデータ構造の一例を示す図である。
【図７】リンク・データ付加記憶装置の拡張されたデータ構造の一例を示す図である。
【図８】ユーザ・インタフェースの一例を示す図である。
【図９】部分映像データにリンク・データを付加した後のデータ構造の一例を示す図である。
【図１０】リンク・データ付加提示のユーザ・インタフェースの一例を示す図である。
【図１１】協調作業における装置構成とユーザ・インタフェースの具体例を示す図である。
【図１２】部分映像データにリンク・データを付加した後のデータ構造の他の一例を示す図である。
【図１３】複数の利用者により付加されたリンク・データを表現する部分画像アイコンを提示した一例を示す図である。
【図１４】編集作業を行うシステムの構成例を示す図である。
【図１５】被リンク・対象データと映像データの構造の一例を示す図である。
【図１６】複数の映像データの部分から合成した映像データを生成する様子の一例を示す図である。
【図１７】リンク・データ付加時における発言推測の一例を示す図である。
【図１８】リンク・データ付加時における対話推測の一例を示す図である。
【図１９】リンク・データ付加時における対話推測の仕方の一例を説明するための図である。
【図２０】リンク・データのデータ構造の一例を示す図である。
【図２１】ユーザ・インタフェースの一例を示す図である。
【図２２】リンク付け処理の手順の一例を示す図である。
【図２３】映像オブジェクトの一例を示す図である。
【図２４】映像オブジェクトを枠で囲んだものの一例を示す図である。
【図２５】映像オブジェクトを斜めに倒したものの一例を示した図である。
【図２６】影データの一例を示した図である。
【図２７】映像オブジェクトと影データとを合成したものの一例を示す図である。
【図２８】映像オブジェクトと影データの提示すべき領域を抽出したものの一例を示す図である。
【図２９】複数の影データの一例を示す図である。
【図３０】映像オブジェクトと複数の影データとを合成したものの一例を示す図である。
【図３１】映像オブジェクトと複数の影データの提示すべき領域を抽出したものの一例を示す図である。
【図３２】ネットワークを介してリンク付け処理を行う構成の一例を示す図である。
【図３３】ネットワークに伝送されるリンク・データの形式の一例を示す図である。
【図３４】ネットワークに伝送されるリンク・データの形式の他の一例を示す図である。
【図３５】リンク・データの値の一例を示す図である。
【図３６】拡張した映像処理装置の構成例を示す図である。
【図３７】リンク・データの値の一例を示す図である。
【図３８】映像処理装置の映像提示画面に映像オブジェクトが１個である映像データ及び枠並びに視覚的フィードバックが提示されているユーザ・インタフェースの一例を示す図である。
【図３９】映像処理装置の映像提示画面に映像オブジェクトが１個である映像データ及び枠並びに視覚的フィードバックが提示されているユーザ・インタフェースの一例を示す図である。
【図４０】映像処理装置の映像提示画面に映像オブジェクトが２個である映像データ及び枠並びに視覚的フィードバックが提示されているユーザ・インタフェースの一例を示す図である。
【符号の説明】
１、１８１・・映像処理装置、１１、１７１、１９１・・記憶部、
１２、１７２、１９２・・リンク対象領域指定部、
１３、１７３、１９３・・リンク生成部、
１４、１７４、１９４・・映像提示部、
１５、１７５、１９５・・リンク管理部、２１、９７・・映像記憶装置、
２２、９１ａ、９１ｂ・・映像データ提示装置、
２３、９２ａ、９２ｂ・・任意部分映像データ指定装置、
２４、９３ａ、９３ｂ・・部分映像データ提示装置、
２５、９４ａ、９４ｂ・・リンク・データ付加装置、
２６、９６ａ、９６ｂ、９８・・リンク・データ記憶装置、
２７、９５ａ、９５ｂ・・リンク・データ提示装置、
３１、１０１、１１１〜１１３、１２１〜１２３、Ｆ１〜Ｆ７、Ｆ１１〜Ｆ１７・・映像データ、
３２・・部分映像データ、３３・・外接矩形、４１・・タイム・コード、
４２・・座標データ、
４３、１０２、１１４ａ、１１５ａ、１１５ｂ、１１６ａ、１１６ｂ、１１６ｃ、１２４ａ、１２５ａ、１２５ｂ、１２５ｃ、１２６ａ・・被リンク対象データ、
４４・・記憶装置名、４５・・部分画像アイコン・データ、
４６・・利用者データ、５１・・映像データ提示画面、
５２・・部分映像データ提示画面、５３・・リンク・データ付加画面、
５４・・被リンク対象データ提示画面、
６２ａ〜６２ｅ、８２ａ〜８２ｃ、８４ａ〜８４ｄ・・リンク・データ付きの部分映像データ、
７１、７２・・オブジェクト、
７３、７４・・メッセージ送付用リンク・データ、
８１ａ、８３ａ〜８３ｃ・・部分画像アイコン、
Ｔ１、Ｔ２・・発言推測箇所、
Ｔ１１〜Ｔ１４、Ｔ２１、Ｔ２２・・対話推測箇所、１３１・・識別子、
１３２・・映像データ名、１３３・・フレーム開始番号、
１３４・・フレーム終了番号、１３５・・リンク対象領域座標、
１３６・・被リンク対象データ名、
１３７・・視覚的フィードバック・データ、
１４１・・ユーザ・インタフェース、１４２、２０１・・映像提示画面、
１４３・・映像再生ボタン、１４４・・映像停止ボタン、
１４５・・リンク開始ボタン、１４６・・リンク終了ボタン、
１４７・・被リンク対象データ名入力ダイアログ、
１５１、２０３、２０７・・映像オブジェクト、１５２・・枠、
１５３・・斜めに倒した映像オブジェクト、
１５４、１５６ａ、１５６ｂ、１５７ａ、１５７ｂ・・影データ、
１５５・・影データの提示すべき領域を抽出したもの、
１６１・・クライアント、１６２・・サーバ、１６３・・ネットワーク、
１９６・・リンク・データ転送部、１９７・・電話通話部、
２０２・・映像提示画面の枠、
２０４ａ、２０４ｂ、２０５ａ、２０５ｂ、２０６ａ、２０６ｂ、２０８ａ、２０８ｂ・・視覚的フィードバック、[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an apparatus or method relating to presentation of a link to video data in a video processing apparatus, and in particular, secures a work area, extracts partial video data from video data, and extracts text data from the partial video data. , Audio data, image data, related document file data, video data, etc. as link data contents, video processing apparatus and video processing capable of easily and appropriately associating link data between users or users Regarding the method.
[0002]
[Prior art]
In recent years, communication tools and conference system information sharing for communication using multimedia data via the Internet have progressed in individuals and companies. Among them, a system for adding a text annotation to a digital document or video image has been proposed in the same way as writing a marker or memo on a conventional printed matter. In Japanese Laid-Open Patent Publication No. Hei 8-272789 “Material Creation Support System Based on Video Specifications”, text information and video information can be associated with each other and handled as materials in the publication. Hereinafter, this technique will be referred to as a first conventional technique.
[0003]
Next, Japanese Patent Laid-Open No. 2000-250864 “Collaborative Work Support System” can add text data such as memos and questions to streaming data such as presentation materials as a technology that allows annotation in various formats. And can be shared among multiple clients. Hereinafter, this technique will be referred to as a second conventional technique.
[0004]
In Japanese Patent Application Laid-Open No. Hei 6-274552, “Multimedia Data Link System”, an arbitrary area in a moving image displayed on a screen or an arbitrary screen in a series of moving image data is designated, whereby data is displayed on the screen. Can be displayed. Hereinafter, this technique will be referred to as a third conventional technique.
[0005]
further, Y.Yamamoto , CHI2001 "Time-ART" has proposed a tool that has a user interface that can be clipped freely while viewing video and audio data, and has a text annotation function. Hereinafter, this technique will be referred to as a fourth conventional technique.
On the other hand, in Japanese Patent Laid-Open No. 10-21029, “telop display device”, there is a display device that allows a user to easily create a telop and easily add audio information and image information as additional information. Hereinafter, this technique will be referred to as a fifth conventional technique.
[0006]
Conventionally, when a home page is browsed by a web browser for browsing the World Wide Web, link information may be embedded as a so-called image map in the home page. Users can access the linked information by moving the mouse over the area composing the image map of the home page presented by the web browser and clicking the mouse. It is. Hereinafter, this technique will be referred to as a sixth conventional technique.
[0007]
Japanese Patent Laid-Open No. 8-329096 “Image Data Retrieval Device” has means for setting an icon that briefly represents the feature of the image as additional information in the image data, and the icon has a one-dimensional or higher axis. A technique of an image data search apparatus that is arranged at a predetermined position on a map and searches for image data related to the icon using the icon is disclosed. Hereinafter, this technique will be referred to as a seventh conventional technique.
[0008]
Further, Japanese Patent Laid-Open No. 8-329097 “Image Data Retrieval Device” has means for setting a keyword for the image as additional information in the image data, and the image data for retrieving the image data using the keyword. The technology of the search device has been released. Hereinafter, this technique will be referred to as an eighth conventional technique.
[0009]
Japanese Patent Laid-Open No. 8-329098 “Image Data Retrieval Device” includes image data on a first map having one or more axes and additional information on a second map having one or more axes. A technology of an image data search apparatus that can search image data in association with each other is disclosed. Hereinafter, this technique will be referred to as a ninth conventional technique.
[0010]
In Japanese Patent Laid-Open No. 11-39120 “Content Display / Selection Device and Content Display / Selection Method, and Recording Medium on which Content Display / Selection Method Program is Recorded”, the HTML document content is arranged in a two-dimensional array, Technologies that enable browsing (list of contents) without a mouse pointer have been released. Hereinafter, this technique will be referred to as a tenth conventional technique.
[0011]
[Problems to be solved by the invention]
However, the conventional techniques have various problems as described below.
First, as a common problem in the first to fifth conventional systems described above, the user extracts partial video data to another screen during playback of video data, and the content of the video data includes audio data. There was a problem that link data could not be added while referring to it.
[0012]
Further, the link data added to the partial video data has a problem that the link data cannot be added to an arbitrary location on the partial video data, and it is not known where it is added. For example, when multiple pieces of objects such as people and documents are shown in video data, when link data is added to the partial video data, in the prior art, the link data comment indicates which object There was a problem that it was not possible to determine if it was pointing.
Furthermore, there is a problem that the additional information of the related link data cannot be superimposed on the arbitrarily designated portion of the partial image data.
[0013]
Next, in the sixth prior art, when the HTML document content including the image map is presented to the user, the user moves the mouse over the area of the HTML document content including the image map in the browser. Without it, the user could not know the existence of the image map.
[0014]
Next, the seventh, eighth, and ninth prior arts can associate an icon, text data, or additional information with image data, but provide a visual feedback of a link to the user. Instead, when a plurality of links are added to the same image data, it is not possible to give visual feedback to the user, distinguish each link, and use the linked information.
[0015]
Similarly, even if the tenth prior art is used, the user can recognize the presence of an image map associated with a specific area such as a person or object represented in the HTML document content, particularly image data or video data. There was a problem that could not be presented.
Also, using any of the sixth to tenth prior arts, it is used in cooperation with a specific area such as a person or an object represented in the video data and a so-called electronic bulletin board system or a telephone / communication system such as a telephone. There was a problem that you can't.
[0016]
The present invention has been made to solve such conventional problems, and is effective for presenting the existence of data associated with the partial video data for the partial video data specified from the video data. An object is to provide a video processing apparatus and the like.
[0017]
[Means for Solving the Problems]
In order to achieve the above object, in the video processing apparatus according to the present invention, the partial video data specifying means specifies the partial video data that is a part of the video data from the video data, and the partial video data specified by the data association means The data so that the presence of the data can be presented.
Therefore, it is possible to identify partial video data from the video data and associate the data with the partial video data in a manner capable of presenting the presence thereof, thereby presenting the existence of data associated with the partial video data. Can be possible.
[0018]
Here, the video processing apparatus may be configured as various apparatuses, for example, using a computer.
Further, as the video data, for example, temporally continuous video data is used, and specifically, data in which planar image data in a frame continuously changes in time is used. In this case, one point in the video data can be indicated by the value of the coordinates (horizontal axis and vertical axis) representing the position in the frame and the value of the time axis.
[0019]
Various data may be used as the partial video data. For example, image data of one frame, data of a specific target in image data of one frame, or a frame having a time width is used. Image data, that is, image data of a plurality of temporally continuous frames, specific target data having a time width, or the like can be used.
[0020]
Various methods may be used as the method of specifying the partial video data. For example, the method of specifying based on the designation from the user, or the video processing device automatically specifies in accordance with a predetermined procedure. It is possible to use a method, a method using both of these, or the like.
[0021]
Various data may be used as data associated with the partial video data, and text data, audio data, image data, and the like can be used.
Further, the number of data associated with the partial video data may be singular or plural.
[0022]
In the video processing apparatus according to the present invention, the partial video data specifying unit specifies partial video data having a time width for the same target data included in the video data.
Therefore, the data can be associated with the same target data having the time width included in the video data.
[0023]
Here, various target data may be used as the same target data, for example, data targeting a person, data targeting an object, or a predetermined area in a frame. Data etc. can be used. Various methods may be used as a method for specifying the same object. For example, for a stationary object, a method in which an object existing in the same place is regarded as the same object can be used. For those performing the above, it is possible to use a method in which those having characteristics such as the same shape are regarded as the same object.
Various time widths can be used as the time width.
[0024]
In the video processing apparatus according to the present invention, the video data corresponds to audio data. Then, the partial video data specifying means specifies partial video data having a time width in which audio data corresponding to the data of the person is valid for the data of one or a plurality of persons included in the video data.
Therefore, for a single person or a plurality of persons, data having a time width in which sound corresponding to the target is valid can be specified as partial video data.
[0025]
Here, as the audio data, for example, audio data emitted by a person or the like in the corresponding video data is used, and corresponds to the video data on the time axis, for example.
In addition, for data of a single person, for example, a voice that is considered to be emitted by the person except for a time width during which voices considered to be emitted by the person continuously continues or a silent period less than a predetermined threshold is continuous. For example, the time width during which the voice data continues can be determined as the time width in which the audio data corresponding to the data of the person is valid.
[0026]
Similarly, with respect to data of a plurality of persons, for example, a time duration during which a state in which at least one of the plurality of persons is considered to emit sound continues continuously or no sound less than a predetermined threshold is used. A time width during which such a state continues continuously except for a period can be determined as a time width in which audio data corresponding to the data of the plurality of persons is valid.
[0027]
In the video processing apparatus according to the present invention, the partial video data specifying means specifies the partial video data using data for specifying a region where the partial video data is located in the frame of the video data.
Therefore, for example, by using coordinate position data in a frame, it is possible to specify the image area in each frame constituting the partial video data and specify the partial video data.
[0028]
In the video processing apparatus according to the present invention, the partial video data specifying unit specifies a plurality of partial video data candidates by the partial video data candidate specifying unit, and sets the partial video data candidates specified by the partial video data designation receiving unit. The designation of the included partial video data is accepted from the user, and the partial video data for which the designation is accepted is set as the specified partial video data.
Therefore, after automatically specifying a plurality of partial video data candidates by the video processing device, the specified partial video data is finally specified by the method of specifying the partial video data from the plurality of candidates by the user. Partial video data.
[0029]
Here, as the number of partial video data candidates, various numbers may be used, for example, the number may be singular.
In addition, various methods may be used as methods for specifying partial video data candidates. For example, data for each target existing in a frame of video data can be specified as partial video data candidates. .
Further, as the partial video data designation receiving means, for example, a keyboard or a mouse operated by the user can be used.
[0030]
In the video processing apparatus according to the present invention, the related partial video data specifying unit specifies the partial video data from the data associated with the partial video data.
Therefore, for example, when the data associated with the partial video data is designated by the user, the partial video data associated with the data can be specified.
[0031]
In the video processing apparatus according to the present invention, the related data presenting means presents data indicating the presence of data associated with the partial video data in a visual association with the partial video data in the video data.
Therefore, the presence of data associated with the partial video data can be presented in a visual association with the partial video data, whereby the presence or association of the associated data is visually indicated to the user. It can be grasped.
[0032]
Here, as data indicating the existence of data associated with the partial video data, for example, icon data can be used, and various data can be used as described later.
Further, as a method of visually associating data indicating the presence of data associated with partial video data and the partial video data, various methods may be used. For example, how to arrange these data in the vicinity Alternatively, it is possible to use a method of arranging a part of these data in an overlapping manner.
Further, as a presentation method, for example, a method of displaying and outputting on a screen or a method of printing and outputting on a paper surface can be used.
[0033]
In the video processing apparatus according to the present invention, the related data presenting means presents data having a shape based on the shape of the partial video data as data indicating the presence of data associated with the partial video data.
Therefore, by presenting data having a shape based on the shape of the partial video data, it is possible to make it easier for the user to visually grasp the association between the data and the partial video data.
[0034]
Here, various data may be used as the data having a shape based on the shape of the partial video data. For example, shadow data having a shape based on the shape of the partial video data may be used.
[0035]
In the video processing apparatus according to the present invention, the related data presenting means is a frame provided outside the frame of the video data and outside the frame as data indicating the presence of data associated with the partial video data. The data indicating the horizontal position and the data indicating the vertical position within the frame of the partial video data are presented on the inside.
Accordingly, data indicating the presence of data associated with the partial video data is presented in a frame provided outside the frame, not within the frame of the video data, so that the image in the frame can be easily viewed as it is. . In addition, the presented data can indicate the horizontal position and the vertical position within the frame of the partial video data.
[0036]
Various frames may be used as the frame provided outside the frame of the video data. For example, a frame that is slightly larger than the frame of the video data is used. Video data is not presented inside the frame.
In addition, the partial video data exists at a position where the vertical line at the horizontal position and the horizontal line at the vertical position are orthogonal to each other.
[0037]
In the video processing apparatus according to the present invention, data indicating the presence of data associated with the partial video data is associated with a predetermined process. And the designation of the data presented by the presentation data designation accepting means (data indicating the existence of data associated with the partial video data) is accepted from the user, and the presentation data corresponding process execution means is associated with the data accepted by the designation. Execute the process.
Therefore, the user can execute processing associated with the data by designating the presented data.
[0038]
Here, various processes may be used as the predetermined process. For example, a document process related to the presented data, a process for starting a program related to e-mail, the Internet, etc., or a process related to the presented data. For example, a process for displaying or transmitting data to be displayed can be used. More specifically, for example, a process for displaying data related to the presented data on the screen, or an address set by e-mail for the data. Or a process for transmitting the data by voice to a telephone number set by telephone.
As the presentation data designation receiving means, for example, a keyboard or a mouse operated by the user can be used.
[0039]
Further, in the video processing apparatus according to the present invention, it is possible to execute an operation related to the same video data by a plurality of terminal devices.
Accordingly, for example, not only the operation related to the same video data is performed by one terminal device (for example, one user) but also the operation related to the same video data is performed by a plurality of terminal devices (for example, a plurality of users). Thus, partial video data related to the same video data, data associated with the partial video data, and the like can be shared and edited together.
[0040]
Here, various devices may be used as the terminal device, and for example, a computer can be used.
Various numbers may be used as the number of terminal devices.
Various operations relating to the same video data may be used. For example, an operation for specifying partial video data from video data or an operation for associating data with the specified partial video data may be used. it can.
[0041]
As one configuration example, a plurality of terminal devices are communicably connected via a wired or wireless network, and a common storage device accessible by the plurality of terminal devices is provided to operate the storage device. The target data is saved.
[0042]
Further, in the video processing device according to the present invention, the plurality of related data presenting means displays the data indicating the existence of the plurality of data associated with the partial video data that is a part of the video data specified from the video data. It is presented visually associated with the partial video data in the data.
Accordingly, it is possible to present the presence of a plurality of data associated with the partial video data in a visual association with the partial video data, so that the existence of the plurality of associated data and the association can be indicated to the user. Can be visually grasped.
[0043]
Here, various numbers may be used as the number of pieces of data associated with the partial video data.
Further, as the data indicating the presence of a plurality of data associated with the partial video data, for example, data different from the case where a single data is associated with the partial video data is used. Data representing the number of pieces of data associated with is used.
[0044]
In the video processing apparatus according to the present invention, the plurality of related data presenting means presents the same number of data as the number of the associated data as data indicating the presence of the plurality of data associated with the partial video data.
Therefore, the number of data associated with the partial video data can be presented so as to be visually grasped by the user.
[0045]
Here, as the same number of data as the number of data associated with the partial video data, data having the same or similar shape can be used as a preferred embodiment example, or, for example, data having different shapes. May be used.
[0046]
Further, in the video processing apparatus according to the present invention, the plurality of related data presenting means presents data indicating the presence of each data associated with the partial video data in an identifiable manner for each associated data.
Therefore, for each piece of data associated with the partial video data, the data indicating the presence can be visually identified by the user.
[0047]
Here, as an aspect in which the data indicating the existence of each piece of data associated with the partial video data can be identified, for example, the shape, color, brightness, arrangement position, etc. of the data indicating the existence are different for each piece of data. In such a manner, it can be used.
[0048]
As in the present invention described above, a technique for presenting data indicating the presence of a plurality of data associated with data of the same image visually associated with the image data and the same number of data as the plurality of data are presented. However, the technology that enables identification for each piece of data is not necessarily limited to the partial video data specified from the video data, and can be applied to various types of image data. It is also possible to apply to image data.
[0049]
In addition, the present invention provides a video processing method for realizing various processes as described above.
For example, in the video processing method according to the present invention, partial video data that is a part of the video data is specified from the video data, and the data is associated with the specified partial video data so that the presence of the data can be presented.
In the video processing method according to the present invention, data indicating the presence of a plurality of data associated with partial video data that is a part of the video data identified from the video data is used as the partial video data in the video data. And present them in a visual association.
[0050]
Further, the present invention provides a program that realizes various processes as described above. In the present invention, a storage medium storing such a program can also be provided.
For example, in the program according to the present invention, a process of specifying partial video data that is a part of the video data from the video data, and a process of associating data with the specified partial video data so that the presence of the data can be presented And let the computer run.
Further, in the program according to the present invention, data indicating the presence of a plurality of data associated with partial video data that is a portion of the video data specified from the video data is visually compared with the partial video data in the video data. Causes the computer to execute processing to be presented in association with.
[0051]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments according to the present invention will be described with reference to the drawings.
First, a video processing apparatus and a video processing method according to the first embodiment of the present invention will be described.
FIG. 1 is a block diagram showing an example of a video processing apparatus according to the present invention. The video processing device 1 includes a storage unit 11, a link target area designation unit 12, a link generation unit 13, a video presentation unit 14, and a link management unit 15.
[0052]
The storage unit 11 is composed of a general storage device, and is linked (associated) with one target video data (hereinafter also simply referred to as video data), link data (association data), and Holds the linked data to be linked to the other target.
The link target area designating unit 12 is composed of a coordinate input device such as a mouse or a digitizer, and is described by the user (user) as coordinate data of an area to be linked in the video data (hereinafter referred to as link target area coordinate data). The link target area coordinate data is output to the link generation unit 13.
[0053]
The link generation unit 13 inputs the identifier or name of the linked data input from the user through the dialog-type user interface. Further, the link generation unit 13 links the link target area coordinate data input from the link target area specifying unit 12 and the linked data input from the user, and outputs the link data to the storage unit 11 as link data. .
The video presentation unit 14 includes a display, and presents the visualized link data and video data to the user.
The link management unit 15 manages and controls the storage unit 11, the link target area specifying unit 12, the link generation unit 13, and the video providing unit 14.
[0054]
In this example, video data will be described as meaning data that is a combination of moving image data and audio data, or one of moving image data and audio data. Further, in this example, the partial video data means a part of temporal or spatial (regional) data in the video data.
Note that the video data referred to in the present invention includes, for example, image-only data, and includes, for example, a case where data such as audio is associated with the image data.
[0055]
FIG. 2 is a detailed block diagram of the video processing apparatus of FIG.
As shown in FIG. 2, the storage unit 11 includes a video storage device 21 and a link data storage device 26. The link target area specifying unit 12 includes an (arbitrary) partial video data specifying device 23 and a partial video data presentation device 24. The link generation unit 13 includes a link / data addition device 25. The video presentation unit 14 includes a video data presentation device 22, a partial video data presentation device 24, and a link / data presentation device 27.
[0056]
The video storage device 21 is configured by a general memory and holds input video data.
The video data presentation device 22 includes a display and presents video data held in the video storage device 21 to the user.
[0057]
The partial video data designating device 23 is constituted by a coordinate input device such as a mouse, designates an arbitrary part of the video data presented by the video data presentation device 22, and designates the designated partial video data as a partial video data. Transfer to the video data presentation device 24.
The partial video data presentation device 24 presents the partial video data transferred from the partial video data designation device 23.
[0058]
The link data adding device 25 adds link data to the partial video data presented by the partial video data presenting device 24 and transfers it to the link data storage device 26.
The link data storage device 26 holds the link data added by the link data adding device 25 and the partial video data.
The link data presentation device 27 presents the link data added by the link data addition device 25 and the link data group.
[0059]
Here, extraction of arbitrary partial video data from video data will be described.
As a form of extracting the partial video data from the video data, the user manually designates the external shape (contour) or circumscribed rectangular progress on the image of the partial video data through the user interface provided by the video data processing device 1. There are forms such as a method of extracting partial video data by the method, and a method of selecting a supplement of partial video data automatically extracted by the video processing apparatus 1 by the user.
[0060]
Here, a method of extracting partial video data when the video processing apparatus 1 automatically extracts partial video data candidates will be described.
Assume that the video data from which the partial video data is to be extracted is as shown in FIG. That is, a rectangular area (x (yO, 30) in the xy orthogonal coordinates) on a frame of a certain frame of video data (Video.mpg) 31 (31 frames with frame numbers 120 to 150 in this example). , (1O, 10), (20, 10), (20, 30)}) are recorded as persons to be extracted as partial video data candidates. The figure shows an x-coordinate axis representing the horizontal direction, a y-coordinate axis representing the vertical direction, and a time t axis representing the flow of time.
[0061]
As shown in FIG. 4, this partial video data extraction procedure includes contour extraction processing in each frame (step S1), circumscribed rectangle calculation processing in each frame (step S2), inter-frame difference calculation processing (step S3), partial video It consists of data detection processing (step S4) and partial video data candidate presentation processing (step S5).
[0062]
Specifically, first, in the contour extraction process in each frame, the video processing device 1 performs the contour extraction process in each frame in the video data 31 in order to specify the rectangular area of the partial video data (step S1). . Contour extraction can extract a contour by extracting edges of a human image by using a so-called differential filter used in normal image processing and connecting the edges. Further, even when a person is divided into a plurality of small regions by the contour extraction process, it is possible to extract a region (contour) in units of people by a conventional region division / integration process.
[0063]
Next, after extracting the contour of the person unit, the circumscribed rectangle 33 including the contour is calculated in the circumscribed rectangle calculation process in each frame (step S2). Here, by this circumscribed rectangle calculation process, in the 31 frames from frame numbers 120 to 150, {(10,30), (10,10), (20,10), (20,30) )} Circumscribed rectangle 33 can be calculated.
[0064]
Subsequently, in the inter-frame difference calculation process and the partial video data detection process, each frame is compared to check whether a person in the same partial video data can be handled as a single object (partial video data). (Step S3, Step S4). That is, by calculating the frame difference between each frame as used in MPEG2 or the like, it is determined whether or not the recorded one frame and the next frame are the same.
[0065]
Specifically, in the inter-frame difference calculation process (step S3), in the frame difference between the frame with the frame number 119 and the frame number 120, no person is recorded in the frame with the frame number 119, and the frame with the frame number 120 is displayed. Since people are recorded, the result of the frame difference (for example, the sum of the differences of each pixel) has a large value. Similarly, the frame difference between the frame with frame number 150 and the frame with frame number 151 also has a large value. On the other hand, in the frames with frame numbers 120 to 150, since a person is recorded in the same rectangular area 33, the frame difference in that frame has a small value.
[0066]
In the partial video data detection process (step S4), based on the above frame difference value and whether or not the rectangular area 33 exists, the frame number 120 to 150 is a candidate for partial video data. It can be seen that is recorded.
Therefore, in the partial video data candidate presentation processing, the portion of the rectangular area 33 of these frames is presented as a single partial video data 32 to the user of the video processing apparatus 1.
[0067]
Next, as shown in FIG. 5, the processing procedure of the video processing apparatus 1 according to this example will be described.
This processing procedure includes video presentation (step S11), partial video designation (step S12), partial video presentation (step S13), link data addition (step S14), and link data storage (step S15). Become.
[0068]
First, in video presentation, the video data presentation device 22 presents video data held in the video storage device 21 of the video processing device 1 (step S11).
Next, in the designation of the partial video, so-called time code or frame number of the video data designated by the user and coordinate data are acquired using the partial video data designation device 23 (step S12).
[0069]
Subsequently, in the presentation of the partial video, the partial video data presentation device 23 presents the partial video data designated by the user (step S13).
In link data addition, the user adds related data (link data) to the partial video data presented by the partial video data presentation device 23 using the link data addition device 25 (step). S14).
Finally, in the link data storage, the link data storage device 26 holds the link data added by the user, the so-called time code or frame number of the video data, and the coordinate data (step S15). .
[0070]
FIG. 6 shows the data structure of data stored in the link data storage device 26. FIG. 7 also shows an expanded data structure of data stored in the link data storage device 26.
The link data storage device 26 holds the time code 41 of the partial image data, and is linked data 43 input by the link data adding device 25 and arbitrary coordinate data designated by the partial video data presentation device 24. 42, a storage device name 44, and partial image icon data 45 are stored. In the expanded data structure, user data 46 is further stored to perform collaborative work.
[0071]
For example, when the link data is added from the still image portion of a certain frame, the time code of the point of the video data of the partial video data presentation device 24 is recorded in the time code 41. In the case of designation of “from here to here”, information on the start point and end point to which link data is added is recorded in the time code 41.
[0072]
The coordinate data 42 gives the two-dimensional coordinates (x1, y1), (x1, y2), (x2, y2), (x2, y1) as the link target area coordinates to the partial video data presentation device 24, such as a mouse. The text data and partial image icons plotted by the input device are (x1, y1), (x1, y2), (x2, y2), (x2, y1) Hold.
The linked data 43 holds comments, electronic data file information, text data, and file storage location information.
[0073]
Next, an operation procedure will be described using the user interface example according to this example of FIG.
The user designates a partial video to which link data is to be added from the video presented on the video data presentation screen 51, whereby the designated partial video is displayed on the partial video data presentation screen 52.
[0074]
At any place designated by the user on the partial video data presentation screen 52, a comment or electronic data file with a plurality of text data is added to the presented image as partial image icons 61a to 61c from the link data addition screen 53. Is possible. In this case, as shown in FIG. 6 and FIG. 7, the specified time for adding the link data is stored as the time code 41, and the user adds the link data to any part on the partial video data presentation screen 52. Is stored in the coordinate data 42, and a comment or electronic data file based on the added text data is held as the link target data 43, and these three are held as one data.
[0075]
FIG. 9 shows a data structure after the link data is added to the partial video data.
Figure 9 shows the relationship between the three link data ("Question" and "This comment is a point" comment (text data) and the name "abc.mpg" in the time code (00: 01: 00.00) Video data). As described above, since the data is held for each designated arbitrary partial video data, the temporarily added link data can be erased from the partial video data.
[0076]
In addition, a storage device destination can be designated as the storage destination. This is because when video data with link data added is stored in a public server or private server like the storage device name 44, and for the same video data among a plurality of users. This is used when link data is added in cooperation.
[0077]
Furthermore, as shown in FIG. 8, when link data is added as the same related data to an arbitrary area designated on the partial video data presentation screen 52, links such as text data comments and related electronic data are linked. -Data can be superimposed. Here, not only icons and comments overlap the coordinate position, but also related link data can be registered as a group. The link data with “*” in FIG. 9 is held as grouped information.
[0078]
As shown in FIG. 10, when link data is added to a person object 71 or a place object 72 on the partial video data presentation screen 52, a message is sent when link data is added to the objects 71 and 72. It is possible.
For example, when adding link data to the person 71, the person message sending link data 73 is used. According to this, in order to ask a certain participant about video data for which a meeting has been held, the user adds a comment and an e-mail address to the participant displayed on the partial video data presentation screen. It is possible to send a message to the participant. Further, when sending the message, it is possible to send not only the comment but also the link data to which the user has added the link data. As a result, it is possible to grasp in a simple and appropriate manner what kind of situation the question is, the specified time, and the situation at the spot.
[0079]
When link data is added to a place object, place space message sending link data 74 is used. The usage is assumed as follows. In other words, if any specified partial video data holds an important person's comment and wants to use that information in a future meeting, use a message service such as e-mail for the meeting place. And send the data. When actually used, it is disclosed using the terminal at the location or the user's terminal.
[0080]
On the linked data presentation screen 54, a plurality of partial video data presentation screens 62a to 62e with link data added by the user in the video data are presented. Further, as the link data group presented on the link target data presentation screen 54, not only link data extracted from the video data but also link data other than the video data can be designated.
[0081]
Next, a procedure for adding video data and link data between a plurality of users via a network will be described.
FIG. 11 shows an example of a device and a user interface that are mainly used when used among a plurality of users.
User A and user B take out video data from the video storage device 21 and designate arbitrary partial video data to which link data is added.
[0082]
In FIG. 11, for the same partial video data, the user A uses a link data addition input dialog (link data addition screen) 53 to generate one piece of link data “this person is Mr. X”. The user B uses the link data addition input dialog 53 to add two pieces of link data “related video of this conversation” (text data) and “xyz.mpg” (video data). Is added. These data are held in the link data storage device 26. The data structure is shown in FIG. 12, and the time codes and coordinate data of user A and user B are held respectively. FIG. 13 shows an image diagram in which partial image icons 81a and 83a to 83c representing link data added by the user A and the user B are simultaneously presented. The linked object data presentation screen 53 of each user A and B shows partial image data 82a to 82c and 84a to 84d with respective link data.
[0083]
Also, user A designates partial video data to which link data is added in advance, and later tells user B the location of the partial video data by e-mail etc. Asynchronous collaborative work is possible. Furthermore, it is possible to re-edit partial video data and link data created in advance by accessing the link data storage device 26 between a single user or a plurality of users.
[0084]
Further, in order to be able to add, hold and present link data between a single user or a plurality of users, a configuration as shown in FIG. 14 can be used. In this configuration, the terminal device of user A, the terminal device of user B, the link / data storage device 98 shared among the users, and the video storage device 97 shared among the users are networked. Connected through. The devices of users A and B include video data presentation devices 91a and 91b, (optional) partial video data designating devices 92a and 92b, partial video data presentation devices 93a and 93b, and link data, respectively. Additional devices 94a and 94b, link data presentation devices 95a and 95b, and link data storage devices 96a and 96b are provided.
[0085]
With reference to FIG. 15 and FIG. 16, a description will be given of a case where video data synthesized by reusing video data 1 and video data 2 created in advance composed of linked data and video data is created.
FIG. 15 shows a state in which the link target data 102 is linked to the video data 101.
[0086]
FIG. 16 shows an example in which video data is reused and edited using the video processing apparatus 1 of this example.
Before a certain meeting is held, the meeting organizer and the like can understand the process up to now and share the video data 1 created in advance related to this meeting in order to share it among the participants. While accessing the video data 2 and browsing the meeting minutes and materials which are the individual linked target data 114a, 115a, 115b, 116a-116c, 124a, 125a-125c, 126a, a plurality of video data 111-113 , 121-123, the most relevant video data can be taken out and edited, such as rearranged, to produce synthesized video data.
[0087]
Next, a process of automatically extracting a video frame that is a target of link data from video data will be described.
As described above, the link data adding device 25 allows the user to designate the partial video data, as well as the video data in the video data after the specified arbitrary partial video data, and the video data in the video data. When the video object and audio data are analyzed, the partial audio data estimated to be the same person's utterance and the same content in the dialogue between multiple persons on the corresponding frame of the partial video data for which link data is specified The estimated partial audio data is extracted, and video data (partial video data) and link data corresponding to the extracted partial audio data are added.
[0088]
For example, FIG. 17 is an example of extracting the start point and the end point of speech estimation of the same person.
In this case, among a plurality of frames F1 to F7 that are continuous with respect to the axis of time t, the audio data of the frame (for example, the frame F1 or the frame F4) to which the link target data is to be added is the next frame (for example, the frame F2 or the frame F7) and the point where the audio data is interrupted are presented as the speech estimation points T1 and T2, and the video frames corresponding to the start and end points of the audio data are added to the link target data. To do.
[0089]
FIG. 18 is an example of extracting the start point and the end point of the dialog estimation between a plurality of persons. Also in this case, the dialog estimated locations T11 to T14 are extracted from a plurality of frames F11 to F17 continuous with respect to the axis of time t, as in the case of FIG. However, in this example, in this case, if the time between the conversations T21 and T22 occurring during the conversation is Δt, as shown in FIG. 19, this is the same if Δt is shorter than a certain interval. Guess the dialogue part.
[0090]
Next, a video processing apparatus and a video processing method according to the second embodiment of the present invention will be described.
The schematic configuration and operation of the video processing apparatus 1 of this example are the same as those shown in the first embodiment, for example, and in this example, different parts will be described in detail.
[0091]
FIG. 20 is a block diagram showing an example of the data structure of the link data according to this example.
The link data of this example includes an identifier 131, a video data file name 132, a frame start number 133, a frame end number 134, a link target area coordinate 135, a link target data name (for example, URL) 136, and visual feedback It consists of data 137.
[0092]
The identifier 131 is data for distinguishing the link data itself, and is assigned by the link management unit 15 for each link data.
The video data file name 132 identifies video data to be linked.
The frame start number 133 is a start number of a frame to be linked with the video data.
The frame end number 134 is the end number of the frame to be linked with the video data.
[0093]
The link target area coordinates 135 are coordinate data to be linked in the video data designated by the user.
The linked target data name 136 is a name of data linked to the video data.
The visual feedback data 137 is data used for visually giving feedback to the user that there is a link to the video data.
[0094]
Here, the identifier 131 is set by the link management unit 15.
The video data name 132, the frame start number 133, the frame end number 134, and the link target area coordinates 135 are input from the user by the link target area specifying unit 12.
The link target data name 136 is input from the user by the link generation unit 13 using a dialog-type user interface.
The visual feedback data 137 is generated by the link generation unit 13.
[0095]
FIG. 21 is a diagram showing a main user interface according to the present example.
The main user interface 141 includes a video presentation screen 142, a video playback button 143, a video stop button 144, a link start button 145, a link end button 146, and a link target data name input dialog 147.
[0096]
The video presentation screen 142 presents video data held in the storage unit 11 to the user.
The video playback button 143 makes it possible to start playback of the video data when the user clicks with the mouse or the like.
The video end button 17 enables the reproduction of the video data to be stopped when the user clicks with a mouse or the like.
[0097]
The link start button 145 allows the user to specify the start frame of the video data being reproduced to be linked by clicking with the mouse or the like.
The link end button 146 allows the user to specify the end frame of the video data being reproduced to be linked by clicking with the mouse or the like.
The linked target data name input dialog 147 allows the user to input the linked target data name to be linked to the video data through the dialog.
[0098]
FIG. 22 is a flowchart showing an example of the linking process of the video processing apparatus 1 of this example.
As shown in FIG. 22, the linking process includes an initialization process (step S21), a video reproduction detection process (step S22), a link start detection process (step S23), a link end detection process (step S24), and a link target area. It consists of definition processing (step S25), linked target input processing (step S26), link generation processing (step S27), link presentation processing (step S28), and video stop detection processing (step S29).
[0099]
Next, the processing procedure of the video processing apparatus 1 of this example will be described using the flowchart of FIG.
First, in the initialization process, the storage unit 11, the link target area designating unit 12, the link generation unit 13, the video presentation unit 14, and the link management unit 15 of the video processing device 1 are initialized (step S21). .
[0100]
That is, first, link data is generated and initialized by the link management unit 15. Specifically, the file name of the video data to be used held in the storage unit 11 is set as the value of the video data name 132 in the link data by using dialog input or the like. As the link data identifier 131, an identifier unique to the video processing apparatus 1 is set by the link management unit 15. The link management unit 15 sets values such as 0 as default values for the frame start number 133 and the frame end number 134 of the link data. Similarly, the link management unit 15 sets predetermined values for the link target area coordinates 135 of the link data, the linked target data name 136, and the visual feedback data 137. The link data generated by the link management unit 15 is held by the storage unit 11.
[0101]
Next, in the video playback detection process, playback of the video data designated by the user is started by detecting the click of the video playback button 143 using the mouse or tablet from the user (step S22). ).
Subsequently, in the link start detection process, the link management unit 15 determines the frame start number for defining the link area for the video data by detecting the click of the link start button 145 from the user. The value is set as the frame start number 133 of the link data (step S23).
[0102]
Subsequently, in the link end detection process, the link management unit 15 determines the frame end number for defining the link area for the video data by detecting the click of the link end button 146 from the user. The value is set as the frame end number 134 of the link data (step S24). Here, the link management unit 15 temporarily stops the reproduction of the video data.
[0103]
In the link target area definition process, the link management unit 15 first superimposes the message and video data on the fact that the link target area can be defined on the video presentation screen 142 for the user. Notice. Further, the link target area designating unit 12 acquires coordinate data of an area to be linked to video data presented on the video presentation screen 142 by designation with the mouse from the user. Here, the link target area designating unit 12 gives the user visual feedback such as surrounding the area designated by the user with a white line. The link target area designating unit 12 stores coordinate data defining the link target area acquired from the user (hereinafter also referred to as link target area defining coordinate data) in the storage unit 11. The value is set as the value of the link target area coordinate 135 of the data (step S25).
[0104]
In the linked target input process, the link management unit 15 obtains the linked target data name specified by the linked target data name input dialog from the user, and stores the link data stored in the storage unit 11. It is set as the value of the linked target data name 136 (step S26).
In the link generation process, the link generation unit 13 uses the value of the link target area coordinates 135 of the link data held in the storage unit 11 and the video data corresponding to the frame start number 133 to the frame end number 134. Image data and related coordinate data for visual feedback to a person are generated. The image data and related coordinate data are set as visual feedback data of link data (step S27).
[0105]
In the link presentation process, the image data related to the video data is superimposed and presented on the video presentation screen 142 using the related coordinate data of the visual feedback data 137 of the link data (step S28).
In the video stop detection process, it is detected whether or not the user has clicked the video stop button with the mouse. If the user has clicked, the presentation of the video data is stopped and the linking process is terminated (step S29). On the other hand, if the user has not clicked, the process after the link start detection process is performed again (steps S23 to S29).
[0106]
Here, the link target area definition process (step S25) and the link generation process (step S27) will be described in detail with reference to FIGS.
FIG. 23 shows an example of a video object (logo “Y”) 151 linked as partial video data.
FIG. 23 shows video data (still image data in each frame) corresponding to the frame start number 133 to the frame end number 134 held in the storage unit 11.
[0107]
FIG. 24 shows a diagram in which the video object 151 is being displayed to the user using the video presentation screen 142 by a frame 152 by the user operating the mouse.
FIG. 25 shows a diagram of a video object 153 in which the video object 151 is tilted obliquely by image processing.
FIG. 26 shows a diagram in which shadow data 154 is generated by image processing of edge extraction (boundary extraction) and color conversion of the slanted video object 153 of FIG.
[0108]
FIG. 27 shows a diagram in which the original video object 151 in FIG. 23 and the shadow data 154 in FIG. 26 are combined.
FIG. 28 shows a diagram in which an area to be presented to the user using the video presentation screen 142 is extracted from the data of FIG.
[0109]
In the link target area definition process (step S25), as described above, first, it is possible to link to the video object, the color of the frame of the video presentation screen 142 is changed, or the color of the link start button 145 is changed. Notify users by making changes.
Next, the user operates the mouse while referring to the video object 151 shown in FIG. 23 presented on the video presentation screen 142, and the video object (here, “Y” logo) 151 to be linked. Select. The selection result is indicated by a frame 152 shown in FIG.
[0110]
The coordinates representing the frame 152 (for example, the coordinates of the upper left corner and the lower right corner) are set as the link target area coordinates 135 of the link data held in the storage unit 11 as the link target area definition coordinate data.
Subsequently, in the link generation process (step S27), image processing such as projective transformation of the image in FIG. 24 is performed to make it distinguishable from the original video object 151 in FIG. Furthermore, the shadow data 154 is obtained by performing contour extraction using a differential filter on the slanted video object 153, determining the boundary between the video object 153 and the background, and performing color conversion of the area of the video object 153. Generate.
[0111]
Further, the image of FIG. 27 is obtained by combining the original video object 151 of FIG. 23 and the generated shadow data 154.
Finally, visual feedback data 137 is generated by clipping the region to be presented on the video presentation screen 142 to the user. The coordinate value of the boundary of the clipped shadow data 155 with the background or the original video object 151 is set as related coordinate data as the visual feedback data 137 held in the storage unit 11 together with the shadow data 154. Subsequently, in the link generation process (step S27), the video of FIG. 28 is presented on the video presentation screen 142.
[0112]
Here, the link generation processing (step S27) when a plurality of links are made to one video object will be described.
When a plurality of links are made to one video object, as shown in FIG. 29, a plurality of video objects inclined at different angles are generated, and the shadow data 156a is changed by changing the shadow color. By generating 156b, the user can distinguish each link. The shadow data 156a and 156b are superimposed on the original video data 151 of FIG. 23 as shown in FIG. 30 and further subjected to clipping processing as shown in FIG. 31, thereby using the images 157a and 157b after the clipping processing. Visual feedback data 137 is generated.
[0113]
Next, a description will be given of the state of the user interface when the user instructs the link target presentation by designating the shadow data presented by visual feedback with the mouse.
First, when a link is associated with the video object presented on the video presentation screen 142, the shadow data is superimposed and displayed as described above. When the user clicks the shadow data with the mouse, the link management unit 15 uses the identifier 131 in the link data, the video data name 132, the frame start number 133, the frame end number 134, and the visual feedback data. 137 is determined whether or not included, and if it matches or included, the value of the linked data name 136 is displayed in the linked data name input dialog 147 so that the user can Make the link target data accessible. Alternatively, the contents of the linked target data name 136 are presented on another window or display (for example, the video presentation screen 142 is divided into screens and displayed on the one screen).
[0114]
In the above description, the video data and the link target data have been described on the premise that they are stored in the storage unit 11 of the same video processing device 1, but the video data or the link target data is, for example, via a network. The video processing apparatus 1 may be connected to the video processing apparatus 1 to access the video data or the link target data. In this case, the video data name 132 or the link target data name 136 in FIG. 20 can be configured as a so-called URL that represents the access destination of the video data or a URL that represents the access destination of the link target data, respectively.
[0115]
Further, the description has been made on the assumption that the video data and the link target data are held in the storage unit 11 of the same video processing apparatus 1, but as shown in FIG. 32, the client 161 and the server 162 are connected via the network 163. The functions of the respective units of the video processing device 1 described above can be separately arranged in the client 161 or the server 162 and linked. For example, as shown in FIG. 32, the link generation unit 173 is arranged in the server 162, and the storage unit 171, the link target designation unit 172, the video instruction unit 174, and the link management unit 175, which are other processing units, are arranged in the client 161. It is also possible to adopt a configuration.
[0116]
FIG. 33 shows an example of the format of link data transmitted to the network.
As shown in FIG. 33, when the client 161 and server 162 as shown in FIG. 32 are connected via the network 163 by converting the link data structure of FIG. 20 into, for example, the so-called XML format and transmitting it to the network. Link data can be transferred and used.
[0117]
Similarly, FIG. 34 shows another example of the format of link data transmitted to the network. FIG. 34 shows a state in which link data is designated as linked data.
Specifically, the link data identifier of LlNK001 is set as the <resource-name> element. When link data is designated as linked data in this way, the link management unit 175 interprets the XML format link data (link data with an identifier of LINK003) in FIG. 34 and converts the link data of L1NK001 into the link data. get. Further, the link management unit 175 interprets the XML format link data (link data with the identifier LlNKOO1) in FIG. Video.mpg data is set in the <audiovisual-data> element, and Detect that Annotation.txt data is set in <resource-name> element.
[0118]
Subsequently, the link management unit 175 causes the user to select whether to use Video.mpg data or Annotation.txt data, and presents the selected data on the video presentation screen 142. If a link data identifier is further set as a <resource-name> element, the same operation is repeated to follow the link. In this way, link data can be reused by setting the link data identifier in the linked data name 136 or the <resource-name> element. In addition, link data that has been XML formatted is transferred by e-mail or the like, and the link data that has been XML formatted is used by the transferred video processing device 1 of the user so that the link data can be reused. It is also possible to do.
[0119]
Next, means and steps for identifying any partial video data linked from electronic data such as linked text data, audio data, or video data will be described.
Here, it is assumed that link data as shown in FIG.
That is, the value “LlNK001” is set in the link identifier 131, the value “Video.mpg” is set in the video data name 132, the value “120” is set in the frame start number 133, and the frame end number 134 is set. The value of “150” is set, the value of “{(1O, 30), (10,10), (20,10), (20,30)}” is set in the link target area coordinate 135, and the linked It is assumed that the value “Annotation.txt” is set in the target data name 136 and “Visual.dat” is set in the visual feedback data 137.
[0120]
When a user specifies arbitrary partial video data from electronic data such as linked text data, audio data, or video data, the user first selects a desired data from the linked target data name input dialog 147 in FIG. Enter the linked data name (name of the electronic data). That is, the user inputs “Annotation.txt” using the linked target data name input dialog 147. When the link target data name is input from the link target data name input dialog 147, the link management unit 15 of the video processing device 1 selects the linked data from the link data held in the storage device 11. Searches and retrieves link data that matches the name value.
[0121]
Next, the link management unit 15 refers to the video data name 132 of the link data and acquires video data that matches the video data name from the storage unit 11. That is, the link management unit 15 refers to the value “Video.mpg” of the video data name 132 in the link data, and acquires the video data matching the “Video.mpg” from the storage unit 11.
[0122]
Subsequently, the link management unit 15 refers to the values “120” and “150” of the frame start number 133 and the frame end number 134 in the link data, that is, the frame to be extracted from the video data, that is, the frame number. Extract 120 to 150 frames. Furthermore, the link management unit 15 refers to the link data, refers to the link target area coordinates 135, and uses “{(10,30), (1O, 1O), (20,1O), ( 20,30)} ”.
[0123]
Therefore, the link management unit 15 matches the above-described extracted link target area coordinates of each frame, here (10, 30), (10, 10), (20, 10), (20, 30). Data corresponding to the visual feedback data 137 of the link data is arranged in the area surrounded by the coordinates and presented on the video presentation screen 142.
Therefore, the user can specify an area indicated by visual feedback in the video data “Video.mpg” corresponding to the linked target data name “Annotation.txt”.
[0124]
Next, electronic data such as linked text data, audio data, or video data is transferred to a telephone or communication system such as an electronic bulletin board system or a telephone, and the electronic data is transmitted to an object related to any linked partial video data. The means and steps for delivering data will be described.
[0125]
36 is similar to the video processing apparatus 1 of FIG. 1 in that a storage unit 191, a link target area designating unit 192, a link generation unit 193, a video presentation unit 194, and a link management unit 195 are provided. It is a block diagram showing an example of an expanded video processing device 181 to which a unit 196 and a telephone call unit 197 are added.
[0126]
The link / data transfer unit 196 is composed of a CPU and a buffer storage device. The link / data transfer unit 196 inputs link target data to be transferred from the storage unit 191, and transfers it to the telephone call unit 197.
The telephone call unit 197 is a subsystem having a normal telephone call function, and transmits the link target data input from the link / data transfer unit 196 to an external telephone.
[0127]
Here, it is assumed that link data as shown in FIG. That is, the value “LlNK002” is set in the link identifier 131, the value “Video.mpg” is set in the video data name 132, the value “120” is set in the frame start number 133, and the frame end number 134 is set. A value of “150” is set, and a value of “{(10,30), (10,10), (20,1O), (20,30)}” is set in the link target area coordinate 135 and the linked target It is assumed that the value “Voice.dat” is set in the target data name 136 and “Visual2.dat” is set in the visual feedback data 137. Voice.dat, which is voice data, is stored in the storage unit 191, and in combination with “Voice.dat”, a telephone number “O120-123-4567” for a call corresponding to the voice data “Voice.dat” is stored. Similarly, it is assumed that it is held in the storage unit 191.
[0128]
It is assumed that the user selects link data identified by the identifier “Link002” using the mouse by visual feedback presented on the video presentation screen 142.
When the link management unit 195 refers to the link data held in the storage unit 191 and specifies that the data to be linked is “Voice.dat” which is voice data, the link management unit 195 corresponds to the “Voice.dat”. The telephone number “O120-123-4567” for the call to be obtained is acquired.
[0129]
Next, the link management unit 195 transfers “Voice.dat” to the link data transfer unit 196. Subsequently, the telephone call unit 197 uses the acquired telephone number “0120-123-4567” to call the call destination, and when there is a call, reproduces “Voice.dat” as voice data and completes the call. .
[0130]
Here, it is assumed that the telephone is connected to a telephone of a normal public telephone network. However, instead of the telephone call unit 197, a data transmission function is prepared, and when the link target data is text data, an electronic bulletin board is used. It can also be configured to forward to. Similarly, it is possible to prepare a data transmission function having a so-called Internet telephone function instead of the normal telephone function, and transmit the link target data to the Internet telephone.
[0131]
Next, a video processing apparatus and a video processing method according to the third embodiment of the present invention will be described.
38 and 39 show video data and a frame 202 with one video object 203 and visual feedback 204a, 204b, 205a, 205b, 206a, 206b on the video presentation screen 201 of the video processing apparatus 1 of this example. It is a figure which shows the example of the user interface made.
In FIG. 40, video data and a frame 202 having two video objects 203 and 207 and visual feedback 205a, 205b, 206a, 206b, 208a and 208b are presented on the video presentation screen 201 of the video processing apparatus 1 of this example. It is a figure which shows the example of the user interface which is connected.
[0132]
FIG. 38 shows a case where one link is set for the video object (“Y” logo) 203.
On the other hand, FIG. 39 shows a case where two links are set for the same video object (“Y” logo) 203. When a plurality of links are set in this way, it is possible to distinguish the links by presenting graphics of different colors on the frame 202 of the video presentation screen 201.
[0133]
As shown in FIGS. 38, 39, and 40, the video objects 203, 207 are two sides of a frame 202 having a short distance from the video object (“Y” logo) 203, 207 to which the link is set. By placing figures 204a, 204b, 205a, 205b, 206a, 206b, 208a, 208b showing links at positions corresponding to horizontal positions (for example, horizontal axis) and vertical positions (for example, vertical axis) The person can get visual feedback indicating the presence of the link.
[0134]
As described above, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the means for designating any partial video data in the video data and the video data and the designated partial video data are simultaneously presented. In a configuration comprising means, means for adding link data to specified arbitrary partial video data, and means for presenting target data linked to the specified arbitrary partial video data, One or more related link data is added to the partial video data, and one or more partial image icons indicating the link data are superimposed and presented, or visually adjacent to or superimposed on the partial video data. A means for presenting the existence of the link was provided.
[0135]
Then, the video data is presented by adding comments or related materials used for the text data to the video data, and any partial video data to which the annotation is to be added is specified. Present video data at the same time, add link data to specified partial video data, and visually add link data to adjacent partial video data to be linked or superimposed Present.
[0136]
Therefore, the link data is present in any partial video data by visually presenting the presence of the link adjacent to or overlaying any partial video data to be linked. Visual feedback can be provided to the user.
Also, the user can add text data, audio data, image data, related material file data, moving image data, etc. to the partial video data to which link data is to be added. It becomes possible to relate easily and appropriately.
In addition, the user interface for associating displays both video data and arbitrary partial video data, and reproduces the video data, so that the user adds link data while referring to the extracted partial video data. It becomes possible.
[0137]
In the video processing apparatus and the video processing method according to the embodiments of the present invention, the link data includes a time range of the partial video data, and the means for adding the link data is a means for specifying the time range of the partial video data. Equipped with.
Accordingly, the time range of the video data can also be specified as the specified range of the partial video data to which the link data is added.
[0138]
In the video processing apparatus and the video processing method according to the embodiments of the present invention, the link data or the partial video data includes area information in the video data of the partial video data, and the means for adding the link data is the partial video data. Get the region information above and compose the link data.
Therefore, the link data including the area information on the partial video data can be configured by the means for adding the link data.
[0139]
Further, in the video processing apparatus and the video processing method according to the embodiments of the present invention, the link data presentation means presents one or more partial image icons on the time axis and on the video data (on the space axis).
Therefore, it is possible to present one or more partial image icons on the time axis and on the video data (on the space axis).
[0140]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the means for visually presenting the presence of the link adjacent to or overlapping the arbitrary partial video data to be linked is the arbitrary part The presence of the link is visually indicated by the video data, the shadow of the video object in the area, or the luminance change of a similar shape.
Therefore, by providing the user with visual feedback based on the luminance change of the shadow of the video object in the area or the shadow of the video object in the area or the similar shape, the video object in the area or the video object in the area Visual feedback can be provided to the user by the brightness change of the shadow corresponding to the shape or the similar shape.
[0141]
In addition, in the video processing apparatus and the video processing method according to the embodiment of the present invention, when a plurality of related data is indicated to an arbitrary location in the arbitrary partial data of the video, the partial images are added in an overlapping manner. Provided with means.
Therefore, when a plurality of link data are related to an arbitrary video object in an arbitrary partial data in the video data, the link data can be superimposed and added using a partial image icon or the like. Become.
[0142]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the arbitrary partial video data or the region may include any partial video data or the luminance change of the shadow or similar shape of the video object in the region. A shadow shape to be presented or a similar shape is generated from the shape of the video object in the image.
Therefore, the shape of a shadow or similar shape to be presented from the shape of the video object in the arbitrary partial video data or the region as the luminance change of the shadow or similar shape of the video object in the region or the arbitrary partial video data By generating the above, it is possible to provide the user with visual feedback that does not feel uncomfortable with the original video data.
[0143]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, when a plurality of links are added to the same image data, a visual feedback is given to the user to distinguish each link. The linked information can be used effectively.
[0144]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the means for visually presenting the presence of the link adjacent to or overlapping the arbitrary partial video data to be linked is the arbitrary partial video. A means for extracting a video object from a luminance change in the data is provided.
Therefore, the video object of arbitrary partial video data can be extracted from the luminance in the arbitrary partial video data.
[0145]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the means for adding the link data allows the user to select the video target in the video data extracted from the luminance change of the video data.
Therefore, the user can select a video object in the video data extracted from the luminance change of the video data.
[0146]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, there are two or more means for visually presenting the presence of the link adjacent to or overlapping with any partial video data to be linked. When presenting a link, the presence of the link is visually presented based on such arbitrary partial video data or a different shadow or different similar shape luminance change or color change of the video object in the region.
Therefore, when a plurality of link data is added to the same partial video data, the plurality of links can be presented in an identifiable manner.
[0147]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, by adding the specified link data to the arbitrary partial video data, and simultaneously presenting the link data to the partial video data individually and in combination. In addition to adding related link data to a designated partial image of arbitrary partial video data, link data can be superimposed on the designated partial image.
[0148]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the added link data is relative to the link data added to other arbitrary partial video data or other arbitrary partial video data. And means for generating link data of the link data.
Therefore, the link data added can be associated with other arbitrary partial video data and link data added to other arbitrary partial video data.
[0149]
As described above, in the video processing apparatus and the video processing method according to the embodiment of the present invention, a plurality of links can be associated with the same area in the video data or the same video object in the area, The specified arbitrary partial video data to which the link data is added can be associated with other arbitrary partial video data and other link data in the video data.
[0150]
In addition, in the video processing apparatus and the video processing method according to the embodiment of the present invention, as the contents of the link data, electronic data such as text data, audio data or video data, or an electronic file or link data is linked. Is described.
Accordingly, electronic data such as text data, audio data, or video data can be linked as the contents of data linked to the video data.
[0151]
As described above, in the video processing apparatus and the video processing method according to the embodiment of the present invention, an existing electronic document such as an associated e-mail, image data used in a conference, It is possible to associate electronic files such as partial audio data and video data.
[0152]
In addition, the video processing apparatus and the video processing method according to the embodiments of the present invention include means for adding, sharing, presenting, or distributing link data by one or a plurality of users.
Thus, for example, the user obtains link data using the portable information terminal and a means for adding, sharing, presenting or distributing the link data with respect to the stored link data, and linked video data. Various re-editing such as adding a link can be performed. In addition, the user obtains the link data and the video data to which the link data is added or the data to be linked by using means for adding, sharing, presenting or distributing the link data among a plurality of users. Thus, various re-editing operations such as compositing link data, video data to which link data is added, or data to be linked can be performed.
[0153]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the means for adding the link data is such that the audio data is effective for a single person or a plurality of persons in any specified partial video data. In this case, there is provided means for extracting a time range of the partial video data from the moving image data and the audio data.
Therefore, by analyzing the audio data in any specified partial video data by means of adding link data, the same content in the dialogue between multiple persons such as the part of the same person's remarks or questions and answers It is possible to guess and cut out a portion that is, extract partial video data corresponding to the data of the portion, and add link data to the partial video data.
[0154]
The video processing apparatus and the video processing method according to the embodiments of the present invention further include means for specifying any partial video data to be linked from electronic data such as linked text data, audio data, or video data. It was.
Therefore, by specifying any partial video data linked from electronic data such as linked text data, audio data or video data, any partial video data linked from the linked electronic data etc. It becomes possible to refer to it.
[0155]
Further, in the video processing apparatus and the video processing method according to the embodiment of the present invention, the user designates the partial video data or the video target in the partial video data, so that the linked text data, audio data, or video data is specified. The electronic data is transferred to an electronic bulletin board system, a call or communication system such as a telephone or electronic mail, and the electronic data is delivered to an object related to any linked partial video data.
Therefore, referring to the other party's data related to the arbitrary partial video data from the arbitrary partial video data using the electronic bulletin board system or the telephone or communication system, the electronic data etc. Can be notified or transferred.
[0156]
Further, in the video processing apparatus and the video processing method according to the embodiments of the present invention, any configuration in which link data is linked in a configuration in which video data is presented, link data is held and processed with respect to the video data. Corresponding to the partial video data, there is provided means for visually presenting the presence of a link in the outer frame of the video data.
Therefore, the presence of a link to the partial video data can be presented using the outer frame without disturbing the presented video data.
[0157]
As described above, in the video processing apparatus and the video processing method according to the embodiment of the present invention, when video data linked with link data is presented to the user, for example, the user presents video in the video processing apparatus. Visual feedback of the link can be given to the user without moving the mouse over the area of the video data presented on the screen, informing the user of the presence of one or more links Can do. In addition, it is possible to refer to arbitrary partial video data from electronic data such as linked text data, audio data, or video data. Furthermore, it is possible to refer to or use an object related to the arbitrary partial video data from the arbitrary partial video data through the electronic bulletin board system or the telephone or the communication system through visual feedback.
[0158]
In the video processing apparatus according to the embodiment of the present invention, the partial video data specifying means is configured by the function of the link target area specifying unit 12 for specifying the partial video data from the video data. Data associating means is constituted by functions such as the link generation unit 13 for associating (linking) with the data.
Further, in the partial video data specifying means such as the video processing apparatus according to the embodiment of the present invention, the partial video data candidate specifying means is configured by the function of specifying the partial video data candidates, and the partial video data is selected from the candidates. The partial video data designation accepting means is configured by the function of accepting data designation from the user.
[0159]
In the video processing apparatus according to the embodiment of the present invention, the related partial video data specifying unit is configured by the function of the link management unit 15 that specifies the partial video data from the data associated with the partial video data. The related data presenting means is configured by the function of the video presenting unit 14 that visually presents data (visual feedback data) indicating the presence of data associated with the partial video data in association with the partial video data. Yes.
[0160]
In the video processing apparatus according to the embodiment of the present invention, visual feedback data and predetermined processing are associated with each other in the storage unit 11, and the designation of the presented visual feedback data is received from the user. The presentation data designation receiving means is configured by the function of the video presentation unit 14 and the like, and the presentation data corresponding process is executed by the function of the link management unit 15 that executes the process associated with the visual feedback data that has received the designation. Means are configured.
[0161]
In addition, in the video processing apparatus according to the embodiment of the present invention, for example, a plurality of visual feedback data indicating the presence of a plurality of data associated with the partial video data is presented in association with the partial video data. A plurality of related data presenting means is configured by the functions of the video presenting unit 14 and the like.
[0162]
Here, the configurations and modes of the video processing apparatus and the video processing method according to the present invention are not necessarily limited to those described above, and various configurations and modes may be used.
The application field of the present invention is not necessarily limited to the above-described fields, and the present invention can be applied to various fields.
[0163]
In addition, various processes performed in the video processing apparatus and the video processing method according to the present invention include, for example, a control program stored in a ROM (Read Only Memory) in a hardware resource including a processor and a memory. A configuration controlled by execution may be used, and for example, each functional unit for executing the processing may be configured as an independent hardware circuit.
Further, the present invention can also be grasped as a computer-readable recording medium such as a floppy (registered trademark) disk or a CD (Compact Disc) -ROM storing the above control program, or the program (itself). The processing according to the present invention can be performed by inputting a program from a recording medium to a computer and causing the processor to execute the program.
[0164]
【The invention's effect】
As described above, in the video processing apparatus and the video processing method according to the present invention, for example, partial video data that is a part of the video data is specified from the video data, and the data is specified for the specified partial video data. Since the existence of data is associated so as to be presented, the existence of data associated with the partial video data can be presented.
That is, in the video processing apparatus and the video processing method according to the present invention, for example, data indicating the presence of data associated with the partial video data is presented in a visual association with the partial video data in the video data. Therefore, the existence of the associated data and the association can be visually grasped by the user.
[0165]
Further, in the video processing apparatus and the video processing method according to the present invention, for example, data indicating the presence of a plurality of data associated with partial video data that is a part of the video data specified from the video data is stored in the video. Since the partial video data in the data is presented in a visually correlated manner, the presence of the plurality of associated data and the association can be visually recognized by the user.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a configuration example of a video processing apparatus according to the present invention.
FIG. 2 is a diagram showing a detailed configuration example of a video processing apparatus according to the present invention.
FIG. 3 is a diagram illustrating a state in which partial video data is extracted from video data.
FIG. 4 is a diagram illustrating an example of a processing procedure for extracting partial video data.
FIG. 5 is a diagram illustrating an example of a processing procedure for adding link data to partial video data;
FIG. 6 is a diagram illustrating an example of a data structure of a link data additional storage device;
FIG. 7 is a diagram showing an example of an extended data structure of a link data additional storage device.
FIG. 8 is a diagram illustrating an example of a user interface.
FIG. 9 is a diagram illustrating an example of a data structure after link data is added to partial video data.
FIG. 10 is a diagram showing an example of a user interface for link data addition presentation;
FIG. 11 is a diagram illustrating a specific example of a device configuration and a user interface in cooperative work.
FIG. 12 is a diagram showing another example of a data structure after link data is added to partial video data.
FIG. 13 is a diagram showing an example in which partial image icons representing link data added by a plurality of users are presented.
FIG. 14 is a diagram illustrating a configuration example of a system that performs editing work;
FIG. 15 is a diagram illustrating an example of a structure of linked data / target data and video data;
FIG. 16 is a diagram illustrating an example of a state in which video data synthesized from a plurality of video data portions is generated.
FIG. 17 is a diagram showing an example of speech estimation when link data is added.
FIG. 18 is a diagram showing an example of dialog estimation when link data is added.
FIG. 19 is a diagram for explaining an example of a method of guessing a dialog when link data is added.
FIG. 20 is a diagram illustrating an example of a data structure of link data.
FIG. 21 is a diagram illustrating an example of a user interface.
FIG. 22 is a diagram illustrating an example of a procedure of a linking process.
FIG. 23 is a diagram illustrating an example of a video object.
FIG. 24 is a diagram illustrating an example of a video object surrounded by a frame.
FIG. 25 is a diagram showing an example in which a video object is tilted obliquely.
FIG. 26 is a diagram showing an example of shadow data.
FIG. 27 is a diagram illustrating an example of a composition of a video object and shadow data.
FIG. 28 is a diagram illustrating an example of an extracted region where video objects and shadow data should be presented.
FIG. 29 is a diagram illustrating an example of a plurality of shadow data.
FIG. 30 is a diagram illustrating an example of a composite of a video object and a plurality of shadow data.
FIG. 31 is a diagram illustrating an example of an extracted region where video objects and a plurality of shadow data are to be presented;
FIG. 32 is a diagram illustrating an example of a configuration for performing a linking process via a network.
FIG. 33 is a diagram illustrating an example of a format of link data transmitted to a network.
FIG. 34 is a diagram showing another example of the format of link data transmitted to the network.
FIG. 35 is a diagram illustrating an example of link data values;
FIG. 36 is a diagram illustrating a configuration example of an extended video processing apparatus.
FIG. 37 is a diagram illustrating an example of link data values;
FIG. 38 is a diagram illustrating an example of a user interface in which video data including one video object, a frame, and visual feedback are presented on a video presentation screen of the video processing apparatus.
FIG. 39 is a diagram illustrating an example of a user interface in which video data including one video object, a frame, and visual feedback are presented on a video presentation screen of the video processing apparatus.
FIG. 40 is a diagram illustrating an example of a user interface in which video data including two video objects, a frame, and visual feedback are presented on a video presentation screen of the video processing apparatus.
[Explanation of symbols]
1, 181... Video processing device 11, 171, 191.
12, 172, 192 .. link target area designating part,
13, 173, 193 .. link generation unit,
14, 174, 194 .. Video presentation part,
15, 175, 195 ··· link management unit, 21, 97 ·· video storage device,
22, 91a, 91b .. video data presentation device,
23, 92a, 92b .. Arbitrary partial video data designation device,
24, 93a, 93b .. Partial video data presentation device,
25, 94a, 94b,.
26, 96a, 96b, 98... Link data storage device,
27, 95a, 95b .. link data presentation device,
31, 101, 111-113, 121-123, F1-F7, F11-F17 ..video data,
32 ... Partial video data 33 ... circumscribed rectangle 41 ... time code
42 .. coordinate data,
43, 102, 114a, 115a, 115b, 116a, 116b, 116c, 124a, 125a, 125b, 125c, 126a ...
44 .. Storage device name 45.. Partial image icon data
46 ・・ User data, 51 ・・ Video data presentation screen,
52..Partial video data presentation screen, 53.Link data addition screen,
54 ..Linked data presentation screen,
62a to 62e, 82a to 82c, 84a to 84d, partial video data with link data,
71, 72 ... objects,
73, 74 .. Link data for sending messages,
81a, 83a to 83c, partial image icons,
T1, T2,.
T11 to T14, T21, T22 .. dialog guessing point, 131 .. identifier,
132 .. Video data name 133.. Frame start number
134-Frame end number, 135-Link target area coordinates,
136 ..Linked data name,
137 .. Visual feedback data.
141..User interface 142, 201..Video presentation screen,
143 ... Video playback button, 144 Video stop button,
145 ... Link start button, 146 ... Link end button,
147 .... Link target data name input dialog,
151, 203, 207 ... Video object, 152 ... Frame
153 .. Video object tilted diagonally,
154, 156a, 156b, 157a, 157b, shadow data,
155 .. Extracted area to be presented with shadow data,
161..Client, 162..Server, 163..Network,
196 ··· Link and data transfer section, 197 · · Telephone call section,
202 .. Frame of video presentation screen,
204a, 204b, 205a, 205b, 206a, 206b, 208a, 208b .. visual feedback,

Claims

Partial video data specifying means for specifying partial video data that is a part of the video data from the video data;
Data association means for associating data with the identified partial video data;
Associated data presenting means for visually presenting data indicating the presence of data associated with the partial video data in association with the partial video data in the video data;
The related data presenting means presents shadow data having a shape based on the shape of the partial video data as data indicating the presence of data associated with the partial video data.
A video processing apparatus characterized by that.

Partial video data specifying means for specifying partial video data that is a part of the video data from the video data;
Data association means for associating data with the identified partial video data;
Associated data presenting means for visually presenting data indicating the presence of data associated with the partial video data in association with the partial video data in the video data;
The related data presenting means is a data indicating the presence of data associated with the partial video data. Presents data indicating the horizontal position and data indicating the vertical position at
A video processing apparatus characterized by that.

In the video processing device according to claim 1 or 2,
The partial video data specifying means specifies partial video data having a time width for the same target data included in the video data.
A video processing apparatus characterized by that.

The video processing apparatus according to claim 3.
Video data is compatible with audio data,
The partial video data specifying means specifies partial video data having a time width in which audio data corresponding to the data of the person is valid for the data of one or more persons included in the video data.
A video processing apparatus characterized by that.

The video processing apparatus according to any one of claims 1 to 4,
The partial video data specifying means specifies the partial video data using data for specifying an area where the partial video data is located in the frame of the video data.
A video processing apparatus characterized by that.

The video processing apparatus according to any one of claims 1 to 5,
The partial video data specifying means includes a partial video data candidate specifying means for specifying a plurality of partial video data candidates, a partial video data designation receiving means for receiving designation of partial video data included in the specified partial video data candidates from a user, and And the specified partial video data is designated as partial video data.
A video processing apparatus characterized by that.

The video processing apparatus according to any one of claims 1 to 6,
Provided with related partial video data specifying means for specifying the partial video data from the data associated with the partial video data;
A video processing apparatus characterized by that.

The video processing apparatus according to any one of claims 1 to 7,
Data indicating the presence of data associated with the partial video data is associated with a predetermined process,
Presenting data designation accepting means for accepting designation of data indicating the presence of data associated with the presented partial video data from the user;
A presentation data corresponding process execution means for executing a process associated with the data for which the designation has been received;
A video processing apparatus comprising:

The video processing apparatus according to any one of claims 1 to 8,
It is possible to execute operations related to the same video data by a plurality of terminal devices.
A video processing apparatus characterized by that.

The partial video data specifying means provided in the video processing device specifies the partial video data that is a part of the video data from the video data,
Data associated means provided in the image processing apparatus associates the data to the specified partial image data,
The related data presenting means provided in the video processing device, as data indicating the presence of the data associated with the partial video data, the shadow data having a shape based on the shape of the partial video data, Presented in visual association with partial video data,
And a video processing method.

The partial video data specifying means provided in the video processing device specifies the partial video data that is a part of the video data from the video data,
Data associated means provided in the image processing apparatus associates the data to the specified partial image data,
The related data presenting means provided in the video processing device, as data indicating the presence of data associated with the partial video data, is outside the frame of the video data and inside the frame provided outside the frame. , Presenting the data indicating the horizontal position and the data indicating the vertical position within the frame of the partial video data in a visual association with the partial video data in the video data,
And a video processing method.

A program to be executed by a computer constituting the video processing device,
A function for identifying partial video data that is a part of the video data from the video data;
A function for associating data with specified partial video data;
A function for presenting shadow data having a shape based on the shape of the partial video data as a data indicating the presence of data associated with the partial video data, visually associated with the partial video data in the video data, Make it happen on the computer,
A program characterized by that.

A program to be executed by a computer constituting the video processing device,
A function for identifying partial video data that is a part of the video data from the video data;
A function for associating data with specified partial video data;
As data indicating the presence of data associated with the partial video data, the horizontal position within the frame of the partial video data is indicated outside the frame of the video data and inside the frame provided outside the frame. Causing the computer to realize a function of presenting data and data indicating a vertical position in a visual association with the partial video data in the video data,
A program characterized by that.