JP3625549B2

JP3625549B2 - Multipoint video conferencing system

Info

Publication number: JP3625549B2
Application number: JP29733495A
Authority: JP
Inventors: フアドイスラムファルハド
Original assignee: Hewlett Packard Co
Current assignee: HP Inc
Priority date: 1995-10-20
Filing date: 1995-10-20
Publication date: 2005-03-02
Anticipated expiration: 2015-10-20
Also published as: JPH09139928A

Description

【０００１】
【発明の属する技術分野】
本発明は、会議環境を改善した多地点ビデオ会議システムに関し、複数の参加者画像の中から、（ａ）端末装置の参加者が注目したい他の参加者、および／または（ｂ）当該他の参加者が端末装置の画像表示装置に表示している更に他の参加者を、視覚的なガイド手段を用いて、自由に選択表示することができる多地点ビデオ会議システムに関する。
【０００２】
【技術背景】
従来、多地点ビデオ会議システム（以下、「ビデオ会議システム」と言う）の端末装置においては、参加者画像を、（１）画面分割法、（２）ウィンドウ表示法の２つの方法により表示している。
（１）の方法では、端末装置の画像表示装置（以下、「モニタ」と言う）の画面（あるいは、該モニタに表示された単一のウィンドウ）を複数の領域に分割し、各領域に参加者画像を表示する。また、（２）の方法では、各参加者画像を、独立のウィンドウ内に表示している。
【０００３】
上記（１），（２）の方法は、ビデオ会議の参加者数（すなわち、ビデオ会議システムに接続される端末装置数）が少ない（例えば、４名以下）ときには問題は生じにくいが、参加者数が多いときには以下のような問題を生じる。
【０００４】
（１）の方法では、各参加者に割り当てられた表示領域が狭くなる。
【０００５】
また、（２）の方法では、各参加者をそれぞれ小さなウィンドウ内に表示すると、参加者の表情等を認識しにくくなる。一方、各参加者をそれぞれ大きいウィンドウ内に表示すると、各ウィンドウ同士がオーバーラップする。
通常、これらのウィンドウは、アイコンをカーソルによりクリックすることによりオープンする。ある参加者画像を表示させたい場合、その参加者についてのウィンドウが開いていないときには、アイコンを探し出してウィンドウをオープンする。あるいは、すでにその参加者のウィンドウが開いているときには、ウィンドウを前面に移動させなければならないことも多い。このように、端末装置の参加者（以下、「ローカル参加者」と言う）は、多数のアイコンやオーバーラップしたウィンドウの中から目標とするアイコンやウィンドウを探すことにおいて、大きな負担を強いられている。
【０００６】
さらに、（１）の方法では、前面のウィンドウに表示されている特定の参加者と、他の参加者とが討論や対話をしているような場合、ローカル参加者が前記他の参加者の画像を表示させたい場合が生じる。しかし、この場合にも、ローカル参加者が上記他の参加者についてのアイコンやウィンドウを、多数のアイコンやウィンドウの中からから探し出すことは容易ではない。
【０００７】
加えて、（１），（２）の方法では、いわゆるアイコンタクトの問題が生じる。すなわち、ローカル参加者の画像は他の参加者の端末装置のモニタに表示される。このため、ローカル参加者の端末装置のモニタに多数の参加者が表示されていると、ローカル参加者の視点は該モニタ上を移動する。このようなことから、他の参加者は、前記ローカル参加者の表情を奇異に感ずる。
【０００８】
【発明の目的】
本発明の１つの目的は、モニタに所望参加者画像が表示されていなくても、ローカル参加者が、前記所望参加者画像を、視覚的なガイド手段を用いて容易に表示することが可能なビデオ会議システムを提供することである。
本発明の他の目的は、ある参加者が他の参加者と討論や対話をしているときに、ローカル参加者が、上記他の参加者の参加者画像を容易に探し出して、モニタに表示することができる上記システムを提供することである。
本発明のさらに他の目的は、いわゆるアイコンタクトによる不都合を低減することができる上記システムを提供することである。
【０００９】
【発明の概要】
本発明は、モニタ、音響発生装置（以下、「スピーカ」と言う）、撮像装置（以下、「ビデオカメラ」と言う）、および音声入力装置（以下、「マイクロフォン」と言う）を備えた複数の端末装置が、有線または無線の通信回線を介して相互に参加者画像および参加者音声を含むメディアデータを送受信するビデオ会議システムに適用されるもので、端末装置はスクロール操作手段を備えている。
【００１０】
ここで、参加者画像とは、ビデオカメラにより撮像される画像（通常は、ビデオ会議参加者の画像であるが、文書等の参加者画像以外の画像であることもある）である。通常、１つの参加者画像には参加者１人が表示されるが、複数が表示される場合もある。
参加者音声とは、例えばマイクロフォンから入力された音声（通常は、ビデオ会議参加者の発言音声）である。
【００１１】
通信回線には、メディアデータおよびシステム情報が流されており、このメディアデータは、上述した参加者画像、参加者音声からなる。
また、システム情報には、これら画像や音声がどの参加者のものなのかを特定する識別コード（この識別コードを「Ｓ_ｉｄ」と言う）が含まれる。また、システム情報には、後述するように、議長，副議長等、特別の役割を持つ参加者が予め決定されている場合にはその役割を特定するための識別コードが、またメディアデータが発言参加者に係るものである場合には、発言していることを示す識別コードが含まれる（前者および後者の識別コードを「ＲＣ」と言う）。
さらに、システム情報には、ある参加者の端末装置の参加者表示部に表示されている参加者（以下、便宜上「対話相手」と言う）をローカル参加者に知らせるようにする場合には、メディアデータにはその対話相手の識別コード（この識別コードを「Ａ_ｉｄ」と言う）も含まれる。なお、通常、Ａ_ｉｄは、上記対話相手のＳ_ｉｄである。
【００１２】
メディアデータを処理するための装置（メディアデータ処理装置）は、端末装置に設けられることもあるし、マルチポイントコントロールユニットなどの端末装置以外の装置に設けられることもある。
このメディアデータの処理装置は、（１）参加者画像生成手段、（２）位置記憶手段、（３）位置算出手段（後述する特定参加者画像の前記仮想画面上での位置を算出する手段）を有している。
【００１３】
（１）の参加者画像生成手段は、複数の参加者画像が所定位置に配置された仮想画面を作成し、この仮想画面の一部をモニタの参加者表示部に表示させる。通常、参加者表示部には、仮想画面上の参加者画像のうちの一参加者画像のみが表示され、仮想画面は上記スクロール操作手段からの相対位置信号に応じてスクロールされる。なお、参加者表示部に表示される仮想画面は、拡大または縮小することもできる。仮想画面を縮小したときは、参加者表示部には複数の参加者が表示される。
【００１４】
（２）の位置記憶手段は、仮想画面上の各参加者画像の、該仮想画面上での配置を記憶している。本発明では、この位置記憶手段に記憶されている参加者画像の前記仮想画面上での配置の指定や変更を、ローカル参加者が行うことができるようにもでき、この場合には、参加者配置変更手段が設けられる。
【００１５】
（３）の位置算出手段は、位置記憶手段に記憶された仮想画面上での特定参加者画像（各参加者画像の中から選択された特定の参加者画像）の配置情報と、前記仮想画面における参加者表示部の画像の現在配置情報とに基づき、その特定参加者の仮想画面上での位置算出を行う。例えば、ビデオ会議の議長や副議長を特定参加者とすることもできるし、発言参加者を特定参加者とすることもできる。
【００１６】
後述するように、ガイド手段は、上記特定参加者の位置をローカル参加者に知らせるアロー等である。
【００１７】
ガイド手段を、上記特定参加者の対話相手の画像の仮想画面上での位置を、ローカル参加者に知らせるために用いることもできる。もちろん、ガイド手段を、特定参加者ではない参加者（画像位置がアロー等により表示されない参加者）の対話相手の画像の仮想画面上での位置を、ローカル参加者に知らせるために用いることもできる。
すなわち、本発明では、第１の端末装置のモニタに表示されたガイド手段が、第１の端末装置における仮想画面上での、第２の前記端末装置のモニタの参加者表示部に表示されている参加者画像の位置を、前記第１の端末装置の参加者に視覚的に知らせるようにもできる。
ローカル参加者は、例えば、議長が他の参加者と会話をしている場合に、当該他の参加者を知ることができない。すなわち、通常、ローカル参加者は、前記他の参加者がモニタに表示されている場合を除いて、前記他の参加者を知ることができない。このため、ローカル参加者は会議の内容を完全に理解することができないという事態も生じる。
しかし、上記のように、ある参加者の対話相手の仮想画面上での位置をガイド手段で示すことで、ローカル参加者は、該対話相手が誰であるのかを容易に知ることができる。
【００１８】
ローカル参加者がどのような役割を有しているか（すなわち、ローカル参加者がどのような識別コードＲＣを持っているか）、ローカル参加者がどの参加者と対話しているか（すなわち、ある参加者がどのような識別コードＡ_ｉｄを持っているか）を、他の参加者に知らせるために、メディアデータの処理装置は、端末装置のモニタの参加者画像表示部に表示されている参加者画像を特定するための情報を出力するシステム情報出力部を有することができる。
また、逆に、ある参加者がどのような識別コードＲＣを持っているか、ある参加者がどのような識別コードＡ_ｉｄを持っているかを、ローカル参加者が知るために、メディアデータの処理装置は、他の端末装置のモニタの参加者画像表示部に表示されている参加者画像を特定するための情報を入力するシステム情報入力部を有することができる。
【００１９】
画像表示装置にはガイド手段が表示される。このガイド手段は、仮想画面上での特定参加者画像の位置を、ローカル参加者に知らせるために設けられるもので、前記特定参加者画像の位置をローカル参加者に視覚的に知らせることができる。
ガイド手段は、典型的にはモニタの位置表示部に表示されたアローとすることができる。この場合には、前記仮想画面における、前記参加者表示部を基準とする前記特定参加者画像の距離を、前記アローの長さで表示し、前記仮想画面における、前記参加者表示部を基準とする特定参加者画像の方向を、前記アローの方向で表示することができる。
【００２０】
以下、本発明のビデオ会議システムの作用の一例を説明する。
前述したように、参加者表示部に表示される画像は仮想画面の一部である。ローカル参加者はスクロール操作手段用いて、任意の参加者画像が参加者表示部に表示されるように仮想画面を移動させることができる。
参加者画像の中から選択された少なくとも１つの特定参加者画像の仮想画面上での位置（参加者表示部を基準とする特定参加者画像の方向、あるいは方向と距離）がガイド手段（例えばアロー）により表示される。
この特定参加者は、ガイド手段により示された他の特定参加者の対話相手であることもあるし、ガイド手段によっては示されない他の参加者の対話相手であることもある。
【００２１】
ローカル参加者が、上記の特定の参加者画像を参加者表示部に表示させたいときには、位置表示部の表示を参照してスクロール操作手段を操作する。この操作手段は相対位置信号を生成し、この信号を参加者画像生成手段に出力する。この相対位置信号に基づき、参加者画像生成手段は仮想画面をスクロールする。また、位置算出手段は、このスクロールに応じて、参加者表示部を基準とする特定参加者画像の仮想画面上での上記位置を算出し、これを位置表示部にリアルタイムで表示させる。
【００２２】
ローカル参加者は、位置表示部の表示を参照することにより、特定参加者画像（例えば、ローカル参加者が注目している参加者やその対話相手）を容易に見つけ出して表示させることができる。また、モニタに表示される参加者画像の数は従来の端末装置と比較して極めて少ない（例えば、１人）ので、いわゆるアイコンタクトの不都合も緩和される。
【００２３】
【実施例】
図１は本発明のビデオ会議システムが適用されるシステムの一例を示す図であり、図２は本発明のビデオ会議システムを具体的に示す説明図である。
図１において、通信回線２００に接続された複数の端末装置（以下、「ローカル端末装置」と言う）１００は、メディアデータ処理装置１（図１では図示せず）を内蔵すると共に、この処理装置１に接続されたモニタ１１１、スピーカ１１２、ビデオカメラ１１３、マイクロフォン１１４、スクロール操作手段（同図ではジョイスティック）１１５、キーボード１１６およびコンピュータ筐体１２０を有している。
【００２４】
図２に示すように、通信回線２００上には、図示しない他の端末装置（ローカル端末装置１００と同一構成であるとは限らない）からの、メディアデータＭＤ（ビデオ信号ＶＳと音声信号ＡＳからなる多重化信号）、およびシステム情報ＳＩ（識別コードＳ_ｉｄ，ＲＣ，Ａ_ｉｄ）が流されている。
前述したように、Ｓ_ｉｄは多重化信号ＭＤがどの参加者のものなのかを特定する識別コードである。また、ＲＣは特別の役割を持つ参加者の当該役割を特定するための識別コードおよび／または現在発言していることを示す識別コードである。さらに、Ａ_ｉｄはある参加者の対話相手を示す識別コード（すなわち、該参加者の識別コードＳ_ｉｄ）である。
【００２５】
メディアデータ処理装置１は、多重化信号入力部１２、多重化信号出力部１３、システム情報入力部１４、システム情報出力部１５、参加者画像生成手段１６、位置算出手段１７ａ〜１７ｄ、位置記憶手段１８、特定参加者変更手段１９、参加者配置変更手段２０、およびディスプレイ・デバイス２１を有してなる。なお、メディアデータ処理装置１の上記構成要素の全てが、図１に示したコンピュータ筐体１２０に内蔵されていてもよいし、その一部の構成要素（例えば、ディスプレイ・デバイス２１）がコンピュータ筐体１２０に内蔵され、他の構成要素が図示しない別の筐体に内蔵されていてもよい。
【００２６】
多重化信号出力部１３は、ビデオカメラ１１３からのローカル参加者画像、およびマイクロフォン１１４からのローカル参加者音声を入力し、これらを多重化しメディアデータＭＤ_ｏｕｔ（ビデオ信号ＶＳ，音声信号ＡＳ）を生成する。多重化信号出力部１３は、このＭＤ_ｏｕｔをメディアデータＭＤとしてローカル参加者の識別コードＳ_ｉｄと共に通信回線２００に出力する。
一方、多重化信号入力部１２は、他の端末装置からのメディアデータＭＤ（ビデオ信号ＶＳ，音声信号ＡＳ）をＭＤ_ｉｎとして、識別コードＳ_ｉｄと共に通信回線２００から入力し、ビデオ信号ＶＳと識別コードＳ_ｉｄとを参加者画像生成手段１６に出力する。なお、音声信号ＡＳは、図示しない音声処理回路に出力され、この音声処理回路はスピーカ１１２に参加者音声を出力させる。
【００２７】
参加者画像生成手段１６は、複数の参加者画像が所定位置に配置された仮想画面３１（図３（ａ）参照）を作成する。この仮想画面３１上には、通常は、ビデオ会議に参加しているすべての参加者についての参加者画像が所定の配列で割り振られる。
【００２８】
図３（ａ）は、端末装置の外観および仮想画面３１がモニタ１１１の参加者表示部３２に表示された様子を示している。同図（ａ）では、モニタ１１１はコンピュータ筐体１２０上に載置されている。また、仮想画面３１にはＡ〜Ｐの参加者画像が配置され、仮想画面３１上の一部（通常、参加者画像の少なくとも１画像であり、同図ではＫ）が、モニタ１１１の参加者表示部３２に表示されている。
なお、図３（ａ）に示したような参加者表示部３２と、後述する参加者位置表示部３３ａ〜３３ｄとは、別々のウィンドウに表示されるようにもできるし、図３（ａ）に示したように１つのウィンドウに表示されるようにもできる。また、このウィンドウをオープンするためのスイッチをアイコン化しておくことができる。なお、参加者表示部３２および／または参加者位置表示部３３ａ〜３３ｄは、拡大・縮小表示できるようにしてもよい。
【００２９】
図２において、参加者画像生成手段１６は、多重化信号入力部１２から入力したビデオ信号ＶＳを識別コードＳ_ｉｄと、位置記憶手段１８に記憶されている後述する配置情報ＰＩとに基づき、図３（ａ）に示した仮想画面３１上の所定位置に各参加者画像を表示する。
なお、後述するように、配置情報ＰＩはローカル参加者により変更されることがある。この場合には、参加者画像生成手段１６は、配置情報ＰＩを位置記憶手段１８から取得して、当該参加者画像の配置変更を行う。
【００３０】
スクロール操作手段１１５は、仮想画面３１を任意の方向にスクロール操作するために、ローカル参加者に提供されるもので、相対位置信号ＲＰＳを生成する。ローカル参加者が、スクロール操作手段１１５を操作すると、参加者画像生成手段１６は、所望の参加者画像が参加者表示部３２に表示されるように、仮想画面３１を表示する画像信号ＶＤを後述するディスプレイ・デバイス２１に出力する。そして、所望の参加者画像が、スクロールされつつモニタ１１１に表示される。ここで、スクロール（通常、縦スクロール、横スクロール（すなわち、パン）、これらの組み合わせ（斜めスクロール）である。
【００３１】
なお、スクロール操作手段１１５として、ジョイスティックの他、フットペダル、またはキーボードのキー（例えば、アローキー）を用いることができる。また、モニタ１１１に表示されたアロー形状をなすソフトキー（通常、マウスにより操作される）を用いることもできる。フットペダルを使用する場合には、手が常に自由となるので、筆記や飲食を行いつつ仮想画面３１をスクロールすることができる。
【００３２】
また、参加者画像生成手段１６は、相対位置信号ＲＰＳ、およびディスプレイ・デバイス２１が出力するウィンドウ情報ＷＩに基づき、参加者表示部３２に現在表示されている仮想画面３１の位置についての情報（現在位置情報）ＣＰを生成し、これを位置算出手段１７ａ〜１７ｄに出力する。ウィンドウ情報ＷＩは、参加者表示部３２の、モニタ１１１の表示画面上での位置、大きさ等に関する情報を含んでいる。
本実施例では、現在位置情報ＣＰは、参加者画像生成手段１６が保有しているが、他の手段が保有する（例えば、位置算出手段１７ａ〜１７ｄそれぞれが保有する）こともできる。
【００３３】
参加者画像生成手段１６は、上記した現在位置情報ＣＰを位置記憶手段１８にも出力し、位置記憶手段１８からローカル参加者の対話相手が誰であるかの情報（すなわち、識別コードＡ_ｉｄ）を取得する。そして、参加者画像生成手段１６はシステム情報出力部１５にこの識別コードＡ_ｉｄを出力し、システム情報出力部１５は、システム情報ＳＩ_ｏｕｔ（識別コードＡ_ｉｄ，ローカル参加者の識別コードＳ_ｉｄ，およびその役割を示す識別コードＲＣ）を、ＳＩとして通信回線２００に出力する。
【００３４】
システム情報入力部１４は、システム情報ＳＩ（識別コードＳ_ｉｄ，Ａ_ｉｄおよびＲＣ）をＳＩ_ｉｎとして通信回線２００から入力し、これらを内蔵する記憶部１４１に関連付けて記憶している。
特定参加者変更手段１９は、システム情報入力部１４から識別コードＲＣ、Ｓ_ｉｄを入力しており、これらを内蔵する記憶部１９１に関連付けて記憶している。
前述したようにＡ_ｉｄは、ある参加者の端末装置のモニタに表示される他の参加者を示す識別コードであり、当該端末装置において仮想画面がスクロールされると、これに応じてコード値が変更される。また、ＲＣは、ある参加者の役割を示す識別コードであり、例えば、ある参加者が発言しているときと発言していないときとではそのコード値は異なる。このため、記憶部１４１，１９１の記憶内容は、通常、頻繁に更新される。
【００３５】
ローカル参加者は、キーボード１１６から特定参加者を誰にするかを設定することができる。特定参加者変更手段１９は、キーボード１１６からの信号に応じて特定参加者を誰にするかの信号をシステム情報入力部１４に出力する。例えば、特定参加者変更手段１９は、議長（または発言参加者、あるいは任意の参加者）を特定参加者とするための信号がキーボード１１６から入力されると、議長（または発言参加者、あるいは任意の参加者）を示す選択用信号ＳＳ（識別コードＳ_ｉｄおよび／またはＲＣ）をシステム情報入力部１４に出力する。
なお、特定参加者の設定を、キーボード１１６を用いずに、モニタ１１１に表示されたソフトキー（図示せず）により設定するようにもできる。
【００３６】
本実施例では、特定参加者が、システムにより決定された議長（参加者画像Ｊ）およびその対話相手（参加者画像Ｄ）、ならびにローカル参加者が選択した参加者（参加者画像Ｍ）およびその対話相手（参加者画像Ｐ）の４人であるものとする。
システム情報入力部１４は、議長（参加者画像Ｊで示される）を示す識別コードＳ_ｉｄおよびその対話相手（参加者画像Ｄで示される）を示す識別コードＡ_ｉｄ（図２ではＳ_ｓｃｃ，Ｓ_{ａ＿ｓｃｃ}として示す）を、位置算出手段１７ａおよび１７ｂにそれぞれ出力する。
また、システム情報入力部１４は、ローカル参加者が選択した参加者（参加者画像Ｍで示される）を示す識別コードＳ_ｉｄおよびその対話相手（参加者画像Ｐで示される）を示す識別コードＡ_ｉｄ（図２ではＳ_ｌｃｃ，Ｓ_{ａ＿ｌｃｃ}として示す）を位置算出手段１７ｂ，１７ｃに出力する。
【００３７】
前述したように、位置記憶手段１８は、各参加者画像Ａ〜Ｐの、仮想画面１６上での配置情報ＰＩを記憶している。この配置情報ＰＩは、最も簡単な例では、ある参加者画像が配置された位置を基準に、単に各参加者画像の順序を、単純な序列行列や行列として表すこともできるし座標情報で表すこともできる。識別コードＳ_ｉｄがＳ_ｓｃｃ，Ｓ_ｌｃｃ，Ｓ_{ａ＿ｓｃｃ}，Ｓ_{ａ＿ｌｃｃ}である参加者（すなわち、特定参加者）の画像の、位置記憶手段１８の配置情報ＰＩは、位置算出手段１７ａ〜１７ｄに出力される。
位置算出手段１７ａ〜１７ｄは、（ｉ）位置記憶手段１８から取得した各特定参加者画像の仮想画面３１上での配置情報ＰＩと、（ｉｉ）参加者画像生成手段１６からの現在位置情報ＣＰとに基づき、上記各参加者画像の仮想画面３１上での参加者表示部３２を基準とする位置の算出を行う。この算出結果は、本実施例では、位置ベクトル情報ＰＶ_ａ〜ＰＶ_ｄである。
【００３８】
位置算出手段１７ａ〜１７ｄは、参加者表示部３２を基準として、予め選択されている上記４つの特定参加者画像（議長および議長の対話相手、ならびにローカル参加者が選択した参加者およびその対話相手のそれぞれ画像）Ｊ，Ｄ，Ｍ，Ｐの位置ベクトル情報ＰＶ_ａ〜ＰＶ_ｄをディスプレイ・デバイス２１に出力する。
【００３９】
ディスプレイ・デバイス２１は、位置算出手段１７ａ〜１７ｄにより算出された位置ベクトル情報ＰＶ_ａ〜ＰＶ_ｄに基づき、図３（ａ）に示したように、各特定参加者画像の位置をガイド手段（すなわち参加者位置表示部３３ａ〜３３ｄ）に表示させる。特定参加者位置表示部３３ａ〜３３ｄにおける位置表示は、通常は図３（ａ）に示したようにアロー３４により表示されるが、数値で表示することも可能である。
【００４０】
図３（ａ）の例では、参加者表示部３２の基準位置（通常、該表示部の中心位置）と、特定参加者画像の位置（通常、該画像の中心位置）との最短距離がアロー３４の長さで表示され、参加者表示部３２を基準とする特定参加者画像Ｊ，Ｄ，Ｍ，Ｐの方向がアロー３４の方向で表示されている。
【００４１】
ローカル参加者は、位置記憶手段１８に記憶されている配置情報ＰＩを変更することができる。すなわち、参加者配置変更手段２０は、キーボード１１６からの信号に応じて、位置記憶手段１８に配置変更用の信号（識別コードＳ_ｉｄと配置情報ＰＩからなる）を出力することができる。ローカル参加者は、自分が注目したい２以上の参加者の参加者画像を近接して配置することもできる。こうすることにより、短いスクロール距離で簡単に上記２以上の参加者を交互に参加者表示部３２に表示させることができる。また、参加者表示部３２を拡大することにより、該表示部３２に同時に上記２以上の参加者を表示させることができる。
なお、配置情報の変更をモニタ１１１に表示されたソフトキーにより設定するようにもできる。
【００４２】
以下、上記ビデオ会議システムの動作を図２および図３（ａ）を参照しつつより具体的に説明する。
図２，図３（ａ）の実施例においては、端末装置１００が起動されると、システム情報入力部１４が、通信回線２００からシステム情報ＳＩ（識別コードＳ_ｉｄ，Ａ_ｉｄ，ＲＣ，場合によりビデオ会議参加者人数等の情報を含む）をＳＩ_ｉｎとして入力する。なお、参加者画像生成手段１６は、位置記憶手段１８からのデフォルト情報に基づき、参加者画像Ａ〜Ｐを仮想画面３１上に配置する。
【００４３】
ここで、ローカル参加者が、必要に応じ、キーボード１１６から特定参加者画像を選択すると、特定参加者変更手段１９からの選択用信号ＳＳ（Ｓ_ｉｄおよび／またはＲＣ）がシステム情報入力部１４に出力される。システム情報入力部１４は、特定参加者の識別コードＳ_ｉｄ（Ｓ_ｓｃｃ，Ｓ_ｌｃｃ）および識別Ａ_ｉｄ（Ｓ_{ａ＿ｓｃｃ}，Ｓ_{ａ＿ｌｃｃ}）を位置算出手段１７ａ〜１７ｄに出力する。位置算出手段１７ａ〜１７ｄは、参加者画像生成手段１６からの現在位置情報ＣＰおよび位置記憶手段１８に記憶された配置情報ＰＩを参照して、各特定参加者画像の仮想画面３１上での位置情報をディスプレイ・デバイス２１に出力する。そして、ディスプレイ・デバイス２１は、アロー３４を参加者位置表示部３３ａ〜３３ｄにそれぞれ表示させる。
【００４４】
ローカル参加者が、参加者位置表示部３３ａ〜３３ｄの何れかを参照して、スクロール操作手段１１５を操作すると、操作手段１１５からの相対位置信号ＲＰＳは参加者画像生成手段１６に出力される。
【００４５】
参加者画像生成手段１６は、特定参加者画像が参加者表示部３２に表示されるように仮想画面３１をスクロールする。また、位置算出手段１７ａ〜１７ｄは、現在位置信号ＣＰと配置情報ＰＩとに基づき、参加者位置表示部３３ａ〜３３ｄのアロー３４を変更する。
【００４６】
例えば、ローカル参加者は、参加者位置表示部３３ｂのアロー３４を参照して、スクロール操作手段１１５を操作することにより、議長（参加者画像Ｊ）の対話相手の画像（参加者画像Ｄ）を、参加者表示部３２に表示させることができる。
また、議長と、ある参加者（例えば、参加者画像Ｄとして表示されている参加者）とが頻繁に討議をしているような場合には、図３（ｂ）に示すように、キーボード１１６（図２参照）を用いて、議長の参加者画像Ｊと参加者画像Ｄとを近接位置に配置することもできる。
【００４７】
図３（ａ）の参加者表示部３２に複数の参加者を表示することもできる。このためには、通常、仮想画面３１は縮小表示されるか、または参加者表示部３２が横長および／または縦長にされる。
この場合、図２を参照して説明すると、参加者画像生成手段１６は複数の識別コードＡ_ｉｄをシステム情報出力部１５に出力し、システム情報出力部１５は、システム情報ＳＩ_ｏｕｔにこれら複数の識別コードＡ_ｉｄを含め、ＳＩとして通信回線２００に出力する。
他の端末装置が、このような複数の識別コードＡ_ｉｄを含むシステム情報ＳＩを出力している場合には、システム情報入力部１４は、他の端末装置から、このシステム情報ＳＩをシステム情報ＳＩ_ｉｎとして入力する。この場合には、メディアデータ処理装置１には、複数の対話相手を示す参加者位置表示部およびこれに対応する位置算出手段が、識別コードＡ_ｉｄの個数に応じて予め用意される。
【００４８】
本発明では、発言参加者を特定参加者とすることで、参加者Ｘ_１の発言に対して参加者Ｘ_２が発言し、この参加者Ｘ_２の発言に対して参加者Ｘ_３が発言するような場合に、ローカル参加者は、参加者Ｘ_１，Ｘ_２，Ｘ_３を追跡することができる。
この場合には、図３（ａ）の参加者位置表示部３３ｃには発言参加者画像の仮想画面３１上での位置が、同じく参加者位置表示部３３ｄにはその対話相手の仮想画面３１上での位置が表示されるようにしておく。
参加者Ｘ_１がその端末装置のモニタの参加者表示部に参加者Ｘ_２を表示させ、参加者Ｘ_２に話しかけると、参加者位置表示部３３ｃには参加者Ｘ_１の画像位置が表示され、参加者位置表示部３３ｄには参加者Ｘ_２の画像位置が表示される。次に、参加者Ｘ_２がその端末装置のモニタの参加者表示部に参加者Ｘ_３を表示させて参加者Ｘ_３に話しかけると、参加者位置表示部３３ｃには参加者Ｘ_２の画像位置が表示され、参加者位置表示部３３ｄには参加者Ｘ_３の画像位置が表示される。したがって、ローカル参加者は、参加者位置表示部３３ｃ，３３ｄを参照することにより、容易に発言参加者Ｘ_１，Ｘ_２、発言参加者の対話相手Ｘ_２，Ｘ_３を順次追跡することができる。
【００４９】
なお、特定参加者位置表示部３３ａ〜３３ｄに表示されたアロー３４を適宜位置にドラッグできるようにもできる。図３（ａ）には、参加者画像にドラッグされたアロー３４が点線で示されている。このアロー３４は、実際においても点線（あるいは、参加者画像の視認の障害とならないような形態）で表示することが好ましい。アロー３４を参加者にオーバーラップさせることで、ローカル参加者の視点がモニタ１１１の画面上を移動する頻度は減少する。これにより、アイコンタクトの不都合が低減される。
【００５０】
図３（ａ）においては、参加者画像Ａ〜Ｐが仮想画面３１に平面モードでマトリクス状に配置されている場合を示したが、図４（ａ）に示すように、参加者画像Ａ〜Ｐが表示される面を球面と同相とし、各参加者画像を任意の方向に方向に自在にスクロールできるようにしてもよい。
また、図４（ｂ）に示すように、各参加者画像Ａ〜Ｐを、仮想画面３１に、垂直または水平（同図では水平）の直線モードで配置してもよい。また、この場合にも、図４（ｃ）に示すように、各参加者画像をループ状に配置してもよい。
【００５１】
【発明の効果】
以上述べたように、本発明は以下のような効果を奏することができる。
（１）ローカル参加者は、仮想画面上に配置された各参加者画像を、該仮想画面をスクロールすることにより自由に選択して表示できるので、全ての参加者画像を１つのモニタに表示する必要がなくなった。
【００５２】
（２）ローカル参加者は、他の参加者、および／またはその対話相手を視覚的なガイド手段を用いて容易に見つけ出すことができる。
【００５３】
（３）参加者位置表示部の指示に基づく仮想画面のスクロールは、ジョイスティックの操作、ソフトキーやハードキーの操作により行うことができる。また、参加者位置表示部は、シンプルな指示をローカル参加者に与えることができる。したがって、本発明においては、多数のアイコンがオーバーラップしたウィンドウから、マウス操作により所定の参加者画像を捜さなければならない従来技術と比較して、所望参加者をモニタ上に表示させる際の操作性が極めて高い。
【００５４】
（４）ビデオ会議参加者が多数であったとしても、モニタに表示される参加者画像をスクロールできる。したがって、該モニタに表示される参加者画像を１人あるいは極めて少数とできるので、ローカル参加者の視線が移動する頻度は極めて少ない。したがって、ローカル参加者の画像を通信回線を介して見ている他の参加者は、ローカル参加者の視線の動きが気になるといった、アイコンタクトの不都合は従来と比較して大幅に低減される。
【図面の簡単な説明】
【図１】本発明のビデオ会議システムが適用されるシステムの一例を示す図である。
【図２】本発明のビデオ会議システムの説明図である。
【図３】（ａ）は図２のビデオ会議システムのモニタの表示状態等を示す説明図であり、（ｂ）は議長画像Ｊと参加者画像Ｄとを近接位置に配置した様子を示す図である。
【図４】本発明のビデオ会議システムにおける仮想画面の態様を示す説明であり、（ａ）は仮想画面が球面と同相な表示面に配置された様子を示す図、（ｂ）は仮想画面が直線モードで配置された様子を示す図、（ｃ）は、仮想画面がループ状の線モードで配置された様子を示す図である。
【符号の説明】
１メディアデータ処理装置
１２多重化信号入力部
１３多重化信号出力部
１４システム情報入力部
１４１，１９１記憶部
１５システム情報出力部
１６参加者画像生成手段
１７ａ〜１７ｄ位置算出手段
１８位置記憶手段
１９特定参加者変更手段
２０参加者配置変更手段
２１ディスプレイ・デバイス
３１仮想画面
３２参加者表示部
３３ａ〜３３ｄ参加者位置表示部
３４アロー
１１１モニタ
１１２スピーカ
１１３ビデオカメラ
１１４マイクロフォン
１１５スクロール操作手段
１１６キーボード
１２０コンピュータ筐体
１００端末装置
２００通信回線
ＭＤ，ＭＤ_ｏｕｔ，ＭＤ_ｉｎ多重化信号
ＳＩ，ＳＩ_ｏｕｔ，ＳＩ_ｉｎシステム情報
Ｓ_ｉｄ，Ａ_ｉｄ，ＲＣ識別コード
ＶＳビデオ信号
ＡＳ音声信号
ＲＰＳ相対位置信号
ＳＳ選択用信号
ＰＩ配置情報
ＣＰ現在位置信号
ＰＶ_ａ〜ＰＶ_ｄ位置ベクトル情報
ＶＤ画像信号[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a multi-point video conferencing system with an improved conference environment, and among a plurality of participant images, (a) another participant that a terminal device participant wants to pay attention to and / or (b) the other The present invention relates to a multi-point video conference system that allows a participant to freely select and display still another participant displayed on an image display device of a terminal device using visual guide means.
[0002]
[Technical background]
Conventionally, in a terminal device of a multipoint video conference system (hereinafter referred to as “video conference system”), participant images are displayed by two methods: (1) a screen division method and (2) a window display method. Yes.
In the method (1), a screen (or a single window displayed on the monitor) of an image display device (hereinafter referred to as a “monitor”) of a terminal device is divided into a plurality of regions and participates in each region. The person image is displayed. In the method (2), each participant image is displayed in an independent window.
[0003]
The above methods (1) and (2) are less likely to cause problems when the number of video conference participants (ie, the number of terminal devices connected to the video conference system) is small (for example, 4 or less). When the number is large, the following problems occur.
[0004]
In the method (1), the display area assigned to each participant is narrowed.
[0005]
In the method (2), when each participant is displayed in a small window, it becomes difficult to recognize the facial expression of the participant. On the other hand, when each participant is displayed in a large window, the windows overlap each other.
Normally, these windows are opened by clicking an icon with a cursor. When it is desired to display a certain participant image, if the window for the participant is not open, the icon is searched and the window is opened. Alternatively, if the participant's window is already open, it often has to be moved to the front. In this way, participants of the terminal device (hereinafter referred to as “local participants”) have a heavy burden in searching for a target icon or window from a large number of icons or overlapping windows. Yes.
[0006]
Furthermore, in the method (1), when a specific participant displayed in the front window and other participants are discussing or interacting with each other, There is a case where an image is desired to be displayed. However, in this case as well, it is not easy for the local participant to search for icons and windows for the other participants from among many icons and windows.
[0007]
In addition, in the methods (1) and (2), a problem of so-called eye contact occurs. That is, the image of the local participant is displayed on the monitor of the terminal device of another participant. For this reason, when a large number of participants are displayed on the monitor of the terminal device of the local participant, the viewpoint of the local participant moves on the monitor. For this reason, other participants feel strange about the facial expressions of the local participants.
[0008]
OBJECT OF THE INVENTION
One object of the present invention is that a local participant can easily display the desired participant image using visual guide means even if the desired participant image is not displayed on the monitor. To provide a video conferencing system.
Another object of the present invention is that when a participant is discussing or interacting with another participant, the local participant can easily find the participant image of the other participant and display it on the monitor. An object of the present invention is to provide such a system.
Still another object of the present invention is to provide the above-described system capable of reducing inconvenience caused by so-called eye contact.
[0009]
SUMMARY OF THE INVENTION
The present invention includes a plurality of monitors, a sound generator (hereinafter referred to as “speaker”), an imaging device (hereinafter referred to as “video camera”), and an audio input device (hereinafter referred to as “microphone”). The terminal device is applied to a video conference system that transmits and receives media data including participant images and participant voices via a wired or wireless communication line, and the terminal device includes scroll operation means.
[0010]
Here, the participant image is an image captured by a video camera (usually an image of a video conference participant, but may be an image other than a participant image such as a document). Normally, one participant image is displayed on one participant image, but a plurality of participants may be displayed.
Participant voice is, for example, voice input from a microphone (usually speech voice of a video conference participant).
[0011]
Media data and system information are passed through the communication line, and the media data includes the participant images and the participant voices described above.
In the system information, an identification code for identifying which participant the image or sound belongs to (this identification code is set to “S _id ")". In addition, as will be described later, in the system information, when a participant having a special role such as a chairperson or a vice chairperson is determined in advance, an identification code for identifying the role, and media data In the case of a participant, an identification code indicating that he / she is speaking is included (the former and latter identification codes are referred to as “RC”).
Further, in the system information, when a local participant is notified of a participant (hereinafter referred to as “conversation partner” for convenience) displayed on the participant display unit of a certain participant's terminal device, The data includes the identification code of the conversation partner (this identification code is "A _id Is also included). Usually, A _id Is S _id It is.
[0012]
A device for processing media data (media data processing device) may be provided in a terminal device, or may be provided in a device other than the terminal device such as a multipoint control unit.
The media data processing apparatus includes (1) participant image generation means, (2) position storage means, and (3) position calculation means (means for calculating the position of a specific participant image, which will be described later, on the virtual screen). have.
[0013]
The participant image generation means (1) creates a virtual screen in which a plurality of participant images are arranged at predetermined positions, and displays a part of the virtual screen on the participant display unit of the monitor. Normally, only one participant image among the participant images on the virtual screen is displayed on the participant display unit, and the virtual screen is scrolled according to the relative position signal from the scroll operation means. Note that the virtual screen displayed on the participant display section can be enlarged or reduced. When the virtual screen is reduced, a plurality of participants are displayed on the participant display section.
[0014]
The position storage means (2) stores the arrangement of each participant image on the virtual screen on the virtual screen. In the present invention, it is also possible for the local participant to specify or change the arrangement of the participant images stored in the position storage means on the virtual screen. In this case, the participant Arrangement changing means is provided.
[0015]
The position calculation means of (3) includes arrangement information of specific participant images (specific participant images selected from each participant image) on the virtual screen stored in the position storage means, and the virtual screen The position of the specific participant on the virtual screen is calculated based on the current arrangement information of the image on the participant display unit. For example, a video conference chair or vice chair may be a specific participant, and a speech participant may be a specific participant.
[0016]
As will be described later, the guide means is an arrow that informs the local participant of the position of the specific participant.
[0017]
Guide means can also be used to inform local participants of the location of the image of the specific participant's conversation partner on the virtual screen. Of course, the guide means can also be used to inform the local participant of the position on the virtual screen of the image of the conversation partner of the participant who is not a specific participant (participant whose image position is not displayed by an arrow or the like). .
That is, in the present invention, the guide means displayed on the monitor of the first terminal device is displayed on the participant display section of the second monitor of the terminal device on the virtual screen of the first terminal device. It is also possible to visually notify the participant of the first terminal device of the position of the participant image.
The local participant cannot know the other participant, for example, when the chairman has a conversation with the other participant. That is, normally, the local participant cannot know the other participant except when the other participant is displayed on the monitor. For this reason, the local participant may not be able to fully understand the contents of the conference.
However, as described above, by indicating the position of a certain participant's conversation partner on the virtual screen with the guide means, the local participant can easily know who the conversation partner is.
[0018]
What role does the local participant have (ie what identification code RC the local participant has) and what participant the local participant is talking to (ie some participant) What identification code A _id The media data processing device outputs information for identifying the participant image displayed on the participant image display unit of the monitor of the terminal device. A system information output unit can be included.
Conversely, what identification code RC a certain participant has, what identification code A a certain participant has _id In order for a local participant to know whether or not he / she has the media data, the media data processing device inputs information for identifying the participant image displayed on the participant image display unit of the monitor of the other terminal device It can have a system information input part.
[0019]
A guide means is displayed on the image display device. This guide means is provided to inform the local participant of the position of the specific participant image on the virtual screen, and can visually inform the local participant of the position of the specific participant image.
The guide means can typically be an arrow displayed on the position display section of the monitor. In this case, the distance of the specific participant image relative to the participant display unit on the virtual screen is displayed with the length of the arrow, and the participant display unit on the virtual screen is used as a reference. The direction of the specific participant image to be displayed can be displayed in the direction of the arrow.
[0020]
Hereinafter, an example of the operation of the video conference system of the present invention will be described.
As described above, the image displayed on the participant display unit is a part of the virtual screen. The local participant can move the virtual screen using the scroll operation means so that an arbitrary participant image is displayed on the participant display unit.
The position on the virtual screen of at least one specific participant image selected from the participant images (the direction of the specific participant image with reference to the participant display unit, or the direction and distance) is the guide means (for example, arrow) ) Is displayed.
This specific participant may be a conversation partner of another specific participant indicated by the guide means, or may be a conversation partner of another participant who is not indicated by the guide means.
[0021]
When the local participant wants to display the above-mentioned specific participant image on the participant display unit, he / she operates the scroll operation means with reference to the display on the position display unit. The operation means generates a relative position signal and outputs this signal to the participant image generation means. Based on this relative position signal, the participant image generating means scrolls the virtual screen. Further, the position calculation means calculates the position on the virtual screen of the specific participant image based on the participant display unit according to the scroll, and displays the position on the position display unit in real time.
[0022]
By referring to the display on the position display unit, the local participant can easily find and display a specific participant image (for example, a participant that the local participant is paying attention to or a conversation partner thereof). Further, since the number of participant images displayed on the monitor is extremely small (for example, one person) as compared with the conventional terminal device, the so-called inconvenience of eye contact is alleviated.
[0023]
【Example】
FIG. 1 is a diagram showing an example of a system to which the video conference system of the present invention is applied, and FIG. 2 is an explanatory diagram specifically showing the video conference system of the present invention.
In FIG. 1, a plurality of terminal devices (hereinafter referred to as “local terminal devices”) 100 connected to a communication line 200 incorporate a media data processing device 1 (not shown in FIG. 1), and this processing device. 1 includes a monitor 111, a speaker 112, a video camera 113, a microphone 114, a scroll operation means (joystick in the figure) 115, a keyboard 116, and a computer case 120.
[0024]
As shown in FIG. 2, on the communication line 200, media data MD (from the video signal VS and the audio signal AS) from other terminal devices not shown (not necessarily having the same configuration as the local terminal device 100). Multiplexed information) and system information SI (identification code S) _id , RC, A _id ).
As mentioned above, S _id Is an identification code that identifies which participant the multiplexed signal MD belongs to. RC is an identification code for specifying the role of a participant having a special role and / or an identification code indicating that he / she is currently speaking. In addition, A _id Is an identification code indicating a participant's conversation partner (that is, the identification code S of the participant) _id ).
[0025]
The media data processing apparatus 1 includes a multiplexed signal input unit 12, a multiplexed signal output unit 13, a system information input unit 14, a system information output unit 15, a participant image generation unit 16, position calculation units 17a to 17d, and a position storage unit. 18, a specific participant changing unit 19, a participant arrangement changing unit 20, and a display device 21. Note that all the above-described components of the media data processing apparatus 1 may be built in the computer case 120 shown in FIG. 1, or some of the components (for example, the display device 21) may be included in the computer case. It may be built in the body 120, and other components may be built in another housing (not shown).
[0026]
The multiplexed signal output unit 13 inputs the local participant image from the video camera 113 and the local participant sound from the microphone 114, and multiplexes them to media data MD. _out (Video signal VS, audio signal AS) are generated. The multiplexed signal output unit 13 is connected to this MD. _out Is the media data MD and the identification code S of the local participant _id And output to the communication line 200.
On the other hand, the multiplexed signal input unit 12 receives media data MD (video signal VS, audio signal AS) from another terminal device as MD. _in As an identification code S _id Together with the video signal VS and the identification code S. _id Are output to the participant image generation means 16. The audio signal AS is output to an audio processing circuit (not shown), and this audio processing circuit causes the speaker 112 to output the participant audio.
[0027]
The participant image generation means 16 creates a virtual screen 31 (see FIG. 3A) in which a plurality of participant images are arranged at predetermined positions. On the virtual screen 31, normally, participant images for all participants participating in the video conference are allocated in a predetermined arrangement.
[0028]
FIG. 3A shows the appearance of the terminal device and how the virtual screen 31 is displayed on the participant display unit 32 of the monitor 111. In FIG. 2A, the monitor 111 is placed on the computer case 120. In addition, participant images A to P are arranged on the virtual screen 31, and a part of the virtual screen 31 (usually at least one of the participant images, K in the figure) is a participant of the monitor 111. It is displayed on the display unit 32.
In addition, the participant display part 32 as shown to Fig.3 (a) and the participant position display parts 33a-33d mentioned later can also be displayed on a separate window, FIG.3 (a). It can also be displayed in one window as shown in FIG. In addition, a switch for opening this window can be iconified. Note that the participant display unit 32 and / or the participant position display units 33a to 33d may be configured to be able to display an enlarged / reduced display.
[0029]
In FIG. 2, the participant image generating means 16 converts the video signal VS input from the multiplexed signal input unit 12 into an identification code S. _id Each participant image is displayed at a predetermined position on the virtual screen 31 shown in FIG. 3A on the basis of the later-described arrangement information PI stored in the position storage unit 18.
As will be described later, the arrangement information PI may be changed by a local participant. In this case, the participant image generation unit 16 acquires the arrangement information PI from the position storage unit 18 and changes the arrangement of the participant image.
[0030]
The scroll operation means 115 is provided to the local participant to scroll the virtual screen 31 in an arbitrary direction, and generates a relative position signal RPS. When the local participant operates the scroll operation unit 115, the participant image generation unit 16 outputs an image signal VD for displaying the virtual screen 31 so that a desired participant image is displayed on the participant display unit 32. Output to the display device 21. Then, a desired participant image is displayed on the monitor 111 while being scrolled. Here, scrolling (normally vertical scrolling, horizontal scrolling (that is, panning), or a combination thereof (oblique scrolling).
[0031]
In addition to the joystick, a foot pedal or a keyboard key (for example, an arrow key) can be used as the scroll operation means 115. Also, an arrow-shaped soft key (usually operated by a mouse) displayed on the monitor 111 can be used. When using the foot pedal, the hand is always free, so the virtual screen 31 can be scrolled while writing or eating or drinking.
[0032]
Further, the participant image generating means 16 is information about the position of the virtual screen 31 currently displayed on the participant display unit 32 (currently based on the relative position signal RPS and the window information WI output from the display device 21). (Position information) CP is generated and output to position calculating means 17a to 17d. The window information WI includes information regarding the position and size of the participant display unit 32 on the display screen of the monitor 111.
In the present embodiment, the current position information CP is held by the participant image generation means 16, but may be held by other means (for example, each of the position calculation means 17a to 17d).
[0033]
The participant image generation means 16 also outputs the current position information CP described above to the position storage means 18, and information (that is, identification code A) indicating who the other party's conversation partner is from the position storage means 18. _id ) To get. Then, the participant image generating means 16 sends this identification code A to the system information output unit 15. _id The system information output unit 15 outputs the system information SI _out (Identification code A _id , Local participant identification code S _id , And an identification code RC indicating its role) are output to the communication line 200 as SI.
[0034]
The system information input unit 14 receives system information SI (identification code S _id , A _id And RC) to SI _in Are input from the communication line 200 and stored in association with the storage unit 141 incorporating them.
The specific participant changing means 19 receives the identification codes RC, S from the system information input unit 14. _id Are stored in association with the storage unit 191 containing them.
As mentioned above, A _id Is an identification code indicating another participant displayed on the monitor of the terminal device of a participant, and when the virtual screen is scrolled on the terminal device, the code value is changed accordingly. RC is an identification code indicating the role of a certain participant. For example, the code value is different between when a certain participant is speaking and when not speaking. For this reason, the storage contents of the storage units 141 and 191 are normally updated frequently.
[0035]
The local participant can set who the specific participant is from the keyboard 116. The specific participant changing unit 19 outputs a signal indicating who the specific participant is to the system information input unit 14 in response to a signal from the keyboard 116. For example, when a signal for setting the chairperson (or the speech participant or any participant) as the specific participant is input from the keyboard 116, the specific participant changing unit 19 receives the chairperson (or the speech participant or any participant). Selection signal SS (identification code S) indicating _id And / or RC) to the system information input unit 14.
The specific participant can be set by a soft key (not shown) displayed on the monitor 111 without using the keyboard 116.
[0036]
In this example, the specific participant is the chairman (participant image J) determined by the system and the conversation partner (participant image D), and the participant selected by the local participant (participant image M) and the It is assumed that there are four conversation partners (participant images P).
The system information input unit 14 has an identification code S indicating the chairman (indicated by the participant image J). _id And an identification code A indicating the conversation partner (indicated by the participant image D) _id (In FIG. 2, S _scc , S _{a_scc} Are output to the position calculating means 17a and 17b, respectively.
The system information input unit 14 also has an identification code S indicating the participant (indicated by the participant image M) selected by the local participant. _id And an identification code A indicating the conversation partner (indicated by the participant image P) _id (In FIG. 2, S _lcc , S _{a_lcc} Is output to the position calculating means 17b and 17c.
[0037]
As described above, the position storage unit 18 stores the arrangement information PI of the participant images A to P on the virtual screen 16. In the simplest example, this arrangement information PI can simply represent the order of each participant image based on the position where a certain participant image is arranged as a simple rank matrix or matrix, or as coordinate information. You can also Identification code S _id Is S _scc , S _lcc , S _{a_scc} , S _{a_lcc} The arrangement information PI of the position storage means 18 of the images of the participants (that is, specific participants) is output to the position calculation means 17a to 17d.
The position calculation means 17a to 17d are (i) the arrangement information PI on the virtual screen 31 of each specific participant image acquired from the position storage means 18, and (ii) the current position information CP from the participant image generation means 16. Based on the above, the position of each participant image on the virtual screen 31 on the basis of the participant display unit 32 is calculated. In this embodiment, this calculation result is obtained from the position vector information PV. _a ~ PV _d It is.
[0038]
The position calculation means 17a to 17d are based on the participant display unit 32, and the four specific participant images selected in advance (the chairperson and the conversation partner of the chairperson, and the participant selected by the local participant and the conversation partner thereof). Each)) Position vector information PV of J, D, M, P _a ~ PV _d Is output to the display device 21.
[0039]
The display device 21 uses the position vector information PV calculated by the position calculation means 17a to 17d. _a ~ PV _d 3 (a), the position of each specific participant image is displayed on the guide means (that is, the participant position display units 33a to 33d). The position display in the specific participant position display sections 33a to 33d is normally displayed by the arrow 34 as shown in FIG. 3A, but can also be displayed numerically.
[0040]
In the example of FIG. 3A, the shortest distance between the reference position of the participant display unit 32 (usually the center position of the display unit) and the position of the specific participant image (usually the center position of the image) is an arrow. The direction of the specific participant images J, D, M, and P with reference to the participant display unit 32 is displayed in the direction of the arrow 34.
[0041]
The local participant can change the arrangement information PI stored in the position storage means 18. That is, the participant arrangement changing means 20 responds to the signal from the keyboard 116 to the position storage means 18 with a signal for changing the arrangement (identification code S _id And arrangement information PI) can be output. The local participant can also arrange participant images of two or more participants that he / she wants to pay attention to in close proximity. In this way, the two or more participants can be displayed alternately on the participant display unit 32 with a short scroll distance. Further, by enlarging the participant display section 32, the two or more participants can be displayed on the display section 32 at the same time.
The change of the arrangement information can be set by a soft key displayed on the monitor 111.
[0042]
Hereinafter, the operation of the video conference system will be described more specifically with reference to FIG. 2 and FIG.
In the embodiment of FIGS. 2 and 3A, when the terminal device 100 is activated, the system information input unit 14 receives system information SI (identification code S from the communication line 200). _id , A _id , RC, and possibly information on the number of participants in video conferences) _in Enter as. The participant image generating unit 16 arranges the participant images A to P on the virtual screen 31 based on the default information from the position storage unit 18.
[0043]
Here, when the local participant selects a specific participant image from the keyboard 116 as necessary, a selection signal SS (S from the specific participant changing means 19 (S _id And / or RC) is output to the system information input unit 14. The system information input unit 14 uses the identification code S of the specific participant. _id (S _scc , S _lcc ) And identification A _id (S _{a_scc} , S _{a_lcc} ) To the position calculating means 17a to 17d. The position calculation means 17a to 17d refer to the current position information CP from the participant image generation means 16 and the arrangement information PI stored in the position storage means 18, and the position on the virtual screen 31 of each specific participant image. Information is output to the display device 21. Then, the display device 21 displays the arrow 34 on each of the participant position display units 33a to 33d.
[0044]
When the local participant operates the scroll operation unit 115 with reference to any of the participant position display units 33 a to 33 d, the relative position signal RPS from the operation unit 115 is output to the participant image generation unit 16.
[0045]
The participant image generation means 16 scrolls the virtual screen 31 so that the specific participant image is displayed on the participant display unit 32. Further, the position calculating means 17a to 17d change the arrow 34 of the participant position display units 33a to 33d based on the current position signal CP and the arrangement information PI.
[0046]
For example, the local participant refers to the arrow 34 of the participant position display unit 33b and operates the scroll operation unit 115 to display the conversation partner image (participant image D) of the chairperson (participant image J). Can be displayed on the participant display section 32.
Further, when the chairman and a participant (for example, a participant displayed as the participant image D) frequently discuss, as shown in FIG. (See FIG. 2), the chairperson's participant image J and the participant image D can also be arranged at close positions.
[0047]
A plurality of participants can also be displayed on the participant display section 32 in FIG. For this purpose, the virtual screen 31 is usually displayed in a reduced size, or the participant display unit 32 is horizontally long and / or vertically long.
In this case, referring to FIG. 2, the participant image generating means 16 includes a plurality of identification codes A. _id Is output to the system information output unit 15, and the system information output unit 15 _out These plural identification codes A _id Are output to the communication line 200 as SI.
The other terminal device may use such a plurality of identification codes A _id When the system information SI including is output, the system information input unit 14 sends the system information SI to the system information SI from another terminal device. _in Enter as. In this case, the media data processing apparatus 1 includes a participant position display unit indicating a plurality of conversation partners and a position calculation unit corresponding to the participant position display unit. _id It is prepared in advance according to the number of.
[0048]
In the present invention, the participant X is defined as the specific participant. ₁ Participant X in response to ₂ Said this participant X ₂ Participant X in response to ₃ The local participant is the participant X ₁ , X ₂ , X ₃ Can be tracked.
In this case, the participant position display section 33c in FIG. 3A shows the position of the speech participant image on the virtual screen 31, and the participant position display section 33d also shows the position on the virtual screen 31 of the conversation partner. The position at is displayed.
Participant X ₁ In the participant display section of the monitor of the terminal device ₂ Is displayed, participant X ₂ To the participant position display section 33c, the participant X ₁ Is displayed, and the participant position display section 33d displays the participant X. ₂ The image position is displayed. Next, participant X ₂ In the participant display section of the monitor of the terminal device ₃ To display Participant X ₃ To the participant position display section 33c, the participant X ₂ Is displayed, and the participant position display section 33d displays the participant X. ₃ The image position is displayed. Therefore, the local participant can easily refer to the participant position display units 33c and 33d, and can easily ₁ , X ₂ Talking partner X ₂ , X ₃ Can be tracked sequentially.
[0049]
The arrow 34 displayed on the specific participant position display units 33a to 33d can be dragged to an appropriate position. In FIG. 3A, an arrow 34 dragged to the participant image is indicated by a dotted line. This arrow 34 is preferably displayed with a dotted line (or a form that does not hinder the visual recognition of the participant image). By overlapping the arrow 34 with the participant, the frequency of the local participant's viewpoint moving on the screen of the monitor 111 is reduced. Thereby, the inconvenience of eye contact is reduced.
[0050]
3A shows a case where the participant images A to P are arranged in a matrix in the plane mode on the virtual screen 31, but as shown in FIG. 4A, the participant images A to P are arranged. The surface on which P is displayed may be in phase with the spherical surface so that each participant image can be freely scrolled in any direction.
4B, the participant images A to P may be arranged on the virtual screen 31 in a vertical or horizontal (horizontal in the figure) linear mode. Also in this case, each participant image may be arranged in a loop as shown in FIG.
[0051]
【The invention's effect】
As described above, the present invention can provide the following effects.
(1) Since the local participant can freely select and display each participant image arranged on the virtual screen by scrolling the virtual screen, all the participant images are displayed on one monitor. No longer needed.
[0052]
(2) The local participant can easily find other participants and / or their interaction partners using visual guide means.
[0053]
(3) The scrolling of the virtual screen based on the instruction from the participant position display unit can be performed by operating a joystick, a soft key, or a hard key. Further, the participant position display unit can give a simple instruction to the local participant. Therefore, in the present invention, operability when displaying a desired participant on a monitor is compared with the conventional technique in which a predetermined participant image must be searched for by a mouse operation from a window in which a large number of icons overlap. Is extremely high.
[0054]
(4) Even if there are a large number of video conference participants, the participant images displayed on the monitor can be scrolled. Therefore, since the number of participant images displayed on the monitor can be one or very small, the frequency of the local participant's line of sight moving is very low. Therefore, the inconvenience of eye contact, such as a local participant's line-of-sight movement being worried about other participants who are viewing the image of the local participant through the communication line, is greatly reduced compared to the conventional case. .
[Brief description of the drawings]
FIG. 1 is a diagram showing an example of a system to which a video conference system of the present invention is applied.
FIG. 2 is an explanatory diagram of a video conference system according to the present invention.
3A is an explanatory diagram showing a display state of a monitor of the video conference system of FIG. 2, and FIG. 3B is a diagram showing a state in which a chairperson image J and a participant image D are arranged at close positions. It is.
FIGS. 4A and 4B are diagrams illustrating aspects of a virtual screen in the video conference system according to the present invention, in which FIG. 4A is a diagram illustrating a state in which the virtual screen is arranged on a display surface in phase with a spherical surface, and FIG. The figure which shows a mode that it arrange | positioned by the straight line mode, (c) is a figure which shows a mode that the virtual screen is arrange | positioned by the loop-like line mode.
[Explanation of symbols]
1 Media data processing device
12 Multiplexed signal input section
13 Multiplexed signal output section
14 System information input section
141,191 storage unit
15 System information output section
16 Participant image generation means
17a-17d Position calculation means
18 Position storage means
19 Specific Participant Change Method
20 Participant placement change means
21 Display devices
31 Virtual screen
32 Participant display
33a-33d Participant position display section
34 Arrow
111 monitors
112 Speaker
113 camcorder
114 microphone
115 Scroll operation means
116 keyboard
120 computer case
100 terminal device
200 communication line
MD, MD _out , MD _in Multiplexed signal
SI, SI _out , SI _in System information
S _id , A _id , RC identification code
VS video signal
AS audio signal
RPS relative position signal
SS selection signal
PI placement information
CP current position signal
PV _a ~ PV _d Position vector information
VD image signal

Claims

Media data including a participant image input from the imaging device and a participant voice input from the voice input device in which a plurality of terminal devices including an image display device, a sound generation device, an imaging device, and a voice input device Is a multi-point video conferencing system that transmits and receives data over a communication line ,
The terminal device includes scroll operation means;
A media data processing device connected to the image display device, the communication line, and the scroll means ,
(1) creating a virtual screen in which a plurality of participant images given from the communication line are arranged at predetermined positions, and displaying a part of the virtual screen on the participant display unit of the image display device; as any of the participants images of the plurality of participants image given from the communication line is displayed on the participant display unit, the virtual screen, the relative position signal from the scroll operation means Participant image generation means that can be scrolled according to
(2) position storage means for storing the position of each participant image on the virtual screen;
(3) The position of the at least one specific participant image selected from the participant images on the virtual screen with reference to the participant display unit,
Arrangement information on the virtual screen of the specific participant image acquired from the position storage means;
The position of the image displayed on the participant display unit on the virtual screen;
Means to calculate based on
Having
The image display device displays guide means for visually informing the participant of the terminal device the position of the specific participant image on the virtual screen;
Multipoint video conferencing system characterized by that.

Participation displayed on the participant display section of the image display device of the second terminal device on the virtual screen of the first terminal device, the guide means displayed on the image display device of the first terminal device 2. The multipoint video conference system according to claim 1, wherein a position of the person image is visually notified to a participant of the first terminal device.

The guide means is an arrow indicating the position of the specific participant image;
The length of the arrow is a distance from the arrangement position of the participant image to the arrangement position of the specific participant image on the virtual screen,
Direction of the arrow is in the virtual screen, a direction from the position of the participant image to location of the particular participant image,
The multipoint video conference system according to claim 1 or 2, characterized in that

The multipoint video conference system according to claim 1, further comprising specific participant changing means for selecting the specific participant image or changing the selection of the specific participant image after the selection. .

The participant arrangement changing means for designating or changing the arrangement information on the virtual screen of the participant image stored in the position storage means is provided. Multipoint video conferencing system.

The media data processing device includes a system information output unit that outputs information for specifying a participant image displayed on the participant image display unit of the image display device, and the information from another terminal device. The multipoint video conference system according to claim 1, further comprising a system information input unit for inputting