JPH0746561A

JPH0746561A - Multmedia equipment

Info

Publication number: JPH0746561A
Application number: JP5189591A
Authority: JP
Inventors: Shuichi Kadowaki; 修一門脇
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1993-07-30
Filing date: 1993-07-30
Publication date: 1995-02-14

Abstract

PURPOSE:To easily identify the window of an opposite party receiving the voice from plural windows by a user. CONSTITUTION:When there is any opposite party receiving no voice higher than a prescribed level for prescribed time, a level detection circuit 22 instructs to display the opposite party with an icon on a display device 34 to a display control circuit 32 by detecting the level of the voice from the opposite party. When the voice higher than the prescribed level is received from the opposite party displayed with icon, the window display of the opposite party on the display device 34 is instructed to the display control circuit 32.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、通信回線を介して複数
の相手とマルチメディア（画像、音声及びデータ等）の
通信を行うマルチメディア装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a multimedia device for communicating multimedia (image, voice, data, etc.) with a plurality of parties via a communication line.

【０００２】[0002]

【従来の技術】従来、テレビ会議システムは、通信回線
を介して複数の相手と画像、音声及びデータなどを多重
化してマルチメディア通信を行い、複数の相手から受信
した音声をひとつに混合してスピーカに出力し、複数の
相手から受信した画像をマルチウィンドウ上に表示する
ことができるように構成されている。2. Description of the Related Art Conventionally, a video conference system performs multimedia communication by multiplexing images, voice and data with a plurality of parties via a communication line, and mixes voices received from a plurality of parties into one. The image output from the speaker and received from a plurality of parties can be displayed on the multi-window.

【０００３】[0003]

【発明が解決しようとする課題】しかし、従来例では、
現在、音声を受信しているか否かに拘わらず、全ての相
手の画像がウィンドウに表示される。また、複数の相手
から受信した音声は、各ウィンドウの位置とは無関係
に、同等に混合されて出力されるため、どのウィンドウ
に対応する音声も一様に聞こえる。そのため、ユーザ
は、スピーカから出力される音声がどのウィンドウの相
手からのものか分かりにくい。また、全ての相手の画像
をウィンドウに表示すると、装置の負担が大きい。However, in the conventional example,
Currently, the images of all the opponents are displayed in the window regardless of whether or not audio is currently received. Further, the voices received from a plurality of parties are equally mixed and output regardless of the position of each window, so that the voices corresponding to any windows can be heard uniformly. For this reason, it is difficult for the user to understand which window the partner outputs as the sound output from the speaker. Also, displaying the images of all the opponents in the window puts a heavy burden on the device.

【０００４】本発明は、このような問題を解決するマル
チメディア装置を提示することを目的とする。The present invention aims at presenting a multimedia device which solves such problems.

【０００５】[0005]

【課題を解決するための手段】第１の発明に係るマルチ
メディア装置は、通信回線を介して複数の相手とマルチ
メディア通信を行い、当該相手から受信した複数の画像
をマルチウィンドウで表示するマルチメディア装置であ
って、当該相手から受信した音声レベルを検出する音声
レベル検出手段と、当該音声レベル検出手段の検出結果
に基づいて、所定時間、所定の音声を受信していない相
手の画像を表示するウインドウをアイコン表示とすＲつ
手段とを有することを特徴とする。A multimedia device according to a first aspect of the present invention performs multimedia communication with a plurality of partners via a communication line and displays a plurality of images received from the partner in a multi-window. A media device, which displays a voice level detecting means for detecting a voice level received from the other party, and an image of the other party who has not received a predetermined voice for a predetermined time based on the detection result of the voice level detecting means. R window means for displaying a window to be displayed as an icon.

【０００６】第２の発明に係るマルチメディア装置は、
通信回線を介して複数の相手とマルチメディア通信を行
い、受信画像をマルチウィンドウで表示するマルチメデ
ィア装置であって、各ウィンドウの位置関係を読み込む
手段と、当該複数の相手からの受信音声を、対応するウ
ィンドウの位置関係に応じた音量で出力させる出力手段
とを有することを特徴とする。A multimedia device according to the second invention is
A multimedia device for performing multimedia communication with a plurality of parties via a communication line and displaying received images in a multi-window, a means for reading the positional relationship of each window, and a received voice from the plurality of parties, And an output unit for outputting at a volume according to the positional relationship of the corresponding windows.

【０００７】[0007]

【作用】上記手段により、所定時間、音声を発していな
い相手は、ウィンドウ表示から除外されるので、注目す
べきウィンドウ、即ち音声に対応する相手のウィンドウ
がユーザに分かりやすい。By the above means, the person who has not made a voice for a predetermined time is excluded from the window display, so that the user can easily understand the window to be noticed, that is, the window of the person corresponding to the voice.

【０００８】また、各スピーカから出力される音声の強
弱関係から、音声に対応する相手のウィンドウがユーザ
に分かりやすくなる。Also, the window of the other party corresponding to the voice can be easily understood by the user due to the strength relation of the voice output from each speaker.

【０００９】[0009]

【実施例】以下、図面を参照して本発明の実施例を説明
する。Embodiments of the present invention will be described below with reference to the drawings.

【００１０】図１は、本発明の第１の実施例の概略構成
ブロック図である。１０は装置全体を制御するＣＰＵ、
１２はＣＰＵ１０で実行されるプログラムを格納するＲ
ＯＭ、１４はＣＰＵ１０で使用されるデータを格納する
ＲＡＭ、１６はキーボード及びタッチパネルなどからな
る操作装置、１８はＩＴＵ−Ｔ（旧ＣＣＩＴＴ）勧告
Ｇ．７２２に従って音声信号の符号化及び復号化を実行
する音声符号化復号化回路、２０は音声入力のマイク、
２２は受信した音声のレベルを検出するレベル検出回
路、２４は受信した複数の相手の音声をひとつに混合す
る音声混合回路、２６は音声を出力するスピーカであ
る。FIG. 1 is a schematic block diagram of the first embodiment of the present invention. 10 is a CPU that controls the entire apparatus,
Reference numeral 12 is an R storing a program executed by the CPU 10.
OM, 14 is a RAM for storing data used in the CPU 10, 16 is an operating device including a keyboard and a touch panel, and 18 is ITU-T (former CCITT) recommendation G. 722, a voice encoding / decoding circuit for performing encoding and decoding of a voice signal, 20 is a voice input microphone,
Reference numeral 22 is a level detection circuit that detects the level of the received voice, 24 is a voice mixing circuit that mixes the received voices of the other parties into one, and 26 is a speaker that outputs the voice.

【００１１】２８はＩＴＵーＴ勧告Ｈ．２６１に従って
画像信号の符号化及び復号化を実行する画像符号化復号
化回路、３０は画像を入力するカメラ、３２は複数の相
手の画像をマルチウィンドウ上に表示したり、ウィンド
ウ表示とアイコン表示との切り替えを行う表示制御回
路、３４は画像を表示するＬＣＤ又はＣＲＴなどからな
る表示装置である。28 is ITU-T Recommendation H.264. An image encoding / decoding circuit that executes encoding and decoding of an image signal in accordance with H.261, 30 is a camera for inputting an image, 32 is a multi-window display of images of a plurality of partners, a window display and an icon display. Is a display control circuit for switching between, and 34 is a display device including an LCD or a CRT for displaying an image.

【００１２】３６はＩＴＵ−Ｔ勧告Ｈ．２２１に従って
画像、音声及びデータなどを多重化し、分離する多重分
離回路、３８は通信回線との接続及び通信を制御する回
線制御回路、４０は現在の時刻を読みとることができる
クロックである。Reference numeral 36 denotes ITU-T recommendation H.264. A demultiplexing circuit that multiplexes and separates images, voices, and data according to 221, 38 is a line control circuit that controls connection and communication with a communication line, and 40 is a clock that can read the current time.

【００１３】図２は、音声符号化復号化回路１８から音
声混合回路２４に供給される音声フレームの構造を示
す。複数の相手からの音声信号は時分割多重化され、各
相手の音声信号はタイムスロットに格納される。図２に
おいて、４２は相手＃１の音声信号を格納するタイムス
ロット、４４は相手＃２の音声信号を格納するタイムス
ロット、４６は相手＃３の音声信号を格納するタイムス
ロット、４８は相手＃４の音声信号を格納するタイムス
ロットである。図２は相手が４人の場合を示しており、
タイムスロット４８の後には同じ繰り返しでタイムスロ
ットが続く。FIG. 2 shows the structure of the audio frame supplied from the audio encoding / decoding circuit 18 to the audio mixing circuit 24. Voice signals from a plurality of partners are time-division multiplexed, and the voice signals of each partner are stored in time slots. In FIG. 2, 42 is a time slot for storing the voice signal of the partner # 1, 44 is a time slot for storing the voice signal of the partner # 2, 46 is a time slot for storing the voice signal of the partner # 3, and 48 is a partner #. 4 is a time slot for storing 4 audio signals. Figure 2 shows the case where there are four people,
The time slot 48 is followed by a time slot with the same repetition.

【００１４】図３は通信時に表示される表示装置３４の
画面の一例を示す。図３において、５０は表示装置３４
の画面、５２は画面５０の中で相手＃１から受信した画
像を表示するウィンドウ、５４は画面５０の中で相手＃
２から受信した画像を表示するウィンドウ、５６は画面
５０の中で相手＃３から受信した画像を表示するウィン
ドウ、５８は画面５０の中で相手＃４から受信した画像
を表示するアイコンである。図３は相手が４人の場合で
あり、アイコンには相手を示す静止画や名称などが表示
される。FIG. 3 shows an example of the screen of the display device 34 displayed during communication. In FIG. 3, 50 is a display device 34.
Screen, 52 is a window for displaying the image received from the other party # 1 on the screen 50, and 54 is the other party # on the screen 50
2 is a window for displaying an image received from # 2, 56 is a window for displaying an image received from partner # 3 on screen 50, and 58 is an icon for displaying an image received from partner # 4 on screen 50. FIG. 3 shows a case where there are four opponents, and a still image or a name indicating the opponent is displayed on the icon.

【００１５】図４はＲＡＭ１４のメモリ構成を示す。６
０は相手の人数を格納する変数Ｎ、６２は音声を受信し
ていない相手のウィンドウの最大表示時間を格納する変
数Ｔｍ、６４は現在の時刻を格納する変数Ｔ、６６は相
手の番号を格納する変数Ｉ、６８は相手Ｉの音声レベル
を格納する変数Ｖ、７０は相手＃１の音声を受信してウ
ィンドウを表示した時刻を格納する変数Ｔｉ（１）、７
２は相手＃２の音声を受信してウィンドウを表示した時
刻を格納する変数Ｔｉ（２）、７４は相手＃３の音声を
受信してウィンドウを表示した時刻を格納する変数Ｔｉ
（３）、７６は相手＃Ｎの音声を受信してウィンドウを
表示した時刻を格納する変数Ｔｉ（Ｎ）である。FIG. 4 shows the memory configuration of the RAM 14. 6
0 is a variable N that stores the number of the other party, 62 is a variable Tm that stores the maximum display time of the window of the other party who is not receiving voice, 64 is a variable T that stores the current time, 66 is the number of the other party Variable I, 68 is a variable V for storing the voice level of the other party I, 70 is a variable Ti (1), 7 for storing the time when the voice of the other party # 1 is received and the window is displayed.
2 is a variable Ti (2) for storing the time when the voice of the partner # 2 is received and the window is displayed, and 74 is a variable Ti for storing the time when the voice of the partner # 3 is received and the window is displayed.
(3) and 76 are variables Ti (N) for storing the time when the voice of the partner #N is received and the window is displayed.

【００１６】図５は、ＲＯＭ１２に格納されているプロ
グラムのフローチャートを示す。通信開始時、変数Ｎに
相手の人数、変数Ｔｍに音声を受信していない相手のウ
ィンドウの最大表示時間、変数Ｔｉ（Ｉ）に相手Ｉとの
通信開始の時刻が夫々設定されているものとする。FIG. 5 shows a flow chart of the program stored in the ROM 12. At the start of communication, the variable N is set to the number of the other party, the variable Tm is set to the maximum display time of the window of the other party who is not receiving voice, and the variable Ti (I) is set to the time to start communication with the other party I. To do.

【００１７】また、本プログラムは、図２に示す音声フ
レームを受信するたびに起動される。The program is started each time the audio frame shown in FIG. 2 is received.

【００１８】先ず、クロック４０から現在の時刻をＴに
読み込み（Ｓ１）、変数Ｉに１を設定する（Ｓ２）。変
数Ｉが変数Ｎ以下であればＳ４に進み、そうでなければ
終了する（Ｓ３）。First, the current time is read from clock 40 into T (S1), and variable I is set to 1 (S2). If the variable I is less than or equal to the variable N, the process proceeds to S4, and if not, the process ends (S3).

【００１９】Ｓ４では、レベル検出回路２２で相手Ｉか
らの音声レベルを検出し、変数Ｖに格納する。変数Ｖが
ゼロであればＳ６に進み、そうでなければＳ８に進む
（Ｓ５）。Ｓ６では、変数Ｔと変数Ｔｉとを比較して、
その差が変数Ｔｍより大きいならば、Ｓ７に進み、そう
でなければＳ１０に進む。In S4, the level detection circuit 22 detects the voice level from the opponent I and stores it in the variable V. If the variable V is zero, the process proceeds to S6, and if not, the process proceeds to S8 (S5). In S6, the variable T and the variable Ti are compared,
If the difference is larger than the variable Tm, proceed to S7, otherwise proceed to S10.

【００２０】Ｓ７では、相手Ｉの画像を表示装置３４に
アイコンで表示するように表示制御回路３２に指示す
る。Ｓ８では、相手Ｉの画像を表示装置３４にウィンド
ウで表示するように表示制御回路３２に指示する。In step S7, the display control circuit 32 is instructed to display the image of the partner I on the display device 34 as an icon. In S8, the display control circuit 32 is instructed to display the image of the partner I on the display device 34 in a window.

【００２１】変数Ｔｉ（Ｉ）に変数Ｔの内容をセットし
（Ｓ９）、変数Ｉに１を加算してＳ３に戻る（Ｓ１
０）。The content of the variable T is set in the variable Ti (I) (S9), 1 is added to the variable I, and the process returns to S3 (S1).
0).

【００２２】尚、Ｓ７で、現在、アイコンで表示されて
いる相手について、再度、アイコンで表示すべきことを
表示制御回路３２に指示しても、継続してアイコンで表
示されるだけであって、何の変化も起こらない。同様
に、Ｓ８で、現在、ウィンドウに画像表示されている相
手について、再度、ウィンドウで画像表示すべきことを
表示制御回路３２に指示しても、継続してウィンドウに
画像表示されるだけであって、何の変化も起こらない。In step S7, even if the display control circuit 32 is instructed again to display the icon with respect to the other party who is currently displayed with the icon, only the icon is continuously displayed. , No change happens. Similarly, in S8, even if the display control circuit 32 is instructed again to display an image in the window for the other party who is currently displaying the image in the window, only the image is continuously displayed in the window. And no change happens.

【００２３】上記実施例によれば、常時、通信相手の音
声レベルを監視しながら、所定時間、音声を発していな
い相手をウィンドウ表示からアイコン表示に切り替える
ので、ウィンドウの数がその分減少し、ユーザは、現在
受信している音声との関連で注目すべきウィンドウを、
容易に峻別することができる。また、表示されるウィン
ドウの数が減少するため、装置のウィンドウ表示の負担
が軽減される。According to the above-described embodiment, while monitoring the voice level of the communication partner at all times, the partner who is not uttering a voice is switched from the window display to the icon display for a predetermined time, so that the number of windows is reduced accordingly. The user has a window of interest in relation to the audio currently being received,
It can be easily distinguished. Further, since the number of windows displayed is reduced, the load of window display of the device is reduced.

【００２４】前述した実施例では、プログラムやデータ
を格納する記憶装置としてＲＯＭ１２及びＲＡＭ１４を
使用しているが、フロッピーディスク、ハードディスク
又はメモリ・カードなどを利用してもよい。In the above-mentioned embodiment, the ROM 12 and the RAM 14 are used as the storage device for storing the programs and data, but a floppy disk, a hard disk or a memory card may be used.

【００２５】上記実施例では、画像符号化復号化方式と
してＩＴＵ−Ｔ勧告Ｈ．２６１を採用しているが、ＭＰ
ＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＣｏｄｉｎｇ
ＥｘｐｅｒｔｓＧｒｏｕｐ）などの他の画像符号化復
号化方式でもよい。In the above embodiment, ITU-T Recommendation H.264 is used as the image encoding / decoding method. 261 is adopted, MP
EG (Moving Picture Coding)
Other image encoding / decoding methods such as Experts Group) may be used.

【００２６】上記実施例では、画像として動画を使用し
ているが、静止画又は連続して静止画でもよい。In the above embodiment, a moving image is used as the image, but it may be a still image or continuous still images.

【００２７】上記実施例では、マルチメディア多重化方
式としてＩＴＵ−Ｔ勧告Ｈ．２２１を使用しているが、
複数呼を画像、音声、データに割り当てる方式でもよ
い。In the above embodiment, ITU-T Recommendation H.264 is used as the multimedia multiplexing method. 221 is used,
A method of assigning a plurality of calls to images, voices, and data may be used.

【００２８】上記実施例では複数の相手の音声及び画像
を時分割多重化方式により多重化しているが、周波数多
重化、位相差多重化及びパケット多重化でもよい。In the above embodiment, the voices and images of a plurality of partners are multiplexed by the time division multiplexing method, but frequency multiplexing, phase difference multiplexing and packet multiplexing may be used.

【００２９】上記実施例では、ウィンドウを一部重ねて
表示しているが、画面分割のように重なりの無い表示方
式を採用することもできる。In the above embodiment, the windows are displayed in a partially overlapped manner, but it is also possible to employ a display method in which there is no overlap such as screen division.

【００３０】上記実施例では、アイコンには相手を示す
静止画や名称が表示されるが、アイコンに相手の動画を
表示してもよい。In the above embodiment, the icon displays a still image or name indicating the partner, but the icon may display a moving image of the partner.

【００３１】上記実施例では、相手の音声のレベルがゼ
ロ（０）の時に音声を受信していないと判断している
が、相手の音声レベルが所定値より小さいときに、音声
を受信していないと判断することにしてもよい。或い
は、全ての相手のうち、相対的に音声レベルの低い所定
人数を、音声を受信していない相手と判断することにし
てもよい。In the above embodiment, it is judged that the voice is not received when the voice level of the other party is zero (0), but the voice is received when the voice level of the other party is smaller than the predetermined value. You may decide not to. Alternatively, among all the opponents, the predetermined number of people whose voice level is relatively low may be determined as the opponent who has not received the voice.

【００３２】上記実施例では、変数Ｔｍは通信開始時に
設定されるが、通信中にユーザが操作装置１６を介して
変更できるようにしてもよい。また、変数Ｔｍを相手ご
とに設定し、音声レベルの如何に係わらず、アイコンで
表示する優先度をつけることもできる。In the above embodiment, the variable Tm is set at the start of communication, but the user may change it via the operating device 16 during communication. Further, it is possible to set the variable Tm for each partner and give priority to display with an icon regardless of the voice level.

【００３３】図６は、本発明の第２の実施例の概略構成
ブロック図を示す。１１０は装置全体を制御するＣＰ
Ｕ、１１２はＣＰＵ１１０で実行されるプログラムを格
納するＲＯＭ、１１４はＣＰＵ１１０で使用されるデー
タを格納するＲＡＭ、１１６はキーボード及びタッチパ
ネルなどからなる操作装置、１１８はＩＴＵ−Ｔ勧告
Ｇ．７２２に従って音声信号の符号化及び復号化を実行
する音声符号化復号化回路、１２０は音声入力のマイ
ク、１２２は音声符号化復号化回路１１８から受信した
音声を増幅する音声増幅回路、１２４Ｌ及び１２４Ｒ
は、夫々、音声増幅回路１２２から入力した複数の相手
の音声をひとつに混合する音声混合回路、１２６Ｌは、
音声混合回路１２４Ｌから入力した音声を出力する左ス
ピーカ、１２６Ｒは、音声混合回路１２４Ｒから入力し
た音声を出力する右スピーカである。FIG. 6 shows a schematic block diagram of the second embodiment of the present invention. 110 is a CP that controls the entire apparatus
U, 112 are ROMs for storing programs executed by the CPU 110, 114 are RAMs for storing data used by the CPU 110, 116 is an operating device including a keyboard and a touch panel, 118 is ITU-T Recommendation G. 722, a voice encoding / decoding circuit that performs encoding and decoding of a voice signal, 120 is a voice input microphone, 122 is a voice amplifying circuit that amplifies the voice received from the voice encoding / decoding circuit 118, and 124L and 124R.
Is a voice mixing circuit that mixes the voices of a plurality of partners input from the voice amplification circuit 122 into one, and 126L is
The left speaker that outputs the sound input from the sound mixing circuit 124L and the right speaker 126R that outputs the sound input from the sound mixing circuit 124R.

【００３４】１２８はＩＴＵーＴ勧告Ｈ．２６１に従っ
て画像信号の符号化及び復号化を実行する画像符号化復
号化回路、１３０は画像を入力するカメラ、１３２は複
数の相手の画像をマルチウィンドウ上に表示したり、ウ
ィンドウの表示状態を読み込んだりする表示制御回路、
１３４は画像を表示するＬＣＤ又はＣＲＴなどからなる
表示装置である。Reference numeral 128 denotes ITU-T Recommendation H.264. An image encoding / decoding circuit that performs encoding and decoding of an image signal in accordance with H.261, 130 is a camera for inputting images, 132 is a display of images of a plurality of partners on a multi-window, and a display state of windows is read. Display control circuit,
Reference numeral 134 denotes a display device including an LCD or a CRT that displays an image.

【００３５】１３６はＩＴＵ−Ｔ勧告Ｈ．２２１に従っ
て画像、音声及びデータなどを多重化し、分離する多重
分離回路、１３８は通信回線との接続及び通信を制御す
る回線制御回路である。Reference numeral 136 denotes ITU-T Recommendation H.264. A demultiplexing circuit 138 that multiplexes and separates images, voices, and data according to 221 is a line control circuit that controls connection and communication with a communication line.

【００３６】音声符号化復号化回路１１８から音声混合
回路１２４Ｌ及び１２４Ｒへ送られる音声フレームの形
式は、図２に示す第１の実施例のものと同じである。The format of the audio frame sent from the audio encoding / decoding circuit 118 to the audio mixing circuits 124L and 124R is the same as that of the first embodiment shown in FIG.

【００３７】図７は通信時に表示される表示装置１３４
の画面の一例を、その左右に配置されたスピーカ１２６
Ｌ及び１２６Ｒとともに示す。図７において、１５０は
表示装置１３４の画面、１５２は画面１５０の中で相手
＃１から受信した画像を表示するウィンドウ、１５４は
画面１５０の中で相手＃２から受信した画像を表示する
ウィンドウ、１５６は画面１５０の中で相手＃３から受
信した画像を表示するウィンドウ、１５８は画面１５０
の中で相手＃４から受信した画像を表示するウィンドウ
である。FIG. 7 shows a display device 134 displayed during communication.
Of an example of the screen of the speaker 126
Shown with L and 126R. 7, 150 is a screen of the display device 134, 152 is a window for displaying an image received from the other party # 1 in the screen 150, 154 is a window for displaying an image received from the other party # 2 in the screen 150, 156 is a window for displaying an image received from the other party # 3 on the screen 150, and 158 is a screen 150
It is a window displaying an image received from the other party # 4.

【００３８】Ｄ１は画面１５０の左端から相手＃１のウ
ィンドウ１５２の中心までの距離、Ｄ２は相手＃２のウ
ィンドウ１５４の中心までの距離、Ｄ３は相手＃３のウ
ィンドウ１５６の中心までの距離、Ｄ４は相手＃４のウ
ィンドウ１５８の中心までの距離、Ｗは画面１５０の左
端から右端までの距離、すなわち、画面１５０の幅であ
る。ＤＬは左スピーカ１２６Ｌの中心から画面１５０の
左端までの距離、ＤＲは右スピーカ１２６Ｒの中心から
画面１５０の右端までの距離である。D1 is the distance from the left end of the screen 150 to the center of the window 152 of the opponent # 1, D2 is the distance to the center of the window 154 of the opponent # 2, D3 is the distance to the center of the window 156 of the opponent # 3, D4 is the distance from the center of the window 158 of the opponent # 4, W is the distance from the left end to the right end of the screen 150, that is, the width of the screen 150. DL is the distance from the center of the left speaker 126L to the left end of the screen 150, and DR is the distance from the center of the right speaker 126R to the right end of the screen 150.

【００３９】尚、図７は、相手が４人の場合であり、各
ウィンドウ１５２乃至１５８は、操作装置１１６の操作
により、画面１５０内で移動可能である。Note that FIG. 7 shows a case where there are four opponents, and each of the windows 152 to 158 can be moved within the screen 150 by operating the operating device 116.

【００４０】図８は、ＲＡＭ１１４のメモリ構成を示
す。１６０は相手の人数を格納する変数Ｎ、１６２は画
面１５０の幅Ｗを格納する変数ｗ、１６４は左スピーカ
１２６Ｌの中心から画面１５０の左端までの距離ＤＬを
格納する変数ｄ（Ｌ）、１６６は右スピーカ１２６Ｒの
中心から画面１５０の右端までの距離ＤＲを格納する変
数ｄ（Ｒ）、１６８は音声増幅回路１２２が音声符号化
復号化回路１１８から入力された音声を増幅する増幅率
を格納する変数Ａである。即ち、音声符号化復号化回路
１１８からの音声をＡ倍したものが、両音声混合回路１
２４Ｌ及び１２４Ｒから出力される音声の総量に相当す
る。FIG. 8 shows the memory configuration of the RAM 114. 160 is a variable N for storing the number of opponents, 162 is a variable w for storing the width W of the screen 150, 164 is a variable d (L) for storing the distance DL from the center of the left speaker 126L to the left end of the screen 150, 166. Is a variable d (R) that stores the distance DR from the center of the right speaker 126R to the right end of the screen 150, and 168 stores the amplification factor by which the audio amplification circuit 122 amplifies the audio input from the audio encoding / decoding circuit 118. Is a variable A to be executed. That is, the audio from the audio encoding / decoding circuit 118 multiplied by A is the audio mixing circuit 1
This corresponds to the total amount of sound output from 24L and 124R.

【００４１】１７０は相手の番号を格納する変数Ｉ、１
７２は画面１５０の左端から相手Ｉのウィンドウの中心
までの距離（Ｄ１、Ｄ２、Ｄ３又はＤ４）を格納する変
数ｄｉ、１７４は音声増幅回路１２２から両音声混合回
路１２４Ｌ及び１２４Ｒに出力される音声総量に対す
る、音声混合回路１２４Ｌに出力される音声の比率を格
納する変数Ｒ（Ｌ）、１７６は音声増幅回路１２２にか
ら両音声混合回路１２４Ｌ及び１２４Ｒに出力される音
声の総量に対する、音声混合回路１２４Ｒに出力される
音声の比率を格納する変数Ｒ（Ｒ）である。170 is a variable I for storing the other party's number, 1
Reference numeral 72 is a variable di that stores the distance (D1, D2, D3 or D4) from the left end of the screen 150 to the center of the window of the opponent I, and 174 is a voice output from the voice amplification circuit 122 to both voice mixing circuits 124L and 124R. A variable R (L) 176 for storing the ratio of the sound output to the sound mixing circuit 124L with respect to the total amount is a sound mixing circuit with respect to the total amount of sound output from the sound amplifying circuit 122 to both sound mixing circuits 124L and 124R. A variable R (R) that stores the ratio of the sound output to 124R.

【００４２】図９は、ＲＯＭ１１２に格納されたプログ
ラムのフローチャートを示す。通信開始時、変数Ｎに相
手の人数、変数ｗに画面１５０の幅、変数ｄ（Ｌ）に左
スピーカ１２６Ｌの中心から画面１５０の左端までの距
離ＤＬ、変数ｄ（Ｒ）に右スピーカ１２６Ｒの中心から
画面１５０の右端までの距離ＤＲ、変数Ａに音声増幅回
路１２２の音声増幅率が、夫々設定されているものとす
る。FIG. 9 shows a flow chart of the program stored in the ROM 112. At the start of communication, the variable N is the number of opponents, the variable w is the width of the screen 150, the variable d (L) is the distance DL from the center of the left speaker 126L to the left end of the screen 150, and the variable d (R) is the right speaker 126R. It is assumed that the distance DR from the center to the right end of the screen 150 and the audio amplification factor of the audio amplification circuit 122 are set to the variable A, respectively.

【００４３】また、本プログラムは、図２に示す音声フ
レームを受信するたびに起動される。Further, this program is activated each time the voice frame shown in FIG. 2 is received.

【００４４】先ず、変数Ｉに１を設定する（Ｓ１０
１）。変数Ｉが変数Ｎ以下であれば、Ｓ１０３に進み、
そうでなければ終了する（Ｓ１０２）。First, the variable I is set to 1 (S10).
1). If the variable I is less than or equal to the variable N, the process proceeds to S103,
If not, the process ends (S102).

【００４５】Ｓ１０３では、表示制御回路１３２から、
画面１５０の左端から相手Ｉのウィンドウの中心までの
距離（Ｄ１、Ｄ２、Ｄ３またはＤ４）を読み込み、変数
ｄｉに格納する。変数Ｒ（Ｌ）には、音声混合回路１２
４Ｌに出力される音声の比率を格納する（Ｓ１０４）。
この比率は、左右のスピーカ１２６Ｌ及び１２６Ｒの中
心間距離に対する左スピーカ１２６Ｌの中心から相手Ｉ
のウィンドウの中心までの距離の割合、即ち、（ｄ
（Ｌ）＋ｄｉ）／（ｄ（Ｌ）＋ｗ＋ｄ（Ｒ））で算出す
ることができる。変数Ｒ（Ｒ）には、１−Ｒ（Ｌ）を格
納する（Ｓ１０５）。In S103, the display control circuit 132
The distance (D1, D2, D3 or D4) from the left end of the screen 150 to the center of the window of the partner I is read and stored in the variable di. For the variable R (L), the voice mixing circuit 12
The ratio of the sound output to 4L is stored (S104).
This ratio is calculated from the center of the left speaker 126L with respect to the distance between the centers of the left and right speakers 126L and 126R.
The percentage of the distance to the center of the window, that is, (d
It can be calculated by (L) + di) / (d (L) + w + d (R)). 1-R (L) is stored in the variable R (R) (S105).

【００４６】Ｓ１０４及びＳ１０５で格納された音声出
力比率に基づいて、音声増幅回路１２２に対し、音声混
合回路１２４Ｌへ出力すべき音声の増幅率をＡ×Ｒ
（Ｌ）とするように指示し（Ｓ１０６）、音声混合回路
１２４Ｒへ出力すべき音声の増幅率をＡ×Ｒ（Ｒ）とす
るように指示する（Ｓ１０７）。Based on the audio output ratios stored in S104 and S105, the audio amplification circuit 122 sets the amplification factor of the audio to be output to the audio mixing circuit 124L to A × R.
(L) is instructed (S106), and the amplification factor of the sound to be output to the sound mixing circuit 124R is instructed to be A × R (R) (S107).

【００４７】変数Ｉに１を加算し、Ｓ１０２に戻る（Ｓ
１０８）。1 is added to the variable I, and the process returns to S102 (S
108).

【００４８】各ウィンドウの位置は、ユーザが通信中に
操作装置１１６からの操作によって変更することがで
き、これにともなって、変数ｄｉの値は変化する。The position of each window can be changed by the user operating the operation device 116 during communication, and the value of the variable di changes accordingly.

【００４９】以上の説明から明らかなように、第２の実
施例によれば、スピーカからの音声の強弱によって、ユ
ーザは、その音声に対応する相手のウィンドウの位置の
見当をつけることができる。また、その音声及び映像
を、違和感無く視聴することができる。As is clear from the above description, according to the second embodiment, the strength of the voice from the speaker enables the user to estimate the position of the window of the partner corresponding to the voice. In addition, the audio and video can be viewed without a feeling of strangeness.

【００５０】上記第２の実施例では、前記第１の実施例
と同様に、ＩＴＵ−Ｔ勧告Ｈ．２６１に代えてＭＰＥＧ
などの他の画像符号化復号化方式でも採用でき、画像と
して動画に代えて静止画又は連続する静止画としてもよ
く、ＩＴＵ−Ｔ勧告Ｈ．２２１に代えて、複数呼を画
像、音声、データに割り当てるマルチメディア多重化方
式を採用してもよく、複数の相手の音声及び画像の多重
化には、時分割多重化方式に代えて、周波数多重化、位
相差多重化及びパケット多重化の何れでも採用してもよ
く、画面分割のように、ウィンドウの重なりのない画面
表示に変えることもできる。In the second embodiment, as in the first embodiment, the ITU-T Recommendation H.264 is used. MPEG instead of H.261
Other image encoding / decoding methods such as the above may also be adopted, and still images or continuous still images may be used as images instead of moving images. 221 may be replaced by a multimedia multiplexing method for allocating a plurality of calls to images, voices, and data. For multiplexing voices and images of a plurality of parties, instead of the time division multiplexing method, a frequency division method is used. Any of multiplex, phase difference multiplex, and packet multiplex may be adopted, and it is also possible to change to a screen display in which windows do not overlap, such as screen division.

【００５１】第２の実施例では、左右２つのスピーカ１
２６Ｌ、１２６Ｒのみを使用しているが、３つ以上のス
ピーカを使用することもできる。例えば、上下のスピー
カを設け、上下スピーカの音声出力比率を変えることに
よって、音声の方向とウィンドウの位置との、より正確
な対応をとることができる。In the second embodiment, two left and right speakers 1 are used.
Although only 26L and 126R are used, three or more speakers can be used. For example, by providing upper and lower speakers and changing the sound output ratio of the upper and lower speakers, more accurate correspondence between the direction of sound and the position of the window can be achieved.

【００５２】第２の実施例は、通信中は変数Ａを変化さ
せないものとして説明しているが、操作装置１１６から
の操作により変化するようにしてもよい。In the second embodiment, the variable A is not changed during communication, but it may be changed by an operation from the operation device 116.

【００５３】[0053]

【発明の効果】以上の説明から理解できるように、本発
明によれば、ユーザは、常時、音声に対応する相手のウ
ィンドウを容易に識別して注視することができる。As can be understood from the above description, according to the present invention, the user can always easily identify and gaze at the window of the other party corresponding to the voice.

[Brief description of drawings]

【図１】本発明の第１の実施例の概略構成ブロック図
である。FIG. 1 is a schematic block diagram of a first embodiment of the present invention.

【図２】第１の実施例の通信時の音声フレームのフレ
ーム構造図である。FIG. 2 is a frame structure diagram of a voice frame during communication according to the first embodiment.

【図３】第１の実施例の通信時の表示画面の一例であ
る。FIG. 3 is an example of a display screen during communication according to the first embodiment.

【図４】第１の実施例のＲＡＭ１４のメモリ構成であ
る。FIG. 4 is a memory configuration of a RAM 14 of the first embodiment.

【図５】第１の実施例のＲＯＭ１２に記憶されるプロ
グラムのフローチャートである。FIG. 5 is a flowchart of a program stored in the ROM 12 of the first embodiment.

【図６】本発明の第２の実施例の概略構成ブロック図
である。FIG. 6 is a schematic block diagram of a second embodiment of the present invention.

【図７】第２の実施例の通信時の表示画面の一例であ
る。FIG. 7 is an example of a display screen during communication according to the second embodiment.

【図８】第２の実施例のＲＡＭ１１４のメモリ構成で
ある。FIG. 8 is a memory configuration of a RAM 114 according to a second embodiment.

【図９】第２の実施例のＲＯＭ１１２に記憶されるプ
ログラムのフローチャートである。FIG. 9 is a flowchart of a program stored in the ROM 112 according to the second embodiment.

[Explanation of symbols]

１０：ＣＰＵ１２：ＲＯＭ１４：ＲＡＭ１６：操
作回路１８：音声符号化復号化回路２０：マイク
２２：レベル検出回路２４：音声混合回路２６：ス
ピーカ２８：画像符号化復号化回路３０：カメラ
３２：表示制御回路３４：表示装置３６：多重分離
回路３８：回線制御回路４０：クロック４２，４
４，４６，４８：タイムスロット５０：表示画面５
２，５４，５６：ウィンドウ５８：アイコン６０：
相手の人数を格納する変数Ｎ６２：音声を受信してい
ない相手のウィンドウの最大表示時間を格納する変数Ｔ
ｍ６４：現在の時刻を格納する変数Ｔ６６：相手の番号
を格納する変数Ｉ６８：相手Ｉの音声レベルを格納す
る変数Ｖ７０：相手＃１の音声を受信してウィンドウ
を表示した時刻を格納する変数Ｔｉ（１）７２：相手
＃２の音声を受信してウィンドウを表示した時刻を格納
する変数Ｔｉ（２）７４：相手＃３の音声を受信して
ウィンドウを表示した時刻を格納する変数Ｔｉ（３）
７６：相手＃Ｎの音声を受信してウィンドウを表示した
時刻を格納する変数Ｔｉ（Ｎ）１１０：ＣＰＵ１１
２：ＲＯＭ１１４：ＲＡＭ１１６：操作回路１１
８：音声符号化復号化回路１２０：マイク１２２：
音声増幅回路１２４Ｌ，１２４Ｒ：音声混合回路１
２６Ｌ，１２６Ｒ：左右スピーカ１２８：画像符号化
復号化回路１３０：カメラ１３２：表示制御回路
１３４：表示装置１３６：多重分離回路１３８：回
線制御回路１５０：表示画面１５２，１５４，１５
６、１５８：ウィンドウＤ１，Ｄ２，Ｄ３，Ｄ４：画
面１５０の左端から各ウィンドウの中心までの距離
Ｗ：画面１５０の幅ＤＬ：左スピーカ１２６Ｌの中心
から画面１５０の左端までの距離ＤＲ：右スピーカ１
２６の中心から画面１５０の右端までの距離１６０：
相手の人数を格納する変数Ｎ１６２：画面１５０の幅
Ｗを格納する変数ｗ１６４：左スピーカ１２６Ｌの中
心から画面１５０の左端までの距離ＤＬを格納する変数
ｄ（Ｌ）１６６：右スピーカ１２６Ｒの中心から画面
１５０の右端までの距離ＤＬを格納する変数ｄ（Ｒ）１６８：音声増幅率を格納する変数Ａ１７０：相手の
番号を格納する変数Ｉ１７２：画面１５０の左端から相手Ｉのウィンドウの中
心までの距離を格納する変数ｄｉ１７４：音声混合回
路１２４Ｌに出力される音声の比率を格納する変数Ｒ
（Ｌ）１７６：音声混合回路１２４Ｒに出力される音
声の比率を格納する変数Ｒ（Ｒ）10: CPU 12: ROM 14: RAM 16: Operation circuit 18: Voice encoding / decoding circuit 20: Microphone
22: Level detection circuit 24: Audio mixing circuit 26: Speaker 28: Image encoding / decoding circuit 30: Camera
32: display control circuit 34: display device 36: demultiplexing circuit 38: line control circuit 40: clock 42, 4
4,46,48: Time slot 50: Display screen 5
2, 54, 56: Window 58: Icon 60:
Variable N 62 for storing the number of opponents: Variable T for storing the maximum display time of the window of the opponent who is not receiving voice
m 64: Variable T that stores the current time T 66: Variable I that stores the number of the other party 68: Variable V that stores the voice level of the other party I 70: The time when the window of the other party # 1 is received and the window is displayed Variable Ti (1) 72 to store: The time when the voice of the partner # 2 is received and the window is displayed is stored. Variable Ti (2) 74: The voice of the partner # 3 is received and the time to display the window is stored. Variable Ti (3)
76: Variable Ti (N) 110 for storing the time when the window of the other party #N is received and the window is displayed 110: CPU 11
2: ROM 114: RAM 116: Operation circuit 11
8: Speech coding / decoding circuit 120: Microphone 122:
Audio amplification circuit 124L, 124R: audio mixing circuit 1
26L, 126R: left and right speakers 128: image encoding / decoding circuit 130: camera 132: display control circuit
134: Display device 136: Demultiplexing circuit 138: Line control circuit 150: Display screen 152, 154, 15
6, 158: Windows D1, D2, D3, D4: Distance from the left end of the screen 150 to the center of each window
W: width of screen 150 DL: distance from center of left speaker 126L to left end of screen 150 DR: right speaker 1
Distance from the center of 26 to the right edge of screen 150 160:
Variable N 162 for storing the number of opponents: Variable w 164 for storing the width W of the screen 150: Variable d (L) 166 for storing the distance DL from the center of the left speaker 126L to the left end of the screen 150L: For the right speaker 126R Variable d (R) 168 that stores the distance DL from the center to the right end of the screen 150: Variable A that stores the voice amplification factor 170: Variable I that stores the number of the other party 172: From the left end of the screen 150 to the window of the other party I Variable di 174 for storing distance to center: Variable R for storing ratio of voices output to voice mixing circuit 124L
(L) 176: Variable R (R) that stores the ratio of the sound output to the sound mixing circuit 124R

Claims

[Claims]

1. A multimedia device which performs multimedia communication with a plurality of partners via a communication line and displays a plurality of images received from the partner in a multi-window, and detects a sound level received from the partner. And a means for displaying, as an icon, a window for displaying an image of a partner who has not received a predetermined voice for a predetermined time, based on the detection result of the voice level detecting means. Multimedia device to do.

2. A means for displaying, in a window, an image of the other party who has received the voice of the predetermined level or higher.
Multimedia device.

3. A multimedia device for performing multimedia communication with a plurality of parties via a communication line and displaying received images in a multi-window, comprising means for reading a positional relationship of each window, and means for reading the positional relationship between the windows. And an output unit for outputting the received voice of the above at a volume according to the positional relationship of the corresponding windows.

4. The multimedia device according to claim 3, wherein the output unit is a unit that outputs from a plurality of speakers at a volume of a ratio according to a positional relationship of the windows.