JP2012114511A

JP2012114511A - Conference system

Info

Publication number: JP2012114511A
Application number: JP2010259425A
Authority: JP
Inventors: Takero Hama; 健朗濱; Kaitaku Ozawa; 開拓小澤; Hiroaki Kubo; 広明久保; Jun Kunioka; 潤國岡; Ayumi Ito; 歩伊藤; Yoshikazu Ikenoue; 義和池ノ上
Original assignee: Konica Minolta Business Technologies Inc
Current assignee: Konica Minolta Business Technologies Inc
Priority date: 2010-11-19
Filing date: 2010-11-19
Publication date: 2012-06-14

Abstract

PROBLEM TO BE SOLVED: To transmit nonverbal information when the nonverbal information is not transmitted to a participant from a presenter.SOLUTION: A conference system comprises: a line-of-sight direction detection part which includes a first conference room which a presenter participates in and at least one second conference room which participants participate in, a PC disposed in the first conference room detecting the direction of the line of sight of the presenter; a line-of-sight image storage part which stores image data output by a camera while the direction in which the line-of-sight of the presenter faces a display part is detected; a selection part which selects either of the image data output by the camera or image data stored in an HDD; and a video data transmission part which transmits video data containing the selected image data to a PC disposed in the second conference room. When a state when the line-of-sight direction does not face the direction of facing the display part continues for a predetermined time while the image data output by the camera is selected, the selection part selects the image data stored in the HDD.

Description

この発明は、会議システムに関し、特に地理的に離れた複数の会議室を用いて開催される会議に適用可能な会議システムに関する。 The present invention relates to a conference system, and more particularly to a conference system applicable to a conference held using a plurality of geographically separated conference rooms.

地理的に離れた複数の会議室で開催されるテレビ会議システムが知られている。しかしながら、テレビ会議システムにおいては、異なる場所の様子は、その場所をカメラで撮影した映像でしか知ることができずないため、１つの場所で開催される会議に比較して、臨場感に欠けるといった問題がある。１つの場所で開催される会議に近づけるために、多数の対話者同士の視線を一致させるようにしたテレビ会議システムが特開平６−２５３３０３号公報に記載されている。このテレビ会議システムは、複数の撮影装置Ｘｉj（ｉ＝ａ，ｂ，ｃ、ｊ＝Ｒ，Ｃ，Ｌ）によって、同一対話者Ｘａ，Ｘｂ，Ｘｃを異なる場所から撮影することにより、異なる角度から撮影した複数の映像を得る。これらの映像を遠隔地において投射装置ＰＸｉｊにより表示スクリーンＹ１，Ｙ２，Ｙ３の一つに投射表示するとき、光の入射角度によって散乱度が異なるスクリーンを用い、投射装置ＰＸｉｊの表示スクリーンＹ１，Ｙ２，Ｙ３への投射角度を撮影の方角に対応させ、かつ散乱度が大きい角度にして投射表示する。 A video conference system that is held in a plurality of geographically separated conference rooms is known. However, in a video conference system, the situation of a different place can only be known from the video taken by the camera, so it lacks a sense of reality compared to a meeting held in one place. There's a problem. Japanese Patent Laid-Open No. 6-253303 discloses a video conference system in which the lines of sight of a large number of interlocutors are made to coincide with each other in order to approach a conference held at one place. In this video conference system, a plurality of photographing devices Xij (i = a, b, c, j = R, C, L) are used to photograph the same conversation person Xa, Xb, Xc from different locations, and from different angles. Get multiple shots. When these images are projected and displayed on one of the display screens Y1, Y2, Y3 by the projection device PXij in a remote place, a screen having a different degree of scattering depending on the incident angle of light is used, and the display screens Y1, Y2, Y2 of the projection device PXij are used. The projection angle to Y3 is made to correspond to the direction of shooting, and is projected and displayed at an angle with a high degree of scattering.

このため、従来の会議システムにおいては、例えば、発話者が、ある一方向の参加者と会話する場合には、別の方向の参加者は、発話者と視線が合うことがない。また、発話者が手元の資料などを読みながら発話する場合、参加者と発話者の視線が合うことがない。このため、発話者と視線の会わない参加者は、会議に参加しているという意識が薄れてしまうといった問題がある。
特開平６−２５３３０３号公報 For this reason, in a conventional conference system, for example, when a speaker has a conversation with a participant in one direction, a participant in another direction does not line up with the speaker. In addition, when the speaker speaks while reading the material at hand, the line of sight of the participant and the speaker does not match. For this reason, there is a problem that a participant who does not meet his / her line of sight loses consciousness of participating in the conference.
JP-A-6-253303

この発明は上述した問題を解決するためになされたもので、この発明の目的の１つは、発表者から参加者にノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することが可能な会議システムを提供することである。 The present invention has been made to solve the above-described problems, and one object of the present invention is to transmit non-verbal information when non-verbal information is not transmitted from a presenter to a participant. To provide a conference system.

この発明の他の目的は、第１会議室の参加者から第２または第３会議室の参加者のいずれかにノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することが可能な会議システムを提供することである。 Another object of the present invention is a conference capable of transmitting non-verbal information when non-verbal information is not transmitted from a participant in the first conference room to any of the participants in the second or third conference room. Is to provide a system.

この発明は上述した問題を解決するためになされたもので、この発明のある局面によれば、会議システムは、地理的に離れた複数の会議室をそれぞれ撮像した複数の画像を用いて仮想的な会議室を生成する会議システムであって、複数の会議室それぞれに、画像を表示する表示手段と、表示手段側から該会議室に存在する参加者に向かう被写体方向で参加者を撮像する撮像手段と、表示手段および撮像手段を制御する制御手段と、が配置され、複数の会議室は、発表者が参加する第１会議室と、参加者が参加する少なくとも１つの第２会議室とを含み、第１会議室に配置される制御手段は、撮像手段が出力する画像データに基づいて、発表者の視線方向を検出する視線方向検出手段と、視線検出手段により発表者の視線が表示手段を向く方向が検出されている間に、撮像手段が出力する画像データを記憶する視線画像記憶手段と、撮像手段が出力する画像データおよび記憶手段に記憶された画像データのうちいずれか一方を選択する選択手段と、選択手段により選択された画像データを含む映像データを、少なくとも１つの第２会議室それぞれに配置された制御手段に送信する映像データ送信手段と、を備え、選択手段は、撮像手段が出力する画像データを選択しているときに、視線検出手段により検出された視線方向が表示手段を向く方向でない状態が所定時間継続すると、記憶手段に記憶された画像データを選択し、第２会議室に配置される制御手段は、第１会議室に配置された制御手段から映像データを受信する映像データ受信手段と、映像データ受信手段により受信される映像データに含まれる画像データの画像を表示手段に表示させる表示制御手段と、を備える。 The present invention has been made to solve the above-described problems. According to one aspect of the present invention, a conference system uses a plurality of images obtained by imaging a plurality of geographically separated conference rooms, respectively. A conference system for generating a simple conference room, wherein each of a plurality of conference rooms has display means for displaying an image, and imaging for imaging a participant in a subject direction toward the participant existing in the conference room from the display means side And a control means for controlling the display means and the imaging means. The plurality of conference rooms include a first conference room in which the presenter participates and at least one second conference room in which the participants participate. A control unit disposed in the first conference room includes: a gaze direction detection unit that detects a gaze direction of the presenter based on image data output from the imaging unit; and a gaze direction of the presenter is displayed by the gaze detection unit. Direction A line-of-sight image storage means for storing image data output by the imaging means during detection, and a selection means for selecting one of the image data output by the imaging means and the image data stored in the storage means, Video data transmission means for transmitting video data including the image data selected by the selection means to a control means arranged in each of at least one second conference room, and the selection means outputs the imaging means When the image data is selected, if the state in which the line-of-sight direction detected by the line-of-sight detection means is not in the direction facing the display means continues for a predetermined time, the image data stored in the storage means is selected and the second meeting room is selected. The arranged control means is received by the video data receiving means for receiving video data from the control means arranged in the first conference room, and the video data receiving means. Comprising display control means for displaying the image of the image data included in the image data on the display unit.

この局面に従えば、発表者が参加する第１会議室に配置される制御手段によって、撮像手段が出力する画像データに基づいて、発表者の視線方向が検出され、発表者の視線が表示手段を向く方向が検出されている間に、撮像手段が出力する画像データが記憶され、撮像手段が出力する画像データおよび記憶手段に記憶された画像データのうちいずれか一方が選択され、選択された画像データを含む映像データが、第２会議室に配置された制御手段に送信される。また、撮像手段が出力する画像データを選択しているときに、検出された視線方向が表示手段を向く方向でない状態が所定時間継続すると、記憶手段に記憶された画像データが選択される。第１会議室の参加者の視線方向が表示手段を向いているときは、第２会議室の参加者は第１会議室の参加者と視線が合うが、第１会議室の参加者の視線方向が表示手段を向いていないときはるときは、第２会議室の参加者は第１会議室の参加者と視線が合わない。第２会議室の参加者は第１会議室の参加者と視線が合わない状態が所定時間継続すると、第１会議室の参加者の視線方向が表示手段を向いているときに記憶された画像データが選択されるので、第２会議室の参加者は第１会議室の参加者と視線が合う画像を見ることになる。このため、発表者から参加者にノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することが可能な会議システムを提供することができる。 According to this aspect, the gaze direction of the presenter is detected based on the image data output by the imaging means by the control means arranged in the first conference room where the presenter participates, and the gaze of the presenter is displayed. The image data output by the imaging unit is stored while the direction facing the camera is detected, and one of the image data output by the imaging unit and the image data stored in the storage unit is selected and selected. Video data including image data is transmitted to the control means arranged in the second conference room. In addition, when the image data output by the imaging unit is selected and the state where the detected line-of-sight direction does not face the display unit continues for a predetermined time, the image data stored in the storage unit is selected. When the line-of-sight direction of the participant in the first meeting room is facing the display means, the participant in the second meeting room is aligned with the participant in the first meeting room, but the line of sight of the participant in the first meeting room When the direction is not facing the display means, the participant in the second conference room does not line up with the participant in the first conference room. An image stored when the line-of-sight direction of the participant in the first conference room is facing the display means when the participant in the second conference room is in a state in which the line of sight does not match the participant in the first conference room for a predetermined time Since the data is selected, the participant in the second meeting room sees an image in line of sight with the participant in the first meeting room. Therefore, it is possible to provide a conference system that can transmit non-verbal information when non-verbal information is not transmitted from a presenter to a participant.

好ましくは、選択手段は、視線検出手段により検出された視線方向が表示手段を向く方向の場合は、撮像手段が出力する画像データを選択する。 Preferably, the selection unit selects the image data output by the imaging unit when the line-of-sight direction detected by the line-of-sight detection unit is a direction facing the display unit.

この発明の他の局面によれば、会議システムは、地理的に離れた３つの会議室をそれぞれ撮像した複数の画像を用いて仮想的な会議室を生成する会議システムであって、３つの会議室それぞれに、２つの表示領域を含む表示手段と、表示手段側から該会議室に参加する参加者に向かう第１の方向で参加者を撮像する第１撮像手段と、第１の方向と交わる第２の方向で参加者を撮像する第２撮像手段と、第２の方向と逆の第３の方向で参加者を撮像する第３撮像手段と、表示手段および第１〜第３撮像手段を制御する制御手段と、が配置され、制御手段は、第１撮像手段が出力する第１画像データに基づいて、参加者の視線方向が第１表示領域に向かう第１視線方向または第２表示領域に向かう第２視線方向であるかを検出する視線方向検出手段と、視線方向検出手段による検出結果に基づいて、参加者が発話する相手を判定する発話相手判定手段と、第１〜第３撮像手段がそれぞれ出力する第１〜第３画像データのうちから１つの第１選択画像データを選択する第１選択手段と、第１〜第３撮像手段がそれぞれ出力する第１〜第３画像データのうちから１つの第２選択画像データを選択する第２選択手段と、第１選択画像データを含む映像データと、発話相手判定手段により判定された発話相手を示す第１発話相手情報と、を他の２つの会議室のうち一方の会議室に配置された制御手段に送信する第１会議情報送信手段と、第２選択画像データを含む映像データと第１発話相手情報と、を他の２つの会議室のうち一方の会議室とは異なる他方の会議室に配置された制御手段に送信する第２会議情報送信手段と、一方の会議室に配置された制御手段から映像データと一方の会議室の参加者が発話する相手を示す第２発話相手情報とを受信する第１会議情報受信手段と、他方の会議室に配置された制御手段から映像データと他方の会議室の参加者が発話する相手を示す第３発話相手情報とを受信する第２会議情報受信手段と、一方の会議室に配置された制御手段から受信される映像データに含まれる画像データを第１表示領域に表示し、他方の会議室に配置された制御手段から受信される映像データに含まれる画像データを第２表示領域に表示させるように表示手段を制御する表示制御手段と、第１〜第３発話相手情報に基づいて、参加者が、他の２つの会議室のいずれの参加者と会話しているかを判断する会話相手判断手段と、を備え、発話相手判定手段は、視線方向検出手段により検出された視線方向が、第１表示領域に向かう場合に一方の会議室の参加者に発話していると判定し、視線方向検出手段により検出された視線方向が、第２表示領域に向かう場合に他方の会議室の参加者に発話していると判定し、第１選択手段は、会話相手判断手段により一方の会議室の参加者と会話していると判断された場合に、第１画像データを選択し、会話相手判断手段により他方の会議室の参加者と会話していると判断された場合に、第２画像データおよび第３画像データのいずれか一方を選択し、他方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１画像データを第２の所定時間選択し、第２選択手段は、会話相手判断手段により他方の会議室の参加者と会話していると判断された場合に、第１画像データを選択し、会話相手判断手段により一方の会議室の参加者と会話していると判断された場合に、第２画像データおよび第３画像データのいずれか一方を選択し、一方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１画像データを第２の所定時間選択する。 According to another aspect of the present invention, a conference system is a conference system that generates a virtual conference room using a plurality of images obtained by imaging three conference rooms that are geographically separated from each other. Each room intersects the first direction with display means including two display areas, first imaging means for imaging the participant in a first direction from the display means side toward the participant participating in the conference room, and the first direction. A second imaging means for imaging the participant in the second direction; a third imaging means for imaging the participant in a third direction opposite to the second direction; a display means; and first to third imaging means. And a control means for controlling, based on the first image data output by the first imaging means, the control means controls the first viewing direction or the second display area in which the viewing direction of the participant is directed to the first display area. Gaze direction detection hand for detecting whether the second gaze direction is toward Based on the detection result by the line-of-sight direction detection means, one of the utterance partner determination means for determining the partner uttered by the participant and the first to third image data output by the first to third imaging means, respectively. First selection means for selecting one first selection image data, and second selection means for selecting one second selection image data from among first to third image data output from the first to third imaging means, respectively. The video data including the first selected image data, and the first utterance partner information indicating the utterance partner determined by the utterance partner determination means, are arranged in one of the other two conference rooms. The first meeting information transmitting means for transmitting to the means, the video data including the second selected image data, and the first utterance partner information are sent to the other meeting room different from one of the other two meeting rooms. Send to arranged control means First conference information receiving means for receiving video data and second utterance partner information indicating a partner uttered by a participant in one conference room from the second conference information transmitting means and the control means arranged in one conference room And second conference information receiving means for receiving video data and third utterance partner information indicating a partner uttered by a participant in the other conference room from the control means arranged in the other conference room, and one conference room The image data included in the video data received from the control means disposed in the first display area is displayed in the first display area, and the image data included in the video data received from the control means disposed in the other conference room is displayed in the second display area. Based on the display control means for controlling the display means to display in the display area, and the first to third utterance partner information, which participant in the other two conference rooms is talking to Conversation partner judgment means to judge And the speech partner determination means determines that the user is speaking to a participant in one conference room when the gaze direction detected by the gaze direction detection means goes to the first display area, and the gaze direction detection means When the direction of the line of sight detected by (2) is directed to the second display area, it is determined that the participant in the other conference room is speaking, and the first selection means is a participant in one conference room by the conversation partner determination means. The first image data is selected when it is determined that the user is talking with the second image data, and the second image data and the second image data are selected when the conversation partner determining unit determines that the user is talking with the participant in the other conference room. When one of the three image data is selected and the state determined to be speaking with a participant in the other conference room continues for the first predetermined time, the first image data is selected for the second predetermined time, The second selecting means is the other by the conversation partner judging means When it is determined that the user is talking with a conference room participant, the first image data is selected, and when the conversation partner determining means determines that the user is talking with a participant of one conference room, When either one of the second image data and the third image data is selected and a state in which it is determined that the user is talking with a participant in one conference room continues for the first predetermined time, the first image data is changed to the second image data. Select a predetermined time.

この局面に従えば、参加者が一方の会議室の参加者と会話していると判断された場合に、参加者に向かう第１の方向で参加者を撮像した第１画像データが選択され、他方の会議室の参加者と会話していると判断された場合に、第１の方向と交わる第２の方向または第２の方向と逆の第３の方向で参加者を撮像した第２画像データおよび第３画像データのいずれか一方が選択され、他方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１画像データが第２の所定時間選択され、選択された画像データが一方の会議室に配置された制御手段に送信される。また、参加者が他方の会議室の参加者と会話していると判断された場合に、参加者に向かう第１の方向で参加者を撮像した第１画像データが選択され、一方の会議室の参加者と会話していると判断された場合に、第１の方向と交わる第２の方向または第２の方向と逆の第３の方向で参加者を撮像した第２画像データおよび第３画像データのいずれか一方が選択され、一方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１画像データが第２の所定時間選択され、選択された画像データが他方の会議室に配置された制御手段に送信される。このため、参加者に向かう第１の方向で参加者を撮像した第１画像データを見る第２または第３会議室の参加者は、第１会議室の参加者と視線が合うが、第２または第３画像データを見る第２または第３会議室の参加者は、第１会議室の参加者と視線が合わない。他方の会議室の参加者と会話していると判断される状態では、一方の会議室の参加者は第１会議室の参加者と視線が合わないが、第１の所定時間継続すると、第１画像データが第２の所定時間一方の会議室に送信されるので、一方の会議室の参加者は第１会議室の参加者と第２の所定時間視線が合う。また、一方の会議室の参加者と会話していると判断される状態では、他方の会議室の参加者は第１会議室の参加者と視線が合わないが、第１の所定時間継続すると、第１画像データが第２の所定時間他方の会議室に送信されるので、他方の会議室の参加者は第１会議室の参加者と第２の所定時間視線が合う。このため、第１会議室の参加者から第２または第３会議室の参加者のいずれかにノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することが可能な会議システムを提供することができる。 According to this aspect, when it is determined that the participant is talking to the participant in one of the conference rooms, the first image data obtained by imaging the participant in the first direction toward the participant is selected. A second image in which the participant is imaged in a second direction that intersects the first direction or a third direction opposite to the second direction when it is determined that the user is talking to the participant in the other conference room When one of the data and the third image data is selected and a state in which it is determined that the user is talking with a participant in the other conference room continues for the first predetermined time, the first image data is changed to the second predetermined time. The selected image data is transmitted to the control means arranged in one conference room. In addition, when it is determined that the participant is talking with the participant in the other conference room, the first image data obtained by imaging the participant in the first direction toward the participant is selected, and one conference room is selected. Second image data obtained by imaging the participant in a second direction intersecting with the first direction or a third direction opposite to the second direction and third When one of the image data is selected and a state in which it is determined that the user is talking with a participant in one conference room continues for the first predetermined time, the first image data is selected for the second predetermined time and selected. The processed image data is transmitted to the control means arranged in the other conference room. Therefore, a participant in the second or third conference room who views the first image data obtained by imaging the participant in the first direction toward the participant is in line with the participant in the first conference room. Or the participant of the 2nd or 3rd meeting room which sees 3rd image data does not match a line of sight with the participant of the 1st meeting room. In a state where it is determined that the other conference room is in conversation with the participant in the other conference room, the participant in one conference room does not line up with the participant in the first conference room. Since one image data is transmitted to one conference room for the second predetermined time, the participant in one conference room is aligned with the participant in the first conference room for the second predetermined time. In addition, in a state where it is determined that the user is talking to a participant in one conference room, the participant in the other conference room does not line up with the participant in the first conference room, but continues for the first predetermined time. Since the first image data is transmitted to the other conference room for the second predetermined time, the participants in the other conference room are aligned with the participants in the first conference room for the second predetermined time. Therefore, a conference system capable of transmitting non-verbal information when non-verbal information is not transmitted from a participant in the first conference room to any of the participants in the second or third conference room is provided. Can do.

好ましくは、第１撮像手段は、参加者を正面から撮影する画角となるように配置され、第２撮像手段は、参加者の右側面から撮影する画角となるように配置され、第３撮像手段は、参加者の左側面から撮影する画角となるように配置され、第１表示領域と第２表示領域とは左右に並んで配置され、制御手段は、表示制御手段が一方の会議室に配置された制御手段から受信される映像データに含まれる画像データを第１表示領域に表示していることを示す表示領域情報を一方の会議室に配置された制御手段に送信する第１表示領域情報送信手段と、表示制御手段が他方の会議室に配置された制御手段から受信される映像データに含まれる画像データを第２表示領域に表示していることを示す表示領域情報を他方の会議室に配置された制御手段に送信する第２表示領域情報送信手段と、を備え、第１選択手段は、会話相手判断手段により他方の会議室の参加者と会話していると判断された場合に、一方の会議室に配置された制御手段から受信される表示領域情報が第１表示領域を示す場合は第２画像データを選択し、一方の会議室に配置された制御手段から受信される表示領域情報が第２表示領域を示す場合は第３画像データを選択し、第２選択手段は、会話相手判断手段により一方の会議室の参加者と会話していると判断された場合に、他方の会議室に配置された制御手段から受信される表示領域情報が第１表示領域を示す場合は第２画像データを選択し、一方の会議室に配置された制御手段から受信される表示領域情報が第２表示領域を示す場合は第３画像データを選択する。 Preferably, the first image pickup means is arranged to have an angle of view for photographing the participant from the front, and the second image pickup means is arranged to obtain an angle of view for photographing from the right side of the participant. The imaging means is arranged so as to have an angle of view taken from the left side of the participant, the first display area and the second display area are arranged side by side, and the control means is configured such that the display control means is one of the conferences. A display area information indicating that image data included in video data received from the control means arranged in the room is displayed in the first display area is transmitted to the control means arranged in one conference room. Display area information transmitting means and display area information indicating that the display control means is displaying image data included in video data received from the control means arranged in the other conference room in the second display area. Control means located in the conference room Second display area information transmitting means for communicating, and the first selecting means is arranged in one conference room when it is determined by the conversation partner judging means that it is talking with a participant in the other conference room. When the display area information received from the control means displayed indicates the first display area, the second image data is selected, and the display area information received from the control means arranged in one conference room is the second display area. Is selected, the third image data is selected, and the second selection means is arranged in the other conference room when it is determined by the conversation partner determination means that it is talking with a participant in one conference room. When the display area information received from the control means indicates the first display area, the second image data is selected, and the display area information received from the control means arranged in one conference room indicates the second display area. In this case, the third image data is selected.

この局面に従えば、他方の会議室の参加者と会話していると判断された場合に、一方の会議室に配置された制御手段から受信される表示領域情報が第１表示領域を示す場合は第２画像データが選択され、一方の会議室に配置された制御手段から受信される表示領域情報が第２表示領域を示す場合は第３画像データが選択され、選択された第１選択画像データが一方の会議室に配置された制御手段に送信される。また、一方の会議室の参加者と会話していると判断された場合に、他方の会議室に配置された制御手段から受信される表示領域情報が第１表示領域を示す場合は第２画像データが選択され、一方の会議室に配置された制御手段から受信される表示領域情報が第２表示領域を示す場合は第３画像データが選択され、選択された第２選択画像データが他方の会議室に配置された制御手段に送信される。このため、３つの会議室それぞれの参加者は、他の２つの会議室の参加者が会話している間は、他の２つの会議室の二人の参加者が向かい合って会話する画像を見ることができる。 According to this aspect, when it is determined that the user is talking to a participant in the other conference room, the display area information received from the control means arranged in the one conference room indicates the first display area The second image data is selected, and when the display area information received from the control means arranged in one conference room indicates the second display area, the third image data is selected, and the selected first selected image Data is transmitted to the control means arranged in one conference room. Further, when it is determined that the user is talking to a participant in one conference room, the second image is displayed when the display area information received from the control means arranged in the other conference room indicates the first display area. When the data is selected and the display area information received from the control means arranged in one conference room indicates the second display area, the third image data is selected, and the selected second selected image data is the other It is transmitted to the control means arranged in the conference room. For this reason, participants in each of the three conference rooms see images where the two participants in the other two conference rooms face each other while the participants in the other two conference rooms are talking. be able to.

好ましくは、３つの会議室それぞれは、さらに、音声を集音するマイクロホンが配置され、制御手段は、マイクロホンをさらに制御し、マイクロホンが出力する音データに基づいて、参加者が発話しているか否かを判断する発話検出手段を、さらに備え、第１選択手段は、第２画像データおよび第３画像データのいずれか一方を選択しているとき、他方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、発話検出手段により参加者が発話していると判断されることを条件に、第１画像データを第２の所定時間選択し、第２選択手段は、第２画像データおよび第３画像データのいずれか一方を選択しているとき、一方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、発話検出手段により参加者が発話していると判断されることを条件に、第１画像データを第２の所定時間選択する。 Preferably, each of the three conference rooms is further provided with a microphone for collecting sound, and the control means further controls the microphone, and whether or not the participant speaks based on sound data output from the microphone. Utterance detecting means for determining whether or not the first selecting means is in conversation with a participant in the other conference room when either one of the second image data and the third image data is selected. When the state determined to be continued for the first predetermined time, the first image data is selected for the second predetermined time on the condition that the participant is determined to be speaking by the speech detection means, and the second selection is performed. The means is configured such that when either one of the second image data and the third image data is selected and the state in which it is determined that the user is speaking with a participant in one of the conference rooms continues for the first predetermined time, Detection means On condition that it is determined that more participants is speaking, selects the first image data a second predetermined time.

この局面に従えば、第２会議室に配置された制御手段に送信する第１選択画像データとして、第２画像データおよび第３画像データのいずれか一方を選択しているとき、他方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１会議室の参加者が発話していると判断されることを条件に、第１画像データが第２の所定時間第１選択画像データとして選択され、第２会議室に配置された制御手段に送信される。また、第３会議室に配置された制御手段に送信する第２選択画像データとして、第２画像データおよび第３画像データのいずれか一方を選択しているとき、一方の会議室の参加者と会話していると判断される状態が第１の所定時間継続すると、第１会議室の参加者が発話していると判断されることを条件に、第１画像データが第２の所定時間第２選択画像データとして選択され、第３会議室に配置された制御手段に送信される。このため、第２会議室の参加者は、第１および第３会議室の参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が第１の所定時間継続すると、第１または第３会議室の参加者いずれかが発話していれば、第１または第３会議室の参加者のうち発話している参加者と視線が合う画像を見ることになる。また、第３会議室の参加者は、第１および第２会議室の参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が第１の所定時間継続すると、第１または第２会議室の参加者が発話していれば、第１または第２会議室の参加者のうち発話している参加者と視線が合う画像を見ることになる。さらに、第１会議室の参加者は、第２および第３会議室の参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が第１の所定時間継続すると、第２または第３会議室の参加者が発話していれば、第２または第３会議室の参加者のうち発話している参加者と視線が合う画像を見ることになる。このため、３つの会議室それぞれの参加者は、他の２つの会議室の参加者が会話している間は、他の２つの会議室の二人の参加者が向かい合って会話する画像を見るが、他の２つの会議室の参加者が会話している状態が所定時間継続すると、他の２つの会議室の参加者のうち発話している参加者と視線が合う画像を見ることができる。 According to this aspect, when one of the second image data and the third image data is selected as the first selected image data to be transmitted to the control means arranged in the second conference room, the other conference room is selected. If the state determined to be in conversation with the participant continues for the first predetermined time, the first image data is second on condition that it is determined that the participant in the first conference room is speaking. Is selected as the first selection image data for a predetermined time and transmitted to the control means arranged in the second conference room. Further, when one of the second image data and the third image data is selected as the second selected image data to be transmitted to the control means arranged in the third conference room, If the state determined to be a conversation continues for the first predetermined time, the first image data is set to the second predetermined time on condition that the participant in the first conference room is determined to be speaking. It is selected as the 2-selected image data and transmitted to the control means arranged in the third conference room. For this reason, when the participants in the second conference room are talking to the participants in the first and third conference rooms, the participants see the images of the two participants facing each other, but the state is the first. If any of the participants in the first or third conference room is speaking, the image that matches the line of sight of the speaking participant among the participants in the first or third meeting room is viewed. It will be. In addition, the participants in the third conference room, when the participants in the first and second conference rooms are talking, see the images of the two participants facing each other, the state is the first If the participant in the first or second conference room is speaking for a predetermined time, the user will see an image whose line of sight matches the participant speaking in the first or second conference room. . Furthermore, when the participants in the first conference room are talking to the participants in the second and third conference rooms, the participants see the images of the two participants facing each other. If the participant in the second or third conference room is speaking for a predetermined time, the user will see an image whose line of sight matches the participant speaking in the second or third conference room. . For this reason, participants in each of the three conference rooms see images where the two participants in the other two conference rooms face each other while the participants in the other two conference rooms are talking. However, if the participants in the other two conference rooms are in a conversation state for a predetermined time, it is possible to see an image whose line of sight matches the participant speaking in the other two conference rooms. .

好ましくは、３つの会議室それぞれは、さらに、音声を集音するマイクロホンが配置され、制御手段は、音声出力装置およびマイクロホンを、さらに制御し、マイクロホンが出力する音データに基づいて、参加者が発話しているか否かを判断する発話検出手段を、さらに備え、発話相手判定手段は、視線方向検出手段により検出された視線方向が、第１表示領域に向かい、かつ、参加者が発話していると判断される場合に一方の会議室の参加者に発話していると判定し、視線方向検出手段により検出された視線方向が、第２表示領域に向かい、かつ、参加者が発話していると判断される場合に他方の会議室の参加者に発話していると判定する。 Preferably, each of the three conference rooms is further provided with a microphone that collects sound, and the control means further controls the sound output device and the microphone, and the participant can control the sound based on the sound data output from the microphone. Speech detection means for determining whether or not the user is speaking is further provided, and the speech partner determination means is such that the line-of-sight direction detected by the line-of-sight direction detection means faces the first display area and the participant speaks. If it is determined that the person is speaking to a participant in one of the conference rooms, the line-of-sight direction detected by the line-of-sight detection means is directed to the second display area, and the participant speaks. If it is determined that the user is speaking, it is determined that the user is speaking to the participant in the other conference room.

好ましくは、３つの会議室それぞれは、さらに、音声を出力する音声出力装置が配置され、制御手段は、音声出力装置を、さらに制御し、音声出力装置に出力する音声制御手段を、さらに備え、第１会議情報送信手段は、第１選択画像データとマイクロホンが出力する音データとを含む映像データと、発話相手判定手段により判定された発話相手を示す第１発話相手情報と、を一方の会議室に配置された制御手段に送信し、第２会議情報送信手段は、第２選択画像データとマイクロホンが出力する音データとを含む映像データと第１発話相手情報と、を他方の会議室に配置された制御手段に送信し、音声制御手段は、一方の会議室に配置された制御手段から受信される映像データに含まれる音データと、他方の会議室に配置された制御手段から受信される映像データに含まれる音データと、を音声出力装置に出力する。 Preferably, each of the three conference rooms is further provided with an audio output device for outputting audio, and the control means further includes audio control means for further controlling the audio output device and outputting to the audio output device, The first conference information transmitting means includes the video data including the first selected image data and the sound data output from the microphone, and the first utterance partner information indicating the utterance partner determined by the utterance partner determination means. The second meeting information transmitting means transmits the video data including the second selected image data and the sound data output from the microphone and the first utterance partner information to the other meeting room. The audio control means transmits the sound data contained in the video data received from the control means arranged in one conference room and the control means arranged in the other conference room. And it outputs the sound data included in the video data received, the audio output device.

本発明の第１の実施の形態における会議システム全体の概要を示す図である。It is a figure which shows the outline | summary of the whole conference system in the 1st Embodiment of this invention. 第１の実施の形態におけるＰＣのハードウエア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of PC in 1st Embodiment. 第１の実施の形態におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。It is a block diagram which shows an example of the function of CPU with which PC in 1st Embodiment is provided. 第１の実施の形態における映像データ送信処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the video data transmission process in 1st Embodiment. 第１の実施の形態の変形例におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。It is a block diagram which shows an example of the function of CPU with which PC in the modification of 1st Embodiment is provided. 第２の実施の形態における会議システム全体の概要を示す図である。It is a figure which shows the outline | summary of the whole conference system in 2nd Embodiment. 第２の実施の形態におけるＰＣのハードウエア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of PC in 2nd Embodiment. 第１〜第３会議室の参加者とカメラとの位置関係を説明するための図である。It is a figure for demonstrating the positional relationship of the participant and the camera of a 1st-3rd meeting room. 第１会議室に配置されたＰＣを中心に、第２会議室に配置されたＰＣおよび第３会議室に配置されたＰＣとの間で送受信されるデータの一例を示す図である。It is a figure which shows an example of the data transmitted / received between PC arrange | positioned at the 2nd meeting room and PC arrange | positioned at the 3rd meeting room centering on PC arrange | positioned at the 1st meeting room. 表示部の表示領域の一例を示す図である。It is a figure which shows an example of the display area of a display part. 第２の実施の形態におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。It is a block diagram which shows an example of the function of CPU with which PC in 2nd Embodiment is provided. 第２の実施の形態における映像データ送信処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the video data transmission process in 2nd Embodiment. 発話相手検出処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of an utterance other party detection process. 会話相手検出処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a conversation other party detection process. 選択画像設定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a selection image setting process. 第２選択画像データ設定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a 2nd selection image data setting process. 第１選択画像データ設定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of a 1st selection image data setting process. 第１参加者と第２参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。It is a figure which shows an example of the display state of the 1st-3rd meeting room when the 1st participant and the 2nd participant are talking. 第１参加者と第３参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。It is a figure which shows an example of the display state of the 1st-3rd meeting room when the 1st participant and the 3rd participant are talking. 第２参加者と第３参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。It is a figure which shows an example of the display state of the 1st-3rd meeting room when the 2nd participant and the 3rd participant are talking.

以下、本発明の実施の形態について図面を参照して説明する。以下の説明では同一の部品には同一の符号を付してある。それらの名称および機能も同じである。したがってそれらについての詳細な説明は繰り返さない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the following description, the same parts are denoted by the same reference numerals. Their names and functions are also the same. Therefore, detailed description thereof will not be repeated.

＜第１の実施の形態＞
図１は、本発明の第１の実施の形態における会議システム全体の概要を示す図である。図１を参照して、第１の実施の形態における会議システム１は、ネットワーク３に接続された６台のパーソナルコンピュータ（以下「ＰＣ」という）１００，１００Ａ〜１００Ｆを含む。ＰＣ１００，１００Ａ〜１００Ｆは、地理的に離れた第１〜第７会議室にそれぞれ設置される。 <First Embodiment>
FIG. 1 is a diagram showing an overview of the entire conference system according to the first embodiment of the present invention. Referring to FIG. 1, a conference system 1 in the first embodiment includes six personal computers (hereinafter referred to as “PCs”) 100, 100 A to 100 F connected to a network 3. The PCs 100 and 100A to 100F are respectively installed in the first to seventh conference rooms that are geographically separated.

第１の実施の形態における会議システム１は、第１〜第７会議室にそれぞれ存在する７人の参加者のいずれか一人が発話し、他の６人が発話者を撮像した画像と発話内容の音声を視聴する場合に有効であり、ここでは、第１会議室に存在する参加者が発話し、第２〜第７会議室それぞれに存在する参加者が視聴する場合を例に説明する。第１〜第７会議室それぞれに配置されるＰＣ１００，１００Ａ〜１００Ｆの構成は同じである。以下の説明では、第２〜第７会議室それぞれに配置されるＰＣ１００Ａ〜ＰＣ１００Ｆの構成要素に付す符号を、第１会議室に配置されるＰＣ１００の構成要素に付す符号に添え字Ａ〜Ｆを付して説明する。また、第１〜第７会議室それぞれに配置されるＰＣ１００，１００Ａ〜１００Ｆの構成は同じなので、以下の説明では特に言及しない限り、第１会議室に配置されるＰＣ１００を例に説明する。 In the conference system 1 according to the first embodiment, any one of the seven participants existing in the first to seventh conference rooms speaks, and the other six capture images of the speakers and utterance contents. In this example, a case where a participant who is present in the first conference room speaks and a participant who is present in each of the second to seventh conference rooms views is described. The configurations of the PCs 100 and 100A to 100F arranged in the first to seventh conference rooms are the same. In the following description, the reference numerals attached to the constituent elements of PC 100A to PC 100F arranged in the second to seventh conference rooms respectively, and the subscripts A to F to the reference numerals attached to the constituent elements of PC 100 arranged in the first conference room. A description will be given. Since the configurations of the PCs 100 and 100A to 100F arranged in the first to seventh conference rooms are the same, the PC 100 arranged in the first conference room will be described as an example unless otherwise specified in the following description.

第１会議室に配置されるＰＣ１００は、音を集音するマイクロホン３１と、音を出力するスピーカ３３と、ＰＣ１００の正面を撮像するカメラ３５と、を備えている。ネットワーク３は、ローカルエリアネットワーク（ＬＡＮ）である。このため、ＰＣ１００，１００Ａ〜１００Ｆは、互いにデータを送受信することが可能である。なお、ネットワーク３は、ＬＡＮに限らず、インターネット、ワイドエリアネットワーク（ＷＡＮ）、公衆交換電話網等であってもよい。また、ネットワーク３は、有線であってもよく、無線であってもよい。 The PC 100 arranged in the first conference room includes a microphone 31 that collects sound, a speaker 33 that outputs sound, and a camera 35 that images the front of the PC 100. The network 3 is a local area network (LAN). Therefore, the PCs 100 and 100A to 100F can transmit and receive data to and from each other. The network 3 is not limited to the LAN, and may be the Internet, a wide area network (WAN), a public switched telephone network, or the like. The network 3 may be wired or wireless.

第１の実施の形態における会議システムにおいては、第１会議室に配置されたＰＣ１００のカメラ３５で参加者を撮像した画像が、第２〜第７会議室にそれぞれ配置されたＰＣ１００Ａ〜１００Ｆそれぞれの表示部に表示され、第１会議室に配置されたＰＣ１００のマイクロホン３１で集音した音声が、第２〜第７会議室にそれぞれ配置されたＰＣ１００Ａ〜１００Ｆそれぞれのスピーカ３３Ａ〜３３Ｆから出力される。 In the conference system in the first embodiment, images obtained by capturing the participants with the camera 35 of the PC 100 arranged in the first conference room are respectively images of the PCs 100A to 100F arranged in the second to seventh conference rooms. The sound displayed on the display unit and collected by the microphone 31 of the PC 100 arranged in the first meeting room is output from the speakers 33A to 33F of the PCs 100A to 100F arranged in the second to seventh meeting rooms, respectively. .

また、第１会議室に配置されたＰＣ１００の表示部には、第１会議室に配置されたＰＣ１００の表示部には、第２〜第７会議室にそれぞれ配置されたＰＣ１００Ａ〜１００Ｆそれぞれのカメラ３５Ａ〜３５Ｆで撮影された６つの画像を並へて配置した画面が表示され、第１会議室に配置されたＰＣ１００のスピーカ３３からは、第２〜第７会議室にそれぞれ配置されたＰＣ１００Ａ〜１００Ｆそれぞれのマイクロホン３１Ａ〜３１Ｆで集音された音を合成した音が出力される。 In addition, the display unit of the PC 100 arranged in the first conference room includes the cameras of the PCs 100A to 100F respectively arranged in the second to seventh conference rooms on the display unit of the PC 100 arranged in the first conference room. A screen in which six images taken in 35A to 35F are arranged side by side is displayed, and from the speaker 33 of the PC 100 arranged in the first meeting room, the PCs 100A to 100A arranged in the second to seventh meeting rooms, respectively. Sounds obtained by synthesizing sounds collected by the microphones 31A to 31F of 100F are output.

図２は、第１の実施の形態におけるＰＣのハードウエア構成の一例を示すブロック図である。図２を参照して、ＰＣ１００は、それぞれがバス４１に接続されたＣＰＵ１１と、ＣＰＵ１１が実行するプログラム等を記憶するためのＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１３と、ＣＰＵ１１の作業領域として用いられるＲＡＭ１５と、大容量記憶装置としてのＨＤＤ１７と、メモリカード１８が装着されるカードインターフェース（Ｉ／Ｆ）１９と、ＰＣ１００をネットワーク３に接続するための通信Ｉ／Ｆ２１と、参加者とのインターフェースとなるユーザインターフェース２３と、外部Ｉ／Ｆ２９と、を含む。 FIG. 2 is a block diagram illustrating an example of a hardware configuration of the PC according to the first embodiment. Referring to FIG. 2, the PC 100 includes a CPU 11 connected to the bus 41, a ROM (Read Only Memory) 13 for storing programs executed by the CPU 11, and a RAM 15 used as a work area of the CPU 11. The HDD 17 as a large-capacity storage device, a card interface (I / F) 19 to which a memory card 18 is mounted, a communication I / F 21 for connecting the PC 100 to the network 3, and a user serving as an interface with participants An interface 23 and an external I / F 29 are included.

ＣＰＵ１１は、ＰＣ１００の全体を制御する。また、ＣＰＵ１１は、ＲＯＭ１３に記憶されたプログラムを実行する。また、ＣＰＵ１１は、カードＩ／Ｆ１９を介してメモリカード１８に記憶されたプログラムをＲＡＭ１５にロードし、実行するようにしてもよい。 The CPU 11 controls the entire PC 100. Further, the CPU 11 executes a program stored in the ROM 13. Further, the CPU 11 may load a program stored in the memory card 18 via the card I / F 19 into the RAM 15 and execute it.

さらに、ＣＰＵ１１がネットワーク３に接続されたコンピュータからプログラムをダウンロードしてＨＤＤ１７に記憶する、または、ネットワーク３に接続されたコンピュータがプログラムをＨＤＤ１７に書き込みするようにしてＣＰＵ１１で実行するようにしてもよい。ここでいうプログラムは、ＣＰＵ１１により直接実行可能なプログラムだけでなく、ソースプログラム、圧縮処理されたプログラム、暗号化されたプログラム等を含む。 Further, the CPU 11 may download a program from a computer connected to the network 3 and store it in the HDD 17, or the computer connected to the network 3 may write the program to the HDD 17 and execute it on the CPU 11. . The program here includes not only a program directly executable by the CPU 11 but also a source program, a compressed program, an encrypted program, and the like.

ユーザインターフェース２３は、キーボードとマウス等のポインティングデバイスとを含む操作部２５と、データを表示する液晶表示装置等からなる表示部２７とを含む。 The user interface 23 includes an operation unit 25 including a keyboard and a pointing device such as a mouse, and a display unit 27 including a liquid crystal display device that displays data.

外部Ｉ／Ｆ２９には、マイクロホン３１、スピーカ３３およびカメラ３５が接続される。マイクロホン３１は、ＰＣ１００の参加者が発生する音声を集音し、集音した音声の音データを、外部Ｉ／Ｆ２９に出力する。外部Ｉ／Ｆ２９は、マイクロホン３１から入力される音データを、ＣＰＵ１１に出力する。スピーカ３３は、ＣＰＵ１１により制御され、ＣＰＵ１１から入力される音データに基づいて音を出力する。カメラ３５は、ＰＣ１００を操作する参加者を撮像し、撮像した画像の画像データを外部Ｉ／Ｆ２９に出力する。外部Ｉ／Ｆ２９は、カメラ３５から入力される画像データを、ＣＰＵ１１に出力する。カメラ３５は、表示部２７の上側に配置される。これにより、表示部２７を見る参加者の画像を撮像するので、参加者を正面から撮像することができる。 A microphone 31, a speaker 33 and a camera 35 are connected to the external I / F 29. The microphone 31 collects sound generated by the participants of the PC 100 and outputs sound data of the collected sound to the external I / F 29. The external I / F 29 outputs sound data input from the microphone 31 to the CPU 11. The speaker 33 is controlled by the CPU 11 and outputs a sound based on sound data input from the CPU 11. The camera 35 images a participant who operates the PC 100 and outputs image data of the captured image to the external I / F 29. The external I / F 29 outputs the image data input from the camera 35 to the CPU 11. The camera 35 is disposed on the upper side of the display unit 27. Thereby, since the image of the participant who sees the display part 27 is imaged, the participant can be imaged from the front.

なお、ＣＰＵ１１が実行するためのプログラムを記憶する記録媒体としては、メモリカード１８に限られず、フレキシブルディスク、カセットテープ、光ディスク（ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）／ＭＯ（ＭａｇｎｅｔｉｃＯｐｔｉｃａｌＤｉｓｃ）／ＭＤ（ＭｉｎｉＤｉｓｃ）／ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ））、ＩＣカード、光カード、マスクＲＯＭ、ＥＰＲＯＭ（ＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）などの半導体メモリ等の媒体でもよい。 The recording medium for storing the program to be executed by the CPU 11 is not limited to the memory card 18, but a flexible disk, a cassette tape, an optical disk (CD-ROM (Compact Disc-Read Only Memory) / MO (Magnetic Optical Disc)). / MD (Mini Disc) / DVD (Digital Versatile Disc), IC card, optical card, mask ROM, EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable and Programmable ROM), and other semiconductor memories.

図３は、第１の実施の形態におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。図３を参照して、ＰＣ１００は、カメラ３５が出力する画像データを取得する画像データ取得部５１と、マイクロホン３１が出力する音データを取得する音データ取得部５３と、画像データに基づいて被写体の視線方向を検出する視線方向検出部５５と、所定の画像データを視線画像データとしてＨＤＤ１７に記憶する視線画像記憶部５７と、カメラ３５から取得された画像データとＨＤＤ１７に記憶された視線画像データのいずれか一方を選択する選択部５９と、音データと画像データとを含む映像データを送信する映像データ送信部６１と、通信Ｉ／Ｆ２１を制御して映像データを受信する映像データ受信部６３と、表示部２７を制御する表示制御部６５と、スピーカ３３を制御する音声制御部６７と、を含む。 FIG. 3 is a block diagram illustrating an example of the functions of the CPU provided in the PC according to the first embodiment. Referring to FIG. 3, the PC 100 includes an image data acquisition unit 51 that acquires image data output from the camera 35, a sound data acquisition unit 53 that acquires sound data output from the microphone 31, and a subject based on the image data. A line-of-sight direction detection unit 55 for detecting the line-of-sight direction, a line-of-sight image storage unit 57 for storing predetermined image data in the HDD 17 as line-of-sight image data, image data acquired from the camera 35, and line-of-sight image data stored in the HDD 17 A selection unit 59 that selects any one of the above, a video data transmission unit 61 that transmits video data including sound data and image data, and a video data reception unit 63 that controls the communication I / F 21 to receive video data. A display control unit 65 that controls the display unit 27, and a sound control unit 67 that controls the speaker 33.

画像データ取得部５１は、外部Ｉ／Ｆ２９に接続されたカメラ３５を制御し、カメラ３５が出力する画像データを取得する。画像データは、動画像であってもよいし、静止画像であってもよい。ここでは、画像データを動画像としている。画像データ取得部５１は、画像データを視線方向検出部５５、選択部５９および視線画像記憶部５７に出力する。 The image data acquisition unit 51 controls the camera 35 connected to the external I / F 29 and acquires image data output from the camera 35. The image data may be a moving image or a still image. Here, the image data is a moving image. The image data acquisition unit 51 outputs the image data to the line-of-sight direction detection unit 55, the selection unit 59, and the line-of-sight image storage unit 57.

音データ取得部５３は、外部Ｉ／Ｆ２１に接続されたマイクロホン３１を制御し、マイクロホン３１が出力する音データを取得する。音データ取得部５３は、音データを映像データ送信部６１に出力する。 The sound data acquisition unit 53 controls the microphone 31 connected to the external I / F 21 and acquires sound data output from the microphone 31. The sound data acquisition unit 53 outputs the sound data to the video data transmission unit 61.

視線方向検出部５５は、画像データ取得部５１から入力される画像データに基づいて被写体の視線方向を検出する。具体的には、画像データに含まれる被写体の領域を抽出し、被写体の領域から、目の領域および瞳の領域を抽出する。目の領域に対する瞳の領域の位置関係から視線方向を検出する。瞳の領域を、左、中、右の３つの領域に水平方向に３分割し瞳の領域が、３つの分割領域のいずれに存在するかを判別する。瞳の領域が、目の領域の左の分割領域に存在すれば、右方向の視線方向を検出し、瞳の領域が、目の領域の中の分割領域に存在すれば、正面方向の視線方向を検出し、瞳の領域が、目の領域の右の分割領域に存在すれば、左方向の視線方向を検出する。視線方向検出部５５は、検出した視線方向を選択部５９および視線画像記憶部５７に出力する。視線方向検出部５５は、被写体の領域から目の領域を抽出できない場合、視線方向を検出できないので、この場合には、視線方向が検出できないことを示す信号を出力する。 The line-of-sight direction detection unit 55 detects the line-of-sight direction of the subject based on the image data input from the image data acquisition unit 51. Specifically, a subject area included in the image data is extracted, and an eye area and a pupil area are extracted from the subject area. The line-of-sight direction is detected from the positional relationship of the pupil region with respect to the eye region. The pupil area is horizontally divided into three areas, ie, left, middle, and right, to determine in which of the three divided areas the pupil area exists. If the pupil area exists in the left divided area of the eye area, the right gaze direction is detected. If the pupil area exists in the divided area in the eye area, the front gaze direction is detected. If the pupil region exists in the right divided region of the eye region, the left gaze direction is detected. The gaze direction detection unit 55 outputs the detected gaze direction to the selection unit 59 and the gaze image storage unit 57. The line-of-sight direction detection unit 55 outputs a signal indicating that the line-of-sight direction cannot be detected because the line-of-sight direction cannot be detected when the eye area cannot be extracted from the subject area.

なお、目の領域を、上中下および左中右の９つの領域に垂直方向および水平方向に９分割し、瞳の領域が、９つの分割領域のいずれに存在するかを判別するようにして、水平方向の視線方向に加えて、垂直方向の視線方向を検出するようにしてもよい。この場合、瞳の領域が、目の領域の垂直方向が中であって水平方向が左の分割領域に存在すれば、右方向の視線方向を検出し、瞳の領域が、目の領域の垂直方向が中であって水平方向が中の分割領域に存在すれば、正面方向の視線方向を検出し、瞳の領域が、目の領域の垂直方向が中であって水平方向が右の分割領域に存在すれば、左方向の視線方向を検出する。視線方向検出部５５は、瞳の領域が、目の領域の垂直方向が上または下の分割領域に存在すれば、上方向または下方向の視線方向を検出する。 The eye area is divided into nine areas, upper, middle, lower, left, left, and right, in the vertical direction and the horizontal direction, and it is determined which of the nine divided areas the pupil area exists. In addition to the horizontal line-of-sight direction, the vertical line-of-sight direction may be detected. In this case, if the pupil area is in the left divided area where the vertical direction of the eye area is in the middle, the right gaze direction is detected, and the pupil area is perpendicular to the eye area. If the direction is middle and the horizontal direction is in the middle divided area, the frontal gaze direction is detected, and the pupil area is the divided area where the vertical direction of the eye area is middle and the horizontal direction is the right If it exists, the left gaze direction is detected. The line-of-sight direction detection unit 55 detects the line-of-sight direction in the upward direction or the downward direction when the pupil region exists in the divided region above or below the vertical direction of the eye region.

視線画像記憶部５７は、画像データ取得部５１から画像データが入力され、視線方向検出部５５から視線方向が入力される。視線画像記憶部５７は、視線方向が正面方向の間、画像データ取得部５１から入力される画像データを視線画像データとしてＨＤＤ１７に記憶する。これにより、視線を正面にした被写体の画像を含む視線画像データ１０１がＨＤＤ１７に記憶される。 The line-of-sight image storage unit 57 receives image data from the image data acquisition unit 51 and receives a line-of-sight direction from the line-of-sight direction detection unit 55. The line-of-sight image storage unit 57 stores the image data input from the image data acquisition unit 51 in the HDD 17 as line-of-sight image data while the line-of-sight direction is the front direction. Thus, the line-of-sight image data 101 including the image of the subject with the line of sight in front is stored in the HDD 17.

選択部５９は、画像データ取得部５１から画像データが入力され、視線方向検出部５５から視線方向が入力される。選択部５９は、視線方向検出部５５から入力される視線方向に基づいて、画像データ取得部５１から入力される画像データと、ＨＤＤ１７に記憶された視線画像データ１０１のいずれか一方を選択する。具体的には、選択部５９は、視線方向検出部５５から正面方向を示す視線方向が入力されなくなってから所定時間Ｔ１が継続すると、ＨＤＤ１７に記憶された視線画像データを、所定時間Ｔ２の間選択し、それ以外の間は画像データ取得部５１から入力される画像データを選択する。 The selection unit 59 receives image data from the image data acquisition unit 51 and receives a line-of-sight direction from the line-of-sight direction detection unit 55. The selection unit 59 selects one of the image data input from the image data acquisition unit 51 and the line-of-sight image data 101 stored in the HDD 17 based on the line-of-sight direction input from the line-of-sight direction detection unit 55. Specifically, when the predetermined time T1 continues after the line-of-sight direction indicating the front direction is not input from the line-of-sight detection unit 55, the selection unit 59 displays the line-of-sight image data stored in the HDD 17 for a predetermined time T2. In other cases, the image data input from the image data acquisition unit 51 is selected.

選択部５９は、視線方向検出部５５から正面方向の視線方向が入力される間は、画像データ取得部５１から入力される画像データを選択する。また、選択部５９は、視線方向検出部５５から正面方向の視線方向が入力されない間であっても、視線方向検出部５５から正面方向の視線方向が入力されなくなってから所定時間Ｔ１が経過するまでの間は、画像データ取得部５１から入力される画像データを選択する。さらに、選択部５９は、視線方向検出部５５から正面方向の視線方向が入力されない間は、ＨＤＤ１７に記憶された視線画像データを選択した後、所定時間Ｔ２が経過すると画像データ取得部５１から入力される画像データを選択すする。さらに、選択部５９は、視線方向検出部５５から正面方向の視線方向が入力されない間は、一度でも視線画像データを選択した場合には、画像データ取得部５１から入力される画像データを選択した後、所定時間Ｔ１経過するまでの間は画像データ取得部５１から入力される画像データを継続して選択する。 The selection unit 59 selects the image data input from the image data acquisition unit 51 while the gaze direction in the front direction is input from the gaze direction detection unit 55. In addition, the selection unit 59 passes the predetermined time T1 after the frontal gaze direction is not input from the gaze direction detection unit 55 even when the frontal gaze direction is not input from the gaze direction detection unit 55. Until this time, the image data input from the image data acquisition unit 51 is selected. Further, the selection unit 59 selects the line-of-sight image data stored in the HDD 17 while the front direction of the line-of-sight is not input from the line-of-sight direction detection unit 55, and then inputs from the image data acquisition unit 51 when a predetermined time T2 elapses. The image data to be selected is selected. Further, the selection unit 59 selects the image data input from the image data acquisition unit 51 when the line-of-sight image data is selected even once while the line-of-sight direction detection direction 55 is not input. Thereafter, the image data input from the image data acquisition unit 51 is continuously selected until the predetermined time T1 elapses.

選択部５９は、画像データ取得部５１から入力される画像データおよびＨＤＤ１７に記憶された視線画像データのうちから選択されたいずれか一方を映像データ送信部６１に出力する。 The selection unit 59 outputs one of the image data input from the image data acquisition unit 51 and the line-of-sight image data stored in the HDD 17 to the video data transmission unit 61.

映像データ送信部６１は、音データ取得部５３から音データが入力され、選択部５９から画像データが入力され、音データと画像データとを含む映像データを、通信Ｉ／Ｆ２１を介して、ＰＣ１００Ａ〜１００Ｆに送信する。なお、第２〜第７会議室に設置されるＰＣ１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣに、映像データを送信する。 The video data transmission unit 61 receives sound data from the sound data acquisition unit 53, receives image data from the selection unit 59, and transmits video data including the sound data and the image data to the PC 100A via the communication I / F 21. Send to ~ 100F. In addition, PC100A-100F installed in the 2nd-7th meeting room transmits video data to PCs other than an own apparatus among PC100,100A-100F.

映像データ受信部６３は、通信Ｉ／Ｆ２１を制御して、ＰＣ１００Ａ〜１００Ｆそれぞれが送信する映像データを受信する。映像データ受信部６３は、映像データに含まれる画像データを表示制御部６５に出力し、映像データに含まれる音データを音声制御部６７に出力する。表示制御部６５は、映像データ受信部６３から入力される画像データの画像を表示部２７に表示する。本実施の形態においては、映像データ受信部６３は、第２〜第７会議室にそれぞれ設置されるＰＣ１００Ａ〜１００Ｆそれぞれから映像データを受信するので、６つの画像データに基づき６つ画像を並べて配置した画面を表示部２７に表示する。なお、第２〜第７会議室に設置されるＰＣ１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣのすべてから映像データを受信するが、それらのうちからＰＣ１００から受信された映像データに含まれる画像データの画像を表示部２７Ａ〜２７Ｆに表示する。 The video data receiving unit 63 controls the communication I / F 21 to receive video data transmitted from each of the PCs 100A to 100F. The video data receiving unit 63 outputs the image data included in the video data to the display control unit 65, and outputs the sound data included in the video data to the audio control unit 67. The display control unit 65 displays the image of the image data input from the video data receiving unit 63 on the display unit 27. In the present embodiment, the video data receiving unit 63 receives video data from each of the PCs 100A to 100F installed in the second to seventh conference rooms, so that six images are arranged side by side based on the six image data. The screen is displayed on the display unit 27. The PCs 100A to 100F installed in the second to seventh conference rooms receive video data from all of the PCs 100 and 100A to 100F other than their own devices, but the video received from the PC 100 out of them. Images of the image data included in the data are displayed on the display units 27A to 27F.

音声制御部６７は、映像データ受信部６３から入力される音データをスピーカ３３に出力し、スピーカ３３に音を出力させる。本実施の形態においては、映像データ受信部６３は、第２〜第７会議室にそれぞれ設置されるＰＣ１００Ａ〜１００Ｆそれぞれから映像データを受信するので、６つの音データに基づき６つ音データを合成した音をスピーカ３３に出力させる。第２〜第７会議室に設置されるＰＣ１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣのすべてから映像データを受信する。ＰＣ１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣのすべてから受信される６つの映像データに含まれる６つの音データを合成した音をスピーカ３３Ａ〜３３Ｆにそれぞれ出力させる。 The audio control unit 67 outputs the sound data input from the video data receiving unit 63 to the speaker 33 and causes the speaker 33 to output sound. In the present embodiment, the video data receiving unit 63 receives the video data from each of the PCs 100A to 100F installed in the second to seventh conference rooms, and therefore synthesizes the six sound data based on the six sound data. The generated sound is output to the speaker 33. The PCs 100A to 100F installed in the second to seventh conference rooms receive video data from all of the PCs 100 and 100A to 100F other than their own devices. The PCs 100A to 100F cause the speakers 33A to 33F to output sounds obtained by synthesizing the six sound data included in the six video data received from all of the PCs 100 and 100A to 100F other than its own device.

図４は、第１の実施の形態における映像データ送信処理の流れの一例を示すフローチャートである。映像データ送信処理は、ＰＣ１００，１００Ａ〜１００Ｆそれぞれにおいて実行されるが、処理対象とする画像データおよび音データが異なるのみなので、ここではＰＣ１００を例に説明する。映像データ送信処理は、ＰＣ１００が備えるＣＰＵ１１がＲＯＭ１３、ＨＤＤ１７またはメモリカード１８に記憶された映像データ送信プログラムを実行することにより、ＣＰＵ１１により実行される処理である。 FIG. 4 is a flowchart illustrating an example of the flow of video data transmission processing in the first embodiment. The video data transmission process is executed in each of the PCs 100 and 100A to 100F. However, since the image data and the sound data to be processed are different, the PC 100 will be described as an example here. The video data transmission process is a process executed by the CPU 11 when the CPU 11 included in the PC 100 executes a video data transmission program stored in the ROM 13, the HDD 17 or the memory card 18.

図４を参照して、ＣＰＵ１１は、カメラ３５から入力される画像データに基づいて、被写体の視線方向を検出する（ステップＳ０１）。そして、検出された視線方向が所定方向か否かを判断する（ステップＳ０２）。所定方向は、ここでは正面方向としている。ステップＳ０１において検出された視線方向が正面方向ならば処理をステップＳ０３に進め、そうでなければ処理をステップＳ０６に進める。ステップＳ０３においては、タイマをリセットする。タイマは、被写体の視線方向が正面方向から正面方向以外の方向になってからの時間を計時する。タイマをリセットすることによって、タイマ値は、０となり、その後タイマは計時を開始する。 Referring to FIG. 4, CPU 11 detects the direction of the line of sight of the subject based on the image data input from camera 35 (step S01). Then, it is determined whether or not the detected line-of-sight direction is a predetermined direction (step S02). Here, the predetermined direction is the front direction. If the line-of-sight direction detected in step S01 is the front direction, the process proceeds to step S03; otherwise, the process proceeds to step S06. In step S03, the timer is reset. The timer measures the time after the line-of-sight direction of the subject changes from the front direction to a direction other than the front direction. By resetting the timer, the timer value becomes 0, and then the timer starts timing.

次のステップＳ０４においては、カメラ３５から入力される画像データを視線画像データとしてＨＤＤ１７に記憶する。これにより、ＨＤＤ１７に、視線方向か正面方向の被写体の画像を含む視線画像データ１０１がＨＤＤ１７に記憶される。次のステップＳ０５においては、カメラ３５から入力される画像データを、送信の対象となる画像データに選択し、処理をステップＳ１１に進める。 In the next step S04, the image data input from the camera 35 is stored in the HDD 17 as line-of-sight image data. As a result, the line-of-sight image data 101 including the image of the subject in the line-of-sight direction or the front direction is stored in the HDD 17. In the next step S05, the image data input from the camera 35 is selected as image data to be transmitted, and the process proceeds to step S11.

ステップＳ０６においては、タイマ値が予め定められたしきい値Ｔ１以上か否かを判断する。タイマ値がしきい値Ｔ１以上ならば処理をステップＳ０７に進めるが、そうでなければ処理をステップＳ１０に進める。タイマ値は、ステップＳ０３においてリセットされるため、ステップＳ０３の処理が実行されなくなってからの時間である。換言すれば、ステップＳ０２において被写体の視線方向が正面方向でないと判断されてからの経過時間である。したがって、ステップＳ０６においては、被写体が正面方向を見なくなってからの時間がしきい値Ｔ１以上になるとステップＳ０７の処理が実行される。 In step S06, it is determined whether the timer value is equal to or greater than a predetermined threshold value T1. If the timer value is equal to or greater than threshold value T1, the process proceeds to step S07; otherwise, the process proceeds to step S10. Since the timer value is reset in step S03, it is the time after the processing in step S03 is not executed. In other words, this is the elapsed time since it was determined in step S02 that the line-of-sight direction of the subject is not the front direction. Therefore, in step S06, if the time after the subject stops looking in the front direction is equal to or greater than the threshold value T1, the process of step S07 is executed.

ステップＳ０７においては、ステップＳ０４においてＨＤＤ１７に記憶された視線画像データ１０１を、送信の対象とする画像データに選択し、処理をステップＳ０８に進める。ステップＳ０８においては、タイマ値が予め定められたしきい値Ｔ２以上か否かを判断する。しきい値Ｔ２は、しきい値Ｔ１よりも大きな値である。タイマ値がしきい値Ｔ２以上ならば処理をステップＳ０９に進めるが、そうでなければステップＳ０９をスキップして処理をステップＳ１１に進める。ステップＳ０９においては、タイマをリセットする。一方、ステップＳ１０においては、カメラ３５から入力される画像データを、送信の対象となる画像データに選択し、処理をステップＳ１１に進める。 In step S07, the line-of-sight image data 101 stored in HDD 17 in step S04 is selected as image data to be transmitted, and the process proceeds to step S08. In step S08, it is determined whether or not the timer value is equal to or greater than a predetermined threshold value T2. The threshold value T2 is larger than the threshold value T1. If the timer value is equal to or greater than threshold value T2, the process proceeds to step S09. If not, step S09 is skipped and the process proceeds to step S11. In step S09, the timer is reset. On the other hand, in step S10, image data input from the camera 35 is selected as image data to be transmitted, and the process proceeds to step S11.

ステップＳ０９においてタイマがリセットされると、次にステップＳ０６が実行される場合に、処理がステップＳ１０に進む。このため、被写体の視線方向が正面方向でなってからその状態がしきい値Ｔ１以上しきい値Ｔ２未満の間は、視線画像データが選択される。換言すれば、しきい値Ｔ２としきい値Ｔ１との差分の時間ＴＭ、視線画像データが選択される。さらに、タイマ値が、しきい値Ｔ２以上になると、タイマがリセットされるので、被写体の視線方向が正面方向でない状態がしきい値Ｔ２以上継続する場合には、しきい値Ｔ１の間はカメラ３５から入力される画像データを選択する処理（ステップＳ０７）、その後の時間ＴＭの間は視線画像データを選択する処理（ステップＳ１０）が繰り返えされる。 When the timer is reset in step S09, the process proceeds to step S10 when step S06 is executed next. For this reason, the line-of-sight image data is selected while the state of the subject's line-of-sight is the front direction and the state is between the threshold value T1 and the threshold value T2. In other words, the difference time TM between the threshold value T2 and the threshold value T1 and the line-of-sight image data are selected. Furthermore, when the timer value becomes equal to or greater than the threshold value T2, the timer is reset. Therefore, if the state in which the subject's line-of-sight direction is not the front direction continues for the threshold value T2 or more, the camera is kept between the threshold value T1 The process of selecting image data input from 35 (step S07) and the process of selecting line-of-sight image data (step S10) are repeated for the subsequent time TM.

ステップＳ１１においては、ステップＳ０５、ステップＳ０７およびステップＳ１０のいずれかで送信の対象に選択された画像データと、マイクロホン３１が入力される音データとを含む映像データを他のＰＣに送信する。ＰＣ１００，１００Ａ〜１００Ｆのうち、自装置以外のＰＣのすべてに対して、通信Ｉ／Ｆ２１を介して送信する。次のステップＳ１２においては、終了指示を受け付けたか否かを判断する。参加者が、操作部２５の備える終了指示が割り当てられたキーを押下すれば、終了指示を受け付ける。終了指示を受け付けたならば処理を終了するが、そうでなければ処理をステップＳ０１に戻す。 In step S11, video data including the image data selected as the transmission target in any of step S05, step S07, and step S10 and sound data to which the microphone 31 is input is transmitted to another PC. It transmits via communication I / F21 with respect to all PCs other than an own apparatus among PC100,100A-100F. In the next step S12, it is determined whether an end instruction has been accepted. If the participant presses the key to which the end instruction provided in the operation unit 25 is assigned, the end instruction is accepted. If an end instruction is accepted, the process ends. If not, the process returns to step S01.

本実施の形態におけるＰＣ１００は、第１会議室に配置され、カメラ３５が出力する画像データを解析して、被写体、換言すれば第１会議室に存在する参加者の視線が正面方向を向いているときの画像データをＨＤＤ１７に視線画像データとして記憶しておき、正面方向を向いていない時間が所定時間Ｔ１になると、過去に正面方向を向いていたときの画像データをＨＤＤ１７から読み出して、画像データに代えて他のＰＣ１００Ａ〜１００Ｆに送信する。このため、第１会議室に存在する参加者が、原稿などを見ながら発話するなどであっても、第１会議室の参加者のノンバーバルな情報を他の第２〜第７会議室に配置されたＰＣ１００Ａ〜１００Ｆで表示することができる。 The PC 100 in the present embodiment is arranged in the first conference room, analyzes the image data output from the camera 35, and the subject, in other words, the participant's line of sight existing in the first conference room faces the front direction. Is stored as line-of-sight image data in the HDD 17, and when the time when it is not facing the front direction reaches the predetermined time T1, the image data when it has been facing the front direction in the past is read out from the HDD 17, It transmits to other PC100A-100F instead of data. Therefore, even if a participant in the first conference room speaks while looking at a manuscript or the like, the non-verbal information of the participant in the first conference room is placed in the other second to seventh conference rooms. Displayed on the PCs 100A to 100F.

＜変形例＞
上述した第１の実施の形態における会議システム１においては、ＰＣ１００，１００Ａ〜１００Ｆそれぞれが他のすべてのＰＣに映像データを送信する際に、送信側のＰＣが、視線画像データとカメラ３５が出力する実際に撮影した画像データとのいずれかを選択するようにした。変形例における会議システム１においては、送信側のＰＣで実行していた選択処理を、受信側のＰＣで実行するようにしたものである。このため、第２会議室に設置されるＰＣ１００Ａを例に、上述した第１の実施の形態における会議システム１と異なる点を主に説明する。 <Modification>
In the conference system 1 in the first embodiment described above, when each of the PCs 100, 100A to 100F transmits video data to all other PCs, the transmission side PC outputs the line-of-sight image data and the camera 35. To select one of the actual captured image data. In the conference system 1 according to the modification, the selection process that has been executed by the PC on the transmission side is executed by the PC on the reception side. For this reason, the difference from the conference system 1 in the first embodiment described above will be mainly described by taking the PC 100A installed in the second conference room as an example.

図５は、第１の実施の形態の変形例におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。図５を参照して、ＰＣ１００Ａは、カメラ３７が出力する画像データを取得する画像データ取得部５１Ａと、マイクロホン３１Ａが出力する音データを取得する音データ取得部５３Ａと、音データと画像データとを含む映像データを送信する映像データ送信部６１Ａと、通信Ｉ／Ｆ２１Ａを制御して映像データを受信する映像データ受信部６３Ａと、画像データに基づいて被写体の視線方向を検出する視線方向検出部５５Ａと、所定の画像データを視線画像データとしてＨＤＤ１７Ａに記憶する視線画像記憶部５７Ａと、受信された映像データに含まれる画像データとＨＤＤ１７Ａに記憶された視線画像データのいずれか一方を選択する選択部５９Ａと、表示部２７Ａを制御する表示制御部６５Ａと、スピーカ３３Ａを制御する音声制御部６７Ａと、を含む。 FIG. 5 is a block diagram illustrating an example of functions of the CPU provided in the PC according to the modification of the first embodiment. Referring to FIG. 5, the PC 100A includes an image data acquisition unit 51A that acquires image data output from the camera 37, a sound data acquisition unit 53A that acquires sound data output from the microphone 31A, and sound data and image data. A video data transmission unit 61A for transmitting video data including the video data, a video data reception unit 63A for controlling the communication I / F 21A to receive the video data, and a gaze direction detection unit for detecting the gaze direction of the subject based on the image data 55A, a line-of-sight image storage unit 57A that stores predetermined image data as line-of-sight image data in the HDD 17A, and a selection for selecting either image data included in the received video data or line-of-sight image data stored in the HDD 17A Unit 59A, display control unit 65A for controlling display unit 27A, and audio control unit 6 for controlling speaker 33A Including, and A.

画像データ取得部５１Ａは、外部Ｉ／Ｆ２９Ａに接続されたカメラ３７を制御し、カメラ３７が出力する画像データを取得する。画像データ取得部５１Ａは、画像データを映像データ送信部６１Ａに出力する。音データ取得部５３Ａは、外部Ｉ／Ｆ２１Ａに接続されたマイクロホン３１Ａを制御し、マイクロホン３１Ａが出力する音データを取得する。音データ取得部５３Ａは、音データを映像データ送信部６１Ａに出力する。 The image data acquisition unit 51A controls the camera 37 connected to the external I / F 29A and acquires image data output by the camera 37. The image data acquisition unit 51A outputs the image data to the video data transmission unit 61A. The sound data acquisition unit 53A controls the microphone 31A connected to the external I / F 21A, and acquires sound data output from the microphone 31A. The sound data acquisition unit 53A outputs the sound data to the video data transmission unit 61A.

映像データ送信部６１Ａは、画像データ取得部５１Ａから画像データが入力され、音データ取得部５３Ａから音データが入力され、音データと画像データとを含む映像データを、通信Ｉ／Ｆ２１Ａを介して、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣ１００，ＰＣ１００Ｂ〜１００Ｆに送信する。なお、第１、第３〜第７会議室に設置されるＰＣ１００，ＰＣ１００Ｂ〜１００Ｆも、同様に、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣに映像データを送信する。 The video data transmission unit 61A receives image data from the image data acquisition unit 51A, receives sound data from the sound data acquisition unit 53A, and transmits video data including the sound data and the image data via the communication I / F 21A. , Of the PCs 100 and 100A to 100F, the PCs 100 and 100B to 100F other than the own device are transmitted. Similarly, the PCs 100 and 100B to 100F installed in the first, third to seventh conference rooms also transmit video data to PCs other than the own apparatus among the PCs 100 and 100A to 100F.

映像データ受信部６３Ａは、通信Ｉ／Ｆ２１を制御して、他のＰＣ１００，ＰＣ１００Ｂ〜１００Ｆそれぞれが送信する映像データを受信する。映像データ受信部６３Ａは、ＰＣ１００，ＰＣ１００Ｂ〜１００Ｆそれぞれから受信される映像データのうち、発話する参加者が存在する第１会議室に配置されたＰＣ１００から受信される映像データを選択し、選択された映像データに含まれる画像データを、視線方向検出部５５Ａ、選択部５９Ａに出力する。 The video data receiving unit 63A controls the communication I / F 21 to receive video data transmitted from the other PCs 100 and PCs 100B to 100F. The video data receiving unit 63A selects and selects video data received from the PC 100 disposed in the first conference room in which the speaking participant is present, from the video data received from the PC 100 and the PCs 100B to 100F. The image data included in the video data is output to the line-of-sight direction detection unit 55A and the selection unit 59A.

また、映像データ受信部６３Ａは、ＰＣ１００，ＰＣ１００Ｂ〜１００Ｆそれぞれから受信される映像データに含まれる６つの音データを音声制御部６７Ａに出力する。音声制御部６７Ａは、映像データ受信部６３Ａから入力される音データをスピーカ３３Ａに出力し、スピーカ３３Ａに音を出力させる。本実施の形態においては、映像データ受信部６３Ａは、第１、第３〜第７会議室にそれぞれ設置されるＰＣ１００、ＰＣ１００Ｂ〜１００Ｆそれぞれから映像データを受信するので、６つの音データに基づき６つ音データを合成した音をスピーカ３３に出力させる。なお、第１、第３〜第７会議室に設置されるＰＣ１００、１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣのすべてから映像データを受信するので、ＰＣ１００、１００Ａ〜１００Ｆは、ＰＣ１００，１００Ａ〜１００Ｆのうち自装置以外のＰＣのすべてから受信される６つの映像データに含まれる６つの音データを合成した音をスピーカ３３、３３Ｂ〜３３Ｆにそれぞれ出力させる。 Further, the video data receiving unit 63A outputs six sound data included in the video data received from each of the PC 100 and the PCs 100B to 100F to the audio control unit 67A. The audio control unit 67A outputs the sound data input from the video data receiving unit 63A to the speaker 33A and causes the speaker 33A to output sound. In the present embodiment, the video data receiving unit 63A receives video data from the PC 100 and PC 100B to 100F installed in the first, third to seventh conference rooms, respectively. The sound obtained by synthesizing the sound data is output to the speaker 33. Since the PCs 100 and 100A to 100F installed in the first, third to seventh conference rooms receive video data from all of the PCs 100 and 100A to 100F other than their own devices, the PCs 100 and 100A to 100F. Causes the speakers 33 and 33B to 33F to output sounds obtained by synthesizing the six sound data included in the six video data received from all of the PCs 100 and 100A to 100F other than the PC.

視線方向検出部５５Ａは、映像データ受信部６３Ａから入力される画像データに基づいて被写体の視線方向を検出する。ここでは、ＰＣ１００から受信される映像データに含まれる画像データに基づいて被写体の視線方向を検出する。視線方向検出部５５Ａは、検出した視線方向を選択部５９Ａおよび視線画像記憶部５７Ａに出力する。 The line-of-sight direction detection unit 55A detects the line-of-sight direction of the subject based on the image data input from the video data reception unit 63A. Here, the line-of-sight direction of the subject is detected based on the image data included in the video data received from the PC 100. The line-of-sight direction detection unit 55A outputs the detected line-of-sight direction to the selection unit 59A and the line-of-sight image storage unit 57A.

視線画像記憶部５７Ａは、映像データ受信部６３Ａから画像データが入力され、視線方向検出部５５Ａから視線方向が入力される。視線画像記憶部５７Ａは、視線方向が正面方向の間、映像データ受信部６３Ａから入力されるＰＣ１００から受信された映像データに含まれる画像データを視線画像データとしてＨＤＤ１７Ａに記憶する。これにより、視線を正面にした被写体の画像を含む視線画像データ１０１がＨＤＤ１７に記憶される。 The line-of-sight image storage unit 57A receives image data from the video data reception unit 63A and receives a line-of-sight direction from the line-of-sight direction detection unit 55A. The line-of-sight image storage unit 57A stores the image data included in the video data received from the PC 100 input from the video data reception unit 63A as the line-of-sight image data in the HDD 17A while the line-of-sight direction is the front direction. Thus, the line-of-sight image data 101 including the image of the subject with the line of sight in front is stored in the HDD 17.

選択部５９Ａは、映像データ受信部６３Ａから画像データが入力され、視線方向検出部５５Ａから視線方向が入力される。選択部５９Ａは、視線方向検出部５５Ａから入力される視線方向に基づいて、映像データ受信部６３Ａから入力されるＰＣ１００から受信された映像データに含まれる画像データと、ＨＤＤ１７Ａに記憶された視線画像データ１０１のいずれか一方を選択する。具体的には、選択部５９Ａは、視線方向検出部５５Ａから正面方向を示す視線方向が入力されなくなってから所定時間Ｔ１が継続すると、ＨＤＤ１７Ａに記憶された視線画像データを、所定時間Ｔ２の間選択し、それ以外の間は映像データ受信部６３Ａから入力される画像データを選択する。選択部５９Ａは、映像データ受信部６３Ａから入力される画像データおよびＨＤＤ１７に記憶された視線画像データのうちから選択された一方を表示制御部６５Ａに出力する。 The selection unit 59A receives image data from the video data reception unit 63A and receives a line-of-sight direction from the line-of-sight direction detection unit 55A. The selection unit 59A, based on the line-of-sight direction input from the line-of-sight direction detection unit 55A, image data included in the video data received from the PC 100 input from the video data reception unit 63A, and the line-of-sight image stored in the HDD 17A. One of the data 101 is selected. Specifically, when the predetermined time T1 continues after the line-of-sight direction indicating the front direction is not input from the line-of-sight direction detection unit 55A, the selection unit 59A displays the line-of-sight image data stored in the HDD 17A for a predetermined time T2. During this period, the image data input from the video data receiving unit 63A is selected. The selection unit 59A outputs one selected from the image data input from the video data reception unit 63A and the line-of-sight image data stored in the HDD 17 to the display control unit 65A.

表示制御部６５Ａは、選択部５９Ａから入力される画像データの画像を表示部２７Ａに表示する。本実施の形態の変形例におけるＰＣ１００Ａは、第１会議室に配置されるＰＣ１００から受信される映像データに含まれる画像データを解析して、被写体、換言すれば第１会議室に存在する参加者の視線が正面方向を向いているときの画像データをＨＤＤ１７Ａに記憶しておき、正面方向を向いていない時間が所定時間Ｔ１になると、過去に正面方向を向いていたときの画像データをＨＤＤ１７Ａから読み出して、表示する。このため、第１会議室に存在する参加者が、原稿などを見ながら発話するなどであっても、第１会議室の参加者のノンバーバルな情報を表示することができる。 The display control unit 65A displays the image of the image data input from the selection unit 59A on the display unit 27A. The PC 100A in the modification of the present embodiment analyzes the image data included in the video data received from the PC 100 arranged in the first conference room, and in other words, the subject existing in the first conference room. Is stored in the HDD 17A, and when the time when the line of sight is not facing the front direction reaches a predetermined time T1, the image data when the line of sight is facing the front direction is read from the HDD 17A in the past. Read and display. For this reason, even if the participant who exists in the 1st conference room speaks while looking at a manuscript etc., the non-verbal information of the participant in the 1st conference room can be displayed.

なお、第１の実施の形態の変形例においては、第１会議室に配置されたＰＣ１００は、第２〜第７会議室にそれぞれ設置されるＰＣ１００Ａ〜１００Ｆそれぞれから映像データを受信するので、６つの画像データに基づき６つ画像を並べて配置した画面を表示部２７に表示する。 In the modification of the first embodiment, the PC 100 arranged in the first conference room receives video data from each of the PCs 100A to 100F installed in the second to seventh conference rooms. A screen in which six images are arranged and arranged based on one image data is displayed on the display unit 27.

以上説明したように、第１の実施の形態における会議システム１においては、発表者が参加する第１会議室に配置されるＰＣ１００によって、カメラ３５が出力する画像データに基づいて、発表者の視線方向が検出され、発表者の視線が表示部２７を向く方向が検出されている間に、カメラ３５が出力する画像データが記憶され、カメラ３５が出力する画像データおよびＨＤＤ１７に記憶された視線画像データのうちいずれか一方が選択され、選択された画像データを含む映像データが、第２会議室に配置されたＰＣ１００Ａ〜１００Ｄに送信される。また、カメラ３５が出力する画像データを選択しているときに、検出された視線方向が表示部２７を向く方向でない状態が所定時間Ｔ１継続すると、ＨＤＤ１７に記憶された視線画像データが選択される。第１会議室の参加者の視線方向が表示部２７を向いているときは、第２会議室の参加者は第１会議室の参加者と視線が合うが、第１会議室の参加者の視線方向が表示手段を向いていないときはるときは、第２会議室の参加者は第１会議室の参加者と視線が合わない。第２会議室の参加者は第１会議室の参加者と視線が合わない状態が所定時間Ｔ１継続すると、第１会議室の参加者の視線方向が表示部２７を向いているときに記憶された視線画像データが選択されるので、第２会議室の参加者は第１会議室の参加者と視線が合う画像を見ることになる。このため、発表者から参加者にノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することができる。 As described above, in the conference system 1 according to the first embodiment, the line of sight of the presenter is based on the image data output from the camera 35 by the PC 100 arranged in the first conference room in which the presenter participates. While the direction is detected and the direction in which the presenter's line of sight faces the display unit 27 is detected, the image data output from the camera 35 is stored, the image data output from the camera 35 and the line-of-sight image stored in the HDD 17. Either one of the data is selected, and video data including the selected image data is transmitted to the PCs 100A to 100D arranged in the second conference room. Further, when the image data output from the camera 35 is selected and the state where the detected line-of-sight direction is not in the direction facing the display unit 27 continues for a predetermined time T1, the line-of-sight image data stored in the HDD 17 is selected. . When the line of sight of the participants in the first meeting room is facing the display unit 27, the participants in the second meeting room are in line of sight with the participants in the first meeting room. When the line-of-sight direction does not face the display means, the participant in the second meeting room does not match the line of sight with the participant in the first meeting room. The participant in the second conference room is memorized when the line of sight of the participant in the first conference room faces the display unit 27 when the line of sight does not match the participant in the first conference room for a predetermined time T1. Since the line-of-sight image data is selected, the participant in the second conference room sees an image whose line of sight matches the participant in the first conference room. Therefore, non-verbal information can be transmitted when non-verbal information is not transmitted from the presenter to the participants.

＜第２の実施の形態＞
次に、第２の実施の形態における会議システムについて説明する。第２の実施の形態における会議システムは、地理的に離れて存在する３人で会議する場合に適用できる。 <Second Embodiment>
Next, the conference system in the second embodiment will be described. The conference system in the second embodiment can be applied when a conference is held by three people who are geographically separated.

図６は、第２の実施の形態における会議システム全体の概要を示す図である。図６を参照して、第２の実施の形態における会議システム１Ａは、ネットワーク３に接続された３台のＰＣ１００，１００Ａ，１００Ｂを含む。ＰＣ１００，１００Ａ，１００Ｂは、地理的に離れた第１、第２および第３会議室にそれぞれ設置される。ここでは、第１会議室に第１参加者が存在し、第２会議室に第２参加者が存在し、第３会議室に第３参加者が存在する場合を例に説明する。 FIG. 6 is a diagram illustrating an overview of the entire conference system according to the second embodiment. Referring to FIG. 6, conference system 1 A according to the second embodiment includes three PCs 100, 100 A, 100 B connected to network 3. The PCs 100, 100A, and 100B are respectively installed in the first, second, and third conference rooms that are geographically separated. Here, the case where the first participant exists in the first conference room, the second participant exists in the second conference room, and the third participant exists in the third conference room will be described as an example.

ネットワーク３は、ローカルエリアネットワーク（ＬＡＮ）である。このため、ＰＣ１００，１００Ａ，１００Ｂは、互いにデータを送受信することが可能である。なお、ネットワーク３は、ＬＡＮに限らず、インターネット、ワイドエリアネットワーク（ＷＡＮ）、公衆交換電話網等であってもよい。また、ネットワーク３は、有線であってもよく、無線であってもよい。 The network 3 is a local area network (LAN). Therefore, the PCs 100, 100A, and 100B can transmit and receive data to and from each other. The network 3 is not limited to the LAN, and may be the Internet, a wide area network (WAN), a public switched telephone network, or the like. The network 3 may be wired or wireless.

第２の実施の形態における会議システム１Ａにおいて、ＰＣ１００，１００Ａ，１００Ｂの構成は同じである。以下の説明では、第２および第３会議室それぞれに配置されるＰＣ１００Ａ、１００Ｂの構成要素に付す符号を、第１会議室に配置されるＰＣ１００の構成要素に付す符号に添え字ＡおよびＢを付して説明する。また、ＰＣ１００，１００Ａ、１００Ｂの構成は同じなので、以下の説明では特に言及しない限り、第１会議室に配置されるＰＣ１００を例に説明する。 In the conference system 1A in the second embodiment, the configurations of the PCs 100, 100A, and 100B are the same. In the following description, the reference numerals attached to the constituent elements of the PCs 100A and 100B arranged in the second and third conference rooms respectively, and the subscripts A and B added to the reference numerals attached to the constituent elements of the PC 100 arranged in the first conference room. A description will be given. In addition, since the configurations of the PCs 100, 100A, and 100B are the same, the PC 100 disposed in the first conference room will be described as an example unless otherwise specified in the following description.

図７は、第２の実施の形態におけるＰＣのハードウエア構成の一例を示すブロック図である。図７を参照して、図２に示したＰＣ１００のハードウエア構成と異なる点は、外部Ｉ／Ｆ２９に接続されていたカメラ３５が第１カメラ３５に変更され、さらに、外部Ｉ／Ｆ２９に、第２カメラ３７および第３カメラ３９が追加して接続される点である。その他の構成は、図２に示したのと同じなので、ここでは説明を繰り返さない。 FIG. 7 is a block diagram illustrating an example of a hardware configuration of a PC according to the second embodiment. Referring to FIG. 7, the difference from the hardware configuration of PC 100 shown in FIG. 2 is that camera 35 connected to external I / F 29 is changed to first camera 35, and further to external I / F 29, The second camera 37 and the third camera 39 are additionally connected. Other configurations are the same as those shown in FIG. 2, and therefore, description thereof will not be repeated here.

外部Ｉ／Ｆ２９には、マイクロホン３１、スピーカ３３および第１〜第３カメラ３５，３７，３９が接続される。外部Ｉ／Ｆ２９は、マイクロホン３１から入力される音データを、ＣＰＵ１１に出力する。第１カメラ３５は、ＰＣ１００の正面に存在する被写体、ここでは第１参加者の正面を撮像し、撮像して得られる画像データを外部Ｉ／Ｆ２９に出力する。第２カメラ３７は、被写体である第１参加者の右側面を撮像し、撮像して得られる画像データを外部Ｉ／Ｆ２９に出力する。第３カメラ３９は、被写体である第１参加者の左側面を撮像し、撮像して得られる画像データを外部Ｉ／Ｆ２９に出力する。外部Ｉ／Ｆ２９は、第１〜第３カメラ３５，３７，３９から入力される画像データを、ＣＰＵ１１に出力する。 A microphone 31, a speaker 33, and first to third cameras 35, 37, and 39 are connected to the external I / F 29. The external I / F 29 outputs sound data input from the microphone 31 to the CPU 11. The first camera 35 images a subject existing in front of the PC 100, here the front of the first participant, and outputs image data obtained by the imaging to the external I / F 29. The second camera 37 images the right side surface of the first participant as the subject and outputs image data obtained by the imaging to the external I / F 29. The third camera 39 captures an image of the left side surface of the first participant, which is a subject, and outputs image data obtained by the imaging to the external I / F 29. The external I / F 29 outputs image data input from the first to third cameras 35, 37, and 39 to the CPU 11.

ここで、第１〜第３会議室の参加者とカメラとの位置関係を説明する。図８（Ａ）は、第１会議室の機器の配置の一例を示す図である。図８（Ａ）は、第１参加者を後方から見た画像を縦線のハッチングで示している。図８（Ａ）を参照して、第１参加者の正面に表示部２７が配置される。第１参加者の正面で表示部２７の上側に第１カメラ３５が配置され、第１参加者の右側に第２カメラ３７が配置され、第１参加者の左側に第３カメラ３９が配置される。第１カメラ３５の撮像範囲は第１参加者の正面であり、第２カメラ３７の撮像範囲は第１参加者の右側面であり、第３カメラ３９の撮像範囲は第１参加者の左側面である。 Here, the positional relationship between the participants in the first to third conference rooms and the camera will be described. FIG. 8A is a diagram illustrating an example of arrangement of devices in the first conference room. FIG. 8A shows an image of the first participant as seen from the rear by vertical hatching. With reference to FIG. 8 (A), the display part 27 is arrange | positioned in front of a 1st participant. A first camera 35 is disposed on the upper side of the display unit 27 in front of the first participant, a second camera 37 is disposed on the right side of the first participant, and a third camera 39 is disposed on the left side of the first participant. The The imaging range of the first camera 35 is the front of the first participant, the imaging range of the second camera 37 is the right side of the first participant, and the imaging range of the third camera 39 is the left side of the first participant. It is.

図８（Ｂ）は、第２会議室の機器の配置の一例を示す図である。図８（Ｂ）は、第２参加者を後方から見た画像を横線のハッチングで示している。図８（Ｂ）を参照して、第２参加者の正面に表示部２７Ａが配置される。第２参加者の正面で表示部２７Ａの上側に第１カメラ３５Ａが配置され、第２参加者の右側に第２カメラ３７Ａが配置され、第２参加者の左側に第３カメラ３９Ａが配置される。第１カメラ３５Ａの撮像範囲は第２参加者の正面であり、第２カメラ３７Ａの撮像範囲は第２参加者の右側面であり、第３カメラ３９Ａの撮像範囲は第２参加者の左側面である。 FIG. 8B is a diagram illustrating an example of the arrangement of the devices in the second conference room. FIG. 8B shows an image of the second participant as viewed from the rear by hatching with horizontal lines. With reference to FIG. 8B, a display unit 27A is arranged in front of the second participant. The first camera 35A is arranged on the upper side of the display unit 27A in front of the second participant, the second camera 37A is arranged on the right side of the second participant, and the third camera 39A is arranged on the left side of the second participant. The The imaging range of the first camera 35A is the front of the second participant, the imaging range of the second camera 37A is the right side of the second participant, and the imaging range of the third camera 39A is the left side of the second participant. It is.

図８（Ｃ）は、第１会議室の機器の配置の一例を示す図である。図８（Ｃ）は、第３参加者を後方から見た画像をハッチング無しで示している。図８（Ｃ）を参照して、第３参加者の正面に表示部２７が配置される。第３参加者の正面で表示部２７の上側に第１カメラ３５Ｂが配置され、第３参加者の右側に第２カメラ３７Ｂが配置され、第３参加者の左側に第３カメラ３９Ｂが配置される。第１カメラ３５Ｂの撮像範囲は第３参加者の正面であり、第２カメラ３７Ｂの撮像範囲は第３参加者の右側面であり、第３カメラ３９Ｂの撮像範囲は第３参加者の左側面である。 FIG. 8C is a diagram illustrating an example of the arrangement of the devices in the first conference room. FIG. 8C shows an image of the third participant viewed from behind without hatching. With reference to FIG.8 (C), the display part 27 is arrange | positioned in front of a 3rd participant. The first camera 35B is arranged on the upper side of the display unit 27 in front of the third participant, the second camera 37B is arranged on the right side of the third participant, and the third camera 39B is arranged on the left side of the third participant. The The imaging range of the first camera 35B is the front of the third participant, the imaging range of the second camera 37B is the right side of the third participant, and the imaging range of the third camera 39B is the left side of the third participant. It is.

第２の実施の形態における会議システム１Ａにおいては、第１会議室に配置されたＰＣ１００のカメラ３５，３７、３８で同じ第１参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第２会議室に配置されたＰＣ１００Ａの表示部２７Ａに表示され、第１会議室に配置されたＰＣ１００のカメラ３５，３７、３８で同じ第１参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第３会議室に配置されたＰＣ１００Ｂの表示部２７Ｂに表示される。また、第２会議室に配置されたＰＣ１００Ａのカメラ３５Ａ，３７Ａ、３８Ａで同じ第２参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第１会議室に配置されたＰＣ１００の表示部２７に表示され、第２会議室に配置されたＰＣ１００Ａのカメラ３５Ａ，３７Ａ、３８Ａで同じ第２参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第３会議室に配置されたＰＣ１００Ｂの表示部２７Ｂに表示される。さらに、第３会議室に配置されたＰＣ１００Ｂのカメラ３５Ｂ，３７Ｂ、３８Ｂで同じ第３参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第１会議室に配置されたＰＣ１００の表示部２７に表示され、第３会議室に配置されたＰＣ１００Ｂのカメラ３５Ｂ，３７Ｂ、３８Ｂで同じ第３参加者をそれぞれ撮像した３つの画像のうちから選択された画像が、第２会議室に配置されたＰＣ１００Ａの表示部２７Ａに表示される。 In the conference system 1A according to the second embodiment, an image selected from three images obtained by capturing the same first participant with the cameras 35, 37, and 38 of the PC 100 arranged in the first conference room, respectively. From the three images displayed on the display unit 27A of the PC 100A arranged in the second conference room and the same first participant taken by the cameras 35, 37, and 38 of the PC 100 arranged in the first conference room, respectively. The selected image is displayed on the display unit 27B of the PC 100B arranged in the third conference room. In addition, an image selected from three images obtained by capturing the same second participant with the cameras 35A, 37A, and 38A of the PC 100A disposed in the second meeting room is the PC 100 disposed in the first meeting room. An image selected from the three images captured on the same second participant by the cameras 35A, 37A, and 38A of the PC 100A arranged in the second conference room and displayed on the display unit 27 is displayed in the third conference room. It is displayed on the display unit 27B of the arranged PC 100B. Furthermore, an image selected from the three images obtained by capturing the same third participant with the cameras 35B, 37B, and 38B of the PC 100B disposed in the third meeting room is the PC 100 disposed in the first meeting room. An image selected from the three images obtained by capturing the same third participant with the cameras 35B, 37B, and 38B of the PC 100B arranged in the third conference room and displayed on the display unit 27 is displayed in the second conference room. It is displayed on the display unit 27A of the arranged PC 100A.

換言すれば、第１会議室に配置されたＰＣ１００の表示部２７には、第２会議室のカメラ３５Ａ，３７Ａ、３８Ａで同じ第２参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、第３会議室のカメラ３５Ｂ，３７Ｂ、３８Ｂで同じ第３参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、が表示される。また、第２会議室に配置されたＰＣ１００Ａの表示部２７Ａには、第１会議室のカメラ３５，３７、３８で同じ第１参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、第３会議室のカメラ３５Ａ，３７Ｂ、３８Ｂで同じ第３参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、が表示される。さらに、第３会議室に配置されたＰＣ１００Ｂの表示部２７Ｂには、第１会議室のカメラ３５，３７、３８で同じ第１参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、第２会議室のカメラ３５Ａ，３７Ａ、３８Ａで同じ第２参加者をそれぞれ撮像した３つの画像のうちから選択された画像と、が表示される。 In other words, the display unit 27 of the PC 100 arranged in the first conference room is selected from three images obtained by imaging the same second participant with the cameras 35A, 37A, and 38A in the second conference room. An image and an image selected from three images obtained by capturing the same third participant with the cameras 35B, 37B, and 38B in the third conference room are displayed. In addition, the display unit 27A of the PC 100A disposed in the second conference room has an image selected from three images obtained by capturing the same first participant with the cameras 35, 37, and 38 in the first conference room, and The images selected from the three images obtained by capturing the same third participant with the cameras 35A, 37B, and 38B in the third conference room are displayed. Further, the display unit 27B of the PC 100B arranged in the third conference room has an image selected from three images obtained by capturing the same first participant with the cameras 35, 37, and 38 in the first conference room, respectively. The images selected from the three images obtained by capturing the same second participant with the cameras 35A, 37A, and 38A in the second conference room are displayed.

第１会議室に配置されたＰＣ１００のスピーカ３３からは、第２会議室に配置されたＰＣ１００Ａのマイクロホン３１Ａで集音された音と、第３会議室に配置されたＰＣ１００Ｂのマイクロホン３１Ｂで集音された音と、を合成した音が出力される。第２会議室に配置されたＰＣ１００Ａのスピーカ３３Ａからは、第１会議室に配置されたＰＣ１００のマイクロホン３１で集音された音と、第３会議室に配置されたＰＣ１００Ｂのマイクロホン３１Ｂで集音された音と、を合成した音が出力される。第３会議室に配置されたＰＣ１００Ｂのスピーカ３３Ｂからは、第１会議室に配置されたＰＣ１００のマイクロホン３１で集音された音と、第２会議室に配置されたＰＣ１００Ａのマイクロホン３１Ａで集音された音と、を合成した音が出力される。 From the speaker 33 of the PC 100 arranged in the first meeting room, the sound collected by the microphone 31A of the PC 100A arranged in the second meeting room and the sound collected by the microphone 31B of the PC 100B arranged in the third meeting room The synthesized sound and the synthesized sound are output. From the speaker 33A of the PC 100A arranged in the second meeting room, the sound collected by the microphone 31 of the PC 100 arranged in the first meeting room and the sound collected by the microphone 31B of the PC 100B arranged in the third meeting room The synthesized sound and the synthesized sound are output. From the speaker 33B of the PC 100B arranged in the third meeting room, the sound collected by the microphone 31 of the PC 100 arranged in the first meeting room and the sound collected by the microphone 31A of the PC 100A arranged in the second meeting room The synthesized sound and the synthesized sound are output.

ここで、第１会議室に配置されたＰＣ１００を中心に、第２会議室に配置されたＰＣ１００Ａおよび第３会議室に配置されたＰＣ１００Ｂとの間で送受信されるデータ、およびＰＣ１００の表示部２７に表示される画像およびスピーカ３３から出力される音について説明する。 Here, centering on the PC 100 arranged in the first meeting room, data transmitted / received between the PC 100A arranged in the second meeting room and the PC 100B arranged in the third meeting room, and the display unit 27 of the PC 100 The image displayed on the screen and the sound output from the speaker 33 will be described.

図９は、第１会議室に配置されたＰＣ１００を中心に、第２会議室に配置されたＰＣ１００Ａおよび第３会議室に配置されたＰＣ１００Ｂとの間で送受信されるデータの一例を示す図である。図９を参照して、第１会議室に配置されるＰＣ１００は、第２会議室に配置されるＰＣ１００Ａに、映像データ、第１発話相手情報および表示領域情報を送信し、第２会議室に配置されるＰＣ１００Ａから映像データ、第２発話相手情報および表示領域情報を受信する。また、第１会議室に配置されるＰＣ１００は、第３会議室に配置されるＰＣ１００Ｂに、映像データ、第１発話相手情報および表示領域情報を送信し、第３会議室に配置されるＰＣ１００Ｂから映像データ、第３発話相手情報および表示領域情報を受信する。 FIG. 9 is a diagram illustrating an example of data transmitted / received between the PC 100A disposed in the second conference room and the PC 100B disposed in the third conference room with the PC 100 disposed in the first conference room as a center. is there. Referring to FIG. 9, PC 100 arranged in the first conference room transmits video data, first utterance partner information, and display area information to PC 100A arranged in the second conference room, and sends it to the second conference room. Video data, second utterance partner information, and display area information are received from the arranged PC 100A. The PC 100 arranged in the first conference room transmits the video data, the first utterance partner information, and the display area information to the PC 100B arranged in the third conference room, and from the PC 100B arranged in the third conference room. Video data, third utterance partner information, and display area information are received.

第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａに送信する映像データは、第１〜第３カメラ３５，３７，３９それぞれが出力する３つの画像データのうちから選択された１つの第１選択画像データと、マイクロホン３１が出力する音データとを含む。第１会議室に配置されるＰＣ１００が、第３会議室に配置されるＰＣ１００Ｂに送信する映像データは、第１〜第３カメラ３５，３７，３９それぞれが出力する３つの画像データのうちから選択された１つの第２選択画像データと、マイクロホン３１が出力する音データとを含む。第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａに送信する映像データに含まれる第１選択画像データと、第３会議室に配置されるＰＣ１００Ｂに送信する映像データに含まれる第２選択画像データとは、同じ場合と、異なる場合がある。第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａに送信する映像データに含まれる音データと、第３会議室に配置されるＰＣ１００Ｂに送信する映像データに含まれる音データとは、同じである。 The video data transmitted from the PC 100 arranged in the first conference room to the PC 100A arranged in the second conference room is selected from the three image data output from the first to third cameras 35, 37, and 39, respectively. The first selected image data thus obtained and the sound data output from the microphone 31 are included. The video data transmitted from the PC 100 arranged in the first meeting room to the PC 100B arranged in the third meeting room is selected from the three image data output from the first to third cameras 35, 37, and 39, respectively. The second selected image data thus obtained and the sound data output from the microphone 31 are included. The first selected image data included in the video data transmitted from the PC 100 disposed in the first conference room to the PC 100A disposed in the second conference room and the video data transmitted to the PC 100B disposed in the third conference room. The second selected image data included may be the same or different. The sound included in the video data transmitted from the PC 100 disposed in the first conference room to the PC 100A disposed in the second conference room and the sound included in the video data transmitted to the PC 100B disposed in the third conference room The data is the same.

第１会議室に配置されるＰＣ１００は、第２会議室に配置されるＰＣ１００Ａから受信する映像データに含まれる画像データを表示部２７に表示し、第３会議室に配置されるＰＣ１００Ｂから受信する映像データに含まれる画像データを表示部２７に表示する。第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａから受信する映像データに含まれる画像データは、第１〜第３カメラ３５Ａ，３７Ａ，３９Ａそれぞれが出力する３つの画像データのうちからＰＣ１００Ａにより選択された１つであり、第３会議室に配置されるＰＣ１００Ｂから受信する映像データに含まれる画像データは、第１〜第３カメラ３５Ｂ，３７Ｂ，３９Ｂそれぞれが出力する３つの画像データのうちからＰＣ１００Ｂにより選択された１つである。 The PC 100 arranged in the first meeting room displays image data included in the video data received from the PC 100A arranged in the second meeting room on the display unit 27 and receives from the PC 100B arranged in the third meeting room. The image data included in the video data is displayed on the display unit 27. The image data included in the video data received from the PC 100A disposed in the second conference room by the PC 100 disposed in the first conference room is the three images output by the first to third cameras 35A, 37A, and 39A, respectively. The first to third cameras 35B, 37B, and 39B output image data that is one selected from the data by the PC 100A and included in the video data received from the PC 100B disposed in the third conference room. One of the three image data selected by the PC 100B.

ここで、ＰＣ１００が備える表示部の表示領域について説明する。図１０は、表示部の表示領域の一例を示す図である。図１０を参照して、表示部２７は、第１表示領域と第２表示領域とを有する。第１表示領域および第２表示領域は、表示部２７の表示領域を左右に２分割した２つの領域である。第１表示領域は、第２表示領域の左側に配置され、第２表示領域は、第１表示領域の右側に配置される。 Here, the display area of the display unit included in the PC 100 will be described. FIG. 10 is a diagram illustrating an example of the display area of the display unit. Referring to FIG. 10, display unit 27 has a first display area and a second display area. The first display area and the second display area are two areas obtained by dividing the display area of the display unit 27 into left and right parts. The first display area is arranged on the left side of the second display area, and the second display area is arranged on the right side of the first display area.

図９に戻って、第１会議室に配置されるＰＣ１００は、ＰＣ１００Ａから受信される映像データに含まれる画像データを、第１表示領域および第２表示領域のいずれか一方に表示し、ＰＣ１００Ｂから受信される映像データに含まれる画像データを、第１表示領域および第２表示領域のうち他方に表示する。ここでは、第１会議室に配置されるＰＣ１００は、ＰＣ１００Ａから受信される映像データに含まれる画像データを、第１表示領域に表示し、ＰＣ１００Ｂから受信される映像データに含まれる画像データを第２表示領域に表示する場合を例に説明する。 Returning to FIG. 9, the PC 100 arranged in the first conference room displays the image data included in the video data received from the PC 100 A in one of the first display area and the second display area, and from the PC 100 B. Image data included in the received video data is displayed on the other of the first display area and the second display area. Here, the PC 100 arranged in the first conference room displays the image data included in the video data received from the PC 100A in the first display area, and the image data included in the video data received from the PC 100B is the first. A case of displaying in two display areas will be described as an example.

第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａに送信する表示領域情報は、ＰＣ１００がＰＣ１００Ａから受信される映像データに含まれる画像データを第１表示領域と第２表示領域のいずれに表示しているかを示す情報である。また、第１会議室に配置されるＰＣ１００が、第３会議室に配置されるＰＣ１００Ｂに送信する表示領域情報は、ＰＣ１００がＰＣ１００Ｂから受信される映像データに含まれる画像データを第１表示領域と第２表示領域のいずれに表示しているかを示す情報である。ここでは、第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａに送信する表示領域情報は、第１表示領域を示す情報であり、第３会議室に配置されるＰＣ１００Ｂに送信する表示領域情報は、第２表示領域を示す情報である。 The display area information transmitted from the PC 100 arranged in the first meeting room to the PC 100A arranged in the second meeting room is the image data included in the video data received by the PC 100 from the PC 100A and the second display area. This is information indicating in which display area the image is displayed. The display area information transmitted from the PC 100 arranged in the first meeting room to the PC 100B arranged in the third meeting room is the image data included in the video data received from the PC 100B as the first display area. This is information indicating in which of the second display areas the image is displayed. Here, the display area information transmitted from the PC 100 arranged in the first meeting room to the PC 100A arranged in the second meeting room is information indicating the first display area, and the PC 100B arranged in the third meeting room. The display area information transmitted to is information indicating the second display area.

第１会議室に配置されるＰＣ１００が、第２会議室に配置されるＰＣ１００Ａから受信する表示領域情報は、第１会議室に配置されるＰＣ１００が第２会議室に配置されるＰＣ１００Ａに送信する映像データに含まれる第１選択画像データが、ＰＣ１００Ａの表示部２７Ａが有する第１表示領域と第２表示領域のいずれに表示されているかを示す情報である。第１会議室に配置されるＰＣ１００が、第３会議室に配置されるＰＣ１００Ｂから受信する表示領域情報は、第１会議室に配置されるＰＣ１００が第３会議室に配置されるＰＣ１００Ｂに送信する映像データに含まれる第２選択画像データが、ＰＣ１００Ｂの表示部２７Ｂが有する第１表示領域と第２表示領域のいずれに表示されているかを示す情報である。 Display area information received from the PC 100A arranged in the second meeting room by the PC 100 arranged in the first meeting room is transmitted to the PC 100A arranged in the second meeting room by the PC 100 arranged in the first meeting room. This is information indicating whether the first selected image data included in the video data is displayed in the first display area or the second display area of the display unit 27A of the PC 100A. Display area information received from the PC 100B arranged in the third meeting room by the PC 100 arranged in the first meeting room is transmitted to the PC 100B arranged in the third meeting room by the PC 100 arranged in the first meeting room. This is information indicating whether the second selected image data included in the video data is displayed in the first display area or the second display area of the display unit 27B of the PC 100B.

第１発話相手情報は、ＰＣ１００により検出される情報であって、第１会議室の第１参加者が、発話しているか否かを示す情報であり、発話していることを示す場合には第２参加者または第３参加者のいずれに対して発話しているかを示す情報を含む。第２発話相手情報は、ＰＣ１００Ａにより検出される情報であって、第２会議室の第２参加者が、発話しているか否かを示す情報であり、発話していることを示す場合には第１参加者または第３参加者のいずれに対して発話しているかを示す情報を含む。第３発話相手情報は、ＰＣ１００Ｂにより検出される情報であって、第３会議室の参加者が、発話しているか否かを示す情報であり、発話していることを示す場合には第１参加者または第２参加者のいずれに対して発話しているかを示す情報を含む。 The first utterance partner information is information detected by the PC 100 and is information indicating whether or not the first participant in the first conference room is speaking. Information indicating whether to speak to the second participant or the third participant is included. The second utterance partner information is information detected by the PC 100A and is information indicating whether or not the second participant in the second conference room is speaking. Information indicating whether to speak to the first participant or the third participant is included. The third utterance partner information is information detected by the PC 100B and is information indicating whether or not a participant in the third conference room is speaking. Information indicating whether to speak to the participant or the second participant is included.

図１１は、第２の実施の形態におけるＰＣが備えるＣＰＵの機能の一例を示すブロック図である。図１１を参照して、ＰＣ１００は、第１〜第３カメラ３５，３７，３９がそれぞれ出力する３つの第１〜第３画像データを取得する画像データ取得部５１と、マイクロホン３１が出力する音データを取得する音データ取得部５３と、参加者の発話を検出する発話検出部７１と、３つの第１〜第３画像データのうちから１つを選択する第１および第２選択部７３，７７と、第１画像データに基づいて被写体の視線方向を検出する視線方向検出部７５と、発話相手を判定する発話相手判定部７９と、第２会議室に配置されたＰＣ１００Ａに会議室情報を送信する第１会議室情報送信部８１と、第３会議室に配置されたＰＣ１００Ｂに会議室情報を送信する第２会議室情報送信部８１と、第２会議室に配置されたＰＣ１００Ａから会議室情報を受信する第１会議室情報受信部８５と、第３会議室に配置されたＰＣ１００Ｂから会議室情報を受信する第２会議室情報受信部８７と、会話相手を判断する会話相手判断部８９と、表示部２７を制御する表示制御部９１と、スピーカ３３を制御する音声制御部９３と、を含む。 FIG. 11 is a block diagram illustrating an example of functions of the CPU provided in the PC according to the second embodiment. Referring to FIG. 11, PC 100 includes an image data acquisition unit 51 that acquires three first to third image data output from first to third cameras 35, 37, and 39, and a sound output from microphone 31. A sound data acquisition unit 53 for acquiring data, an utterance detection unit 71 for detecting an utterance of a participant, and first and second selection units 73 for selecting one of the three first to third image data, 77, gaze direction detection unit 75 for detecting the gaze direction of the subject based on the first image data, utterance partner determination unit 79 for judging the utterance partner, and conference room information in PC 100A arranged in the second conference room. The first conference room information transmission unit 81 for transmitting, the second conference room information transmission unit 81 for transmitting conference room information to the PC 100B arranged in the third conference room, and the conference room from the PC 100A arranged in the second conference room Receive information A first conference room information receiving unit 85, a second conference room information receiving unit 87 for receiving conference room information from the PC 100B arranged in the third conference room, a conversation partner judging unit 89 for judging a conversation partner, and a display A display control unit 91 that controls the unit 27, and an audio control unit 93 that controls the speaker 33.

画像データ取得部５１は、外部Ｉ／Ｆ２９に接続された第１〜第３カメラ３５，３７，３９を制御し、第１〜第３カメラ３５，３７，３９がそれぞれ出力する第１〜第３画像データを取得する。第１〜第３画像データは、動画像であってもよいし、静止画像であってもよい。ここでは、第１〜第３画像データを動画像としている。画像データ取得部５１は、第１〜第３画像データを第１および第２選択部７３，７７に出力し、第１カメラ３５が出力する第１画像データを視線方向検出部７５に出力する。 The image data acquisition unit 51 controls the first to third cameras 35, 37, and 39 connected to the external I / F 29, and the first to third cameras 35, 37, and 39 output from the first to third cameras 35, 37, and 39, respectively. Get image data. The first to third image data may be a moving image or a still image. Here, the first to third image data are moving images. The image data acquisition unit 51 outputs the first to third image data to the first and second selection units 73 and 77, and outputs the first image data output from the first camera 35 to the line-of-sight direction detection unit 75.

音データ取得部５３は、外部Ｉ／Ｆ２１に接続されたマイクロホン３１を制御し、マイクロホン３１が出力する音データを取得する。音データ取得部５３は、音データを、発話検出部７１、第１会議室情報送信部８１および第２会議室情報送信部８３に出力する。 The sound data acquisition unit 53 controls the microphone 31 connected to the external I / F 21 and acquires sound data output from the microphone 31. The sound data acquisition unit 53 outputs the sound data to the utterance detection unit 71, the first conference room information transmission unit 81, and the second conference room information transmission unit 83.

視線方向検出部７５は、画像データ取得部５１から入力される第１画像データに基づいて被写体の視線方向を検出する。具体的には、画像データに含まれる被写体の領域を抽出し、被写体の領域から、目の領域および瞳の領域を抽出する。目の領域に対する瞳の領域の位置関係から視線方向を検出する。瞳の領域を、左、右の２つの領域に水平方向に２分割し、瞳の領域が、２つの分割領域のいずれに存在するかを判別する。瞳の領域が、目の領域の左の分割領域に存在すれば、右方向の視線方向を検出し、瞳の領域が、瞳の領域が、目の領域の右の分割領域に存在すれば、左方向の視線方向を検出する。視線方向検出部５５は、検出した視線方向を発話相手判定部７９に出力する。視線方向検出部５５は、画像データから目の領域を抽出できない場合、視線方向を検出することなく、視線方向を発話相手判定部７９に出力しない。 The line-of-sight direction detection unit 75 detects the line-of-sight direction of the subject based on the first image data input from the image data acquisition unit 51. Specifically, a subject area included in the image data is extracted, and an eye area and a pupil area are extracted from the subject area. The line-of-sight direction is detected from the positional relationship of the pupil region with respect to the eye region. The pupil area is divided into two areas, left and right, in the horizontal direction, and it is determined which of the two divided areas the pupil area exists. If the pupil region is present in the left divided region of the eye region, the right gaze direction is detected, and if the pupil region is present in the right divided region of the eye region, Detect the left gaze direction. The gaze direction detection unit 55 outputs the detected gaze direction to the utterance partner determination unit 79. If the eye area cannot be extracted from the image data, the line-of-sight direction detection unit 55 does not detect the line-of-sight direction and does not output the line-of-sight direction to the utterance partner determination unit 79.

発話検出部７１は、音データ取得部５３から音データが入力され、音データに基づいて、第１会議室の参加者の発話の有無を検出する。音データの音声レベルを予め定められたしきい値と比較し、音声レベルがしきい値以上ならば第１会議室の参加者が発話していると判断する。発話検出部７１は、第１会議室の参加者が発話しているか否か示す発話有無信号を、発話相手判定部７９、第１選択部７３および第２選択部７７に出力する。 The utterance detection unit 71 receives the sound data from the sound data acquisition unit 53 and detects the presence or absence of the utterance of the participant in the first conference room based on the sound data. The voice level of the sound data is compared with a predetermined threshold, and if the voice level is equal to or higher than the threshold, it is determined that the participant in the first conference room is speaking. The utterance detection unit 71 outputs an utterance presence / absence signal indicating whether or not a participant in the first conference room is speaking to the utterance partner determination unit 79, the first selection unit 73, and the second selection unit 77.

発話相手判定部７９は、視線方向検出部７５から視線方向が入力され、発話検出部７１から発話有無信号が入力される。発話相手判定部７９は、視線方向および発話有無信号に基づいて発話相手を判定する。後述する表示制御部９１によって、第２会議室から受信される映像データに含まれる画像データが表示部２７の第１表示領域に表示され、第３会議室から受信される映像データに含まれる画像データは表示部２７の第２表示領域に表示される。このため、発話相手判定部７９は、発話有無信号が発話していることを示す場合に、視線方向が左方向ならば第２会議室の参加者に対して発話していると判断し、視線方向が右方向ならば第３会議室の参加者に対して発話していると判断する。発話相手判定部７９は、特定された発話相手を示す第１発話相手情報を会話相手判断部８９、第１および第２会議室情報送信部８１，８３に出力する。 The utterance partner determination unit 79 receives the gaze direction from the gaze direction detection unit 75 and receives the utterance presence / absence signal from the utterance detection unit 71. The speaking partner determination unit 79 determines a speaking partner based on the line-of-sight direction and the speech presence / absence signal. Image data included in the video data received from the second conference room is displayed in the first display area of the display unit 27 by the display control unit 91 described later, and is included in the video data received from the third conference room The data is displayed in the second display area of the display unit 27. For this reason, when the utterance presence / absence signal indicates that the utterance is present, the utterance partner determination unit 79 determines that the utterance is uttered to the participant in the second conference room if the sightline direction is the left direction. If the direction is the right direction, it is determined that the user is speaking to the third conference room participant. The utterance partner determination unit 79 outputs first utterance partner information indicating the specified utterance partner to the conversation partner determination unit 89 and the first and second conference room information transmission units 81 and 83.

なお、発話相手判定部７９は、視線方向検出部７５から入力される視線方向のみから発話相手を判定するようにしてもよい。この場合、発話相手判定部７９は、発話有無信号に係わらず、視線方向が左方向ならば第２会議室の参加者に対して発話していると判断し、視線方向が右方向ならば第３会議室の参加者に対して発話していると判断し、特定された発話相手を示す第１発話相手情報を会話相手判断部８９、第１および第２会議室情報送信部８１，８３に出力する。 Note that the utterance partner determination unit 79 may determine the utterance partner only from the gaze direction input from the gaze direction detection unit 75. In this case, regardless of the utterance presence / absence signal, the utterance partner determination unit 79 determines that the utterance is speaking to the participant in the second conference room if the gaze direction is the left direction, and if the gaze direction is the right direction, The first utterance partner information indicating that the utterance partner is identified is given to the conversation partner determination unit 89 and the first and second conference room information transmission units 81 and 83. Output.

また、表示制御部９１が、第２会議室から受信される映像データに含まれる画像データを表示部２７の第１表示領域および第２表示領域のいずれに表示するかは、任意である。このため、表示制御部９１が、第２会議室から受信される映像データに含まれる画像データを表示部２７の第２表示領域に表示し、第３会議室から受信される映像データに含まれる画像データを第１表示領域に表示する場合、発話相手判定部７９は、視線方向が左方向ならば第３会議室の参加者に対して発話していると判断し、視線方向が右方向ならば第２会議室の参加者に対して発話していると判断する。 In addition, it is arbitrary whether the display control unit 91 displays the image data included in the video data received from the second conference room in the first display area or the second display area of the display unit 27. Therefore, the display control unit 91 displays the image data included in the video data received from the second conference room in the second display area of the display unit 27 and is included in the video data received from the third conference room. When displaying the image data in the first display area, the utterance partner determination unit 79 determines that the utterance is uttered to the participant in the third conference room if the sight line direction is the left direction, and if the sight line direction is the right direction. It is determined that the user is speaking to the participant in the second conference room.

第１会議室情報受信部８５は、通信Ｉ／Ｆ２１を制御して、第２会議室に配置されるＰＣ１００Ａから映像データ、第２発話相手情報および表示領域情報を受信する。第１会議室情報受信部８５は、受信された映像データに含まれる画像データを、表示制御部９１に出力し、音データを音声制御部９３に出力する。また、第１会議室情報受信部８５は、第２発話相手情報および表示領域情報を会話相手判断部８９に出力する。 The first conference room information receiving unit 85 controls the communication I / F 21 to receive video data, second utterance partner information, and display area information from the PC 100A arranged in the second conference room. The first meeting room information receiving unit 85 outputs the image data included in the received video data to the display control unit 91 and outputs the sound data to the audio control unit 93. Further, the first conference room information receiving unit 85 outputs the second utterance partner information and the display area information to the conversation partner determination unit 89.

第２会議室情報受信部８７は、通信Ｉ／Ｆ２１を制御して、第３会議室に配置されるＰＣ１００Ｂから映像データ、第３発話相手情報および表示領域情報を受信する。第２会議室情報受信部８７は、受信された映像データに含まれる画像データを、表示制御部９１に出力し、音データを音声制御部９３に出力する。また、第２会議室情報受信部８７は、第２発話相手情報および表示領域情報を会話相手判断部８９に出力する。 The second meeting room information receiving unit 87 controls the communication I / F 21 to receive video data, third utterance partner information, and display area information from the PC 100B arranged in the third meeting room. The second meeting room information receiving unit 87 outputs the image data included in the received video data to the display control unit 91, and outputs the sound data to the audio control unit 93. Further, the second conference room information receiving unit 87 outputs the second utterance partner information and the display area information to the conversation partner determination unit 89.

表示制御部９１は、第１会議室情報受信部８５から入力される画像データ、換言すれば、第２会議室から受信される映像データに含まれる画像データを、表示部２７の第１表示領域に表示する。なお、表示制御部９１が、第２会議室から受信される映像データに含まれる画像データを表示部２７の第１表示領域および第２表示領域のいずれに表示するかは、任意である。また、表示制御部９１は、第２会議室情報受信部８７から入力される画像データ、換言すれば、第３会議室から受信される映像データに含まれる画像データを、表示部２７の第２表示領域に表示する。表示制御部９１が、第２会議室から受信される映像データに含まれる画像データを表示部２７の第２表示領域に表示する場合、第３会議室から受信される映像データに含まれる画像データを第１表示領域に表示する。 The display control unit 91 displays image data input from the first meeting room information receiving unit 85, in other words, image data included in video data received from the second meeting room, in the first display area of the display unit 27. To display. Note that it is arbitrary whether the display control unit 91 displays the image data included in the video data received from the second conference room in the first display area or the second display area of the display unit 27. In addition, the display control unit 91 receives the image data input from the second meeting room information receiving unit 87, in other words, the image data included in the video data received from the third meeting room, on the second display unit 27. Display in the display area. When the display control unit 91 displays the image data included in the video data received from the second conference room in the second display area of the display unit 27, the image data included in the video data received from the third conference room Are displayed in the first display area.

音声制御部９３は、第１会議室情報受信部８５から入力される音データと、第２会議室情報受信部８７から入力される音データと、を合成した音をスピーカ３３に出力させる。 The sound control unit 93 causes the speaker 33 to output a sound obtained by synthesizing the sound data input from the first meeting room information receiving unit 85 and the sound data input from the second meeting room information receiving unit 87.

会話相手判断部８９は、第１会議室情報受信部８５から第２発話相手情報および表示領域情報が入力され、第２会議室情報受信部８７から第３発話相手情報および表示領域情報が入力され、発話相手判定部７９から第１発話相手情報が入力される。第２発話相手情報は、第２参加者が第１参加者および第３参加者のいずれに対して発話しているかを示す情報である。第３発話相手情報は、第３参加者が第１参加者および第２参加者のいずれに対して発話しているかを示す情報である。会話相手判断部８９は、第１発話相手情報が第２参加者に対して発話していることを示す場合で、かつ、第２発話相手情報が第１参加者に対して発話していることを示す場合、第１参加者と第２参加者とが会話している第１会話状態と判断する。また、会話相手判断部８９は、第１発話相手情報が第３参加者に対して発話していることを示す場合で、かつ、第３発話相手情報が第１参加者に対して発話していることを示す場合、第１参加者と第３参加者とが会話している第２会話状態と判断する。 The conversation partner determination unit 89 receives the second utterance partner information and the display area information from the first conference room information reception unit 85 and receives the third utterance partner information and the display area information from the second conference room information reception unit 87. The first utterance partner information is input from the utterance partner determination unit 79. The second utterance partner information is information indicating whether the second participant is speaking to the first participant or the third participant. The third utterance partner information is information indicating whether the third participant is speaking to the first participant or the second participant. The conversation partner determination unit 89 indicates that the first utterance partner information indicates to the second participant and the second utterance partner information is uttered to the first participant. Is determined to be in the first conversation state in which the first participant and the second participant are talking. The conversation partner determining unit 89 is a case where the first utterance partner information indicates that the third participant is speaking and the third utterance partner information is uttered to the first participant. When it is shown that it is, it is judged as the 2nd conversation state in which the 1st participant and the 3rd participant are talking.

会話相手判断部８９は、第１参加者と第２参加者とが会話している第１会話状態と判断する場合、第１選択部７３に第１会話状態信号を出力し、第１会話状態と判断しない場合、第１選択部７３に第１会議室情報受信部８５から入力される表示領域情報を出力する。第１会議室情報受信部８５から入力される表示領域情報は、第２会議室に配置されたＰＣ１００Ａの表示部２７Ａにおいて、ＰＣ１００からＰＣ１００Ａに送信された映像信号に含まれる第１選択画像データが第１表示領域と第２表示領域とのいずれに表示されているかを示す情報である。 When determining that the conversation partner determination unit 89 is in the first conversation state in which the first participant and the second participant are talking, the conversation partner determination unit 89 outputs a first conversation state signal to the first selection unit 73, and the first conversation state If not, the display area information input from the first meeting room information receiving unit 85 is output to the first selection unit 73. The display area information input from the first meeting room information receiving unit 85 includes the first selected image data included in the video signal transmitted from the PC 100 to the PC 100A in the display unit 27A of the PC 100A disposed in the second meeting room. This is information indicating which of the first display area and the second display area is displayed.

会話相手判断部８９は、第１参加者と第３参加者とが会話している第２会話状態と判断する場合、第２選択部７７に第２会話状態信号を出力し、第２会話状態と判断しない場合、第２選択部７７に第２会議室情報受信部８７から入力される表示領域情報を出力する。第２会議室情報受信部８７から入力される表示領域情報は、第３会議室に配置されたＰＣ１００Ｂの表示部２７Ｂにおいて、ＰＣ１００からＰＣ１００Ｂに送信された映像信号に含まれる第２選択画像データが第１表示領域と第２表示領域とのいずれに表示されているかを示す情報である。 When the conversation partner determination unit 89 determines that the second conversation state is the conversation between the first participant and the third participant, the conversation partner determination unit 89 outputs a second conversation state signal to the second selection unit 77 and outputs the second conversation state. Otherwise, the display area information input from the second conference room information receiving unit 87 is output to the second selection unit 77. The display area information input from the second conference room information receiving unit 87 is that the second selected image data included in the video signal transmitted from the PC 100 to the PC 100B is displayed on the display unit 27B of the PC 100B arranged in the third conference room. This is information indicating which of the first display area and the second display area is displayed.

第１選択部７３は、画像データ取得部５１から第１〜第３画像データが入力され、発話検出部７１から発話有無信号が入力され、会話相手判断部８９から第１会話状態信号または表示領域情報のいずれかが入力される。第１選択部７３は、会話相手判断部８９から第１会話状態信号が入力される場合、第１画像データを第１選択画像データとして選択し、表示領域情報が入力される場合は第２および第３画像データのいずれか一方を第１選択画像データとして選択する。第１選択部７３は、表示領域情報が第１表示領域を示す場合、第２画像データを第１選択画像データとして選択し、表示領域情報が第２表示領域を示す場合、第３画像データを第１選択画像データとして選択する。また、第１選択部７３は、会話相手判断部８９から表示領域情報が入力される場合であっても、表示領域情報が入力されてから所定時間Ｔ１が継続すると、発話検出部７１から発話していることを示す発話有無信号が入力されていることを条件に、第１画像データを、所定時間Ｔ２の間、第１選択画像データとして選択する。第１選択部７３は、第１〜第３画像データのうちから選択された第１選択画像データを第１会議室情報送信部８１に出力する。 The first selection unit 73 receives the first to third image data from the image data acquisition unit 51, the utterance presence / absence signal from the utterance detection unit 71, and the first conversation state signal or display area from the conversation partner determination unit 89. Any of the information is entered. The first selection unit 73 selects the first image data as the first selection image data when the first conversation state signal is input from the conversation partner determination unit 89, and the second and the second when the display area information is input. One of the third image data is selected as the first selected image data. The first selection unit 73 selects the second image data as the first selection image data when the display area information indicates the first display area, and the third image data when the display area information indicates the second display area. It selects as 1st selection image data. In addition, even when the display area information is input from the conversation partner determination unit 89, the first selection unit 73 receives the utterance from the utterance detection unit 71 when the predetermined time T1 continues after the display area information is input. The first image data is selected as the first selected image data for a predetermined time T2 on the condition that an utterance presence / absence signal indicating that it is input is input. The first selection unit 73 outputs the first selected image data selected from the first to third image data to the first meeting room information transmission unit 81.

第２選択部７７は、画像データ取得部５１から第１〜第３画像データが入力され、、発話検出部７１から発話有無信号が入力され、会話相手判断部８９から第２会話状態信号または表示領域情報のいずれかが入力される。第２選択部７７は、会話相手判断部８９から第２会話状態信号が入力される場合、第１画像データを第２選択画像データとして選択し、表示領域情報が入力される場合は第２および第３画像データのいずれか一方を第２選択画像データとして選択する。第２選択部７７は、表示領域情報が第１表示領域を示す場合、第２画像データを第２選択画像データとして選択し、表示領域情報が第２表示領域を示す場合、第３画像データを第２選択画像データとして選択する。また、第２選択部７７は、会話相手判断部８９から表示領域情報が入力される場合であっても、表示領域情報が入力されてから所定時間Ｔ１が継続すると、発話検出部７１から発話していることを示す発話有無信号が入力されていることを条件に、第１画像データを、所定時間Ｔ２の間、第２選択画像データとして選択する。第２選択部７７は、第１〜第３画像データのうちから選択された第２選択画像データを第２会議室情報送信部８３に出力する。 The second selection unit 77 receives the first to third image data from the image data acquisition unit 51, receives the utterance presence / absence signal from the utterance detection unit 71, and receives the second conversation state signal or display from the conversation partner determination unit 89. One of the area information is input. The second selection unit 77 selects the first image data as the second selection image data when the second conversation state signal is input from the conversation partner determination unit 89, and the second and the second when the display area information is input. One of the third image data is selected as the second selected image data. The second selection unit 77 selects the second image data as the second selected image data when the display area information indicates the first display area, and the third image data when the display area information indicates the second display area. It selects as 2nd selection image data. In addition, even when the display area information is input from the conversation partner determination unit 89, the second selection unit 77 continues to speak from the utterance detection unit 71 when the predetermined time T1 continues after the display area information is input. The first image data is selected as the second selected image data for a predetermined time T2 on the condition that an utterance presence / absence signal indicating that the image is present is input. The second selection unit 77 outputs the second selected image data selected from the first to third image data to the second conference room information transmission unit 83.

第１会議室情報送信部８１は、音データ取得部５３から音データが入力され、第１選択部７３から第１〜第３画像データのうちから選択された１つの第１選択画像データが入力され、発話相手判定部７９から第１発話相手情報が入力される。第１会議室情報送信部８１は、第１選択画像データと音データとを含む映像データと、第１発話相手情報と、第１表示領域を示す表示領域情報とを、通信Ｉ／Ｆ２１を制御して、第２会議室に配置されたＰＣ１００Ａに送信する。なお、表示領域情報は、表示制御部９１から入力されるようにしてもよい。 The first conference room information transmission unit 81 receives sound data from the sound data acquisition unit 53 and receives one first selection image data selected from the first to third image data from the first selection unit 73. Then, the first utterance partner information is input from the utterance partner determination unit 79. The first meeting room information transmission unit 81 controls the communication I / F 21 with video data including first selected image data and sound data, first utterance partner information, and display area information indicating the first display area. And it transmits to PC100A arrange | positioned in a 2nd meeting room. The display area information may be input from the display control unit 91.

第２会議室情報送信部８３は、音データ取得部５３から音データが入力され、第２選択部７７から第１〜第３画像データのうちから選択された１つの第２選択画像データが入力され、発話相手判定部７９から第１発話相手情報が入力される。第２会議室情報送信部８３は、第２選択画像データと音データとを含む映像データと、第１発話相手情報と、第２表示領域を示す表示領域情報とを、通信Ｉ／Ｆ２１を制御して、第３会議室に配置されたＰＣ１００Ｂに送信する。なお、表示領域情報は、表示制御部９１から入力されるようにしてもよい。 The second conference room information transmission unit 83 receives sound data from the sound data acquisition unit 53, and receives one second selection image data selected from the first to third image data from the second selection unit 77. Then, the first utterance partner information is input from the utterance partner determination unit 79. The second meeting room information transmission unit 83 controls the communication I / F 21 with video data including the second selected image data and sound data, first utterance partner information, and display area information indicating the second display area. And it transmits to PC100B arrange | positioned in the 3rd meeting room. The display area information may be input from the display control unit 91.

図１２は、第２の実施の形態における映像データ送信処理の流れの一例を示すフローチャートである。第２の実施の形態における映像データ送信処理は、ＰＣ１００，１００Ａ，１００Ｂそれぞれにおいて実行されるが、処理対象とする画像データおよび音データが異なるのみなので、ここではＰＣ１００が実行する場合を例に説明する。第２の実施の形態における映像データ送信処理は、ＰＣ１００が備えるＣＰＵ１１がＲＯＭ１３、ＨＤＤ１７またはメモリカード１８に記憶された映像データ送信プログラウを実行することにより、ＣＰＵ１１により実行される処理である。 FIG. 12 is a flowchart illustrating an example of a flow of video data transmission processing in the second embodiment. The video data transmission processing in the second embodiment is executed in each of the PCs 100, 100A, and 100B, but only the image data and sound data to be processed are different, so here the case where the PC 100 executes will be described as an example. To do. The video data transmission process in the second embodiment is a process executed by the CPU 11 when the CPU 11 included in the PC 100 executes a video data transmission program stored in the ROM 13, the HDD 17 or the memory card 18.

図１２を参照して、ＣＰＵ１１は、第２会議室に配置されたＰＣ１００Ａから映像データ、第２発話相手情報および表示領域情報を受信し（ステップＳ２１）、第３会議室に配置されたＰＣ１００Ｂから映像データ、第３発話相手情報および表示領域情報を受信する（ステップＳ２２）。 Referring to FIG. 12, CPU 11 receives video data, second utterance partner information, and display area information from PC 100A arranged in the second conference room (step S21), and from PC 100B arranged in the third conference room. The video data, third utterance partner information, and display area information are received (step S22).

そして、ステップＳ２１においてＰＣ１００Ａから受信された第２会議室の映像データに含まれる画像データを表示部２７の第１表示領域に表示し（ステップＳ２３）、第２会議室用の表示領域情報に第１表示領域を示す情報を設定する（ステップＳ２４）。さらに、ステップＳ２２においてＰＣ１００Ｂから受信された第３会議室の映像データに含まれる画像データを表示部２７の第２表示領域に表示し（ステップＳ２５）、第３会議室用の表示領域情報に第２表示領域を示す情報を設定する（ステップＳ２６）。 Then, the image data included in the video data of the second meeting room received from the PC 100A in step S21 is displayed on the first display area of the display unit 27 (step S23), and the display area information for the second meeting room is displayed in the first area. Information indicating one display area is set (step S24). Further, the image data included in the video data of the third meeting room received from the PC 100B in step S22 is displayed in the second display area of the display unit 27 (step S25), and the display area information for the third meeting room is displayed in the second area. Information indicating two display areas is set (step S26).

次のステップＳ２７においては、第１カメラ３５から入力される画像データに基づいて、被写体の視線方向を検出する。次のステップＳ２８においては、発話相手検出処理を実行する。発話相手検出処理の詳細は後述するが、第１会議室の第１参加者が発話する相手が、第２会議室の第２参加者および第３会議室の第３参加者のいずれであるかを検出し、検出された相手を示す情報を含む第１発話相手情報を生成する処理である。次のステップＳ２９においては、会話相手検出処理を実行する。会話相手検出処理の詳細は後述するが、第１会議室の参加者が会話する相手が、第２参加者および第３参加者のいずれであるかを検出する処理である。 In the next step S27, the line-of-sight direction of the subject is detected based on the image data input from the first camera 35. In the next step S28, an utterance partner detection process is executed. The details of the utterance partner detection process will be described later, but whether the first participant in the first conference room speaks is the second participant in the second conference room or the third participant in the third conference room Is detected, and first utterance partner information including information indicating the detected partner is generated. In the next step S29, conversation partner detection processing is executed. Although details of the conversation partner detection process will be described later, this is a process of detecting whether the partner with whom the participant in the first conference room talks is the second participant or the third participant.

次のステップＳ３０においては、選択画像設定処理を実行する。選択画像設定処理の詳細は後述するが、第２会議室に送信する第１選択画像データに第１〜第３画像データのいずれかを設定するとともに、第３会議室に送信する第２選択画像データに第１〜第３画像データのいずれかを設定する処理である。 In the next step S30, a selected image setting process is executed. Although the details of the selected image setting process will be described later, any one of the first to third image data is set in the first selected image data to be transmitted to the second conference room, and the second selected image to be transmitted to the third conference room This is a process for setting any one of the first to third image data in the data.

次のステップＳ３１においては、通信Ｉ／Ｆ２１を介して、第２会議室に配置されたＰＣ１００Ａに、映像データ、第１発話相手情報、およびステップＳ２４において設定された第２会議室用の表示領域情報を送信する。映像データは、第１選択画像データおよび音データを含む。次のステップＳ３２においては、通信Ｉ／Ｆ２１を介して、第３会議室に配置されたＰＣ１００Ｂに、映像データ、第１発話相手情報、およびステップＳ２６において設定された第３会議室用の表示領域情報を送信する。映像データは、第２選択画像データおよび音データを含む。 In the next step S31, the video data, the first utterance partner information, and the display area for the second conference room set in step S24 are transmitted to the PC 100A arranged in the second conference room via the communication I / F 21. Send information. The video data includes first selection image data and sound data. In the next step S32, the video data, the first utterance partner information, and the display area for the third conference room set in step S26 on the PC 100B arranged in the third conference room via the communication I / F 21. Send information. The video data includes second selection image data and sound data.

次のステップＳ３３においては、終了指示を受け付けたか否かを判断する。参加者が、操作部２５の備える終了指示が割り当てられたキーを押下すれば、終了指示を受け付ける。終了指示を受け付けたならば処理を終了するが、そうでなければ処理をステップＳ２１に戻す。 In the next step S33, it is determined whether an end instruction has been accepted. If the participant presses the key to which the end instruction provided in the operation unit 25 is assigned, the end instruction is accepted. If the end instruction is accepted, the process ends. If not, the process returns to step S21.

図１３は、発話相手検出処理の流れの一例を示すフローチャートである。発話相手検出処理は、図１２のステップＳ２８において実行される処理である。図１３を参照して、ＣＰＵ１１は、発話を検出したか否かを判断する（ステップＳ４１）。マイクロホン３１が出力する音データの音声レベルが予め定められたしきい値以上ならば第１会議室の第１参加者の発話を検出する。発話を検出したならば処理をステップＳ４２に進めるが、そうでなければ処理をステップＳ４６に進める。 FIG. 13 is a flowchart illustrating an example of the flow of the speech partner detection process. The speech partner detection process is a process executed in step S28 of FIG. Referring to FIG. 13, CPU 11 determines whether an utterance has been detected (step S41). If the sound level of the sound data output from the microphone 31 is equal to or higher than a predetermined threshold, the speech of the first participant in the first conference room is detected. If an utterance is detected, the process proceeds to step S42; otherwise, the process proceeds to step S46.

ステップＳ４２においては、視線方向が第１表示領域か否かを判断する。第１画像データに含まれる第１参加者の視線方向が左方向ならば視線方向が第１表示領域と判断する。視線方向が第１表示領域ならば処理をステップＳ４３に進めるが、そうでなければ処理をステップＳ４４に進める。ステップＳ４４においては、視線方向が第２表示領域か否かを判断する。第１画像データの第１参加者の視線方向が右方向ならば視線方向が第２表示領域と判断する。視線方向が第２表示領域ならば処理をステップＳ４５に進めるが、そうでなければ処理をステップＳ４６に進める。 In step S42, it is determined whether or not the viewing direction is the first display area. If the gaze direction of the first participant included in the first image data is the left direction, the gaze direction is determined as the first display area. If the line-of-sight direction is the first display area, the process proceeds to step S43; otherwise, the process proceeds to step S44. In step S44, it is determined whether or not the viewing direction is the second display area. If the line-of-sight direction of the first participant in the first image data is the right direction, the line-of-sight direction is determined as the second display area. If the line-of-sight direction is the second display area, the process proceeds to step S45; otherwise, the process proceeds to step S46.

処理がステップＳ４３に進む場合、第１会議室の参加者が発話しており、かつ、視線方向が第１表示領域の場合である。ステップＳ４３においては、第１発話相手情報に第２参加者を設定し、処理を図１２に示した映像データ送信処理に戻す。 When the process proceeds to step S43, the participant in the first conference room is speaking and the line-of-sight direction is the first display area. In step S43, the second participant is set in the first utterance partner information, and the process returns to the video data transmission process shown in FIG.

処理がステップＳ４５に進む場合、第１会議室の参加者が発話しており、かつ、視線方向が第２表示領域の場合である。ステップＳ４５においては、第１発話相手情報に第３参加者を設定し、処理を図１２に示した映像データ送信処理に戻す。 When the process proceeds to step S45, the participant in the first conference room is speaking and the line-of-sight direction is the second display area. In step S45, the third participant is set in the first utterance partner information, and the process returns to the video data transmission process shown in FIG.

処理がステップＳ４６に進む場合、第１会議室の参加者が発話していない場合、または、視線方向が第１表示領域および第２表示領域のいずれでもない場合である。ステップＳ４６においては、第１発話相手情報にブランクを設定し、処理を図１２に示した映像データ送信処理に戻す。 The process proceeds to step S46 when the participant in the first conference room is not speaking or when the line-of-sight direction is neither the first display area nor the second display area. In step S46, blank is set for the first utterance partner information, and the process returns to the video data transmission process shown in FIG.

図１４は、会話相手検出処理の流れの一例を示すフローチャートである。会話相手検出処理は、図１２のステップＳ２９において実行される処理である。図１４を参照して、ＣＰＵ１１は、第１発話相手情報に、第２参加者が設定されているか否かを判断する。第１発話相手情報に、第２参加者が設定されているならば処理をステップＳ５２に進め、そうでなければ処理をステップＳ５４に進める。ステップＳ５２においては、第２発話相手情報に第１参加者が設定されているか否かを判断する。第２発話相手情報に第１参加者が設定されているならば処理をステップＳ５３に進め、そうでなければ処理をステップＳ５４に進める。第２発話相手情報は、第２会議室に配置されたＰＣ１００Ａから受信され、第２会議室の第２参加者が発話している相手を特定する情報が設定されている。 FIG. 14 is a flowchart illustrating an example of the flow of a conversation partner detection process. The conversation partner detection process is a process executed in step S29 of FIG. Referring to FIG. 14, CPU 11 determines whether or not the second participant is set in the first utterance partner information. If the second participant is set in the first utterance partner information, the process proceeds to step S52; otherwise, the process proceeds to step S54. In step S52, it is determined whether or not the first participant is set in the second utterance partner information. If the first participant is set in the second utterance partner information, the process proceeds to step S53; otherwise, the process proceeds to step S54. The second utterance partner information is received from the PC 100A arranged in the second conference room, and information for specifying the partner with whom the second participant in the second conference room is speaking is set.

処理がステップＳ５３に進む場合、第１発話相手情報に第２参加者が設定されている場合であって、かつ、第２発話相手情報に第１会議室の参加者が設定されている場合である。ステップＳ５３においては、第１参加者の会話相手を第２参加者に設定し、処理を図１２に示した映像データ送信処理に戻す。 When the process proceeds to step S53, the second participant is set in the first utterance partner information, and the participant in the first conference room is set in the second utterance partner information. is there. In step S53, the conversation partner of the first participant is set as the second participant, and the process returns to the video data transmission process shown in FIG.

ステップＳ５４においては、第１発話相手情報に、第３参加者が設定されているか否かを判断する。第１発話相手情報に、第３参加者が設定されているならば処理をステップＳ５５に進め、そうでなければ処理を図１２に示した映像データ送信処理に戻す。ステップＳ５５においては、第３発話相手情報に第１参加者が設定されているか否かを判断する。第３発話相手情報に第１参加者が設定されているならば処理をステップＳ５６に進め、そうでなければ処理を図１２に示した映像データ送信処理に戻す。 In step S54, it is determined whether or not a third participant is set in the first utterance partner information. If the third participant is set in the first utterance partner information, the process proceeds to step S55; otherwise, the process returns to the video data transmission process shown in FIG. In step S55, it is determined whether or not the first participant is set in the third utterance partner information. If the first participant is set in the third utterance partner information, the process proceeds to step S56; otherwise, the process returns to the video data transmission process shown in FIG.

処理がステップＳ５６に進む場合、第１発話相手情報に第３会議室の参加者が設定されている場合であって、かつ、第３発話相手情報に第１会議室の参加者が設定されている場合である。ステップＳ５６においては、第１参加者の会話相手を第３参加者に設定し、処理を図１２に示した映像データ送信処理に戻す。 When the process proceeds to step S56, the participant in the third conference room is set in the first utterance partner information, and the participant in the first conference room is set in the third utterance partner information. This is the case. In step S56, the conversation partner of the first participant is set as the third participant, and the process returns to the video data transmission process shown in FIG.

図１５は、選択画像設定処理の流れの一例を示すフローチャートである。選択画像設定処理は、図１２のステップＳ３０において実行される処理である。図１５を参照して、ＣＰＵ１１は、会話相手は第２参加者か否かを判断する（ステップＳ６１）。会話相手が第２参加者ならば処理をステップＳ６２に進めるが、そうでなければ処理をステップＳ６５に進める。 FIG. 15 is a flowchart illustrating an example of the flow of selected image setting processing. The selected image setting process is a process executed in step S30 of FIG. Referring to FIG. 15, CPU 11 determines whether or not the conversation partner is the second participant (step S61). If the conversation partner is the second participant, the process proceeds to step S62; otherwise, the process proceeds to step S65.

ステップＳ６２においては、タイマＢをリセットする。タイマＢは、第１参加者の会話相手が第３参加者である時間を計時する。タイマＢをリセットすることによって、タイマＢ値は、０となり、その後タイマＢは計時を開始する。次のステップＳ６３においては、第１選択画像データに第１カメラ３５が出力する第１画像データを設定する。次のステップＳ６４においては、第２選択画像データ設定処理を実行し、処理を図１２に示した映像データ送信処理に戻す。第２選択画像データ設定処理は、後述するが、第２選択画像データに第１〜第３画像データのいずれかを設定する処理である。 In step S62, timer B is reset. The timer B measures the time during which the conversation partner of the first participant is the third participant. By resetting the timer B, the timer B value becomes 0, and then the timer B starts timing. In the next step S63, the first image data output from the first camera 35 is set as the first selected image data. In the next step S64, the second selected image data setting process is executed, and the process returns to the video data transmission process shown in FIG. As will be described later, the second selected image data setting process is a process for setting any one of the first to third image data in the second selected image data.

ステップＳ６５においては、会話相手は第３参加者か否かを判断する。会話相手が第３参加者ならば処理をステップＳ６６に進めるが、そうでなければ処理をステップＳ６９に進める。 In step S65, it is determined whether or not the conversation partner is a third participant. If the conversation partner is the third participant, the process proceeds to step S66; otherwise, the process proceeds to step S69.

ステップＳ６６においては、タイマＡをリセットする。タイマＡは、第１参加者の会話相手が第２参加者である時間を計時する。タイマＡをリセットすることによって、タイマＡ値は、０となり、その後タイマＡは計時を開始する。次のステップＳ６７においては、第２選択画像データに第１カメラ３５が出力する第１画像データを設定する。次のステップＳ６８においては、第１選択画像データ設定処理を実行し、処理を図１２に示した映像データ送信処理に戻す。第１選択画像データ設定処理は、後述するが、第１選択画像データに第１〜第３画像データのいずれかを設定する処理である。 In step S66, timer A is reset. The timer A measures the time during which the conversation partner of the first participant is the second participant. By resetting the timer A, the timer A value becomes 0, and then the timer A starts timing. In the next step S67, the first image data output from the first camera 35 is set as the second selected image data. In the next step S68, the first selected image data setting process is executed, and the process returns to the video data transmission process shown in FIG. As will be described later, the first selected image data setting process is a process for setting any one of the first to third image data in the first selected image data.

ステップＳ６９においては、タイマＡをリセットする。次のステップＳ７０においては、タイマＢをリセットする。そして、第１選択画像データに第１カメラ３５が出力する第１画像データを設定し（ステップＳ７１）、第２選択画像データに第１カメラ３５が出力する第１画像データを設定し（ステップＳ７２）、処理を図１２に示した映像データ送信処理に戻す。 In step S69, timer A is reset. In the next step S70, the timer B is reset. Then, the first image data output from the first camera 35 is set as the first selected image data (step S71), and the first image data output from the first camera 35 is set as the second selected image data (step S72). ), The processing is returned to the video data transmission processing shown in FIG.

図１６は、第２選択画像データ設定処理の流れの一例を示すフローチャートである。第２選択画像データ設定処理は、図１５のステップＳ６４において実行される処理である。図１６を参照して、ＣＰＵ１１は、タイマＡ値がしきい値Ｔ１以上か否かを判断する（ステップＳ８１）。タイマＡ値がしきい値Ｔ１以上ならば処理をステップＳ８２に進めるが、そうでなければ処理をステップＳ８８に進める。タイマＡ値は、第１参加者の会話相手が第２参加者である時間を示す。 FIG. 16 is a flowchart illustrating an example of the flow of the second selected image data setting process. The second selected image data setting process is a process executed in step S64 in FIG. Referring to FIG. 16, CPU 11 determines whether timer A value is equal to or greater than threshold value T1 (step S81). If the timer A value is greater than or equal to threshold value T1, the process proceeds to step S82; otherwise, the process proceeds to step S88. The timer A value indicates the time during which the first participant's conversation partner is the second participant.

ステップＳ８２においては、第１参加者が発話しているか否かを判断する。マイクロホン３１が出力する音データの音声レベルをしきい値と比較することによって、発話しているか否かを判断する、第１参加者が発話しているならば処理をステップＳ８３に進めるが、そうでなければ処理をステップＳ８８に進める。 In step S82, it is determined whether or not the first participant is speaking. By comparing the sound level of the sound data output from the microphone 31 with a threshold value, it is determined whether or not the speaker is speaking. If the first participant is speaking, the process proceeds to step S83. Otherwise, the process proceeds to step S88.

ステップＳ８３においては、第２選択画像データに第１カメラ３５が出力する第１画像データを設定する。そして、ステップＳ８４においては、タイマＡ値が予め定められたしきい値Ｔ２以上か否かを判断する。しきい値Ｔ２は、しきい値Ｔ１よりも大きな値である。タイマＡ値がしきい値Ｔ２以上ならば処理をステップＳ８５に進めるが、そうでなければ処理をステップＳ８６に進める。ステップＳ８５においては、タイマＡをリセットし、処理を選択画像設定処理に戻す。 In step S83, the first image data output from the first camera 35 is set as the second selected image data. In step S84, it is determined whether or not the timer A value is greater than or equal to a predetermined threshold value T2. The threshold value T2 is larger than the threshold value T1. If the timer A value is greater than or equal to threshold value T2, the process proceeds to step S85; otherwise, the process proceeds to step S86. In step S85, timer A is reset, and the process returns to the selected image setting process.

ステップＳ８６においては、第３参加者の反応があったか否かを判断する。第３会議室に配置されたＰＣ１００Ｂから受信される映像データに含まれる画像データを解析して、画像データに含まれる被写体の動き、または、表情を分析する。被写体の動きが検出される場合、または表示の変化を検出する場合に、反応があったと判断する。第３参加者の反応があったと判断する場合、第１画像データをＨＤＤ１７に記憶し、処理を選択画像設定処理に戻す。ステップＳ８７が実行された後に、ステップＳ８３が実行される場合、第２選択画像データに、第１カメラ３５が出力する第１画像データに代えて、ＨＤＤ１７に記憶された第１画像データを設定する。これにより、第３会議室の第３参加者に影響を与えることのできる画像を視聴させることができる。 In step S86, it is determined whether or not there has been a reaction from the third participant. The image data included in the video data received from the PC 100B arranged in the third conference room is analyzed to analyze the movement or facial expression of the subject included in the image data. When the movement of the subject is detected, or when a change in display is detected, it is determined that there has been a reaction. If it is determined that there is a response from the third participant, the first image data is stored in the HDD 17 and the process returns to the selected image setting process. When step S83 is executed after step S87 is executed, the first image data stored in the HDD 17 is set in the second selected image data instead of the first image data output from the first camera 35. . Thereby, the image which can affect the 3rd participant of a 3rd meeting room can be viewed.

また、第３参加者の反応がなかった場合、または第３参加者の反応の有無に係わらず、表示部２７に注意を喚起するメッセージ、例えば、「第３会議室に視線を向けてください。」等を表示するようにしてもよい。これにより、第１参加者は、説明や議論が第２参加者側に偏っていることを知ることができ、第３参加者に対して話しかける等の対応が可能になる。 In addition, when there is no response from the third participant or regardless of whether or not the third participant responds, a message that calls attention to the display unit 27, for example, “Please look at the third conference room. Or the like may be displayed. Thereby, the 1st participant can know that explanation and discussion are biased to the 2nd participant side, and correspondence, such as talking to the 3rd participant, becomes possible.

一方、ステップＳ８８においては、第３会議室の表示領域情報によって処理を分岐させる。第３会議室の表示領域情報が第１表示領域ならば処理をステップＳ８９に進めるが、第３会議室の表示領域情報が第２表示領域ならば処理をステップＳ９０に進める。第３会議室の表示領域情報は、第３会議室に配置されたＰＣ１００Ｂが、第１会議室に配置されたＰＣ１００から受信される映像データに含まれる画像データ、換言すれば第２選択画像データを、表示部２７Ｂの第１表示領域および第２表示領域のいずれに表示しているかを示す情報である。 On the other hand, in step S88, the process is branched according to the display area information of the third conference room. If the display area information of the third meeting room is the first display area, the process proceeds to step S89. If the display area information of the third meeting room is the second display area, the process proceeds to step S90. The display area information of the third meeting room is the image data included in the video data received from the PC 100B arranged in the first meeting room, in other words, the second selected image data. Is displayed in which of the first display area and the second display area of the display unit 27B.

ステップＳ８９においては、第２選択画像データに第２カメラ３７が出力する第２画像データを設定し、処理を選択画像設定処理に戻す。ステップＳ９０においては、第２選択画像データに第３カメラ３９が出力する第３画像データを設定し、処理を選択画像設定処理に戻す。 In step S89, the second image data output from the second camera 37 is set in the second selected image data, and the process returns to the selected image setting process. In step S90, the third image data output from the third camera 39 is set in the second selected image data, and the process returns to the selected image setting process.

ステップＳ８５においてタイマＡがリセットされると、次にステップＳ８１が実行される場合に、処理がステップＳ８８に進む。このため、第１参加者の会話相手が第２参加者の状態がしきい値Ｔ１以上しきい値Ｔ２未満の間は、第２選択画像データに第１画像データが選択される。換言すれば、しきい値Ｔ２としきい値Ｔ１との差分の時間ＴＭ、第１画像データが選択される。さらに、タイマＡ値が、しきい値Ｔ２以上になると、タイマＡがリセットされるので、第１参加者の会話相手が第２参加者である状態がしきい値Ｔ２以上継続する場合には、タイマＡ値がしきい値Ｔ１の間は第２選択画像データに第１カメラ３５から入力される第１画像データを設定する処理（ステップＳ８３）、その後の時間ＴＭの間は第２選択画像データに第２または第３画像データを選択する処理（ステップＳ８９、ステップＳ９０）が繰り返えされる。 When the timer A is reset in step S85, the process proceeds to step S88 when step S81 is executed next. Therefore, the first image data is selected as the second selection image data while the conversation partner of the first participant is in the state of the second participant being not less than the threshold value T1 and less than the threshold value T2. In other words, the difference time TM between the threshold value T2 and the threshold value T1 and the first image data are selected. Furthermore, since the timer A is reset when the timer A value becomes equal to or greater than the threshold value T2, when the state where the conversation partner of the first participant is the second participant continues for the threshold value T2 or more, A process of setting the first image data input from the first camera 35 as the second selected image data while the timer A value is the threshold value T1 (step S83), and the second selected image data during the subsequent time TM. The process of selecting the second or third image data (step S89, step S90) is repeated.

なお、ここでは第１参加者の会話相手が第２参加者の状態がしきい値Ｔ１以上継続すると、第１画像データを、第２選択画像データに設定するようにしたが、第１参加者の会話相手が第２参加者の状態が継続する時間を、ランダムな時間としてもよい。例えば、第１参加者の発話が中断した場合とすればよい。また、第１画像データを、第２選択画像データに設定する時間を時間ＴＭとしたが、時間ＴＭは、任意に定めることができ、第３参加者が第１画像データの画像を認識できる程度に短い時間であってもよい。 In this case, the first image data is set as the second selected image data when the first participant's conversation partner is in the state of the second participant for the threshold value T1 or more. The time during which the conversation partner is in the state of the second participant may be a random time. For example, the first participant's speech may be interrupted. In addition, although the time TM is set as the second selected image data for the first image data, the time TM can be arbitrarily determined, and the third participant can recognize the image of the first image data. It may be a short time.

図１７は、第１選択画像データ設定処理の流れの一例を示すフローチャートである。第１選択画像データ設定処理は、図１５のステップＳ６８において実行される処理である。図１７を参照して、ＣＰＵ１１は、タイマＢ値がしきい値Ｔ１以上か否かを判断する（ステップＳ９１）。タイマＢ値がしきい値Ｔ１以上ならば処理をステップＳ９３に進めるが、そうでなければ処理をステップＳ９８に進める。タイマＢ値は、第１参加者の会話相手が第３参加者である時間を示す。 FIG. 17 is a flowchart illustrating an example of the flow of the first selected image data setting process. The first selected image data setting process is a process executed in step S68 of FIG. Referring to FIG. 17, CPU 11 determines whether timer B value is greater than or equal to threshold value T1 (step S91). If the timer B value is greater than or equal to threshold value T1, the process proceeds to step S93; otherwise, the process proceeds to step S98. The timer B value indicates the time during which the conversation partner of the first participant is the third participant.

ステップＳ９２においては、第１参加者が発話しているか否かを判断する。マイクロホン３１が出力する音データの音声レベルをしきい値と比較することによって、発話しているか否かを判断する、第１参加者が発話しているならば処理をステップＳ９３に進めるが、そうでなければ処理をステップＳ９８に進める。 In step S92, it is determined whether or not the first participant is speaking. By comparing the sound level of the sound data output from the microphone 31 with a threshold value, it is determined whether or not the speaker is speaking. If the first participant is speaking, the process proceeds to step S93. Otherwise, the process proceeds to step S98.

ステップＳ９３においては、第１選択画像データに第１カメラ３５が出力する第１画像データを設定する。そして、ステップＳ９４においては、タイマＢ値が予め定められたしきい値Ｔ２以上か否かを判断する。しきい値Ｔ２は、しきい値Ｔ１よりも大きな値である。タイマＢ値がしきい値Ｔ２以上ならば処理をステップＳ９５に進めるが、そうでなければ処理をステップＳ９６に進める。ステップＳ９５においては、タイマＢをリセットし、処理を選択画像設定処理に戻す。 In step S93, the first image data output from the first camera 35 is set as the first selected image data. In step S94, it is determined whether or not the timer B value is equal to or greater than a predetermined threshold value T2. The threshold value T2 is larger than the threshold value T1. If the timer B value is greater than or equal to threshold value T2, the process proceeds to step S95; otherwise, the process proceeds to step S96. In step S95, timer B is reset, and the process returns to the selected image setting process.

ステップＳ９６においては、第２参加者の反応があったか否かを判断する。第２会議室に配置されたＰＣ１００Ａから受信される映像データに含まれる画像データを解析して、画像データに含まれる被写体の動き、または、表情を分析する。被写体の動きが検出される場合、または表示の変化を検出する場合に、反応があったと判断する。第２参加者の反応があったと判断する場合、第１画像データをＨＤＤ１７に記憶し、処理を選択画像設定処理に戻す。ステップＳ９７が実行された後に、ステップＳ９３が実行される場合、第１選択画像データに、第１カメラ３５が出力する第１画像データに代えて、ＨＤＤ１７に記憶された第１画像データを設定する。これにより、第２会議室の第２参加者に影響を与えることのできる画像を視聴させることができる。 In step S96, it is determined whether or not there is a response from the second participant. The image data included in the video data received from the PC 100A arranged in the second conference room is analyzed to analyze the movement or expression of the subject included in the image data. When the movement of the subject is detected, or when a change in display is detected, it is determined that there has been a reaction. If it is determined that the second participant has responded, the first image data is stored in the HDD 17 and the process returns to the selected image setting process. When step S93 is executed after step S97 is executed, the first image data stored in the HDD 17 is set in the first selected image data instead of the first image data output from the first camera 35. . Thereby, the image which can influence the 2nd participant of a 2nd meeting room can be viewed.

また、第２参加者の反応がなかった場合、または第２参加者の反応の有無に係わらず、表示部２７に注意を喚起するメッセージ、例えば、「第２会議室に視線を向けてください。」等を表示するようにしてもよい。これにより、第１参加者は、説明や議論が第３参加者側に偏っていることを知ることができ、第２参加者に対して話しかける等の対応が可能になる。 In addition, when there is no response from the second participant or regardless of whether or not the second participant responds, a message that calls attention to the display unit 27, for example, “Please look at the second conference room. Or the like may be displayed. Thereby, the 1st participant can know that explanation and discussion are biased to the 3rd participant side, and correspondence, such as talking to the 2nd participant, becomes possible.

一方、ステップＳ９８においては、第２会議室の表示領域情報によって処理を分岐させる。第２会議室の表示領域情報が第１表示領域ならば処理をステップＳ９９に進めるが、第２会議室の表示領域情報が第２表示領域ならば処理をステップＳ１００に進める。第２会議室の表示領域情報は、第２会議室に配置されたＰＣ１００Ａが、第１会議室に配置されたＰＣ１００から受信される映像データに含まれる画像データ、換言すれば第１選択画像データを、表示部２７Ａの第１表示領域および第２表示領域のいずれに表示しているかを示す情報である。 On the other hand, in step S98, the process is branched according to the display area information of the second conference room. If the display area information of the second meeting room is the first display area, the process proceeds to step S99. If the display area information of the second meeting room is the second display area, the process proceeds to step S100. The display area information of the second meeting room is the image data included in the video data received from the PC 100A placed in the first meeting room, in other words, the first selected image data. Is displayed in either the first display area or the second display area of the display unit 27A.

ステップＳ９９においては、第１選択画像データに第２カメラ３７が出力する第２画像データを設定し、処理を選択画像設定処理に戻す。ステップＳ１００においては、第１選択画像データに第３カメラ３９が出力する第３画像データを設定し、処理を選択画像設定処理に戻す。 In step S99, the second image data output from the second camera 37 is set in the first selected image data, and the process returns to the selected image setting process. In step S100, the third image data output from the third camera 39 is set in the first selected image data, and the process returns to the selected image setting process.

ステップＳ９５においてタイマＢがリセットされると、次にステップＳ９１が実行される場合に、処理がステップＳ９８に進む。このため、第１会議室の参加者の会話相手が第３会議室の参加者の状態がしきい値Ｔ１以上しきい値Ｔ２未満の間は、第１選択画像データに第１画像データが選択される。換言すれば、しきい値Ｔ２としきい値Ｔ１との差分の時間ＴＭ、第１画像データが選択される。さらに、タイマＢ値が、しきい値Ｔ２以上になると、タイマＢがリセットされるので、第１会議室の参加者の会話相手が第３会議室の参加者である状態がしきい値Ｔ２以上継続する場合には、タイマＢ値がしきい値Ｔ１の間は第１選択画像データに第１カメラ３５から入力される第１画像データを設定する処理（ステップＳ９３）、その後の時間ＴＭの間は第１選択画像データに第２または第３画像データを選択する処理（ステップＳ９９、ステップＳ１００）が繰り返えされる。 When timer B is reset in step S95, the process proceeds to step S98 when step S91 is executed next. For this reason, the first image data is selected as the first selected image data while the conversation partner of the participant in the first conference room is in the state of the participant in the third conference room is greater than or equal to the threshold value T1 Is done. In other words, the difference time TM between the threshold value T2 and the threshold value T1 and the first image data are selected. Further, when the timer B value becomes equal to or greater than the threshold value T2, the timer B is reset, so that the conversation partner of the first conference room participant is the third conference room participant or higher. When continuing, while the timer B value is the threshold value T1, the process of setting the first image data input from the first camera 35 to the first selected image data (step S93), during the subsequent time TM The process of selecting the second or third image data as the first selected image data (step S99, step S100) is repeated.

なお、ここでは第１参加者の会話相手が第３参加者の状態がしきい値Ｔ１以上継続すると、第１画像データを、第１選択画像データに設定するようにしたが、第１参加者の会話相手が第３参加者の状態が継続する時間を、ランダムな時間としてもよい。例えば、第１参加者の発話が中断した場合とすればよい。また、第１画像データを、第１選択画像データに設定する時間を時間ＴＭとしたが、時間ＴＭは、任意に定めることができ、第２参加者が第１画像データの画像を認識できる程度に短い時間であってもよい。 Here, the first image data is set as the first selected image data when the third participant's conversation with the first participant continues for the threshold value T1 or more, but the first participant The time during which the conversation partner is in the state of the third participant may be a random time. For example, the first participant's speech may be interrupted. Further, the time TM is set as the first image data in the first selected image data. However, the time TM can be arbitrarily determined, and the second participant can recognize the image of the first image data. It may be a short time.

図１８は、第１参加者と第２参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。図１８（Ａ）は、第１会議室に配置される表示部２７の画面の一例を示す図である。図１８（Ａ）を参照して、表示部２７の第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７の第２表示領域に第３参加者の正面を撮像した画像が表示される。図１８（Ｂ）は、第２会議室に配置される表示部２７Ａの画面の一例を示す図である。図１８（Ｂ）を参照して、表示部２７Ａの第１表示領域に第１参加者の正面を撮像した画像が表示され、表示部２７Ａの第２表示領域に第３参加者の正面を撮像した画像が表示される。図１８（Ｃ）は、第３会議室に配置される表示部２７Ｂの画面の一例を示す図である。図１８（Ｃ）を参照して、表示部２７Ｂの第１表示領域に第２参加者の右側面を撮像した画像が表示され、表示部２７Ｂの第２表示領域に第１参加者の左側面を撮像した画像が表示される。 FIG. 18 is a diagram illustrating an example of a display state of the first to third conference rooms when the first participant and the second participant are talking. FIG. 18A is a diagram illustrating an example of a screen of the display unit 27 arranged in the first conference room. Referring to FIG. 18A, an image obtained by imaging the front of the second participant is displayed in the first display area of display 27, and the front of the third participant is captured in the second display area of display 27. The displayed image is displayed. FIG. 18B is a diagram showing an example of the screen of the display unit 27A arranged in the second conference room. Referring to FIG. 18B, an image obtained by imaging the front of the first participant is displayed in the first display area of display 27A, and the front of the third participant is captured in the second display area of display 27A. The displayed image is displayed. FIG. 18C is a diagram illustrating an example of a screen of the display unit 27B arranged in the third conference room. Referring to FIG. 18C, an image obtained by imaging the right side surface of the second participant is displayed in the first display area of display unit 27B, and the left side surface of the first participant is displayed in the second display area of display unit 27B. A captured image is displayed.

第１参加者と第２参加者が会話しているときは、第１参加者は、表示部２７の第１表示領域に表示された第２参加者と視線が合い、第２参加者は、表示部２７Ａの第１表示領域に表示された第１参加者と視線が合う。また、第３参加者は、表示部２７Ｂの第１表示領域に表示された第２参加者と、表示部２７Ｂの第２表示領域に表示された第１参加者と、が会話していることを知ることができる。 When the first participant and the second participant are talking, the first participant is in line of sight with the second participant displayed in the first display area of the display unit 27, and the second participant The line of sight matches the first participant displayed in the first display area of the display unit 27A. In addition, the third participant has a conversation between the second participant displayed in the first display area of the display unit 27B and the first participant displayed in the second display area of the display unit 27B. Can know.

図１８（Ｄ）は、第１参加者と第２参加者が会話して所定時間Ｔ１経過後の第３会議室の表示状態の一例を示す第１の図である。図１８（Ｄ）は、第１参加者と第２参加者が会話して所定時間Ｔ１経過後に、第１参加者が発話しているときに表示される画面の一例を示す。図１８（Ｄ）を参照して、表示部２７Ｂの第１表示領域に第２参加者の右側面を撮像した画像が表示され、表示部２７Ｂの第２表示領域に第１参加者の正面を撮像した画像が表示される。このため、第３参加者は、発話している第１参加者が視線を自分の方に向けた画像を見ることになり、第１参加者からのノンバーバル情報が第３参加者に伝えられる。このため、第３参加者は、会議に集中することができる。 FIG. 18D is a first diagram illustrating an example of a display state of the third conference room after a predetermined time T1 has elapsed after the first participant and the second participant have a conversation. FIG. 18D shows an example of a screen displayed when the first participant speaks after the first participant and the second participant have talked and a predetermined time T1 has elapsed. Referring to FIG. 18D, an image obtained by imaging the right side surface of the second participant is displayed in the first display area of display unit 27B, and the front of the first participant is displayed in the second display area of display unit 27B. The captured image is displayed. For this reason, the third participant sees an image in which the first participant who is speaking has his gaze directed toward him, and the non-verbal information from the first participant is transmitted to the third participant. For this reason, the 3rd participant can concentrate on a meeting.

図１８（Ｅ）は、第１参加者と第２参加者が会話して所定時間Ｔ１経過後の第３会議室の表示状態の一例を示す第２の図である。図１８（Ｅ）は、第１参加者と第２参加者が会話して所定時間Ｔ１経過後に、第２参加者が発話しているときに表示される画面の一例を示す。図１８（Ｅ）を参照して、表示部２７Ｂの第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７Ｂの第２表示領域に第１参加者の左側面を撮像した画像が表示される。このため、第３参加者は、発話している第２参加者が視線を自分の方に向けた画像を見ることになり、第２参加者からのノンバーバル情報が第３参加者に伝えられる。このため、第３参加者は、会議に集中することができる。 FIG. 18E is a second diagram illustrating an example of a display state of the third conference room after a predetermined time T1 has elapsed after the first participant and the second participant have a conversation. FIG. 18E shows an example of a screen displayed when the second participant speaks after the first participant and the second participant have talked and a predetermined time T1 has elapsed. Referring to FIG. 18E, an image obtained by imaging the front of the second participant is displayed in the first display area of display unit 27B, and the left side surface of the first participant is displayed in the second display area of display unit 27B. The captured image is displayed. For this reason, the third participant sees an image in which the second participant who is speaking has his / her line of sight directed toward himself / herself, and the non-verbal information from the second participant is transmitted to the third participant. For this reason, the 3rd participant can concentrate on a meeting.

図１９は、第１参加者と第３参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。図１９（Ａ）は、第１会議室に配置される表示部２７の画面の一例を示す図である。図１９（Ａ）を参照して、表示部２７の第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７の第２表示領域に第３参加者の正面を撮像した画像が表示される。図１９（Ｂ）は、第２会議室に配置される表示部２７Ａの画面の一例を示す図である。図１９（Ｂ）を参照して、表示部２７Ａの第１表示領域に第１参加者の右側面を撮像した画像が表示され、表示部２７Ａの第２表示領域に第３参加者の左側面を撮像した画像が表示される。図１９（Ｃ）は、第３会議室に配置される表示部２７Ｂの画面の一例を示す図である。図１９（Ｃ）を参照して、表示部２７Ｂの第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７Ｂの第２表示領域に第１参加者の正面を撮像した画像が表示される。 FIG. 19 is a diagram illustrating an example of a display state of the first to third conference rooms when the first participant and the third participant have a conversation. FIG. 19A is a diagram illustrating an example of a screen of the display unit 27 arranged in the first conference room. Referring to FIG. 19A, an image obtained by imaging the front of the second participant is displayed in the first display area of display 27, and the front of the third participant is captured in the second display area of display 27. The displayed image is displayed. FIG. 19B is a diagram illustrating an example of a screen of the display unit 27A arranged in the second conference room. Referring to FIG. 19B, an image obtained by capturing the right side of the first participant is displayed in the first display area of display 27A, and the left side of the third participant is displayed in the second display area of display 27A. A captured image is displayed. FIG. 19C is a diagram illustrating an example of a screen of the display unit 27B arranged in the third conference room. Referring to FIG. 19C, an image obtained by imaging the front of the second participant is displayed in the first display area of the display unit 27B, and the front of the first participant is imaged in the second display area of the display unit 27B. The displayed image is displayed.

第１参加者と第３参加者が会話しているときは、第１参加者は、表示部２７の第２表示領域に表示された第３参加者と視線が合い、第３参加者は、表示部２７Ｂの第２表示領域に表示された第１参加者と視線が合う。また、第２参加者は、表示部２７Ａの第１表示領域に表示された第１参加者と、表示部２７Ａの第２表示領域に表示された第３参加者と、が会話していることを知ることができる。 When the first participant and the third participant are talking, the first participant is in line of sight with the third participant displayed in the second display area of the display unit 27, and the third participant The line of sight matches the first participant displayed in the second display area of the display unit 27B. In addition, the second participant has a conversation between the first participant displayed in the first display area of the display unit 27A and the third participant displayed in the second display area of the display unit 27A. Can know.

図１９（Ｄ）は、第１参加者と第３参加者が会話して所定時間Ｔ１経過後の第２会議室の表示部の表示状態の一例を示す第１の図である。図１９（Ｄ）は、第１参加者と第３参加者が会話して所定時間Ｔ１経過後に、第１参加者が発話しているときに表示される画面の一例を示す。図１９（Ｄ）を参照して、表示部２７Ａの第１表示領域に第１参加者の正面を撮像した画像が表示され、表示部２７Ａの第２表示領域に第３参加者の左側面を撮像した画像が表示される。このため、第２参加者は、発話している第１参加者が視線を自分の方に向けた画像を見ることになり、第１参加者からのノンバーバル情報が第２参加者に伝えられる。このため、第２参加者は、会議に集中することができる。 FIG. 19D is a first diagram illustrating an example of a display state of the display unit of the second conference room after a predetermined time T1 has elapsed after the first participant and the third participant have a conversation. FIG. 19D shows an example of a screen displayed when the first participant speaks after the first participant and the third participant talk to each other after a predetermined time T1 has elapsed. Referring to FIG. 19D, an image obtained by capturing the front of the first participant is displayed in the first display area of display 27A, and the left side of the third participant is displayed in the second display area of display 27A. The captured image is displayed. For this reason, the second participant sees an image in which the first participant who is speaking turns his / her line of sight toward himself / herself, and nonverbal information from the first participant is transmitted to the second participant. For this reason, the 2nd participant can concentrate on a meeting.

図１９（Ｅ）は、第１参加者と第３参加者が会話して所定時間Ｔ１経過後の第２会議室の表示部の表示状態の一例を示す第２の図である。図１９（Ｅ）は、第１参加者と第３参加者が会話して所定時間Ｔ１経過後に、第３参加者が発話しているときに表示される画面の一例を示す。図１９（Ｅ）を参照して、表示部２７Ａの第１表示領域に第１参加者の右側面を撮像した画像が表示され、表示部２７Ａの第２表示領域に第３参加者の正面を撮像した画像が表示される。このため、第２参加者は、発話している第３参加者が視線を自分の方に向けた画像を見ることになり、第３参加者からのノンバーバル情報が第２参加者に伝えられる。このため、第２参加者は、会議に集中することができる。 FIG. 19E is a second diagram illustrating an example of the display state of the display unit of the second conference room after a predetermined time T1 has elapsed after the first participant and the third participant have a conversation. FIG. 19E shows an example of a screen displayed when the third participant speaks after the first participant and the third participant have talked and a predetermined time T1 has elapsed. Referring to FIG. 19E, an image of the right side of the first participant is displayed in the first display area of display 27A, and the front of the third participant is displayed in the second display area of display 27A. The captured image is displayed. For this reason, the second participant sees an image in which the third participant who is speaking turns his / her line of sight toward himself / herself, and nonverbal information from the third participant is transmitted to the second participant. For this reason, the 2nd participant can concentrate on a meeting.

図２０は、第２参加者と第３参加者が会話しているときの第１〜第３会議室の表示状態の一例を示す図である。図２０（Ａ）は、第１会議室に配置される表示部２７の画面の一例を示す図である。図２０（Ａ）を参照して、表示部２７の第１表示領域に第２参加者の右側面を撮像した画像が表示され、表示部２７の第２表示領域に第３参加者の左側面を撮像した画像が表示される。図２０（Ｂ）は、第２会議室に配置される表示部２７Ａの画面の一例を示す図である。図２０（Ｂ）を参照して、表示部２７Ａの第１表示領域に第１参加者の正面を撮像した画像が表示され、表示部２７Ａの第２表示領域に第３参加者の正面を撮像した画像が表示される。図２０（Ｃ）は、第３会議室に配置される表示部２７Ｂの画面の一例を示す図である。図２０（Ｃ）を参照して、表示部２７Ｂの第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７Ｂの第２表示領域に第１参加者の正面を撮像した画像が表示される。 FIG. 20 is a diagram illustrating an example of a display state of the first to third conference rooms when the second participant and the third participant are talking. FIG. 20A is a diagram illustrating an example of a screen of the display unit 27 arranged in the first conference room. Referring to FIG. 20A, an image obtained by imaging the right side surface of the second participant is displayed in the first display area of display unit 27, and the left side surface of the third participant is displayed in the second display area of display unit 27. A captured image is displayed. FIG. 20B is a diagram illustrating an example of a screen of the display unit 27A arranged in the second conference room. Referring to FIG. 20B, an image obtained by capturing the front of the first participant is displayed in the first display area of display 27A, and the front of the third participant is captured in the second display area of display 27A. The displayed image is displayed. FIG. 20C is a diagram illustrating an example of a screen of the display unit 27B arranged in the third conference room. Referring to FIG. 20C, an image obtained by imaging the front of the second participant is displayed in the first display area of the display unit 27B, and the front of the first participant is imaged in the second display area of the display unit 27B. The displayed image is displayed.

第２参加者と第３参加者が会話しているときは、第１参加者は、表示部２７の第１表示領域に表示された第２参加者と、表示部２７の第２表示領域に表示された第３参加者と、が会話していることを知ることができる。また、第２参加者は、表示部２７Ａの第２表示領域に表示された第３参加者と視線が合い、第３参加者は、表示部２７Ｂの第１表示領域に表示された第２参加者と視線が合う。 When the second participant and the third participant are talking, the first participant is in the second participant displayed in the first display area of the display unit 27 and the second display area of the display unit 27. It is possible to know that the displayed third participant is talking. The second participant is in line of sight with the third participant displayed in the second display area of the display unit 27A, and the third participant is the second participant displayed in the first display area of the display unit 27B. The eyes of the person.

図２０（Ｄ）は、第２参加者と第３参加者が会話して所定時間Ｔ１経過後の第１会議室の表示部の表示状態の一例を示す第１の図である。図２０（Ｄ）は、第２参加者と第３参加者が会話して所定時間Ｔ１経過後に、第２会議室の第２参加者が発話しているときに表示される画面の一例を示す。図２０（Ｄ）を参照して、表示部２７の第１表示領域に第２参加者の正面を撮像した画像が表示され、表示部２７の第２表示領域に第３参加者の左側面を撮像した画像が表示される。このため、第１参加者は、発話している第２参加者が視線を自分の方に向けた画像を見ることになり、第２参加者からのノンバーバル情報が第１参加者に伝えられる。このため、第１参加者は、会議に集中することができる。 FIG. 20D is a first diagram illustrating an example of a display state of the display unit of the first conference room after the second participant and the third participant have a conversation and the predetermined time T1 has elapsed. FIG. 20D shows an example of a screen displayed when the second participant in the second conference room speaks after the second participant and the third participant have talked and the predetermined time T1 has elapsed. . Referring to FIG. 20D, an image obtained by imaging the front of the second participant is displayed in the first display area of the display unit 27, and the left side surface of the third participant is displayed in the second display area of the display unit 27. The captured image is displayed. For this reason, the 1st participant will see the image in which the 2nd participant who is uttering turned the line of sight toward himself, and the non-verbal information from the 2nd participant is transmitted to the 1st participant. For this reason, the 1st participant can concentrate on a meeting.

図２０（Ｅ）は、第２参加者と第３参加者が会話して所定時間Ｔ１経過後の第１会議室の表示部の表示状態の一例を示す第２の図である。図２０（Ｅ）は、第２参加者と第３参加者が会話して所定時間Ｔ１経過後に、第３参加者が発話しているときに表示される画面の一例を示す。図２０（Ｅ）を参照して、表示部２７の第１表示領域に第２参加者の右側面を撮像した画像が表示され、表示部２７の第２表示領域に第３参加者の正面を撮像した画像が表示される。このため、第１参加者は、発話している第３参加者が視線を自分の方に向けた画像を見ることになり、第３参加者からのノンバーバル情報が第１参加者に伝えられる。このため、第１参加者は、会議に集中することができる。 FIG. 20E is a second diagram illustrating an example of a display state of the display unit of the first conference room after the second participant and the third participant have a conversation and the predetermined time T1 has elapsed. FIG. 20E shows an example of a screen displayed when the third participant speaks after the second participant and the third participant talk to each other for a predetermined time T1. Referring to FIG. 20E, an image obtained by capturing the right side surface of the second participant is displayed in the first display area of the display unit 27, and the front of the third participant is displayed in the second display area of the display unit 27. The captured image is displayed. For this reason, the first participant sees an image in which the third participant who is speaking has his / her line of sight toward himself / herself, and the non-verbal information from the third participant is transmitted to the first participant. For this reason, the 1st participant can concentrate on a meeting.

以上説明したように第２の実施の形態における会議システム１Ａは、第１会議室の第１参加者が第２参加者と会話していると判断された場合に、第１カメラ３５で第１参加者を撮像した第１画像データが第１選択画像データとして選択され、第３参加者と会話していると判断された場合に、第２および第３カメラ３７，３９それぞれで第１参加者を撮像した第２画像データおよび第３画像データのいずれか一方が第１選択画像データとして選択され、第１会議室の第１参加者が第３参加者と会話していると判断される状態が所定時間Ｔ１継続すると、第１画像データが所定時間Ｔ２第１選択画像データとして選択され、選択された第１選択画像データが第２会議室に配置されたＰＣ１００Ａに送信される。また、第１会議室の第１参加者が第３参加者と会話していると判断された場合に、第１カメラ３５で第１参加者を撮像した第１画像データが第２選択画像データとして選択され、一方の会議室の参加者と会話していると判断された場合に、第２および第３カメラ３７，３９それぞれで第１参加者を撮像した第２画像データおよび第３画像データのいずれか一方が第２選択画像データとして選択され、第１会議室の第１参加者が第２参加者と会話していると判断される状態が所定時間Ｔ１継続すると、第１画像データが所定時間Ｔ２第２選択画像データとして選択され、選択された第２選択画像データが第３会議室に配置されたＰＣ１００Ｂに送信される。このため、参加者に向かう第１の方向で参加者を撮像した第１画像データを見る第２または第３参加者は、第１会議室の第１参加者と視線が合うが、第２または第３画像データを見る第２または第３会議室の参加者は、第１会議室の参加者と視線が合わない。第１参加者が第３参加者と会話していると判断される状態では、第２参加者は第１参加者と視線が合わないが、その状態が所定時間Ｔ１継続すると、第１画像データが所定時間Ｔ２第１会議室に配置されたＰＣ１００Ａに送信されるので、第２参加者は第１参加者と所定時間Ｔ２視線が合う。また、第１参加者が第２参加者と会話していると判断される状態では、第３参加者は第１会議室の参加者と視線が合わないが、その状態が所定時間Ｔ１継続すると、第１画像データが所定時間Ｔ２第３会議室に配置されたＰＣ１００Ｂに送信されるので、第３参加者は第１参加者と所定時間Ｔ２視線が合う。このため、第１参加者から第２または第３参加者のいずれかにノンバーバル情報が伝達されていないときに、ノンバーバル情報を伝達することができる。 As described above, the conference system 1A according to the second embodiment allows the first camera 35 to perform the first operation when it is determined that the first participant in the first conference room is talking to the second participant. When the first image data obtained by imaging the participant is selected as the first selected image data and it is determined that the first participant is talking to the third participant, the first participant is obtained by each of the second and third cameras 37 and 39. One of the second image data and the third image data obtained by capturing the image is selected as the first selected image data, and it is determined that the first participant in the first conference room is talking to the third participant Is continued for the predetermined time T1, the first image data is selected as the first selected image data for the predetermined time T2, and the selected first selected image data is transmitted to the PC 100A arranged in the second conference room. Further, when it is determined that the first participant in the first conference room is talking to the third participant, the first image data obtained by imaging the first participant with the first camera 35 is the second selected image data. Second image data and third image data obtained by capturing the first participant with each of the second and third cameras 37 and 39 when it is determined that the user is talking with a participant in one of the conference rooms. Is selected as the second selected image data, and the state in which it is determined that the first participant in the first conference room is talking to the second participant continues for a predetermined time T1, the first image data is The second selection image data selected as the second selection image data T2 for a predetermined time is transmitted to the PC 100B arranged in the third conference room. Therefore, the second or third participant who views the first image data obtained by imaging the participant in the first direction toward the participant is in line of sight with the first participant in the first conference room. Participants in the second or third conference room viewing the third image data are not aligned with the participants in the first conference room. In a state where it is determined that the first participant is talking to the third participant, the second participant does not match his line of sight with the first participant, but if the state continues for a predetermined time T1, the first image data Is transmitted to the PC 100A placed in the first conference room for a predetermined time T2, the second participant is aligned with the first participant for the predetermined time T2. Further, in a state where it is determined that the first participant is talking to the second participant, the third participant does not line up with the participant in the first conference room, but the state continues for a predetermined time T1. Since the first image data is transmitted to the PC 100B arranged in the third conference room for the predetermined time T2, the third participant is aligned with the first participant for the predetermined time T2. Therefore, the non-verbal information can be transmitted when the non-verbal information is not transmitted from the first participant to either the second or third participant.

また、第１参加者が第３参加者と会話していると判断された場合に、第２会議室に配置されたＰＣ１００Ａから受信される表示領域情報が第１表示領域を示す場合は第２画像データが第１選択画像データとして選択され、ＰＣ１００Ａから受信される表示領域情報が第２表示領域を示す場合は第３画像データが第１選択画像データとして選択され、選択された第１選択画像データが第２会議室に配置されたＰＣ１００Ａに送信される。また、第１参加者が第２参加者と会話していると判断された場合に、第３会議室に配置されたＰＣ１００Ｂから受信される表示領域情報が第１表示領域を示す場合は第２画像データが第２選択画像データとして選択され、ＰＣ１００Ｂから受信される表示領域情報が第２表示領域を示す場合は第３画像データが第２選択画像データとして選択され、選択された第２選択画像データが第３会議室に配置されたＰＣ１００Ｂに送信される。このため、第１〜第３参加者それぞれは、他の２つの会議室の参加者が会話している間は、他の２つの会議室の二人の参加者が向かい合って会話する画像を見ることができる。 Further, when it is determined that the first participant is talking to the third participant, the second is displayed when the display area information received from the PC 100A arranged in the second conference room indicates the first display area. When the image data is selected as the first selected image data and the display area information received from the PC 100A indicates the second display area, the third image data is selected as the first selected image data, and the selected first selected image is selected. Data is transmitted to the PC 100A arranged in the second conference room. Further, when it is determined that the first participant is talking to the second participant, the second is displayed when the display area information received from the PC 100B arranged in the third conference room indicates the first display area. When the image data is selected as the second selected image data and the display area information received from the PC 100B indicates the second display area, the third image data is selected as the second selected image data, and the selected second selected image is selected. Data is transmitted to the PC 100B arranged in the third conference room. Therefore, each of the first to third participants sees an image in which two participants in the other two meeting rooms face each other while the other two meeting room participants are talking. be able to.

また、第２会議室に配置されたＰＣ１００Ａに送信する第１選択画像データとして、第２画像データおよび第３画像データのいずれか一方を選択しているとき、第３会議室の第３参加者と会話していると判断される状態が所定時間Ｔ１継続すると、第１参加者が発話していると判断されることを条件に、第１画像データが所定時間Ｔ２第１選択画像データとして選択され、第２会議室に配置されたＰＣ１００Ａに送信される。また、第３会議室に配置されたＰＣ１００Ｂに送信する第２選択画像データとして、第２画像データおよび第３画像データのいずれか一方を選択しているとき、第１参加者と会話していると判断される状態が所定時間Ｔ１継続すると、第１参加者が発話していると判断されることを条件に、第１画像データが所定時間Ｔ２第２選択画像データとして選択され、第３会議室に配置されたＰＣ１００Ｂに送信される。このため、第２会議室の参加者は、第１および第３会議室の参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が所定時間Ｔ１継続すると、第１または第３参加者いずれかが発話していれば、第１または第３参加者のうち発話している参加者と視線が合う画像を見ることになる。また、第３参加者は、第１および第２参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が所定時間Ｔ１継続すると、第１または第２参加者が発話していれば、第１または第２参加者のうち発話している参加者と視線が合う画像を見ることになる。さらに、第１参加者は、第２および第３参加者が会話しているときは、二人の参加者が向かい合って会話する画像を見るが、その状態が所定時間Ｔ１継続すると、第２または第３参加者が発話していれば、第２または第３参加者のうち発話している参加者と視線が合う画像を見ることになる。このため、３つの第１〜第３参加者それぞれは、他の２つの会議室の参加者が会話している間は、他の２つの会議室の二人の参加者が向かい合って会話する画像を見るが、他の２つの会議室の参加者が会話している状態が所定時間Ｔ１継続すると、他の２つの会議室の参加者のうち発話している参加者と視線が合う画像を、所定時間Ｔ２見ることができる。 Further, when one of the second image data and the third image data is selected as the first selected image data to be transmitted to the PC 100A arranged in the second conference room, the third participant in the third conference room The first image data is selected as the first selected image data for a predetermined time T2 on the condition that it is determined that the first participant is speaking when the state determined to be talking is continued for a predetermined time T1. And transmitted to the PC 100A arranged in the second conference room. Further, when either one of the second image data and the third image data is selected as the second selected image data to be transmitted to the PC 100B arranged in the third conference room, the user is talking to the first participant. If the state determined to be continued for the predetermined time T1, the first image data is selected as the second selected image data for the predetermined time T2 on the condition that it is determined that the first participant is speaking. It is transmitted to the PC 100B arranged in the room. For this reason, when the participants in the first and third conference rooms are talking, the participants in the second conference room see the images of the two participants facing each other, but the state is kept for a predetermined time. If T1 continues, if either the first or third participant is speaking, an image whose line of sight matches the speaking participant among the first or third participants will be seen. In addition, when the first and second participants are talking, the third participant sees an image in which the two participants face each other, but if the state continues for a predetermined time T1, the first or second participant If the second participant is speaking, the user will see an image whose line of sight matches that of the first or second participant speaking. Furthermore, when the second participant and the third participant are talking, the first participant sees an image in which the two participants face each other. If the state continues for a predetermined time T1, the first participant If the third participant speaks, he / she sees an image whose line of sight matches that of the second or third participant speaking. For this reason, each of the three first to third participants is an image in which two participants in the other two conference rooms confront each other while the other two conference rooms are talking. If the state where the participants in the other two conference rooms are speaking continues for a predetermined time T1, an image that matches the line of sight of the participant speaking in the other two conference rooms, It can be seen for a predetermined time T2.

また、３つの第１〜第３会議室それぞれは、さらに、音声を集音するマイクロホン３１が配置され、ＰＣ１００，１００Ａ,１００Ｂそれぞれは、マイクロホン３１をさらに制御し、マイクロホン３１が出力する音データに基づいて、参加者が発話しているか否かを判断する。ＰＣ１００は、検出された第１参加者の視線方向が第１表示領域に向かい、かつ、第１参加者が発話していると判断される場合に第２参加者に発話していると判定し、検出された第１参加者の視線方向が第２表示領域に向かい、かつ、第１参加者が発話していると判断される場合に第３参加者に発話していると判定する。このため、発話しているか否かの判断を、視線方向と発話していることとで判断するので、正確に会話相手を特定することができる。 In addition, each of the three first to third conference rooms is further provided with a microphone 31 that collects sound, and each of the PCs 100, 100A, and 100B further controls the microphone 31 and outputs sound data output from the microphone 31. Based on this, it is determined whether the participant is speaking. The PC 100 determines that the second participant is speaking when it is determined that the detected gaze direction of the first participant is directed to the first display area and the first participant is speaking. When it is determined that the detected line-of-sight direction of the first participant is directed to the second display area and the first participant is speaking, it is determined that the third participant is speaking. Therefore, since it is determined whether or not the utterance is being made based on the line-of-sight direction and the utterance, the conversation partner can be accurately identified.

なお、上述した第１の実施の形態においては、制御手段の一例としてＰＣ１００，１００Ａ〜１００Ｆを説明したが、図４に示した処理をＰＣ１００，１００Ａ〜１００Ｆそれぞれに実行させるための表示制御方法または、その表示制御方法をコンピュータに実行させるための表示制御プログラムとして発明を捉えることができるのは言うまでもない。また、上述した第２の実施の形態においては、制御手段の一例としてＰＣ１００，１００Ａ，１００Ｂを説明したが、図１２〜図１７に示した処理をＰＣ１００，１００Ａ，１００Ｂそれぞれに実行させるための表示制御方法または、その表示制御方法をコンピュータに実行させるための表示制御プログラムとして発明を捉えることができるのは言うまでもない。 In the first embodiment described above, the PCs 100 and 100A to 100F have been described as an example of the control unit. However, the display control method for causing the PCs 100 and 100A to 100F to execute the processing illustrated in FIG. Of course, the invention can be understood as a display control program for causing a computer to execute the display control method. In the above-described second embodiment, the PCs 100, 100A, and 100B have been described as examples of the control unit. However, the display for causing the PCs 100, 100A, and 100B to execute the processes illustrated in FIGS. It goes without saying that the invention can be understood as a control method or a display control program for causing a computer to execute the display control method.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１，１Ａ会議システム、３ネットワーク、１１ＣＰＵ、１３ＲＯＭ、１５ＲＡＭ、１７ＨＤＤ、１８メモリカード、１９カードＩ／Ｆ、２１通信Ｉ／Ｆ、２３ユーザインターフェース、２５操作部、２７表示部、２９外部Ｉ／Ｆ、３１マイクロホン、３３スピーカ、３５カメラ、３５，３７，３９第１〜第３カメラ、５１，５１Ａ画像データ取得部、５３，５３Ａ音データ取得部、５５，５５Ａ視線方向検出部、５７，５７Ａ視線画像記憶部、５７Ａ視線画像記憶部、５９，５９Ａ選択部、６１，６１Ａ映像データ送信部、６３，６３Ａ映像データ受信部、６５，６５Ａ表示制御部、６７，６７Ａ音声制御部、７１発話検出部、７３第１選択部、７５視線方向検出部、７７第２選択部、７９発話相手判定部、８１第１会議室情報送信部、８３第２会議室情報送信部、８５第１会議室情報受信部、８７第２会議室情報受信部、８９会話相手判断部、９１表示制御部、９３音声制御部、１０１視線画像データ。

1, 1A conference system, 3 network, 11 CPU, 13 ROM, 15 RAM, 17 HDD, 18 memory card, 19 card I / F, 21 communication I / F, 23 user interface, 25 operation unit, 27 display unit, 29 External I / F, 31 microphone, 33 speaker, 35 camera, 35, 37, 39 1st to 3rd camera, 51, 51A image data acquisition unit, 53, 53A sound data acquisition unit, 55, 55A gaze direction detection unit, 57, 57A line-of-sight image storage unit, 57A line-of-sight image storage unit, 59, 59A selection unit, 61, 61A video data transmission unit, 63, 63A video data reception unit, 65, 65A display control unit, 67, 67A audio control unit, 71 utterance detection unit, 73 first selection unit, 75 gaze direction detection unit, 77 second selection unit, 79 utterance partner determination unit, 81 first Conference room information transmission unit, 83 Second conference room information transmission unit, 85 First conference room information reception unit, 87 Second conference room information reception unit, 89 Conversation partner determination unit, 91 Display control unit, 93 Voice control unit, 101 Eye-gaze image data.

Claims

A conference system that generates a virtual conference room using a plurality of images obtained by imaging a plurality of geographically separated conference rooms,
In each of the plurality of conference rooms,
Display means for displaying an image;
Imaging means for imaging a participant in a subject direction toward the participant existing in the conference room from the display means;
And a control means for controlling the display means and the imaging means,
The plurality of conference rooms include a first conference room in which a presenter participates, and at least one second conference room in which participants participate,
The control means arranged in the first conference room includes:
Gaze direction detection means for detecting the gaze direction of the presenter based on the image data output by the imaging means;
Line-of-sight image storage means for storing image data output by the imaging means while the direction in which the presenter's line of sight faces the display means is detected by the line-of-sight detection means;
Selecting means for selecting any one of the image data output by the imaging means and the image data stored in the storage means;
Video data transmission means for transmitting video data including the image data selected by the selection means to the control means arranged in each of the at least one second conference room,
When the selection means selects the image data output by the imaging means, and the state where the line-of-sight direction detected by the line-of-sight detection means is not in the direction facing the display means continues for a predetermined time, the selection means Select the stored image data,
The control means arranged in the second conference room includes:
Video data receiving means for receiving video data from the control means arranged in the first conference room;
And a display control means for causing the display means to display an image of image data included in the video data received by the video data receiving means.

2. The conference system according to claim 1, wherein when the line-of-sight direction detected by the line-of-sight detection unit is a direction facing the display unit, the selection unit selects image data output by the imaging unit.

A conference system that generates a virtual conference room using a plurality of images obtained by capturing three geographically separated conference rooms,
In each of the three meeting rooms,
Display means including two display areas;
First imaging means for imaging a participant in a first direction toward the participant participating in the conference room from the display means side;
Second imaging means for imaging the participant in a second direction intersecting the first direction;
Third imaging means for imaging the participant in a third direction opposite to the second direction;
And a control means for controlling the display means and the first to third imaging means,
The control means is based on the first image data output from the first imaging means, and the participant's line-of-sight direction is the first line-of-sight direction toward the first display area or the second line-of-sight direction toward the second display area. Gaze direction detection means for detecting whether or not
Based on the detection result by the line-of-sight direction detection means, the utterance partner determination means for determining the partner with whom the participant speaks;
First selection means for selecting one first selection image data from the first to third image data output from the first to third imaging means, respectively;
Second selection means for selecting one second selection image data from the first to third image data output by the first to third imaging means, respectively;
The video data including the first selected image data and the first utterance partner information indicating the utterance partner determined by the utterance partner determination unit are arranged in one of the other two conference rooms. First conference information transmitting means for transmitting to the control means;
The video data including the second selected image data and the first utterance partner information are transmitted to the control means disposed in the other conference room different from the one of the other two conference rooms. Second meeting information transmitting means for
First conference information receiving means for receiving video data and second utterance partner information indicating a partner uttered by a participant in the one conference room from the control means disposed in the one conference room;
Second conference information receiving means for receiving video data and third utterance partner information indicating a partner uttered by a participant in the other conference room from the control means disposed in the other conference room;
Image data included in video data received from the control means arranged in the one conference room is displayed in the first display area, and video received from the control means arranged in the other conference room Display control means for controlling the display means to display image data included in the data in the second display area;
A conversation partner judging means for judging, based on the first to third speech partner information, which participant is talking with which participant in the other two conference rooms,
The speaking partner determination unit determines that the line of sight detected by the line-of-sight direction detection unit is speaking to a participant in the one conference room when the line-of-sight direction is directed to the first display area, and the line-of-sight direction detection Determining that the line-of-sight direction detected by the means is speaking to a participant in the other conference room when heading toward the second display area;
The first selection unit selects the first image data when the conversation partner determination unit determines that the conversation partner is speaking with a participant in the one conference room, and the conversation partner determination unit selects the other image data. When it is determined that the user is talking with a participant in the other conference room, the user selects either the second image data or the third image data and is talking with the participant in the other conference room. When the state determined to be continued for a first predetermined time, the first image data is selected for a second predetermined time,
The second selection unit selects the first image data when the conversation partner determination unit determines that the conversation partner determination unit is speaking with the other conference room participant, and the conversation partner determination unit selects the one of the first image data. When it is determined that the user is talking with a participant in the other conference room, the user selects either the second image data or the third image data and is talking with the participant in the one conference room. When the state determined to be continued for a first predetermined time, the conference system selects the first image data for a second predetermined time.

The first imaging means is arranged so as to have an angle of view for photographing the participant from the front, and the second imaging means is arranged so as to obtain an angle of view for photographing from the right side of the participant. The imaging means is arranged to have an angle of view from the left side of the participant,
The first display area and the second display area are arranged side by side,
Display area information indicating that the control means is displaying image data included in video data received from the control means disposed in the one conference room in the first display area. First display area information transmitting means for transmitting the information to the control means arranged in the one conference room;
Display area information indicating that the display control means is displaying image data included in video data received from the control means disposed in the other conference room in the second display area. Second display area information transmitting means for transmitting to the control means disposed in a room,
The first selection means is a display received from the control means arranged in the one conference room when the conversation partner judgment means judges that the conversation partner is talking to a participant in the other conference room. When the area information indicates the first display area, the second image data is selected, and when the display area information received from the control means arranged in the one conference room indicates the second display area, the second image data is selected. 3 Select image data,
The second selection means is a display received from the control means disposed in the other conference room when the conversation partner judgment means judges that the conversation partner is talking to a participant in the one conference room. When the area information indicates the first display area, the second image data is selected, and when the display area information received from the control means arranged in the one conference room indicates the second display area, the second image data is selected. The conference system according to claim 3, wherein three image data is selected.

Each of the three meeting rooms further includes
A microphone that collects sound is placed,
The control means further controls the microphone,
Utterance detection means for determining whether or not the participant is speaking based on the sound data output from the microphone, further comprising:
When the first selection means selects either the second image data or the third image data, a state in which it is determined that the first selection means is talking to a participant in the other conference room is the first. The first image data is selected for a second predetermined time on condition that the participant is determined to be speaking by the utterance detection means.
When the second selection means selects either the second image data or the third image data, a state in which it is determined that the second selection means is talking with a participant in the one conference room is the first. 5. The conference according to claim 3, wherein the first image data is selected for a second predetermined time on condition that the participant is determined to be speaking by the utterance detection unit when the predetermined time continues. system.

Each of the three meeting rooms further includes
A microphone that collects sound is placed,
The control means further controls the audio output device and the microphone,
Utterance detection means for determining whether or not the participant is speaking based on the sound data output from the microphone, further comprising:
The utterance partner determination unit determines whether the sight line direction detected by the sight line direction detection unit is directed to the first display area and the participant is speaking. When it is determined that the participant is speaking and the line-of-sight direction detected by the line-of-sight direction detection unit is directed to the second display area and the participant is speaking, the other The conference system according to claim 3, wherein the conference system determines that the user is speaking to a participant in the conference room.

Each of the three meeting rooms further includes
An audio output device that outputs audio is arranged,
The control means further controls the audio output device,
A voice control means for outputting to the voice output device;
The first conference information transmission means includes video data including the first selected image data and sound data output from the microphone, first utterance partner information indicating the utterance partner determined by the utterance partner determination means, To the control means arranged in the one conference room,
The second meeting information transmitting means includes the control means arranged in the other meeting room for video data including the second selected image data and sound data output from the microphone and the first utterance partner information. To
The audio control means includes sound data included in video data received from the control means arranged in the one conference room and video data received from the control means arranged in the other conference room. The conference system according to claim 5 or 6, wherein the audio data included is output to the audio output device.