JP2010154387A

JP2010154387A - Communication terminal device, communication control method, and communication control program

Info

Publication number: JP2010154387A
Application number: JP2008331984A
Authority: JP
Inventors: Katsuhiro Amano; 勝博天野
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2008-12-26
Filing date: 2008-12-26
Publication date: 2010-07-08
Anticipated expiration: 2028-12-26
Also published as: JP5151970B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a communication terminal device capable of making a smooth conversation with a listening participant of an opposite-side terminal, and to provide a communication control method and a communication control program. <P>SOLUTION: A terminal device 3 is interconnected to other terminal devices 4-6 via a network 2 and constitutes a video conference system 1. In this system, when it is detected that a listening participant nods at a specific terminal device during a remote conference, an opposite-side terminal device of a speaking participant is notified regarding the detection of nodding of the listening participant. At the opposite-side terminal device notified of the nodding, a nodding image of the listening participant, stored beforehand, is displayed. Since the present system does not require encoding and decoding of image data differently from a streaming system, nodding of the listening participant can be displayed without delay. Consequently, deviation between timing of speaking and a reaction of the listening participant can be reduced, and thereby smooth conversations are provided. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、相手側端末との間で、画像と音声を双方向に送受信できる通信端末装置、当該通信端末装置の通信制御方法、通信制御プログラムに関する。 The present invention relates to a communication terminal device capable of bidirectionally transmitting and receiving images and sound to and from a counterpart terminal, a communication control method for the communication terminal device, and a communication control program.

従来、複数の通信端末装置をネットワークを介して接続し、画像と音声を双方向に送受信することで、遠隔の地にある者同士の会議を実現するテレビ会議システムが知られている。例えば、精神安定、集中力、意欲を向上させるために、所定の匂いや、音響、画像等を端末から出力して、会議の効率化を図った電子会議装置及び電子会議システムが知られている（例えば、特許文献１参照）。このようなシステムでは、画像の送受信を行う際に、「ストリーミング方式」が採用されていることが多い。ストリーミング方式とは、ネットワークを通じて画像や音声などのマルチメディアデータを視聴する際に、データを受信しながら同時に再生を行なう方式である。
特開平７−１０７４５３号公報 2. Description of the Related Art Conventionally, there has been known a video conference system in which a plurality of communication terminal devices are connected via a network and images and sound are bidirectionally transmitted and received to realize a conference between persons in remote locations. For example, in order to improve mental stability, concentration and motivation, an electronic conference apparatus and an electronic conference system are known in which a predetermined odor, sound, image, or the like is output from a terminal to improve the efficiency of the conference. (For example, refer to Patent Document 1). In such a system, the “streaming method” is often adopted when transmitting and receiving images. The streaming method is a method of simultaneously reproducing data while receiving it when viewing multimedia data such as an image or sound over a network.
JP-A-7-107453

しかしながら、上述のストリーミング方式では、相手側端末との間で画像データを送受信する際に、画像データのエンコードとデコードに時間がかかる。つまり、画像を表示する際に遅延時間が発生する。例えば、会議中に話者が話した内容に対して、相手側端末にいる聞き手が同意して頷いたとき、その際の画像データがエンコードされる。そのエンコードされた画像データは、ネットワークを介して他の端末装置に受信される。各端末装置では、受信した画像データがデコードされ、その画像データが端末装置のディスプレイに表示される。このような方式では、頷きのリアクションが実際の時間よりも遅延して表示される。よって、話すタイミングと聞き手のリアクションとが微妙にずれてしまうことがあり、話し難いという問題点があった。 However, in the above streaming method, it takes time to encode and decode image data when transmitting and receiving image data to and from the partner terminal. That is, a delay time occurs when displaying an image. For example, when the listener at the other terminal agrees with the content spoken by the speaker during the conference, the image data at that time is encoded. The encoded image data is received by another terminal device via the network. In each terminal device, the received image data is decoded, and the image data is displayed on the display of the terminal device. In such a system, the whirling reaction is displayed with a delay from the actual time. Therefore, there is a problem that it is difficult to speak because the timing of speaking and the reaction of the listener may be slightly different.

本発明は、上記課題を解決するためになされたものであり、相手側端末の聞き手と円滑に会話できる通信端末装置、通信制御方法、通信制御プログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a communication terminal device, a communication control method, and a communication control program that can smoothly talk with a listener of a counterpart terminal.

上記目的を達成するために、請求項１に係る発明の通信端末装置は、ネットワークを介して相手側端末と画像を介した通信を行う通信端末装置であって、前記相手側端末から送信された画像データを表示する表示手段と、ユーザの反応状態を検出する反応状態検出手段と、当該反応状態検出手段によって前記反応状態が検出された場合に、前記相手側端末に、前記ユーザに前記反応状態が検出されたことを示す反応信号を送信する反応信号送信手段と、当該反応信号送信手段によって送信された前記反応信号を受信する反応信号受信手段と、前記相手側端末の相手ユーザが前記反応状態を示すときの反応時画像を記憶する反応時画像記憶手段と、前記反応信号受信手段によって前記反応信号が受信された場合に、前記反応時画像記憶手段に記憶された前記相手ユーザの前記反応時画像を、前記表示手段に表示する反応時画像表示制御手段とを備えている。 In order to achieve the above object, a communication terminal apparatus according to claim 1 is a communication terminal apparatus that communicates with a counterpart terminal via an image via a network, and is transmitted from the counterpart terminal. Display means for displaying image data; reaction state detection means for detecting a reaction state of a user; and when the reaction state is detected by the reaction state detection means, Reaction signal transmitting means for transmitting a reaction signal indicating that the reaction is detected; reaction signal receiving means for receiving the reaction signal transmitted by the reaction signal transmitting means; When the reaction signal is received by the reaction signal receiving means, and when the reaction signal is received by the reaction signal receiving means, it is recorded in the reaction image storage means. The reaction time image of the counterpart user is, and an reaction time of image display control means for displaying on the display means.

また、請求項２に係る発明の通信端末装置は、請求項１に記載の発明の構成に加え、前記表示手段には、前記相手側端末から送信され、前記相手側端末で撮影された前記相手ユーザのストリーミング画像が表示され、前記反応信号受信手段によって前記反応信号が受信された場合に、前記表示手段に表示されている前記ストリーミング画像に割り込んで、前記反応時画像を表示させる割り込み表示手段と、当該割り込み表示手段によって前記反応時画像が割り込まれた時から、前記反応時画像の時間に相当する時間分だけ、前記ストリーミング画像をカットする第１ストリーミング画像カット手段とを備えている。 According to a second aspect of the present invention, in addition to the configuration of the first aspect of the present invention, the communication terminal device is configured such that the display means transmits the partner transmitted from the partner terminal and photographed by the partner terminal. An interrupt display means for interrupting the streaming image displayed on the display means and displaying the reaction time image when a streaming image of a user is displayed and the reaction signal is received by the reaction signal receiving means; And a first streaming image cut means for cutting the streaming image for a time corresponding to the time of the reaction image from when the reaction image is interrupted by the interrupt display means.

また、請求項３に係る発明の通信端末装置は、請求項１に記載の発明の構成に加え、前記相手側端末に送信する前記ユーザのストリーミング画像を、前記反応時画像データ送信手段によって送信される前記反応時画像データの時間に相当する時間分だけカットして、前記相手側端末に送信する第２ストリーミング画像カット手段を備えている。 According to a third aspect of the present invention, in addition to the configuration of the first aspect of the invention, the communication terminal device transmits the user's streaming image to be transmitted to the counterpart terminal by the reaction-time image data transmission means. A second streaming image cutting means for cutting the time corresponding to the time of the reaction image data and transmitting the cut image data to the counterpart terminal.

また、請求項４に係る発明の通信端末装置は、請求項１乃至３の何れかに記載の発明の構成に加え、前記反応状態検出手段によって前記ユーザの前記反応状態が検出されたときの反応時画像データを圧縮し、前記相手側端末に送信する反応時画像データ送信手段と、前記相手側端末の前記反応時画像データ送信手段により送信された前記反応時画像データを受信する反応時画像データ受信手段と、当該反応時画像データ受信手段によって受信された前記反応時画像データを解凍して、前記反応時画像記憶手段に記憶する解凍記憶処理手段とを備えている。 According to a fourth aspect of the present invention, there is provided a communication terminal device according to any one of the first to third aspects, in addition to the reaction when the reaction state of the user is detected by the reaction state detection unit. Reaction image data transmitting means for compressing the reaction image data and transmitting it to the counterpart terminal, and Reaction image data for receiving the reaction image data transmitted by the reaction image data transmission means of the counterpart terminal Receiving means; and decompression storage processing means for decompressing the reaction image data received by the reaction image data receiving means and storing the decompressed image data in the reaction image storage means.

また、請求項５に係る発明の通信端末装置は、請求項１乃至４の何れかに記載の発明の構成に加え、前記ネットワークを介して、複数の前記相手側端末と接続している場合に、前記複数の前記相手側端末の中から、所定条件に基づいて、前記相手ユーザの前記反応状態を検出する一の端末を特定する端末特定手段と、当該端末特定手段によって特定された前記一の端末に向けて、前記反応状態を検出する端末として特定されたことを通知する通知信号を送信する通知信号送信手段と、前記相手側端末から送信された前記通知信号を受信する通知信号受信手段とを備え、前記反応状態検出手段は、前記通知信号受信手段によって前記通知信号が受信された場合に、前記ユーザの前記反応状態を検出することを特徴とする。 In addition to the configuration of the invention according to any one of claims 1 to 4, the communication terminal device according to the invention according to claim 5 is connected to the plurality of counterpart terminals via the network. A terminal specifying means for specifying one terminal for detecting the reaction state of the counterpart user based on a predetermined condition from the plurality of the counterpart terminals, and the one specified by the terminal specifying means A notification signal transmitting unit that transmits a notification signal for notifying the terminal that the reaction state has been identified, and a notification signal receiving unit that receives the notification signal transmitted from the counterpart terminal; The reaction state detection unit detects the reaction state of the user when the notification signal is received by the notification signal reception unit.

また、請求項６に係る発明の通信端末装置は、請求項５に記載の発明の構成に加え、前記ネットワークを介した前記相手側端末のＣＰＵの負荷状況を検出するＣＰＵ負荷状況検出手段と、当該ＣＰＵ負荷状況検出手段によって前記相手側端末毎に検出された前記ＣＰＵの負荷状況を記憶するＣＰＵ負荷状況記憶手段とを備え、前記端末特定手段は、前記ＣＰＵ負荷状況記憶手段に記憶された前記ＣＰＵ負荷状況を参照して、最もＣＰＵの負荷が低い前記相手側端末を、前記一の端末として特定する第１所定条件を備えている。 According to a sixth aspect of the present invention, there is provided a communication terminal device according to the sixth aspect, in addition to the configuration of the fifth aspect of the invention, a CPU load state detecting means for detecting a load state of the CPU of the counterpart terminal via the network; CPU load status storage means for storing the CPU load status detected for each counterpart terminal by the CPU load status detection means, and the terminal specifying means is stored in the CPU load status storage means. With reference to the CPU load status, a first predetermined condition for specifying the counterpart terminal with the lowest CPU load as the one terminal is provided.

また、請求項７に係る発明の通信端末装置は、請求項５に記載の発明の構成に加え、前記ネットワークを介した前記相手側端末との通信におけるデータの伝送時間を検出する伝送時間検出手段と、当該伝送時間検出手段による検出結果を記憶する伝送時間記憶手段とを備え、前記端末特定手段は、前記伝送時間記憶手段に記憶された前記検出結果を参照して、最も伝送時間の短い前記相手側端末を、前記一の端末として特定する第２所定条件を備えている。 According to a seventh aspect of the present invention, in addition to the configuration of the fifth aspect of the invention, the communication terminal apparatus detects a transmission time of data in communication with the counterpart terminal via the network. And a transmission time storage means for storing the detection result by the transmission time detection means, the terminal specifying means refers to the detection result stored in the transmission time storage means, and the transmission time is shortest. A second predetermined condition for specifying the counterpart terminal as the one terminal is provided.

また、請求項８に係る発明の通信端末装置は、請求項５に記載の発明の構成に加え、前記相手側端末毎に、ログインした前記相手ユーザの人数を検出するログイン人数検出手段と、当該ログイン人数検出手段によって検出されたログイン人数を記憶するログイン人数記憶手段とを備え、前記端末特定手段は、前記ログイン人数記憶手段に記憶された前記相手側端末毎のログイン人数において、最も多いログイン人数である前記相手側端末を、前記一の端末として特定する第３所定条件を備えている。 According to an eighth aspect of the present invention, there is provided a communication terminal device according to the fifth aspect of the present invention, in addition to the configuration of the fifth aspect of the present invention, a login number detection means for detecting the number of logged-in partner users for each counterpart terminal; Login number storage means for storing the number of logins detected by the login number detection means, wherein the terminal specifying means is the largest number of logins among the number of logins for each of the other party terminals stored in the login number storage means. A third predetermined condition for specifying the counterpart terminal as the one terminal.

また、請求項９に係る発明の通信端末装置は、請求項５に記載の発明の構成に加え、前記相手ユーザの発話を検出する発話検出手段と、当該発話検出手段によって検出された前記相手ユーザ毎の発話時からの経過時間を計測する経過時間計測手段とを備え、前記端末特定手段は、前記経過時間計測手段によって計測された前記経過時間が最も短い前記相手ユーザの前記相手側端末を、前記一の端末として特定する第４所定条件を備えている。 According to a ninth aspect of the present invention, in addition to the configuration of the fifth aspect of the invention, the communication terminal device further includes an utterance detection unit for detecting the utterance of the counterpart user, and the counterpart user detected by the utterance detection unit. An elapsed time measuring means for measuring an elapsed time from the time of each utterance, and the terminal specifying means, the partner terminal of the partner user with the shortest elapsed time measured by the elapsed time measuring means, A fourth predetermined condition for specifying the one terminal is provided.

また、請求項１０に係る発明の通信端末装置は、請求項１乃至９の何れかに記載の発明の構成に加え、前記反応状態検出手段は、前記ユーザの頭部が所定方向に振れて頷く頷き状態を、前記反応状態として検出することを特徴とする。 According to a tenth aspect of the present invention, there is provided the communication terminal device according to any one of the first to ninth aspects, wherein the reaction state detecting means sway the user's head in a predetermined direction. A whirling state is detected as the reaction state.

また、請求項１１に係る発明の通信端末装置は、請求項１乃至９の何れかに記載の発明の構成に加え、前記反応状態検出手段は、前記ユーザの頭部が横方向に振れて拒否する拒否状態を、前記反応状態として検出することを特徴とする。 According to an eleventh aspect of the present invention, there is provided a communication terminal device according to any one of the first to ninth aspects, wherein the reaction state detecting means rejects the user's head swinging laterally. The rejection state to be detected is detected as the reaction state.

また、請求項１２に係る発明の通信端末装置は、請求項１乃至９の何れかに記載の発明の構成に加え、前記ユーザが納得している場合に前記ユーザによって操作される操作手段を備え、前記反応状態検出手段は、前記操作手段による操作を検出することで、前記ユーザの納得状態を、前記反応状態として検出することを特徴とする。 According to a twelfth aspect of the present invention, there is provided a communication terminal apparatus according to any one of the first to ninth aspects, further comprising operation means operated by the user when the user is satisfied. The reaction state detection means detects the user's consent state as the reaction state by detecting an operation by the operation means.

また、請求項１３に係る発明の通信端末装置は、請求項１乃至１２の何れかに記載の発明の構成に加え、前記反応状態検出手段によって検出された前記ユーザの前記反応状態が初回か否かを判断する初回反応状態判断手段と、当該初回反応状態判断手段によって前記反応状態が初回と判断された場合に、前記反応時画像記憶手段に前記反応時画像を記憶する反応時画像記憶処理手段とを備え、前記反応時画像データ送信手段は、前記反応時映像記憶手段に記憶された前記反応時映像データを圧縮した状態で前記相手側端末に送信することを特徴とする。 According to a thirteenth aspect of the present invention, in addition to the configuration of the invention according to any one of the first to twelfth aspects, the communication terminal device according to the thirteenth aspect is the first time that the reaction state of the user detected by the reaction state detection unit is Initial reaction state determination means for determining whether or not the reaction state is stored in the reaction image storage means when the reaction state is determined to be the first time by the initial reaction state determination means. The reaction image data transmission means transmits the reaction image data stored in the reaction image storage means to the counterpart terminal in a compressed state.

また、請求項１４に係る発明の通信端末装置は、請求項１乃至１３の何れかに記載の発明の構成に加え、前記反応信号受信手段によって前記反応信号が受信された場合に、前記反応時画像記憶手段に、前記相手ユーザの前記反応時画像データが記憶されているか否かを判断する反応時画像記憶判断手段と、当該反応時画像記憶判断手段によって前記反応時画像が記憶されていないと判断された場合に、前記表示手段に、前記相手ユーザが前記反応状態を示していることを、前記反応時画像の代わりに、文字、図形、記号等で表記する代替画像を表示する代替画像表示制御手段とを備えている。 According to a fourteenth aspect of the present invention, there is provided a communication terminal device according to the first aspect, wherein the reaction signal is received when the reaction signal is received by the reaction signal receiving means. A response-time image storage determination unit that determines whether or not the response-time image data of the partner user is stored in the image storage unit, and that the response-time image storage determination unit does not store the response-time image. When judged, an alternative image display for displaying, on the display means, an alternative image indicating that the counterpart user indicates the reaction state by using a character, a figure, a symbol or the like instead of the reaction time image Control means.

また、請求項１５に係る発明の通信制御方法は、ネットワークを介して相手側端末と画像を介した通信を行う通信端末装置の通信制御方法であって、前記相手側端末から送信される画像データを受信する画像データ受信ステップと、ユーザの反応状態を検出する反応状態検出ステップと、当該反応状態検出ステップにおいて前記反応状態が検出された場合に、前記相手側端末に、前記ユーザに前記反応状態が検出されたことを示す反応信号を送信する反応信号送信ステップと、当該反応信号送信ステップにおいて送信された前記反応信号を受信する反応信号受信ステップと、前記反応信号受信ステップにおいて前記反応信号が受信された場合に、前記相手側端末の相手ユーザが前記反応状態を示すときの反応時画像を記憶する前記反応時画像記憶手段に記憶された前記相手ユーザの前記反応時画像を、前記画像データ受信ステップにおいて受信された前記画像データを表示する前記表示手段に表示する反応時画像表示制御ステップとを備えている。 A communication control method according to a fifteenth aspect of the present invention is a communication control method for a communication terminal apparatus that communicates with an opponent terminal via an image via a network, the image data transmitted from the opponent terminal. When the reaction state is detected in the reaction state detection step, the reaction state detection step for detecting the reaction state of the user, and the reaction state to the user when the reaction state is detected in the reaction state detection step. A reaction signal transmitting step for transmitting a reaction signal indicating that the reaction signal has been detected; a reaction signal receiving step for receiving the reaction signal transmitted in the reaction signal transmitting step; and the reaction signal received in the reaction signal receiving step. The reaction time image for storing the response time image when the other party user of the counterpart terminal indicates the reaction state The reaction time images stored in the other user to 憶 means, and a reaction time of image display control step of displaying on said display means for displaying said received image data in the image data receiving step.

また、請求項１６に係る発明の通信制御プログラムは、請求項１５に記載の通信制御方法の各種処理ステップとしてコンピュータに実行させる。 A communication control program according to a sixteenth aspect is caused to be executed by a computer as various processing steps of the communication control method according to the fifteenth aspect.

請求項１に係る発明の通信端末装置では、相手側端末と画像を介した通信を行うことができる。相手側端末から送信された画像データは表示手段に表示される。ユーザの反応状態は、反応状態検出手段によって検出される。反応状態検出手段によってユーザの反応状態が検出されると、反応信号送信手段によって、相手側端末に向けて反応信号が送信される。一方、相手側端末の相手ユーザが反応状態を示すときの反応時画像が、反応時画像記憶手段に記憶されている。そして、反応信号が反応信号受信手段によって受信されると、反応時画像記憶手段に記憶された相手ユーザの反応時画像が表示手段に表示するように、反応時画像表示制御手段が制御する。つまり、ストリーミング方式とは異なり、相手側の反応時画像を反応時画像記憶手段に記憶しておき、反応があった場合にその反応時画像を表示手段に表示させるので、画像データのエンコード、デコードが不要である。従って、話すタイミングと聞き手のリアクションとのずれが小さくなるので、相手側端末にいる聞き手と円滑に会話ができる。 In the communication terminal device according to the first aspect of the present invention, communication via the image can be performed with the counterpart terminal. The image data transmitted from the counterpart terminal is displayed on the display means. The reaction state of the user is detected by the reaction state detection means. When a reaction state of the user is detected by the reaction state detection unit, a reaction signal is transmitted to the counterpart terminal by the reaction signal transmission unit. On the other hand, a response image when the partner user of the partner terminal indicates a response state is stored in the response image storage means. Then, when the reaction signal is received by the reaction signal receiving means, the reaction time image display control means controls so that the reaction time image of the opponent user stored in the reaction time image storage means is displayed on the display means. That is, unlike the streaming method, the other party's reaction time image is stored in the reaction time image storage means, and when there is a reaction, the reaction time image is displayed on the display means, so the encoding and decoding of the image data Is unnecessary. Accordingly, since the difference between the speaking timing and the listener's reaction becomes small, it is possible to smoothly talk with the listener at the other terminal.

また、請求項２に係る発明の通信端末装置では、請求項１に記載の発明の効果に加え、表示手段には、相手側端末から送信され、相手側端末で撮影された相手ユーザのストリーミング画像が表示される。そして、反応信号受信手段によって反応信号が受信された場合、割り込み表示手段が、表示手段に表示されているストリーミング画像に反応時画像を割り込ませて表示させる。さらに、割り込み表示手段によって反応時画像が割り込まれた時から、反応時画像の時間に相当する時間分のストリーミング画像が第１ストリーミング画像カット手段によってカットされる。これにより、反応時の画像が重複して表示手段に表示されないので違和感が無くなる。 In the communication terminal device according to the second aspect of the invention, in addition to the effect of the first aspect of the invention, the display means transmits a streaming image of the other user transmitted from the other party terminal and photographed by the other party terminal. Is displayed. Then, when the reaction signal is received by the reaction signal receiving means, the interrupt display means interrupts and displays the reaction time image on the streaming image displayed on the display means. Furthermore, from the time when the response image is interrupted by the interrupt display means, the streaming image corresponding to the time of the response image is cut by the first streaming image cut means. Thereby, since the image at the time of reaction is not displayed on the display means in an overlapping manner, there is no sense of incongruity.

また、請求項３に係る発明の通信端末装置では、請求項１に記載の発明の効果に加え、第２ストリーミング画像カット手段が、相手側端末に送信するユーザのストリーミング画像を、反応時画像データ送信手段によって送信される反応時画像データの時間に相当する時間分だけカットして、相手側端末に送信する。これにより反応時の画像が重複して表示手段に表示されないので違和感が無くなる。 Moreover, in the communication terminal device of the invention according to claim 3, in addition to the effect of the invention of claim 1, the second streaming image cut means converts the streaming image of the user transmitted to the counterpart terminal into the response image data. Cut by the time corresponding to the time of the image data at the time of reaction transmitted by the transmission means, and transmit to the counterpart terminal. As a result, the images at the time of reaction are not displayed on the display means in a duplicated manner, so that the feeling of strangeness is eliminated.

また、請求項４に係る発明の通信端末装置では、請求項１乃至３の何れかに記載の発明の効果に加え、反応状態検出手段によってユーザの反応状態が検出されたときの反応時画像データは、反応時画像データ送信手段によって圧縮されて相手側端末に送信される。相手側端末から送信された反応時画像データは、反応時画像データ受信手段によって受信される。その受信された反応時画像データは、解凍記憶処理手段によって解凍された状態で、反応時画像記憶手段に記憶される。これにより、相手側端末の反応時画像を会話の最初のうちで記憶しておくことができるので、その後の会話中の反応時画像を表示手段にいち早く表示させることができる。 In addition, in the communication terminal device of the invention according to claim 4, in addition to the effect of the invention according to any one of claims 1 to 3, image data at the time of reaction when the reaction state of the user is detected by the reaction state detection means Is compressed by the reaction image data transmission means and transmitted to the counterpart terminal. The reaction time image data transmitted from the counterpart terminal is received by the reaction time image data receiving means. The received reaction image data is stored in the reaction image storage unit in a state where it has been decompressed by the decompression storage processing unit. As a result, the reaction time image of the counterpart terminal can be stored at the beginning of the conversation, so that the response image during the subsequent conversation can be quickly displayed on the display means.

また、請求項５に係る発明の通信端末装置では、請求項１乃至４の何れかに記載の発明の効果に加え、複数の相手側端末と接続している場合に、端末特定手段がその複数の相手側端末の中から、所定条件に基づいて、相手ユーザの反応状態を検出する一の端末を特定する。端末が特定されると、通知信号送信手段が、その特定された一の端末に向けて、反応状態を検出する端末として特定されたことを通知する通知信号を送信する。相手側端末から送信された通知信号は通知信号受信手段によって受信される。反応状態検出手段は、その通知信号受信手段によって通知信号が受信された場合に、ユーザの反応状態を検出する。即ち、ネットワークを介して複数の相手側端末と接続している場合でも、反応状態を検出する１つの端末を特定するので、ネットワークにおける通信負荷と遅延時間を最小限にすることができる。 Further, in the communication terminal device of the invention according to claim 5, in addition to the effect of the invention according to any one of claims 1 to 4, when the terminal is connected to a plurality of counterpart terminals, the terminal specifying means includes the plurality of terminals. One terminal that detects the reaction state of the other user is specified based on a predetermined condition. When the terminal is specified, the notification signal transmitting means transmits a notification signal for notifying that the terminal is specified as the terminal that detects the reaction state toward the specified one terminal. The notification signal transmitted from the counterpart terminal is received by the notification signal receiving means. The reaction state detection unit detects the reaction state of the user when the notification signal is received by the notification signal reception unit. That is, even when connected to a plurality of counterpart terminals via a network, one terminal for detecting a reaction state is specified, so that the communication load and delay time in the network can be minimized.

また、請求項６に係る発明の通信端末装置では、請求項５に記載の発明の効果に加え、ネットワークを介した相手側端末のＣＰＵの負荷状況が、ＣＰＵ負荷状況検出手段によって検出される。その検出されたＣＰＵの負荷状況は、ＣＰＵ負荷状況記憶手段に相手側端末毎に記憶される。端末特定手段は、ＣＰＵ負荷状況記憶手段に記憶されたＣＰＵ負荷状況を参照して、最もＣＰＵの負荷が低い相手側端末を一の端末として特定する第１所定条件を備えている。これにより、ユーザの反応状態を検出することによる負荷の影響を最小限に留めることができる。 In the communication terminal device according to the sixth aspect of the invention, in addition to the effect of the fifth aspect of the invention, the CPU load status detecting means detects the CPU load status of the counterpart terminal via the network. The detected CPU load status is stored for each counterpart terminal in the CPU load status storage means. The terminal specifying means has a first predetermined condition for referring to the CPU load status stored in the CPU load status storage means and specifying the counterpart terminal with the lowest CPU load as one terminal. Thereby, the influence of the load by detecting a user's reaction state can be suppressed to the minimum.

また、請求項７に係る発明の通信端末装置では、請求項５に記載の発明の効果に加え、ネットワークを介した相手側端末との通信におけるデータの伝送時間が、伝送時間検出手段によって検出される。その検出結果は、伝送時間記憶手段に記憶される。端末特定手段は、その伝送時間記憶手段に記憶された検出結果を参照して、最も伝送時間の短い前記相手側端末を、一の端末として特定する第２所定条件を備えている。これにより、画像データを速やかに伝送できるので、反応時の画像をいち早く表示手段に表示させることができる。 In the communication terminal device of the invention according to claim 7, in addition to the effect of the invention of claim 5, the transmission time of the data in communication with the counterpart terminal via the network is detected by the transmission time detection means. The The detection result is stored in the transmission time storage means. The terminal specifying means has a second predetermined condition for referring to the detection result stored in the transmission time storage means and specifying the counterpart terminal having the shortest transmission time as one terminal. Thereby, since image data can be transmitted quickly, the image at the time of reaction can be quickly displayed on the display means.

また、請求項８に係る発明の通信端末装置では、請求項５に記載の発明の効果に加え、ログインした相手ユーザの人数が、ログイン人数検出手段によって相手側端末毎に検出される。ログイン人数検出手段によって検出されたログイン人数は、ログイン人数記憶手段に記憶される。端末特定手段は、ログイン人数記憶手段に記憶された相手側端末毎のログイン人数において、最も多いログイン人数である相手側端末を、一の端末として特定する第３所定条件を備えている。これにより、１拠点（１端末）において反応状態をより多く検出できるので、会話をより円滑に進めることができる。 In the communication terminal device according to the eighth aspect of the invention, in addition to the effect of the fifth aspect, the number of logged-in partner users is detected for each partner-side terminal by the log-in number detecting means. The number of logins detected by the login number detection means is stored in the login number storage means. The terminal specifying means has a third predetermined condition for specifying, as one terminal, the partner terminal that has the largest number of logins among the number of logins for each partner terminal stored in the login number storage means. Thereby, since more reaction states can be detected at one site (one terminal), the conversation can proceed more smoothly.

また、請求項９に係る発明の通信端末装置では、請求項５に記載の発明の効果に加え、相手ユーザの発話が発話検出手段によって検出される。さらに、経過時間計測手段によって、発話検出手段によって検出された相手ユーザ毎の発話時からの経過時間が計測される。端末特定手段は、経過時間計測手段によって計測された経過時間が最も短い前記相手ユーザの相手側端末を、一の端末として特定する第４所定条件を備えている。これにより、前回話をしていた話者の端末を優先して特定できる。 In the communication terminal device according to the ninth aspect of the invention, in addition to the effect of the fifth aspect of the invention, the speech of the other user is detected by the speech detection means. Furthermore, the elapsed time from the utterance time for each partner user detected by the utterance detection means is measured by the elapsed time measurement means. The terminal specifying unit includes a fourth predetermined condition for specifying the partner terminal of the partner user having the shortest elapsed time measured by the elapsed time measuring unit as one terminal. Thereby, the terminal of the speaker who talked last time can be identified with priority.

また、請求項１０に係る発明の通信端末装置では、請求項１乃至９の何れかに記載の発明の効果に加え、反応状態検出手段は、ユーザの頭部が所定方向に振れて頷く頷き状態を、反応状態として検出する。これにより、聞き手の頷きを話者にいち早く認識させることができるので、聞き手の反応の遅延によって話者を不安にさせることがない。よって、話者と聞き手との間において円滑な会話が可能である。 Moreover, in the communication terminal device of the invention according to claim 10, in addition to the effect of the invention according to any one of claims 1 to 9, the reaction state detection means is a state in which the user's head is swung in a predetermined direction. Is detected as a reaction state. As a result, the speaker can quickly recognize the whisper of the listener, so that the speaker is not anxious due to the delay of the listener's reaction. Therefore, a smooth conversation between the speaker and the listener is possible.

また、請求項１１に係る発明の通信端末装置では、請求項１乃至９の何れかに記載の発明の効果に加え、反応状態検出手段は、ユーザの頭部が横方向に振れて拒否する拒否状態を、反応状態として検出する。これにより、聞き手の拒否反応を話者にいち早く認識させることができるので、聞き手の反応の遅延によって話者を不安にさせることがない。 Moreover, in the communication terminal device of the invention according to claim 11, in addition to the effect of the invention according to any one of claims 1 to 9, the reaction state detection means rejects that the user's head shakes in the horizontal direction and rejects The state is detected as a reaction state. As a result, the speaker can quickly recognize the listener's rejection reaction, so that the speaker is not disturbed by the delay in the listener's response.

また、請求項１２に係る発明の通信端末装置では、請求項１乃至９の何れかに記載の発明の効果に加え、ユーザが納得している場合は、ユーザは操作手段を操作する。反応状態検出手段は、その操作手段による操作を検出することで、ユーザの納得状態を反応状態として検出する。これにより、聞き手の納得状態を話者にいち早く認識させることができるので、聞き手の反応の遅延によって話者を不安にさせることがない。 In the communication terminal device according to the twelfth aspect of the present invention, in addition to the effects of the invention according to any one of the first to ninth aspects, the user operates the operation means when the user is satisfied. The reaction state detection means detects the user's satisfaction state as the reaction state by detecting an operation by the operation means. This allows the speaker to quickly recognize the listener's satisfaction, so that the speaker is not disturbed by the delay in the listener's reaction.

また、請求項１３に係る発明の通信端末装置では、請求項１乃至１２の何れかに記載の発明の効果に加え、初回反応状態判断手段が、反応状態検出手段によって検出されたユーザの反応状態が初回か否かを判断する。初回反応状態判断手段によって反応状態が初回と判断された場合に、反応時記憶手段には反応時画像が記憶されていない可能性が高い。そこで、反応時画像記憶処理手段が、反応時画像記憶手段に反応時画像を記憶するので、会話中に反応時画像を記憶することができる。つまり、反応時画像記憶手段に予め反応時画像を記憶させる手間が不要である。 In addition, in the communication terminal device of the invention according to claim 13, in addition to the effect of the invention according to any one of claims 1 to 12, the initial reaction state determination means is the user reaction state detected by the reaction state detection means Determine whether is the first time. When the reaction state is determined to be the first time by the initial reaction state determination means, there is a high possibility that no reaction time image is stored in the reaction time storage means. Therefore, since the reaction time image storage processing means stores the reaction time image in the reaction time image storage means, the reaction time image can be stored during the conversation. That is, there is no need to store the reaction time image in advance in the reaction image storage means.

また、請求項１４に係る発明の通信端末装置では、請求項１乃至１３の何れかに記載の発明の効果に加え、反応信号受信手段によって反応信号が受信された場合、反応時画像記憶判断手段が、反応時画像記憶手段に、相手ユーザの反応時画像が記憶されているか否かを判断する。反応時画像が記憶されていないと判断された場合、代替画像表示制御手段が、相手ユーザが反応状態を示していることを、反応時画像の代わりに、文字、図形、記号等で表記する代替画像を表示手段に表示させる。これにより、反応時画像記憶に反応時画像が記憶されていない場合でも、反応時画像の代わりに代替画像を表示させることができるので、円滑な会話を提供することができる。 Further, in the communication terminal device of the invention according to claim 14, in addition to the effect of the invention according to any of claims 1 to 13, when a reaction signal is received by the reaction signal receiving means, a reaction time image storage judging means However, it is determined whether or not the other party user's reaction time image is stored in the reaction time image storage means. When it is determined that the response image is not stored, the alternative image display control means uses a character, figure, symbol or the like instead of the response image to indicate that the other user shows the response state The image is displayed on the display means. As a result, even when no response time image is stored in the response time image storage, the substitute image can be displayed instead of the response time image, so that a smooth conversation can be provided.

また、請求項１５に係る発明の通信制御方法では、まず、画像データ受信ステップにおいて、相手側端末から送信される画像データが受信される。次いで、反応状態検出ステップにおいて、ユーザの反応状態が検出される。反応状態検出ステップにおいて反応状態が検出された場合、反応信号送信ステップにおいて、ユーザに反応状態が検出されたことを示す反応信号が相手側端末に送信される。相手側端末から送信された反応し信号は、反応信号受信ステップにおいて受信される。反応信号受信ステップにおいて反応信号が受信された場合、反応時画像表示制御ステップにおいて、反応時画像記憶手段に記憶された相手ユーザの反応時画像が表示手段に表示される。つまり、ストリーミング方式とは異なり、相手側の反応時画像を反応時画像記憶手段に記憶しておき、反応があった場合にその反応時画像を表示手段に表示させるので、画像データのエンコード、デコードが不要である。従って、話すタイミングと聞き手のリアクションとのずれが小さくなるので、相手側端末にいる聞き手と円滑に会話ができる。 In the communication control method according to the fifteenth aspect of the present invention, first, image data transmitted from the counterpart terminal is received in the image data receiving step. Next, in the reaction state detection step, the reaction state of the user is detected. When a reaction state is detected in the reaction state detection step, a reaction signal indicating that the reaction state has been detected is transmitted to the user terminal in the reaction signal transmission step. The reaction signal transmitted from the counterpart terminal is received in the reaction signal receiving step. When the reaction signal is received in the reaction signal reception step, the reaction time image of the opponent user stored in the reaction time image storage means is displayed on the display means in the reaction time image display control step. That is, unlike the streaming method, the other party's reaction time image is stored in the reaction time image storage means, and when there is a reaction, the reaction time image is displayed on the display means, so the encoding and decoding of the image data Is unnecessary. Accordingly, since the difference between the speaking timing and the listener's reaction becomes small, it is possible to smoothly talk with the listener at the other terminal.

また、請求項１６に係る発明の通信制御プログラムは、請求項１５に記載の通信制御方法の各種処理ステップとしてコンピュータに実行させることによって、請求項１５に記載の発明の効果を得ることができる。 The communication control program according to the sixteenth aspect of the present invention is caused to be executed by a computer as various processing steps of the communication control method according to the fifteenth aspect, whereby the effect of the invention of the fifteenth aspect can be obtained.

以下、本発明の第１実施形態である端末装置３について、図面を参照して説明する。はじめに、端末装置３を構成要素とするテレビ会議システム１の構成について、図１を参照して説明する。図１は、テレビ会議システム１の構成を示すブロック図である。 Hereinafter, the terminal device 3 which is 1st Embodiment of this invention is demonstrated with reference to drawings. First, the configuration of the video conference system 1 including the terminal device 3 as a component will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration of the video conference system 1.

テレビ会議システム１は、ネットワーク２と、該ネットワーク２を介して相互に接続され、かつ各拠点毎に設けられた複数の端末装置３、４、５、６とを備えている。このテレビ会議システム１では、端末装置３、４、５、６間において、ネットワーク２を介して、画像、音声が互いに送受信されることで遠隔会議が実施される。 The video conference system 1 includes a network 2 and a plurality of terminal devices 3, 4, 5, and 6 that are connected to each other via the network 2 and provided at each base. In the video conference system 1, a remote conference is performed by transmitting and receiving images and sounds between the terminal devices 3, 4, 5, and 6 via the network 2.

そして、本実施形態では、遠隔会議中に、特定の端末装置において聞き手が頷いていることを検出した場合に、話者のいる相手側の端末装置に聞き手の頷きを検出したことを通知する。頷きが通知された相手側の端末装置では、予め記憶しておいた聞き手の頷き画像を表示する。この方式では、頷き画像のエンコード及びデコードが不要であるので、聞き手の頷きをいち早く表示できる点に特徴がある。 In this embodiment, when it is detected that a listener is speaking at a specific terminal device during a remote conference, the other terminal device on which the speaker is present is notified that the listener's whisper has been detected. The partner terminal device that is notified of the whispering displays a pre-stored listener whispering image. Since this method does not require encoding and decoding of a whispered image, it is characterized in that the listener's whisper can be displayed quickly.

次に、端末装置３の電気的構成について、図２を参照して説明する。図２は、端末装置３の電気的構成を示すブロック図である。なお、端末装置３〜６は全て同じ構成であるので、ここでは端末装置３の構成についてのみ説明し、他の端末装置４〜６については説明を省略する。 Next, the electrical configuration of the terminal device 3 will be described with reference to FIG. FIG. 2 is a block diagram showing an electrical configuration of the terminal device 3. In addition, since all the terminal devices 3-6 are the same structures, only the structure of the terminal device 3 is demonstrated here, and description is abbreviate | omitted about the other terminal devices 4-6.

端末装置３には、端末装置３の制御を司るコントローラとしてのＣＰＵ２０が設けられている。ＣＰＵ２０には、ＢＩＯＳ等を記憶したＲＯＭ２１と、各種データを一時的に記憶するＲＡＭ２２と、データの受け渡しの仲介を行うＩ／Ｏインタフェイス３０とが接続されている。Ｉ／Ｏインタフェイス３０には、各種記憶エリアを有するハードディスクドライブ３１（以下、ＨＤＤ３１）が接続されている。 The terminal device 3 is provided with a CPU 20 as a controller that controls the terminal device 3. Connected to the CPU 20 are a ROM 21 that stores BIOS, a RAM 22 that temporarily stores various data, and an I / O interface 30 that mediates data transfer. The I / O interface 30 is connected to a hard disk drive 31 (hereinafter referred to as HDD 31) having various storage areas.

Ｉ／Ｏインタフェイス３０には、ネットワーク２と通信するための通信装置２５と、マウス２７と、ビデオコントローラ２３と、キーコントローラ２４と、カードリーダ制御部３２と、ユーザを撮影するためのカメラ３４と、ユーザの音声を取り込むためのマイク３５と、ＣＤ−ＲＯＭドライブ２６とが各々接続されている。ビデオコントローラ２３には、ディスプレイ２８が接続されている。キーコントローラ２４には、キーボード２９が接続されている。カードリーダ制御部３２には、各ユーザが所有する識別カード（図示外）に記憶されたユーザを識別するためのユーザＩＤを読み込むためのカードリーダ３３が接続されている。 The I / O interface 30 includes a communication device 25 for communicating with the network 2, a mouse 27, a video controller 23, a key controller 24, a card reader control unit 32, and a camera 34 for photographing a user. A microphone 35 for capturing the user's voice and a CD-ROM drive 26 are connected to each other. A display 28 is connected to the video controller 23. A keyboard 29 is connected to the key controller 24. A card reader 33 for reading a user ID for identifying a user stored in an identification card (not shown) owned by each user is connected to the card reader control unit 32.

なお、ＣＤ−ＲＯＭドライブ２６に挿入されるＣＤ−ＲＯＭ１１４には、端末装置３のメインプログラムや、本発明の通信制御プログラム等が記憶されている。ＣＤ−ＲＯＭ１１４の導入時には、これら各種プログラムが、ＣＤ−ＲＯＭ１１４からＨＤＤ３１にセットアップされて、後述するプログラム記憶エリア３１５（図３参照）に記憶される。 The CD-ROM 114 inserted into the CD-ROM drive 26 stores the main program of the terminal device 3, the communication control program of the present invention, and the like. When the CD-ROM 114 is introduced, these various programs are set up from the CD-ROM 114 to the HDD 31 and stored in a program storage area 315 (see FIG. 3) described later.

次に、ＨＤＤ３１の各種記憶エリアについて、図３を参照して説明する。図３は、ＨＤＤ３１の各種記憶エリアを示す概念図である。ＨＤＤ３１には、ネットワーク２にログインしたユーザを管理するログインテーブル３１１１（図４参照）を記憶するログインテーブル記憶エリア３１１と、ネットワーク２に接続している端末装置３〜６の端末ＩＤを管理すると共に、接続している端末装置３〜６の各動作状況を保存する端末状況テーブル３１２１（図５参照）を記憶する端末状況テーブル記憶エリア３１２と、端末装置３〜６で各々撮影された頷き時の画像データ（以下、頷き画像データと呼ぶ。）を保存して管理する頷き画像データテーブル３１３１（図６参照）を記憶する頷き画像データテーブル記憶エリア３１３と、カメラ３４によって撮影されるカメラ画像を記憶するカメラ画像データ記憶エリア３１４と、各種プログラムを記憶するプログラム記憶エリア３１５と、その他の情報記憶エリア３１６と、ユーザの頷き時の波形パターンを予め記憶する波形パターン記憶エリア３１７と、頷かせるためのコンテンツ画像を記憶するコンテンツ画像記憶エリア３１８が少なくとも設けられている。 Next, various storage areas of the HDD 31 will be described with reference to FIG. FIG. 3 is a conceptual diagram showing various storage areas of the HDD 31. The HDD 31 manages a login table storage area 311 for storing a login table 3111 (see FIG. 4) for managing users who have logged into the network 2 and terminal IDs of the terminal devices 3 to 6 connected to the network 2. , A terminal status table storage area 312 for storing a terminal status table 3121 (see FIG. 5) for storing each operation status of the connected terminal devices 3 to 6, and a shooting time when each of the terminal devices 3 to 6 is photographed. A fired image data table storage area 313 for storing a fired image data table 3131 (see FIG. 6) for storing and managing image data (hereinafter referred to as fired image data) and a camera image taken by the camera 34 are stored. Camera image data storage area 314 to be stored, and program storage area 31 to store various programs When, with the other information storage area 316, a waveform pattern storage area 317 for previously storing a waveform pattern when nodding users, content image storage area 318 for storing the content image for nodding it is at least provided.

プログラム記憶エリア３１５には、端末装置３のメインプログラムや、他の端末装置４、５、６との間で遠隔会議を実行するための通信制御プログラム等が記憶されている。その他の情報記憶エリア３１６には、端末装置３で使用されるその他の情報が記憶されている。なお、端末装置３がＨＤＤ３１を備えていない専用機の場合は、ＲＯＭ２１に各種プログラムが記憶される。 The program storage area 315 stores a main program of the terminal device 3, a communication control program for executing a remote conference with the other terminal devices 4, 5, and 6 and the like. In the other information storage area 316, other information used in the terminal device 3 is stored. When the terminal device 3 is a dedicated machine that does not include the HDD 31, various programs are stored in the ROM 21.

次に、ログインテーブル３１１１について、図４を参照して説明する。図４は、ログインテーブル３１１１の概念図である。ログインテーブル３１１１には、ネットワーク２にログインしたユーザのユーザＩＤと、そのユーザＩＤが登録された端末装置３〜６の端末ＩＤとが記憶される。具体的には、ユーザＩＤの欄には、カードリーダ３３で読み取られた識別カード（図示外）に記憶されたユーザＩＤが記憶される。端末ＩＤの欄には、そのユーザＩＤを送信した端末装置３〜６の端末ＩＤが記憶される。なお、端末ＩＤとは、端末装置３のマックアドレス等である。さらに、ユーザＩＤ、端末ＩＤは、ネットワーク２を介して接続する他の端末装置４〜６から送信される端末情報に含まれ、その端末情報に基づいてログインテーブル３１１１に登録される。 Next, the login table 3111 will be described with reference to FIG. FIG. 4 is a conceptual diagram of the login table 3111. The login table 3111 stores the user ID of the user who has logged into the network 2 and the terminal IDs of the terminal devices 3 to 6 in which the user ID is registered. Specifically, the user ID stored in the identification card (not shown) read by the card reader 33 is stored in the user ID column. The terminal ID column stores the terminal IDs of the terminal devices 3 to 6 that transmitted the user ID. The terminal ID is a Mac address of the terminal device 3 or the like. Further, the user ID and the terminal ID are included in terminal information transmitted from other terminal devices 4 to 6 connected via the network 2 and are registered in the login table 3111 based on the terminal information.

例えば、図４に示すように、端末装置４のユーザであるＢさんがログインする場合、Ｂさんは自分の所有する識別カードをカードリーダ３３に読み取らせる。すると、ログイン信号が相手側の端末装置に送信され、ログインしたことが相手側に通知される。この場合、その識別カードに記憶されたユーザＩＤ＝「Ｂ０００１」と、そのユーザＩＤを送信した端末装置４の端末ＩＤ＝「０００２」とが、ログインテーブル３１１１に各々記憶される。その他のユーザについても同様に設定される。 For example, as shown in FIG. 4, when Mr. B who is the user of the terminal device 4 logs in, he causes the card reader 33 to read his own identification card. Then, a login signal is transmitted to the other party's terminal device, and the other party is notified that the user has logged in. In this case, the user ID = “B0001” stored in the identification card and the terminal ID = “0002” of the terminal device 4 that transmitted the user ID are stored in the login table 3111, respectively. The same is set for other users.

なお、図４に示すログインテーブル３１１１は、端末装置３（端末ＩＤ＝０００１）からは１人（ユーザＩＤ＝Ａ０００１）、端末装置４（端末ＩＤ＝０００２）からは１人（ユーザＩＤ＝Ｂ０００１）、端末装置５（端末ＩＤ＝０００３）からは２人（ユーザＩＤ＝Ｃ０００１、Ｃ０００２）、端末装置６（端末ＩＤ＝０００４）からは３人（ユーザＩＤ＝Ｄ０００１、Ｄ０００２、Ｄ０００３）がそれぞれログインした状態を示している。 Note that the login table 3111 shown in FIG. 4 includes one person (user ID = A0001) from the terminal apparatus 3 (terminal ID = 0001) and one person (user ID = B0001) from the terminal apparatus 4 (terminal ID = 0002). Two users (user ID = C0001, C0002) logged in from the terminal device 5 (terminal ID = 0003), and three people (user ID = D0001, D0002, D0003) logged in from the terminal device 6 (terminal ID = 0004), respectively. Indicates the state.

次に、端末状況テーブル３１２１について、図５を参照して記憶する。図５は、端末状況テーブル３１２１の概念図である。端末状況テーブル３１２１は、自身の端末の動作状況と、ネットワーク２を介して接続している相手側の端末装置（以下、接続端末と呼ぶ。）の各動作状況とを記録するテーブルである。端末状況テーブル３１２１には、端末ＩＤを記憶する端末ＩＤカラム６１と、各拠点間において話者が聞き手に対して話をする会話の方向を端末毎に記憶する会話方向カラム６２と、ＣＰＵの負荷状況を端末毎に記憶するＣＰＵ負荷カラム６３と、データの伝送の遅延時間を端末毎に記憶する遅延カラム６４と、頷き検出機能の有無を端末毎に記憶する頷き検出機能カラム６５と、ログインテーブル３１１１に登録されたユーザの人数（ログイン人数）を端末毎に記憶するログイン人数カラム６６と、最後に発話してからの経過時間を端末毎に記憶する発話後経過時間カラム６７と、が各々設けられている。 Next, the terminal status table 3121 is stored with reference to FIG. FIG. 5 is a conceptual diagram of the terminal status table 3121. The terminal status table 3121 is a table that records the operating status of its own terminal and each operating status of a partner terminal device (hereinafter referred to as a connected terminal) connected via the network 2. The terminal status table 3121 includes a terminal ID column 61 for storing the terminal ID, a conversation direction column 62 for storing the direction of conversation in which the speaker speaks to the listener between the bases for each terminal, and the CPU load. CPU load column 63 for storing the status for each terminal, delay column 64 for storing the delay time of data transmission for each terminal, a soot detection function column 65 for storing the presence or absence of a soot detection function for each terminal, and a login table There are provided a log-in number column 66 for storing the number of users (log-in number) registered in 3111 for each terminal, and a post-speech elapsed time column 67 for storing the elapsed time since the last utterance for each terminal. It has been.

なお、データ伝送時間の遅延とは、データが送信側から宛先に到達するまでにかかる時間をいう。最後の発話経過時間カラム６７には、各端末装置３〜６において、マイク３５で発話を検出してからの経過時間が記憶される。会話方向カラム６２には、マイク３５でユーザの発話を検出した端末装置が話者として登録され、それ以外の端末装置が聞き手として登録される。 The data transmission time delay refers to the time required for data to reach the destination from the transmission side. The last utterance elapsed time column 67 stores the elapsed time since the utterance was detected by the microphone 35 in each of the terminal devices 3 to 6. In the conversation direction column 62, a terminal device that detects the user's utterance with the microphone 35 is registered as a speaker, and other terminal devices are registered as listeners.

さらに、端末状況テーブル３１２１に記憶される各値は、各端末装置からネットワーク２を介して送信される端末情報に基づいて記憶される。端末情報とは、各端末の端末ＩＤ、ＣＰＵ負荷（％）、データの伝送時間の遅延（ｍｓ）、頷き検出機能の有無、発話後経過時間等を含むものである。ログイン人数カラム６６には、ログインテーブル３１１１に記憶されたユーザＩＤの人数が端末毎に各々記憶される。 Further, each value stored in the terminal status table 3121 is stored based on terminal information transmitted from each terminal device via the network 2. The terminal information includes the terminal ID of each terminal, CPU load (%), data transmission time delay (ms), presence / absence of a whisper detection function, elapsed time after utterance, and the like. In the login number column 66, the number of user IDs stored in the login table 3111 is stored for each terminal.

例えば、図５に示すように、端末装置４の動作状況は、端末状況テーブル３１２１の２行目に記憶されている。即ち、端末ＩＤカラム６１＝「０００２」、会話方向カラム６２＝「聞き手」、ＣＰＵ負荷カラム６３＝「５０％」、遅延カラム６４＝「１０ｍｓ」、頷き検出機能カラム６５＝「有り」、ログイン人数カラム６６＝「１人」、発話後経過時間カラム６７＝「１秒前」、が各々記憶されている。 For example, as illustrated in FIG. 5, the operation status of the terminal device 4 is stored in the second row of the terminal status table 3121. That is, the terminal ID column 61 = “0002”, the conversation direction column 62 = “listener”, the CPU load column 63 = “50%”, the delay column 64 = “10 ms”, the whisper detection function column 65 = “Yes”, the number of login users Column 66 = “one person” and post-speech elapsed time column 67 = “one second ago” are stored.

次に、頷き画像データテーブル３１３１について、図６を参照して説明する。図６は、頷き画像データテーブル３１３１の概念図である。頷き画像データテーブル３１３１には、端末ＩＤと、その端末ＩＤに対応するユーザの頷き画像データと、その録画時間とが端末毎に各々記憶されている。なお、後述するが、各端末装置で撮影された頷き画像は、エンコードして圧縮された状態で送信される。そして、受信した頷き画像データはデコードされた状態で、端末ＩＤ毎に管理されて、頷き画像データテーブル３１３１に記憶される。 Next, the roaring image data table 3131 will be described with reference to FIG. FIG. 6 is a conceptual diagram of the roaring image data table 3131. The whispered image data table 3131 stores the terminal ID, the whispered image data of the user corresponding to the terminal ID, and the recording time for each terminal. As will be described later, the hand-drawn image captured by each terminal device is transmitted in an encoded and compressed state. The received hand-drawn image data is managed for each terminal ID in a decoded state and stored in the hand-drawn image data table 3131.

例えば、図６に示すように、頷き画像データテーブル３１３１の２行目には、端末装置４の識別ＩＤ＝０００２に対して、頷き画像データ＝ｂｂｂ．ａｖｃと、録画時間＝２．４秒とが対応付けられて各々記憶されている。なお、ｂｂｂ．ａｖｃは、無圧縮化された状態で記憶されている。 For example, as shown in FIG. 6, the second row of the whirling image data table 3131 contains whispered image data = bbb. avc and recording time = 2.4 seconds are stored in association with each other. Note that bbb. avc is stored in an uncompressed state.

次に、端末装置３のディスプレイ２８に表示される画像について、図７を参照して説明する。図７は、ディスプレイ２８における一表示態様を示す図である。会議中において、端末装置３のディスプレイ２８には、他の端末装置４、５、６の各ユーザを映し出すために、３つの分割画面２８１、２８２、２８３がそれぞれ表示される。例えば、分割画面２８１は、ディスプレイ２８の略左半分に配置され、分割画面２８２は、ディスプレイ２８の右半分の上側に配置され、分割画面２８３は、ディスプレイ２８の右半分の下側に配置されて表示される。分割画面２８１には、端末装置４のユーザの画像が映し出される。分割画面２８２には、端末装置５のユーザの画像が映し出される。分割画面２８３には、端末装置６のユーザの画像が映し出される。なお、表示態様についてはこれに限定されず、各分割画面２８１〜２８３の配置、大きさも自由に変更可能である。なお、図７では、端末装置４のユーザが頷いている様子が分割画面２８１に映し出された状態を示している。 Next, an image displayed on the display 28 of the terminal device 3 will be described with reference to FIG. FIG. 7 is a diagram showing one display mode on the display 28. During the conference, three divided screens 281, 282, and 283 are displayed on the display 28 of the terminal device 3 in order to display each user of the other terminal devices 4, 5, and 6. For example, the divided screen 281 is arranged on the substantially left half of the display 28, the divided screen 282 is arranged on the upper right side of the display 28, and the divided screen 283 is arranged on the lower side of the right half of the display 28. Is displayed. An image of the user of the terminal device 4 is displayed on the divided screen 281. An image of the user of the terminal device 5 is displayed on the divided screen 282. On the divided screen 283, an image of the user of the terminal device 6 is displayed. Note that the display mode is not limited to this, and the arrangement and size of each of the divided screens 281 to 283 can be freely changed. FIG. 7 shows a state in which the user of the terminal device 4 is talking on the split screen 281.

次に、ユーザの頷きを検出する方法について、図８乃至図１２を参照して説明する。図８は、うつむき加減を示す特徴量ｄの説明図（頷き前）である。図９は、うつむき加減を示す特徴量ｄの説明図（頷き後）である。図１０は、カメラ画像データ４０の概念図である。図１１は、検出波形パターン（頷き時）を示すグラフである。図１２は、登録された頷き波形パターンを示すグラフである。 Next, a method for detecting a user's whispering will be described with reference to FIGS. FIG. 8 is an explanatory diagram of the characteristic amount d indicating the amount of change (before turning). FIG. 9 is an explanatory diagram of the feature amount d indicating the amount of change (after the turn). FIG. 10 is a conceptual diagram of the camera image data 40. FIG. 11 is a graph showing a detected waveform pattern (when blinking). FIG. 12 is a graph showing registered whispering waveform patterns.

ここで、「頷き状態」とは、話者が話している内容に聞き手が納得したときに、聞き手の頭部が上下方向に所定量以上に振れる状態をいう。本実施形態では、周知の画像処理によってユーザの頭部の振れを検出するのであるが、例えば、特開２００７−９７６６８号公報に記載された状態識別装置による識別方法が適用可能である。 Here, the “whispering state” refers to a state in which the listener's head swings upward or downward by a predetermined amount or more when the listener is satisfied with the content spoken by the speaker. In this embodiment, the shake of the user's head is detected by well-known image processing. However, for example, an identification method using a state identification device described in Japanese Patent Application Laid-Open No. 2007-97668 is applicable.

ここで、上記識別方法を適用した頷き検出方法の具体例について説明する。まず、カメラ３４から転送されたカメラ画像データが、ＨＤＤ３１のカメラ画像データ記憶エリア３１４（図３参照）に記憶される。そして、カメラ画像データ記憶エリア３１４に記憶されたカメラ画像から人物の画像を検出する。次いで、検出された人物毎に顔の特徴量を算出する。本実施形態では、眉間又は目の検出によって眉間の位置座標を取得し、検出された顔の輪郭から、画像に写っている顔の最下端部の位置座標を取得する。そして、取得した２つの位置座標の差分値を算出する。 Here, a specific example of the whirl detection method to which the identification method is applied will be described. First, the camera image data transferred from the camera 34 is stored in the camera image data storage area 314 (see FIG. 3) of the HDD 31. Then, a person image is detected from the camera images stored in the camera image data storage area 314. Next, a facial feature amount is calculated for each detected person. In the present embodiment, the position coordinates between the eyebrows are acquired by detecting the space between the eyebrows or the eyes, and the position coordinates of the lowermost end portion of the face shown in the image are acquired from the detected face outline. Then, a difference value between the two acquired position coordinates is calculated.

例えば、カメラ画像に写っている顔が正面顔の場合、図８に示すように、顎の位置座標が、顔の画像に写っている最下端部の位置座標として取得される。一方、カメラ画像に写っている顔がうつむき顔の場合、図９に示すように、鼻など、より目に近い位置の座標が、顔の画像に写っている最下端部の位置座標として取得される。図８および図９の対比から明らかであるように、眉間から画像に写っている顔の最下端部までの距離ｄは、正面顔で最も長く、うつむき加減が大きいほど短くなる。従って、２箇所の位置座標の差分値により、顔のうつむき加減を判定できる。なお、特徴量抽出に基づく顔の識別については種々の技術が知られており、本実施形態では、そのいずれの技術をも採用できる。 For example, when the face shown in the camera image is a front face, as shown in FIG. 8, the position coordinates of the chin are acquired as the position coordinates of the lowest end part shown in the face image. On the other hand, when the face shown in the camera image is a face that looks down, as shown in FIG. 9, the coordinates of the position closer to the eyes, such as the nose, are acquired as the position coordinates of the lowest end part shown in the face image. The As is clear from the comparison between FIG. 8 and FIG. 9, the distance d from the space between the eyebrows to the lowermost end portion of the face shown in the image is the longest in the front face, and becomes shorter as the degree of depression is increased. Therefore, it is possible to determine whether the face is muted or not based on the difference value between the two position coordinates. Various techniques are known for identifying a face based on feature amount extraction, and any of these techniques can be employed in the present embodiment.

そして、算出した特徴量ｄに、カメラ画像の管理情報に含まれている撮影時刻の情報と、顔を検出して識別した際に割り当てたユーザＩＤとを付したカメラ画像データ４０（図１０参照）を生成し、カメラ画像データ記憶エリア３１４（図３参照）に記憶する。そして、上記処理を繰り返すことにより、カメラ画像データ記憶エリア３１４には、各時刻における聞き手のうつむき加減を表す複数のカメラ画像データ４０が蓄積される。 Then, camera image data 40 (see FIG. 10) in which the calculated feature amount d is added with information on the shooting time included in the management information of the camera image and the user ID assigned when the face is detected and identified. ) And is stored in the camera image data storage area 314 (see FIG. 3). Then, by repeating the above processing, a plurality of camera image data 40 representing the degree of depression of the listener at each time is accumulated in the camera image data storage area 314.

さらに、直前に生成した撮影時間１０秒分のカメラ画像データ４０を、カメラ画像データ記憶エリア３１４から読み込み、ユーザＩＤに基づいてユーザ別に分類する。続いて、各聞き手のデータを時刻情報に基づいて時系列に並べる。この時系列に並べられたデータ群から、特徴量（距離ｄ）の経時変化を表す検出波形パターン（図１１参照）を生成する。 Furthermore, the camera image data 40 for the shooting time of 10 seconds generated immediately before is read from the camera image data storage area 314 and classified by user based on the user ID. Subsequently, the data of each listener is arranged in time series based on the time information. A detection waveform pattern (see FIG. 11) representing a change with time of the feature amount (distance d) is generated from the data group arranged in time series.

そして、生成した検出波形パターンを、ＨＤＤ３１の波形パターン記憶エリア３１７（図３参照）に予め登録されている波形パターン（図１２参照）と照合する。本実施形態では、軽くうつむく動作が行なわれたことを表す波形である１秒程度の短い波形パターン（図１２参照）が記憶されている。この波形パターンを「頷きパターン」と呼ぶ。つまり、検出波形パターンが、頷きパターンに一致する場合は、頷いていると判断できる。なお、頷きパターンの波形は、このパターンに限らず、自由に変更可能である。 Then, the generated detected waveform pattern is collated with a waveform pattern (see FIG. 12) registered in advance in the waveform pattern storage area 317 (see FIG. 3) of the HDD 31. In the present embodiment, a short waveform pattern (see FIG. 12) of about 1 second, which is a waveform indicating that the lightly nailing operation has been performed, is stored. This waveform pattern is referred to as a “whit pattern”. That is, if the detected waveform pattern matches the whirling pattern, it can be determined that it is scooping. Note that the waveform of the whirling pattern is not limited to this pattern and can be freely changed.

次に、上記構成からなる端末装置３のＣＰＵ２０によって実行される通信制御処理について、図１３乃至図１７のフローチャート、及び図１８を参照して説明する。図１３は、通信制御処理のフローチャートである。図１４は、図１３の続きを示すフローチャートである。図１５は、図１４の続きを示すフローチャートである。図１６は、図１５の続きを示すフローチャートである。図１７は、図１６の続きを示すフローチャートである。図１８は、頷き検出時の画像カット処理を説明するためのタイミングチャートである。 Next, communication control processing executed by the CPU 20 of the terminal device 3 having the above configuration will be described with reference to the flowcharts of FIGS. 13 to 17 and FIG. FIG. 13 is a flowchart of the communication control process. FIG. 14 is a flowchart showing a continuation of FIG. FIG. 15 is a flowchart showing a continuation of FIG. FIG. 16 is a flowchart showing a continuation of FIG. FIG. 17 is a flowchart showing a continuation of FIG. FIG. 18 is a timing chart for explaining the image cut processing at the time of detection of a whirl.

なお、この通信制御処理は、端末装置３のみならず、他の端末装置４〜６においても同様に行われるものである。従って、ここでは端末装置３のＣＰＵ２０によって実行される通信制御処理についてのみ説明する。 This communication control process is performed not only in the terminal device 3 but also in the other terminal devices 4 to 6. Therefore, only the communication control process executed by the CPU 20 of the terminal device 3 will be described here.

図１３に示すように、まず、各種データが初期化される（Ｓ１）。そして、ユーザは会議前に頷き画像を録画するために、端末装置３に設けられた録画スイッチ（図示外）を操作する。これに伴い、録画スイッチが操作されたことによる頷き画像の録画の指示があったか否かが判断される（Ｓ２）。録画スイッチが操作されない間は（Ｓ２：ＮＯ）、録画スイッチの操作が引き続き監視される（Ｓ２）。録画スイッチが操作された場合（Ｓ２：ＹＥＳ）、ディスプレイ２８に頷きが得られるコンテンツ画像が再生される（Ｓ３）。コンテンツ画像には、ＨＤＤ３１のコンテンツ画像記憶エリア３１８に記憶され、例えば、ユーザに共感させるような画像や、頷きを促すような画像等を採用できる。 As shown in FIG. 13, first, various data are initialized (S1). And a user operates the recording switch (not shown) provided in the terminal device 3 in order to record a rolling image before a meeting. Along with this, it is determined whether or not there is an instruction to record a whispered image by operating the recording switch (S2). While the recording switch is not operated (S2: NO), the operation of the recording switch is continuously monitored (S2). When the recording switch is operated (S2: YES), a content image that can be displayed on the display 28 is reproduced (S3). The content image is stored in the content image storage area 318 of the HDD 31, and for example, an image that sympathizes with the user or an image that encourages the user to use it can be adopted.

そして、カメラ３４で撮影されているユーザの頷きを検出したか否かが判断される（Ｓ４）。なお、ユーザの頷きは、上記した検出方法によって検出される。ユーザが頷くまでは（Ｓ４：ＮＯ）、Ｓ４に戻って、引き続き、ユーザの頷きが検出されたか否かが判断される。コンテンツ画像を見てユーザが頷いた場合（Ｓ４：ＹＥＳ）、頷き画像の録画が行われる（Ｓ５）。なお、録画された頷き画像の頷き画像データは、その録画時間と共に、ＨＤＤ３１の頷き画像データテーブル記憶エリア３１３に記憶された頷き画像データテーブル３１３１に記憶される。 Then, it is determined whether or not the user's whirling photographed by the camera 34 has been detected (S4). In addition, a user's whisper is detected by the above-described detection method. Until the user asks (S4: NO), the process returns to S4, and it is continuously determined whether or not the user's whisper is detected. When the user hears the content image (S4: YES), the whispered image is recorded (S5). Note that the recorded image data of the recorded image is stored in the expanded image data table 3131 stored in the expanded image data table storage area 313 of the HDD 31 together with the recording time.

続いて、他の端末装置４〜６の少なくとも何れかとネットワーク２を介して接続したか否かが判断される（Ｓ６）。他の端末装置４〜６の何れかと接続するまでは（Ｓ６：ＮＯ）、Ｓ６に戻り、処理が繰り返される。他の端末装置と接続した場合（Ｓ６：ＹＥＳ）、接続した端末装置から送信される端末情報に基づき、その端末装置の動作状況が、ＨＤＤ３１の端末状況テーブル記憶エリア３１２に記憶された端末状況テーブル３１２１（図５参照）に記憶される（Ｓ７）。さらに、端末装置３のＣＰＵ２０の負荷が計測され、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ８）。さらに、端末装置３のデータ伝送時間の遅延について計測され、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ９）。 Subsequently, it is determined whether or not it is connected to at least one of the other terminal devices 4 to 6 via the network 2 (S6). Until connected to any of the other terminal devices 4 to 6 (S6: NO), the process returns to S6 and the process is repeated. When connected to another terminal device (S6: YES), based on the terminal information transmitted from the connected terminal device, the terminal status table in which the operating status of the terminal device is stored in the terminal status table storage area 312 of the HDD 31 It is stored in 3121 (see FIG. 5) (S7). Further, the load on the CPU 20 of the terminal device 3 is measured and stored in the terminal status table 3121 (see FIG. 5) (S8). Further, the data transmission time delay of the terminal device 3 is measured and stored in the terminal status table 3121 (see FIG. 5) (S9).

さらに、端末装置３の頷き検出機能の有無について、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ１０）。本実施形態の場合は、端末装置３には頷き検出機能は有るので、端末状況テーブル３１２１の頷き検出機能カラム６５には、「有り」と記憶される。なお、具体的には、頷き検出機能の有無の登録について、「有り」の場合は、頷き機能フラグ「１」が記憶される。「無し」の場合は、頷き機能フラグ「０」が記憶される。 Furthermore, the presence or absence of the whirl detection function of the terminal device 3 is stored in the terminal status table 3121 (see FIG. 5) (S10). In the present embodiment, since the terminal device 3 has a whirl detection function, “whether” is stored in the whirl detection function column 65 of the terminal status table 3121. More specifically, when the presence / absence of the presence / absence detection function is registered, the presence / absence function flag “1” is stored. In the case of “None”, the rolling function flag “0” is stored.

さらに、図示しないが、ネットワーク２に端末装置３からログインしたログイン人数が検出され、その検出されたログイン人数が端末状況テーブル３１２１（図５参照）に記憶される。さらに、ユーザの発話を検出すると共に、その発話からの経過時間が端末状況テーブル３１２１（図５参照）に記憶される。なお、ログイン人数を検出するＣＰＵ２０が本発明の「ログイン人数検出手段」に相当し、検出されたログイン人数を記憶するＣＰＵ２０が本発明の「ログイン人数記憶手段」に相当する。マイク３５によって発話が検出されてからの経過時間を計測するＣＰＵ２０が本発明の「経過時間計測手段」に相当する。 Further, although not shown, the number of login users who have logged into the network 2 from the terminal device 3 is detected, and the detected number of login users is stored in the terminal status table 3121 (see FIG. 5). Further, the user's utterance is detected, and the elapsed time from the utterance is stored in the terminal status table 3121 (see FIG. 5). The CPU 20 that detects the number of logins corresponds to the “login number detection means” of the present invention, and the CPU 20 that stores the detected number of logins corresponds to the “login number storage means” of the present invention. The CPU 20 that measures the elapsed time after the utterance is detected by the microphone 35 corresponds to the “elapsed time measuring means” of the present invention.

次いで、話者を検出したか否かが判断される（Ｓ１２）。マイク３５からユーザの音声を検出した場合に話者と判断され、音声を検出しない場合は聞き手と判断される。ここで、話者となるか聞き手となるかでその後の処理が変わるので、話者が検出された場合と、話者が検出されずに聞き手となった場合とに分けて、順に以下説明する。 Next, it is determined whether or not a speaker has been detected (S12). When the user's voice is detected from the microphone 35, it is determined as a speaker, and when the voice is not detected, it is determined as a listener. Here, since the subsequent processing changes depending on whether it is a speaker or a listener, the following will be described in order for a case where a speaker is detected and a case where a speaker is detected without being detected. .

まず、端末装置３で話者を検出した場合について説明する。図１４に示すように、端末装置３の拠点において、話者を検出した場合（Ｓ１２：ＹＥＳ）、聞き手となった他の端末装置の中から頷きを検出する一の端末装置が特定される（Ｓ１３）。ここでは、各端末装置４〜６の動作状況に基づいて決められた条件に基づいて、頷きを検出する一の端末装置が特定される。なお、聞き手となった他の端末装置が１つしかない場合は、頷きを検出する一の端末装置を特定する処理は行われない。 First, a case where a speaker is detected by the terminal device 3 will be described. As shown in FIG. 14, when a speaker is detected at the base of the terminal device 3 (S12: YES), one terminal device that detects whispering is identified from other terminal devices that have become listeners ( S13). Here, one terminal device that detects whispering is specified based on the condition determined based on the operation status of each of the terminal devices 4 to 6. In addition, when there is only one other terminal device as a listener, the process of specifying one terminal device that detects whispering is not performed.

ここで、端末を特定するための条件について説明する。本実施形態では、第１条件、第２条件、第３条件、第４条件の４種類がある。第１条件では、ＣＰＵ負荷が最も少ない端末装置を特定する。ＣＰＵ負荷が少ない端末装置であれば、頷き検出機能を利用した場合でも、速やかに処理できる。第２条件は、データ伝送時間の遅延が最も短い端末装置を特定する。データ伝送時間の遅延が短ければ短いほど、頷き信号の送受信にかかる時間が短くなり、速やかに処理できる。第３条件は、ログイン人数の最も多い端末装置を特定する。ログイン人数が多い拠点であれば、頷きを検出する割合が高くなるので、会話をより円滑に進めることができる。第４条件は、最後に発話してからの経過時間（発話後経過時間）が最も短い端末装置を特定する。つまり、前回話していた話者の端末装置を優先して特定できる。現在の会話に直近で話をしていたユーザの頷きを検出できるので、現在の話者に効果的な印象を与え、より円滑な会話を提供できる。 Here, conditions for specifying a terminal will be described. In the present embodiment, there are four types of first condition, second condition, third condition, and fourth condition. In the first condition, a terminal device with the least CPU load is specified. If the terminal device has a low CPU load, it can be quickly processed even when the whisper detection function is used. The second condition specifies the terminal device with the shortest data transmission time delay. The shorter the data transmission time delay, the shorter the time required to send and receive the whispering signal, and the faster it can be processed. The third condition specifies a terminal device with the largest number of logins. A base with a large number of logged-in persons can increase the rate of detection of whispering, so that the conversation can proceed more smoothly. The fourth condition specifies a terminal device that has the shortest elapsed time since the last utterance (elapsed time after utterance). That is, the terminal device of the speaker who was speaking last time can be identified with priority. Since it is possible to detect the whispering of the user who has spoken most recently in the current conversation, an effective impression can be given to the current speaker and a smoother conversation can be provided.

本実施形態では、端末装置３において、これら第１乃至４条件の中から何れかを選択することができる。ユーザによって選択された条件に基づき、ＣＰＵ２０は、頷きを検出する一の端末装置を特定する。なお、頷きを検出する一の端末装置を特定する際に、頷き検出機能が無いものは除外される。図５に示す端末状況テーブル３１２１では、端末装置６（端末ＩＤ＝０００４）には頷き検出機能はない。従って、頷き検出機能を有し、かつ聞き手となる相手側の端末装置４、５の何れかから一の端末を特定する。そして、上記した条件に基づき、頷きを検出する一の端末装置が特定されたら、その特定した端末装置に向けて、頷きを検出するように指示するための頷き検出指示信号が送信される（Ｓ１４）。 In the present embodiment, the terminal device 3 can select any one of these first to fourth conditions. Based on the condition selected by the user, the CPU 20 identifies one terminal device that detects the whisper. It should be noted that when one terminal device that detects whispering is specified, those that do not have the whisper detection function are excluded. In the terminal status table 3121 shown in FIG. 5, the terminal device 6 (terminal ID = 0004) does not have a whisper detection function. Therefore, one terminal is specified from any of the terminal devices 4 and 5 on the other party side that has a whisper detection function and is a listener. Then, when one terminal device that detects whispering is identified based on the above-described conditions, a whispering detection instruction signal for instructing to detect whispering is transmitted to the identified terminal device (S14). ).

続いて、他の端末装置から頷き検出指示信号を受信したか否かが判断される（Ｓ１５）。上記したように、端末装置３が話者となっている場合は、頷き検出指示信号を送信する側であって受信する側ではないので（Ｓ１５：ＮＯ）、図１６に示すフローに移行し、相手側の端末装置から頷き画像を受信したか否かが判断される（Ｓ１９）。上記したように、頷き画像は、送信元の端末ＩＤと、頷き画像の録画時間と共に送信される。頷き画像を受信した場合（Ｓ１９：ＹＥＳ）、その頷き画像はエンコードされて圧縮された状態であるので、その頷き画像データのデコード処理が行われる（Ｓ２０）。そして、デコードされて無圧縮の状態となった頷き画像データと、端末ＩＤと、録画時間とが、ＨＤＤ３１に記憶された頷き画像データテーブル３１３１に登録される（Ｓ２１）。即ち、頷き画像データを無圧縮の状態で記憶しておくので、頷き信号を受信した場合に速やかにディスプレイ２８に表示させることができる。 Subsequently, it is determined whether or not a whisper detection instruction signal has been received from another terminal device (S15). As described above, when the terminal device 3 is a speaker, since it is a side that transmits a whisper detection instruction signal and not a receiver (S15: NO), the process proceeds to the flow shown in FIG. It is determined whether or not a fired image has been received from the counterpart terminal device (S19). As described above, the whispered image is transmitted together with the terminal ID of the sender and the recording time of the whispered image. When the whispered image is received (S19: YES), the whispered image is in an encoded and compressed state, so that the whispered image data is decoded (S20). Then, the decoded image data that has been decoded and is in an uncompressed state, the terminal ID, and the recording time are registered in the distributed image data table 3131 stored in the HDD 31 (S21). In other words, since the whispered image data is stored in an uncompressed state, when the whispering signal is received, it can be displayed on the display 28 promptly.

ところで、テレビ会議システム１では、端末装置間で画像と音声の送受信が行われることで会議が行われる。画像については、図１８に示すように、画像を録画した端末装置から、その録画された画像の画像データがストリーミング方式で相手側の端末装置に送信され、ディスプレイ２８においてバッファ再生される。ストリーミング方式では、エンコード処理とデコード処理に時間がかかる。従って、再生側の端末装置のディスプレイ２８の表示に遅延が生じる。例えば、ｔ０タイミングで録画した画像データは、遅延時間Ｐを経て、ｔ１タイミングで再生される。さらに、ｔ１タイミングで録画した画像データも同様に、遅延時間Ｐを経て、ｔ２タイミングで再生されることになる。 By the way, in the video conference system 1, a conference is performed by transmitting and receiving an image and sound between terminal devices. As for the image, as shown in FIG. 18, the image data of the recorded image is transmitted from the terminal device that recorded the image to the partner terminal device in the streaming method, and is buffer-played on the display 28. In the streaming method, it takes time to encode and decode. Accordingly, a delay occurs in the display on the display 28 of the terminal device on the playback side. For example, image data recorded at timing t0 is reproduced at timing t1 after a delay time P. Further, the image data recorded at the timing t1 is similarly reproduced at the timing t2 after the delay time P.

そして、図１６に示すように、端末装置３では、聞き手であって録画側である端末装置から頷き信号を受信したか否かが判断される（Ｓ２２）。頷き信号は、画像データに比べて情報量が格段に小さい。そのため、頷きを検出する端末として特定された端末装置から送信された頷き信号は、ネットワーク２を介して、話者である端末装置３に速やかに送信される。 Then, as shown in FIG. 16, in the terminal device 3, it is determined whether or not a whispering signal has been received from the terminal device that is the listener and on the recording side (S22). The amount of information of the whispering signal is much smaller than that of the image data. Therefore, the whispering signal transmitted from the terminal device specified as the terminal that detects whispering is promptly transmitted to the terminal device 3 that is a speaker via the network 2.

ここで、例えば、録画側の端末装置において、ｔ３タイミング（図１８参照）で頷きが検出された場合、遅延時間Ｐよりも短い時間で、話者であって再生側である端末装置３に向けて頷き信号が送信される。そして、再生側である端末装置３において頷き信号がｔ４タイミング（図１８参照）で受信される。次いで、ＨＤＤ３１に記憶された頷き画像データテーブル３１３１に、頷き信号を送信した端末装置に対応する頷き画像データが記憶されているか否かが判断される（Ｓ２３）。 Here, for example, when a whisper is detected at the timing t3 (see FIG. 18) in the terminal device on the recording side, in a time shorter than the delay time P, it is directed to the terminal device 3 that is a speaker and on the playback side. A whispering signal is sent. Then, the terminal device 3 on the reproduction side receives the whisper signal at the timing t4 (see FIG. 18). Next, it is determined whether or not the rolled image data corresponding to the terminal device that has transmitted the rolled signal is stored in the rolled image data table 3131 stored in the HDD 31 (S23).

頷き画像が記憶されていると判断された場合（Ｓ２３：ＹＥＳ）、ｔ４タイミング（図１８参照）で、デコードした頷き画像データに基づき、ディスプレイ２８において再生中の画像に割り込んで頷き画像が再生される（Ｓ２５）。このときストリーミング画像を配信する録画側の端末装置において、ストリーミング画像は、頷き画像の再生時間Ｑ分だけカットされる。なお、このカット処理を行うＣＰＵ２０が本発明の「第２ストリーミング画像カット手段」に相当する。 When it is determined that a whispered image is stored (S23: YES), at the timing t4 (see FIG. 18), based on the decoded whispered image data, the fired image is reproduced by interrupting the image being played on the display 28. (S25). At this time, in the terminal device on the recording side that distributes the streaming image, the streaming image is cut for the reproduction time Q of the rolled image. The CPU 20 that performs this cut processing corresponds to the “second streaming image cutting means” of the present invention.

さらに、頷き画像の再生時間Ｑが経過したｔ６タイミングにおいて、頷き画像が割り込まれた際にバッファに残存する画像データのＲ時間分が遅延して再生される（Ｓ２６）。そして、Ｒ時間分の再生が終了するｔ７タイミングから、通常のストリーミング画像のバッファ再生が行われる（Ｓ２７）。 Further, at time t6 when the reproduction time Q of the whispered image has elapsed, the R time of the image data remaining in the buffer when the whispered image is interrupted is delayed and reproduced (S26). Then, normal streaming image buffer playback is performed from timing t7 when playback for R hours is completed (S27).

なお、頷き信号を受信しても（Ｓ２２：ＹＥＳ）、ＨＤＤ３１の頷き画像データテーブル３１３１に頷き画像が記憶されていないと判断された場合（Ｓ２３：ＮＯ）、ＨＤＤ３１に予め記憶された代替画像が表示される（Ｓ２４）。代替画像は、例えば、文字、図形等で、聞き手が頷いていることを話者に示すものであればよい。 If it is determined that a fired image is not stored in the fired image data table 3131 of the HDD 31 (S23: NO) even if the fired signal is received (S22: YES), an alternative image stored in advance in the HDD 31 is stored. It is displayed (S24). The substitute image may be, for example, a character, graphic, or the like that indicates to the speaker that the listener is speaking.

次に、図１７に示すように、端末装置間において、画像通話中であるか否かが判断される（Ｓ２８）。画像通話中である場合は（Ｓ２８：ＹＥＳ）、自拠点でのカメラ画像の画像データのエンコード処理が行われ（Ｓ２９）、そのエンコード処理された画像データが、相手側の端末装置にストリーミング配信される（Ｓ３０）。続いて、端末装置との接続が全て切断されたか否かが判断される（Ｓ３１）。接続が全て切断された場合は（Ｓ３１：ＹＥＳ）、処理を終了する。接続がまだ残っている場合は（Ｓ３１：ＮＯ）、図１４のＳ６に戻り、どの端末と接続されているかが判断される。そして、接続が維持された端末装置の動作状況が書き換えられると共に、新たに接続された端末装置の動作状況が記憶され、同様に処理が繰り返される。 Next, as shown in FIG. 17, it is determined whether an image call is in progress between the terminal devices (S28). When the image call is in progress (S28: YES), the image data of the camera image at the local site is encoded (S29), and the encoded image data is streamed to the other terminal device. (S30). Subsequently, it is determined whether or not all the connections with the terminal device have been disconnected (S31). If all the connections are disconnected (S31: YES), the process is terminated. If the connection still remains (S31: NO), the process returns to S6 in FIG. 14 to determine which terminal is connected. Then, the operation status of the terminal device in which the connection is maintained is rewritten, the operation status of the newly connected terminal device is stored, and the process is repeated in the same manner.

次に、端末装置３で話者を検出しなかった場合について説明する。図１４に示すように、端末装置３の拠点において、話者を検出しなかった場合（Ｓ１２：ＮＯ）、端末装置３は聞き手となる。そこで、図１５に示すように、話者となった相手側の端末装置から、頷き検出指示信号を受信したか否かが判断される（Ｓ１５）。頷き検出指示信号を受信した場合、頷き検出処理が実行される（Ｓ１６）。この頷き検出処理は、上記した頷き検出方法に従って、カメラ画像から頷いている人が検出され、カメラ画像から頷きが検出される。そして、頷き信号が話者となった相手側の端末装置に向けて送信される。 Next, a case where a speaker is not detected by the terminal device 3 will be described. As shown in FIG. 14, when no speaker is detected at the base of the terminal device 3 (S12: NO), the terminal device 3 becomes a listener. Therefore, as shown in FIG. 15, it is determined whether or not a whisper detection instruction signal has been received from the partner terminal device that has become the speaker (S15). If a whisper detection instruction signal is received, whisper detection processing is executed (S16). In this whispering detection process, a person whispering from a camera image is detected according to the whirling detection method described above, and whispering is detected from the camera image. Then, a whisper signal is transmitted toward the other terminal device that becomes the speaker.

その後、図１６に示すように、頷き画像を受信したか否かが判断される（Ｓ１９）。頷き画像を受信した場合（Ｓ１９：ＹＥＳ）、その頷き画像はエンコードされて圧縮された状態であるので、その頷き画像データのデコード処理が行われる（Ｓ２０）。そして、デコードされて無圧縮の状態となった頷き画像データと、端末ＩＤと、録画時間とが、ＨＤＤ３１に記憶された頷き画像データテーブル３１３１に登録される（Ｓ２１）。 Thereafter, as shown in FIG. 16, it is determined whether or not a whispered image has been received (S19). When the whispered image is received (S19: YES), the whispered image is in an encoded and compressed state, so that the whispered image data is decoded (S20). Then, the decoded image data that has been decoded and is in an uncompressed state, the terminal ID, and the recording time are registered in the distributed image data table 3131 stored in the HDD 31 (S21).

さらに、頷き信号を受信したか否かが判断される（Ｓ２２）。現在、端末装置３は聞き手であって、頷き信号を送信する側であるので（Ｓ２２：ＮＯ）、続いて、図１７に示すように、端末装置間において、画像通話中であるか否かが判断される（Ｓ２８）。画像通話中である場合は（Ｓ２８：ＹＥＳ）、自拠点でのカメラ画像の画像データのエンコード処理が行われ（Ｓ２９）、そのエンコード処理された画像データが、相手側の端末装置にストリーミング配信される（Ｓ３０）。続いて、端末装置との接続が全て切断されたか否かが判断される（Ｓ３１）。接続が全て切断された場合は（Ｓ３１：ＹＥＳ）、処理を終了する。接続がまだ残っている場合は（Ｓ３１：ＮＯ）、図１４のＳ６に戻り、端末との接続状況が判断され、接続している端末装置の動作状況が最新のものに書き換えられ（Ｓ７〜１０）、上記と同様に処理が繰り返される。 Further, it is determined whether or not a whispering signal has been received (S22). At present, since the terminal device 3 is a listener who transmits a whisper signal (S22: NO), subsequently, as shown in FIG. 17, whether or not an image call is in progress between the terminal devices is determined. Determination is made (S28). When the image call is in progress (S28: YES), the image data of the camera image at the local site is encoded (S29), and the encoded image data is streamed to the other terminal device. (S30). Subsequently, it is determined whether or not all the connections with the terminal device have been disconnected (S31). If all the connections are disconnected (S31: YES), the process is terminated. If the connection still remains (S31: NO), the process returns to S6 in FIG. 14, the connection status with the terminal is determined, and the operation status of the connected terminal device is rewritten to the latest (S7 to 10). ), The process is repeated as described above.

なお、以上説明において、図２，図７に示すディスプレイ２８が本発明の「表示手段」に相当する。図２に示すマイク３５が本発明の「発話検出手段」に相当する。図３に示すＨＤＤ３１の頷き画像データテーブル記憶エリア３１３が本発明の「反応時画像記憶手段」に相当する。図１４に示すＳ８の処理を実行するＣＰＵ２０が本発明の「ＣＰＵ負荷検出手段」および「ＣＰＵ負荷状況記憶手段」に相当する。図１４に示すＳ９の処理を実行するＣＰＵ２０が本発明の「伝送時間検出手段」および「伝送時間記憶手段」に相当する。図１４に示すＳ１１の処理を実行するＣＰＵ２０が本発明の「反応時画像データ送信手段」に相当する。図１４に示すＳ１３の処理を実行するＣＰＵ２０が本発明の「端末特定手段」に相当する。図１４に示すＳ１４の処理を実行するＣＰＵ２０が本発明の「通知信号送信手段」に相当する。図１４に示すＳ１５の処理を実行するＣＰＵ２０が本発明の「通知信号受信手段」に相当する。図１５に示すＳ１６の処理を実行するＣＰＵ２０が本発明の「反応状態検出手段」に相当する。図１５に示すＳ１７の処理を実行するＣＰＵ２０が本発明の「反応信号送信手段」に相当する。図１６に示すＳ１９の処理を実行するＣＰＵ２０が本発明の「反応時画像データ受信手段」に相当する。図１６に示すＳ２０，２１の処理を実行するＣＰＵ２０が本発明の「解凍記憶処理手段」に相当する。図１６に示すＳ２２の処理を実行するＣＰＵ２０が本発明の「反応信号受信手段」に相当する。図１６に示すＳ２３の処理を実行するＣＰＵ２０が本発明の「反応時画像記憶判断手段」に相当する。図１６に示すＳ２４の処理を実行するＣＰＵ２０が本発明の「代替画像表示制御手段」に相当する。図１６に示すＳ２５の処理を実行するＣＰＵ２０が本発明の「反応時画像表示制御手段」に相当する。 In the above description, the display 28 shown in FIGS. 2 and 7 corresponds to the “display means” of the present invention. The microphone 35 shown in FIG. 2 corresponds to the “speech detection means” of the present invention. The whirling image data table storage area 313 of the HDD 31 shown in FIG. 3 corresponds to the “reaction image storage means” of the present invention. The CPU 20 that executes the process of S8 shown in FIG. 14 corresponds to the “CPU load detection means” and the “CPU load status storage means” of the present invention. The CPU 20 that executes the process of S9 shown in FIG. 14 corresponds to the “transmission time detection means” and the “transmission time storage means” of the present invention. The CPU 20 that executes the process of S11 shown in FIG. 14 corresponds to the “reaction time image data transmitting means” of the present invention. The CPU 20 that executes the process of S13 shown in FIG. 14 corresponds to the “terminal specifying unit” of the present invention. The CPU 20 that executes the process of S14 shown in FIG. 14 corresponds to the “notification signal transmission means” of the present invention. The CPU 20 that executes the process of S15 shown in FIG. 14 corresponds to “notification signal receiving means” of the present invention. The CPU 20 that executes the process of S16 shown in FIG. 15 corresponds to the “reaction state detection means” of the present invention. The CPU 20 that executes the process of S17 shown in FIG. 15 corresponds to the “reaction signal transmission means” of the present invention. The CPU 20 that executes the process of S19 shown in FIG. 16 corresponds to the “reaction time image data receiving means” of the present invention. The CPU 20 that executes the processes of S20 and S21 shown in FIG. 16 corresponds to the “decompressing storage processing means” of the present invention. The CPU 20 that executes the process of S22 shown in FIG. 16 corresponds to the “reaction signal receiving means” of the present invention. The CPU 20 that executes the process of S23 shown in FIG. 16 corresponds to the “reaction-time image storage determination unit” of the present invention. The CPU 20 that executes the processing of S24 shown in FIG. 16 corresponds to “alternative image display control means” of the present invention. The CPU 20 that executes the process of S25 shown in FIG. 16 corresponds to the “reaction-time image display control means” of the present invention.

以上説明したように、第１実施形態である端末装置３は、ネットワーク２を介して他の端末装置４乃至６と相互に接続される。これら端末装置間で、画像、音声を互いに送受信することで遠隔会議を実施するテレビ会議システム１を構成する。このテレビ会議システム１では、遠隔会議中に、特定の端末装置（端末装置３乃至６の何れか）において聞き手が頷いていることを検出した場合に、話者のいる相手側の端末装置（端末装置３乃至６の何れか）に聞き手の頷きを検出したことを通知する。頷きが通知された相手側の端末装置では、予め記憶しておいた聞き手の頷き画像を表示する。つまり、この方式では、ストリーミング方式とは異なり、画像データのエンコード及びデコードが不要であるので、聞き手の頷きを遅延なく表示できる。従って、話すタイミングと聞き手のリアクションとのずれを小さくできるので、円滑な会話を提供できる。 As described above, the terminal device 3 according to the first embodiment is connected to the other terminal devices 4 to 6 via the network 2. A video conference system 1 that implements a remote conference by transmitting and receiving images and sounds between these terminal devices is configured. In this video conference system 1, during a remote conference, when it is detected that a listener is speaking in a specific terminal device (any one of the terminal devices 3 to 6), the terminal device (terminal) of the other party where the speaker is located Any device 3 to 6) is notified that the listener's whisper has been detected. The partner terminal device that is notified of the whispering displays a pre-stored listener whispering image. In other words, unlike the streaming method, this method does not require encoding and decoding of image data, so that the listener's whisper can be displayed without delay. Therefore, since the gap between the speaking timing and the listener's reaction can be reduced, a smooth conversation can be provided.

次に、第２実施形態である端末装置１３０について、図面を参照して説明する。第１実施形態では、ＣＰＵ２０による通信制御処理において、頷き画像を会議前に録画し、頷き画像の時間分だけ録画配信を停止する。これに対し、第２実施形態におけるＣＰＵ１２０の通信制御処理は、頷き画像を会議中に録画する点と、頷き画像の再生時間分だけストリーミング画像をカットする点がそれぞれ異なっている。そこで、これら異なる点を重点的に説明するために、第１実施形態とは異なるＣＰＵ１２０による通信制御処理を中心に説明する。なお、第２実施形態の端末装置１３０も第１実施形態の端末装置３と同様に、図１に示すテレビ会議システム１を構成するものである。 Next, the terminal device 130 which is 2nd Embodiment is demonstrated with reference to drawings. In the first embodiment, in the communication control process by the CPU 20, the whispered image is recorded before the meeting, and the recording distribution is stopped for the time of the whispered image. On the other hand, the communication control processing of the CPU 120 in the second embodiment is different in that a whispered image is recorded during a meeting and a streaming image is cut by the playing time of the whispered image. Therefore, in order to focus on these different points, the communication control process by the CPU 120 that is different from the first embodiment will be mainly described. In addition, the terminal device 130 of 2nd Embodiment comprises the video conference system 1 shown in FIG. 1 similarly to the terminal device 3 of 1st Embodiment.

まず、端末装置１３０の構成について、図１９を参照して説明する。図１９は、端末装置１３０の電気的構成を示すブロック図である。端末装置１３０には、端末装置１３０の制御を司るコントローラとしてのＣＰＵ１２０が設けられている。ＣＰＵ１２０には、ＢＩＯＳ等を記憶したＲＯＭ１２１と、各種データを一時的に記憶するＲＡＭ１２２と、データの受け渡しの仲介を行うＩ／Ｏインタフェイス３０とが接続されている。Ｉ／Ｏインタフェイス３０には、各種記憶エリアを有するハードディスクドライブ１３１（以下、ＨＤＤ１３１）が接続されている。なお、ＨＤＤ１３１は、第１実施形態のＨＤＤ３１と同様の各種記憶エリア（図３参照）を備えるものとする。その他の構成は、第１実施形態の端末装置３（図２参照）と同様の構成を備えているので説明を省略する。 First, the configuration of the terminal device 130 will be described with reference to FIG. FIG. 19 is a block diagram illustrating an electrical configuration of the terminal device 130. The terminal device 130 is provided with a CPU 120 as a controller that controls the terminal device 130. Connected to the CPU 120 are a ROM 121 that stores BIOS, a RAM 122 that temporarily stores various data, and an I / O interface 30 that mediates data transfer. The I / O interface 30 is connected to a hard disk drive 131 (hereinafter referred to as HDD 131) having various storage areas. The HDD 131 includes various storage areas (see FIG. 3) similar to the HDD 31 of the first embodiment. Other configurations are the same as the configuration of the terminal device 3 (see FIG. 2) of the first embodiment, and thus description thereof is omitted.

次に、ＣＰＵ１２０による通信制御処理について、図２０乃至図２３のフローチャートと、図２４を参照して説明する。図２０は、ＣＰＵ１２０による通信制御処理のフローチャートである。図２１は、図２０の続きを示すフローチャートである。図２２は、図２１の続きを示すフローチャートである。図２３は、図２２の続きを示すフローチャートである。図２４は、頷き検出時の画像カット処理を説明するためのタイミングチャートである。 Next, communication control processing by the CPU 120 will be described with reference to the flowcharts of FIGS. 20 to 23 and FIG. FIG. 20 is a flowchart of communication control processing by the CPU 120. FIG. 21 is a flowchart showing a continuation of FIG. FIG. 22 is a flowchart showing a continuation of FIG. FIG. 23 is a flowchart showing a continuation of FIG. FIG. 24 is a timing chart for explaining the image cut processing at the time of detection of a whirl.

図２０に示すように、まず、各種データが初期化される（Ｓ４０）。続いて、他の端末装置の少なくとも何れかとネットワーク２を介して接続したか否かが判断される（Ｓ４１）。他の端末装置の何れかと接続するまでは（Ｓ４１：ＮＯ）、Ｓ４１に戻り、処理が繰り返される。他の端末装置と接続した場合（Ｓ４１：ＹＥＳ）、接続した他の端末装置から送信される端末情報から、その端末装置の動作状況が、ＨＤＤ３１の端末状況テーブル記憶エリア３１２に記憶された端末状況テーブル３１２１（図５参照）に記憶される（Ｓ４２）。さらに、端末装置１３０のＣＰＵ１２０の負荷が計測され、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ４３）。さらに、端末装置１３０のデータ伝送時間の遅延について計測され、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ４４）。 As shown in FIG. 20, first, various data are initialized (S40). Subsequently, it is determined whether or not it is connected to at least one of other terminal devices via the network 2 (S41). Until it is connected to any other terminal device (S41: NO), the process returns to S41 and the process is repeated. When connected to another terminal device (S41: YES), the terminal status is stored in the terminal status table storage area 312 of the HDD 31 from the terminal information transmitted from the connected other terminal device. It is stored in the table 3121 (see FIG. 5) (S42). Further, the load on the CPU 120 of the terminal device 130 is measured and stored in the terminal status table 3121 (see FIG. 5) (S43). Further, the data transmission time delay of the terminal device 130 is measured and stored in the terminal status table 3121 (see FIG. 5) (S44).

さらに、端末装置の頷き検出機能の有無について、端末状況テーブル３１２１（図５参照）に記憶される（Ｓ４５）。端末装置は頷き検出機能が有る場合は、端末状況テーブル３１２１の頷き検出機能カラム６５には、「有り」と記憶される。なお、具体的には、頷き検出機能の有無の登録について、「有り」の場合は、頷き機能フラグ「１」が記憶される。「無し」の場合は、頷き機能フラグ「０」が記憶される。 Further, the presence / absence of the terminal device detection function is stored in the terminal status table 3121 (see FIG. 5) (S45). When the terminal device has a whirl detection function, “present” is stored in the whirl detection function column 65 of the terminal status table 3121. More specifically, when the presence / absence of the presence / absence detection function is registered, the presence / absence function flag “1” is stored. In the case of “None”, the rolling function flag “0” is stored.

次いで、話者を検出したか否かが判断される（Ｓ４７）。マイク３５からユーザの音声を検出した場合に話者と判断され、音声を検出しない場合は聞き手と判断される。ここで、話者となるか聞き手となるかでその後の処理が変わるので、話者が検出された場合と、話者が検出されずに聞き手となった場合とに分けて順に以下説明する。 Next, it is determined whether or not a speaker has been detected (S47). When the user's voice is detected from the microphone 35, it is determined as a speaker, and when the voice is not detected, it is determined as a listener. Here, since subsequent processing changes depending on whether the speaker is a speaker or a listener, the following description will be made separately for a case where a speaker is detected and a case where a speaker is detected without being detected.

まず、話者を検出した場合について説明する。図１９に示すように、端末装置の拠点において、話者を検出した場合（Ｓ４７：ＹＥＳ）、頷きを検出する端末装置が特定される（Ｓ４８）。ここでは、各端末装置の動作状況に基づいて決められた条件に基づいて、頷きを検出する一の端末装置が特定される。なお、端末を特定するための条件は、第１実施形態と同様に、上記した第１〜第４条件の何れかである。 First, a case where a speaker is detected will be described. As shown in FIG. 19, when a speaker is detected at the base of the terminal device (S47: YES), a terminal device that detects whispering is specified (S48). Here, one terminal device that detects whispering is specified based on conditions determined based on the operating status of each terminal device. In addition, the conditions for specifying the terminal are any of the first to fourth conditions described above, as in the first embodiment.

そして、第２実施形態でも同様に、端末装置１３０において、これら第１〜４条件の中から何れかを選択することができる。ユーザによって選択された条件に基づき、ＣＰＵ１２０は、頷きを検出する一の端末装置を特定する。なお、頷きを検出する一の端末装置を特定する際に、頷き検出機能が無いものは除外される。そして、上記した条件に基づき、頷きを検出する一の端末装置が特定されたら、その特定した端末装置に向けて、頷きを検出するように指示するための頷き検出指示信号が送信される（Ｓ４９）。 Similarly in the second embodiment, the terminal device 130 can select any one of the first to fourth conditions. Based on the condition selected by the user, the CPU 120 identifies one terminal device that detects whispering. It should be noted that when one terminal device that detects whispering is specified, those that do not have the whisper detection function are excluded. Then, when one terminal device that detects whispering is identified based on the above-described conditions, a whispering detection instruction signal for instructing to detect whispering is transmitted to the identified terminal device (S49). ).

続いて、他の端末装置から頷き検出指示信号を受信したか否かが判断される（Ｓ５０）。端末装置１３０が話者となっている場合は、頷き検出信号は送信する側であって受信する側ではないので（Ｓ５０：ＮＯ）、図２２に示すフローに移行し、相手側の端末装置から頷き画像を受信したか否かが判断される（Ｓ５７）。上記したように、頷き画像は、送信元の端末ＩＤと、頷き画像の録画時間と共に送信される。頷き画像を受信した場合（Ｓ５７：ＹＥＳ）、その頷き画像はエンコードされて圧縮された状態であるので、その頷き画像データのデコード処理が行われる（Ｓ５８）。そして、デコードされて無圧縮の状態となった頷き画像データと、端末ＩＤと、録画時間とが、ＨＤＤ３１に記憶された頷き画像データテーブル３１３１に登録される（Ｓ５９）。 Subsequently, it is determined whether or not a whisper detection instruction signal is received from another terminal device (S50). If the terminal device 130 is a speaker, the whisper detection signal is on the transmitting side and not on the receiving side (S50: NO), the process proceeds to the flow shown in FIG. It is determined whether or not a whispered image has been received (S57). As described above, the whispered image is transmitted together with the terminal ID of the sender and the recording time of the whispered image. When the whispered image is received (S57: YES), the whispered image is in an encoded and compressed state, so that the whispered image data is decoded (S58). Then, the decoded image data that has been decoded and is in an uncompressed state, the terminal ID, and the recording time are registered in the distributed image data table 3131 stored in the HDD 31 (S59).

ところで、第２実施形態においても、端末装置間で画像と音声の送受信が行われることで会議が行われる。画像については、図２４に示すように、画像を録画した端末装置から、その録画された画像の画像データがストリーミング方式で相手側の端末装置に送信され、ディスプレイ２８にバッファ再生される。ストリーミング方式では、エンコード処理とデコード処理に時間がかかる。従って、再生側の端末装置のディスプレイ２８の表示に遅延が生じる。例えば、ｒ０タイミングで録画した画像データは、遅延時間Ｐを経て、ｒ１タイミングで再生される。さらに、ｒ１タイミングで録画した画像データも同様に、遅延時間Ｐを経て、ｒ２タイミングで再生されることになる。 By the way, also in 2nd Embodiment, a meeting is performed by transmitting and receiving an image and an audio | voice between terminal devices. As for the image, as shown in FIG. 24, the image data of the recorded image is transmitted from the terminal device that recorded the image to the partner terminal device in a streaming manner, and is buffer-reproduced on the display 28. In the streaming method, it takes time to encode and decode. Accordingly, a delay occurs in the display on the display 28 of the terminal device on the playback side. For example, image data recorded at the r0 timing is reproduced at the r1 timing after the delay time P. Furthermore, the image data recorded at the r1 timing is also reproduced at the r2 timing after the delay time P.

そして、図２２に示すように、端末装置１３０では、聞き手であって録画側である端末装置から頷き信号を受信したか否かが判断される（Ｓ６０）。頷き信号は、画像データに比べて情報量が格段に小さい。そのため、頷きを検出する端末として特定された端末装置から送信された頷き信号は、ネットワーク２を介して、話者である端末装置１３０に速やかに通知される。 Then, as shown in FIG. 22, in the terminal device 130, it is determined whether or not a whispering signal has been received from the terminal device that is the listener and on the recording side (S60). The amount of information of the whispering signal is much smaller than that of the image data. Therefore, the whispering signal transmitted from the terminal device identified as the terminal that detects whispering is promptly notified to the terminal device 130 that is the speaker via the network 2.

ここで、録画側の端末装置において、ｒ３タイミング（図２４参照）で頷きが検出された場合、遅延時間Ｐよりも短い時間で、話者であって再生側である端末装置に向けて頷き信号が送信される。そして、再生側である端末装置において頷き信号がｒ４タイミング（図２４参照）で受信される。次いで、ＨＤＤ１３１に記憶された頷き画像データテーブル３１３１（図６参照）に、頷き信号を送信した端末装置に対応する頷き画像が記憶されているか否かが判断される（Ｓ６１）。 Here, when a whisper is detected at the r3 timing (see FIG. 24) in the terminal device on the recording side, a whisper signal is sent to the terminal device that is the speaker and on the playback side in a time shorter than the delay time P. Is sent. Then, the playing signal is received at the r4 timing (see FIG. 24) in the terminal device on the playback side. Next, it is determined whether or not a rolled image corresponding to the terminal device that has transmitted the rolled signal is stored in the rolled image data table 3131 (see FIG. 6) stored in the HDD 131 (S61).

頷き画像が記憶されていると判断された場合（Ｓ６１：ＹＥＳ）、ｒ４タイミング（図２４参照）で、デコードした頷き画像データに基づき、ディスプレイ２８において再生中の画像に割り込んで再生される（Ｓ６３）。さらに、録画側の端末装置からのストリーミング配信について、再生した頷き画像の再生時間（Ｑ）がカットされる（Ｓ６４）。つまり、ストリーミング配信された画像の時間分Ｔ１と、割り込まれた際にバッファに残存する画像の時間分Ｔ２とがカットされる。さらに、頷き画像の再生時間Ｑが経過したｒ５タイミングにおいて、頷き画像が割り込まれた際にバッファに残存する画像のＴ２時間分が遅延して再生される。そして、Ｔ２時間分の再生が終了するｒ６タイミングから、通常のストリーミング画像のバッファ再生が行われる（Ｓ６５）。 When it is determined that a whispered image is stored (S61: YES), at the r4 timing (see FIG. 24), based on the decoded whispered image data, the display 28 interrupts and plays the image being played (S63). ). Further, for streaming delivery from the terminal device on the recording side, the reproduction time (Q) of the reproduced rolled image is cut (S64). In other words, the time T1 of the streamed image and the time T2 of the image remaining in the buffer when interrupted are cut. Further, at the r5 timing when the reproduction time Q of the whispered image has passed, the time T2 of the image remaining in the buffer when the whispered image is interrupted is delayed and reproduced. Then, from the r6 timing when the reproduction for T2 hours ends, normal streaming image buffer reproduction is performed (S65).

なお、頷き信号を受信しても（Ｓ６０：ＹＥＳ）、ＨＤＤ３１の頷き画像データテーブル３１３１に頷き画像が記憶されていないと判断された場合（Ｓ６１：ＮＯ）、ＨＤＤ３１に予め記憶された代替画像が表示される（Ｓ６２）。代替画像は、例えば、文字、図形等で頷いていることを話者に示すものであればよい。 If it is determined that a fired image is not stored in the fired image data table 3131 of the HDD 31 (S61: NO) even if the fired signal is received (S60: YES), an alternative image stored in advance in the HDD 31 is stored. It is displayed (S62). The substitute image may be any image that indicates to the speaker that he / she is scolding with characters, graphics, or the like.

次に、図２３に示すように、端末装置間において、画像通話中であるか否かが判断される（Ｓ６６）。画像通話中である場合は（Ｓ６６：ＹＥＳ）、自拠点でのカメラ画像の画像データのエンコード処理が行われ（Ｓ６７）、そのエンコード処理された画像データが、相手側の端末装置にストリーミング配信される（Ｓ６８）。続いて、他の端末装置との接続が全て切断されたか否かが判断される（Ｓ６９）。接続が全て切断された場合は（Ｓ６９：ＹＥＳ）、処理を終了する。接続がまだ残っている場合は（Ｓ６９：ＮＯ）、図２０のＳ４１に戻り、接続している端末装置の動作状況が最新のものに書き換えられ、処理が繰り返される。 Next, as shown in FIG. 23, it is determined whether an image call is in progress between the terminal devices (S66). When the image call is in progress (S66: YES), the image data of the camera image is encoded at the local site (S67), and the encoded image data is streamed to the partner terminal device. (S68). Subsequently, it is determined whether or not all connections with other terminal devices have been disconnected (S69). If all the connections are disconnected (S69: YES), the process is terminated. If the connection still remains (S69: NO), the process returns to S41 in FIG. 20, the operation status of the connected terminal device is rewritten to the latest one, and the process is repeated.

次に、端末装置１３０で話者を検出しなかった場合について説明する。図２０に示すように、端末装置１３０の拠点において、話者を検出しなかった場合（Ｓ４７：ＮＯ）、端末装置１３０は聞き手となる。そこで、図２１に示すように、話者となった相手側の端末装置から、頷き検出指示信号を受信したか否かが判断される（Ｓ５０）。頷き検出指示信号を受信した場合、頷き検出処理が実行される（Ｓ５１）。この頷き検出処理は、上記した頷き検出方法に従って、カメラ画像から頷いている人が検出され、カメラ画像から頷きが検出される。そして、頷き信号が話者となった相手側の端末装置に向けて送信される。 Next, a case where a speaker is not detected by the terminal device 130 will be described. As shown in FIG. 20, when no speaker is detected at the base of the terminal device 130 (S47: NO), the terminal device 130 becomes a listener. Accordingly, as shown in FIG. 21, it is determined whether or not a whisper detection instruction signal has been received from the terminal device on the other side who has become the speaker (S50). When a whisper detection instruction signal is received, whisper detection processing is executed (S51). In this whispering detection process, a person whispering from a camera image is detected according to the whirling detection method described above, and whispering is detected from the camera image. Then, a whisper signal is transmitted toward the other terminal device that becomes the speaker.

続いて、頷きが初回の検出であったか否かが判断される（Ｓ５２）。初回の検出の場合（Ｓ５２：ＹＥＳ）、話者側の端末装置には、聞き手である端末装置１３０の頷き画像データは記憶されていない。そこで、頷きを検出したユーザの頷き画像がエンコードされ（Ｓ５３）、ＨＤＤ１３１に記憶される（Ｓ５４）。さらに、そのエンコードされた頷き画像データが、ネットワーク２を介して接続した他の端末装置に向けて送信される（Ｓ５５）。なお、頷きが初回の検出でなかった場合は（Ｓ５２：ＮＯ）、頷き信号が話者である端末装置に向けて送信される。 Subsequently, it is determined whether or not the first detection has been performed (S52). In the case of the first detection (S52: YES), the terminal device on the speaker side does not store the whispered image data of the terminal device 130 that is the listener. Therefore, the whispered image of the user who detected the whisper is encoded (S53) and stored in the HDD 131 (S54). Further, the encoded sooted image data is transmitted to another terminal device connected via the network 2 (S55). When the whispering is not detected for the first time (S52: NO), the whispering signal is transmitted toward the terminal device that is the speaker.

その後、頷き画像を受信したか否かが判断される（Ｓ５７）。頷き画像を受信した場合（Ｓ５７：ＹＥＳ）、その頷き画像はエンコードされて圧縮された状態であるので、その頷き画像データのデコード処理が行われる（Ｓ５８）。そして、デコードされて無圧縮の状態となった頷き画像データと、端末ＩＤと、録画時間とが、ＨＤＤ１３１に記憶された頷き画像データテーブル３１３１に登録される（Ｓ５９）。 Thereafter, it is determined whether or not a whispered image has been received (S57). When the whispered image is received (S57: YES), the whispered image is in an encoded and compressed state, so that the whispered image data is decoded (S58). Then, the decoded image data that has been decoded and is in an uncompressed state, the terminal ID, and the recording time are registered in the distributed image data table 3131 stored in the HDD 131 (S59).

さらに、頷き信号を受信したか否かが判断される（Ｓ６０）。現在、端末装置１３０は聞き手であって、頷き信号を送信する側であるので（Ｓ６０：ＮＯ）、続いて、図２３に示すように、端末装置間において、画像通話中であるか否かが判断される（Ｓ６６）。画像通話中である場合は（Ｓ６６：ＹＥＳ）、自拠点でのカメラ画像の画像データのエンコード処理が行われ（Ｓ６７）、そのエンコード処理された画像データが、相手側の端末装置にストリーミング配信される（Ｓ６８）。続いて、端末装置との接続が全て切断されたか否かが判断される（Ｓ６９）。接続が全て切断された場合は（Ｓ６９：ＹＥＳ）、処理を終了する。接続がまだ残っている場合は（Ｓ６９：ＮＯ）、図１９のＳ４１に戻り、図２０のＳ４１に戻り、端末との接続状況が判断され、接続している端末装置の動作状況が最新のものに書き換えられ（Ｓ４２〜４５）、上記と同様に処理が繰り返される。 Further, it is determined whether or not a whispering signal has been received (S60). Since the terminal device 130 is a listener who transmits a whispering signal (S60: NO), whether or not an image call is currently in progress between the terminal devices as shown in FIG. Determination is made (S66). When the image call is in progress (S66: YES), the image data of the camera image is encoded at the local site (S67), and the encoded image data is streamed to the partner terminal device. (S68). Subsequently, it is determined whether or not all the connections with the terminal device have been disconnected (S69). If all the connections are disconnected (S69: YES), the process is terminated. If the connection still remains (S69: NO), the process returns to S41 in FIG. 19, returns to S41 in FIG. 20, the connection status with the terminal is determined, and the operation status of the connected terminal device is the latest. (S42-45), and the process is repeated in the same manner as described above.

以上説明したように、第２実施形態の端末装置１３０では、ＣＰＵ１２０の通信制御処理において、頷き画像を会議中に録画できるので、会議前に録画する準備等の手間が不要である。さらに、頷き画像を再生する際に、頷き画像の再生時間分だけストリーミング画像をカットする。この場合、ストリーミング配信する側の端末装置において、第１実施形態のように、ストリーミングを一時停止するような処理が不要となる。 As described above, in the terminal device 130 according to the second embodiment, in the communication control process of the CPU 120, a whispered image can be recorded during the conference, so that there is no need for troubles such as preparation for recording before the conference. Furthermore, when playing the whispered image, the streaming image is cut for the playing time of the whispered image. In this case, in the terminal device on the streaming delivery side, processing for temporarily stopping streaming is not required as in the first embodiment.

なお、以上説明において、図２１に示すＳ５２の処理を実行するＣＰＵ２０が本発明の「初回反応状態検出手段」に相当する。図２１に示すＳ５３，５４の処理を実行するＣＰＵ２０が本発明の「反応時画像記憶処理手段」に相当する。図２２に示すＳ６３の処理を実行するＣＰＵ２０が本発明の「割り込み表示手段」に相当する。図２２に示すＳ６４の処理を実行するＣＰＵ２０が本発明の「第１ストリーミング画像カット手段」に相当する。 In the above description, the CPU 20 that executes the process of S52 shown in FIG. 21 corresponds to the “initial reaction state detecting means” of the present invention. The CPU 20 that executes the processes of S53 and S54 shown in FIG. 21 corresponds to the “reaction image storage processing means” of the present invention. The CPU 20 that executes the process of S63 shown in FIG. 22 corresponds to the “interrupt display means” of the present invention. The CPU 20 that executes the process of S64 shown in FIG. 22 corresponds to the “first streaming image cutting means” of the present invention.

次に、本発明の第３実施形態である端末装置２３０について、図２５を参照して説明する。図２５は、端末装置２３０の電気的構成を示すブロック図である。第１，第２実施形態では、ユーザの頷きをカメラ画像から画像処理を用いて検出している。第３実施形態では、話者の話に聞き手が納得した場合に、端末装置２３０に設けられた頷きボタン７０を押下するようになっている。 Next, a terminal device 230 according to a third embodiment of the present invention will be described with reference to FIG. FIG. 25 is a block diagram illustrating an electrical configuration of the terminal device 230. In the first and second embodiments, a user's whisper is detected from a camera image using image processing. In the third embodiment, when the listener is satisfied with the speaker's story, the whisper button 70 provided on the terminal device 230 is pressed.

端末装置２３０の構成について説明する。図２５に示すように、端末装置２３０には、端末装置２３０の制御を司るコントローラとしてのＣＰＵ２２０が設けられている。ＣＰＵ２２０には、ＢＩＯＳ等を記憶したＲＯＭ２２１と、各種データを一時的に記憶するＲＡＭ２２２と、データの受け渡しの仲介を行うＩ／Ｏインタフェイス３０とが接続されている。Ｉ／Ｏインタフェイス３０には、各種記憶エリアを有するハードディスクドライブ２３１が接続されている。そして、第１実施形態の端末装置３（図２参照）と同様の構成を備えると共に、Ｉ／Ｏインタフェイス３０には、頷きボタン７０が接続されている。 The configuration of the terminal device 230 will be described. As illustrated in FIG. 25, the terminal device 230 is provided with a CPU 220 as a controller that controls the terminal device 230. Connected to the CPU 220 are a ROM 221 that stores BIOS and the like, a RAM 222 that temporarily stores various data, and an I / O interface 30 that mediates data transfer. A hard disk drive 231 having various storage areas is connected to the I / O interface 30. A whirl button 70 is connected to the I / O interface 30 while having the same configuration as the terminal device 3 (see FIG. 2) of the first embodiment.

頷きボタン７０は、聞き手が話者の話に納得した場合に押下されるものである。よって、この頷きボタン７０が押下されると、話者である端末装置に向けて、第１実施形態と同様の頷き信号が送信される。つまり、ＣＰＵ２２０による通信制御処理は、第１実施形態の通信制御処理の中で、頷きを検出する一の端末装置に特定された場合の頷き検出処理（図１５：Ｓ１６）において、頷きボタン７０が押下されたか否かを判断する点が異なる。頷きボタン７０が押下された場合は、頷き信号が送信される（図１５：Ｓ１７）ので、第１実施形態と同様の効果を得ることができる。 The whisper button 70 is pressed when the listener is satisfied with the speaker's story. Therefore, when the whisper button 70 is pressed, the whisper signal similar to that of the first embodiment is transmitted to the terminal device that is the speaker. That is, the communication control process by the CPU 220 is performed when the whisper button 70 is selected in the whirl detection process (FIG. 15: S16) in the case of being specified as one terminal device that detects whisper in the communication control process of the first embodiment. The difference is that it is determined whether or not the button has been pressed. When the whispering button 70 is pressed, a whispering signal is transmitted (FIG. 15: S17), so that the same effect as in the first embodiment can be obtained.

以上説明したように、第３実施形態の端末装置２３０では、話者の話に聞き手が納得した場合に、端末装置２３０に設けられた頷きボタン７０を押下するので、第１実施形態のような画像処理に比べて、聞き手が納得した意志を話者に確実に伝えることができる。なお、図２５に示すＳ頷きボタン７０が本発明の「操作手段」に相当する。 As described above, in the terminal device 230 according to the third embodiment, when the listener is satisfied with the speaker's story, the whisper button 70 provided on the terminal device 230 is pressed. Compared with image processing, it is possible to reliably convey the will that the listener is satisfied to the speaker. Note that the S button 70 shown in FIG. 25 corresponds to the “operation means” of the present invention.

なお、本発明は、上記の第１乃至第３実施形態に限定されることなく、種々の変更が可能である。例えば、上記実施形態では、ユーザの頭部が上下方向に振れる頷きを検出したものであるが、首を左右に振って話者に対して否定する反応状態を、上記した画像処理によって検出することも可能である。この場合、頷き画像と同様に、首を横に振る画像を記憶しておけば、聞き手の否定の意志を話者に速やかに伝えることができる。 The present invention is not limited to the first to third embodiments described above, and various modifications can be made. For example, in the above-described embodiment, the whirling of the user's head is detected, but the reaction state of denying the speaker by shaking his / her head left and right is detected by the image processing described above. Is also possible. In this case, as in the case of the whispered image, if the image of shaking the head is memorized, it is possible to quickly inform the speaker of the intention of denying the listener.

また、その他にも、人間には自己の感情を相手に伝えるために、頭部を振る動作のみならず、種々のジェスチャーで表現することがある。このような聞き手の反応状態の特徴を検出することで、様々な反応を検出でき、本発明を適用することができる。 In addition, in order to convey a person's own feelings to the other person, there are cases where not only the motion of shaking the head but also various gestures are used. By detecting the characteristics of the listener's reaction state, various reactions can be detected, and the present invention can be applied.

テレビ会議システム１の構成を示すブロック図である。1 is a block diagram showing a configuration of a video conference system 1. FIG. 端末装置３の電気的構成を示すブロック図である。3 is a block diagram showing an electrical configuration of a terminal device 3. FIG. ＨＤＤ３１の各種記憶エリアを示す概念図である。3 is a conceptual diagram showing various storage areas of an HDD 31. FIG. ログインテーブル３１１１の概念図である。3 is a conceptual diagram of a login table 3111. FIG. 端末状況テーブル３１２１の概念図である。It is a conceptual diagram of the terminal status table 3121. 頷き画像データテーブル３１３１の概念図である。It is a conceptual diagram of the roaring image data table 3131. ディスプレイ２８における一表示態様を示す図である。It is a figure which shows one display mode in the display. うつむき加減を示す特徴量ｄの説明図（頷き前）である。It is explanatory drawing (before whispering) of the feature-value d which shows the amount of depression. うつむき加減を示す特徴量ｄの説明図（頷き後）である。It is explanatory drawing (after a whisper) of the feature-value d which shows depressing adjustment. カメラ画像データ４０の概念図である。3 is a conceptual diagram of camera image data 40. FIG. 検出波形パターン（頷き時）を示すグラフである。It is a graph which shows a detection waveform pattern (at the time of whispering). 登録された頷き波形パターンを示すグラフである。It is a graph which shows the registered whispering waveform pattern. ＣＰＵ２０による通信制御処理のフローチャートである。It is a flowchart of the communication control process by CPU20. 図１３の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 図１４の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 図１５の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 図１６の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 頷き検出時の画像カット処理を説明するためのタイミングチャートである。It is a timing chart for demonstrating the image cut process at the time of a whirling detection. 第２実施形態である端末装置１３０の電気的構成を示すブロック図である。It is a block diagram which shows the electric constitution of the terminal device 130 which is 2nd Embodiment. ＣＰＵ１２０による通信制御処理のフローチャートである。It is a flowchart of the communication control process by CPU120. 図２０の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 図２１の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 図２２の続きを示すフローチャートである。It is a flowchart which shows the continuation of FIG. 頷き検出時の画像カット処理を説明するためのタイミングチャートである。It is a timing chart for demonstrating the image cut process at the time of a whirling detection. 第３実施形態である端末装置２３０の電気的構成を示すブロック図である。It is a block diagram which shows the electric constitution of the terminal device 230 which is 3rd Embodiment.

Explanation of symbols

１テレビ会議システム
２ネットワーク
３端末装置
２０ＣＰＵ
２８ディスプレイ
３１ハードディスクドライブ
３４カメラ
３５マイク
７０頷きボタン
１２０ＣＰＵ
１３０端末装置
２２０ＣＰＵ
２３０端末装置
３１１ログインテーブル記憶エリア
３１２端末状況テーブル記憶エリア
３１３頷き画像データテーブル記憶エリア
３１４カメラ画像データ記憶エリア 1 Video conference system 2 Network 3 Terminal device 20 CPU
28 Display 31 Hard Disk Drive 34 Camera 35 Microphone 70 Sparkling Button 120 CPU
130 Terminal device 220 CPU
230 Terminal device 311 Login table storage area 312 Terminal status table storage area 313 Fired image data table storage area 314 Camera image data storage area

Claims

A communication terminal device that communicates with a counterpart terminal via an image via a network,
Display means for displaying the image data transmitted from the counterpart terminal;
Reaction state detection means for detecting the reaction state of the user;
When the reaction state is detected by the reaction state detection unit, a reaction signal transmission unit that transmits a reaction signal indicating that the reaction state is detected to the user to the counterpart terminal;
Reaction signal receiving means for receiving the reaction signal transmitted by the reaction signal transmitting means;
A reaction-time image storage means for storing a reaction-time image when the partner user of the counterpart terminal indicates the reaction state;
A reaction time image display control means for displaying, on the display means, the reaction time image of the counterpart user stored in the reaction time image storage means when the reaction signal is received by the reaction signal receiving means; A communication terminal device comprising the communication terminal device.

The display means displays the streaming image of the partner user transmitted from the partner terminal and photographed by the partner terminal,
Interrupt display means for interrupting the streaming image displayed on the display means and displaying the reaction time image when the reaction signal is received by the reaction signal receiving means,
And a first streaming image cut means for cutting the streaming image for a time corresponding to the time of the reaction image from when the reaction image is interrupted by the interrupt display means. The communication terminal device according to claim 1.

The streaming image of the user to be transmitted to the counterpart terminal is cut by a time corresponding to the time of the response image data transmitted by the response image data transmission means, and transmitted to the counterpart terminal. The communication terminal apparatus according to claim 1, further comprising: 2 streaming image cut means.

Reaction time image data transmitting means for compressing the reaction time image data when the reaction state of the user is detected by the reaction state detection means, and transmitting the compressed image data to the counterpart terminal;
Reaction image data receiving means for receiving the reaction image data transmitted by the reaction image data transmitting means of the counterpart terminal;
4. A decompression storage processing means for decompressing the reaction image data received by the reaction image data receiving means and storing the decompressed image data in the reaction image storage means. The communication terminal device according to claim 1.

One terminal that detects the reaction state of the counterpart user based on a predetermined condition from the plurality of counterpart terminals when connected to the counterpart terminals via the network A terminal identification means for identifying
To the one terminal specified by the terminal specifying means, a notification signal transmitting means for transmitting a notification signal notifying that it has been specified as a terminal for detecting the reaction state;
Notification signal receiving means for receiving the notification signal transmitted from the counterpart terminal,
The communication terminal according to any one of claims 1 to 4, wherein the reaction state detection unit detects the reaction state of the user when the notification signal is received by the notification signal reception unit. apparatus.

CPU load status detecting means for detecting the CPU load status of the counterpart terminal via the network;
CPU load status storage means for storing the CPU load status detected for each counterpart terminal by the CPU load status detection means,
The terminal specifying means includes
A first predetermined condition is provided for referring to the CPU load status stored in the CPU load status storage means to identify the counterpart terminal with the lowest CPU load as the one terminal. The communication terminal device according to claim 5.

A transmission time detecting means for detecting a transmission time of data in communication with the counterpart terminal via the network;
Transmission time storage means for storing the detection result by the transmission time detection means,
The terminal specifying means includes
6. The second predetermined condition for specifying the counterpart terminal with the shortest transmission time as the one terminal with reference to the detection result stored in the transmission time storage means. The communication terminal device according to 1.

For each counterpart terminal, a login number detection means for detecting the number of logged-in counterpart users,
Login number storage means for storing the number of logins detected by the login number detection means,
The terminal specifying means includes
A third predetermined condition for specifying, as the one terminal, the partner terminal having the largest number of logins among the number of logins for each partner terminal stored in the login number storage unit is provided. The communication terminal device according to claim 5.

Utterance detection means for detecting the utterance of the other user;
Elapsed time measuring means for measuring the elapsed time from the time of utterance for each partner user detected by the utterance detecting means,
The terminal specifying means includes
6. The fourth predetermined condition for specifying, as the one terminal, the counterpart terminal of the counterpart user with the shortest elapsed time measured by the elapsed time measuring means. Communication terminal device.

The reaction state detecting means includes
10. The communication terminal device according to claim 1, wherein a state in which the user's head shakes in a predetermined direction is detected as the reaction state. 11.

The reaction state detecting means includes
The communication terminal device according to claim 1, wherein a rejection state in which the user's head shakes and rejects in a horizontal direction is detected as the reaction state.

Comprising operating means operated by the user when the user is convinced,
The reaction state detecting means includes
The communication terminal device according to claim 1, wherein the user's satisfaction state is detected as the reaction state by detecting an operation performed by the operation unit.

Initial reaction state determination means for determining whether or not the reaction state of the user detected by the reaction state detection means is the first time;
A reaction time image storage processing means for storing the reaction time image in the reaction time image storage means when the reaction state is determined to be the first time by the initial reaction state determination means;
13. The reaction image data transmitting unit transmits the reaction image data stored in the reaction image storage unit to the counterpart terminal in a compressed state. The communication terminal device described.

A reaction time image storage judging means for judging whether or not the reaction time image data of the counterpart user is stored in the reaction time image storage means when the reaction signal is received by the reaction signal receiving means; ,
When it is determined by the response image storage determination means that the response image is not stored, the display means that the other user indicates the response status instead of the response image. 14. The communication terminal apparatus according to claim 1, further comprising: a substitute image display control unit that displays a substitute image represented by characters, figures, symbols, and the like.

A communication control method for a communication terminal device that performs image communication with a counterpart terminal via a network,
An image data receiving step of receiving image data transmitted from the counterpart terminal;
A reaction state detection step for detecting a reaction state of the user;
When the reaction state is detected in the reaction state detection step, a reaction signal transmission step of transmitting a reaction signal indicating that the reaction state is detected to the user to the counterpart terminal;
A reaction signal receiving step for receiving the reaction signal transmitted in the reaction signal transmitting step;
When the reaction signal is received in the reaction signal reception step, the counterpart user stored in the reaction-time image storage means for storing a response-time image when the counterpart user of the counterpart terminal indicates the reaction state And a reaction-time image display control step of displaying the reaction-time image on the display means for displaying the image data received in the image data reception step.

A communication control program for causing a computer to execute various processing steps of the communication control method according to claim 15.