JP2010199906A

JP2010199906A - Video conference apparatus, video conference system, video conference control method, and program of video conference apparatus

Info

Publication number: JP2010199906A
Application number: JP2009041631A
Authority: JP
Inventors: Tomohiro Inagaki; 友大稲垣
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2009-02-25
Filing date: 2009-02-25
Publication date: 2010-09-09
Anticipated expiration: 2029-02-25
Also published as: JP5391725B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a video conference apparatus, a video conference system, a video conference control method and a program of the video conference apparatus, for not obstructing the tempo of conversations. <P>SOLUTION: When main processing is executed, a priority order table is acquired (S11) and voice data elimination processing is activated (S12). Then, when voice data is received (S14:YES), the voice data is stored (S16). When other voice data is being output (S17:YES), voice data information relating to the voice data is stored (S18). When there is the voice data delayed for T1 time or longer within the stored voice data (S19:YES), "1" is set to the value of the invalidation flag of the pertinent voice data information (S20). Then, output order setting processing is performed (S21), and when there is the voice data that can be output (S22:YES), the voice data whose output order is the first is output (S24). <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、テレビ会議装置、テレビ会議システム、テレビ会議制御方法、及びテレビ会議装置のプログラムに関するものであり、詳細には、会話のテンポを妨げないテレビ会議装置、テレビ会議システム、テレビ会議制御方法、及びテレビ会議装置のプログラムに関するものである。 The present invention relates to a video conference device, a video conference system, a video conference control method, and a program for the video conference device, and in particular, a video conference device, a video conference system, and a video conference control method that do not disturb the tempo of conversation. And a program of the video conference apparatus.

従来、遠隔地にある複数の通信装置が同一のテレビ会議システムに接続し、あたかも同一の場所で会議を行っているかのようにする通信制御装置が知られている（例えば、特許文献１参照）。特許文献１に記載の通信制御装置によれば、複数の通信装置から送信される通話情報を受信し、受信した通話情報を記録する。そして、記録した通話情報のうち、時間的に重複する部分が存在する場合、通話情報を出力するタイミングの優先順位を設定する。そして、設定した優先順位に基づいて通話情報を時間的にずらして出力する。これにより、複数のユーザが同時に発言した場合であっても、それぞれの発言を容易に聞き取ることができ、快適なコミュニケーションを図ることができる。 2. Description of the Related Art Conventionally, a communication control device is known in which a plurality of communication devices at remote locations are connected to the same video conference system, as if they are holding a conference at the same location (see, for example, Patent Document 1). . According to the communication control device described in Patent Literature 1, call information transmitted from a plurality of communication devices is received, and the received call information is recorded. Then, when there is a portion that overlaps in time among the recorded call information, the priority of the timing for outputting the call information is set. Then, the call information is output while being shifted in time based on the set priority. Thereby, even if it is a case where a some user speaks simultaneously, each remark can be easily heard and comfortable communication can be aimed at.

特開２００２−２３２５７６号公報JP 2002-232576 A

しかしながら、特許文献１に記載の発明の通信制御装置では、時間的に重複する場合、各ユーザの通話情報を優先度に応じて時間的に遅延を与えて、順次再生するため、会話が遅れて再生され、会話のテンポを妨げるという問題点があった。 However, in the communication control device of the invention described in Patent Document 1, when there is a time overlap, the conversation information is delayed because the call information of each user is reproduced in sequence with a time delay according to the priority. There was a problem of being played back and hindering the conversation tempo.

そこで、本発明は、上述した問題を解決するためになされたものであり、会話のテンポを妨げないテレビ会議装置、テレビ会議システム、テレビ会議制御方法、及びテレビ会議装置のプログラムを提供することを目的とする。 Accordingly, the present invention has been made to solve the above-described problems, and provides a video conference apparatus, a video conference system, a video conference control method, and a video conference apparatus program that do not disturb the tempo of conversation. Objective.

上記目的を達成するために、請求項１に係る発明のテレビ会議装置は、複数の端末装置の間でネットワークを介して実施されるテレビ会議を制御するテレビ会議装置であって、前記端末装置から送信された発話による音声データを受信する音声データ受信手段と、前記音声データ受信手段にて前記音声データを前記端末装置から受信した場合に、他の前記端末装置から受信した前記音声データが出力中であるか否かを判断する出力判断手段と、
前記音声データを送信した前記端末装置を識別する前記端末識別情報と、前記音声データを前記端末装置へ出力するために優先する順位を示す優先順位とを対応づけて優先順位情報として記憶する優先順位情報記憶手段と、前記出力判断手段にて、他の前記端末装置から受信した前記音声データが出力中であると判断された場合、前記優先順位情報記憶手段に記憶された前記優先順位情報に基づいて、前記音声データ受信手段にて受信した前記音声データと、前記音声データの受信を開始した時刻である受信開始時刻と、前記音声データを前記端末装置へ出力するための順位の情報である出力順位と、前記音声データを送信した前記端末装置を識別する端末識別情報とを対応づけて音声データ情報として音声データ情報記憶手段に記憶させる音声データ情報記憶制御手段と、前記テレビ会議においてデータ通信を行っている複数の前記端末装置に前記音声データを出力する音声データ出力手段と、前記音声データ情報記憶手段に記憶された前記音声データ情報において、前記音声データ出力手段にて前記端末装置に出力されていない前記音声データの内で、前記音声データ情報の前記受信開始時刻から第１所定時間が経過した前記音声データ情報が存在するか否かを判断する第１判断手段と、前記第１判断手段にて前記音声データ情報の前記受信開始時刻から前記第１所定時間が経過した前記音声データ情報が存在すると判断された場合、前記第１所定時間が経過した前記音声データ情報を前記音声データ情報記憶手段から削除する第１削除手段とを備え、前記音声データ出力手段は、前記音声データ情報記憶手段に記憶された前記音声データ情報の前記出力順位が、最も高い前記出力順位に対応した前記音声データを、前記テレビ会議においてデータ通信を行っている複数の前記端末装置に出力することを特徴とする。 In order to achieve the above object, a video conference apparatus according to a first aspect of the present invention is a video conference apparatus that controls a video conference that is performed via a network among a plurality of terminal apparatuses, from the terminal apparatus. Voice data receiving means for receiving voice data from the transmitted utterance, and when the voice data receiving means receives the voice data from the terminal apparatus, the voice data received from the other terminal apparatus is being output. Output judging means for judging whether or not,
Priority order for storing the terminal identification information for identifying the terminal apparatus that has transmitted the voice data and the priority order indicating the priority order for outputting the voice data to the terminal apparatus in association with each other. When the information storage means and the output determination means determine that the audio data received from the other terminal device is being output, based on the priority information stored in the priority information storage means The voice data received by the voice data receiving means, the reception start time that is the time when reception of the voice data is started, and the output that is the order information for outputting the voice data to the terminal device The rank and the terminal identification information for identifying the terminal device that has transmitted the voice data are associated with each other and stored as voice data information in the voice data information storage means. Voice data information storage control means, voice data output means for outputting the voice data to the plurality of terminal devices performing data communication in the video conference, and the voice data information stored in the voice data information storage means In the audio data output means, the audio data information that has not been output to the terminal device includes the audio data information that has passed a first predetermined time from the reception start time of the audio data information. A first determination unit that determines whether the audio data information that has passed the first predetermined time from the reception start time of the audio data information is present by the first determination unit; A first deletion unit that deletes the audio data information after a predetermined time has elapsed from the audio data information storage unit, and the audio data output unit The voice data corresponding to the highest output order of the voice data information stored in the voice data information storage means is sent to the plurality of terminal devices performing data communication in the video conference. It is characterized by outputting.

また、請求項２に係る発明のテレビ会議装置は、請求項１に記載の発明の構成に加え、前記音声データ情報記憶手段に記憶された前記音声データ情報において、前記音声データ出力手段にて前記端末装置に出力されていない前記音声データの内で、前記音声データ情報の前記音声データの出力時間が第２所定時間より短い前記音声データ情報が存在するか否かを判断する第２判断手段と、前記第２判断手段にて前記音声データ情報の前記音声データの前記出力時間が、前記第２所定時間より短い前記音声データ情報が存在すると判断された場合、前記第２所定時間より短い前記出力時間に対応する前記音声データ情報の前記出力順位を、最も高い順位に更新する第１出力順位更新手段とを備えている。 According to a second aspect of the present invention, in addition to the configuration of the first aspect of the invention, in the audio data information stored in the audio data information storage unit, the audio data output unit may Second judgment means for judging whether or not the voice data information having a shorter output time of the voice data of the voice data information than a second predetermined time exists among the voice data not outputted to the terminal device; When the second determination means determines that the audio data information is shorter than the second predetermined time, the output of the audio data is shorter than the second predetermined time. First output rank update means for updating the output rank of the audio data information corresponding to time to the highest rank.

また、請求項３に係る発明のテレビ会議装置は、請求項１又は２に記載の発明の構成に加え、前記音声データ出力手段にて複数の前記端末装置に対して前記音声データの出力が開始されてから第３所定時間が経過するまでに、出力中の前記音声データを送信した前記端末装置を識別する前記端末識別情報に対応した前記優先順位より高い前記優先順位に対応した前記端末識別情報で特定される前記端末装置から、前記音声データ受信手段にて前記音声データを受信したか否かを判断する第３判断手段と、前記第３判断手段にて前記音声データを受信したと判断された場合、前記音声データ記憶手段に記憶された前記音声データ情報において、出力中の前記音声データに対応した前記音声データ情報を前記音声データ記憶制御手段から削除する第２削除手段と、前記第３判断手段にて前記音声データを受信したと判断された場合、前記音声データ記憶手段に記憶された前記音声データ情報において、受信した前記音声データに対応した前記出力順位を、最も高い前記出力順位に更新する第２出力順位更新手段とを備えている。 According to a third aspect of the present invention, in addition to the configuration of the first or second aspect, the audio data output means starts outputting the audio data to the plurality of terminal devices. The terminal identification information corresponding to the priority higher than the priority corresponding to the terminal identification information identifying the terminal apparatus that has transmitted the audio data being output before the third predetermined time has elapsed It is determined that the voice data is received by the third judgment means, and third judgment means for judging whether or not the voice data is received by the voice data receiving means from the terminal device specified by If the audio data is stored in the audio data storage means, the audio data information corresponding to the audio data being output is deleted from the audio data storage control means. 2 When the voice data is stored in the voice data storage means, the output order corresponding to the received voice data when the voice data is judged to have been received by the deletion means and the third judgment means Is updated to the highest output rank.

また、請求項４に係る発明のテレビ会議装置は、請求項１乃至３のいずれかに記載の発明の構成に加え、前記音声データ出力手段にて複数の前記端末装置に対して前記音声データの出力が開始されてから第４所定時間が経過した後、出力中の前記音声データを送信した前記端末装置を識別する前記端末識別情報に対応した前記優先順位より高い前記優先順位に対応した前記端末識別情報で特定される前記端末装置から、前記音声データ受信手段にて前記音声データを受信したか否かを判断する第４判断手段と、前記第４判断手段にて、出力中の前記音声データを送信した前記端末装置を識別する前記端末識別情報に対応した前記優先順位より高い前記優先順位に対応した前記端末識別情報で特定される前記端末装置から、前記音声データ受信手段にて前記音声データを受信したと判断されてから第５所定時間が経過したか否かを判断する第５判断手段と、前記第５判断手段にて前記音声データを受信してから第５所定時間が経過したと判断された場合、前記音声データ記憶手段に記憶された前記音声データ情報において、出力中の前記音声データに対応する前記音声データ情報を前記音声データ記憶制御手段から削除する第３削除手段と、前記第５判断手段にて前記音声データを受信してから第５所定時間が経過したと判断された場合、前記音声データ記憶手段に記憶された前記音声データ情報において、受信した前記音声データに対応した前記出力順位を、最も高い前記出力順位に更新する第３出力順位更新手段とを備えている。 According to a fourth aspect of the present invention, in addition to the configuration of the first aspect of the present invention, the audio data output unit may be configured to transmit the audio data to a plurality of the terminal devices. The terminal corresponding to the priority higher than the priority corresponding to the terminal identification information for identifying the terminal device that has transmitted the audio data being output after the fourth predetermined time has elapsed since the output was started The voice data receiving means determines whether or not the voice data has been received from the terminal device specified by the identification information, and the voice data being output by the fourth judgment means Receiving the voice data from the terminal device identified by the terminal identification information corresponding to the priority higher than the priority corresponding to the terminal identification information identifying the terminal device transmitting A fifth determination means for determining whether or not a fifth predetermined time has elapsed since it was determined that the voice data was received at the stage, and a fifth after the voice data was received by the fifth determination means. When it is determined that a predetermined time has elapsed, in the audio data information stored in the audio data storage means, the audio data information corresponding to the audio data being output is deleted from the audio data storage control means. 3. When it is determined that the fifth predetermined time has passed since the voice data is received by the deletion means and the fifth judgment means, the voice data information stored in the voice data storage means is received. And a third output rank update means for updating the output rank corresponding to the audio data to the highest output rank.

また、請求項５に係る発明のテレビ会議システムは、請求項１に記載のテレビ会議装置と、前記テレビ会議装置から出力された音声データを出力する出力部、及び前記テレビ会議装置が制御するテレビ会議で使用される画像データを表示する表示部を備えた前記端末装置とで構成されたテレビ会議システムであり、前記テレビ会議装置は、前記音声データ記憶手段に記憶された前記音声データ情報の前記出力順位を、前記出力順位に対応した前記端末識別情報で特定される前記端末装置に対して送信する出力順位送信手段を備え、前記端末装置は、前記テレビ会議装置の前記出力順位送信手段にて送信された前記出力順位を受信する出力順位受信手段と、前記出力順位受信手段にて受信した前記出力順位に基づいて、前記出力順位を通知する旨の情報を前記表示部に表示する出力順位表示制御手段とを備えている。 According to a fifth aspect of the present invention, there is provided a video conference system according to the first aspect of the present invention, the output unit that outputs the audio data output from the video conference apparatus, and the television controlled by the video conference apparatus. A video conference system including the terminal device including a display unit that displays image data used in a conference, wherein the video conference device includes the audio data information stored in the audio data storage unit. An output order transmitting means for transmitting an output order to the terminal device specified by the terminal identification information corresponding to the output order, wherein the terminal apparatus is the output order transmitting means of the video conference device; An output rank receiving means for receiving the output rank transmitted, and the output rank is notified based on the output rank received by the output rank receiving means. The information and an output order display control means for displaying on the display unit.

また、請求項６に係る発明のテレビ会議制御方法は、複数の端末装置の間でネットワークを介して実施されるテレビ会議を制御するテレビ会議制御方法であって、前記端末装置から送信された発話による音声データを受信する音声データ受信ステップと、前記音声データ受信ステップにて前記音声データを前記端末装置から受信した場合に、他の前記端末装置から受信した前記音声データが出力中であるか否かを判断する出力判断ステップと、
前記音声データを送信した前記端末装置を識別する前記端末識別情報と、前記音声データを前記端末装置へ出力するために優先する順位を示す優先順位とを対応づけて優先順位情報として記憶する優先順位情報記憶手段と、前記出力判断ステップにて、他の前記端末装置から受信した前記音声データが出力中であると判断された場合、前記優先順位情報記憶手段に記憶された前記優先順位情報に基づいて、前記音声データ受信ステップにて受信した前記音声データと、前記音声データの受信を開始した時刻である受信開始時刻と、前記音声データを前記端末装置へ出力するための順位の情報である出力順位と、前記音声データを送信した前記端末装置を識別する端末識別情報とを対応づけて音声データ情報として音声データ情報記憶手段に記憶させる音声データ情報記憶制御ステップと、前記テレビ会議においてデータ通信を行っている複数の前記端末装置に前記音声データを出力する音声データ出力ステップと、前記音声データ情報記憶手段に記憶された前記音声データ情報において、前記音声データ出力手段にて前記端末装置に出力されていない前記音声データの内で、前記音声データ情報の前記受信開始時刻から第１所定時間が経過した前記音声データ情報が存在するか否かを判断する第１判断ステップと、前記第１判断ステップにて前記音声データ情報の前記受信開始時刻から前記第１所定時間が経過した前記音声データ情報が存在すると判断された場合、前記第１所定時間が経過した前記音声データ情報を前記音声データ情報記憶手段から削除する第１削除ステップとを備え、前記音声データ出力ステップは、前記音声データ情報記憶手段に記憶された前記音声データ情報の前記出力順位が、最も高い前記出力順位に対応した前記音声データを、前記テレビ会議においてデータ通信を行っている複数の前記端末装置に出力することを特徴とする。 A video conference control method according to a sixth aspect of the present invention is a video conference control method for controlling a video conference performed between a plurality of terminal devices via a network, and an utterance transmitted from the terminal device. Voice data receiving step for receiving voice data according to the above, and whether or not the voice data received from another terminal device is being output when the voice data is received from the terminal device in the voice data receiving step An output determination step for determining whether or not
Priority order for storing the terminal identification information for identifying the terminal apparatus that has transmitted the voice data and the priority order indicating the priority order for outputting the voice data to the terminal apparatus in association with each other. When it is determined in the information storage means and the output determination step that the audio data received from another terminal device is being output, based on the priority information stored in the priority information storage means The voice data received in the voice data reception step, the reception start time that is the time when reception of the voice data is started, and the output that is the order information for outputting the voice data to the terminal device The rank and the terminal identification information for identifying the terminal device that transmitted the voice data are associated with each other and recorded as voice data information in the voice data information storage means. A voice data information storage control step, a voice data output step of outputting the voice data to a plurality of the terminal devices performing data communication in the video conference, and the voice data stored in the voice data information storage means In the information, does the voice data information that has not been output to the terminal device by the voice data output means includes the voice data information that has passed a first predetermined time from the reception start time of the voice data information? A first determination step for determining whether or not there is the audio data information that has passed the first predetermined time from the reception start time of the audio data information in the first determination step; A first deletion step of deleting the audio data information after a predetermined time has elapsed from the audio data information storage means; The audio data output step performs data communication in the video conference for the audio data corresponding to the output order with the highest output order of the audio data information stored in the audio data information storage means. Output to the plurality of terminal devices.

また、請求項７に係る発明のテレビ会議装置のプログラムは、請求項１乃至４のいずれかに記載のテレビ会議装置の各種処理手段としてコンピュータを機能させることを特徴とする。 According to a seventh aspect of the present invention, there is provided a video conference device program that causes a computer to function as various processing means of the video conference device according to any one of the first to fourth aspects.

請求項１に係るテレビ会議装置では、ネットワークを介して実施されるテレビ会議に接続する端末装置から発話による音声データを受信した場合、他の端末装置から受信した音声データが出力中であるか否かを判断する出力判断手段を備えている。出力判断手段にて、他の端末装置から受信した音声データが出力中でないと判断された場合、優先順位情報記憶手段に記憶された各端末装置の優先順位にしたがって、端末識別情報と、受信した音声データと、出力順位と、受信開始時刻と、出力順位とが対応づけられた音声データ情報を音声データ情報記憶手段に記憶する。また、出力されていない音声データの内で、受信開始時刻から第１所定時間が経過した音声データが存在する場合、第１削除手段にて削除される。音声データ出力手段は、音声データ情報記憶手段に記憶された音声データ情報の内で出力順位が最も高い音声データ情報に対応した音声データをテレビ会議においてデータ通信を行っている複数の端末に対して出力する。これにより、テレビ会議装置は、受信した音声データの出力順位を、予め決定されている優先順位に従って決定する。そして、テレビ会議に接続している各端末装置に対して出力順位の高い音声データから順に出力することができる。また、所定時間である第１所定時間以上、遅延されている音声データは削除されるため、出力されることがなくなる。その結果、音声データが必要以上に遅れて出力されることがなくなり、テレビ会議を実施中に、会話のテンポが妨げられる可能性を少なくできる。 In the video conference device according to claim 1, whether or not the audio data received from another terminal device is being output when the voice data from the terminal device connected to the video conference conducted via the network is received. Output judging means for judging whether or not. When the output determination means determines that the audio data received from another terminal device is not being output, the terminal identification information and the received information are received according to the priority order of each terminal device stored in the priority order information storage means. Audio data information in which audio data, output order, reception start time, and output order are associated is stored in the audio data information storage means. Further, when there is audio data that has passed the first predetermined time from the reception start time among the audio data that has not been output, the audio data is deleted by the first deletion means. The voice data output means is configured to send voice data corresponding to the voice data information having the highest output order among the voice data information stored in the voice data information storage means to a plurality of terminals performing data communication in the video conference. Output. Thereby, the video conference apparatus determines the output order of the received audio data according to the priority order determined in advance. And it can output in order from the audio | voice data with a high output order with respect to each terminal device connected to the video conference. Further, since the audio data delayed for the first predetermined time which is the predetermined time is deleted, it is not output. As a result, the audio data is not output more late than necessary, and the possibility of hindering the conversation tempo during a video conference can be reduced.

請求項２に係るテレビ会議装置では、請求項１に記載の発明の効果に加え、出力されていない音声データの内で、音声データの出力時間が第２所定時間より短い音声データが存在するか否かを判断する第２判断手段を備えている。また、第２判断手段にて出力されていない音声データの内で、音声データの出力時間が第２所定時間より短い音声データが存在すると判断された場合、第２所定時間より短い音声データに対応する出力順位を最も高い順位に更新する第１出力順位更新手段を備えている。これにより、第２所定時間より短い出力時間の音声データは、他の音声データと同時に出力されることになる。その結果、テレビ会議の参加者が行う合鎚等の短い発言を、テレビ会議で行われる発言に取り入れることができるため、より自然な会話を促すことができる。 In the video conference apparatus according to claim 2, in addition to the effect of the invention according to claim 1, in the audio data that has not been output, does the audio data output time be shorter than the second predetermined time? Second determining means for determining whether or not is provided. In addition, when it is determined that there is audio data whose output time is shorter than the second predetermined time among the audio data not output by the second determining means, it corresponds to the audio data shorter than the second predetermined time. First output rank update means for updating the output rank to be performed to the highest rank. As a result, audio data having an output time shorter than the second predetermined time is output simultaneously with other audio data. As a result, it is possible to incorporate a short utterance such as a match made by a participant in the video conference into an utterance made in the video conference, thereby promoting a more natural conversation.

請求項３に係るテレビ会議装置では、請求項１又は２に記載の発明の効果に加え、音声データ出力手段にて複数の端末装置に対して音声データの出力が開始されてから第３所定時間が経過するまでに、出力中の音声データを送信した端末装置に対応した優先順位よりも高い優先順位に対応した端末装置から、音声データを受信したか否かを判断する第３判断手段を備えている。また、第３判断手段にて音声データを受信したと判断された場合、音声データ記憶手段に記憶された音声データ情報において、出力中の音声データに対応した音声データ情報を削除する第２削除手段を備えている。また、第３判断手段にて音声データを受信したと判断された場合、音声データ記憶手段に記憶された音声データ情報において、受信した音声データに対応した出力順位を、最も高い出力順位に更新する第２出力順位更新手段を備えている。これにより、テレビ会議において、他の音声データが出力中の状態であっても、優先順位がより高い端末装置から送信された音声データが、優先して出力される。その結果、テレビ会議において予め決められた優先順位が高い端末装置の使用者の発言を、遅延させることなく出力することができる。 In the video conference device according to claim 3, in addition to the effect of the invention according to claim 1 or 2, a third predetermined time after the audio data output means starts outputting audio data to the plurality of terminal devices. 3rd judging means for judging whether or not voice data has been received from a terminal device corresponding to a priority higher than the priority corresponding to the terminal device that has transmitted the voice data being output before ing. Further, when it is determined by the third determination means that the voice data has been received, second deletion means for deleting the voice data information corresponding to the voice data being output in the voice data information stored in the voice data storage means. It has. Further, when it is determined that the voice data has been received by the third determining means, the output rank corresponding to the received voice data is updated to the highest output rank in the voice data information stored in the voice data storage means. Second output rank update means is provided. As a result, even when other audio data is being output in the video conference, the audio data transmitted from the terminal device having a higher priority is preferentially output. As a result, it is possible to output the speech of the user of the terminal device having a high priority determined in advance in the video conference without delay.

請求項４に係るテレビ会議装置では、請求項１乃至３のいずれかに記載の発明の効果に加え、音声データの出力が開始されてから第４所定時間が経過した後、出力中の音声データを送信した端末装置に対応した優先順位より高い優先順位に対応した端末装置から音声データを受信したと判断されてから、第５所定時間が経過したか否かを判断する第５判断手段を備えている。また、前記第５判断手段にて音声データを受信してから第５所定時間が経過したと判断された場合、出力中の音声データに対応する音声データ情報を削除する第３削除手段を備えている。また、前記第５判断手段にて第５所定時間が経過したと判断された場合、受信した音声データに対応した出力順位を、最も高い前記出力順位に更新する第３出力順位更新手段を備えている。これにより、優先順位が低い端末装置から送信された音声データが所定時間以上、テレビ会議において出力中であれば、優先順位が高い端末装置から送信された音声データは、受信の開始時刻から第５所定時間が経過後に出力される。その結果、優先順位が高い端末装置から送信された出力時間が所定時間以上の音声データを、少ない遅延で出力することができる。 In the video conference apparatus according to claim 4, in addition to the effect of the invention according to any one of claims 1 to 3, the audio data being output after the fourth predetermined time has elapsed since the output of the audio data was started. And a fifth determination means for determining whether or not a fifth predetermined time has elapsed since it was determined that the audio data was received from the terminal device corresponding to a higher priority than the priority corresponding to the terminal device that transmitted ing. In addition, when the fifth determination unit determines that the fifth predetermined time has elapsed since the reception of the audio data, the third determination unit includes a third deletion unit that deletes the audio data information corresponding to the audio data being output. Yes. And a third output order update means for updating the output order corresponding to the received audio data to the highest output order when the fifth determination means determines that the fifth predetermined time has elapsed. Yes. As a result, if the audio data transmitted from the terminal device with the lower priority is being output in the video conference for a predetermined time or longer, the audio data transmitted from the terminal device with the higher priority is the fifth from the reception start time. It is output after a predetermined time has elapsed. As a result, audio data transmitted from a terminal device with a high priority and having an output time of a predetermined time or more can be output with a small delay.

請求項５に係るテレビ会議システムは、請求項１に記載のテレビ会議装置と、音声データを出力する出力部、及びテレビ会議で使用される画像データを表示する表示部を備えた端末装置とで構成されている。テレビ会議装置は出力順位を、出力順位に対応した端末識別情報で特定される端末装置に対して送信する出力順位送信手段を備えている。また、端末装置は、出力順位送信手段にて送信された出力順位を出力順位受信手段にて受信し、受信した出力順位に基づいて出力順位を通知する旨の情報を出力順位表示制御手段にて表示部に表示する。これにより、端末装置を使用するユーザは、自身の発言が出力される順番を知ることができ、自身の発言に遅延があるか否かを把握できる。 According to a fifth aspect of the present invention, there is provided a video conference system according to the first aspect of the present invention, and a terminal device including an output unit that outputs audio data and a display unit that displays image data used in the video conference. It is configured. The video conference apparatus includes an output order transmitting means for transmitting the output order to the terminal device specified by the terminal identification information corresponding to the output order. Also, the terminal device receives the output rank transmitted by the output rank transmission means by the output rank reception means, and outputs information indicating that the output rank is notified based on the received output rank by the output rank display control means. Display on the display. Thereby, the user who uses a terminal device can know the order in which his / her speech is output, and can grasp whether or not his / her speech has a delay.

請求項６に係るテレビ会議制御方法では、ネットワークを介して実施されるテレビ会議に接続する端末装置から発話による音声データを受信した場合、他の端末装置から受信した音声データが出力中であるか否かを判断する出力判断ステップを備えている。出力判断ステップにて、他の端末装置から受信した音声データが出力中でないと判断された場合、優先順位情報記憶手段に記憶された各端末装置の優先順位にしたがって、端末識別情報と、受信した音声データと、出力順位と、受信開始時刻と、出力順位とが対応づけられた音声データ情報を音声データ情報記憶手段に記憶する。また、出力されていない音声データの内で、受信開始時刻から第１所定時間が経過した音声データが存在する場合、第１削除ステップにて削除される。音声データ出力ステップは、音声データ情報記憶手段に記憶された音声データ情報の内で出力順位が最も高い音声データ情報に対応した音声データをテレビ会議においてデータ通信を行っている複数の端末に対して出力する。これにより、テレビ会議装置は、受信した音声データの出力順位を、予め決定されている優先順位に従って決定する。そして、テレビ会議に接続している各端末装置に対して出力順位の高い音声データから順に出力することができる。また、所定時間である第１所定時間以上、遅延されている音声データは削除されるため、出力されることがなくなる。その結果、音声データが必要以上に遅れて出力されることがなくなり、テレビ会議を実施中に、会話のテンポが妨げられる可能性を少なくできる。 In the video conference control method according to claim 6, whether voice data received from another terminal device is being output when voice data by utterance is received from a terminal device connected to the video conference performed via the network. An output judging step for judging whether or not is provided. If it is determined in the output determination step that the voice data received from another terminal device is not being output, the terminal identification information and the received information are received according to the priority order of each terminal device stored in the priority order information storage means. Audio data information in which audio data, output order, reception start time, and output order are associated is stored in the audio data information storage means. Further, if there is audio data that has passed the first predetermined time from the reception start time among the audio data that has not been output, the audio data is deleted in the first deletion step. In the audio data output step, the audio data corresponding to the audio data information having the highest output order among the audio data information stored in the audio data information storage means is transmitted to a plurality of terminals performing data communication in the video conference. Output. Thereby, the video conference apparatus determines the output order of the received audio data according to the priority order determined in advance. And it can output in order from the audio | voice data with a high output order with respect to each terminal device connected to the video conference. Further, since the audio data delayed for the first predetermined time which is the predetermined time is deleted, it is not output. As a result, the audio data is not output more late than necessary, and the possibility of hindering the conversation tempo during a video conference can be reduced.

請求項７に係るテレビ会議装置のプログラムでは、請求項１乃至４のいずれかに記載のテレビ会議装置の各種処理手段としてコンピュータを機能させる。従って、テレビ会議装置のプログラムをコンピュータに実行させることにより、請求項１乃至４のいずれかに記載の発明の効果を奏することができる。 The program of the video conference apparatus according to claim 7 causes the computer to function as various processing means of the video conference apparatus according to claim 1. Therefore, the effect of the invention according to any one of claims 1 to 4 can be achieved by causing a computer to execute the program of the video conference apparatus.

テレビ会議システム１の接続形態の一例を示す図である。It is a figure which shows an example of the connection form of the video conference system. テレビ会議装置１００の電気的構成を示すブロック図である。2 is a block diagram showing an electrical configuration of the video conference apparatus 100. FIG. 音声データテーブル１１００の構成を示す模式図である。3 is a schematic diagram showing a configuration of an audio data table 1100. FIG. テレビ会議装置１００のＨＤＤ１０４の記憶エリアの構成を示す模式図である。3 is a schematic diagram illustrating a configuration of a storage area of an HDD 104 of the video conference apparatus 100. FIG. 優先順位テーブル１２００の構成を示す模式図である。3 is a schematic diagram showing a configuration of a priority table 1200. FIG. 端末装置２００の電気的構成を示すブロック図である。3 is a block diagram showing an electrical configuration of a terminal device 200. FIG. 端末装置２００のモニタ２７０に表示されるテレビ会議画面２７１の一具体例を示す図である。It is a figure which shows one specific example of the video conference screen 271 displayed on the monitor 270 of the terminal device 200. テレビ会議装置１００で実行されるメイン処理のフローチャートである。4 is a flowchart of main processing executed by the video conference apparatus 100. テレビ会議装置１００で実行される音声データ削除処理のサブルーチンのフローチャートである。10 is a flowchart of a subroutine of audio data deletion processing executed by the video conference apparatus 100. テレビ会議装置１００で実行される出力判断処理のサブルーチンのフローチャートである。4 is a flowchart of a subroutine of output determination processing executed by the video conference apparatus 100. 音声データ、及び出力音声データの一例を示すタイミングチャート図である。It is a timing chart figure which shows an example of audio | voice data and output audio | voice data. テレビ会議装置１００で実行される出力順位設定処理のサブルーチンのフローチャートである。6 is a flowchart of a subroutine of an output order setting process executed by the video conference apparatus 100. 音声データ、及び出力音声データの一例を示すタイミングチャート図である。It is a timing chart figure which shows an example of audio | voice data and output audio | voice data. 音声データ、及び出力音声データの一例を示すタイミングチャート図である。It is a timing chart figure which shows an example of audio | voice data and output audio | voice data. 端末装置２００で実行される出力順位情報表示処理のサブルーチンのフローチャートである。10 is a flowchart of a subroutine of output order information display processing executed by the terminal device 200. 出力順位情報が表示されたテレビ会議画面２７１の一具体例を示す図である。It is a figure which shows one specific example of the video conference screen 271 on which output order information was displayed. 変形例におけるテレビ会議装置１００で実行される出力順位設定処理のサブルーチンのフローチャートである。It is a flowchart of the subroutine of the output order setting process performed with the video conference apparatus 100 in a modification. 音声データ、及び出力音声データの一例を示すタイミングチャート図である。It is a timing chart figure which shows an example of audio | voice data and output audio | voice data.

以下、本発明の一実施の形態であるテレビ会議システム１について、図面を参照して説明をする。まず、図１を参照してテレビ会議システム１の概要について説明する。 Hereinafter, a video conference system 1 according to an embodiment of the present invention will be described with reference to the drawings. First, an overview of the video conference system 1 will be described with reference to FIG.

図１に示すように、複数の端末装置２００は、ネットワーク２を介してテレビ会議装置１００に接続され、端末装置２００間で画像、音声の送受信を行うことで、テレビ会議を実施することができる。図１では、端末装置２００は３つ図示しているが、実際には、ネットワーク２を介してテレビ会議が実施できればよく、２つ以上であればよい。端末装置２００は、例えばある企業の同一サイト内に複数存在してもよいし、異なる事業所内や、異なる地域や国に点在して存在していてもよい。 As shown in FIG. 1, a plurality of terminal devices 200 are connected to the video conference device 100 via the network 2, and can perform a video conference by transmitting and receiving images and audio between the terminal devices 200. . In FIG. 1, three terminal devices 200 are illustrated, but actually, it is sufficient that a video conference can be performed via the network 2, and two or more terminal devices 200 may be used. For example, a plurality of terminal devices 200 may exist in the same site of a certain company, or may exist in different offices or in different regions or countries.

テレビ会議装置１００は、例えば、周知のパーソナルコンピュータであり、汎用型の装置であり、複数の端末装置２００から送信された、画像、音声、データ等を中継することにより、端末装置２００の間でのテレビ会議を実現する装置である。テレビ会議装置１００は、各端末装置２００から送信される個別の画像データを合成することで、テレビ会議で共有する表示用の画像データを作成する。そして、テレビ会議に接続している複数の端末装置２００に対して、音声、及び合成された画像データを送信する機能を有している。 The video conference device 100 is, for example, a well-known personal computer and is a general-purpose device, and relays images, sounds, data, and the like transmitted from a plurality of terminal devices 200, thereby allowing the terminal devices 200 to communicate with each other. It is a device that realizes the video conference. The video conference apparatus 100 creates display image data to be shared in the video conference by synthesizing individual image data transmitted from each terminal device 200. And it has a function which transmits an audio | voice and the synthesized image data with respect to the several terminal device 200 connected to the video conference.

端末装置２００は、例えば、周知のパーソナルコンピュータであり、汎用型の装置である。なお、この端末装置２００には、テレビ会議で使用するための画像を外部から入力するためのカメラ２５０（図６参照）、テレビ会議の参加者の音声を外部から入力するためのマイク２４０（図６参照）が設けられている。また、テレビ会議装置１００から受信した画像を外部に出力するためのモニタ２７０（図６参照）、音声を外部に出力するためのスピーカ２６０（図６参照）が設けられている。 The terminal device 200 is, for example, a known personal computer and is a general-purpose device. The terminal device 200 includes a camera 250 (see FIG. 6) for inputting an image for use in a video conference from the outside, and a microphone 240 (see FIG. 6) for inputting the audio of a participant in the video conference from the outside. 6). Further, a monitor 270 (see FIG. 6) for outputting an image received from the video conference apparatus 100 to the outside and a speaker 260 (see FIG. 6) for outputting the sound to the outside are provided.

次に、図２のブロック図を参照して、テレビ会議装置１００の電気的構成について説明する。図２に示すように、テレビ会議装置１００は、テレビ会議装置１００の制御を司るＣＰＵ１０１を備えている。そして、このＣＰＵ１０１には、ＲＯＭ１０２、ＲＡＭ１０３、ハードディスクドライブ（ＨＤＤ）１０４、カウンタ１０５、表示制御部１０６、入力制御部１０７、計時装置１０８、及び通信制御部１０９がバス１１１を介して接続されている。 Next, the electrical configuration of the video conference apparatus 100 will be described with reference to the block diagram of FIG. As shown in FIG. 2, the video conference apparatus 100 includes a CPU 101 that controls the video conference apparatus 100. A ROM 102, a RAM 103, a hard disk drive (HDD) 104, a counter 105, a display control unit 106, an input control unit 107, a timing device 108, and a communication control unit 109 are connected to the CPU 101 via a bus 111. .

ＲＯＭ１０２には、ＣＰＵ１０１が実行するＢＩＯＳを起動するプログラムや設定値が記憶されている。ＲＡＭ１０３には、各種のデータが一時的に記憶される。また、音声データテーブル１１００（図３参照）が記憶される音声データテーブル記憶エリア１３０１が設けられている。この音声データテーブル１１００についての詳細は、後述する。なお、音声データテーブル記憶エリア１３０１は、異なったプロセス間で情報を共有するための共有メモリ領域である。ＨＤＤ１０４には、テレビ会議装置１００で実行される各種のプログラム等が記憶される。カウンタ１０５は、タイマ機能としての時間を計測する。 The ROM 102 stores a program for starting the BIOS executed by the CPU 101 and setting values. Various data are temporarily stored in the RAM 103. Also, an audio data table storage area 1301 for storing an audio data table 1100 (see FIG. 3) is provided. Details of the audio data table 1100 will be described later. The audio data table storage area 1301 is a shared memory area for sharing information between different processes. The HDD 104 stores various programs executed by the video conference apparatus 100. The counter 105 measures time as a timer function.

表示制御部１０６は、操作画面を表示するためのモニタ１２０に接続され、このモニタ１２０の表示を制御する。また、入力制御部１０７は、ユーザが操作の入力を行うためのキーボード１３０やマウス１４０に接続され、これらの入力の制御を行う。計時装置１０８では、内部時計として時間が計時される。通信制御部１０９は、ネットワーク２に接続し、端末装置２００との間でデータの送受信の制御を行う。 The display control unit 106 is connected to a monitor 120 for displaying an operation screen, and controls the display of the monitor 120. The input control unit 107 is connected to a keyboard 130 and a mouse 140 for a user to input an operation, and controls these inputs. In the time measuring device 108, time is measured as an internal clock. The communication control unit 109 is connected to the network 2 and controls data transmission / reception with the terminal device 200.

ここで、図３を参照して、音声データテーブル１１００について説明する。この音声データテーブル１１００は、テレビ会議装置１００のメイン処理（図８参照）が実行された際に、ＲＡＭ１０３の音声データテーブル記憶エリア１３０１に作成される。 Here, the audio data table 1100 will be described with reference to FIG. The audio data table 1100 is created in the audio data table storage area 1301 of the RAM 103 when the main process (see FIG. 8) of the video conference apparatus 100 is executed.

音声データテーブル１１００には、端末ＩＤ、音声データ名、出力順位、無効フラグ、状態フラグ、及び開始時刻の情報が対応づけられた情報（以下、「音声データ情報」という）が記憶される。端末ＩＤには、音声データを送信した端末装置２００を識別するための情報が記憶される。なお、本実施の形態では、一例として、端末ＩＤは端末装置２００のマックアドレスとする。なお、端末ＩＤは端末装置２００を特定できるユニークな情報であればよく、端末装置２００のＩＰアドレス等であってもよい。 The voice data table 1100 stores information (hereinafter referred to as “voice data information”) in which terminal ID, voice data name, output order, invalid flag, status flag, and start time information are associated with each other. The terminal ID stores information for identifying the terminal device 200 that has transmitted the audio data. In the present embodiment, as an example, the terminal ID is a Mac address of the terminal device 200. The terminal ID may be unique information that can identify the terminal device 200, and may be the IP address of the terminal device 200 or the like.

音声データ名には、テレビ会議に接続している端末装置２００から受信した音声データを識別するファイル名が記憶される。出力順位には、各端末装置２００から受信した音声データを、テレビ会議システム１に接続する全ての端末装置２００に対して出力する順番が記憶される。 The audio data name stores a file name for identifying audio data received from the terminal device 200 connected to the video conference. The output order stores the order in which the audio data received from each terminal device 200 is output to all the terminal devices 200 connected to the video conference system 1.

無効フラグには、テレビ会議に接続する各端末装置２００に対して、記憶された音声データの出力が無効になったか否かを示すフラグが記憶される。なお、無効フラグの値が「１」の場合、記憶された音声データの出力が無効になったことを示している。また、無効フラグの値が「０」の場合、記憶された音声データの出力が有効であることを示している。 The invalid flag stores a flag indicating whether or not the output of the stored audio data is invalid for each terminal device 200 connected to the video conference. When the value of the invalid flag is “1”, it indicates that the output of the stored audio data is invalid. Further, when the value of the invalid flag is “0”, it indicates that the output of the stored audio data is valid.

状態フラグには、対応する音声データ名で特定される音声データの状態が記憶される。状態フラグの値が「０」の場合、対応する音声データ名で特定される音声データの出力が、テレビ会議に接続している各端末装置２００に対して待ち状態であることを示している。状態フラグの値が「１」の場合、対応する音声データ名で特定される音声データが、テレビ会議に接続している各端末装置２００に対して出力中の状態であることを示している。状態フラグの値が「２」の場合、対応する音声データ名で特定される音声データの出力が、終了した状態であることを示している。 The status flag stores the status of the audio data specified by the corresponding audio data name. When the value of the status flag is “0”, it indicates that the output of the audio data specified by the corresponding audio data name is in a waiting state for each terminal device 200 connected to the video conference. When the value of the status flag is “1”, it indicates that the audio data specified by the corresponding audio data name is being output to each terminal device 200 connected to the video conference. When the value of the status flag is “2”, it indicates that the output of the audio data specified by the corresponding audio data name has been completed.

開始時刻には、テレビ会議システム１に接続する端末装置２００から音声データの受信を開始した時刻が記憶される。 In the start time, the time when reception of audio data from the terminal device 200 connected to the video conference system 1 is started is stored.

次に、図４を参照して、ＨＤＤ１０４の詳細について説明する。図４に示すように、ＨＤＤ１０４は、優先順位テーブル記憶エリア１４０１、プログラム記憶エリア１４０２、及びプログラム関係情報記憶エリア１４０３を含む複数の記憶エリアを備えている。 Next, the details of the HDD 104 will be described with reference to FIG. As shown in FIG. 4, the HDD 104 includes a plurality of storage areas including a priority table storage area 1401, a program storage area 1402, and a program related information storage area 1403.

優先順位テーブル記憶エリア１４０１には、優先順位テーブル１２００（図５参照）が記憶されている。この優先順位テーブル１２００の詳細については、後述する。プログラム記憶エリア１４０２には、各種処理をテレビ会議装置１００に実行させるための各種プログラムが記憶されている。なお、これらのプログラムは、例えばＣＤ−ＲＯＭに記憶されたものがＣＤ−ＲＯＭドライブ（図示外）を介してインストールされ、プログラム記憶エリア１４０２に記憶される。または、ネットワーク２を介してダウンロードされたプログラムが、プログラム記憶エリア１４０２に記憶されてもよい。プログラム関係情報記憶エリア１４０３には、プログラムの実行に必要な設定や初期値、データ等の情報が記憶される。 The priority table storage area 1401 stores a priority table 1200 (see FIG. 5). Details of the priority table 1200 will be described later. The program storage area 1402 stores various programs for causing the video conference apparatus 100 to execute various processes. For example, those programs stored in a CD-ROM are installed via a CD-ROM drive (not shown) and stored in the program storage area 1402. Alternatively, a program downloaded via the network 2 may be stored in the program storage area 1402. The program related information storage area 1403 stores information such as settings, initial values, and data necessary for program execution.

次に、図５を参照して、優先順位テーブル１２００の詳細について説明する。優先順位テーブル１２００には、テレビ会議に参加する端末装置２００の端末ＩＤと、優先順位とが予め対応づけられた優先順位情報が記憶されている。この優先順位は、テレビ会議で各端末装置２００に対して出力する音声データ（以下、「出力音声データ」という）を決定する際に使用される順位である。また、本実施の形態では、端末ＩＤとしてはマックアドレスを使用するものとする。ただし、テレビ会議システム１内で、端末装置２００を特定できるユニークな情報であれば問題はない。 Next, the details of the priority order table 1200 will be described with reference to FIG. The priority order table 1200 stores priority order information in which the terminal ID of the terminal device 200 participating in the video conference and the priority order are associated in advance. This priority order is an order used when determining audio data (hereinafter referred to as “output audio data”) to be output to each terminal device 200 in a video conference. In the present embodiment, a MAC address is used as the terminal ID. However, there is no problem as long as it is unique information that can identify the terminal device 200 in the video conference system 1.

次に、図６のブロック図を参照して、端末装置２００の電気的構成について説明する。図６に示すように、端末装置２００は、端末装置２００の制御を司るＣＰＵ２０１を備えている。そして、このＣＰＵ２０１には、ＲＯＭ２０２、ＲＡＭ２０３、ハードディスクドライブ（ＨＤＤ）２０４、カウンタ２０５、計時装置２０６、入出力制御部２０７、及び通信制御部２０８が、バス２１１を介して接続されている。 Next, the electrical configuration of the terminal device 200 will be described with reference to the block diagram of FIG. As illustrated in FIG. 6, the terminal device 200 includes a CPU 201 that controls the terminal device 200. A ROM 202, a RAM 203, a hard disk drive (HDD) 204, a counter 205, a timing device 206, an input / output control unit 207, and a communication control unit 208 are connected to the CPU 201 via a bus 211.

ＲＯＭ２０２には、ＣＰＵ２０１が実行するＢＩＯＳを起動させるプログラムや設定値が記憶されている。ＲＡＭ２０３には、各種のデータが一時的に記憶される。ＨＤＤ２０４には、端末装置２００で実行される各種のプログラム等が記憶される。カウンタ２０５は、タイマ機能としての時間を計測する。計時装置２０６では、内部時計として時間が計時される。 The ROM 202 stores a program for starting the BIOS executed by the CPU 201 and setting values. Various data are temporarily stored in the RAM 203. The HDD 204 stores various programs executed by the terminal device 200. The counter 205 measures time as a timer function. The time measuring device 206 measures time as an internal clock.

入出力制御部２０７には、ユーザが操作の入力を行うためのキーボード２２０、マウス２３０、テレビ会議で使用するための発話による音声データを取得するマイク２４０、及びテレビ会議で使用するための画像データを取得するカメラ２５０が接続されている。また、入出力制御部２０７には、テレビ会議が実行時に、テレビ会議装置１００から送信された音声データを出力するスピーカ２６０、及びテレビ会議装置１００から送信された画像データを表示するモニタ２７０が接続されている。通信制御部２０８は、ネットワーク２に接続され、テレビ会議装置１００等の外部機器との間でのデータの送受信を制御する。 The input / output control unit 207 includes a keyboard 220 for a user to input an operation, a mouse 230, a microphone 240 for acquiring voice data by utterance for use in a video conference, and image data for use in a video conference. Is connected to the camera 250. The input / output control unit 207 is connected to a speaker 260 that outputs audio data transmitted from the video conference apparatus 100 and a monitor 270 that displays image data transmitted from the video conference apparatus 100 when a video conference is executed. Has been. The communication control unit 208 is connected to the network 2 and controls data transmission / reception with an external device such as the video conference apparatus 100.

次に、図７を参照して、端末装置２００のモニタ２７０に表示されるテレビ会議画面２７１について説明する。 Next, the video conference screen 271 displayed on the monitor 270 of the terminal device 200 will be described with reference to FIG.

テレビ会議が実行されると各端末装置２００のモニタ２７０には、図７に示すように、テレビ会議画面２７１が表示される。このテレビ会議画面２７１には、テレビ会議に接続している各端末装置２００からテレビ会議装置１００に対して送信された画像データを表示する個別領域２７２が、端末装置２００毎に設けられている。この個別領域２７２で表示される画像データは、各端末装置２００のカメラ２５０で撮影された画像データである。 When the video conference is executed, a video conference screen 271 is displayed on the monitor 270 of each terminal device 200 as shown in FIG. In this video conference screen 271, an individual area 272 for displaying image data transmitted from each terminal device 200 connected to the video conference to the video conference device 100 is provided for each terminal device 200. The image data displayed in the individual area 272 is image data captured by the camera 250 of each terminal device 200.

次いで、図８を参照して、テレビ会議装置１００で実行されるメイン処理について説明する。ここで、図８のメイン処理は、テレビ会議装置１００の電源がＯＮになった際に、実行される。なお、テレビ会議装置１００の電源がＯＦＦになった際には、図８のメイン処理は自動的に終了するものとする。 Next, with reference to FIG. 8, main processing executed in the video conference apparatus 100 will be described. Here, the main process of FIG. 8 is executed when the power of the video conference apparatus 100 is turned on. In addition, when the power supply of the video conference apparatus 100 is turned off, the main process in FIG. 8 is automatically terminated.

図８のメイン処理が実行されると、ＨＤＤ１０４の優先順位テーブル記憶エリア１４０１に記憶された優先順位テーブル１２００が取得される（Ｓ１１）。そして、音声データ削除処理（図９参照）が起動される（Ｓ１２）。ここで、メイン処理とは別のプロセスが生成され、音声データ削除処理が動作する。なお、メイン処理が動作しているプロセスが親プロセスとなり、メイン処理から起動される音声データ削除処理は、子プロセスとして動作する。この音声データ削除処理の詳細については、後述する。 When the main process of FIG. 8 is executed, the priority table 1200 stored in the priority table storage area 1401 of the HDD 104 is acquired (S11). Then, the voice data deletion process (see FIG. 9) is started (S12). Here, a process different from the main process is generated, and the audio data deletion process operates. The process in which the main process is operating becomes a parent process, and the audio data deletion process activated from the main process operates as a child process. Details of the audio data deletion processing will be described later.

次いで、出力判断処理（図９参照）が起動される（Ｓ１３）。ここで、メイン処理とは別のプロセスが生成され、出力判断処理が動作する。なお、メイン処理が動作しているプロセスが親プロセスとなり、メイン処理から起動される出力判断処理は、子プロセスとして動作する。この出力判断処理の詳細については、後述する。 Next, an output determination process (see FIG. 9) is started (S13). Here, a process different from the main process is generated, and the output determination process operates. The process in which the main process is operating becomes a parent process, and the output determination process started from the main process operates as a child process. Details of the output determination process will be described later.

次いで、端末装置２００から送信されたデータに、ユーザの発話による音声データを含んだ音声データを、端末装置２００から受信したか否かが判断される（Ｓ１４）。この音声データは、端末装置２００のマイク２４０から取得されるデータであり、受信した音声データが所定の音量値を超えているか否かで判断される。ここで、所定の音量値に達していない場合は、マイク２４０から取得されたノイズとみなされる。 Next, it is determined whether or not voice data including voice data generated by the user's speech is received from the terminal apparatus 200 in the data transmitted from the terminal apparatus 200 (S14). This audio data is data acquired from the microphone 240 of the terminal device 200, and is determined based on whether or not the received audio data exceeds a predetermined volume value. Here, when the predetermined volume value is not reached, it is regarded as noise acquired from the microphone 240.

ユーザの発話による音声データを受信したと判断された場合（Ｓ１４：ＹＥＳ）、つまり、所定の音量値を超えている音声データを受信した場合、受信した音声データからユーザの発話による音声データのみが取得される（Ｓ１５）。次いで、取得された音声データが、ＲＡＭ１０３の音声データ記憶エリア（図示外）に、音声データを送信した端末装置２００の端末ＩＤに対応づけて記憶される（Ｓ１６）。なお、この音声データ記憶エリアは、異なったプロセス間で、情報を共有するための共有メモリ領域である。 When it is determined that the voice data by the user's utterance has been received (S14: YES), that is, when the voice data exceeding the predetermined volume value is received, only the voice data by the user's utterance is received from the received voice data. Obtained (S15). Next, the acquired audio data is stored in an audio data storage area (not shown) of the RAM 103 in association with the terminal ID of the terminal device 200 that transmitted the audio data (S16). This audio data storage area is a shared memory area for sharing information between different processes.

次いで、テレビ会議に接続している全ての端末装置２００に対して、Ｓ１６で音声データ記憶エリアに記憶された音声データ以外の他の音声データが、出力中であるか否かが判断される（Ｓ１７）。つまり、音声データテーブル１１００において、他の音声データに対応する音声データ情報の状態フラグの値が「１」である音声データ情報が存在するか否かで判断される。Ｓ１６で音声データ記憶エリアに記憶された音声データ以外の他の音声データが出力中である場合（Ｓ１７：ＹＥＳ）、Ｓ１６で音声データ記憶エリアに記憶された音声データに関する音声データ情報が、音声データテーブル１１００に記憶される（Ｓ１８）。 Next, it is determined whether or not audio data other than the audio data stored in the audio data storage area in S16 is being output to all the terminal devices 200 connected to the video conference. S17). That is, in the audio data table 1100, determination is made based on whether or not there is audio data information in which the value of the status flag of the audio data information corresponding to other audio data is “1”. When audio data other than the audio data stored in the audio data storage area in S16 is being output (S17: YES), the audio data information related to the audio data stored in the audio data storage area in S16 is the audio data. It is stored in the table 1100 (S18).

ここで、端末ＩＤには、音声データを送信した端末装置２００の端末ＩＤが記憶される。音声データ名には、音声データ記憶エリアに記憶された音声データのファイル名が記憶される。出力順位には、音声データテーブル１１００に記憶されている出力順位の最も大きい値に「１」が加算されて記憶される。無効フラグには、初期値としての「０」が記憶される。状態フラグには、初期値としての「０」が記憶される。開始時刻には、音声データ記憶エリアに記憶された音声データの受信の開始時刻が記憶される。 Here, the terminal ID of the terminal device 200 that has transmitted the audio data is stored in the terminal ID. In the audio data name, a file name of audio data stored in the audio data storage area is stored. In the output order, “1” is added to the largest output order value stored in the audio data table 1100 and stored. The invalid flag stores “0” as an initial value. The status flag stores “0” as an initial value. In the start time, the reception start time of the audio data stored in the audio data storage area is stored.

そして、所定時間であるＴ１時間以上、遅延している音声データがあるか否かが判断される（Ｓ１９）。つまり、音声データテーブル１１００に記憶された音声データ情報の内で、開始時刻からＴ１時間以上、経過している音声データ情報が存在するか否かが判断される。この判断処理は、音声データテーブル１１００に記憶された音声データ情報の開始時刻にＴ１時間が加算された時刻の内で、現在時刻を超えている音声データ情報が存在するか否かで判断される。ここで、現在時刻は、計時装置１０８より取得される。なお、所定時間であるＴ１時間は、会話のテンポを妨げない程度の時間であればよく、一例として、本実施の形態では「１０秒」とする。 Then, it is determined whether there is audio data delayed for a predetermined time T1 or more (S19). That is, it is determined whether or not audio data information stored in the audio data table 1100 includes audio data information that has passed for at least T1 hours from the start time. This determination process is determined based on whether or not there is audio data information that exceeds the current time in the time obtained by adding T1 time to the start time of the audio data information stored in the audio data table 1100. . Here, the current time is acquired from the timing device 108. The predetermined time T1 may be a time that does not interfere with the tempo of the conversation, and is, for example, “10 seconds” in the present embodiment.

また、ユーザの発話による音声データを受信していないと判断された場合（Ｓ１４：ＮＯ）、つまり、送信された音声データが所定の音量値に達しておらず、マイク２４０から取得したノイズと判断された場合、Ｓ１９へ移行する。 Further, when it is determined that the voice data due to the user's utterance has not been received (S14: NO), that is, the transmitted voice data has not reached the predetermined volume value, and is determined to be noise acquired from the microphone 240. If so, the process proceeds to S19.

また、Ｓ１６で音声データ記憶エリアに記憶された音声データ以外の他の音声データが、出力中でないと判断された場合（Ｓ１７：ＮＯ）、つまり、音声データテーブル１１００において、他の音声データに対応する音声データ情報の状態フラグの値が「１」である音声データ情報が存在しない場合、Ｓ１９へ移行する。 When it is determined that other audio data other than the audio data stored in the audio data storage area in S16 is not being output (S17: NO), that is, in the audio data table 1100, other audio data is supported. If there is no audio data information whose status flag value is “1”, the process proceeds to S19.

Ｓ１９において、所定時間であるＴ１時間以上、遅延している音声データがあると判断された場合（Ｓ１９：ＹＥＳ）、つまり、音声データテーブル１１００に記憶された音声データ情報の内で、開始時刻からＴ１時間以上、経過している音声データ情報が存在する場合、該当する音声データ情報の無効フラグの値に「１」が設定される（Ｓ２０）。そして、出力順位設定処理が行われる（Ｓ２１）。この出力順位設定処理についての詳細は、後述する。 In S19, when it is determined that there is audio data delayed for a predetermined time T1 or more (S19: YES), that is, from the start time in the audio data information stored in the audio data table 1100. If there is audio data information that has passed for T1 time or longer, “1” is set to the invalid flag value of the corresponding audio data information (S20). Then, an output order setting process is performed (S21). Details of the output order setting process will be described later.

また、所定時間であるＴ１時間以上、遅延している音声データがない場合（Ｓ１９：ＮＯ）、つまり、音声データテーブル１１００に記憶された音声データ情報の内で、開始時刻からＴ１時間以上、経過している音声データ情報が存在しない場合、Ｓ２１へ移行する。 Further, when there is no delayed voice data for a predetermined time T1 or more (S19: NO), that is, the voice data information stored in the voice data table 1100 has passed T1 hours or more from the start time. If there is no audio data information being processed, the process proceeds to S21.

Ｓ２１で出力順位設定処理が実行されると、テレビ会議に接続している全ての端末装置２００に対して、出力可能な音声データが存在するか否かが判断される（Ｓ２２）。この判断処理は、音声データテーブル１１００において、優先順位の値が「１」であり、状態フラグの値が「２」以外の音声データ情報が存在するか否かで判断される。 When the output order setting process is executed in S21, it is determined whether there is any audio data that can be output for all the terminal devices 200 connected to the video conference (S22). This determination processing is determined based on whether or not there is audio data information other than the priority value “1” and the status flag value “2” in the audio data table 1100.

出力可能な音声データが存在すると判断された場合（Ｓ２２：ＹＥＳ）、テレビ会議に接続している全ての端末装置２００に対して、出力順位が送信される（Ｓ２３）。つまり、音声データテーブル１１００に記憶された音声データ情報の端末ＩＤで特定される各端末装置２００に対して、対応した出力順位が送信される。ここで、例えば、図３に示す音声データテーブル１１００であれば、端末ＩＤが「端末Ａ」で特定される端末装置２００に対して、出力順位「１」が送信される。端末ＩＤが「端末Ｂ」で特定される端末装置２００に対して、出力順位「２」が送信される。端末ＩＤが「端末Ｃ」で特定される端末装置２００に対して、出力順位「３」が送信される。 When it is determined that there is sound data that can be output (S22: YES), the output order is transmitted to all the terminal devices 200 connected to the video conference (S23). That is, the corresponding output order is transmitted to each terminal device 200 specified by the terminal ID of the voice data information stored in the voice data table 1100. Here, for example, in the audio data table 1100 shown in FIG. 3, the output rank “1” is transmitted to the terminal device 200 whose terminal ID is specified by “terminal A”. The output order “2” is transmitted to the terminal device 200 whose terminal ID is specified by “terminal B”. The output order “3” is transmitted to the terminal device 200 whose terminal ID is specified by “terminal C”.

次いで、音声データ情報の出力順位が「１」である音声データが、テレビ会議に接続している全ての端末装置２００に対して出力され（Ｓ２４）、Ｓ１４へ移行する。ここで、例えば、図３に示す音声データテーブル１１００であれば、音声データ名が「Ａ１」で特定される音声データが、ＲＡＭ１０３の音声データ記憶エリアから取得され、テレビ会議に接続している全ての端末装置２００に対して出力される。その際、音声データテーブル１１００において、Ｓ２４で出力された音声データに対応する状態フラグの値には、出力中である状態を示す「１」が設定される。なお、Ｓ２４で出力された音声データの出力が終了した際には、状態フラグの値には、自動的に「２」が設定され、無効フラグの値には、自動的に「１」が設定されるものとする。また、出力可能な音声データが存在しないと判断された場合（Ｓ２２：ＮＯ）、Ｓ１４へ移行する。 Next, audio data whose output order of audio data information is “1” is output to all the terminal devices 200 connected to the video conference (S24), and the process proceeds to S14. Here, for example, in the case of the audio data table 1100 shown in FIG. 3, all the audio data identified by the audio data name “A1” is acquired from the audio data storage area of the RAM 103 and connected to the video conference. Is output to the terminal device 200. At this time, in the audio data table 1100, “1” indicating a state of being output is set as the value of the status flag corresponding to the audio data output in S24. When the output of the audio data output in S24 is completed, “2” is automatically set as the value of the status flag, and “1” is automatically set as the value of the invalid flag. Shall be. If it is determined that there is no audio data that can be output (S22: NO), the process proceeds to S14.

次に、図９を参照して、テレビ会議装置１００で実行される音声データ削除処理について説明する。ここで、図９の音声データ削除処理は、図８のメイン処理のＳ１２で、子プロセスとして起動される。なお、音声データ削除処理は、図８のメイン処理が終了する際に自動的に終了ものとする。 Next, the audio data deletion process executed by the video conference apparatus 100 will be described with reference to FIG. Here, the audio data deletion process of FIG. 9 is started as a child process in S12 of the main process of FIG. Note that the audio data deletion process automatically ends when the main process in FIG. 8 ends.

音声データ削除処理が実行されると、無効フラグの値が「１」である音声データ情報があるか否かが判断される（Ｓ３１）。この判断処理は、音声データテーブル１１００に記憶されている無効フラグが「１」である音声データ情報があるか否かで判断される。 When the audio data deletion process is executed, it is determined whether there is audio data information whose invalid flag value is “1” (S31). This determination process is determined by whether there is audio data information whose invalid flag is “1” stored in the audio data table 1100.

無効フラグの値が「１」である音声データ情報が、音声データテーブル１１００に記憶されているていると判断された場合（Ｓ３１：ＹＥＳ）、音声データテーブル１１００に記憶されている無効フラグが「１」である音声データ情報が削除される（Ｓ３２）。つまり、音声データテーブル１１００において、記憶された無効フラグが「１」に該当する音声データ情報が削除されると共に、音声データ名に対応する音声データが、ＲＡＭ１０３の音声データ記憶エリアから削除される。そして、再度、Ｓ３１へ移行する。 When it is determined that the voice data information whose invalid flag value is “1” is stored in the voice data table 1100 (S31: YES), the invalid flag stored in the voice data table 1100 is “ The audio data information “1” is deleted (S32). That is, in the audio data table 1100, the audio data information corresponding to the stored invalid flag “1” is deleted, and the audio data corresponding to the audio data name is deleted from the audio data storage area of the RAM 103. And it transfers to S31 again.

また、無効フラグの値が「１」である音声データ情報が、音声データテーブル１１００に記憶されていないと判断された場合（Ｓ３１：ＮＯ）、再度、Ｓ３１の判断処理を行う。 If it is determined that the audio data information whose invalid flag value is “1” is not stored in the audio data table 1100 (S31: NO), the determination process of S31 is performed again.

これにより、音声データ削除処理が実行されている間は、常に、音声データテーブル１１００の無効フラグの値が検出され、他のプロセスで行われる処理において、無効フラグの値が「１」に変更された際には、該当する音声データ情報、及び音声データが削除される。 Thus, while the voice data deletion process is being executed, the invalid flag value in the voice data table 1100 is always detected, and the invalid flag value is changed to “1” in the process performed in another process. In this case, the corresponding audio data information and audio data are deleted.

次に、図１０を参照して、テレビ会議装置１００で実行される出力判断処理について説明する。ここで、図１０の出力判断処理は、図８のメイン処理のＳ１３で、子プロセスとして起動される。なお、出力判断処理は、図８のメイン処理が終了する際に自動的に終了ものとする。 Next, with reference to FIG. 10, output determination processing executed by the video conference apparatus 100 will be described. Here, the output determination process of FIG. 10 is started as a child process in S13 of the main process of FIG. Note that the output determination process automatically ends when the main process in FIG. 8 ends.

出力判断処理が実行されると、出力順位が１位である音声データがあるか否かが判断される（Ｓ４１）。つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報が存在するか否かが判断される。出力順位が１位である音声データがないと判断された場合（Ｓ４１：ＮＯ）、つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報が存在しない場合、Ｓ４１へ移行する。 When the output determination process is executed, it is determined whether or not there is audio data whose output rank is 1st (S41). That is, it is determined whether or not there is audio data information whose output rank value is “1” in the audio data table 1100. When it is determined that there is no audio data with the output rank of 1st (S41: NO), that is, when there is no audio data information with an output rank value of “1” in the audio data table 1100, the process goes to S41. Transition.

出力順位が１位である音声データがあると判断された場合（Ｓ４１：ＹＥＳ）、つまり、音声データテーブル１１００において出力順位の値が「１」である音声データ情報が存在する場合、出力順位が１位である音声データの出力が終了したか否かが判断される（Ｓ４２）。この判断処理は、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報に対応する状態フラグの値が「２」であるか否かで判断される。 If it is determined that there is audio data with the output rank of 1st (S41: YES), that is, if there is audio data information with an output rank value of “1” in the audio data table 1100, the output rank is It is determined whether or not the first-ranked audio data has been output (S42). This determination processing is determined based on whether or not the value of the status flag corresponding to the audio data information whose output rank value is “1” in the audio data table 1100 is “2”.

出力順位が１位である音声データの出力が終了していないと判断された場合（Ｓ４２：ＮＯ）、つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報に対応する状態フラグの値が「２」でない場合、Ｓ４１へ移行する。 When it is determined that the output of the audio data whose output rank is 1st has not been completed (S42: NO), that is, in the audio data table 1100, corresponding to the audio data information whose output rank value is “1” When the value of the status flag to be performed is not “2”, the process proceeds to S41.

出力順位が１位である音声データの出力が終了したと判断された場合（Ｓ４２：ＹＥＳ）、つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報に対応する状態フラグの値が「２」の場合、出力順位が２位以降の音声データが存在するか否かが判断される（Ｓ４３）。つまり、音声データテーブル１１００において、出力順位の値が「２」以上の音声データ情報が存在するか否かが判断される。 When it is determined that the output of the audio data whose output rank is 1st has been completed (S42: YES), that is, in the audio data table 1100, the state corresponding to the audio data information whose output rank value is “1” If the value of the flag is “2”, it is determined whether or not there is audio data whose output rank is 2nd or higher (S43). That is, in the audio data table 1100, it is determined whether or not there is audio data information whose output rank value is “2” or more.

出力順位が２位以降の音声データが存在しないと判断された場合（Ｓ４３：ＮＯ）、つまり、音声データテーブル１１００において、出力順位の値が「２」以上の音声データ情報が存在しない場合、Ｓ４１へ移行する。 When it is determined that there is no audio data whose output rank is second or higher (S43: NO), that is, when there is no audio data information whose output rank value is “2” or more in the audio data table 1100, S41. Migrate to

出力順位が２位以降の音声データが存在すると判断された場合（Ｓ４３：ＹＥＳ）、つまり、音声データテーブル１１００において、出力順位の値が「２」以上の音声データ情報が存在する場合、音声データテーブル１１００の出力順位が更新され（Ｓ４４）、Ｓ４１へ移行する。ここで、音声データテーブル１１００に記憶された出力順位の値が「１」だけ減算される。なお、出力順位の値が「１」であったものは、「０」となる。 If it is determined that there is audio data whose output rank is second or higher (S43: YES), that is, if there is audio data information whose output rank value is “2” or more in the audio data table 1100, the audio data The output order of the table 1100 is updated (S44), and the process proceeds to S41. Here, the value of the output order stored in the audio data table 1100 is subtracted by “1”. Note that the output rank value “1” is “0”.

次に、上述したメイン処理の動作について、一例として、図１１のタイミングチャートを参照して説明する。図１１は、テレビ会議に接続している各端末装置２００から受信した音声データ（「Ａ１」、「Ｂ１」、及び「Ｃ１」とする）に基づいて、テレビ会議装置１００から出力される出力音声データの時間的な推移を示している。 Next, the operation of the main process described above will be described with reference to the timing chart of FIG. 11 as an example. FIG. 11 shows output audio output from the video conference device 100 based on audio data (“A1”, “B1”, and “C1”) received from each terminal device 200 connected to the video conference. The time transition of data is shown.

ここで、図３に示すように、優先順位テーブル１２００に記憶されている優先順位としては、音声データ（Ａ１）を送信した端末装置２００の優先順位の値が「１」、音声データ（Ｂ１）を送信した端末装置２００の優先順位の値が「２」、音声データ（Ｃ１）を送信した端末装置２００の優先順位の値が「３」とする。 Here, as shown in FIG. 3, as the priority order stored in the priority order table 1200, the priority order value of the terminal device 200 that has transmitted the voice data (A1) is “1”, and the voice data (B1). The priority order value of the terminal apparatus 200 that has transmitted “2” is “2”, and the priority order value of the terminal apparatus 200 that has transmitted the voice data (C1) is “3”.

まず、時刻ｔＡ１のタイミングで音声データ（Ａ１）の受信を開始する。時刻ｔＡ１のタイミングでは、他の端末装置２００から受信した音声データの出力がないため、音声データ（Ａ１）が出力音声データとして、テレビ会議に接続している各端末装置２００に対して出力される。 First, reception of audio data (A1) is started at the timing of time tA1. At time tA1, since there is no output of audio data received from another terminal device 200, audio data (A1) is output as output audio data to each terminal device 200 connected to the video conference. .

次いで、時刻ｔＣ１のタイミングで音声データ（Ｃ１）の受信を開始する。時刻ｔＣ１のタイミングでは、音声データ（Ａ１）が出力中であり、音声データ（Ａ１）を送信した端末装置２００の優先順位の方が、音声データ（Ｃ１）を送信した端末装置２００の優先順位のより高いため、音声データ（Ｃ１）は出力されない状態となる。 Next, reception of audio data (C1) is started at the timing of time tC1. At the timing of time tC1, the voice data (A1) is being output, and the priority order of the terminal apparatus 200 that has transmitted the voice data (A1) is the priority order of the terminal apparatus 200 that has transmitted the voice data (C1). Since it is higher, the audio data (C1) is not output.

次いで、時刻ｔＢ１のタイミングで音声データ（Ｂ１）の受信を開始する。時刻ｔＢ１のタイミングでは、音声データ（Ａ１）が出力中であり、音声データ（Ａ１）を送信した端末装置２００の優先順位の方が、音声データ（Ｂ１）を送信した端末装置２００の優先順位のより高いため、音声データ（Ｂ１）は出力されない状態となる。 Next, reception of audio data (B1) is started at the timing of time tB1. At the timing of time tB1, the voice data (A1) is being output, and the priority order of the terminal device 200 that has transmitted the voice data (A1) is the priority order of the terminal device 200 that has transmitted the voice data (B1). Since it is higher, the audio data (B1) is not output.

次いで、時刻ｔ１のタイミングで音声データ（Ａ１）の受信、及び出力を終了する。この時刻ｔ１のタイミングで、出力順位が更新され、音声データ（Ｂ１）に対する出力順位が「１」、音声データ（Ｃ１）に対する出力順位が「２」となる。そして、音声データ（Ｂ１）が出力音声データとなり、出力が開始される。 Next, the reception and output of the audio data (A1) are finished at the timing of time t1. At this time t1, the output order is updated, the output order for the audio data (B1) is “1”, and the output order for the audio data (C1) is “2”. Then, the audio data (B1) becomes output audio data, and output is started.

次いで、音声データ（Ｃ１）の受信の開始時刻から所定の時間であるＴ１時間が経過した時刻ｔ２のタイミングで、音声データ（Ｃ１）に対応する音声データ情報の無効フラグの値が「１」に更新され、音声データ（Ｃ１）に対応する音声データ情報、及び音声データ（Ｃ１）が削除される。 Next, the value of the invalid flag of the audio data information corresponding to the audio data (C1) is set to “1” at the timing of time t2 when a predetermined time T1 has elapsed from the reception start time of the audio data (C1). The voice data information corresponding to the voice data (C1) and the voice data (C1) are deleted.

次いで、時刻ｔ３のタイミングで、出力音声データである音声データ（Ｂ１）の出力が終了する。 Next, at the time t3, the output of the audio data (B1) that is the output audio data is completed.

なお、上述したメイン処理において、Ｔ１時間以上、遅延している音声データがあるか否かが判断される（図８：Ｓ１９参照）タイミングが、時刻ｔ２のタイミングである。 In the main process described above, the timing at which it is determined whether there is audio data delayed for T1 time or longer (see S19 in FIG. 8) is the timing at time t2.

次に、図１２を参照して、テレビ会議装置１００で実行される出力順位設定処理について説明する。ここで、図１２の出力順位設定処理は、図８のＳ２１において、テレビ会議装置１００のメイン処理から実行される。 Next, with reference to FIG. 12, an output order setting process executed by the video conference apparatus 100 will be described. Here, the output order setting process of FIG. 12 is executed from the main process of the video conference apparatus 100 in S21 of FIG.

出力順位設定処理が実行されると、出力順位が１位である音声データが存在するか否かが判断される（Ｓ５１）。つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報が存在するか否かが判断される。出力順位が１位である音声データが存在しないと判断された場合（Ｓ５１：ＮＯ）、つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報が存在しないと判断された場合、優先順位テーブル記憶エリア１４０１に記憶された優先順位テーブル１２００に基づいて、音声データテーブル１１００の出力順位が設定される（Ｓ５３）。 When the output order setting process is executed, it is determined whether or not there is audio data whose output order is first (S51). That is, it is determined whether or not there is audio data information whose output rank value is “1” in the audio data table 1100. When it is determined that there is no audio data whose output rank is 1st (S51: NO), that is, it is determined that there is no audio data information whose output rank value is “1” in the audio data table 1100. If so, the output order of the audio data table 1100 is set based on the priority table 1200 stored in the priority table storage area 1401 (S53).

ここでは、優先順位テーブル１２００に記憶された優先順位が高い順に、優先順位が高い順に対応した端末ＩＤで特定される端末装置２００から送信された音声データに対しての出力順位が順に設定される。そして、継続時間が所定時間であるＴ２時間を超えていない音声データが存在するか否かが判断される（Ｓ５４）。つまり、音声データテーブル１１００の出力順位の値が「２」以降の音声データ情報において、受信の開始時刻から受信の終了時刻までの時間である継続時間が、Ｔ２時間を超えていない音声データ情報が存在するか否かで判断される。本実施の形態では、Ｔ２時間はテレビ会議の参加者の合鎚等の短いフレーズの音声データを想定し、一例として、「１秒」とする。なお、音声データの受信の終了時刻は、音声データのデータ容量から算出されるものとし、継続時間は、受信の終了時刻から受信の開始時刻を減算して算出する。 Here, the order of output for the audio data transmitted from the terminal device 200 identified by the terminal ID corresponding to the order of descending priority is set in order from the highest priority stored in the priority order table 1200. . Then, it is determined whether there exists audio data whose duration does not exceed the predetermined time T2 (S54). That is, in the audio data information whose output rank value in the audio data table 1100 is “2” or later, audio data information whose duration from the reception start time to the reception end time does not exceed T2 time is included. It is judged by whether or not it exists. In the present embodiment, T2 time is assumed to be “1 second” as an example, assuming audio data of a short phrase such as a meeting of participants in a video conference. Note that the reception end time of the audio data is calculated from the data capacity of the audio data, and the duration is calculated by subtracting the reception start time from the reception end time.

また、出力順位が１位である音声データが存在すると判断された場合（Ｓ５１：ＹＥＳ）、つまり、音声データテーブル１１００において、出力順位の値が「１」である音声データ情報が存在すると判断された場合、出力順位が２位以降の音声データに対応する出力順位が設定される（Ｓ５２）。ここでは、出力順位が２位以降の音声データに対して、優先順位テーブル１２００に基づいて出力順位が設定される。そして、Ｓ５４へ移行する。 If it is determined that there is audio data with the output rank of 1st (S51: YES), that is, it is determined that there is audio data information with an output rank value of “1” in the audio data table 1100. If so, the output order corresponding to the audio data whose output order is second or higher is set (S52). Here, the output order is set on the basis of the priority order table 1200 for the audio data whose output order is second or higher. Then, the process proceeds to S54.

継続時間が所定時間であるＴ２時間を超えていない音声データが存在すると判断された場合（Ｓ５４：ＹＥＳ）、音声データテーブル１１００において、継続時間が所定時間であるＴ２時間を超えていない音声データ情報に対応する出力順位が１位に設定される（Ｓ５５）。つまり、音声データテーブル１１００の出力順位の値が「１」である音声データ情報が複数存在することになり、同時に音声データが出力されることになる。 If it is determined that there is audio data whose duration does not exceed the predetermined time T2 (S54: YES), audio data information whose duration does not exceed the predetermined time T2 in the audio data table 1100 Is set to the first output rank (S55). That is, there are a plurality of pieces of audio data information whose output rank value is “1” in the audio data table 1100, and audio data is output at the same time.

そして、ｔＡ１＜ｔＢ１＋Ｔ３を満たす音声データ情報が存在するか否かが判断される（Ｓ５６）。ここで、ｔＡ１は、優先順位テーブル１２００において優先順位が１位の端末装置２００から受信した音声データの受信の開始時刻であり、音声データテーブル１１００に記憶されている。ｔＢ１は、優先順位テーブル１２００において優先順位が２位以降の端末装置２００から受信した音声データの受信の開始時刻であり、音声データテーブル１１００に記憶されている。また、Ｔ３時間は、所定の時間であり、一例として、本実施の形態では「１秒」とする。ここでは、開始時刻がｔＢ１に対応する音声データの出力が開始してからＴ３時間が経過するまでに、開始時刻がｔＡ１に対応する音声データの出力が開始されたか否かが判断される。 Then, it is determined whether or not there is audio data information satisfying tA1 <tB1 + T3 (S56). Here, tA1 is the reception start time of the audio data received from the terminal device 200 with the highest priority in the priority table 1200, and is stored in the audio data table 1100. tB1 is the reception start time of the audio data received from the terminal device 200 whose priority is second or higher in the priority table 1200, and is stored in the audio data table 1100. Further, the T3 time is a predetermined time, and is, for example, “1 second” in the present embodiment. Here, it is determined whether or not the output of the audio data corresponding to the start time tA1 is started before the time T3 elapses after the output of the audio data corresponding to the start time tB1 is started.

ｔＡ１＜ｔＢ１＋Ｔ３を満たす音声データが存在すると判断された場合（Ｓ５６：ＹＥＳ）、音声データテーブル１１００において、優先順位が２位以降の端末装置２００から受信した音声データに対応する無効フラグが「１」に設定される（Ｓ５７）。つまり、開始時刻がｔＢ１に対応する音声データの出力が開始してからＴ３時間が経過するまでに、開始時刻がｔＡ１に対応する音声データの出力が開始された場合、音声データテーブル１１００において、開始時刻がｔＢ１の音声データ情報に対応する無効フラグが「１」に設定される。 When it is determined that there is audio data satisfying tA1 <tB1 + T3 (S56: YES), in the audio data table 1100, the invalid flag corresponding to the audio data received from the terminal device 200 with the second highest priority is “1”. (S57). That is, if the output of audio data corresponding to the start time tA1 is started before the time T3 elapses after the output of audio data corresponding to the start time tB1 is started, the start in the audio data table 1100 is started. The invalid flag corresponding to the audio data information at time tB1 is set to “1”.

次いで、優先順位がより高い端末装置２００から受信した音声データに対応する出力順位が１位に設定される（Ｓ５８）。つまり、音声データテーブル１１００において、開始時刻がｔＡ１の音声データ情報に対応する出力順位が「１」に設定される。そして、音声データテーブル１１００において、出力順位が「２」以降の順番が更新される（Ｓ５９）。そして出力順位設定処理が終了する。 Next, the output order corresponding to the audio data received from the terminal device 200 having a higher priority is set to 1st (S58). That is, in the audio data table 1100, the output order corresponding to the audio data information whose start time is tA1 is set to “1”. In the audio data table 1100, the order after the output order “2” is updated (S59). Then, the output order setting process ends.

次に、上述したテレビ会議装置１００の動作について、一例として、図１３のタイミングチャートを参照して説明する。図１３は、テレビ会議に接続している各端末装置２００から受信した音声データ（「Ａ１」、「Ｂ１」、及び「Ｃ１」とする）に基づいて、出力音声データの時間的な推移を示している。 Next, the operation of the above-described video conference apparatus 100 will be described with reference to the timing chart of FIG. 13 as an example. FIG. 13 shows temporal transition of output audio data based on audio data (referred to as “A1”, “B1”, and “C1”) received from each terminal device 200 connected to the video conference. ing.

次いで、時刻ｔ４のタイミングで音声データ（Ｂ１）の受信を終了する。このタイミングでは、音声データ（Ｂ１）の受信の開始時刻から所定の時間であるＴ２時間が経過していないため、音声データ（Ｂ１）の出力順位が「１」に設定され、音声データ（Ｂ１）の出力が開始される。つまり、音声データ（Ｂ１）は、テレビ会議の参加者の合鎚等の短いフレーズの音声データとみなされ、音声データ（Ａ１）、及び音声データ（Ｂ１）は同時に出力される。 Next, the reception of the audio data (B1) ends at the timing of time t4. At this timing, since the predetermined time T2 has not elapsed from the reception start time of the audio data (B1), the output order of the audio data (B1) is set to “1”, and the audio data (B1) Starts to output. That is, the audio data (B1) is regarded as audio data of a short phrase such as a meeting of participants in the video conference, and the audio data (A1) and the audio data (B1) are output simultaneously.

次いで、時刻ｔ５のタイミングで音声データ（Ａ１）の受信を終了する。このタイミングでは、音声データ（Ｃ１）の遅延された時間がＴ１時間を超えていないため、音声データ（Ｃ１）の出力順位が「１」に設定され、音声データ（Ｃ１）の出力が開始される。 Next, the reception of the audio data (A1) ends at the timing of time t5. At this timing, since the delay time of the audio data (C1) does not exceed the time T1, the output rank of the audio data (C1) is set to “1”, and the output of the audio data (C1) is started. .

なお、上述した出力順位設定処理において、継続時間がＴ２時間を超えていない音声データが存在するか否かが判断される（図１２：Ｓ５４参照）タイミングが、時刻ｔ４のタイミングである。 In the output rank setting process described above, the timing at which it is determined whether or not there is audio data whose duration does not exceed T2 time (see S54 in FIG. 12) is the timing at time t4.

次に、上述したテレビ会議装置１００の動作について、一例として、図１４のタイミングチャートを参照して説明する。図１４は、テレビ会議に接続している各端末装置２００から受信した音声データ（「Ａ１」、及び「Ｂ１」）に基づいて、出力音声データの時間的な推移を示している。 Next, the operation of the above-described video conference apparatus 100 will be described with reference to the timing chart of FIG. 14 as an example. FIG. 14 shows temporal transition of output audio data based on audio data (“A1” and “B1”) received from each terminal device 200 connected to the video conference.

ここで、図３に示すように、優先順位テーブル１２００に記憶されている優先順位としては、音声データ（Ａ１）を送信した端末装置２００の優先順位の値が「１」、音声データ（Ｂ１）を送信した端末装置２００の優先順位の値が「２」とする。 Here, as shown in FIG. 3, as the priority order stored in the priority order table 1200, the priority order value of the terminal device 200 that has transmitted the voice data (A1) is “1”, and the voice data (B1). The priority value of the terminal device 200 that has transmitted “2” is “2”.

まず、時刻ｔＢ１のタイミングで音声データ（Ｂ１）の受信を開始する。時刻ｔＢ１のタイミングでは、他の端末装置２００から受信した音声データの出力がないため、音声データ（Ｂ１）が出力音声データとして、テレビ会議に接続している各端末装置２００に対して出力される。 First, reception of audio data (B1) is started at time tB1. At the timing of time tB1, since there is no output of audio data received from another terminal device 200, audio data (B1) is output as output audio data to each terminal device 200 connected to the video conference. .

次いで、時刻ｔＡ１のタイミングで音声データ（Ａ１）の受信を開始する。時刻ｔＡ１のタイミングでは、出力中である音声データ（Ｂ１）の出力の開始時刻からＴ３時間が経過するまでに音声データ（Ａ１）の受信が開始された状態である。つまり、ｔＡ１＜ｔＢ１＋Ｔ３を満たすことになる。このため、音声データ（Ｂ１）に対応する音声データ情報の無効フラグが「１」に設定され、音声データ（Ｂ１）に対応する音声データ情報、及び音声データ（Ｂ１）は削除される。また、音声データ（Ｂ１）の出力順位が「１」に設定され、音声データ（Ａ１）の出力が開始される。 Next, reception of audio data (A1) is started at the timing of time tA1. At the timing of time tA1, reception of the audio data (A1) is started until T3 time elapses from the output start time of the audio data (B1) being output. That is, tA1 <tB1 + T3 is satisfied. Therefore, the invalid flag of the voice data information corresponding to the voice data (B1) is set to “1”, and the voice data information corresponding to the voice data (B1) and the voice data (B1) are deleted. Also, the output order of the audio data (B1) is set to “1”, and output of the audio data (A1) is started.

次いで、時刻ｔ６のタイミングで音声データ（Ａ１）の出力を終了する。 Next, the output of the audio data (A1) ends at the timing of time t6.

なお、上述した出力順位設定処理において、ｔＡ１＜ｔＢ１＋Ｔ３を満たす音声データが存在するか否かが判断される（図１２：Ｓ５６参照）タイミングが、時刻ｔＡ１のタイミングである。 In the output order setting process described above, the timing at which it is determined whether there is audio data satisfying tA1 <tB1 + T3 (see S56 in FIG. 12) is the timing at time tA1.

次に、図１５、及び図１６を参照して、端末装置２００で実行される出力順位情報表示処理について説明する。ここで、出力順位情報表示処理は、端末装置２００のメイン処理（図示外）から起動されるプロセスで実行される処理である。端末装置２００で行われる他の処理については、メイン処理で行われるものとし、詳細については省略する。また、出力順位情報表示処理は、強制終了等のシグナルをメイン処理から受信した際に、終了するものとする。 Next, with reference to FIG. 15 and FIG. 16, the output order information display process executed by the terminal device 200 will be described. Here, the output order information display process is a process executed in a process activated from the main process (not shown) of the terminal device 200. Other processes performed in the terminal device 200 are performed in the main process, and details thereof are omitted. The output order information display process is terminated when a signal such as forced termination is received from the main process.

図１５に示すように、出力順位情報表示処理が実行されると、出力順位情報をテレビ会議装置１００から受信したか否かが判断される（Ｓ５１）。この出力順位情報は、テレビ会議において、テレビ会議に接続する各端末装置２００に対して出力する音声データの順番である。なお、テレビ会議装置１００のメイン処理で、出力順位情報は各端末装置２００に対して送信される（図８：Ｓ２３参照）。 As shown in FIG. 15, when the output order information display process is executed, it is determined whether or not the output order information is received from the video conference apparatus 100 (S51). This output order information is the order of audio data output to each terminal device 200 connected to the video conference in the video conference. In the main process of the video conference apparatus 100, the output order information is transmitted to each terminal apparatus 200 (see S23 in FIG. 8).

出力順位情報をテレビ会議装置１００から受信したと判断されると（Ｓ５１：ＹＥＳ）、図１６に示すように、受信した出力順位情報に基づいた情報がテレビ会議画面２７１の出力順位情報表示領域２７３に表示され（Ｓ５２）、Ｓ５１へ移行する。ここで、一例として、図１６に示すように、出力順位情報に基づいた番号が表示される。また、番号に限定せず、記号等の表示形態であっても問題はない。 If it is determined that the output order information has been received from the video conference apparatus 100 (S51: YES), information based on the received output order information is displayed in the output order information display area 273 of the video conference screen 271 as shown in FIG. (S52), and the process proceeds to S51. Here, as an example, a number based on the output order information is displayed as shown in FIG. Moreover, it is not limited to a number, and even if it is a display form of a symbol etc., there is no problem.

また、出力順位情報をテレビ会議装置１００から受信していないと判断されると（Ｓ５１：ＮＯ）、Ｓ５１の処理が再度行われる。つまり、テレビ会議装置１００から出力順位情報の受信を待機している状態となる。 If it is determined that the output order information has not been received from the video conference apparatus 100 (S51: NO), the process of S51 is performed again. That is, it is in a state of waiting for reception of the output order information from the video conference apparatus 100.

以上説明したように、本実施の形態では、テレビ会議装置１００はテレビ会議に接続する端末装置２００からテレビ会議の参加者の発話による音声データを受信する。そして、予め、ＨＤＤ１０４の優先順位テーブル記憶エリア１４０１に記憶されている優先順位テーブル１２００に基づいて、テレビ会議で音声データを出力する出力順位を決定し、ＲＡＭ１０３の音声データテーブル記憶エリア１３０１に記憶されている音声データテーブル１１００に音声データ情報として記憶する。そして、出力順位に基づいて音声データを各端末装置２００に対して出力する。また、所定時間であるＴ１時間以上、遅延されているデータは削除され、出力順位が更新される。これにより、所定時間であるＴ１時間以上、遅延されている音声データは削除されるため、出力されることがなくなる。その結果、音声データが必要以上に遅れて出力されることがなくなり、テレビ会議を実施中に、会話のテンポが妨げられる可能性を少なくできる。 As described above, in the present embodiment, the video conference apparatus 100 receives audio data from the utterances of participants in the video conference from the terminal device 200 connected to the video conference. Based on the priority table 1200 stored in the priority table storage area 1401 of the HDD 104 in advance, the output order for outputting audio data in the video conference is determined and stored in the audio data table storage area 1301 of the RAM 103. And stored as voice data information in the voice data table 1100. Then, the audio data is output to each terminal device 200 based on the output order. Further, data delayed for a predetermined time T1 or more is deleted, and the output order is updated. As a result, the audio data delayed for the predetermined time T1 or more is deleted, so that it is not output. As a result, the audio data is not output more late than necessary, and the possibility of hindering the conversation tempo during a video conference can be reduced.

また、出力されていない音声データの内で、音声データの出力時間が所定の時間であるＴ２時間より短い音声データが存在する場合、その音声データに対応する出力順位が最も高い順位に更新される。これにより、Ｔ２時間より短い出力時間の音声データは、他の音声データと同時に出力されることになる。その結果、テレビ会議の参加者が行う合鎚等の短い発言を、テレビ会議で行われる発言に取り入れることができるため、より自然な会話を促すことができる。 If there is audio data that is shorter than T2 time, which is a predetermined time, among the audio data that has not been output, the output rank corresponding to the audio data is updated to the highest rank. . As a result, audio data having an output time shorter than T2 time is output simultaneously with other audio data. As a result, it is possible to incorporate a short utterance such as a match made by a participant in the video conference into an utterance made in the video conference, thereby promoting a more natural conversation.

また、音声データの出力が開始されてからＴ３時間が経過するまでに、出力中の音声データに対応した優先順位よりも高い優先順位に音声データを受信した場合、出力中の音声データに対応した音声データ情報、及び音声データが削除される。そして、受信した音声データに対応した出力順位が、最も高い出力順位に更新され出力される。これにより、テレビ会議において、他の音声データが出力中の状態であっても、優先順位がより高い端末装置２００から送信された音声データが、優先して出力される。その結果、テレビ会議において予め決められた優先順位が高い端末装置２００の使用者の発言を、遅延させることなく出力することができる。 Also, when audio data is received with a priority higher than the priority corresponding to the audio data being output before the time T3 elapses after the output of the audio data is started, it corresponds to the audio data being output. Audio data information and audio data are deleted. Then, the output order corresponding to the received audio data is updated to the highest output order and output. Thereby, even if other audio data is being output in the video conference, the audio data transmitted from the terminal device 200 having a higher priority is output with priority. As a result, it is possible to output the speech of the user of the terminal device 200 having a high priority determined in advance in a video conference without delay.

また、テレビ会議装置１００は出力順位を、各端末装置２００に対して送信する。また、端末装置２００は、送信された出力順位を受信し、受信した出力順位をモニタ２７０のテレビ会議画面２７１に表示する。これにより、端末装置２００を使用するユーザは、自身の発言が出力される順番を知ることができ、自身の発言に遅延があるか否かを把握できる。 In addition, the video conference device 100 transmits the output order to each terminal device 200. In addition, the terminal device 200 receives the transmitted output order and displays the received output order on the video conference screen 271 of the monitor 270. Thereby, the user who uses the terminal device 200 can know the order in which his / her speech is output, and can grasp whether or not his / her speech has a delay.

なお、本実施の形態の端末識別情報が「端末ＩＤ」に相当し、音声データテーブル１１００の開始時刻が「受信開始時刻」に相当する。ＲＡＭ１０３の音声データテーブル記憶エリア１３０１が「音声データ情報記憶手段」に相当し、優先順位テーブル１２００のレコードが「優先順位情報」に相当し、優先順位テーブル記憶エリア１４０１が「優先順位情報記憶手段」に相当する。 Note that the terminal identification information of the present embodiment corresponds to “terminal ID”, and the start time of the audio data table 1100 corresponds to “reception start time”. The voice data table storage area 1301 of the RAM 103 corresponds to “voice data information storage means”, the record of the priority table 1200 corresponds to “priority information”, and the priority table storage area 1401 corresponds to “priority information storage means”. It corresponds to.

また、Ｔ１時間が「第１所定時間」に相当する。図８のＳ１４：ＹＥＳを実行するＣＰＵ１０１が「音声データ受信手段」、及び「音声データ受信ステップ」に相当し、図８のＳ１７を実行するＣＰＵ１０１が「出力判断手段」、及び「出力判断ステップ」に相当する。図８のＳ１８を実行するＣＰＵ１０１が「音声データ情報記憶制御手段」、及び「音声データ情報記憶制御ステップ」に相当し、図８のＳ２４を実行するＣＰＵ１０１が「音声データ出力手段」、「音声データ出力ステップ」に相当する。図８のＳ１９を実行するＣＰＵ１０１が「第１判断手段」、及び「第１判断ステップ」に相当し、図８のＳ２０を実行し、図８のＳ３２を実行するＣＰＵ１０１が「第１削除手段」、及び「第１削除ステップ」に相当する。 The time T1 corresponds to the “first predetermined time”. The CPU 101 that executes S14 in FIG. 8 corresponds to “voice data receiving means” and “voice data receiving step”, and the CPU 101 that executes S17 in FIG. 8 corresponds to “output judgment means” and “output judgment step”. It corresponds to. The CPU 101 that executes S18 of FIG. 8 corresponds to “voice data information storage control means” and “voice data information storage control step”, and the CPU 101 that executes S24 of FIG. This corresponds to “output step”. The CPU 101 that executes S19 in FIG. 8 corresponds to “first determination means” and “first determination step”, and the CPU 101 that executes S20 in FIG. 8 and executes S32 in FIG. 8 is “first deletion means”. , And “first deletion step”.

また、Ｔ２時間が「第２所定時間」に相当する。図１２のＳ５４を実行するＣＰＵ１０１が「第２判断手段」に相当し、図１２のＳ５５を実行するＣＰＵ１０１が「第１出力順位変更手段」に相当する。 The time T2 corresponds to a “second predetermined time”. The CPU 101 that executes S54 in FIG. 12 corresponds to “second determination means”, and the CPU 101 that executes S55 in FIG. 12 corresponds to “first output order changing means”.

また、Ｔ３時間が「第３所定時間」に相当する。図１２のＳ５６を実行するＣＰＵ１０１が「第３判断手段」に相当し、図１２のＳ５７を実行し、図８のＳ３２を実行するＣＰＵ１０１が「第１削除手段」に相当し、図１２のＳ５８を実行するＣＰＵ１０１が「第２出力順位変更手段」に相当する。 The time T3 corresponds to a “third predetermined time”. The CPU 101 that executes S56 in FIG. 12 corresponds to “third determination means”, executes S57 in FIG. 12, and the CPU 101 that executes S32 in FIG. 8 corresponds to “first deletion means”. The CPU 101 that executes the process corresponds to “second output order changing means”.

また、スピーカ２６０が「出力部」に相当し、モニタ２７０が「表示部」に相当する。図８のＳ２３を実行するＣＰＵ１０１が「出力順位送信手段」に相当し、図１５のＳ５１：ＹＥＳを実行するＣＰＵ１０１が「出力順位受信手段」に相当し、図１５のＳ５２を実行するＣＰＵ１０１が「出力順位表示制御手段」に相当する。 The speaker 260 corresponds to the “output unit”, and the monitor 270 corresponds to the “display unit”. The CPU 101 that executes S23 in FIG. 8 corresponds to the “output order transmitting means”, the CPU 101 that executes S51: YES in FIG. 15 corresponds to the “output order receiving means”, and the CPU 101 that executes S52 in FIG. This corresponds to “output order display control means”.

なお、本発明は、上述した実施の形態に限定されるものではなく、種々の変更が可能である。以下に、変形例について説明する。 In addition, this invention is not limited to embodiment mentioned above, A various change is possible. Hereinafter, modified examples will be described.

上述した実施の形態のテレビ会議装置１００では、図１２の出力順位設定処理のＳ５６において、ｔＡ１＜ｔＢ１＋Ｔ３の条件を満たす音声データ情報が存在するか否かを判断し、条件を満たす音声データ情報が存在する場合、該当する音声データ情報、及び音声データを削除するが、この条件はこれに限定しない。例えば、他の条件を満たすようにしてもよい。これについては、図１７、及び図１８を参照して、以下の変形例で説明する。 In the video conference apparatus 100 according to the above-described embodiment, in S56 of the output order setting process of FIG. 12, it is determined whether there is audio data information that satisfies the condition of tA1 <tB1 + T3. If it exists, the corresponding audio data information and audio data are deleted, but this condition is not limited to this. For example, other conditions may be satisfied. This will be described in the following modification with reference to FIGS. 17 and 18.

図１７を参照して、変形例の出力順位設定処理について説明する。なお、図１２のＳ５１〜Ｓ５５は、図１７のＳ６１〜Ｓ６５と同様の処理であるため、その説明を省略する。また、図１２のＳ５７〜Ｓ５９は、図１７のＳ６８〜Ｓ７０と同様の処理であるため、その説明を省略する。なお、図１７の出力順位設定処理は、上述した実施の形態と同様に、図８のＳ２１において、テレビ会議装置１００のメイン処理から実行されるものとする。 With reference to FIG. 17, the output order setting process of a modification is demonstrated. Note that S51 to S55 in FIG. 12 are the same processes as S61 to S65 in FIG. Also, S57 to S59 in FIG. 12 are the same processes as S68 to S70 in FIG. Note that the output order setting process of FIG. 17 is executed from the main process of the video conference apparatus 100 in S21 of FIG. 8 as in the above-described embodiment.

変形例の出力順位設定処理が実行されると、Ｓ６１〜Ｓ６５の処理が実行され、ｔＢ１＋Ｔ３＜ｔＡ１を満たす音声データ情報が存在するか否かが判断される（Ｓ６６）。ここで、ｔＡ１は、優先順位テーブル１２００において優先順位が１位の端末装置２００から受信した音声データの受信の開始時刻であり、音声データテーブル１１００に記憶されている。ｔＢ１は、優先順位テーブル１２００において優先順位が２位以降の端末装置２００から受信した音声データの受信の開始時刻であり、音声データテーブル１１００に記憶されている。また、Ｔ３時間は、上述した実施の形態と同様に、変形例では「１秒」とする。ここでは、開始時刻がｔＢ１に対応する音声データの出力が開始してからＴ３時間が経過するまでに、開始時刻がｔＡ１に対応する音声データの出力が開始されたか否かが判断される。 When the output order setting process of the modification is executed, the processes of S61 to S65 are executed, and it is determined whether there is audio data information satisfying tB1 + T3 <tA1 (S66). Here, tA1 is the reception start time of the audio data received from the terminal device 200 with the highest priority in the priority table 1200, and is stored in the audio data table 1100. tB1 is the reception start time of the audio data received from the terminal device 200 whose priority is second or higher in the priority table 1200, and is stored in the audio data table 1100. Also, the T3 time is set to “1 second” in the modified example, as in the above-described embodiment. Here, it is determined whether or not the output of the audio data corresponding to the start time tA1 is started before the time T3 elapses after the output of the audio data corresponding to the start time tB1 is started.

ｔＢ１＋Ｔ３＜ｔＡ１を満たす音声データが存在すると判断された場合（Ｓ６６：ＹＥＳ）、優先順位がより高い端末装置２００から受信した音声データの遅延された時間がＴ２時間を超えたか否かが判断される（Ｓ６７）。つまり、優先順位がより高い端末装置２００からの音声データの受信の開始時刻であるｔＡ１にＴ２時間を加えた時間が、現在時刻を超えているか否かで判断される。ここで、現在時刻は、計時装置１０８より取得される。 When it is determined that there is audio data satisfying tB1 + T3 <tA1 (S66: YES), it is determined whether or not the delayed time of the audio data received from the terminal device 200 having a higher priority has exceeded T2 time. (S67). That is, it is determined whether or not the time obtained by adding T2 time to tA1 which is the start time of reception of the audio data from the terminal device 200 with higher priority exceeds the current time. Here, the current time is acquired from the timing device 108.

また、ｔＢ１＋Ｔ３＜ｔＡ１を満たす音声データが存在しないと判断された場合（Ｓ６６：ＮＯ）、Ｓ７０へ移行する。 If it is determined that there is no audio data satisfying tB1 + T3 <tA1 (S66: NO), the process proceeds to S70.

優先順位がより高い端末装置２００から受信した音声データの遅延された時間がＴ２時間を超えたと判断された場合（Ｓ６７：ＹＥＳ）、つまり、優先順位がより高い端末装置２００からの音声データの受信の開始時刻であるｔＡ１にＴ２時間を加えた時間が、現在時刻を超えている場合、Ｓ６８へ移行する。 When it is determined that the delay time of the voice data received from the terminal device 200 with the higher priority exceeds T2 time (S67: YES), that is, reception of the voice data from the terminal device 200 with the higher priority. If the time obtained by adding T2 time to tA1 which is the start time of the time exceeds the current time, the process proceeds to S68.

また、優先順位がより高い端末装置２００から受信した音声データの遅延された時間がＴ２時間を超えていないと判断された場合（Ｓ６７：ＮＯ）、つまり、優先順位がより高い端末装置２００からの音声データの受信の開始時刻であるｔＡ１にＴ２時間を加えた時間が、現在時刻を超えていない場合、Ｓ６８へ移行する。 Further, when it is determined that the delay time of the audio data received from the terminal device 200 having the higher priority does not exceed the time T2 (S67: NO), that is, from the terminal device 200 having the higher priority. If the time obtained by adding T2 time to tA1, which is the reception start time of the audio data, does not exceed the current time, the process proceeds to S68.

次に、上述したテレビ会議装置１００の動作について、一例として、図１８のタイミングチャートを参照して説明する。図１８は、テレビ会議に接続している各端末装置２００から受信した音声データ（「Ａ１」、「Ｂ１」とする）に基づいて、出力音声データの時間的な推移を示している。 Next, the operation of the video conference apparatus 100 described above will be described with reference to the timing chart of FIG. 18 as an example. FIG. 18 shows the temporal transition of the output audio data based on the audio data (“A1” and “B1”) received from each terminal device 200 connected to the video conference.

次いで、時刻ｔＡ１のタイミングで音声データ（Ａ１）の受信を開始する。時刻ｔＡ１のタイミングでは、出力中である音声データ（Ｂ１）の出力の開始時刻からＴ３時間が経過した後、音声データ（Ａ１）の受信が開始された状態である。つまり、ｔＢ１＋Ｔ３＜ｔＡ１を満たすことになる。 Next, reception of audio data (A1) is started at the timing of time tA1. At the timing of time tA1, reception of audio data (A1) is started after T3 time has elapsed from the output start time of audio data (B1) being output. That is, tB1 + T3 <tA1 is satisfied.

ついで、時刻ｔ７のタイミングにおいて、音声データ（Ａ１）の受信が開始されてからＴ２時間が経過したことになる。このため、音声データ（Ｂ１）に対応する音声データ情報の無効フラグが「１」に設定され、音声データ（Ｂ１）に対応する音声データ情報、及び音声データ（Ｂ１）は削除される。また、音声データ（Ｂ１）の出力順位が「１」に設定され、音声データ（Ａ１）の出力が開始される。 Next, at the timing of time t7, T2 time has elapsed since the reception of the audio data (A1) was started. Therefore, the invalid flag of the voice data information corresponding to the voice data (B1) is set to “1”, and the voice data information corresponding to the voice data (B1) and the voice data (B1) are deleted. Also, the output order of the audio data (B1) is set to “1”, and output of the audio data (A1) is started.

以上説明したように、変形例では、音声データの出力が開始されてからＴ３時間が経過した後、出力中の音声データに対応した優先順位より高い優先順位である端末装置２００から音声データを受信してから、Ｔ２時間が経過したか否かが判断される。Ｔ２時間が経過している場合、出力中の音声データに対応する音声データ情報、及び出力中の音声データが削除される。そして、優先順位がより高い端末装置２００から受信した音声データが送信される。これにより、優先順位が高い端末装置２００から送信された出力時間が所定時間以上の音声データを、少ない遅延で出力することができる。 As described above, in the modification, after T3 time has elapsed since the start of the output of the audio data, the audio data is received from the terminal device 200 having a higher priority than the priority corresponding to the audio data being output. Then, it is determined whether or not T2 time has elapsed. When the time T2 has elapsed, the audio data information corresponding to the audio data being output and the audio data being output are deleted. And the audio | voice data received from the terminal device 200 with a higher priority are transmitted. Accordingly, it is possible to output audio data transmitted from the terminal device 200 having a high priority and having an output time of a predetermined time or more with a small delay.

なお、変形例のＴ３時間が「第４所定時間」に相当し、Ｔ２時間が「第５所定時間」に相当する。図１７のＳ６６を実行するＣＰＵ１０１が「第４判断手段」に相当し、図１７のＳ６７を実行するＣＰＵ１０１が「第５判断手段」に相当する。図１７のＳ６８を実行し、図８のＳ３２を実行するＣＰＵ１０１が「第３削除手段」に相当し、図１７のＳ６９を実行するＣＰＵ１０１が「第３出力順位変更手段」に相当する。 In the modified example, the T3 time corresponds to the “fourth predetermined time”, and the T2 time corresponds to the “fifth predetermined time”. The CPU 101 that executes S66 in FIG. 17 corresponds to the “fourth determination unit”, and the CPU 101 that executes S67 in FIG. 17 corresponds to the “fifth determination unit”. The CPU 101 that executes S68 of FIG. 17 and executes S32 of FIG. 8 corresponds to the “third deletion unit”, and the CPU 101 that executes S69 of FIG. 17 corresponds to the “third output order changing unit”.

１テレビ会議システム
２ネットワーク
１００テレビ会議装置
１０１，２０１ＣＰＵ
１０３，２０３ＲＡＭ
１０４，２０４ＨＤＤ
１０８計時装置
２００端末装置
２３０マウス
２５０カメラ
２６０スピーカ
２７０モニタ
２７１テレビ会議画面
２７２個別表示領域
２７３出力順位情報表示領域
１１００音声データテーブル
１２００優先順位テーブル
１３０１音声データテーブル記憶エリア
１４０１優先順位テーブル記憶エリア 1 Video conference system 2 Network 100 Video conference device 101, 201 CPU
103, 203 RAM
104,204 HDD
108 timing device 200 terminal device 230 mouse 250 camera 260 speaker 270 monitor 271 video conference screen 272 individual display area 273 output order information display area 1100 audio data table 1200 priority order table 1301 audio data table storage area 1401 priority order table storage area

Claims

A video conference apparatus that controls a video conference performed between a plurality of terminal devices via a network,
Voice data receiving means for receiving voice data from an utterance transmitted from the terminal device;
Output determination means for determining whether or not the audio data received from another terminal device is being output when the audio data is received from the terminal device by the audio data receiving means;
Priority order for storing the terminal identification information for identifying the terminal apparatus that has transmitted the voice data and the priority order indicating the priority order for outputting the voice data to the terminal apparatus in association with each other. Information storage means;
If the output determination means determines that the audio data received from another terminal device is being output, the audio data is based on the priority information stored in the priority information storage means. The voice data received by the receiving means, a reception start time that is a time when reception of the voice data is started, an output order that is information on a rank for outputting the voice data to the terminal device, and the voice Audio data information storage control means for associating with terminal identification information for identifying the terminal device that transmitted the data and storing it in the audio data information storage means as audio data information;
Audio data output means for outputting the audio data to a plurality of the terminal devices performing data communication in the video conference;
In the voice data information stored in the voice data information storage means, the voice data information that is not output to the terminal device by the voice data output means is first from the reception start time of the voice data information. First determination means for determining whether or not the audio data information for which a predetermined time has elapsed;
If it is determined by the first determination means that the audio data information that has passed the first predetermined time from the reception start time of the audio data information exists, the audio data information that has passed the first predetermined time is First deletion means for deleting from the voice data information storage means,
The audio data output means performs data communication in the video conference for the audio data corresponding to the output order with the highest output order of the audio data information stored in the audio data information storage means. A video conference apparatus that outputs to a plurality of the terminal devices.

In the audio data information stored in the audio data information storage means, the output time of the audio data in the audio data information among the audio data not output to the terminal device by the audio data output means. Second determination means for determining whether or not the audio data information shorter than a second predetermined time exists;
When it is determined by the second determination means that the output time of the sound data of the sound data information is shorter than the second predetermined time, the output time shorter than the second predetermined time 2. The video conference apparatus according to claim 1, further comprising: first output rank update means for updating the output rank of the audio data information corresponding to the highest rank.

The terminal device that has transmitted the audio data being output is identified before the third predetermined time elapses after the audio data output means starts outputting the audio data to the plurality of terminal devices. It is determined whether or not the voice data is received by the voice data receiving means from the terminal device specified by the terminal identification information corresponding to the priority higher than the priority corresponding to the terminal identification information. A third determination means;
When it is determined by the third determining means that the audio data has been received, the audio data information corresponding to the audio data being output in the audio data information stored in the audio data storage means Second deletion means for deleting from the data storage control means;
When it is determined that the audio data is received by the third determining means, the output order corresponding to the received audio data is the highest in the audio data information stored in the audio data storage means. The video conference apparatus according to claim 1, further comprising second output rank update means for updating the output rank.

Identifying the terminal device that has transmitted the voice data being output after a fourth predetermined time has elapsed since the voice data output means started outputting the voice data to the plurality of terminal devices. The voice data receiving means determines whether or not the voice data is received from the terminal device specified by the terminal identification information corresponding to the priority higher than the priority corresponding to the terminal identification information. 4 judgment means,
The fourth determination means is specified by the terminal identification information corresponding to the priority higher than the priority corresponding to the terminal identification information identifying the terminal device that has transmitted the audio data being output. Fifth determination means for determining whether a fifth predetermined time has elapsed since it was determined that the voice data was received by the voice data receiving means from the terminal device;
When it is determined that the fifth predetermined time has elapsed since the reception of the audio data by the fifth determination unit, the audio data information stored in the audio data storage unit includes the audio data being output. Third deletion means for deleting the corresponding voice data information from the voice data storage control means;
When it is determined that the fifth predetermined time has elapsed since the reception of the audio data by the fifth determination means, the audio data information stored in the audio data storage means corresponds to the received audio data The video conference apparatus according to claim 1, further comprising: a third output order update unit that updates the output order to the highest output order.

The video conference device according to claim 1, an output unit that outputs audio data output from the video conference device, and a display unit that displays image data used in the video conference controlled by the video conference device. A video conference system configured with the terminal device,
The video conference device is:
Output rank transmission means for transmitting the output rank of the voice data information stored in the voice data storage means to the terminal device specified by the terminal identification information corresponding to the output rank;
The terminal device
Output order receiving means for receiving the output order transmitted by the output order transmitting means of the video conference device;
A video conference system comprising: output order display control means for displaying information indicating the output order on the display unit based on the output order received by the output order receiving means.

A video conference control method for controlling a video conference performed between a plurality of terminal devices via a network,
A voice data receiving step of receiving voice data from an utterance transmitted from the terminal device;
An output determination step of determining whether or not the audio data received from another terminal device is being output when the audio data is received from the terminal device in the audio data reception step;
Priority order for storing the terminal identification information for identifying the terminal apparatus that has transmitted the voice data and the priority order indicating the priority order for outputting the voice data to the terminal apparatus in association with each other. Information storage means;
If it is determined in the output determination step that the audio data received from the other terminal device is being output, the audio data is based on the priority information stored in the priority information storage means. The audio data received in the reception step, a reception start time that is a time when reception of the audio data is started, an output order that is information on an order for outputting the audio data to the terminal device, and the audio A voice data information storage control step for storing in the voice data information storage means as voice data information in association with terminal identification information for identifying the terminal device that has transmitted the data;
An audio data output step of outputting the audio data to a plurality of the terminal devices performing data communication in the video conference;
In the voice data information stored in the voice data information storage means, the voice data information that is not output to the terminal device by the voice data output means is first from the reception start time of the voice data information. A first determination step of determining whether or not the audio data information for which a predetermined time has elapsed;
If it is determined in the first determination step that the audio data information has passed the first predetermined time from the reception start time of the audio data information, the audio data information after the first predetermined time has passed. A first deletion step of deleting from the voice data information storage means,
In the audio data output step, the audio data corresponding to the output order having the highest output order of the audio data information stored in the audio data information storage unit is subjected to data communication in the video conference. A video conference control method, comprising: outputting to a plurality of the terminal devices.

A program for a video conference apparatus, which causes a computer to function as various processing means of the video conference apparatus according to claim 1.