JP6289178B2

JP6289178B2 - Call conferencing system

Info

Publication number: JP6289178B2
Application number: JP2014048511A
Authority: JP
Inventors: 茂明鈴木; 山浦　正; 正山浦; 渉伏見
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2014-03-12
Filing date: 2014-03-12
Publication date: 2018-03-07
Anticipated expiration: 2034-03-12
Also published as: JP2015173376A

Description

この発明は、ＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）ネットワーク上で音声会議を実現する通話会議システムに関するものである。 The present invention relates to a telephone conference system that realizes an audio conference on an IP (Internet Protocol) network.

通話会議システムは、サーバと複数の端末により構成され、各端末が送信する音声データをサーバで合成し、合成した音声データを各端末に配信する形態が取られる場合が多い。
ここで、ＩＰ会議通信システムでは、サーバが各端末に送信する音声データをマルチキャスト伝送することでネットワーク負荷の低減を図ることができる。ただし、各端末は、自端末が送信した音声データが、サーバから受信する音声データの中に含まれて戻ってくることで生じるエコーを除去する必要が生じる。 In many cases, the call conference system includes a server and a plurality of terminals, and the voice data transmitted from each terminal is synthesized by the server, and the synthesized voice data is distributed to each terminal.
Here, in the IP conference communication system, the network load can be reduced by multicast transmission of the audio data transmitted from the server to each terminal. However, each terminal needs to remove an echo generated when the voice data transmitted by the terminal is included in the voice data received from the server and returned.

エコー除去のため、サーバが、会議に参加している全ての端末から受信した音声データを合成して一つの合成音声データを生成し、この合成音声データを全端末にマルチキャスト伝送する一方、各端末は、サーバから受信した合成音声データから自己の音声データを取り除いて再生出力する方法があった（例えば、特許文献１参照）。この方法によれば、各端末は、サーバへ送信する音声データをパケット化した音声パケットに端末種別を示す端末ＩＤを付加すると共に、送信した音声データを内部に格納する。サーバは、各端末から受信した音声データを合成し、これをパケット化して各端末に送信する際、音声合成元の端末ＩＤを付加する。各端末は、サーバから受信した音声パケット内に自己の端末ＩＤが含まれる場合、内部に格納した送信済みの音声データを、受信した音声データから減算することでエコーを除去する。 For echo cancellation, the server synthesizes voice data received from all terminals participating in the conference to generate one synthesized voice data, and multicasts the synthesized voice data to all terminals, while each terminal Has a method of removing and outputting the own voice data from the synthesized voice data received from the server (see, for example, Patent Document 1). According to this method, each terminal adds a terminal ID indicating a terminal type to a voice packet obtained by packetizing voice data to be transmitted to the server, and stores the transmitted voice data therein. The server synthesizes the voice data received from each terminal, adds the terminal ID of the voice synthesizer when packetizing the voice data and transmitting it to each terminal. When the terminal ID is included in the voice packet received from the server, each terminal removes the echo by subtracting the transmitted voice data stored therein from the received voice data.

特開平１０−１６４２３９号公報Japanese Patent Laid-Open No. 10-164239

サーバから端末への音声パケット送信にかかるネットワーク負荷を更に低減するためには、音声データを低ビットレート符号化することによって情報量を削減することも有効である。しかし、低ビットレート符号化すると符号化歪が生じるため、上記特許文献１のように音声データを減算する方法ではエコーの除去ができないという課題があった。 In order to further reduce the network load related to voice packet transmission from the server to the terminal, it is also effective to reduce the amount of information by encoding voice data at a low bit rate. However, since encoding distortion occurs when low bit rate encoding is performed, there is a problem that echo cannot be removed by the method of subtracting audio data as in Patent Document 1 described above.

この発明は、上記のような課題を解決するためになされたもので、サーバから各端末へマルチキャスト送信される音声データが低ビットレート符号化されている場合でも、端末においてエコーの除去を可能とすることを目的とする。 The present invention has been made to solve the above-described problems, and even when audio data multicast-transmitted from the server to each terminal is encoded at a low bit rate, echo can be removed at the terminal. The purpose is to do.

この発明に係る通話会議は、３個以上の端末が高優先端末と低優先端末とに分けられ、サーバは、高優先端末それぞれに対しては、当該高優先端末以外の端末からサーバに送信された音声信号を合成して送信し、低優先端末に対しては、高優先端末からサーバに送信された音声データと低優先端末からサーバに送信された音声データのうち、高優先端末からの音声データを優先して選択し、選択した音声信号を、高優先端末と低優先端末のどちらからの音声信号を選択したかを示す選択情報と共にマルチキャスト方式で送信し、低優先端末は、サーバからマルチキャスト方式で送信された音声データを受信して再生する場合に、自端末の有効な音声データをサーバに送信している間は、選択情報に基づき、受信した音声データが高優先端末からの音声データであるなら再生するようにしたものである。 In the call conference according to the present invention, three or more terminals are divided into a high-priority terminal and a low-priority terminal, and for each high-priority terminal, the server is transmitted from a terminal other than the high-priority terminal to the server. The voice signal from the high priority terminal is selected from among the voice data transmitted from the high priority terminal to the server and the voice data transmitted from the low priority terminal to the server. The data is selected with priority, and the selected voice signal is transmitted by a multicast method together with selection information indicating whether the voice signal from the high priority terminal or the low priority terminal is selected. When receiving and playing back audio data transmitted by the method, while the effective audio data of its own terminal is being transmitted to the server, whether the received audio data is a high-priority terminal based on the selection information In which it was to be played if it is of sound data.

この発明に係る通話会議は、３個以上の端末が高優先端末と低優先端末とに分けられ、サーバは、高優先端末それぞれに対しては、当該高優先端末以外の端末からサーバに送信された音声信号を合成して送信し、低優先端末に対しては、高優先端末からサーバに送信された音声データと低優先端末からサーバに送信された音声データとを多重してマルチキャスト方式で送信し、低優先端末は、サーバからマルチキャスト方式で送信された音声データを受信して再生する場合に、自端末の有効な音声データをサーバに送信している間は、受信した音声データのうちの高優先端末からの音声データを再生するようにしたものである。 In the call conference according to the present invention, three or more terminals are divided into a high-priority terminal and a low-priority terminal. The voice data transmitted from the high-priority terminal to the server and the voice data transmitted from the low-priority terminal to the server are multiplexed and transmitted to the low-priority terminal. However, when the low priority terminal receives and reproduces the voice data transmitted from the server by the multicast method, the low priority terminal transmits the valid voice data of its own terminal to the server. The voice data from the high priority terminal is played back.

この発明に係る通話会議は、３個以上の端末が高優先端末と低優先端末とに分けられ、サーバは、高優先端末それぞれに対しては、当該高優先端末以外の端末からサーバに送信された音声信号を合成して送信し、低優先端末に対しては、高優先端末からサーバに送信された音声データと低優先端末からサーバに送信された音声データとをそれぞれ別々にマルチキャスト方式で送信し、低優先端末は、サーバからマルチキャスト方式で送信された音声データを受信して再生する場合に、自端末の有効な音声データをサーバに送信している間は、受信した音声データが高優先端末からの音声データであるなら再生するようにしたものである。 In the call conference according to the present invention, three or more terminals are divided into a high-priority terminal and a low-priority terminal, and for each high-priority terminal, the server is transmitted from a terminal other than the high-priority terminal to the server. The voice signal transmitted from the high-priority terminal to the server and the voice data transmitted from the low-priority terminal to the server are separately transmitted to the low-priority terminal using a multicast method. When a low-priority terminal receives and reproduces voice data transmitted from the server by multicast, the received voice data has high priority while it transmits valid voice data of its own terminal to the server. If it is audio data from the terminal, it is played back.

この発明によれば、端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データを優先して選択すると共に選択情報を付加してマルチキャスト送信し、低優先端末は、自端末の有効な音声データをサーバに送信している間、サーバから受信した音声データ内に含まれている高優先端末からの音声データを再生し、低優先端末からの音声データが含まれているなら再生しないようにしたので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない。 According to this invention, the terminal is divided into a high-priority terminal and a low-priority terminal, and the server synthesizes and transmits voice data from other terminals excluding the terminal to the high-priority terminal. In response to the priority, the voice data from the high priority terminal is preferentially selected and multicast information is added to the selection information, and the low priority terminal transmits the valid voice data of its own terminal to the server. The audio data from the high-priority terminal included in the audio data received from is played back, and if the audio data from the low-priority terminal is included, it is not played back. Even if the signal is encoded at a low bit rate, no echo is detected at the terminal.

この発明によれば、端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを多重してマルチキャスト送信し、低優先端末は、自端末の有効な音声データをサーバに送信している間、サーバから受信した音声データのうちの高優先端末からの音声データを再生し低優先端末からの音声データを再生しないようにしたので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない。 According to this invention, the terminal is divided into a high-priority terminal and a low-priority terminal, and the server synthesizes and transmits voice data from other terminals excluding the terminal to the high-priority terminal. The voice data from the high-priority terminal and the voice data from the low-priority terminal are multiplexed and transmitted by multicast, and the low-priority terminal transmits the valid voice data of its own terminal to the server. The audio data received from the high priority terminal is reproduced and the audio data from the low priority terminal is not reproduced, so that the audio data multicast-transmitted from the server is encoded at a low bit rate. Even in such a case, no echo is detected at the terminal.

この発明によれば、端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを別々にマルチキャスト送信し、低優先端末は、自端末の有効な音声データをサーバに送信している間、サーバから受信した高優先端末からの音声データを再生し、低優先端末からの音声データを再生しないようにしたので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない。 According to this invention, the terminal is divided into a high-priority terminal and a low-priority terminal, and the server synthesizes and transmits voice data from other terminals excluding the terminal to the high-priority terminal. In response, the voice data from the high-priority terminal and the voice data from the low-priority terminal are separately multicast-transmitted, and the low-priority terminal transmits the valid voice data of its own terminal to the server from the server. The received voice data from the high priority terminal is played back, and the voice data from the low priority terminal is not played back, so even if the voice data multicast-transmitted from the server is encoded at a low bit rate, Is not perceived.

この発明の実施の形態１に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 1 of this invention. 実施の形態１に係る通話会議システムの低優先端末の内部構成を示すブロック図である。3 is a block diagram showing an internal configuration of a low priority terminal of the call conference system according to Embodiment 1. FIG. 実施の形態１に係る通話会議システムの高優先端末の内部構成を示すブロック図である。3 is a block diagram showing an internal configuration of a high priority terminal of the call conference system according to Embodiment 1. FIG. この発明の実施の形態２に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 2 of this invention. この発明の実施の形態３に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 3 of this invention. 実施の形態３に係る通話会議システムの低優先端末の内部構成を示すブロック図である。FIG. 10 is a block diagram showing an internal configuration of a low priority terminal of a call conference system according to Embodiment 3. この発明の実施の形態４に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 4 of this invention. 実施の形態４に係る通話会議システムの低優先端末の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the low priority terminal of the telephone conference system which concerns on Embodiment 4. この発明の実施の形態５に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 5 of this invention. 実施の形態５に係る通話会議システムの低優先端末の内部構成を示すブロック図である。FIG. 10 is a block diagram showing an internal configuration of a low priority terminal of a call conference system according to Embodiment 5. この発明の実施の形態６に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 6 of this invention. 実施の形態６に係る通話会議システムの低優先端末の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the low priority terminal of the telephone conference system which concerns on Embodiment 6. FIG. この発明の実施の形態７に係る通話会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the telephone conference system which concerns on Embodiment 7 of this invention. 実施の形態７に係る通話会議システムの低優先端末の内部構成を示すブロック図である。FIG. 15 is a block diagram showing an internal configuration of a low priority terminal of a call conference system according to Embodiment 7.

実施の形態１．
図１に示す通話会議システムは、１台のサーバと５台の端末との間で会議が行われる場合の構成である。図１において、サーバ１００は、各端末から送信された音声データを受信し、合成して各端末に送信する。また、図１の例では、５台の端末のうちの２台が、会議通話の優先度が高く設定された高優先端末２００ａ，２００ｂ、残りの３台が、会議通話の優先度が低く設定された低優先端末３００ａ〜３００ｃである。 Embodiment 1 FIG.
The call conference system shown in FIG. 1 has a configuration in which a conference is performed between one server and five terminals. In FIG. 1, a server 100 receives audio data transmitted from each terminal, synthesizes it, and transmits it to each terminal. Further, in the example of FIG. 1, two of the five terminals are set to high priority terminals 200a and 200b in which the priority of the conference call is set high, and the remaining three are set to have a low priority of the conference call. Low-priority terminals 300a to 300c.

サーバ１００は、各端末から送信された音声パケットを受信する音声パケット受信部１０１，１０２と、音声データを合成する音声合成部１０３と、音声データをパケット化して送信する音声パケット送信部１０４と、マルチキャスト伝送する音声データを選択する音声選択部１０５と、音声データを低ビットレート符号化する音声符号器１０６と、音声データの選択情報を付加する選択情報付加部１０７と、音声データをパケット化してマルチキャスト送信するマルチキャスト音声パケット送信部１０８と、低ビットレート符号化された音声データを復号する音声復号器１０９ａ〜１０９ｃとを備えている。
音声復号器１０９ａ〜１０９ｃの個数は、低優先端末３００ａ〜３００ｃの台数と同じとする。 The server 100 includes voice packet receivers 101 and 102 that receive voice packets transmitted from each terminal, a voice synthesizer 103 that synthesizes voice data, a voice packet transmitter 104 that packetizes and transmits voice data, An audio selection unit 105 that selects audio data to be transmitted by multicast transmission, an audio encoder 106 that encodes audio data at a low bit rate, a selection information addition unit 107 that adds audio data selection information, and packetizes the audio data A multicast voice packet transmitter 108 for multicast transmission and voice decoders 109a to 109c for decoding low bit rate encoded voice data are provided.
The number of speech decoders 109a to 109c is the same as the number of low priority terminals 300a to 300c.

このサーバ１００は、不図示のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）で構成されており、このＣＰＵが内部メモリに格納されたプログラムを実行することによって、音声パケット受信部１０１，１０２、音声合成部１０３、音声パケット送信部１０４、音声選択部１０５、音声符号器１０６、選択情報付加部１０７、マルチキャスト音声パケット送信部１０８、音声復号器１０９ａ〜１０９ｃとしての機能を実現する。
また、音声符号器１０６および音声復号器１０９ａ〜１０９ｃなどの一部の機能を、専用回路により構成してもよい。 The server 100 is composed of a CPU (Central Processing Unit) (not shown), and when the CPU executes a program stored in an internal memory, the voice packet receivers 101 and 102, the voice synthesizer 103, the voice Functions as the packet transmission unit 104, the voice selection unit 105, the voice encoder 106, the selection information addition unit 107, the multicast voice packet transmission unit 108, and the voice decoders 109a to 109c are realized.
In addition, some functions such as the speech encoder 106 and the speech decoders 109a to 109c may be configured by dedicated circuits.

次に、サーバ１００の動作を説明する。
高優先端末２００ａ，２００ｂと低優先端末３００ａ〜３００ｃは、自端末の音声データをパケット化し、音声パケットとしてサーバ１００に送信する。ここで、高優先端末２００ａ，２００ｂが送信する音声パケットのペイロード、即ち、音声データは低ビットレート符号化されていないＰＣＭ（ＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｉｎ）符号である。一方、低優先端末３００ａ〜３００ｃが送信する音声パケットのペイロードは、低ビットレート符号化された音声データである。低ビットレート符号化は、ＰＣＭに比べて低いビットレートであり、符号化した後に復号した音声信号が符号化前の音声信号と全く同じにならない（つまり符号化歪を生じる）符号化方式を指す。 Next, the operation of the server 100 will be described.
The high-priority terminals 200a and 200b and the low-priority terminals 300a to 300c packetize the voice data of their own terminals and transmit them to the server 100 as voice packets. Here, the payload of the voice packet transmitted by the high-priority terminals 200a and 200b, that is, the voice data is a PCM (Pulse Code Modulation) code that is not subjected to low bit rate coding. On the other hand, the payload of the voice packet transmitted by the low priority terminals 300a to 300c is voice data encoded with a low bit rate. Low bit rate coding is a bit rate lower than that of PCM, and refers to a coding method in which an audio signal decoded after encoding is not exactly the same as an audio signal before encoding (that is, encoding distortion is generated). .

サーバ１００において、まず音声パケット受信部１０１は、高優先端末２００ａ，２００ｂが送信する音声パケットを受信し、この音声パケットのペイロード部分、即ち、ＰＣＭ符号の音声データを抜き出して、音声合成部１０３に出力する。
他方、音声パケット受信部１０２は、低優先端末３００ａ〜３００ｃが送信する音声パケットを受信し、この音声パケットのペイロード部分、即ち、低ビットレート符号の音声データを抜き出して、音声復号器１０９ａ〜１０９ｃに出力する。音声復号器１０９ａ〜１０９ｃは、低ビットレート符号をＰＣＭ符号に復号し、音声合成部１０３および音声選択部１０５に出力する。 In the server 100, the voice packet receiving unit 101 first receives a voice packet transmitted from the high priority terminals 200 a and 200 b, extracts the payload part of this voice packet, that is, the voice data of the PCM code, and sends it to the voice synthesis unit 103. Output.
On the other hand, the voice packet receiving unit 102 receives voice packets transmitted by the low priority terminals 300a to 300c, extracts the payload portion of the voice packets, that is, voice data of the low bit rate code, and outputs the voice decoders 109a to 109c. Output to. Speech decoders 109 a to 109 c decode the low bit rate code into a PCM code, and output it to speech synthesis section 103 and speech selection section 105.

音声合成部１０３は、音声パケット受信部１０１が出力する高優先端末２００ａ，２００ｂの音声データと、音声復号器１０９ａ〜１０９ｃが出力する低優先端末３００ａ〜３００ｃの音声データとが入力され、高優先端末２００ａ，２００ｂの何れかに送信するための音声データ（高優先端末送信用の音声データ）として、送信対象の端末以外からの音声データを合成した音声データを生成し、音声パケット送信部１０４に出力する。
即ち、高優先端末２００ａに送信する場合、高優先端末２００ｂおよび低優先端末３００ａ〜３００ｃの音声データを合成した合成音声データを生成する。高優先端末２００ｂに送信する場合、高優先端末２００ａおよび低優先端末３００ａ〜３００ｃの音声データを合成した合成音声データを生成する。 The voice synthesizer 103 receives the voice data of the high priority terminals 200a and 200b output from the voice packet receiver 101 and the voice data of the low priority terminals 300a to 300c output from the voice decoders 109a to 109c. As voice data to be transmitted to either of the terminals 200a and 200b (voice data for high-priority terminal transmission), voice data obtained by synthesizing voice data from other than the terminal to be transmitted is generated, and is sent to the voice packet transmitting unit 104. Output.
That is, when transmitting to the high priority terminal 200a, the synthetic | combination audio | voice data which synthesize | combined the audio | voice data of the high priority terminal 200b and the low priority terminals 300a-300c are produced | generated. When transmitting to the high priority terminal 200b, the synthetic | combination audio | voice data which synthesize | combined the audio | voice data of the high priority terminal 200a and the low priority terminals 300a-300c are produced | generated.

音声パケット送信部１０４は、音声合成部１０３から入力される高優先端末２００ａ以外の音声データをパケット化して高優先端末２００ａに送信し、音声合成部１０３から入力される高優先端末２００ｂ以外の音声データをパケット化して高優先端末２００ｂに送信する。 The voice packet transmission unit 104 packetizes voice data other than the high priority terminal 200 a input from the voice synthesis unit 103 and transmits the packetized data to the high priority terminal 200 a, and voice other than the high priority terminal 200 b input from the voice synthesis unit 103. Data is packetized and transmitted to the high priority terminal 200b.

また、音声合成部１０３は、低優先端末３００ａ〜３００ｃに送信するための音声データ（低優先端末送信用の音声データ）として、高優先端末２００ａ，２００ｂの音声データを合成した音声データを生成し、音声選択部１０５に出力する。 The voice synthesis unit 103 generates voice data obtained by synthesizing voice data of the high priority terminals 200a and 200b as voice data to be transmitted to the low priority terminals 300a to 300c (voice data for low priority terminal transmission). And output to the voice selection unit 105.

音声選択部１０５は、音声合成部１０３が出力する高優先端末２００ａ，２００ｂの音声データ（低優先端末送信用の音声データ）と、音声復号器１０９ａ〜１０９ｃが出力する低優先端末３００ａ〜３００ｃの音声データとが入力され、その何れかを選択して、音声符号器１０６に出力する。その選択は、音声合成部１０３から入力される低優先端末送信用の音声データ、即ち、高優先端末２００ａ，２００ｂの音声データを合成したものを優先する。
また、音声合成部１０３から入力される音声データを選択した場合は０を、そうでない場合は１を、選択情報として、音声選択部１０５から選択情報付加部１０７に出力する。 The voice selection unit 105 includes voice data (high-priority terminal transmission voice data) output from the voice synthesis unit 103 and low-priority terminals 300a to 300c output from the voice decoders 109a to 109c. Audio data is input, and one of them is selected and output to the audio encoder 106. The selection gives priority to voice data for low-priority terminal transmission input from the voice synthesizer 103, that is, synthesized voice data of the high-priority terminals 200a and 200b.
Further, 0 is output to the selection information addition unit 107 from the voice selection unit 105 as selection information when the voice data input from the voice synthesis unit 103 is selected, and 1 otherwise.

具体的な選択の方法としては、まず、音声選択部１０５が入力された各音声データの有音・無音を検出し、音声合成部１０３から入力された音声データが有音の場合は、常にその音声データを選択する。また、音声選択部１０５は、音声合成部１０３および音声復号器１０９ａ〜１０９ｃから入力された全ての音声データが無音の場合も、音声合成部１０３の音声データを選択する。また、音声選択部１０５は、音声合成部１０３および音声復号器１０９ａ〜１０９ｃから入力された音声データのうち、何れか１個の音声データのみが有音ならばその音声データを選択する。
また、音声選択部１０５は、音声合成部１０３から入力された音声データが無音であって、音声復号器１０９ａ〜１０９ｃから入力された音声データの２個以上が有音の場合には、先に有音となった音声データを選択する。 As a specific selection method, first, the voice selection unit 105 detects the voice / silence of each voice data input, and if the voice data input from the voice synthesis unit 103 is voiced, always Select audio data. In addition, the voice selection unit 105 selects the voice data of the voice synthesis unit 103 even when all the voice data input from the voice synthesis unit 103 and the voice decoders 109a to 109c are silent. The voice selection unit 105 selects the voice data if only one of the voice data input from the voice synthesis unit 103 and the voice decoders 109a to 109c is voiced.
In addition, the voice selection unit 105 first determines that the voice data input from the voice synthesis unit 103 is silent and two or more of the voice data input from the voice decoders 109a to 109c are voiced. Select voice data that has sound.

音声符号器１０６は、音声選択部１０５から入力される音声データを低ビットレート符号化し、その低ビットレート符号の音声データを選択情報付加部１０７に出力する。
選択情報付加部１０７は、音声符号器１０６から入力される低ビットレート符号の音声データに、音声選択部１０５から入力された０か１の選択情報を付加して、マルチキャスト音声パケット送信部１０８に出力する。
マルチキャスト音声パケット送信部１０８は、選択情報付加部１０７から入力される低ビットレート符号の音声データと選択情報をパケット化し、低優先端末３００ａ〜３００ｃに対してマルチキャスト送信する。 The speech encoder 106 performs low bit rate coding on the speech data input from the speech selection unit 105 and outputs the speech data of the low bit rate code to the selection information addition unit 107.
The selection information adding unit 107 adds 0 or 1 selection information input from the audio selection unit 105 to the low bit rate code audio data input from the audio encoder 106, and sends it to the multicast audio packet transmission unit 108. Output.
The multicast voice packet transmission unit 108 packetizes the low bit rate code voice data and selection information input from the selection information adding unit 107, and multicasts them to the low priority terminals 300a to 300c.

次に、図２を用いて、低優先端末３００ａ〜３００ｃの動作を説明する。この図２は、低優先端末３００ａの内部構成を示すブロック図であるが、低優先端末３００ｂ，３００ｃも同様の構成である。 Next, the operation of the low priority terminals 300a to 300c will be described with reference to FIG. Although FIG. 2 is a block diagram showing the internal configuration of the low priority terminal 300a, the low priority terminals 300b and 300c have the same configuration.

低優先端末３００ａは、送信に係る機能として、端末ユーザが発言した音声を収音するマイクロフォン３０１と、マイクロフォン３０１から入力されるアナログ音声をＰＣＭ符号に変換するＡ／Ｄ変換器３０２と、ＰＣＭ符号の音声データを低ビットレート符号化する音声符号器３０４と、低ビットレート符号の音声データをパケット化して送信する音声パケット送信部３０５とを備えている。
また、低優先端末３００ａは、受信に係る機能として、サーバ１００が送信するマルチキャスト音声パケットを受信するマルチキャスト音声パケット受信部３０６と、受信したマルチキャスト音声パケットの音声データに付加された選択情報を分離する選択情報分離部３０７と、受信した音声データを低ビットレート符号からＰＣＭ符号に復号する音声復号器３０８と、端末ユーザの音声の有音・無音を判定する有音検出部３０３と、ＰＣＭ符号の音声データの出力を切り換えるスイッチ３０９と、ＰＣＭ符号をアナログ音声に変換するＤ／Ａ変換器３１０と、アナログ音声を出力するスピーカ３１１とを備えている。 The low-priority terminal 300a includes, as functions related to transmission, a microphone 301 that collects speech spoken by a terminal user, an A / D converter 302 that converts analog speech input from the microphone 301 into a PCM code, and a PCM code. A speech encoder 304 for low-bit-rate encoding the speech data, and a speech packet transmission unit 305 for packetizing and transmitting the speech data of the low bit-rate code.
Further, the low priority terminal 300a separates selection information added to the voice data of the received multicast voice packet from the multicast voice packet receiving unit 306 that receives the multicast voice packet transmitted by the server 100 as a function related to reception. A selection information separation unit 307; a speech decoder 308 that decodes received speech data from a low bit rate code to a PCM code; a speech detection unit 303 that determines the presence / absence of speech of a terminal user; A switch 309 that switches output of audio data, a D / A converter 310 that converts PCM codes into analog audio, and a speaker 311 that outputs analog audio are provided.

この低優先端末３００ａは、不図示のＣＰＵで構成されており、このＣＰＵが内部メモリに格納されたプログラムを実行することによって、有音検出部３０３、音声パケット送信部３０５、マルチキャスト音声パケット受信部３０６、選択情報分離部３０７等としての機能を実現する。また、Ａ／Ｄ変換器３０２、Ｄ／Ａ変換器３１０、音声符号器３０４、音声復号器３０８、スイッチ３０９などの一部機能を、専用回路により構成してもよい。 The low-priority terminal 300a is composed of a CPU (not shown), and when the CPU executes a program stored in an internal memory, a voice detection unit 303, a voice packet transmission unit 305, a multicast voice packet reception unit A function as the selection information separation unit 307 and the like is realized. Further, some functions such as the A / D converter 302, the D / A converter 310, the speech encoder 304, the speech decoder 308, and the switch 309 may be configured by a dedicated circuit.

低優先端末３００ａにおいて、ユーザの音声はマイクロフォン３０１で収音され、Ａ／Ｄ変換器３０２によってＰＣＭ符号に変換された後、有音検出部３０３と音声符号器３０４に入力される。
有音検出部３０３は、入力されたＰＣＭ符号の音声データが有音であるか無音であるかを判定し、有音ならば１を、無音ならば０をスイッチ３０９に出力する。
音声符号器３０４は、有音検出部３０３から入力されたＰＣＭ符号を低ビットレート符号化し、低ビットレート符号の音声データを音声パケット送信部３０５に出力する。
音声パケット送信部３０５は、音声符号器３０４から入力された低ビットレート符号の音声データをパケット化して、サーバ１００に送信する。 In the low-priority terminal 300 a, the user's voice is collected by the microphone 301, converted to a PCM code by the A / D converter 302, and then input to the sound detection unit 303 and the voice encoder 304.
The sound detection unit 303 determines whether the input voice data of the PCM code is sound or sound, and outputs 1 to the switch 309 if sound is present and 0 if sound is not present.
The voice encoder 304 performs low bit rate coding on the PCM code input from the voice detection unit 303, and outputs the voice data of the low bit rate code to the voice packet transmission unit 305.
The voice packet transmission unit 305 packetizes the voice data of the low bit rate code input from the voice encoder 304 and transmits the packetized data to the server 100.

マルチキャスト音声パケット受信部３０６は、サーバ１００からマルチキャスト音声パケットを受信し、そのパケットのペイロード部分を抜き出して、選択情報分離部３０７に出力する。このペイロード部分は、サーバ１００の動作説明で記載したように、低ビットレート符号の音声データに０か１の選択情報が付加されたデータであり、選択情報分離部３０７はこれを分離して、低ビットレート符号の音声データを音声復号器３０８に、選択情報をスイッチ３０９に出力する。 The multicast voice packet receiving unit 306 receives the multicast voice packet from the server 100, extracts the payload portion of the packet, and outputs it to the selection information separation unit 307. As described in the operation description of the server 100, this payload part is data in which 0 or 1 selection information is added to the audio data of the low bit rate code, and the selection information separation unit 307 separates this, The audio data of the low bit rate code is output to the audio decoder 308 and the selection information is output to the switch 309.

音声復号器３０８は、選択情報分離部３０７から入力された低ビットレート符号の音声データをＰＣＭ符号に復号して、スイッチ３０９に出力する。
スイッチ３０９は、選択情報分離部３０７からの入力と有音検出部３０３からの入力の何れかが０の場合、または両方０の場合、音声復号器３０８から入力されたＰＣＭ符号をＤ／Ａ変換器３１０に出力する。一方、スイッチ３０９は、選択情報分離部３０７からの入力と有音検出部３０３からの入力が共に１の場合は、音声復号器３０８から入力されたＰＣＭ符号を出力せず、無音のＰＣＭ符号をＤ／Ａ変換器３１０に出力する。
Ｄ／Ａ変換器３１０は、スイッチ３０９から入力されたＰＣＭ符号をアナログ音声に変換して、スピーカ３１１を介してユーザに出力する。 The audio decoder 308 decodes the audio data of the low bit rate code input from the selection information separation unit 307 into a PCM code, and outputs it to the switch 309.
The switch 309 performs D / A conversion on the PCM code input from the speech decoder 308 when either the input from the selection information separation unit 307 and the input from the sound detection unit 303 are 0 or both are 0. Output to the device 310. On the other hand, when both the input from the selection information separation unit 307 and the input from the voice detection unit 303 are 1, the switch 309 does not output the PCM code input from the speech decoder 308, and does not output the silent PCM code. The data is output to the D / A converter 310.
The D / A converter 310 converts the PCM code input from the switch 309 into analog voice and outputs the analog voice to the user via the speaker 311.

次に、図３を用いて、高優先端末２００ａ，２００ｂの動作を説明する。この図３は、高優先端末２００ａの内部構成を示すブロック図であるが、高優先端末２００ｂも同様の構成である。高優先端末２００ａは、端末ユーザが発言した音声を収音するマイクロフォン２０１と、マイクロフォン２０１から入力されるアナログ音声をＰＣＭ符号に変換するＡ／Ｄ変換器２０２と、ＰＣＭ符号の音声データをパケット化してサーバ１００に送信する音声パケット送信部２０３と、サーバ１００が送信する音声パケットを受信する音声パケット受信部２０４と、受信した音声パケットに含まれる音声データをアナログ音声に変換するＤ／Ａ変換器２０５と、アナログ音声を出力するスピーカ２０６とを備えている。 Next, the operation of the high priority terminals 200a and 200b will be described with reference to FIG. Although FIG. 3 is a block diagram showing the internal configuration of the high priority terminal 200a, the high priority terminal 200b has the same configuration. The high priority terminal 200a packetizes the microphone 201 that collects the voice spoken by the terminal user, the A / D converter 202 that converts the analog voice input from the microphone 201 into the PCM code, and the voice data of the PCM code. A voice packet transmission unit 203 that transmits to the server 100, a voice packet reception unit 204 that receives a voice packet transmitted from the server 100, and a D / A converter that converts voice data included in the received voice packet into analog voice 205 and a speaker 206 for outputting analog sound.

この通話会議システムにおいて、サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、サーバ１００がマルチキャスト送信する音声パケットの中に、低優先端末３００ａ〜３００ｃの何れかがサーバ１００に送信した音声データが含まれているときには、その音声パケット中の選択情報が１となっているので、低優先端末３００ａ〜３００ｃにおいて選択情報分離部３０７からスイッチ３０９への入力が１となる。そして、低優先端末３００ａ〜３００ｃのユーザが発話中には、その端末において有音検出部３０３からスイッチ３０９への入力も１となるので、その端末のスイッチ３０９は無音のＰＣＭ符号をＤ／Ａ変換器３１０に出力する。従って、低優先端末３００ａ〜３００ｃが受信するマルチキャスト音声パケットの中に、自らが送信した音声データが含まれていても再生されず、エコーの発生を防止することができる。 In this call conference system, when the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, any of the low-priority terminals 300a to 300c is included in the voice packets that the server 100 performs multicast transmission. When the voice data transmitted to the server 100 is included, since the selection information in the voice packet is 1, the selection information separation unit 307 inputs to the switch 309 in the low priority terminals 300a to 300c. 1 When the user of the low-priority terminals 300a to 300c is speaking, the input from the voice detection unit 303 to the switch 309 is also 1 at the terminal, so the switch 309 of the terminal converts the silent PCM code to D / A Output to the converter 310. Therefore, even if the multicast voice packet received by the low priority terminals 300a to 300c includes the voice data transmitted by the low priority terminals 300a to 300c, the multicast voice packet is not reproduced and the occurrence of echo can be prevented.

ただし、低優先端末３００ａ〜３００ｃのうちの何れかの端末のユーザは、他の低優先端末の音声を聞き逃すことはある。例えば、低優先端末３００ａ，３００ｂの各ユーザが同時に発話し、サーバ１００が送信するマルチキャスト音声パケット内には低優先端末３００ａの音声データが含まれている場合、低優先端末３００ｂのユーザは低優先端末３００ａの音声を聞き逃すことになる。 However, the user of any one of the low priority terminals 300a to 300c may miss the voice of another low priority terminal. For example, when the users of the low priority terminals 300a and 300b speak at the same time and the voice data of the low priority terminal 300a is included in the multicast voice packet transmitted by the server 100, the user of the low priority terminal 300b is low priority. The voice of the terminal 300a is missed.

一方、サーバ１００が送信するマルチキャスト音声パケットに含まれる音声データが、高優先端末２００ａ，２００ｂの音声データの場合には、付加される選択情報が０であるため、低優先端末３００ａ〜３００ｃの各ユーザが聞き逃すことはない。 On the other hand, when the voice data included in the multicast voice packet transmitted by the server 100 is the voice data of the high priority terminals 200a and 200b, the selection information to be added is 0, so each of the low priority terminals 300a to 300c. The user never misses.

また、高優先端末２００ａ，２００ｂの各ユーザは、他のどの端末からの音声も再生されずに聞き逃すことはなく、また、発話した全ての音声が全ての端末で再生される。
さらに、サーバ１００から高優先端末２００ａに送信する音声パケットにはこの高優先端末２００ａの音声データは含まれておらず、サーバ１００から高優先端末２００ｂに送信する音声パケットにはこの高優先端末２００ｂの音声データは含まれていないので、エコーを防止できる。 In addition, each user of the high priority terminals 200a and 200b does not miss the sound from any other terminal without being reproduced, and all the spoken sounds are reproduced on all terminals.
Further, the voice packet transmitted from the server 100 to the high priority terminal 200a does not include the voice data of the high priority terminal 200a, and the voice packet transmitted from the server 100 to the high priority terminal 200b includes the high priority terminal 200b. Is not included, so echo can be prevented.

この通話会議システムは、会議の司会者が高優先端末２００ａ，２００ｂを使用し、その他一般の会議参加者が低優先端末３００ａ〜３００ｃを使用することを想定している。
司会者は、全ての会議参加者の音声を確実に聞くことが可能で、また、各会議参加者は司会者の音声を確実に聞くことができる。また、何れの端末のユーザも、自らの音声がサーバ１００を介して戻ってくることで生じるエコーを聞くことはない。 This call conference system assumes that the conference moderator uses the high-priority terminals 200a and 200b, and other general conference participants use the low-priority terminals 300a to 300c.
The presenter can surely hear the voices of all the conference participants, and each conference participant can hear the voices of the presenter reliably. In addition, the user of any terminal does not hear an echo generated when his / her voice returns via the server 100.

以上のように実施の形態１によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データを優先して選択すると共に選択情報を付加してマルチキャスト送信し、低優先端末は、自端末の有効な音声データをサーバに送信している間、選択情報に基づき、サーバから受信した音声データ内に低優先端末からの音声データが含まれているなら再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the first embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The low-priority terminal selects the voice data from the high-priority terminal with priority and sends the selection information to the multicast, and the low-priority terminal transmits the valid voice data of its own terminal. Is transmitted to the server based on the selection information, if the audio data received from the server contains audio data from the low priority terminal, it will not be played back. Even when data is encoded at a low bit rate, it is possible to obtain a telephone conference system that does not sense echo at the terminal.

また、実施の形態１によれば、低優先端末は、自端末の音声データの有音と無音を検出し、有音を検出した音声データを有効な音声データとしてサーバに送信していると判断するようにした。この動作により、低優先端末は、ユーザが発話して自端末が有音状態のとき（つまり有効な音声データを送信しているとき）だけ、サーバから受信した低優先端末の音声データを再生しない。よって、ユーザが発話しておらず自端末が無音状態のときは、他の低優先端末からの音声データを再生することができる。 Further, according to the first embodiment, the low-priority terminal detects the presence and absence of the voice data of its own terminal, and determines that the voice data in which the voice is detected is transmitted to the server as valid voice data. I tried to do it. With this operation, the low-priority terminal does not reproduce the voice data of the low-priority terminal received from the server only when the user speaks and the terminal is in a voiced state (that is, when valid voice data is transmitted). . Therefore, when the user is not speaking and the terminal is silent, the voice data from other low priority terminals can be reproduced.

実施の形態２．
図４は、実施の形態２に係る通話会議システムの構成を示すブロック図であり、サーバ１００は、新たに、低優先端末３００ａ〜３００ｃから受信した音声データを合成する受信側音声合成部１２１を備えている。
高優先端末２００ａ，２００ｂおよび低優先端末３００ａ〜３００ｃの内部構成は、図２および図３と同様である。 Embodiment 2. FIG.
FIG. 4 is a block diagram showing the configuration of the call conference system according to the second embodiment. The server 100 newly includes a reception-side voice synthesis unit 121 that synthesizes voice data received from the low-priority terminals 300a to 300c. I have.
The internal configurations of the high priority terminals 200a and 200b and the low priority terminals 300a to 300c are the same as those shown in FIGS.

次に、実施の形態２のサーバ１００の動作を説明する。
図４において、サーバ１００の音声パケット受信部１０１，１０２、音声パケット送信部１０４、音声符号器１０６、選択情報付加部１０７、マルチキャスト音声パケット送信部１０８、音声復号器１０９ａ〜１０９ｃの動作は、図１で説明した上記実施の形態１と同様である。また、高優先端末２００ａ，２００ｂおよび低優先端末３００ａ〜３００ｃの動作も、図２および図３で説明した上記実施の形態１と同様である。 Next, the operation of the server 100 according to the second embodiment will be described.
4, operations of the voice packet receiving units 101 and 102, the voice packet transmitting unit 104, the voice encoder 106, the selection information adding unit 107, the multicast voice packet transmitting unit 108, and the voice decoders 109a to 109c of the server 100 are shown in FIG. 1 is the same as that in the first embodiment described above. The operations of the high priority terminals 200a and 200b and the low priority terminals 300a to 300c are the same as those in the first embodiment described with reference to FIGS.

受信側音声合成部１２１は、音声復号器１０９ａ〜１０９ｃから入力される音声データを合成して、音声合成部１０３と音声選択部１０５に出力する。 The reception side speech synthesizer 121 synthesizes the speech data input from the speech decoders 109 a to 109 c and outputs the synthesized speech data to the speech synthesizer 103 and the speech selector 105.

音声合成部１０３の動作は、入力される音声データが少なくなることを除くと図１に示したサーバ１００の音声合成部１０３と同様である。
まず、音声合成部１０３は、高優先端末２００ａおよび高優先端末２００ｂの何れかに対しては、その端末以外の音声データを合成した高優先端末送信用の音声データを生成し、音声パケット送信部１０４に出力する。即ち、高優先端末２００ａに対しては、高優先端末２００ｂおよび受信側音声合成部１２１からの音声データを合成し、高優先端末２００ｂに対しては、高優先端末２００ａおよび受信側音声合成部１２１からの音声データを合成する。
さらに、音声選択部１０５は、高優先端末２００ａ，２００ｂからの音声データを合成した低優先端末送信用の音声データを生成し、音声選択部１０５に出力する。 The operation of the speech synthesizer 103 is the same as that of the speech synthesizer 103 of the server 100 shown in FIG. 1 except that the input speech data is reduced.
First, the voice synthesis unit 103 generates voice data for high-priority terminal transmission by synthesizing voice data other than that terminal for either the high-priority terminal 200a or the high-priority terminal 200b. To 104. That is, for the high-priority terminal 200a, the voice data from the high-priority terminal 200b and the reception-side voice synthesis unit 121 is synthesized, and for the high-priority terminal 200b, the high-priority terminal 200a and the reception-side voice synthesis unit 121 are synthesized. Synthesize audio data from
Furthermore, the voice selection unit 105 generates voice data for low-priority terminal transmission by combining voice data from the high-priority terminals 200 a and 200 b and outputs the voice data to the voice selection unit 105.

音声選択部１０５の動作は、入力する音声データが少なくなることを除くと図１に示したサーバ１００の音声選択部１０５と同様である。
まず、音声選択部１０５は、音声合成部１０３と受信側音声合成部１２１から入力される音声データの何れかを選択して、音声符号器１０６に出力する。その選択は、音声合成部１０３から入力される低優先端末送信用の音声データ、即ち、高優先端末２００ａ，２００ｂの音声データを合成したものを優先する。
具体的な選択の方法としては、まず、音声選択部１０５が入力された各音声データの有音・無音を検出し、音声合成部１０３から入力された音声データが有音の場合は、常にその音声データを選択し、両方の音声データが無音の場合も音声合成部１０３からの音声データを選択する。また、音声選択部１０５は、受信側音声合成部１２１から入力される音声データのみが有音の場合は、受信側音声合成部１２１からの音声データを選択する。
そして、音声選択部１０５は、音声合成部１０３から入力される音声データを選択した場合は０を、そうでない場合は１を、選択情報として選択情報付加部１０７に出力する。 The operation of the voice selection unit 105 is the same as that of the voice selection unit 105 of the server 100 illustrated in FIG. 1 except that less voice data is input.
First, the speech selection unit 105 selects any of the speech data input from the speech synthesis unit 103 and the reception side speech synthesis unit 121 and outputs the selected speech data to the speech encoder 106. The selection gives priority to voice data for low-priority terminal transmission input from the voice synthesizer 103, that is, synthesized voice data of the high-priority terminals 200a and 200b.
As a specific selection method, first, the voice selection unit 105 detects the presence / absence of voice data input, and if the voice data input from the voice synthesis unit 103 is voiced, always Voice data is selected, and voice data from the voice synthesizer 103 is selected even when both voice data are silent. In addition, the voice selection unit 105 selects the voice data from the reception-side voice synthesis unit 121 when only the voice data input from the reception-side voice synthesis unit 121 is voiced.
Then, the voice selection unit 105 outputs 0 to the selection information addition unit 107 when the voice data input from the voice synthesis unit 103 is selected and 1 otherwise.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、上記実施の形態１と同様に、サーバ１００がマルチキャスト送信する音声パケットの中に、低優先端末３００ａ〜３００ｃの何れかがサーバ１００に送信した音声データが含まれているときには、その音声パケット中の選択情報が１となっているので、図２に示した低優先端末３００ａ〜３００ｃにおいて選択情報分離部３０７からスイッチ３０９への入力が１となる。そして、低優先端末３００ａ〜３００ｃのユーザが発話中には、その端末において有音検出部３０３からスイッチ３０９への入力も１となるので、その端末のスイッチ３０９は無音をＤ／Ａ変換器３１０に出力する。従って、低優先端末３００ａ〜３００ｃが受信するマルチキャスト音声パケットの中に、自らが送信した音声データが含まれていても再生されず、エコーの発生を防止することができる。 When the server 100, the high priority terminals 200a and 200b, and the low priority terminals 300a to 300c operate as described above, the low priority terminals 300a to 300b are included in the voice packets that the server 100 performs multicast transmission, as in the first embodiment. When any of 300c includes voice data transmitted to the server 100, since the selection information in the voice packet is 1, the selection information separation unit in the low priority terminals 300a to 300c shown in FIG. The input from 307 to the switch 309 is 1. While the user of the low-priority terminals 300a to 300c is speaking, the input from the voice detection unit 303 to the switch 309 is also 1 at the terminal, so the switch 309 of the terminal converts the silence to the D / A converter 310. Output to. Therefore, even if the multicast voice packet received by the low priority terminals 300a to 300c includes the voice data transmitted by the low priority terminals 300a to 300c, the multicast voice packet is not reproduced and the occurrence of echo can be prevented.

以上のように実施の形態２によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データを優先して選択すると共に選択情報を付加してマルチキャスト送信し、低優先端末は、自端末の送信音声が有音の間、選択情報に基づき、サーバから受信した音声データ内に低優先端末からの音声データが含まれているなら再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the second embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals excluding the terminal for the high priority terminal. The low-priority terminal selects the voice data from the high-priority terminal with priority and also adds the selection information for multicast transmission. The low-priority terminal has the transmission voice of its own terminal. During sound, based on the selection information, if the audio data received from the server contains audio data from the low priority terminal, the operation is performed so that the audio data multicast-transmitted from the server has a low bit rate. It is possible to obtain a telephone conference system that does not detect echoes at the terminal even if it is encoded.

また、実施の形態２によれば、サーバは、低優先端末に対してマルチキャスト方式で送信する音声データに含める低優先端末からの音声データとして、複数個の低優先端末からサーバに送信された音声データを合成したものを用いるようにした。これにより、音声選択部１０５は受信側音声合成部１２１の音声データか音声合成部１０３からの音声データかの何れかを選択するだけでよく、音声復号器１０９ａ〜１０９ｃからの音声データを個別に入力して選択する場合よりも処理が軽減される。 Further, according to the second embodiment, the server transmits the voice data transmitted from the plurality of low priority terminals to the server as the voice data from the low priority terminal included in the voice data transmitted to the low priority terminal by the multicast method. The synthesized data was used. Thus, the speech selection unit 105 only needs to select either the speech data of the reception side speech synthesizer 121 or the speech data from the speech synthesizer 103, and the speech data from the speech decoders 109a to 109c can be individually selected. The processing is reduced compared to the case of inputting and selecting.

実施の形態３．
図５は、実施の形態３に係る通話会議システムの構成を示すブロック図であり、サーバ１００は、新たに、低優先端末３００ａ〜３００ｃからの制御パケットを受信する制御パケット受信部１３１、低優先端末３００ａ〜３００ｃが要求する発言許可の可否を判定する発言受付制御部１３２、低優先端末３００ａ〜３００ｃに制御パケットを送信する制御パケット送信部１３３を備えている。また、上記実施の形態１，２では低優先端末ごとに設けていた音声復号器１０９ａ〜１０９ｃの代わりに、低優先端末に共通の音声復号器１０９を備えている。 Embodiment 3 FIG.
FIG. 5 is a block diagram showing the configuration of the call conference system according to Embodiment 3, in which server 100 newly receives a control packet receiving unit 131 that receives control packets from low priority terminals 300a to 300c, and low priority. A speech reception control unit 132 that determines whether or not speech permission requested by the terminals 300a to 300c is requested, and a control packet transmission unit 133 that transmits a control packet to the low priority terminals 300a to 300c. In the first and second embodiments, the speech decoder 109 common to the low priority terminals is provided instead of the speech decoders 109a to 109c provided for each low priority terminal.

次に、実施の形態３のサーバ１００の動作を説明する。
図５において、サーバ１００の音声パケット受信部１０１，１０２、音声パケット送信部１０４、音声符号器１０６、選択情報付加部１０７、マルチキャスト音声パケット送信部１０８の動作は、図１で説明した上記実施の形態１と同様である。音声合成部１０３、音声選択部１０５の動作は、図４で説明した上記実施の形態２と同様である。
また、高優先端末２００ａ，２００ｂの動作は、図３で説明した上記実施の形態１と同様である。 Next, the operation of the server 100 according to the third embodiment will be described.
In FIG. 5, the operations of the voice packet receiving units 101 and 102, the voice packet transmitting unit 104, the voice encoder 106, the selection information adding unit 107, and the multicast voice packet transmitting unit 108 of the server 100 are the same as those described in FIG. This is the same as the first embodiment. The operations of the voice synthesis unit 103 and the voice selection unit 105 are the same as those in the second embodiment described with reference to FIG.
The operations of the high priority terminals 200a and 200b are the same as those in the first embodiment described with reference to FIG.

詳細は後述するが、実施の形態２の低優先端末３００ａ〜３００ｃは、自端末のユーザが発言する場合に発言許可を求める制御パケットをサーバ１００に送信する。ここでは、低優先端末３００ａが発言許可を求める制御パケットを送信したと仮定して説明する。
サーバ１００において、制御パケット受信部１３１がこの発言許可を求める制御パケットを受信して、発言受付制御部１３２にこの制御パケットの受信を通知する。 Although details will be described later, the low-priority terminals 300a to 300c according to the second embodiment transmit a control packet for requesting permission to speak to the server 100 when the user of the terminal itself speaks. Here, the description will be made on the assumption that the low priority terminal 300a has transmitted a control packet for requesting permission to speak.
In the server 100, the control packet receiving unit 131 receives the control packet for requesting the speech permission, and notifies the speech reception control unit 132 of the reception of the control packet.

発言受付制御部１３２は、制御パケット受信部１３１から通知を受けると、この制御パケットを送信した低優先端末３００ａのユーザの発言を許可するかどうかを判定し、その判定結果を制御パケット送信部１３３に通知する。ここで、発言受付制御部１３２は、複数の低優先端末から発言許可の要求があった場合に、そのうちの何れか１端末のみに発言許可を与えるように判定する。 Upon receiving the notification from the control packet receiving unit 131, the speech reception control unit 132 determines whether or not to allow the user of the low-priority terminal 300a that has transmitted this control packet, and the determination result is the control packet transmission unit 133. Notify Here, when there is a request for permission to speak from a plurality of low-priority terminals, the speech reception control unit 132 determines that only one of the terminals is permitted to speak.

制御パケット送信部１３３は、発言受付制御部１３２から発言を許可する判定結果が通知された場合は発言許可を通知する制御パケットを、発言を許可しない判定結果が通知された場合は発言不許可を通知する制御パケットを、要求元の低優先端末３００ａに送信する。
また、発言受付制御部１３２は、判定結果を音声パケット受信部１０２にも通知する。 The control packet transmission unit 133 notifies the control packet that notifies the speech permission when the speech acceptance control unit 132 is notified of the determination result that permits the speech, and rejects the speech when the determination result that does not permit the speech is notified. The control packet to be notified is transmitted to the requesting low priority terminal 300a.
In addition, the speech reception control unit 132 notifies the voice packet reception unit 102 of the determination result.

詳細は後述するが、発言許可を要求した低優先端末３００ａは、上記発言許可を通知する制御パケットを受信すると、ユーザが発話した音声データを含む音声パケットの送信を開始し、ユーザの発言が終わると音声パケットの送信を停止すると共に、発言完了を通知する制御パケットをサーバ１００に送信する。 As will be described in detail later, when the low priority terminal 300a that has requested permission to speak receives a control packet that notifies the permission of speaking, the low priority terminal 300a starts transmitting a voice packet including voice data spoken by the user, and the user's speech ends. Transmission of the voice packet is stopped, and a control packet for notifying completion of the speech is transmitted to the server 100.

サーバ１００において、音声パケット受信部１０２は、発言受付制御部１３２の判定結果に従い、発言が許可された低優先端末３００ａからの音声パケットのみを受信し、受信した音声パケットからペイロード部分（即ち、低ビットレート符号の音声データ）を抜き出して、音声復号器１０９に出力する。
音声復号器１０９は、音声パケット受信部１０２から入力された低ビットレート符号の音声データをＰＣＭ符号に復号して、音声合成部１０３と音声選択部１０５に出力する。 In the server 100, the voice packet receiving unit 102 receives only the voice packet from the low priority terminal 300 a that is allowed to speak according to the determination result of the message reception control unit 132, and the payload portion (that is, low The bit rate code audio data) is extracted and output to the audio decoder 109.
The voice decoder 109 decodes the low bit rate code voice data input from the voice packet receiver 102 into a PCM code and outputs the PCM code to the voice synthesizer 103 and the voice selector 105.

また、制御パケット受信部１３１は、発言完了を通知する制御パケットを低優先端末３００ａから受信して、このパケットの受信を発言受付制御部１３２に通知する。これにより、発言受付制御部１３２は、発言を許可した低優先端末３００ａの発言完了を認識し、その応答指示を制御パケット送信部１３３に通知すると共に、発言完了を音声パケット受信部１０２に通知する。 Further, the control packet receiving unit 131 receives a control packet for notifying the completion of the message from the low priority terminal 300a, and notifies the message reception control unit 132 of the reception of this packet. Thereby, the speech reception control unit 132 recognizes the completion of the speech of the low priority terminal 300a that has permitted the speech, notifies the control packet transmission unit 133 of the response instruction, and notifies the voice packet reception unit 102 of the completion of the speech. .

制御パケット送信部１３３は、発言受付制御部１３２から応答指示を受けると、発言完了の通知を受信したことを示す制御パケットを、発言完了通知の送信元の低優先端末３００ａに送信する。
また、音声パケット受信部１０２は、発言受付制御部１３２から発言完了の通知を受けると、低優先端末３００ａ〜３００ｃからの音声パケットの受信動作を停止し、音声復号器１０９に対する音声データの出力も停止する。 When receiving a response instruction from the message reception control unit 132, the control packet transmission unit 133 transmits a control packet indicating that the message completion notification has been received to the low priority terminal 300a that is the transmission source of the message completion notification.
In addition, when the voice packet receiving unit 102 receives the notification of the completion of the speech from the speech admission control unit 132, the voice packet receiving unit 102 stops the voice packet receiving operation from the low priority terminals 300a to 300c, and outputs voice data to the voice decoder 109. Stop.

音声復号器１０９は、音声パケット受信部１０２から音声データの入力が停止すると、音声合成部１０３と音声選択部１０５に対して無音を出力する。なお、音声復号器１０９の動作は、基本的に、図１に示した音声復号器１０９ａ〜１０９ｃの動作と同様であり、この無音を出力する動作のみが追加される。 The voice decoder 109 outputs silence to the voice synthesizer 103 and the voice selector 105 when the input of voice data from the voice packet receiver 102 stops. Note that the operation of the speech decoder 109 is basically the same as the operation of the speech decoders 109a to 109c shown in FIG. 1, and only the operation of outputting silence is added.

次に、図６を用いて、実施の形態３の低優先端末３００ａ〜３００ｃの動作を説明する。
この図６は、実施の形態３に係る低優先端末３００ａの内部構成を示すブロック図であり、低優先端末３００ａは、新たに、自端末のユーザが発言時に押下するトークスイッチ３３１、サーバ１００に対して制御パケットを送信する制御パケット送信部３３２、サーバ１００からの制御パケットを受信する制御パケット受信部３３３、自端末のユーザの発言が許可された状態であるかどうかを表示する発言許可ランプ３３４を備えている。 Next, the operation of the low priority terminals 300a to 300c according to the third embodiment will be described with reference to FIG.
FIG. 6 is a block diagram showing an internal configuration of the low priority terminal 300a according to the third embodiment. The low priority terminal 300a is newly added to the talk switch 331 and the server 100 that the user of the own terminal presses when speaking. A control packet transmission unit 332 that transmits a control packet, a control packet reception unit 333 that receives a control packet from the server 100, and a speech permission lamp 334 that displays whether or not the user's speech of the terminal is permitted. It has.

また、図６において、マイクロフォン３０１、Ａ／Ｄ変換器３０２、有音検出部３０３、音声符号器３０４、音声パケット送信部３０５、マルチキャスト音声パケット受信部３０６、選択情報分離部３０７、音声復号器３０８、スイッチ３０９、Ｄ／Ａ変換器３１０、スピーカ３１１の動作は、図２で説明した上記実施の形態１と同様である。
低優先端末３００ｂ，３００ｃの内部構成も、図６と同様である。
また、高優先端末２００ａ，２００ｂの動作は、図３で説明した上記実施の形態１と同様である。 In FIG. 6, the microphone 301, the A / D converter 302, the voice detection unit 303, the voice encoder 304, the voice packet transmission unit 305, the multicast voice packet reception unit 306, the selection information separation unit 307, and the voice decoder 308. The operations of the switch 309, the D / A converter 310, and the speaker 311 are the same as those in the first embodiment described with reference to FIG.
The internal configuration of the low priority terminals 300b and 300c is the same as that in FIG.
The operations of the high priority terminals 200a and 200b are the same as those in the first embodiment described with reference to FIG.

低優先端末３００ａにおいて、端末ユーザが発言する際にトークスイッチ３３１が押下される。トークスイッチ３３１は、その押下状態を制御パケット送信部３３２に出力する。
制御パケット送信部３３２は、現在この低優先端末３００ａが発言許可状態にないとき、トークスイッチ３３１が非押下状態から押下状態に変化すると、発言許可を要求する制御パケットを生成して、サーバ１００に送信する。一方、制御パケット送信部３３２は、現在この低優先端末３００ａが発言許可状態にあるとき、トークスイッチ３３１が押下状態から非押下状態に戻ると、発言完了を通知する制御パケットを生成して、サーバ１００に送信する。
なお、この低優先端末３００ａが発言許可状態にあるかどうかについては、制御パケット受信部３３３からの出力によって得るが、これについては後述する。 In the low priority terminal 300a, the talk switch 331 is pressed when the terminal user speaks. The talk switch 331 outputs the pressed state to the control packet transmission unit 332.
When the talk switch 331 changes from the non-pressed state to the pressed state when the low-priority terminal 300a is not currently in the speech-permitted state when the low-priority terminal 300a is not in the speech-permitted state, the control packet transmission unit 332 generates a control packet that requests speech permission. Send. On the other hand, when the talk switch 331 returns from the pressed state to the non-pressed state when the low-priority terminal 300a is currently in the speech-permitted state, the control packet transmission unit 332 generates a control packet for notifying the completion of the speech, To 100.
Whether or not the low-priority terminal 300a is in the speech-permitted state is obtained from an output from the control packet receiving unit 333, which will be described later.

制御パケット受信部３３３は、サーバ１００から受信する制御パケットに基づき、自端末が発言許可状態であるかどうかを判断し、発言許可状態にあるときは１を、そうでないときには０を、制御パケット送信部３３２、音声パケット送信部３０５、スイッチ３０９、発言許可ランプ３３４に出力する。 Based on the control packet received from the server 100, the control packet receiving unit 333 determines whether or not the terminal is in the speech-permitted state. If the terminal is in the speech-permitted state, 1 is transmitted, otherwise 0 is transmitted. Unit 332, voice packet transmission unit 305, switch 309, and speech permission lamp 334.

発言許可状態については、次のように判断する。まず、初期状態は発言許可でない状態とする。制御パケット受信部３３３は、発言許可でない状態のとき、サーバ１００より発言許可を通知する制御パケットを受信すると、発言許可状態に変化し、発言許可状態のときにサーバ１００より発言完了の通知を受信したことを示す制御パケットを受信すると、発言許可でない状態に戻る。 The speech permission state is determined as follows. First, the initial state is a state where speech is not permitted. When the control packet receiving unit 333 receives a control packet for notifying speech permission from the server 100 in a state where speech is not permitted, the control packet receiving unit 333 changes to the speech permitted state, and receives a speech completion notification from the server 100 when in the speech permitted state. When a control packet indicating that the message has been received is received, the state returns to a state where speech is not permitted.

発言許可ランプ３３４は、制御パケット受信部３３３の出力が１、即ち、発言許可状態のとき点灯し、制御パケット受信部３３３の出力が０のとき消灯する。
低優先端末３００ａのユーザは、発言許可ランプ３３４の点灯・消灯を見て、発言可能かどうかを認識する。 The speech permission lamp 334 is turned on when the output of the control packet receiving unit 333 is 1, that is, the speech permitted state, and is turned off when the output of the control packet receiving unit 333 is 0.
The user of the low-priority terminal 300a recognizes whether or not he / she can speak by seeing whether the speech permission lamp 334 is turned on / off.

音声パケット送信部３０５は、制御パケット受信部３３３の出力が１、即ち、発言許可状態のとき、音声符号器３０４から入力される低ビットレート符号の音声データをパケット化し、サーバ１００に送信する。 When the output of the control packet receiver 333 is 1, that is, when the speech is permitted, the voice packet transmitter 305 packetizes the low bit rate code voice data input from the voice encoder 304 and transmits the packet to the server 100.

スイッチ３０９は、選択情報分離部３０７からの入力と制御パケット受信部３３３からの入力の何れかが０の場合、または両方０の場合、音声復号器３０８から入力されたＰＣＭ符号をＤ／Ａ変換器３１０に出力する。一方、スイッチ３０９は、選択情報分離部３０７からの入力と制御パケット受信部３３３からの入力が共に１の場合は、音声復号器３０８から入力されたＰＣＭ符号の音声データを出力せず、無音をＤ／Ａ変換器３１０に出力する。 The switch 309 performs D / A conversion on the PCM code input from the speech decoder 308 when either the input from the selection information separation unit 307 and the input from the control packet reception unit 333 are 0 or both are 0. Output to the device 310. On the other hand, when both the input from the selection information separation unit 307 and the input from the control packet reception unit 333 are 1, the switch 309 does not output the PCM code audio data input from the audio decoder 308 and does not output silence. The data is output to the D / A converter 310.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、発言許可を受けた低優先端末３００ａ〜３００ｃのユーザが発言中は、制御パケット受信部３３３からスイッチ３０９への入力が１であり、かつ、低優先端末３００ａ〜３００ｃの何れかがサーバ１００に送信した音声データがマルチキャスト音声パケット受信部３０６で受信したマルチキャスト音声パケット中に含まれているときには付加された選択情報が１となっているので、スイッチ３０９からＤ／Ａ変換器３１０に無音が出力される。従って、低優先端末３００ａ〜３００ｃが受信するマルチキャスト音声パケットの中に、自らが送信した音声データが含まれていても再生されず、エコーの発生を防止することができる。 When the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, while the user of the low-priority terminals 300a to 300c who is allowed to speak is speaking, the control packet receiving unit 333 switches to the switch 309. Is added when the voice data transmitted to the server 100 by one of the low priority terminals 300a to 300c is included in the multicast voice packet received by the multicast voice packet receiving unit 306. Since the selection information is 1, silence is output from the switch 309 to the D / A converter 310. Therefore, even if the multicast voice packet received by the low priority terminals 300a to 300c includes the voice data transmitted by the low priority terminals 300a to 300c, the multicast voice packet is not reproduced and the occurrence of echo can be prevented.

また、発言を許可される低優先端末３００ａ〜３００ｃは何れか１端末のみであるため、発言許可された低優先端末のユーザは他の低優先端末の音声を聞き逃すことがなく、この点は、上記実施の形態１，２の通話会議システムよりも優れる。
さらに、複数の低優先端末３００ａ〜３００ｃの音声パケットを同時に送信することがないため、ネットワークの負荷が更に低減される。 In addition, since any one of the low priority terminals 300a to 300c allowed to speak is one terminal, the user of the low priority terminal permitted to speak does not miss the voice of other low priority terminals. It is superior to the call conference system of the first and second embodiments.
Furthermore, since the voice packets of the plurality of low priority terminals 300a to 300c are not transmitted simultaneously, the load on the network is further reduced.

以上のように実施の形態３によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データを優先して選択すると共に選択情報を付加してマルチキャスト送信し、低優先端末は、自端末の有効な音声データをサーバに送信している間、選択情報に基づき、サーバから受信した音声データ内に低優先端末からの音声データが含まれているなら再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the third embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The low-priority terminal selects the voice data from the high-priority terminal with priority and sends the selection information to the multicast, and the low-priority terminal transmits the valid voice data of its own terminal. Is transmitted to the server based on the selection information, if the audio data received from the server contains audio data from the low priority terminal, it will not be played back. Even when data is encoded at a low bit rate, it is possible to obtain a telephone conference system that does not sense echo at the terminal.

また、実施の形態３によれば、サーバは、低優先端末のユーザの発言を許可または不許可し、低優先端末は、自端末のユーザが発言する許可をサーバから受けた場合にユーザの発言を有効な音声データとしてサーバに送信するようにした。このとき、サーバが発言を許可する低優先端末を１端末のみとすることで、複数の低優先端末が同時に音声データをサーバに送信することを防止でき、ネットワーク負荷を更に低減することができる。 Further, according to the third embodiment, the server permits or disallows the speech of the user of the low-priority terminal, and the low-priority terminal transmits the user's speech when receiving permission from the server for the user of the local terminal. Was sent to the server as valid voice data. At this time, by setting the number of low-priority terminals that the server permits to speak to only one terminal, it is possible to prevent a plurality of low-priority terminals from simultaneously transmitting voice data to the server, and to further reduce the network load.

また、実施の形態３によれば、サーバは、低優先端末に対してマルチキャスト方式で送信する音声データに含める低優先端末からの音声データとして、発言を許可した低優先端末からサーバに送信された音声データを用いるようにした。このとき、サーバが発言を許可する低優先端末を１端末のみとすることで、低優先端末が他の低優先端末の発言を聞き逃すことを防止できる。 Further, according to the third embodiment, the server transmits the speech data from the low priority terminal permitted to speak to the server as the voice data from the low priority terminal included in the voice data transmitted to the low priority terminal by the multicast method. Audio data was used. At this time, it is possible to prevent the low-priority terminal from missing the speech of other low-priority terminals by setting the low-priority terminal that the server permits to speak to only one terminal.

実施の形態４．
図７は、実施の形態４に係る通話会議システムの構成を示すブロック図であり、サーバ１００は、新たに、低優先端末３００ａ〜３００ｃから受信した音声データを低ビットレート符号化する音声符号器１４１、低ビットレート符号化した音声データをパケット化してマルチキャスト送信するマルチキャスト音声パケット送信部１４２を備えている。また、上記実施の形態１，２の音声合成部１０３の代わりに、一部動作が異なる音声合成部１４３を備えている。 Embodiment 4 FIG.
FIG. 7 is a block diagram showing a configuration of a call conference system according to Embodiment 4, in which server 100 newly encodes audio data received from low priority terminals 300a to 300c at a low bit rate. 141, a multicast voice packet transmission unit 142 that packetizes voice data encoded at a low bit rate and multicasts it. Further, instead of the speech synthesizer 103 of the first and second embodiments, a speech synthesizer 143 having a partially different operation is provided.

実施の形態４のサーバ１００には、２個のマルチキャスト音声パケット送信部１０８，１４２があるが、一方のマルチキャスト音声パケット送信部１０８は、高優先端末２００ａ，２００ｂの音声データを合成して低ビットレート符号化したマルチキャスト音声パケットを送信するためのものであり、もう一方のマルチキャスト音声パケット送信部１４２は、低優先端末３００ａ〜３００ｃの音声データを合成して低ビットレート符号化したマルチキャスト音声パケットを送信するためのものである。 The server 100 of the fourth embodiment has two multicast voice packet transmission units 108 and 142. One multicast voice packet transmission unit 108 synthesizes voice data of the high-priority terminals 200a and 200b and generates a low bit. The other multicast voice packet transmitting unit 142 is for transmitting rate-coded multicast voice packets. The other multicast voice packet transmitting unit 142 synthesizes voice data of the low-priority terminals 300a to 300c and low-rate encoded multicast voice packets. It is for sending.

以下に、実施の形態４のサーバ１００の動作を説明する。
図７において、サーバ１００の音声パケット受信部１０１，１０２、音声パケット送信部１０４、音声符号器１０６、マルチキャスト音声パケット送信部１０８、音声復号器１０９ａ〜１０９ｃの動作は、図１で説明した上記実施の形態１と同様である。マルチキャスト音声パケット送信部１０８の動作は、選択情報が付加された低ビットレート符号の音声データではなく、選択情報が付加されていない低ビットレート符号の音声データをパケット化する点を除き、図１で説明した上記実施の形態１と同様である。また、受信側音声合成部１２１の動作は、図４で説明した上記実施の形態２と同様である。 The operation of the server 100 according to the fourth embodiment will be described below.
7, the operations of the voice packet receiving units 101 and 102, the voice packet transmitting unit 104, the voice encoder 106, the multicast voice packet transmitting unit 108, and the voice decoders 109a to 109c of the server 100 are the same as those described with reference to FIG. This is the same as the first embodiment. The operation of the multicast voice packet transmission unit 108 is not the low bit rate code voice data to which the selection information is added but the packet data of the low bit rate code voice data to which the selection information is not added. This is the same as the first embodiment described above. Further, the operation of the reception side speech synthesizer 121 is the same as that of the second embodiment described with reference to FIG.

音声符号器１４１は、受信側音声合成部１２１から入力されるＰＣＭ符号を低ビットレート符号化し、低ビットレート符号の音声データをマルチキャスト音声パケット送信部１４２に出力する。
マルチキャスト音声パケット送信部１４２は、音声符号器１４１から入力される低ビットレート符号の音声データをパケット化して、低優先端末３００ａ〜３００ｃにマルチキャスト送信する。
つまり、音声符号器１４１とマルチキャスト音声パケット送信部１４２からなる系列は、低優先端末３００ａ〜３００ｃから受信した音声データを低優先端末３００ａ〜３００ｃへマルチキャスト送信するものである。 The speech encoder 141 performs low bit rate coding on the PCM code input from the reception side speech synthesis unit 121, and outputs speech data of the low bit rate code to the multicast speech packet transmission unit 142.
The multicast voice packet transmission unit 142 packetizes the low bit rate code voice data input from the voice encoder 141, and multicasts the voice data to the low priority terminals 300a to 300c.
That is, the sequence composed of the voice encoder 141 and the multicast voice packet transmission unit 142 is to multicast the voice data received from the low priority terminals 300a to 300c to the low priority terminals 300a to 300c.

音声合成部１４３は、音声パケット受信部１０２が出力する高優先端末２００ａ，２００ｂの音声データと、受信側音声合成部１２１が出力する低優先端末３００ａ〜３００ｃの音声データとが入力され、高優先端末２００ａ，２００ｂの何れかに送信するための音声データとして、送信対象の端末以外からの音声データを合成し、音声パケット送信部１０４に出力する。
この動作は、図４に示した音声合成部１０３と同様である。 The voice synthesizer 143 receives the voice data of the high priority terminals 200a and 200b output from the voice packet receiver 102 and the voice data of the low priority terminals 300a to 300c output from the receiver voice synthesizer 121, and receives high priority. As voice data to be transmitted to either terminal 200 a or 200 b, voice data from other than the terminal to be transmitted is synthesized and output to voice packet transmitting section 104.
This operation is the same as that of the speech synthesizer 103 shown in FIG.

また、音声合成部１４３は、音声パケット受信部１０２が出力する高優先端末２００ａ，２００ｂの音声データを合成し、音声符号器１０６に出力する。この音声データが、音声符号器１０６において低ビットレート符号化され、マルチキャスト音声パケット送信部１０８から低優先端末３００ａ〜３００ｃに対してマルチキャスト送信される。
つまり、音声符号器１０６とマルチキャスト音声パケット送信部１０８からなる系列は、高優先端末２００ａ，２００ｂから受信した音声データを低優先端末３００ａ〜３００ｃへマルチキャスト送信するものである。 The voice synthesizer 143 synthesizes voice data of the high priority terminals 200 a and 200 b output from the voice packet receiver 102 and outputs the synthesized voice data to the voice encoder 106. The voice data is encoded at a low bit rate in the voice encoder 106 and multicast transmitted from the multicast voice packet transmission unit 108 to the low priority terminals 300a to 300c.
That is, the sequence composed of the voice encoder 106 and the multicast voice packet transmitting unit 108 multicasts voice data received from the high priority terminals 200a and 200b to the low priority terminals 300a to 300c.

次に、図８を用いて、実施の形態４の低優先端末３００ａ〜３００ｃの動作を説明する。
この図８は、実施の形態４に係る低優先端末３００ａの内部構成を示すブロック図であり、低優先端末３００ａは、新たに、低優先端末３００ａ〜３００ｃからの音声データを含んだマルチキャスト音声パケットをサーバ１００から受信するマルチキャスト音声パケット受信部３４１、低優先端末３００ａ〜３００ｃからの低ビットレート符号の音声データを復号する音声復号器３４２、高優先端末２００ａ，２００ｂからの音声データと低優先端末３００ａ〜３００ｃからの音声データを合成する音声合成部３４３を備えている。 Next, the operation of the low priority terminals 300a to 300c of the fourth embodiment will be described with reference to FIG.
FIG. 8 is a block diagram showing the internal configuration of the low priority terminal 300a according to the fourth embodiment. The low priority terminal 300a is a multicast voice packet that newly includes voice data from the low priority terminals 300a to 300c. Multicast voice packet receiving unit 341 for receiving voice data from server 100, voice decoder 342 for decoding voice data of low bit rate code from low priority terminals 300a to 300c, voice data from high priority terminals 200a and 200b, and low priority terminals A speech synthesis unit 343 that synthesizes speech data from 300a to 300c is provided.

また、図８において、マイクロフォン３０１、Ａ／Ｄ変換器３０２、有音検出部３０３、音声符号器３０４、音声パケット送信部３０５、マルチキャスト音声パケット受信部３０６、音声復号器３０８、スイッチ３０９、Ｄ／Ａ変換器３１０、スピーカ３１１の動作は、図２で説明した上記実施の形態１と同様である。
低優先端末３００ｂ，３００ｃの内部構成も、図８と同様である。
また、高優先端末２００ａ，２００ｂの動作は、図３で説明した上記実施の形態１と同様である。 In FIG. 8, a microphone 301, an A / D converter 302, a voice detection unit 303, a voice encoder 304, a voice packet transmission unit 305, a multicast voice packet reception unit 306, a voice decoder 308, a switch 309, a D / D The operations of the A converter 310 and the speaker 311 are the same as those in the first embodiment described with reference to FIG.
The internal configurations of the low priority terminals 300b and 300c are the same as those in FIG.
The operations of the high priority terminals 200a and 200b are the same as those in the first embodiment described with reference to FIG.

低優先端末３００ａにおいて、マルチキャスト音声パケット受信部３０６は、高優先端末２００ａ，２００ｂの音声データを含むマルチキャスト音声パケット（サーバ１００のマルチキャスト音声パケット送信部１０８が送信したパケット）を受信し、そのペイロード部分、即ち、低ビットレート符号の音声データを抜き出して、音声復号器３０８に出力する。 In the low priority terminal 300a, the multicast voice packet receiving unit 306 receives a multicast voice packet including the voice data of the high priority terminals 200a and 200b (packet transmitted by the multicast voice packet transmitting unit 108 of the server 100), and its payload part. That is, the audio data of the low bit rate code is extracted and output to the audio decoder 308.

一方、マルチキャスト音声パケット受信部３４１は、低優先端末３００ａ〜３００ｃの音声データを含むマルチキャスト音声パケット（サーバ１００のマルチキャスト音声パケット送信部１４２が送信したパケット）を受信し、そのペイロード部分、即ち、低ビットレート符号の音声データを抜き出して、音声復号器３４２に出力する。
音声復号器３４２は、低優先端末３００ａ〜３００ｃの低ビットレート符号の音声データをＰＣＭ符号に復号して、スイッチ３０９に出力する。
スイッチ３０９は、有音検出部３０３の出力が０の場合には、音声復号器３４２から入力されるＰＣＭ符号の音声データを音声合成部３４３に出力し、有音検出部３０３の出力が１の場合には出力しない。 On the other hand, the multicast voice packet receiving unit 341 receives a multicast voice packet including the voice data of the low priority terminals 300a to 300c (the packet transmitted by the multicast voice packet transmitting unit 142 of the server 100), and its payload part, that is, a low level. The audio data of the bit rate code is extracted and output to the audio decoder 342.
The audio decoder 342 decodes the low bit rate code audio data of the low priority terminals 300 a to 300 c into the PCM code and outputs the PCM code to the switch 309.
The switch 309 outputs the PCM code speech data input from the speech decoder 342 to the speech synthesizer 343 when the output of the speech detector 303 is 0, and the output of the speech detector 303 is 1. Do not output in case.

音声合成部３４３は、音声復号器３０８から入力される高優先端末２００ａ，２００ｂの音声データと、スイッチ３０９から入力される低優先端末３００ａ〜３００ｃの音声データとを合成して、Ｄ／Ａ変換器３１０に出力する。ここで、スイッチ３０９からの入力がない場合には、音声復号器３０８からの入力のみをＤ／Ａ変換器３１０に出力する。 The voice synthesis unit 343 synthesizes the voice data of the high priority terminals 200a and 200b input from the voice decoder 308 and the voice data of the low priority terminals 300a to 300c input from the switch 309, and performs D / A conversion. Output to the device 310. Here, when there is no input from the switch 309, only the input from the speech decoder 308 is output to the D / A converter 310.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、低優先端末３００ａ〜３００ｃのユーザが発話中には、その端末において有音検出部３０３からスイッチ３０９への入力が１となり、スイッチ３０９は音声合成部３４３に無音のＰＣＭ符号を出力する。従って、低優先端末３００ａ〜３００ｃのマルチキャスト音声パケット受信部３４１が受信するマルチキャスト音声パケットの中に、自らが送信した音声データが含まれていても再生されず、エコーの発生を防止することができる。 When the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, while the user of the low-priority terminals 300a to 300c is speaking, the voice detection unit 303 switches to the switch 309 at that terminal. The switch 309 outputs a silent PCM code to the speech synthesizer 343. Therefore, even if the voice data transmitted by itself is included in the multicast voice packet received by the multicast voice packet receiver 341 of the low-priority terminals 300a to 300c, it is not reproduced and echo can be prevented. .

以上のように実施の形態４によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを別々にマルチキャスト送信し、低優先端末は、自端末の送信音声が有音の間、サーバから受信した低優先端末からの音声データを再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the fourth embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The voice data from the high-priority terminal and the voice data from the low-priority terminal are separately multicast-transmitted to the low-priority terminal. Since the operation is performed so as not to reproduce the voice data from the low-priority terminal received from the server, the terminal can detect an echo even when the voice data multicast-transmitted from the server is encoded at a low bit rate. You can get no call conferencing system.

実施の形態５．
図９は、実施の形態５に係る通話会議システムの構成を示すブロック図である。
図９において、サーバ１００の音声パケット受信部１０１、音声パケット送信部１０４、音声符号器１０６、マルチキャスト音声パケット送信部１０８の動作は、図１で説明した上記実施の形態１と同様である。音声パケット受信部１０２、音声復号器１０９、制御パケット受信部１３１、発言受付制御部１３２、制御パケット送信部１３３の動作は、図５で説明した上記実施の形態３と同様である。また、マルチキャスト音声パケット送信部１４２、音声合成部１４３の動作は、図７で説明した上記実施の形態４と同様である。 Embodiment 5. FIG.
FIG. 9 is a block diagram showing a configuration of a call conference system according to the fifth embodiment.
9, the operations of the voice packet receiving unit 101, the voice packet transmitting unit 104, the voice encoder 106, and the multicast voice packet transmitting unit 108 of the server 100 are the same as those in the first embodiment described with reference to FIG. The operations of the voice packet receiving unit 102, the voice decoder 109, the control packet receiving unit 131, the message reception control unit 132, and the control packet transmission unit 133 are the same as those in the third embodiment described with reference to FIG. The operations of the multicast voice packet transmitting unit 142 and the voice synthesizing unit 143 are the same as those in the fourth embodiment described with reference to FIG.

また、高優先端末２００ａ，２００ｂの動作は、図３で説明した上記実施の形態１と同様である。 The operations of the high priority terminals 200a and 200b are the same as those in the first embodiment described with reference to FIG.

サーバ１００において、マルチキャスト音声パケット送信部１４２は、音声パケット受信部１０２から入力される低ビットレート符号の音声データをパケット化し、低優先端末３００ａ〜３００ｃにマルチキャスト送信する。ここで、低優先端末３００ａ〜３００ｃの何れからも音声パケットが受信されない場合は、音声パケット受信部１０２の出力がなく、従ってマルチキャスト音声パケット送信部１４２からもマルチキャスト音声パケットは送信されない。 In the server 100, the multicast voice packet transmitting unit 142 packetizes the low bit rate code voice data input from the voice packet receiving unit 102, and multicasts the voice data to the low priority terminals 300a to 300c. Here, when the voice packet is not received from any of the low priority terminals 300a to 300c, there is no output of the voice packet receiving unit 102, and therefore the multicast voice packet is not transmitted from the multicast voice packet transmitting unit 142.

次に、図１０を用いて、実施の形態５の低優先端末３００ａ〜３００ｃの動作を説明する。この図１０は、実施の形態５に係る低優先端末３００ａの内部構成を示すブロック図である。図１０において、マイクロフォン３０１、Ａ／Ｄ変換器３０２、音声符号器３０４、音声パケット送信部３０５、音声復号器３０８、Ｄ／Ａ変換器３１０、スピーカ３１１の動作は、図２で説明した上記実施の形態１と同様である。トークスイッチ３３１、制御パケット送信部３３２、制御パケット受信部３３３、発言許可ランプ３３４の動作は、図６で説明した上記実施の形態３と同様である。スイッチ３０９、マルチキャスト音声パケット受信部３０６の動作は、図８で説明した上記実施の形態４と同様である。
また、低優先端末３００ｂ，３００ｃの内部構成は、図１０と同様である。 Next, the operation of the low priority terminals 300a to 300c according to the fifth embodiment will be described with reference to FIG. FIG. 10 is a block diagram showing an internal configuration of the low priority terminal 300a according to the fifth embodiment. 10, the operations of the microphone 301, the A / D converter 302, the voice encoder 304, the voice packet transmission unit 305, the voice decoder 308, the D / A converter 310, and the speaker 311 are the same as those described in FIG. This is the same as the first embodiment. The operations of the talk switch 331, the control packet transmitter 332, the control packet receiver 333, and the speech permission lamp 334 are the same as those in the third embodiment described with reference to FIG. The operations of the switch 309 and the multicast voice packet receiving unit 306 are the same as those in the fourth embodiment described with reference to FIG.
The internal configuration of the low priority terminals 300b and 300c is the same as that shown in FIG.

マルチキャスト音声パケット受信部３４１と音声復号器３４２の動作も、基本的に、図８で説明した通りである。
低優先端末３００ａ〜３００ｃの音声データを含むマルチキャスト音声パケット（サーバ１００のマルチキャスト音声パケット送信部１４２が送信したパケット）を受信したマルチキャスト音声パケット受信部３４１がそのペイロード部分、即ち、低ビットレート符号の音声データを抜き出して音声復号器３４２に出力し、音声復号器３４２は、低ビットレート符号をＰＣＭ符号に復号してスイッチ３０９に出力する。
ただし、上記の通り、サーバ１００のマルチキャスト音声パケット送信部１４２からマルチキャスト音声パケットが送信されない場合があり、この場合、マルチキャスト音声パケット受信部３４１から音声復号器３４２への出力はなく、音声復号器３４２は無音のＰＣＭ符号をスイッチ３０９に出力する。 The operations of the multicast voice packet receiving unit 341 and the voice decoder 342 are basically as described with reference to FIG.
The multicast voice packet receiving unit 341 that has received a multicast voice packet including the voice data of the low-priority terminals 300a to 300c (the packet transmitted by the multicast voice packet transmitting unit 142 of the server 100) receives the payload portion, that is, the low bit rate code. The audio data is extracted and output to the audio decoder 342, and the audio decoder 342 decodes the low bit rate code into a PCM code and outputs the PCM code to the switch 309.
However, as described above, a multicast voice packet may not be transmitted from the multicast voice packet transmitter 142 of the server 100. In this case, there is no output from the multicast voice packet receiver 341 to the voice decoder 342, and the voice decoder 342 is not transmitted. Outputs a silent PCM code to the switch 309.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、発言許可を受けた低優先端末３００ａ〜３００ｃのユーザが発言中は、制御パケット受信部３３３からスイッチ３０９への入力が１となり、スイッチ３０９から音声合成部３４３に無音が出力される。従って、低優先端末３００ａ〜３００ｃの音声がスピーカ３１１から再生されず、エコーの発生を防止することができる。
また、発言を許可される低優先端末３００ａ〜３００ｃは何れか１端末のみであるため、発言許可された低優先端末のユーザは他の低優先端末の音声を聞き逃すことがなくなる。 When the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, while the user of the low-priority terminals 300a to 300c who is allowed to speak is speaking, the control packet receiving unit 333 switches to the switch 309. Is 1 and silence is output from the switch 309 to the speech synthesizer 343. Therefore, the audio of the low priority terminals 300a to 300c is not reproduced from the speaker 311 and the occurrence of echoes can be prevented.
Further, since any one of the low-priority terminals 300a to 300c permitted to speak is used, the user of the low-priority terminal permitted to speak does not miss the voice of the other low-priority terminals.

以上のように実施の形態５によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを別々にマルチキャスト送信し、発言を許可された低優先端末は、サーバに音声データを送信している間、サーバから受信した低優先端末からの音声データを再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the fifth embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The voice data from the high-priority terminal and the voice data from the low-priority terminal are separately multicast-transmitted to the low-priority terminal. While transmitting data, it operates so as not to reproduce the audio data from the low priority terminal received from the server, so even if the audio data multicast transmitted from the server is encoded at a low bit rate, It is possible to obtain a call conference system that does not sense the call.

実施の形態６．
図１１は、実施の形態６に係る通話会議システムの構成を示すブロック図であり、サーバ１００は、新たに、２種類の低ビットレート符号の音声データを多重する音声符号多重部１６１を備えている。
この図１１において、サーバ１００の音声パケット受信部１０１，１０２、音声パケット送信部１０４、音声符号器１０６、音声復号器１０９ａ〜１０９ｃの動作は、図１で説明した上記実施の形態１と同様である。受信側音声合成部１２１の動作は、図４で説明した上記実施の形態２と同様である。また、音声符号器１４１、音声合成部１４３の動作は、図７で説明した上記実施の形態４と同様である。 Embodiment 6 FIG.
FIG. 11 is a block diagram showing the configuration of the call conference system according to the sixth embodiment. The server 100 newly includes a voice code multiplexing unit 161 that multiplexes voice data of two types of low bit rate codes. Yes.
In FIG. 11, the operations of the voice packet receiving units 101 and 102, the voice packet transmitting unit 104, the voice encoder 106, and the voice decoders 109a to 109c of the server 100 are the same as those in the first embodiment described with reference to FIG. is there. The operation of the reception side speech synthesizer 121 is the same as that of the second embodiment described with reference to FIG. The operations of the speech encoder 141 and speech synthesis unit 143 are the same as those in the fourth embodiment described with reference to FIG.

サーバ１００において、音声符号多重部１６１は、音声符号器１０６から入力される高優先端末２００ａ，２００ｂの低ビットレート符号の音声データと、音声符号器１４１から入力される低優先端末３００ａ〜３００ｃの低ビットレート符号の音声データとを多重化し、この多重化データをマルチキャスト音声パケット送信部１０８に出力する。多重の際には、常に音声符号器１０６から入力される低ビットレート符号の後に、音声符号器１４１から入力される低ビットレート符号を配置する。 In the server 100, the speech code multiplexing unit 161 includes low bit rate code speech data of the high priority terminals 200 a and 200 b input from the speech encoder 106 and low priority terminals 300 a to 300 c input from the speech encoder 141. The voice data of the low bit rate code is multiplexed, and the multiplexed data is output to the multicast voice packet transmitter 108. In multiplexing, the low bit rate code input from the speech encoder 141 is always placed after the low bit rate code input from the speech encoder 106.

音声符号器１０６は、高優先端末２００ａ，２００ｂからの音声データを低ビットレート符号化するものであり、音声符号器１４１は、低優先端末３００ａ〜３００ｃからの音声データを低ビットレート符号化するものであるから、音声符号多重部１６１が出力する多重化データは、前半に高優先端末２００ａ，２００ｂの低ビットレート符号の音声データ、後半に低優先端末３００ａ〜３００ｃの低ビットレート符号の音声データが多重されたデータとなる。 The speech encoder 106 encodes speech data from the high priority terminals 200a and 200b at a low bit rate, and the speech encoder 141 encodes speech data from the low priority terminals 300a to 300c at a low bit rate. Therefore, the multiplexed data output from the voice code multiplexer 161 is low bit rate code voice data of the high priority terminals 200a and 200b in the first half, and low bit rate code voice of the low priority terminals 300a to 300c in the second half. The data is multiplexed data.

マルチキャスト音声パケット送信部１０８は、音声符号多重部１６１から入力される多重データをパケット化し、低優先端末３００ａ〜３００ｃにマルチキャスト送信する。 The multicast voice packet transmission unit 108 packetizes the multiplexed data input from the voice code multiplexing unit 161, and multicasts the data to the low priority terminals 300a to 300c.

次に、図１２を用いて、実施の形態６の低優先端末３００ａ〜３００ｃの動作を説明する。この図１２は、実施の形態６に係る低優先端末３００ａの内部構成を示すブロック図であり、低優先端末３００ａは、新たに、マルチキャスト音声パケット内の多重化データを分離する音声符号分離部３６１を備えている。
図１２において、マイクロフォン３０１、Ａ／Ｄ変換器３０２、有音検出部３０３、音声符号器３０４、音声パケット送信部３０５、音声復号器３０８、Ｄ／Ａ変換器３１０、スピーカ３１１の動作は、図２で説明した上記実施の形態１と同様である。音声復号器３４２、音声合成部３４３、スイッチ３０９の動作は、図８で説明した上記実施の形態４と同様である。
また、低優先端末３００ｂ，３００ｃの内部構成は、図１２と同様である。 Next, the operation of the low priority terminals 300a to 300c of the sixth embodiment will be described with reference to FIG. FIG. 12 is a block diagram showing an internal configuration of the low priority terminal 300a according to Embodiment 6. The low priority terminal 300a newly adds a voice code separation unit 361 that separates multiplexed data in a multicast voice packet. It has.
In FIG. 12, the operations of the microphone 301, A / D converter 302, voice detection unit 303, speech encoder 304, speech packet transmission unit 305, speech decoder 308, D / A converter 310, and speaker 311 are illustrated. This is the same as the first embodiment described in the second embodiment. The operations of speech decoder 342, speech synthesizer 343, and switch 309 are the same as those in the fourth embodiment described with reference to FIG.
The internal configuration of the low priority terminals 300b and 300c is the same as that shown in FIG.

低優先端末３００ａにおいて、マルチキャスト音声パケット受信部３０６は、サーバ１００が送信したマルチキャスト音声パケットを受信し、そのペイロード部分を抜き出して音声符号分離部３６１に出力する。このペイロード部分は、上記の通り多重化データであり、音声符号分離部３６１は、その多重化データ前半の高優先端末２００ａ，２００ｂの低ビットレート符号の音声データを音声復号器３０８に出力し、後半の低優先端末３００ａ〜３００ｃの低ビットレート符号の音声データを音声復号器３４２に出力する。 In the low-priority terminal 300a, the multicast voice packet receiving unit 306 receives the multicast voice packet transmitted by the server 100, extracts the payload portion, and outputs it to the voice code separation unit 361. The payload portion is multiplexed data as described above, and the speech code separation unit 361 outputs the speech data of the low bit rate code of the high priority terminals 200a and 200b in the first half of the multiplexed data to the speech decoder 308, The audio data of the low bit rate code of the latter low priority terminals 300 a to 300 c is output to the audio decoder 342.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、低優先端末３００ａ〜３００ｃのユーザが発話中には、その端末において有音検出部３０３からスイッチ３０９への入力が１となり、スイッチ３０９は音声合成部３４３に無音のＰＣＭ符号を出力する。従って、マルチキャスト音声パケットの後半に多重化されている低優先端末３００ａ〜３００ｃの音声データが再生されず、エコーの発生を防止することができる。 When the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, while the user of the low-priority terminals 300a to 300c is speaking, the voice detection unit 303 switches to the switch 309 at that terminal. The switch 309 outputs a silent PCM code to the speech synthesizer 343. Therefore, the voice data of the low priority terminals 300a to 300c multiplexed in the second half of the multicast voice packet is not reproduced, and the occurrence of echo can be prevented.

以上のように実施の形態６によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを多重してマルチキャスト送信し、低優先端末は、自端末の送信音声が有音の場合、サーバから受信した音声データのうちの低優先端末からの音声データを再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the sixth embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The voice data from the high-priority terminal and the voice data from the low-priority terminal are multiplexed and transmitted to the low-priority terminal by multicast transmission. In the case of sound, since it operates so as not to reproduce the audio data from the low priority terminal among the audio data received from the server, even if the audio data multicast-transmitted from the server is encoded at a low bit rate, It is possible to obtain a telephone conference system that does not sense echo.

なお、実施の形態６では、多重化データの前半に高優先端末からの音声データを配置し、後半に低優先端末からの音声データを配置する例を説明したが、配置の方法はこれに限定されるものではない。多重化されたデータを受信する低優先端末が多重化データから高優先端末の音声データと低優先端末の音声データとを分離できるように、予め配置のルールを定めておき、サーバがそのルールに則って音声データを多重化すればよい。例えば、高優先端末の音声データと低優先端末の音声データを、１ビット毎に交互に配置する、８ビット毎に交互に配置するなどのルールであっても構わない。 In the sixth embodiment, the example in which the voice data from the high priority terminal is arranged in the first half of the multiplexed data and the voice data from the low priority terminal is arranged in the second half is explained. However, the arrangement method is limited to this. Is not to be done. An arrangement rule is determined in advance so that a low-priority terminal that receives multiplexed data can separate voice data from a high-priority terminal and voice data from a low-priority terminal from the multiplexed data. Accordingly, the audio data may be multiplexed. For example, the rule may be such that the voice data of the high priority terminal and the voice data of the low priority terminal are alternately arranged every 1 bit, or alternately every 8 bits.

実施の形態７．
図１３は、実施の形態７に係る通話会議システムの構成を示すブロック図である。
図１３において、サーバ１００の音声パケット受信部１０１、音声合成部１４３、音声符号器１０６、マルチキャスト音声パケット送信部１０８の動作は、図１で説明した上記実施の形態１と同様である。音声パケット受信部１０２、音声復号器１０９、制御パケット受信部１３１、発言受付制御部１３２、制御パケット送信部１３３の動作は、図５で説明した上記実施の形態３と同様である。音声合成部１４３の動作は、図７で説明した上記実施の形態４と同様であり、また、音声符号多重部１６１の動作は、図１１で説明した上記実施の形態６と同様である。 Embodiment 7 FIG.
FIG. 13 is a block diagram showing a configuration of a call conference system according to the seventh embodiment.
In FIG. 13, the operations of the voice packet receiving unit 101, the voice synthesizing unit 143, the voice encoder 106, and the multicast voice packet transmitting unit 108 of the server 100 are the same as those in the first embodiment described with reference to FIG. The operations of the voice packet receiving unit 102, the voice decoder 109, the control packet receiving unit 131, the message reception control unit 132, and the control packet transmission unit 133 are the same as those in the third embodiment described with reference to FIG. The operation of the speech synthesizer 143 is the same as that of the fourth embodiment described with reference to FIG. 7, and the operation of the speech code multiplexer 161 is the same as that of the sixth embodiment described with reference to FIG.

サーバ１００において、音声符号多重部１６１は、音声符号器１０６の出力と音声パケット受信部１０２の出力とを多重化して、マルチキャスト音声パケット送信部１０８に出力するが、音声パケット受信部１０２からの出力がない場合には、音声符号器１０６の出力をそのままマルチキャスト音声パケット送信部１０８に出力する。 In server 100, speech code multiplexing section 161 multiplexes the output of speech encoder 106 and the output of speech packet receiving section 102 and outputs the result to multicast speech packet transmitting section 108, but the output from speech packet receiving section 102. If there is no signal, the output of the speech encoder 106 is output to the multicast speech packet transmitter 108 as it is.

次に、図１４を用いて、実施の形態７の低優先端末３００ａ〜３００ｃの動作を説明する。この図１４は、実施の形態７に係る低優先端末３００ａの内部構成を示すブロック図である。
図１４において、マイクロフォン３０１、Ａ／Ｄ変換器３０２、音声符号器３０４、音声パケット送信部３０５、音声復号器３０８、Ｄ／Ａ変換器３１０、スピーカ３１１の動作は、図２で説明した上記実施の形態１と同様である。トークスイッチ３３１、制御パケット送信部３３２、制御パケット受信部３３３、発言許可ランプ３３４の動作は、図６で説明した上記実施の形態３と同様である。スイッチ３０９、マルチキャスト音声パケット受信部３０６の動作は、図８で説明した上記実施の形態４と同様である。
また、低優先端末３００ｂ，３００ｃの内部構成は、図１０と同様である。 Next, the operation of the low priority terminals 300a to 300c of the seventh embodiment will be described with reference to FIG. FIG. 14 is a block diagram showing an internal configuration of the low priority terminal 300a according to the seventh embodiment.
14, the operations of the microphone 301, the A / D converter 302, the voice encoder 304, the voice packet transmission unit 305, the voice decoder 308, the D / A converter 310, and the speaker 311 are the same as those described in FIG. This is the same as the first embodiment. The operations of the talk switch 331, the control packet transmitter 332, the control packet receiver 333, and the speech permission lamp 334 are the same as those in the third embodiment described with reference to FIG. The operations of the switch 309 and the multicast voice packet receiving unit 306 are the same as those in the fourth embodiment described with reference to FIG.
The internal configuration of the low priority terminals 300b and 300c is the same as that shown in FIG.

音声符号分離部３６１、音声復号器３４２、音声合成部３４３の動作も、基本的に、図１２で説明した通りである。高優先端末２００ａ，２００ｂの音声データと低優先端末３００ａ〜３００ｃの音声データの多重化データを含むマルチキャスト音声パケットを受信したマルチキャスト音声パケット受信部３０６がそのペイロード部分、即ち、多重化データを抜き出して音声符号分離部３６１に出力し、音声符号分離部３６１は、その多重化データ前半の低ビットレート符号を音声復号器３０８に出力し、後半の低ビットレート符号を音声復号器３４２に出力する。
ただし、上記の通り、２種類の低ビットレート符号の音声データが多重されていない場合があり、この場合、音声符号分離部３６１は、マルチキャスト音声パケット受信部３０６からの入力を音声復号器３０８にのみ出力し、音声復号器３４２には出力しない。また、音声復号器３４２は、音声符号分離部３６１から低ビットレート符号の音声データが入力された場合はＰＣＭ符号に復号してスイッチ３０９に出力し、入力がない場合には無音のＰＣＭ符号をスイッチ３０９に出力する。 The operations of the speech code separation unit 361, speech decoder 342, and speech synthesis unit 343 are also basically the same as described with reference to FIG. The multicast voice packet receiving unit 306 that has received the multicast voice packet including the multiplexed data of the voice data of the high priority terminals 200a and 200b and the voice data of the low priority terminals 300a to 300c extracts the payload part, that is, the multiplexed data. The audio code separation unit 361 outputs the low bit rate code of the first half of the multiplexed data to the audio decoder 308 and outputs the low bit rate code of the second half to the audio decoder 342.
However, as described above, there are cases where the audio data of two types of low bit rate codes are not multiplexed. In this case, the audio code separation unit 361 inputs the input from the multicast audio packet reception unit 306 to the audio decoder 308. Are not output to the speech decoder 342. The speech decoder 342 decodes the speech data of the low bit rate code to the PCM code when the speech data of the speech code separation unit 361 is input, and outputs the PCM code to the switch 309. If there is no input, the speech decoder 342 generates the silent PCM code. Output to the switch 309.

サーバ１００、高優先端末２００ａ，２００ｂ、低優先端末３００ａ〜３００ｃが上記のように動作すると、発言許可を受けた低優先端末３００ａ〜３００ｃのユーザが発言中は、制御パケット受信部３３３からスイッチ３０９への入力が１となり、スイッチ３０９は音声合成部３４３に無音のＰＣＭ符号を出力する。従って、マルチキャスト音声パケットの後半に多重化されている低優先端末３００ａ〜３００ｃの音声データが再生されず、エコーの発生を防止することができる。
また、発言を許可される低優先端末３００ａ〜３００ｃは何れか１端末のみであるため、発言許可された低優先端末のユーザは他の低優先端末の音声を聞き逃すことがなくなる。 When the server 100, the high-priority terminals 200a and 200b, and the low-priority terminals 300a to 300c operate as described above, while the user of the low-priority terminals 300a to 300c who is allowed to speak is speaking, the control packet receiving unit 333 switches to the switch 309. Is 1 and the switch 309 outputs a silent PCM code to the speech synthesizer 343. Therefore, the voice data of the low priority terminals 300a to 300c multiplexed in the second half of the multicast voice packet is not reproduced, and the occurrence of echo can be prevented.
Further, since any one of the low-priority terminals 300a to 300c permitted to speak is used, the user of the low-priority terminal permitted to speak does not miss the voice of the other low-priority terminals.

なお、サーバ１００から低優先端末３００ａ〜３００ｃに対しては、２種類の低ビットレート符号の音声データが多重化された多重化データがマルチキャスト送信されるため、ネットワーク負荷は増大するが、逆方向の低優先端末３００ａ〜３００ｃからサーバ１００に対しては、複数の低優先端末３００ａ〜３００ｃから音声パケットが同時に送信されることはないので、ネットワーク負荷が低減する。 In addition, since the multiplexed data in which the voice data of two types of low bit rate codes is multiplexed is multicast-transmitted from the server 100 to the low-priority terminals 300a to 300c, the network load increases, but the reverse direction Since the voice packets are not simultaneously transmitted from the plurality of low priority terminals 300a to 300c to the server 100 from the low priority terminals 300a to 300c, the network load is reduced.

以上のように実施の形態７によれば、３個以上の端末を高優先端末と低優先端末とに分け、サーバは、高優先端末に対してはその端末を除く他の端末からの音声データを合成して送信し、低優先端末に対しては高優先端末からの音声データと低優先端末からの音声データとを多重してマルチキャスト送信し、発言を許可された低優先端末は、サーバに音声データを送信している間、サーバから受信した音声データのうちの低優先端末からの音声データを再生しないように動作するので、サーバからマルチキャスト送信される音声データが低ビットレート符号化されている場合でも端末においてエコーを感知することがない通話会議システムを得ることができる。 As described above, according to the seventh embodiment, three or more terminals are divided into a high priority terminal and a low priority terminal, and the server receives voice data from other terminals other than the terminal for the high priority terminal. The low-priority terminal that is allowed to speak is sent to the server by multiplexing and transmitting the voice data from the high-priority terminal and the voice data from the low-priority terminal to the low-priority terminal. While transmitting the audio data, it operates so as not to reproduce the audio data from the low priority terminal among the audio data received from the server. Therefore, the audio data multicast-transmitted from the server is encoded with the low bit rate. It is possible to obtain a telephone conference system that does not sense echo even when the terminal is present.

なお、上記実施の形態１，２，４，６においては、低優先端末３００ａ〜３００ｃの有音検出部３０３によって送信音声が有音かどうか判定しその判定結果をスイッチ３０９へ出力する構成としたが、有音検出部３０３の代わりにトークスイッチを設けてユーザが発言中にトークスイッチを押下することとし、有音／無音の判定結果の代わりにトークスイッチの押下／非押下の情報をスイッチ３０９へ出力する構成にしても、同様の効果を得ることができる。 In the first, second, fourth, and sixth embodiments, the voice detection unit 303 of the low-priority terminals 300a to 300c determines whether the transmission voice is voice and outputs the determination result to the switch 309. However, it is assumed that a talk switch is provided instead of the voice detection unit 303 and the user presses the talk switch while speaking, and information on whether the talk switch is pressed / not pressed is used instead of the voice / silence determination result. Even if it is the structure which outputs to, the same effect can be acquired.

また、上記実施の形態１〜３においては、音声選択部１０５が、入力されたＰＣＭ符号の音声データの中からマルチキャスト送信すべき音声データを選択する構成にしたが、音声選択部１０５に低ビットレート符号の音声データを入力して選択する構成とすることも可能である。 In the first to third embodiments, the voice selection unit 105 is configured to select voice data to be multicast transmitted from the input voice data of the PCM code. It is also possible to adopt a configuration in which rate code audio data is input and selected.

また、上記実施の形態１〜７においては、サーバ１００と低優先端末３００ａ〜３００ｃとの間で低ビットレート符号の音声データを送受信する構成にしたが、低ビットレート符号でなくＰＣＭ符号にしても、低優先端末３００ａ〜３００ｃにおいてエコーを感知することがない通話会議システムを得ることができる。 In the first to seventh embodiments, low bit rate code audio data is transmitted and received between the server 100 and the low priority terminals 300a to 300c. However, instead of the low bit rate code, a PCM code is used. However, it is possible to obtain a telephone conference system that does not sense echo in the low priority terminals 300a to 300c.

また、上記実施の形態１〜７においては、ＩＰアドレスなどを基に端末ごとの優先度（高優先端末か低優先端末か）が予め決められているものとして説明したが、会議通話の途中で司会者が交代する場合などに優先度が入れ替わってもよい。その場合、端末は、図２に示した低優先端末用の機能と図３に示した高優先端末機能の両方を備えて、優先度に応じて機能を切り換える等すればよい。 Further, in the first to seventh embodiments, it has been described that the priority (high priority terminal or low priority terminal) for each terminal is determined in advance based on the IP address or the like. The priority may be changed when the moderator changes. In that case, the terminal may have both the function for the low priority terminal shown in FIG. 2 and the high priority terminal function shown in FIG.

上記以外にも、本発明はその発明の範囲内において、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。 In addition to the above, within the scope of the present invention, the present invention can be freely combined with each embodiment, modified any component of each embodiment, or omitted any component in each embodiment. Is possible.

１００サーバ、１０１，１０２，２０４音声パケット受信部、１０３，１４３，３４３音声合成部、１０４，２０３，３０５音声パケット送信部、１０５音声選択部
、１０６，１４１，３０４音声符号器、１０７選択情報付加部、１０８，１４２マルチキャスト音声パケット送信部、１０９，１０９ａ〜１０９ｃ，３０８，３４２音声復号器、１２１受信側音声合成部、１３１，３３３制御パケット受信部、１３２発言受付制御部、１３３，３３２制御パケット送信部、１６１音声符号多重部、２００ａ，２００ｂ高優先端末、２０１，３０１マイクロフォン、２０２，３０２Ａ／Ｄ変換器、２０５，３１０Ｄ／Ａ変換器、２０６，３１１スピーカ、３００ａ〜３００ｃ低優先端末、３０３有音検出部、３０６，３４１マルチキャスト音声パケット受信部、３０７選択情報分離部、３０９スイッチ、３３１トークスイッチ、３３４発言許可ランプ、３６１音声符号分離部。 100 server, 101, 102, 204 voice packet receiver, 103, 143, 343 voice synthesizer, 104, 203, 305 voice packet transmitter, 105 voice selector, 106, 141, 304 voice encoder, 107 selection information addition , 108, 142 Multicast voice packet transmitter, 109, 109a to 109c, 308, 342 Voice decoder, 121 Receiving side voice synthesizer, 131, 333 Control packet receiver, 132 Speech acceptance controller, 133, 332 Control packet Transmitter, 161 Voice code multiplexer, 200a, 200b High priority terminal, 201, 301 Microphone, 202, 302 A / D converter, 205, 310 D / A converter, 206, 311 Speaker, 300a-300c Low priority terminal 303 Sound detection unit, 306, 341 Multi Cast voice packet reception unit, 307 selection information separation unit, 309 switch, 331 talk switch, 334 speech permission lamp, 361 voice code separation unit.

Claims

In a call conference system consisting of a server and three or more terminals,
The three or more terminals are divided into high priority terminals and low priority terminals,
The server synthesizes and transmits a voice signal transmitted to the server from a terminal other than the high priority terminal for each of the high priority terminals, and the high priority terminal for the low priority terminals. Among the audio data transmitted from the low-priority terminal and the audio data transmitted from the low-priority terminal to the high-priority terminal with priority, and the selected audio signal is selected as the high-priority signal. Transmit in a multicast manner together with selection information indicating whether the voice signal from the terminal or the low priority terminal is selected,
When the low-priority terminal receives and reproduces the audio data transmitted from the server by the multicast method, the low-priority terminal is based on the selection information while transmitting valid audio data of the terminal to the server. A call conference system, wherein the received audio data is reproduced if it is audio data from the high priority terminal.

In a call conference system consisting of a server and three or more terminals,
The three or more terminals are divided into high priority terminals and low priority terminals,
The server synthesizes and transmits a voice signal transmitted to the server from a terminal other than the high priority terminal for each of the high priority terminals, and the high priority terminal for the low priority terminals. The voice data transmitted from the server to the server and the voice data transmitted from the low priority terminal to the server are multiplexed and transmitted in a multicast manner,
When the low-priority terminal receives and reproduces the audio data transmitted from the server by the multicast method, the low-priority terminal receives the audio data received while transmitting valid audio data of the terminal to the server. A call conferencing system for reproducing voice data from the high priority terminal.

In a call conference system consisting of a server and three or more terminals,
The three or more terminals are divided into high priority terminals and low priority terminals,
The server synthesizes and transmits a voice signal transmitted to the server from a terminal other than the high priority terminal for each of the high priority terminals, and the high priority terminal for the low priority terminals. The voice data transmitted from the low-priority terminal to the server and the voice data transmitted from the low-priority terminal to the server separately by a multicast method,
When the low-priority terminal receives and reproduces the audio data transmitted from the server by the multicast method, the low-priority terminal receives the audio data received while transmitting valid audio data of the terminal to the server. Is a voice conferencing system that reproduces audio data from the high-priority terminal.

The low-priority terminal detects the presence or absence of voice data of its own terminal, and determines that the voice data in which the voice is detected is transmitted to the server as the valid voice data. The telephone conference system according to any one of claims 1 to 3.

The server allows or disallows the speech of the user of the low priority terminal;
The said low priority terminal transmits the said user's speech to the said server as the said effective audio | voice data, when the permission of the user of the own terminal is received from the said server. The call conference system according to claim 1.

The voice data transmitted / received between the server and the low priority terminal is encoded with a low bit rate at a bit rate lower than that of PCM. Call conference system described in the section.

The server is a combination of voice data transmitted from the plurality of low-priority terminals to the server as voice data from the low-priority terminals to be included in voice data transmitted by multicast to the low-priority terminals. The call conference system according to claim 4, wherein:

The server uses voice data transmitted to the server from the low-priority terminal permitted to speak as voice data from the low-priority terminal to be included in voice data transmitted by multicast to the low-priority terminal. The call conference system according to claim 5.