JP2009094630A

JP2009094630A - Distribution system and method

Info

Publication number: JP2009094630A
Application number: JP2007261066A
Authority: JP
Inventors: Takahiro Tanaka; 孝浩田中
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2007-10-04
Filing date: 2007-10-04
Publication date: 2009-04-30

Abstract

<P>PROBLEM TO BE SOLVED: To provide a distribution system and distribution method, enabling a user of a communication terminal to listen to the speech of a user of another communication terminal even during the speech of himself/herself by interrupting only the speech of himself/herself in the communication terminal, and capable of maintaining the voice quality. <P>SOLUTION: In this distribution system, a distribution server transmits the following data to each communication terminal: distribution voice data in which the input voice data generated in the distribution server and the terminal voice data transmitted from each communication terminal are allocated for each channel, and allocation information indicating the contents of the allocation. On the other hand, each communication terminal emits distribution voice data of a channel other than the channel to which the terminal voice data of the terminal data transmitted of its own communication terminal are allocated on the basis of the allocation information. In this way, the user of the communication terminal can listen to the speech other than the speech of himself/herself. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、配信サーバと通信端末へのデータの配信、および配信サーバと通信端末との利用者間におけるコミュニケーションを行う際の音声の品質を向上させる技術に関する。 The present invention relates to a technique for improving the quality of voice when data is distributed to a distribution server and a communication terminal, and communication between users of the distribution server and the communication terminal is performed.

テレビ放送などを各通信端末に対して、通信網を介して配信するシステムにおいて、配信側の配信サーバと、受信側の通信端末との間でコミュニケーションを行う場合がある。このとき、配信サーバ、または通信端末におけるマイクロフォンとスピーカとの音響結合から生じるエコーを消去するエコーキャンセラが設けられ、音声の品質を保つようになっている。 In a system that distributes television broadcasting to each communication terminal via a communication network, communication may be performed between a distribution-side distribution server and a reception-side communication terminal. At this time, an echo canceller for canceling an echo generated from acoustic coupling between a microphone and a speaker in the distribution server or the communication terminal is provided to maintain the quality of the voice.

ところで、配信サーバが、そのコミュニケーションの状況についても通信網を介して、各通信端末に配信することにより、コミュニケーションの当事者以外についてもその状況を視聴することができるようなシステムの場合には、通信網を介して配信されるときに発生する遅延の影響を受けることがある。例えば、テレビ放送の音声に含まれる通信端末の利用者の音声は、利用者自らが発声してから遅延して届くことがあり、違和感があるとともに、音響結合により音声の品質の悪化を招くことがあった。このような品質の悪化を改善するために、通信端末の利用者が発言している間には、その通信端末に対して配信サーバから配信される各通信端末からの音声データを遮断して放音させないようにする技術が開示されている（例えば特許文献１）。
特開平３−１８１３０号公報 By the way, in the case of a system in which the distribution server distributes the communication status to each communication terminal via the communication network, the communication status can be viewed by other parties than the communication party. It may be affected by delays that occur when distributed over the network. For example, the voice of the user of the communication terminal included in the voice of the television broadcast may arrive after being uttered by the user himself / herself, and there is a sense of incongruity and the quality of the voice is deteriorated due to acoustic coupling. was there. In order to improve such quality deterioration, while the user of the communication terminal speaks, the voice data from each communication terminal distributed from the distribution server to the communication terminal is blocked and released. A technique for preventing sound from being disclosed is disclosed (for example, Patent Document 1).
Japanese Patent Laid-Open No. 3-18130

しかし、特許文献１に記載の技術においては、複数の通信端末の利用者が発言していた場合、自らの発言に係る音声データを通信端末において遮断するだけでなく、他の通信端末の利用者の発言に係る音声データも遮断してしまうために、通信端末の利用者が発言中には配信サーバからの音声だけしか聞くことができなかった。 However, in the technique described in Patent Document 1, when users of a plurality of communication terminals speak, not only the voice data related to their statements is blocked at the communication terminal but also users of other communication terminals. Since the voice data related to the utterance is also blocked, only the voice from the distribution server can be heard while the user of the communication terminal speaks.

本発明は、上述の事情に鑑みてなされたものであり、通信端末の利用者の発言だけをその通信端末において遮断することにより、発言中においても他の通信端末の利用者の発言を聞くことができるとともに、音声の品質を保つことができる配信システムおよび配信方法を提供することを目的とする。 The present invention has been made in view of the above-mentioned circumstances, and by listening only to the speech of the user of the communication terminal at the communication terminal, the speech of the user of another communication terminal can be heard even during the speech. An object of the present invention is to provide a distribution system and a distribution method that can maintain voice quality.

上述の課題を解決するため、本発明は、通信網を介して通信を行う配信サーバと複数の通信端末とを有する配信システムにおいて、前記配信サーバは、前記通信端末に対して、複数のチャンネルにより構成される配信音声データを送信する配信音声データ送信手段と、前記通信端末から、端末音声データを受信する端末音声データ受信手段と、ストリーミング形式の音声データが入力される音声データ入力手段と、前記音声データ入力手段に入力された音声データと前記端末音声データ受信手段が受信した通信端末ごとの端末音声データとの各々について、前記配信音声データの複数のチャンネルの各々に割り当てる割当手段とを具備し、前記通信端末は、ストリーミング形式の端末音声データが入力される端末音声データ入力手段と、前記端末音声データ入力手段に入力された端末音声データを前記配信サーバに送信する端末音声データ送信手段と、前記配信サーバから配信音声データを受信する受信手段と、前記受信手段によって受信された配信音声データの複数チャンネルのうち、前記端末音声データ送信手段によって送信された端末音声データが割り当てられたチャンネルを特定する特定手段と、前記受信手段によって受信された配信音声データの複数チャンネルのうち、前記特定手段が特定したチャンネル以外のチャンネルの配信音声データを出力する出力手段とを具備することを特徴とする配信システムを提供する。 In order to solve the above-described problem, the present invention provides a distribution system including a distribution server that performs communication via a communication network and a plurality of communication terminals. Distribution voice data transmitting means for transmitting the distribution voice data configured; terminal voice data receiving means for receiving terminal voice data from the communication terminal; voice data input means for receiving streaming-format voice data; Allocating means for allocating each of the voice data input to the voice data input means and the terminal voice data for each communication terminal received by the terminal voice data receiving means to each of the plurality of channels of the distributed voice data. The communication terminal includes terminal voice data input means for inputting terminal voice data in a streaming format, Terminal voice data transmitting means for transmitting terminal voice data input to the terminal voice data input means to the distribution server, receiving means for receiving distribution voice data from the distribution server, and distribution voice data received by the receiving means Among the plurality of channels, the specifying means for specifying the channel to which the terminal voice data transmitted by the terminal voice data transmitting means is allocated, and the specifying means among the plurality of channels of the distribution voice data received by the receiving means Output means for outputting delivery audio data of a channel other than the specified channel.

また、別の好ましい態様において、前記端末音声データ受信手段は、端末音声データを受信するときに、当該端末音声データを送信した通信端末を特定する端末情報をさらに受信し、前記配信音声データ送信手段は、前記割当手段によって前記端末音声データに割り当てられたチャンネルと当該端末音声データを送信した通信端末とを対応付けた割当情報を生成し、前記通信端末に当該割当情報をさらに送信し、前記端末音声データ送信手段は、端末音声データを送信するときに、前記配信サーバに自端末を特定する端末情報をさらに送信し、前記受信手段は、前記配信サーバから割当情報をさらに受信し、前記特定手段におけるチャンネルの特定は、前記受信手段によって受信された割当情報に基づいて行われてもよい。 In another preferred aspect, when receiving the terminal voice data, the terminal voice data receiving means further receives terminal information specifying a communication terminal that has transmitted the terminal voice data, and the distributed voice data transmitting means Generates allocation information in which the channel allocated to the terminal voice data by the allocation means is associated with the communication terminal that has transmitted the terminal voice data, and further transmits the allocation information to the communication terminal, The voice data transmitting means further transmits terminal information for specifying its own terminal to the distribution server when transmitting terminal voice data, the receiving means further receives allocation information from the distribution server, and the specifying means The channel may be specified based on the allocation information received by the receiving means.

また、別の好ましい態様において、前記特定手段におけるチャンネルの特定は、前記受信手段によって受信された配信音声データの複数のチャンネル各々が示す音声パターンと前記端末音声データ送信手段によって送信された端末音声データが示す音声パターンとの比較に基づいて行われてもよい。 In another preferred embodiment, the channel in the specifying unit is specified by the voice pattern indicated by each of a plurality of channels of the distribution voice data received by the receiving unit and the terminal voice data transmitted by the terminal voice data transmitting unit. May be performed based on a comparison with the voice pattern indicated by.

また、別の好ましい態様において、前記割当手段は、前記端末音声データ受信手段が受信した通信端末ごとの端末音声データのうち、所定の条件を満たした端末音声データの各々について、前記配信音声データの複数のチャンネルの各々に割り当ててもよい。 In another preferred aspect, the allocating means includes, for each of the terminal voice data satisfying a predetermined condition among the terminal voice data for each communication terminal received by the terminal voice data receiving means, It may be assigned to each of a plurality of channels.

また、別の好ましい態様において、前記割当手段は、前記端末音声データ受信手段が受信した通信端末ごとの端末音声データのうち、所定の条件を満たした端末音声データの各々について、前記配信音声データの複数のチャンネルの各々に割り当て、所定の条件を満たさない端末音声データについては、当該所定の条件を満たさない音声データを合成したデータとして１のチャンネルに割り当ててもよい。 In another preferred aspect, the allocating means includes, for each of the terminal voice data satisfying a predetermined condition among the terminal voice data for each communication terminal received by the terminal voice data receiving means, Terminal audio data that is assigned to each of a plurality of channels and does not satisfy a predetermined condition may be assigned to one channel as synthesized data of audio data that does not satisfy the predetermined condition.

また、本発明は、通信網を介して通信を行う配信サーバと複数の通信端末とを有する配信システムに用いられる方法であって、前記配信サーバにおいて用いられる方法は、前記通信端末に対して、複数のチャンネルにより構成される配信音声データを送信する配信音声データ送信過程と、前記通信端末から、端末音声データを受信する端末音声データ受信過程と、ストリーミング形式の音声データが入力される音声データ入力過程と、前記音声データ入力過程において入力された音声データと前記端末音声データ受信過程によって受信された通信端末ごとの端末音声データとの各々について、前記配信音声データの複数のチャンネルの各々に割り当てる割当過程とを備え、前記通信端末は、ストリーミング形式の端末音声データが入力される端末音声データ入力過程と、前記端末音声データ入力過程において入力された端末音声データを前記配信サーバに送信する端末音声データ送信過程と、前記配信サーバから配信音声データを受信する受信過程と、前記受信過程によって受信された配信音声データの複数チャンネルのうち、前記端末音声データ送信過程によって送信された端末音声データが割り当てられたチャンネルを特定する特定過程と、前記受信過程によって受信された配信音声データの複数チャンネルのうち、前記特定過程によって特定されたチャンネル以外のチャンネルの配信音声データを出力する出力過程とを備えることを特徴とする配信方法を提供する。 In addition, the present invention is a method used in a distribution system having a distribution server and a plurality of communication terminals that communicate via a communication network, and the method used in the distribution server is for the communication terminal, Distribution audio data transmission process for transmitting distribution audio data composed of a plurality of channels, terminal audio data reception process for receiving terminal audio data from the communication terminal, and audio data input for receiving audio data in streaming format Assigning to each of a plurality of channels of the distributed audio data for each of the audio data input in the audio data input process and the terminal audio data for each communication terminal received in the terminal audio data receiving process And the communication terminal receives streaming terminal audio data. Terminal voice data input process, terminal voice data transmission process for transmitting terminal voice data input in the terminal voice data input process to the distribution server, reception process for receiving distribution voice data from the distribution server, and the reception Among a plurality of channels of distribution voice data received by the process, a specifying process for identifying a channel to which the terminal voice data transmitted by the terminal voice data transmission process is assigned, and a distribution voice data received by the reception process And providing an output process of outputting distribution audio data of a channel other than the channel specified by the specifying process among the plurality of channels.

本発明によれば、通信端末の利用者の発言だけをその通信端末において遮断することにより、発言中においても他の通信端末の利用者の発言を聞くことができるとともに、音声の品質を保つことができる配信システムおよび配信方法を提供することができる。 According to the present invention, by blocking only the communication terminal user's remarks at the communication terminal, it is possible to hear the remarks of other communication terminal users even during remarks and to maintain the voice quality. It is possible to provide a distribution system and a distribution method capable of performing the above.

以下、本発明の一実施形態について説明する。 Hereinafter, an embodiment of the present invention will be described.

＜実施形態＞
本発明の実施形態に係る配信システムは、図１に示すように、配信サーバ１および複数の通信端末として通信端末２−Ａ、２−Ｂを有する。以下、通信端末２−Ａと通信端末２−Ｂを区別しない場合には、単に通信端末２という。配信サーバ１と通信端末２とは、通信網１０００を介して接続され、各種データの送受信を行う。また、配信サーバ１は、複数の通信端末２に対して、マルチキャストでの通信を行うことができる。以下、配信サーバ１、通信端末２の構成について順に説明する。 <Embodiment>
As shown in FIG. 1, the distribution system according to the embodiment of the present invention includes a distribution server 1 and communication terminals 2-A and 2-B as a plurality of communication terminals. Hereinafter, when the communication terminal 2-A and the communication terminal 2-B are not distinguished, they are simply referred to as the communication terminal 2. The distribution server 1 and the communication terminal 2 are connected via the communication network 1000 and transmit / receive various data. The distribution server 1 can perform multicast communication with the plurality of communication terminals 2. Hereinafter, configurations of the distribution server 1 and the communication terminal 2 will be described in order.

配信サーバ１は、図示しないＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などを有する制御部を有し、ＣＰＵは、ＲＯＭに記憶されたプログラムをＲＡＭにロードして実行する。また、図２に示すように、配信サーバ１は、通信部１０、音声出力部１２、音声入力部１３を有し、ＣＰＵによって、デコード部１１、チャンネル割当部１４、エンコード部１５を構成するように、配信サーバ１を制御する。図２は、配信サーバ１の構成を示すブロック図である。 The distribution server 1 includes a control unit having a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like (not shown), and the CPU loads a program stored in the ROM into the RAM. And execute. As shown in FIG. 2, the distribution server 1 includes a communication unit 10, an audio output unit 12, and an audio input unit 13, and a decoding unit 11, a channel allocation unit 14, and an encoding unit 15 are configured by the CPU. Next, the distribution server 1 is controlled. FIG. 2 is a block diagram showing the configuration of the distribution server 1.

通信部１０は、有線、無線などによって、通信網１０００を介して通信端末２とデータの送受信を行う通信手段である。本実施形態においては、通信端末２へは配信データを送信し、通信端末２からは端末データを受信する。配信データ、端末データについては後述する。 The communication unit 10 is a communication unit that transmits and receives data to and from the communication terminal 2 via the communication network 1000 by wire, wireless, or the like. In the present embodiment, distribution data is transmitted to the communication terminal 2 and terminal data is received from the communication terminal 2. The distribution data and terminal data will be described later.

デコード部１１は、通信部１０が通信端末２から受信した端末データに対してデコードを行う。端末データは、後述するように通信端末２においてエンコードが行われた端末音声データと、端末データを送信した通信端末２を特定する端末情報とを有する。デコード部１１は、端末データに対してデコードを行うことによって端末音声データと端末情報とを生成して、端末音声データを音声出力部１２とチャンネル割当部１４とに出力するとともに、チャンネル割当部１４には、端末情報についても端末音声データと対応付けて出力する。 The decoding unit 11 decodes the terminal data received from the communication terminal 2 by the communication unit 10. The terminal data includes terminal voice data encoded in the communication terminal 2 and terminal information for specifying the communication terminal 2 that has transmitted the terminal data, as will be described later. The decoding unit 11 generates terminal audio data and terminal information by decoding the terminal data, and outputs the terminal audio data to the audio output unit 12 and the channel allocation unit 14, and the channel allocation unit 14. The terminal information is also output in association with the terminal voice data.

ここで、通信部１０が複数の通信端末２から端末データを受信した場合においては、デコード部１１は、通信端末２ごとに区別して端末音声データと端末情報とを出力する。 Here, when the communication unit 10 receives terminal data from a plurality of communication terminals 2, the decoding unit 11 outputs terminal voice data and terminal information by distinguishing each communication terminal 2.

音声出力部１２は、スピーカなどの放音手段を有し、デコード部１１から入力されるオーディオデータである端末音声データに基づいて放音する。また、複数の端末音声データが入力された場合には、これらをミキシングして放音する。音声入力部１３は、収音を行うマイクロフォンを有し、マイクロフォンの収音に基づいたストリーミング形式のオーディオデータである入力音声データを生成して、チャンネル割当部１４に出力する。本実施形態においては、入力音声データは、Ｌチャンネル、Ｒチャンネルの２チャンネルにより構成されるオーディオデータとする。 The audio output unit 12 includes sound emission means such as a speaker, and emits sound based on terminal audio data that is audio data input from the decoding unit 11. Further, when a plurality of terminal voice data are input, they are mixed and emitted. The voice input unit 13 includes a microphone that collects sound, generates input voice data that is audio data in a streaming format based on the collected sound of the microphone, and outputs the input voice data to the channel assignment unit 14. In this embodiment, the input audio data is audio data composed of two channels, an L channel and an R channel.

チャンネル割当部１４は、音声入力部１３から入力される入力音声データと、デコード部１１から入力される端末音声データとについて、それぞれ異なるチャンネルに割り当てて、複数チャンネルのオーディオデータである配信音声データを生成する。また、各チャンネルとそのチャンネルに割り当てられたオーディオデータの内容を示す割当情報を生成する。オーディオデータの内容とは、端末音声データについては、その端末音声データに係る端末データを送信した通信端末２を示し、通信端末２−Ａは通信端末Ａと、通信端末２−Ｂは通信端末Ｂと示されるものとする。この端末データを送信した通信端末２は、デコード部１１から入力された端末情報によって特定される。一方、入力音声データについては、Ｌチャンネルについては配信サーバＬと、Ｒチャンネルについては配信サーバＲとして示されるものとする。 The channel allocation unit 14 allocates the input audio data input from the audio input unit 13 and the terminal audio data input from the decoding unit 11 to different channels, respectively, and distributes distribution audio data that is audio data of a plurality of channels. Generate. Also, allocation information indicating the contents of each channel and the audio data allocated to the channel is generated. The contents of the audio data indicate, for terminal voice data, the communication terminal 2 that has transmitted the terminal data related to the terminal voice data. The communication terminal 2-A is the communication terminal A, and the communication terminal 2-B is the communication terminal B. It shall be indicated. The communication terminal 2 that has transmitted the terminal data is specified by the terminal information input from the decoding unit 11. On the other hand, for input audio data, the L channel is indicated as a distribution server L, and the R channel is indicated as a distribution server R.

本実施形態においては、配信音声データは３チャンネルにより構成されるものとし、割当情報は図３に示すようなテーブルとして生成される。例えば、図３（ａ）においては、通信端末２から端末データが送信されていない場合を示し、チャンネル１には入力音声データのＬチャンネル、チャンネル２には入力音声データのＲチャンネルが割り当てられている。このときチャンネル３には割り当てるデータがないから未使用となっている。 In this embodiment, it is assumed that the distribution audio data is composed of three channels, and the allocation information is generated as a table as shown in FIG. For example, FIG. 3A shows a case where terminal data is not transmitted from the communication terminal 2, and the channel 1 is assigned the L channel of the input audio data, and the channel 2 is assigned the R channel of the input audio data. Yes. At this time, channel 3 is unused because there is no data to be assigned.

図３（ｂ）においては、通信端末２−Ａから端末データが送信されている場合を示し、図３（ａ）におけるチャンネル３に通信端末２−Ａから送信された端末データに係る端末音声データが割り当てられている。一方、図３（ｃ）においては、通信端末２−Ａ、２−Ｂから端末データが同時に送信されている場合を示し、チャンネル１には入力音声データのＬチャンネルとＲチャンネルとをミキシングすることにより、１つのチャンネルとして割り当てられ、チャンネル２には通信端末２−Ａから送信された端末データに係る端末音声データ、チャンネル３には通信端末２−Ｂから送信された端末データに係る端末音声データがそれぞれ割り当てられている。なお、通信端末２−Ａ、２−Ｂから端末データが交互に送信されるような場合には、図３（ｂ）におけるチャンネル３にそれぞれの端末音声データを交互に割り当てるようにすればよい。このように、チャンネル割当部１４は、通信端末２からの端末データの受信の状況に応じて、各オーディオデータの各チャンネルへの割り当てを行い、この割り当ては随時行われるから、割当情報の内容についても、割り当てに応じて随時変化する。 FIG. 3B shows a case where terminal data is transmitted from the communication terminal 2-A, and terminal voice data related to the terminal data transmitted from the communication terminal 2-A to the channel 3 in FIG. Is assigned. On the other hand, FIG. 3 (c) shows a case where terminal data is transmitted simultaneously from the communication terminals 2-A and 2-B. The channel 1 mixes the L channel and the R channel of the input audio data. Thus, terminal audio data related to the terminal data transmitted from the communication terminal 2-A is assigned to the channel 2, and terminal audio data related to the terminal data transmitted from the communication terminal 2-B is assigned to the channel 3. Are assigned to each. When terminal data is alternately transmitted from the communication terminals 2-A and 2-B, the respective terminal voice data may be alternately allocated to the channel 3 in FIG. As described above, the channel assignment unit 14 assigns each audio data to each channel according to the reception status of the terminal data from the communication terminal 2, and the assignment is performed as needed. Also changes from time to time depending on the assignment.

図２に戻って説明を続ける。エンコード部１５は、チャンネル割当部１４から入力された配信音声データに対して所定のエンコードを行って配信データを生成する。この配信データには、チャンネル割当部１４から入力された割当情報も含まれる。そして、エンコード部１５において生成された配信データは、通信部１０に出力されて、各通信端末２へ送信される。以上が配信サーバ１の構成の説明である。 Returning to FIG. 2, the description will be continued. The encoding unit 15 performs predetermined encoding on the distribution audio data input from the channel allocation unit 14 to generate distribution data. This distribution data also includes allocation information input from the channel allocation unit 14. The distribution data generated in the encoding unit 15 is output to the communication unit 10 and transmitted to each communication terminal 2. The above is the description of the configuration of the distribution server 1.

次に、通信端末２の構成について図４を用いて説明する。図４は、通信端末２の構成を示すブロック図である。 Next, the configuration of the communication terminal 2 will be described with reference to FIG. FIG. 4 is a block diagram showing a configuration of the communication terminal 2.

デコード部２１は、通信部２０が配信サーバ１から受信した配信データに対してデコードを行う。配信データは、上述したように所定のエンコードが行われた配信音声データと割当情報を有し、デコード部２１は、配信データに対してデコードを行うことによって、配信音声データと割当情報とを生成して、ミュート部２６に出力する。 The decoding unit 21 decodes the distribution data received from the distribution server 1 by the communication unit 20. The distribution data has the distribution audio data and the allocation information subjected to the predetermined encoding as described above, and the decoding unit 21 generates the distribution audio data and the allocation information by decoding the distribution data. And output to the mute unit 26.

ミュート部２６は、デコード部２１から入力された割当情報に基づいて、自らの通信端末２に対応するチャンネルを特定することにより、自らの通信端末２が送信した端末データに係る端末音声データが割り当てられたチャンネルを特定する。ミュート部２６は、特定されたチャンネルが存在するときには、デコード部２１から入力された配信音声データの複数チャンネルのうち、特定されたチャンネル以外のチャンネルの配信音声データを音声出力部２２に出力し、特定されたチャンネルが存在しないときには、全てのチャンネルの配信音声データを音声出力部２２に出力する。以下、ミュート部２６から出力される配信音声データを出力音声データという。 The mute unit 26 assigns terminal voice data related to the terminal data transmitted by its own communication terminal 2 by specifying the channel corresponding to its own communication terminal 2 based on the assignment information input from the decoding unit 21. Identify the channel that was given. When the specified channel exists, the mute unit 26 outputs the distribution audio data of channels other than the specified channel among the plurality of channels of the distribution audio data input from the decoding unit 21 to the audio output unit 22. When the specified channel does not exist, the distribution audio data of all channels is output to the audio output unit 22. Hereinafter, the distribution audio data output from the mute unit 26 is referred to as output audio data.

音声出力部２２は、スピーカなどの放音手段を有し、ミュート部２６から入力される複数チャンネルのオーディオデータである出力音声データに基づいて放音する。ここで、スピーカが１つだけの場合などモノラルによる放音のみに対応している場合には、出力音声データに含まれる各チャンネルに係る音をミキシングして放音する。一方、複数のスピーカが存在する場合などにおいて多チャンネルの放音に対応している場合には、各チャンネルに係る音を放出するスピーカを予め設定しておけばよい。 The sound output unit 22 has sound emitting means such as a speaker, and emits sound based on output sound data that is audio data of a plurality of channels input from the mute unit 26. Here, when only monophonic sound emission is supported, such as when there is only one speaker, the sound related to each channel included in the output audio data is mixed and emitted. On the other hand, in the case where a plurality of speakers are present, etc., when multi-channel sound emission is supported, a speaker that emits sound related to each channel may be set in advance.

音声入力部２３は、収音を行うマイクロフォンを有し、マイクロフォンの収音に基づいたストリーミング形式のオーディオデータである端末音声データを生成して、エンコード部２５に出力する。端末情報記憶部２４は、自らの通信端末２を特定する端末情報を記憶する記憶手段である。 The audio input unit 23 includes a microphone that collects sound, generates terminal audio data that is streaming-format audio data based on the sound collected by the microphone, and outputs the generated terminal audio data to the encoding unit 25. The terminal information storage unit 24 is a storage unit that stores terminal information for specifying its own communication terminal 2.

エンコード部２５は、音声入力部２３から入力された端末音声データに対して所定のエンコードを行って端末データを生成する。また、エンコード部２５は、端末情報記憶部２４から端末情報を読み出し、この端末データにその端末情報が含まれるようにして、端末データを通信部２０に出力する。そして、通信部２０は、配信サーバ１へ端末データを送信する。以上が通信端末２の構成の説明である。 The encoding unit 25 performs predetermined encoding on the terminal voice data input from the voice input unit 23 to generate terminal data. Further, the encoding unit 25 reads the terminal information from the terminal information storage unit 24 and outputs the terminal data to the communication unit 20 so that the terminal information is included in the terminal data. Then, the communication unit 20 transmits terminal data to the distribution server 1. The above is the description of the configuration of the communication terminal 2.

次に、本発明の実施形態に係る配信システムの動作について説明する。この説明においては、通信端末２−Ａの利用者を利用者Ａ、通信端末２−Ｂの利用者を利用者Ｂ、配信サーバ１の利用者を利用者Ｃという。 Next, the operation of the distribution system according to the embodiment of the present invention will be described. In this description, the user of the communication terminal 2-A is called user A, the user of the communication terminal 2-B is called user B, and the user of the distribution server 1 is called user C.

利用者Ｃは、配信サーバ１が設置された音楽スタジオにおいて、楽器の生演奏（以下、ライブ演奏という）を行っている。このとき利用者Ｃは、図示しない配信サーバ１の操作部を操作して、通信部１０が配信データの送信を行う一方、端末データの受信を行わないように、ＣＰＵに制御させる。 A user C performs a live performance of a musical instrument (hereinafter referred to as a live performance) in a music studio where the distribution server 1 is installed. At this time, the user C operates an operation unit of the distribution server 1 (not shown) to cause the CPU to control the communication unit 10 to transmit distribution data while not receiving terminal data.

音声入力部１３は、ライブ演奏を録音した音声（以下、ライブ音声という）を示す入力音声データを生成してチャンネル割当部１４に出力する。一方、通信部１０が端末データの受信をしないため、チャンネル割当部１４には、端末音声データ、端末情報は入力されない。これにより、チャンネル割当部１４は、チャンネル１にライブ音声のＬチャンネル、チャンネル２にライブ音声のＲチャンネルが割り当てられた配信音声データをエンコード部１５に出力するとともに、図３（ａ）に示すような割当情報を生成し、エンコード部１５に出力する。そして、エンコード部１５において、配信音声データに対してエンコードを行ったデータと割当情報とを有する配信データが、通信部１０を介して、通信端末２−Ａ、２−Ｂに送信される。 The audio input unit 13 generates input audio data indicating audio (hereinafter referred to as live audio) recorded from the live performance and outputs the input audio data to the channel allocation unit 14. On the other hand, since the communication unit 10 does not receive terminal data, terminal voice data and terminal information are not input to the channel assignment unit 14. As a result, the channel allocation unit 14 outputs the distribution audio data in which the L channel of the live audio is allocated to the channel 1 and the R channel of the live audio is allocated to the channel 2 to the encoding unit 15, and as illustrated in FIG. Is generated and output to the encoding unit 15. Then, in the encoding unit 15, distribution data including data obtained by encoding the distribution audio data and the allocation information is transmitted to the communication terminals 2 -A and 2 -B through the communication unit 10.

一方、通信端末２−Ａ、２−Ｂの通信部２０は配信データを受信し、デコード部２１においてデコードが行われ、ミュート部２６に配信音声データ、割当情報が入力される。ミュート部２６は、割当情報を参照し、自らの通信端末２に対応するチャンネルを特定するが、入力される割当情報は、図３（ａ）に示すような割当情報であるから、チャンネルを特定することがなく、配信音声データの複数チャンネルの全てのチャンネルをそのまま出力音声データとして音声出力部２２に出力する。これにより、通信端末２−Ａ、２−Ｂにおいては、ライブ音声がステレオで放音され、利用者Ａ、Ｂはライブ音声を楽しむことができる。 On the other hand, the communication unit 20 of the communication terminals 2-A and 2-B receives the distribution data, the decoding unit 21 performs decoding, and the mute unit 26 inputs the distribution audio data and the allocation information. The mute unit 26 refers to the allocation information and identifies the channel corresponding to its own communication terminal 2. However, since the input allocation information is allocation information as shown in FIG. 3 (a), the channel is identified. Without being performed, all of the plurality of channels of the distributed audio data are output as is to the audio output unit 22 as output audio data. Thereby, in the communication terminals 2-A and 2-B, the live sound is emitted in stereo, and the users A and B can enjoy the live sound.

そして、利用者Ｃはライブ演奏を終了して、利用者Ａ、利用者Ｂに対して、コミュニケーションをとるために、配信サーバ１の操作部を操作して、ＣＰＵに通信部１０において端末データの受信が行われるように制御させる。また、利用者Ｃは、音声入力部１３を介してその旨を利用者Ａ、Ｂに対して提案する。ここで、利用者Ａが会話を開始すると、通信端末２−Ａの音声入力部２３は、利用者Ａの会話を録音した音声（以下、会話音声Ａという）を示す端末音声データ（以下、端末音声データＡという）を生成して、エンコード部２５に出力する。エンコード部２５は、入力された端末音声データＡに対してエンコードを行うとともに、通信端末２−Ａを示す端末情報を端末情報記憶部２４から読み出して端末データ（以下、端末データＡという）を生成し、通信部２０を介して配信サーバ１に端末データＡを送信させる。 Then, the user C finishes the live performance and operates the operation unit of the distribution server 1 to communicate with the user A and the user B, and the communication unit 10 transmits the terminal data to the CPU. The reception is controlled. In addition, the user C proposes this to the users A and B via the voice input unit 13. Here, when the user A starts a conversation, the voice input unit 23 of the communication terminal 2-A has terminal voice data (hereinafter referred to as a terminal) indicating a voice (hereinafter referred to as a conversation voice A) in which the conversation of the user A is recorded. Audio data A) is generated and output to the encoding unit 25. The encoding unit 25 encodes the input terminal audio data A and reads terminal information indicating the communication terminal 2-A from the terminal information storage unit 24 to generate terminal data (hereinafter referred to as terminal data A). The terminal data A is transmitted to the distribution server 1 via the communication unit 20.

配信サーバ１のデコード部１１は、通信部１０によって受信した端末データＡに対してデコードを行い、音声出力部１２に端末音声データＡを出力して会話音声Ａを放音させる。また、チャンネル割当部１４には、デコード部１１から端末音声データＡと通信端末２−Ａを示す端末情報とが入力される一方、利用者Ｃの会話を録音した音声（以下、会話音声Ｃという）を示す入力音声データが音声入力部１３から入力される。 The decoding unit 11 of the distribution server 1 decodes the terminal data A received by the communication unit 10 and outputs the terminal voice data A to the voice output unit 12 to emit the conversation voice A. The channel assignment unit 14 receives the terminal voice data A and the terminal information indicating the communication terminal 2-A from the decoding unit 11, while the voice recording the conversation of the user C (hereinafter referred to as conversation voice C). ) Is input from the voice input unit 13.

チャンネル割当部１４は、端末音声データＡが新たに入力されたことにより、配信音声データのチャンネル３にその端末音声データＡを割り当てるとともに、図３（ｂ）に示すような割当情報を生成してエンコード部１５に出力する。そして、エンコード部１５によって出力される配信データは、通信部１０を介して、通信端末２−Ａ、２−Ｂに送信される。 When the terminal voice data A is newly input, the channel allocation unit 14 allocates the terminal voice data A to the channel 3 of the distribution voice data and generates allocation information as shown in FIG. Output to the encoding unit 15. The distribution data output by the encoding unit 15 is transmitted to the communication terminals 2-A and 2-B via the communication unit 10.

通信端末２−Ａは、配信サーバ１から配信データを受信すると、ミュート部２６には、図３（ｂ）に示す割当情報、配信音声データが入力される。ミュート部２６は、割当情報を参照し、自らの通信端末２−Ａを示すチャンネルがチャンネル３であることを特定し、配信音声データの複数のチャンネルのうち、端末音声データＡが割り当てられたチャンネル３を除去し、チャンネル３以外のチャンネル１、２を有する出力音声データを音声出力部２２に出力する。これにより、音声出力部２２からは、会話音声Ｃが放音され、利用者Ａの会話である会話音声Ａは放音されないため、利用者Ａは自らの発言が遅延した状態で聞かなくてもよいから違和感を生じることがなく、音声の品質を悪化させることがない。 When the communication terminal 2-A receives the distribution data from the distribution server 1, the mute unit 26 receives the allocation information and distribution audio data shown in FIG. The mute unit 26 refers to the allocation information, specifies that the channel indicating its communication terminal 2-A is channel 3, and among the plurality of channels of the distribution audio data, the channel to which the terminal audio data A is allocated 3 is output, and output audio data having channels 1 and 2 other than channel 3 is output to the audio output unit 22. Thereby, since the conversation voice C is emitted from the voice output unit 22 and the conversation voice A which is the conversation of the user A is not emitted, the user A does not listen in a state where his / her speech is delayed. Since it is good, there is no sense of incongruity and the quality of the voice is not deteriorated.

一方、通信端末２−Ｂにおいては、ミュート部２６は、割当情報を参照しても自らの通信端末２−Ｂを示すチャンネルを特定することがないから、音声出力部２２からは、会話音声Ａと会話音声Ｃとが放音され、利用者Ａと利用者Ｃとの双方の会話を聞くことができる。 On the other hand, in the communication terminal 2-B, the mute unit 26 does not specify a channel indicating its own communication terminal 2-B even if referring to the allocation information. And the conversation voice C are emitted, and the conversation between the user A and the user C can be heard.

次に、さらに利用者Ｂも会話を開始し、通信端末２−Ｂにおける音声入力部２３からその会話の音声（以下、会話音声Ｂという）を録音して端末音声データ（以下、端末音声データＢという）が生成されると、配信サーバ１におけるチャンネル割当部１４には、音声入力部１３から会話音声Ｃを示す入力音声データが入力されるとともに、デコード部１１からは、通信端末２−Ａを示す端末情報と端末音声データＡとの組、および通信端末２−Ｂを示す端末情報と端末音声データＢとの組が入力される。この結果、チャンネル割当部１４は、入力音声データのＬチャンネルとＲチャンネルをミキシングしたデータをチャンネル１、端末音声データＡをチャンネル２、端末音声データＢをチャンネル３に割り当てた配信音声データを生成するとともに、図３（ｃ）に示す割当情報を生成する。 Next, the user B also starts a conversation, records the conversation voice (hereinafter referred to as conversation voice B) from the voice input unit 23 in the communication terminal 2-B, and records terminal voice data (hereinafter referred to as terminal voice data B). Is input to the channel assignment unit 14 in the distribution server 1 from the audio input unit 13 and the communication terminal 2-A is received from the decoding unit 11. A set of terminal information and terminal voice data A shown, and a set of terminal information and terminal voice data B showing the communication terminal 2-B are input. As a result, the channel allocation unit 14 generates distribution audio data in which the data obtained by mixing the L channel and the R channel of the input audio data is assigned to channel 1, the terminal audio data A is assigned to channel 2, and the terminal audio data B is assigned to channel 3. At the same time, the allocation information shown in FIG.

そして、配信サーバ１においては、会話音声Ａ、会話音声Ｂが放音され、通信端末２−Ａにおいては、会話音声Ｂ、会話音声Ｃが放音され、通信端末２−Ｂにおいては、会話音声Ａ、会話音声Ｃが放音されることにより、利用者Ａ、Ｂ、Ｃはそれぞれ自らの発言以外の音声を聞くことができる。 Then, in the distribution server 1, conversation voice A and conversation voice B are emitted, conversation voice B and conversation voice C are emitted in the communication terminal 2-A, and conversation voice is emitted in the communication terminal 2-B. By A and conversation voice C being emitted, users A, B, and C can hear voices other than their own speech.

なお、利用者Ａが所定時間発言を中止すると、配信サーバ１のチャンネル割当部１４に端末音声データＡが入力されないようにしてもよい。これは、配信サーバ１側で端末音声データＡが示す会話音声Ａの音量が所定値以下の状態が所定時間以上続いた場合に利用者Ａが発言していないと判断して、チャンネル割当部１４に入力されないようにしてもよいし、通信端末２−Ａ側で判断して、端末データが通信部２０から送信されないようにしてもよい。そして、チャンネル割当部１４に入力される端末音声データが端末音声データＢだけになった場合には、チャンネル１を入力音声データのＬチャンネル、チャンネル２を入力音声データのＲチャンネル、チャンネル３を端末音声データＢとして割り当て、図３（ｂ）におけるチャンネル３の内容を通信端末Ｂとした割当情報を生成するようにして、随時チャンネルの割り当て内容を変更するようにしてもよい。 Note that if the user A stops speaking for a predetermined time, the terminal voice data A may not be input to the channel assignment unit 14 of the distribution server 1. This is because the channel allocation unit 14 determines that the user A is not speaking when the volume of the conversation voice A indicated by the terminal voice data A on the distribution server 1 side continues for a predetermined time or longer. The terminal data may not be transmitted from the communication unit 20 as determined on the communication terminal 2-A side. When the terminal voice data input to the channel allocation unit 14 is only the terminal voice data B, channel 1 is the L channel of the input voice data, channel 2 is the R channel of the input voice data, and channel 3 is the terminal. The allocation information of the channel may be changed as needed by generating the allocation information in which the content of the channel 3 in FIG.

このように、本発明の配信システムは、配信サーバ１は、配信サーバ１において生成した入力音声データと各通信端末２から送信された端末音声データとがチャンネルごとに割り当てられた配信音声データと、その割り当ての内容を示す割当情報とを各通信端末２に送信する。一方、各通信端末２は、割当情報に基づいて、自らの通信端末２が送信した端末データに係る端末音声データが割り当てられたチャンネル以外のチャンネルの配信音声データを放音することにより、その通信端末２の利用者は、自らの発言以外の音声を聞くことができる。これにより、利用者は違和感を生じることなく、また配信サーバ１の利用者と通信端末２の利用者とのコミュニケーションに係る音声の品質を保つことができる。 As described above, in the distribution system of the present invention, the distribution server 1 includes the distribution audio data in which the input audio data generated in the distribution server 1 and the terminal audio data transmitted from each communication terminal 2 are allocated for each channel; The allocation information indicating the content of the allocation is transmitted to each communication terminal 2. On the other hand, each communication terminal 2 emits the distribution voice data of a channel other than the channel to which the terminal voice data related to the terminal data transmitted by its own communication terminal 2 is transmitted based on the allocation information, thereby performing the communication. The user of the terminal 2 can hear a sound other than his / her speech. Thereby, the user can maintain the quality of the voice related to the communication between the user of the distribution server 1 and the user of the communication terminal 2 without causing discomfort.

以上、本発明の実施形態について説明したが、本発明は以下のように、さまざまな態様で実施可能である。 As mentioned above, although embodiment of this invention was described, this invention can be implemented in various aspects as follows.

＜変形例１＞
上述した実施形態においては、通信端末２は、端末情報を送信し、割当情報を受信することにより、通信端末２のミュート部２６は、割当情報に基づいて、配信サーバ１から受信した配信音声データの複数のチャンネルのうち、自らの通信端末２が送信した端末データに係る端末音声データが割り当てられたチャンネルを特定したが、端末情報および割当情報を用いずに、別の方法によって特定してもよい。 <Modification 1>
In the embodiment described above, the communication terminal 2 transmits the terminal information and receives the allocation information, so that the mute unit 26 of the communication terminal 2 receives the distribution audio data received from the distribution server 1 based on the allocation information. Among the plurality of channels, the channel to which the terminal voice data related to the terminal data transmitted by the own communication terminal 2 is identified, but the channel information may be identified by another method without using the terminal information and the allocation information. Good.

例えば、図５に示すように通信端末２に解析部２７を設ける。解析部２７は、音声入力部２３から端末音声データが入力され、デコード部２１から配信音声データが入力される。そして、配信音声データの各チャンネルの音声パターンと、端末音声データの音声パターンとを比較することによって、配信音声データの複数のチャンネルのうち、この端末音声データが割り当てられたチャンネルがあるか否かを判断するとともに、ある場合には、そのチャンネルを特定する。このような音声パターンの比較は、端末音声データに係る音声の音量変化、周波数分布などのスペクトルなどを用いて比較し、双方のスペクトルが一致、または類似の程度を示す類似度が所定値以上であれば、同一とみなすようにすればよい。 For example, an analysis unit 27 is provided in the communication terminal 2 as shown in FIG. The analysis unit 27 receives terminal audio data from the audio input unit 23 and receives distribution audio data from the decoding unit 21. Then, by comparing the audio pattern of each channel of the distribution audio data with the audio pattern of the terminal audio data, whether or not there is a channel to which this terminal audio data is allocated among a plurality of channels of the distribution audio data And, if there is, the channel is specified. Such voice pattern comparison is performed by using a spectrum such as a volume change of the voice related to the terminal voice data, a frequency distribution, etc., and both spectra match or a similarity indicating a degree of similarity is a predetermined value or more. If so, they should be considered identical.

このとき、端末音声データが音声入力部２３から解析部２７に入力されてから、配信音声データとしてデコード部２１から解析部２７に入力されるまでには、処理遅延、通信遅延による時間の遅れがあるから、音声入力部２３から解析部２７に入力された端末音声データについては、その時間の遅れに相当する時間の遅延処理を施してから、デコード部２１から入力される配信音声データの各チャンネルと音声パターンの比較することが望ましい。また、解析部２７は、音声入力部２３から入力された端末音声データの最初の数秒などの所定の区間を記憶し、デコード部２１から入力される配信音声データの各チャンネルの音声パターンと、記憶した所定の区間の端末音声データの音声パターンを比較するようにし、同一とみなせるチャンネルを特定するようにしてもよい。 At this time, there is a time delay due to processing delay and communication delay from when the terminal voice data is input from the voice input unit 23 to the analysis unit 27 until it is input as distribution voice data from the decoding unit 21 to the analysis unit 27. Therefore, the terminal audio data input from the audio input unit 23 to the analysis unit 27 is subjected to a delay process corresponding to the time delay, and then each channel of the distribution audio data input from the decode unit 21 is processed. It is desirable to compare the sound pattern. The analysis unit 27 stores a predetermined section such as the first few seconds of the terminal audio data input from the audio input unit 23, stores the audio pattern of each channel of the distribution audio data input from the decoding unit 21, and the storage The audio patterns of the terminal audio data in the predetermined section may be compared, and the channels that can be regarded as the same may be specified.

このように、音声入力部２３において生成した端末音声データとデコード部２１から出力される配信音声データの各チャンネルとの音声パターンを比較することにより、自らの通信端末２が送信した端末データに係る端末音声データが割り当てられたチャンネルを特定すれば、端末情報、割当情報を用いなくても、実施形態と同様な効果を得ることができる。 As described above, the terminal voice data generated by the voice input unit 23 is compared with the voice pattern of each channel of the delivery voice data output from the decoding unit 21, thereby relating to the terminal data transmitted by the own communication terminal 2. If the channel to which the terminal voice data is allocated is specified, the same effect as the embodiment can be obtained without using the terminal information and the allocation information.

＜変形例２＞
上述した実施形態においては、利用者Ｃが配信サーバ１の操作部を操作することによって、配信サーバ１が各通信端末２から送信される端末データを受信するか否かを制御し、受信可能な状態であるときには、全ての通信端末２からの端末データを受信可能な状態としていたが、端末データを受信可能とする通信端末２の対象を制限するようにしてもよい。これは、配信サーバ１の操作部を操作することによって通信端末２の制限を行なってもよいし、配信音声データのチャンネル数に応じて、対象となる通信端末２の数を制限してもよい。 <Modification 2>
In the above-described embodiment, the user C operates the operation unit of the distribution server 1 to control whether the distribution server 1 receives terminal data transmitted from each communication terminal 2 and can receive the data. In the state, the terminal data from all the communication terminals 2 can be received. However, the targets of the communication terminals 2 that can receive the terminal data may be limited. This may limit the communication terminal 2 by operating the operation unit of the distribution server 1, or may limit the number of target communication terminals 2 according to the number of channels of the distribution audio data. .

チャンネル数に応じて制限するときには、例えば、チャンネル数が６の場合には、配信サーバ１の入力音声データ用の１チャンネルを最低限確保しておけばよいから、通信端末２の数を５に制限しておけば、すべての端末音声データにチャンネルを割り当てることができる。また、入力音声データ用のチャンネルが３チャンネル必要であれば、通信端末２の数を２に制限しておけばよい。なお、このような制限を行った場合には、各通信端末２と割り当てられるチャンネルとを予め設定しておいてもよい。このとき、入力音声データ用に必要なチャンネルが固定数で変化しなければ、割当情報の内容は固定とすることもできる。 When limiting according to the number of channels, for example, when the number of channels is 6, the number of communication terminals 2 should be 5 because it is sufficient to secure at least one channel for input audio data of the distribution server 1. If restricted, channels can be assigned to all terminal audio data. If three channels for input audio data are required, the number of communication terminals 2 may be limited to two. When such a restriction is performed, each communication terminal 2 and an assigned channel may be set in advance. At this time, if the number of channels necessary for input audio data does not change by a fixed number, the contents of the allocation information can be fixed.

＜変形例３＞
上述した実施形態においては、通信端末２は通信端末２−Ａ、２−Ｂの２つであったため、配信サーバ１で同時に端末データを受信しても、チャンネル割当部１４は、それぞれの端末音声データを各チャンネルに割り当てることができたが、さらに通信端末２の数が多い場合には、以下のようにすればよい。 <Modification 3>
In the above-described embodiment, since there are two communication terminals 2, that is, the communication terminals 2 -A and 2-B, even if the distribution server 1 receives the terminal data at the same time, the channel assignment unit 14 does not receive the respective terminal voices. Data could be assigned to each channel, but when the number of communication terminals 2 is larger, the following may be performed.

まず、第１の方法として、チャンネル割当部１４に入力される複数の端末音声データについて、その音量レベルに基づいて、割り当て可能なチャンネル数（実施形態の場合には、最大２チャンネル）の端末音声データを特定すればよい。例えば、音量レベルが大きい方から２つの端末音声データを特定して割り当てればよい。また、割り当て時点の所定時間前から、割り当て時点までの期間のうち、音量レベルが所定値以上になっている期間が多い順に割り当てるようにしてもよいし、平均音量レベルが大きい順に割り当てるようにしてもよい。そして、所定時間ごと、またはリアルタイムで、この割り当てを行なうようにすればよい。すなわち、所定の条件を設け、その条件を満たさずに割り当てられなかった端末音声データについては、配信音声データに含まれずに除外されるものとすればよい。なお、所定の条件については、音量レベルに基づく条件に限られず、端末音声データから抽出可能な物理量などに基づく条件であれば、どのような条件を用いてもよい。 First, as a first method, for a plurality of terminal audio data input to the channel allocation unit 14, the number of channels that can be allocated (maximum 2 channels in the embodiment) based on the volume level of the terminal audio data is as follows. You just need to identify the data. For example, it is only necessary to specify and assign two terminal voice data from the higher volume level. Further, among the periods from the predetermined time before the allocation time point to the allocation time point, the volume level may be assigned in the descending order of the volume level, or the average sound volume level may be assigned in the descending order. Also good. Then, this assignment may be performed every predetermined time or in real time. That is, a predetermined condition is provided, and terminal voice data that is not allocated without satisfying the condition may be excluded without being included in the distribution voice data. Note that the predetermined condition is not limited to the condition based on the volume level, and any condition may be used as long as the condition is based on a physical quantity that can be extracted from the terminal voice data.

第２の方法として、第１の方法と同様に所定の条件を設けて、その条件を満たした端末音声データについては、１つのチャンネルが割り当てられ、条件を満たさない残りの端末音声データについては、これらのデータを合成して、まとめて１つのチャンネルが割り当てられるようにしてもよい。例えば、音量レベルが最大の端末音声データに１のチャンネルに割り当て、残りの端末音声データについては、ミキシングするなどして合成し、まとめて１のチャンネルに割り当てるようにすればよい。そして、まとめて割り当てられた端末音声データを送信した通信端末２については、割当情報に基づいて、そのチャンネルを音声出力部２２に出力してもよいし、出力しないようにしてもよい。音声出力部２２に出力したとしても、音量レベルが小さければ音声の品質の悪化は少ないものとすることができる。 As a second method, a predetermined condition is set in the same manner as the first method. For terminal voice data that satisfies the condition, one channel is allocated, and for the remaining terminal voice data that does not satisfy the condition, These data may be combined so that one channel is assigned collectively. For example, the terminal audio data with the maximum volume level may be assigned to one channel, and the remaining terminal audio data may be combined by mixing or the like and collectively assigned to one channel. And about the communication terminal 2 which transmitted the terminal audio | voice data allocated collectively, the channel may be output to the audio | voice output part 22 based on allocation information, and you may make it not output. Even if the sound is output to the sound output unit 22, if the volume level is low, the deterioration of the sound quality can be reduced.

第３の方法として、第２の方法のようにまとめて割り当てるときに、配信音声データの１つのチャンネルを時間軸方向に分割して、複数の端末音声データの各々を分割したチャンネルの各々に交互に割り当てるようにしてもよい。この場合は、複数の端末音声データの各々を時間軸方向に圧縮し、割り当てるようにして、時間軸方向のデータの欠落が発生しないようにしてもよい。例えば、２つの端末音声データを１つのチャンネルに割り当てるときには、それぞれの端末音声データを１秒ずつ分割して、それぞれ分割された１秒の端末音声データを時間軸方向に２倍に圧縮して０．５秒ずつの端末音声データとし、交互に１つのチャンネルに割り当てるようにすればよい。この場合、割当情報は０．５秒ずつ変更されることになる。そして、通信端末２におけるミュート部２６において、０．５秒に圧縮された端末音声データを１秒に伸長し、自らが送信した端末データに係る端末音声データ以外について、音声出力部２２に出力するようにすればよい。このようにすれば、チャンネル数を事実上増加させることができる。 As a third method, when the channels are assigned together as in the second method, one channel of the distribution audio data is divided in the time axis direction, and each of the plurality of terminal audio data is alternately assigned to each of the divided channels. You may make it allocate to. In this case, each of the plurality of terminal audio data may be compressed and allocated in the time axis direction so that no data loss in the time axis direction occurs. For example, when allocating two terminal audio data to one channel, each terminal audio data is divided by 1 second, and each divided 1-second terminal audio data is compressed twice in the time axis direction to 0. The terminal voice data may be assigned every 5 seconds and alternately assigned to one channel. In this case, the allocation information is changed every 0.5 seconds. Then, the mute unit 26 in the communication terminal 2 expands the terminal voice data compressed to 0.5 seconds to 1 second, and outputs to the voice output unit 22 other than the terminal voice data related to the terminal data transmitted by itself. What should I do? In this way, the number of channels can be effectively increased.

＜変形例４＞
上述した実施形態において、音声入力部１３のマイクロフォンに音声出力部１２からの放音が収音されることに伴う音声結合によるエコーの発声を防止するために、音声入力部１３にエコーキャンセラ回路を設けてもよい。 <Modification 4>
In the above-described embodiment, an echo canceller circuit is provided in the voice input unit 13 in order to prevent echo from being generated due to voice coupling caused by the sound output from the voice output unit 12 being collected by the microphone of the voice input unit 13. It may be provided.

実施形態に係る配信システムにおける配信サーバと通信端末との接続関係を示す説明図である。It is explanatory drawing which shows the connection relation of the delivery server and communication terminal in the delivery system which concerns on embodiment. 実施形態に係る配信サーバの構成を示すブロック図である。It is a block diagram which shows the structure of the delivery server which concerns on embodiment. 実施形態に係る割当情報の内容の説明図である。It is explanatory drawing of the content of the allocation information which concerns on embodiment. 実施形態に係る通信端末の構成を示すブロック図である。It is a block diagram which shows the structure of the communication terminal which concerns on embodiment. 変形例１に係る通信端末の構成を示すブロック図である。It is a block diagram which shows the structure of the communication terminal which concerns on the modification 1.

Explanation of symbols

１…配信サーバ、２，２−Ａ，２−Ｂ…通信端末、１０，２０…通信部、１１，２１…デコード部、１２，２２…音声出力部、１３，２３…音声入力部、１４…チャンネル割当部、１５，２５…エンコード部、２４…端末情報記憶部、２６…ミュート部、２７…解析部、１０００…通信網 DESCRIPTION OF SYMBOLS 1 ... Distribution server, 2, 2-A, 2-B ... Communication terminal 10, 20 ... Communication part, 11, 21 ... Decoding part, 12, 22 ... Audio | voice output part, 13, 23 ... Audio | voice input part, 14 ... Channel allocation unit, 15, 25 ... Encoding unit, 24 ... Terminal information storage unit, 26 ... Mute unit, 27 ... Analysis unit, 1000 ... Communication network

Claims

In a distribution system having a distribution server that communicates via a communication network and a plurality of communication terminals,
The distribution server
Distribution audio data transmitting means for transmitting distribution audio data constituted by a plurality of channels to the communication terminal;
Terminal voice data receiving means for receiving terminal voice data from the communication terminal;
Audio data input means for inputting audio data in a streaming format;
Allocating means for allocating each of the voice data input to the voice data input means and the terminal voice data for each communication terminal received by the terminal voice data receiving means to each of a plurality of channels of the distributed voice data. And
The communication terminal is
Terminal voice data input means for inputting terminal voice data in a streaming format;
Terminal voice data transmitting means for transmitting terminal voice data input to the terminal voice data input means to the distribution server;
Receiving means for receiving distribution audio data from the distribution server;
A specifying unit for specifying a channel to which the terminal voice data transmitted by the terminal voice data transmitting unit is allocated among a plurality of channels of the distribution voice data received by the receiving unit;
A distribution system comprising: output means for outputting distribution audio data of a channel other than the channel specified by the specifying means among a plurality of channels of the distribution audio data received by the receiving means.

When receiving the terminal voice data, the terminal voice data receiving means further receives terminal information specifying a communication terminal that has transmitted the terminal voice data;
The delivery voice data transmitting unit generates allocation information in which the channel allocated to the terminal voice data by the allocation unit is associated with the communication terminal that has transmitted the terminal voice data, and the allocation information is transmitted to the communication terminal. Send more,
When transmitting the terminal voice data, the terminal voice data transmitting means further transmits terminal information for identifying the terminal itself to the distribution server,
The receiving means further receives allocation information from the distribution server;
2. The distribution system according to claim 1, wherein the channel is specified by the specifying unit based on the allocation information received by the receiving unit.

The specification of the channel in the specifying means is performed by comparing the sound pattern indicated by each of the plurality of channels of the distribution sound data received by the receiving means with the sound pattern indicated by the terminal sound data transmitted by the terminal sound data transmitting means. The distribution system according to claim 1, wherein the distribution system is based on.

The assigning means assigns each of the terminal voice data satisfying a predetermined condition among the terminal voice data for each communication terminal received by the terminal voice data receiving means to each of the plurality of channels of the distribution voice data. The distribution system according to any one of claims 1 to 3, wherein:

The assigning means assigns each of the terminal voice data satisfying a predetermined condition among the terminal voice data for each communication terminal received by the terminal voice data receiving means to each of the plurality of channels of the distribution voice data, 4. The distribution according to claim 1, wherein terminal audio data that does not satisfy a predetermined condition is assigned to one channel as synthesized data of audio data that does not satisfy the predetermined condition. 5. system.

A method used in a distribution system having a distribution server and a plurality of communication terminals that communicate via a communication network,
The method used in the distribution server is:
A delivery voice data transmission process of sending delivery voice data composed of a plurality of channels to the communication terminal;
A terminal voice data receiving process for receiving terminal voice data from the communication terminal;
An audio data input process in which streaming audio data is input;
Assigning each of the voice data input in the voice data input process and the terminal voice data for each communication terminal received in the terminal voice data reception process to each of a plurality of channels of the distributed voice data; Prepared,
The communication terminal is
Terminal audio data input process in which streaming terminal audio data is input,
A terminal voice data transmission process for transmitting the terminal voice data input in the terminal voice data input process to the distribution server;
A receiving process of receiving distribution voice data from the distribution server;
A specifying step of identifying a channel to which the terminal voice data transmitted by the terminal voice data transmission step is assigned among a plurality of channels of the distribution voice data received by the reception step;
A distribution method comprising: an output process of outputting distribution audio data of a channel other than the channel specified by the specifying process among a plurality of channels of the distribution audio data received by the receiving process.