JP2004072354A

JP2004072354A - Audio teleconference system

Info

Publication number: JP2004072354A
Application number: JP2002228059A
Authority: JP
Inventors: Masaaki Yonezawa; 米澤　正明
Original assignee: Yokogawa Electric Corp
Current assignee: Yokogawa Electric Corp
Priority date: 2002-08-06
Filing date: 2002-08-06
Publication date: 2004-03-04

Abstract

<P>PROBLEM TO BE SOLVED: To realize an audio teleconference system enabling the receiving side to easily identify a speaker. <P>SOLUTION: The audio teleconference system for performing a conference with persons in a remote place by transmitting/receiving voice data through a network is provided with the network, a 1st remote conference device for adding an identifier to voice data to be transmitted through the network and transmitting the voice data and a 2nd remote conference device for receiving the voice data to which the identifier is added through the network and changing the position of a sound image of the voice data to be reproduced on the basis of the identifier. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、ネットワークを利用して音声データを送受信することにより遠隔地の者と会議を行う音声会議システムに関し、特に受信側において発言者を容易に識別することが可能な音声会議システムに関する。
【０００２】
【従来の技術】
従来の音声会議システムは複数台の遠隔会議装置をネットワークで接続し、一の遠隔会議装置で収集した発言者の音声データをネットワーク経由で他の遠隔会議装置に送信し音声を再生させることにより遠隔地の者と会議を可能にするものである。
【０００３】
図４はこのような従来の音声会議システムの一例を示す構成ブロック図である。図４において１，２，３及び４は音声を収集するマイクロフォンや音声を再生するスピーカ等を有しネットワークを介して音声データの送受信が可能な遠隔会議装置、１００はインターネット、イントラネット、電話回線、若しくは、専用回線等の汎用のネットワークである。
【０００４】
遠隔会議装置１，２，３及び遠隔会議装置４はそれぞれネットワーク１００に接続される。
【０００５】
また、図５は遠隔会議装置１〜４の具体例を示す構成ブロック図である。図５において５はネットワーク１００を介して他の遠隔会議装置との間で通信（音声データの送受信等）を行う通信手段、６は受信した音声データをアナログ信号に変換する音声再生手段、７は音声再生手段６のアナログ出力信号を適宜増幅して音声として出力するスピーカ等の拡声手段、８はマイクロフォン等により音声を収集し通信手段５を制御して他の遠隔会議装置に音声データを送信させる音声収集手段である。
【０００６】
通信手段５はネットワーク１００（図示せず。）に接続されると共に出力が音声再生手段６に接続され、音声再生手段６の出力は拡声手段７に接続される。また、音声収集手段８の出力が通信手段５に接続される。
【０００７】
ここで、図４及び図５に示す従来例の動作を図６を用いて説明する。図６は音声会議システムの動作を説明する説明図である。但し、音声の再生に関してのみ説明し、音声の収集及び送信に関しては説明は省略する。
【０００８】
先ず第１に、遠隔会議装置単体では、図５中”ＳＳ０１”に示すような音声データをネットワーク１００を介して受信した通信手段５は受信した音声データを音声再生手段６に出力する。音声再生手段６は音声データをアナログ信号に変換して出力する。
【０００９】
そして、拡声手段７は音声再生手段６の出力であるアナログ信号を適宜増幅し図５中”ＰＳ０１”に示すように音声として会議を行っている者に対して出力する。
【００１０】
一方、図４に示す４台の遠隔会議装置から構成される音声会議システムでは、遠隔会議装置１において収集された図６中”ＴＫ１１”に示す発言者の音声は図６中”ＳＳ１１”に示すような音声データとして遠隔会議装置４に送信される。
【００１１】
同様に、遠隔会議装置２及び３において収集された図６中”ＴＫ１２”及び”ＴＫ１３”に示す発言者の音声は図６中”ＳＳ１２”及び”ＳＳ１３”に示すような音声データとして遠隔会議装置４に送信される。
【００１２】
この時、遠隔会議装置４では受信した図６中”ＳＳ１１”、”ＳＳ１２”及び”ＳＳ１３”に示すような音声データを再生して図６中”ＳＰ１１”に示す拡声手段７から図６中”ＰＳ１１”に示すような音声として出力し、図６中”ＬＮ１１”に示す聴取者によって発言者の音声が知覚される。
【００１３】
この結果、複数台の遠隔会議装置をネットワークにより相互に接続して、各遠隔会議装置で収集した音声を音声データとして送信し音声を再生させることにより遠隔地の者との会議が可能になる。
【００１４】
【発明が解決しようとする課題】
しかし、図４に示す従来例では複数の発言者がいる場合には、発言者を識別するために発言者がその都度氏名を名乗ったり、聴取者が再生される音声の特徴を聞き分けて発言者を識別しなければならないと言った問題点があった。
【００１５】
このため、発言者が氏名を名乗らない場合には、初見の相手や音声の特徴が類似する発言者を聴取者が識別することは極めて困難であると言った問題点があった。
従って本発明が解決しようとする課題は、受信側において発言者を容易に識別することが可能な音声会議システムを実現することにある。
【００１６】
【課題を解決するための手段】
このような課題を達成するために、本発明のうち請求項１記載の発明は、
ネットワークを利用して音声データを送受信することにより遠隔地の者と会議を行う音声会議システムにおいて、
ネットワークと、前記ネットワーク経由で送信する音声データに識別子を付加して送信する第１の遠隔会議装置と、前記ネットワーク経由で前記識別子を付加された音声データを受信し再生される前記音声データの音像の位置を前記識別子に基づき変える第２の遠隔会議装置とを備えたことにより、発言者を容易に識別することが可能になる。
【００１７】
請求項２記載の発明は、
請求項１記載の発明である音声会議システムにおいて、
前記第１の遠隔会議装置が、
前記ネットワークを介して他の遠隔会議装置との間で通信を行う通信手段と、マイクロフォンにより音声を収集し前記識別子を付加した後に前記通信手段を制御して他の遠隔会議装置にデータを送信させる音声収集手段とから構成されることにより、発言者を容易に識別することが可能になる。
【００１８】
請求項３記載の発明は、
請求項１記載の発明である音声会議システムにおいて、
前記第２の遠隔会議装置が、
前記ネットワークを介して他の遠隔会議装置との間で通信を行う通信手段と、受信したデータから前記音声データを抽出してアナログ信号に変換する音声信号抽出手段と、受信したデータから前記識別子を抽出する音声識別子抽出手段と、前記アナログ信号の再生音量のバランスを制御する再生均衡制御手段と、前記識別子に基づき前記バランスを決定して前記再生均衡制御手段に設定する均衡比率設定手段と、前記再生均衡制御手段の制御により音声データの音像の位置を変えて再生する２つの拡声手段とから構成されることにより、発言者を容易に識別することが可能になる。
【００１９】
請求項４記載の発明は、
請求項３記載の発明である音声会議システムにおいて、
前記拡声手段が、
複数であることにより、発言者を容易に識別することが可能になる。
【００２０】
請求項５記載の発明は、
請求項３記載の発明である音声会議システムにおいて、
前記拡声手段が、
奥行き方向にずらして配置されたことにより、発言者を容易に識別することが可能になる。
【００２１】
請求項６記載の発明は、
請求項３記載の発明である音声会議システムにおいて、
前記拡声手段が、
高さ方向にずらして配置されたことにより、発言者を容易に識別することが可能になる。
【００２２】
請求項７記載の発明は、
請求項１乃至請求項３のいずれかの発明である音声会議システムにおいて、
前記識別子が、
予め定義された識別子であることにより、発言者を容易に識別することが可能になる。
【００２３】
請求項８記載の発明は、
請求項１乃至請求項３のいずれかの発明である音声会議システムにおいて、
前記識別子が、
遠隔会議装置のＩＰアドレスであることにより、発言者を容易に識別することが可能になる。
【００２４】
請求項９記載の発明は、
請求項１乃至請求項３のいずれかの発明である音声会議システムにおいて、
前記識別子が、
遠隔会議装置のＭＡＣアドレスであることにより、発言者を容易に識別することが可能になる。
【００２５】
請求項１０記載の発明は、
請求項１若しくは請求項２の発明である音声会議システムにおいて、
前記第２の遠隔会議装置が、
複数の前記マイクロフォンを具備し、複数の前記マイクロフォンが収集した音声データ毎に異なる識別子を付加して送信することにより、発言者を容易に識別することが可能になる。
【００２６】
【発明の実施の形態】
以下本発明を図面を用いて詳細に説明する。図１は本発明に係る音声会議システムの一実施例を示す構成ブロック図である。
【００２７】
図１において１００は図４と同一符号を付してあり、９，１０，１１及び１２は改良された遠隔会議装置である。遠隔会議装置９，１０，１１及び遠隔会議装置１２はそれぞれネットワーク１００に接続される。
【００２８】
また、図２は遠隔会議装置９〜１１の具体例を示す構成ブロック図である。図２において５は図５と同一符号を付してあり、１３は受信したデータから音声データを抽出してアナログ信号に変換する音声信号抽出手段、１４は音声信号抽出手段１３のアナログ出力信号の再生音量のバランス（均衡）を制御する再生均衡制御手段、１５は受信したデータから識別子を抽出する音声識別子抽出手段、１６は当該識別子に基づき再生音量のバランス（均衡）を設定する均衡比率設定手段、１７及び１８は再生均衡制御手段１４のアナログ出力信号を適宜増幅し音声として出力するスピーカ等の拡声手段、１９はマイクロフォン等により音声を収集し識別子を付加した後に通信手段５を制御して他の遠隔会議装置にデータを送信させる音声収集手段である。
【００２９】
通信手段５はネットワーク１００（図示せず。）に接続されると共に出力が音声信号抽出手段１３及び音声識別子抽出手段１５に接続され、音声信号抽出手段１３の出力は再生均衡制御手段１４に接続される。
【００３０】
音声識別子抽出手段１５の出力は均衡比率設定手段１６に接続され、均衡比率設定手段１６の出力は再生均衡制御手段１４の制御入力端子に接続される。
【００３１】
また、再生均衡制御手段１４の２つの出力はそれぞれ左右に配置された拡声手段１７及び１８に接続され、音声収集手段１９の出力が通信手段５に接続される。
【００３２】
ここで、図１及び図２に示す実施例の動作を図３を用いて説明する。図３は音声会議システムの動作を説明する説明図である。
【００３３】
音声収集手段１９はマイクロフォン等により発言者の音声を収集し識別子を付加した後に通信手段５を制御して他の遠隔会議装置にデータを送信させる音声収集手段である。
【００３４】
例えば、遠隔会議装置９において収集された図３中”ＴＫ３１”に示す発言者の音声には識別子”Ａ”が付加され図３中”ＳＳ３１”に示すようなデータとして遠隔会議装置１３に送信される。
【００３５】
例えば、同様に、遠隔会議装置１０及び１１において収集された図３中”ＴＫ３２”及び”ＴＫ３３”に示す発言者の音声にはそれぞれ識別子”Ｂ”及び”Ｃ”が付加され図３中”ＳＳ３２”及び”ＳＳ３３”に示すようなデータとして遠隔会議装置１３に送信される。
【００３６】
また、図２中”ＳＳ２１”に示すようなデータ（音声データ＋識別子）をネットワーク１００を介して受信した通信手段５は受信したデータ（音声データ＋識別子）を音声信号抽出手段１３及び音声識別子抽出手段１５に出力する。音声信号抽出手段１３はデータ（音声データ＋識別子）から音声データを抽出してアナログ信号に変換する。
【００３７】
一方、音声識別子抽出手段１５はデータ（音声データ＋識別子）から識別子を抽出して出力し、均衡比率設定手段１６は抽出された識別子に基づき再生される音声の音量を決定して再生均衡制御手段１４を制御する。
【００３８】
そして、右側に設置された拡声手段１７及び左側に設置された拡声手段１８は再生均衡制御手段１４の出力である２つ（左右）のアナログ信号を図２中”ＰＳ２１”及び”ＰＳ２２”に示すように音声として会議を行っている者に対して出力する。
【００３９】
例えば、識別子”Ａ”、”Ｂ”及び”Ｃ”に対する再生音声の音量のバランスがそれぞれ”１００：０”、”５０：５０”及び”０：１００”と定義されていた場合を想定する。
【００４０】
ここで、音量のバランスは”１００：０”の場合には左側の拡声手段１８から再生する音声の音量の”１００％”が出力され、右側の拡声手段１７から再生する音声の音量の”０％”が出力されることを示す。
【００４１】
このため、識別子”Ａ”が付加された図３中”ＴＫ３１”に示す発言者の音声データは左側の拡声手段１８から再生する音声の音量の”１００％”が出力され、右側の拡声手段１７から再生する音声の音量の”０％”が出力されるので、図３中”ＰＳ３１”に示すように左側から聞こえるように、言い換えれば、左端に音像が定位するように再生される。
【００４２】
同様に、識別子”Ｂ”が付加された図３中”ＴＫ３２”に示す発言者の音声データは左側の拡声手段１８から再生する音声の音量の”５０％”が出力され、右側の拡声手段１７から再生する音声の音量の”５０％”が出力されるので、図３中”ＰＳ３２”に示すように中央から聞こえるように、言い換えれば、中央に音像が定位するように再生される。
【００４３】
さらに、識別子”Ｃ”が付加された図３中”ＴＫ３３”に示す発言者の音声データは左側の拡声手段１８から再生する音声の音量の”０％”が出力され、右側の拡声手段１７から再生する音声の音量の”１００％”が出力されるので、図３中”ＰＳ３３”に示すように右端から聞こえるように、言い換えれば、右端に音像が定位するように再生される。
【００４４】
すなわち、図３中”ＬＮ３１”に示す聴取者は図３中”ＴＫ３１”，”ＴＫ３２”及び”ＴＫ３３”に示す発言者の再生された音声が図３中”ＰＳ３１”、”ＰＳ３２”及び”ＰＳ３３”に示すようにそれぞれ異なる方向から聞こえてくるので発言者を容易に識別することが可能になる。
【００４５】
この結果、送信側が送信する音声データに識別子を付加して送信し、受信側が当該識別子に基づき再生される音声の聞こえてくる方向、言い換えれば、再生される音声データの音像の位置を識別子に基づき変えることにより、発言者を容易に識別することが可能になる。
【００４６】
なお、図１及び図３に示す実施例では発言者が３人である場合を例示しているが、発言者の数が増えた場合には、発言者と同数の識別子を設け再生音声の音量バランスの比率を適宜変更して再生音声の音像の位置が互いに重ならないようにすれば良い。
【００４７】
また、図１及び図３に示す実施例では２つの拡声手段１７及び１８を例示しているが、勿論２つに限定されるものではなく、必要に応じて複数の拡声手段を備えることにより、聴取者を中心に３６０度に再生音声の音像の位置を配置することも可能になる。
【００４８】
また、図１及び図３に示す実施例では拡声手段を左右にずらして設置しているが、拡声手段の設置位置として奥行き方向や高さ方向にずらしても良く、さらに、複数の拡声手段を３次元方向にずらして配置して再生音声の音像の位置を増加させることにより、さらに、多数の発言者の識別をすることも可能になる。
【００４９】
また、図１及び図３に示す実施例では遠隔会議装置が予め定義された識別子を付加しているが、遠隔会議装置のＭＡＣ（Ｍｅｄｉａ　Ａｃｃｅｓｓ　Ｃｏｎｔｒｏｌ　ａｄｒｅｓｓ）アドレスやＩＰ（Ｉｎｔｅｒｎｅｔ　Ｐｒｏｔｏｃｏｌ）アドレス等の一意の情報を識別子として用いても構わない。
【００５０】
また、図１及び図３に示す実施例では発言者と遠隔会議装置が同数であったが、１つの遠隔会議装置に複数人の発言者があっても構わない。この場合には、各発言者毎に音声収集用のマイクロフォンを割り当て、マイクロフォン毎に識別子を付加すれば良い。
【００５１】
【発明の効果】
以上説明したことから明らかなように、本発明によれば次のような効果がある。
請求項１，２，３，４，５，６，７，８，９及び請求項１０の発明によれば、送信側が送信する音声データに識別子を付加して送信し、受信側が当該識別子に基づき再生される音声の聞こえてくる方向、言い換えれば、再生される音声データの音像の位置を識別子に基づき変えることにより、発言者を容易に識別することが可能になる。
【図面の簡単な説明】
【図１】本発明に係る音声会議システムの一実施例を示す構成ブロック図である。
【図２】遠隔会議装置の具体例を示す構成ブロック図である。
【図３】音声会議システムの動作を説明する説明図である。
【図４】従来の音声会議システムの一例を示す構成ブロック図である。
【図５】遠隔会議装置の具体例を示す構成ブロック図である。
【図６】音声会議システムの動作を説明する説明図である。
【符号の説明】
１，２，３，４，９，１０，１１，１２　遠隔会議装置
５　通信手段
６　音声再生手段
７，１７，１８　拡声手段
８，１９　音声収集手段
１３　音声信号抽出手段
１４　再生均衡制御手段
１５　音声識別子抽出手段
１６　均衡比率設定手段
１００　ネットワーク[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a voice conference system for holding a conference with a remote person by transmitting and receiving voice data using a network, and more particularly to a voice conference system capable of easily identifying a speaker on a receiving side.
[0002]
[Prior art]
In a conventional audio conference system, a plurality of teleconference devices are connected via a network, and voice data of a speaker collected by one teleconference device is transmitted to another teleconference device via a network to reproduce a voice. It enables meetings with local people.
[0003]
FIG. 4 is a configuration block diagram showing an example of such a conventional audio conference system. In FIG. 4, reference numerals 1, 2, 3, and 4 denote teleconferencing devices having a microphone for collecting voice, a speaker for reproducing voice, and the like, and capable of transmitting and receiving voice data via a network. 100 denotes the Internet, an intranet, a telephone line, Alternatively, it is a general-purpose network such as a dedicated line.
[0004]
The remote conference devices 1, 2, 3, and the remote conference device 4 are connected to the network 100, respectively.
[0005]
FIG. 5 is a configuration block diagram showing a specific example of the remote conference devices 1 to 4. In FIG. 5, reference numeral 5 denotes communication means for performing communication (transmission / reception of voice data, etc.) with another teleconference device via the network 100; 6, voice reproduction means for converting received voice data into analog signals; A loudspeaker, such as a speaker, which amplifies the analog output signal of the audio reproducer 6 as appropriate and outputs it as audio. The loudspeaker 8 collects audio by a microphone or the like and controls the communication means 5 to transmit the audio data to another remote conference device. Sound collection means.
[0006]
The communication unit 5 is connected to a network 100 (not shown), and the output is connected to the audio reproducing unit 6. The output of the audio reproducing unit 6 is connected to the loudspeaker 7. The output of the voice collecting means 8 is connected to the communication means 5.
[0007]
Here, the operation of the conventional example shown in FIGS. 4 and 5 will be described with reference to FIG. FIG. 6 is an explanatory diagram illustrating the operation of the audio conference system. However, only the reproduction of the audio will be described, and the description of the collection and transmission of the audio will be omitted.
[0008]
First, in the teleconferencing device alone, the communication unit 5 that has received the voice data as indicated by “SS01” in FIG. 5 via the network 100 outputs the received voice data to the voice reproduction unit 6. The audio reproducing means 6 converts the audio data into an analog signal and outputs it.
[0009]
Then, the loudspeaker 7 amplifies the analog signal output from the audio reproducer 6 as appropriate, and outputs the analog signal to the person having the conference as audio as indicated by "PS01" in FIG.
[0010]
On the other hand, in the audio conference system including the four remote conference devices shown in FIG. 4, the voice of the speaker indicated by “TK11” in FIG. 6 collected by the remote conference device 1 is indicated by “SS11” in FIG. Such audio data is transmitted to the remote conference device 4.
[0011]
Similarly, the voices of the speakers indicated by “TK12” and “TK13” in FIG. 6 collected by the remote conference devices 2 and 3 are converted into voice data such as “SS12” and “SS13” in FIG. 4 is sent.
[0012]
At this time, the remote conference device 4 reproduces the received audio data such as “SS11”, “SS12” and “SS13” in FIG. 6 and transmits the voice data from the loudspeaker 7 shown in “SP11” in FIG. PS11 "is output as a voice as shown in FIG.
[0013]
As a result, a plurality of teleconferencing devices can be connected to each other via a network, and the voice collected by each teleconferencing device can be transmitted as voice data to reproduce the voice, thereby enabling a conference with a remote person.
[0014]
[Problems to be solved by the invention]
However, in the conventional example shown in FIG. 4, when there are a plurality of speakers, the speakers give their names each time to identify the speakers, or the listener distinguishes the characteristics of the sound to be reproduced, and Had to be identified.
[0015]
For this reason, when the speaker does not give his name, there is a problem that it is extremely difficult for the listener to identify the first-time partner or the speaker whose voice characteristics are similar.
Therefore, an object of the present invention is to realize a voice conference system that can easily identify a speaker on a receiving side.
[0016]
[Means for Solving the Problems]
In order to achieve such an object, the invention according to claim 1 of the present invention is:
In a voice conference system that performs a conference with a remote person by transmitting and receiving voice data using a network,
A network, a first teleconference device for adding an identifier to audio data transmitted via the network and transmitting the audio data, and a sound image of the audio data to be received and reproduced via the network And the second teleconferencing device that changes the position of the speaker based on the identifier, the speaker can be easily identified.
[0017]
The invention according to claim 2 is
In the audio conference system according to the first aspect,
The first teleconferencing device comprises:
Communication means for communicating with another teleconference device via the network, and after collecting voice by a microphone and adding the identifier, controlling the communication means to transmit data to the other teleconference device With the configuration including the voice collecting means, the speaker can be easily identified.
[0018]
The invention according to claim 3 is
In the audio conference system according to the first aspect,
The second teleconferencing device comprises:
Communication means for communicating with another teleconference device via the network, audio signal extraction means for extracting the audio data from received data and converting it to an analog signal, and identifying the identifier from the received data Audio identifier extraction means to be extracted, reproduction balance control means for controlling the balance of the reproduction volume of the analog signal, balance ratio setting means for determining the balance based on the identifier and setting the reproduction balance control means, By comprising two loudspeakers for changing the position of the sound image of the audio data and reproducing it under the control of the reproduction balance controller, the speaker can be easily identified.
[0019]
The invention according to claim 4 is
The audio conference system according to claim 3,
The loudspeaker means,
With a plurality of speakers, the speaker can be easily identified.
[0020]
The invention according to claim 5 is
The audio conference system according to claim 3,
The loudspeaker means,
By being displaced in the depth direction, the speaker can be easily identified.
[0021]
The invention according to claim 6 is
The audio conference system according to claim 3,
The loudspeaker means,
By being shifted in the height direction, the speaker can be easily identified.
[0022]
The invention according to claim 7 is
In the audio conference system according to any one of claims 1 to 3,
The identifier is
By using a predefined identifier, the speaker can be easily identified.
[0023]
The invention according to claim 8 is
In the audio conference system according to any one of claims 1 to 3,
The identifier is
With the IP address of the remote conference device, the speaker can be easily identified.
[0024]
The invention according to claim 9 is
In the audio conference system according to any one of claims 1 to 3,
The identifier is
By using the MAC address of the remote conference device, the speaker can be easily identified.
[0025]
The invention according to claim 10 is
In the audio conference system according to claim 1 or 2,
The second teleconferencing device comprises:
By providing a plurality of the microphones and adding a different identifier to each of the voice data collected by the plurality of the microphones and transmitting the same, the speaker can be easily identified.
[0026]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a configuration block diagram showing one embodiment of a voice conference system according to the present invention.
[0027]
In FIG. 1, reference numeral 100 denotes the same reference numeral as in FIG. 4, and 9, 10, 11, and 12 denote improved teleconference devices. The remote conference devices 9, 10, 11 and the remote conference device 12 are connected to the network 100, respectively.
[0028]
FIG. 2 is a configuration block diagram showing a specific example of the remote conference devices 9 to 11. 2, reference numeral 5 denotes the same reference numeral as in FIG. 5, reference numeral 13 denotes an audio signal extracting means for extracting audio data from the received data and converting it into an analog signal, and reference numeral 14 denotes an analog output signal of the audio signal extracting means 13. Reproduction balance control means for controlling the reproduction volume balance (balance), 15 is a voice identifier extraction means for extracting an identifier from received data, and 16 is a balance ratio setting means for setting the reproduction volume balance (balance) based on the identifier. , 17 and 18 are loudspeaker means such as speakers for appropriately amplifying the analog output signal of the reproduction balance control means 14 and outputting it as sound, and 19 controls the communication means 5 after collecting the sound by a microphone or the like and adding an identifier thereto. Is a voice collecting means for transmitting data to the remote conference device.
[0029]
The communication unit 5 is connected to a network 100 (not shown), and the output is connected to the audio signal extraction unit 13 and the audio identifier extraction unit 15. The output of the audio signal extraction unit 13 is connected to the reproduction balance control unit 14. You.
[0030]
The output of the voice identifier extracting means 15 is connected to the balance ratio setting means 16, and the output of the balance ratio setting means 16 is connected to the control input terminal of the reproduction balance control means 14.
[0031]
Further, two outputs of the reproduction balance control means 14 are connected to loudspeakers 17 and 18 arranged on the left and right, respectively, and an output of the sound collection means 19 is connected to the communication means 5.
[0032]
Here, the operation of the embodiment shown in FIGS. 1 and 2 will be described with reference to FIG. FIG. 3 is an explanatory diagram illustrating the operation of the audio conference system.
[0033]
The voice collecting means 19 is a voice collecting means for collecting the voice of the speaker using a microphone or the like, adding an identifier, and then controlling the communication means 5 to transmit data to another remote conference device.
[0034]
For example, an identifier “A” is added to the voice of the speaker indicated by “TK31” in FIG. 3 collected by the remote conference device 9 and transmitted to the remote conference device 13 as data such as “SS31” in FIG. You.
[0035]
For example, similarly, identifiers “B” and “C” are added to the voices of the speakers indicated by “TK32” and “TK33” in FIG. 3 collected by the remote conference devices 10 and 11, respectively, and “SS32” in FIG. "And" SS33 "are transmitted to the remote conference apparatus 13.
[0036]
In addition, the communication unit 5 that has received the data (audio data + identifier) as indicated by “SS21” in FIG. 2 via the network 100 transmits the received data (audio data + identifier) to the audio signal extraction unit 13 and the audio identifier extraction. Output to the means 15. The audio signal extracting means 13 extracts audio data from the data (audio data + identifier) and converts it into an analog signal.
[0037]
On the other hand, the voice identifier extracting means 15 extracts and outputs an identifier from the data (voice data + identifier), and the equilibrium ratio setting means 16 determines the volume of the voice to be reproduced based on the extracted identifier and reproduces the balance control means. 14 is controlled.
[0038]
The loudspeaker 17 installed on the right side and the loudspeaker 18 installed on the left side show two (left and right) analog signals output from the reproduction balance control means 14 as "PS21" and "PS22" in FIG. As described above, it is output to the person who is conducting the meeting.
[0039]
For example, it is assumed that the balance of the volume of the reproduced sound with respect to the identifiers “A”, “B”, and “C” is defined as “100: 0”, “50:50”, and “0: 100”, respectively.
[0040]
Here, when the volume balance is “100: 0”, “100%” of the volume of the sound reproduced from the left loudspeaker 18 is output, and “0” of the volume of the sound reproduced from the right loudspeaker 17 is output. % "Is output.
[0041]
Therefore, the voice data of the speaker indicated by “TK31” in FIG. 3 to which the identifier “A” is added is output as “100%” of the volume of the reproduced voice from the left loudspeaker 18, and is output to the right loudspeaker 17. Since "0%" of the volume of the sound to be reproduced is output, the sound is reproduced so that it can be heard from the left side as shown by "PS31" in FIG. 3, in other words, the sound image is localized at the left end.
[0042]
Similarly, the voice data of the speaker indicated by “TK32” in FIG. 3 to which the identifier “B” is added is output as “50%” of the volume of the reproduced voice from the left loudspeaker 18 and is output to the right loudspeaker 17. Since "50%" of the volume of the sound to be reproduced is output, the sound is reproduced so as to be heard from the center as indicated by "PS32" in FIG. 3, in other words, so that the sound image is localized at the center.
[0043]
Further, as for the voice data of the speaker indicated by “TK33” in FIG. 3 to which the identifier “C” is added, “0%” of the volume of the voice to be reproduced is output from the left loudspeaker 18 and from the right loudspeaker 17 is output. Since "100%" of the volume of the sound to be reproduced is output, the sound is reproduced so as to be heard from the right end as shown by "PS33" in FIG. 3, in other words, so that the sound image is localized at the right end.
[0044]
That is, the listeners indicated by "LN31" in FIG. 3 can reproduce the voices of the speakers indicated by "TK31", "TK32" and "TK33" in FIG. 3 as "PS31", "PS32" and "PS33" in FIG. "", The speakers are heard from different directions, so that the speaker can be easily identified.
[0045]
As a result, the transmitting side transmits the sound data with the identifier added thereto, and the receiving side determines the direction in which the sound reproduced based on the identifier is heard, in other words, the position of the sound image of the reproduced sound data based on the identifier. By changing, the speaker can be easily identified.
[0046]
In the embodiment shown in FIGS. 1 and 3, the case where the number of speakers is three is exemplified. However, when the number of speakers increases, the same number of identifiers as the speakers are provided and The balance ratio may be appropriately changed so that the positions of the sound images of the reproduced sound do not overlap each other.
[0047]
Further, in the embodiment shown in FIGS. 1 and 3, two loudspeakers 17 and 18 are illustrated, but the present invention is not limited to the two loudspeakers, and by providing a plurality of loudspeakers as necessary, It is also possible to arrange the position of the sound image of the reproduced sound at 360 degrees around the listener.
[0048]
In the embodiment shown in FIGS. 1 and 3, the loudspeakers are shifted left and right. However, the loudspeakers may be shifted in the depth direction or the height direction as the installation positions of the loudspeakers. By increasing the position of the sound image of the reproduced sound by displacing it in the three-dimensional direction, it is possible to further identify a large number of speakers.
[0049]
Also, in the embodiment shown in FIGS. 1 and 3, the teleconferencing device adds a predefined identifier, but a unique identifier such as a MAC (Media Access Control address) address or an IP (Internet Protocol) address of the teleconferencing device. May be used as the identifier.
[0050]
Also, in the embodiment shown in FIGS. 1 and 3, the number of speakers and the number of remote conference devices are the same, but a single remote conference device may have a plurality of speakers. In this case, a microphone for voice collection may be assigned to each speaker, and an identifier may be added to each microphone.
[0051]
【The invention's effect】
As is apparent from the above description, the present invention has the following effects.
According to the first, second, third, fourth, fifth, sixth, seventh, eighth and ninth aspects of the present invention, an identifier is added to audio data transmitted by a transmitting side and transmitted, and a receiving side is configured based on the identifier. By changing the direction in which the reproduced sound is heard, in other words, the position of the sound image of the reproduced sound data based on the identifier, the speaker can be easily identified.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an embodiment of a voice conference system according to the present invention.
FIG. 2 is a configuration block diagram illustrating a specific example of a remote conference device.
FIG. 3 is an explanatory diagram illustrating an operation of the audio conference system.
FIG. 4 is a configuration block diagram illustrating an example of a conventional audio conference system.
FIG. 5 is a configuration block diagram showing a specific example of a remote conference device.
FIG. 6 is an explanatory diagram illustrating the operation of the audio conference system.
[Explanation of symbols]
1,2,3,4,9,10,11,12 Teleconference device 5 Communication means 6 Sound reproduction means 7,17,18 Loudspeaking means 8,19 Sound collection means 13 Sound signal extraction means 14 Reproduction balance control means 15 Sound Identifier extracting means 16 balance ratio setting means 100 network

Claims

In a voice conference system that performs a conference with a remote person by transmitting and receiving voice data using a network,
Network and
A first teleconferencing device that adds an identifier to audio data transmitted via the network and transmits the data;
A voice conference system comprising: a second teleconferencing device that receives voice data to which the identifier is added via the network and changes a position of a sound image of the voice data to be reproduced based on the identifier.

The first teleconferencing device comprises:
Communication means for communicating with another teleconference device via the network,
2. The audio conference system according to claim 1, further comprising: audio collection means for collecting audio by a microphone, adding said identifier, and controlling said communication means to transmit data to another remote conference apparatus. .

The second teleconferencing device comprises:
Communication means for communicating with another teleconference device via the network,
Audio signal extraction means for extracting the audio data from the received data and converting it to an analog signal,
Voice identifier extracting means for extracting the identifier from the received data,
Playback balance control means for controlling the balance of the playback volume of the analog signal,
A balance ratio setting means for determining the balance based on the identifier and setting the reproduction balance control means;
2. The audio conference system according to claim 1, further comprising two loudspeakers for changing the position of the sound image of the audio data for reproduction under the control of the reproduction balance controller.

The loudspeaker means,
4. The audio conference system according to claim 3, wherein there are a plurality of audio conference systems.

The loudspeaker means,
4. The audio conference system according to claim 3, wherein the audio conference system is arranged so as to be shifted in a depth direction.

The loudspeaker means,
4. The audio conference system according to claim 3, wherein the audio conference system is arranged so as to be shifted in a height direction.

The identifier is
4. The audio conference system according to claim 1, wherein the identifier is a predefined identifier.

The identifier is
The audio conference system according to claim 1, wherein the audio conference system is an IP address of a remote conference device.

The identifier is
4. The audio conference system according to claim 1, wherein the audio conference system is a MAC address of the remote conference device.

The second teleconferencing device comprises:
The audio conference system according to claim 1, further comprising a plurality of microphones, wherein a different identifier is added to each audio data collected by the plurality of microphones and transmitted.