JP2008017126A

JP2008017126A - Voice conference system

Info

Publication number: JP2008017126A
Application number: JP2006185674A
Authority: JP
Inventors: Noriyuki Hata; 紀行畑
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2006-07-05
Filing date: 2006-07-05
Publication date: 2008-01-24

Abstract

PROBLEM TO BE SOLVED: To provide a voice conference system of simple constitution that enables respective members of a conference to securely listen to a speaker when the conference is held in a wide conference room. SOLUTION: Respective voice conference devices 1A to 1D pick up utterances of corresponding members 200A to 200H of the conference, generate picked-up voice signals SsA to SsD, and transmit them a conference voice controller 2. The conference voice controller 2 sets a gain which increase in accordance with distances between the voice conference devices, adjusts gains of the picked-up voice signals SsA to SsD and mixes them into output voice signals SdA to SdD, which are transmitted. The voice conference devices 1A to 1D excluding the device of the speaker sound the received output voice signals SdA to SdD toward the respective members 200 of the conference. COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、それぞれにマイクとスピーカとを備えた複数の音声会議装置を同一室内等の所定空間内に配置して、これら複数の音声会議装置同士で相互に音声通信を行うことで音声会議を実現する音声会議システムに関するものである。 This invention arranges a plurality of voice conference apparatuses each having a microphone and a speaker in a predetermined space such as the same room, and performs voice communication between the plurality of voice conference apparatuses with each other. The present invention relates to an audio conference system to be realized.

大きな会議室や会議場のような広い空間で会議を行う際に用いる音声会議システムが各種開示されている。 Various audio conference systems used when a conference is held in a large space such as a large conference room or conference hall are disclosed.

特許文献１は、講演者席にはマイクが設置され、他の会議者席にはそれぞれマイクとスピーカとが設置されたプレゼンテーションシステムが開示されている。このシステムでは、それぞれのマイクとスピーカとの距離に応じて放音量レベルが設定されており、この設定レベルに応じて、スピーカ毎に与える放音信号をアッテネーションして放音する。 Patent Document 1 discloses a presentation system in which a microphone is installed in a speaker seat and a microphone and a speaker are installed in other conference seats. In this system, the sound output level is set according to the distance between each microphone and the speaker, and the sound output signal given to each speaker is attenuated and emitted according to the set level.

特許文献２は、講演者席にマイクとスピーカとが設置され、各聴衆者席にそれぞれスピーカが設置された音響設備が開示されている。この音響設備では、講演者席のマイクと各聴衆者席のスピーカとの距離に基づく生音声の伝搬遅延時間を予め設定しておき、この伝搬遅延時間に準じて各聴衆者席のスピーカからの放音を遅延させる。 Patent Document 2 discloses an audio facility in which a microphone and a speaker are installed at a speaker seat, and a speaker is installed at each audience seat. In this audio equipment, a propagation delay time of the live sound based on the distance between the speaker's microphone and each speaker's speaker is set in advance, and the sound from each speaker's speaker is set according to this propagation delay time. Delay sound emission.

このような構成により、これらの特許文献に記載の音声会議システムでは、発言者と聴取者との距離に応じて、音量や遅延量が制御されるため、全ての聴取者に対して発言者の音声を略同等のレベルで提供することができる。
特開平４−３１２０９８号公報特開２００４−１０４２１０公報 With such a configuration, in the audio conference system described in these patent documents, the volume and the delay amount are controlled according to the distance between the speaker and the listener. Audio can be provided at approximately the same level.
Japanese Patent Laid-Open No. 4-312098 JP 2004-104210 A

しかしながら、前述の各従来技術では、前述の発言者（講演者）および聴取者毎にそれぞれスピーカやマイクが個別に設置されているので、会議者が多い場合等には、システムが大幅に大きなものとなってしまう。また、会議途中で、参加者が増加するような場合には、当該参加者に対して容易にマイクやスピーカを設置することができず、当該参加者が発言者から遠い場所に着席した場合には、発言者の発声音を聴き取ることができなくなってしまう。 However, in each of the above-described prior arts, since the speaker and the microphone are individually installed for each of the above-described speakers (speakers) and listeners, the system is significantly large when there are many conference persons. End up. Also, if the number of participants increases during the meeting, microphones and speakers cannot be easily installed for the participants, and the participants are seated at a place far away from the speaker. Will not be able to hear the voice of the speaker.

したがって、本発明の目的は、広い空間からなる会議室で会議を行う場合に、会議者数にあまり影響されることなく音声会議を行え、各会議者が確実に発言者の声を聴き取ることができる音声会議システムを簡素な構成で実現することにある。 Therefore, an object of the present invention is to perform an audio conference without being affected by the number of conference participants when a conference is performed in a large conference room, and to ensure that each conference participant listens to the voice of the speaker. This is to realize a voice conferencing system that can be used with a simple configuration.

さらに、会議者がどの位置に、どのタイミングで着席しても、発言者の発声音を確実に聴取させて会議に参加させることができる音声会議システムを簡素な構成で実現することにある。 It is another object of the present invention to realize an audio conference system with a simple configuration that can surely listen to a speaker's voice and participate in the conference regardless of the position and timing of the conference.

この発明の音声会議システムは、それぞれに異なる複数の収音指向性を実現する収音手段およびそれぞれに異なる複数の放音指向性を実現する放音手段を備え、所定パターンで配置された複数の音声会議装置と、該複数の音声会議装置からの収音信号を受け付けて、収音信号を発生した音声会議装置からの距離に応じた調整音量の放音信号を生成する会議音制御手段と、を備えたことを特徴としている。 The audio conference system according to the present invention includes a plurality of sound collecting means for realizing a plurality of sound collecting directivities different from each other and a sound emitting means for realizing a plurality of sound emitting directivities different from each other, and a plurality of sound emitting means arranged in a predetermined pattern. A conference sound control unit that receives sound pickup signals from the plurality of sound conference devices and generates a sound emission signal having an adjustment volume according to a distance from the sound conference device that has generated the sound pickup signals; It is characterized by having.

この構成では、各音声会議装置は、それぞれ自装置付近に在席する各会議者の発声音を個別に収音するとともに、各会議者に対して個別に放音する。会議音制御手段は、各音声会議装置から入力された収音信号をミキシングして音声会議装置毎に個別の放音信号を出力する。この際、各音声会議装置に対する放音信号は、会議音制御手段により、収音信号の入力元である音声会議装置（以下、収音元音声会議装置）からそれぞれの音声会議装置までの距離に応じて信号レベルが高くなるように設定される。これにより、大きな会議室等で発言者からの距離が遠い位置に聴取者がいても、確実に発言者の発声音を聴き取ることができるとともに、発言者に対して本人の発声音を大きく放音しないため発言者に対する違和感をなくすとともにハウリング防止が可能となる。
さらに、それぞれの音声会議装置に放音機能と収音機能とが備えられていることで、同時に二台の音声会議装置で収音が行われた場合、各収音元音声会議装置からの距離に応じて他の音声会議装置の放音信号の信号レベルが設定されてミキシングされるので、全ての会議者が確実に聴き取れて且つより臨場感溢れる会議音声を実現できる。
また、この発明の音声会議システムの音声会議装置は、複数のマイクからなるマイクアレイと複数のスピーカからなるスピーカアレイと複数のマイクの収音音声に基づいて会議者方位を検出する会議者方位検出手段とを備え、該会議者方位検出手段で異なる複数の会議者方位を検出した場合に、複数のスピーカに与える放音音声を制御することで各会議者方位に対して同時に個別の放音指向性で放音を行うことを特徴としている。 In this configuration, each audio conference device individually collects the utterance sound of each conference person present in the vicinity of its own device and emits it individually to each conference participant. The conference sound control means mixes the collected sound signals input from the respective audio conference devices and outputs individual sound emission signals for each audio conference device. At this time, the sound emission signal for each audio conference device is sent by the conference sound control means to the distance from the audio conference device that is the input source of the sound collection signal (hereinafter, the sound collection source audio conference device) to each audio conference device. Accordingly, the signal level is set to be higher. As a result, even if the listener is far away from the speaker in a large conference room, the speaker's voice can be heard reliably, and the speaker's voice can be greatly released to the speaker. Since no sound is generated, it is possible to eliminate a sense of incongruity for the speaker and prevent howling.
Furthermore, since each voice conference device is equipped with a sound emission function and a sound collection function, when sound is collected by two voice conference devices at the same time, the distance from each sound source voice conference device Accordingly, since the signal level of the sound emission signal of the other audio conference apparatus is set and mixed, it is possible to realize conference audio that can be listened to by all the conference parties reliably and more realistic.
In addition, the audio conference apparatus of the audio conference system according to the present invention includes a microphone array composed of a plurality of microphones, a speaker array composed of a plurality of speakers, and a conference party orientation detection that detects a conference party orientation based on sound collected by the plurality of microphones. And a plurality of different meeting party orientations detected by the meeting party orientation detection means, by controlling the sound emission to be given to a plurality of speakers, the individual sound emitting direction for each meeting party direction simultaneously. It is characterized by emitting sound by sex.

この構成では、会議者方位検出手段が会議者位置を検出すると、会議者方位に強い収音指向性と放音指向性とを設定する。これにより、会議者方位からの音声、すなわち会議者が発言者である場合の発声音、会議者方位への音声、すなわち会議者が聴取者である場合の放音音声のＳ／Ｎ比が高くなる。また、一台の音声会議装置に複数の会議者が在席しても、各会議者に個別の放収音を行うことができる。なお、会議者方位は操作スイッチ等により検出しても、会議者の発声音をマイクアレイの各マイクで収音した収音信号に対して遅延処理を行って各方位に収音ビーム信号を形成し、その信号レベルで検出してもよい。また、この会議者方位検出を会議実行中に、連続的または所定タイミング間隔で機能させ続ければ、会議中に新たな会議者が加わったり会議者が移動したりしても、確実に発言者の発声音を聴取させたり、この新たな会議者の発言を収音することができる。 In this configuration, when the conference direction detection unit detects the location of the conference, the sound collection directivity and sound emission directivity that are strong in the conference direction are set. Thereby, the S / N ratio of the voice from the conference direction, that is, the utterance sound when the conference person is the speaker, the sound toward the conference direction, that is, the sound output sound when the conference person is the listener is high. Become. Further, even when a plurality of conference persons are present in one audio conference apparatus, it is possible to emit and collect individual sounds for each conference party. Note that even if the conference direction is detected by an operation switch, etc., a sound collection beam signal is formed in each direction by performing a delay process on the collected signal obtained by collecting the voice of the conference by each microphone of the microphone array. However, it may be detected at the signal level. In addition, if this conference direction detection continues to function continuously or at predetermined timing intervals during a conference, it will be possible to ensure that the speaker's direction is maintained even if a new conference participant is added or the conference party moves during the conference. It is possible to listen to the uttered sound or pick up the speech of this new conference person.

この発明によれば、広い空間からなる会議室で会議を行う場合に、各会議者が確実に発言者の声を聴き取ることができる音声会議システムを簡素な構成で実現することができる。この際、各音声会議装置が複数の放収音指向性を有することで、各音声会議装置に対して複数人が在席しても、各会議者が個別に発声者の音声を聴き取ることができる。 According to the present invention, it is possible to realize an audio conference system with a simple configuration in which each conference person can surely hear the voice of a speaker when a conference is performed in a conference room consisting of a large space. At this time, since each voice conference device has a plurality of sound emission and collection directivities, each conference person can listen to the voice of the speaker individually even if a plurality of people are present at each voice conference device. Can do.

また、この発明によれば、各音声会議装置が会議者方位を検出することで、会議者がどの位置にどのタイミングで着席しても、発言者の発声音を確実に聴取することができる。 Further, according to the present invention, each voice conference device detects the conference direction, so that the voice of the speaker can be surely heard regardless of the position and timing of the conference.

本発明の実施形態に係る音声会議システムについて図を参照して説明する。
図１は本実施形態の音声会議システムの構成図であり、本図では各音声会議装置にそれぞれ二人ずつ会議者が在席している場合を示す。
図２は本実施形態の音声会議システムの通信配線を示す構成図であり、本図では、四台の音声会議装置を接続する場合を示す。
なお、本実施形態では、音声会議装置が四台の場合を示すが、この台数に限ることなく、会議室の大きさおよび参加者等の仕様に基づいて設置台数は適宜設定すればよい。
図１に示すように、本実施形態の音声会議システムは、大会議室等の広い空間の会議室１００内に、複数の音声会議装置１Ａ〜１Ｄを配置してなる。複数の音声会議装置１Ａ〜１Ｄは同じ仕様で形成されており、長尺状の形状からなる。音声会議装置１Ａ〜１Ｄは、長机１０１上に長机１０１の延びる方向に対して平行な直線状に配列され、音声会議装置１Ａ〜１Ｄの長尺方向と長机１０１の延びる方向とが平行になるように配置されている。 An audio conference system according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a configuration diagram of the audio conference system according to the present embodiment. In this figure, two audio conference apparatuses are present in each audio conference apparatus.
FIG. 2 is a configuration diagram showing communication wiring of the audio conference system according to the present embodiment. In this figure, a case where four audio conference apparatuses are connected is shown.
Although the present embodiment shows a case where there are four audio conference apparatuses, the number is not limited to this number, and the number of installed devices may be set as appropriate based on the size of the conference room and the specifications of the participants.
As shown in FIG. 1, the audio conference system according to the present embodiment includes a plurality of audio conference apparatuses 1 A to 1 D arranged in a conference room 100 having a large space such as a large conference room. The plurality of audio conference apparatuses 1 A to 1 D are formed with the same specifications and have a long shape. The audio conference apparatuses 1A to 1D are arranged on the long desk 101 in a straight line parallel to the direction in which the long desk 101 extends, and the long direction of the voice conference apparatuses 1A to 1D and the extension direction of the long desk 101 are parallel to each other. It is arranged to be.

各音声会議装置１Ａ〜１Ｄには、それぞれ二人ずつの会議者２００Ａ〜２００Ｈが在席しており、対応する音声会議装置１Ａ〜１Ｄを用いて音声会議を行う。具体的に、音声会議装置１Ａに会議者２００Ａ，２００Ｅが在席し、音声会議装置１Ｂに会議者２００Ｂ，２００Ｆが在席し、音声会議装置１Ｃに会議者２００Ｃ，２００Ｇが在席し、音声会議装置１Ｄに会議者２００Ｄ，２００Ｈが在席している。なお、この実施形態の説明では、一台の音声会議装置１に対して二人の会議者２００が在席する例を示したが、一台の音声会議装置に三人以上の会議者が同時に在席してもよい。 In each of the audio conference apparatuses 1A to 1D, two conference members 200A to 200H are present, and an audio conference is performed using the corresponding audio conference apparatuses 1A to 1D. Specifically, the conference members 200A and 200E are present in the audio conference device 1A, the conference members 200B and 200F are present in the audio conference device 1B, and the conference members 200C and 200G are present in the audio conference device 1C. Conference members 200D and 200H are present in the conference apparatus 1D. In the description of this embodiment, an example in which two conference members 200 are present for one audio conference device 1 is shown. However, three or more conference members are simultaneously present in one audio conference device. May be present.

図２に示すように、各音声会議装置１Ａ〜１Ｄと会議音声制御装置２とはＬＡＮ等により接続されている。各音声会議装置１Ａ〜１Ｄは、自装置に在席する会議者２００Ａ〜２００Ｈの発言を収音して、収音音声信号ＳｓＡ〜ＳｓＤを生成し、会議音声制御装置２に送信する。会議音声制御装置２は、収音音声信号ＳｓＡ〜ＳｓＤを、音声会議装置１Ａ〜１Ｄ毎に異なるミキシング比でミキシングして、音声会議装置１Ａ〜１Ｄ毎の放音音声信号ＳｄＡ〜ＳｄＤを生成し、各音声会議装置１Ａ〜１Ｄに送信する。各音声会議装置１Ａ〜１Ｄは、受信した放音音声信号ＳｄＡ〜ＳｄＤをそれぞれの在席者２００Ａ〜２００Ｈに向けて放音する。この際、詳細なミキシング比の設定例は後述するが、概念的には、（１）自装置の収音音声信号を大きく含まない、（２）放音させる音声会議装置と収音した音声会議装置との距離に応じたゲインにより収音音声信号を増幅する、という点に基づいて各放音音声信号ＳｄＡ〜ＳｄＤを生成する。 As shown in FIG. 2, each of the audio conference apparatuses 1A to 1D and the conference audio control apparatus 2 are connected by a LAN or the like. Each of the audio conference apparatuses 1 A to 1 D collects the speech of the conference participants 200 A to 200 H who are present at the own apparatus, generates collected audio signals SsA to SsD, and transmits them to the conference audio control apparatus 2. The conference voice control device 2 mixes the collected voice signals SsA to SsD with different mixing ratios for the voice conference devices 1A to 1D, and generates sound emission voice signals SdA to SdD for the voice conference devices 1A to 1D. And transmitted to each of the audio conference apparatuses 1A to 1D. Each of the audio conference apparatuses 1A to 1D emits the received sound emission audio signals SdA to SdD toward the occupants 200A to 200H. In this case, a detailed setting example of the mixing ratio will be described later. However, conceptually, (1) it does not include a large amount of the collected sound signal of its own device, and (2) the audio conference device that collects sound and the audio conference device that collects sound The sound output sound signals SdA to SdD are generated based on the point that the collected sound signal is amplified by a gain corresponding to the distance to the apparatus.

また、会議音声制御装置２は、会議室１００外の音声会議システムの会議音声制御装置２にネットワーク等を介して接続されていれば、当該別の音声会議システムに対して、収音音声信号ＳｓＡ〜ＳｓＤを同レベルでミキシングした外部出力音声信号ＳｄＯを生成して送信し、別の音声会議システムからの外部入力音声信号ＳｓＯを前記収音音声信号ＳｓＡ〜ＳｓＤ群にミキシングして各音声会議装置１Ａ〜１Ｄに送信する。 If the conference audio control device 2 is connected to the conference audio control device 2 of the audio conference system outside the conference room 100 via a network or the like, the collected audio signal SsA is sent to the other audio conference system. ~ SsD mixed at the same level to generate and transmit an external output audio signal SdO, and an external input audio signal SsO from another audio conference system is mixed into the collected audio signals SsA to SsD to each audio conference device Send to 1A-1D.

次に、より具体的に、本実施形態の音声会議システムを構成する音声会議装置１（１Ａ〜１Ｄ）および会議音声制御装置２の構成および処理を説明する。 Next, the configuration and processing of the audio conference device 1 (1A to 1D) and the conference audio control device 2 configuring the audio conference system of the present embodiment will be described more specifically.

図３は本実施形態の音声会議装置１Ａ〜１Ｄの三面図であり、（Ａ），（Ｃ）が側面図、（Ｂ）が底面図である。
図４は本実施形態の音声会議装置１Ａ〜１Ｄの主要構成を示すブロック図である。
図３に示すように、本実施形態の音声会議装置１（１Ａ〜１Ｄ）は、機構的に、筐体１１２、脚部１１３、操作部１１４を備える。
筐体１１２は一方向に長尺な略直方体形状からなり、筐体１１２の長尺な辺（面）の両端部には、筐体１１２の下面を設置面から所定間隔離間する所定高さの脚部１１３が設置されている。なお、以下の説明では、筐体１１２の四側面のうち、長尺な面を長尺面、短尺な面を短尺面と称する。この長尺面に沿って、図１に示すように、音声会議装置１Ａ〜１Ｄが配列される。 FIG. 3 is a three-side view of the audio conference apparatuses 1A to 1D of the present embodiment, in which (A) and (C) are side views, and (B) is a bottom view.
FIG. 4 is a block diagram showing the main configuration of the audio conference apparatuses 1A to 1D of the present embodiment.
As shown in FIG. 3, the audio conference device 1 (1 A to 1 D) of the present embodiment mechanically includes a housing 112, a leg 113, and an operation unit 114.
The casing 112 has a substantially rectangular parallelepiped shape that is elongated in one direction, and has a predetermined height that separates the lower surface of the casing 112 from the installation surface at a predetermined interval at both ends of the long side (surface) of the casing 112. Legs 113 are installed. In the following description, of the four side surfaces of the housing 112, a long surface is referred to as a long surface, and a short surface is referred to as a short surface. As shown in FIG. 1, voice conference apparatuses 1A to 1D are arranged along the long surface.

筐体１１２の上面における長尺な方向の一方端には、複数のボタンや表示画面からなる操作部１１４が設置されている。これら操作部１１４は筐体１１２内に設置されたメイン制御部１０に接続し、会議者からの操作入力を受け付けて、メイン制御部１０に出力するとともに、操作内容や実行モード等を表示画面に表示する。 An operation unit 114 including a plurality of buttons and a display screen is installed at one end of the upper surface of the housing 112 in the long direction. These operation units 114 are connected to the main control unit 10 installed in the housing 112, receive operation inputs from conference participants, output them to the main control unit 10, and display operation contents and execution modes on a display screen. indicate.

筐体１１２における操作部１１４が設置された側の短尺面には、図示しないが、ネットワーク接続端子等の各種入出力インターフェース端子が設置されており、このネットワーク接続端子を介することで、音声会議装置１（１Ａ〜１Ｄ）は会議音声制御装置２にＬＡＮ等で接続する。 Although not shown, various input / output interface terminals such as a network connection terminal are installed on the short surface of the housing 112 on the side where the operation unit 114 is installed, and the audio conference device is provided via the network connection terminal. 1 (1A to 1D) is connected to the conference voice control apparatus 2 via a LAN or the like.

筐体１１２の下面には、同形状からなるスピーカＳＰ１〜ＳＰ１６が設置されている。これらスピーカＳＰ１〜ＳＰ１６は長尺方向に沿って一定の間隔で直線状に設置されており、これによりスピーカアレイが構成される。筐体１１２の一方の長尺面には、同形状からなるマイクＭＩＣ１０１〜ＭＩＣ１１６が設置されている。これらマイクＭＩＣ１０１〜ＭＩＣ１１６は長尺方向に沿って一定の間隔で直線状に設置されており、これによりマイクアレイが構成される。また、筐体１１２の他方の長尺面にも、同形状からなるマイクＭＩＣ２０１〜ＭＩＣ２１６が設置されている。これらマイクＭＩＣ２０１〜ＭＩＣ２１６も長尺方向に沿って一定の間隔で直線状に設置されており、これによりマイクアレイが構成される。そして、筐体１１２の下面側には、これらスピーカアレイおよびマイクアレイを覆う形状で形成され、パンチメッシュされた下面グリル（図示せず）が設置されている。なお、本実施形態では、スピーカアレイのスピーカ数を１６本とし、各マイクアレイのマイク数をそれぞれ１６本としたが、これに限ることなく、仕様に応じてスピーカ数およびマイク数は適宜設定すればよい。 Speakers SP 1 to SP 16 having the same shape are installed on the lower surface of the housing 112. These speakers SP1 to SP16 are installed in a straight line at regular intervals along the longitudinal direction, thereby constituting a speaker array. On one long surface of the housing 112, microphones MIC101 to MIC116 having the same shape are installed. These microphones MIC101 to MIC116 are installed in a straight line at regular intervals along the longitudinal direction, thereby forming a microphone array. In addition, microphones MIC201 to MIC216 having the same shape are also installed on the other long surface of the casing 112. These microphones MIC201 to MIC216 are also installed in a straight line at regular intervals along the longitudinal direction, thereby forming a microphone array. On the lower surface side of the housing 112, a lower surface grill (not shown) formed in a shape covering the speaker array and the microphone array and punch meshed is installed. In this embodiment, the number of speakers in the speaker array is 16 and the number of microphones in each microphone array is 16. However, the present invention is not limited to this, and the number of speakers and the number of microphones may be set as appropriate according to the specifications. That's fine.

音声会議装置１Ａ〜１Ｄは、機能的には図４に示すように、メイン制御部１０、通信制御部１１、放音制御部１２、Ｄ／Ａコンバータ１３、放音アンプ（ＡＭＰ）１４、収音アンプ（ＡＭＰ）１５、Ａ／Ｄコンバータ１６、収音制御部１７、エコーキャンセル部１８、リモコン送受信部１９、操作部１１４、スピーカＳＰ１〜ＳＰ１６、マイクＭＩＣ１０１〜ＭＩＣ１１６、ＭＩＣ２０１〜ＭＩＣ２１６、を備える。 As shown in FIG. 4, the audio conference apparatuses 1 A to 1 D functionally include a main control unit 10, a communication control unit 11, a sound emission control unit 12, a D / A converter 13, a sound emission amplifier (AMP) 14, a receiver. A sound amplifier (AMP) 15, an A / D converter 16, a sound collection control unit 17, an echo cancellation unit 18, a remote control transmission / reception unit 19, an operation unit 114, speakers SP1 to SP16, microphones MIC101 to MIC116, and MIC201 to MIC216 are provided.

メイン制御部１０は、音声会議装置の全体制御を行うとともに、操作部１１４から入力される電源オン／オフ等の制御や、その他信号処理系の各種制御を行う。 The main control unit 10 performs overall control of the audio conference apparatus, and also performs control such as power on / off input from the operation unit 114, and various control of other signal processing systems.

メイン制御部１０は、在席する会議者２００がリモコン１２０を操作し、リモコン送受信部１９を介して会議参加情報を受け付けると、受け付けたリモコン１２０の方向から会議者２００の方位を検出する。メイン制御部１０は検出した方位に基づいて会議者２００方向に強い指向性を有する放音指向性を設定して、放音制御部１２に与える。 When the attendee 200 operating the remote controller 120 and accepting the conference participation information via the remote control transmission / reception unit 19, the main control unit 10 detects the orientation of the conference 200 from the direction of the accepted remote controller 120. The main control unit 10 sets a sound emission directivity having a strong directivity in the direction of the conference 200 based on the detected azimuth and gives it to the sound emission control unit 12.

通信制御部１１は、ＬＡＮを介して接続された会議音声制御装置２からの放音音声信号Ｓｄを受信して、通信形式のデータから一般的な音声信号に変換して、エコーキャンセル部１８を介して放音制御部１２に出力する。 The communication control unit 11 receives the sound output audio signal Sd from the conference audio control device 2 connected via the LAN, converts the communication format data into a general audio signal, and sets the echo cancellation unit 18 to To the sound emission control unit 12.

また、通信制御部１１は、エコーキャンセル部１８から出力された収音音声信号Ｓｓを通信形式に変換し、会議音声制御装置２に送信する。 In addition, the communication control unit 11 converts the collected sound signal Ss output from the echo cancellation unit 18 into a communication format and transmits the communication format to the conference sound control device 2.

放音制御部１２は、メイン制御部１０からの与えられた放音指向性に基づいて、入力された放音音声信号Ｓｄに対して遅延処理や振幅処理等を行って、在席する会議者２００の方向に強い指向性を有する放音ビームを形成するように、各スピーカＳＰ１〜ＳＰ１６に対応する放音信号を生成する。 The sound emission control unit 12 performs a delay process, an amplitude process, and the like on the input sound emission sound signal Sd based on the sound emission directivity given from the main control unit 10 to be present Sound emission signals corresponding to the speakers SP1 to SP16 are generated so as to form a sound emission beam having strong directivity in the direction of 200.

各Ｄ／Ａコンバータ１３は、入力された放音信号をディジタル−アナログ変換して、各放音アンプ１４に与え、各放音アンプ１４はアナログ化された放音信号を増幅して、各スピーカＳＰ１〜ＳＰ１６に与える。各スピーカＳＰ１〜ＳＰ１６は、入力された電気的な放音信号を音声に変換して放音する。 Each D / A converter 13 performs digital-analog conversion on the input sound emission signal and applies it to each sound emission amplifier 14, and each sound emission amplifier 14 amplifies the analog sound emission signal to produce each speaker. Give to SP1-SP16. Each speaker SP1-SP16 converts the input electric sound emission signal into a sound and emits the sound.

マイクＭＩＣ１０１〜ＭＩＣ１１６、ＭＩＣ２０１〜ＭＩＣ２１６は、自装置に在席する会議者２００からの発声音を含む周囲の音を収音して電気的な収音信号に変換し、収音アンプ１５に与える。収音アンプ１５は収音信号を増幅してＡ／Ｄコンバータ１６に与え、Ａ／Ｄコンバータ１６は、アナログ形式の収音信号をディジタル変換して、収音制御部１７に出力する。 The microphones MIC101 to MIC116 and MIC201 to MIC216 collect ambient sounds including utterances from the conference person 200 present in the apparatus, convert them into electrical sound collection signals, and supply them to the sound collection amplifier 15. The sound collection amplifier 15 amplifies the sound collection signal and applies it to the A / D converter 16, and the A / D converter 16 converts the analog sound collection signal into a digital signal and outputs it to the sound collection control unit 17.

収音制御部１７は、各マイクＭＩＣ１０１〜ＭＩＣ１１６，ＭＩＣ２０１〜ＭＩＣ２１６の収音信号に対して遅延処理等を行い、それぞれに異なる方位に強い指向性を有する複数の収音ビーム信号を生成する。収音制御部１７は、生成した各方位の収音ビーム信号の振幅を比較し、最も振幅の大きい収音ビーム信号ＭＢを選択して、エコーキャンセル部１８に出力する。この際、会議者２００が発言していれば、会議者２００の方向に強い指向性を有する収音ビーム信号が選択される。このため、この方位情報をメイン制御部１０に与え、メイン制御部１０はこの方位情報に基づいて放音指向性を設定してもよい。逆に、前述のように会議者２００からリモコン操作により方位情報が入力されていることを利用し、当該方位に指向性を有する収音ビーム信号のみを形成したり、当該方位を含む所定方位角範囲内のみで収音ビーム信号を形成し、振幅による選択を行ってもよい。 The sound collection control unit 17 performs a delay process or the like on the sound collection signals of the microphones MIC101 to MIC116 and MIC201 to MIC216, and generates a plurality of sound collection beam signals having strong directivities in different directions. The sound collection control unit 17 compares the amplitudes of the generated sound collection beam signals in the respective directions, selects the sound collection beam signal MB having the largest amplitude, and outputs it to the echo cancellation unit 18. At this time, if the conference person 200 speaks, a sound collection beam signal having strong directivity in the direction of the conference person 200 is selected. For this reason, this azimuth | direction information may be given to the main control part 10, and the main control part 10 may set sound emission directivity based on this azimuth | direction information. Conversely, using the fact that the direction information is input from the conference person 200 by remote control as described above, only a sound collecting beam signal having directivity in the direction is formed, or a predetermined azimuth angle including the direction is set. The sound collection beam signal may be formed only within the range, and selection by amplitude may be performed.

エコーキャンセル部１８は、適応型フィルタとポストプロセッサとを備える。適応型フィルタは放音音声信号Ｓｄに基づく擬似回帰音信号を生成する。ポストプロセッサは収音制御部１７から出力された収音ビーム信号ＭＢから、放音音声信号Ｓｄの擬似回帰音信号を減算して、通信制御部１１に収音音声信号Ｓｓとして出力する。これにより、スピーカＳＰからマイクＭＩＣへの回り込み音を抑圧する。 The echo cancellation unit 18 includes an adaptive filter and a post processor. The adaptive filter generates a pseudo regression sound signal based on the sound emission sound signal Sd. The post processor subtracts the pseudo regression sound signal of the sound emission sound signal Sd from the sound collection beam signal MB output from the sound collection control unit 17 and outputs the subtracted sound signal Ss to the communication control unit 11. Thereby, the wraparound sound from the speaker SP to the microphone MIC is suppressed.

図５は本実施形態の会議音声制御装置２の主要構成を示すブロック図である。
会議音声制御装置２はＣＰＵ２１、メモリ２２、ミキサ２３を備える。
ＣＰＵ２１は会議音声制御装置２の全体制御を行うとともに、収音音声信号ＳｓＡ〜ＳｓＤに基づいて、放音音声信号ＳｄＡ〜ＳｄＤ毎のミキシング比および遅延時間をメモリ２２から読み出してミキサ２３に与える。
メモリ２２は、各放音音声信号ＳｄＡ〜ＳｄＤに対する収音音声信号ＳｓＡ〜ＳｓＤのミキシング比および遅延時間比を記憶している。
図６（Ａ）は、各放音音声信号ＳｄＡ〜ＳｄＤを構成する際の各収音音声信号ＳｓＡ〜ＳｓＤのゲインＧの関係を示す図であり、（Ｂ）は各放音音声信号ＳｄＡ〜ＳｄＤを構成する際の各収音音声信号ＳｓＡ〜ＳｓＤの遅延時間Ｔの関係を示す図である。
図６（Ａ）に示すように、それぞれの放音音声信号ＳｄＡ〜ＳｄＤに対して、ミキシング要素となる各収音音声信号ＳｓＡ〜ＳｓＤのゲインＧは予め設定されている。このゲインＧは、放音先である音声会議装置からの距離に応じて大きくなるように設定されている。音声会議装置１Ａからの距離は、置換順に、（１）音声会議装置１Ｂ、（２）音声会議装置１Ｃ、（４）音声会議装置１Ｄとなる。したがって、音声会議装置１Ａの放音音声信号ＳｄＡに対して、音声会議装置１Ｂの収音音声信号ＳｓＢのゲインをＧ１、音声会議装置１Ｃの収音音声信号ＳｓＣのゲインをＧ３、音声会議装置１Ｄの収音音声信号ＳｓＤのゲインをＧ５として、Ｇ１＜Ｇ３＜Ｇ５となるように設定されている。そして、これらゲインＧ１，Ｇ３，Ｇ５は、ゲイン調整後の各収音音声信号ＳｓＢ〜ＳｓＤの音量レベルが略同じになるように設定されている。さらに、放音先の音声会議装置からの収音音声信号は、ミキシング要素に含まない（ゲインＧ＝「０」に相当）ように設定されている。
なお、放音先の音声会議装置からの収音音声信号をミキシング要素に含めるようにしても良い。この場合、該収音音声信号は他の音声会議装置で得られた収音音声信号よりも小さな音量レベルでミキシングされる。すなわち、この時のゲインをＧ０とすると、０＜Ｇ０＜＜Ｇ１とする。これにより、放音先の音声会議装置からは、当該装置の収音音声信号が極小さいレベルで再生（放音）される。これは拡声の目的ではなく、発言者による音声のモニタを目的とするもので、このように小さな音量レベルで再生することで、ハウリングを防止しながら、発言者の自然な会話をサポートすることができる。 FIG. 5 is a block diagram showing the main configuration of the conference voice control apparatus 2 of the present embodiment.
The conference voice control device 2 includes a CPU 21, a memory 22, and a mixer 23.
The CPU 21 performs overall control of the conference audio control device 2 and reads the mixing ratio and delay time for each of the sound output audio signals SdA to SdD from the memory 22 based on the collected audio signals SsA to SsD and gives them to the mixer 23.
The memory 22 stores a mixing ratio and a delay time ratio of the collected sound signals SsA to SsD with respect to the emitted sound signals SdA to SdD.
FIG. 6A is a diagram illustrating the relationship of gain G of each collected sound signal SsA to SsD when each sound output sound signal SdA to SdD is configured, and FIG. 6B is a diagram illustrating each sound output sound signal SdA to SdA to SdA. It is a figure which shows the relationship of the delay time T of each sound-collected audio | voice signal SsA-SsD at the time of comprising SdD.
As shown in FIG. 6A, the gain G of each of the collected sound signals SsA to SsD, which is a mixing element, is set in advance for each of the emitted sound signals SdA to SdD. This gain G is set so as to increase in accordance with the distance from the audio conference device that is the sound output destination. The distance from the audio conference apparatus 1A is (1) the audio conference apparatus 1B, (2) the audio conference apparatus 1C, and (4) the audio conference apparatus 1D in the order of replacement. Therefore, the gain of the collected voice signal SsB of the voice conference apparatus 1B is G1, the gain of the collected voice signal SsC of the voice conference apparatus 1C is G3, and the voice conference apparatus 1D is set to the sound output voice signal SdA of the voice conference apparatus 1A. Is set so that G1 <G3 <G5, where G5 is the gain of the collected sound signal SsD. These gains G1, G3, and G5 are set so that the volume levels of the collected sound signals SsB to SsD after gain adjustment are substantially the same. Further, the collected voice signal from the voice conference device that is the sound output destination is set so as not to be included in the mixing element (corresponding to gain G = “0”).
Note that the collected sound signal from the voice conference device that is the sound output destination may be included in the mixing element. In this case, the collected voice signal is mixed at a volume level smaller than that of the collected voice signal obtained by another voice conference apparatus. That is, assuming that the gain at this time is G0, 0 <G0 << G1. As a result, the collected voice signal of the device is reproduced (sounded) at a very small level from the voice conference device of the sound emitting destination. This is not for the purpose of loud sound but for the purpose of monitoring the voice of the speaker. By playing at such a low volume level, it is possible to support the natural conversation of the speaker while preventing howling. it can.

また、図６（Ｂ）に示すように、それぞれの放音音声信号ＳｄＡ〜ＳｄＤに対して、ミキシング要素となる各収音音声信号ＳｓＡ〜ＳｓＤの遅延時間Ｔは予め設定されている。この遅延時間Ｔは、放音先である音声会議装置からの距離に応じて長くなるように設定されている。より具体的に、音声会議装置１Ａの放音音声信号ＳｄＡに対して、音声会議装置１Ｂの収音音声信号ＳｓＢの遅延時間をＴ１、音声会議装置１Ｃの収音音声信号ＳｓＣの遅延時間をＴ３、音声会議装置１Ｄの収音音声信号ＳｓＤの遅延時間をＴ５として、Ｔ１＜Ｔ３＜Ｔ５となるように設定されている。そして、これら遅延時間Ｔ１，Ｔ３，Ｔ５は、遅延処理をして放音される放音音声信号ＳｄＡに含まれる各収音音声信号ＳｓＢ〜ＳｓＤと、これら収音音声信号ＳｓＢ〜ＳｓＤに対応する音声会議装置１Ｂ〜１Ｄの会議者２００Ｂ〜２００Ｄ，２００Ｆ〜２００Ｈの生の発声音とが同時に音声会議装置１Ａに在席する会議者２００Ａに届くように設定されている。
ところで、これらゲインＧおよび遅延時間Ｔは、装置設置時に装置間距離を計測して入力することで設定すればよい。 As shown in FIG. 6B, the delay time T of each of the collected sound signals SsA to SsD serving as a mixing element is set in advance for each of the emitted sound signals SdA to SdD. This delay time T is set so as to increase in accordance with the distance from the voice conference device that is the sound output destination. More specifically, the delay time of the collected audio signal SsB of the audio conference apparatus 1B is T1 and the delay time of the collected audio signal SsC of the audio conference apparatus 1C is T3 with respect to the sound output audio signal SdA of the audio conference apparatus 1A. The delay time of the collected audio signal SsD of the audio conference apparatus 1D is set to be T1 <T3 <T5, where T5 is the delay time. The delay times T1, T3, and T5 correspond to the collected sound signals SsB to SsD included in the sound output sound signal SdA that is emitted after the delay process, and to the sound collection sound signals SsB to SsD. It is set so that the live voices of the conference participants 200B to 200D and 200F to 200H of the audio conference apparatuses 1B to 1D reach the conference person 200A who is present in the audio conference apparatus 1A at the same time.
By the way, the gain G and the delay time T may be set by measuring and inputting the distance between apparatuses when the apparatus is installed.

ミキサ２３は、ＣＰＵ２１から与えられたゲインＧおよび遅延時間Ｔに基づいて、各音声会議装置１Ａ〜１Ｄから受信した収音音声信号ＳｓＡ〜ＳｓＤをミキシングして放音音声信号ＳｄＡ〜ＳｄＤを生成する。より具体的に、ミキサ２３は、放音音声信号ＳｄＡ〜ＳｄＤ毎に設定されたゲインＧおよび遅延時間Ｔを用いて、各収音音声信号ＳｓＡ〜ＳｓＤに、対応するゲインＧと遅延時間Ｔとを乗算し、これら乗算後の各収音音声信号ＳｓＡ〜ＳｓＤを加算する。ミキサ２３は、これら放音音声信号ＳｄＡ〜ＳｄＤを各音声会議装置１Ａ〜１Ｄに送信する。 The mixer 23 mixes the collected sound signals SsA to SsD received from each of the audio conference apparatuses 1A to 1D based on the gain G and the delay time T given from the CPU 21, and generates the emitted sound signals SdA to SdD. . More specifically, the mixer 23 uses the gain G and the delay time T set for each of the emitted sound signals SdA to SdD, and the corresponding gain G and delay time T for each of the collected sound signals SsA to SsD. And the sound pickup audio signals SsA to SsD after the multiplication are added. The mixer 23 transmits these sound emission audio signals SdA to SdD to the audio conference apparatuses 1A to 1D.

このようにゲインが設定されミキシングされることで、会議者間（音声会議装置間）の距離に影響されることなく、いずれの会議者が発言しても、聴取者である全ての会議者に同等の音量で発声音を放音することができる。 By setting the gain and mixing in this way, regardless of the distance between the conferences (between the audio conference devices), no matter which conferencer speaks, all the conference participants who are listeners The utterance sound can be emitted with the same volume.

さらに、このように遅延時間が設定されてミキシングされることで、会議者間（音声会議装置間）の距離に影響されることなく、いずれの会議者が発言しても、聴取者である各会議者に発言者の生音声と音声会議装置からの放音音声とを同時に与えることができる。 Furthermore, since the delay time is set and mixed in this way, each conference person speaks without being affected by the distance between the conference parties (between the audio conference apparatuses). The voice of the speaker and the sound emitted from the audio conference device can be simultaneously given to the conference person.

なお、この際、各音声会議装置１Ａ〜１Ｄが外部の音声会議装置とネットワークを介して接続している場合、外部の音声会議装置には、各収音音声信号ＳｓＡ〜ＳｓＤを同レベルのゲインで調整して加算した外部出力音声信号ＳｄＯを送信する。一方、外部の音声会議装置から外部入力音声信号ＳｓＯを受信すれば、適当なゲインを設定して、収音音声信号ＳｓＡ〜ＳｓＤをそれぞれ所定ミキシング比でミキシングした放音音声信号ＳｄＡ〜ＳｄＤに加える。これにより、前述の放収音環境を維持しながら、外部の音声会議装置との音声会議も実現することができる。 At this time, if each of the audio conference apparatuses 1A to 1D is connected to an external audio conference apparatus via a network, the external audio conference apparatus receives the collected audio signals SsA to SsD at the same level of gain. The external output audio signal SdO adjusted and added in (1) is transmitted. On the other hand, when an external input audio signal SsO is received from an external audio conference apparatus, an appropriate gain is set and the collected audio signals SsA to SsD are added to the emitted audio signals SdA to SdD, respectively, mixed at a predetermined mixing ratio. . Accordingly, it is possible to realize an audio conference with an external audio conference device while maintaining the above-described sound emission and collection environment.

次に、具体的な状況を設定し、図を参照することで前記機能を説明する。
図７は会議者２００Ａのみが発言している状況を示す図である。図７において、破線は収音音声を示し、実線は放音音声を示す。また、ＳｓＸ（Ｇｍ，Ｔｎ）は、収音音声信号ＳｓＸをゲインＧｍ、遅延時間Ｔｎで調整した信号を示す。
図７に示すように音声会議装置１Ａが会議者２００Ａの発言を収音すると、収音音声信号ＳｓＡを生成する。音声会議装置１Ａは収音音声信号ＳｓＡを会議音声制御装置２に送信し、会議音声制御装置２は収音音声信号ＳｓＡに基づいて、放音音声信号ＳｄＡ〜ＳｄＤを生成し、それぞれ音声会議装置１Ａ〜１Ｄに送信する。この際、会議音声制御装置２は、前述の基準に従って設定されたゲインＧおよび遅延時間Ｔを用いて放音音声信号ＳｄＡ〜ＳｄＤを生成する。各音声会議装置１Ａ〜１Ｄは、放音音声信号ＳｄＡ〜ＳｄＤを受信して放音する。
具体的には、音声会議装置１Ａは０レベルの放音音声信号ＳｄＡを放音するか、信号レベル（０レベル）を検出して放音を行わない。音声会議装置１Ｂはゲイン・遅延調整された放音音声信号ＳｄＢ＝ＳｓＡ（Ｇ１，Ｔ１）を放音する。音声会議装置１Ｃはゲイン・遅延調整された放音音声信号ＳｄＣ＝ＳｓＡ（Ｇ３，Ｔ３）を放音する。音声会議装置１Ｄはゲイン・遅延調整された放音音声信号ＳｄＤ＝ＳｓＡ（Ｇ５，Ｔ５）を放音する。 Next, the function will be described by setting a specific situation and referring to the drawing.
FIG. 7 is a diagram showing a situation where only the conference person 200A speaks. In FIG. 7, the broken line indicates the collected sound, and the solid line indicates the emitted sound. SsX (Gm, Tn) indicates a signal obtained by adjusting the collected sound signal SsX with the gain Gm and the delay time Tn.
As shown in FIG. 7, when the voice conference apparatus 1A picks up the speech of the conference person 200A, a voice pickup voice signal SsA is generated. The audio conference device 1A transmits the collected audio signal SsA to the conference audio control device 2, and the conference audio control device 2 generates sound output audio signals SdA to SdD based on the collected audio signal SsA, and each audio conference device Send to 1A-1D. At this time, the conference audio control apparatus 2 generates sound emission audio signals SdA to SdD using the gain G and the delay time T set according to the above-described criteria. Each of the audio conference apparatuses 1A to 1D receives and emits sound output sound signals SdA to SdD.
Specifically, the audio conference apparatus 1A emits a 0-level emitted sound signal SdA or detects a signal level (0 level) and does not emit sound. The voice conference apparatus 1B emits a sound emission voice signal SdB = SsA (G1, T1) whose gain and delay are adjusted. The audio conference apparatus 1 C emits a sound emission sound signal SdC = SsA (G 3, T 3) whose gain and delay are adjusted. The audio conference apparatus 1D emits the sound emission sound signal SdD = SsA (G5, T5) whose gain and delay are adjusted.

このような処理を行うことにより、会議者２００Ｂ〜２００Ｄ，２００Ｆ〜２００Ｈは会議者２００Ａの発言を十分な音量で、且つ生音声と放音音声とのズレによる違和感が無い状態で聴き取ることができる。なお、会議者２００Ｅは、会議者２００Ａの正面近傍に在席しているので、会議者２００Ａの生音声を直接聴き取ることができる。そして、会議者２００Ａは、自身の発声音を放音音声として聞くことが無いので、違和感なく発言することができる。 By performing such processing, the conference participants 200B to 200D and 200F to 200H can listen to the speech of the conference participant 200A at a sufficient volume and without a sense of incongruity due to the difference between the live sound and the emitted sound. it can. In addition, since the conference participant 200E is present near the front of the conference participant 200A, it is possible to directly listen to the live audio of the conference participant 200A. And since the conference participant 200A does not hear his / her uttered sound as the emitted sound, he / she can speak without a sense of incongruity.

以上のように、本実施形態の構成および処理を用いることにより、広い空間で会議を行う場合にも、全ての会議者が各発言者の発言を十分な音量で、且つ違和感なく聴き取ることができる。 As described above, by using the configuration and processing of the present embodiment, even when a conference is performed in a wide space, all the conference members can listen to the speech of each speaker at a sufficient volume and without a sense of incongruity. it can.

なお、このような放収音処理の状況で、各音声会議装置１Ａ〜１Ｄは、前述のように会議者方向に強い指向性を有する放音音声を形成することができるので、それぞれの会議者方向にのみ放音音声が放音されるように設定することで、隣り合う会議者間でそれぞれの放音音声が混じり合うことがない。これにより、各会議者はより一層違和感なく発言者の音声を聴き取ることができる。
また、各音声会議装置１Ａ〜１Ｄは、前述のように操作入力や収音ビーム信号を検出することができるので、会議者が移動しても、確実に会議者方向へ音声を放音することができる。これにより、会議者は移動しても発言者の音声を確実に聴き取ることができる。 In addition, in such a state of sound emission and collection processing, each of the audio conference apparatuses 1A to 1D can form a sound emission sound having strong directivity in the direction of the conference person as described above. By setting so that the emitted sound is emitted only in the direction, the emitted sounds are not mixed between adjacent conference parties. As a result, each conference person can listen to the voice of the speaker without further discomfort.
In addition, since each of the audio conference apparatuses 1A to 1D can detect the operation input and the collected sound beam signal as described above, even if the conference person moves, the audio conference apparatus 1A to 1D can surely emit the sound toward the conference person. Can do. Thereby, even if a conference person moves, he can hear a speaker's voice reliably.

また、前述の説明では、ゲイン調整と遅延時間調整とをともに行う場合を説明したが、いずれか一方のみを行うようにしてもよい。この場合、ミキシング処理の負荷が軽減することで、よりレスポンスよく放音音声信号を生成することができる。 In the above description, the case where both gain adjustment and delay time adjustment are performed has been described. However, only one of them may be performed. In this case, the sound output sound signal can be generated with better response by reducing the load of the mixing process.

また、前述の説明では、一台の音声会議装置に二人の会議者が在席する場合を示したが、さらに多くの人が一台の音声会議装置に在席する場合でも、前述の構成を適用することができる。この場合、多くの人が在席する音声会議装置は、在席する全員に同等に放音するように広い指向性の放音音声を放音しても、各人に対して絞った放音音声を放音するようにしてもよい。 Further, in the above description, the case where two conference persons are present in one audio conference apparatus has been described. However, the above-described configuration is possible even when more people are present in one audio conference apparatus. Can be applied. In this case, an audio conference device in which many people are present, even if a sound with a wide directivity is emitted so as to be emitted equally to all people present, Sound may be emitted.

また、各音声会議装置１Ａ〜１Ｄは前述のように会議者を検出することができる。さらに、複数人が在席しても各人に放音音声を提供することができる。これらを利用することで、いずれかの音声会議装置に会議者が途中参加して、会議者数が増加しても、全ての会議者が発言者の発声音を十分な音量で且つ違和感無く聴き取ることができる。 Moreover, each audio | voice conference apparatus 1A-1D can detect a conference person as mentioned above. Furthermore, even if a plurality of people are present, sound emission can be provided to each person. By using these, even if a conference participant joins one of the audio conferencing devices halfway and the number of conference participants increases, all the conference parties can listen to the speaker's utterance at a sufficient volume and without any sense of incongruity. Can be taken.

この際、各音声会議装置１Ａ〜１Ｄは各会議者の方位情報を会議音声制御装置２に与えることで、会議音声制御装置２は、各音声会議装置の距離情報と各会議者の方位情報とから、各々の会議者間の距離をより詳細に設定することもできる。この会議者間の距離を用いることで、より詳細にゲインおよび遅延時間を設定することができる。 At this time, each of the audio conference devices 1A to 1D provides the conference voice control device 2 with the direction information of each conference person, so that the conference voice control device 2 can detect the distance information of each voice conference device and the direction information of each conference person. From the above, the distance between each conference party can be set in more detail. By using the distance between the conference participants, the gain and the delay time can be set in more detail.

本発明の実施形態の音声会議システムの構成図である。It is a block diagram of the audio conference system of embodiment of this invention. 本発明の実施形態の音声会議システムの通信配線を示す構成図である。It is a block diagram which shows the communication wiring of the audio conference system of embodiment of this invention. 本発明の実施形態の音声会議装置１Ａ〜１Ｄの三面図である。It is a three-view figure of the audio conference apparatuses 1A to 1D according to the embodiment of the present invention. 本発明の実施形態の音声会議装置１Ａ〜１Ｄの主要構成を示すブロック図である。It is a block diagram which shows the main structures of the audio conference apparatuses 1A-1D of embodiment of this invention. 本発明の実施形態の会議音声制御装置２の主要構成を示すブロック図である。It is a block diagram which shows the main structures of the meeting audio | voice control apparatus 2 of embodiment of this invention. 各放音音声信号ＳｄＡ〜ＳｄＤを構成する際の各収音音声信号ＳｓＡ〜ＳｓＤのゲインＧの関係を示す図、および各放音音声信号ＳｄＡ〜ＳｄＤを構成する際の各収音音声信号ＳｓＡ〜ＳｓＤの遅延時間Ｔの関係を示す図である。The figure which shows the relationship of the gain G of each sound collection audio | voice signal SsA-SsD at the time of comprising each sound emission sound signal SdA-SdD, and each sound collection sound signal SsA at the time of comprising each sound emission sound signal SdA-SdD It is a figure which shows the relationship of the delay time T of -SsD. 会議者２００Ａのみが発言している状況を示す図である。It is a figure which shows the condition where only the conference person 200A is speaking.

Explanation of symbols

１（１Ａ〜１Ｄ）−音声会議装置、２−会議音声制御装置、１０メイン制御部、１１通信制御部、１２−放音制御部、１３−Ｄ／Ａコンバータ、１４−放音アンプ（ＡＭＰ）、１５−収音アンプ（ＡＭＰ）、１６−Ａ／Ｄコンバータ、１７−収音制御部、１８−エコーキャンセル部、１９−リモコン送受信部、１１２−筐体、１１３−脚部、１１４−操作部、２００Ａ〜２００Ｈ−会議者、ＳＰ１〜ＳＰ１６−スピーカ、ＭＩＣ１０１〜ＭＩＣ１１６，ＭＩＣ２０１〜ＭＩＣ２１６−マイク 1 (1A to 1D) -voice conference device, 2-conference voice control device, 10 main control unit, 11 communication control unit, 12-sound emission control unit, 13-D / A converter, 14-sound emission amplifier (AMP) , 15-sound collecting amplifier (AMP), 16-A / D converter, 17-sound collecting control unit, 18-echo canceling unit, 19-remote control transmission / reception unit, 112-housing, 113-leg unit, 114-operation unit , 200A-200H-conference, SP1-SP16-speaker, MIC101-MIC116, MIC201-MIC216-microphone

Claims

A plurality of audio conferencing apparatuses arranged in a predetermined pattern, each having a sound collection means for realizing a plurality of sound collection directivities different from each other and a sound emission means for realizing a plurality of sound emission directivities different from each other;
A conference sound control means for receiving sound pickup signals from the plurality of voice conference devices and generating a sound emission signal having an adjustment volume according to the distance from the voice conference device that has generated the sound pickup signals;
Voice conference system with

The audio conference apparatus includes a microphone array including a plurality of microphones, a speaker array including a plurality of speakers, and a conference direction determining unit that detects a conference direction based on sound collected by the plurality of microphones. When a plurality of different conference directions are detected by the conference direction detection means, the sound output to the plurality of speakers is controlled to simultaneously emit sound with individual sound output directivity for each conference direction. The audio conference system according to claim 1, wherein: