JP4929673B2

JP4929673B2 - Audio conferencing equipment

Info

Publication number: JP4929673B2
Application number: JP2005306687A
Authority: JP
Inventors: 康祐斉藤; 利晃石橋
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2005-10-21
Filing date: 2005-10-21
Publication date: 2012-05-09
Anticipated expiration: 2025-10-21
Also published as: JP2007116494A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice conference apparatus which allows a user to easily change a sound source position, does not prepare the number of speakers according to the number of counterparts, and also allows the user to easily identify which person talks out of a plurality of counterparts. <P>SOLUTION: The voice conference apparatus 200 is provided with: an input unit 21 which inputs each voice signal from a plurality of the counterparts; and a delay unit 24, an adding unit 25, a D/A converter 26, and amplifiers 27 wherein a speaker array 20 with a plurality of speaker units SP arranged in a line and directed downward are connected, and independent delay control for each voice signal input from the input unit 21 is performed so that voice beams of each voice signal focus at different point voice sources, then inputs the signal to the speaker array 20. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、複数の相手方の音声会議装置との間で通信することによって、複数の相手方との間で音声会議を行う音声会議装置に関する。 The present invention relates to an audio conference apparatus that performs an audio conference with a plurality of other parties by communicating with the audio conference apparatuses of a plurality of other parties.

従来より、異なった地点に居る複数の相手方と電話会議等を通じて会議を行うための音声会議装置は広く知られている。このような音声会議装置はスピーカ及びマイクロフォン（以下、「マイク」と記載する）を備えている。そして、この音声会議装置は、マイクで集音した会議出席者の話声を相手方の音声会議装置に対して送信する機能と、相手方の音声会議装置から受信した相手方の話声をスピーカから放音する機能を並列に備えている（例えば特許文献１を参照）。これによって、遠隔地に居る相手方との間で通話を行い、会議出席者は遠隔地に居る相手方との間で音声会議を行うことができる。
特開平５−１５８４９２号公報 2. Description of the Related Art Conventionally, an audio conference apparatus for performing a conference through a telephone conference or the like with a plurality of other parties at different points is widely known. Such an audio conference apparatus includes a speaker and a microphone (hereinafter referred to as “microphone”). The voice conference apparatus transmits the voice of the conference attendee collected by the microphone to the voice conference apparatus of the other party, and emits the voice of the other party received from the voice conference apparatus of the other party from the speaker. (See, for example, Patent Document 1). As a result, a call can be made with the other party in the remote place, and the conference attendee can have a voice conference with the other party in the remote place.
JP-A-5-158492

従来の音声会議装置では、無指向性の（指向範囲の広い）１台のスピーカを用いて複数の相手方の話声を放音する。このため、複数の相手方の話声が同一の音源から放音されることになるため、ユーザが複数の相手方のうち何れの者が話しているかを識別しづらかった。 In a conventional audio conference apparatus, a plurality of opponent's voices are emitted using a single non-directional speaker (with a wide directivity range). For this reason, since the voices of a plurality of opponents are emitted from the same sound source, it is difficult for the user to identify which one of the plurality of opponents is speaking.

この問題を解決するために、複数のスピーカを用意して複数の相手方からの音声をそれぞれ異なったスピーカから放音する方法も考えられる。しかしながら、この方法では、相手方の人数と同じ台数のスピーカが予め用意されなくてはならない。また、音源位置を変更する場合に、都度スピーカの配置位置をユーザが変更しなくてはならず、手間である。 In order to solve this problem, a method of preparing a plurality of speakers and emitting sounds from a plurality of opponents from different speakers can be considered. However, in this method, the same number of speakers as the number of opponents must be prepared in advance. Also, when changing the sound source position, the user must change the position of the speaker each time, which is troublesome.

上記課題を解決するために、本発明は、ユーザに音源位置の変更のために与える手間が少なく、かつ相手方の人数に応じた台数のスピーカを用意することなく、ユーザが複数の相手方のうち何れの者が話しているかを識別することを容易にすることができる音声会議装置を提供することを目的としている。 In order to solve the above-described problems, the present invention reduces the effort given to the user for changing the sound source position, and the user can select any one of a plurality of opponents without preparing the number of speakers according to the number of opponents. It is an object of the present invention to provide an audio conference apparatus that can easily identify who is talking.

上記課題を解決するために本発明では以下の手段を採用している。 In order to solve the above problems, the present invention employs the following means.

本発明は、複数の送信元からの各音声信号を送信元識別情報とともに入力する入力部と、ライン状に且つ下向きに配列された複数のスピーカユニットを備えたスピーカアレイを接続し、前記入力部で入力した各音声信号に対して独立した遅延制御を行って、前記スピーカアレイに入力する放音部と、を備え、前記放音部は、前記各音声信号の仮想点音源の位置パターンを選択し、前記入力部から各音声信号とともに入力された送信元識別情報、および前記選択した位置パターンに基づいて、前記各音声信号の音声ビームがそれぞれ前記スピーカアレイの長尺方向の異なる位置の仮想点音源で焦点するように制御を行うことを特徴とする音声会議装置である。 The present invention connects an input unit that inputs the audio signals from multiple sources along with the transmission source identification information, the speaker array having a plurality of speaker units and arranged downwardly in a line, the entering force A sound emitting unit that performs independent delay control on each sound signal input by the unit and inputs the sound signal to the speaker array, and the sound emitting unit determines a position pattern of a virtual point sound source of each sound signal. Based on the source identification information selected and input together with each audio signal from the input unit, and the selected position pattern, the audio beam of each audio signal is virtual at different positions in the longitudinal direction of the speaker array. An audio conference apparatus that performs control so as to focus on a point sound source .

上記構成によれば、複数の相手方からの各音声信号が入力部に入力される。この入力された音声信号は、放音部によって、複数の相手方の音声会議装置からの音声信号の音声ビームがそれぞれ異なった仮想点音源で焦点するように制御されて、スピーカアレイに入力される。これによって、複数の相手方からの音声がそれぞれ異なった位置に定位することになり、ユーザに何れの相手方からの音声を聴音しているかを容易に識別させることが可能となる。 According to the above configuration, each audio signal from a plurality of opponents is input to the input unit. The input sound signal is controlled by the sound emitting unit so that the sound beams of the sound signals from the voice conference apparatuses of the other party are focused by different virtual point sound sources, and input to the speaker array. As a result, the voices from a plurality of opponents are localized at different positions, and the user can easily identify which party is listening to the voices.

上述のようにスピーカアレイを用いて仮想点音源に定位させているため、仮想点音源の位置や仮想点音源の数を用意に変更することができる。これによって、ユーザに音源位置の変更のために与える手間が少なく、かつ相手方の人数に応じた台数のスピーカを用意しなくても、上記したように、ユーザに何れの相手方からの音声を聴音しているかを用意に識別させることが可能となる。 Since the speaker array is used to localize the virtual point sound source as described above, the position of the virtual point sound source and the number of virtual point sound sources can be changed as needed. As a result, as described above, the user can listen to the sound from any other party without the need to provide the user with the number of speakers according to the number of the other party and less effort to change the sound source position to the user. It is possible to identify whether or not

また、スピーカアレイは、スピーカユニット列の放音側が下方に位置するため、スピーカユニット列からは、音声ビームが下方に向けて出力される。そして、スピーカアレイの下方にある障害物（例えば、机上に本装置がある場合には机や床面等）で反射した音声ビームが障害物の斜め上方向に向かって伝播する。 In the speaker array, since the sound emission side of the speaker unit row is positioned below, the sound beam is output downward from the speaker unit row. Then, an audio beam reflected by an obstacle below the speaker array (for example, a desk or a floor when the apparatus is on a desk) propagates obliquely upward of the obstacle.

これによって、スピーカユニットのライン方向と直交する２方向に同内容の音声ビームを出力することが可能となる。このため、ユーザが複数居る場合に、この２方向に分かれて音声を聴音することが可能となる。 As a result, it is possible to output an audio beam having the same content in two directions orthogonal to the line direction of the speaker unit. For this reason, when there are a plurality of users, it is possible to listen to the sound divided into these two directions.

また、特定の聴取者の後方（この聴取者が音声ビームの到来方向を向いているとする）に居る人に対する音漏れを防止することが可能となる。また、スピーカアレイでは音声ビームの指向性を制御できるといっても、放音面に直交する方向に多少は音声が漏れるものである。本発明では、スピーカアレイから下方に音声ビームを出力するため、この音声漏れによって特定の聴取者以外の人に対して音声を聴取させてしまうことを効果的に防止することが可能となる。 In addition, it is possible to prevent sound leakage for a person who is behind a specific listener (assuming that this listener is facing the direction of arrival of the sound beam). Further, even though the speaker array can control the directivity of the sound beam, some sound leaks in the direction orthogonal to the sound emitting surface. In the present invention, since a sound beam is output downward from the speaker array, it is possible to effectively prevent a person other than a specific listener from listening to the sound due to the sound leakage.

本発明によれば、スピーカアレイを用いて、複数の相手方からの音声がそれぞれ異なった仮想点音源に定位することになる。このため、任意の位置に仮想点音源の位置や仮想点音源の数を容易に変更することができる。これによって、ユーザに音源位置の変更のために与える手間が少なく、かつ相手方の人数に応じた台数のスピーカを用意しなくても、ユーザに何れの相手方からの音声を聴音しているかを容易に識別させることができる。 According to the present invention, sound from a plurality of opponents is localized to different virtual point sound sources using the speaker array. For this reason, the position of the virtual point sound source and the number of virtual point sound sources can be easily changed to arbitrary positions. As a result, it is easy for the user to listen to the sound from the other party without preparing the number of speakers according to the number of the other party and less effort for the user to change the sound source position. Can be identified.

また、スピーカユニット列の放音側が下方に向くように設置される。このため、スピーカユニット列からは、音声ビームが下方に向けて出力され、スピーカアレイの下方にある障害物（例えば、机上に本装置がある場合には机や床面等）で反射した音声ビームが障害物の斜め上方向に向かって伝播する。 Moreover, it installs so that the sound emission side of the speaker unit row may face downward. For this reason, an audio beam is output downward from the speaker unit row and is reflected by an obstacle below the speaker array (for example, a desk or a floor surface when the apparatus is on a desk). Propagates diagonally above the obstacle.

これによって、スピーカユニットのライン方向と直交する２方向に同内容の音声ビームを出力することができる。このため、ユーザが複数居る場合に、この２方向に分かれて音声を聴音することができる。また、特定の聴取者以外の人に対して音声を聴取させてしまうことを効果的に防止することができる。 As a result, an audio beam having the same content can be output in two directions orthogonal to the line direction of the speaker unit. For this reason, when there are a plurality of users, it is possible to listen to the sound divided into these two directions. Moreover, it is possible to effectively prevent a person other than the specific listener from listening to the sound.

図１〜図８を参照して本発明の実施形態である音声会議システム１００について詳細に説明する。図１は、音声会議システム１００を概略的に示す図である。音声会議システム１００は、複数の会議地点に居る複数の会議出席者ｈの間で音声会議を行うためのシステムである。音声会議システム１００は、各会議地点に配置された複数の音声会議装置２００がネットワークＮに接続されたシステムである。このネットワークは、例えば、インターネットやＬＡＮ等の広域ネットワーク等である。 With reference to FIGS. 1-8, the audio conference system 100 which is embodiment of this invention is demonstrated in detail. FIG. 1 is a diagram schematically showing an audio conference system 100. The audio conference system 100 is a system for conducting an audio conference among a plurality of conference attendees h at a plurality of conference points. The audio conference system 100 is a system in which a plurality of audio conference apparatuses 200 arranged at each conference point are connected to a network N. This network is, for example, a wide area network such as the Internet or a LAN.

各音声会議装置２００は、ネットワークＮに接続された他（相手方）の音声会議装置２００（相手方装置２００´）と通信する機能を備え、この通信によって会議用音声信号を相互に送受信する。この通信のために用いられる通信プロトコルは、特に限定されないが例えば、ＴＣＰ(Transmission Control Protocol)／ＩＰ(Internet Protocol)等が用いられる。 Each voice conference apparatus 200 has a function of communicating with another (the other party) voice conference apparatus 200 (the other party apparatus 200 ′) connected to the network N, and transmits and receives a conference audio signal by this communication. The communication protocol used for this communication is not particularly limited, and for example, TCP (Transmission Control Protocol) / IP (Internet Protocol) or the like is used.

図２は、図１で示す音声会議装置２００を断面視した斜視図である。音声会議装置２００は、長尺の略直方体状であり、かつ図中のＹ−Ｙ方向の側面が開口したロ字形状である枠体１の上部に、長尺の略直方体状であるスピーカ装置２を備える。このスピーカ装置２には、接続線３ａを介してマイク３が接続される。スピーカ装置２の下面には、下方に放音側が向くようにスピーカアレイ２０が配設されている。 FIG. 2 is a perspective view of the audio conference apparatus 200 shown in FIG. The audio conference apparatus 200 has a long, substantially rectangular parallelepiped shape, and a speaker apparatus that has a long, substantially rectangular parallelepiped shape on the upper portion of the frame body 1 that has a rectangular shape with open sides in the YY direction in the figure. 2 is provided. A microphone 3 is connected to the speaker device 2 via a connection line 3a. A speaker array 20 is arranged on the lower surface of the speaker device 2 so that the sound emission side faces downward.

このスピーカアレイ２０に、相手方装置２００´から送信された会議用音声が入力されることで、会議用音声が放音される。また、マイク３でユーザである会議出席者ｈの音声が集音され、この集音された音声が相手方装置２００´に送信され、相手方装置２００´のスピーカアレイ２０から放音される。これによって、複数の音声会議装置２００の間で音声会議を行うことができる。 When the conference audio transmitted from the counterpart device 200 ′ is input to the speaker array 20, the conference audio is emitted. Further, the voice of the conference attendee h, who is the user, is collected by the microphone 3, and the collected voice is transmitted to the partner apparatus 200 ′ and emitted from the speaker array 20 of the partner apparatus 200 ′. Thereby, a voice conference can be performed between the plurality of voice conference apparatuses 200.

スピーカアレイ２０は、長尺方向に亘ってライン状に下向きに配列された８個のスピーカユニットＳＰ（ＳＰ１〜ＳＰ８）から成る。このスピーカユニットＳＰ１〜ＳＰ８に音声信号を入力すると、スピーカアレイ２０から音声ビームが下方に向かうように出力される。 The speaker array 20 includes eight speaker units SP (SP1 to SP8) arranged in a line downward in the longitudinal direction. When an audio signal is input to the speaker units SP1 to SP8, an audio beam is output from the speaker array 20 so as to be directed downward.

下方に出力された音声ビームは枠体１の下面１０（以下、反射板１０とする）で反射して斜め上方に向かい、図中のＹ−Ｙ方向における枠体１の側部から放出される。なお、反射板１０は、例えば樹脂製で平板に形成されており、これによって好適に音声ビームを反射することができる。なお、音声ビームを好適に反射することができれば、反射板１０の素材は樹脂に限定されず、金属材料等であってもよい。また、反射板１０の形状も平板に限定されず、なだらかに湾曲する形状等であってもよい。 The sound beam output downward is reflected by the lower surface 10 (hereinafter referred to as the reflector 10) of the frame 1 and is obliquely upward, and is emitted from the side of the frame 1 in the YY direction in the figure. . In addition, the reflecting plate 10 is made of, for example, resin and is formed into a flat plate, and thereby, the sound beam can be favorably reflected. As long as the sound beam can be reflected appropriately, the material of the reflector 10 is not limited to resin, and may be a metal material or the like. Moreover, the shape of the reflecting plate 10 is not limited to a flat plate, and may be a gently curved shape or the like.

そして、スピーカユニットＳＰ１〜ＳＰ８への音声信号の入力タイミングを制御することで、音声ビームを仮想点音源Ｐに焦点させることができるとともに、この仮想点音源Ｐの位置を任意の位置に変えることができる。すなわち、スピーカユニットＳＰ１〜ＳＰ８に入力する音声信号に図中の矢印で示すような遅延時間を付与することで、音声ビームを仮想点音源Ｐに音像定位させることができる。これによって、会議出席者ｈに仮想点音源Ｐから会議用音声が聴こえるように認識させることができる。 Then, by controlling the input timing of the audio signal to the speaker units SP1 to SP8, the audio beam can be focused on the virtual point sound source P, and the position of the virtual point sound source P can be changed to an arbitrary position. it can. That is, the sound signal can be localized on the virtual point sound source P by adding a delay time as indicated by an arrow in the drawing to the sound signals input to the speaker units SP1 to SP8. As a result, the conference attendee h can be recognized so that the conference audio can be heard from the virtual point sound source P.

図３は、図２で示す音声会議装置２００の設置方法及び音声ビームの指向性制御を説明するための図である。部屋Ｒは、音声会議専用の会議室等ではなく通常のオフィスルームである。図３はこの部屋Ｒを横方向から見た図である。また、図４は、図３で示す部屋ＲをＹ方向から見た図である。なお、図４では、図面の分かり易さのため、会議出席者ｈ２及びこの会議出席者ｈ２が着座する会議椅子を省略している。 FIG. 3 is a diagram for explaining the installation method of the audio conference apparatus 200 and the directivity control of the audio beam shown in FIG. The room R is not a conference room dedicated to audio conferencing but a normal office room. FIG. 3 is a view of the room R as viewed from the side. FIG. 4 is a view of the room R shown in FIG. 3 as viewed from the Y direction. In FIG. 4, for the sake of easy understanding of the drawing, the conference attendee h2 and the conference chair on which the conference attendee h2 is seated are omitted.

部屋Ｒでは、２人の会議出席者ｈ（ｈ１，ｈ２）が会議机を挟んで対向するように着席している。そして、会議机の上には、会議出席者ｈ１及び会議出席者ｈ２の向かい合う方向にＹ−Ｙ方向（図１参照）がくるように音声会議装置２００が配置されている。図３で示すように、この音声会議装置２００からは下方向に音声ビームが出力され、この音声ビームは反射板１０で反射して、Ｙ−Ｙ方向で２方向に分かれて斜め上方に伝播する。すなわち、音声ビームは、会議出席者ｈ１及び会議出席者ｈ２に対して、それぞれ斜め上方に向かうように伝播する。 In the room R, two meeting attendees h (h1, h2) are seated so as to face each other across the conference desk. On the conference desk, the audio conference apparatus 200 is arranged such that the YY direction (see FIG. 1) is in the direction in which the conference attendee h1 and the conference attendee h2 face each other. As shown in FIG. 3, an audio beam is output from the audio conference apparatus 200 in the downward direction. The audio beam is reflected by the reflector 10, and is divided into two directions in the YY direction and propagates obliquely upward. . That is, the sound beam propagates diagonally upward to the conference attendee h1 and the conference attendee h2.

このため、会議出席者ｈ１，ｈ２に対して横方向に音声ビームを出力するのに比較して、会議出席者ｈ１，ｈ２の後方に位置する人に音声を聴取させることを効果的に防止することができる。これとともに、図２におけるＸ−Ｘ後方で音声ビームの指向性を制御することができるため、このＸ−Ｘ方向への音漏れを効果的に防止することができる。 For this reason, compared with outputting an audio beam in the horizontal direction to the conference attendees h1 and h2, it is possible to effectively prevent a person located behind the conference attendees h1 and h2 from listening to the audio. be able to. At the same time, since the directivity of the sound beam can be controlled behind XX in FIG. 2, sound leakage in the XX direction can be effectively prevented.

また、スピーカ装置２では音声ビームの指向性を制御できるといっても、放音面に直交する方向に多少は音声が漏れるものである。本実施形態では、スピーカアレイ２０から下方に音声ビームを出力するため、この音声漏れによって特定の聴取者以外の人に対して音声を聴取させてしまうことを防止することができる。 Further, even though the speaker device 2 can control the directivity of the sound beam, some sound leaks in a direction orthogonal to the sound emitting surface. In this embodiment, since an audio beam is output downward from the speaker array 20, it is possible to prevent a person other than a specific listener from listening to the sound due to the sound leakage.

そして、図４で示すように、この音声ビームは複数の相手方端末２００´からの音声が、スピーカユニットＳＰのライン方向（Ｘ−Ｘ方向）でそれぞれ異なる仮想点音源Ｐで焦点するように制御される。例えば、４地点間の音声会議を行う場合には、音声ビームは３箇所の異なった仮想点音源Ｐ（Ｐ１〜Ｐ３）で音像定位するように制御される。このため、会議出席者ｈ１は、Ｘ−Ｘ方向で仮想点音源Ｐの位置であり、Ｙ−Ｙ方向では仮想点音源Ｐから反射板１０までの距離分だけ仮想点音源ＰよりＹ側の位置に音源があるように認識する。また、会議出席者ｈ２は、Ｘ−Ｘ方向で仮想点音源Ｐの位置であり、Ｙ−Ｙ方向では仮想点音源Ｐから反射板１０までの距離分だけ仮想点音源Ｐより−Ｙ側の位置に音源があるように認識する。 Then, as shown in FIG. 4, this sound beam is controlled so that sounds from a plurality of counterpart terminals 200 'are focused on different virtual point sound sources P in the line direction (XX direction) of the speaker unit SP. The For example, when a voice conference between four points is performed, the voice beam is controlled so that sound images are localized by three different virtual point sound sources P (P1 to P3). For this reason, the conference attendee h1 is the position of the virtual point sound source P in the XX direction, and the position on the Y side from the virtual point sound source P by the distance from the virtual point sound source P to the reflector 10 in the Y-Y direction. Recognize that there is a sound source. Also, the conference attendee h2 is the position of the virtual point sound source P in the XX direction, and the position on the −Y side from the virtual point sound source P by the distance from the virtual point sound source P to the reflector 10 in the YY direction. Recognize that there is a sound source.

これによって、会議出席者ｈ１、ｈ２に複数の相手方装置２００´からの音声がそれぞれ異なる音源から聴こえるように認識させることができ、いずれの相手方からの音声を聴音しているかを比較的容易に識別させることができる。 As a result, the conference attendees h1 and h2 can recognize the voices from the plurality of counterpart devices 200 'so that they can be heard from different sound sources, respectively, and it is relatively easy to identify which party is listening to the voice. Can be made.

また、本音声会議装置２００は、上述したようにスピーカアレイ２０からの音声ビームを複数箇所の仮想点音源Ｐで焦点させることで、複数箇所の音源から音声が聴こえるように会議出席者ｈに認識させる。このため、スピーカユニットＳＰに入力する音声信号に付与する遅延時間を変更することによって、仮想点音源Ｐの位置を容易に変更することができる。これとともに、遅延時間を付与する機能部のチャンネル数を所望の仮想点音源Ｐの個数分用意するだけで、仮想点音源Ｐの数を変更することができる。 In addition, the audio conference apparatus 200 recognizes the conference attendee h so that the audio can be heard from the sound sources at a plurality of locations by focusing the sound beams from the speaker array 20 at the plurality of virtual point sound sources P as described above. Let For this reason, the position of the virtual point sound source P can be easily changed by changing the delay time given to the audio signal input to the speaker unit SP. At the same time, it is possible to change the number of virtual point sound sources P only by preparing the number of channels of the functional unit to which the delay time is provided by the number of desired virtual point sound sources P.

更に、上述したように、音声ビームは反射板１０で反射してＹ−Ｙ方向の２方向に分かれて出力されるため、この分離した２方向の音声成分は同内容となる。このため、２方向の音声成分を聴音すると、会議出席者ｈはそれぞれ真逆の位置で音像が定位するように認識する。すなわち、会議出席者ｈ１及び会議出席者具体的には、北海道にある相手方装置２００´からの音声を仮想点音源Ｐ１に定位させると、会議出席者ｈ１には自らの右端でこの音声の音源があるように聴音させ、会議出席者ｈ２には自らの左端でこの音声の音源があるように聴音させる。これによって、会議出席者ｈ１，ｈ２の双方に、複数の相手方装置２００´からの音声がそれぞれ異なる音源から聴こえるように認識させることができる。 Further, as described above, since the sound beam is reflected by the reflecting plate 10 and divided and output in two directions in the YY direction, the sound components in the two separated directions have the same contents. For this reason, when listening to audio components in two directions, the conference attendee h recognizes each sound image so that the sound image is localized at the opposite positions. That is, when the audio from the conference attendee h1 and the conference attendee device 200 'in Hokkaido is localized at the virtual point sound source P1, the conference attendee h1 has the sound source of the audio at his right end. The conference attendee h2 is made to listen as if there is a sound source of this sound at his left end. Thus, both the conference attendees h1 and h2 can recognize the sounds from the plurality of counterpart devices 200 ′ so that they can be heard from different sound sources.

図５は、図１で示す音声会議装置２００の構成を概略的に示すブロック図である。スピーカ装置２には、スピーカアレイ２０の他に、入出力インタフェース２１、エコーキャンセラ２２、信号振分部２３、遅延部２４、加算部２５、Ｄ／Ａ（digital/analog）コンバータ２６、アンプ２７、操作部２８及びコントロール部２９を備える。 FIG. 5 is a block diagram schematically showing the configuration of the audio conference apparatus 200 shown in FIG. In addition to the speaker array 20, the speaker device 2 includes an input / output interface 21, an echo canceller 22, a signal distribution unit 23, a delay unit 24, an addition unit 25, a D / A (digital / analog) converter 26, an amplifier 27, An operation unit 28 and a control unit 29 are provided.

入出力インタフェース２１は、本願発明の入力部に対応し、接続端子３０に接続された通信ケーブル（図略）等を介してネットワークＮに接続され、このネットワークＮに接続された音声会議装置２００´との間でデジタル音声信号の送受信を行う。エコーキャンセラ２２は、音声会議装置２００´から入出力インタフェース２１を介して受信した会議用音声信号（受信音声信号）が入力される。この受信音声信号は本実施形態ではパケット単位で入力される。なお、音声会議装置２００の間の通信はパケット通信に限定されず、受信音声信号はパケット単位で入力されなくてもよい。 The input / output interface 21 corresponds to the input unit of the present invention, is connected to the network N via a communication cable (not shown) connected to the connection terminal 30, and the audio conference apparatus 200 'connected to the network N. Send and receive digital audio signals to and from. The echo canceller 22 receives a conference audio signal (received audio signal) received from the audio conference apparatus 200 ′ via the input / output interface 21. This received audio signal is input in packet units in this embodiment. Communication between the audio conference apparatuses 200 is not limited to packet communication, and the received audio signal may not be input in units of packets.

エコーキャンセラ２２は、この入力信号を用いて、スピーカアレイ２０から出力されてマイク３に帰還されるエコー成分を擬似した擬似信号を生成する。エコーキャンセラ２２は、入出力インタフェース２１を介して受信音声信号を信号振分部２３に入力する。そして、エコーキャンセラ２２は、マイク３から入力した音声信号（後述）から擬似信号を除去することでエコー成分を除去する。 The echo canceller 22 uses this input signal to generate a pseudo signal that simulates an echo component output from the speaker array 20 and fed back to the microphone 3. The echo canceller 22 inputs the received audio signal to the signal distribution unit 23 via the input / output interface 21. The echo canceller 22 removes an echo component by removing a pseudo signal from an audio signal (described later) input from the microphone 3.

信号振分部２３は、例えばＤＳＰ（Digital Signal Processor）等で実現され、入力した受信音声信号に含まれる送信元IDを参照して、この送信元IDに対応する遅延部２４のチャンネルに入力する処理（振り分け処理）を実行する。この振り分け処理については詳しくは図７（ｂ）を用いて後述する。なお、この送信元IDは例えばIPアドレス等であり、複数の音声会議装置２００の相互間の通信の際に通信データに含めて送信される。また、信号振分部２３には、送信元ＩＤとこの送信元ＩＤに対応する遅延部２４のチャンネルとが対応付けて設定されている。信号振分部２３はこの設定に基づいて、受信した音声信号を遅延部２４の各チャンネルに振り分けて入力する。 The signal distribution unit 23 is realized by, for example, a DSP (Digital Signal Processor) or the like, and refers to the transmission source ID included in the input received audio signal, and inputs it to the channel of the delay unit 24 corresponding to the transmission source ID. Processing (distribution processing) is executed. This distribution process will be described later in detail with reference to FIG. The transmission source ID is, for example, an IP address or the like, and is transmitted by being included in communication data when a plurality of voice conference apparatuses 200 communicate with each other. In the signal distribution unit 23, a transmission source ID and a channel of the delay unit 24 corresponding to the transmission source ID are set in association with each other. Based on this setting, the signal distribution unit 23 distributes and inputs the received audio signal to each channel of the delay unit 24.

遅延部２４は、本願発明の放音部に対応し、複数チャンネル（ｎ個のチャンネル）が設けられている。このｎ個のチャンネルを区別する場合には、チャンネル番号を添え字として付して遅延部２４１〜２４ｎと記載する。この複数チャンネルの遅延部２４１〜２４ｎ、スピーカユニットＳＰ１〜ＳＰ８の個数分だけ（８個）設けられている。以下、それぞれの遅延部２４を区別する場合には、スピーカユニットＳＰ１〜ＳＰ８のうち対応するものと同様の数字を添え字として付す。例えば、スピーカユニットＳＰ１に対応する遅延部２４は、遅延部２４−１と記載する。仮想点音源Ｐからの音声が各スピーカユニットＳＰに至るまでの遅延時間（例えば図２で示すような遅延時間）と同じ遅延時間を各遅延部２４−１〜２４−８が設定されることで、仮想点音源Ｐに音像定位させることができる。 The delay unit 24 corresponds to the sound emitting unit of the present invention, and is provided with a plurality of channels (n channels). When distinguishing these n channels, the channel numbers are attached as subscripts and are described as delay units 241 to 24n. The same number (eight) of delay units 241 to 24n and speaker units SP1 to SP8 of the plurality of channels are provided. Hereinafter, when distinguishing each delay part 24, the same number as the corresponding thing among the speaker units SP1-SP8 is attached as a subscript. For example, the delay unit 24 corresponding to the speaker unit SP1 is referred to as a delay unit 24-1. By setting each delay unit 24-1-24-8 to the same delay time (for example, the delay time as shown in FIG. 2) until the sound from the virtual point sound source P reaches each speaker unit SP. The sound image can be localized on the virtual point sound source P.

そして、各チャンネルの遅延部２４１〜２４ｎは、Ｘ−Ｘ方向で異なった仮想点音源Pの位置に対応している。すなわち、各チャンネルには、それぞれ異なった仮想点音源Pで音声ビームを焦点させるような遅延時間が設定されており、入力した信号に設定された遅延時間を付与する。また、各チャンネルはそれぞれ異なった送信元に対応付けられており、信号振分部２３から上述したように対応する送信元の受信音声信号が入力される。これによって、異なった送信元からの受信音声信号には、異なった仮想点音源Pに音像定位するような遅延時間が付与されることになる。このため、異なった相手方装置２００´から受信した音声を異なった仮想点音源Pに焦点させることができる。 The delay units 241 to 24n of each channel correspond to the positions of the virtual point sound sources P that are different in the XX direction. That is, each channel is set with a delay time for focusing the sound beam with a different virtual point sound source P, and the set delay time is given to the input signal. Further, each channel is associated with a different transmission source, and the reception voice signal of the corresponding transmission source is input from the signal distribution unit 23 as described above. As a result, the received audio signals from different transmission sources are given a delay time for sound image localization to different virtual point sound sources P. For this reason, it is possible to focus the sound received from different counterpart device 200 ′ on different virtual point sound sources P.

各チャンネルの遅延部２４−１〜２４−８は、それぞれ対応する加算部２５に遅延時間を付与した音声信号を入力する。加算部２５は、本願発明の放音部に対応し、遅延部２４−１〜２４−８の個数だけ設けられている。なお、これらの加算部２５を識別する場合には、対応する遅延部２４−１〜２４−ｎと同じ添え字を付す。これらの加算部２５−１〜２５−８は、それぞれ対応する遅延部２４から遅延時間の付与された受信音声信号を入力する。各加算部２５は、入力した各チャンネルからの音声信号を合成して対応するＤ／Ａコンバータ２６に入力する。 The delay units 24-1 to 24-8 of the respective channels input the audio signals provided with the delay time to the corresponding adding units 25, respectively. The adding units 25 correspond to the sound emitting units of the present invention and are provided by the number of delay units 24-1 to 24-8. In addition, when identifying these addition parts 25, the same subscript as the corresponding delay parts 24-1 to 24-n is attached. Each of these adders 25-1 to 25-8 inputs the received audio signal to which the delay time is given from the corresponding delay unit 24. Each adder 25 synthesizes the input audio signals from the respective channels and inputs them to the corresponding D / A converter 26.

Ｄ／Ａコンバータ２６は、本願発明の放音部に対応し、加算部２５−１〜２５−８の個数だけ設けられている。なお、これらのＤ／Ａコンバータ２６を識別する場合には、対応する加算部２５−１〜２５−８と同じ添え字を付す。各Ｄ／Ａコンバータ２６−１〜２６−８は、それぞれ対応する加算部２５から遅延時間の付与された受信音声信号を入力する。Ｄ／Ａコンバータ２６−１〜２６−８は入力したデジタルの受信音声信号をアナログに変換して対応するアンプ２７に入力する。 The D / A converter 26 corresponds to the sound emitting unit of the present invention, and is provided in the number of the adding units 25-1 to 25-8. In addition, when identifying these D / A converters 26, the same subscripts as the corresponding adding units 25-1 to 25-8 are attached. Each of the D / A converters 26-1 to 26-8 receives a received audio signal to which a delay time is added from the corresponding adder 25. The D / A converters 26-1 to 26-8 convert the received digital received audio signal into analog and input it to the corresponding amplifier 27.

アンプ２７は、本願発明の放音部に対応し、入力した音声信号の信号レベルを増幅する。アンプ２７は、スピーカユニットＳＰ１〜ＳＰ８に対応する個数だけ設けられている。以下、それぞれのアンプ２７を区別する場合には、スピーカユニットＳＰ１〜ＳＰ８のうち対応するものと同様の数字を添え字として付す。 The amplifier 27 corresponds to the sound emission unit of the present invention and amplifies the signal level of the input audio signal. The amplifier 27 is provided in the number corresponding to the speaker units SP1 to SP8. Hereinafter, when distinguishing each amplifier 27, the same number as the corresponding one among the speaker units SP1 to SP8 is attached as a subscript.

アンプ２７−１〜アンプ２７−８は、加算部２５−１〜２５−８からＤ／Ａコンバータ２６（２６−１〜２６−８）を介して受信音声信号を入力する。アンプ２７−１〜アンプ２７−８は、入力した受信音声信号の信号レベルを増幅して対応するスピーカユニットＳＰ１〜ＳＰ８に入力する。これによって、スピーカユニットＳＰ１〜ＳＰ８から受信音声信号の音声が放音され、音声会議装置２００´からの相手方の話声が放音される。 The amplifiers 27-1 to 27-8 receive received audio signals from the adders 25-1 to 25-8 via the D / A converter 26 (26-1 to 26-8). The amplifiers 27-1 to 27-8 amplify the signal level of the input reception audio signal and input the amplified signal levels to the corresponding speaker units SP1 to SP8. As a result, the voice of the received voice signal is emitted from the speaker units SP1 to SP8, and the voice of the other party from the voice conference apparatus 200 ′ is emitted.

操作部２８は、会議出席者ｈやオペレータ等のユーザの操作を受け付けて、この操作内容を示す操作信号をコントロール部２９に入力する。このユーザの操作には、音声会議を開始させる操作や、音声会議の事前設定のための操作がある。 The operation unit 28 receives an operation of a user such as a conference attendee h or an operator, and inputs an operation signal indicating the operation content to the control unit 29. This user operation includes an operation for starting an audio conference and an operation for presetting the audio conference.

この事前設定のための操作には、例えば、会議地点を選択する操作がある。この会議地点を選択する操作とは、本音声会議システム１００を構成する複数地点にそれぞれ配置された音声会議装置２００のうちいずれと音声会議を行うかを選択する操作である。 The operation for the presetting includes, for example, an operation for selecting a conference point. The operation of selecting a conference point is an operation of selecting which of the audio conference apparatuses 200 arranged at a plurality of points constituting the audio conference system 100 to perform an audio conference with.

例えば、「北海道」、「東京」、「名古屋」、「大阪」にそれぞれ音声会議装置２００が配置されており、このうち、本音声会議装置２００が大阪に配置されているとする。ここで、「北海道」、「東京」、「名古屋」のうち「東京」と「名古屋」とで音声会議を行いたい場合には、ユーザは「東京」と「名古屋」を示す番号等を操作部２８の図略のテンキー等を用いて入力する。これによって、会議地点が「東京」と「名古屋」であることが入力される。 For example, it is assumed that the audio conference apparatuses 200 are arranged in “Hokkaido”, “Tokyo”, “Nagoya”, and “Osaka”, respectively, and the audio conference apparatus 200 is arranged in Osaka. Here, when an audio conference is to be held between “Tokyo” and “Nagoya” among “Hokkaido”, “Tokyo”, and “Nagoya”, the user enters the numbers indicating “Tokyo” and “Nagoya” on the operation section. The number is input using a numeric keypad 28 not shown. As a result, it is input that the conference points are “Tokyo” and “Nagoya”.

また、本音声会議装置２００では、選択した会議地点の相手方装置２００´からの音声は、上述したように異なった複数箇所の点音源位置に定位させるが、事前設定の操作には、仮想点音源Ｐの位置パターンを選択する操作がある。以下に、仮想点音源Ｐの位置パターンについて説明する。 Further, in the audio conference apparatus 200, the audio from the partner apparatus 200 'at the selected conference point is localized at a plurality of different point sound source positions as described above. For the preset operation, a virtual point sound source is used. There is an operation of selecting a position pattern of P. Hereinafter, the position pattern of the virtual point sound source P will be described.

図６は、上方から見た仮想点音源Ｐの位置を示す図である。（ａ）は、４地点間の会議を行う場合に、３地点の相手方装置２００´からの音声を、均等な距離だけ離れた仮想点音源Ｐに定位させた状態を示す。（ｂ）は、３地点間の会議を行う場合に、２地点の相手方装置２００´からの音声を、均等な距離だけ離れた仮想点音源Ｐに定位させた状態を示す。 FIG. 6 is a diagram illustrating the position of the virtual point sound source P viewed from above. (A) shows the state where the sound from the counterpart device 200 'at the three points is localized to the virtual point sound source P separated by an equal distance when the conference between the four points is performed. (B) shows a state in which, when a conference between three points is performed, the sound from the counterpart device 200 ′ at two points is localized to the virtual point sound source P separated by an equal distance.

（ｃ）は、３地点間の会議を行う場合であって、このうち１地点の相手方装置２００´を用いる相手方が議長である場合に、議長からの音声を他の２地点の相手方装置２００´からの仮想点音源Ｐから離れた位置定位させている。これによって、会議出席者ｈに対して何れの相手方装置２００´からの音声が議長の音声であるかを明確に認識させることができる。 (C) is a case where a meeting between three points is performed, and when the other party using the one-party device 200 'is the chairperson, the voice from the chairperson is sent to the other-party device 200' at the other two points. Is positioned away from the virtual point sound source P. As a result, the conference attendee h can be made to clearly recognize which counterpart device 200 'is the voice of the chairperson.

このように、同図（ａ）（ｂ）で示すように各相手方装置２００´からの音声の仮想点音源Ｐを均等に配置するか、また、同図（ｃ）で示すように議長を設定して配置するか等の、仮想点音源Ｐの位置パターンはユーザの選択に基づいて設定される。また、議長を設定する場合には、ユーザは議長となる会議地区を選択する操作を行う。 In this way, the virtual point sound sources P of the voices from the counterpart devices 200 'are evenly arranged as shown in FIGS. 10A and 10B, or the chairperson is set as shown in FIG. The position pattern of the virtual point sound source P, such as whether or not to arrange, is set based on the user's selection. Further, when setting the chairperson, the user performs an operation of selecting a conference district to be the chairperson.

上述したように、会議地区及び仮想点音源Ｐの位置パターンの選択を操作部２８で受け付けた場合には、コントロール部２９によって選択された会議地点と仮想点音源Ｐの位置とを対応付ける処理（対応付け処理）が行われる（詳しくは後述する）。 As described above, when the selection of the position pattern of the conference district and the virtual point sound source P is received by the operation unit 28, the process of associating the conference point selected by the control unit 29 with the position of the virtual point sound source P (corresponding Attachment process) is performed (details will be described later).

コントロール部２９は、例えばＣＰＵ（Central Processing Unit）やメモリ等の記憶部等を備える。メモリで記憶するプログラムを実行することで、コントロール部２９は例えば音声会議装置２００´との間の通話等、スピーカ装置２の各部の動作を制御する。 The control unit 29 includes a storage unit such as a CPU (Central Processing Unit) and a memory, for example. By executing the program stored in the memory, the control unit 29 controls the operation of each unit of the speaker device 2, such as a call with the audio conference apparatus 200 ′.

また、コントロール部２９は、音声会議の事前設定を遅延部２４の各チャンネルや信号振分部２３に行うための処理を実行する。具体的には、コントロール部２９は、操作部２８から会議地区の選択を示す操作信号や仮想点音源Ｐの位置パターンの選択を示す操作信号を入力する。コントロール部２９は、これらの入力信号を用いて、図７（ａ）を用いて後述する対応付け処理を行う。この対応付け処理の実行によって、コントロール部２９は、選択された会議地点の送信元ＩＤと仮想点音源Ｐの位置との対応付けを行う。 In addition, the control unit 29 executes processing for performing the audio conference prior setting for each channel of the delay unit 24 and the signal distribution unit 23. Specifically, the control unit 29 inputs an operation signal indicating selection of a conference area and an operation signal indicating selection of the position pattern of the virtual point sound source P from the operation unit 28. Using these input signals, the control unit 29 performs association processing described later with reference to FIG. By executing this association process, the control unit 29 associates the transmission source ID of the selected conference point with the position of the virtual point sound source P.

これとともに、コントロール部２９は、選択された会議地点の送信元ＩＤと遅延部２４のチャンネルと対応付け、この対応付けを信号振分部２３に設定する。この設定によって、上述したように信号振分部２３では、入力した受信音声信号の送信元ＩＤが参照されて、この送信元ＩＤに対応するチャンネルに音声信号が入力される。また、コントロール部２９は、仮想点音源Ｐの位置で音声ビームを焦点させるような遅延時間を各会議地点毎に算出して、遅延部２４の対応するチャンネルに設定する。 At the same time, the control unit 29 associates the transmission source ID of the selected conference point with the channel of the delay unit 24 and sets this association in the signal distribution unit 23. With this setting, as described above, the signal distribution unit 23 refers to the transmission source ID of the received reception audio signal and inputs the audio signal to the channel corresponding to the transmission source ID. In addition, the control unit 29 calculates a delay time for focusing the sound beam at the position of the virtual point sound source P for each conference point, and sets it to the channel corresponding to the delay unit 24.

この設定によって、遅延部２４の対応するチャンネルに入力された受信音声信号は、会議地点に対応する遅延時間が付与されることになり、会議地点毎に対応する仮想点音源Ｐの位置で音像が定位することになる。 With this setting, the received audio signal input to the channel corresponding to the delay unit 24 is given a delay time corresponding to the conference point, and a sound image is generated at the position of the virtual point sound source P corresponding to each conference point. It will be localized.

また、音声会議装置２００は、上記構成に加えて、マイクアンプ３１及びＡ／Ｄコンバータ３２を備える。マイクアンプ３１は、接続端子３３に接続されたマイク３からの会議用音声信号（送信音声信号）を入力する。マイクアンプ３１は、入力した送信音声信号を増幅してＡ／Ｄコンバータ３２に入力する。Ａ／Ｄコンバータ３２は入力したアナログの送信音声信号をデジタルに変換してエコーキャンセラ２２に入力する。 The audio conference apparatus 200 includes a microphone amplifier 31 and an A / D converter 32 in addition to the above configuration. The microphone amplifier 31 inputs a conference audio signal (transmission audio signal) from the microphone 3 connected to the connection terminal 33. The microphone amplifier 31 amplifies the input transmission audio signal and inputs it to the A / D converter 32. The A / D converter 32 converts the input analog transmission voice signal into a digital signal and inputs it to the echo canceller 22.

このエコーキャンセラ２２では、上述したように入力した送信音声信号から擬似信号を除くことで、エコー成分を除去する。そして、エコーキャンセラ２２―入出力インタフェース２１―ネットワークＮを介して、送信音声信号は音声会議装置２００´に送信される。なお、送信音声信号は入出力インタフェース２１から送信される際に、本音声会議装置２００の送信元ＩＤが含められて送信される。 The echo canceller 22 removes the echo component by removing the pseudo signal from the input transmission voice signal as described above. Then, the transmission audio signal is transmitted to the audio conference apparatus 200 ′ via the echo canceller 22 -input / output interface 21 -network N. When the transmission audio signal is transmitted from the input / output interface 21, the transmission source ID of the audio conference apparatus 200 is included and transmitted.

図７は図６で示す音声会議装置２００の実行する処理を示すフローチャートであって、（ａ）は対応付け処理を示すフローチャートであり、（ｂ）は振り分け処理を示すフローチャートである。 FIG. 7 is a flowchart showing processing executed by the audio conference apparatus 200 shown in FIG. 6, wherein (a) is a flowchart showing association processing, and (b) is a flowchart showing sorting processing.

（ａ）を参照して、まず、コントロール部２９は、会議地点の選択を示す操作信号が入力され（Ｓ１）、この後、仮想点音源Ｐの位置パターンを示す操作信号が入力される（Ｓ２）。 Referring to (a), first, the control unit 29 receives an operation signal indicating the selection of a conference point (S1), and thereafter receives an operation signal indicating the position pattern of the virtual point sound source P (S2). ).

コントロール部２９は、仮想点音源Ｐを取得し、選択された会議地点の送信元ＩＤと取得した仮想点音源Ｐの対応付けを行う（Ｓ３）。以下にこの対応付けを具体的に説明する。コントロール部２９は、会議地点数及び仮想点音源Ｐの位置パターンに対応付けられた仮想点音源Ｐの位置とを登録するテーブルＴ１（図８（ａ）を参照）を記憶する。そして、コントロール部２９は、このテーブルＴ１を会議地点数及び仮想点音源Ｐの位置パターンで参照して、会議地点数及び仮想点音源Ｐの位置パターンに対応する仮想点音源Ｐの位置を取得する。 The control unit 29 acquires the virtual point sound source P and associates the transmission source ID of the selected conference point with the acquired virtual point sound source P (S3). This association will be specifically described below. The control unit 29 stores a table T1 (see FIG. 8A) for registering the number of conference points and the position of the virtual point sound source P associated with the position pattern of the virtual point sound source P. Then, the control unit 29 refers to the table T1 with the number of conference points and the position pattern of the virtual point sound source P, and acquires the position of the virtual point sound source P corresponding to the number of conference points and the position pattern of the virtual point sound source P. .

また、コントロール部２９は、会議地点とこの会議地点の相手方装置２００´の送信元ＩＤを登録したテーブルＴ２（図８（ｂ）を参照）を記憶する。コントロール部２９は、テーブルＴ２を選択された会議地点で参照して、この会議地点に対応する送信元ＩＤを取得する。コントロール部２９は、取得した送信元ＩＤを取得した仮想点音源Ｐの位置にランダムに対応付けてこの対応付けを記憶する。 In addition, the control unit 29 stores a table T2 (see FIG. 8B) in which the conference point and the transmission source ID of the counterpart device 200 ′ at the conference point are registered. The control unit 29 refers to the table T2 at the selected conference point, and acquires the transmission source ID corresponding to this conference point. The control unit 29 stores this association by randomly associating the acquired transmission source ID with the position of the acquired virtual point sound source P.

次に、コントロール部２９は、各送信元ＩＤと遅延部２４のチャンネルと対応付け、遅延部２４のチャンネルと送信元ＩＤとの対応付けを信号振分部２３に設定する（Ｓ４）。そして、コントロール部２９は、取得した仮想点音源Ｐで音声ビームを焦点させるような遅延時間を各仮想点音源Ｐ毎に算出して（Ｓ５）、対応する遅延部２４のチャンネルに設定する（Ｓ６）。この後、コントロール部２９は、本処理を終了させる。 Next, the control unit 29 associates each transmission source ID with the channel of the delay unit 24, and sets the association between the channel of the delay unit 24 and the transmission source ID in the signal distribution unit 23 (S4). Then, the control unit 29 calculates a delay time for focusing the sound beam with the acquired virtual point sound source P for each virtual point sound source P (S5) and sets it to the channel of the corresponding delay unit 24 (S6). ). Thereafter, the control unit 29 ends this process.

なお、本フローチャートではステップＳ２及びステップＳ３が実行されるが、これに代えて、コントロール部２９はテーブルＴ１の代わりに会議地点とこの会議地点に対応する仮想点音源Ｐの位置を登録したテーブルを記憶し、このテーブルを用いて会議地点に応じた仮想点音源Ｐの位置を取得し、これによって会議地点の送信元ＩＤに対応する仮想点音源Ｐの位置を取得する構成であってもよい。 In this flowchart, step S2 and step S3 are executed, but instead of this, the control unit 29 uses a table in which the conference point and the position of the virtual point sound source P corresponding to this conference point are registered instead of the table T1. The configuration may be such that the position of the virtual point sound source P corresponding to the conference point is acquired using this table, and the position of the virtual point sound source P corresponding to the transmission source ID of the conference point is thereby acquired.

次に、同図（ｂ）を参照して信号振分部２３の実行する振り分け処理を説明する。まず、信号振分部２３は、受信音声信号を入力したかどうかを入力したと判断するまで所定時間間隔毎に繰り返し判断して待機する（Ｓ１１）。受信音声信号を入力したと判断した場合には（Ｓ１１でＹＥＳ）、信号振分部２３は、受信音声信号に含まれる送信元ＩＤを参照する。信号振分部２３は、設定されている送信元ＩＤと遅延部２４のチャンネルとの対応付けに基づいて、参照した送信元ＩＤに対応する遅延部２４のチャンネルを判別する（Ｓ１２）。 Next, a distribution process executed by the signal distribution unit 23 will be described with reference to FIG. First, the signal distribution unit 23 repeatedly determines and waits at predetermined time intervals until it determines that the received audio signal has been input or not (S11). When it is determined that the received voice signal has been input (YES in S11), the signal distribution unit 23 refers to the transmission source ID included in the received voice signal. The signal distribution unit 23 determines the channel of the delay unit 24 corresponding to the referenced transmission source ID based on the association between the set transmission source ID and the channel of the delay unit 24 (S12).

この後、信号振分部２３は、受信音声信号を判別したチャンネルに入力する（Ｓ１３）。この後、信号振分部２３は本処理をステップＳ１１に戻す。 Thereafter, the signal distribution unit 23 inputs the received audio signal to the determined channel (S13). Thereafter, the signal distribution unit 23 returns the process to step S11.

上述したように、本実施形態では、異なった会議地点に配置された相手方装置２００´から送信された受信音声信号に対して、異なった仮想点音源Ｐの位置で焦点するような遅延時間が付与される。これによって、異なった相手方装置２００´からの音声を異なった仮想点音源Ｐの位置に音像定位させるような音声ビームを、スピーカアレイ２０から出力することができる。このため、会議出席者ｈに対して何れの会議地点からの音声を聴音しているかを容易に識別させることができる。 As described above, in the present embodiment, a delay time is given such that the received voice signal transmitted from the counterpart device 200 ′ arranged at a different conference point is focused at the position of the different virtual point sound source P. Is done. As a result, a sound beam can be output from the speaker array 20 so that sound from different counterpart devices 200 ′ is localized at different virtual point sound sources P. For this reason, it is possible to easily identify the conference participant h who is listening to the sound from which conference point.

また、本実施形態では、スピーカユニットＳＰ１〜ＳＰ８の放音側が下方に位置するようにスピーカアレイ２０が配設されている。これとともに、スピーカアレイ２０の下方に反射板１０が配置されている。これによって、スピーカアレイ２０から下方に向かって出力され、反射板１０で反射された音声ビームがＸ−Ｘ方向で２方向に分かれて斜め上方向かい、枠体１の側部から出力される。 In the present embodiment, the speaker array 20 is arranged so that the sound emission sides of the speaker units SP1 to SP8 are positioned below. At the same time, the reflector 10 is disposed below the speaker array 20. As a result, the sound beam output downward from the speaker array 20 and reflected by the reflector 10 is divided into two directions in the XX direction, obliquely upward, and output from the side of the frame 1.

これによって、異なった相手方装置２００´からの音声を異なった仮想点音源Ｐの位置に音像定位させるような音声ビームを２方向に出力することができる。このため、図３で示すように会議机に対向して着座する会議出席者ｈ１，ｈ２の双方に対して、異なった相手方装置２００´からの音声を異なった仮想点音源Ｐの位置に音像定位させることができる。また、会議出席者ｈ１，ｈ２に水平方向の音声ビームが出力されると、この音声ビームの出力方向に居る人に対する音漏れが生じやすい。本実施形態では、会議出席者ｈ１，ｈ２に対して下方から斜め上方に向かうように音声ビームが出力される。このため、会議出席者ｈ１，ｈ２の周囲に対するこのような音漏れを効果的に防止することができる。 As a result, it is possible to output in two directions a sound beam that allows sound from different counterpart devices 200 ′ to be localized at different virtual point sound sources P. For this reason, as shown in FIG. 3, the sound image localization is performed at the positions of the different virtual point sound sources P with respect to the voices from the different counterpart devices 200 'for both the conference attendees h1 and h2 seated facing the conference desk. Can be made. Further, when a horizontal audio beam is output to the conference attendees h1 and h2, sound leakage to a person in the output direction of the audio beam tends to occur. In the present embodiment, an audio beam is output so as to go diagonally upward from below with respect to meeting attendees h1 and h2. For this reason, it is possible to effectively prevent such sound leakage to the surroundings of the conference attendees h1 and h2.

また、スピーカアレイ２０では音声ビームの指向性を制御できるといっても、放音面に直交する方向に多少は音声が漏れるものである。本実施形態では、スピーカアレイ２０から下方に音声ビームを出力し、反射板１０で聴取者の方向に指向性制御するため、この音声漏れによって特定の聴取者以外の人に対して音声を聴取させてしまうことを防止することができる。 Further, even though the speaker array 20 can control the directivity of the sound beam, some sound leaks in the direction orthogonal to the sound emitting surface. In the present embodiment, since a sound beam is output downward from the speaker array 20 and directivity control is performed in the direction of the listener by the reflector 10, this sound leakage causes a person other than the specific listener to listen to the sound. Can be prevented.

本実施形態は、以下の変形例を採用することができる。 The present embodiment can employ the following modified examples.

（１）本実施形態では、音声会議装置２００は反射板１０を備えるがこれに限定されない。例えば、枠体１が反射板１０（下面を）を備えずに、音声会議装置２００を設置した会議机や床面に音声ビームを反射させる構成等であってもよい。 (1) In the present embodiment, the audio conference apparatus 200 includes the reflector 10, but is not limited thereto. For example, the frame 1 may have a configuration in which an audio beam is reflected on a conference desk or a floor surface on which the audio conference apparatus 200 is installed without including the reflector 10 (lower surface).

（２）なお、本実施形態では、スピーカユニットＳＰの個数は８個であるが、この個数に限定されず、音声ビームの指向性及びビーム幅を制御できるだけの個数が少なくとも配設されていればよい。 (2) In this embodiment, the number of speaker units SP is 8. However, the number of speaker units SP is not limited to this number. If the number of speaker units SP that can control the directivity and beam width of the sound beam is at least provided. Good.

図１は、音声会議システムを概略的に示す図である。FIG. 1 is a diagram schematically showing an audio conference system. 図１で示す音声会議装置を断面視した斜視図である。It is the perspective view which carried out the cross sectional view of the audio conference apparatus shown in FIG. 図２で示す音声会議装置の設置方法及び音声ビームの指向性制御を説明するための図である。It is a figure for demonstrating the installation method of the audio conference apparatus shown in FIG. 2, and directivity control of an audio beam. 図３で示す部屋ＲをＹ方向から見た図である。It is the figure which looked at the room R shown in FIG. 3 from the Y direction. 図１で示す音声会議装置の構成を概略的に示すブロック図である。It is a block diagram which shows roughly the structure of the audio conference apparatus shown in FIG. 仮想点音源の位置を示す図である。（ａ）は、４地点間の会議を行う場合に、３地点の相手方装置からの音声を、均等な距離だけ離れた仮想点音源に定位させた状態を示す。（ｂ）は、３地点間の会議を行う場合に、２地点の相手方装置からの音声を、均等な距離だけ離れた仮想点音源に定位させた状態を示す。It is a figure which shows the position of a virtual point sound source. (A) shows a state in which when a conference between four locations is performed, the sound from the counterpart device at the three locations is localized to a virtual point sound source separated by an equal distance. (B) shows a state in which, when a conference between three points is performed, the sound from the counterpart device at two points is localized to a virtual point sound source separated by an equal distance. 図６で示す音声会議装置の実行する処理を示すフローチャートであって、（ａ）は対応付け処理を示すフローチャートであり、（ｂ）は振り分け処理を示すフローチャートである。FIG. 7 is a flowchart showing processing executed by the voice conference apparatus shown in FIG. 6, wherein (a) is a flowchart showing association processing, and (b) is a flowchart showing sorting processing. コントロール部が記憶するテーブルを示す図である。It is a figure which shows the table which a control part memorize | stores.

Explanation of symbols

２００−音声会議装置３−マイクロフォン２０−スピーカアレイ２１−入出力インタフェース（入力部）２４−遅延部（放音部）２５−加算部（放音部）２６−Ｄ／Ａコンバータ２７−増幅部（放音部）２９−コントロール部 P−仮想点音源ＳＰ−スピーカユニット 200-voice conference device 3-microphone 20-speaker array 21-input / output interface (input unit) 24-delay unit (sound emitting unit) 25-adder unit (sound emitting unit) 26-D / A converter 27-amplifier unit ( 29-Control part P-Virtual point sound source SP-Speaker unit

Claims

An input unit for inputting each audio signal from a plurality of transmission sources together with transmission source identification information ;
A speaker array comprising a plurality of speaker units arranged in a line and downwards is connected, and the sound emission input to the speaker array is performed by performing independent delay control on each audio signal input at the input unit. And
With
The sound emitting unit selects a position pattern of a virtual point sound source of each audio signal,
Based on the transmission source identification information input together with each audio signal from the input unit and the selected position pattern, the audio beam of each audio signal is a virtual point sound source at a different position in the longitudinal direction of the speaker array. An audio conference apparatus that performs control so as to focus.

The audio conference apparatus according to claim 1, wherein the sound emitting unit sets the position of the virtual point sound source above the speaker array.