JP2006211487A

JP2006211487A - Voice input/output device

Info

Publication number: JP2006211487A
Application number: JP2005022981A
Authority: JP
Inventors: Akane Noguchi; あかね野口; Toshiaki Ishibashi; 利晃石橋
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2005-01-31
Filing date: 2005-01-31
Publication date: 2006-08-10

Abstract

<P>PROBLEM TO BE SOLVED: To provide a voice input/output device capable of appropriately picking up voices of a user in any arbitrary posture so as to talk in a free posture. <P>SOLUTION: A plurality of unit devices DU (i, j) sharing a function as a microphone and a function as a speaker and capable of switching control to validate any one of these functions are disposed in a planar pattern. A controller 200 generates an ultrasonic wave using a function as the speaker of the plurality of distributed unit devices, collects reflection sounds of the ultrasonic wave using the function of the microphone of the plurality of remaining distributed unit devices, selects one or more unit devices corresponding to a position of the mouth of the user based on a result, and operates the selected unit device as a microphone. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、音声通話のための手段として好適な音声入出力装置に関する。 The present invention relates to a voice input / output device suitable as a means for voice call.

音声通話のための一般的な手段として電話機がある。ユーザは、この電話機を使用して自分の音声を通話相手に送る場合、受話器を持ってそのマイクを口の近くに位置させ、マイクに向かって発声しなければならない。このような受話器を持っての通話は面倒である。受話器を持たずに通話を行うことを可能にするためには、例えば電話機におけるマイクを固定された位置に置き、ユーザは、このマイクが適切に自分の声を拾えるような姿勢で通話をするという方法が考えられる。しかし、このようにマイクの位置に合わせて通話の姿勢をとるのはユーザにとって甚だ窮屈である。ユーザとしては自由な姿勢で、しかも手ぶらで通話を行いたいところである。この点に関し、特許文献１は、カメラによりユーザを撮像し、これにより得られるユーザの画像に基づいてユーザの口の位置を求め、その口から発声される音を適切に拾える位置にマイクを移動させる技術を開示している。
特開平７−２８４８８号公報 There is a telephone as a general means for voice calls. When a user uses his telephone to send his / her voice to the other party, he / she must hold the handset, position the microphone near his mouth, and speak into the microphone. Calls with such handsets are cumbersome. In order to be able to make a call without a handset, for example, a microphone in the telephone is placed in a fixed position, and the user calls with a posture such that the microphone can properly pick up his / her voice. A method is conceivable. However, it is extremely cramped for the user to take a call posture according to the position of the microphone in this way. As a user, he wants to talk freely with his hands free. In this regard, Patent Document 1 captures the user with a camera, obtains the position of the user's mouth based on the user's image obtained thereby, and moves the microphone to a position where the sound uttered from the mouth can be appropriately picked up Disclosed.
JP-A-7-28488

特許文献１の技術によれば、ユーザの手を煩わせることなくユーザの声を拾えることができる位置にマイクを移動させることができる。しかし、この技術では、画像認識処理によりユーザの口の位置を求め、その位置にマイクを移動させるため、マイクの位置をユーザの口の位置に適切に合わせるのに時間が掛かるという問題がある。また、ユーザの画像を得るためのカメラや画像認識処理を行うための装置やマイクを移動させる装置など、本来は通話のために必要でない装置を別途設けなければならず、その意味において非経済的であるという問題があった。 According to the technique of Patent Document 1, the microphone can be moved to a position where the user's voice can be picked up without bothering the user's hand. However, in this technique, since the position of the user's mouth is obtained by image recognition processing and the microphone is moved to that position, there is a problem that it takes time to properly match the position of the microphone to the position of the user's mouth. In addition, a device that is not originally required for a call, such as a camera for obtaining a user image, a device for performing image recognition processing, or a device for moving a microphone, must be provided separately. There was a problem of being.

この発明は、以上説明した事情に鑑みてなされたものであり、任意の姿勢をとっているユーザの声を適切に拾うことができ、自由な姿勢での通話を可能にする音声入出力装置を提供することを目的とする。 The present invention has been made in view of the circumstances described above, and provides a voice input / output device that can appropriately pick up the voice of a user who takes an arbitrary posture and enables a call in a free posture. The purpose is to provide.

この発明は、マイクとしての機能およびスピーカとしての機能を併有するとともにこれらの機能のいずれかを一方を有効化する切替制御が可能な複数の単位デバイスを面状に配置してなる音声入出力デバイスアレイと、前記複数の単位デバイスの各々が何らかの物体と対面しているか否かを検出する検出手段と、前記複数の単位デバイスのうち前記検出手段により何らかの物体と対面している旨が検出されたものの中の１または複数の単位デバイスを選択し、選択した単位デバイスをマイクとして動作させ、他の単位デバイスの少なくとも一部をスピーカとして動作させるための前記音声入出力デバイスアレイの制御を行う制御手段とを具備することを特徴とする音声入出力装置を提供する。
他の好ましい態様においては、前記音声入出力デバイスアレイが前記検出手段を兼ねる。この態様において、前記制御手段は、前記音声入出力デバイスアレイにおける複数の分散配置された単位デバイスのスピーカとしての機能により超音波を発生し、この超音波の反射音を前記音声入出力デバイスアレイにおける分散配置された残りの複数の単位デバイスのマイクとしての機能により収音し、この収音結果に基づいて前記複数の単位デバイスの各々が何らかの物体と対面しているか否かを検出し、前記複数の単位デバイスのうち何らかの物体と対面していると認められるものの中の１または複数の単位デバイスを選択し、選択した単位デバイスをマイクとして動作させ、他の単位デバイスの少なくとも一部をスピーカとして動作させるための前記音声入出力デバイスアレイの制御を行う。
他の好ましい態様において、前記制御手段は、前記マイクとして機能させる単位デバイスを選択するための制御を繰り返し実行する。 The present invention provides a voice input / output device having a plurality of unit devices arranged in a planar shape, which have both a function as a microphone and a function as a speaker, and capable of switching control for enabling one of these functions. An array, detection means for detecting whether or not each of the plurality of unit devices is facing some object, and detection that the detection means is facing the object among the plurality of unit devices Control means for controlling the voice input / output device array for selecting one or a plurality of unit devices in a device, operating the selected unit device as a microphone, and operating at least a part of the other unit devices as a speaker. A voice input / output device is provided.
In another preferred embodiment, the voice input / output device array also serves as the detection means. In this aspect, the control means generates an ultrasonic wave by a function of a speaker of a plurality of unit devices arranged in a distributed manner in the voice input / output device array, and reflects the reflected sound of the ultrasonic wave in the voice input / output device array. Sound is collected by the function of the microphones of the remaining plurality of unit devices dispersedly arranged, and it is detected whether each of the plurality of unit devices is facing any object based on the sound collection result, Select one or a plurality of unit devices that can be recognized as facing any object among the unit devices, operate the selected unit devices as microphones, and operate at least some other unit devices as speakers The voice input / output device array for controlling the audio input / output device array is controlled.
In another preferred aspect, the control means repeatedly executes control for selecting a unit device to function as the microphone.

以下、図面を参照し、本発明の実施の形態について説明する。
＜第１実施形態＞
図１は、この発明の第１実施形態である音声入出力システムの構成を示すブロック図である。この音声入出力システムは、ｍ×ｎ個の単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）と、このシステム全体を制御するコントローラ２００と、このコントローラ２００による制御の下、各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）を制御する出力音声処理・制御部２０１、入力音声処理・制御部２０２および切替制御部２０３とにより構成されている。 Embodiments of the present invention will be described below with reference to the drawings.
<First Embodiment>
FIG. 1 is a block diagram showing the configuration of a voice input / output system according to the first embodiment of the present invention. This voice input / output system includes m × n unit devices DU (i, j) (i = 1 to m, j = 1 to n), a controller 200 for controlling the entire system, and control by the controller 200. The output voice processing / control unit 201, the input voice processing / control unit 202, and the switching control unit 203 that control each unit device DU (i, j) (i = 1 to m, j = 1 to n). It is configured.

各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）は、マイクとしての機能およびスピーカとしての機能を各々併有する音声入出力部１０１と、音声入出力部１０１がスピーカとして使用されている場合に出力音声処理・制御部２０１から与えられるオーディオ信号を増幅して音声入出力部１０１に供給するスピーカアンプ１０２と、音声入出力部１０１がマイクとして使用されている場合に音声入出力部１０１から得られる信号を増幅してオーディオ信号を入力音声処理・制御部２０２に出力するマイクアンプ１０３と、切替制御部２０３からの切替制御信号に従い、音声入出力部１０１をスピーカとして使用するかマイクとして使用するかの切替を行うマイクスピーカ切替器１０４とにより構成されている。 Each unit device DU (i, j) (i = 1 to m, j = 1 to n) includes an audio input / output unit 101 having both a function as a microphone and a function as a speaker, and an audio input / output unit 101. A speaker amplifier 102 that amplifies an audio signal supplied from the output audio processing / control unit 201 when used as a speaker and supplies the amplified signal to the audio input / output unit 101, and a case where the audio input / output unit 101 is used as a microphone A microphone amplifier 103 that amplifies the signal obtained from the voice input / output unit 101 and outputs the audio signal to the input voice processing / control unit 202, and the voice input / output unit 101 is connected to the speaker according to the switching control signal from the switching control unit 203. And a microphone / speaker switch 104 for switching between use as a microphone and use as a microphone.

図２は単位デバイスＤＵ（ｉ，ｊ）の構成例を示す図である。この例において、単位デバイスＤＵ（ｉ，ｊ）における音声入出力部１０１は、振動板１０１ａと、この振動板１０１ａを間に挟んで対向した２枚の固定板１０１ｂおよび１０１ｃとを有するコンデンサ型デバイスにより構成されている。好ましい態様において、各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）の音声入出力部１０１は、薄い正方形板の形状をなしており、例えば部屋の中の壁を覆うように縦横に配列され、音声入出力デバイスアレイを構成している。単位デバイスＤＵ（ｉ，ｊ）におけるスピーカアンプ１０２は、出力音声処理・制御部２０１から与えられるオーディオ信号を同位相のまま増幅して出力するアンプ１０２ａと、同オーディオ信号の位相を反転して増幅して出力するアンプ１０２ｂとにより構成されている。また、マイクアンプ１０３は、差動アンプにより構成されている。 FIG. 2 is a diagram illustrating a configuration example of the unit device DU (i, j). In this example, the audio input / output unit 101 in the unit device DU (i, j) includes a diaphragm 101a and two fixed plates 101b and 101c facing each other with the diaphragm 101a interposed therebetween. It is comprised by. In a preferred embodiment, the voice input / output unit 101 of each unit device DU (i, j) (i = 1 to m, j = 1 to n) has a thin square plate shape, for example, a wall in a room The audio input / output device array is configured in a vertical and horizontal manner so as to cover. The speaker amplifier 102 in the unit device DU (i, j) is an amplifier 102a that amplifies and outputs the audio signal supplied from the output audio processing / control unit 201 in the same phase, and inverts and amplifies the phase of the audio signal. And an amplifier 102b that outputs the signal. Further, the microphone amplifier 103 is constituted by a differential amplifier.

マイクスピーカ切替器１０４は、音声入出力部１０１をスピーカとして機能させる旨の切替制御信号が与えられた場合、スピーカアンプ１０２におけるアンプ１０２ａの出力端子を音声入出力部１０１の固定板１０１ｂに、アンプ１０２ｂの出力端子を固定板１０１ｃに、バイアス電源ＶＢ１の電極を振動板１０１ａに接続する。この状態において、音声入出力部１０１における固定板１０１ｂにはアンプ１０２ａを介することにより出力音声処理・制御部２０１からのオーディオ信号が与えられ、固定板１０１ｃにはアンプ１０２ｂを介することにより同オーディオ信号と逆位相のオーディオ信号が与えられる。このため、オーディオ信号の瞬時値の変化に応じて固定板１０１ｂ側または固定板１０１ｃ側に移動する往復振動が振動板１０１ａに発生する。この結果、平坦な音声入出力部１０１の表裏両面（図２では上面および下面）から元のオーディオ信号に対応した平面波が放射される。 When a switching control signal for causing the voice input / output unit 101 to function as a speaker is given, the microphone speaker switcher 104 connects the output terminal of the amplifier 102a in the speaker amplifier 102 to the fixed plate 101b of the voice input / output unit 101. The output terminal 102b is connected to the fixed plate 101c, and the electrode of the bias power source VB1 is connected to the diaphragm 101a. In this state, the audio signal from the output audio processing / control unit 201 is given to the fixed plate 101b in the audio input / output unit 101 via the amplifier 102a, and the audio signal is given to the fixed plate 101c via the amplifier 102b. And an audio signal of opposite phase. Therefore, a reciprocating vibration that moves to the fixed plate 101b side or the fixed plate 101c side according to the change in the instantaneous value of the audio signal is generated in the diaphragm 101a. As a result, plane waves corresponding to the original audio signal are radiated from the front and back surfaces (upper surface and lower surface in FIG. 2) of the flat audio input / output unit 101.

一方、マイクスピーカ切替器１０４は、音声入出力部１０１をマイクとして機能させる旨の切替制御信号が与えられた場合、マイクアンプ１０２における非反転入力端子を音声入出力部１０１の固定板１０１ｂに、反転入力端子を固定板１０１ｃに、バイアス電源ＶＢ２の電極を振動板１０１ａに接続する。この状態において、音声入出力部１０１における振動板１０１ａが外界から音波を受けることにより振動すると、この振動波形に応じた波形であって、相互に逆位相の電気信号が固定版１０１ｂおよび１０１ｃに発生する。マイクアンプ１０３では、これらの固定板１０１ｂおよび１０１ｃに発生する各信号の差動増幅が行われ、この差動増幅により得られたオーディオ信号が入力音声処理・制御部２０２に出力される。以上が図２に示す単位デバイスＤＵ（ｉ，ｊ）の詳細である。 On the other hand, when a switching control signal for causing the voice input / output unit 101 to function as a microphone is given to the microphone speaker switch 104, the non-inverting input terminal of the microphone amplifier 102 is connected to the fixed plate 101b of the voice input / output unit 101. The inverting input terminal is connected to the fixed plate 101c, and the electrode of the bias power source VB2 is connected to the diaphragm 101a. In this state, when the diaphragm 101a in the voice input / output unit 101 vibrates by receiving sound waves from the outside, electrical signals having a waveform corresponding to the vibration waveform and having opposite phases to each other are generated in the fixed plates 101b and 101c. To do. The microphone amplifier 103 performs differential amplification of each signal generated on the fixed plates 101 b and 101 c, and an audio signal obtained by the differential amplification is output to the input sound processing / control unit 202. The above is the details of the unit device DU (i, j) shown in FIG.

本実施形態において各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）の音声入出力部１０１は、マイクおよびスピーカとしての役割の他、水平方向近傍部にある物体を検出するセンサとしての役割を担っている。この役割については、説明の重複を避けるため、本実施形態の動作説明において明らかにする。 In this embodiment, the voice input / output unit 101 of each unit device DU (i, j) (i = 1 to m, j = 1 to n) is an object in the vicinity in the horizontal direction in addition to the role as a microphone and a speaker. It plays a role as a sensor to detect the. This role will be clarified in the explanation of the operation of the present embodiment in order to avoid duplication of explanation.

好ましい態様において、音声入出力システムは、音声通信のための通信制御部を介して電話通信網またはインターネットに接続される。コントローラ２００は、図示しない操作部の操作により通信相手の電話番号が入力された場合、通信制御部により通話相手装置の音声受信部と出力音声処理・制御部２０１との間および通話相手装置の音声送信部と入力音声処理・制御部２０２との間を各々結ぶ双方向の音声通話コネクションを確立する機能を有している。
以上が本実施形態に係る音声入出力システムの構成の詳細である。 In a preferred embodiment, the voice input / output system is connected to a telephone communication network or the Internet via a communication control unit for voice communication. When the communication partner's telephone number is input by operating an operation unit (not shown), the controller 200 causes the communication control unit to communicate between the voice receiving unit of the communication partner device and the output voice processing / control unit 201 and the voice of the communication partner device. It has a function of establishing a two-way voice call connection connecting the transmission unit and the input voice processing / control unit 202.
The above is the details of the configuration of the voice input / output system according to the present embodiment.

次に本実施形態の動作について説明する。
ユーザが、図示しない操作部の操作により通話相手装置の電話番号またはＩＰアドレスを入力し、通話開始を指示すると、コントローラ２００は、通話相手装置との間のコネクションを確立するための処理を実行するとともに、これと並行して図３にフローを示すルーチンを実行する。 Next, the operation of this embodiment will be described.
When the user inputs the telephone number or IP address of the communication partner device by operating the operation unit (not shown) and instructs the start of the call, the controller 200 executes processing for establishing a connection with the communication partner device. At the same time, a routine whose flow is shown in FIG. 3 is executed.

まず、コントローラ２００は、発声者頭部位置測定処理を実行する（ステップＳ１）。この発声者頭部位置測定処理において、コントローラ２００は、発声者頭部位置測定のための切替制御信号の出力を切替制御部２０３に指示する。この指示に従い、切替制御部２０３は、発声者頭部位置測定のために予め記憶したパターンに従い、各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）のうち分散配置された複数の単位デバイスのマイクスピーカ切替器１０４には音声入出力部１０１をスピーカとして機能させる切替制御信号を送り、他の複数の単位デバイスのマイクスピーカ切替器１０４には音声入出力部１０１をマイクとして機能させる切替制御信号を送る。図４は、この切替制御が行われた後の単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）の各音声入出力部１０１の状態を示している。この例では、インデックスｉ、ｊが共に奇数である単位デバイスＤＵ（ｉ，ｊ）およびインデックスｉ、ｊが共に偶数である単位デバイスＤＵ（ｉ，ｊ）の音声入出力部１０１がマイクとされる一方、インデックスｉ、ｊの一方が奇数で他方が偶数である単位デバイスＤＵ（ｉ，ｊ）の音声入出力部１０１がスピーカとされており、全体としてスピーカとマイクがモザイク状に配列された状態となっている。 First, the controller 200 executes a speaker head position measurement process (step S1). In this speaker head position measurement process, the controller 200 instructs the switching control unit 203 to output a switching control signal for measuring the speaker head position. According to this instruction, the switching control unit 203 distributes among the unit devices DU (i, j) (i = 1 to m, j = 1 to n) according to a pattern stored in advance for measuring the speaker head position. A switching control signal for causing the voice input / output unit 101 to function as a speaker is sent to the microphone speaker switchers 104 of the plurality of unit devices arranged, and the voice input / output unit 101 is sent to the microphone speaker switchers 104 of the other unit devices. Sends a switching control signal that causes the to function as a microphone. FIG. 4 shows the state of each audio input / output unit 101 of the unit device DU (i, j) (i = 1 to m, j = 1 to n) after this switching control is performed. In this example, the sound input / output unit 101 of the unit device DU (i, j) whose indexes i and j are both odd numbers and the unit device DU (i, j) whose indexes i and j are both even numbers are microphones. On the other hand, the audio input / output unit 101 of the unit device DU (i, j) in which one of the indexes i and j is odd and the other is even is a speaker, and the speakers and microphones are arranged in a mosaic as a whole. It has become.

次にコントローラ２００は、超音波の出力を出力音声処理・制御部２０１に指示する。出力音声処理・制御部２０１は、この指示に従い、超音波の発生に必要な駆動信号を単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）のスピーカアンプ１０２に送る。この結果、単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）のうちスピーカとして機能させる旨の切替制御信号が与えられている単位デバイスでは、出力音声処理・制御部２０１からの駆動信号がスピーカアンプ１０２におけるアンプ１０２ａおよび１０２ｂによる増幅後、マイクスピーカ切替器１０４を介して音声入出力部１０１の固定板１０１ｂおよび１０１ｃに与えられ、音声入出力部１０１の表裏両面から超音波が放射される。この超音波は、指向性の強い平面波であるため、音声入出力部１０１の前方に何らかの物体がある場合にはそこで反射され、反射波である超音波は、その超音波の放射を行った音声入出力部１０１とこれに隣接した他の音声入出力部１０１に到達する。 Next, the controller 200 instructs the output sound processing / control unit 201 to output an ultrasonic wave. In accordance with this instruction, the output sound processing / control unit 201 sends a drive signal necessary for generating an ultrasonic wave to the speaker amplifier 102 of the unit device DU (i, j) (i = 1 to m, j = 1 to n). . As a result, in the unit device DU (i, j) (i = 1 to m, j = 1 to n) to which the switching control signal for functioning as a speaker is given, the output audio processing / control unit The drive signal from 201 is amplified by the amplifiers 102 a and 102 b in the speaker amplifier 102, and then applied to the fixed plates 101 b and 101 c of the audio input / output unit 101 via the microphone speaker switch 104, and from both the front and back sides of the audio input / output unit 101. Ultrasound is emitted. Since this ultrasonic wave is a plane wave with strong directivity, if there is any object in front of the voice input / output unit 101, it is reflected there, and the ultrasonic wave that is the reflected wave is the voice that emitted the ultrasonic wave. It reaches the input / output unit 101 and another audio input / output unit 101 adjacent thereto.

この超音波が到達する音声入出力部１０１のうちマイクとして機能しているものにおいては、振動板１０１ａが超音波に応じて振動し、この振動に応じた電気信号が固定板１０１ｂおよび１０１ｃに発生し、これら２相の電気信号がマイクスピーカ切替器１０４を介してマイクアンプ１０３に送られる。これら２相の電気信号は、マイクアンプ１０３により差動増幅され、入力音声処理・制御部２０２に送られる。入力音声処理・制御部２０２は、各単位デバイスＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）のマイクアンプ１０３の出力信号を監視しており、この監視結果に基づき、マイクとして機能している音声入出力部１０１のうち超音波の到達したものを求め、コントローラ２００に報告する。 In the voice input / output unit 101 that functions as a microphone among the voice input / output units 101 that the ultrasonic waves reach, the vibration plate 101a vibrates according to the ultrasonic waves, and electric signals corresponding to the vibrations are generated in the fixed plates 101b and 101c. These two-phase electric signals are sent to the microphone amplifier 103 via the microphone speaker switch 104. These two-phase electric signals are differentially amplified by the microphone amplifier 103 and sent to the input sound processing / control unit 202. The input audio processing / control unit 202 monitors the output signal of the microphone amplifier 103 of each unit device DU (i, j) (i = 1 to m, j = 1 to n), and based on this monitoring result, Of the voice input / output unit 101 functioning as a microphone, the one that the ultrasonic wave reaches is obtained and reported to the controller 200.

図５に示す例では、縦横に配列された音声入出力部１０１の前に破線で示すような人物が立っており、この人物の体によって反射された超音波が、マイクとして機能している音声入力部１０１のうち斜線で示すものに到達した様子が示されている。コントローラ２００は、この超音波が到達した音声入出力部１０１に関する報告を受けると、それらの音声入出力部１０１のうち最上部にあるもの、すなわち、図５の例では音声入力部１０１Ｈの位置にユーザの頭部があると判断する。 In the example shown in FIG. 5, a person as shown by a broken line stands in front of the voice input / output units 101 arranged vertically and horizontally, and the ultrasonic waves reflected by the person's body function as a microphone. A state in which the input unit 101 reaches the one indicated by the oblique lines is shown. When the controller 200 receives a report regarding the voice input / output unit 101 to which the ultrasonic wave has reached, the controller 200 is located at the top of the voice input / output units 101, that is, in the example of FIG. It is determined that there is a user's head.

以上のようにして発声者頭部位置測定処理が完了すると、コントローラ２００は、スピーカおよびマイクの設定処理を実行する（ステップＳ２）。このステップＳ２の設定処理において、コントローラ２００は、発声者頭部位置測定処理において頭部の位置と判断された音声入出力部１０１の近傍の音声入出力部１０１をマイクとして選択する。このステップＳ２の設定処理では、発声者頭部位置測定処理により求められた頭部の位置から発声者の口の位置を推定し、この口の位置に面している１個の音声入出力部１０１をマイクとして選択してもよいし、この１個の音声入出力部１０１を中心とした一定範囲内の複数の音声入出力部１０１をマイクとして選択してもよい。この選択を終えると、コントローラ２００は、マイクとして選択した音声入出力部１０１をマイクとして機能させ、他の音声入出力部１０１をスピーカとして機能させるべき旨の切替制御信号の出力を切替制御部２０３に指示する。 When the speaker head position measurement processing is completed as described above, the controller 200 executes speaker and microphone setting processing (step S2). In the setting process in step S2, the controller 200 selects the voice input / output unit 101 near the voice input / output unit 101 determined as the head position in the speaker head position measurement process as a microphone. In the setting process of step S2, the position of the mouth of the speaker is estimated from the position of the head obtained by the speaker head position measurement process, and one voice input / output unit facing the position of the mouth 101 may be selected as a microphone, or a plurality of audio input / output units 101 within a certain range centering on this single audio input / output unit 101 may be selected as microphones. When this selection is finished, the controller 200 causes the switching control unit 203 to output a switching control signal indicating that the voice input / output unit 101 selected as the microphone functions as a microphone and the other voice input / output unit 101 functions as a speaker. To instruct.

その後、通話のためのコネクションが確立され、通話が開始された後も、コントローラ２００は、ステップＳ１およびＳ２の処理を繰り返す。すなわち、分散配置された音声入出力部から超音波を発生し、その反射音を検出することによる発声者頭部位置測定が行われ（図６（ａ）参照）、次いで発声者頭部位置測定の結果に基づいてマイクとして機能させる音声入出力部が選択され（図６（ｂ）参照）、発声者頭部位置測定が行われ（図６（ｃ）参照）、再びマイクとして機能させる音声入出力部が選択され（図６（ｄ）参照）、という具合に発声者頭部位置測定とマイクとして機能させる音声入出力部の選択が交互に繰り返される。 Thereafter, after the connection for the call is established and the call is started, the controller 200 repeats the processes of steps S1 and S2. That is, the head position measurement of the speaker is performed by generating ultrasonic waves from the voice input / output units distributed and detecting the reflected sound (see FIG. 6A), and then measuring the head position of the speaker. The voice input / output unit that functions as a microphone is selected on the basis of the result (see FIG. 6B), the head position of the speaker is measured (see FIG. 6C), and the voice input that functions as the microphone again is selected. The output unit is selected (see FIG. 6D), and the speaker head position measurement and the selection of the voice input / output unit that functions as a microphone are alternately repeated.

従って、例えば図７に示すように椅子に座ったり、図８に示すように立ったりする等、発声者が通話中に姿勢を変えたとしても、発声者の口の位置に対面した音声入出力部１０１がマイクとして選択され、発声者の音声が適切に収音され、通話相手装置に送信される。 Therefore, even if the speaker changes his posture during a call, such as sitting on a chair as shown in FIG. 7 or standing as shown in FIG. 8, voice input / output facing the mouth position of the speaker The unit 101 is selected as a microphone, and the voice of the speaker is appropriately picked up and transmitted to the call partner apparatus.

＜第２実施形態＞
図９は、この発明の第２実施形態である音声入出力装置の動作を示す図である。上記第１実施形態では、発声者頭部位置測定処理の結果に基づいてマイクとして機能させる音声入出部を選択する場合、マイクとして選択した音声入出力部以外の全ての音声入出力部をスピーカとして機能させた（図６（ｂ）および（ｄ）参照）。これに対し、本実施形態では、発声者頭部位置測定処理の結果に基づいてマイクとして機能させる音声入出部を選択する場合、マイクとして選択したもの以外の音声入出力部であっても、発声者頭部位置測定処理の際にスピーカとして機能しているものでない限り、スピーカとして機能させない（図９（ｂ）および（ｄ）参照）。すなわち、マイクとして選択したもの以外の音声入出力部のうち発声者頭部位置測定処理の際にスピーカとして機能していないものは、マイクとしてもスピーカとしても機能させない。このような制御を行うことにより、発声者頭部位置測定処理を行っている期間と、それ以外の期間とで、スピーカとして機能している音声入出力部の個数および配置の変化を少なくし、スピーカ再生音の大きさの変化を少なくすることができる。 Second Embodiment
FIG. 9 is a diagram showing the operation of the voice input / output device according to the second embodiment of the present invention. In the first embodiment, when selecting the voice input / output unit to function as a microphone based on the result of the speaker head position measurement process, all the voice input / output units other than the voice input / output unit selected as the microphone are used as speakers. It was made to function (refer FIG.6 (b) and (d)). On the other hand, in this embodiment, when selecting the voice input / output unit that functions as a microphone based on the result of the speaker head position measurement process, even if the voice input / output unit is not the one selected as the microphone, Unless it functions as a speaker during the human head position measurement process, it does not function as a speaker (see FIGS. 9B and 9D). That is, voice input / output units other than those selected as microphones that do not function as speakers during the speaker head position measurement process are not allowed to function as microphones or speakers. By performing such control, the number of voice input / output units functioning as a speaker and the change in arrangement are reduced between the period during which the speaker's head position measurement processing is performed and the other period, It is possible to reduce the change in loudness of the speaker playback sound.

＜第３実施形態＞
図１０は、この発明の第３実施形態である音声入出力装置の動作を示す図である。本実施形態は、上記第２実施形態にさらに改良を加えたものである。上記第２実施形態では、何回目の発声者頭部位置測定処理であろうと、発声者頭部位置測定処理の際には、必ず決まった位置の複数の音声入出力部をマイクとして機能させた（図９（ａ）および（ｃ）参照）。これに対し、本実施形態では、第１回目の発声者頭部位置測定処理では最大個数の音声入出力部をマイクとして機能させるが（図１０（ａ）参照）、この発声者頭部位置測定処理の結果に基づいてマイクとして機能させる音声入出力を選択（図１０（ｂ）参照）した後、第２回目の発声者頭部位置測定処理では、その直前においてマイクとして選択されていた音声入出力部を中心とした所定範囲内の音声入出力部のみをマイクとして機能させる（図１０（ｃ）参照）。以後同様であり、第２回目以降の発声者頭部位置測定処理では、その直前においてマイクとして選択されていた音声入出力部を中心とした所定範囲内の音声入出力部のみをマイクとして機能させる。本実施形態によれば、通話開始時には多数の音声入出力部がマイクとして使用されるが、それ以降は、マイクとして使用される音声入出力部の個数を減らし、かつ、個数の変化を少なくすることができる。従って、通話開始以降、通話相手に聞こえる音量の時間的変化を少なくすることができる。 <Third Embodiment>
FIG. 10 shows the operation of the voice input / output device according to the third embodiment of the present invention. In the present embodiment, the second embodiment is further improved. In the second embodiment, regardless of how many times the speaker head position measurement process is performed, a plurality of voice input / output units at predetermined positions are allowed to function as microphones in the speaker head position measurement process. (See FIGS. 9A and 9C). In contrast, in the present embodiment, the maximum number of voice input / output units function as microphones in the first speaker head position measurement process (see FIG. 10A). After selecting the voice input / output to function as a microphone based on the processing result (see FIG. 10B), in the second speaker head position measurement process, the voice input that was selected as the microphone immediately before is input. Only the voice input / output unit within a predetermined range centering on the output unit is caused to function as a microphone (see FIG. 10C). The same applies thereafter, and in the second and subsequent speaker head position measurement processing, only the voice input / output unit within the predetermined range centering on the voice input / output unit selected as the microphone immediately before is used as a microphone. . According to the present embodiment, a large number of voice input / output units are used as microphones at the start of a call. Thereafter, the number of voice input / output units used as microphones is reduced and the change in the number is reduced. be able to. Accordingly, it is possible to reduce the temporal change in the volume that can be heard by the call partner after the start of the call.

＜第４実施形態＞
図１１は、この発明の第４実施形態である音声入出力装置の動作を示すタイムチャートである。歩きながら話すのが好きな人は別として、通常の人は、話に熱中しているときは、あまり姿勢を変えず、口の位置もあまり変化しない。しかし、自分が話しておらず、通話相手の話を聞くときは、退屈なことが多いため、ついつい姿勢を変えがちである。本実施形態は、この点に着目したものであり、ユーザが話をしていない期間に発声者頭部位置測定処理を実行し、ユーザが話をしている期間は発声者頭部位置測定を行わない。このような制御を行うため、本実施形態では、通話相手から送られてくる音声のレベルを検出する機能を図１における出力音声処理・制御部２０１に設け、マイクを介して入力されるユーザの音声のレベルを検出する機能を入力音声処理・制御部２０２に設ける。そして、コントローラ２００は、ユーザからの音声入力がない状態が所定時間以上続いた場合に発声者頭部位置測定処理とこの処理結果に基づいてマイクとして機能させる音声入出力部を選択する処理を交互に繰り返す動作を開始させる。その間、通話相手からの音声がスピーカとしての音声入出力部から再生されることも起こりうるが、そのようなことが起こるがどうかとは無関係に、ユーザからの音声入力がない限り、発声者頭部位置測定処理とマイクとして機能させる音声入出力部の選択を交互に繰り返す。そして、ユーザからの音声入力が開始されたときには、その直前にマイクとして選択された音声入出力部を継続的にマイクとして使用するのである。 <Fourth embodiment>
FIG. 11 is a time chart showing the operation of the voice input / output device according to the fourth embodiment of the present invention. Aside from those who like to talk while walking, normal people do not change their posture and their mouth position when they are enthusiastic about talking. However, when you are not speaking and listening to the other party, you are often bored and tend to change your attitude. The present embodiment focuses on this point, and performs the speaker head position measurement process during a period when the user is not speaking, and performs the speaker head position measurement during a period when the user is speaking. Not performed. In order to perform such control, in this embodiment, the output voice processing / control unit 201 in FIG. 1 has a function of detecting the level of the voice sent from the other party, and the user input via the microphone is provided. A function for detecting the audio level is provided in the input audio processing / control unit 202. Then, the controller 200 alternately performs a speaker head position measurement process and a process of selecting a voice input / output unit to function as a microphone based on the result of the process when there is no voice input from the user for a predetermined time or longer. The operation to repeat is started. In the meantime, the voice from the other party may be played from the voice input / output unit as a speaker. Regardless of whether this happens, the head of the speaker The part position measurement process and the selection of the voice input / output unit that functions as a microphone are alternately repeated. When the voice input from the user is started, the voice input / output unit selected as the microphone immediately before that is continuously used as the microphone.

その際、話をしているときでも多少の姿勢の変化はあるので、ユーザからの音声入力が開始された後は、次のような制御をしてもよい。すなわち、ユーザからの音声入力の開始直前にマイクとして選択された音声入出力部を中心とした所定範囲内の音声入出力部をマイクとして選択し、その際の選択した各音声入出力部のマイクアンプ１０３の出力レベルに基づいてユーザの口の位置の移動を予測し、予測結果に基づいて、マイクとして機能させる音声入出力部を更新する、という動作を繰り返すのである。例えば上下に並んだ音声入出力部のうち上方の音声入出力部のマイクアンプ１０３の出力レベルが上昇し、下方の音声入出力部のマイクアンプ１０３の出力レベルが低下しつつある場合には、ユーザの口の位置が上昇しつつあると予測し、マイクとして機能させる音声入出力部の範囲を上方にシフトするのである。この態様によれば、ユーザが発声している期間（すなわち、通話相手がユーザの声を聞いている期間）、発声者頭部位置測定を間欠的に繰り返す動作が行われないので、通話相手が聞くユーザの音声の音量の時間的変化を抑えることができる。 At that time, since the posture is slightly changed even when talking, the following control may be performed after the voice input from the user is started. That is, a voice input / output unit within a predetermined range centering on the voice input / output unit selected as a microphone immediately before the start of voice input from the user is selected as a microphone, and the microphone of each selected voice input / output unit at that time The movement of the position of the user's mouth is predicted based on the output level of the amplifier 103, and the operation of updating the voice input / output unit that functions as a microphone is repeated based on the prediction result. For example, when the output level of the microphone amplifier 103 of the upper voice input / output unit is rising and the output level of the microphone amplifier 103 of the lower voice input / output unit is decreasing among the voice input / output units arranged vertically, It is predicted that the position of the user's mouth is rising, and the range of the voice input / output unit that functions as a microphone is shifted upward. According to this aspect, since the operation of intermittently repeating the speaker head position measurement is not performed during the period in which the user is speaking (that is, the period in which the other party is listening to the user's voice), The temporal change in the volume of the user's voice to be heard can be suppressed.

＜他の実施形態＞
以上説明した各実施形態では、音声入出力部が、単位デバイスに対面している物体の有無を検出するセンサを兼ねていた。しかし、このように音声入出力部をセンサとして用いるのではなく、例えば赤外線センサなどを各単位デバイスに設けて、これにより各単位デバイスに対面している物体の有無を検出するようにしてもよい。 <Other embodiments>
In each embodiment described above, the voice input / output unit also serves as a sensor that detects the presence or absence of an object facing the unit device. However, instead of using the voice input / output unit as a sensor in this way, for example, an infrared sensor or the like may be provided in each unit device, thereby detecting the presence or absence of an object facing each unit device. .

この発明の第１実施形態である音声入出力装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of a voice input / output device according to a first embodiment of the present invention. 同実施形態における単位デバイスの構成例を示す図である。It is a figure which shows the structural example of the unit device in the embodiment. 同実施形態におけるコントローラの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the controller in the same embodiment. 同実施形態の動作を示す図である。It is a figure which shows the operation | movement of the embodiment. 同同実施形態の動作を示す図である。It is a figure which shows operation | movement of the same embodiment. 同実施形態の動作を示す図である。It is a figure which shows the operation | movement of the embodiment. 同同実施形態の動作を示す図である。It is a figure which shows operation | movement of the same embodiment. 同同実施形態の動作を示す図である。It is a figure which shows operation | movement of the same embodiment. この発明の第２実施形態である音声入出力装置の動作を示す図である。It is a figure which shows operation | movement of the audio | voice input / output apparatus which is 2nd Embodiment of this invention. この発明の第３実施形態である音声入出力装置の動作を示す図である。It is a figure which shows operation | movement of the audio | voice input / output apparatus which is 3rd Embodiment of this invention. この発明の第４実施形態である音声入出力装置の動作を示す図である。It is a figure which shows operation | movement of the audio | voice input / output apparatus which is 4th Embodiment of this invention.

Explanation of symbols

２００…コントローラ、２０１…出力音声処理・制御部、２０２…入力音声処理・制御部、２０３…切替制御部、ＤＵ（ｉ，ｊ）（ｉ＝１〜ｍ，ｊ＝１〜ｎ）…単位デバイス、１０１…音声入出力部、１０２…スピーカアンプ、１０３…マイクアンプ、１０４…マイクスピーカ切替器。 DESCRIPTION OF SYMBOLS 200 ... Controller, 201 ... Output voice processing / control part, 202 ... Input voice processing / control part, 203 ... Switching control part, DU (i, j) (i = 1-m, j = 1-n) ... Unit device DESCRIPTION OF SYMBOLS 101 ... Voice input / output part 102 ... Speaker amplifier 103 ... Microphone amplifier 104 ... Microphone speaker switcher

Claims

An audio input / output device array in which a plurality of unit devices having a function as a microphone and a function as a speaker and capable of switching control for enabling one of these functions are arranged in a plane;
Detecting means for detecting whether or not each of the plurality of unit devices is facing an object;
Among the plurality of unit devices, one or a plurality of unit devices are selected from those detected by the detection means as facing some object, and the selected unit devices are operated as microphones, and other unit devices are selected. Control means for controlling the audio input / output device array for operating at least a part of the speaker as a speaker;
A voice input / output device comprising:

A voice input / output device array in which a plurality of unit devices having both a function as a microphone and a function as a speaker and capable of switching control for enabling any one of these functions are arranged in a plane;
A plurality of dispersed unit devices in the voice input / output device array generate ultrasonic waves by the function as speakers, and the reflected sound of the ultrasonic waves is distributed in the voice input / output device array as a plurality of remaining units. Sound is collected by the function of the microphone of the device, and it is detected whether or not each of the plurality of unit devices is facing any object based on the sound collection result. The audio input / output device array for selecting one or a plurality of unit devices among those recognized as being operated, operating the selected unit devices as microphones, and operating at least some of the other unit devices as speakers A voice input / output device comprising: control means for controlling

The voice input / output apparatus according to claim 1, wherein the control unit repeatedly executes control for selecting a unit device to function as the microphone.